To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Steven G. Johnson @ submitter email = stevenj@alum.mit.edu @ submitter organization = MIT @ computer manufacturer = Sun @ computer model = Ultra HPC 5000 @ CPU manufacturer = Sun @ CPU model = UltraSPARC-I @ CPU speed = 167 MHz @ RAM = 1024 MB @ L2 cache size = @ operating system = SunOS 5.5.1 @ C compiler = Sun Workshop cc 4.2 @ C compiler flags = -fast -native -DSOLARIS -dalign -xO5 -I../fftw-1.2/src/src -DUSE_SUNPERF @ Fortran compiler = Sun Workshop f77 4.2 @ Fortran compiler flags = -fast -native -dalign -libmil -xO5 @ remarks = This machine is in a cluster of SMPs donated to MIT by Sun. (http://xolas.lcs.mit.edu) @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) 524288 (50.9973 MB) 1048576 (96 MB) 2097152 (192 MB) 4194304 (384 MB) Maximum array size = 4194304 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Bailey 5. Beauregard 6. Bergland 7. Brenner 8. Burrus 9. CWP (min N) 10. CWP (best N) 11. Edelblute 12. FFTPACK 13. FFTPACK (f2c) 14. FFTW 15. FFTW_ESTIMATE 16. Frigo-old 17. Green 18. GSL 19. GSL DIT 20. GSL DIF 21. Krukar 22. Mayer (Buneman) 23. Mayer (simple) 24. Mayer (lookup) 25. Monro 26. NAPACK (f2c) 27. Nielsen 28. NR (C) 29. NR (F) 30. Ooura (C) 31. Ooura (F) 32. QFT 33. Ransom 34. SCIPORT 35. Singleton 36. Singleton (f2c) 37. Sorensen 38. Sorensen DIT 39. Temperton 40. Temperton (f2c) 41. Valkenburg 42. SUNPERF Computing normalized averages (43 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.68418 s, 4194304 iters, t-(init.)=0.792235 s t(norm)=0.0944417, mflops=52.9427 (err=1.7e-17) 1. Arndt DIT: elapsed time t=1.61179 s, 4194304 iters, t-(init.)=1.0625 s t(norm)=0.12666, mflops=39.4759 (err=1.7e-17) 2. Arndt Split-Radix: elapsed time t=1.04271 s, 1048576 iters, t-(init.)=0.899132 s t(norm)=0.42874, mflops=11.6621 (err=1.7e-17) 3. Arndt 4-step: elapsed time t=1.47151 s, 131072 iters, t-(init.)=1.44732 s t(norm)=5.52108, mflops=0.905619 (err=1.7e-17) 4. Bailey: elapsed time t=1.37497 s, 524288 iters, t-(init.)=1.30631 s t(norm)=1.2458, mflops=4.01349 (err=1.7e-17) 5. Beauregard: elapsed time t=1.24684 s, 524288 iters, t-(init.)=1.13478 s t(norm)=1.08221, mflops=4.62017 (err=1.7e-17) 6. Bergland: elapsed time t=1.17937 s, 524288 iters, t-(init.)=1.11069 s t(norm)=1.05924, mflops=4.72038 (err=1.7e-17) 7. Brenner: elapsed time t=1.08339 s, 524288 iters, t-(init.)=1.00535 s t(norm)=0.958774, mflops=5.215 (err=1.7e-17) 8. Burrus: elapsed time t=1.03731 s, 2097152 iters, t-(init.)=0.725198 s t(norm)=0.172901, mflops=28.9183 (err=1.7e-17) 9. CWP (min N): elapsed time t=1.37167 s, 262144 iters, t-(init.)=1.33576 s t(norm)=2.54775, mflops=1.96252 10. CWP (best N) (N=3): elapsed time t=1.41459 s, 262144 iters, t-(init.)=1.36466 s t(norm)=2.60289, mflops=1.92095 11. Skipping fft (Edelblute can't handle N <= 2). 12. FFTPACK: elapsed time t=1.58293 s, 1048576 iters, t-(init.)=1.42063 s t(norm)=0.677411, mflops=7.38105 (err=1.7e-17) 13. FFTPACK (f2c): elapsed time t=2.00057 s, 1048576 iters, t-(init.)=1.76559 s t(norm)=0.841899, mflops=5.93896 (err=1.7e-17) FFTW_MEASURE plan: (cost = 3.958164e-07) FFTW_NOTW 2 14. FFTW: elapsed time t=1.04744 s, 2097152 iters, t-(init.)=0.74531 s t(norm)=0.177696, mflops=28.138 (err=1.7e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 15. FFTW_ESTIMATE: elapsed time t=1.21729 s, 2097152 iters, t-(init.)=0.751007 s t(norm)=0.179054, mflops=27.9245 (err=1.7e-17) 16. Frigo-old: elapsed time t=1.09859 s, 4194304 iters, t-(init.)=0.524325 s t(norm)=0.0625044, mflops=79.9944 (err=1.7e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.0489 s, 524288 iters, t-(init.)=0.935118 s t(norm)=0.891798, mflops=5.60665 (err=1.7e-17) 19. GSL DIT: elapsed time t=1.57969 s, 524288 iters, t-(init.)=1.50167 s t(norm)=1.43211, mflops=3.49136 (err=1.7e-17) 20. GSL DIF: elapsed time t=1.56801 s, 524288 iters, t-(init.)=1.48687 s t(norm)=1.41799, mflops=3.52611 (err=1.7e-17) 21. Krukar: elapsed time t=1.09319 s, 2097152 iters, t-(init.)=0.706175 s t(norm)=0.168365, mflops=29.6973 (err=1.7e-17) 22. Skipping fft (Mayer can't handle N <= 2). 23. Skipping fft (Mayer can't handle N <= 2). 24. Skipping fft (Mayer can't handle N <= 2). 25. Skipping fft (Monro can't handle N <= 2). 26. NAPACK (f2c): elapsed time t=1.79518 s, 262144 iters, t-(init.)=1.75618 s t(norm)=3.34964, mflops=1.4927 (err=1.7e-17) 27. Nielsen: elapsed time t=1.16851 s, 131072 iters, t-(init.)=1.15006 s t(norm)=4.38713, mflops=1.1397 (err=1.7e-17) 28. NR (C): elapsed time t=1.4168 s, 524288 iters, t-(init.)=1.34811 s t(norm)=1.28566, mflops=3.88905 (err=1.7e-17) 29. NR (F): elapsed time t=1.43845 s, 524288 iters, t-(init.)=1.36043 s t(norm)=1.29741, mflops=3.85384 (err=1.7e-17) 30. Ooura (C): elapsed time t=1.4948 s, 2097152 iters, t-(init.)=1.09531 s t(norm)=0.261141, mflops=19.1467 (err=1.7e-17) 31. Ooura (F): elapsed time t=1.4826 s, 2097152 iters, t-(init.)=1.20794 s t(norm)=0.287994, mflops=17.3615 (err=1.7e-17) 32. Skipping fft (QFT requires N >= 16). 33. Skipping fft (Ransom doesn't work for N=2). 34. Skipping fft (SCIPORT can't handle N < 4). 35. Singleton: elapsed time t=1.0335 s, 262144 iters, t-(init.)=0.98669 s t(norm)=1.88196, mflops=2.6568 (err=1.7e-17) 36. Singleton (f2c): elapsed time t=1.98022 s, 524288 iters, t-(init.)=1.91156 s t(norm)=1.82301, mflops=2.74272 (err=1.7e-17) 37. Sorensen: elapsed time t=1.85665 s, 2097152 iters, t-(init.)=1.58201 s t(norm)=0.37718, mflops=13.2563 (err=1.7e-17) 38. Sorensen DIT: elapsed time t=1.11632 s, 2097152 iters, t-(init.)=0.83708 s t(norm)=0.199575, mflops=25.0532 (err=1.7e-17) 39. Temperton: elapsed time t=1.11056 s, 262144 iters, t-(init.)=1.06063 s t(norm)=2.02299, mflops=2.47159 (err=1.7e-17) 40. Temperton (f2c): elapsed time t=1.36172 s, 262144 iters, t-(init.)=1.32739 s t(norm)=2.5318, mflops=1.97488 (err=1.7e-17) 41. Valkenburg: elapsed time t=1.40972 s, 524288 iters, t-(init.)=1.31607 s t(norm)=1.25511, mflops=3.98373 (err=1.7e-17) 42. SUNPERF: elapsed time t=1.59558 s, 1048576 iters, t-(init.)=1.45827 s t(norm)=0.695357, mflops=7.19055 (err=1.7e-17) Top mflops for N=2 = 79.9944 Normalized results and averages for N=2: fft 0: mflops = 52.9427 (norm. = 0.66183), norm. avg. (of 1) = 0.66183 fft 1: mflops = 39.4759 (norm. = 0.493483), norm. avg. (of 1) = 0.493483 fft 2: mflops = 11.6621 (norm. = 0.145786), norm. avg. (of 1) = 0.145786 fft 3: mflops = 0.905619 (norm. = 0.011321), norm. avg. (of 1) = 0.011321 fft 4: mflops = 4.01349 (norm. = 0.0501722), norm. avg. (of 1) = 0.0501722 fft 5: mflops = 4.62017 (norm. = 0.0577562), norm. avg. (of 1) = 0.0577562 fft 6: mflops = 4.72038 (norm. = 0.0590088), norm. avg. (of 1) = 0.0590088 fft 7: mflops = 5.215 (norm. = 0.065192), norm. avg. (of 1) = 0.065192 fft 8: mflops = 28.9183 (norm. = 0.361505), norm. avg. (of 1) = 0.361505 fft 9: mflops = 1.96252 (norm. = 0.0245332), norm. avg. (of 1) = 0.0245332 fft 10: mflops = 1.92095 (norm. = 0.0240135), norm. avg. (of 1) = 0.0240135 fft 11: mflops = -1 (norm. = -0.0125009), norm. avg. (of 0) = -1 fft 12: mflops = 7.38105 (norm. = 0.0922695), norm. avg. (of 1) = 0.0922695 fft 13: mflops = 5.93896 (norm. = 0.0742421), norm. avg. (of 1) = 0.0742421 fft 14: mflops = 28.138 (norm. = 0.351749), norm. avg. (of 1) = 0.351749 fft 15: mflops = 27.9245 (norm. = 0.349081), norm. avg. (of 1) = 0.349081 fft 16: mflops = 79.9944 (norm. = 1), norm. avg. (of 1) = 1 fft 17: mflops = -1 (norm. = -0.0125009), norm. avg. (of 0) = -1 fft 18: mflops = 5.60665 (norm. = 0.070088), norm. avg. (of 1) = 0.070088 fft 19: mflops = 3.49136 (norm. = 0.043645), norm. avg. (of 1) = 0.043645 fft 20: mflops = 3.52611 (norm. = 0.0440795), norm. avg. (of 1) = 0.0440795 fft 21: mflops = 29.6973 (norm. = 0.371243), norm. avg. (of 1) = 0.371243 fft 22: mflops = -1 (norm. = -0.0125009), norm. avg. (of 0) = -1 fft 23: mflops = -1 (norm. = -0.0125009), norm. avg. (of 0) = -1 fft 24: mflops = -1 (norm. = -0.0125009), norm. avg. (of 0) = -1 fft 25: mflops = -1 (norm. = -0.0125009), norm. avg. (of 0) = -1 fft 26: mflops = 1.4927 (norm. = 0.01866), norm. avg. (of 1) = 0.01866 fft 27: mflops = 1.1397 (norm. = 0.0142472), norm. avg. (of 1) = 0.0142472 fft 28: mflops = 3.88905 (norm. = 0.0486165), norm. avg. (of 1) = 0.0486165 fft 29: mflops = 3.85384 (norm. = 0.0481764), norm. avg. (of 1) = 0.0481764 fft 30: mflops = 19.1467 (norm. = 0.239351), norm. avg. (of 1) = 0.239351 fft 31: mflops = 17.3615 (norm. = 0.217033), norm. avg. (of 1) = 0.217033 fft 32: mflops = -1 (norm. = -0.0125009), norm. avg. (of 0) = -1 fft 33: mflops = -1 (norm. = -0.0125009), norm. avg. (of 0) = -1 fft 34: mflops = -1 (norm. = -0.0125009), norm. avg. (of 0) = -1 fft 35: mflops = 2.6568 (norm. = 0.0332124), norm. avg. (of 1) = 0.0332124 fft 36: mflops = 2.74272 (norm. = 0.0342864), norm. avg. (of 1) = 0.0342864 fft 37: mflops = 13.2563 (norm. = 0.165715), norm. avg. (of 1) = 0.165715 fft 38: mflops = 25.0532 (norm. = 0.313187), norm. avg. (of 1) = 0.313187 fft 39: mflops = 2.47159 (norm. = 0.030897), norm. avg. (of 1) = 0.030897 fft 40: mflops = 1.97488 (norm. = 0.0246877), norm. avg. (of 1) = 0.0246877 fft 41: mflops = 3.98373 (norm. = 0.0498001), norm. avg. (of 1) = 0.0498001 fft 42: mflops = 7.19055 (norm. = 0.0898882), norm. avg. (of 1) = 0.0898882 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.34116 s, 1048576 iters, t-(init.)=1.13517 s t(norm)=0.135323, mflops=36.9486 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.25747 s, 1048576 iters, t-(init.)=1.0702 s t(norm)=0.127578, mflops=39.1916 (err=1.3e-16) 2. Arndt Split-Radix: elapsed time t=1.33437 s, 524288 iters, t-(init.)=1.24075 s t(norm)=0.295818, mflops=16.9023 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=2.01404 s, 262144 iters, t-(init.)=1.95943 s t(norm)=0.934328, mflops=5.35144 (err=1.3e-16) 4. Bailey: elapsed time t=1.48169 s, 262144 iters, t-(init.)=1.43332 s t(norm)=0.683462, mflops=7.3157 (err=1.3e-16) 5. Beauregard: elapsed time t=1.57064 s, 262144 iters, t-(init.)=1.50092 s t(norm)=0.715695, mflops=6.98622 (err=6.5e-17) 6. Bergland: elapsed time t=1.48559 s, 524288 iters, t-(init.)=1.38563 s t(norm)=0.33036, mflops=15.135 (err=5.3e-17) 7. Brenner: elapsed time t=1.90617 s, 524288 iters, t-(init.)=1.76606 s t(norm)=0.421062, mflops=11.8747 (err=5.3e-17) 8. Burrus: elapsed time t=1.17134 s, 524288 iters, t-(init.)=1.03713 s t(norm)=0.24727, mflops=20.2208 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.46454 s, 262144 iters, t-(init.)=1.41773 s t(norm)=0.676025, mflops=7.39618 10. CWP (best N) (N=15): elapsed time t=1.29516 s, 131072 iters, t-(init.)=1.24445 s t(norm)=1.1868, mflops=4.21302 11. Edelblute: elapsed time t=1.48689 s, 524288 iters, t-(init.)=1.38703 s t(norm)=0.330693, mflops=15.1198 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.67358 s, 1048576 iters, t-(init.)=1.40828 s t(norm)=0.16788, mflops=29.7831 (err=5.3e-17) 13. FFTPACK (f2c): elapsed time t=1.119 s, 524288 iters, t-(init.)=1.01914 s t(norm)=0.242983, mflops=20.5776 (err=5.3e-17) FFTW_MEASURE plan: (cost = 5.200664e-07) FFTW_NOTW 4 14. FFTW: elapsed time t=1.29346 s, 2097152 iters, t-(init.)=0.893965 s t(norm)=0.0532845, mflops=93.836 (err=5.3e-17) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 15. FFTW_ESTIMATE: elapsed time t=1.52685 s, 2097152 iters, t-(init.)=1.11488 s t(norm)=0.0664522, mflops=75.2421 (err=5.3e-17) 16. Frigo-old: elapsed time t=1.69783 s, 4194304 iters, t-(init.)=0.798534 s t(norm)=0.0237982, mflops=210.1 (err=5.3e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.35787 s, 524288 iters, t-(init.)=1.25177 s t(norm)=0.298445, mflops=16.7535 (err=5.3e-17) 19. GSL DIT: elapsed time t=1.49943 s, 262144 iters, t-(init.)=1.43516 s t(norm)=0.684337, mflops=7.30634 (err=6.5e-17) 20. GSL DIF: elapsed time t=1.50255 s, 262144 iters, t-(init.)=1.4417 s t(norm)=0.687456, mflops=7.2732 (err=6.5e-17) 21. Krukar: elapsed time t=1.59022 s, 2097152 iters, t-(init.)=1.17825 s t(norm)=0.0702291, mflops=71.1956 (err=5.3e-17) 22. Mayer (Buneman): elapsed time t=1.02964 s, 524288 iters, t-(init.)=0.939115 s t(norm)=0.223902, mflops=22.3312 (err=1.3e-16) 23. Mayer (simple): elapsed time t=1.04777 s, 524288 iters, t-(init.)=0.954122 s t(norm)=0.22748, mflops=21.9799 24. Mayer (lookup): elapsed time t=1.09895 s, 524288 iters, t-(init.)=0.995937 s t(norm)=0.23745, mflops=21.0571 (err=1.3e-16) 25. Monro: elapsed time t=1.98626 s, 262144 iters, t-(init.)=1.93789 s t(norm)=0.924056, mflops=5.41093 (err=1.3e-16) 26. NAPACK (f2c): elapsed time t=1.5657 s, 131072 iters, t-(init.)=1.53332 s t(norm)=1.46229, mflops=3.41929 (err=1.6e-16) 27. Nielsen: elapsed time t=1.92356 s, 262144 iters, t-(init.)=1.87519 s t(norm)=0.894158, mflops=5.59185 (err=1.3e-16) 28. NR (C): elapsed time t=1.378 s, 262144 iters, t-(init.)=1.33119 s t(norm)=0.634762, mflops=7.87696 (err=6.5e-17) 29. NR (F): elapsed time t=1.38578 s, 262144 iters, t-(init.)=1.32256 s t(norm)=0.630646, mflops=7.92838 (err=6.5e-17) 30. Ooura (C): elapsed time t=1.29195 s, 1048576 iters, t-(init.)=1.03292 s t(norm)=0.123133, mflops=40.6065 (err=5.3e-17) 31. Ooura (F): elapsed time t=1.29727 s, 1048576 iters, t-(init.)=1.11 s t(norm)=0.132323, mflops=37.7864 (err=5.3e-17) 32. Skipping fft (QFT requires N >= 16). 33. Ransom: elapsed time t=1.89381 s, 131072 iters, t-(init.)=1.86572 s t(norm)=1.77929, mflops=2.81011 (err=1.6e-16) 34. SCIPORT: elapsed time t=1.5167 s, 1048576 iters, t-(init.)=1.25187 s t(norm)=0.149235, mflops=33.5043 (err=6.5e-17) 35. Singleton: elapsed time t=1.02251 s, 262144 iters, t-(init.)=0.96946 s t(norm)=0.462275, mflops=10.8161 (err=5.3e-17) 36. Singleton (f2c): elapsed time t=1.89429 s, 524288 iters, t-(init.)=1.79602 s t(norm)=0.428204, mflops=11.6767 (err=5.3e-17) 37. Sorensen: elapsed time t=1.16284 s, 524288 iters, t-(init.)=1.0692 s t(norm)=0.254917, mflops=19.6143 (err=1.3e-16) 38. Sorensen DIT: elapsed time t=1.16494 s, 524288 iters, t-(init.)=1.0713 s t(norm)=0.255417, mflops=19.5758 (err=1.3e-16) 39. Temperton: elapsed time t=1.43441 s, 262144 iters, t-(init.)=1.383 s t(norm)=0.659465, mflops=7.58191 (err=5.3e-17) 40. Temperton (f2c): elapsed time t=1.61233 s, 262144 iters, t-(init.)=1.56552 s t(norm)=0.746497, mflops=6.69795 (err=5.3e-17) 41. Valkenburg: elapsed time t=1.3688 s, 131072 iters, t-(init.)=1.33915 s t(norm)=1.27712, mflops=3.91507 (err=1.6e-16) 42. SUNPERF: elapsed time t=1.60385 s, 1048576 iters, t-(init.)=1.41653 s t(norm)=0.168863, mflops=29.6097 (err=5.3e-17) Top mflops for N=4 = 210.1 Normalized results and averages for N=4: fft 0: mflops = 36.9486 (norm. = 0.175862), norm. avg. (of 2) = 0.418846 fft 1: mflops = 39.1916 (norm. = 0.186538), norm. avg. (of 2) = 0.340011 fft 2: mflops = 16.9023 (norm. = 0.0804486), norm. avg. (of 2) = 0.113117 fft 3: mflops = 5.35144 (norm. = 0.0254709), norm. avg. (of 2) = 0.018396 fft 4: mflops = 7.3157 (norm. = 0.03482), norm. avg. (of 2) = 0.0424961 fft 5: mflops = 6.98622 (norm. = 0.0332519), norm. avg. (of 2) = 0.045504 fft 6: mflops = 15.135 (norm. = 0.072037), norm. avg. (of 2) = 0.0655229 fft 7: mflops = 11.8747 (norm. = 0.0565194), norm. avg. (of 2) = 0.0608557 fft 8: mflops = 20.2208 (norm. = 0.0962436), norm. avg. (of 2) = 0.228874 fft 9: mflops = 7.39618 (norm. = 0.0352031), norm. avg. (of 2) = 0.0298681 fft 10: mflops = 4.21302 (norm. = 0.0200525), norm. avg. (of 2) = 0.022033 fft 11: mflops = 15.1198 (norm. = 0.0719646), norm. avg. (of 1) = 0.0719646 fft 12: mflops = 29.7831 (norm. = 0.141757), norm. avg. (of 2) = 0.117013 fft 13: mflops = 20.5776 (norm. = 0.0979419), norm. avg. (of 2) = 0.086092 fft 14: mflops = 93.836 (norm. = 0.446625), norm. avg. (of 2) = 0.399187 fft 15: mflops = 75.2421 (norm. = 0.358125), norm. avg. (of 2) = 0.353603 fft 16: mflops = 210.1 (norm. = 1), norm. avg. (of 2) = 1 fft 17: mflops = -1 (norm. = -0.00475963), norm. avg. (of 0) = -1 fft 18: mflops = 16.7535 (norm. = 0.0797405), norm. avg. (of 2) = 0.0749143 fft 19: mflops = 7.30634 (norm. = 0.0347755), norm. avg. (of 2) = 0.0392103 fft 20: mflops = 7.2732 (norm. = 0.0346178), norm. avg. (of 2) = 0.0393486 fft 21: mflops = 71.1956 (norm. = 0.338865), norm. avg. (of 2) = 0.355054 fft 22: mflops = 22.3312 (norm. = 0.106288), norm. avg. (of 1) = 0.106288 fft 23: mflops = 21.9799 (norm. = 0.104616), norm. avg. (of 1) = 0.104616 fft 24: mflops = 21.0571 (norm. = 0.100224), norm. avg. (of 1) = 0.100224 fft 25: mflops = 5.41093 (norm. = 0.025754), norm. avg. (of 1) = 0.025754 fft 26: mflops = 3.41929 (norm. = 0.0162746), norm. avg. (of 2) = 0.0174673 fft 27: mflops = 5.59185 (norm. = 0.0266152), norm. avg. (of 2) = 0.0204312 fft 28: mflops = 7.87696 (norm. = 0.0374915), norm. avg. (of 2) = 0.043054 fft 29: mflops = 7.92838 (norm. = 0.0377362), norm. avg. (of 2) = 0.0429563 fft 30: mflops = 40.6065 (norm. = 0.193272), norm. avg. (of 2) = 0.216311 fft 31: mflops = 37.7864 (norm. = 0.179849), norm. avg. (of 2) = 0.198441 fft 32: mflops = -1 (norm. = -0.00475963), norm. avg. (of 0) = -1 fft 33: mflops = 2.81011 (norm. = 0.0133751), norm. avg. (of 1) = 0.0133751 fft 34: mflops = 33.5043 (norm. = 0.159468), norm. avg. (of 1) = 0.159468 fft 35: mflops = 10.8161 (norm. = 0.0514806), norm. avg. (of 2) = 0.0423465 fft 36: mflops = 11.6767 (norm. = 0.0555767), norm. avg. (of 2) = 0.0449315 fft 37: mflops = 19.6143 (norm. = 0.0933567), norm. avg. (of 2) = 0.129536 fft 38: mflops = 19.5758 (norm. = 0.0931739), norm. avg. (of 2) = 0.20318 fft 39: mflops = 7.58191 (norm. = 0.0360871), norm. avg. (of 2) = 0.0334921 fft 40: mflops = 6.69795 (norm. = 0.0318798), norm. avg. (of 2) = 0.0282837 fft 41: mflops = 3.91507 (norm. = 0.0186343), norm. avg. (of 2) = 0.0342172 fft 42: mflops = 29.6097 (norm. = 0.140931), norm. avg. (of 2) = 0.11541 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.91315 s, 1048576 iters, t-(init.)=1.63851 s t(norm)=0.0651085, mflops=76.7949 (err=1.2e-16) 1. Arndt DIT: elapsed time t=1.93037 s, 1048576 iters, t-(init.)=1.68436 s t(norm)=0.0669306, mflops=74.7043 (err=1.2e-16) 2. Arndt Split-Radix: elapsed time t=1.62844 s, 262144 iters, t-(init.)=1.5629 s t(norm)=0.248417, mflops=20.1275 (err=1.1e-16) 3. Arndt 4-step: elapsed time t=1.53851 s, 65536 iters, t-(init.)=1.52135 s t(norm)=0.967246, mflops=5.16931 (err=1.3e-16) 4. Bailey: elapsed time t=1.26404 s, 131072 iters, t-(init.)=1.23359 s t(norm)=0.392148, mflops=12.7503 (err=9.8e-17) 5. Beauregard: elapsed time t=1.45815 s, 65536 iters, t-(init.)=1.43902 s t(norm)=0.914907, mflops=5.46504 (err=1.2e-16) 6. Bergland: elapsed time t=1.3625 s, 262144 iters, t-(init.)=1.30009 s t(norm)=0.206644, mflops=24.1962 (err=1.3e-16) 7. Brenner: elapsed time t=1.97292 s, 262144 iters, t-(init.)=1.89803 s t(norm)=0.301684, mflops=16.5737 (err=1.2e-16) 8. Burrus: elapsed time t=1.09995 s, 131072 iters, t-(init.)=1.06247 s t(norm)=0.337751, mflops=14.8038 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.73937 s, 262144 iters, t-(init.)=1.67696 s t(norm)=0.266545, mflops=18.7585 10. CWP (best N) (N=15): elapsed time t=1.2885 s, 131072 iters, t-(init.)=1.23779 s t(norm)=0.393481, mflops=12.7071 11. Edelblute: elapsed time t=1.29156 s, 131072 iters, t-(init.)=1.25141 s t(norm)=0.397811, mflops=12.5688 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.66045 s, 524288 iters, t-(init.)=1.50259 s t(norm)=0.119415, mflops=41.8709 (err=1.2e-16) 13. FFTPACK (f2c): elapsed time t=1.09651 s, 262144 iters, t-(init.)=1.02942 s t(norm)=0.163622, mflops=30.5583 (err=1.2e-16) FFTW_MEASURE plan: (cost = 8.481875e-07) FFTW_NOTW 8 14. FFTW: elapsed time t=1.02287 s, 1048576 iters, t-(init.)=0.766949 s t(norm)=0.0304758, mflops=164.064 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.07672 s, 1048576 iters, t-(init.)=0.801408 s t(norm)=0.0318451, mflops=157.01 (err=1.2e-16) 16. Frigo-old: elapsed time t=1.55791 s, 2097152 iters, t-(init.)=1.05855 s t(norm)=0.0210316, mflops=237.738 (err=1.4e-16) 17. Green: elapsed time t=1.21183 s, 524288 iters, t-(init.)=1.08385 s t(norm)=0.086137, mflops=58.0471 (err=1.4e-16) 18. GSL: elapsed time t=1.27734 s, 262144 iters, t-(init.)=1.20244 s t(norm)=0.191123, mflops=26.1611 (err=1.4e-16) 19. GSL DIT: elapsed time t=1.27238 s, 131072 iters, t-(init.)=1.23571 s t(norm)=0.392822, mflops=12.7284 (err=1.2e-16) 20. GSL DIF: elapsed time t=1.22439 s, 131072 iters, t-(init.)=1.18777 s t(norm)=0.377583, mflops=13.2421 (err=1.4e-16) 21. Krukar: elapsed time t=1.61175 s, 1048576 iters, t-(init.)=1.33087 s t(norm)=0.052884, mflops=94.5466 (err=1.9e-16) 22. Mayer (Buneman): elapsed time t=1.7077 s, 524288 iters, t-(init.)=1.58597 s t(norm)=0.126042, mflops=39.6694 (err=1.2e-16) 23. Mayer (simple): elapsed time t=1.83033 s, 524288 iters, t-(init.)=1.69108 s t(norm)=0.134395, mflops=37.2038 24. Mayer (lookup): elapsed time t=1.86052 s, 524288 iters, t-(init.)=1.72943 s t(norm)=0.137443, mflops=36.3788 (err=1.2e-16) 25. Monro: elapsed time t=1.23156 s, 131072 iters, t-(init.)=1.20032 s t(norm)=0.381573, mflops=13.1037 (err=1.1e-08) 26. NAPACK (f2c): elapsed time t=1.2372 s, 65536 iters, t-(init.)=1.22003 s t(norm)=0.775676, mflops=6.44599 (err=1.7e-16) 27. Nielsen: elapsed time t=1.99314 s, 262144 iters, t-(init.)=1.92917 s t(norm)=0.306633, mflops=16.3061 (err=6.9e-16) 28. NR (C): elapsed time t=1.16655 s, 131072 iters, t-(init.)=1.13534 s t(norm)=0.360916, mflops=13.8536 (err=1.5e-16) 29. NR (F): elapsed time t=1.1385 s, 131072 iters, t-(init.)=1.10181 s t(norm)=0.350254, mflops=14.2753 (err=1.5e-16) 30. Ooura (C): elapsed time t=1.25199 s, 524288 iters, t-(init.)=1.11466 s t(norm)=0.0885853, mflops=56.4427 (err=1.3e-16) 31. Ooura (F): elapsed time t=1.26497 s, 524288 iters, t-(init.)=1.14012 s t(norm)=0.0906086, mflops=55.1824 (err=1.3e-16) 32. Skipping fft (QFT requires N >= 16). 33. Ransom: elapsed time t=1.48963 s, 32768 iters, t-(init.)=1.48066 s t(norm)=1.88275, mflops=2.65569 (err=3.4e-16) 34. SCIPORT: elapsed time t=1.56236 s, 524288 iters, t-(init.)=1.40858 s t(norm)=0.111944, mflops=44.6653 (err=1.4e-16) 35. Singleton: elapsed time t=1.43678 s, 131072 iters, t-(init.)=1.40011 s t(norm)=0.445084, mflops=11.2338 (err=1.4e-16) 36. Singleton (f2c): elapsed time t=1.41406 s, 131072 iters, t-(init.)=1.38285 s t(norm)=0.439597, mflops=11.3741 (err=1.4e-16) 37. Sorensen: elapsed time t=1.01298 s, 262144 iters, t-(init.)=0.949661 s t(norm)=0.150945, mflops=33.1248 (err=1.5e-16) 38. Sorensen DIT: elapsed time t=1.09601 s, 131072 iters, t-(init.)=1.064 s t(norm)=0.338236, mflops=14.7826 (err=1.1e-16) 39. Temperton: elapsed time t=1.01018 s, 131072 iters, t-(init.)=0.977413 s t(norm)=0.310711, mflops=16.0921 (err=4.6e-09) 40. Temperton (f2c): elapsed time t=1.25856 s, 131072 iters, t-(init.)=1.22735 s t(norm)=0.390164, mflops=12.8151 (err=1.4e-16) 41. Valkenburg: elapsed time t=1.01245 s, 32768 iters, t-(init.)=1.00251 s t(norm)=1.27475, mflops=3.92233 (err=1.5e-16) 42. SUNPERF: elapsed time t=1.3249 s, 524288 iters, t-(init.)=1.20317 s t(norm)=0.0956193, mflops=52.2907 (err=1.2e-16) Top mflops for N=8 = 237.738 Normalized results and averages for N=8: fft 0: mflops = 76.7949 (norm. = 0.323023), norm. avg. (of 3) = 0.386905 fft 1: mflops = 74.7043 (norm. = 0.314229), norm. avg. (of 3) = 0.331417 fft 2: mflops = 20.1275 (norm. = 0.0846624), norm. avg. (of 3) = 0.103632 fft 3: mflops = 5.16931 (norm. = 0.0217437), norm. avg. (of 3) = 0.0195119 fft 4: mflops = 12.7503 (norm. = 0.0536316), norm. avg. (of 3) = 0.046208 fft 5: mflops = 5.46504 (norm. = 0.0229876), norm. avg. (of 3) = 0.0379986 fft 6: mflops = 24.1962 (norm. = 0.101777), norm. avg. (of 3) = 0.0776075 fft 7: mflops = 16.5737 (norm. = 0.069714), norm. avg. (of 3) = 0.0638084 fft 8: mflops = 14.8038 (norm. = 0.0622693), norm. avg. (of 3) = 0.173339 fft 9: mflops = 18.7585 (norm. = 0.0789043), norm. avg. (of 3) = 0.0462135 fft 10: mflops = 12.7071 (norm. = 0.0534499), norm. avg. (of 3) = 0.0325053 fft 11: mflops = 12.5688 (norm. = 0.0528682), norm. avg. (of 2) = 0.0624164 fft 12: mflops = 41.8709 (norm. = 0.176122), norm. avg. (of 3) = 0.136716 fft 13: mflops = 30.5583 (norm. = 0.128538), norm. avg. (of 3) = 0.100241 fft 14: mflops = 164.064 (norm. = 0.690106), norm. avg. (of 3) = 0.49616 fft 15: mflops = 157.01 (norm. = 0.660433), norm. avg. (of 3) = 0.45588 fft 16: mflops = 237.738 (norm. = 1), norm. avg. (of 3) = 1 fft 17: mflops = 58.0471 (norm. = 0.244164), norm. avg. (of 1) = 0.244164 fft 18: mflops = 26.1611 (norm. = 0.110042), norm. avg. (of 3) = 0.0866234 fft 19: mflops = 12.7284 (norm. = 0.0535396), norm. avg. (of 3) = 0.0439867 fft 20: mflops = 13.2421 (norm. = 0.0557005), norm. avg. (of 3) = 0.0447992 fft 21: mflops = 94.5466 (norm. = 0.397692), norm. avg. (of 3) = 0.369267 fft 22: mflops = 39.6694 (norm. = 0.166862), norm. avg. (of 2) = 0.136575 fft 23: mflops = 37.2038 (norm. = 0.156491), norm. avg. (of 2) = 0.130554 fft 24: mflops = 36.3788 (norm. = 0.153021), norm. avg. (of 2) = 0.126622 fft 25: mflops = 13.1037 (norm. = 0.0551181), norm. avg. (of 2) = 0.0404361 fft 26: mflops = 6.44599 (norm. = 0.0271138), norm. avg. (of 3) = 0.0206828 fft 27: mflops = 16.3061 (norm. = 0.0685887), norm. avg. (of 3) = 0.0364837 fft 28: mflops = 13.8536 (norm. = 0.0582727), norm. avg. (of 3) = 0.0481269 fft 29: mflops = 14.2753 (norm. = 0.0600465), norm. avg. (of 3) = 0.048653 fft 30: mflops = 56.4427 (norm. = 0.237416), norm. avg. (of 3) = 0.223346 fft 31: mflops = 55.1824 (norm. = 0.232114), norm. avg. (of 3) = 0.209666 fft 32: mflops = -1 (norm. = -0.00420631), norm. avg. (of 0) = -1 fft 33: mflops = 2.65569 (norm. = 0.0111707), norm. avg. (of 2) = 0.0122729 fft 34: mflops = 44.6653 (norm. = 0.187876), norm. avg. (of 2) = 0.173672 fft 35: mflops = 11.2338 (norm. = 0.047253), norm. avg. (of 3) = 0.043982 fft 36: mflops = 11.3741 (norm. = 0.0478428), norm. avg. (of 3) = 0.045902 fft 37: mflops = 33.1248 (norm. = 0.139333), norm. avg. (of 3) = 0.132802 fft 38: mflops = 14.7826 (norm. = 0.0621801), norm. avg. (of 3) = 0.15618 fft 39: mflops = 16.0921 (norm. = 0.0676884), norm. avg. (of 3) = 0.0448909 fft 40: mflops = 12.8151 (norm. = 0.0539044), norm. avg. (of 3) = 0.036824 fft 41: mflops = 3.92233 (norm. = 0.0164986), norm. avg. (of 3) = 0.028311 fft 42: mflops = 52.2907 (norm. = 0.219951), norm. avg. (of 3) = 0.150257 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.24461 s, 131072 iters, t-(init.)=1.18688 s t(norm)=0.141487, mflops=35.339 (err=1.5e-16) 1. Arndt DIT: elapsed time t=1.15569 s, 131072 iters, t-(init.)=1.10888 s t(norm)=0.132188, mflops=37.8248 (err=2.2e-16) 2. Arndt Split-Radix: elapsed time t=1.87329 s, 131072 iters, t-(init.)=1.8257 s t(norm)=0.217641, mflops=22.9737 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.903 s, 65536 iters, t-(init.)=1.87728 s t(norm)=0.447578, mflops=11.1712 (err=2.0e-16) 4. Bailey: elapsed time t=1.04364 s, 65536 iters, t-(init.)=1.02024 s t(norm)=0.243244, mflops=20.5555 (err=2.0e-16) 5. Beauregard: elapsed time t=1.6572 s, 32768 iters, t-(init.)=1.64472 s t(norm)=0.784264, mflops=6.3754 (err=2.7e-16) 6. Bergland: elapsed time t=1.29462 s, 131072 iters, t-(init.)=1.24781 s t(norm)=0.148751, mflops=33.6133 (err=2.6e-16) 7. Brenner: elapsed time t=1.74026 s, 131072 iters, t-(init.)=1.69111 s t(norm)=0.201596, mflops=24.8021 (err=2.1e-16) 8. Burrus: elapsed time t=1.55203 s, 65536 iters, t-(init.)=1.52628 s t(norm)=0.363893, mflops=13.7403 (err=1.4e-16) 9. CWP (min N): elapsed time t=1.07768 s, 65536 iters, t-(init.)=1.05386 s t(norm)=0.25126, mflops=19.8997 10. CWP (best N) (N=28): elapsed time t=1.55374 s, 131072 iters, t-(init.)=1.47806 s t(norm)=0.176198, mflops=28.3771 11. Edelblute: elapsed time t=1.74023 s, 65536 iters, t-(init.)=1.71604 s t(norm)=0.409136, mflops=12.2209 (err=1.4e-16) 12. FFTPACK: elapsed time t=1.24684 s, 262144 iters, t-(init.)=1.14227 s t(norm)=0.0680846, mflops=73.438 (err=1.8e-16) 13. FFTPACK (f2c): elapsed time t=1.65817 s, 262144 iters, t-(init.)=1.55516 s t(norm)=0.0926948, mflops=53.9405 (err=1.8e-16) FFTW_MEASURE plan: (cost = 1.784156e-06) FFTW_NOTW 16 14. FFTW: elapsed time t=1.7977 s, 1048576 iters, t-(init.)=1.4107 s t(norm)=0.021021, mflops=237.857 (err=1.8e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.89152 s, 1048576 iters, t-(init.)=1.46705 s t(norm)=0.0218608, mflops=228.72 (err=1.8e-16) 16. Frigo-old: elapsed time t=1.60996 s, 1048576 iters, t-(init.)=1.2292 s t(norm)=0.0183165, mflops=272.978 (err=1.8e-16) 17. Green: elapsed time t=1.32254 s, 262144 iters, t-(init.)=1.2289 s t(norm)=0.0732479, mflops=68.2614 (err=1.9e-16) 18. GSL: elapsed time t=1.1021 s, 131072 iters, t-(init.)=1.05295 s t(norm)=0.125522, mflops=39.8338 (err=1.8e-16) 19. GSL DIT: elapsed time t=1.07476 s, 65536 iters, t-(init.)=1.04979 s t(norm)=0.250289, mflops=19.9769 (err=2.1e-16) 20. GSL DIF: elapsed time t=1.01495 s, 65536 iters, t-(init.)=0.988428 s t(norm)=0.23566, mflops=21.217 (err=2.8e-16) 21. Krukar: elapsed time t=1.77859 s, 524288 iters, t-(init.)=1.56827 s t(norm)=0.046738, mflops=106.979 (err=1.9e-16) 22. Mayer (Buneman): elapsed time t=1.22443 s, 131072 iters, t-(init.)=1.17762 s t(norm)=0.140383, mflops=35.6169 (err=1.7e-16) 23. Mayer (simple): elapsed time t=1.10209 s, 131072 iters, t-(init.)=1.05605 s t(norm)=0.125892, mflops=39.7167 24. Mayer (lookup): elapsed time t=1.05295 s, 131072 iters, t-(init.)=0.996782 s t(norm)=0.118826, mflops=42.0785 (err=1.8e-16) 25. Monro: elapsed time t=1.91596 s, 131072 iters, t-(init.)=1.86913 s t(norm)=0.222818, mflops=22.4399 (err=2.1e-08) 26. NAPACK (f2c): elapsed time t=1.00213 s, 32768 iters, t-(init.)=0.9877 s t(norm)=0.470972, mflops=10.6163 (err=3.3e-16) 27. Nielsen: elapsed time t=1.43968 s, 65536 iters, t-(init.)=1.4155 s t(norm)=0.337481, mflops=14.8157 (err=1.6e-16) 28. NR (C): elapsed time t=1.93209 s, 131072 iters, t-(init.)=1.88528 s t(norm)=0.224743, mflops=22.2477 (err=2.1e-16) 29. NR (F): elapsed time t=1.98349 s, 131072 iters, t-(init.)=1.93433 s t(norm)=0.230591, mflops=21.6834 (err=2.1e-16) 30. Ooura (C): elapsed time t=1.27069 s, 262144 iters, t-(init.)=1.17235 s t(norm)=0.0698776, mflops=71.5536 (err=2.0e-16) 31. Ooura (F): elapsed time t=1.31184 s, 262144 iters, t-(init.)=1.21822 s t(norm)=0.0726118, mflops=68.8593 (err=2.0e-16) 32. QFT: elapsed time t=1.79304 s, 262144 iters, t-(init.)=1.70095 s t(norm)=0.101385, mflops=49.3171 (err=1.6e-16) 33. Ransom: elapsed time t=1.86984 s, 65536 iters, t-(init.)=1.84331 s t(norm)=0.439479, mflops=11.3771 (err=3.4e-16) 34. SCIPORT: elapsed time t=1.59895 s, 262144 iters, t-(init.)=1.49895 s t(norm)=0.0893443, mflops=55.9633 (err=2.8e-16) 35. Singleton: elapsed time t=1.35402 s, 131072 iters, t-(init.)=1.30209 s t(norm)=0.155221, mflops=32.2121 (err=1.7e-16) 36. Singleton (f2c): elapsed time t=1.27258 s, 131072 iters, t-(init.)=1.22575 s t(norm)=0.14612, mflops=34.2183 (err=1.7e-16) 37. Sorensen: elapsed time t=1.91802 s, 262144 iters, t-(init.)=1.82596 s t(norm)=0.108836, mflops=45.9407 (err=1.5e-16) 38. Sorensen DIT: elapsed time t=1.56483 s, 65536 iters, t-(init.)=1.54104 s t(norm)=0.367412, mflops=13.6087 (err=1.6e-16) 39. Temperton: elapsed time t=1.60095 s, 131072 iters, t-(init.)=1.54565 s t(norm)=0.184256, mflops=27.1361 (err=1.7e-08) 40. Temperton (f2c): elapsed time t=1.866 s, 131072 iters, t-(init.)=1.81997 s t(norm)=0.216957, mflops=23.046 (err=1.8e-16) 41. Valkenburg: elapsed time t=1.31013 s, 16384 iters, t-(init.)=1.30379 s t(norm)=1.24339, mflops=4.02127 (err=2.9e-16) 42. SUNPERF: elapsed time t=1.83859 s, 524288 iters, t-(init.)=1.6482 s t(norm)=0.0491201, mflops=101.791 (err=1.8e-16) Top mflops for N=16 = 272.978 Normalized results and averages for N=16: fft 0: mflops = 35.339 (norm. = 0.129457), norm. avg. (of 4) = 0.322543 fft 1: mflops = 37.8248 (norm. = 0.138564), norm. avg. (of 4) = 0.283204 fft 2: mflops = 22.9737 (norm. = 0.0841594), norm. avg. (of 4) = 0.0987641 fft 3: mflops = 11.1712 (norm. = 0.0409235), norm. avg. (of 4) = 0.0248648 fft 4: mflops = 20.5555 (norm. = 0.0753009), norm. avg. (of 4) = 0.0534812 fft 5: mflops = 6.3754 (norm. = 0.023355), norm. avg. (of 4) = 0.0343377 fft 6: mflops = 33.6133 (norm. = 0.123135), norm. avg. (of 4) = 0.0889895 fft 7: mflops = 24.8021 (norm. = 0.0908573), norm. avg. (of 4) = 0.0705707 fft 8: mflops = 13.7403 (norm. = 0.0503348), norm. avg. (of 4) = 0.142588 fft 9: mflops = 19.8997 (norm. = 0.0728985), norm. avg. (of 4) = 0.0528847 fft 10: mflops = 28.3771 (norm. = 0.103954), norm. avg. (of 4) = 0.0503674 fft 11: mflops = 12.2209 (norm. = 0.0447687), norm. avg. (of 3) = 0.0565338 fft 12: mflops = 73.438 (norm. = 0.269025), norm. avg. (of 4) = 0.169793 fft 13: mflops = 53.9405 (norm. = 0.1976), norm. avg. (of 4) = 0.12458 fft 14: mflops = 237.857 (norm. = 0.871342), norm. avg. (of 4) = 0.589956 fft 15: mflops = 228.72 (norm. = 0.837869), norm. avg. (of 4) = 0.551377 fft 16: mflops = 272.978 (norm. = 1), norm. avg. (of 4) = 1 fft 17: mflops = 68.2614 (norm. = 0.250062), norm. avg. (of 2) = 0.247113 fft 18: mflops = 39.8338 (norm. = 0.145923), norm. avg. (of 4) = 0.101448 fft 19: mflops = 19.9769 (norm. = 0.0731812), norm. avg. (of 4) = 0.0512853 fft 20: mflops = 21.217 (norm. = 0.0777243), norm. avg. (of 4) = 0.0530305 fft 21: mflops = 106.979 (norm. = 0.391897), norm. avg. (of 4) = 0.374924 fft 22: mflops = 35.6169 (norm. = 0.130475), norm. avg. (of 3) = 0.134542 fft 23: mflops = 39.7167 (norm. = 0.145494), norm. avg. (of 3) = 0.135534 fft 24: mflops = 42.0785 (norm. = 0.154146), norm. avg. (of 3) = 0.135797 fft 25: mflops = 22.4399 (norm. = 0.082204), norm. avg. (of 3) = 0.0543587 fft 26: mflops = 10.6163 (norm. = 0.0388908), norm. avg. (of 4) = 0.0252348 fft 27: mflops = 14.8157 (norm. = 0.0542741), norm. avg. (of 4) = 0.0409313 fft 28: mflops = 22.2477 (norm. = 0.0814998), norm. avg. (of 4) = 0.0564701 fft 29: mflops = 21.6834 (norm. = 0.0794329), norm. avg. (of 4) = 0.056348 fft 30: mflops = 71.5536 (norm. = 0.262122), norm. avg. (of 4) = 0.23304 fft 31: mflops = 68.8593 (norm. = 0.252252), norm. avg. (of 4) = 0.220312 fft 32: mflops = 49.3171 (norm. = 0.180663), norm. avg. (of 1) = 0.180663 fft 33: mflops = 11.3771 (norm. = 0.0416777), norm. avg. (of 3) = 0.0220745 fft 34: mflops = 55.9633 (norm. = 0.20501), norm. avg. (of 3) = 0.184118 fft 35: mflops = 32.2121 (norm. = 0.118002), norm. avg. (of 4) = 0.0624871 fft 36: mflops = 34.2183 (norm. = 0.125352), norm. avg. (of 4) = 0.0657645 fft 37: mflops = 45.9407 (norm. = 0.168295), norm. avg. (of 4) = 0.141675 fft 38: mflops = 13.6087 (norm. = 0.0498527), norm. avg. (of 4) = 0.129598 fft 39: mflops = 27.1361 (norm. = 0.0994077), norm. avg. (of 4) = 0.0585201 fft 40: mflops = 23.046 (norm. = 0.0844245), norm. avg. (of 4) = 0.0487241 fft 41: mflops = 4.02127 (norm. = 0.0147311), norm. avg. (of 4) = 0.024916 fft 42: mflops = 101.791 (norm. = 0.372892), norm. avg. (of 4) = 0.205916 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.25186 s, 65536 iters, t-(init.)=1.21129 s t(norm)=0.115518, mflops=43.2835 (err=3.1e-16) 1. Arndt DIT: elapsed time t=1.21572 s, 65536 iters, t-(init.)=1.17632 s t(norm)=0.112183, mflops=44.5701 (err=2.5e-16) 2. Arndt Split-Radix: elapsed time t=1.0259 s, 32768 iters, t-(init.)=1.0064 s t(norm)=0.191955, mflops=26.0478 (err=2.7e-16) 3. Arndt 4-step: elapsed time t=1.13992 s, 16384 iters, t-(init.)=1.12968 s t(norm)=0.430937, mflops=11.6026 (err=2.8e-16) 4. Bailey: elapsed time t=1.63662 s, 65536 iters, t-(init.)=1.59761 s t(norm)=0.15236, mflops=32.817 (err=2.7e-16) 5. Beauregard: elapsed time t=1.02503 s, 8192 iters, t-(init.)=1.01976 s t(norm)=0.778014, mflops=6.42662 (err=1.8e-16) 6. Bergland: elapsed time t=1.16385 s, 65536 iters, t-(init.)=1.12484 s t(norm)=0.107274, mflops=46.6098 (err=2.6e-16) 7. Brenner: elapsed time t=1.6627 s, 65536 iters, t-(init.)=1.62252 s t(norm)=0.154735, mflops=32.3132 (err=2.2e-16) 8. Burrus: elapsed time t=1.83223 s, 32768 iters, t-(init.)=1.81192 s t(norm)=0.345596, mflops=14.4678 (err=2.9e-16) 9. CWP (min N) (N=33): elapsed time t=1.14846 s, 65536 iters, t-(init.)=1.10789 s t(norm)=0.105657, mflops=47.323 10. CWP (best N) (N=35): elapsed time t=2.0011 s, 131072 iters, t-(init.)=1.9124 s t(norm)=0.0911903, mflops=54.8304 11. Edelblute: elapsed time t=1.01983 s, 16384 iters, t-(init.)=1.00979 s t(norm)=0.385203, mflops=12.9802 (err=2.9e-16) 12. FFTPACK: elapsed time t=1.34522 s, 131072 iters, t-(init.)=1.26172 s t(norm)=0.0601634, mflops=83.107 (err=1.9e-16) 13. FFTPACK (f2c): elapsed time t=1.00323 s, 65536 iters, t-(init.)=0.961885 s t(norm)=0.0917325, mflops=54.5063 (err=1.9e-16) FFTW_MEASURE plan: (cost = 5.353937e-06) FFTW_TWIDDLE 4 FFTW_NOTW 8 14. FFTW: elapsed time t=1.54384 s, 262144 iters, t-(init.)=1.38621 s t(norm)=0.0330499, mflops=151.287 (err=1.9e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.10693 s, 262144 iters, t-(init.)=0.932681 s t(norm)=0.0222369, mflops=224.852 (err=2.1e-16) 16. Frigo-old: elapsed time t=1.11892 s, 262144 iters, t-(init.)=0.961299 s t(norm)=0.0229191, mflops=218.158 (err=2.2e-16) 17. Green: elapsed time t=1.27878 s, 131072 iters, t-(init.)=1.19997 s t(norm)=0.0572193, mflops=87.3832 (err=2.0e-16) 18. GSL: elapsed time t=1.48377 s, 65536 iters, t-(init.)=1.44359 s t(norm)=0.137672, mflops=36.3182 (err=2.0e-16) 19. GSL DIT: elapsed time t=1.87425 s, 65536 iters, t-(init.)=1.83407 s t(norm)=0.174911, mflops=28.586 (err=2.2e-16) 20. GSL DIF: elapsed time t=1.74799 s, 65536 iters, t-(init.)=1.70781 s t(norm)=0.16287, mflops=30.6994 (err=2.5e-16) 21. Krukar: elapsed time t=1.01706 s, 131072 iters, t-(init.)=0.933059 s t(norm)=0.0444917, mflops=112.38 (err=2.2e-16) 22. Mayer (Buneman): elapsed time t=1.31336 s, 65536 iters, t-(init.)=1.27435 s t(norm)=0.121532, mflops=41.1415 (err=2.7e-16) 23. Mayer (simple): elapsed time t=1.96459 s, 131072 iters, t-(init.)=1.88657 s t(norm)=0.0899588, mflops=55.581 24. Mayer (lookup): elapsed time t=1.97753 s, 131072 iters, t-(init.)=1.89717 s t(norm)=0.0904643, mflops=55.2704 (err=2.5e-16) 25. Monro: elapsed time t=1.49047 s, 65536 iters, t-(init.)=1.45146 s t(norm)=0.138422, mflops=36.1214 (err=3.7e-08) 26. NAPACK (f2c): elapsed time t=1.79932 s, 32768 iters, t-(init.)=1.77845 s t(norm)=0.339213, mflops=14.74 (err=5.4e-16) 27. Nielsen: elapsed time t=1.05828 s, 32768 iters, t-(init.)=1.03858 s t(norm)=0.198093, mflops=25.2406 (err=1.1e-15) 28. NR (C): elapsed time t=1.71071 s, 65536 iters, t-(init.)=1.6717 s t(norm)=0.159426, mflops=31.3626 (err=2.0e-16) 29. NR (F): elapsed time t=1.6768 s, 65536 iters, t-(init.)=1.63662 s t(norm)=0.156081, mflops=32.0347 (err=2.0e-16) 30. Ooura (C): elapsed time t=1.35905 s, 131072 iters, t-(init.)=1.27557 s t(norm)=0.0608241, mflops=82.2043 (err=2.7e-16) 31. Ooura (F): elapsed time t=1.41633 s, 131072 iters, t-(init.)=1.33831 s t(norm)=0.0638157, mflops=78.3506 (err=2.7e-16) 32. QFT: elapsed time t=1.19823 s, 65536 iters, t-(init.)=1.15883 s t(norm)=0.110515, mflops=45.2429 (err=2.6e-16) 33. Ransom: elapsed time t=1.35851 s, 16384 iters, t-(init.)=1.34856 s t(norm)=0.514434, mflops=9.71942 (err=7.0e-16) 34. SCIPORT: elapsed time t=1.69259 s, 131072 iters, t-(init.)=1.60911 s t(norm)=0.0767282, mflops=65.165 (err=1.8e-16) 35. Singleton: elapsed time t=1.18766 s, 65536 iters, t-(init.)=1.14553 s t(norm)=0.109246, mflops=45.7683 (err=2.2e-16) 36. Singleton (f2c): elapsed time t=1.16983 s, 65536 iters, t-(init.)=1.13043 s t(norm)=0.107806, mflops=46.3796 (err=2.2e-16) 37. Sorensen: elapsed time t=1.48439 s, 131072 iters, t-(init.)=1.40637 s t(norm)=0.0670611, mflops=74.5589 (err=2.7e-16) 38. Sorensen DIT: elapsed time t=1.81033 s, 32768 iters, t-(init.)=1.7908 s t(norm)=0.341567, mflops=14.6384 (err=2.6e-16) 39. Temperton: elapsed time t=1.55619 s, 65536 iters, t-(init.)=1.5125 s t(norm)=0.144244, mflops=34.6636 (err=3.1e-08) 40. Temperton (f2c): elapsed time t=1.08894 s, 32768 iters, t-(init.)=1.06924 s t(norm)=0.203941, mflops=24.5169 (err=2.0e-16) 41. Valkenburg: elapsed time t=1.59743 s, 8192 iters, t-(init.)=1.59236 s t(norm)=1.21487, mflops=4.11565 (err=4.3e-16) 42. SUNPERF: elapsed time t=1.02169 s, 131072 iters, t-(init.)=0.94367 s t(norm)=0.0449977, mflops=111.117 (err=1.9e-16) Top mflops for N=32 = 224.852 Normalized results and averages for N=32: fft 0: mflops = 43.2835 (norm. = 0.192498), norm. avg. (of 5) = 0.296534 fft 1: mflops = 44.5701 (norm. = 0.19822), norm. avg. (of 5) = 0.266207 fft 2: mflops = 26.0478 (norm. = 0.115844), norm. avg. (of 5) = 0.10218 fft 3: mflops = 11.6026 (norm. = 0.0516012), norm. avg. (of 5) = 0.0302121 fft 4: mflops = 32.817 (norm. = 0.145949), norm. avg. (of 5) = 0.0719748 fft 5: mflops = 6.42662 (norm. = 0.0285816), norm. avg. (of 5) = 0.0331864 fft 6: mflops = 46.6098 (norm. = 0.207291), norm. avg. (of 5) = 0.11265 fft 7: mflops = 32.3132 (norm. = 0.143709), norm. avg. (of 5) = 0.0851983 fft 8: mflops = 14.4678 (norm. = 0.0643435), norm. avg. (of 5) = 0.126939 fft 9: mflops = 47.323 (norm. = 0.210463), norm. avg. (of 5) = 0.0844004 fft 10: mflops = 54.8304 (norm. = 0.243851), norm. avg. (of 5) = 0.0890642 fft 11: mflops = 12.9802 (norm. = 0.0577276), norm. avg. (of 4) = 0.0568323 fft 12: mflops = 83.107 (norm. = 0.369608), norm. avg. (of 5) = 0.209756 fft 13: mflops = 54.5063 (norm. = 0.24241), norm. avg. (of 5) = 0.148146 fft 14: mflops = 151.287 (norm. = 0.672827), norm. avg. (of 5) = 0.60653 fft 15: mflops = 224.852 (norm. = 1), norm. avg. (of 5) = 0.641101 fft 16: mflops = 218.158 (norm. = 0.970231), norm. avg. (of 5) = 0.994046 fft 17: mflops = 87.3832 (norm. = 0.388625), norm. avg. (of 3) = 0.294284 fft 18: mflops = 36.3182 (norm. = 0.161521), norm. avg. (of 5) = 0.113463 fft 19: mflops = 28.586 (norm. = 0.127133), norm. avg. (of 5) = 0.0664548 fft 20: mflops = 30.6994 (norm. = 0.136532), norm. avg. (of 5) = 0.0697307 fft 21: mflops = 112.38 (norm. = 0.499798), norm. avg. (of 5) = 0.399899 fft 22: mflops = 41.1415 (norm. = 0.182971), norm. avg. (of 4) = 0.146649 fft 23: mflops = 55.581 (norm. = 0.247189), norm. avg. (of 4) = 0.163448 fft 24: mflops = 55.2704 (norm. = 0.245808), norm. avg. (of 4) = 0.1633 fft 25: mflops = 36.1214 (norm. = 0.160645), norm. avg. (of 4) = 0.0809303 fft 26: mflops = 14.74 (norm. = 0.0655543), norm. avg. (of 5) = 0.0332987 fft 27: mflops = 25.2406 (norm. = 0.112254), norm. avg. (of 5) = 0.0551959 fft 28: mflops = 31.3626 (norm. = 0.139481), norm. avg. (of 5) = 0.0730723 fft 29: mflops = 32.0347 (norm. = 0.14247), norm. avg. (of 5) = 0.0735725 fft 30: mflops = 82.2043 (norm. = 0.365593), norm. avg. (of 5) = 0.259551 fft 31: mflops = 78.3506 (norm. = 0.348454), norm. avg. (of 5) = 0.245941 fft 32: mflops = 45.2429 (norm. = 0.201212), norm. avg. (of 2) = 0.190938 fft 33: mflops = 9.71942 (norm. = 0.0432259), norm. avg. (of 4) = 0.0273623 fft 34: mflops = 65.165 (norm. = 0.289813), norm. avg. (of 4) = 0.210542 fft 35: mflops = 45.7683 (norm. = 0.203549), norm. avg. (of 5) = 0.0906994 fft 36: mflops = 46.3796 (norm. = 0.206267), norm. avg. (of 5) = 0.0938651 fft 37: mflops = 74.5589 (norm. = 0.331591), norm. avg. (of 5) = 0.179658 fft 38: mflops = 14.6384 (norm. = 0.0651024), norm. avg. (of 5) = 0.116699 fft 39: mflops = 34.6636 (norm. = 0.154162), norm. avg. (of 5) = 0.0776484 fft 40: mflops = 24.5169 (norm. = 0.109036), norm. avg. (of 5) = 0.0607864 fft 41: mflops = 4.11565 (norm. = 0.0183038), norm. avg. (of 5) = 0.0235936 fft 42: mflops = 111.117 (norm. = 0.494178), norm. avg. (of 5) = 0.263568 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.56183 s, 32768 iters, t-(init.)=1.52555 s t(norm)=0.12124, mflops=41.2405 (err=5.7e-16) 1. Arndt DIT: elapsed time t=1.44509 s, 32768 iters, t-(init.)=1.40998 s t(norm)=0.112055, mflops=44.6209 (err=5.6e-16) 2. Arndt Split-Radix: elapsed time t=1.10756 s, 16384 iters, t-(init.)=1.08988 s t(norm)=0.173232, mflops=28.863 (err=5.7e-16) 3. Arndt 4-step: elapsed time t=1.80899 s, 16384 iters, t-(init.)=1.79104 s t(norm)=0.284678, mflops=17.5637 (err=5.6e-16) 4. Bailey: elapsed time t=1.3387 s, 32768 iters, t-(init.)=1.3036 s t(norm)=0.1036, mflops=48.2623 (err=5.7e-16) 5. Beauregard: elapsed time t=1.22606 s, 4096 iters, t-(init.)=1.22152 s t(norm)=0.776624, mflops=6.43812 (err=5.9e-16) 6. Bergland: elapsed time t=1.20092 s, 32768 iters, t-(init.)=1.16562 s t(norm)=0.0926349, mflops=53.9754 (err=5.9e-16) 7. Brenner: elapsed time t=1.58791 s, 32768 iters, t-(init.)=1.55007 s t(norm)=0.123188, mflops=40.5883 (err=5.8e-16) 8. Burrus: elapsed time t=1.99217 s, 16384 iters, t-(init.)=1.9732 s t(norm)=0.313632, mflops=15.9422 (err=5.7e-16) 9. CWP (min N) (N=65): elapsed time t=1.11129 s, 32768 iters, t-(init.)=1.0756 s t(norm)=0.085481, mflops=58.4925 10. CWP (best N) (N=84): elapsed time t=1.50356 s, 65536 iters, t-(init.)=1.41108 s t(norm)=0.0560713, mflops=89.1722 11. Edelblute: elapsed time t=1.10365 s, 8192 iters, t-(init.)=1.09467 s t(norm)=0.347987, mflops=14.3684 (err=5.7e-16) 12. FFTPACK: elapsed time t=1.28587 s, 65536 iters, t-(init.)=1.21253 s t(norm)=0.0481818, mflops=103.774 (err=5.6e-16) 13. FFTPACK (f2c): elapsed time t=1.86918 s, 65536 iters, t-(init.)=1.79701 s t(norm)=0.0714068, mflops=70.0213 (err=5.6e-16) FFTW_MEASURE plan: (cost = 1.007700e-05) FFTW_TWIDDLE 4 FFTW_NOTW 16 14. FFTW: elapsed time t=1.35708 s, 131072 iters, t-(init.)=1.21509 s t(norm)=0.0241417, mflops=207.111 (err=5.5e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.69734 s, 131072 iters, t-(init.)=1.55454 s t(norm)=0.030886, mflops=161.886 (err=5.6e-16) 16. Frigo-old: elapsed time t=1.20895 s, 65536 iters, t-(init.)=1.13832 s t(norm)=0.0452326, mflops=110.54 (err=5.6e-16) 17. Green: elapsed time t=1.15562 s, 65536 iters, t-(init.)=1.08501 s t(norm)=0.0431144, mflops=115.971 (err=5.5e-16) 18. GSL: elapsed time t=1.41782 s, 32768 iters, t-(init.)=1.38096 s t(norm)=0.109749, mflops=45.5587 (err=5.6e-16) 19. GSL DIT: elapsed time t=1.78615 s, 32768 iters, t-(init.)=1.75046 s t(norm)=0.139114, mflops=35.9418 (err=5.6e-16) 20. GSL DIF: elapsed time t=1.52735 s, 32768 iters, t-(init.)=1.49062 s t(norm)=0.118464, mflops=42.207 (err=5.4e-16) 21. Krukar: elapsed time t=1.31006 s, 65536 iters, t-(init.)=1.23865 s t(norm)=0.0492196, mflops=101.586 (err=6.0e-16) 22. Mayer (Buneman): elapsed time t=1.49654 s, 32768 iters, t-(init.)=1.46124 s t(norm)=0.116129, mflops=43.0557 (err=5.4e-16) 23. Mayer (simple): elapsed time t=1.11223 s, 32768 iters, t-(init.)=1.0769 s t(norm)=0.0855846, mflops=58.4217 24. Mayer (lookup): elapsed time t=1.12359 s, 32768 iters, t-(init.)=1.0877 s t(norm)=0.0864423, mflops=57.842 (err=5.4e-16) 25. Monro: elapsed time t=1.31693 s, 32768 iters, t-(init.)=1.28162 s t(norm)=0.101854, mflops=49.0898 (err=3.4e-08) 26. NAPACK (f2c): elapsed time t=1.67344 s, 16384 iters, t-(init.)=1.65461 s t(norm)=0.262994, mflops=19.0118 (err=1.1e-15) 27. Nielsen: elapsed time t=1.63042 s, 32768 iters, t-(init.)=1.59511 s t(norm)=0.126768, mflops=39.442 (err=1.8e-15) 28. NR (C): elapsed time t=1.6269 s, 32768 iters, t-(init.)=1.59179 s t(norm)=0.126504, mflops=39.5245 (err=5.5e-16) 29. NR (F): elapsed time t=1.57224 s, 32768 iters, t-(init.)=1.53577 s t(norm)=0.122052, mflops=40.9662 (err=5.5e-16) 30. Ooura (C): elapsed time t=1.44586 s, 65536 iters, t-(init.)=1.3733 s t(norm)=0.0545701, mflops=91.6252 (err=5.9e-16) 31. Ooura (F): elapsed time t=1.44391 s, 65536 iters, t-(init.)=1.37366 s t(norm)=0.0545845, mflops=91.6011 (err=5.9e-16) 32. QFT: elapsed time t=1.43835 s, 32768 iters, t-(init.)=1.40304 s t(norm)=0.111504, mflops=44.8414 (err=5.4e-16) 33. Ransom: elapsed time t=1.33894 s, 16384 iters, t-(init.)=1.32109 s t(norm)=0.209982, mflops=23.8116 (err=8.6e-16) 34. SCIPORT: elapsed time t=1.708 s, 65536 iters, t-(init.)=1.63623 s t(norm)=0.0650178, mflops=76.902 (err=5.9e-16) 35. Singleton: elapsed time t=1.9712 s, 65536 iters, t-(init.)=1.89863 s t(norm)=0.0754448, mflops=66.2736 (err=9.2e-16) 36. Singleton (f2c): elapsed time t=1.87704 s, 65536 iters, t-(init.)=1.8068 s t(norm)=0.0717959, mflops=69.6418 (err=9.2e-16) 37. Sorensen: elapsed time t=1.39444 s, 65536 iters, t-(init.)=1.3242 s t(norm)=0.0526191, mflops=95.0226 (err=5.4e-16) 38. Sorensen DIT: elapsed time t=1.97552 s, 16384 iters, t-(init.)=1.95787 s t(norm)=0.311194, mflops=16.0671 (err=5.5e-16) 39. Temperton: elapsed time t=1.22061 s, 32768 iters, t-(init.)=1.18492 s t(norm)=0.0941686, mflops=53.0962 (err=3.8e-08) 40. Temperton (f2c): elapsed time t=1.80178 s, 32768 iters, t-(init.)=1.76667 s t(norm)=0.140402, mflops=35.6119 (err=5.6e-16) 41. Valkenburg: elapsed time t=1.87843 s, 4096 iters, t-(init.)=1.87384 s t(norm)=1.19136, mflops=4.19689 (err=8.1e-16) 42. SUNPERF: elapsed time t=1.75655 s, 131072 iters, t-(init.)=1.61612 s t(norm)=0.0321093, mflops=155.718 (err=5.6e-16) Top mflops for N=64 = 207.111 Normalized results and averages for N=64: fft 0: mflops = 41.2405 (norm. = 0.199123), norm. avg. (of 6) = 0.280299 fft 1: mflops = 44.6209 (norm. = 0.215445), norm. avg. (of 6) = 0.257747 fft 2: mflops = 28.863 (norm. = 0.13936), norm. avg. (of 6) = 0.108377 fft 3: mflops = 17.5637 (norm. = 0.0848034), norm. avg. (of 6) = 0.0393106 fft 4: mflops = 48.2623 (norm. = 0.233027), norm. avg. (of 6) = 0.0988168 fft 5: mflops = 6.43812 (norm. = 0.0310854), norm. avg. (of 6) = 0.0328363 fft 6: mflops = 53.9754 (norm. = 0.260611), norm. avg. (of 6) = 0.13731 fft 7: mflops = 40.5883 (norm. = 0.195974), norm. avg. (of 6) = 0.103661 fft 8: mflops = 15.9422 (norm. = 0.0769745), norm. avg. (of 6) = 0.118612 fft 9: mflops = 58.4925 (norm. = 0.282422), norm. avg. (of 6) = 0.117404 fft 10: mflops = 89.1722 (norm. = 0.430554), norm. avg. (of 6) = 0.145979 fft 11: mflops = 14.3684 (norm. = 0.0693753), norm. avg. (of 5) = 0.0593409 fft 12: mflops = 103.774 (norm. = 0.501054), norm. avg. (of 6) = 0.258306 fft 13: mflops = 70.0213 (norm. = 0.338086), norm. avg. (of 6) = 0.179803 fft 14: mflops = 207.111 (norm. = 1), norm. avg. (of 6) = 0.672108 fft 15: mflops = 161.886 (norm. = 0.781639), norm. avg. (of 6) = 0.664524 fft 16: mflops = 110.54 (norm. = 0.533723), norm. avg. (of 6) = 0.917326 fft 17: mflops = 115.971 (norm. = 0.559945), norm. avg. (of 4) = 0.360699 fft 18: mflops = 45.5587 (norm. = 0.219973), norm. avg. (of 6) = 0.131214 fft 19: mflops = 35.9418 (norm. = 0.173539), norm. avg. (of 6) = 0.0843022 fft 20: mflops = 42.207 (norm. = 0.20379), norm. avg. (of 6) = 0.0920739 fft 21: mflops = 101.586 (norm. = 0.490489), norm. avg. (of 6) = 0.414997 fft 22: mflops = 43.0557 (norm. = 0.207887), norm. avg. (of 5) = 0.158897 fft 23: mflops = 58.4217 (norm. = 0.28208), norm. avg. (of 5) = 0.187174 fft 24: mflops = 57.842 (norm. = 0.279281), norm. avg. (of 5) = 0.186496 fft 25: mflops = 49.0898 (norm. = 0.237022), norm. avg. (of 5) = 0.112149 fft 26: mflops = 19.0118 (norm. = 0.0917956), norm. avg. (of 6) = 0.0430482 fft 27: mflops = 39.442 (norm. = 0.190439), norm. avg. (of 6) = 0.0777365 fft 28: mflops = 39.5245 (norm. = 0.190837), norm. avg. (of 6) = 0.0926998 fft 29: mflops = 40.9662 (norm. = 0.197799), norm. avg. (of 6) = 0.0942769 fft 30: mflops = 91.6252 (norm. = 0.442397), norm. avg. (of 6) = 0.290025 fft 31: mflops = 91.6011 (norm. = 0.442281), norm. avg. (of 6) = 0.278664 fft 32: mflops = 44.8414 (norm. = 0.21651), norm. avg. (of 3) = 0.199462 fft 33: mflops = 23.8116 (norm. = 0.11497), norm. avg. (of 5) = 0.044884 fft 34: mflops = 76.902 (norm. = 0.371309), norm. avg. (of 5) = 0.242695 fft 35: mflops = 66.2736 (norm. = 0.319991), norm. avg. (of 6) = 0.128915 fft 36: mflops = 69.6418 (norm. = 0.336254), norm. avg. (of 6) = 0.134263 fft 37: mflops = 95.0226 (norm. = 0.458801), norm. avg. (of 6) = 0.226182 fft 38: mflops = 16.0671 (norm. = 0.0775775), norm. avg. (of 6) = 0.110179 fft 39: mflops = 53.0962 (norm. = 0.256367), norm. avg. (of 6) = 0.107435 fft 40: mflops = 35.6119 (norm. = 0.171946), norm. avg. (of 6) = 0.0793131 fft 41: mflops = 4.19689 (norm. = 0.020264), norm. avg. (of 6) = 0.0230386 fft 42: mflops = 155.718 (norm. = 0.751859), norm. avg. (of 6) = 0.34495 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.58925 s, 16384 iters, t-(init.)=1.55589 s t(norm)=0.105987, mflops=47.1756 (err=3.5e-16) 1. Arndt DIT: elapsed time t=1.4588 s, 16384 iters, t-(init.)=1.42545 s t(norm)=0.0971009, mflops=51.4928 (err=3.2e-16) 2. Arndt Split-Radix: elapsed time t=1.15424 s, 8192 iters, t-(init.)=1.13756 s t(norm)=0.154981, mflops=32.2621 (err=3.6e-16) 3. Arndt 4-step: elapsed time t=1.12558 s, 4096 iters, t-(init.)=1.1171 s t(norm)=0.304384, mflops=16.4266 (err=3.3e-16) 4. Bailey: elapsed time t=1.11939 s, 16384 iters, t-(init.)=1.08611 s t(norm)=0.0739853, mflops=67.581 (err=3.2e-16) 5. Beauregard: elapsed time t=1.45254 s, 2048 iters, t-(init.)=1.44834 s t(norm)=0.789284, mflops=6.33485 (err=3.7e-16) 6. Bergland: elapsed time t=1.28293 s, 16384 iters, t-(init.)=1.24987 s t(norm)=0.0851405, mflops=58.7265 (err=3.7e-16) 7. Brenner: elapsed time t=1.65357 s, 16384 iters, t-(init.)=1.61983 s t(norm)=0.110342, mflops=45.3136 (err=4.1e-16) 8. Burrus: elapsed time t=1.04064 s, 4096 iters, t-(init.)=1.03218 s t(norm)=0.281246, mflops=17.778 (err=3.4e-16) 9. CWP (min N) (N=130): elapsed time t=1.96275 s, 32768 iters, t-(init.)=1.89527 s t(norm)=0.0645525, mflops=77.4564 10. CWP (best N) (N=140): elapsed time t=1.27864 s, 32768 iters, t-(init.)=1.20508 s t(norm)=0.0410449, mflops=121.818 11. Edelblute: elapsed time t=1.15147 s, 4096 iters, t-(init.)=1.14308 s t(norm)=0.311465, mflops=16.0532 (err=3.4e-16) 12. FFTPACK: elapsed time t=1.25736 s, 32768 iters, t-(init.)=1.1896 s t(norm)=0.0405177, mflops=123.403 (err=3.5e-16) 13. FFTPACK (f2c): elapsed time t=1.98367 s, 32768 iters, t-(init.)=1.91598 s t(norm)=0.0652581, mflops=76.6189 (err=3.5e-16) FFTW_MEASURE plan: (cost = 2.424900e-05) FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.76084 s, 65536 iters, t-(init.)=1.6274 s t(norm)=0.0277144, mflops=180.411 (err=3.8e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.59125 s, 65536 iters, t-(init.)=1.45548 s t(norm)=0.0247866, mflops=201.722 (err=3.5e-16) 16. Frigo-old: elapsed time t=1.36151 s, 32768 iters, t-(init.)=1.29481 s t(norm)=0.0441009, mflops=113.376 (err=3.4e-16) 17. Green: elapsed time t=1.36308 s, 32768 iters, t-(init.)=1.29656 s t(norm)=0.0441605, mflops=113.223 (err=4.2e-16) 18. GSL: elapsed time t=1.51656 s, 16384 iters, t-(init.)=1.4831 s t(norm)=0.101028, mflops=49.491 (err=3.4e-16) 19. GSL DIT: elapsed time t=1.78211 s, 16384 iters, t-(init.)=1.74753 s t(norm)=0.119041, mflops=42.0023 (err=3.5e-16) 20. GSL DIF: elapsed time t=1.53334 s, 16384 iters, t-(init.)=1.49989 s t(norm)=0.102172, mflops=48.9373 (err=3.7e-16) 21. Krukar: elapsed time t=1.00717 s, 16384 iters, t-(init.)=0.973524 s t(norm)=0.0663161, mflops=75.3965 (err=3.6e-16) 22. Mayer (Buneman): elapsed time t=1.47936 s, 16384 iters, t-(init.)=1.44611 s t(norm)=0.0985082, mflops=50.7572 (err=3.2e-16) 23. Mayer (simple): elapsed time t=1.14548 s, 16384 iters, t-(init.)=1.11242 s t(norm)=0.0757775, mflops=65.9826 24. Mayer (lookup): elapsed time t=1.08997 s, 16384 iters, t-(init.)=1.05603 s t(norm)=0.0719362, mflops=69.506 (err=3.4e-16) 25. Monro: elapsed time t=1.25111 s, 16384 iters, t-(init.)=1.21802 s t(norm)=0.0829712, mflops=60.2618 (err=5.2e-08) 26. NAPACK (f2c): elapsed time t=1.73787 s, 8192 iters, t-(init.)=1.72085 s t(norm)=0.234447, mflops=21.3268 (err=1.2e-15) 27. Nielsen: elapsed time t=1.07225 s, 8192 iters, t-(init.)=1.05557 s t(norm)=0.14381, mflops=34.7681 (err=1.0e-15) 28. NR (C): elapsed time t=1.61389 s, 16384 iters, t-(init.)=1.58064 s t(norm)=0.107672, mflops=46.4372 (err=3.2e-16) 29. NR (F): elapsed time t=1.55991 s, 16384 iters, t-(init.)=1.52655 s t(norm)=0.103988, mflops=48.0826 (err=3.2e-16) 30. Ooura (C): elapsed time t=1.52923 s, 32768 iters, t-(init.)=1.46031 s t(norm)=0.0497379, mflops=100.527 (err=3.3e-16) 31. Ooura (F): elapsed time t=1.56127 s, 32768 iters, t-(init.)=1.49512 s t(norm)=0.0509236, mflops=98.1863 (err=3.3e-16) 32. QFT: elapsed time t=1.6923 s, 16384 iters, t-(init.)=1.65911 s t(norm)=0.113018, mflops=44.2407 (err=4.4e-16) 33. Ransom: elapsed time t=1.84189 s, 8192 iters, t-(init.)=1.82497 s t(norm)=0.248633, mflops=20.11 (err=1.0e-15) 34. SCIPORT: elapsed time t=1.97042 s, 32768 iters, t-(init.)=1.90333 s t(norm)=0.064827, mflops=77.1284 (err=3.6e-16) 35. Singleton: elapsed time t=1.12187 s, 16384 iters, t-(init.)=1.08812 s t(norm)=0.0741222, mflops=67.4562 (err=4.2e-16) 36. Singleton (f2c): elapsed time t=1.10192 s, 16384 iters, t-(init.)=1.06876 s t(norm)=0.0728037, mflops=68.6779 (err=4.2e-16) 37. Sorensen: elapsed time t=1.29306 s, 32768 iters, t-(init.)=1.22694 s t(norm)=0.0417895, mflops=119.647 (err=3.1e-16) 38. Sorensen DIT: elapsed time t=1.05088 s, 4096 iters, t-(init.)=1.04257 s t(norm)=0.284076, mflops=17.6009 (err=3.1e-16) 39. Temperton: elapsed time t=1.41682 s, 16384 iters, t-(init.)=1.38279 s t(norm)=0.0941948, mflops=53.0815 (err=4.7e-08) 40. Temperton (f2c): elapsed time t=1.99443 s, 16384 iters, t-(init.)=1.96127 s t(norm)=0.133601, mflops=37.4248 (err=3.6e-16) 41. Valkenburg: elapsed time t=1.08376 s, 1024 iters, t-(init.)=1.08166 s t(norm)=1.17892, mflops=4.24117 (err=5.7e-16) 42. SUNPERF: elapsed time t=1.63576 s, 65536 iters, t-(init.)=1.50311 s t(norm)=0.0255978, mflops=195.33 (err=3.5e-16) Top mflops for N=128 = 201.722 Normalized results and averages for N=128: fft 0: mflops = 47.1756 (norm. = 0.233865), norm. avg. (of 7) = 0.273665 fft 1: mflops = 51.4928 (norm. = 0.255267), norm. avg. (of 7) = 0.257392 fft 2: mflops = 32.2621 (norm. = 0.159934), norm. avg. (of 7) = 0.115742 fft 3: mflops = 16.4266 (norm. = 0.081432), norm. avg. (of 7) = 0.045328 fft 4: mflops = 67.581 (norm. = 0.335021), norm. avg. (of 7) = 0.13256 fft 5: mflops = 6.33485 (norm. = 0.0314039), norm. avg. (of 7) = 0.0326317 fft 6: mflops = 58.7265 (norm. = 0.291126), norm. avg. (of 7) = 0.159284 fft 7: mflops = 45.3136 (norm. = 0.224634), norm. avg. (of 7) = 0.120943 fft 8: mflops = 17.778 (norm. = 0.0881314), norm. avg. (of 7) = 0.114257 fft 9: mflops = 77.4564 (norm. = 0.383976), norm. avg. (of 7) = 0.155486 fft 10: mflops = 121.818 (norm. = 0.60389), norm. avg. (of 7) = 0.211395 fft 11: mflops = 16.0532 (norm. = 0.0795807), norm. avg. (of 6) = 0.0627142 fft 12: mflops = 123.403 (norm. = 0.611748), norm. avg. (of 7) = 0.308798 fft 13: mflops = 76.6189 (norm. = 0.379825), norm. avg. (of 7) = 0.208378 fft 14: mflops = 180.411 (norm. = 0.894358), norm. avg. (of 7) = 0.703858 fft 15: mflops = 201.722 (norm. = 1), norm. avg. (of 7) = 0.712449 fft 16: mflops = 113.376 (norm. = 0.562043), norm. avg. (of 7) = 0.866571 fft 17: mflops = 113.223 (norm. = 0.561285), norm. avg. (of 5) = 0.400816 fft 18: mflops = 49.491 (norm. = 0.245343), norm. avg. (of 7) = 0.147519 fft 19: mflops = 42.0023 (norm. = 0.208219), norm. avg. (of 7) = 0.102005 fft 20: mflops = 48.9373 (norm. = 0.242598), norm. avg. (of 7) = 0.113577 fft 21: mflops = 75.3965 (norm. = 0.373765), norm. avg. (of 7) = 0.409107 fft 22: mflops = 50.7572 (norm. = 0.25162), norm. avg. (of 6) = 0.174351 fft 23: mflops = 65.9826 (norm. = 0.327097), norm. avg. (of 6) = 0.210495 fft 24: mflops = 69.506 (norm. = 0.344564), norm. avg. (of 6) = 0.212841 fft 25: mflops = 60.2618 (norm. = 0.298737), norm. avg. (of 6) = 0.143247 fft 26: mflops = 21.3268 (norm. = 0.105724), norm. avg. (of 7) = 0.0520018 fft 27: mflops = 34.7681 (norm. = 0.172356), norm. avg. (of 7) = 0.0912536 fft 28: mflops = 46.4372 (norm. = 0.230204), norm. avg. (of 7) = 0.112343 fft 29: mflops = 48.0826 (norm. = 0.238361), norm. avg. (of 7) = 0.11486 fft 30: mflops = 100.527 (norm. = 0.498345), norm. avg. (of 7) = 0.319785 fft 31: mflops = 98.1863 (norm. = 0.486741), norm. avg. (of 7) = 0.308389 fft 32: mflops = 44.2407 (norm. = 0.219316), norm. avg. (of 4) = 0.204425 fft 33: mflops = 20.11 (norm. = 0.0996917), norm. avg. (of 6) = 0.0540186 fft 34: mflops = 77.1284 (norm. = 0.38235), norm. avg. (of 6) = 0.265971 fft 35: mflops = 67.4562 (norm. = 0.334402), norm. avg. (of 7) = 0.15827 fft 36: mflops = 68.6779 (norm. = 0.340458), norm. avg. (of 7) = 0.16372 fft 37: mflops = 119.647 (norm. = 0.593131), norm. avg. (of 7) = 0.278603 fft 38: mflops = 17.6009 (norm. = 0.0872533), norm. avg. (of 7) = 0.106904 fft 39: mflops = 53.0815 (norm. = 0.263142), norm. avg. (of 7) = 0.129679 fft 40: mflops = 37.4248 (norm. = 0.185527), norm. avg. (of 7) = 0.0944865 fft 41: mflops = 4.24117 (norm. = 0.0210249), norm. avg. (of 7) = 0.022751 fft 42: mflops = 195.33 (norm. = 0.968312), norm. avg. (of 7) = 0.434001 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.748 s, 8192 iters, t-(init.)=1.71541 s t(norm)=0.102246, mflops=48.9016 (err=9.6e-16) 1. Arndt DIT: elapsed time t=1.57297 s, 8192 iters, t-(init.)=1.54071 s t(norm)=0.0918338, mflops=54.4462 (err=9.9e-16) 2. Arndt Split-Radix: elapsed time t=1.20748 s, 4096 iters, t-(init.)=1.19134 s t(norm)=0.142018, mflops=35.2067 (err=9.8e-16) 3. Arndt 4-step: elapsed time t=1.12499 s, 2048 iters, t-(init.)=1.11686 s t(norm)=0.26628, mflops=18.7772 (err=1.0e-15) 4. Bailey: elapsed time t=1.17407 s, 8192 iters, t-(init.)=1.14178 s t(norm)=0.0680553, mflops=73.4697 (err=9.8e-16) 5. Beauregard: elapsed time t=1.6645 s, 1024 iters, t-(init.)=1.66042 s t(norm)=0.791751, mflops=6.31512 (err=1.1e-15) 6. Bergland: elapsed time t=1.29512 s, 8192 iters, t-(init.)=1.26289 s t(norm)=0.075274, mflops=66.424 (err=1.0e-15) 7. Brenner: elapsed time t=1.59648 s, 8192 iters, t-(init.)=1.564 s t(norm)=0.093222, mflops=53.6354 (err=1.1e-15) 8. Burrus: elapsed time t=1.08118 s, 2048 iters, t-(init.)=1.0731 s t(norm)=0.255847, mflops=19.543 (err=1.0e-15) 9. CWP (min N) (N=260): elapsed time t=1.791 s, 16384 iters, t-(init.)=1.72408 s t(norm)=0.0513817, mflops=97.3109 10. CWP (best N) (N=280): elapsed time t=1.34912 s, 16384 iters, t-(init.)=1.2782 s t(norm)=0.0380933, mflops=131.257 11. Edelblute: elapsed time t=1.20331 s, 2048 iters, t-(init.)=1.19512 s t(norm)=0.284938, mflops=17.5476 (err=1.0e-15) 12. FFTPACK: elapsed time t=1.2583 s, 16384 iters, t-(init.)=1.19303 s t(norm)=0.0355551, mflops=140.627 (err=1.1e-15) 13. FFTPACK (f2c): elapsed time t=1.01534 s, 8192 iters, t-(init.)=0.982765 s t(norm)=0.0585774, mflops=85.3572 (err=1.1e-15) FFTW_MEASURE plan: (cost = 5.299800e-05) FFTW_TWIDDLE 8 FFTW_NOTW 32 14. FFTW: elapsed time t=1.89136 s, 32768 iters, t-(init.)=1.76222 s t(norm)=0.0262591, mflops=190.41 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.89695 s, 32768 iters, t-(init.)=1.76763 s t(norm)=0.0263398, mflops=189.827 (err=1.1e-15) 16. Frigo-old: elapsed time t=1.46294 s, 16384 iters, t-(init.)=1.39848 s t(norm)=0.0416779, mflops=119.968 (err=1.1e-15) 17. Green: elapsed time t=1.68218 s, 16384 iters, t-(init.)=1.61771 s t(norm)=0.0482117, mflops=103.709 (err=1.1e-15) 18. GSL: elapsed time t=1.51467 s, 8192 iters, t-(init.)=1.48198 s t(norm)=0.0883331, mflops=56.6039 (err=1.1e-15) 19. GSL DIT: elapsed time t=1.8612 s, 8192 iters, t-(init.)=1.82887 s t(norm)=0.109009, mflops=45.8677 (err=1.0e-15) 20. GSL DIF: elapsed time t=1.56216 s, 8192 iters, t-(init.)=1.52963 s t(norm)=0.0911733, mflops=54.8406 (err=1.1e-15) 21. Krukar: elapsed time t=1.02745 s, 8192 iters, t-(init.)=0.994873 s t(norm)=0.0592991, mflops=84.3184 (err=1.1e-15) 22. Mayer (Buneman): elapsed time t=1.61768 s, 8192 iters, t-(init.)=1.58549 s t(norm)=0.0945028, mflops=52.9085 (err=9.7e-16) 23. Mayer (simple): elapsed time t=1.33583 s, 8192 iters, t-(init.)=1.30367 s t(norm)=0.0777046, mflops=64.3463 24. Mayer (lookup): elapsed time t=1.25831 s, 8192 iters, t-(init.)=1.22579 s t(norm)=0.0730625, mflops=68.4345 (err=9.6e-16) 25. Monro: elapsed time t=1.26582 s, 8192 iters, t-(init.)=1.23363 s t(norm)=0.0735302, mflops=67.9992 (err=8.5e-08) 26. NAPACK (f2c): elapsed time t=1.82381 s, 4096 iters, t-(init.)=1.80754 s t(norm)=0.215476, mflops=23.2044 (err=3.8e-15) 27. Nielsen: elapsed time t=1.01252 s, 4096 iters, t-(init.)=0.996429 s t(norm)=0.118784, mflops=42.0933 (err=3.8e-15) 28. NR (C): elapsed time t=1.67919 s, 8192 iters, t-(init.)=1.64701 s t(norm)=0.0981694, mflops=50.9324 (err=1.1e-15) 29. NR (F): elapsed time t=1.61231 s, 8192 iters, t-(init.)=1.57968 s t(norm)=0.0941565, mflops=53.1031 (err=1.1e-15) 30. Ooura (C): elapsed time t=1.59181 s, 16384 iters, t-(init.)=1.52725 s t(norm)=0.0455155, mflops=109.853 (err=9.8e-16) 31. Ooura (F): elapsed time t=1.60616 s, 16384 iters, t-(init.)=1.54167 s t(norm)=0.0459453, mflops=108.825 (err=9.8e-16) 32. QFT: elapsed time t=1.97944 s, 8192 iters, t-(init.)=1.94723 s t(norm)=0.116064, mflops=43.0797 (err=1.1e-15) 33. Ransom: elapsed time t=1.25165 s, 4096 iters, t-(init.)=1.23546 s t(norm)=0.147279, mflops=33.9493 (err=1.9e-15) 34. SCIPORT: elapsed time t=1.03075 s, 8192 iters, t-(init.)=0.99822 s t(norm)=0.0594986, mflops=84.0356 (err=1.1e-15) 35. Singleton: elapsed time t=1.88083 s, 16384 iters, t-(init.)=1.81578 s t(norm)=0.0541145, mflops=92.3967 (err=1.7e-15) 36. Singleton (f2c): elapsed time t=1.83897 s, 16384 iters, t-(init.)=1.77458 s t(norm)=0.0528867, mflops=94.5417 (err=1.7e-15) 37. Sorensen: elapsed time t=1.30961 s, 16384 iters, t-(init.)=1.24358 s t(norm)=0.0370617, mflops=134.91 (err=9.8e-16) 38. Sorensen DIT: elapsed time t=1.0844 s, 2048 iters, t-(init.)=1.07636 s t(norm)=0.256624, mflops=19.4838 (err=9.8e-16) 39. Temperton: elapsed time t=1.35047 s, 8192 iters, t-(init.)=1.31774 s t(norm)=0.0785436, mflops=63.6589 (err=9.5e-08) 40. Temperton (f2c): elapsed time t=1.00467 s, 4096 iters, t-(init.)=0.988539 s t(norm)=0.117843, mflops=42.4293 (err=1.1e-15) 41. Valkenburg: elapsed time t=1.25887 s, 512 iters, t-(init.)=1.25684 s t(norm)=1.19861, mflops=4.17149 (err=1.2e-15) 42. SUNPERF: elapsed time t=1.60455 s, 32768 iters, t-(init.)=1.47579 s t(norm)=0.021991, mflops=227.365 (err=1.1e-15) Top mflops for N=256 = 227.365 Normalized results and averages for N=256: fft 0: mflops = 48.9016 (norm. = 0.215079), norm. avg. (of 8) = 0.266342 fft 1: mflops = 54.4462 (norm. = 0.239466), norm. avg. (of 8) = 0.255151 fft 2: mflops = 35.2067 (norm. = 0.154846), norm. avg. (of 8) = 0.12063 fft 3: mflops = 18.7772 (norm. = 0.0825861), norm. avg. (of 8) = 0.0499852 fft 4: mflops = 73.4697 (norm. = 0.323135), norm. avg. (of 8) = 0.156382 fft 5: mflops = 6.31512 (norm. = 0.0277752), norm. avg. (of 8) = 0.0320246 fft 6: mflops = 66.424 (norm. = 0.292146), norm. avg. (of 8) = 0.175892 fft 7: mflops = 53.6354 (norm. = 0.2359), norm. avg. (of 8) = 0.135312 fft 8: mflops = 19.543 (norm. = 0.0859539), norm. avg. (of 8) = 0.110719 fft 9: mflops = 97.3109 (norm. = 0.427993), norm. avg. (of 8) = 0.189549 fft 10: mflops = 131.257 (norm. = 0.577294), norm. avg. (of 8) = 0.257132 fft 11: mflops = 17.5476 (norm. = 0.0771782), norm. avg. (of 7) = 0.0647805 fft 12: mflops = 140.627 (norm. = 0.618505), norm. avg. (of 8) = 0.347511 fft 13: mflops = 85.3572 (norm. = 0.375418), norm. avg. (of 8) = 0.229258 fft 14: mflops = 190.41 (norm. = 0.837464), norm. avg. (of 8) = 0.720559 fft 15: mflops = 189.827 (norm. = 0.834898), norm. avg. (of 8) = 0.727756 fft 16: mflops = 119.968 (norm. = 0.527642), norm. avg. (of 8) = 0.824205 fft 17: mflops = 103.709 (norm. = 0.456135), norm. avg. (of 6) = 0.410036 fft 18: mflops = 56.6039 (norm. = 0.248956), norm. avg. (of 8) = 0.160198 fft 19: mflops = 45.8677 (norm. = 0.201736), norm. avg. (of 8) = 0.114471 fft 20: mflops = 54.8406 (norm. = 0.2412), norm. avg. (of 8) = 0.12953 fft 21: mflops = 84.3184 (norm. = 0.370849), norm. avg. (of 8) = 0.404325 fft 22: mflops = 52.9085 (norm. = 0.232702), norm. avg. (of 7) = 0.182687 fft 23: mflops = 64.3463 (norm. = 0.283008), norm. avg. (of 7) = 0.220854 fft 24: mflops = 68.4345 (norm. = 0.300989), norm. avg. (of 7) = 0.225433 fft 25: mflops = 67.9992 (norm. = 0.299075), norm. avg. (of 7) = 0.165508 fft 26: mflops = 23.2044 (norm. = 0.102058), norm. avg. (of 8) = 0.0582588 fft 27: mflops = 42.0933 (norm. = 0.185135), norm. avg. (of 8) = 0.102989 fft 28: mflops = 50.9324 (norm. = 0.224011), norm. avg. (of 8) = 0.126302 fft 29: mflops = 53.1031 (norm. = 0.233558), norm. avg. (of 8) = 0.129698 fft 30: mflops = 109.853 (norm. = 0.483154), norm. avg. (of 8) = 0.340206 fft 31: mflops = 108.825 (norm. = 0.478635), norm. avg. (of 8) = 0.32967 fft 32: mflops = 43.0797 (norm. = 0.189474), norm. avg. (of 5) = 0.201435 fft 33: mflops = 33.9493 (norm. = 0.149316), norm. avg. (of 7) = 0.0676325 fft 34: mflops = 84.0356 (norm. = 0.369606), norm. avg. (of 7) = 0.280776 fft 35: mflops = 92.3967 (norm. = 0.40638), norm. avg. (of 8) = 0.189284 fft 36: mflops = 94.5417 (norm. = 0.415814), norm. avg. (of 8) = 0.195231 fft 37: mflops = 134.91 (norm. = 0.593362), norm. avg. (of 8) = 0.317948 fft 38: mflops = 19.4838 (norm. = 0.0856937), norm. avg. (of 8) = 0.104253 fft 39: mflops = 63.6589 (norm. = 0.279985), norm. avg. (of 8) = 0.148467 fft 40: mflops = 42.4293 (norm. = 0.186613), norm. avg. (of 8) = 0.106002 fft 41: mflops = 4.17149 (norm. = 0.018347), norm. avg. (of 8) = 0.0222005 fft 42: mflops = 227.365 (norm. = 1), norm. avg. (of 8) = 0.504751 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.85046 s, 4096 iters, t-(init.)=1.81871 s t(norm)=0.096359, mflops=51.8893 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.6858 s, 4096 iters, t-(init.)=1.65395 s t(norm)=0.0876295, mflops=57.0584 (err=1.1e-15) 2. Arndt Split-Radix: elapsed time t=1.29639 s, 2048 iters, t-(init.)=1.28053 s t(norm)=0.13569, mflops=36.8488 (err=1.1e-15) 3. Arndt 4-step: elapsed time t=1.20371 s, 1024 iters, t-(init.)=1.19551 s t(norm)=0.253362, mflops=19.7346 (err=1.0e-15) 4. Bailey: elapsed time t=1.28711 s, 4096 iters, t-(init.)=1.25539 s t(norm)=0.0665128, mflops=75.1734 (err=1.1e-15) 5. Beauregard: elapsed time t=1.9676 s, 512 iters, t-(init.)=1.96358 s t(norm)=0.832275, mflops=6.00763 (err=1.1e-15) 6. Bergland: elapsed time t=1.47186 s, 4096 iters, t-(init.)=1.44017 s t(norm)=0.0763027, mflops=65.5285 (err=1.0e-15) 7. Brenner: elapsed time t=1.80555 s, 4096 iters, t-(init.)=1.7735 s t(norm)=0.0939635, mflops=53.2121 (err=1.0e-15) 8. Burrus: elapsed time t=1.14826 s, 1024 iters, t-(init.)=1.14032 s t(norm)=0.241665, mflops=20.6898 (err=1.1e-15) 9. CWP (min N) (N=520): elapsed time t=1.0347 s, 4096 iters, t-(init.)=1.00247 s t(norm)=0.0531127, mflops=94.1394 10. CWP (best N) (N=560): elapsed time t=1.83637 s, 8192 iters, t-(init.)=1.76698 s t(norm)=0.046809, mflops=106.817 11. Edelblute: elapsed time t=1.25706 s, 1024 iters, t-(init.)=1.24912 s t(norm)=0.264723, mflops=18.8877 (err=1.1e-15) 12. FFTPACK: elapsed time t=1.80717 s, 8192 iters, t-(init.)=1.74363 s t(norm)=0.0461904, mflops=108.248 (err=1.0e-15) 13. FFTPACK (f2c): elapsed time t=1.696 s, 4096 iters, t-(init.)=1.66403 s t(norm)=0.0881637, mflops=56.7127 (err=1.0e-15) FFTW_MEASURE plan: (cost = 1.624700e-04) FFTW_TWIDDLE 32 FFTW_NOTW 16 14. FFTW: elapsed time t=1.31061 s, 8192 iters, t-(init.)=1.24721 s t(norm)=0.0330398, mflops=151.332 (err=9.8e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.20782 s, 8192 iters, t-(init.)=1.14414 s t(norm)=0.0303093, mflops=164.966 (err=9.7e-16) 16. Frigo-old: elapsed time t=1.84888 s, 8192 iters, t-(init.)=1.78554 s t(norm)=0.0473006, mflops=105.707 (err=9.5e-16) 17. Green: elapsed time t=1.83697 s, 4096 iters, t-(init.)=1.80524 s t(norm)=0.095645, mflops=52.2766 (err=9.6e-16) 18. GSL: elapsed time t=1.07672 s, 2048 iters, t-(init.)=1.0608 s t(norm)=0.112406, mflops=44.4815 (err=1.0e-15) 19. GSL DIT: elapsed time t=1.00208 s, 2048 iters, t-(init.)=0.986193 s t(norm)=0.104501, mflops=47.8466 (err=1.2e-15) 20. GSL DIF: elapsed time t=1.76063 s, 4096 iters, t-(init.)=1.72873 s t(norm)=0.0915916, mflops=54.5902 (err=1.1e-15) 21. Krukar: elapsed time t=1.27889 s, 4096 iters, t-(init.)=1.247 s t(norm)=0.0660685, mflops=75.6791 (err=1.0e-15) 22. Mayer (Buneman): elapsed time t=1.73638 s, 4096 iters, t-(init.)=1.70471 s t(norm)=0.0903188, mflops=55.3595 (err=1.0e-15) 23. Mayer (simple): elapsed time t=1.26372 s, 4096 iters, t-(init.)=1.23108 s t(norm)=0.0652248, mflops=76.658 24. Mayer (lookup): elapsed time t=1.37801 s, 4096 iters, t-(init.)=1.34617 s t(norm)=0.0713224, mflops=70.1042 (err=1.0e-15) 25. Monro: elapsed time t=1.33676 s, 4096 iters, t-(init.)=1.30503 s t(norm)=0.0691432, mflops=72.3137 (err=8.4e-08) 26. NAPACK (f2c): elapsed time t=1.06046 s, 1024 iters, t-(init.)=1.05252 s t(norm)=0.223059, mflops=22.4156 (err=7.1e-15) 27. Nielsen: elapsed time t=1.96494 s, 4096 iters, t-(init.)=1.93324 s t(norm)=0.102427, mflops=48.8154 (err=3.7e-15) 28. NR (C): elapsed time t=1.76672 s, 4096 iters, t-(init.)=1.73499 s t(norm)=0.0919233, mflops=54.3932 (err=9.7e-16) 29. NR (F): elapsed time t=1.71268 s, 4096 iters, t-(init.)=1.68084 s t(norm)=0.0890541, mflops=56.1457 (err=9.7e-16) 30. Ooura (C): elapsed time t=1.79129 s, 8192 iters, t-(init.)=1.72775 s t(norm)=0.0457698, mflops=109.242 (err=9.6e-16) 31. Ooura (F): elapsed time t=1.86149 s, 8192 iters, t-(init.)=1.7981 s t(norm)=0.0476334, mflops=104.968 (err=9.6e-16) 32. QFT: elapsed time t=1.31583 s, 2048 iters, t-(init.)=1.30002 s t(norm)=0.137755, mflops=36.2964 (err=1.3e-15) 33. Ransom: elapsed time t=1.60761 s, 2048 iters, t-(init.)=1.59169 s t(norm)=0.168661, mflops=29.6452 (err=1.4e-15) 34. SCIPORT: elapsed time t=1.30174 s, 4096 iters, t-(init.)=1.26987 s t(norm)=0.0672802, mflops=74.3161 (err=1.0e-15) 35. Singleton: elapsed time t=1.05328 s, 4096 iters, t-(init.)=1.02137 s t(norm)=0.0541143, mflops=92.397 (err=1.2e-15) 36. Singleton (f2c): elapsed time t=1.02997 s, 4096 iters, t-(init.)=0.998302 s t(norm)=0.0528919, mflops=94.5324 (err=1.2e-15) 37. Sorensen: elapsed time t=1.5236 s, 8192 iters, t-(init.)=1.46018 s t(norm)=0.0386816, mflops=129.26 (err=1.0e-15) 38. Sorensen DIT: elapsed time t=1.14223 s, 1024 iters, t-(init.)=1.1343 s t(norm)=0.24039, mflops=20.7995 (err=1.1e-15) 39. Temperton: elapsed time t=1.75517 s, 4096 iters, t-(init.)=1.7234 s t(norm)=0.091309, mflops=54.7591 (err=1.0e-07) 40. Temperton (f2c): elapsed time t=1.32475 s, 2048 iters, t-(init.)=1.3089 s t(norm)=0.138696, mflops=36.0501 (err=1.0e-15) 41. Valkenburg: elapsed time t=1.41057 s, 256 iters, t-(init.)=1.40857 s t(norm)=1.19406, mflops=4.18739 (err=1.3e-15) 42. SUNPERF: elapsed time t=1.09205 s, 8192 iters, t-(init.)=1.02871 s t(norm)=0.0272515, mflops=183.476 (err=1.0e-15) Top mflops for N=512 = 183.476 Normalized results and averages for N=512: fft 0: mflops = 51.8893 (norm. = 0.282812), norm. avg. (of 9) = 0.268172 fft 1: mflops = 57.0584 (norm. = 0.310985), norm. avg. (of 9) = 0.261355 fft 2: mflops = 36.8488 (norm. = 0.200837), norm. avg. (of 9) = 0.129542 fft 3: mflops = 19.7346 (norm. = 0.107559), norm. avg. (of 9) = 0.0563824 fft 4: mflops = 75.1734 (norm. = 0.409718), norm. avg. (of 9) = 0.18453 fft 5: mflops = 6.00763 (norm. = 0.0327434), norm. avg. (of 9) = 0.0321045 fft 6: mflops = 65.5285 (norm. = 0.35715), norm. avg. (of 9) = 0.196031 fft 7: mflops = 53.2121 (norm. = 0.290022), norm. avg. (of 9) = 0.152502 fft 8: mflops = 20.6898 (norm. = 0.112765), norm. avg. (of 9) = 0.110947 fft 9: mflops = 94.1394 (norm. = 0.513088), norm. avg. (of 9) = 0.225498 fft 10: mflops = 106.817 (norm. = 0.582185), norm. avg. (of 9) = 0.293249 fft 11: mflops = 18.8877 (norm. = 0.102943), norm. avg. (of 8) = 0.0695508 fft 12: mflops = 108.248 (norm. = 0.589982), norm. avg. (of 9) = 0.374452 fft 13: mflops = 56.7127 (norm. = 0.309101), norm. avg. (of 9) = 0.238129 fft 14: mflops = 151.332 (norm. = 0.824807), norm. avg. (of 9) = 0.732142 fft 15: mflops = 164.966 (norm. = 0.899112), norm. avg. (of 9) = 0.746795 fft 16: mflops = 105.707 (norm. = 0.576134), norm. avg. (of 9) = 0.796641 fft 17: mflops = 52.2766 (norm. = 0.284923), norm. avg. (of 7) = 0.392163 fft 18: mflops = 44.4815 (norm. = 0.242437), norm. avg. (of 9) = 0.169336 fft 19: mflops = 47.8466 (norm. = 0.260778), norm. avg. (of 9) = 0.130727 fft 20: mflops = 54.5902 (norm. = 0.297533), norm. avg. (of 9) = 0.148197 fft 21: mflops = 75.6791 (norm. = 0.412473), norm. avg. (of 9) = 0.40523 fft 22: mflops = 55.3595 (norm. = 0.301726), norm. avg. (of 8) = 0.197566 fft 23: mflops = 76.658 (norm. = 0.417809), norm. avg. (of 8) = 0.245473 fft 24: mflops = 70.1042 (norm. = 0.382089), norm. avg. (of 8) = 0.245015 fft 25: mflops = 72.3137 (norm. = 0.394131), norm. avg. (of 8) = 0.194086 fft 26: mflops = 22.4156 (norm. = 0.122172), norm. avg. (of 9) = 0.0653603 fft 27: mflops = 48.8154 (norm. = 0.266058), norm. avg. (of 9) = 0.121108 fft 28: mflops = 54.3932 (norm. = 0.296459), norm. avg. (of 9) = 0.145208 fft 29: mflops = 56.1457 (norm. = 0.306011), norm. avg. (of 9) = 0.149288 fft 30: mflops = 109.242 (norm. = 0.595403), norm. avg. (of 9) = 0.368562 fft 31: mflops = 104.968 (norm. = 0.572109), norm. avg. (of 9) = 0.356608 fft 32: mflops = 36.2964 (norm. = 0.197826), norm. avg. (of 6) = 0.200833 fft 33: mflops = 29.6452 (norm. = 0.161575), norm. avg. (of 8) = 0.0793753 fft 34: mflops = 74.3161 (norm. = 0.405045), norm. avg. (of 8) = 0.29631 fft 35: mflops = 92.397 (norm. = 0.503591), norm. avg. (of 9) = 0.224207 fft 36: mflops = 94.5324 (norm. = 0.51523), norm. avg. (of 9) = 0.230787 fft 37: mflops = 129.26 (norm. = 0.704508), norm. avg. (of 9) = 0.360899 fft 38: mflops = 20.7995 (norm. = 0.113364), norm. avg. (of 9) = 0.105265 fft 39: mflops = 54.7591 (norm. = 0.298453), norm. avg. (of 9) = 0.165132 fft 40: mflops = 36.0501 (norm. = 0.196484), norm. avg. (of 9) = 0.116056 fft 41: mflops = 4.18739 (norm. = 0.0228225), norm. avg. (of 9) = 0.0222696 fft 42: mflops = 183.476 (norm. = 1), norm. avg. (of 9) = 0.559779 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.9977 s, 2048 iters, t-(init.)=1.96624 s t(norm)=0.0937575, mflops=53.3291 (err=1.8e-15) 1. Arndt DIT: elapsed time t=1.79003 s, 2048 iters, t-(init.)=1.7586 s t(norm)=0.0838568, mflops=59.6255 (err=1.8e-15) 2. Arndt Split-Radix: elapsed time t=1.33435 s, 1024 iters, t-(init.)=1.31861 s t(norm)=0.125753, mflops=39.7606 (err=1.8e-15) 3. Arndt 4-step: elapsed time t=1.17377 s, 512 iters, t-(init.)=1.16588 s t(norm)=0.222374, mflops=22.4847 (err=1.8e-15) 4. Bailey: elapsed time t=1.46608 s, 2048 iters, t-(init.)=1.43465 s t(norm)=0.0684094, mflops=73.0894 (err=1.9e-15) 5. Beauregard: elapsed time t=1.08558 s, 128 iters, t-(init.)=1.0836 s t(norm)=0.82672, mflops=6.048 (err=2.0e-15) 6. Bergland: elapsed time t=1.52254 s, 2048 iters, t-(init.)=1.49108 s t(norm)=0.0711004, mflops=70.3231 (err=2.2e-15) 7. Brenner: elapsed time t=1.78653 s, 2048 iters, t-(init.)=1.7543 s t(norm)=0.0836517, mflops=59.7716 (err=1.9e-15) 8. Burrus: elapsed time t=1.16023 s, 512 iters, t-(init.)=1.15236 s t(norm)=0.219795, mflops=22.7485 (err=1.8e-15) 9. CWP (min N) (N=1040): elapsed time t=1.18695 s, 2048 iters, t-(init.)=1.15501 s t(norm)=0.055075, mflops=90.7853 10. CWP (best N) (N=1040): elapsed time t=1.18724 s, 2048 iters, t-(init.)=1.15521 s t(norm)=0.0550847, mflops=90.7693 11. Edelblute: elapsed time t=1.26923 s, 512 iters, t-(init.)=1.26135 s t(norm)=0.240583, mflops=20.7829 (err=1.8e-15) 12. FFTPACK: elapsed time t=1.09467 s, 2048 iters, t-(init.)=1.0631 s t(norm)=0.0506925, mflops=98.6339 (err=2.0e-15) 13. FFTPACK (f2c): elapsed time t=1.09455 s, 1024 iters, t-(init.)=1.07874 s t(norm)=0.102877, mflops=48.6017 (err=2.0e-15) FFTW_MEASURE plan: (cost = 3.541770e-04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 4 FFTW_NOTW 16 14. FFTW: elapsed time t=1.39131 s, 4096 iters, t-(init.)=1.3284 s t(norm)=0.0316716, mflops=157.87 (err=1.9e-15) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.45135 s, 4096 iters, t-(init.)=1.38642 s t(norm)=0.0330547, mflops=151.264 (err=2.0e-15) 16. Frigo-old: elapsed time t=1.32703 s, 2048 iters, t-(init.)=1.29556 s t(norm)=0.0617771, mflops=80.9362 (err=1.9e-15) 17. Green: elapsed time t=1.54134 s, 2048 iters, t-(init.)=1.50988 s t(norm)=0.0719966, mflops=69.4477 (err=2.0e-15) 18. GSL: elapsed time t=1.15188 s, 1024 iters, t-(init.)=1.13614 s t(norm)=0.108351, mflops=46.1463 (err=2.0e-15) 19. GSL DIT: elapsed time t=1.07129 s, 1024 iters, t-(init.)=1.05554 s t(norm)=0.100664, mflops=49.6701 (err=2.1e-15) 20. GSL DIF: elapsed time t=1.80048 s, 2048 iters, t-(init.)=1.76894 s t(norm)=0.0843496, mflops=59.2771 (err=2.2e-15) 21. Krukar: elapsed time t=1.07853 s, 1024 iters, t-(init.)=1.06278 s t(norm)=0.101354, mflops=49.3318 (err=1.9e-15) 22. Mayer (Buneman): elapsed time t=1.86148 s, 2048 iters, t-(init.)=1.83004 s t(norm)=0.087263, mflops=57.298 (err=1.8e-15) 23. Mayer (simple): elapsed time t=1.37201 s, 2048 iters, t-(init.)=1.34058 s t(norm)=0.0639237, mflops=78.2183 24. Mayer (lookup): elapsed time t=1.39451 s, 2048 iters, t-(init.)=1.36296 s t(norm)=0.0649911, mflops=76.9336 (err=1.8e-15) 25. Monro: elapsed time t=1.35645 s, 2048 iters, t-(init.)=1.325 s t(norm)=0.0631809, mflops=79.1378 (err=1.0e-07) 26. NAPACK (f2c): elapsed time t=1.22256 s, 512 iters, t-(init.)=1.21466 s t(norm)=0.231678, mflops=21.5817 (err=1.7e-14) 27. Nielsen: elapsed time t=1.29023 s, 1024 iters, t-(init.)=1.27451 s t(norm)=0.121546, mflops=41.1366 (err=7.5e-15) 28. NR (C): elapsed time t=1.88667 s, 2048 iters, t-(init.)=1.85522 s t(norm)=0.0884637, mflops=56.5204 (err=1.9e-15) 29. NR (F): elapsed time t=1.89296 s, 2048 iters, t-(init.)=1.86147 s t(norm)=0.0887616, mflops=56.3307 (err=1.9e-15) 30. Ooura (C): elapsed time t=1.84855 s, 4096 iters, t-(init.)=1.78529 s t(norm)=0.0425646, mflops=117.468 (err=2.2e-15) 31. Ooura (F): elapsed time t=1.91547 s, 4096 iters, t-(init.)=1.85257 s t(norm)=0.0441686, mflops=113.203 (err=2.2e-15) 32. QFT: elapsed time t=1.56299 s, 1024 iters, t-(init.)=1.54727 s t(norm)=0.147559, mflops=33.8847 (err=2.2e-15) 33. Ransom: elapsed time t=1.23013 s, 1024 iters, t-(init.)=1.21439 s t(norm)=0.115813, mflops=43.1731 (err=2.3e-15) 34. SCIPORT: elapsed time t=1.54025 s, 2048 iters, t-(init.)=1.50872 s t(norm)=0.0719414, mflops=69.501 (err=2.0e-15) 35. Singleton: elapsed time t=1.03858 s, 2048 iters, t-(init.)=1.00704 s t(norm)=0.0480195, mflops=104.124 (err=2.8e-15) 36. Singleton (f2c): elapsed time t=1.01551 s, 2048 iters, t-(init.)=0.984069 s t(norm)=0.046924, mflops=106.555 (err=2.8e-15) 37. Sorensen: elapsed time t=1.58823 s, 4096 iters, t-(init.)=1.52535 s t(norm)=0.0363672, mflops=137.487 (err=1.8e-15) 38. Sorensen DIT: elapsed time t=1.1598 s, 512 iters, t-(init.)=1.15194 s t(norm)=0.219715, mflops=22.7567 (err=1.8e-15) 39. Temperton: elapsed time t=1.69843 s, 2048 iters, t-(init.)=1.66695 s t(norm)=0.0794864, mflops=62.9038 (err=1.1e-07) 40. Temperton (f2c): elapsed time t=1.31556 s, 1024 iters, t-(init.)=1.29984 s t(norm)=0.123962, mflops=40.3349 (err=2.0e-15) 41. Valkenburg: elapsed time t=1.57521 s, 128 iters, t-(init.)=1.57324 s t(norm)=1.20028, mflops=4.16568 (err=2.4e-15) 42. SUNPERF: elapsed time t=1.1579 s, 4096 iters, t-(init.)=1.09505 s t(norm)=0.026108, mflops=191.512 (err=2.0e-15) Top mflops for N=1024 = 191.512 Normalized results and averages for N=1024: fft 0: mflops = 53.3291 (norm. = 0.278463), norm. avg. (of 10) = 0.269201 fft 1: mflops = 59.6255 (norm. = 0.31134), norm. avg. (of 10) = 0.266354 fft 2: mflops = 39.7606 (norm. = 0.207614), norm. avg. (of 10) = 0.137349 fft 3: mflops = 22.4847 (norm. = 0.117406), norm. avg. (of 10) = 0.0624847 fft 4: mflops = 73.0894 (norm. = 0.381644), norm. avg. (of 10) = 0.204242 fft 5: mflops = 6.048 (norm. = 0.0315802), norm. avg. (of 10) = 0.032052 fft 6: mflops = 70.3231 (norm. = 0.367199), norm. avg. (of 10) = 0.213148 fft 7: mflops = 59.7716 (norm. = 0.312104), norm. avg. (of 10) = 0.168462 fft 8: mflops = 22.7485 (norm. = 0.118784), norm. avg. (of 10) = 0.11173 fft 9: mflops = 90.7853 (norm. = 0.474045), norm. avg. (of 10) = 0.250353 fft 10: mflops = 90.7693 (norm. = 0.473961), norm. avg. (of 10) = 0.311321 fft 11: mflops = 20.7829 (norm. = 0.10852), norm. avg. (of 9) = 0.0738807 fft 12: mflops = 98.6339 (norm. = 0.515027), norm. avg. (of 10) = 0.38851 fft 13: mflops = 48.6017 (norm. = 0.253779), norm. avg. (of 10) = 0.239694 fft 14: mflops = 157.87 (norm. = 0.824334), norm. avg. (of 10) = 0.741361 fft 15: mflops = 151.264 (norm. = 0.789841), norm. avg. (of 10) = 0.7511 fft 16: mflops = 80.9362 (norm. = 0.422616), norm. avg. (of 10) = 0.759239 fft 17: mflops = 69.4477 (norm. = 0.362628), norm. avg. (of 8) = 0.388471 fft 18: mflops = 46.1463 (norm. = 0.240957), norm. avg. (of 10) = 0.176498 fft 19: mflops = 49.6701 (norm. = 0.259357), norm. avg. (of 10) = 0.14359 fft 20: mflops = 59.2771 (norm. = 0.309521), norm. avg. (of 10) = 0.16433 fft 21: mflops = 49.3318 (norm. = 0.257591), norm. avg. (of 10) = 0.390466 fft 22: mflops = 57.298 (norm. = 0.299187), norm. avg. (of 9) = 0.208858 fft 23: mflops = 78.2183 (norm. = 0.408425), norm. avg. (of 9) = 0.263579 fft 24: mflops = 76.9336 (norm. = 0.401716), norm. avg. (of 9) = 0.262426 fft 25: mflops = 79.1378 (norm. = 0.413226), norm. avg. (of 9) = 0.218435 fft 26: mflops = 21.5817 (norm. = 0.112691), norm. avg. (of 10) = 0.0700934 fft 27: mflops = 41.1366 (norm. = 0.214799), norm. avg. (of 10) = 0.130477 fft 28: mflops = 56.5204 (norm. = 0.295127), norm. avg. (of 10) = 0.1602 fft 29: mflops = 56.3307 (norm. = 0.294136), norm. avg. (of 10) = 0.163773 fft 30: mflops = 117.468 (norm. = 0.613373), norm. avg. (of 10) = 0.393043 fft 31: mflops = 113.203 (norm. = 0.591098), norm. avg. (of 10) = 0.380057 fft 32: mflops = 33.8847 (norm. = 0.176932), norm. avg. (of 7) = 0.197419 fft 33: mflops = 43.1731 (norm. = 0.225433), norm. avg. (of 9) = 0.0956039 fft 34: mflops = 69.501 (norm. = 0.362906), norm. avg. (of 9) = 0.303709 fft 35: mflops = 104.124 (norm. = 0.543696), norm. avg. (of 10) = 0.256156 fft 36: mflops = 106.555 (norm. = 0.556388), norm. avg. (of 10) = 0.263347 fft 37: mflops = 137.487 (norm. = 0.7179), norm. avg. (of 10) = 0.396599 fft 38: mflops = 22.7567 (norm. = 0.118827), norm. avg. (of 10) = 0.106621 fft 39: mflops = 62.9038 (norm. = 0.328458), norm. avg. (of 10) = 0.181465 fft 40: mflops = 40.3349 (norm. = 0.210613), norm. avg. (of 10) = 0.125511 fft 41: mflops = 4.16568 (norm. = 0.0217515), norm. avg. (of 10) = 0.0222178 fft 42: mflops = 191.512 (norm. = 1), norm. avg. (of 10) = 0.603801 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.27729 s, 512 iters, t-(init.)=1.26158 s t(norm)=0.109376, mflops=45.7137 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.19426 s, 512 iters, t-(init.)=1.17859 s t(norm)=0.102181, mflops=48.933 (err=1.4e-15) 2. Arndt Split-Radix: elapsed time t=1.81646 s, 512 iters, t-(init.)=1.8008 s t(norm)=0.156125, mflops=32.0256 (err=1.4e-15) 3. Arndt 4-step: elapsed time t=1.39564 s, 256 iters, t-(init.)=1.3878 s t(norm)=0.240638, mflops=20.7781 (err=1.4e-15) 4. Bailey: elapsed time t=1.87104 s, 1024 iters, t-(init.)=1.83969 s t(norm)=0.0797483, mflops=62.6973 (err=1.4e-15) 5. Beauregard: elapsed time t=1.24093 s, 64 iters, t-(init.)=1.23897 s t(norm)=0.859325, mflops=5.81852 (err=1.4e-15) 6. Bergland: elapsed time t=1.71887 s, 1024 iters, t-(init.)=1.6866 s t(norm)=0.0731122, mflops=68.388 (err=1.5e-15) 7. Brenner: elapsed time t=1.05867 s, 512 iters, t-(init.)=1.043 s t(norm)=0.0904253, mflops=55.2943 (err=1.4e-15) 8. Burrus: elapsed time t=1.37567 s, 256 iters, t-(init.)=1.36782 s t(norm)=0.237173, mflops=21.0816 (err=1.4e-15) 9. CWP (min N) (N=2145): elapsed time t=1.56676 s, 1024 iters, t-(init.)=1.53396 s t(norm)=0.0664953, mflops=75.1933 10. CWP (best N) (N=2184): elapsed time t=1.33371 s, 1024 iters, t-(init.)=1.30023 s t(norm)=0.0563636, mflops=88.7097 11. Edelblute: elapsed time t=1.46498 s, 256 iters, t-(init.)=1.45714 s t(norm)=0.252661, mflops=19.7893 (err=1.4e-15) 12. FFTPACK: elapsed time t=1.11978 s, 1024 iters, t-(init.)=1.08841 s t(norm)=0.0471815, mflops=105.974 (err=1.4e-15) 13. FFTPACK (f2c): elapsed time t=1.11919 s, 512 iters, t-(init.)=1.10351 s t(norm)=0.0956721, mflops=52.2618 (err=1.4e-15) FFTW_MEASURE plan: (cost = 7.787320e-04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.54539 s, 2048 iters, t-(init.)=1.48274 s t(norm)=0.0321376, mflops=155.581 (err=1.4e-15) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.01291 s, 1024 iters, t-(init.)=0.981531 s t(norm)=0.0425482, mflops=117.514 (err=1.4e-15) 16. Frigo-old: elapsed time t=1.46454 s, 1024 iters, t-(init.)=1.43321 s t(norm)=0.062128, mflops=80.479 (err=1.3e-15) 17. Green: elapsed time t=1.5844 s, 1024 iters, t-(init.)=1.55306 s t(norm)=0.0673235, mflops=74.2683 (err=1.4e-15) 18. GSL: elapsed time t=1.40912 s, 512 iters, t-(init.)=1.3934 s t(norm)=0.120804, mflops=41.3892 (err=1.4e-15) 19. GSL DIT: elapsed time t=1.44417 s, 512 iters, t-(init.)=1.4285 s t(norm)=0.123847, mflops=40.3723 (err=2.0e-15) 20. GSL DIF: elapsed time t=1.25207 s, 512 iters, t-(init.)=1.2364 s t(norm)=0.107193, mflops=46.645 (err=2.3e-15) 21. Krukar: elapsed time t=1.31018 s, 512 iters, t-(init.)=1.29449 s t(norm)=0.112229, mflops=44.5516 (err=1.4e-15) 22. Mayer (Buneman): elapsed time t=1.03516 s, 512 iters, t-(init.)=1.0195 s t(norm)=0.0883881, mflops=56.5687 (err=1.4e-15) 23. Mayer (simple): elapsed time t=1.64414 s, 1024 iters, t-(init.)=1.61283 s t(norm)=0.0699143, mflops=71.5162 24. Mayer (lookup): elapsed time t=1.58896 s, 1024 iters, t-(init.)=1.55755 s t(norm)=0.067518, mflops=74.0544 (err=1.4e-15) 25. Monro: elapsed time t=1.98394 s, 1024 iters, t-(init.)=1.95262 s t(norm)=0.0846439, mflops=59.071 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.58278 s, 256 iters, t-(init.)=1.57495 s t(norm)=0.273088, mflops=18.3091 (err=1.5e-14) 27. Nielsen: elapsed time t=1.54963 s, 512 iters, t-(init.)=1.53396 s t(norm)=0.132991, mflops=37.5965 (err=1.1e-14) 28. NR (C): elapsed time t=1.27853 s, 512 iters, t-(init.)=1.26287 s t(norm)=0.109487, mflops=45.6673 (err=1.4e-15) 29. NR (F): elapsed time t=1.28251 s, 512 iters, t-(init.)=1.2668 s t(norm)=0.109829, mflops=45.5253 (err=1.4e-15) 30. Ooura (C): elapsed time t=1.33448 s, 1024 iters, t-(init.)=1.30312 s t(norm)=0.0564885, mflops=88.5135 (err=1.4e-15) 31. Ooura (F): elapsed time t=1.39347 s, 1024 iters, t-(init.)=1.36215 s t(norm)=0.0590476, mflops=84.6775 (err=1.4e-15) 32. QFT: elapsed time t=1.89551 s, 512 iters, t-(init.)=1.87985 s t(norm)=0.162978, mflops=30.6789 (err=1.9e-15) 33. Ransom: elapsed time t=1.74299 s, 512 iters, t-(init.)=1.72731 s t(norm)=0.149754, mflops=33.3881 (err=2.1e-15) 34. SCIPORT: elapsed time t=1.79465 s, 1024 iters, t-(init.)=1.76327 s t(norm)=0.0764358, mflops=65.4144 (err=1.4e-15) 35. Singleton: elapsed time t=1.72478 s, 1024 iters, t-(init.)=1.69341 s t(norm)=0.0734074, mflops=68.113 (err=1.9e-15) 36. Singleton (f2c): elapsed time t=1.6044 s, 1024 iters, t-(init.)=1.57308 s t(norm)=0.0681911, mflops=73.3234 (err=1.9e-15) 37. Sorensen: elapsed time t=1.28649 s, 1024 iters, t-(init.)=1.25518 s t(norm)=0.0544105, mflops=91.8941 (err=1.4e-15) 38. Sorensen DIT: elapsed time t=1.48754 s, 256 iters, t-(init.)=1.4797 s t(norm)=0.256574, mflops=19.4876 (err=1.4e-15) 39. Temperton: elapsed time t=1.11212 s, 512 iters, t-(init.)=1.09644 s t(norm)=0.0950588, mflops=52.599 (err=1.1e-07) 40. Temperton (f2c): elapsed time t=1.55197 s, 512 iters, t-(init.)=1.53631 s t(norm)=0.133194, mflops=37.5391 (err=1.4e-15) 41. Valkenburg: elapsed time t=1.76828 s, 64 iters, t-(init.)=1.76631 s t(norm)=1.22508, mflops=4.08136 (err=1.7e-15) 42. SUNPERF: elapsed time t=1.2105 s, 2048 iters, t-(init.)=1.14729 s t(norm)=0.0248668, mflops=201.071 (err=1.4e-15) Top mflops for N=2048 = 201.071 Normalized results and averages for N=2048: fft 0: mflops = 45.7137 (norm. = 0.227351), norm. avg. (of 11) = 0.265397 fft 1: mflops = 48.933 (norm. = 0.243361), norm. avg. (of 11) = 0.264263 fft 2: mflops = 32.0256 (norm. = 0.159275), norm. avg. (of 11) = 0.139342 fft 3: mflops = 20.7781 (norm. = 0.103337), norm. avg. (of 11) = 0.0661986 fft 4: mflops = 62.6973 (norm. = 0.311816), norm. avg. (of 11) = 0.214021 fft 5: mflops = 5.81852 (norm. = 0.0289376), norm. avg. (of 11) = 0.0317689 fft 6: mflops = 68.388 (norm. = 0.340118), norm. avg. (of 11) = 0.224691 fft 7: mflops = 55.2943 (norm. = 0.274998), norm. avg. (of 11) = 0.178148 fft 8: mflops = 21.0816 (norm. = 0.104847), norm. avg. (of 11) = 0.111105 fft 9: mflops = 75.1933 (norm. = 0.373963), norm. avg. (of 11) = 0.26159 fft 10: mflops = 88.7097 (norm. = 0.441185), norm. avg. (of 11) = 0.323126 fft 11: mflops = 19.7893 (norm. = 0.0984194), norm. avg. (of 10) = 0.0763346 fft 12: mflops = 105.974 (norm. = 0.527045), norm. avg. (of 11) = 0.401104 fft 13: mflops = 52.2618 (norm. = 0.259917), norm. avg. (of 11) = 0.241533 fft 14: mflops = 155.581 (norm. = 0.773759), norm. avg. (of 11) = 0.744307 fft 15: mflops = 117.514 (norm. = 0.584438), norm. avg. (of 11) = 0.735949 fft 16: mflops = 80.479 (norm. = 0.400251), norm. avg. (of 11) = 0.726604 fft 17: mflops = 74.2683 (norm. = 0.369363), norm. avg. (of 9) = 0.386348 fft 18: mflops = 41.3892 (norm. = 0.205843), norm. avg. (of 11) = 0.179166 fft 19: mflops = 40.3723 (norm. = 0.200786), norm. avg. (of 11) = 0.14879 fft 20: mflops = 46.645 (norm. = 0.231982), norm. avg. (of 11) = 0.17048 fft 21: mflops = 44.5516 (norm. = 0.221571), norm. avg. (of 11) = 0.375112 fft 22: mflops = 56.5687 (norm. = 0.281336), norm. avg. (of 10) = 0.216106 fft 23: mflops = 71.5162 (norm. = 0.355676), norm. avg. (of 10) = 0.272788 fft 24: mflops = 74.0544 (norm. = 0.368299), norm. avg. (of 10) = 0.273014 fft 25: mflops = 59.071 (norm. = 0.293781), norm. avg. (of 10) = 0.225969 fft 26: mflops = 18.3091 (norm. = 0.0910576), norm. avg. (of 11) = 0.0719992 fft 27: mflops = 37.5965 (norm. = 0.186981), norm. avg. (of 11) = 0.135614 fft 28: mflops = 45.6673 (norm. = 0.22712), norm. avg. (of 11) = 0.166284 fft 29: mflops = 45.5253 (norm. = 0.226414), norm. avg. (of 11) = 0.169467 fft 30: mflops = 88.5135 (norm. = 0.44021), norm. avg. (of 11) = 0.397331 fft 31: mflops = 84.6775 (norm. = 0.421131), norm. avg. (of 11) = 0.383791 fft 32: mflops = 30.6789 (norm. = 0.152577), norm. avg. (of 8) = 0.191814 fft 33: mflops = 33.3881 (norm. = 0.166051), norm. avg. (of 10) = 0.102649 fft 34: mflops = 65.4144 (norm. = 0.325329), norm. avg. (of 10) = 0.305871 fft 35: mflops = 68.113 (norm. = 0.33875), norm. avg. (of 11) = 0.263664 fft 36: mflops = 73.3234 (norm. = 0.364663), norm. avg. (of 11) = 0.272558 fft 37: mflops = 91.8941 (norm. = 0.457022), norm. avg. (of 11) = 0.402092 fft 38: mflops = 19.4876 (norm. = 0.0969187), norm. avg. (of 11) = 0.105739 fft 39: mflops = 52.599 (norm. = 0.261594), norm. avg. (of 11) = 0.188749 fft 40: mflops = 37.5391 (norm. = 0.186695), norm. avg. (of 11) = 0.131074 fft 41: mflops = 4.08136 (norm. = 0.0202981), norm. avg. (of 11) = 0.0220433 fft 42: mflops = 201.071 (norm. = 1), norm. avg. (of 11) = 0.639819 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.32765 s, 256 iters, t-(init.)=1.312 s t(norm)=0.104268, mflops=47.9531 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.23116 s, 256 iters, t-(init.)=1.21552 s t(norm)=0.0966009, mflops=51.7593 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.83272 s, 256 iters, t-(init.)=1.81706 s t(norm)=0.144407, mflops=34.6244 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.40426 s, 128 iters, t-(init.)=1.39643 s t(norm)=0.221957, mflops=22.5269 (err=3.7e-15) 4. Bailey: elapsed time t=1.04698 s, 256 iters, t-(init.)=1.03117 s t(norm)=0.08195, mflops=61.0128 (err=3.7e-15) 5. Beauregard: elapsed time t=1.36451 s, 32 iters, t-(init.)=1.36251 s t(norm)=0.866261, mflops=5.77193 (err=3.8e-15) 6. Bergland: elapsed time t=1.80717 s, 512 iters, t-(init.)=1.77591 s t(norm)=0.0705684, mflops=70.8533 (err=3.9e-15) 7. Brenner: elapsed time t=1.09666 s, 256 iters, t-(init.)=1.08101 s t(norm)=0.0859109, mflops=58.1998 (err=3.8e-15) 8. Burrus: elapsed time t=1.38184 s, 128 iters, t-(init.)=1.37402 s t(norm)=0.218394, mflops=22.8944 (err=3.7e-15) 9. CWP (min N) (N=4290): elapsed time t=1.70464 s, 512 iters, t-(init.)=1.6719 s t(norm)=0.0664351, mflops=75.2614 10. CWP (best N) (N=4368): elapsed time t=1.51554 s, 512 iters, t-(init.)=1.4822 s t(norm)=0.0588971, mflops=84.8938 11. Edelblute: elapsed time t=1.4984 s, 128 iters, t-(init.)=1.49032 s t(norm)=0.23688, mflops=21.1078 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.29305 s, 512 iters, t-(init.)=1.26176 s t(norm)=0.0501378, mflops=99.7252 (err=3.8e-15) 13. FFTPACK (f2c): elapsed time t=1.1312 s, 256 iters, t-(init.)=1.11546 s t(norm)=0.0886485, mflops=56.4026 (err=3.8e-15) FFTW_MEASURE plan: (cost = 1.980208e-03) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 14. FFTW: elapsed time t=1.74852 s, 1024 iters, t-(init.)=1.68595 s t(norm)=0.0334967, mflops=149.268 (err=3.8e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.74739 s, 1024 iters, t-(init.)=1.68476 s t(norm)=0.0334731, mflops=149.374 (err=3.8e-15) 16. Frigo-old: elapsed time t=1.80201 s, 512 iters, t-(init.)=1.77074 s t(norm)=0.0703631, mflops=71.06 (err=3.8e-15) 17. Green: elapsed time t=1.45147 s, 512 iters, t-(init.)=1.42017 s t(norm)=0.0564324, mflops=88.6015 (err=3.8e-15) 18. GSL: elapsed time t=1.43285 s, 256 iters, t-(init.)=1.41721 s t(norm)=0.112629, mflops=44.3934 (err=3.8e-15) 19. GSL DIT: elapsed time t=1.58899 s, 256 iters, t-(init.)=1.57335 s t(norm)=0.125039, mflops=39.9877 (err=4.1e-15) 20. GSL DIF: elapsed time t=1.38527 s, 256 iters, t-(init.)=1.36962 s t(norm)=0.108847, mflops=45.9359 (err=4.3e-15) 21. Krukar: elapsed time t=1.45784 s, 256 iters, t-(init.)=1.44219 s t(norm)=0.114615, mflops=43.6242 (err=3.8e-15) 22. Mayer (Buneman): elapsed time t=1.30434 s, 256 iters, t-(init.)=1.28871 s t(norm)=0.102417, mflops=48.8198 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.13705 s, 256 iters, t-(init.)=1.12142 s t(norm)=0.0891222, mflops=56.1027 24. Mayer (lookup): elapsed time t=1.11632 s, 256 iters, t-(init.)=1.10068 s t(norm)=0.0874739, mflops=57.1599 (err=3.7e-15) 25. Monro: elapsed time t=1.08097 s, 256 iters, t-(init.)=1.06532 s t(norm)=0.084664, mflops=59.057 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.66589 s, 128 iters, t-(init.)=1.65807 s t(norm)=0.263544, mflops=18.9722 (err=4.9e-14) 27. Nielsen: elapsed time t=1.69844 s, 256 iters, t-(init.)=1.68281 s t(norm)=0.133737, mflops=37.3867 (err=2.6e-14) 28. NR (C): elapsed time t=1.40586 s, 256 iters, t-(init.)=1.39023 s t(norm)=0.110485, mflops=45.2549 (err=3.9e-15) 29. NR (F): elapsed time t=1.37468 s, 256 iters, t-(init.)=1.35904 s t(norm)=0.108007, mflops=46.2934 (err=3.9e-15) 30. Ooura (C): elapsed time t=1.32626 s, 512 iters, t-(init.)=1.29494 s t(norm)=0.0514563, mflops=97.1698 (err=3.9e-15) 31. Ooura (F): elapsed time t=1.36433 s, 512 iters, t-(init.)=1.33306 s t(norm)=0.0529709, mflops=94.3915 (err=3.9e-15) 32. QFT: elapsed time t=1.09177 s, 128 iters, t-(init.)=1.08393 s t(norm)=0.172287, mflops=29.0214 (err=4.3e-15) 33. Ransom: elapsed time t=1.36586 s, 256 iters, t-(init.)=1.35022 s t(norm)=0.107306, mflops=46.5958 (err=4.4e-15) 34. SCIPORT: elapsed time t=1.94631 s, 512 iters, t-(init.)=1.915 s t(norm)=0.0760951, mflops=65.7072 (err=3.8e-15) 35. Singleton: elapsed time t=1.534 s, 512 iters, t-(init.)=1.5018 s t(norm)=0.0596763, mflops=83.7853 (err=5.8e-15) 36. Singleton (f2c): elapsed time t=1.4981 s, 512 iters, t-(init.)=1.46684 s t(norm)=0.0582869, mflops=85.7826 (err=5.8e-15) 37. Sorensen: elapsed time t=1.38814 s, 512 iters, t-(init.)=1.35688 s t(norm)=0.0539174, mflops=92.7344 (err=3.7e-15) 38. Sorensen DIT: elapsed time t=1.47763 s, 128 iters, t-(init.)=1.4698 s t(norm)=0.233619, mflops=21.4024 (err=3.7e-15) 39. Temperton: elapsed time t=1.10279 s, 256 iters, t-(init.)=1.08715 s t(norm)=0.0863987, mflops=57.8712 (err=1.2e-07) 40. Temperton (f2c): elapsed time t=1.6587 s, 256 iters, t-(init.)=1.64306 s t(norm)=0.130578, mflops=38.2912 (err=3.8e-15) 41. Valkenburg: elapsed time t=1.94393 s, 32 iters, t-(init.)=1.94197 s t(norm)=1.23467, mflops=4.04967 (err=4.0e-15) 42. SUNPERF: elapsed time t=1.28777 s, 1024 iters, t-(init.)=1.22524 s t(norm)=0.0243434, mflops=205.395 (err=3.8e-15) Top mflops for N=4096 = 205.395 Normalized results and averages for N=4096: fft 0: mflops = 47.9531 (norm. = 0.233468), norm. avg. (of 12) = 0.262736 fft 1: mflops = 51.7593 (norm. = 0.251999), norm. avg. (of 12) = 0.263241 fft 2: mflops = 34.6244 (norm. = 0.168575), norm. avg. (of 12) = 0.141778 fft 3: mflops = 22.5269 (norm. = 0.109676), norm. avg. (of 12) = 0.0698217 fft 4: mflops = 61.0128 (norm. = 0.297052), norm. avg. (of 12) = 0.22094 fft 5: mflops = 5.77193 (norm. = 0.0281017), norm. avg. (of 12) = 0.0314633 fft 6: mflops = 70.8533 (norm. = 0.344962), norm. avg. (of 12) = 0.234713 fft 7: mflops = 58.1998 (norm. = 0.283356), norm. avg. (of 12) = 0.186915 fft 8: mflops = 22.8944 (norm. = 0.111465), norm. avg. (of 12) = 0.111135 fft 9: mflops = 75.2614 (norm. = 0.366423), norm. avg. (of 12) = 0.270326 fft 10: mflops = 84.8938 (norm. = 0.41332), norm. avg. (of 12) = 0.330643 fft 11: mflops = 21.1078 (norm. = 0.102767), norm. avg. (of 11) = 0.0787375 fft 12: mflops = 99.7252 (norm. = 0.48553), norm. avg. (of 12) = 0.408139 fft 13: mflops = 56.4026 (norm. = 0.274606), norm. avg. (of 12) = 0.244289 fft 14: mflops = 149.268 (norm. = 0.72674), norm. avg. (of 12) = 0.742843 fft 15: mflops = 149.374 (norm. = 0.727253), norm. avg. (of 12) = 0.735224 fft 16: mflops = 71.06 (norm. = 0.345968), norm. avg. (of 12) = 0.694884 fft 17: mflops = 88.6015 (norm. = 0.431372), norm. avg. (of 10) = 0.39085 fft 18: mflops = 44.3934 (norm. = 0.216137), norm. avg. (of 12) = 0.182247 fft 19: mflops = 39.9877 (norm. = 0.194687), norm. avg. (of 12) = 0.152615 fft 20: mflops = 45.9359 (norm. = 0.223647), norm. avg. (of 12) = 0.17491 fft 21: mflops = 43.6242 (norm. = 0.212392), norm. avg. (of 12) = 0.361552 fft 22: mflops = 48.8198 (norm. = 0.237688), norm. avg. (of 11) = 0.218068 fft 23: mflops = 56.1027 (norm. = 0.273146), norm. avg. (of 11) = 0.272821 fft 24: mflops = 57.1599 (norm. = 0.278293), norm. avg. (of 11) = 0.273494 fft 25: mflops = 59.057 (norm. = 0.287529), norm. avg. (of 11) = 0.231566 fft 26: mflops = 18.9722 (norm. = 0.0923695), norm. avg. (of 12) = 0.0736967 fft 27: mflops = 37.3867 (norm. = 0.182024), norm. avg. (of 12) = 0.139481 fft 28: mflops = 45.2549 (norm. = 0.220332), norm. avg. (of 12) = 0.170788 fft 29: mflops = 46.2934 (norm. = 0.225388), norm. avg. (of 12) = 0.174127 fft 30: mflops = 97.1698 (norm. = 0.473088), norm. avg. (of 12) = 0.403644 fft 31: mflops = 94.3915 (norm. = 0.459562), norm. avg. (of 12) = 0.390105 fft 32: mflops = 29.0214 (norm. = 0.141296), norm. avg. (of 9) = 0.186201 fft 33: mflops = 46.5958 (norm. = 0.22686), norm. avg. (of 11) = 0.113941 fft 34: mflops = 65.7072 (norm. = 0.319907), norm. avg. (of 11) = 0.307147 fft 35: mflops = 83.7853 (norm. = 0.407924), norm. avg. (of 12) = 0.275686 fft 36: mflops = 85.7826 (norm. = 0.417648), norm. avg. (of 12) = 0.284648 fft 37: mflops = 92.7344 (norm. = 0.451494), norm. avg. (of 12) = 0.406209 fft 38: mflops = 21.4024 (norm. = 0.104201), norm. avg. (of 12) = 0.105611 fft 39: mflops = 57.8712 (norm. = 0.281756), norm. avg. (of 12) = 0.1965 fft 40: mflops = 38.2912 (norm. = 0.186427), norm. avg. (of 12) = 0.135686 fft 41: mflops = 4.04967 (norm. = 0.0197165), norm. avg. (of 12) = 0.0218494 fft 42: mflops = 205.395 (norm. = 1), norm. avg. (of 12) = 0.669834 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.34615 s, 128 iters, t-(init.)=1.33051 s t(norm)=0.0976056, mflops=51.2266 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.24124 s, 128 iters, t-(init.)=1.22559 s t(norm)=0.0899089, mflops=55.6119 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.90892 s, 128 iters, t-(init.)=1.89329 s t(norm)=0.138891, mflops=35.9995 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.45263 s, 64 iters, t-(init.)=1.4448 s t(norm)=0.21198, mflops=23.5871 (err=3.7e-15) 4. Bailey: elapsed time t=1.07568 s, 128 iters, t-(init.)=1.05995 s t(norm)=0.0777577, mflops=64.3023 (err=3.7e-15) 5. Beauregard: elapsed time t=1.49997 s, 16 iters, t-(init.)=1.49798 s t(norm)=0.879131, mflops=5.68743 (err=3.7e-15) 6. Bergland: elapsed time t=1.9467 s, 256 iters, t-(init.)=1.91521 s t(norm)=0.0702497, mflops=71.1747 (err=3.7e-15) 7. Brenner: elapsed time t=1.18235 s, 128 iters, t-(init.)=1.16619 s t(norm)=0.0855513, mflops=58.4444 (err=3.7e-15) 8. Burrus: elapsed time t=1.39535 s, 64 iters, t-(init.)=1.38752 s t(norm)=0.203576, mflops=24.5609 (err=3.7e-15) 9. CWP (min N) (N=8580): elapsed time t=1.74061 s, 256 iters, t-(init.)=1.70787 s t(norm)=0.0626444, mflops=79.8156 10. CWP (best N) (N=9240): elapsed time t=1.54564 s, 256 iters, t-(init.)=1.51039 s t(norm)=0.0554009, mflops=90.2513 11. Edelblute: elapsed time t=1.53929 s, 64 iters, t-(init.)=1.53147 s t(norm)=0.224696, mflops=22.2523 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.74619 s, 256 iters, t-(init.)=1.71492 s t(norm)=0.0629031, mflops=79.4874 (err=3.7e-15) 13. FFTPACK (f2c): elapsed time t=1.50032 s, 128 iters, t-(init.)=1.48468 s t(norm)=0.108916, mflops=45.9071 (err=3.7e-15) FFTW_MEASURE plan: (cost = 4.346116e-03) FFTW_TWIDDLE 16 FFTW_TWIDDLE 32 FFTW_NOTW 16 14. FFTW: elapsed time t=1.2066 s, 256 iters, t-(init.)=1.17491 s t(norm)=0.0430954, mflops=116.022 (err=3.7e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.09339 s, 256 iters, t-(init.)=1.06172 s t(norm)=0.0389438, mflops=128.39 (err=3.7e-15) 16. Frigo-old: elapsed time t=1.01265 s, 128 iters, t-(init.)=0.996931 s t(norm)=0.0731344, mflops=68.3673 (err=3.7e-15) 17. Green: elapsed time t=1.57278 s, 256 iters, t-(init.)=1.54152 s t(norm)=0.0565427, mflops=88.4287 (err=3.7e-15) 18. GSL: elapsed time t=1.69147 s, 128 iters, t-(init.)=1.67583 s t(norm)=0.122938, mflops=40.6709 (err=3.7e-15) 19. GSL DIT: elapsed time t=1.76255 s, 128 iters, t-(init.)=1.7469 s t(norm)=0.128152, mflops=39.0161 (err=4.3e-15) 20. GSL DIF: elapsed time t=1.52132 s, 128 iters, t-(init.)=1.50568 s t(norm)=0.110456, mflops=45.2668 (err=4.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.37108 s, 128 iters, t-(init.)=1.35544 s t(norm)=0.0994348, mflops=50.2842 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.15826 s, 128 iters, t-(init.)=1.14262 s t(norm)=0.0838223, mflops=59.65 24. Mayer (lookup): elapsed time t=1.15902 s, 128 iters, t-(init.)=1.14338 s t(norm)=0.0838778, mflops=59.6106 (err=3.7e-15) 25. Monro: elapsed time t=1.06331 s, 128 iters, t-(init.)=1.04766 s t(norm)=0.0768561, mflops=65.0566 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.84442 s, 64 iters, t-(init.)=1.83659 s t(norm)=0.269463, mflops=18.5554 (err=4.5e-14) 27. Nielsen: elapsed time t=1.95714 s, 128 iters, t-(init.)=1.9415 s t(norm)=0.142428, mflops=35.1055 (err=1.1e-14) 28. NR (C): elapsed time t=1.54874 s, 128 iters, t-(init.)=1.53311 s t(norm)=0.112468, mflops=44.4571 (err=3.9e-15) 29. NR (F): elapsed time t=1.49145 s, 128 iters, t-(init.)=1.47582 s t(norm)=0.108265, mflops=46.1828 (err=3.9e-15) 30. Ooura (C): elapsed time t=1.51455 s, 256 iters, t-(init.)=1.48323 s t(norm)=0.0544045, mflops=91.9042 (err=3.7e-15) 31. Ooura (F): elapsed time t=1.55795 s, 256 iters, t-(init.)=1.52665 s t(norm)=0.0559971, mflops=89.2903 (err=3.7e-15) 32. QFT: elapsed time t=1.33103 s, 64 iters, t-(init.)=1.32315 s t(norm)=0.194131, mflops=25.7558 (err=4.7e-15) 33. Ransom: elapsed time t=1.79249 s, 128 iters, t-(init.)=1.77685 s t(norm)=0.130349, mflops=38.3585 (err=4.9e-15) 34. SCIPORT: elapsed time t=1.07552 s, 64 iters, t-(init.)=1.06763 s t(norm)=0.156642, mflops=31.9199 (err=3.7e-15) 35. Singleton: elapsed time t=1.67425 s, 256 iters, t-(init.)=1.64298 s t(norm)=0.0602641, mflops=82.9681 (err=5.6e-15) 36. Singleton (f2c): elapsed time t=1.72035 s, 256 iters, t-(init.)=1.68821 s t(norm)=0.0619231, mflops=80.7454 (err=5.6e-15) 37. Sorensen: elapsed time t=1.5204 s, 256 iters, t-(init.)=1.4891 s t(norm)=0.0546198, mflops=91.5419 (err=3.7e-15) 38. Sorensen DIT: elapsed time t=1.50647 s, 64 iters, t-(init.)=1.49864 s t(norm)=0.219879, mflops=22.7397 (err=3.7e-15) 39. Temperton: elapsed time t=1.27749 s, 128 iters, t-(init.)=1.26185 s t(norm)=0.0925687, mflops=54.0139 (err=1.4e-07) 40. Temperton (f2c): elapsed time t=1.97203 s, 128 iters, t-(init.)=1.95638 s t(norm)=0.143519, mflops=34.8385 (err=3.7e-15) 41. Valkenburg: elapsed time t=1.05828 s, 8 iters, t-(init.)=1.05718 s t(norm)=1.24087, mflops=4.02944 (err=3.8e-15) 42. SUNPERF: elapsed time t=1.02419 s, 256 iters, t-(init.)=0.992932 s t(norm)=0.0364205, mflops=137.285 (err=3.7e-15) Top mflops for N=8192 = 137.285 Normalized results and averages for N=8192: fft 0: mflops = 51.2266 (norm. = 0.37314), norm. avg. (of 13) = 0.271229 fft 1: mflops = 55.6119 (norm. = 0.405083), norm. avg. (of 13) = 0.274152 fft 2: mflops = 35.9995 (norm. = 0.262224), norm. avg. (of 13) = 0.151044 fft 3: mflops = 23.5871 (norm. = 0.171811), norm. avg. (of 13) = 0.077667 fft 4: mflops = 64.3023 (norm. = 0.468385), norm. avg. (of 13) = 0.239975 fft 5: mflops = 5.68743 (norm. = 0.0414279), norm. avg. (of 13) = 0.0322298 fft 6: mflops = 71.1747 (norm. = 0.518444), norm. avg. (of 13) = 0.256539 fft 7: mflops = 58.4444 (norm. = 0.425715), norm. avg. (of 13) = 0.205284 fft 8: mflops = 24.5609 (norm. = 0.178904), norm. avg. (of 13) = 0.116348 fft 9: mflops = 79.8156 (norm. = 0.581385), norm. avg. (of 13) = 0.294254 fft 10: mflops = 90.2513 (norm. = 0.6574), norm. avg. (of 13) = 0.355778 fft 11: mflops = 22.2523 (norm. = 0.162088), norm. avg. (of 12) = 0.0856834 fft 12: mflops = 79.4874 (norm. = 0.578994), norm. avg. (of 13) = 0.421282 fft 13: mflops = 45.9071 (norm. = 0.334392), norm. avg. (of 13) = 0.25122 fft 14: mflops = 116.022 (norm. = 0.845113), norm. avg. (of 13) = 0.75071 fft 15: mflops = 128.39 (norm. = 0.935207), norm. avg. (of 13) = 0.750607 fft 16: mflops = 68.3673 (norm. = 0.497994), norm. avg. (of 13) = 0.679739 fft 17: mflops = 88.4287 (norm. = 0.644124), norm. avg. (of 11) = 0.413875 fft 18: mflops = 40.6709 (norm. = 0.296251), norm. avg. (of 13) = 0.191016 fft 19: mflops = 39.0161 (norm. = 0.284198), norm. avg. (of 13) = 0.162736 fft 20: mflops = 45.2668 (norm. = 0.329728), norm. avg. (of 13) = 0.186819 fft 21: mflops = -1 (norm. = -0.0072841), norm. avg. (of 12) = 0.361552 fft 22: mflops = 50.2842 (norm. = 0.366275), norm. avg. (of 12) = 0.230418 fft 23: mflops = 59.65 (norm. = 0.434497), norm. avg. (of 12) = 0.286294 fft 24: mflops = 59.6106 (norm. = 0.43421), norm. avg. (of 12) = 0.286887 fft 25: mflops = 65.0566 (norm. = 0.473879), norm. avg. (of 12) = 0.251759 fft 26: mflops = 18.5554 (norm. = 0.13516), norm. avg. (of 13) = 0.0784247 fft 27: mflops = 35.1055 (norm. = 0.255712), norm. avg. (of 13) = 0.148422 fft 28: mflops = 44.4571 (norm. = 0.32383), norm. avg. (of 13) = 0.18256 fft 29: mflops = 46.1828 (norm. = 0.3364), norm. avg. (of 13) = 0.18661 fft 30: mflops = 91.9042 (norm. = 0.66944), norm. avg. (of 13) = 0.42409 fft 31: mflops = 89.2903 (norm. = 0.6504), norm. avg. (of 13) = 0.410128 fft 32: mflops = 25.7558 (norm. = 0.187608), norm. avg. (of 10) = 0.186341 fft 33: mflops = 38.3585 (norm. = 0.279408), norm. avg. (of 12) = 0.127729 fft 34: mflops = 31.9199 (norm. = 0.232508), norm. avg. (of 12) = 0.300927 fft 35: mflops = 82.9681 (norm. = 0.604348), norm. avg. (of 13) = 0.300968 fft 36: mflops = 80.7454 (norm. = 0.588158), norm. avg. (of 13) = 0.307995 fft 37: mflops = 91.5419 (norm. = 0.6668), norm. avg. (of 13) = 0.426255 fft 38: mflops = 22.7397 (norm. = 0.165639), norm. avg. (of 13) = 0.110228 fft 39: mflops = 54.0139 (norm. = 0.393443), norm. avg. (of 13) = 0.211649 fft 40: mflops = 34.8385 (norm. = 0.253767), norm. avg. (of 13) = 0.14477 fft 41: mflops = 4.02944 (norm. = 0.0293509), norm. avg. (of 13) = 0.0224264 fft 42: mflops = 137.285 (norm. = 1), norm. avg. (of 13) = 0.695232 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.98846 s, 64 iters, t-(init.)=1.95977 s t(norm)=0.133499, mflops=37.4536 (err=6.8e-15) 1. Arndt DIT: elapsed time t=1.84215 s, 64 iters, t-(init.)=1.81356 s t(norm)=0.123539, mflops=40.4731 (err=6.8e-15) 2. Arndt Split-Radix: elapsed time t=1.36552 s, 32 iters, t-(init.)=1.35116 s t(norm)=0.184081, mflops=27.162 (err=6.8e-15) 3. Arndt 4-step: elapsed time t=1.49572 s, 32 iters, t-(init.)=1.48136 s t(norm)=0.20182, mflops=24.7746 (err=6.8e-15) 4. Bailey: elapsed time t=1.40438 s, 32 iters, t-(init.)=1.38993 s t(norm)=0.189363, mflops=26.4043 (err=6.8e-15) 5. Beauregard: elapsed time t=1.60389 s, 8 iters, t-(init.)=1.60025 s t(norm)=0.872069, mflops=5.73349 (err=6.8e-15) 6. Bergland: elapsed time t=1.3087 s, 64 iters, t-(init.)=1.27997 s t(norm)=0.087191, mflops=57.3454 (err=6.8e-15) 7. Brenner: elapsed time t=1.536 s, 64 iters, t-(init.)=1.50726 s t(norm)=0.102674, mflops=48.6979 (err=6.8e-15) 8. Burrus: elapsed time t=1.78146 s, 32 iters, t-(init.)=1.7671 s t(norm)=0.240748, mflops=20.7686 (err=6.8e-15) 9. CWP (min N) (N=17160): elapsed time t=1.01268 s, 64 iters, t-(init.)=0.983265 s t(norm)=0.0669796, mflops=74.6496 10. CWP (best N) (N=17160): elapsed time t=1.01268 s, 64 iters, t-(init.)=0.983236 s t(norm)=0.0669776, mflops=74.6518 11. Edelblute: elapsed time t=1.98924 s, 32 iters, t-(init.)=1.97488 s t(norm)=0.269056, mflops=18.5835 (err=6.8e-15) 12. FFTPACK: elapsed time t=1.60537 s, 64 iters, t-(init.)=1.57661 s t(norm)=0.107398, mflops=46.5558 (err=6.8e-15) 13. FFTPACK (f2c): elapsed time t=1.09957 s, 32 iters, t-(init.)=1.08515 s t(norm)=0.14784, mflops=33.8204 (err=6.8e-15) FFTW_MEASURE plan: (cost = 1.149136e-02) FFTW_TWIDDLE 64 FFTW_TWIDDLE 8 FFTW_NOTW 32 14. FFTW: elapsed time t=1.86281 s, 128 iters, t-(init.)=1.80486 s t(norm)=0.0614732, mflops=81.3362 (err=6.8e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.91424 s, 128 iters, t-(init.)=1.85597 s t(norm)=0.0632141, mflops=79.0963 (err=6.8e-15) 16. Frigo-old: elapsed time t=1.54536 s, 64 iters, t-(init.)=1.51627 s t(norm)=0.103288, mflops=48.4083 (err=6.8e-15) 17. Green: elapsed time t=1.08648 s, 64 iters, t-(init.)=1.05778 s t(norm)=0.0720553, mflops=69.3911 (err=6.8e-15) 18. GSL: elapsed time t=1.12261 s, 32 iters, t-(init.)=1.10801 s t(norm)=0.150954, mflops=33.1226 (err=6.8e-15) 19. GSL DIT: elapsed time t=1.19803 s, 32 iters, t-(init.)=1.18366 s t(norm)=0.16126, mflops=31.0058 (err=7.2e-15) 20. GSL DIF: elapsed time t=1.09751 s, 32 iters, t-(init.)=1.08315 s t(norm)=0.147568, mflops=33.8828 (err=7.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.92215 s, 64 iters, t-(init.)=1.89348 s t(norm)=0.128983, mflops=38.7648 (err=6.8e-15) 23. Mayer (simple): elapsed time t=1.71479 s, 64 iters, t-(init.)=1.68613 s t(norm)=0.114858, mflops=43.5319 24. Mayer (lookup): elapsed time t=1.70929 s, 64 iters, t-(init.)=1.6806 s t(norm)=0.114482, mflops=43.675 (err=6.8e-15) 25. Monro: elapsed time t=1.67106 s, 64 iters, t-(init.)=1.64229 s t(norm)=0.111872, mflops=44.6938 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.18871 s, 16 iters, t-(init.)=1.18124 s t(norm)=0.321861, mflops=15.5346 (err=2.3e-13) 27. Nielsen: elapsed time t=1.3932 s, 32 iters, t-(init.)=1.37847 s t(norm)=0.187801, mflops=26.6239 (err=1.3e-13) 28. NR (C): elapsed time t=1.07883 s, 32 iters, t-(init.)=1.06447 s t(norm)=0.145022, mflops=34.4774 (err=6.9e-15) 29. NR (F): elapsed time t=1.06916 s, 32 iters, t-(init.)=1.0548 s t(norm)=0.143705, mflops=34.7936 (err=6.9e-15) 30. Ooura (C): elapsed time t=1.78226 s, 128 iters, t-(init.)=1.72491 s t(norm)=0.0587502, mflops=85.1061 (err=6.8e-15) 31. Ooura (F): elapsed time t=1.81746 s, 128 iters, t-(init.)=1.7601 s t(norm)=0.0599485, mflops=83.4049 (err=6.8e-15) 32. QFT: elapsed time t=1.63927 s, 32 iters, t-(init.)=1.62422 s t(norm)=0.221283, mflops=22.5955 (err=8.1e-15) 33. Ransom: elapsed time t=1.54383 s, 64 iters, t-(init.)=1.51514 s t(norm)=0.103211, mflops=48.4445 (err=7.4e-15) 34. SCIPORT: elapsed time t=1.18432 s, 16 iters, t-(init.)=1.17671 s t(norm)=0.320628, mflops=15.5944 (err=6.8e-15) 35. Singleton: elapsed time t=1.13893 s, 64 iters, t-(init.)=1.11017 s t(norm)=0.0756246, mflops=66.116 (err=1.0e-14) 36. Singleton (f2c): elapsed time t=1.16019 s, 64 iters, t-(init.)=1.13145 s t(norm)=0.0770742, mflops=64.8725 (err=1.0e-14) 37. Sorensen: elapsed time t=1.21991 s, 64 iters, t-(init.)=1.19126 s t(norm)=0.0811481, mflops=61.6157 (err=6.8e-15) 38. Sorensen DIT: elapsed time t=1.8949 s, 32 iters, t-(init.)=1.88052 s t(norm)=0.256201, mflops=19.5159 (err=6.8e-15) 39. Temperton: elapsed time t=1.59801 s, 64 iters, t-(init.)=1.5693 s t(norm)=0.1069, mflops=46.7727 (err=1.5e-07) 40. Temperton (f2c): elapsed time t=1.13819 s, 32 iters, t-(init.)=1.1235 s t(norm)=0.153065, mflops=32.6659 (err=6.8e-15) 41. Valkenburg: elapsed time t=1.20055 s, 4 iters, t-(init.)=1.19824 s t(norm)=1.30597, mflops=3.82856 (err=6.9e-15) 42. SUNPERF: elapsed time t=1.25767 s, 64 iters, t-(init.)=1.22893 s t(norm)=0.0837142, mflops=59.727 (err=6.8e-15) Top mflops for N=16384 = 85.1061 Normalized results and averages for N=16384: fft 0: mflops = 37.4536 (norm. = 0.440082), norm. avg. (of 14) = 0.283289 fft 1: mflops = 40.4731 (norm. = 0.475561), norm. avg. (of 14) = 0.288539 fft 2: mflops = 27.162 (norm. = 0.319154), norm. avg. (of 14) = 0.163051 fft 3: mflops = 24.7746 (norm. = 0.291102), norm. avg. (of 14) = 0.0929124 fft 4: mflops = 26.4043 (norm. = 0.310251), norm. avg. (of 14) = 0.244994 fft 5: mflops = 5.73349 (norm. = 0.0673688), norm. avg. (of 14) = 0.0347397 fft 6: mflops = 57.3454 (norm. = 0.673811), norm. avg. (of 14) = 0.286344 fft 7: mflops = 48.6979 (norm. = 0.572203), norm. avg. (of 14) = 0.231493 fft 8: mflops = 20.7686 (norm. = 0.244032), norm. avg. (of 14) = 0.125468 fft 9: mflops = 74.6496 (norm. = 0.877136), norm. avg. (of 14) = 0.335888 fft 10: mflops = 74.6518 (norm. = 0.877162), norm. avg. (of 14) = 0.393019 fft 11: mflops = 18.5835 (norm. = 0.218357), norm. avg. (of 13) = 0.095889 fft 12: mflops = 46.5558 (norm. = 0.547033), norm. avg. (of 14) = 0.430264 fft 13: mflops = 33.8204 (norm. = 0.397391), norm. avg. (of 14) = 0.26166 fft 14: mflops = 81.3362 (norm. = 0.955704), norm. avg. (of 14) = 0.765352 fft 15: mflops = 79.0963 (norm. = 0.929385), norm. avg. (of 14) = 0.763377 fft 16: mflops = 48.4083 (norm. = 0.5688), norm. avg. (of 14) = 0.671814 fft 17: mflops = 69.3911 (norm. = 0.815349), norm. avg. (of 12) = 0.447331 fft 18: mflops = 33.1226 (norm. = 0.389193), norm. avg. (of 14) = 0.205172 fft 19: mflops = 31.0058 (norm. = 0.364319), norm. avg. (of 14) = 0.177135 fft 20: mflops = 33.8828 (norm. = 0.398124), norm. avg. (of 14) = 0.201913 fft 21: mflops = -1 (norm. = -0.01175), norm. avg. (of 12) = 0.361552 fft 22: mflops = 38.7648 (norm. = 0.455488), norm. avg. (of 13) = 0.247731 fft 23: mflops = 43.5319 (norm. = 0.511501), norm. avg. (of 13) = 0.303618 fft 24: mflops = 43.675 (norm. = 0.513183), norm. avg. (of 13) = 0.304294 fft 25: mflops = 44.6938 (norm. = 0.525154), norm. avg. (of 13) = 0.272789 fft 26: mflops = 15.5346 (norm. = 0.182533), norm. avg. (of 14) = 0.0858609 fft 27: mflops = 26.6239 (norm. = 0.312832), norm. avg. (of 14) = 0.160165 fft 28: mflops = 34.4774 (norm. = 0.405111), norm. avg. (of 14) = 0.198457 fft 29: mflops = 34.7936 (norm. = 0.408826), norm. avg. (of 14) = 0.202483 fft 30: mflops = 85.1061 (norm. = 1), norm. avg. (of 14) = 0.465226 fft 31: mflops = 83.4049 (norm. = 0.980011), norm. avg. (of 14) = 0.450834 fft 32: mflops = 22.5955 (norm. = 0.265498), norm. avg. (of 11) = 0.193537 fft 33: mflops = 48.4445 (norm. = 0.569225), norm. avg. (of 13) = 0.161691 fft 34: mflops = 15.5944 (norm. = 0.183235), norm. avg. (of 13) = 0.291874 fft 35: mflops = 66.116 (norm. = 0.776867), norm. avg. (of 14) = 0.33496 fft 36: mflops = 64.8725 (norm. = 0.762255), norm. avg. (of 14) = 0.340442 fft 37: mflops = 61.6157 (norm. = 0.723988), norm. avg. (of 14) = 0.447521 fft 38: mflops = 19.5159 (norm. = 0.229313), norm. avg. (of 14) = 0.118734 fft 39: mflops = 46.7727 (norm. = 0.549581), norm. avg. (of 14) = 0.235787 fft 40: mflops = 32.6659 (norm. = 0.383826), norm. avg. (of 14) = 0.161845 fft 41: mflops = 3.82856 (norm. = 0.0449858), norm. avg. (of 14) = 0.0240378 fft 42: mflops = 59.727 (norm. = 0.701795), norm. avg. (of 14) = 0.6957 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.66889 s, 16 iters, t-(init.)=1.64306 s t(norm)=0.208926, mflops=23.9319 (err=1.4e-14) 1. Arndt DIT: elapsed time t=1.64719 s, 16 iters, t-(init.)=1.6214 s t(norm)=0.206172, mflops=24.2516 (err=1.4e-14) 2. Arndt Split-Radix: elapsed time t=1.14578 s, 8 iters, t-(init.)=1.13286 s t(norm)=0.288102, mflops=17.355 (err=1.4e-14) 3. Arndt 4-step: elapsed time t=1.01393 s, 8 iters, t-(init.)=1.00096 s t(norm)=0.254558, mflops=19.6419 (err=1.4e-14) 4. Bailey: elapsed time t=1.27491 s, 8 iters, t-(init.)=1.26149 s t(norm)=0.320814, mflops=15.5854 (err=1.4e-14) 5. Beauregard: elapsed time t=1.76649 s, 4 iters, t-(init.)=1.75997 s t(norm)=0.895168, mflops=5.58555 (err=1.4e-14) 6. Bergland: elapsed time t=1.93195 s, 32 iters, t-(init.)=1.88029 s t(norm)=0.119545, mflops=41.8251 (err=1.4e-14) 7. Brenner: elapsed time t=1.19505 s, 16 iters, t-(init.)=1.16916 s t(norm)=0.148666, mflops=33.6325 (err=1.4e-14) 8. Burrus: elapsed time t=1.44672 s, 8 iters, t-(init.)=1.43371 s t(norm)=0.364612, mflops=13.7132 (err=1.4e-14) 9. CWP (min N) (N=34320): elapsed time t=1.16343 s, 32 iters, t-(init.)=1.10935 s t(norm)=0.0705304, mflops=70.8914 10. CWP (best N) (N=34320): elapsed time t=1.16628 s, 32 iters, t-(init.)=1.11166 s t(norm)=0.0706773, mflops=70.7441 11. Edelblute: elapsed time t=1.43921 s, 8 iters, t-(init.)=1.42626 s t(norm)=0.362717, mflops=13.7849 (err=1.4e-14) 12. FFTPACK: elapsed time t=1.09472 s, 16 iters, t-(init.)=1.06802 s t(norm)=0.135806, mflops=36.8173 (err=1.4e-14) 13. FFTPACK (f2c): elapsed time t=1.457 s, 16 iters, t-(init.)=1.43029 s t(norm)=0.181871, mflops=27.492 (err=1.4e-14) FFTW_MEASURE plan: (cost = 3.478626e-02) FFTW_TWIDDLE 16 FFTW_TWIDDLE 64 FFTW_NOTW 32 14. FFTW: elapsed time t=1.44147 s, 32 iters, t-(init.)=1.38793 s t(norm)=0.0882419, mflops=56.6624 (err=1.4e-14) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.41626 s, 32 iters, t-(init.)=1.36273 s t(norm)=0.0866403, mflops=57.7098 (err=1.4e-14) 16. Frigo-old: elapsed time t=1.04699 s, 16 iters, t-(init.)=1.01997 s t(norm)=0.129696, mflops=38.5516 (err=1.4e-14) 17. Green: elapsed time t=1.79474 s, 32 iters, t-(init.)=1.74294 s t(norm)=0.110813, mflops=45.121 (err=1.4e-14) 18. GSL: elapsed time t=1.49046 s, 16 iters, t-(init.)=1.46397 s t(norm)=0.186153, mflops=26.8596 (err=1.4e-14) 19. GSL DIT: elapsed time t=1.92481 s, 16 iters, t-(init.)=1.89914 s t(norm)=0.241488, mflops=20.7049 (err=1.4e-14) 20. GSL DIF: elapsed time t=1.83223 s, 16 iters, t-(init.)=1.80635 s t(norm)=0.229689, mflops=21.7686 (err=1.4e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.03263 s, 16 iters, t-(init.)=1.00705 s t(norm)=0.128053, mflops=39.0463 (err=1.4e-14) 23. Mayer (simple): elapsed time t=1.87098 s, 32 iters, t-(init.)=1.81976 s t(norm)=0.115697, mflops=43.2162 24. Mayer (lookup): elapsed time t=1.92433 s, 32 iters, t-(init.)=1.8729 s t(norm)=0.119076, mflops=41.9901 (err=1.4e-14) 25. Monro: elapsed time t=1.76415 s, 16 iters, t-(init.)=1.73827 s t(norm)=0.221033, mflops=22.6211 (err=1.5e-07) 26. NAPACK (f2c): elapsed time t=1.58826 s, 8 iters, t-(init.)=1.57458 s t(norm)=0.400436, mflops=12.4864 (err=5.6e-13) 27. Nielsen: elapsed time t=1.2067 s, 8 iters, t-(init.)=1.19261 s t(norm)=0.303296, mflops=16.4856 (err=2.3e-13) 28. NR (C): elapsed time t=1.79863 s, 16 iters, t-(init.)=1.77297 s t(norm)=0.225445, mflops=22.1784 (err=1.4e-14) 29. NR (F): elapsed time t=1.79465 s, 16 iters, t-(init.)=1.76884 s t(norm)=0.22492, mflops=22.2301 (err=1.4e-14) 30. Ooura (C): elapsed time t=1.38041 s, 32 iters, t-(init.)=1.32873 s t(norm)=0.0844786, mflops=59.1866 (err=1.4e-14) 31. Ooura (F): elapsed time t=1.39892 s, 32 iters, t-(init.)=1.34705 s t(norm)=0.0856431, mflops=58.3818 (err=1.4e-14) 32. QFT: elapsed time t=1.06118 s, 8 iters, t-(init.)=1.04654 s t(norm)=0.266149, mflops=18.7865 (err=1.6e-14) 33. Ransom: elapsed time t=1.07469 s, 16 iters, t-(init.)=1.04881 s t(norm)=0.133363, mflops=37.4917 (err=1.5e-14) 34. SCIPORT: elapsed time t=1.45809 s, 8 iters, t-(init.)=1.44406 s t(norm)=0.367244, mflops=13.6149 (err=1.4e-14) 35. Singleton: elapsed time t=1.18492 s, 16 iters, t-(init.)=1.15893 s t(norm)=0.147366, mflops=33.9292 (err=2.1e-14) 36. Singleton (f2c): elapsed time t=1.23559 s, 16 iters, t-(init.)=1.20964 s t(norm)=0.153813, mflops=32.507 (err=2.1e-14) 37. Sorensen: elapsed time t=1.35786 s, 16 iters, t-(init.)=1.33237 s t(norm)=0.16942, mflops=29.5125 (err=1.4e-14) 38. Sorensen DIT: elapsed time t=1.56782 s, 8 iters, t-(init.)=1.5549 s t(norm)=0.395431, mflops=12.6444 (err=1.4e-14) 39. Temperton: elapsed time t=1.4671 s, 16 iters, t-(init.)=1.44114 s t(norm)=0.183251, mflops=27.285 (err=1.5e-07) 40. Temperton (f2c): elapsed time t=1.78357 s, 16 iters, t-(init.)=1.75773 s t(norm)=0.223507, mflops=22.3707 (err=1.4e-14) 41. Valkenburg: elapsed time t=1.39377 s, 2 iters, t-(init.)=1.38938 s t(norm)=1.41336, mflops=3.53768 (err=1.4e-14) 42. SUNPERF: elapsed time t=1.89621 s, 32 iters, t-(init.)=1.84358 s t(norm)=0.117211, mflops=42.658 (err=1.4e-14) Top mflops for N=32768 = 70.8914 Normalized results and averages for N=32768: fft 0: mflops = 23.9319 (norm. = 0.337586), norm. avg. (of 15) = 0.286909 fft 1: mflops = 24.2516 (norm. = 0.342095), norm. avg. (of 15) = 0.292109 fft 2: mflops = 17.355 (norm. = 0.244811), norm. avg. (of 15) = 0.168502 fft 3: mflops = 19.6419 (norm. = 0.277071), norm. avg. (of 15) = 0.10519 fft 4: mflops = 15.5854 (norm. = 0.219849), norm. avg. (of 15) = 0.243318 fft 5: mflops = 5.58555 (norm. = 0.0787902), norm. avg. (of 15) = 0.0376764 fft 6: mflops = 41.8251 (norm. = 0.589988), norm. avg. (of 15) = 0.306587 fft 7: mflops = 33.6325 (norm. = 0.474423), norm. avg. (of 15) = 0.247688 fft 8: mflops = 13.7132 (norm. = 0.193439), norm. avg. (of 15) = 0.129999 fft 9: mflops = 70.8914 (norm. = 1), norm. avg. (of 15) = 0.380162 fft 10: mflops = 70.7441 (norm. = 0.997922), norm. avg. (of 15) = 0.433346 fft 11: mflops = 13.7849 (norm. = 0.194451), norm. avg. (of 14) = 0.102929 fft 12: mflops = 36.8173 (norm. = 0.519348), norm. avg. (of 15) = 0.436203 fft 13: mflops = 27.492 (norm. = 0.387805), norm. avg. (of 15) = 0.27007 fft 14: mflops = 56.6624 (norm. = 0.799285), norm. avg. (of 15) = 0.767614 fft 15: mflops = 57.7098 (norm. = 0.81406), norm. avg. (of 15) = 0.766756 fft 16: mflops = 38.5516 (norm. = 0.543813), norm. avg. (of 15) = 0.663281 fft 17: mflops = 45.121 (norm. = 0.63648), norm. avg. (of 13) = 0.461881 fft 18: mflops = 26.8596 (norm. = 0.378884), norm. avg. (of 15) = 0.216753 fft 19: mflops = 20.7049 (norm. = 0.292066), norm. avg. (of 15) = 0.184797 fft 20: mflops = 21.7686 (norm. = 0.30707), norm. avg. (of 15) = 0.208923 fft 21: mflops = -1 (norm. = -0.0141061), norm. avg. (of 12) = 0.361552 fft 22: mflops = 39.0463 (norm. = 0.55079), norm. avg. (of 14) = 0.269378 fft 23: mflops = 43.2162 (norm. = 0.609612), norm. avg. (of 14) = 0.325474 fft 24: mflops = 41.9901 (norm. = 0.592316), norm. avg. (of 14) = 0.324867 fft 25: mflops = 22.6211 (norm. = 0.319095), norm. avg. (of 14) = 0.276097 fft 26: mflops = 12.4864 (norm. = 0.176134), norm. avg. (of 15) = 0.0918792 fft 27: mflops = 16.4856 (norm. = 0.232547), norm. avg. (of 15) = 0.164991 fft 28: mflops = 22.1784 (norm. = 0.31285), norm. avg. (of 15) = 0.206083 fft 29: mflops = 22.2301 (norm. = 0.31358), norm. avg. (of 15) = 0.209889 fft 30: mflops = 59.1866 (norm. = 0.834891), norm. avg. (of 15) = 0.48987 fft 31: mflops = 58.3818 (norm. = 0.823539), norm. avg. (of 15) = 0.475681 fft 32: mflops = 18.7865 (norm. = 0.265004), norm. avg. (of 12) = 0.199493 fft 33: mflops = 37.4917 (norm. = 0.528861), norm. avg. (of 14) = 0.187917 fft 34: mflops = 13.6149 (norm. = 0.192053), norm. avg. (of 14) = 0.284744 fft 35: mflops = 33.9292 (norm. = 0.478608), norm. avg. (of 15) = 0.344537 fft 36: mflops = 32.507 (norm. = 0.458546), norm. avg. (of 15) = 0.348316 fft 37: mflops = 29.5125 (norm. = 0.416306), norm. avg. (of 15) = 0.44544 fft 38: mflops = 12.6444 (norm. = 0.178363), norm. avg. (of 15) = 0.12271 fft 39: mflops = 27.285 (norm. = 0.384885), norm. avg. (of 15) = 0.245727 fft 40: mflops = 22.3707 (norm. = 0.315563), norm. avg. (of 15) = 0.172093 fft 41: mflops = 3.53768 (norm. = 0.0499028), norm. avg. (of 15) = 0.0257621 fft 42: mflops = 42.658 (norm. = 0.601737), norm. avg. (of 15) = 0.689436 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.43623 s, 4 iters, t-(init.)=1.41241 s t(norm)=0.336746, mflops=14.848 (err=1.7e-14) 1. Arndt DIT: elapsed time t=1.4702 s, 4 iters, t-(init.)=1.44613 s t(norm)=0.344785, mflops=14.5018 (err=1.7e-14) 2. Arndt Split-Radix: elapsed time t=1.93632 s, 4 iters, t-(init.)=1.9125 s t(norm)=0.455976, mflops=10.9655 (err=1.7e-14) 3. Arndt 4-step: elapsed time t=1.10774 s, 4 iters, t-(init.)=1.08274 s t(norm)=0.258145, mflops=19.369 (err=1.7e-14) 4. Bailey: elapsed time t=1.34151 s, 4 iters, t-(init.)=1.31638 s t(norm)=0.313851, mflops=15.9311 (err=1.7e-14) 5. Beauregard: elapsed time t=1.98428 s, 2 iters, t-(init.)=1.97178 s t(norm)=0.940216, mflops=5.31793 (err=1.7e-14) 6. Bergland: elapsed time t=1.4893 s, 8 iters, t-(init.)=1.44033 s t(norm)=0.171701, mflops=29.1204 (err=1.7e-14) 7. Brenner: elapsed time t=1.94708 s, 8 iters, t-(init.)=1.89828 s t(norm)=0.226293, mflops=22.0952 (err=1.7e-14) 8. Burrus: elapsed time t=1.07173 s, 2 iters, t-(init.)=1.0604 s t(norm)=0.505639, mflops=9.88847 (err=1.7e-14) 9. CWP (min N) (N=72072): elapsed time t=1.46776 s, 16 iters, t-(init.)=1.3579 s t(norm)=0.0809371, mflops=61.7764 10. CWP (best N) (N=72072): elapsed time t=1.46705 s, 16 iters, t-(init.)=1.35711 s t(norm)=0.0808898, mflops=61.8125 11. Edelblute: elapsed time t=1.11162 s, 2 iters, t-(init.)=1.10029 s t(norm)=0.52466, mflops=9.52998 (err=1.7e-14) 12. FFTPACK: elapsed time t=1.26068 s, 8 iters, t-(init.)=1.21104 s t(norm)=0.144367, mflops=34.6339 (err=1.7e-14) 13. FFTPACK (f2c): elapsed time t=1.55567 s, 8 iters, t-(init.)=1.50603 s t(norm)=0.179533, mflops=27.8501 (err=1.7e-14) FFTW_MEASURE plan: (cost = 1.022785e-01) FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_TWIDDLE 32 FFTW_NOTW 16 14. FFTW: elapsed time t=1.44052 s, 16 iters, t-(init.)=1.33999 s t(norm)=0.0798695, mflops=62.6021 (err=1.7e-14) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.49799 s, 16 iters, t-(init.)=1.39613 s t(norm)=0.0832161, mflops=60.0846 (err=1.7e-14) 16. Frigo-old: elapsed time t=1.24617 s, 8 iters, t-(init.)=1.19607 s t(norm)=0.142582, mflops=35.0675 (err=1.7e-14) 17. Green: elapsed time t=1.35789 s, 8 iters, t-(init.)=1.30871 s t(norm)=0.15601, mflops=32.0492 (err=1.7e-14) 18. GSL: elapsed time t=1.57133 s, 8 iters, t-(init.)=1.52181 s t(norm)=0.181414, mflops=27.5613 (err=1.7e-14) 19. GSL DIT: elapsed time t=1.58361 s, 4 iters, t-(init.)=1.55976 s t(norm)=0.371875, mflops=13.4454 (err=1.7e-14) 20. GSL DIF: elapsed time t=1.54897 s, 4 iters, t-(init.)=1.52397 s t(norm)=0.363343, mflops=13.7611 (err=1.8e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.49327 s, 8 iters, t-(init.)=1.4439 s t(norm)=0.172126, mflops=29.0484 (err=1.7e-14) 23. Mayer (simple): elapsed time t=1.40241 s, 8 iters, t-(init.)=1.35293 s t(norm)=0.161282, mflops=31.0015 24. Mayer (lookup): elapsed time t=1.5068 s, 8 iters, t-(init.)=1.45744 s t(norm)=0.17374, mflops=28.7786 (err=1.7e-14) 25. Monro: elapsed time t=1.4048 s, 4 iters, t-(init.)=1.38097 s t(norm)=0.329248, mflops=15.1861 (err=1.6e-07) 26. NAPACK (f2c): elapsed time t=1.73604 s, 4 iters, t-(init.)=1.71081 s t(norm)=0.407888, mflops=12.2583 (err=8.6e-13) 27. Nielsen: elapsed time t=1.60391 s, 4 iters, t-(init.)=1.57863 s t(norm)=0.376375, mflops=13.2846 (err=2.6e-13) 28. NR (C): elapsed time t=1.51868 s, 4 iters, t-(init.)=1.49486 s t(norm)=0.356402, mflops=14.0291 (err=1.7e-14) 29. NR (F): elapsed time t=1.5115 s, 4 iters, t-(init.)=1.48767 s t(norm)=0.354689, mflops=14.0969 (err=1.7e-14) 30. Ooura (C): elapsed time t=1.74281 s, 16 iters, t-(init.)=1.6437 s t(norm)=0.0979724, mflops=51.0348 (err=1.7e-14) 31. Ooura (F): elapsed time t=1.7687 s, 16 iters, t-(init.)=1.66959 s t(norm)=0.0995151, mflops=50.2436 (err=1.7e-14) 32. QFT: elapsed time t=1.25358 s, 4 iters, t-(init.)=1.22814 s t(norm)=0.29281, mflops=17.0759 (err=1.9e-14) 33. Ransom: elapsed time t=1.30954 s, 8 iters, t-(init.)=1.25954 s t(norm)=0.150149, mflops=33.3003 (err=1.7e-14) 34. SCIPORT: elapsed time t=1.92213 s, 4 iters, t-(init.)=1.89711 s t(norm)=0.452305, mflops=11.0545 (err=1.7e-14) 35. Singleton: elapsed time t=1.59522 s, 8 iters, t-(init.)=1.54513 s t(norm)=0.184194, mflops=27.1453 (err=2.3e-14) 36. Singleton (f2c): elapsed time t=1.60647 s, 8 iters, t-(init.)=1.55638 s t(norm)=0.185535, mflops=26.9491 (err=2.3e-14) 37. Sorensen: elapsed time t=1.63606 s, 8 iters, t-(init.)=1.58732 s t(norm)=0.189224, mflops=26.4237 (err=1.7e-14) 38. Sorensen DIT: elapsed time t=1.13937 s, 2 iters, t-(init.)=1.12816 s t(norm)=0.537948, mflops=9.29457 (err=1.7e-14) 39. Temperton: elapsed time t=1.88754 s, 8 iters, t-(init.)=1.83851 s t(norm)=0.219167, mflops=22.8137 (err=1.7e-07) 40. Temperton (f2c): elapsed time t=1.12925 s, 4 iters, t-(init.)=1.10521 s t(norm)=0.263502, mflops=18.9752 (err=1.7e-14) 41. Valkenburg: elapsed time t=1.5843 s, 1 iters, t-(init.)=1.57802 s t(norm)=1.50492, mflops=3.32244 (err=1.7e-14) 42. SUNPERF: elapsed time t=1.03979 s, 8 iters, t-(init.)=0.990151 s t(norm)=0.118035, mflops=42.3602 (err=1.7e-14) Top mflops for N=65536 = 62.6021 Normalized results and averages for N=65536: fft 0: mflops = 14.848 (norm. = 0.237181), norm. avg. (of 16) = 0.283801 fft 1: mflops = 14.5018 (norm. = 0.23165), norm. avg. (of 16) = 0.28833 fft 2: mflops = 10.9655 (norm. = 0.175162), norm. avg. (of 16) = 0.168918 fft 3: mflops = 19.369 (norm. = 0.309398), norm. avg. (of 16) = 0.117953 fft 4: mflops = 15.9311 (norm. = 0.254483), norm. avg. (of 16) = 0.244016 fft 5: mflops = 5.31793 (norm. = 0.0849481), norm. avg. (of 16) = 0.0406309 fft 6: mflops = 29.1204 (norm. = 0.465166), norm. avg. (of 16) = 0.316498 fft 7: mflops = 22.0952 (norm. = 0.352947), norm. avg. (of 16) = 0.254267 fft 8: mflops = 9.88847 (norm. = 0.157958), norm. avg. (of 16) = 0.131747 fft 9: mflops = 61.7764 (norm. = 0.98681), norm. avg. (of 16) = 0.418078 fft 10: mflops = 61.8125 (norm. = 0.987386), norm. avg. (of 16) = 0.467974 fft 11: mflops = 9.52998 (norm. = 0.152231), norm. avg. (of 15) = 0.106216 fft 12: mflops = 34.6339 (norm. = 0.553239), norm. avg. (of 16) = 0.443518 fft 13: mflops = 27.8501 (norm. = 0.444875), norm. avg. (of 16) = 0.280995 fft 14: mflops = 62.6021 (norm. = 1), norm. avg. (of 16) = 0.782138 fft 15: mflops = 60.0846 (norm. = 0.959785), norm. avg. (of 16) = 0.77882 fft 16: mflops = 35.0675 (norm. = 0.560165), norm. avg. (of 16) = 0.656836 fft 17: mflops = 32.0492 (norm. = 0.511951), norm. avg. (of 14) = 0.465458 fft 18: mflops = 27.5613 (norm. = 0.440262), norm. avg. (of 16) = 0.230722 fft 19: mflops = 13.4454 (norm. = 0.214775), norm. avg. (of 16) = 0.186671 fft 20: mflops = 13.7611 (norm. = 0.219819), norm. avg. (of 16) = 0.209604 fft 21: mflops = -1 (norm. = -0.0159739), norm. avg. (of 12) = 0.361552 fft 22: mflops = 29.0484 (norm. = 0.464017), norm. avg. (of 15) = 0.282354 fft 23: mflops = 31.0015 (norm. = 0.495216), norm. avg. (of 15) = 0.33679 fft 24: mflops = 28.7786 (norm. = 0.459706), norm. avg. (of 15) = 0.333856 fft 25: mflops = 15.1861 (norm. = 0.242582), norm. avg. (of 15) = 0.273862 fft 26: mflops = 12.2583 (norm. = 0.195812), norm. avg. (of 16) = 0.098375 fft 27: mflops = 13.2846 (norm. = 0.212207), norm. avg. (of 16) = 0.167942 fft 28: mflops = 14.0291 (norm. = 0.2241), norm. avg. (of 16) = 0.207209 fft 29: mflops = 14.0969 (norm. = 0.225182), norm. avg. (of 16) = 0.210845 fft 30: mflops = 51.0348 (norm. = 0.815225), norm. avg. (of 16) = 0.510205 fft 31: mflops = 50.2436 (norm. = 0.802587), norm. avg. (of 16) = 0.496112 fft 32: mflops = 17.0759 (norm. = 0.272769), norm. avg. (of 13) = 0.20513 fft 33: mflops = 33.3003 (norm. = 0.531937), norm. avg. (of 15) = 0.210852 fft 34: mflops = 11.0545 (norm. = 0.176583), norm. avg. (of 15) = 0.277533 fft 35: mflops = 27.1453 (norm. = 0.433617), norm. avg. (of 16) = 0.350104 fft 36: mflops = 26.9491 (norm. = 0.430482), norm. avg. (of 16) = 0.353451 fft 37: mflops = 26.4237 (norm. = 0.42209), norm. avg. (of 16) = 0.443981 fft 38: mflops = 9.29457 (norm. = 0.148471), norm. avg. (of 16) = 0.12432 fft 39: mflops = 22.8137 (norm. = 0.364423), norm. avg. (of 16) = 0.253146 fft 40: mflops = 18.9752 (norm. = 0.303108), norm. avg. (of 16) = 0.180281 fft 41: mflops = 3.32244 (norm. = 0.0530723), norm. avg. (of 16) = 0.027469 fft 42: mflops = 42.3602 (norm. = 0.676658), norm. avg. (of 16) = 0.688638 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.70268 s, 2 iters, t-(init.)=1.67622 s t(norm)=0.376134, mflops=13.2931 (err=3.3e-14) 1. Arndt DIT: elapsed time t=1.70881 s, 2 iters, t-(init.)=1.68286 s t(norm)=0.377623, mflops=13.2407 (err=3.3e-14) 2. Arndt Split-Radix: elapsed time t=1.18421 s, 1 iters, t-(init.)=1.17113 s t(norm)=0.525589, mflops=9.51313 (err=3.3e-14) 3. Arndt 4-step: elapsed time t=1.44634 s, 2 iters, t-(init.)=1.41958 s t(norm)=0.318544, mflops=15.6964 (err=3.3e-14) 4. Bailey: elapsed time t=1.28996 s, 2 iters, t-(init.)=1.26345 s t(norm)=0.28351, mflops=17.6361 (err=3.3e-14) 5. Beauregard: elapsed time t=2.14462 s, 1 iters, t-(init.)=2.13123 s t(norm)=0.956471, mflops=5.22755 (err=3.3e-14) 6. Bergland: elapsed time t=1.70079 s, 4 iters, t-(init.)=1.64797 s t(norm)=0.184898, mflops=27.042 (err=3.4e-14) 7. Brenner: elapsed time t=1.18071 s, 2 iters, t-(init.)=1.15475 s t(norm)=0.259119, mflops=19.2962 (err=3.3e-14) 8. Burrus: elapsed time t=1.2902 s, 1 iters, t-(init.)=1.27712 s t(norm)=0.573157, mflops=8.72362 (err=3.3e-14) 9. CWP (min N) (N=144144): elapsed time t=1.64295 s, 8 iters, t-(init.)=1.52565 s t(norm)=0.0855869, mflops=58.4202 10. CWP (best N) (N=144144): elapsed time t=1.64293 s, 8 iters, t-(init.)=1.52565 s t(norm)=0.0855868, mflops=58.4202 11. Edelblute: elapsed time t=1.3433 s, 1 iters, t-(init.)=1.32995 s t(norm)=0.596865, mflops=8.3771 (err=3.3e-14) 12. FFTPACK: elapsed time t=1.39945 s, 4 iters, t-(init.)=1.34581 s t(norm)=0.150995, mflops=33.1136 (err=3.3e-14) 13. FFTPACK (f2c): elapsed time t=1.7607 s, 4 iters, t-(init.)=1.70704 s t(norm)=0.191525, mflops=26.1063 (err=3.3e-14) FFTW_MEASURE plan: (cost = 2.170662e-01) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.94079 s, 8 iters, t-(init.)=1.83371 s t(norm)=0.102868, mflops=48.6058 (err=3.3e-14) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.73819 s, 8 iters, t-(init.)=1.63104 s t(norm)=0.091499, mflops=54.6454 (err=3.3e-14) 16. Frigo-old: elapsed time t=1.52041 s, 4 iters, t-(init.)=1.467 s t(norm)=0.164593, mflops=30.3779 (err=3.3e-14) 17. Green: elapsed time t=1.64275 s, 4 iters, t-(init.)=1.58974 s t(norm)=0.178364, mflops=28.0326 (err=3.3e-14) 18. GSL: elapsed time t=1.7754 s, 4 iters, t-(init.)=1.72204 s t(norm)=0.193207, mflops=25.8789 (err=3.3e-14) 19. GSL DIT: elapsed time t=1.87038 s, 2 iters, t-(init.)=1.84391 s t(norm)=0.413762, mflops=12.0842 (err=3.5e-14) 20. GSL DIF: elapsed time t=1.82801 s, 2 iters, t-(init.)=1.80124 s t(norm)=0.404187, mflops=12.3705 (err=3.5e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.27849 s, 2 iters, t-(init.)=1.25217 s t(norm)=0.28098, mflops=17.7948 (err=3.3e-14) 23. Mayer (simple): elapsed time t=1.24553 s, 2 iters, t-(init.)=1.21921 s t(norm)=0.273584, mflops=18.2759 24. Mayer (lookup): elapsed time t=1.30613 s, 2 iters, t-(init.)=1.27982 s t(norm)=0.287183, mflops=17.4105 (err=3.3e-14) 25. Monro: elapsed time t=1.70896 s, 2 iters, t-(init.)=1.6825 s t(norm)=0.377543, mflops=13.2435 (err=1.7e-07) 26. NAPACK (f2c): elapsed time t=1.91957 s, 2 iters, t-(init.)=1.89279 s t(norm)=0.424731, mflops=11.7721 (err=2.0e-12) 27. Nielsen: elapsed time t=1.82691 s, 2 iters, t-(init.)=1.80013 s t(norm)=0.403939, mflops=12.3781 (err=9.2e-13) 28. NR (C): elapsed time t=1.80265 s, 2 iters, t-(init.)=1.77619 s t(norm)=0.398566, mflops=12.545 (err=3.4e-14) 29. NR (F): elapsed time t=1.7951 s, 2 iters, t-(init.)=1.76853 s t(norm)=0.396848, mflops=12.5993 (err=3.4e-14) 30. Ooura (C): elapsed time t=1.01663 s, 4 iters, t-(init.)=0.963401 s t(norm)=0.108091, mflops=46.2575 (err=3.4e-14) 31. Ooura (F): elapsed time t=1.03087 s, 4 iters, t-(init.)=0.977633 s t(norm)=0.109687, mflops=45.5841 (err=3.4e-14) 32. QFT: elapsed time t=1.81475 s, 2 iters, t-(init.)=1.78798 s t(norm)=0.401212, mflops=12.4622 (err=3.6e-14) 33. Ransom: elapsed time t=1.6756 s, 4 iters, t-(init.)=1.62227 s t(norm)=0.182014, mflops=27.4704 (err=3.3e-14) 34. SCIPORT: elapsed time t=1.00907 s, 1 iters, t-(init.)=0.995724 s t(norm)=0.446869, mflops=11.189 (err=3.3e-14) 35. Singleton: elapsed time t=1.03973 s, 2 iters, t-(init.)=1.01286 s t(norm)=0.22728, mflops=21.9993 (err=4.8e-14) 36. Singleton (f2c): elapsed time t=1.04157 s, 2 iters, t-(init.)=1.01481 s t(norm)=0.227717, mflops=21.9571 (err=4.8e-14) 37. Sorensen: elapsed time t=1.04549 s, 2 iters, t-(init.)=1.01946 s t(norm)=0.228761, mflops=21.8569 (err=3.3e-14) 38. Sorensen DIT: elapsed time t=1.37301 s, 1 iters, t-(init.)=1.36044 s t(norm)=0.610551, mflops=8.18933 (err=3.3e-14) 39. Temperton: elapsed time t=1.1909 s, 2 iters, t-(init.)=1.16484 s t(norm)=0.261383, mflops=19.129 (err=1.9e-07) 40. Temperton (f2c): elapsed time t=1.40213 s, 2 iters, t-(init.)=1.37524 s t(norm)=0.308596, mflops=16.2024 (err=3.3e-14) 41. Valkenburg: elapsed time t=3.47896 s, 1 iters, t-(init.)=3.4657 s t(norm)=1.55536, mflops=3.21468 (err=3.4e-14) 42. SUNPERF: elapsed time t=1.18803 s, 4 iters, t-(init.)=1.13449 s t(norm)=0.127286, mflops=39.2815 (err=3.3e-14) Top mflops for N=131072 = 58.4202 Normalized results and averages for N=131072: fft 0: mflops = 13.2931 (norm. = 0.227543), norm. avg. (of 17) = 0.280492 fft 1: mflops = 13.2407 (norm. = 0.226646), norm. avg. (of 17) = 0.284702 fft 2: mflops = 9.51313 (norm. = 0.16284), norm. avg. (of 17) = 0.168561 fft 3: mflops = 15.6964 (norm. = 0.268681), norm. avg. (of 17) = 0.126819 fft 4: mflops = 17.6361 (norm. = 0.301883), norm. avg. (of 17) = 0.24742 fft 5: mflops = 5.22755 (norm. = 0.0894818), norm. avg. (of 17) = 0.0435045 fft 6: mflops = 27.042 (norm. = 0.462887), norm. avg. (of 17) = 0.325109 fft 7: mflops = 19.2962 (norm. = 0.330299), norm. avg. (of 17) = 0.258739 fft 8: mflops = 8.72362 (norm. = 0.149325), norm. avg. (of 17) = 0.132781 fft 9: mflops = 58.4202 (norm. = 0.999999), norm. avg. (of 17) = 0.452308 fft 10: mflops = 58.4202 (norm. = 1), norm. avg. (of 17) = 0.499269 fft 11: mflops = 8.3771 (norm. = 0.143394), norm. avg. (of 16) = 0.10854 fft 12: mflops = 33.1136 (norm. = 0.566817), norm. avg. (of 17) = 0.450771 fft 13: mflops = 26.1063 (norm. = 0.446871), norm. avg. (of 17) = 0.290753 fft 14: mflops = 48.6058 (norm. = 0.832003), norm. avg. (of 17) = 0.785072 fft 15: mflops = 54.6454 (norm. = 0.935385), norm. avg. (of 17) = 0.78803 fft 16: mflops = 30.3779 (norm. = 0.51999), norm. avg. (of 17) = 0.648786 fft 17: mflops = 28.0326 (norm. = 0.479844), norm. avg. (of 15) = 0.466417 fft 18: mflops = 25.8789 (norm. = 0.442979), norm. avg. (of 17) = 0.243208 fft 19: mflops = 12.0842 (norm. = 0.20685), norm. avg. (of 17) = 0.187858 fft 20: mflops = 12.3705 (norm. = 0.21175), norm. avg. (of 17) = 0.20973 fft 21: mflops = -1 (norm. = -0.0171174), norm. avg. (of 12) = 0.361552 fft 22: mflops = 17.7948 (norm. = 0.304601), norm. avg. (of 16) = 0.283745 fft 23: mflops = 18.2759 (norm. = 0.312836), norm. avg. (of 16) = 0.335293 fft 24: mflops = 17.4105 (norm. = 0.298022), norm. avg. (of 16) = 0.331617 fft 25: mflops = 13.2435 (norm. = 0.226694), norm. avg. (of 16) = 0.270914 fft 26: mflops = 11.7721 (norm. = 0.201508), norm. avg. (of 17) = 0.104442 fft 27: mflops = 12.3781 (norm. = 0.21188), norm. avg. (of 17) = 0.170527 fft 28: mflops = 12.545 (norm. = 0.214737), norm. avg. (of 17) = 0.207652 fft 29: mflops = 12.5993 (norm. = 0.215666), norm. avg. (of 17) = 0.211128 fft 30: mflops = 46.2575 (norm. = 0.791806), norm. avg. (of 17) = 0.52677 fft 31: mflops = 45.5841 (norm. = 0.780279), norm. avg. (of 17) = 0.512828 fft 32: mflops = 12.4622 (norm. = 0.213321), norm. avg. (of 14) = 0.205715 fft 33: mflops = 27.4704 (norm. = 0.47022), norm. avg. (of 16) = 0.227062 fft 34: mflops = 11.189 (norm. = 0.191526), norm. avg. (of 16) = 0.272158 fft 35: mflops = 21.9993 (norm. = 0.37657), norm. avg. (of 17) = 0.351661 fft 36: mflops = 21.9571 (norm. = 0.375847), norm. avg. (of 17) = 0.354769 fft 37: mflops = 21.8569 (norm. = 0.374132), norm. avg. (of 17) = 0.439872 fft 38: mflops = 8.18933 (norm. = 0.14018), norm. avg. (of 17) = 0.125253 fft 39: mflops = 19.129 (norm. = 0.327439), norm. avg. (of 17) = 0.257516 fft 40: mflops = 16.2024 (norm. = 0.277343), norm. avg. (of 17) = 0.185991 fft 41: mflops = 3.21468 (norm. = 0.0550269), norm. avg. (of 17) = 0.0290901 fft 42: mflops = 39.2815 (norm. = 0.672395), norm. avg. (of 17) = 0.687682 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=1.83108 s, 1 iters, t-(init.)=1.80431 s t(norm)=0.382384, mflops=13.0759 (err=4.3e-14) 1. Arndt DIT: elapsed time t=1.83182 s, 1 iters, t-(init.)=1.80574 s t(norm)=0.382686, mflops=13.0655 (err=4.3e-14) 2. Arndt Split-Radix: elapsed time t=2.53282 s, 1 iters, t-(init.)=2.50605 s t(norm)=0.531102, mflops=9.41438 (err=4.3e-14) 3. Arndt 4-step: elapsed time t=1.30097 s, 1 iters, t-(init.)=1.2741 s t(norm)=0.270017, mflops=18.5174 (err=4.3e-14) 4. Bailey: elapsed time t=1.29442 s, 1 iters, t-(init.)=1.26755 s t(norm)=0.26863, mflops=18.613 (err=4.3e-14) 5. Beauregard: elapsed time t=4.56514 s, 1 iters, t-(init.)=4.53838 s t(norm)=0.961808, mflops=5.19854 (err=4.4e-14) 6. Bergland: elapsed time t=1.76118 s, 2 iters, t-(init.)=1.70776 s t(norm)=0.180961, mflops=27.6303 (err=4.4e-14) 7. Brenner: elapsed time t=1.24942 s, 1 iters, t-(init.)=1.22335 s t(norm)=0.259261, mflops=19.2856 (err=4.4e-14) 8. Burrus: elapsed time t=2.72702 s, 1 iters, t-(init.)=2.70025 s t(norm)=0.572257, mflops=8.73734 (err=4.3e-14) 9. CWP (min N) (N=360360): elapsed time t=1.08104 s, 2 iters, t-(init.)=1.00747 s t(norm)=0.106755, mflops=46.8361 10. CWP (best N) (N=360360): elapsed time t=1.08093 s, 2 iters, t-(init.)=1.00736 s t(norm)=0.106743, mflops=46.8413 11. Edelblute: elapsed time t=2.83179 s, 1 iters, t-(init.)=2.80502 s t(norm)=0.594461, mflops=8.41098 (err=4.3e-14) 12. FFTPACK: elapsed time t=1.45708 s, 2 iters, t-(init.)=1.40344 s t(norm)=0.148714, mflops=33.6217 (err=4.4e-14) 13. FFTPACK (f2c): elapsed time t=1.81727 s, 2 iters, t-(init.)=1.76375 s t(norm)=0.186894, mflops=26.7532 (err=4.4e-14) FFTW_MEASURE plan: (cost = 4.368217e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.01662 s, 2 iters, t-(init.)=0.963063 s t(norm)=0.10205, mflops=48.9957 (err=4.4e-14) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.03521 s, 2 iters, t-(init.)=0.981566 s t(norm)=0.10401, mflops=48.0721 (err=4.4e-14) 16. Frigo-old: elapsed time t=1.84737 s, 2 iters, t-(init.)=1.79388 s t(norm)=0.190087, mflops=26.3038 (err=4.4e-14) 17. Green: elapsed time t=1.67682 s, 2 iters, t-(init.)=1.62387 s t(norm)=0.172071, mflops=29.0577 (err=4.4e-14) 18. GSL: elapsed time t=1.76473 s, 2 iters, t-(init.)=1.71133 s t(norm)=0.181339, mflops=27.5726 (err=4.4e-14) 19. GSL DIT: elapsed time t=2.00407 s, 1 iters, t-(init.)=1.9773 s t(norm)=0.419045, mflops=11.9319 (err=4.6e-14) 20. GSL DIF: elapsed time t=1.95225 s, 1 iters, t-(init.)=1.92549 s t(norm)=0.408064, mflops=12.253 (err=4.6e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.57031 s, 1 iters, t-(init.)=1.54385 s t(norm)=0.327184, mflops=15.2819 (err=4.3e-14) 23. Mayer (simple): elapsed time t=1.5452 s, 1 iters, t-(init.)=1.51873 s t(norm)=0.321861, mflops=15.5347 24. Mayer (lookup): elapsed time t=1.59545 s, 1 iters, t-(init.)=1.56898 s t(norm)=0.332511, mflops=15.0371 (err=4.3e-14) 25. Monro: elapsed time t=1.76437 s, 1 iters, t-(init.)=1.73761 s t(norm)=0.368247, mflops=13.5778 (err=1.8e-07) 26. NAPACK (f2c): elapsed time t=1.97268 s, 1 iters, t-(init.)=1.94592 s t(norm)=0.412393, mflops=12.1243 (err=3.7e-12) 27. Nielsen: elapsed time t=1.93061 s, 1 iters, t-(init.)=1.90372 s t(norm)=0.403451, mflops=12.3931 (err=2.1e-12) 28. NR (C): elapsed time t=1.93289 s, 1 iters, t-(init.)=1.90602 s t(norm)=0.403938, mflops=12.3781 (err=4.3e-14) 29. NR (F): elapsed time t=1.92208 s, 1 iters, t-(init.)=1.89531 s t(norm)=0.401668, mflops=12.4481 (err=4.3e-14) 30. Ooura (C): elapsed time t=1.01865 s, 2 iters, t-(init.)=0.965309 s t(norm)=0.102288, mflops=48.8817 (err=4.4e-14) 31. Ooura (F): elapsed time t=1.03036 s, 2 iters, t-(init.)=0.97713 s t(norm)=0.10354, mflops=48.2903 (err=4.4e-14) 32. QFT: elapsed time t=2.21824 s, 1 iters, t-(init.)=2.19144 s t(norm)=0.464427, mflops=10.7659 (err=4.8e-14) 33. Ransom: elapsed time t=1.41064 s, 2 iters, t-(init.)=1.35712 s t(norm)=0.143805, mflops=34.7692 (err=4.3e-14) 34. SCIPORT: elapsed time t=2.34907 s, 1 iters, t-(init.)=2.32219 s t(norm)=0.492136, mflops=10.1598 (err=4.4e-14) 35. Singleton: elapsed time t=1.0334 s, 1 iters, t-(init.)=1.00662 s t(norm)=0.213331, mflops=23.4377 (err=6.0e-14) 36. Singleton (f2c): elapsed time t=1.03238 s, 1 iters, t-(init.)=1.00561 s t(norm)=0.213117, mflops=23.4613 (err=6.0e-14) 37. Sorensen: elapsed time t=1.14456 s, 1 iters, t-(init.)=1.11804 s t(norm)=0.236943, mflops=21.1021 (err=4.3e-14) 38. Sorensen DIT: elapsed time t=2.87123 s, 1 iters, t-(init.)=2.84513 s t(norm)=0.602961, mflops=8.2924 (err=4.3e-14) 39. Temperton: elapsed time t=1.1804 s, 1 iters, t-(init.)=1.15394 s t(norm)=0.244552, mflops=20.4455 (err=2.0e-07) 40. Temperton (f2c): elapsed time t=1.38646 s, 1 iters, t-(init.)=1.36 s t(norm)=0.288221, mflops=17.3478 (err=4.4e-14) 41. Valkenburg: elapsed time t=7.50495 s, 1 iters, t-(init.)=7.47819 s t(norm)=1.58483, mflops=3.1549 (err=4.4e-14) 42. SUNPERF: elapsed time t=1.20326 s, 2 iters, t-(init.)=1.14913 s t(norm)=0.121766, mflops=41.0624 (err=4.4e-14) Top mflops for N=262144 = 48.9957 Normalized results and averages for N=262144: fft 0: mflops = 13.0759 (norm. = 0.266878), norm. avg. (of 18) = 0.279736 fft 1: mflops = 13.0655 (norm. = 0.266667), norm. avg. (of 18) = 0.2837 fft 2: mflops = 9.41438 (norm. = 0.192147), norm. avg. (of 18) = 0.169871 fft 3: mflops = 18.5174 (norm. = 0.377939), norm. avg. (of 18) = 0.14077 fft 4: mflops = 18.613 (norm. = 0.37989), norm. avg. (of 18) = 0.254779 fft 5: mflops = 5.19854 (norm. = 0.106102), norm. avg. (of 18) = 0.0469821 fft 6: mflops = 27.6303 (norm. = 0.563933), norm. avg. (of 18) = 0.338377 fft 7: mflops = 19.2856 (norm. = 0.393618), norm. avg. (of 18) = 0.266232 fft 8: mflops = 8.73734 (norm. = 0.178329), norm. avg. (of 18) = 0.135311 fft 9: mflops = 46.8361 (norm. = 0.955922), norm. avg. (of 18) = 0.480287 fft 10: mflops = 46.8413 (norm. = 0.956029), norm. avg. (of 18) = 0.524645 fft 11: mflops = 8.41098 (norm. = 0.171668), norm. avg. (of 17) = 0.112253 fft 12: mflops = 33.6217 (norm. = 0.686217), norm. avg. (of 18) = 0.463851 fft 13: mflops = 26.7532 (norm. = 0.546031), norm. avg. (of 18) = 0.304935 fft 14: mflops = 48.9957 (norm. = 1), norm. avg. (of 18) = 0.797012 fft 15: mflops = 48.0721 (norm. = 0.98115), norm. avg. (of 18) = 0.798759 fft 16: mflops = 26.3038 (norm. = 0.536859), norm. avg. (of 18) = 0.642568 fft 17: mflops = 29.0577 (norm. = 0.593066), norm. avg. (of 16) = 0.474332 fft 18: mflops = 27.5726 (norm. = 0.562756), norm. avg. (of 18) = 0.26096 fft 19: mflops = 11.9319 (norm. = 0.24353), norm. avg. (of 18) = 0.190951 fft 20: mflops = 12.253 (norm. = 0.250083), norm. avg. (of 18) = 0.211972 fft 21: mflops = -1 (norm. = -0.02041), norm. avg. (of 12) = 0.361552 fft 22: mflops = 15.2819 (norm. = 0.311903), norm. avg. (of 17) = 0.285401 fft 23: mflops = 15.5347 (norm. = 0.317062), norm. avg. (of 17) = 0.334221 fft 24: mflops = 15.0371 (norm. = 0.306907), norm. avg. (of 17) = 0.330163 fft 25: mflops = 13.5778 (norm. = 0.277123), norm. avg. (of 17) = 0.27128 fft 26: mflops = 12.1243 (norm. = 0.247457), norm. avg. (of 18) = 0.112387 fft 27: mflops = 12.3931 (norm. = 0.252942), norm. avg. (of 18) = 0.175105 fft 28: mflops = 12.3781 (norm. = 0.252637), norm. avg. (of 18) = 0.210151 fft 29: mflops = 12.4481 (norm. = 0.254065), norm. avg. (of 18) = 0.213514 fft 30: mflops = 48.8817 (norm. = 0.997673), norm. avg. (of 18) = 0.552931 fft 31: mflops = 48.2903 (norm. = 0.985603), norm. avg. (of 18) = 0.539093 fft 32: mflops = 10.7659 (norm. = 0.219733), norm. avg. (of 15) = 0.206649 fft 33: mflops = 34.7692 (norm. = 0.709638), norm. avg. (of 17) = 0.255449 fft 34: mflops = 10.1598 (norm. = 0.207361), norm. avg. (of 17) = 0.268346 fft 35: mflops = 23.4377 (norm. = 0.478363), norm. avg. (of 18) = 0.3587 fft 36: mflops = 23.4613 (norm. = 0.478844), norm. avg. (of 18) = 0.361662 fft 37: mflops = 21.1021 (norm. = 0.430694), norm. avg. (of 18) = 0.439362 fft 38: mflops = 8.2924 (norm. = 0.169248), norm. avg. (of 18) = 0.127697 fft 39: mflops = 20.4455 (norm. = 0.417293), norm. avg. (of 18) = 0.266392 fft 40: mflops = 17.3478 (norm. = 0.354068), norm. avg. (of 18) = 0.195328 fft 41: mflops = 3.1549 (norm. = 0.0643915), norm. avg. (of 18) = 0.0310513 fft 42: mflops = 41.0624 (norm. = 0.838082), norm. avg. (of 18) = 0.696038 Benchmarking for array size = 524288 (power of 2): 0. Arndt DIF: elapsed time t=3.91923 s, 1 iters, t-(init.)=3.86551 s t(norm)=0.388046, mflops=12.8851 (err=1.1e-13) 1. Arndt DIT: elapsed time t=3.88478 s, 1 iters, t-(init.)=3.83126 s t(norm)=0.384607, mflops=13.0003 (err=1.1e-13) 2. Arndt Split-Radix: elapsed time t=5.44053 s, 1 iters, t-(init.)=5.387 s t(norm)=0.540784, mflops=9.24584 (err=1.1e-13) 3. Arndt 4-step: elapsed time t=3.48606 s, 1 iters, t-(init.)=3.43252 s t(norm)=0.34458, mflops=14.5104 (err=1.1e-13) 4. Bailey: elapsed time t=2.94484 s, 1 iters, t-(init.)=2.89131 s t(norm)=0.290249, mflops=17.2266 (err=1.1e-13) 5. Beauregard: elapsed time t=9.6629 s, 1 iters, t-(init.)=9.60938 s t(norm)=0.964654, mflops=5.1832 (err=1.1e-13) 6. Bergland: elapsed time t=1.93192 s, 1 iters, t-(init.)=1.87841 s t(norm)=0.188568, mflops=26.5156 (err=1.1e-13) 7. Brenner: elapsed time t=2.74161 s, 1 iters, t-(init.)=2.68809 s t(norm)=0.269849, mflops=18.5289 (err=1.1e-13) 8. Burrus: elapsed time t=5.79873 s, 1 iters, t-(init.)=5.74511 s t(norm)=0.576733, mflops=8.66952 (err=1.1e-13) 9. CWP (min N) (N=720720): elapsed time t=1.15051 s, 1 iters, t-(init.)=1.07693 s t(norm)=0.108109, mflops=46.2494 10. CWP (best N) (N=720720): elapsed time t=1.15051 s, 1 iters, t-(init.)=1.07692 s t(norm)=0.108109, mflops=46.2497 11. Edelblute: elapsed time t=6.02441 s, 1 iters, t-(init.)=5.97078 s t(norm)=0.599387, mflops=8.34185 (err=1.1e-13) 12. FFTPACK: elapsed time t=1.514 s, 1 iters, t-(init.)=1.46048 s t(norm)=0.146613, mflops=34.1035 (err=1.1e-13) 13. FFTPACK (f2c): elapsed time t=1.8764 s, 1 iters, t-(init.)=1.82119 s t(norm)=0.182823, mflops=27.3489 (err=1.1e-13) FFTW_MEASURE plan: (cost = 9.121350e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.08698 s, 1 iters, t-(init.)=1.03344 s t(norm)=0.103744, mflops=48.1957 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 5.976883e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.11109 s, 1 iters, t-(init.)=1.05756 s t(norm)=0.106165, mflops=47.0964 (err=1.1e-13) 16. Frigo-old: elapsed time t=2.07531 s, 1 iters, t-(init.)=2.02174 s t(norm)=0.202956, mflops=24.6358 (err=1.1e-13) 17. Green: elapsed time t=1.87288 s, 1 iters, t-(init.)=1.81935 s t(norm)=0.182639, mflops=27.3764 (err=1.1e-13) 18. GSL: elapsed time t=1.95839 s, 1 iters, t-(init.)=1.90477 s t(norm)=0.191213, mflops=26.1488 (err=1.1e-13) 19. GSL DIT: elapsed time t=4.36805 s, 1 iters, t-(init.)=4.31452 s t(norm)=0.43312, mflops=11.5441 (err=1.1e-13) 20. GSL DIF: elapsed time t=4.23274 s, 1 iters, t-(init.)=4.17921 s t(norm)=0.419537, mflops=11.9179 (err=1.1e-13) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=3.35765 s, 1 iters, t-(init.)=3.30402 s t(norm)=0.33168, mflops=15.0748 (err=1.1e-13) 23. Mayer (simple): elapsed time t=3.3018 s, 1 iters, t-(init.)=3.24828 s t(norm)=0.326085, mflops=15.3334 24. Mayer (lookup): elapsed time t=3.40634 s, 1 iters, t-(init.)=3.35283 s t(norm)=0.336579, mflops=14.8553 (err=1.1e-13) 25. Monro: elapsed time t=3.86634 s, 1 iters, t-(init.)=3.8128 s t(norm)=0.382755, mflops=13.0632 (err=1.9e-07) 26. NAPACK (f2c): elapsed time t=4.27495 s, 1 iters, t-(init.)=4.22142 s t(norm)=0.423775, mflops=11.7987 (err=7.9e-12) 27. Nielsen: elapsed time t=4.29483 s, 1 iters, t-(init.)=4.2412 s t(norm)=0.425761, mflops=11.7437 (err=4.4e-12) 28. NR (C): elapsed time t=4.20787 s, 1 iters, t-(init.)=4.15434 s t(norm)=0.417041, mflops=11.9892 (err=1.1e-13) 29. NR (F): elapsed time t=4.18729 s, 1 iters, t-(init.)=4.13376 s t(norm)=0.414974, mflops=12.0489 (err=1.1e-13) 30. Ooura (C): elapsed time t=1.11706 s, 1 iters, t-(init.)=1.06345 s t(norm)=0.106756, mflops=46.8357 (err=1.1e-13) 31. Ooura (F): elapsed time t=1.13527 s, 1 iters, t-(init.)=1.08175 s t(norm)=0.108593, mflops=46.0433 (err=1.1e-13) 32. QFT: elapsed time t=5.24073 s, 1 iters, t-(init.)=5.1872 s t(norm)=0.520727, mflops=9.60197 (err=1.2e-13) 33. Ransom: elapsed time t=1.74519 s, 1 iters, t-(init.)=1.69157 s t(norm)=0.169812, mflops=29.4444 (err=1.1e-13) 34. SCIPORT: elapsed time t=4.78054 s, 1 iters, t-(init.)=4.727 s t(norm)=0.474529, mflops=10.5368 (err=1.1e-13) 35. Singleton: elapsed time t=2.72757 s, 1 iters, t-(init.)=2.67404 s t(norm)=0.268438, mflops=18.6263 (err=1.6e-13) 36. Singleton (f2c): elapsed time t=2.75195 s, 1 iters, t-(init.)=2.69842 s t(norm)=0.270885, mflops=18.458 (err=1.6e-13) 37. Sorensen: elapsed time t=2.4347 s, 1 iters, t-(init.)=2.38123 s t(norm)=0.239044, mflops=20.9166 (err=1.1e-13) 38. Sorensen DIT: elapsed time t=6.08586 s, 1 iters, t-(init.)=6.03235 s t(norm)=0.605568, mflops=8.25671 (err=1.1e-13) 39. Temperton: elapsed time t=2.66222 s, 1 iters, t-(init.)=2.60869 s t(norm)=0.261878, mflops=19.0929 (err=2.1e-07) 40. Temperton (f2c): elapsed time t=3.09039 s, 1 iters, t-(init.)=3.03688 s t(norm)=0.304862, mflops=16.4009 (err=1.1e-13) 41. Valkenburg: elapsed time t=16.0919 s, 1 iters, t-(init.)=16.0383 s t(norm)=1.61004, mflops=3.10552 (err=1.1e-13) 42. SUNPERF: elapsed time t=1.25058 s, 1 iters, t-(init.)=1.19705 s t(norm)=0.120168, mflops=41.6086 (err=1.1e-13) Top mflops for N=524288 = 48.1957 Normalized results and averages for N=524288: fft 0: mflops = 12.8851 (norm. = 0.267349), norm. avg. (of 19) = 0.279084 fft 1: mflops = 13.0003 (norm. = 0.269739), norm. avg. (of 19) = 0.282965 fft 2: mflops = 9.24584 (norm. = 0.19184), norm. avg. (of 19) = 0.171027 fft 3: mflops = 14.5104 (norm. = 0.301073), norm. avg. (of 19) = 0.149207 fft 4: mflops = 17.2266 (norm. = 0.35743), norm. avg. (of 19) = 0.260182 fft 5: mflops = 5.1832 (norm. = 0.107545), norm. avg. (of 19) = 0.0501696 fft 6: mflops = 26.5156 (norm. = 0.550167), norm. avg. (of 19) = 0.349524 fft 7: mflops = 18.5289 (norm. = 0.384452), norm. avg. (of 19) = 0.272455 fft 8: mflops = 8.66952 (norm. = 0.179882), norm. avg. (of 19) = 0.137657 fft 9: mflops = 46.2494 (norm. = 0.959618), norm. avg. (of 19) = 0.505515 fft 10: mflops = 46.2497 (norm. = 0.959624), norm. avg. (of 19) = 0.547539 fft 11: mflops = 8.34185 (norm. = 0.173083), norm. avg. (of 18) = 0.115632 fft 12: mflops = 34.1035 (norm. = 0.707605), norm. avg. (of 19) = 0.47668 fft 13: mflops = 27.3489 (norm. = 0.567455), norm. avg. (of 19) = 0.318752 fft 14: mflops = 48.1957 (norm. = 1), norm. avg. (of 19) = 0.807696 fft 15: mflops = 47.0964 (norm. = 0.977191), norm. avg. (of 19) = 0.80815 fft 16: mflops = 24.6358 (norm. = 0.511163), norm. avg. (of 19) = 0.635652 fft 17: mflops = 27.3764 (norm. = 0.568026), norm. avg. (of 17) = 0.479844 fft 18: mflops = 26.1488 (norm. = 0.542555), norm. avg. (of 19) = 0.275781 fft 19: mflops = 11.5441 (norm. = 0.239526), norm. avg. (of 19) = 0.193507 fft 20: mflops = 11.9179 (norm. = 0.247281), norm. avg. (of 19) = 0.213831 fft 21: mflops = -1 (norm. = -0.0207488), norm. avg. (of 12) = 0.361552 fft 22: mflops = 15.0748 (norm. = 0.312783), norm. avg. (of 18) = 0.286922 fft 23: mflops = 15.3334 (norm. = 0.31815), norm. avg. (of 18) = 0.333328 fft 24: mflops = 14.8553 (norm. = 0.30823), norm. avg. (of 18) = 0.328945 fft 25: mflops = 13.0632 (norm. = 0.271045), norm. avg. (of 18) = 0.271266 fft 26: mflops = 11.7987 (norm. = 0.244809), norm. avg. (of 19) = 0.119357 fft 27: mflops = 11.7437 (norm. = 0.243667), norm. avg. (of 19) = 0.178714 fft 28: mflops = 11.9892 (norm. = 0.248762), norm. avg. (of 19) = 0.212183 fft 29: mflops = 12.0489 (norm. = 0.25), norm. avg. (of 19) = 0.215434 fft 30: mflops = 46.8357 (norm. = 0.971783), norm. avg. (of 19) = 0.574976 fft 31: mflops = 46.0433 (norm. = 0.955341), norm. avg. (of 19) = 0.561001 fft 32: mflops = 9.60197 (norm. = 0.199229), norm. avg. (of 16) = 0.206185 fft 33: mflops = 29.4444 (norm. = 0.610935), norm. avg. (of 18) = 0.275198 fft 34: mflops = 10.5368 (norm. = 0.218625), norm. avg. (of 18) = 0.265584 fft 35: mflops = 18.6263 (norm. = 0.386472), norm. avg. (of 19) = 0.360162 fft 36: mflops = 18.458 (norm. = 0.38298), norm. avg. (of 19) = 0.362784 fft 37: mflops = 20.9166 (norm. = 0.433994), norm. avg. (of 19) = 0.43908 fft 38: mflops = 8.25671 (norm. = 0.171317), norm. avg. (of 19) = 0.129993 fft 39: mflops = 19.0929 (norm. = 0.396154), norm. avg. (of 19) = 0.273222 fft 40: mflops = 16.4009 (norm. = 0.340297), norm. avg. (of 19) = 0.202958 fft 41: mflops = 3.10552 (norm. = 0.0644357), norm. avg. (of 19) = 0.0328083 fft 42: mflops = 41.6086 (norm. = 0.863326), norm. avg. (of 19) = 0.704842 Benchmarking for array size = 1048576 (power of 2): 0. Arndt DIF: elapsed time t=8.50601 s, 1 iters, t-(init.)=8.39898 s t(norm)=0.400494, mflops=12.4846 (err=1.9e-13) 1. Arndt DIT: elapsed time t=8.48032 s, 1 iters, t-(init.)=8.37319 s t(norm)=0.399265, mflops=12.523 (err=1.9e-13) 2. Arndt Split-Radix: elapsed time t=11.7296 s, 1 iters, t-(init.)=11.6225 s t(norm)=0.554202, mflops=9.02198 (err=1.9e-13) 3. Arndt 4-step: elapsed time t=5.98921 s, 1 iters, t-(init.)=5.88218 s t(norm)=0.280484, mflops=17.8263 (err=1.9e-13) 4. Bailey: elapsed time t=5.50706 s, 1 iters, t-(init.)=5.40003 s t(norm)=0.257494, mflops=19.418 (err=1.9e-13) 5. Beauregard: elapsed time t=20.3775 s, 1 iters, t-(init.)=20.2704 s t(norm)=0.966568, mflops=5.17294 (err=1.9e-13) 6. Bergland: elapsed time t=4.08037 s, 1 iters, t-(init.)=3.97323 s t(norm)=0.189458, mflops=26.391 (err=1.9e-13) 7. Brenner: elapsed time t=5.72219 s, 1 iters, t-(init.)=5.61515 s t(norm)=0.267751, mflops=18.6741 (err=1.9e-13) 8. Burrus: elapsed time t=12.4056 s, 1 iters, t-(init.)=12.2986 s t(norm)=0.586443, mflops=8.52598 (err=1.9e-13) 9. Skipping fft (this transform size is too big for CWP). 10. Skipping fft (this transform size is too big for CWP). 11. Edelblute: elapsed time t=12.8878 s, 1 iters, t-(init.)=12.7807 s t(norm)=0.609431, mflops=8.20437 (err=1.9e-13) 12. FFTPACK: elapsed time t=3.09885 s, 1 iters, t-(init.)=2.99184 s t(norm)=0.142662, mflops=35.0478 (err=1.9e-13) 13. FFTPACK (f2c): elapsed time t=3.87019 s, 1 iters, t-(init.)=3.76306 s t(norm)=0.179437, mflops=27.865 (err=1.9e-13) FFTW_MEASURE plan: (cost = 2.352618e+00) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=2.29771 s, 1 iters, t-(init.)=2.19058 s t(norm)=0.104455, mflops=47.8676 (err=1.9e-13) FFTW_ESTIMATE plan: (cost = 1.195377e+07) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=2.24196 s, 1 iters, t-(init.)=2.13482 s t(norm)=0.101796, mflops=49.1177 (err=1.9e-13) 16. Frigo-old: elapsed time t=4.28822 s, 1 iters, t-(init.)=4.18068 s t(norm)=0.19935, mflops=25.0815 (err=1.9e-13) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=3.79862 s, 1 iters, t-(init.)=3.69146 s t(norm)=0.176022, mflops=28.4055 (err=1.9e-13) 19. GSL DIT: elapsed time t=9.45716 s, 1 iters, t-(init.)=9.35002 s t(norm)=0.445843, mflops=11.2147 (err=1.9e-13) 20. GSL DIF: elapsed time t=9.15652 s, 1 iters, t-(init.)=9.04938 s t(norm)=0.431508, mflops=11.5873 (err=1.9e-13) 21. Skipping fft (Krukar can't handle N > 4096). 22. Skipping fft (Mayer (Buneman) can't handle N > 2^19). 23. Mayer (simple): elapsed time t=7.46417 s, 1 iters, t-(init.)=7.35713 s t(norm)=0.350815, mflops=14.2525 24. Mayer (lookup): elapsed time t=7.66728 s, 1 iters, t-(init.)=7.55843 s t(norm)=0.360414, mflops=13.8729 (err=1.9e-13) 25. Monro: elapsed time t=8.20542 s, 1 iters, t-(init.)=8.09839 s t(norm)=0.386161, mflops=12.948 (err=2.0e-07) 26. NAPACK (f2c): elapsed time t=8.74652 s, 1 iters, t-(init.)=8.63949 s t(norm)=0.411963, mflops=12.137 (err=1.5e-11) 27. Nielsen: elapsed time t=8.71732 s, 1 iters, t-(init.)=8.61018 s t(norm)=0.410565, mflops=12.1783 (err=8.1e-12) 28. NR (C): elapsed time t=9.11994 s, 1 iters, t-(init.)=9.01278 s t(norm)=0.429763, mflops=11.6343 (err=1.9e-13) 29. NR (F): elapsed time t=9.0906 s, 1 iters, t-(init.)=8.98355 s t(norm)=0.428369, mflops=11.6722 (err=1.9e-13) 30. Ooura (C): elapsed time t=2.2236 s, 1 iters, t-(init.)=2.11646 s t(norm)=0.10092, mflops=49.544 (err=1.9e-13) 31. Ooura (F): elapsed time t=2.24969 s, 1 iters, t-(init.)=2.14264 s t(norm)=0.102169, mflops=48.9385 (err=1.9e-13) 32. QFT: elapsed time t=11.4295 s, 1 iters, t-(init.)=11.3224 s t(norm)=0.539893, mflops=9.26109 (err=1.9e-13) 33. Ransom: elapsed time t=2.9605 s, 1 iters, t-(init.)=2.85337 s t(norm)=0.136059, mflops=36.7487 (err=1.9e-13) 34. SCIPORT: elapsed time t=11.2936 s, 1 iters, t-(init.)=11.1865 s t(norm)=0.533415, mflops=9.37356 (err=1.9e-13) 35. Singleton: elapsed time t=4.8266 s, 1 iters, t-(init.)=4.71946 s t(norm)=0.225042, mflops=22.2181 (err=2.6e-13) 36. Singleton (f2c): elapsed time t=4.79169 s, 1 iters, t-(init.)=4.68457 s t(norm)=0.223377, mflops=22.3836 (err=2.6e-13) 37. Sorensen: elapsed time t=5.39254 s, 1 iters, t-(init.)=5.2854 s t(norm)=0.252028, mflops=19.8391 (err=1.9e-13) 38. Sorensen DIT: elapsed time t=12.9908 s, 1 iters, t-(init.)=12.8837 s t(norm)=0.614344, mflops=8.13876 (err=1.9e-13) 39. Temperton: elapsed time t=5.4316 s, 1 iters, t-(init.)=5.32449 s t(norm)=0.253891, mflops=19.6935 (err=2.3e-07) 40. Temperton (f2c): elapsed time t=6.35561 s, 1 iters, t-(init.)=6.24859 s t(norm)=0.297956, mflops=16.781 (err=1.9e-13) 41. Valkenburg: elapsed time t=34.5408 s, 1 iters, t-(init.)=34.4338 s t(norm)=1.64193, mflops=3.0452 (err=1.9e-13) 42. SUNPERF: elapsed time t=2.47287 s, 1 iters, t-(init.)=2.36584 s t(norm)=0.112812, mflops=44.3215 (err=1.9e-13) Top mflops for N=1048576 = 49.544 Normalized results and averages for N=1048576: fft 0: mflops = 12.4846 (norm. = 0.25199), norm. avg. (of 20) = 0.277729 fft 1: mflops = 12.523 (norm. = 0.252766), norm. avg. (of 20) = 0.281455 fft 2: mflops = 9.02198 (norm. = 0.182101), norm. avg. (of 20) = 0.171581 fft 3: mflops = 17.8263 (norm. = 0.359808), norm. avg. (of 20) = 0.159737 fft 4: mflops = 19.418 (norm. = 0.391934), norm. avg. (of 20) = 0.266769 fft 5: mflops = 5.17294 (norm. = 0.104411), norm. avg. (of 20) = 0.0528817 fft 6: mflops = 26.391 (norm. = 0.532679), norm. avg. (of 20) = 0.358682 fft 7: mflops = 18.6741 (norm. = 0.376919), norm. avg. (of 20) = 0.277678 fft 8: mflops = 8.52598 (norm. = 0.172089), norm. avg. (of 20) = 0.139379 fft 9: mflops = -1 (norm. = -0.0201841), norm. avg. (of 19) = 0.505515 fft 10: mflops = -1 (norm. = -0.0201841), norm. avg. (of 19) = 0.547539 fft 11: mflops = 8.20437 (norm. = 0.165598), norm. avg. (of 19) = 0.118262 fft 12: mflops = 35.0478 (norm. = 0.707409), norm. avg. (of 20) = 0.488217 fft 13: mflops = 27.865 (norm. = 0.56243), norm. avg. (of 20) = 0.330936 fft 14: mflops = 47.8676 (norm. = 0.966164), norm. avg. (of 20) = 0.815619 fft 15: mflops = 49.1177 (norm. = 0.991397), norm. avg. (of 20) = 0.817312 fft 16: mflops = 25.0815 (norm. = 0.506247), norm. avg. (of 20) = 0.629182 fft 17: mflops = -1 (norm. = -0.0201841), norm. avg. (of 17) = 0.479844 fft 18: mflops = 28.4055 (norm. = 0.573339), norm. avg. (of 20) = 0.290659 fft 19: mflops = 11.2147 (norm. = 0.226359), norm. avg. (of 20) = 0.19515 fft 20: mflops = 11.5873 (norm. = 0.233878), norm. avg. (of 20) = 0.214833 fft 21: mflops = -1 (norm. = -0.0201841), norm. avg. (of 12) = 0.361552 fft 22: mflops = -1 (norm. = -0.0201841), norm. avg. (of 18) = 0.286922 fft 23: mflops = 14.2525 (norm. = 0.287674), norm. avg. (of 19) = 0.330925 fft 24: mflops = 13.8729 (norm. = 0.280013), norm. avg. (of 19) = 0.326369 fft 25: mflops = 12.948 (norm. = 0.261343), norm. avg. (of 19) = 0.270744 fft 26: mflops = 12.137 (norm. = 0.244975), norm. avg. (of 20) = 0.125637 fft 27: mflops = 12.1783 (norm. = 0.245809), norm. avg. (of 20) = 0.182068 fft 28: mflops = 11.6343 (norm. = 0.234828), norm. avg. (of 20) = 0.213315 fft 29: mflops = 11.6722 (norm. = 0.235592), norm. avg. (of 20) = 0.216442 fft 30: mflops = 49.544 (norm. = 1), norm. avg. (of 20) = 0.596227 fft 31: mflops = 48.9385 (norm. = 0.98778), norm. avg. (of 20) = 0.58234 fft 32: mflops = 9.26109 (norm. = 0.186927), norm. avg. (of 17) = 0.205053 fft 33: mflops = 36.7487 (norm. = 0.741738), norm. avg. (of 19) = 0.299753 fft 34: mflops = 9.37356 (norm. = 0.189197), norm. avg. (of 19) = 0.261564 fft 35: mflops = 22.2181 (norm. = 0.448453), norm. avg. (of 20) = 0.364576 fft 36: mflops = 22.3836 (norm. = 0.451793), norm. avg. (of 20) = 0.367234 fft 37: mflops = 19.8391 (norm. = 0.400434), norm. avg. (of 20) = 0.437147 fft 38: mflops = 8.13876 (norm. = 0.164274), norm. avg. (of 20) = 0.131707 fft 39: mflops = 19.6935 (norm. = 0.397495), norm. avg. (of 20) = 0.279436 fft 40: mflops = 16.781 (norm. = 0.33871), norm. avg. (of 20) = 0.209746 fft 41: mflops = 3.0452 (norm. = 0.0614645), norm. avg. (of 20) = 0.0342411 fft 42: mflops = 44.3215 (norm. = 0.894589), norm. avg. (of 20) = 0.71433 Benchmarking for array size = 2097152 (power of 2): 0. Arndt DIF: elapsed time t=18.9516 s, 1 iters, t-(init.)=18.7374 s t(norm)=0.425462, mflops=11.7519 (err=2.7e-13) 1. Arndt DIT: elapsed time t=18.832 s, 1 iters, t-(init.)=18.6179 s t(norm)=0.422747, mflops=11.8274 (err=2.7e-13) 2. Arndt Split-Radix: elapsed time t=25.7171 s, 1 iters, t-(init.)=25.5029 s t(norm)=0.579081, mflops=8.63436 (err=2.7e-13) 3. Arndt 4-step: elapsed time t=16.5565 s, 1 iters, t-(init.)=16.3423 s t(norm)=0.371078, mflops=13.4743 (err=2.7e-13) 4. Bailey: elapsed time t=14.748 s, 1 iters, t-(init.)=14.5338 s t(norm)=0.330013, mflops=15.1509 (err=2.7e-13) 5. Beauregard: elapsed time t=43.0602 s, 1 iters, t-(init.)=42.8459 s t(norm)=0.972882, mflops=5.13937 (err=2.7e-13) 6. Skipping fft (Bergland doesn't work for N > 2^20). 7. Brenner: elapsed time t=12.495 s, 1 iters, t-(init.)=12.2808 s t(norm)=0.278855, mflops=17.9304 (err=2.7e-13) 8. Burrus: elapsed time t=27.1791 s, 1 iters, t-(init.)=26.9649 s t(norm)=0.61228, mflops=8.1662 (err=2.7e-13) 9. Skipping fft (this transform size is too big for CWP). 10. Skipping fft (this transform size is too big for CWP). 11. Edelblute: elapsed time t=28.2423 s, 1 iters, t-(init.)=28.0281 s t(norm)=0.636421, mflops=7.85643 (err=2.7e-13) 12. FFTPACK: elapsed time t=7.40682 s, 1 iters, t-(init.)=7.19265 s t(norm)=0.16332, mflops=30.6147 (err=2.7e-13) 13. FFTPACK (f2c): elapsed time t=9.32656 s, 1 iters, t-(init.)=9.11238 s t(norm)=0.206911, mflops=24.165 (err=2.7e-13) FFTW_MEASURE plan: (cost = 5.501667e+00) FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_TWIDDLE 64 FFTW_NOTW 32 14. FFTW: elapsed time t=4.72799 s, 1 iters, t-(init.)=4.51379 s t(norm)=0.102493, mflops=48.784 (err=2.7e-13) FFTW_ESTIMATE plan: (cost = 2.390753e+07) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=4.65021 s, 1 iters, t-(init.)=4.43603 s t(norm)=0.100727, mflops=49.6392 (err=2.7e-13) 16. Frigo-old: elapsed time t=9.01648 s, 1 iters, t-(init.)=8.8023 s t(norm)=0.19987, mflops=25.0163 (err=2.7e-13) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=9.10744 s, 1 iters, t-(init.)=8.89328 s t(norm)=0.201936, mflops=24.7604 (err=2.7e-13) 19. GSL DIT: elapsed time t=20.8778 s, 1 iters, t-(init.)=20.6636 s t(norm)=0.469198, mflops=10.6565 (err=2.7e-13) 20. GSL DIF: elapsed time t=20.5468 s, 1 iters, t-(init.)=20.3326 s t(norm)=0.461683, mflops=10.8299 (err=2.7e-13) 21. Skipping fft (Krukar can't handle N > 4096). 22. Skipping fft (Mayer (Buneman) can't handle N > 2^19). 23. Mayer (simple): elapsed time t=16.7258 s, 1 iters, t-(init.)=16.5116 s t(norm)=0.374922, mflops=13.3361 24. Mayer (lookup): elapsed time t=17.1504 s, 1 iters, t-(init.)=16.9362 s t(norm)=0.384563, mflops=13.0018 (err=2.7e-13) 25. Skipping fft (Monro can't handle N > 2^20). 26. NAPACK (f2c): elapsed time t=18.7553 s, 1 iters, t-(init.)=18.5412 s t(norm)=0.421005, mflops=11.8763 (err=1.9e-11) 27. Nielsen: elapsed time t=18.315 s, 1 iters, t-(init.)=18.1009 s t(norm)=0.411008, mflops=12.1652 (err=6.3e-12) 28. NR (C): elapsed time t=20.1844 s, 1 iters, t-(init.)=19.9685 s t(norm)=0.453416, mflops=11.0274 (err=2.7e-13) 29. NR (F): elapsed time t=20.1353 s, 1 iters, t-(init.)=19.9211 s t(norm)=0.452338, mflops=11.0537 (err=2.7e-13) 30. Ooura (C): elapsed time t=5.2861 s, 1 iters, t-(init.)=5.07195 s t(norm)=0.115166, mflops=43.4154 (err=2.7e-13) 31. Ooura (F): elapsed time t=5.19581 s, 1 iters, t-(init.)=4.98161 s t(norm)=0.113115, mflops=44.2027 (err=2.7e-13) 32. QFT: elapsed time t=24.8387 s, 1 iters, t-(init.)=24.6245 s t(norm)=0.559136, mflops=8.94236 (err=2.8e-13) 33. Ransom: elapsed time t=7.68697 s, 1 iters, t-(init.)=7.4728 s t(norm)=0.169681, mflops=29.467 (err=2.7e-13) 34. SCIPORT: elapsed time t=23.1177 s, 1 iters, t-(init.)=22.9035 s t(norm)=0.520059, mflops=9.6143 (err=2.7e-13) 35. Singleton: elapsed time t=11.4046 s, 1 iters, t-(init.)=11.1904 s t(norm)=0.254095, mflops=19.6777 (err=3.7e-13) 36. Singleton (f2c): elapsed time t=11.4481 s, 1 iters, t-(init.)=11.2338 s t(norm)=0.255081, mflops=19.6016 (err=3.7e-13) 37. Sorensen: elapsed time t=11.9901 s, 1 iters, t-(init.)=11.776 s t(norm)=0.267392, mflops=18.6991 (err=2.7e-13) 38. Sorensen DIT: elapsed time t=28.5685 s, 1 iters, t-(init.)=28.3542 s t(norm)=0.643826, mflops=7.76607 (err=2.7e-13) 39. Temperton: elapsed time t=13.1552 s, 1 iters, t-(init.)=12.941 s t(norm)=0.293846, mflops=17.0157 (err=2.4e-07) 40. Temperton (f2c): elapsed time t=15.1242 s, 1 iters, t-(init.)=14.9081 s t(norm)=0.338511, mflops=14.7706 (err=2.7e-13) 41. Valkenburg: elapsed time t=74.2226 s, 1 iters, t-(init.)=74.0085 s t(norm)=1.68048, mflops=2.97535 (err=2.7e-13) 42. SUNPERF: elapsed time t=6.50081 s, 1 iters, t-(init.)=6.28664 s t(norm)=0.142748, mflops=35.0268 (err=2.7e-13) Top mflops for N=2097152 = 49.6392 Normalized results and averages for N=2097152: fft 0: mflops = 11.7519 (norm. = 0.236747), norm. avg. (of 21) = 0.275777 fft 1: mflops = 11.8274 (norm. = 0.238268), norm. avg. (of 21) = 0.279399 fft 2: mflops = 8.63436 (norm. = 0.173943), norm. avg. (of 21) = 0.171693 fft 3: mflops = 13.4743 (norm. = 0.271444), norm. avg. (of 21) = 0.165057 fft 4: mflops = 15.1509 (norm. = 0.305221), norm. avg. (of 21) = 0.2686 fft 5: mflops = 5.13937 (norm. = 0.103535), norm. avg. (of 21) = 0.0552938 fft 6: mflops = -1 (norm. = -0.0201454), norm. avg. (of 20) = 0.358682 fft 7: mflops = 17.9304 (norm. = 0.361216), norm. avg. (of 21) = 0.281656 fft 8: mflops = 8.1662 (norm. = 0.164511), norm. avg. (of 21) = 0.140576 fft 9: mflops = -1 (norm. = -0.0201454), norm. avg. (of 19) = 0.505515 fft 10: mflops = -1 (norm. = -0.0201454), norm. avg. (of 19) = 0.547539 fft 11: mflops = 7.85643 (norm. = 0.158271), norm. avg. (of 20) = 0.120263 fft 12: mflops = 30.6147 (norm. = 0.616746), norm. avg. (of 21) = 0.494337 fft 13: mflops = 24.165 (norm. = 0.486814), norm. avg. (of 21) = 0.338358 fft 14: mflops = 48.784 (norm. = 0.982772), norm. avg. (of 21) = 0.823579 fft 15: mflops = 49.6392 (norm. = 1), norm. avg. (of 21) = 0.826012 fft 16: mflops = 25.0163 (norm. = 0.503963), norm. avg. (of 21) = 0.623219 fft 17: mflops = -1 (norm. = -0.0201454), norm. avg. (of 17) = 0.479844 fft 18: mflops = 24.7604 (norm. = 0.498807), norm. avg. (of 21) = 0.300571 fft 19: mflops = 10.6565 (norm. = 0.214679), norm. avg. (of 21) = 0.19608 fft 20: mflops = 10.8299 (norm. = 0.218173), norm. avg. (of 21) = 0.214992 fft 21: mflops = -1 (norm. = -0.0201454), norm. avg. (of 12) = 0.361552 fft 22: mflops = -1 (norm. = -0.0201454), norm. avg. (of 18) = 0.286922 fft 23: mflops = 13.3361 (norm. = 0.268661), norm. avg. (of 20) = 0.327812 fft 24: mflops = 13.0018 (norm. = 0.261926), norm. avg. (of 20) = 0.323147 fft 25: mflops = -1 (norm. = -0.0201454), norm. avg. (of 19) = 0.270744 fft 26: mflops = 11.8763 (norm. = 0.239253), norm. avg. (of 21) = 0.131048 fft 27: mflops = 12.1652 (norm. = 0.245073), norm. avg. (of 21) = 0.185069 fft 28: mflops = 11.0274 (norm. = 0.222151), norm. avg. (of 21) = 0.213736 fft 29: mflops = 11.0537 (norm. = 0.222681), norm. avg. (of 21) = 0.216739 fft 30: mflops = 43.4154 (norm. = 0.87462), norm. avg. (of 21) = 0.609484 fft 31: mflops = 44.2027 (norm. = 0.890481), norm. avg. (of 21) = 0.597013 fft 32: mflops = 8.94236 (norm. = 0.180147), norm. avg. (of 18) = 0.203669 fft 33: mflops = 29.467 (norm. = 0.593624), norm. avg. (of 20) = 0.314447 fft 34: mflops = 9.6143 (norm. = 0.193684), norm. avg. (of 20) = 0.25817 fft 35: mflops = 19.6777 (norm. = 0.396415), norm. avg. (of 21) = 0.366093 fft 36: mflops = 19.6016 (norm. = 0.394882), norm. avg. (of 21) = 0.368551 fft 37: mflops = 18.6991 (norm. = 0.376701), norm. avg. (of 21) = 0.434269 fft 38: mflops = 7.76607 (norm. = 0.156451), norm. avg. (of 21) = 0.132885 fft 39: mflops = 17.0157 (norm. = 0.342788), norm. avg. (of 21) = 0.282452 fft 40: mflops = 14.7706 (norm. = 0.297559), norm. avg. (of 21) = 0.213928 fft 41: mflops = 2.97535 (norm. = 0.0599395), norm. avg. (of 21) = 0.0354649 fft 42: mflops = 35.0268 (norm. = 0.705628), norm. avg. (of 21) = 0.713915 Benchmarking for array size = 4194304 (power of 2): 0. Arndt DIF: elapsed time t=42.3504 s, 1 iters, t-(init.)=41.922 s t(norm)=0.454317, mflops=11.0055 (err=6.3e-13) 1. Arndt DIT: elapsed time t=42.9837 s, 1 iters, t-(init.)=42.5554 s t(norm)=0.461182, mflops=10.8417 (err=6.3e-13) 2. Arndt Split-Radix: elapsed time t=57.5676 s, 1 iters, t-(init.)=57.1392 s t(norm)=0.619229, mflops=8.07455 (err=6.3e-13) 3. Arndt 4-step: elapsed time t=24.865 s, 1 iters, t-(init.)=24.4365 s t(norm)=0.264823, mflops=18.8805 (err=6.3e-13) 4. Bailey: elapsed time t=31.1531 s, 1 iters, t-(init.)=30.7212 s t(norm)=0.332932, mflops=15.0181 (err=6.3e-13) 5. Beauregard: elapsed time t=90.8228 s, 1 iters, t-(init.)=90.3944 s t(norm)=0.979623, mflops=5.104 (err=6.3e-13) 6. Skipping fft (Bergland doesn't work for N > 2^20). 7. Brenner: elapsed time t=26.4348 s, 1 iters, t-(init.)=26.0063 s t(norm)=0.281836, mflops=17.7408 (err=6.3e-13) 8. Burrus: elapsed time t=61.0295 s, 1 iters, t-(init.)=60.6012 s t(norm)=0.656748, mflops=7.61328 (err=6.3e-13) 9. Skipping fft (this transform size is too big for CWP). 10. Skipping fft (this transform size is too big for CWP). 11. Edelblute: elapsed time t=62.5739 s, 1 iters, t-(init.)=62.1456 s t(norm)=0.673484, mflops=7.42408 (err=6.3e-13) 12. FFTPACK: elapsed time t=15.6253 s, 1 iters, t-(init.)=15.1933 s t(norm)=0.164653, mflops=30.3669 (err=6.3e-13) 13. FFTPACK (f2c): elapsed time t=19.3394 s, 1 iters, t-(init.)=18.911 s t(norm)=0.204943, mflops=24.3971 (err=6.3e-13) FFTW_MEASURE plan: (cost = 1.011110e+01) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_TWIDDLE 64 FFTW_NOTW 32 14. FFTW: elapsed time t=10.5109 s, 1 iters, t-(init.)=10.0753 s t(norm)=0.109188, mflops=45.7926 (err=6.3e-13) FFTW_ESTIMATE plan: (cost = 5.872026e+07) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=11.0601 s, 1 iters, t-(init.)=10.6282 s t(norm)=0.11518, mflops=43.4101 (err=6.3e-13) 16. Frigo-old: elapsed time t=21.2316 s, 1 iters, t-(init.)=20.8032 s t(norm)=0.225448, mflops=22.178 (err=6.3e-13) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=18.5733 s, 1 iters, t-(init.)=18.145 s t(norm)=0.196641, mflops=25.4271 (err=6.3e-13) 19. GSL DIT: elapsed time t=45.5209 s, 1 iters, t-(init.)=45.0924 s t(norm)=0.488676, mflops=10.2317 (err=6.7e-13) 20. GSL DIF: elapsed time t=45.3521 s, 1 iters, t-(init.)=44.9238 s t(norm)=0.486849, mflops=10.2701 (err=6.3e-13) 21. Skipping fft (Krukar can't handle N > 4096). 22. Skipping fft (Mayer (Buneman) can't handle N > 2^19). 23. Mayer (simple): elapsed time t=37.81 s, 1 iters, t-(init.)=37.3815 s t(norm)=0.405111, mflops=12.3423 24. Mayer (lookup): elapsed time t=38.7723 s, 1 iters, t-(init.)=38.3439 s t(norm)=0.415541, mflops=12.0325 (err=6.3e-13) 25. Skipping fft (Monro can't handle N > 2^20). 26. NAPACK (f2c): elapsed time t=39.4335 s, 1 iters, t-(init.)=39.0051 s t(norm)=0.422707, mflops=11.8285 (err=7.2e-11) 27. Nielsen: elapsed time t=41.7986 s, 1 iters, t-(init.)=41.3703 s t(norm)=0.448338, mflops=11.1523 (err=4.5e-11) 28. NR (C): elapsed time t=44.0489 s, 1 iters, t-(init.)=43.6206 s t(norm)=0.472726, mflops=10.577 (err=6.3e-13) 29. NR (F): elapsed time t=43.9371 s, 1 iters, t-(init.)=43.5088 s t(norm)=0.471513, mflops=10.6042 (err=6.3e-13) 30. Ooura (C): elapsed time t=11.1885 s, 1 iters, t-(init.)=10.7602 s t(norm)=0.11661, mflops=42.8779 (err=6.3e-13) 31. Ooura (F): elapsed time t=11.1916 s, 1 iters, t-(init.)=10.7631 s t(norm)=0.116642, mflops=42.8662 (err=6.3e-13) 32. QFT: elapsed time t=57.8157 s, 1 iters, t-(init.)=57.3776 s t(norm)=0.621813, mflops=8.04101 (err=6.4e-13) 33. Ransom: elapsed time t=13.7871 s, 1 iters, t-(init.)=13.3588 s t(norm)=0.144772, mflops=34.5371 (err=6.3e-13) 34. SCIPORT: elapsed time t=53.0803 s, 1 iters, t-(init.)=52.652 s t(norm)=0.5706, mflops=8.7627 (err=6.3e-13) 35. Singleton: elapsed time t=23.7478 s, 1 iters, t-(init.)=23.3195 s t(norm)=0.252718, mflops=19.7849 (err=8.6e-13) 36. Singleton (f2c): elapsed time t=23.9449 s, 1 iters, t-(init.)=23.5166 s t(norm)=0.254854, mflops=19.619 (err=8.6e-13) 37. Sorensen: elapsed time t=28.6656 s, 1 iters, t-(init.)=28.2372 s t(norm)=0.306013, mflops=16.3392 (err=6.3e-13) 38. Sorensen DIT: elapsed time t=64.1439 s, 1 iters, t-(init.)=63.7121 s t(norm)=0.690462, mflops=7.24153 (err=6.3e-13) 39. Temperton: elapsed time t=27.841 s, 1 iters, t-(init.)=27.4127 s t(norm)=0.297077, mflops=16.8306 (err=2.5e-07) 40. Temperton (f2c): elapsed time t=31.5975 s, 1 iters, t-(init.)=31.1691 s t(norm)=0.337786, mflops=14.8023 (err=6.3e-13) 41. Valkenburg: elapsed time t=159.083 s, 1 iters, t-(init.)=158.655 s t(norm)=1.71937, mflops=2.90803 (err=6.3e-13) 42. SUNPERF: elapsed time t=13.3777 s, 1 iters, t-(init.)=12.9494 s t(norm)=0.140335, mflops=35.629 (err=6.3e-13) Top mflops for N=4194304 = 45.7926 Normalized results and averages for N=4194304: fft 0: mflops = 11.0055 (norm. = 0.240334), norm. avg. (of 22) = 0.274166 fft 1: mflops = 10.8417 (norm. = 0.236757), norm. avg. (of 22) = 0.27746 fft 2: mflops = 8.07455 (norm. = 0.176329), norm. avg. (of 22) = 0.171904 fft 3: mflops = 18.8805 (norm. = 0.412304), norm. avg. (of 22) = 0.176295 fft 4: mflops = 15.0181 (norm. = 0.327959), norm. avg. (of 22) = 0.271299 fft 5: mflops = 5.104 (norm. = 0.111459), norm. avg. (of 22) = 0.0578467 fft 6: mflops = -1 (norm. = -0.0218376), norm. avg. (of 20) = 0.358682 fft 7: mflops = 17.7408 (norm. = 0.387417), norm. avg. (of 22) = 0.286463 fft 8: mflops = 7.61328 (norm. = 0.166255), norm. avg. (of 22) = 0.141743 fft 9: mflops = -1 (norm. = -0.0218376), norm. avg. (of 19) = 0.505515 fft 10: mflops = -1 (norm. = -0.0218376), norm. avg. (of 19) = 0.547539 fft 11: mflops = 7.42408 (norm. = 0.162124), norm. avg. (of 21) = 0.122256 fft 12: mflops = 30.3669 (norm. = 0.66314), norm. avg. (of 22) = 0.50201 fft 13: mflops = 24.3971 (norm. = 0.532773), norm. avg. (of 22) = 0.347195 fft 14: mflops = 45.7926 (norm. = 1), norm. avg. (of 22) = 0.831598 fft 15: mflops = 43.4101 (norm. = 0.947972), norm. avg. (of 22) = 0.831555 fft 16: mflops = 22.178 (norm. = 0.484314), norm. avg. (of 22) = 0.616905 fft 17: mflops = -1 (norm. = -0.0218376), norm. avg. (of 17) = 0.479844 fft 18: mflops = 25.4271 (norm. = 0.555265), norm. avg. (of 22) = 0.312148 fft 19: mflops = 10.2317 (norm. = 0.223436), norm. avg. (of 22) = 0.197323 fft 20: mflops = 10.2701 (norm. = 0.224275), norm. avg. (of 22) = 0.215414 fft 21: mflops = -1 (norm. = -0.0218376), norm. avg. (of 12) = 0.361552 fft 22: mflops = -1 (norm. = -0.0218376), norm. avg. (of 18) = 0.286922 fft 23: mflops = 12.3423 (norm. = 0.269526), norm. avg. (of 21) = 0.325036 fft 24: mflops = 12.0325 (norm. = 0.262761), norm. avg. (of 21) = 0.320271 fft 25: mflops = -1 (norm. = -0.0218376), norm. avg. (of 19) = 0.270744 fft 26: mflops = 11.8285 (norm. = 0.258306), norm. avg. (of 22) = 0.136832 fft 27: mflops = 11.1523 (norm. = 0.243539), norm. avg. (of 22) = 0.187726 fft 28: mflops = 10.577 (norm. = 0.230975), norm. avg. (of 22) = 0.21452 fft 29: mflops = 10.6042 (norm. = 0.231569), norm. avg. (of 22) = 0.217413 fft 30: mflops = 42.8779 (norm. = 0.936349), norm. avg. (of 22) = 0.624341 fft 31: mflops = 42.8662 (norm. = 0.936093), norm. avg. (of 22) = 0.612426 fft 32: mflops = 8.04101 (norm. = 0.175596), norm. avg. (of 19) = 0.202191 fft 33: mflops = 34.5371 (norm. = 0.754207), norm. avg. (of 21) = 0.335388 fft 34: mflops = 8.7627 (norm. = 0.191356), norm. avg. (of 21) = 0.254988 fft 35: mflops = 19.7849 (norm. = 0.432054), norm. avg. (of 22) = 0.369091 fft 36: mflops = 19.619 (norm. = 0.428432), norm. avg. (of 22) = 0.371273 fft 37: mflops = 16.3392 (norm. = 0.356808), norm. avg. (of 22) = 0.430748 fft 38: mflops = 7.24153 (norm. = 0.158137), norm. avg. (of 22) = 0.134033 fft 39: mflops = 16.8306 (norm. = 0.36754), norm. avg. (of 22) = 0.28632 fft 40: mflops = 14.8023 (norm. = 0.323246), norm. avg. (of 22) = 0.218897 fft 41: mflops = 2.90803 (norm. = 0.0635044), norm. avg. (of 22) = 0.0367394 fft 42: mflops = 35.629 (norm. = 0.77805), norm. avg. (of 22) = 0.716831 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. Brenner 1. CWP (min N) 2. CWP (best N) 3. FFTPACK 4. FFTPACK (f2c) 5. FFTW 6. FFTW_ESTIMATE 7. Frigo-old 8. GSL 9. NAPACK (f2c) 10. Nielsen 11. Singleton 12. Singleton (f2c) 13. Temperton 14. Temperton (f2c) 15. Valkenburg 16. SUNPERF Computing normalized averages (17 transforms). Benchmarking for array size = 6: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.82876 s, 262144 iters, t-(init.)=1.77259 s t(norm)=0.435976, mflops=11.4685 2. CWP (best N) (N=15): elapsed time t=1.29039 s, 131072 iters, t-(init.)=1.23968 s t(norm)=0.609808, mflops=8.1993 3. FFTPACK: elapsed time t=1.5523 s, 524288 iters, t-(init.)=1.41871 s t(norm)=0.174468, mflops=28.6585 (err=1.4e-16) 4. FFTPACK (f2c): elapsed time t=1.02557 s, 262144 iters, t-(init.)=0.967836 s t(norm)=0.238044, mflops=21.0046 (err=1.8e-16) FFTW_MEASURE plan: (cost = 7.447617e-07) FFTW_NOTW 6 5. FFTW: elapsed time t=1.63187 s, 2097152 iters, t-(init.)=1.16729 s t(norm)=0.0358875, mflops=139.324 (err=1.1e-16) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 6. FFTW_ESTIMATE: elapsed time t=1.81525 s, 2097152 iters, t-(init.)=1.30341 s t(norm)=0.0400725, mflops=124.774 (err=1.1e-16) 7. Frigo-old: elapsed time t=1.5665 s, 262144 iters, t-(init.)=1.51032 s t(norm)=0.371471, mflops=13.46 (err=3.3e-16) 8. GSL: elapsed time t=1.05517 s, 262144 iters, t-(init.)=0.981317 s t(norm)=0.241359, mflops=20.716 (err=1.2e-16) 9. NAPACK (f2c): elapsed time t=1.799 s, 131072 iters, t-(init.)=1.76543 s t(norm)=0.868429, mflops=5.75752 (err=4.7e-16) 10. Nielsen: elapsed time t=1.26979 s, 65536 iters, t-(init.)=1.25536 s t(norm)=1.23504, mflops=4.04844 (err=2.7e-16) 11. Singleton: elapsed time t=1.32462 s, 131072 iters, t-(init.)=1.28613 s t(norm)=0.63266, mflops=7.90314 (err=1.0e-16) 12. Singleton (f2c): elapsed time t=1.31983 s, 131072 iters, t-(init.)=1.29253 s t(norm)=0.635805, mflops=7.86405 (err=1.0e-16) 13. Temperton: elapsed time t=1.0984 s, 131072 iters, t-(init.)=1.06564 s t(norm)=0.524196, mflops=9.53842 (err=3.7e-16) 14. Temperton (f2c): elapsed time t=1.37062 s, 131072 iters, t-(init.)=1.34331 s t(norm)=0.660785, mflops=7.56676 (err=1.0e-16) 15. Valkenburg: elapsed time t=1.36486 s, 65536 iters, t-(init.)=1.34887 s t(norm)=1.32704, mflops=3.76779 (err=3.4e-16) 16. SUNPERF: elapsed time t=1.4037 s, 524288 iters, t-(init.)=1.29447 s t(norm)=0.159191, mflops=31.4088 (err=1.4e-16) Top mflops for N=6 = 139.324 Normalized results and averages for N=6: fft 0: mflops = -1 (norm. = -0.00717751), norm. avg. (of 0) = -1 fft 1: mflops = 11.4685 (norm. = 0.0823155), norm. avg. (of 1) = 0.0823155 fft 2: mflops = 8.1993 (norm. = 0.0588505), norm. avg. (of 1) = 0.0588505 fft 3: mflops = 28.6585 (norm. = 0.205696), norm. avg. (of 1) = 0.205696 fft 4: mflops = 21.0046 (norm. = 0.15076), norm. avg. (of 1) = 0.15076 fft 5: mflops = 139.324 (norm. = 1), norm. avg. (of 1) = 1 fft 6: mflops = 124.774 (norm. = 0.895566), norm. avg. (of 1) = 0.895566 fft 7: mflops = 13.46 (norm. = 0.0966093), norm. avg. (of 1) = 0.0966093 fft 8: mflops = 20.716 (norm. = 0.148689), norm. avg. (of 1) = 0.148689 fft 9: mflops = 5.75752 (norm. = 0.0413246), norm. avg. (of 1) = 0.0413246 fft 10: mflops = 4.04844 (norm. = 0.0290577), norm. avg. (of 1) = 0.0290577 fft 11: mflops = 7.90314 (norm. = 0.0567248), norm. avg. (of 1) = 0.0567248 fft 12: mflops = 7.86405 (norm. = 0.0564443), norm. avg. (of 1) = 0.0564443 fft 13: mflops = 9.53842 (norm. = 0.0684621), norm. avg. (of 1) = 0.0684621 fft 14: mflops = 7.56676 (norm. = 0.0543104), norm. avg. (of 1) = 0.0543104 fft 15: mflops = 3.76779 (norm. = 0.0270433), norm. avg. (of 1) = 0.0270433 fft 16: mflops = 31.4088 (norm. = 0.225437), norm. avg. (of 1) = 0.225437 Benchmarking for array size = 9: 0. Brenner: elapsed time t=2.00173 s, 65536 iters, t-(init.)=1.98379 s t(norm)=1.06102, mflops=4.71245 (err=3.6e-16) 1. CWP (min N): elapsed time t=1.91902 s, 262144 iters, t-(init.)=1.84214 s t(norm)=0.246315, mflops=20.2992 2. CWP (best N) (N=15): elapsed time t=1.28175 s, 131072 iters, t-(init.)=1.23182 s t(norm)=0.329416, mflops=15.1784 3. FFTPACK: elapsed time t=1.83944 s, 524288 iters, t-(init.)=1.68339 s t(norm)=0.112544, mflops=44.427 (err=1.4e-16) 4. FFTPACK (f2c): elapsed time t=1.3765 s, 262144 iters, t-(init.)=1.30005 s t(norm)=0.173831, mflops=28.7636 (err=2.4e-16) FFTW_MEASURE plan: (cost = 1.207547e-06) FFTW_NOTW 9 5. FFTW: elapsed time t=1.31186 s, 1048576 iters, t-(init.)=1.04095 s t(norm)=0.0347969, mflops=143.691 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.31603 s, 1048576 iters, t-(init.)=1.01369 s t(norm)=0.0338855, mflops=147.556 (err=1.2e-16) 7. Frigo-old: elapsed time t=1.71503 s, 131072 iters, t-(init.)=1.67989 s t(norm)=0.449242, mflops=11.1299 (err=3.3e-16) 8. GSL: elapsed time t=1.76352 s, 262144 iters, t-(init.)=1.68392 s t(norm)=0.225159, mflops=22.2065 (err=1.4e-16) 9. NAPACK (f2c): elapsed time t=1.90537 s, 131072 iters, t-(init.)=1.87026 s t(norm)=0.500151, mflops=9.99698 (err=4.6e-16) 10. Nielsen: elapsed time t=1.25737 s, 65536 iters, t-(init.)=1.24018 s t(norm)=0.663308, mflops=7.53798 (err=4.7e-16) 11. Singleton: elapsed time t=1.37827 s, 131072 iters, t-(init.)=1.3364 s t(norm)=0.357383, mflops=13.9906 (err=1.5e-16) 12. Singleton (f2c): elapsed time t=1.33791 s, 131072 iters, t-(init.)=1.30436 s t(norm)=0.348817, mflops=14.3342 (err=1.5e-16) 13. Temperton: elapsed time t=1.21244 s, 131072 iters, t-(init.)=1.17655 s t(norm)=0.314638, mflops=15.8913 (err=1.1e-08) 14. Temperton (f2c): elapsed time t=1.3031 s, 131072 iters, t-(init.)=1.26953 s t(norm)=0.339501, mflops=14.7275 (err=1.4e-16) 15. Valkenburg: elapsed time t=1.27436 s, 32768 iters, t-(init.)=1.26461 s t(norm)=1.35274, mflops=3.6962 (err=3.7e-16) 16. SUNPERF: elapsed time t=1.71859 s, 524288 iters, t-(init.)=1.58128 s t(norm)=0.105717, mflops=47.2959 (err=1.4e-16) Top mflops for N=9 = 147.556 Normalized results and averages for N=9: fft 0: mflops = 4.71245 (norm. = 0.0319367), norm. avg. (of 1) = 0.0319367 fft 1: mflops = 20.2992 (norm. = 0.13757), norm. avg. (of 2) = 0.109943 fft 2: mflops = 15.1784 (norm. = 0.102865), norm. avg. (of 2) = 0.0808579 fft 3: mflops = 44.427 (norm. = 0.301086), norm. avg. (of 2) = 0.253391 fft 4: mflops = 28.7636 (norm. = 0.194933), norm. avg. (of 2) = 0.172847 fft 5: mflops = 143.691 (norm. = 0.973808), norm. avg. (of 2) = 0.986904 fft 6: mflops = 147.556 (norm. = 1), norm. avg. (of 2) = 0.947783 fft 7: mflops = 11.1299 (norm. = 0.0754281), norm. avg. (of 2) = 0.0860187 fft 8: mflops = 22.2065 (norm. = 0.150496), norm. avg. (of 2) = 0.149592 fft 9: mflops = 9.99698 (norm. = 0.0677505), norm. avg. (of 2) = 0.0545376 fft 10: mflops = 7.53798 (norm. = 0.0510856), norm. avg. (of 2) = 0.0400717 fft 11: mflops = 13.9906 (norm. = 0.0948157), norm. avg. (of 2) = 0.0757703 fft 12: mflops = 14.3342 (norm. = 0.097144), norm. avg. (of 2) = 0.0767942 fft 13: mflops = 15.8913 (norm. = 0.107697), norm. avg. (of 2) = 0.0880795 fft 14: mflops = 14.7275 (norm. = 0.0998097), norm. avg. (of 2) = 0.0770601 fft 15: mflops = 3.6962 (norm. = 0.0250495), norm. avg. (of 2) = 0.0260464 fft 16: mflops = 47.2959 (norm. = 0.320529), norm. avg. (of 2) = 0.272983 Benchmarking for array size = 12: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.04767 s, 131072 iters, t-(init.)=1.0071 s t(norm)=0.178606, mflops=27.9946 2. CWP (best N) (N=15): elapsed time t=1.28795 s, 131072 iters, t-(init.)=1.24114 s t(norm)=0.220112, mflops=22.7157 3. FFTPACK: elapsed time t=1.96401 s, 524288 iters, t-(init.)=1.78091 s t(norm)=0.0789597, mflops=63.3235 (err=1.7e-16) 4. FFTPACK (f2c): elapsed time t=1.46573 s, 262144 iters, t-(init.)=1.38301 s t(norm)=0.122636, mflops=40.771 (err=2.2e-16) FFTW_MEASURE plan: (cost = 1.271531e-06) FFTW_NOTW 12 5. FFTW: elapsed time t=1.33779 s, 1048576 iters, t-(init.)=1.01669 s t(norm)=0.0225384, mflops=221.843 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.40273 s, 1048576 iters, t-(init.)=1.05319 s t(norm)=0.0233476, mflops=214.155 (err=1.2e-16) 7. Frigo-old: elapsed time t=1.48169 s, 131072 iters, t-(init.)=1.44268 s t(norm)=0.255855, mflops=19.5424 (err=2.9e-16) 8. GSL: elapsed time t=1.8978 s, 262144 iters, t-(init.)=1.81508 s t(norm)=0.16095, mflops=31.0656 (err=1.6e-16) 9. NAPACK (f2c): elapsed time t=1.47106 s, 65536 iters, t-(init.)=1.44882 s t(norm)=0.513889, mflops=9.72972 (err=5.5e-16) 10. Nielsen: elapsed time t=1.36225 s, 65536 iters, t-(init.)=1.34275 s t(norm)=0.476265, mflops=10.4983 (err=5.0e-16) 11. Singleton: elapsed time t=1.86186 s, 131072 iters, t-(init.)=1.81583 s t(norm)=0.322033, mflops=15.5264 (err=1.5e-16) 12. Singleton (f2c): elapsed time t=1.85711 s, 131072 iters, t-(init.)=1.81732 s t(norm)=0.322297, mflops=15.5136 (err=1.5e-16) 13. Temperton: elapsed time t=1.36416 s, 131072 iters, t-(init.)=1.32359 s t(norm)=0.234735, mflops=21.3006 (err=5.4e-16) 14. Temperton (f2c): elapsed time t=1.67489 s, 131072 iters, t-(init.)=1.63666 s t(norm)=0.290257, mflops=17.2261 (err=1.4e-16) 15. Valkenburg: elapsed time t=1.82662 s, 32768 iters, t-(init.)=1.81511 s t(norm)=1.28762, mflops=3.88313 (err=3.4e-16) 16. SUNPERF: elapsed time t=1.64832 s, 524288 iters, t-(init.)=1.49229 s t(norm)=0.0661632, mflops=75.5707 (err=1.7e-16) Top mflops for N=12 = 221.843 Normalized results and averages for N=12: fft 0: mflops = -1 (norm. = -0.00450769), norm. avg. (of 1) = 0.0319367 fft 1: mflops = 27.9946 (norm. = 0.126191), norm. avg. (of 3) = 0.115359 fft 2: mflops = 22.7157 (norm. = 0.102395), norm. avg. (of 3) = 0.088037 fft 3: mflops = 63.3235 (norm. = 0.285442), norm. avg. (of 3) = 0.264075 fft 4: mflops = 40.771 (norm. = 0.183783), norm. avg. (of 3) = 0.176492 fft 5: mflops = 221.843 (norm. = 1), norm. avg. (of 3) = 0.991269 fft 6: mflops = 214.155 (norm. = 0.965344), norm. avg. (of 3) = 0.953637 fft 7: mflops = 19.5424 (norm. = 0.0880908), norm. avg. (of 3) = 0.0867094 fft 8: mflops = 31.0656 (norm. = 0.140034), norm. avg. (of 3) = 0.146406 fft 9: mflops = 9.72972 (norm. = 0.0438585), norm. avg. (of 3) = 0.0509779 fft 10: mflops = 10.4983 (norm. = 0.0473233), norm. avg. (of 3) = 0.0424889 fft 11: mflops = 15.5264 (norm. = 0.069988), norm. avg. (of 3) = 0.0738428 fft 12: mflops = 15.5136 (norm. = 0.0699307), norm. avg. (of 3) = 0.0745063 fft 13: mflops = 21.3006 (norm. = 0.0960164), norm. avg. (of 3) = 0.0907251 fft 14: mflops = 17.2261 (norm. = 0.07765), norm. avg. (of 3) = 0.0772567 fft 15: mflops = 3.88313 (norm. = 0.0175039), norm. avg. (of 3) = 0.0231989 fft 16: mflops = 75.5707 (norm. = 0.340649), norm. avg. (of 3) = 0.295538 Benchmarking for array size = 15: 0. Brenner: elapsed time t=1.43332 s, 32768 iters, t-(init.)=1.41927 s t(norm)=0.739084, mflops=6.76513 (err=3.5e-16) 1. CWP (min N): elapsed time t=1.28304 s, 131072 iters, t-(init.)=1.237 s t(norm)=0.161042, mflops=31.0478 2. CWP (best N): elapsed time t=1.27357 s, 131072 iters, t-(init.)=1.22363 s t(norm)=0.159301, mflops=31.3871 3. FFTPACK: elapsed time t=1.28989 s, 262144 iters, t-(init.)=1.19471 s t(norm)=0.077768, mflops=64.2938 (err=2.1e-16) 4. FFTPACK (f2c): elapsed time t=1.9404 s, 262144 iters, t-(init.)=1.8343 s t(norm)=0.119401, mflops=41.8757 (err=4.4e-16) FFTW_MEASURE plan: (cost = 1.996953e-06) FFTW_NOTW 15 5. FFTW: elapsed time t=1.06428 s, 524288 iters, t-(init.)=0.883253 s t(norm)=0.028747, mflops=173.931 (err=1.9e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.12294 s, 524288 iters, t-(init.)=0.938802 s t(norm)=0.0305549, mflops=163.64 (err=1.9e-16) 7. Frigo-old: elapsed time t=1.35244 s, 65536 iters, t-(init.)=1.3294 s t(norm)=0.346142, mflops=14.4449 (err=4.2e-16) 8. GSL: elapsed time t=1.49636 s, 131072 iters, t-(init.)=1.44172 s t(norm)=0.187693, mflops=26.6392 (err=2.0e-16) 9. NAPACK (f2c): elapsed time t=1.16812 s, 32768 iters, t-(init.)=1.15445 s t(norm)=0.601176, mflops=8.31703 (err=6.3e-16) 10. Nielsen: elapsed time t=1.55579 s, 65536 iters, t-(init.)=1.53316 s t(norm)=0.399195, mflops=12.5252 (err=4.6e-15) 11. Singleton: elapsed time t=1.01176 s, 65536 iters, t-(init.)=0.985621 s t(norm)=0.25663, mflops=19.4833 (err=2.8e-16) 12. Singleton (f2c): elapsed time t=1.03806 s, 65536 iters, t-(init.)=1.01543 s t(norm)=0.264392, mflops=18.9113 (err=2.8e-16) 13. Temperton: elapsed time t=1.42764 s, 131072 iters, t-(init.)=1.38083 s t(norm)=0.179766, mflops=27.8139 (err=7.9e-16) 14. Temperton (f2c): elapsed time t=1.0483 s, 65536 iters, t-(init.)=1.02567 s t(norm)=0.267058, mflops=18.7225 (err=2.0e-16) 15. Valkenburg: elapsed time t=1.48231 s, 16384 iters, t-(init.)=1.47578 s t(norm)=1.53702, mflops=3.25305 (err=4.5e-16) 16. SUNPERF: elapsed time t=1.05997 s, 262144 iters, t-(init.)=0.967908 s t(norm)=0.0630045, mflops=79.3594 (err=2.0e-16) Top mflops for N=15 = 173.931 Normalized results and averages for N=15: fft 0: mflops = 6.76513 (norm. = 0.0388955), norm. avg. (of 2) = 0.0354161 fft 1: mflops = 31.0478 (norm. = 0.178506), norm. avg. (of 4) = 0.131146 fft 2: mflops = 31.3871 (norm. = 0.180457), norm. avg. (of 4) = 0.111142 fft 3: mflops = 64.2938 (norm. = 0.369651), norm. avg. (of 4) = 0.290469 fft 4: mflops = 41.8757 (norm. = 0.24076), norm. avg. (of 4) = 0.192559 fft 5: mflops = 173.931 (norm. = 1), norm. avg. (of 4) = 0.993452 fft 6: mflops = 163.64 (norm. = 0.94083), norm. avg. (of 4) = 0.950435 fft 7: mflops = 14.4449 (norm. = 0.0830497), norm. avg. (of 4) = 0.0857945 fft 8: mflops = 26.6392 (norm. = 0.153159), norm. avg. (of 4) = 0.148095 fft 9: mflops = 8.31703 (norm. = 0.047818), norm. avg. (of 4) = 0.0501879 fft 10: mflops = 12.5252 (norm. = 0.0720125), norm. avg. (of 4) = 0.0498698 fft 11: mflops = 19.4833 (norm. = 0.112017), norm. avg. (of 4) = 0.0833865 fft 12: mflops = 18.9113 (norm. = 0.108729), norm. avg. (of 4) = 0.0830619 fft 13: mflops = 27.8139 (norm. = 0.159913), norm. avg. (of 4) = 0.108022 fft 14: mflops = 18.7225 (norm. = 0.107643), norm. avg. (of 4) = 0.0848533 fft 15: mflops = 3.25305 (norm. = 0.0187031), norm. avg. (of 4) = 0.022075 fft 16: mflops = 79.3594 (norm. = 0.456269), norm. avg. (of 4) = 0.335721 Benchmarking for array size = 18: 0. Brenner: elapsed time t=1.99872 s, 32768 iters, t-(init.)=1.98504 s t(norm)=0.807085, mflops=6.19513 (err=4.3e-16) 1. CWP (min N): elapsed time t=1.46367 s, 131072 iters, t-(init.)=1.4114 s t(norm)=0.143463, mflops=34.8522 2. CWP (best N) (N=28): elapsed time t=1.54136 s, 131072 iters, t-(init.)=1.4688 s t(norm)=0.149297, mflops=33.4902 3. FFTPACK: elapsed time t=1.90156 s, 262144 iters, t-(init.)=1.78893 s t(norm)=0.0909186, mflops=54.9943 (err=2.8e-16) 4. FFTPACK (f2c): elapsed time t=1.52375 s, 131072 iters, t-(init.)=1.46992 s t(norm)=0.149411, mflops=33.4647 (err=2.7e-16) FFTW_MEASURE plan: (cost = 3.815344e-06) FFTW_TWIDDLE 3 FFTW_NOTW 6 5. FFTW: elapsed time t=1.05569 s, 262144 iters, t-(init.)=0.951151 s t(norm)=0.0483402, mflops=103.434 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.25476 s, 262144 iters, t-(init.)=1.15021 s t(norm)=0.0584571, mflops=85.5328 (err=2.0e-16) 7. Frigo-old: elapsed time t=1.8313 s, 65536 iters, t-(init.)=1.80477 s t(norm)=0.366894, mflops=13.6279 (err=5.0e-16) 8. GSL: elapsed time t=1.42087 s, 131072 iters, t-(init.)=1.36235 s t(norm)=0.138478, mflops=36.1069 (err=2.3e-16) 9. NAPACK (f2c): elapsed time t=1.86753 s, 65536 iters, t-(init.)=1.84022 s t(norm)=0.374101, mflops=13.3654 (err=8.7e-16) 10. Nielsen: elapsed time t=1.28655 s, 32768 iters, t-(init.)=1.27328 s t(norm)=0.517694, mflops=9.65821 (err=7.3e-16) 11. Singleton: elapsed time t=1.12059 s, 65536 iters, t-(init.)=1.09172 s t(norm)=0.221938, mflops=22.5288 (err=2.1e-16) 12. Singleton (f2c): elapsed time t=1.09954 s, 65536 iters, t-(init.)=1.07416 s t(norm)=0.218367, mflops=22.8972 (err=2.1e-16) 13. Temperton: elapsed time t=1.07682 s, 65536 iters, t-(init.)=1.05069 s t(norm)=0.213596, mflops=23.4087 (err=2.7e-08) 14. Temperton (f2c): elapsed time t=1.29298 s, 65536 iters, t-(init.)=1.26684 s t(norm)=0.257539, mflops=19.4145 (err=2.9e-16) 15. Valkenburg: elapsed time t=1.60919 s, 16384 iters, t-(init.)=1.60207 s t(norm)=1.30275, mflops=3.83804 (err=5.1e-16) 16. SUNPERF: elapsed time t=1.63412 s, 262144 iters, t-(init.)=1.53114 s t(norm)=0.0778171, mflops=64.2533 (err=2.8e-16) Top mflops for N=18 = 103.434 Normalized results and averages for N=18: fft 0: mflops = 6.19513 (norm. = 0.0598948), norm. avg. (of 3) = 0.0435757 fft 1: mflops = 34.8522 (norm. = 0.336953), norm. avg. (of 5) = 0.172307 fft 2: mflops = 33.4902 (norm. = 0.323785), norm. avg. (of 5) = 0.153671 fft 3: mflops = 54.9943 (norm. = 0.531687), norm. avg. (of 5) = 0.338713 fft 4: mflops = 33.4647 (norm. = 0.323538), norm. avg. (of 5) = 0.218755 fft 5: mflops = 103.434 (norm. = 1), norm. avg. (of 5) = 0.994762 fft 6: mflops = 85.5328 (norm. = 0.826935), norm. avg. (of 5) = 0.925735 fft 7: mflops = 13.6279 (norm. = 0.131755), norm. avg. (of 5) = 0.0949866 fft 8: mflops = 36.1069 (norm. = 0.349084), norm. avg. (of 5) = 0.188292 fft 9: mflops = 13.3654 (norm. = 0.129217), norm. avg. (of 5) = 0.0659937 fft 10: mflops = 9.65821 (norm. = 0.093376), norm. avg. (of 5) = 0.058571 fft 11: mflops = 22.5288 (norm. = 0.21781), norm. avg. (of 5) = 0.110271 fft 12: mflops = 22.8972 (norm. = 0.221371), norm. avg. (of 5) = 0.110724 fft 13: mflops = 23.4087 (norm. = 0.226316), norm. avg. (of 5) = 0.131681 fft 14: mflops = 19.4145 (norm. = 0.187701), norm. avg. (of 5) = 0.105423 fft 15: mflops = 3.83804 (norm. = 0.0371064), norm. avg. (of 5) = 0.0250813 fft 16: mflops = 64.2533 (norm. = 0.621204), norm. avg. (of 5) = 0.392818 Benchmarking for array size = 24: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.44178 s, 131072 iters, t-(init.)=1.37937 s t(norm)=0.0956364, mflops=52.2814 2. CWP (best N) (N=28): elapsed time t=1.52461 s, 131072 iters, t-(init.)=1.4493 s t(norm)=0.100485, mflops=49.7587 3. FFTPACK: elapsed time t=1.08138 s, 131072 iters, t-(init.)=1.0129 s t(norm)=0.0702281, mflops=71.1966 (err=2.2e-16) 4. FFTPACK (f2c): elapsed time t=1.72493 s, 131072 iters, t-(init.)=1.66017 s t(norm)=0.115106, mflops=43.4384 (err=2.7e-16) FFTW_MEASURE plan: (cost = 4.407562e-06) FFTW_TWIDDLE 2 FFTW_NOTW 12 5. FFTW: elapsed time t=1.17486 s, 262144 iters, t-(init.)=1.04927 s t(norm)=0.0363749, mflops=137.457 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.22069 s, 262144 iters, t-(init.)=1.09159 s t(norm)=0.037842, mflops=132.128 (err=2.1e-16) 7. Frigo-old: elapsed time t=1.47392 s, 65536 iters, t-(init.)=1.4423 s t(norm)=0.199999, mflops=25.0001 (err=3.9e-16) 8. GSL: elapsed time t=1.62243 s, 131072 iters, t-(init.)=1.55377 s t(norm)=0.107728, mflops=46.4131 (err=2.1e-16) 9. NAPACK (f2c): elapsed time t=1.25448 s, 32768 iters, t-(init.)=1.23692 s t(norm)=0.343041, mflops=14.5755 (err=8.0e-16) 10. Nielsen: elapsed time t=1.9447 s, 65536 iters, t-(init.)=1.91271 s t(norm)=0.26523, mflops=18.8515 (err=1.5e-15) 11. Singleton: elapsed time t=1.60179 s, 65536 iters, t-(init.)=1.56875 s t(norm)=0.217534, mflops=22.9849 (err=2.3e-16) 12. Singleton (f2c): elapsed time t=1.65201 s, 65536 iters, t-(init.)=1.62119 s t(norm)=0.224806, mflops=22.2414 (err=2.3e-16) 13. Temperton: elapsed time t=1.10538 s, 65536 iters, t-(init.)=1.0734 s t(norm)=0.148844, mflops=33.5921 (err=4.5e-09) 14. Temperton (f2c): elapsed time t=1.42795 s, 65536 iters, t-(init.)=1.39713 s t(norm)=0.193736, mflops=25.8084 (err=2.8e-16) 15. Valkenburg: elapsed time t=1.12981 s, 8192 iters, t-(init.)=1.12552 s t(norm)=1.24858, mflops=4.00455 (err=4.8e-16) 16. SUNPERF: elapsed time t=1.78567 s, 262144 iters, t-(init.)=1.65806 s t(norm)=0.0574797, mflops=86.9872 (err=2.2e-16) Top mflops for N=24 = 137.457 Normalized results and averages for N=24: fft 0: mflops = -1 (norm. = -0.00727498), norm. avg. (of 3) = 0.0435757 fft 1: mflops = 52.2814 (norm. = 0.380346), norm. avg. (of 6) = 0.20698 fft 2: mflops = 49.7587 (norm. = 0.361994), norm. avg. (of 6) = 0.188391 fft 3: mflops = 71.1966 (norm. = 0.517954), norm. avg. (of 6) = 0.368586 fft 4: mflops = 43.4384 (norm. = 0.316014), norm. avg. (of 6) = 0.234965 fft 5: mflops = 137.457 (norm. = 1), norm. avg. (of 6) = 0.995635 fft 6: mflops = 132.128 (norm. = 0.96123), norm. avg. (of 6) = 0.931651 fft 7: mflops = 25.0001 (norm. = 0.181875), norm. avg. (of 6) = 0.109468 fft 8: mflops = 46.4131 (norm. = 0.337654), norm. avg. (of 6) = 0.213186 fft 9: mflops = 14.5755 (norm. = 0.106037), norm. avg. (of 6) = 0.0726676 fft 10: mflops = 18.8515 (norm. = 0.137145), norm. avg. (of 6) = 0.0716666 fft 11: mflops = 22.9849 (norm. = 0.167215), norm. avg. (of 6) = 0.119762 fft 12: mflops = 22.2414 (norm. = 0.161806), norm. avg. (of 6) = 0.119237 fft 13: mflops = 33.5921 (norm. = 0.244382), norm. avg. (of 6) = 0.150465 fft 14: mflops = 25.8084 (norm. = 0.187755), norm. avg. (of 6) = 0.119145 fft 15: mflops = 4.00455 (norm. = 0.029133), norm. avg. (of 6) = 0.0257565 fft 16: mflops = 86.9872 (norm. = 0.63283), norm. avg. (of 6) = 0.43282 Benchmarking for array size = 36: 0. Brenner: elapsed time t=1.93661 s, 16384 iters, t-(init.)=1.92549 s t(norm)=0.631443, mflops=7.91837 (err=5.3e-16) 1. CWP (min N): elapsed time t=1.96907 s, 131072 iters, t-(init.)=1.88246 s t(norm)=0.0771666, mflops=64.7949 2. CWP (best N): elapsed time t=1.98907 s, 131072 iters, t-(init.)=1.90088 s t(norm)=0.0779217, mflops=64.167 3. FFTPACK: elapsed time t=1.4841 s, 131072 iters, t-(init.)=1.38978 s t(norm)=0.0569704, mflops=87.7649 (err=3.8e-16) 4. FFTPACK (f2c): elapsed time t=1.3057 s, 65536 iters, t-(init.)=1.26005 s t(norm)=0.103305, mflops=48.4002 (err=4.7e-16) FFTW_MEASURE plan: (cost = 6.541375e-06) FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.73165 s, 262144 iters, t-(init.)=1.55843 s t(norm)=0.031942, mflops=156.534 (err=4.4e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.7208 s, 262144 iters, t-(init.)=1.5362 s t(norm)=0.0314862, mflops=158.8 (err=4.4e-16) 7. Frigo-old: elapsed time t=1.82218 s, 32768 iters, t-(init.)=1.80033 s t(norm)=0.2952, mflops=16.9377 (err=5.9e-16) 8. GSL: elapsed time t=1.25457 s, 65536 iters, t-(init.)=1.20814 s t(norm)=0.0990494, mflops=50.4798 (err=4.3e-16) 9. NAPACK (f2c): elapsed time t=1.63669 s, 32768 iters, t-(init.)=1.61387 s t(norm)=0.264626, mflops=18.8946 (err=1.4e-15) 10. Nielsen: elapsed time t=1.75003 s, 32768 iters, t-(init.)=1.72838 s t(norm)=0.283402, mflops=17.6428 (err=1.0e-15) 11. Singleton: elapsed time t=1.66778 s, 65536 iters, t-(init.)=1.62136 s t(norm)=0.132927, mflops=37.6146 (err=4.7e-16) 12. Singleton (f2c): elapsed time t=1.59672 s, 65536 iters, t-(init.)=1.55381 s t(norm)=0.127389, mflops=39.2499 (err=4.7e-16) 13. Temperton: elapsed time t=1.48379 s, 65536 iters, t-(init.)=1.4401 s t(norm)=0.118067, mflops=42.349 (err=5.1e-08) 14. Temperton (f2c): elapsed time t=1.80445 s, 65536 iters, t-(init.)=1.76154 s t(norm)=0.144419, mflops=34.6214 (err=3.7e-16) 15. Valkenburg: elapsed time t=1.93964 s, 8192 iters, t-(init.)=1.93413 s t(norm)=1.26855, mflops=3.94149 (err=6.2e-16) 16. SUNPERF: elapsed time t=1.23375 s, 131072 iters, t-(init.)=1.14713 s t(norm)=0.0470235, mflops=106.33 (err=3.8e-16) Top mflops for N=36 = 158.8 Normalized results and averages for N=36: fft 0: mflops = 7.91837 (norm. = 0.0498639), norm. avg. (of 4) = 0.0451477 fft 1: mflops = 64.7949 (norm. = 0.408029), norm. avg. (of 7) = 0.235701 fft 2: mflops = 64.167 (norm. = 0.404075), norm. avg. (of 7) = 0.219203 fft 3: mflops = 87.7649 (norm. = 0.552677), norm. avg. (of 7) = 0.394885 fft 4: mflops = 48.4002 (norm. = 0.304788), norm. avg. (of 7) = 0.24494 fft 5: mflops = 156.534 (norm. = 0.985732), norm. avg. (of 7) = 0.99422 fft 6: mflops = 158.8 (norm. = 1), norm. avg. (of 7) = 0.941415 fft 7: mflops = 16.9377 (norm. = 0.106661), norm. avg. (of 7) = 0.109067 fft 8: mflops = 50.4798 (norm. = 0.317884), norm. avg. (of 7) = 0.228143 fft 9: mflops = 18.8946 (norm. = 0.118984), norm. avg. (of 7) = 0.0792842 fft 10: mflops = 17.6428 (norm. = 0.111101), norm. avg. (of 7) = 0.0773001 fft 11: mflops = 37.6146 (norm. = 0.236869), norm. avg. (of 7) = 0.136491 fft 12: mflops = 39.2499 (norm. = 0.247166), norm. avg. (of 7) = 0.137513 fft 13: mflops = 42.349 (norm. = 0.266682), norm. avg. (of 7) = 0.167067 fft 14: mflops = 34.6214 (norm. = 0.218019), norm. avg. (of 7) = 0.13327 fft 15: mflops = 3.94149 (norm. = 0.0248205), norm. avg. (of 7) = 0.0256228 fft 16: mflops = 106.33 (norm. = 0.669585), norm. avg. (of 7) = 0.466643 Benchmarking for array size = 80: 0. Brenner: elapsed time t=1.49131 s, 8192 iters, t-(init.)=1.48022 s t(norm)=0.357269, mflops=13.995 (err=3.9e-16) 1. CWP (min N): elapsed time t=1.29171 s, 32768 iters, t-(init.)=1.24858 s t(norm)=0.0753403, mflops=66.3655 2. CWP (best N) (N=84): elapsed time t=1.51022 s, 65536 iters, t-(init.)=1.41777 s t(norm)=0.0427748, mflops=116.891 3. FFTPACK: elapsed time t=1.65451 s, 65536 iters, t-(init.)=1.56519 s t(norm)=0.0472224, mflops=105.882 (err=3.2e-16) 4. FFTPACK (f2c): elapsed time t=1.23142 s, 32768 iters, t-(init.)=1.18792 s t(norm)=0.07168, mflops=69.7545 (err=4.0e-16) FFTW_MEASURE plan: (cost = 1.589225e-05) FFTW_TWIDDLE 5 FFTW_NOTW 16 5. FFTW: elapsed time t=1.0202 s, 65536 iters, t-(init.)=0.933986 s t(norm)=0.0281787, mflops=177.439 (err=3.6e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 6. FFTW_ESTIMATE: elapsed time t=1.02781 s, 65536 iters, t-(init.)=0.941213 s t(norm)=0.0283967, mflops=176.077 (err=3.6e-16) 7. Frigo-old: elapsed time t=1.38421 s, 16384 iters, t-(init.)=1.36256 s t(norm)=0.164436, mflops=30.407 (err=3.5e-16) 8. GSL: elapsed time t=1.93547 s, 32768 iters, t-(init.)=1.89217 s t(norm)=0.114175, mflops=43.7925 (err=3.2e-16) 9. NAPACK (f2c): elapsed time t=1.64927 s, 8192 iters, t-(init.)=1.63839 s t(norm)=0.395447, mflops=12.6439 (err=5.0e-16) 10. Nielsen: elapsed time t=1.10666 s, 16384 iters, t-(init.)=1.08508 s t(norm)=0.130949, mflops=38.1827 (err=5.0e-15) 11. Singleton: elapsed time t=1.33844 s, 32768 iters, t-(init.)=1.29397 s t(norm)=0.078079, mflops=64.0377 (err=4.4e-16) 12. Singleton (f2c): elapsed time t=1.32452 s, 32768 iters, t-(init.)=1.28161 s t(norm)=0.0773333, mflops=64.6552 (err=4.4e-16) 13. Temperton: elapsed time t=1.32852 s, 32768 iters, t-(init.)=1.28425 s t(norm)=0.0774924, mflops=64.5225 (err=5.3e-08) 14. Temperton (f2c): elapsed time t=1.01276 s, 16384 iters, t-(init.)=0.991182 s t(norm)=0.119617, mflops=41.8 (err=3.4e-16) 15. Valkenburg: elapsed time t=1.38879 s, 2048 iters, t-(init.)=1.38604 s t(norm)=1.33815, mflops=3.7365 (err=4.6e-16) 16. SUNPERF: elapsed time t=1.10394 s, 65536 iters, t-(init.)=1.01812 s t(norm)=0.030717, mflops=162.776 (err=3.0e-16) Top mflops for N=80 = 177.439 Normalized results and averages for N=80: fft 0: mflops = 13.995 (norm. = 0.0788724), norm. avg. (of 5) = 0.0518927 fft 1: mflops = 66.3655 (norm. = 0.374019), norm. avg. (of 8) = 0.252991 fft 2: mflops = 116.891 (norm. = 0.65877), norm. avg. (of 8) = 0.274149 fft 3: mflops = 105.882 (norm. = 0.596724), norm. avg. (of 8) = 0.420115 fft 4: mflops = 69.7545 (norm. = 0.393118), norm. avg. (of 8) = 0.263462 fft 5: mflops = 177.439 (norm. = 1), norm. avg. (of 8) = 0.994943 fft 6: mflops = 176.077 (norm. = 0.992322), norm. avg. (of 8) = 0.947778 fft 7: mflops = 30.407 (norm. = 0.171366), norm. avg. (of 8) = 0.116854 fft 8: mflops = 43.7925 (norm. = 0.246803), norm. avg. (of 8) = 0.230475 fft 9: mflops = 12.6439 (norm. = 0.0712579), norm. avg. (of 8) = 0.0782809 fft 10: mflops = 38.1827 (norm. = 0.215188), norm. avg. (of 8) = 0.0945361 fft 11: mflops = 64.0377 (norm. = 0.3609), norm. avg. (of 8) = 0.164542 fft 12: mflops = 64.6552 (norm. = 0.36438), norm. avg. (of 8) = 0.165871 fft 13: mflops = 64.5225 (norm. = 0.363632), norm. avg. (of 8) = 0.191638 fft 14: mflops = 41.8 (norm. = 0.235574), norm. avg. (of 8) = 0.146058 fft 15: mflops = 3.7365 (norm. = 0.0210579), norm. avg. (of 8) = 0.0250522 fft 16: mflops = 162.776 (norm. = 0.917364), norm. avg. (of 8) = 0.522983 Benchmarking for array size = 108: 0. Brenner: elapsed time t=1.72908 s, 4096 iters, t-(init.)=1.72184 s t(norm)=0.576224, mflops=8.67718 (err=6.4e-16) 1. CWP (min N) (N=110): elapsed time t=1.67812 s, 32768 iters, t-(init.)=1.62038 s t(norm)=0.0677838, mflops=73.7639 2. CWP (best N) (N=112): elapsed time t=1.72148 s, 32768 iters, t-(init.)=1.6616 s t(norm)=0.069508, mflops=71.9342 3. FFTPACK: elapsed time t=1.02168 s, 32768 iters, t-(init.)=0.964342 s t(norm)=0.0403403, mflops=123.946 (err=3.8e-16) 4. FFTPACK (f2c): elapsed time t=1.01883 s, 16384 iters, t-(init.)=0.99016 s t(norm)=0.0828407, mflops=60.3568 (err=4.0e-16) FFTW_MEASURE plan: (cost = 2.623700e-05) FFTW_TWIDDLE 9 FFTW_NOTW 12 5. FFTW: elapsed time t=1.64751 s, 65536 iters, t-(init.)=1.53394 s t(norm)=0.0320839, mflops=155.841 (err=3.7e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.65715 s, 65536 iters, t-(init.)=1.54361 s t(norm)=0.0322861, mflops=154.866 (err=3.7e-16) 7. Frigo-old: elapsed time t=1.95379 s, 8192 iters, t-(init.)=1.9396 s t(norm)=0.324549, mflops=15.406 (err=5.7e-16) 8. GSL: elapsed time t=1.24309 s, 16384 iters, t-(init.)=1.21461 s t(norm)=0.101619, mflops=49.2033 (err=4.0e-16) 9. NAPACK (f2c): elapsed time t=1.19244 s, 8192 iters, t-(init.)=1.1782 s t(norm)=0.197146, mflops=25.3619 (err=3.1e-15) 10. Nielsen: elapsed time t=1.37127 s, 8192 iters, t-(init.)=1.35703 s t(norm)=0.227069, mflops=22.0198 (err=1.1e-15) 11. Singleton: elapsed time t=1.35101 s, 16384 iters, t-(init.)=1.32151 s t(norm)=0.110563, mflops=45.2232 (err=4.5e-16) 12. Singleton (f2c): elapsed time t=1.40415 s, 16384 iters, t-(init.)=1.37577 s t(norm)=0.115103, mflops=43.4395 (err=4.5e-16) 13. Temperton: elapsed time t=1.01463 s, 16384 iters, t-(init.)=0.986056 s t(norm)=0.0824973, mflops=60.608 (err=7.4e-08) 14. Temperton (f2c): elapsed time t=1.31763 s, 16384 iters, t-(init.)=1.28945 s t(norm)=0.10788, mflops=46.3477 (err=3.5e-16) 15. Valkenburg: elapsed time t=1.89121 s, 2048 iters, t-(init.)=1.88762 s t(norm)=1.26341, mflops=3.95755 (err=6.6e-16) 16. SUNPERF: elapsed time t=1.59612 s, 65536 iters, t-(init.)=1.48297 s t(norm)=0.0310178, mflops=161.198 (err=3.8e-16) Top mflops for N=108 = 161.198 Normalized results and averages for N=108: fft 0: mflops = 8.67718 (norm. = 0.0538295), norm. avg. (of 6) = 0.0522155 fft 1: mflops = 73.7639 (norm. = 0.457599), norm. avg. (of 9) = 0.275725 fft 2: mflops = 71.9342 (norm. = 0.446248), norm. avg. (of 9) = 0.293271 fft 3: mflops = 123.946 (norm. = 0.768904), norm. avg. (of 9) = 0.458869 fft 4: mflops = 60.3568 (norm. = 0.374427), norm. avg. (of 9) = 0.275791 fft 5: mflops = 155.841 (norm. = 0.966772), norm. avg. (of 9) = 0.991812 fft 6: mflops = 154.866 (norm. = 0.960718), norm. avg. (of 9) = 0.949216 fft 7: mflops = 15.406 (norm. = 0.0955721), norm. avg. (of 9) = 0.11449 fft 8: mflops = 49.2033 (norm. = 0.305236), norm. avg. (of 9) = 0.238782 fft 9: mflops = 25.3619 (norm. = 0.157334), norm. avg. (of 9) = 0.0870646 fft 10: mflops = 22.0198 (norm. = 0.136601), norm. avg. (of 9) = 0.0992099 fft 11: mflops = 45.2232 (norm. = 0.280545), norm. avg. (of 9) = 0.177431 fft 12: mflops = 43.4395 (norm. = 0.26948), norm. avg. (of 9) = 0.177383 fft 13: mflops = 60.608 (norm. = 0.375986), norm. avg. (of 9) = 0.212121 fft 14: mflops = 46.3477 (norm. = 0.287521), norm. avg. (of 9) = 0.161776 fft 15: mflops = 3.95755 (norm. = 0.0245509), norm. avg. (of 9) = 0.0249965 fft 16: mflops = 161.198 (norm. = 1), norm. avg. (of 9) = 0.575985 Benchmarking for array size = 210: 0. Brenner: elapsed time t=1.44822 s, 2048 iters, t-(init.)=1.44144 s t(norm)=0.434464, mflops=11.5084 (err=6.7e-16) 1. CWP (min N): elapsed time t=1.0861 s, 16384 iters, t-(init.)=1.03272 s t(norm)=0.0389091, mflops=128.504 2. CWP (best N): elapsed time t=1.08675 s, 16384 iters, t-(init.)=1.03289 s t(norm)=0.0389155, mflops=128.483 3. FFTPACK: elapsed time t=1.70903 s, 16384 iters, t-(init.)=1.65491 s t(norm)=0.0623506, mflops=80.1917 (err=5.0e-16) 4. FFTPACK (f2c): elapsed time t=1.94667 s, 8192 iters, t-(init.)=1.91936 s t(norm)=0.144628, mflops=34.5713 (err=6.4e-16) FFTW_MEASURE plan: (cost = 7.180700e-05) FFTW_TWIDDLE 3 FFTW_TWIDDLE 10 FFTW_NOTW 7 5. FFTW: elapsed time t=1.29473 s, 16384 iters, t-(init.)=1.24139 s t(norm)=0.0467708, mflops=106.904 (err=4.7e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.41407 s, 16384 iters, t-(init.)=1.35996 s t(norm)=0.0512381, mflops=97.5836 (err=4.7e-16) 7. Frigo-old: elapsed time t=1.1003 s, 2048 iters, t-(init.)=1.09363 s t(norm)=0.329631, mflops=15.1685 (err=5.9e-16) 8. GSL: elapsed time t=1.86243 s, 8192 iters, t-(init.)=1.83571 s t(norm)=0.138325, mflops=36.1467 (err=6.3e-16) 9. NAPACK (f2c): elapsed time t=1.4783 s, 2048 iters, t-(init.)=1.47153 s t(norm)=0.443532, mflops=11.2731 (err=1.5e-14) 10. Nielsen: elapsed time t=1.15989 s, 4096 iters, t-(init.)=1.14655 s t(norm)=0.172791, mflops=28.9366 (err=7.2e-15) 11. Singleton: elapsed time t=1.71504 s, 8192 iters, t-(init.)=1.68827 s t(norm)=0.127215, mflops=39.3035 (err=6.4e-16) 12. Singleton (f2c): elapsed time t=1.86248 s, 8192 iters, t-(init.)=1.83585 s t(norm)=0.138336, mflops=36.1439 (err=6.4e-16) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.37205 s, 512 iters, t-(init.)=1.37036 s t(norm)=1.65216, mflops=3.02634 (err=7.1e-16) 16. SUNPERF: elapsed time t=1.28997 s, 16384 iters, t-(init.)=1.23678 s t(norm)=0.0465974, mflops=107.302 (err=4.9e-16) Top mflops for N=210 = 128.504 Normalized results and averages for N=210: fft 0: mflops = 11.5084 (norm. = 0.0895568), norm. avg. (of 7) = 0.0575499 fft 1: mflops = 128.504 (norm. = 1), norm. avg. (of 10) = 0.348153 fft 2: mflops = 128.483 (norm. = 0.999837), norm. avg. (of 10) = 0.363928 fft 3: mflops = 80.1917 (norm. = 0.624038), norm. avg. (of 10) = 0.475386 fft 4: mflops = 34.5713 (norm. = 0.269028), norm. avg. (of 10) = 0.275115 fft 5: mflops = 106.904 (norm. = 0.831911), norm. avg. (of 10) = 0.975822 fft 6: mflops = 97.5836 (norm. = 0.759379), norm. avg. (of 10) = 0.930232 fft 7: mflops = 15.1685 (norm. = 0.118038), norm. avg. (of 10) = 0.114845 fft 8: mflops = 36.1467 (norm. = 0.281288), norm. avg. (of 10) = 0.243033 fft 9: mflops = 11.2731 (norm. = 0.0877256), norm. avg. (of 10) = 0.0871307 fft 10: mflops = 28.9366 (norm. = 0.22518), norm. avg. (of 10) = 0.111807 fft 11: mflops = 39.3035 (norm. = 0.305853), norm. avg. (of 10) = 0.190274 fft 12: mflops = 36.1439 (norm. = 0.281265), norm. avg. (of 10) = 0.187772 fft 13: mflops = -1 (norm. = -0.00778183), norm. avg. (of 9) = 0.212121 fft 14: mflops = -1 (norm. = -0.00778183), norm. avg. (of 9) = 0.161776 fft 15: mflops = 3.02634 (norm. = 0.0235505), norm. avg. (of 10) = 0.0248519 fft 16: mflops = 107.302 (norm. = 0.835007), norm. avg. (of 10) = 0.601888 Benchmarking for array size = 504: 0. Brenner: elapsed time t=1.80002 s, 1024 iters, t-(init.)=1.79219 s t(norm)=0.38682, mflops=12.9259 (err=1.5e-15) 1. CWP (min N): elapsed time t=1.5727 s, 8192 iters, t-(init.)=1.51017 s t(norm)=0.0407436, mflops=122.719 2. CWP (best N): elapsed time t=1.57 s, 8192 iters, t-(init.)=1.50717 s t(norm)=0.0406627, mflops=122.963 3. FFTPACK: elapsed time t=1.06289 s, 4096 iters, t-(init.)=1.03156 s t(norm)=0.0556619, mflops=89.8281 (err=1.2e-15) 4. FFTPACK (f2c): elapsed time t=1.39666 s, 2048 iters, t-(init.)=1.381 s t(norm)=0.149036, mflops=33.549 (err=1.3e-15) FFTW_MEASURE plan: (cost = 2.167530e-04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 7 FFTW_NOTW 12 5. FFTW: elapsed time t=1.64782 s, 8192 iters, t-(init.)=1.58538 s t(norm)=0.0427727, mflops=116.897 (err=1.3e-15) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.71227 s, 8192 iters, t-(init.)=1.64986 s t(norm)=0.0445124, mflops=112.328 (err=1.2e-15) 7. Frigo-old: elapsed time t=1.25087 s, 1024 iters, t-(init.)=1.24306 s t(norm)=0.268297, mflops=18.6361 (err=1.3e-15) 8. GSL: elapsed time t=1.14533 s, 2048 iters, t-(init.)=1.12964 s t(norm)=0.121909, mflops=41.0144 (err=1.3e-15) 9. NAPACK (f2c): elapsed time t=1.57896 s, 1024 iters, t-(init.)=1.57113 s t(norm)=0.339106, mflops=14.7446 (err=4.1e-14) 10. Nielsen: elapsed time t=1.60788 s, 2048 iters, t-(init.)=1.59227 s t(norm)=0.171835, mflops=29.0977 (err=6.2e-15) 11. Singleton: elapsed time t=1.00979 s, 2048 iters, t-(init.)=0.994093 s t(norm)=0.107281, mflops=46.6067 (err=1.9e-15) 12. Singleton (f2c): elapsed time t=1.00094 s, 2048 iters, t-(init.)=0.985338 s t(norm)=0.106336, mflops=47.0208 (err=1.9e-15) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.70986 s, 256 iters, t-(init.)=1.7079 s t(norm)=1.47451, mflops=3.39096 (err=1.4e-15) 16. SUNPERF: elapsed time t=1.83409 s, 8192 iters, t-(init.)=1.7717 s t(norm)=0.0477997, mflops=104.603 (err=1.2e-15) Top mflops for N=504 = 122.963 Normalized results and averages for N=504: fft 0: mflops = 12.9259 (norm. = 0.105121), norm. avg. (of 8) = 0.0634963 fft 1: mflops = 122.719 (norm. = 0.998014), norm. avg. (of 11) = 0.407231 fft 2: mflops = 122.963 (norm. = 1), norm. avg. (of 11) = 0.421752 fft 3: mflops = 89.8281 (norm. = 0.73053), norm. avg. (of 11) = 0.498581 fft 4: mflops = 33.549 (norm. = 0.272839), norm. avg. (of 11) = 0.274908 fft 5: mflops = 116.897 (norm. = 0.950669), norm. avg. (of 11) = 0.973536 fft 6: mflops = 112.328 (norm. = 0.913514), norm. avg. (of 11) = 0.928713 fft 7: mflops = 18.6361 (norm. = 0.151559), norm. avg. (of 11) = 0.118182 fft 8: mflops = 41.0144 (norm. = 0.333551), norm. avg. (of 11) = 0.251262 fft 9: mflops = 14.7446 (norm. = 0.119911), norm. avg. (of 11) = 0.0901108 fft 10: mflops = 29.0977 (norm. = 0.236638), norm. avg. (of 11) = 0.123155 fft 11: mflops = 46.6067 (norm. = 0.379031), norm. avg. (of 11) = 0.207433 fft 12: mflops = 47.0208 (norm. = 0.382399), norm. avg. (of 11) = 0.205465 fft 13: mflops = -1 (norm. = -0.00813254), norm. avg. (of 9) = 0.212121 fft 14: mflops = -1 (norm. = -0.00813254), norm. avg. (of 9) = 0.161776 fft 15: mflops = 3.39096 (norm. = 0.0275771), norm. avg. (of 11) = 0.0250997 fft 16: mflops = 104.603 (norm. = 0.85069), norm. avg. (of 11) = 0.624506 Benchmarking for array size = 1000: 0. Brenner: elapsed time t=1.01321 s, 256 iters, t-(init.)=1.00937 s t(norm)=0.395638, mflops=12.6378 (err=1.1e-15) 1. CWP (min N) (N=1001): elapsed time t=1.3681 s, 2048 iters, t-(init.)=1.33732 s t(norm)=0.065523, mflops=76.3091 2. CWP (best N) (N=1008): elapsed time t=1.8466 s, 4096 iters, t-(init.)=1.7845 s t(norm)=0.0437165, mflops=114.373 3. FFTPACK: elapsed time t=1.08084 s, 2048 iters, t-(init.)=1.04751 s t(norm)=0.0513235, mflops=97.4213 (err=1.0e-15) 4. FFTPACK (f2c): elapsed time t=1.04072 s, 1024 iters, t-(init.)=1.0247 s t(norm)=0.100412, mflops=49.7951 (err=1.1e-15) FFTW_MEASURE plan: (cost = 4.931240e-04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 5. FFTW: elapsed time t=1.8102 s, 4096 iters, t-(init.)=1.74619 s t(norm)=0.042778, mflops=116.882 (err=9.7e-16) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.8145 s, 4096 iters, t-(init.)=1.75304 s t(norm)=0.0429457, mflops=116.426 (err=9.7e-16) 7. Frigo-old: elapsed time t=1.45587 s, 512 iters, t-(init.)=1.44819 s t(norm)=0.28382, mflops=17.6168 (err=1.0e-15) 8. GSL: elapsed time t=1.26768 s, 1024 iters, t-(init.)=1.25228 s t(norm)=0.122713, mflops=40.7455 (err=1.0e-15) 9. NAPACK (f2c): elapsed time t=1.13907 s, 256 iters, t-(init.)=1.13521 s t(norm)=0.444963, mflops=11.2369 (err=1.7e-14) 10. Nielsen: elapsed time t=1.24902 s, 1024 iters, t-(init.)=1.23366 s t(norm)=0.120889, mflops=41.3604 (err=1.5e-14) 11. Singleton: elapsed time t=1.55018 s, 2048 iters, t-(init.)=1.51936 s t(norm)=0.0744422, mflops=67.1662 (err=1.5e-15) 12. Singleton (f2c): elapsed time t=1.49407 s, 2048 iters, t-(init.)=1.46336 s t(norm)=0.0716983, mflops=69.7367 (err=1.5e-15) 13. Temperton: elapsed time t=1.45619 s, 2048 iters, t-(init.)=1.42542 s t(norm)=0.0698398, mflops=71.5925 (err=1.3e-07) 14. Temperton (f2c): elapsed time t=1.2819 s, 1024 iters, t-(init.)=1.26655 s t(norm)=0.124111, mflops=40.2865 (err=9.9e-16) 15. Valkenburg: elapsed time t=1.95154 s, 128 iters, t-(init.)=1.94961 s t(norm)=1.52837, mflops=3.27147 (err=1.1e-15) 16. SUNPERF: elapsed time t=1.35305 s, 4096 iters, t-(init.)=1.29166 s t(norm)=0.031643, mflops=158.013 (err=1.0e-15) Top mflops for N=1000 = 158.013 Normalized results and averages for N=1000: fft 0: mflops = 12.6378 (norm. = 0.0799798), norm. avg. (of 9) = 0.0653278 fft 1: mflops = 76.3091 (norm. = 0.48293), norm. avg. (of 12) = 0.413539 fft 2: mflops = 114.373 (norm. = 0.723823), norm. avg. (of 12) = 0.446925 fft 3: mflops = 97.4213 (norm. = 0.616541), norm. avg. (of 12) = 0.508411 fft 4: mflops = 49.7951 (norm. = 0.315133), norm. avg. (of 12) = 0.27826 fft 5: mflops = 116.882 (norm. = 0.739703), norm. avg. (of 12) = 0.95405 fft 6: mflops = 116.426 (norm. = 0.736815), norm. avg. (of 12) = 0.912721 fft 7: mflops = 17.6168 (norm. = 0.11149), norm. avg. (of 12) = 0.117624 fft 8: mflops = 40.7455 (norm. = 0.257862), norm. avg. (of 12) = 0.251812 fft 9: mflops = 11.2369 (norm. = 0.0711139), norm. avg. (of 12) = 0.0885277 fft 10: mflops = 41.3604 (norm. = 0.261754), norm. avg. (of 12) = 0.134705 fft 11: mflops = 67.1662 (norm. = 0.425068), norm. avg. (of 12) = 0.22557 fft 12: mflops = 69.7367 (norm. = 0.441336), norm. avg. (of 12) = 0.225121 fft 13: mflops = 71.5925 (norm. = 0.45308), norm. avg. (of 10) = 0.236217 fft 14: mflops = 40.2865 (norm. = 0.254958), norm. avg. (of 10) = 0.171094 fft 15: mflops = 3.27147 (norm. = 0.0207038), norm. avg. (of 12) = 0.0247333 fft 16: mflops = 158.013 (norm. = 1), norm. avg. (of 12) = 0.655797 Benchmarking for array size = 1960: 0. Brenner: elapsed time t=1.0002 s, 128 iters, t-(init.)=0.99645 s t(norm)=0.363166, mflops=13.7678 (err=2.9e-15) 1. CWP (min N) (N=1980): elapsed time t=1.12696 s, 1024 iters, t-(init.)=1.09667 s t(norm)=0.0499615, mflops=100.077 2. CWP (best N) (N=1980): elapsed time t=1.12686 s, 1024 iters, t-(init.)=1.09652 s t(norm)=0.049955, mflops=100.09 3. FFTPACK: elapsed time t=1.4102 s, 1024 iters, t-(init.)=1.37887 s t(norm)=0.0628179, mflops=79.5952 (err=2.8e-15) 4. FFTPACK (f2c): elapsed time t=1.8397 s, 512 iters, t-(init.)=1.8243 s t(norm)=0.166221, mflops=30.0804 (err=2.8e-15) FFTW_MEASURE plan: (cost = 1.266427e-03) FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 7 FFTW_NOTW 14 5. FFTW: elapsed time t=1.25064 s, 1024 iters, t-(init.)=1.22001 s t(norm)=0.0555807, mflops=89.9593 (err=2.7e-15) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.2889 s, 1024 iters, t-(init.)=1.25888 s t(norm)=0.0573515, mflops=87.1817 (err=2.8e-15) 7. Frigo-old: elapsed time t=1.59772 s, 256 iters, t-(init.)=1.59022 s t(norm)=0.289786, mflops=17.2541 (err=2.8e-15) 8. GSL: elapsed time t=1.48095 s, 512 iters, t-(init.)=1.46593 s t(norm)=0.133568, mflops=37.434 (err=2.8e-15) 9. NAPACK (f2c): elapsed time t=1.48146 s, 128 iters, t-(init.)=1.4777 s t(norm)=0.538564, mflops=9.28394 (err=1.3e-13) 10. Nielsen: elapsed time t=1.89583 s, 512 iters, t-(init.)=1.88084 s t(norm)=0.171373, mflops=29.1761 (err=1.7e-14) 11. Singleton: elapsed time t=1.23113 s, 512 iters, t-(init.)=1.21611 s t(norm)=0.110806, mflops=45.1238 (err=4.3e-15) 12. Singleton (f2c): elapsed time t=1.26792 s, 512 iters, t-(init.)=1.25289 s t(norm)=0.114157, mflops=43.7993 (err=4.3e-15) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.19652 s, 32 iters, t-(init.)=1.19558 s t(norm)=1.74297, mflops=2.86867 (err=2.7e-15) 16. SUNPERF: elapsed time t=1.18419 s, 1024 iters, t-(init.)=1.15421 s t(norm)=0.0525828, mflops=95.0882 (err=2.8e-15) Top mflops for N=1960 = 100.09 Normalized results and averages for N=1960: fft 0: mflops = 13.7678 (norm. = 0.137554), norm. avg. (of 10) = 0.0725504 fft 1: mflops = 100.077 (norm. = 0.999868), norm. avg. (of 13) = 0.458642 fft 2: mflops = 100.09 (norm. = 1), norm. avg. (of 13) = 0.489469 fft 3: mflops = 79.5952 (norm. = 0.795235), norm. avg. (of 13) = 0.530474 fft 4: mflops = 30.0804 (norm. = 0.300533), norm. avg. (of 13) = 0.279974 fft 5: mflops = 89.9593 (norm. = 0.898783), norm. avg. (of 13) = 0.949798 fft 6: mflops = 87.1817 (norm. = 0.871031), norm. avg. (of 13) = 0.909514 fft 7: mflops = 17.2541 (norm. = 0.172386), norm. avg. (of 13) = 0.121837 fft 8: mflops = 37.434 (norm. = 0.374003), norm. avg. (of 13) = 0.261211 fft 9: mflops = 9.28394 (norm. = 0.0927558), norm. avg. (of 13) = 0.0888529 fft 10: mflops = 29.1761 (norm. = 0.291498), norm. avg. (of 13) = 0.146766 fft 11: mflops = 45.1238 (norm. = 0.450832), norm. avg. (of 13) = 0.242897 fft 12: mflops = 43.7993 (norm. = 0.437598), norm. avg. (of 13) = 0.241465 fft 13: mflops = -1 (norm. = -0.00999099), norm. avg. (of 10) = 0.236217 fft 14: mflops = -1 (norm. = -0.00999099), norm. avg. (of 10) = 0.171094 fft 15: mflops = 2.86867 (norm. = 0.0286609), norm. avg. (of 13) = 0.0250355 fft 16: mflops = 95.0882 (norm. = 0.950025), norm. avg. (of 13) = 0.67843 Benchmarking for array size = 4725: 0. Brenner: elapsed time t=1.74855 s, 64 iters, t-(init.)=1.74387 s t(norm)=0.47245, mflops=10.5831 (err=1.9e-15) 1. CWP (min N) (N=5005): elapsed time t=1.9696 s, 512 iters, t-(init.)=1.93007 s t(norm)=0.0653617, mflops=76.4973 2. CWP (best N) (N=5040): elapsed time t=1.3963 s, 512 iters, t-(init.)=1.35643 s t(norm)=0.0459356, mflops=108.848 3. FFTPACK: elapsed time t=1.67387 s, 512 iters, t-(init.)=1.63587 s t(norm)=0.0553989, mflops=90.2545 (err=1.8e-15) 4. FFTPACK (f2c): elapsed time t=1.91888 s, 256 iters, t-(init.)=1.90007 s t(norm)=0.128692, mflops=38.8524 (err=1.9e-15) FFTW_MEASURE plan: (cost = 3.012359e-03) FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_NOTW 15 5. FFTW: elapsed time t=1.68993 s, 512 iters, t-(init.)=1.65206 s t(norm)=0.0559471, mflops=89.3702 (err=1.9e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.54245 s, 512 iters, t-(init.)=1.50506 s t(norm)=0.050969, mflops=98.0989 (err=1.8e-15) 7. Frigo-old: elapsed time t=1.4916 s, 64 iters, t-(init.)=1.48692 s t(norm)=0.402836, mflops=12.412 (err=1.9e-15) 8. GSL: elapsed time t=1.9128 s, 256 iters, t-(init.)=1.8941 s t(norm)=0.128288, mflops=38.9749 (err=1.9e-15) 9. NAPACK (f2c): elapsed time t=1.70367 s, 64 iters, t-(init.)=1.69896 s t(norm)=0.460283, mflops=10.8629 (err=3.5e-13) 10. Nielsen: elapsed time t=1.42171 s, 128 iters, t-(init.)=1.41229 s t(norm)=0.191308, mflops=26.1358 (err=3.8e-14) 11. Singleton: elapsed time t=1.70122 s, 256 iters, t-(init.)=1.68226 s t(norm)=0.113939, mflops=43.883 (err=2.4e-15) 12. Singleton (f2c): elapsed time t=1.81534 s, 256 iters, t-(init.)=1.79646 s t(norm)=0.121674, mflops=41.0933 (err=2.4e-15) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.5367 s, 16 iters, t-(init.)=1.53553 s t(norm)=1.66402, mflops=3.00477 (err=1.8e-15) 16. SUNPERF: elapsed time t=1.2967 s, 512 iters, t-(init.)=1.2593 s t(norm)=0.0426463, mflops=117.243 (err=1.8e-15) Top mflops for N=4725 = 117.243 Normalized results and averages for N=4725: fft 0: mflops = 10.5831 (norm. = 0.0902663), norm. avg. (of 11) = 0.0741609 fft 1: mflops = 76.4973 (norm. = 0.652466), norm. avg. (of 14) = 0.472486 fft 2: mflops = 108.848 (norm. = 0.928394), norm. avg. (of 14) = 0.520821 fft 3: mflops = 90.2545 (norm. = 0.769805), norm. avg. (of 14) = 0.547569 fft 4: mflops = 38.8524 (norm. = 0.331382), norm. avg. (of 14) = 0.283646 fft 5: mflops = 89.3702 (norm. = 0.762262), norm. avg. (of 14) = 0.936403 fft 6: mflops = 98.0989 (norm. = 0.836712), norm. avg. (of 14) = 0.904314 fft 7: mflops = 12.412 (norm. = 0.105865), norm. avg. (of 14) = 0.120696 fft 8: mflops = 38.9749 (norm. = 0.332427), norm. avg. (of 14) = 0.266298 fft 9: mflops = 10.8629 (norm. = 0.0926524), norm. avg. (of 14) = 0.0891243 fft 10: mflops = 26.1358 (norm. = 0.222919), norm. avg. (of 14) = 0.152206 fft 11: mflops = 43.883 (norm. = 0.374289), norm. avg. (of 14) = 0.252283 fft 12: mflops = 41.0933 (norm. = 0.350495), norm. avg. (of 14) = 0.249253 fft 13: mflops = -1 (norm. = -0.00852926), norm. avg. (of 10) = 0.236217 fft 14: mflops = -1 (norm. = -0.00852926), norm. avg. (of 10) = 0.171094 fft 15: mflops = 3.00477 (norm. = 0.0256285), norm. avg. (of 14) = 0.0250778 fft 16: mflops = 117.243 (norm. = 1), norm. avg. (of 14) = 0.701399 Benchmarking for array size = 10368: 0. Brenner: elapsed time t=1.78276 s, 32 iters, t-(init.)=1.77769 s t(norm)=0.401662, mflops=12.4483 (err=3.1e-15) 1. CWP (min N) (N=10920): elapsed time t=1.00594 s, 128 iters, t-(init.)=0.984763 s t(norm)=0.0556258, mflops=89.8864 2. CWP (best N) (N=11088): elapsed time t=1.07109 s, 128 iters, t-(init.)=1.04959 s t(norm)=0.0592874, mflops=84.3349 3. FFTPACK: elapsed time t=1.03697 s, 128 iters, t-(init.)=1.01669 s t(norm)=0.0574293, mflops=87.0636 (err=3.0e-15) 4. FFTPACK (f2c): elapsed time t=1.95982 s, 128 iters, t-(init.)=1.93964 s t(norm)=0.109563, mflops=45.6357 (err=3.0e-15) FFTW_MEASURE plan: (cost = 6.861827e-03) FFTW_TWIDDLE 6 FFTW_TWIDDLE 16 FFTW_TWIDDLE 9 FFTW_NOTW 12 5. FFTW: elapsed time t=1.81837 s, 256 iters, t-(init.)=1.77753 s t(norm)=0.0502032, mflops=99.5952 (err=3.0e-15) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.94514 s, 256 iters, t-(init.)=1.90464 s t(norm)=0.0537931, mflops=92.9487 (err=3.0e-15) 7. Frigo-old: elapsed time t=1.1984 s, 32 iters, t-(init.)=1.19317 s t(norm)=0.269591, mflops=18.5466 (err=3.1e-15) 8. GSL: elapsed time t=1.05133 s, 64 iters, t-(init.)=1.04108 s t(norm)=0.117614, mflops=42.5121 (err=3.0e-15) 9. NAPACK (f2c): elapsed time t=1.25292 s, 32 iters, t-(init.)=1.24781 s t(norm)=0.281937, mflops=17.7345 (err=8.1e-14) 10. Nielsen: elapsed time t=1.62821 s, 64 iters, t-(init.)=1.61772 s t(norm)=0.182759, mflops=27.3585 (err=8.1e-15) 11. Singleton: elapsed time t=1.60487 s, 128 iters, t-(init.)=1.58462 s t(norm)=0.0895092, mflops=55.8602 (err=4.4e-15) 12. Singleton (f2c): elapsed time t=1.72198 s, 128 iters, t-(init.)=1.70177 s t(norm)=0.0961266, mflops=52.0147 (err=4.4e-15) 13. Temperton: elapsed time t=1.45654 s, 128 iters, t-(init.)=1.43641 s t(norm)=0.0811377, mflops=61.6236 (err=2.1e-07) 14. Temperton (f2c): elapsed time t=1.02985 s, 64 iters, t-(init.)=1.01976 s t(norm)=0.115205, mflops=43.4008 (err=3.0e-15) 15. Valkenburg: elapsed time t=1.45146 s, 8 iters, t-(init.)=1.44999 s t(norm)=1.31048, mflops=3.8154 (err=3.0e-15) 16. SUNPERF: elapsed time t=1.01625 s, 128 iters, t-(init.)=0.995784 s t(norm)=0.0562483, mflops=88.8916 (err=3.0e-15) Top mflops for N=10368 = 99.5952 Normalized results and averages for N=10368: fft 0: mflops = 12.4483 (norm. = 0.124989), norm. avg. (of 12) = 0.0783966 fft 1: mflops = 89.8864 (norm. = 0.902518), norm. avg. (of 15) = 0.501155 fft 2: mflops = 84.3349 (norm. = 0.846777), norm. avg. (of 15) = 0.542551 fft 3: mflops = 87.0636 (norm. = 0.874174), norm. avg. (of 15) = 0.569343 fft 4: mflops = 45.6357 (norm. = 0.458212), norm. avg. (of 15) = 0.295283 fft 5: mflops = 99.5952 (norm. = 1), norm. avg. (of 15) = 0.940643 fft 6: mflops = 92.9487 (norm. = 0.933265), norm. avg. (of 15) = 0.906244 fft 7: mflops = 18.5466 (norm. = 0.18622), norm. avg. (of 15) = 0.125064 fft 8: mflops = 42.5121 (norm. = 0.426849), norm. avg. (of 15) = 0.277001 fft 9: mflops = 17.7345 (norm. = 0.178065), norm. avg. (of 15) = 0.0950537 fft 10: mflops = 27.3585 (norm. = 0.274697), norm. avg. (of 15) = 0.160372 fft 11: mflops = 55.8602 (norm. = 0.560872), norm. avg. (of 15) = 0.272855 fft 12: mflops = 52.0147 (norm. = 0.522261), norm. avg. (of 15) = 0.267454 fft 13: mflops = 61.6236 (norm. = 0.618741), norm. avg. (of 11) = 0.270992 fft 14: mflops = 43.4008 (norm. = 0.435772), norm. avg. (of 11) = 0.195156 fft 15: mflops = 3.8154 (norm. = 0.038309), norm. avg. (of 15) = 0.0259599 fft 16: mflops = 88.8916 (norm. = 0.892529), norm. avg. (of 15) = 0.714141 Benchmarking for array size = 27000: 0. Brenner: elapsed time t=1.01706 s, 4 iters, t-(init.)=1.0108 s t(norm)=0.635792, mflops=7.86421 (err=5.6e-15) 1. CWP (min N) (N=27720): elapsed time t=1.55603 s, 64 iters, t-(init.)=1.4595 s t(norm)=0.0573764, mflops=87.1438 2. CWP (best N) (N=27720): elapsed time t=1.55606 s, 64 iters, t-(init.)=1.45953 s t(norm)=0.0573774, mflops=87.1424 3. FFTPACK: elapsed time t=1.30656 s, 32 iters, t-(init.)=1.26049 s t(norm)=0.0991057, mflops=50.4512 (err=5.5e-15) 4. FFTPACK (f2c): elapsed time t=1.89919 s, 32 iters, t-(init.)=1.85313 s t(norm)=0.145701, mflops=34.3168 (err=5.5e-15) FFTW_MEASURE plan: (cost = 2.919067e-02) FFTW_TWIDDLE 3 FFTW_TWIDDLE 6 FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 15 5. FFTW: elapsed time t=1.01807 s, 32 iters, t-(init.)=0.971281 s t(norm)=0.0763666, mflops=65.4736 (err=5.5e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.00271 s, 32 iters, t-(init.)=0.955918 s t(norm)=0.0751587, mflops=66.5259 (err=5.6e-15) 7. Frigo-old: elapsed time t=1.2092 s, 8 iters, t-(init.)=1.19707 s t(norm)=0.376477, mflops=13.281 (err=5.7e-15) 8. GSL: elapsed time t=1.83419 s, 32 iters, t-(init.)=1.78801 s t(norm)=0.140582, mflops=35.5664 (err=5.5e-15) 9. NAPACK (f2c): elapsed time t=1.56187 s, 8 iters, t-(init.)=1.54991 s t(norm)=0.487445, mflops=10.2576 (err=1.1e-12) 10. Nielsen: elapsed time t=1.16125 s, 16 iters, t-(init.)=1.13743 s t(norm)=0.17886, mflops=27.9549 (err=2.0e-13) 11. Singleton: elapsed time t=1.88455 s, 32 iters, t-(init.)=1.83924 s t(norm)=0.14461, mflops=34.5759 (err=7.7e-15) 12. Singleton (f2c): elapsed time t=1.93766 s, 32 iters, t-(init.)=1.89237 s t(norm)=0.148787, mflops=33.6051 (err=7.7e-15) 13. Temperton: elapsed time t=1.71383 s, 32 iters, t-(init.)=1.66648 s t(norm)=0.131027, mflops=38.1602 (err=1.4e-07) 14. Temperton (f2c): elapsed time t=1.11589 s, 16 iters, t-(init.)=1.09328 s t(norm)=0.171917, mflops=29.0838 (err=5.6e-15) 15. Valkenburg: elapsed time t=1.24677 s, 2 iters, t-(init.)=1.24321 s t(norm)=1.56394, mflops=3.19704 (err=5.5e-15) 16. SUNPERF: elapsed time t=1.16421 s, 32 iters, t-(init.)=1.11814 s t(norm)=0.0879134, mflops=56.8742 (err=5.5e-15) Top mflops for N=27000 = 87.1438 Normalized results and averages for N=27000: fft 0: mflops = 7.86421 (norm. = 0.0902441), norm. avg. (of 13) = 0.0793079 fft 1: mflops = 87.1438 (norm. = 1), norm. avg. (of 16) = 0.532333 fft 2: mflops = 87.1424 (norm. = 0.999984), norm. avg. (of 16) = 0.571141 fft 3: mflops = 50.4512 (norm. = 0.578942), norm. avg. (of 16) = 0.569943 fft 4: mflops = 34.3168 (norm. = 0.393795), norm. avg. (of 16) = 0.30144 fft 5: mflops = 65.4736 (norm. = 0.751328), norm. avg. (of 16) = 0.92881 fft 6: mflops = 66.5259 (norm. = 0.763404), norm. avg. (of 16) = 0.897317 fft 7: mflops = 13.281 (norm. = 0.152403), norm. avg. (of 16) = 0.126773 fft 8: mflops = 35.5664 (norm. = 0.408135), norm. avg. (of 16) = 0.285197 fft 9: mflops = 10.2576 (norm. = 0.117709), norm. avg. (of 16) = 0.0964697 fft 10: mflops = 27.9549 (norm. = 0.32079), norm. avg. (of 16) = 0.170398 fft 11: mflops = 34.5759 (norm. = 0.396768), norm. avg. (of 16) = 0.2806 fft 12: mflops = 33.6051 (norm. = 0.385628), norm. avg. (of 16) = 0.27484 fft 13: mflops = 38.1602 (norm. = 0.437899), norm. avg. (of 12) = 0.284901 fft 14: mflops = 29.0838 (norm. = 0.333745), norm. avg. (of 12) = 0.206705 fft 15: mflops = 3.19704 (norm. = 0.036687), norm. avg. (of 16) = 0.0266303 fft 16: mflops = 56.8742 (norm. = 0.652647), norm. avg. (of 16) = 0.710298 Benchmarking for array size = 75600: 0. Brenner: elapsed time t=1.86653 s, 2 iters, t-(init.)=1.85295 s t(norm)=0.756193, mflops=6.61207 (err=1.1e-14) 1. CWP (min N) (N=80080): elapsed time t=1.64899 s, 16 iters, t-(init.)=1.53557 s t(norm)=0.0783336, mflops=63.8296 2. CWP (best N) (N=80080): elapsed time t=1.6525 s, 16 iters, t-(init.)=1.5391 s t(norm)=0.078514, mflops=63.6829 3. FFTPACK: elapsed time t=1.67406 s, 8 iters, t-(init.)=1.62094 s t(norm)=0.165378, mflops=30.2338 (err=1.0e-14) 4. FFTPACK (f2c): elapsed time t=1.13536 s, 4 iters, t-(init.)=1.10868 s t(norm)=0.226228, mflops=22.1016 (err=1.1e-14) FFTW_MEASURE plan: (cost = 1.251942e-01) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 7 FFTW_TWIDDLE 4 FFTW_TWIDDLE 5 FFTW_NOTW 10 5. FFTW: elapsed time t=1.05249 s, 8 iters, t-(init.)=0.998987 s t(norm)=0.101922, mflops=49.057 (err=1.0e-14) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.04417 s, 8 iters, t-(init.)=0.988745 s t(norm)=0.100877, mflops=49.5651 (err=1.1e-14) 7. Frigo-old: elapsed time t=1.21079 s, 2 iters, t-(init.)=1.19702 s t(norm)=0.488507, mflops=10.2353 (err=1.1e-14) 8. GSL: elapsed time t=1.74132 s, 8 iters, t-(init.)=1.68837 s t(norm)=0.172257, mflops=29.0264 (err=1.1e-14) 9. NAPACK (f2c): elapsed time t=1.35803 s, 2 iters, t-(init.)=1.34423 s t(norm)=0.548584, mflops=9.11438 (err=5.1e-12) 10. Nielsen: elapsed time t=1.65421 s, 4 iters, t-(init.)=1.62713 s t(norm)=0.332017, mflops=15.0595 (err=4.8e-13) 11. Singleton: elapsed time t=1.17567 s, 4 iters, t-(init.)=1.15079 s t(norm)=0.23482, mflops=21.2929 (err=1.5e-14) 12. Singleton (f2c): elapsed time t=1.20029 s, 4 iters, t-(init.)=1.1754 s t(norm)=0.239843, mflops=20.847 (err=1.5e-14) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=2.16819 s, 1 iters, t-(init.)=2.16132 s t(norm)=1.76408, mflops=2.83434 (err=1.1e-14) 16. SUNPERF: elapsed time t=1.55443 s, 8 iters, t-(init.)=1.50135 s t(norm)=0.153176, mflops=32.6422 (err=1.0e-14) Top mflops for N=75600 = 63.8296 Normalized results and averages for N=75600: fft 0: mflops = 6.61207 (norm. = 0.103589), norm. avg. (of 14) = 0.0810423 fft 1: mflops = 63.8296 (norm. = 1), norm. avg. (of 17) = 0.559843 fft 2: mflops = 63.6829 (norm. = 0.997702), norm. avg. (of 17) = 0.596233 fft 3: mflops = 30.2338 (norm. = 0.473664), norm. avg. (of 17) = 0.564279 fft 4: mflops = 22.1016 (norm. = 0.34626), norm. avg. (of 17) = 0.304077 fft 5: mflops = 49.057 (norm. = 0.768562), norm. avg. (of 17) = 0.919384 fft 6: mflops = 49.5651 (norm. = 0.776523), norm. avg. (of 17) = 0.890211 fft 7: mflops = 10.2353 (norm. = 0.160353), norm. avg. (of 17) = 0.128748 fft 8: mflops = 29.0264 (norm. = 0.454749), norm. avg. (of 17) = 0.295171 fft 9: mflops = 9.11438 (norm. = 0.142792), norm. avg. (of 17) = 0.0991945 fft 10: mflops = 15.0595 (norm. = 0.235932), norm. avg. (of 17) = 0.174253 fft 11: mflops = 21.2929 (norm. = 0.333589), norm. avg. (of 17) = 0.283717 fft 12: mflops = 20.847 (norm. = 0.326604), norm. avg. (of 17) = 0.277885 fft 13: mflops = -1 (norm. = -0.0156667), norm. avg. (of 12) = 0.284901 fft 14: mflops = -1 (norm. = -0.0156667), norm. avg. (of 12) = 0.206705 fft 15: mflops = 2.83434 (norm. = 0.0444048), norm. avg. (of 17) = 0.0276759 fft 16: mflops = 32.6422 (norm. = 0.511397), norm. avg. (of 17) = 0.698598 Benchmarking for array size = 165375: 0. Brenner: elapsed time t=2.72325 s, 1 iters, t-(init.)=2.70678 s t(norm)=0.944167, mflops=5.29567 (err=2.7e-14) 1. CWP (min N) (N=180180): elapsed time t=1.03425 s, 4 iters, t-(init.)=0.964991 s t(norm)=0.0841511, mflops=59.4169 2. CWP (best N) (N=180180): elapsed time t=1.0296 s, 4 iters, t-(init.)=0.96044 s t(norm)=0.0837542, mflops=59.6985 3. FFTPACK: elapsed time t=1.55315 s, 2 iters, t-(init.)=1.52135 s t(norm)=0.265336, mflops=18.844 (err=2.7e-14) 4. FFTPACK (f2c): elapsed time t=1.03681 s, 1 iters, t-(init.)=1.0209 s t(norm)=0.356107, mflops=14.0407 (err=2.7e-14) FFTW_MEASURE plan: (cost = 3.273459e-01) FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_NOTW 15 5. FFTW: elapsed time t=1.36068 s, 4 iters, t-(init.)=1.29709 s t(norm)=0.113111, mflops=44.2042 (err=2.7e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.362 s, 4 iters, t-(init.)=1.2971 s t(norm)=0.113112, mflops=44.204 (err=2.7e-14) 7. Frigo-old: elapsed time t=1.94817 s, 1 iters, t-(init.)=1.93246 s t(norm)=0.674074, mflops=7.41758 (err=2.7e-14) 8. GSL: elapsed time t=1.05516 s, 2 iters, t-(init.)=1.02356 s t(norm)=0.178516, mflops=28.0086 (err=2.7e-14) 9. NAPACK (f2c): elapsed time t=1.87795 s, 1 iters, t-(init.)=1.86205 s t(norm)=0.649515, mflops=7.69805 (err=1.6e-11) 10. Nielsen: elapsed time t=1.11827 s, 1 iters, t-(init.)=1.10235 s t(norm)=0.384519, mflops=13.0033 (err=1.6e-12) 11. Singleton: elapsed time t=1.77716 s, 2 iters, t-(init.)=1.74729 s t(norm)=0.304742, mflops=16.4073 (err=4.0e-14) 12. Singleton (f2c): elapsed time t=1.80264 s, 2 iters, t-(init.)=1.77277 s t(norm)=0.309186, mflops=16.1715 (err=4.0e-14) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=5.69287 s, 1 iters, t-(init.)=5.67696 s t(norm)=1.98022, mflops=2.52498 (err=2.7e-14) 16. SUNPERF: elapsed time t=1.51251 s, 2 iters, t-(init.)=1.48071 s t(norm)=0.258248, mflops=19.3612 (err=2.7e-14) Top mflops for N=165375 = 59.6985 Normalized results and averages for N=165375: fft 0: mflops = 5.29567 (norm. = 0.088707), norm. avg. (of 15) = 0.0815533 fft 1: mflops = 59.4169 (norm. = 0.995284), norm. avg. (of 18) = 0.584034 fft 2: mflops = 59.6985 (norm. = 1), norm. avg. (of 18) = 0.618664 fft 3: mflops = 18.844 (norm. = 0.315654), norm. avg. (of 18) = 0.550467 fft 4: mflops = 14.0407 (norm. = 0.235194), norm. avg. (of 18) = 0.30025 fft 5: mflops = 44.2042 (norm. = 0.740458), norm. avg. (of 18) = 0.909444 fft 6: mflops = 44.204 (norm. = 0.740454), norm. avg. (of 18) = 0.881891 fft 7: mflops = 7.41758 (norm. = 0.124251), norm. avg. (of 18) = 0.128498 fft 8: mflops = 28.0086 (norm. = 0.469168), norm. avg. (of 18) = 0.304837 fft 9: mflops = 7.69805 (norm. = 0.128949), norm. avg. (of 18) = 0.100848 fft 10: mflops = 13.0033 (norm. = 0.217815), norm. avg. (of 18) = 0.176673 fft 11: mflops = 16.4073 (norm. = 0.274836), norm. avg. (of 18) = 0.283223 fft 12: mflops = 16.1715 (norm. = 0.270886), norm. avg. (of 18) = 0.277496 fft 13: mflops = -1 (norm. = -0.0167508), norm. avg. (of 12) = 0.284901 fft 14: mflops = -1 (norm. = -0.0167508), norm. avg. (of 12) = 0.206705 fft 15: mflops = 2.52498 (norm. = 0.0422955), norm. avg. (of 18) = 0.0284881 fft 16: mflops = 19.3612 (norm. = 0.324316), norm. avg. (of 18) = 0.677804 Benchmarking for array size = 362880: 0. Brenner: elapsed time t=5.71637 s, 1 iters, t-(init.)=5.68139 s t(norm)=0.847705, mflops=5.89828 (err=1.1e-13) 1. CWP (min N) (N=720720): elapsed time t=1.14576 s, 1 iters, t-(init.)=1.07638 s t(norm)=0.160604, mflops=31.1324 2. CWP (best N) (N=720720): elapsed time t=1.149 s, 1 iters, t-(init.)=1.07967 s t(norm)=0.161095, mflops=31.0376 3. FFTPACK: elapsed time t=1.19686 s, 1 iters, t-(init.)=1.16188 s t(norm)=0.173361, mflops=28.8415 (err=1.1e-13) 4. FFTPACK (f2c): elapsed time t=1.61092 s, 1 iters, t-(init.)=1.57604 s t(norm)=0.235157, mflops=21.2624 (err=1.1e-13) FFTW_MEASURE plan: (cost = 6.955900e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 9 FFTW_TWIDDLE 3 FFTW_TWIDDLE 8 FFTW_TWIDDLE 7 FFTW_NOTW 15 5. FFTW: elapsed time t=1.3941 s, 2 iters, t-(init.)=1.32431 s t(norm)=0.0987986, mflops=50.608 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 7.511616e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.479 s, 2 iters, t-(init.)=1.40924 s t(norm)=0.105134, mflops=47.5582 (err=1.1e-13) 7. Frigo-old: elapsed time t=3.64253 s, 1 iters, t-(init.)=3.60769 s t(norm)=0.538295, mflops=9.2886 (err=1.1e-13) 8. GSL: elapsed time t=1.13183 s, 1 iters, t-(init.)=1.09695 s t(norm)=0.163673, mflops=30.5487 (err=1.1e-13) 9. NAPACK (f2c): elapsed time t=3.40343 s, 1 iters, t-(init.)=3.36855 s t(norm)=0.502612, mflops=9.94803 (err=3.4e-11) 10. Nielsen: elapsed time t=2.74325 s, 1 iters, t-(init.)=2.70837 s t(norm)=0.404109, mflops=12.3729 (err=3.5e-12) 11. Singleton: elapsed time t=2.2155 s, 1 iters, t-(init.)=2.18138 s t(norm)=0.325478, mflops=15.362 (err=1.6e-13) 12. Singleton (f2c): elapsed time t=2.28775 s, 1 iters, t-(init.)=2.25535 s t(norm)=0.336515, mflops=14.8582 (err=1.6e-13) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=12.1428 s, 1 iters, t-(init.)=12.1079 s t(norm)=1.80659, mflops=2.76764 (err=1.1e-13) 16. SUNPERF: elapsed time t=1.11979 s, 1 iters, t-(init.)=1.0849 s t(norm)=0.161876, mflops=30.8879 (err=1.1e-13) Top mflops for N=362880 = 50.608 Normalized results and averages for N=362880: fft 0: mflops = 5.89828 (norm. = 0.116548), norm. avg. (of 16) = 0.0837405 fft 1: mflops = 31.1324 (norm. = 0.615167), norm. avg. (of 19) = 0.585672 fft 2: mflops = 31.0376 (norm. = 0.613294), norm. avg. (of 19) = 0.618382 fft 3: mflops = 28.8415 (norm. = 0.5699), norm. avg. (of 19) = 0.55149 fft 4: mflops = 21.2624 (norm. = 0.42014), norm. avg. (of 19) = 0.30656 fft 5: mflops = 50.608 (norm. = 1), norm. avg. (of 19) = 0.91421 fft 6: mflops = 47.5582 (norm. = 0.939735), norm. avg. (of 19) = 0.884936 fft 7: mflops = 9.2886 (norm. = 0.18354), norm. avg. (of 19) = 0.131395 fft 8: mflops = 30.5487 (norm. = 0.603634), norm. avg. (of 19) = 0.320563 fft 9: mflops = 9.94803 (norm. = 0.19657), norm. avg. (of 19) = 0.105886 fft 10: mflops = 12.3729 (norm. = 0.244485), norm. avg. (of 19) = 0.180242 fft 11: mflops = 15.362 (norm. = 0.303549), norm. avg. (of 19) = 0.284293 fft 12: mflops = 14.8582 (norm. = 0.293594), norm. avg. (of 19) = 0.278343 fft 13: mflops = -1 (norm. = -0.0197597), norm. avg. (of 12) = 0.284901 fft 14: mflops = -1 (norm. = -0.0197597), norm. avg. (of 12) = 0.206705 fft 15: mflops = 2.76764 (norm. = 0.0546878), norm. avg. (of 19) = 0.029867 fft 16: mflops = 30.8879 (norm. = 0.610336), norm. avg. (of 19) = 0.674253 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) 512x128x64 (64.0236 MB) 256x128x256 (128.012 MB) 256x256x256 (256.012 MB) Maximum array size N = 16777216 Benchmarking FFTs: 0. FFTW 1. HARM 2. HARM (f2c) 3. NR (C) 4. NR (F) 5. PDA 6. PDA (f2c) 7. Singleton 8. Singleton (f2c) 9. Temperton 10. Temperton (f2c) Computing normalized averages (11 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.86457 s, 131072 iters, t-(init.)=1.72413 s t(norm)=0.0342554, mflops=145.962 (err=1.9e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. Skipping fft (all dimensions must be > 4 for HARM). 3. NR (C): elapsed time t=1.14223 s, 16384 iters, t-(init.)=1.12383 s t(norm)=0.178628, mflops=27.9911 (err=2.4e-16) 4. NR (F): elapsed time t=1.63648 s, 32768 iters, t-(init.)=1.59981 s t(norm)=0.127141, mflops=39.3264 (err=2.4e-16) 5. PDA: elapsed time t=1.91109 s, 16384 iters, t-(init.)=1.89354 s t(norm)=0.30097, mflops=16.613 (err=2.8e-16) 6. PDA (f2c): elapsed time t=1.68354 s, 8192 iters, t-(init.)=1.67481 s t(norm)=0.532409, mflops=9.39127 (err=2.8e-16) 7. Singleton: elapsed time t=1.65113 s, 65536 iters, t-(init.)=1.57897 s t(norm)=0.0627425, mflops=79.6908 (err=1.9e-16) 8. Singleton (f2c): elapsed time t=1.69514 s, 65536 iters, t-(init.)=1.62493 s t(norm)=0.0645689, mflops=77.4367 (err=1.9e-16) 9. Temperton: elapsed time t=1.3215 s, 32768 iters, t-(init.)=1.28581 s t(norm)=0.102187, mflops=48.9298 (err=1.9e-16) 10. Temperton (f2c): elapsed time t=1.05315 s, 16384 iters, t-(init.)=1.03462 s t(norm)=0.164449, mflops=30.4046 (err=1.9e-16) Top mflops for N=64 = 145.962 Normalized results and averages for N=64: fft 0: mflops = 145.962 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.00685108), norm. avg. (of 0) = -1 fft 2: mflops = -1 (norm. = -0.00685108), norm. avg. (of 0) = -1 fft 3: mflops = 27.9911 (norm. = 0.191769), norm. avg. (of 1) = 0.191769 fft 4: mflops = 39.3264 (norm. = 0.269428), norm. avg. (of 1) = 0.269428 fft 5: mflops = 16.613 (norm. = 0.113817), norm. avg. (of 1) = 0.113817 fft 6: mflops = 9.39127 (norm. = 0.0643403), norm. avg. (of 1) = 0.0643403 fft 7: mflops = 79.6908 (norm. = 0.545968), norm. avg. (of 1) = 0.545968 fft 8: mflops = 77.4367 (norm. = 0.530525), norm. avg. (of 1) = 0.530525 fft 9: mflops = 48.9298 (norm. = 0.335222), norm. avg. (of 1) = 0.335222 fft 10: mflops = 30.4046 (norm. = 0.208304), norm. avg. (of 1) = 0.208304 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.75893 s, 16384 iters, t-(init.)=1.63203 s t(norm)=0.021617, mflops=231.3 (err=3.6e-16) 1. HARM: elapsed time t=1.23897 s, 4096 iters, t-(init.)=1.20711 s t(norm)=0.0639547, mflops=78.1803 (err=4.0e-16) 2. HARM (f2c): elapsed time t=1.48897 s, 4096 iters, t-(init.)=1.4571 s t(norm)=0.0771999, mflops=64.7669 (err=4.0e-16) 3. NR (C): elapsed time t=1.10369 s, 2048 iters, t-(init.)=1.08734 s t(norm)=0.115219, mflops=43.3956 (err=3.6e-16) 4. NR (F): elapsed time t=1.56542 s, 4096 iters, t-(init.)=1.5336 s t(norm)=0.0812533, mflops=61.536 (err=3.6e-16) 5. PDA: elapsed time t=1.49364 s, 2048 iters, t-(init.)=1.47778 s t(norm)=0.156591, mflops=31.9302 (err=3.0e-16) 6. PDA (f2c): elapsed time t=1.1598 s, 1024 iters, t-(init.)=1.15189 s t(norm)=0.244116, mflops=20.482 (err=3.0e-16) 7. Singleton: elapsed time t=1.58824 s, 4096 iters, t-(init.)=1.55633 s t(norm)=0.0824571, mflops=60.6376 (err=3.5e-16) 8. Singleton (f2c): elapsed time t=1.65141 s, 4096 iters, t-(init.)=1.61969 s t(norm)=0.0858142, mflops=58.2654 (err=3.5e-16) 9. Temperton: elapsed time t=1.8974 s, 8192 iters, t-(init.)=1.83372 s t(norm)=0.0485769, mflops=102.93 (err=1.3e-08) 10. Temperton (f2c): elapsed time t=1.0871 s, 4096 iters, t-(init.)=1.05521 s t(norm)=0.0559068, mflops=89.4346 (err=3.3e-16) Top mflops for N=512 = 231.3 Normalized results and averages for N=512: fft 0: mflops = 231.3 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 78.1803 (norm. = 0.338004), norm. avg. (of 1) = 0.338004 fft 2: mflops = 64.7669 (norm. = 0.280013), norm. avg. (of 1) = 0.280013 fft 3: mflops = 43.3956 (norm. = 0.187617), norm. avg. (of 2) = 0.189693 fft 4: mflops = 61.536 (norm. = 0.266044), norm. avg. (of 2) = 0.267736 fft 5: mflops = 31.9302 (norm. = 0.138047), norm. avg. (of 2) = 0.125932 fft 6: mflops = 20.482 (norm. = 0.088552), norm. avg. (of 2) = 0.0764461 fft 7: mflops = 60.6376 (norm. = 0.26216), norm. avg. (of 2) = 0.404064 fft 8: mflops = 58.2654 (norm. = 0.251905), norm. avg. (of 2) = 0.391215 fft 9: mflops = 102.93 (norm. = 0.445006), norm. avg. (of 2) = 0.390114 fft 10: mflops = 89.4346 (norm. = 0.386661), norm. avg. (of 2) = 0.297483 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.25201 s, 1024 iters, t-(init.)=1.18947 s t(norm)=0.0236326, mflops=211.572 (err=4.2e-16) 1. HARM: elapsed time t=1.54266 s, 512 iters, t-(init.)=1.51131 s t(norm)=0.0600541, mflops=83.2582 (err=4.0e-16) 2. HARM (f2c): elapsed time t=1.6032 s, 512 iters, t-(init.)=1.57189 s t(norm)=0.0624613, mflops=80.0495 (err=4.0e-16) 3. NR (C): elapsed time t=1.85976 s, 256 iters, t-(init.)=1.8441 s t(norm)=0.146556, mflops=34.1166 (err=5.0e-16) 4. NR (F): elapsed time t=1.30777 s, 256 iters, t-(init.)=1.29212 s t(norm)=0.102688, mflops=48.691 (err=5.0e-16) 5. PDA: elapsed time t=1.23941 s, 256 iters, t-(init.)=1.22376 s t(norm)=0.0972557, mflops=51.4109 (err=4.0e-16) 6. PDA (f2c): elapsed time t=1.95025 s, 256 iters, t-(init.)=1.93459 s t(norm)=0.153748, mflops=32.5208 (err=4.0e-16) 7. Singleton: elapsed time t=1.67498 s, 512 iters, t-(init.)=1.64369 s t(norm)=0.0653146, mflops=76.5526 (err=4.1e-16) 8. Singleton (f2c): elapsed time t=1.71004 s, 512 iters, t-(init.)=1.67877 s t(norm)=0.0667082, mflops=74.9533 (err=4.1e-16) 9. Temperton: elapsed time t=1.04717 s, 512 iters, t-(init.)=1.01587 s t(norm)=0.040367, mflops=123.863 (err=6.3e-08) 10. Temperton (f2c): elapsed time t=1.40619 s, 512 iters, t-(init.)=1.37486 s t(norm)=0.0546321, mflops=91.5212 (err=4.6e-16) Top mflops for N=4096 = 211.572 Normalized results and averages for N=4096: fft 0: mflops = 211.572 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 83.2582 (norm. = 0.393521), norm. avg. (of 2) = 0.365763 fft 2: mflops = 80.0495 (norm. = 0.378355), norm. avg. (of 2) = 0.329184 fft 3: mflops = 34.1166 (norm. = 0.161253), norm. avg. (of 3) = 0.180213 fft 4: mflops = 48.691 (norm. = 0.230139), norm. avg. (of 3) = 0.255204 fft 5: mflops = 51.4109 (norm. = 0.242994), norm. avg. (of 3) = 0.164953 fft 6: mflops = 32.5208 (norm. = 0.15371), norm. avg. (of 3) = 0.102201 fft 7: mflops = 76.5526 (norm. = 0.361827), norm. avg. (of 3) = 0.389985 fft 8: mflops = 74.9533 (norm. = 0.354268), norm. avg. (of 3) = 0.378899 fft 9: mflops = 123.863 (norm. = 0.585442), norm. avg. (of 3) = 0.455223 fft 10: mflops = 91.5212 (norm. = 0.432576), norm. avg. (of 3) = 0.342514 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.5286 s, 64 iters, t-(init.)=1.41347 s t(norm)=0.0449329, mflops=111.277 (err=5.2e-16) 1. HARM: elapsed time t=1.34668 s, 32 iters, t-(init.)=1.28886 s t(norm)=0.0819438, mflops=61.0175 (err=5.2e-16) 2. HARM (f2c): elapsed time t=1.46993 s, 32 iters, t-(init.)=1.41226 s t(norm)=0.089789, mflops=55.6861 (err=5.2e-16) 3. NR (C): elapsed time t=1.10744 s, 8 iters, t-(init.)=1.09331 s t(norm)=0.278043, mflops=17.9828 (err=5.9e-16) 4. NR (F): elapsed time t=1.96686 s, 16 iters, t-(init.)=1.9382 s t(norm)=0.246454, mflops=20.2877 (err=5.9e-16) 5. PDA: elapsed time t=1.47727 s, 16 iters, t-(init.)=1.44868 s t(norm)=0.184209, mflops=27.143 (err=4.2e-16) 6. PDA (f2c): elapsed time t=1.94485 s, 16 iters, t-(init.)=1.91622 s t(norm)=0.24366, mflops=20.5204 (err=4.2e-16) 7. Singleton: elapsed time t=1.26976 s, 16 iters, t-(init.)=1.24095 s t(norm)=0.157795, mflops=31.6868 (err=5.3e-16) 8. Singleton (f2c): elapsed time t=1.29839 s, 16 iters, t-(init.)=1.26964 s t(norm)=0.161444, mflops=30.9706 (err=5.3e-16) 9. Temperton: elapsed time t=1.21287 s, 32 iters, t-(init.)=1.15528 s t(norm)=0.0734506, mflops=68.073 (err=9.6e-08) 10. Temperton (f2c): elapsed time t=1.68001 s, 32 iters, t-(init.)=1.62244 s t(norm)=0.103152, mflops=48.4722 (err=4.7e-16) Top mflops for N=32768 = 111.277 Normalized results and averages for N=32768: fft 0: mflops = 111.277 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 61.0175 (norm. = 0.548338), norm. avg. (of 3) = 0.426621 fft 2: mflops = 55.6861 (norm. = 0.500427), norm. avg. (of 3) = 0.386265 fft 3: mflops = 17.9828 (norm. = 0.161604), norm. avg. (of 4) = 0.175561 fft 4: mflops = 20.2877 (norm. = 0.182317), norm. avg. (of 4) = 0.236982 fft 5: mflops = 27.143 (norm. = 0.243923), norm. avg. (of 4) = 0.184695 fft 6: mflops = 20.5204 (norm. = 0.184408), norm. avg. (of 4) = 0.122753 fft 7: mflops = 31.6868 (norm. = 0.284755), norm. avg. (of 4) = 0.363678 fft 8: mflops = 30.9706 (norm. = 0.278319), norm. avg. (of 4) = 0.353754 fft 9: mflops = 68.073 (norm. = 0.611743), norm. avg. (of 4) = 0.494353 fft 10: mflops = 48.4722 (norm. = 0.435599), norm. avg. (of 4) = 0.365785 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.75104 s, 4 iters, t-(init.)=1.64415 s t(norm)=0.08711, mflops=57.3987 (err=1.2e-15) 1. HARM: elapsed time t=1.85456 s, 4 iters, t-(init.)=1.74748 s t(norm)=0.0925847, mflops=54.0046 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.89361 s, 4 iters, t-(init.)=1.78653 s t(norm)=0.094654, mflops=52.824 (err=1.2e-15) 3. NR (C): elapsed time t=2.16915 s, 1 iters, t-(init.)=2.1424 s t(norm)=0.454033, mflops=11.0124 (err=1.2e-15) 4. NR (F): elapsed time t=2.0265 s, 1 iters, t-(init.)=1.99975 s t(norm)=0.423802, mflops=11.798 (err=1.2e-15) 5. PDA: elapsed time t=1.64654 s, 2 iters, t-(init.)=1.59304 s t(norm)=0.168805, mflops=29.62 (err=1.3e-15) 6. PDA (f2c): elapsed time t=1.06577 s, 1 iters, t-(init.)=1.03903 s t(norm)=0.220198, mflops=22.7068 (err=1.3e-15) 7. Singleton: elapsed time t=1.21417 s, 1 iters, t-(init.)=1.18743 s t(norm)=0.251648, mflops=19.869 (err=1.7e-15) 8. Singleton (f2c): elapsed time t=1.21247 s, 1 iters, t-(init.)=1.18563 s t(norm)=0.251267, mflops=19.8991 (err=1.7e-15) 9. Temperton: elapsed time t=1.94598 s, 4 iters, t-(init.)=1.83895 s t(norm)=0.0974312, mflops=51.3183 (err=1.3e-07) 10. Temperton (f2c): elapsed time t=1.11132 s, 2 iters, t-(init.)=1.05786 s t(norm)=0.112095, mflops=44.6049 (err=1.3e-15) Top mflops for N=262144 = 57.3987 Normalized results and averages for N=262144: fft 0: mflops = 57.3987 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 54.0046 (norm. = 0.940868), norm. avg. (of 4) = 0.555183 fft 2: mflops = 52.824 (norm. = 0.9203), norm. avg. (of 4) = 0.519774 fft 3: mflops = 11.0124 (norm. = 0.191858), norm. avg. (of 5) = 0.17882 fft 4: mflops = 11.798 (norm. = 0.205544), norm. avg. (of 5) = 0.230694 fft 5: mflops = 29.62 (norm. = 0.516039), norm. avg. (of 5) = 0.250964 fft 6: mflops = 22.7068 (norm. = 0.395598), norm. avg. (of 5) = 0.177322 fft 7: mflops = 19.869 (norm. = 0.346158), norm. avg. (of 5) = 0.360174 fft 8: mflops = 19.8991 (norm. = 0.346683), norm. avg. (of 5) = 0.35234 fft 9: mflops = 51.3183 (norm. = 0.894067), norm. avg. (of 5) = 0.574296 fft 10: mflops = 44.6049 (norm. = 0.777107), norm. avg. (of 5) = 0.44805 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.78148 s, 2 iters, t-(init.)=1.67448 s t(norm)=0.0840477, mflops=59.49 (err=1.2e-15) 1. HARM: elapsed time t=1.14173 s, 1 iters, t-(init.)=1.08823 s t(norm)=0.109244, mflops=45.7693 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.16166 s, 1 iters, t-(init.)=1.10817 s t(norm)=0.111246, mflops=44.9456 (err=1.2e-15) 3. NR (C): elapsed time t=4.69657 s, 1 iters, t-(init.)=4.64298 s t(norm)=0.466094, mflops=10.7275 (err=1.3e-15) 4. NR (F): elapsed time t=4.3984 s, 1 iters, t-(init.)=4.3449 s t(norm)=0.43617, mflops=11.4634 (err=1.3e-15) 5. PDA: elapsed time t=1.61854 s, 1 iters, t-(init.)=1.56503 s t(norm)=0.157108, mflops=31.8252 (err=1.2e-15) 6. PDA (f2c): elapsed time t=2.14235 s, 1 iters, t-(init.)=2.08884 s t(norm)=0.209692, mflops=23.8445 (err=1.2e-15) 7. Singleton: elapsed time t=2.97703 s, 1 iters, t-(init.)=2.92325 s t(norm)=0.293455, mflops=17.0384 (err=1.7e-15) 8. Singleton (f2c): elapsed time t=2.95089 s, 1 iters, t-(init.)=2.89737 s t(norm)=0.290858, mflops=17.1905 (err=1.7e-15) 9. Temperton: elapsed time t=1.69644 s, 2 iters, t-(init.)=1.58935 s t(norm)=0.0797746, mflops=62.6766 (err=1.5e-07) 10. Temperton (f2c): elapsed time t=1.00826 s, 1 iters, t-(init.)=0.954668 s t(norm)=0.0958361, mflops=52.1724 (err=1.3e-15) Top mflops for N=524288 = 62.6766 Normalized results and averages for N=524288: fft 0: mflops = 59.49 (norm. = 0.949158), norm. avg. (of 6) = 0.991526 fft 1: mflops = 45.7693 (norm. = 0.730246), norm. avg. (of 5) = 0.590195 fft 2: mflops = 44.9456 (norm. = 0.717103), norm. avg. (of 5) = 0.55924 fft 3: mflops = 10.7275 (norm. = 0.171156), norm. avg. (of 6) = 0.177543 fft 4: mflops = 11.4634 (norm. = 0.182898), norm. avg. (of 6) = 0.222728 fft 5: mflops = 31.8252 (norm. = 0.507768), norm. avg. (of 6) = 0.293765 fft 6: mflops = 23.8445 (norm. = 0.380437), norm. avg. (of 6) = 0.211174 fft 7: mflops = 17.0384 (norm. = 0.271846), norm. avg. (of 6) = 0.345452 fft 8: mflops = 17.1905 (norm. = 0.274273), norm. avg. (of 6) = 0.339329 fft 9: mflops = 62.6766 (norm. = 1), norm. avg. (of 6) = 0.645247 fft 10: mflops = 52.1724 (norm. = 0.832407), norm. avg. (of 6) = 0.512109 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=1.54015 s, 1 iters, t-(init.)=1.43313 s t(norm)=0.068337, mflops=73.1669 (err=2.0e-15) 1. HARM: elapsed time t=2.45574 s, 1 iters, t-(init.)=2.34864 s t(norm)=0.111992, mflops=44.6461 (err=1.9e-15) 2. HARM (f2c): elapsed time t=2.29323 s, 1 iters, t-(init.)=2.18614 s t(norm)=0.104243, mflops=47.9647 (err=1.9e-15) 3. NR (C): elapsed time t=9.80558 s, 1 iters, t-(init.)=9.69855 s t(norm)=0.462463, mflops=10.8117 (err=1.9e-15) 4. NR (F): elapsed time t=9.19787 s, 1 iters, t-(init.)=9.09075 s t(norm)=0.433481, mflops=11.5345 (err=1.9e-15) 5. PDA: elapsed time t=3.42684 s, 1 iters, t-(init.)=3.31987 s t(norm)=0.158304, mflops=31.5849 (err=2.0e-15) 6. PDA (f2c): elapsed time t=4.35833 s, 1 iters, t-(init.)=4.25127 s t(norm)=0.202716, mflops=24.665 (err=2.0e-15) 7. Singleton: elapsed time t=5.08178 s, 1 iters, t-(init.)=4.97466 s t(norm)=0.23721, mflops=21.0783 (err=2.8e-15) 8. Singleton (f2c): elapsed time t=5.05992 s, 1 iters, t-(init.)=4.9529 s t(norm)=0.236172, mflops=21.171 (err=2.8e-15) 9. Skipping fft (Temperton can't handle dimensions > 256). 10. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 73.1669 Normalized results and averages for N=1048576: fft 0: mflops = 73.1669 (norm. = 1), norm. avg. (of 7) = 0.992737 fft 1: mflops = 44.6461 (norm. = 0.610196), norm. avg. (of 6) = 0.593529 fft 2: mflops = 47.9647 (norm. = 0.655552), norm. avg. (of 6) = 0.575292 fft 3: mflops = 10.8117 (norm. = 0.147767), norm. avg. (of 7) = 0.173289 fft 4: mflops = 11.5345 (norm. = 0.157647), norm. avg. (of 7) = 0.213431 fft 5: mflops = 31.5849 (norm. = 0.431683), norm. avg. (of 7) = 0.313467 fft 6: mflops = 24.665 (norm. = 0.337106), norm. avg. (of 7) = 0.229164 fft 7: mflops = 21.0783 (norm. = 0.288086), norm. avg. (of 7) = 0.337257 fft 8: mflops = 21.171 (norm. = 0.289352), norm. avg. (of 7) = 0.332189 fft 9: mflops = -1 (norm. = -0.0136674), norm. avg. (of 6) = 0.645247 fft 10: mflops = -1 (norm. = -0.0136674), norm. avg. (of 6) = 0.512109 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=3.67535 s, 1 iters, t-(init.)=3.4612 s t(norm)=0.0785919, mflops=63.6198 (err=7.3e-16) 1. HARM: elapsed time t=4.80854 s, 1 iters, t-(init.)=4.5944 s t(norm)=0.104323, mflops=47.9282 (err=6.9e-16) 2. HARM (f2c): elapsed time t=5.01784 s, 1 iters, t-(init.)=4.80372 s t(norm)=0.109076, mflops=45.8397 (err=6.9e-16) 3. NR (C): elapsed time t=22.3569 s, 1 iters, t-(init.)=22.1428 s t(norm)=0.502786, mflops=9.94458 (err=7.4e-16) 4. NR (F): elapsed time t=21.0241 s, 1 iters, t-(init.)=20.81 s t(norm)=0.472523, mflops=10.5815 (err=7.4e-16) 5. PDA: elapsed time t=6.63909 s, 1 iters, t-(init.)=6.42496 s t(norm)=0.145889, mflops=34.2727 (err=7.1e-16) 6. PDA (f2c): elapsed time t=8.67154 s, 1 iters, t-(init.)=8.45737 s t(norm)=0.192038, mflops=26.0366 (err=7.1e-16) 7. Singleton: elapsed time t=17.7288 s, 1 iters, t-(init.)=17.5146 s t(norm)=0.397697, mflops=12.5724 (err=8.4e-16) 8. Singleton (f2c): elapsed time t=18.1229 s, 1 iters, t-(init.)=17.9065 s t(norm)=0.406594, mflops=12.2973 (err=8.4e-16) 9. Temperton: elapsed time t=5.73637 s, 1 iters, t-(init.)=5.52224 s t(norm)=0.125391, mflops=39.8753 (err=1.5e-07) 10. Temperton (f2c): elapsed time t=6.21746 s, 1 iters, t-(init.)=6.00311 s t(norm)=0.13631, mflops=36.6811 (err=7.4e-16) Top mflops for N=2097152 = 63.6198 Normalized results and averages for N=2097152: fft 0: mflops = 63.6198 (norm. = 1), norm. avg. (of 8) = 0.993645 fft 1: mflops = 47.9282 (norm. = 0.753353), norm. avg. (of 7) = 0.616361 fft 2: mflops = 45.8397 (norm. = 0.720526), norm. avg. (of 7) = 0.59604 fft 3: mflops = 9.94458 (norm. = 0.156313), norm. avg. (of 8) = 0.171167 fft 4: mflops = 10.5815 (norm. = 0.166324), norm. avg. (of 8) = 0.207543 fft 5: mflops = 34.2727 (norm. = 0.538712), norm. avg. (of 8) = 0.341623 fft 6: mflops = 26.0366 (norm. = 0.409253), norm. avg. (of 8) = 0.251676 fft 7: mflops = 12.5724 (norm. = 0.197618), norm. avg. (of 8) = 0.319802 fft 8: mflops = 12.2973 (norm. = 0.193293), norm. avg. (of 8) = 0.314827 fft 9: mflops = 39.8753 (norm. = 0.626776), norm. avg. (of 7) = 0.642608 fft 10: mflops = 36.6811 (norm. = 0.576568), norm. avg. (of 7) = 0.521318 Benchmarking for array size = 512x128x64 (power of 2): 0. FFTW: elapsed time t=8.09672 s, 1 iters, t-(init.)=7.66846 s t(norm)=0.0831047, mflops=60.1651 (err=1.3e-15) 1. HARM: elapsed time t=11.9251 s, 1 iters, t-(init.)=11.4969 s t(norm)=0.124594, mflops=40.1304 (err=1.2e-15) 2. HARM (f2c): elapsed time t=11.5623 s, 1 iters, t-(init.)=11.1302 s t(norm)=0.12062, mflops=41.4524 (err=1.2e-15) 3. NR (C): elapsed time t=48.1555 s, 1 iters, t-(init.)=47.7272 s t(norm)=0.51723, mflops=9.66688 (err=1.3e-15) 4. NR (F): elapsed time t=45.4796 s, 1 iters, t-(init.)=45.0513 s t(norm)=0.48823, mflops=10.2411 (err=1.3e-15) 5. PDA: elapsed time t=14.5187 s, 1 iters, t-(init.)=14.0904 s t(norm)=0.1527, mflops=32.7439 (err=1.3e-15) 6. PDA (f2c): elapsed time t=18.6866 s, 1 iters, t-(init.)=18.2584 s t(norm)=0.19787, mflops=25.2692 (err=1.3e-15) 7. Singleton: elapsed time t=32.9716 s, 1 iters, t-(init.)=32.5434 s t(norm)=0.352679, mflops=14.1772 (err=1.6e-15) 8. Singleton (f2c): elapsed time t=33.3941 s, 1 iters, t-(init.)=32.9658 s t(norm)=0.357257, mflops=13.9955 (err=1.6e-15) 9. Skipping fft (Temperton can't handle dimensions > 256). 10. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=4194304 = 60.1651 Normalized results and averages for N=4194304: fft 0: mflops = 60.1651 (norm. = 1), norm. avg. (of 9) = 0.994351 fft 1: mflops = 40.1304 (norm. = 0.667005), norm. avg. (of 8) = 0.622691 fft 2: mflops = 41.4524 (norm. = 0.688978), norm. avg. (of 8) = 0.607657 fft 3: mflops = 9.66688 (norm. = 0.160673), norm. avg. (of 9) = 0.170001 fft 4: mflops = 10.2411 (norm. = 0.170216), norm. avg. (of 9) = 0.203395 fft 5: mflops = 32.7439 (norm. = 0.544234), norm. avg. (of 9) = 0.364135 fft 6: mflops = 25.2692 (norm. = 0.419997), norm. avg. (of 9) = 0.270378 fft 7: mflops = 14.1772 (norm. = 0.235638), norm. avg. (of 9) = 0.310451 fft 8: mflops = 13.9955 (norm. = 0.232618), norm. avg. (of 9) = 0.305693 fft 9: mflops = -1 (norm. = -0.0166209), norm. avg. (of 7) = 0.642608 fft 10: mflops = -1 (norm. = -0.0166209), norm. avg. (of 7) = 0.521318 Benchmarking for array size = 256x128x256 (power of 2): 0. FFTW: elapsed time t=16.5805 s, 1 iters, t-(init.)=15.7238 s t(norm)=0.0814965, mflops=61.3524 (err=1.5e-15) 1. HARM: elapsed time t=22.4545 s, 1 iters, t-(init.)=21.598 s t(norm)=0.111943, mflops=44.6657 (err=1.4e-15) 2. HARM (f2c): elapsed time t=21.592 s, 1 iters, t-(init.)=20.7355 s t(norm)=0.107472, mflops=46.5236 (err=1.4e-15) 3. NR (C): elapsed time t=102.752 s, 1 iters, t-(init.)=101.892 s t(norm)=0.528106, mflops=9.4678 (err=1.5e-15) 4. NR (F): elapsed time t=97.2317 s, 1 iters, t-(init.)=96.3752 s t(norm)=0.499514, mflops=10.0097 (err=1.5e-15) 5. PDA: elapsed time t=33.0037 s, 1 iters, t-(init.)=32.1469 s t(norm)=0.166618, mflops=30.0088 (err=1.4e-15) 6. PDA (f2c): elapsed time t=41.1887 s, 1 iters, t-(init.)=40.3279 s t(norm)=0.20902, mflops=23.9212 (err=1.4e-15) 7. Singleton: elapsed time t=67.8224 s, 1 iters, t-(init.)=66.9659 s t(norm)=0.347085, mflops=14.4057 (err=2.0e-15) 8. Singleton (f2c): elapsed time t=68.8073 s, 1 iters, t-(init.)=67.9508 s t(norm)=0.35219, mflops=14.1969 (err=2.0e-15) 9. Temperton: elapsed time t=27.49 s, 1 iters, t-(init.)=26.6333 s t(norm)=0.138041, mflops=36.2212 (err=1.9e-07) 10. Temperton (f2c): elapsed time t=29.5758 s, 1 iters, t-(init.)=28.715 s t(norm)=0.14883, mflops=33.5954 (err=1.5e-15) Top mflops for N=8388608 = 61.3524 Normalized results and averages for N=8388608: fft 0: mflops = 61.3524 (norm. = 1), norm. avg. (of 10) = 0.994916 fft 1: mflops = 44.6657 (norm. = 0.72802), norm. avg. (of 9) = 0.634395 fft 2: mflops = 46.5236 (norm. = 0.758302), norm. avg. (of 9) = 0.624395 fft 3: mflops = 9.4678 (norm. = 0.154318), norm. avg. (of 10) = 0.168433 fft 4: mflops = 10.0097 (norm. = 0.163151), norm. avg. (of 10) = 0.199371 fft 5: mflops = 30.0088 (norm. = 0.489122), norm. avg. (of 10) = 0.376634 fft 6: mflops = 23.9212 (norm. = 0.389898), norm. avg. (of 10) = 0.28233 fft 7: mflops = 14.4057 (norm. = 0.234803), norm. avg. (of 10) = 0.302886 fft 8: mflops = 14.1969 (norm. = 0.231399), norm. avg. (of 10) = 0.298264 fft 9: mflops = 36.2212 (norm. = 0.590379), norm. avg. (of 8) = 0.636079 fft 10: mflops = 33.5954 (norm. = 0.547581), norm. avg. (of 8) = 0.5246 Benchmarking for array size = 256x256x256 (power of 2): 0. FFTW: elapsed time t=40.9607 s, 1 iters, t-(init.)=39.2434 s t(norm)=0.0974619, mflops=51.3021 (err=1.7e-15) 1. HARM: elapsed time t=46.5617 s, 1 iters, t-(init.)=44.8446 s t(norm)=0.111373, mflops=44.8943 (err=1.7e-15) 2. HARM (f2c): elapsed time t=43.9605 s, 1 iters, t-(init.)=42.2474 s t(norm)=0.104922, mflops=47.6542 (err=1.7e-15) 3. NR (C): elapsed time t=219.311 s, 1 iters, t-(init.)=217.594 s t(norm)=0.5404, mflops=9.25241 (err=1.8e-15) 4. NR (F): elapsed time t=207.949 s, 1 iters, t-(init.)=206.235 s t(norm)=0.512191, mflops=9.76198 (err=1.8e-15) 5. PDA: elapsed time t=71.7742 s, 1 iters, t-(init.)=70.061 s t(norm)=0.173998, mflops=28.7359 (err=1.7e-15) 6. PDA (f2c): elapsed time t=88.7749 s, 1 iters, t-(init.)=87.0615 s t(norm)=0.21622, mflops=23.1246 (err=1.7e-15) 7. Singleton: elapsed time t=127.961 s, 1 iters, t-(init.)=126.248 s t(norm)=0.31354, mflops=15.9469 (err=2.4e-15) 8. Singleton (f2c): elapsed time t=129.247 s, 1 iters, t-(init.)=127.534 s t(norm)=0.316734, mflops=15.7861 (err=2.4e-15) 9. Temperton: elapsed time t=58.1514 s, 1 iters, t-(init.)=56.4383 s t(norm)=0.140166, mflops=35.672 (err=2.1e-07) 10. Temperton (f2c): elapsed time t=62.5499 s, 1 iters, t-(init.)=60.8366 s t(norm)=0.151089, mflops=33.093 (err=1.7e-15) Top mflops for N=16777216 = 51.3021 Normalized results and averages for N=16777216: fft 0: mflops = 51.3021 (norm. = 1), norm. avg. (of 11) = 0.995378 fft 1: mflops = 44.8943 (norm. = 0.875098), norm. avg. (of 10) = 0.658465 fft 2: mflops = 47.6542 (norm. = 0.928895), norm. avg. (of 10) = 0.654845 fft 3: mflops = 9.25241 (norm. = 0.180352), norm. avg. (of 11) = 0.169516 fft 4: mflops = 9.76198 (norm. = 0.190284), norm. avg. (of 11) = 0.198545 fft 5: mflops = 28.7359 (norm. = 0.560131), norm. avg. (of 11) = 0.393315 fft 6: mflops = 23.1246 (norm. = 0.450755), norm. avg. (of 11) = 0.297641 fft 7: mflops = 15.9469 (norm. = 0.310844), norm. avg. (of 11) = 0.303609 fft 8: mflops = 15.7861 (norm. = 0.307709), norm. avg. (of 11) = 0.299122 fft 9: mflops = 35.672 (norm. = 0.695332), norm. avg. (of 9) = 0.642663 fft 10: mflops = 33.093 (norm. = 0.645062), norm. avg. (of 9) = 0.537985 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) 180x180x180 (88.9976 MB) 240x240x240 (210.949 MB) Maximum array size N = 13824000 Benchmarking FFTs: 0. FFTW 1. PDA 2. PDA (f2c) 3. Singleton 4. Singleton (f2c) 5. Temperton 6. Temperton (f2c) Computing normalized averages (7 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.42875 s, 32768 iters, t-(init.)=1.3638 s t(norm)=0.0477991, mflops=104.605 (err=3.0e-16) 1. PDA: elapsed time t=1.75114 s, 8192 iters, t-(init.)=1.73495 s t(norm)=0.24323, mflops=20.5567 (err=2.3e-16) 2. PDA (f2c): elapsed time t=1.30034 s, 4096 iters, t-(init.)=1.2922 s t(norm)=0.362318, mflops=13.8 (err=2.3e-16) 3. Singleton: elapsed time t=1.46424 s, 32768 iters, t-(init.)=1.39832 s t(norm)=0.0490089, mflops=102.022 (err=3.1e-16) 4. Singleton (f2c): elapsed time t=1.33615 s, 32768 iters, t-(init.)=1.27118 s t(norm)=0.0445529, mflops=112.226 (err=3.1e-16) 5. Temperton: elapsed time t=1.906 s, 32768 iters, t-(init.)=1.83929 s t(norm)=0.0644645, mflops=77.562 (err=5.3e-16) 6. Temperton (f2c): elapsed time t=1.39086 s, 16384 iters, t-(init.)=1.35751 s t(norm)=0.0951575, mflops=52.5445 (err=2.4e-16) Top mflops for N=125 = 112.226 Normalized results and averages for N=125: fft 0: mflops = 104.605 (norm. = 0.932087), norm. avg. (of 1) = 0.932087 fft 1: mflops = 20.5567 (norm. = 0.183172), norm. avg. (of 1) = 0.183172 fft 2: mflops = 13.8 (norm. = 0.122966), norm. avg. (of 1) = 0.122966 fft 3: mflops = 102.022 (norm. = 0.909077), norm. avg. (of 1) = 0.909077 fft 4: mflops = 112.226 (norm. = 1), norm. avg. (of 1) = 1 fft 5: mflops = 77.562 (norm. = 0.691122), norm. avg. (of 1) = 0.691122 fft 6: mflops = 52.5445 (norm. = 0.468201), norm. avg. (of 1) = 0.468201 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.81465 s, 32768 iters, t-(init.)=1.70521 s t(norm)=0.031067, mflops=160.943 (err=2.9e-16) 1. PDA: elapsed time t=1.67985 s, 4096 iters, t-(init.)=1.66622 s t(norm)=0.242853, mflops=20.5885 (err=3.6e-16) 2. PDA (f2c): elapsed time t=1.10831 s, 2048 iters, t-(init.)=1.1015 s t(norm)=0.321087, mflops=15.5721 (err=3.6e-16) 3. Singleton: elapsed time t=1.21723 s, 8192 iters, t-(init.)=1.18972 s t(norm)=0.0867014, mflops=57.6692 (err=2.9e-16) 4. Singleton (f2c): elapsed time t=1.30068 s, 8192 iters, t-(init.)=1.27332 s t(norm)=0.0927939, mflops=53.8829 (err=2.9e-16) 5. Temperton: elapsed time t=1.9853 s, 16384 iters, t-(init.)=1.92981 s t(norm)=0.0703179, mflops=71.1057 (err=2.1e-15) 6. Temperton (f2c): elapsed time t=1.31295 s, 8192 iters, t-(init.)=1.28528 s t(norm)=0.0936653, mflops=53.3816 (err=3.1e-16) Top mflops for N=216 = 160.943 Normalized results and averages for N=216: fft 0: mflops = 160.943 (norm. = 1), norm. avg. (of 2) = 0.966043 fft 1: mflops = 20.5885 (norm. = 0.127925), norm. avg. (of 2) = 0.155548 fft 2: mflops = 15.5721 (norm. = 0.0967555), norm. avg. (of 2) = 0.109861 fft 3: mflops = 57.6692 (norm. = 0.358321), norm. avg. (of 2) = 0.633699 fft 4: mflops = 53.8829 (norm. = 0.334796), norm. avg. (of 2) = 0.667398 fft 5: mflops = 71.1057 (norm. = 0.441808), norm. avg. (of 2) = 0.566465 fft 6: mflops = 53.3816 (norm. = 0.331681), norm. avg. (of 2) = 0.399941 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.05838 s, 8192 iters, t-(init.)=1.01557 s t(norm)=0.0429147, mflops=116.51 (err=3.8e-16) 1. PDA: elapsed time t=1.32597 s, 1024 iters, t-(init.)=1.32061 s t(norm)=0.44644, mflops=11.1997 (err=4.5e-16) 2. PDA (f2c): elapsed time t=1.85868 s, 1024 iters, t-(init.)=1.85333 s t(norm)=0.626528, mflops=7.98049 (err=4.5e-16) 3. Singleton: elapsed time t=1.14206 s, 4096 iters, t-(init.)=1.12048 s t(norm)=0.0946959, mflops=52.8006 (err=5.8e-16) 4. Singleton (f2c): elapsed time t=1.18527 s, 4096 iters, t-(init.)=1.16387 s t(norm)=0.0983627, mflops=50.8323 (err=5.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 116.51 Normalized results and averages for N=343: fft 0: mflops = 116.51 (norm. = 1), norm. avg. (of 3) = 0.977362 fft 1: mflops = 11.1997 (norm. = 0.0961265), norm. avg. (of 3) = 0.135741 fft 2: mflops = 7.98049 (norm. = 0.0684961), norm. avg. (of 3) = 0.0960726 fft 3: mflops = 52.8006 (norm. = 0.453185), norm. avg. (of 3) = 0.573528 fft 4: mflops = 50.8323 (norm. = 0.436291), norm. avg. (of 3) = 0.590362 fft 5: mflops = -1 (norm. = -0.00858295), norm. avg. (of 2) = 0.566465 fft 6: mflops = -1 (norm. = -0.00858295), norm. avg. (of 2) = 0.399941 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.84373 s, 8192 iters, t-(init.)=1.75384 s t(norm)=0.0308817, mflops=161.908 (err=5.3e-16) 1. PDA: elapsed time t=1.06143 s, 1024 iters, t-(init.)=1.0502 s t(norm)=0.147936, mflops=33.7985 (err=4.9e-16) 2. PDA (f2c): elapsed time t=1.5995 s, 1024 iters, t-(init.)=1.58827 s t(norm)=0.223731, mflops=22.3483 (err=4.9e-16) 3. Singleton: elapsed time t=1.19325 s, 2048 iters, t-(init.)=1.17068 s t(norm)=0.0824534, mflops=60.6403 (err=4.5e-16) 4. Singleton (f2c): elapsed time t=1.13101 s, 2048 iters, t-(init.)=1.10854 s t(norm)=0.0780771, mflops=64.0393 (err=4.5e-16) 5. Temperton: elapsed time t=1.2216 s, 4096 iters, t-(init.)=1.17654 s t(norm)=0.0414333, mflops=120.676 (err=6.0e-08) 6. Temperton (f2c): elapsed time t=1.53345 s, 4096 iters, t-(init.)=1.48823 s t(norm)=0.0524097, mflops=95.4022 (err=5.1e-16) Top mflops for N=729 = 161.908 Normalized results and averages for N=729: fft 0: mflops = 161.908 (norm. = 1), norm. avg. (of 4) = 0.983022 fft 1: mflops = 33.7985 (norm. = 0.208751), norm. avg. (of 4) = 0.153994 fft 2: mflops = 22.3483 (norm. = 0.138031), norm. avg. (of 4) = 0.106562 fft 3: mflops = 60.6403 (norm. = 0.374536), norm. avg. (of 4) = 0.52378 fft 4: mflops = 64.0393 (norm. = 0.395529), norm. avg. (of 4) = 0.541654 fft 5: mflops = 120.676 (norm. = 0.745336), norm. avg. (of 3) = 0.626089 fft 6: mflops = 95.4022 (norm. = 0.589237), norm. avg. (of 3) = 0.46304 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.27002 s, 4096 iters, t-(init.)=1.20855 s t(norm)=0.029607, mflops=168.879 (err=4.0e-16) 1. PDA: elapsed time t=1.49158 s, 1024 iters, t-(init.)=1.47623 s t(norm)=0.144658, mflops=34.5643 (err=4.3e-16) 2. PDA (f2c): elapsed time t=1.1474 s, 512 iters, t-(init.)=1.13972 s t(norm)=0.223365, mflops=22.3849 (err=4.3e-16) 3. Singleton: elapsed time t=1.44155 s, 2048 iters, t-(init.)=1.41071 s t(norm)=0.069119, mflops=72.339 (err=4.6e-16) 4. Singleton (f2c): elapsed time t=1.45551 s, 2048 iters, t-(init.)=1.42477 s t(norm)=0.0698079, mflops=71.6251 (err=4.6e-16) 5. Temperton: elapsed time t=1.63171 s, 4096 iters, t-(init.)=1.57007 s t(norm)=0.0384634, mflops=129.994 (err=6.3e-16) 6. Temperton (f2c): elapsed time t=1.41304 s, 2048 iters, t-(init.)=1.38226 s t(norm)=0.067725, mflops=73.828 (err=3.4e-16) Top mflops for N=1000 = 168.879 Normalized results and averages for N=1000: fft 0: mflops = 168.879 (norm. = 1), norm. avg. (of 5) = 0.986417 fft 1: mflops = 34.5643 (norm. = 0.204669), norm. avg. (of 5) = 0.164129 fft 2: mflops = 22.3849 (norm. = 0.13255), norm. avg. (of 5) = 0.11176 fft 3: mflops = 72.339 (norm. = 0.428348), norm. avg. (of 5) = 0.504693 fft 4: mflops = 71.6251 (norm. = 0.424121), norm. avg. (of 5) = 0.518147 fft 5: mflops = 129.994 (norm. = 0.769743), norm. avg. (of 4) = 0.662002 fft 6: mflops = 73.828 (norm. = 0.437164), norm. avg. (of 4) = 0.456571 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.06849 s, 1024 iters, t-(init.)=1.04809 s t(norm)=0.0740959, mflops=67.4802 (err=4.3e-16) 1. PDA: elapsed time t=1.55061 s, 256 iters, t-(init.)=1.54551 s t(norm)=0.437046, mflops=11.4404 (err=5.6e-16) 2. PDA (f2c): elapsed time t=1.06949 s, 128 iters, t-(init.)=1.06694 s t(norm)=0.603427, mflops=8.28601 (err=5.6e-16) 3. Singleton: elapsed time t=1.26364 s, 1024 iters, t-(init.)=1.24321 s t(norm)=0.08789, mflops=56.8893 (err=6.6e-16) 4. Singleton (f2c): elapsed time t=1.31604 s, 1024 iters, t-(init.)=1.29564 s t(norm)=0.0915965, mflops=54.5873 (err=6.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 67.4802 Normalized results and averages for N=1331: fft 0: mflops = 67.4802 (norm. = 1), norm. avg. (of 6) = 0.988681 fft 1: mflops = 11.4404 (norm. = 0.169538), norm. avg. (of 6) = 0.16503 fft 2: mflops = 8.28601 (norm. = 0.122792), norm. avg. (of 6) = 0.113598 fft 3: mflops = 56.8893 (norm. = 0.843052), norm. avg. (of 6) = 0.561086 fft 4: mflops = 54.5873 (norm. = 0.808938), norm. avg. (of 6) = 0.566612 fft 5: mflops = -1 (norm. = -0.0148192), norm. avg. (of 4) = 0.662002 fft 6: mflops = -1 (norm. = -0.0148192), norm. avg. (of 4) = 0.456571 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=2.00189 s, 4096 iters, t-(init.)=1.8961 s t(norm)=0.0249087, mflops=200.733 (err=3.9e-16) 1. PDA: elapsed time t=1.0016 s, 512 iters, t-(init.)=0.988376 s t(norm)=0.103873, mflops=48.1357 (err=3.8e-16) 2. PDA (f2c): elapsed time t=1.68211 s, 512 iters, t-(init.)=1.66889 s t(norm)=0.175391, mflops=28.5077 (err=3.8e-16) 3. Singleton: elapsed time t=1.7296 s, 1024 iters, t-(init.)=1.70311 s t(norm)=0.0894939, mflops=55.8697 (err=4.0e-16) 4. Singleton (f2c): elapsed time t=1.74224 s, 1024 iters, t-(init.)=1.71579 s t(norm)=0.0901603, mflops=55.4568 (err=4.0e-16) 5. Temperton: elapsed time t=1.51348 s, 2048 iters, t-(init.)=1.46054 s t(norm)=0.0383738, mflops=130.297 (err=1.8e-15) 6. Temperton (f2c): elapsed time t=1.58288 s, 1024 iters, t-(init.)=1.55641 s t(norm)=0.0817851, mflops=61.1358 (err=3.9e-16) Top mflops for N=1728 = 200.733 Normalized results and averages for N=1728: fft 0: mflops = 200.733 (norm. = 1), norm. avg. (of 7) = 0.990298 fft 1: mflops = 48.1357 (norm. = 0.2398), norm. avg. (of 7) = 0.175712 fft 2: mflops = 28.5077 (norm. = 0.142018), norm. avg. (of 7) = 0.117658 fft 3: mflops = 55.8697 (norm. = 0.278329), norm. avg. (of 7) = 0.520692 fft 4: mflops = 55.4568 (norm. = 0.276272), norm. avg. (of 7) = 0.525135 fft 5: mflops = 130.297 (norm. = 0.649108), norm. avg. (of 5) = 0.659424 fft 6: mflops = 61.1358 (norm. = 0.304563), norm. avg. (of 5) = 0.426169 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.01873 s, 512 iters, t-(init.)=1.00193 s t(norm)=0.0802344, mflops=62.3174 (err=4.5e-16) 1. PDA: elapsed time t=1.38843 s, 128 iters, t-(init.)=1.38423 s t(norm)=0.443396, mflops=11.2766 (err=9.0e-16) 2. PDA (f2c): elapsed time t=1.91721 s, 128 iters, t-(init.)=1.91301 s t(norm)=0.612777, mflops=8.15958 (err=9.0e-16) 3. Singleton: elapsed time t=1.21504 s, 512 iters, t-(init.)=1.19821 s t(norm)=0.095953, mflops=52.1088 (err=7.7e-16) 4. Singleton (f2c): elapsed time t=1.2069 s, 512 iters, t-(init.)=1.19007 s t(norm)=0.0953013, mflops=52.4652 (err=7.7e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 62.3174 Normalized results and averages for N=2197: fft 0: mflops = 62.3174 (norm. = 1), norm. avg. (of 8) = 0.991511 fft 1: mflops = 11.2766 (norm. = 0.180954), norm. avg. (of 8) = 0.176367 fft 2: mflops = 8.15958 (norm. = 0.130936), norm. avg. (of 8) = 0.119318 fft 3: mflops = 52.1088 (norm. = 0.836185), norm. avg. (of 8) = 0.560129 fft 4: mflops = 52.4652 (norm. = 0.841903), norm. avg. (of 8) = 0.564731 fft 5: mflops = -1 (norm. = -0.0160469), norm. avg. (of 5) = 0.659424 fft 6: mflops = -1 (norm. = -0.0160469), norm. avg. (of 5) = 0.426169 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.50943 s, 1024 iters, t-(init.)=1.4675 s t(norm)=0.0457244, mflops=109.351 (err=4.1e-16) 1. PDA: elapsed time t=1.03491 s, 128 iters, t-(init.)=1.02966 s t(norm)=0.256658, mflops=19.4812 (err=4.7e-16) 2. PDA (f2c): elapsed time t=1.39012 s, 128 iters, t-(init.)=1.38487 s t(norm)=0.345199, mflops=14.4844 (err=4.7e-16) 3. Singleton: elapsed time t=1.79517 s, 512 iters, t-(init.)=1.77414 s t(norm)=0.110558, mflops=45.2252 (err=5.6e-16) 4. Singleton (f2c): elapsed time t=1.90058 s, 512 iters, t-(init.)=1.8796 s t(norm)=0.117129, mflops=42.6878 (err=5.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 109.351 Normalized results and averages for N=2744: fft 0: mflops = 109.351 (norm. = 1), norm. avg. (of 9) = 0.992454 fft 1: mflops = 19.4812 (norm. = 0.178153), norm. avg. (of 9) = 0.176565 fft 2: mflops = 14.4844 (norm. = 0.132458), norm. avg. (of 9) = 0.120778 fft 3: mflops = 45.2252 (norm. = 0.413579), norm. avg. (of 9) = 0.543846 fft 4: mflops = 42.6878 (norm. = 0.390375), norm. avg. (of 9) = 0.545358 fft 5: mflops = -1 (norm. = -0.00914488), norm. avg. (of 5) = 0.659424 fft 6: mflops = -1 (norm. = -0.00914488), norm. avg. (of 5) = 0.426169 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.3074 s, 1024 iters, t-(init.)=1.25585 s t(norm)=0.0310035, mflops=161.272 (err=3.9e-16) 1. PDA: elapsed time t=1.0244 s, 256 iters, t-(init.)=1.0115 s t(norm)=0.0998849, mflops=50.0576 (err=4.4e-16) 2. PDA (f2c): elapsed time t=1.74545 s, 256 iters, t-(init.)=1.73255 s t(norm)=0.171088, mflops=29.2247 (err=4.4e-16) 3. Singleton: elapsed time t=1.79862 s, 512 iters, t-(init.)=1.77281 s t(norm)=0.0875318, mflops=57.1221 (err=5.0e-16) 4. Singleton (f2c): elapsed time t=1.7858 s, 512 iters, t-(init.)=1.75997 s t(norm)=0.086898, mflops=57.5387 (err=5.0e-16) 5. Temperton: elapsed time t=1.62565 s, 1024 iters, t-(init.)=1.57407 s t(norm)=0.0388596, mflops=128.668 (err=1.9e-15) 6. Temperton (f2c): elapsed time t=1.26379 s, 512 iters, t-(init.)=1.23764 s t(norm)=0.0611078, mflops=81.8227 (err=4.1e-16) Top mflops for N=3375 = 161.272 Normalized results and averages for N=3375: fft 0: mflops = 161.272 (norm. = 1), norm. avg. (of 10) = 0.993209 fft 1: mflops = 50.0576 (norm. = 0.310393), norm. avg. (of 10) = 0.189948 fft 2: mflops = 29.2247 (norm. = 0.181214), norm. avg. (of 10) = 0.126822 fft 3: mflops = 57.1221 (norm. = 0.354197), norm. avg. (of 10) = 0.524881 fft 4: mflops = 57.5387 (norm. = 0.356781), norm. avg. (of 10) = 0.5265 fft 5: mflops = 128.668 (norm. = 0.797836), norm. avg. (of 6) = 0.682492 fft 6: mflops = 81.8227 (norm. = 0.507358), norm. avg. (of 6) = 0.439701 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.91033 s, 128 iters, t-(init.)=1.85709 s t(norm)=0.0615269, mflops=81.2652 (err=4.7e-16) 1. PDA: elapsed time t=1.69834 s, 64 iters, t-(init.)=1.67172 s t(norm)=0.110771, mflops=45.1383 (err=4.7e-16) 2. PDA (f2c): elapsed time t=1.25036 s, 32 iters, t-(init.)=1.23658 s t(norm)=0.163875, mflops=30.511 (err=4.7e-16) 3. Singleton: elapsed time t=1.60777 s, 64 iters, t-(init.)=1.58121 s t(norm)=0.104773, mflops=47.7221 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=1.59462 s, 64 iters, t-(init.)=1.56806 s t(norm)=0.103902, mflops=48.1221 (err=5.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 81.2652 Normalized results and averages for N=16800: fft 0: mflops = 81.2652 (norm. = 1), norm. avg. (of 11) = 0.993826 fft 1: mflops = 45.1383 (norm. = 0.555445), norm. avg. (of 11) = 0.223175 fft 2: mflops = 30.511 (norm. = 0.37545), norm. avg. (of 11) = 0.149424 fft 3: mflops = 47.7221 (norm. = 0.587239), norm. avg. (of 11) = 0.53055 fft 4: mflops = 48.1221 (norm. = 0.592161), norm. avg. (of 11) = 0.53247 fft 5: mflops = -1 (norm. = -0.0123054), norm. avg. (of 6) = 0.682492 fft 6: mflops = -1 (norm. = -0.0123054), norm. avg. (of 6) = 0.439701 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.87015 s, 16 iters, t-(init.)=1.69008 s t(norm)=0.0570061, mflops=87.7099 (err=6.6e-16) 1. PDA: elapsed time t=1.275 s, 4 iters, t-(init.)=1.23046 s t(norm)=0.166013, mflops=30.1181 (err=6.2e-16) 2. PDA (f2c): elapsed time t=1.67051 s, 4 iters, t-(init.)=1.62597 s t(norm)=0.219375, mflops=22.792 (err=6.2e-16) 3. Singleton: elapsed time t=1.79257 s, 4 iters, t-(init.)=1.74771 s t(norm)=0.235801, mflops=21.2044 (err=6.5e-16) 4. Singleton (f2c): elapsed time t=1.7948 s, 4 iters, t-(init.)=1.74995 s t(norm)=0.236103, mflops=21.1772 (err=6.5e-16) 5. Temperton: elapsed time t=1.26673 s, 8 iters, t-(init.)=1.17739 s t(norm)=0.0794267, mflops=62.9511 (err=1.0e-07) 6. Temperton (f2c): elapsed time t=1.4924 s, 8 iters, t-(init.)=1.40303 s t(norm)=0.0946483, mflops=52.8271 (err=7.0e-16) Top mflops for N=110592 = 87.7099 Normalized results and averages for N=110592: fft 0: mflops = 87.7099 (norm. = 1), norm. avg. (of 12) = 0.994341 fft 1: mflops = 30.1181 (norm. = 0.343383), norm. avg. (of 12) = 0.233192 fft 2: mflops = 22.792 (norm. = 0.259857), norm. avg. (of 12) = 0.158627 fft 3: mflops = 21.2044 (norm. = 0.241755), norm. avg. (of 12) = 0.506484 fft 4: mflops = 21.1772 (norm. = 0.241446), norm. avg. (of 12) = 0.508218 fft 5: mflops = 62.9511 (norm. = 0.717719), norm. avg. (of 7) = 0.687525 fft 6: mflops = 52.8271 (norm. = 0.602294), norm. avg. (of 7) = 0.462929 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.31376 s, 8 iters, t-(init.)=1.2187 s t(norm)=0.0768721, mflops=65.0431 (err=6.5e-16) 1. PDA: elapsed time t=1.95118 s, 4 iters, t-(init.)=1.90369 s t(norm)=0.240159, mflops=20.8195 (err=7.4e-16) 2. PDA (f2c): elapsed time t=1.40315 s, 2 iters, t-(init.)=1.37988 s t(norm)=0.348158, mflops=14.3613 (err=7.4e-16) 3. Singleton: elapsed time t=1.7216 s, 4 iters, t-(init.)=1.67362 s t(norm)=0.211134, mflops=23.6816 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.72814 s, 4 iters, t-(init.)=1.68016 s t(norm)=0.21196, mflops=23.5893 (err=1.0e-15) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 65.0431 Normalized results and averages for N=117649: fft 0: mflops = 65.0431 (norm. = 1), norm. avg. (of 13) = 0.994776 fft 1: mflops = 20.8195 (norm. = 0.320088), norm. avg. (of 13) = 0.239877 fft 2: mflops = 14.3613 (norm. = 0.220797), norm. avg. (of 13) = 0.163409 fft 3: mflops = 23.6816 (norm. = 0.364091), norm. avg. (of 13) = 0.49553 fft 4: mflops = 23.5893 (norm. = 0.362672), norm. avg. (of 13) = 0.497022 fft 5: mflops = -1 (norm. = -0.0153744), norm. avg. (of 7) = 0.687525 fft 6: mflops = -1 (norm. = -0.0153744), norm. avg. (of 7) = 0.462929 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.89324 s, 8 iters, t-(init.)=1.717 s t(norm)=0.0560721, mflops=89.1708 (err=7.3e-16) 1. PDA: elapsed time t=1.21871 s, 2 iters, t-(init.)=1.17488 s t(norm)=0.153472, mflops=32.5793 (err=7.4e-16) 2. PDA (f2c): elapsed time t=1.59121 s, 2 iters, t-(init.)=1.54746 s t(norm)=0.202142, mflops=24.7351 (err=7.4e-16) 3. Singleton: elapsed time t=1.12715 s, 1 iters, t-(init.)=1.10608 s t(norm)=0.288971, mflops=17.3028 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.13936 s, 1 iters, t-(init.)=1.11836 s t(norm)=0.292179, mflops=17.1128 (err=1.0e-15) 5. Temperton: elapsed time t=1.16227 s, 4 iters, t-(init.)=1.0743 s t(norm)=0.0701665, mflops=71.2591 (err=2.0e-15) 6. Temperton (f2c): elapsed time t=1.70066 s, 4 iters, t-(init.)=1.61266 s t(norm)=0.105329, mflops=47.4701 (err=7.1e-16) Top mflops for N=216000 = 89.1708 Normalized results and averages for N=216000: fft 0: mflops = 89.1708 (norm. = 1), norm. avg. (of 14) = 0.995149 fft 1: mflops = 32.5793 (norm. = 0.365358), norm. avg. (of 14) = 0.24884 fft 2: mflops = 24.7351 (norm. = 0.27739), norm. avg. (of 14) = 0.171551 fft 3: mflops = 17.3028 (norm. = 0.194041), norm. avg. (of 14) = 0.473995 fft 4: mflops = 17.1128 (norm. = 0.19191), norm. avg. (of 14) = 0.475228 fft 5: mflops = 71.2591 (norm. = 0.79913), norm. avg. (of 8) = 0.701475 fft 6: mflops = 47.4701 (norm. = 0.53235), norm. avg. (of 8) = 0.471606 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.2144 s, 4 iters, t-(init.)=1.11596 s t(norm)=0.0644834, mflops=77.5393 (err=7.3e-16) 1. PDA: elapsed time t=1.44487 s, 2 iters, t-(init.)=1.39599 s t(norm)=0.161329, mflops=30.9925 (err=7.8e-16) 2. PDA (f2c): elapsed time t=1.95706 s, 2 iters, t-(init.)=1.90818 s t(norm)=0.22052, mflops=22.6736 (err=7.8e-16) 3. Singleton: elapsed time t=1.40464 s, 1 iters, t-(init.)=1.37994 s t(norm)=0.318949, mflops=15.6765 (err=9.4e-16) 4. Singleton (f2c): elapsed time t=1.41462 s, 1 iters, t-(init.)=1.38993 s t(norm)=0.321257, mflops=15.5639 (err=9.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 77.5393 Normalized results and averages for N=241920: fft 0: mflops = 77.5393 (norm. = 1), norm. avg. (of 15) = 0.995472 fft 1: mflops = 30.9925 (norm. = 0.399701), norm. avg. (of 15) = 0.258897 fft 2: mflops = 22.6736 (norm. = 0.292415), norm. avg. (of 15) = 0.179608 fft 3: mflops = 15.6765 (norm. = 0.202175), norm. avg. (of 15) = 0.455874 fft 4: mflops = 15.5639 (norm. = 0.200722), norm. avg. (of 15) = 0.456928 fft 5: mflops = -1 (norm. = -0.0128967), norm. avg. (of 8) = 0.701475 fft 6: mflops = -1 (norm. = -0.0128967), norm. avg. (of 8) = 0.471606 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.30656 s, 2 iters, t-(init.)=1.22056 s t(norm)=0.0774141, mflops=64.5877 (err=7.0e-16) 1. PDA: elapsed time t=1.21567 s, 1 iters, t-(init.)=1.17288 s t(norm)=0.148779, mflops=33.6069 (err=7.5e-16) 2. PDA (f2c): elapsed time t=1.59576 s, 1 iters, t-(init.)=1.55297 s t(norm)=0.196994, mflops=25.3815 (err=7.5e-16) 3. Singleton: elapsed time t=2.05589 s, 1 iters, t-(init.)=2.01276 s t(norm)=0.255318, mflops=19.5835 (err=9.8e-16) 4. Singleton (f2c): elapsed time t=2.04752 s, 1 iters, t-(init.)=2.00438 s t(norm)=0.254254, mflops=19.6653 (err=9.8e-16) 5. Temperton: elapsed time t=1.10343 s, 2 iters, t-(init.)=1.01736 s t(norm)=0.0645257, mflops=77.4885 (err=1.4e-07) 6. Temperton (f2c): elapsed time t=1.35991 s, 2 iters, t-(init.)=1.27384 s t(norm)=0.0807929, mflops=61.8867 (err=8.9e-16) Top mflops for N=421875 = 77.4885 Normalized results and averages for N=421875: fft 0: mflops = 64.5877 (norm. = 0.833513), norm. avg. (of 16) = 0.98535 fft 1: mflops = 33.6069 (norm. = 0.433702), norm. avg. (of 16) = 0.269822 fft 2: mflops = 25.3815 (norm. = 0.327552), norm. avg. (of 16) = 0.188855 fft 3: mflops = 19.5835 (norm. = 0.252727), norm. avg. (of 16) = 0.443177 fft 4: mflops = 19.6653 (norm. = 0.253784), norm. avg. (of 16) = 0.444231 fft 5: mflops = 77.4885 (norm. = 1), norm. avg. (of 9) = 0.734645 fft 6: mflops = 61.8867 (norm. = 0.798656), norm. avg. (of 9) = 0.507945 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.65754 s, 2 iters, t-(init.)=1.55298 s t(norm)=0.0799642, mflops=62.528 (err=6.3e-16) 1. PDA: elapsed time t=1.55419 s, 1 iters, t-(init.)=1.50196 s t(norm)=0.154674, mflops=32.3261 (err=6.2e-16) 2. PDA (f2c): elapsed time t=1.97716 s, 1 iters, t-(init.)=1.92493 s t(norm)=0.198232, mflops=25.223 (err=6.2e-16) 3. Singleton: elapsed time t=2.64502 s, 1 iters, t-(init.)=2.59269 s t(norm)=0.266999, mflops=18.7267 (err=8.2e-16) 4. Singleton (f2c): elapsed time t=2.62526 s, 1 iters, t-(init.)=2.57297 s t(norm)=0.264968, mflops=18.8702 (err=8.2e-16) 5. Temperton: elapsed time t=1.74543 s, 2 iters, t-(init.)=1.64097 s t(norm)=0.084495, mflops=59.1751 (err=1.7e-07) 6. Temperton (f2c): elapsed time t=1.0708 s, 1 iters, t-(init.)=1.01847 s t(norm)=0.104883, mflops=47.6721 (err=6.6e-16) Top mflops for N=512000 = 62.528 Normalized results and averages for N=512000: fft 0: mflops = 62.528 (norm. = 1), norm. avg. (of 17) = 0.986212 fft 1: mflops = 32.3261 (norm. = 0.516986), norm. avg. (of 17) = 0.284361 fft 2: mflops = 25.223 (norm. = 0.403387), norm. avg. (of 17) = 0.201474 fft 3: mflops = 18.7267 (norm. = 0.299493), norm. avg. (of 17) = 0.434725 fft 4: mflops = 18.8702 (norm. = 0.301788), norm. avg. (of 17) = 0.435852 fft 5: mflops = 59.1751 (norm. = 0.946378), norm. avg. (of 10) = 0.755818 fft 6: mflops = 47.6721 (norm. = 0.762412), norm. avg. (of 10) = 0.533392 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.07074 s, 1 iters, t-(init.)=1.01035 s t(norm)=0.0888905, mflops=56.249 (err=7.0e-16) 1. PDA: elapsed time t=1.97334 s, 1 iters, t-(init.)=1.91297 s t(norm)=0.168302, mflops=29.7084 (err=6.8e-16) 2. PDA (f2c): elapsed time t=2.77123 s, 1 iters, t-(init.)=2.7092 s t(norm)=0.238355, mflops=20.9771 (err=6.8e-16) 3. Singleton: elapsed time t=3.75006 s, 1 iters, t-(init.)=3.6902 s t(norm)=0.324663, mflops=15.4006 (err=8.9e-16) 4. Singleton (f2c): elapsed time t=3.81794 s, 1 iters, t-(init.)=3.75805 s t(norm)=0.330632, mflops=15.1225 (err=8.9e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 56.249 Normalized results and averages for N=592704: fft 0: mflops = 56.249 (norm. = 1), norm. avg. (of 18) = 0.986978 fft 1: mflops = 29.7084 (norm. = 0.52816), norm. avg. (of 18) = 0.297906 fft 2: mflops = 20.9771 (norm. = 0.372934), norm. avg. (of 18) = 0.211 fft 3: mflops = 15.4006 (norm. = 0.273793), norm. avg. (of 18) = 0.425785 fft 4: mflops = 15.1225 (norm. = 0.26885), norm. avg. (of 18) = 0.426574 fft 5: mflops = -1 (norm. = -0.0177781), norm. avg. (of 10) = 0.755818 fft 6: mflops = -1 (norm. = -0.0177781), norm. avg. (of 10) = 0.533392 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=1.52057 s, 1 iters, t-(init.)=1.43019 s t(norm)=0.0818285, mflops=61.1034 (err=7.7e-16) 1. PDA: elapsed time t=2.94682 s, 1 iters, t-(init.)=2.85644 s t(norm)=0.163432, mflops=30.5938 (err=6.4e-16) 2. PDA (f2c): elapsed time t=3.87047 s, 1 iters, t-(init.)=3.78017 s t(norm)=0.216284, mflops=23.1178 (err=6.4e-16) 3. Singleton: elapsed time t=6.52126 s, 1 iters, t-(init.)=6.43017 s t(norm)=0.367904, mflops=13.5905 (err=7.0e-16) 4. Singleton (f2c): elapsed time t=6.5994 s, 1 iters, t-(init.)=6.50923 s t(norm)=0.372427, mflops=13.4254 (err=7.0e-16) 5. Temperton: elapsed time t=2.10735 s, 1 iters, t-(init.)=2.01698 s t(norm)=0.115402, mflops=43.3269 (err=1.6e-07) 6. Temperton (f2c): elapsed time t=2.59297 s, 1 iters, t-(init.)=2.50261 s t(norm)=0.143187, mflops=34.9193 (err=7.5e-16) Top mflops for N=884736 = 61.1034 Normalized results and averages for N=884736: fft 0: mflops = 61.1034 (norm. = 1), norm. avg. (of 19) = 0.987663 fft 1: mflops = 30.5938 (norm. = 0.500688), norm. avg. (of 19) = 0.308578 fft 2: mflops = 23.1178 (norm. = 0.378339), norm. avg. (of 19) = 0.219807 fft 3: mflops = 13.5905 (norm. = 0.222418), norm. avg. (of 19) = 0.415081 fft 4: mflops = 13.4254 (norm. = 0.219717), norm. avg. (of 19) = 0.415687 fft 5: mflops = 43.3269 (norm. = 0.709074), norm. avg. (of 11) = 0.751569 fft 6: mflops = 34.9193 (norm. = 0.571478), norm. avg. (of 11) = 0.536854 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=2.12973 s, 1 iters, t-(init.)=2.01153 s t(norm)=0.0862661, mflops=57.9602 (err=7.4e-16) 1. PDA: elapsed time t=3.94854 s, 1 iters, t-(init.)=3.83037 s t(norm)=0.164269, mflops=30.4379 (err=7.1e-16) 2. PDA (f2c): elapsed time t=5.55199 s, 1 iters, t-(init.)=5.43386 s t(norm)=0.233036, mflops=21.456 (err=7.1e-16) 3. Singleton: elapsed time t=7.02673 s, 1 iters, t-(init.)=6.90913 s t(norm)=0.296304, mflops=16.8746 (err=8.0e-16) 4. Singleton (f2c): elapsed time t=7.10566 s, 1 iters, t-(init.)=6.98808 s t(norm)=0.29969, mflops=16.6839 (err=8.0e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 57.9602 Normalized results and averages for N=1157625: fft 0: mflops = 57.9602 (norm. = 1), norm. avg. (of 20) = 0.98828 fft 1: mflops = 30.4379 (norm. = 0.525152), norm. avg. (of 20) = 0.319407 fft 2: mflops = 21.456 (norm. = 0.370184), norm. avg. (of 20) = 0.227326 fft 3: mflops = 16.8746 (norm. = 0.291141), norm. avg. (of 20) = 0.408884 fft 4: mflops = 16.6839 (norm. = 0.287851), norm. avg. (of 20) = 0.409295 fft 5: mflops = -1 (norm. = -0.0172532), norm. avg. (of 11) = 0.751569 fft 6: mflops = -1 (norm. = -0.0172532), norm. avg. (of 11) = 0.536854 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=2.56815 s, 1 iters, t-(init.)=2.42477 s t(norm)=0.0845116, mflops=59.1635 (err=5.4e-16) 1. PDA: elapsed time t=5.00589 s, 1 iters, t-(init.)=4.86246 s t(norm)=0.169474, mflops=29.5031 (err=5.4e-16) 2. PDA (f2c): elapsed time t=6.72294 s, 1 iters, t-(init.)=6.5795 s t(norm)=0.229318, mflops=21.8037 (err=5.4e-16) 3. Singleton: elapsed time t=8.3734 s, 1 iters, t-(init.)=8.2294 s t(norm)=0.286823, mflops=17.4323 (err=6.4e-16) 4. Singleton (f2c): elapsed time t=8.45891 s, 1 iters, t-(init.)=8.31543 s t(norm)=0.289822, mflops=17.252 (err=6.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 59.1635 Normalized results and averages for N=1404928: fft 0: mflops = 59.1635 (norm. = 1), norm. avg. (of 21) = 0.988838 fft 1: mflops = 29.5031 (norm. = 0.49867), norm. avg. (of 21) = 0.327944 fft 2: mflops = 21.8037 (norm. = 0.368534), norm. avg. (of 21) = 0.23405 fft 3: mflops = 17.4323 (norm. = 0.294647), norm. avg. (of 21) = 0.403444 fft 4: mflops = 17.252 (norm. = 0.291598), norm. avg. (of 21) = 0.403691 fft 5: mflops = -1 (norm. = -0.0169023), norm. avg. (of 11) = 0.751569 fft 6: mflops = -1 (norm. = -0.0169023), norm. avg. (of 11) = 0.536854 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=2.93625 s, 1 iters, t-(init.)=2.75996 s t(norm)=0.0770824, mflops=64.8656 (err=7.3e-16) 1. PDA: elapsed time t=5.09752 s, 1 iters, t-(init.)=4.9211 s t(norm)=0.13744, mflops=36.3794 (err=7.8e-16) 2. PDA (f2c): elapsed time t=6.73181 s, 1 iters, t-(init.)=6.55542 s t(norm)=0.183085, mflops=27.3097 (err=7.8e-16) 3. Singleton: elapsed time t=14.8555 s, 1 iters, t-(init.)=14.679 s t(norm)=0.409968, mflops=12.1961 (err=9.4e-16) 4. Singleton (f2c): elapsed time t=14.88 s, 1 iters, t-(init.)=14.7035 s t(norm)=0.410652, mflops=12.1757 (err=9.4e-16) 5. Temperton: elapsed time t=3.47599 s, 1 iters, t-(init.)=3.2996 s t(norm)=0.0921539, mflops=54.2571 (err=1.1e-08) 6. Temperton (f2c): elapsed time t=4.02939 s, 1 iters, t-(init.)=3.85297 s t(norm)=0.107609, mflops=46.4645 (err=6.9e-16) Top mflops for N=1728000 = 64.8656 Normalized results and averages for N=1728000: fft 0: mflops = 64.8656 (norm. = 1), norm. avg. (of 22) = 0.989345 fft 1: mflops = 36.3794 (norm. = 0.560843), norm. avg. (of 22) = 0.33853 fft 2: mflops = 27.3097 (norm. = 0.42102), norm. avg. (of 22) = 0.242549 fft 3: mflops = 12.1961 (norm. = 0.18802), norm. avg. (of 22) = 0.393652 fft 4: mflops = 12.1757 (norm. = 0.187707), norm. avg. (of 22) = 0.393873 fft 5: mflops = 54.2571 (norm. = 0.836454), norm. avg. (of 12) = 0.758642 fft 6: mflops = 46.4645 (norm. = 0.71632), norm. avg. (of 12) = 0.55181 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=5.63611 s, 1 iters, t-(init.)=5.33128 s t(norm)=0.0830058, mflops=60.2367 (err=1.2e-15) 1. PDA: elapsed time t=9.0333 s, 1 iters, t-(init.)=8.72844 s t(norm)=0.135898, mflops=36.7923 (err=1.2e-15) 2. PDA (f2c): elapsed time t=12.1682 s, 1 iters, t-(init.)=11.8633 s t(norm)=0.184707, mflops=27.0699 (err=1.2e-15) 3. Singleton: elapsed time t=24.141 s, 1 iters, t-(init.)=23.8361 s t(norm)=0.371118, mflops=13.4728 (err=1.6e-15) 4. Singleton (f2c): elapsed time t=24.3421 s, 1 iters, t-(init.)=24.0373 s t(norm)=0.37425, mflops=13.36 (err=1.6e-15) 5. Temperton: elapsed time t=7.29116 s, 1 iters, t-(init.)=6.98627 s t(norm)=0.108773, mflops=45.9672 (err=1.8e-07) 6. Temperton (f2c): elapsed time t=8.07249 s, 1 iters, t-(init.)=7.76769 s t(norm)=0.12094, mflops=41.3429 (err=1.2e-15) Top mflops for N=2985984 = 60.2367 Normalized results and averages for N=2985984: fft 0: mflops = 60.2367 (norm. = 1), norm. avg. (of 23) = 0.989809 fft 1: mflops = 36.7923 (norm. = 0.610794), norm. avg. (of 23) = 0.350367 fft 2: mflops = 27.0699 (norm. = 0.449392), norm. avg. (of 23) = 0.251542 fft 3: mflops = 13.4728 (norm. = 0.223664), norm. avg. (of 23) = 0.386261 fft 4: mflops = 13.36 (norm. = 0.221792), norm. avg. (of 23) = 0.386391 fft 5: mflops = 45.9672 (norm. = 0.763109), norm. avg. (of 13) = 0.758986 fft 6: mflops = 41.3429 (norm. = 0.686341), norm. avg. (of 13) = 0.562158 Benchmarking for array size = 180x180x180: 0. FFTW: elapsed time t=12.3408 s, 1 iters, t-(init.)=11.7451 s t(norm)=0.0896046, mflops=55.8007 (err=9.5e-16) 1. PDA: elapsed time t=18.7221 s, 1 iters, t-(init.)=18.1266 s t(norm)=0.13829, mflops=36.156 (err=9.3e-16) 2. PDA (f2c): elapsed time t=25.3617 s, 1 iters, t-(init.)=24.7662 s t(norm)=0.188943, mflops=26.463 (err=9.3e-16) 3. Singleton: elapsed time t=53.8917 s, 1 iters, t-(init.)=53.2963 s t(norm)=0.406601, mflops=12.2971 (err=1.2e-15) 4. Singleton (f2c): elapsed time t=54.2457 s, 1 iters, t-(init.)=53.6502 s t(norm)=0.409302, mflops=12.2159 (err=1.2e-15) 5. Temperton: elapsed time t=9.64281 s, 1 iters, t-(init.)=9.04736 s t(norm)=0.069023, mflops=72.4396 (err=9.9e-08) 6. Temperton (f2c): elapsed time t=13.5464 s, 1 iters, t-(init.)=12.9509 s t(norm)=0.0988034, mflops=50.6056 (err=9.5e-16) Top mflops for N=5832000 = 72.4396 Normalized results and averages for N=5832000: fft 0: mflops = 55.8007 (norm. = 0.770306), norm. avg. (of 24) = 0.980663 fft 1: mflops = 36.156 (norm. = 0.499119), norm. avg. (of 24) = 0.356565 fft 2: mflops = 26.463 (norm. = 0.365311), norm. avg. (of 24) = 0.256282 fft 3: mflops = 12.2971 (norm. = 0.169756), norm. avg. (of 24) = 0.37724 fft 4: mflops = 12.2159 (norm. = 0.168636), norm. avg. (of 24) = 0.377318 fft 5: mflops = 72.4396 (norm. = 1), norm. avg. (of 14) = 0.776201 fft 6: mflops = 50.6056 (norm. = 0.698589), norm. avg. (of 14) = 0.571903 Benchmarking for array size = 240x240x240: 0. FFTW: elapsed time t=34.9571 s, 1 iters, t-(init.)=33.5453 s t(norm)=0.102299, mflops=48.8764 (err=1.5e-15) 1. PDA: elapsed time t=51.6586 s, 1 iters, t-(init.)=50.2469 s t(norm)=0.153232, mflops=32.6303 (err=1.5e-15) 2. PDA (f2c): elapsed time t=67.351 s, 1 iters, t-(init.)=65.9355 s t(norm)=0.201075, mflops=24.8663 (err=1.5e-15) 3. Singleton: elapsed time t=123.669 s, 1 iters, t-(init.)=122.257 s t(norm)=0.372832, mflops=13.4109 (err=2.1e-15) 4. Singleton (f2c): elapsed time t=123.42 s, 1 iters, t-(init.)=122.008 s t(norm)=0.372074, mflops=13.4382 (err=2.1e-15) 5. Temperton: elapsed time t=33.4545 s, 1 iters, t-(init.)=32.039 s t(norm)=0.0977053, mflops=51.1743 (err=1.8e-07) 6. Temperton (f2c): elapsed time t=39.819 s, 1 iters, t-(init.)=38.4075 s t(norm)=0.117126, mflops=42.6889 (err=1.5e-15) Top mflops for N=13824000 = 51.1743 Normalized results and averages for N=13824000: fft 0: mflops = 48.8764 (norm. = 0.955096), norm. avg. (of 25) = 0.97964 fft 1: mflops = 32.6303 (norm. = 0.637631), norm. avg. (of 25) = 0.367808 fft 2: mflops = 24.8663 (norm. = 0.485914), norm. avg. (of 25) = 0.265468 fft 3: mflops = 13.4109 (norm. = 0.262062), norm. avg. (of 25) = 0.372633 fft 4: mflops = 13.4382 (norm. = 0.262596), norm. avg. (of 25) = 0.372729 fft 5: mflops = 51.1743 (norm. = 1), norm. avg. (of 15) = 0.791121 fft 6: mflops = 42.6889 (norm. = 0.834186), norm. avg. (of 15) = 0.589389 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Bailey, Beauregard, Bergland, Brenner, Burrus, CWP (min N), CWP (best N), Edelblute, FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), Monro, NAPACK (f2c), Nielsen, NR (C), NR (F), Ooura (C), Ooura (F), QFT, Ransom, SCIPORT, Singleton, Singleton (f2c), Sorensen, Sorensen DIT, Temperton, Temperton (f2c), Valkenburg, SUNPERF 2, 52.9427, 39.4759, 11.6621, 0.905619, 4.01349, 4.62017, 4.72038, 5.215, 28.9183, 1.96252, 1.92095, , 7.38105, 5.93896, 28.138, 27.9245, 79.9944, , 5.60665, 3.49136, 3.52611, 29.6973, , , , , 1.4927, 1.1397, 3.88905, 3.85384, 19.1467, 17.3615, , , , 2.6568, 2.74272, 13.2563, 25.0532, 2.47159, 1.97488, 3.98373, 7.19055 4, 36.9486, 39.1916, 16.9023, 5.35144, 7.3157, 6.98622, 15.135, 11.8747, 20.2208, 7.39618, 4.21302, 15.1198, 29.7831, 20.5776, 93.836, 75.2421, 210.1, , 16.7535, 7.30634, 7.2732, 71.1956, 22.3312, 21.9799, 21.0571, 5.41093, 3.41929, 5.59185, 7.87696, 7.92838, 40.6065, 37.7864, , 2.81011, 33.5043, 10.8161, 11.6767, 19.6143, 19.5758, 7.58191, 6.69795, 3.91507, 29.6097 8, 76.7949, 74.7043, 20.1275, 5.16931, 12.7503, 5.46504, 24.1962, 16.5737, 14.8038, 18.7585, 12.7071, 12.5688, 41.8709, 30.5583, 164.064, 157.01, 237.738, 58.0471, 26.1611, 12.7284, 13.2421, 94.5466, 39.6694, 37.2038, 36.3788, 13.1037, 6.44599, 16.3061, 13.8536, 14.2753, 56.4427, 55.1824, , 2.65569, 44.6653, 11.2338, 11.3741, 33.1248, 14.7826, 16.0921, 12.8151, 3.92233, 52.2907 16, 35.339, 37.8248, 22.9737, 11.1712, 20.5555, 6.3754, 33.6133, 24.8021, 13.7403, 19.8997, 28.3771, 12.2209, 73.438, 53.9405, 237.857, 228.72, 272.978, 68.2614, 39.8338, 19.9769, 21.217, 106.979, 35.6169, 39.7167, 42.0785, 22.4399, 10.6163, 14.8157, 22.2477, 21.6834, 71.5536, 68.8593, 49.3171, 11.3771, 55.9633, 32.2121, 34.2183, 45.9407, 13.6087, 27.1361, 23.046, 4.02127, 101.791 32, 43.2835, 44.5701, 26.0478, 11.6026, 32.817, 6.42662, 46.6098, 32.3132, 14.4678, 47.323, 54.8304, 12.9802, 83.107, 54.5063, 151.287, 224.852, 218.158, 87.3832, 36.3182, 28.586, 30.6994, 112.38, 41.1415, 55.581, 55.2704, 36.1214, 14.74, 25.2406, 31.3626, 32.0347, 82.2043, 78.3506, 45.2429, 9.71942, 65.165, 45.7683, 46.3796, 74.5589, 14.6384, 34.6636, 24.5169, 4.11565, 111.117 64, 41.2405, 44.6209, 28.863, 17.5637, 48.2623, 6.43812, 53.9754, 40.5883, 15.9422, 58.4925, 89.1722, 14.3684, 103.774, 70.0213, 207.111, 161.886, 110.54, 115.971, 45.5587, 35.9418, 42.207, 101.586, 43.0557, 58.4217, 57.842, 49.0898, 19.0118, 39.442, 39.5245, 40.9662, 91.6252, 91.6011, 44.8414, 23.8116, 76.902, 66.2736, 69.6418, 95.0226, 16.0671, 53.0962, 35.6119, 4.19689, 155.718 128, 47.1756, 51.4928, 32.2621, 16.4266, 67.581, 6.33485, 58.7265, 45.3136, 17.778, 77.4564, 121.818, 16.0532, 123.403, 76.6189, 180.411, 201.722, 113.376, 113.223, 49.491, 42.0023, 48.9373, 75.3965, 50.7572, 65.9826, 69.506, 60.2618, 21.3268, 34.7681, 46.4372, 48.0826, 100.527, 98.1863, 44.2407, 20.11, 77.1284, 67.4562, 68.6779, 119.647, 17.6009, 53.0815, 37.4248, 4.24117, 195.33 256, 48.9016, 54.4462, 35.2067, 18.7772, 73.4697, 6.31512, 66.424, 53.6354, 19.543, 97.3109, 131.257, 17.5476, 140.627, 85.3572, 190.41, 189.827, 119.968, 103.709, 56.6039, 45.8677, 54.8406, 84.3184, 52.9085, 64.3463, 68.4345, 67.9992, 23.2044, 42.0933, 50.9324, 53.1031, 109.853, 108.825, 43.0797, 33.9493, 84.0356, 92.3967, 94.5417, 134.91, 19.4838, 63.6589, 42.4293, 4.17149, 227.365 512, 51.8893, 57.0584, 36.8488, 19.7346, 75.1734, 6.00763, 65.5285, 53.2121, 20.6898, 94.1394, 106.817, 18.8877, 108.248, 56.7127, 151.332, 164.966, 105.707, 52.2766, 44.4815, 47.8466, 54.5902, 75.6791, 55.3595, 76.658, 70.1042, 72.3137, 22.4156, 48.8154, 54.3932, 56.1457, 109.242, 104.968, 36.2964, 29.6452, 74.3161, 92.397, 94.5324, 129.26, 20.7995, 54.7591, 36.0501, 4.18739, 183.476 1024, 53.3291, 59.6255, 39.7606, 22.4847, 73.0894, 6.048, 70.3231, 59.7716, 22.7485, 90.7853, 90.7693, 20.7829, 98.6339, 48.6017, 157.87, 151.264, 80.9362, 69.4477, 46.1463, 49.6701, 59.2771, 49.3318, 57.298, 78.2183, 76.9336, 79.1378, 21.5817, 41.1366, 56.5204, 56.3307, 117.468, 113.203, 33.8847, 43.1731, 69.501, 104.124, 106.555, 137.487, 22.7567, 62.9038, 40.3349, 4.16568, 191.512 2048, 45.7137, 48.933, 32.0256, 20.7781, 62.6973, 5.81852, 68.388, 55.2943, 21.0816, 75.1933, 88.7097, 19.7893, 105.974, 52.2618, 155.581, 117.514, 80.479, 74.2683, 41.3892, 40.3723, 46.645, 44.5516, 56.5687, 71.5162, 74.0544, 59.071, 18.3091, 37.5965, 45.6673, 45.5253, 88.5135, 84.6775, 30.6789, 33.3881, 65.4144, 68.113, 73.3234, 91.8941, 19.4876, 52.599, 37.5391, 4.08136, 201.071 4096, 47.9531, 51.7593, 34.6244, 22.5269, 61.0128, 5.77193, 70.8533, 58.1998, 22.8944, 75.2614, 84.8938, 21.1078, 99.7252, 56.4026, 149.268, 149.374, 71.06, 88.6015, 44.3934, 39.9877, 45.9359, 43.6242, 48.8198, 56.1027, 57.1599, 59.057, 18.9722, 37.3867, 45.2549, 46.2934, 97.1698, 94.3915, 29.0214, 46.5958, 65.7072, 83.7853, 85.7826, 92.7344, 21.4024, 57.8712, 38.2912, 4.04967, 205.395 8192, 51.2266, 55.6119, 35.9995, 23.5871, 64.3023, 5.68743, 71.1747, 58.4444, 24.5609, 79.8156, 90.2513, 22.2523, 79.4874, 45.9071, 116.022, 128.39, 68.3673, 88.4287, 40.6709, 39.0161, 45.2668, , 50.2842, 59.65, 59.6106, 65.0566, 18.5554, 35.1055, 44.4571, 46.1828, 91.9042, 89.2903, 25.7558, 38.3585, 31.9199, 82.9681, 80.7454, 91.5419, 22.7397, 54.0139, 34.8385, 4.02944, 137.285 16384, 37.4536, 40.4731, 27.162, 24.7746, 26.4043, 5.73349, 57.3454, 48.6979, 20.7686, 74.6496, 74.6518, 18.5835, 46.5558, 33.8204, 81.3362, 79.0963, 48.4083, 69.3911, 33.1226, 31.0058, 33.8828, , 38.7648, 43.5319, 43.675, 44.6938, 15.5346, 26.6239, 34.4774, 34.7936, 85.1061, 83.4049, 22.5955, 48.4445, 15.5944, 66.116, 64.8725, 61.6157, 19.5159, 46.7727, 32.6659, 3.82856, 59.727 32768, 23.9319, 24.2516, 17.355, 19.6419, 15.5854, 5.58555, 41.8251, 33.6325, 13.7132, 70.8914, 70.7441, 13.7849, 36.8173, 27.492, 56.6624, 57.7098, 38.5516, 45.121, 26.8596, 20.7049, 21.7686, , 39.0463, 43.2162, 41.9901, 22.6211, 12.4864, 16.4856, 22.1784, 22.2301, 59.1866, 58.3818, 18.7865, 37.4917, 13.6149, 33.9292, 32.507, 29.5125, 12.6444, 27.285, 22.3707, 3.53768, 42.658 65536, 14.848, 14.5018, 10.9655, 19.369, 15.9311, 5.31793, 29.1204, 22.0952, 9.88847, 61.7764, 61.8125, 9.52998, 34.6339, 27.8501, 62.6021, 60.0846, 35.0675, 32.0492, 27.5613, 13.4454, 13.7611, , 29.0484, 31.0015, 28.7786, 15.1861, 12.2583, 13.2846, 14.0291, 14.0969, 51.0348, 50.2436, 17.0759, 33.3003, 11.0545, 27.1453, 26.9491, 26.4237, 9.29457, 22.8137, 18.9752, 3.32244, 42.3602 131072, 13.2931, 13.2407, 9.51313, 15.6964, 17.6361, 5.22755, 27.042, 19.2962, 8.72362, 58.4202, 58.4202, 8.3771, 33.1136, 26.1063, 48.6058, 54.6454, 30.3779, 28.0326, 25.8789, 12.0842, 12.3705, , 17.7948, 18.2759, 17.4105, 13.2435, 11.7721, 12.3781, 12.545, 12.5993, 46.2575, 45.5841, 12.4622, 27.4704, 11.189, 21.9993, 21.9571, 21.8569, 8.18933, 19.129, 16.2024, 3.21468, 39.2815 262144, 13.0759, 13.0655, 9.41438, 18.5174, 18.613, 5.19854, 27.6303, 19.2856, 8.73734, 46.8361, 46.8413, 8.41098, 33.6217, 26.7532, 48.9957, 48.0721, 26.3038, 29.0577, 27.5726, 11.9319, 12.253, , 15.2819, 15.5347, 15.0371, 13.5778, 12.1243, 12.3931, 12.3781, 12.4481, 48.8817, 48.2903, 10.7659, 34.7692, 10.1598, 23.4377, 23.4613, 21.1021, 8.2924, 20.4455, 17.3478, 3.1549, 41.0624 524288, 12.8851, 13.0003, 9.24584, 14.5104, 17.2266, 5.1832, 26.5156, 18.5289, 8.66952, 46.2494, 46.2497, 8.34185, 34.1035, 27.3489, 48.1957, 47.0964, 24.6358, 27.3764, 26.1488, 11.5441, 11.9179, , 15.0748, 15.3334, 14.8553, 13.0632, 11.7987, 11.7437, 11.9892, 12.0489, 46.8357, 46.0433, 9.60197, 29.4444, 10.5368, 18.6263, 18.458, 20.9166, 8.25671, 19.0929, 16.4009, 3.10552, 41.6086 1048576, 12.4846, 12.523, 9.02198, 17.8263, 19.418, 5.17294, 26.391, 18.6741, 8.52598, , , 8.20437, 35.0478, 27.865, 47.8676, 49.1177, 25.0815, , 28.4055, 11.2147, 11.5873, , , 14.2525, 13.8729, 12.948, 12.137, 12.1783, 11.6343, 11.6722, 49.544, 48.9385, 9.26109, 36.7487, 9.37356, 22.2181, 22.3836, 19.8391, 8.13876, 19.6935, 16.781, 3.0452, 44.3215 2097152, 11.7519, 11.8274, 8.63436, 13.4743, 15.1509, 5.13937, , 17.9304, 8.1662, , , 7.85643, 30.6147, 24.165, 48.784, 49.6392, 25.0163, , 24.7604, 10.6565, 10.8299, , , 13.3361, 13.0018, , 11.8763, 12.1652, 11.0274, 11.0537, 43.4154, 44.2027, 8.94236, 29.467, 9.6143, 19.6777, 19.6016, 18.6991, 7.76607, 17.0157, 14.7706, 2.97535, 35.0268 4194304, 11.0055, 10.8417, 8.07455, 18.8805, 15.0181, 5.104, , 17.7408, 7.61328, , , 7.42408, 30.3669, 24.3971, 45.7926, 43.4101, 22.178, , 25.4271, 10.2317, 10.2701, , , 12.3423, 12.0325, , 11.8285, 11.1523, 10.577, 10.6042, 42.8779, 42.8662, 8.04101, 34.5371, 8.7627, 19.7849, 19.619, 16.3392, 7.24153, 16.8306, 14.8023, 2.90803, 35.629 Norm. Avg., 0.274166, 0.27746, 0.171904, 0.176295, 0.271299, 0.0578467, 0.358682, 0.286463, 0.141743, 0.505515, 0.547539, 0.122256, 0.50201, 0.347195, 0.831598, 0.831555, 0.616905, 0.479844, 0.312148, 0.197323, 0.215414, 0.361552, 0.286922, 0.325036, 0.320271, 0.270744, 0.136832, 0.187726, 0.21452, 0.217413, 0.624341, 0.612426, 0.202191, 0.335388, 0.254988, 0.369091, 0.371273, 0.430748, 0.134033, 0.28632, 0.218897, 0.0367394, 0.716831 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, Brenner, CWP (min N), CWP (best N), FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Nielsen, Singleton, Singleton (f2c), Temperton, Temperton (f2c), Valkenburg, SUNPERF 6, , 11.4685, 8.1993, 28.6585, 21.0046, 139.324, 124.774, 13.46, 20.716, 5.75752, 4.04844, 7.90314, 7.86405, 9.53842, 7.56676, 3.76779, 31.4088 9, 4.71245, 20.2992, 15.1784, 44.427, 28.7636, 143.691, 147.556, 11.1299, 22.2065, 9.99698, 7.53798, 13.9906, 14.3342, 15.8913, 14.7275, 3.6962, 47.2959 12, , 27.9946, 22.7157, 63.3235, 40.771, 221.843, 214.155, 19.5424, 31.0656, 9.72972, 10.4983, 15.5264, 15.5136, 21.3006, 17.2261, 3.88313, 75.5707 15, 6.76513, 31.0478, 31.3871, 64.2938, 41.8757, 173.931, 163.64, 14.4449, 26.6392, 8.31703, 12.5252, 19.4833, 18.9113, 27.8139, 18.7225, 3.25305, 79.3594 18, 6.19513, 34.8522, 33.4902, 54.9943, 33.4647, 103.434, 85.5328, 13.6279, 36.1069, 13.3654, 9.65821, 22.5288, 22.8972, 23.4087, 19.4145, 3.83804, 64.2533 24, , 52.2814, 49.7587, 71.1966, 43.4384, 137.457, 132.128, 25.0001, 46.4131, 14.5755, 18.8515, 22.9849, 22.2414, 33.5921, 25.8084, 4.00455, 86.9872 36, 7.91837, 64.7949, 64.167, 87.7649, 48.4002, 156.534, 158.8, 16.9377, 50.4798, 18.8946, 17.6428, 37.6146, 39.2499, 42.349, 34.6214, 3.94149, 106.33 80, 13.995, 66.3655, 116.891, 105.882, 69.7545, 177.439, 176.077, 30.407, 43.7925, 12.6439, 38.1827, 64.0377, 64.6552, 64.5225, 41.8, 3.7365, 162.776 108, 8.67718, 73.7639, 71.9342, 123.946, 60.3568, 155.841, 154.866, 15.406, 49.2033, 25.3619, 22.0198, 45.2232, 43.4395, 60.608, 46.3477, 3.95755, 161.198 210, 11.5084, 128.504, 128.483, 80.1917, 34.5713, 106.904, 97.5836, 15.1685, 36.1467, 11.2731, 28.9366, 39.3035, 36.1439, , , 3.02634, 107.302 504, 12.9259, 122.719, 122.963, 89.8281, 33.549, 116.897, 112.328, 18.6361, 41.0144, 14.7446, 29.0977, 46.6067, 47.0208, , , 3.39096, 104.603 1000, 12.6378, 76.3091, 114.373, 97.4213, 49.7951, 116.882, 116.426, 17.6168, 40.7455, 11.2369, 41.3604, 67.1662, 69.7367, 71.5925, 40.2865, 3.27147, 158.013 1960, 13.7678, 100.077, 100.09, 79.5952, 30.0804, 89.9593, 87.1817, 17.2541, 37.434, 9.28394, 29.1761, 45.1238, 43.7993, , , 2.86867, 95.0882 4725, 10.5831, 76.4973, 108.848, 90.2545, 38.8524, 89.3702, 98.0989, 12.412, 38.9749, 10.8629, 26.1358, 43.883, 41.0933, , , 3.00477, 117.243 10368, 12.4483, 89.8864, 84.3349, 87.0636, 45.6357, 99.5952, 92.9487, 18.5466, 42.5121, 17.7345, 27.3585, 55.8602, 52.0147, 61.6236, 43.4008, 3.8154, 88.8916 27000, 7.86421, 87.1438, 87.1424, 50.4512, 34.3168, 65.4736, 66.5259, 13.281, 35.5664, 10.2576, 27.9549, 34.5759, 33.6051, 38.1602, 29.0838, 3.19704, 56.8742 75600, 6.61207, 63.8296, 63.6829, 30.2338, 22.1016, 49.057, 49.5651, 10.2353, 29.0264, 9.11438, 15.0595, 21.2929, 20.847, , , 2.83434, 32.6422 165375, 5.29567, 59.4169, 59.6985, 18.844, 14.0407, 44.2042, 44.204, 7.41758, 28.0086, 7.69805, 13.0033, 16.4073, 16.1715, , , 2.52498, 19.3612 362880, 5.89828, 31.1324, 31.0376, 28.8415, 21.2624, 50.608, 47.5582, 9.2886, 30.5487, 9.94803, 12.3729, 15.362, 14.8582, , , 2.76764, 30.8879 Norm. Avg., 0.0837405, 0.585672, 0.618382, 0.55149, 0.30656, 0.91421, 0.884936, 0.131395, 0.320563, 0.105886, 0.180242, 0.284293, 0.278343, 0.284901, 0.206705, 0.029867, 0.674253 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM, HARM (f2c), NR (C), NR (F), PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 4x4x4, 145.962, , , 27.9911, 39.3264, 16.613, 9.39127, 79.6908, 77.4367, 48.9298, 30.4046 8x8x8, 231.3, 78.1803, 64.7669, 43.3956, 61.536, 31.9302, 20.482, 60.6376, 58.2654, 102.93, 89.4346 16x16x16, 211.572, 83.2582, 80.0495, 34.1166, 48.691, 51.4109, 32.5208, 76.5526, 74.9533, 123.863, 91.5212 32x32x32, 111.277, 61.0175, 55.6861, 17.9828, 20.2877, 27.143, 20.5204, 31.6868, 30.9706, 68.073, 48.4722 64x64x64, 57.3987, 54.0046, 52.824, 11.0124, 11.798, 29.62, 22.7068, 19.869, 19.8991, 51.3183, 44.6049 256x64x32, 59.49, 45.7693, 44.9456, 10.7275, 11.4634, 31.8252, 23.8445, 17.0384, 17.1905, 62.6766, 52.1724 16x1024x64, 73.1669, 44.6461, 47.9647, 10.8117, 11.5345, 31.5849, 24.665, 21.0783, 21.171, , 128x128x128, 63.6198, 47.9282, 45.8397, 9.94458, 10.5815, 34.2727, 26.0366, 12.5724, 12.2973, 39.8753, 36.6811 512x128x64, 60.1651, 40.1304, 41.4524, 9.66688, 10.2411, 32.7439, 25.2692, 14.1772, 13.9955, , 256x128x256, 61.3524, 44.6657, 46.5236, 9.4678, 10.0097, 30.0088, 23.9212, 14.4057, 14.1969, 36.2212, 33.5954 256x256x256, 51.3021, 44.8943, 47.6542, 9.25241, 9.76198, 28.7359, 23.1246, 15.9469, 15.7861, 35.672, 33.093 Norm. Avg., 0.995378, 0.658465, 0.654845, 0.169516, 0.198545, 0.393315, 0.297641, 0.303609, 0.299122, 0.642663, 0.537985 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 5x5x5, 104.605, 20.5567, 13.8, 102.022, 112.226, 77.562, 52.5445 6x6x6, 160.943, 20.5885, 15.5721, 57.6692, 53.8829, 71.1057, 53.3816 7x7x7, 116.51, 11.1997, 7.98049, 52.8006, 50.8323, , 9x9x9, 161.908, 33.7985, 22.3483, 60.6403, 64.0393, 120.676, 95.4022 10x10x10, 168.879, 34.5643, 22.3849, 72.339, 71.6251, 129.994, 73.828 11x11x11, 67.4802, 11.4404, 8.28601, 56.8893, 54.5873, , 12x12x12, 200.733, 48.1357, 28.5077, 55.8697, 55.4568, 130.297, 61.1358 13x13x13, 62.3174, 11.2766, 8.15958, 52.1088, 52.4652, , 14x14x14, 109.351, 19.4812, 14.4844, 45.2252, 42.6878, , 15x15x15, 161.272, 50.0576, 29.2247, 57.1221, 57.5387, 128.668, 81.8227 24x25x28, 81.2652, 45.1383, 30.511, 47.7221, 48.1221, , 48x48x48, 87.7099, 30.1181, 22.792, 21.2044, 21.1772, 62.9511, 52.8271 49x49x49, 65.0431, 20.8195, 14.3613, 23.6816, 23.5893, , 60x60x60, 89.1708, 32.5793, 24.7351, 17.3028, 17.1128, 71.2591, 47.4701 72x60x56, 77.5393, 30.9925, 22.6736, 15.6765, 15.5639, , 75x75x75, 64.5877, 33.6069, 25.3815, 19.5835, 19.6653, 77.4885, 61.8867 80x80x80, 62.528, 32.3261, 25.223, 18.7267, 18.8702, 59.1751, 47.6721 84x84x84, 56.249, 29.7084, 20.9771, 15.4006, 15.1225, , 96x96x96, 61.1034, 30.5938, 23.1178, 13.5905, 13.4254, 43.3269, 34.9193 105x105x105, 57.9602, 30.4379, 21.456, 16.8746, 16.6839, , 112x112x112, 59.1635, 29.5031, 21.8037, 17.4323, 17.252, , 120x120x120, 64.8656, 36.3794, 27.3097, 12.1961, 12.1757, 54.2571, 46.4645 144x144x144, 60.2367, 36.7923, 27.0699, 13.4728, 13.36, 45.9672, 41.3429 180x180x180, 55.8007, 36.156, 26.463, 12.2971, 12.2159, 72.4396, 50.6056 240x240x240, 48.8764, 32.6303, 24.8663, 13.4109, 13.4382, 51.1743, 42.6889 Norm. Avg., 0.97964, 0.367808, 0.265468, 0.372633, 0.372729, 0.791121, 0.589389 @@@@ end