To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ name = Steven G. Johnson @ email = stevenj@alum.mit.edu @ organization = MIT @ computer manufacturer = DEC @ computer model = AlphaServer 4100 @ CPU manufacturer = DEC @ CPU model = Alpha EV56 @ CPU speed = 467 @ RAM = 2048 @ L2 cache size = @ operating system = OSF1 V4.0 @ C compiler = DEC C V5.2-033 (Rev. 564) @ C compiler flags = -newc -w0 -O5 -ansi_alias -ansi_args -fp_reorder -tune host -std1 -DUSE_DXML @ Fortran compiler = f77 @ Fortran compiler flags = -w0 -O5 -ansi_alias -ansi_args -fp_reorder -tune host -std1 @ remarks = The DXML library is the EV4 version. @ FFTW version = FFTW V1.1 ($Id: executor.c,v 1.34 1997/04/30 13:15:56 fftw Exp $) @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) Maximum array size = 360360 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Bailey 5. Beauregard 6. Bergland 7. Brenner 8. Burrus 9. CWP (min N) 10. CWP (best N) 11. Edelblute 12. FFTPACK 13. FFTPACK (f2c) 14. FFTW 15. FFTW_ESTIMATE 16. Frigo-old 17. Green 18. GSL 19. GSL DIT 20. GSL DIF 21. Krukar 22. Mayer (Buneman) 23. Mayer (simple) 24. Mayer (lookup) 25. Monro 26. NAPACK (f2c) 27. Nielsen 28. NR (C) 29. NR (F) 30. Ooura (C) 31. Ooura (F) 32. Ransom 33. SCIPORT 34. Singleton 35. Singleton (f2c) 36. Sorensen 37. Sorensen DIT 38. Temperton 39. Temperton (f2c) 40. Valkenburg 41. DXML Computing normalized averages (42 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.83326 s, 8388608 iters, t-(init.)=0.99996 s t(norm)=0.0596023, mflops=83.8894 (err=1.7e-17) 1. Arndt DIT: elapsed time t=1.84993 s, 8388608 iters, t-(init.)=1.36661 s t(norm)=0.0814564, mflops=61.3825 (err=1.7e-17) 2. Arndt Split-Radix: elapsed time t=1.08329 s, 4194304 iters, t-(init.)=0.799968 s t(norm)=0.0953636, mflops=52.4309 (err=1.7e-17) 3. Arndt 4-step: elapsed time t=1.03329 s, 262144 iters, t-(init.)=1.03329 s t(norm)=1.97085, mflops=2.53698 (err=1.7e-17) 4. Bailey: elapsed time t=1.09996 s, 2097152 iters, t-(init.)=0.949962 s t(norm)=0.226489, mflops=22.0762 (err=1.7e-17) 5. Beauregard: elapsed time t=1.38328 s, 1048576 iters, t-(init.)=1.31661 s t(norm)=0.62781, mflops=7.96419 (err=1.7e-17) 6. Bergland: elapsed time t=1.06662 s, 1048576 iters, t-(init.)=0.99996 s t(norm)=0.476818, mflops=10.4862 (err=1.7e-17) 7. Brenner: elapsed time t=1.7666 s, 2097152 iters, t-(init.)=1.64993 s t(norm)=0.393375, mflops=12.7105 (err=1.7e-17) 8. Burrus: elapsed time t=1.39994 s, 4194304 iters, t-(init.)=1.13329 s t(norm)=0.135098, mflops=37.01 (err=1.7e-17) 9. CWP (min N): elapsed time t=1.34995 s, 1048576 iters, t-(init.)=1.28328 s t(norm)=0.611917, mflops=8.17105 10. CWP (best N) (N=3): elapsed time t=1.46661 s, 1048576 iters, t-(init.)=1.38328 s t(norm)=0.659598, mflops=7.58037 11. Skipping fft (Edelblute can't handle N <= 2). 12. FFTPACK: elapsed time t=1.73326 s, 4194304 iters, t-(init.)=1.46661 s t(norm)=0.174833, mflops=28.5987 (err=1.7e-17) 13. FFTPACK (f2c): elapsed time t=1.01663 s, 2097152 iters, t-(init.)=0.883298 s t(norm)=0.210595, mflops=23.7423 (err=1.7e-17) FFTW_MEASURE plan: (cost = 2.066212e-07) FFTW_NOTW 2 14. FFTW: elapsed time t=1.73326 s, 8388608 iters, t-(init.)=1.19995 s t(norm)=0.0715227, mflops=69.9079 (err=1.7e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 15. FFTW_ESTIMATE: elapsed time t=1.7166 s, 8388608 iters, t-(init.)=1.03329 s t(norm)=0.061589, mflops=81.1833 (err=1.7e-17) 16. Frigo-old: elapsed time t=1.24995 s, 8388608 iters, t-(init.)=0.716638 s t(norm)=0.042715, mflops=117.055 (err=1.7e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.13329 s, 2097152 iters, t-(init.)=1.01663 s t(norm)=0.242383, mflops=20.6285 (err=1.7e-17) 19. GSL DIT: elapsed time t=1.09996 s, 1048576 iters, t-(init.)=1.03329 s t(norm)=0.492712, mflops=10.1479 (err=1.7e-17) 20. GSL DIF: elapsed time t=1.93326 s, 2097152 iters, t-(init.)=1.79993 s t(norm)=0.429136, mflops=11.6513 (err=1.7e-17) 21. Krukar: elapsed time t=1.13329 s, 4194304 iters, t-(init.)=0.866632 s t(norm)=0.103311, mflops=48.3978 (err=1.7e-17) 22. Skipping fft (Mayer can't handle N <= 2). 23. Skipping fft (Mayer can't handle N <= 2). 24. Skipping fft (Mayer can't handle N <= 2). 25. Skipping fft (Monro can't handle N <= 2). 26. NAPACK (f2c): elapsed time t=1.68327 s, 1048576 iters, t-(init.)=1.6166 s t(norm)=0.770856, mflops=6.4863 (err=1.7e-17) 27. Nielsen: elapsed time t=1.63327 s, 524288 iters, t-(init.)=1.6166 s t(norm)=1.54171, mflops=3.24315 (err=1.7e-17) 28. NR (C): elapsed time t=1.79993 s, 2097152 iters, t-(init.)=1.64993 s t(norm)=0.393375, mflops=12.7105 (err=1.7e-17) 29. NR (F): elapsed time t=1.04996 s, 1048576 iters, t-(init.)=0.99996 s t(norm)=0.476818, mflops=10.4862 (err=1.7e-17) 30. Ooura (C): elapsed time t=1.86659 s, 8388608 iters, t-(init.)=1.34995 s t(norm)=0.0804631, mflops=62.1403 (err=1.7e-17) 31. Ooura (F): elapsed time t=1.83326 s, 8388608 iters, t-(init.)=1.33328 s t(norm)=0.0794697, mflops=62.9171 (err=1.7e-17) 32. Skipping fft (Ransom doesn't work for N=2). 33. Skipping fft (SCIPORT can't handle N < 4). 34. Singleton: elapsed time t=1.26662 s, 1048576 iters, t-(init.)=1.19995 s t(norm)=0.572182, mflops=8.73848 (err=1.7e-17) 35. Singleton (f2c): elapsed time t=1.29995 s, 1048576 iters, t-(init.)=1.21662 s t(norm)=0.580129, mflops=8.61878 (err=1.7e-17) 36. Sorensen: elapsed time t=1.49994 s, 4194304 iters, t-(init.)=1.23328 s t(norm)=0.147019, mflops=34.0092 (err=1.7e-17) 37. Sorensen DIT: elapsed time t=1.24995 s, 4194304 iters, t-(init.)=0.766636 s t(norm)=0.0913901, mflops=54.7105 (err=1.7e-17) 38. Temperton: elapsed time t=1.23328 s, 1048576 iters, t-(init.)=1.16662 s t(norm)=0.556288, mflops=8.98815 (err=1.7e-17) 39. Temperton (f2c): elapsed time t=1.86659 s, 1048576 iters, t-(init.)=1.79993 s t(norm)=0.858273, mflops=5.82566 (err=1.7e-17) 40. Valkenburg: elapsed time t=1.6666 s, 2097152 iters, t-(init.)=1.54994 s t(norm)=0.369534, mflops=13.5306 (err=1.7e-17) 41. DXML: elapsed time t=1.16662 s, 1048576 iters, t-(init.)=1.11662 s t(norm)=0.532447, mflops=9.39061 (err=1.7e-17) Top mflops for N=2 = 117.055 Normalized results and averages for N=2: fft 0: mflops = 83.8894 (norm. = 0.716667), norm. avg. (of 1) = 0.716667 fft 1: mflops = 61.3825 (norm. = 0.52439), norm. avg. (of 1) = 0.52439 fft 2: mflops = 52.4309 (norm. = 0.447917), norm. avg. (of 1) = 0.447917 fft 3: mflops = 2.53698 (norm. = 0.0216734), norm. avg. (of 1) = 0.0216734 fft 4: mflops = 22.0762 (norm. = 0.188596), norm. avg. (of 1) = 0.188596 fft 5: mflops = 7.96419 (norm. = 0.068038), norm. avg. (of 1) = 0.068038 fft 6: mflops = 10.4862 (norm. = 0.0895833), norm. avg. (of 1) = 0.0895833 fft 7: mflops = 12.7105 (norm. = 0.108586), norm. avg. (of 1) = 0.108586 fft 8: mflops = 37.01 (norm. = 0.316176), norm. avg. (of 1) = 0.316176 fft 9: mflops = 8.17105 (norm. = 0.0698052), norm. avg. (of 1) = 0.0698052 fft 10: mflops = 7.58037 (norm. = 0.064759), norm. avg. (of 1) = 0.064759 fft 11: mflops = -1 (norm. = -0.00854299), norm. avg. (of 0) = -1 fft 12: mflops = 28.5987 (norm. = 0.244318), norm. avg. (of 1) = 0.244318 fft 13: mflops = 23.7423 (norm. = 0.20283), norm. avg. (of 1) = 0.20283 fft 14: mflops = 69.9079 (norm. = 0.597222), norm. avg. (of 1) = 0.597222 fft 15: mflops = 81.1833 (norm. = 0.693548), norm. avg. (of 1) = 0.693548 fft 16: mflops = 117.055 (norm. = 1), norm. avg. (of 1) = 1 fft 17: mflops = -1 (norm. = -0.00854299), norm. avg. (of 0) = -1 fft 18: mflops = 20.6285 (norm. = 0.17623), norm. avg. (of 1) = 0.17623 fft 19: mflops = 10.1479 (norm. = 0.0866935), norm. avg. (of 1) = 0.0866935 fft 20: mflops = 11.6513 (norm. = 0.099537), norm. avg. (of 1) = 0.099537 fft 21: mflops = 48.3978 (norm. = 0.413462), norm. avg. (of 1) = 0.413462 fft 22: mflops = -1 (norm. = -0.00854299), norm. avg. (of 0) = -1 fft 23: mflops = -1 (norm. = -0.00854299), norm. avg. (of 0) = -1 fft 24: mflops = -1 (norm. = -0.00854299), norm. avg. (of 0) = -1 fft 25: mflops = -1 (norm. = -0.00854299), norm. avg. (of 0) = -1 fft 26: mflops = 6.4863 (norm. = 0.0554124), norm. avg. (of 1) = 0.0554124 fft 27: mflops = 3.24315 (norm. = 0.0277062), norm. avg. (of 1) = 0.0277062 fft 28: mflops = 12.7105 (norm. = 0.108586), norm. avg. (of 1) = 0.108586 fft 29: mflops = 10.4862 (norm. = 0.0895833), norm. avg. (of 1) = 0.0895833 fft 30: mflops = 62.1403 (norm. = 0.530864), norm. avg. (of 1) = 0.530864 fft 31: mflops = 62.9171 (norm. = 0.5375), norm. avg. (of 1) = 0.5375 fft 32: mflops = -1 (norm. = -0.00854299), norm. avg. (of 0) = -1 fft 33: mflops = -1 (norm. = -0.00854299), norm. avg. (of 0) = -1 fft 34: mflops = 8.73848 (norm. = 0.0746528), norm. avg. (of 1) = 0.0746528 fft 35: mflops = 8.61878 (norm. = 0.0736301), norm. avg. (of 1) = 0.0736301 fft 36: mflops = 34.0092 (norm. = 0.290541), norm. avg. (of 1) = 0.290541 fft 37: mflops = 54.7105 (norm. = 0.467391), norm. avg. (of 1) = 0.467391 fft 38: mflops = 8.98815 (norm. = 0.0767857), norm. avg. (of 1) = 0.0767857 fft 39: mflops = 5.82566 (norm. = 0.0497685), norm. avg. (of 1) = 0.0497685 fft 40: mflops = 13.5306 (norm. = 0.115591), norm. avg. (of 1) = 0.115591 fft 41: mflops = 9.39061 (norm. = 0.0802239), norm. avg. (of 1) = 0.0802239 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.64993 s, 4194304 iters, t-(init.)=1.18329 s t(norm)=0.0352647, mflops=141.785 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.63327 s, 4194304 iters, t-(init.)=1.41661 s t(norm)=0.0422183, mflops=118.432 (err=1.3e-16) 2. Arndt Split-Radix: elapsed time t=1.81659 s, 2097152 iters, t-(init.)=1.7166 s t(norm)=0.102317, mflops=48.8676 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.14995 s, 262144 iters, t-(init.)=1.13329 s t(norm)=0.540394, mflops=9.25251 (err=1.3e-16) 4. Bailey: elapsed time t=1.98325 s, 2097152 iters, t-(init.)=1.86659 s t(norm)=0.111258, mflops=44.9408 (err=1.3e-16) 5. Beauregard: elapsed time t=1.41661 s, 524288 iters, t-(init.)=1.38328 s t(norm)=0.329799, mflops=15.1607 (err=6.5e-17) 6. Bergland: elapsed time t=1.21662 s, 1048576 iters, t-(init.)=1.16662 s t(norm)=0.139072, mflops=35.9526 (err=5.3e-17) 7. Brenner: elapsed time t=1.28328 s, 1048576 iters, t-(init.)=1.23328 s t(norm)=0.147019, mflops=34.0092 (err=5.3e-17) 8. Burrus: elapsed time t=1.79993 s, 2097152 iters, t-(init.)=1.69993 s t(norm)=0.101324, mflops=49.3467 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.39994 s, 1048576 iters, t-(init.)=1.34995 s t(norm)=0.160926, mflops=31.0702 10. CWP (best N) (N=15): elapsed time t=1.34995 s, 524288 iters, t-(init.)=1.28328 s t(norm)=0.305958, mflops=16.3421 11. Edelblute: elapsed time t=1.04996 s, 1048576 iters, t-(init.)=0.983294 s t(norm)=0.117218, mflops=42.6556 (err=1.3e-16) 12. FFTPACK: elapsed time t=2.01659 s, 4194304 iters, t-(init.)=1.79993 s t(norm)=0.053642, mflops=93.2105 (err=5.3e-17) 13. FFTPACK (f2c): elapsed time t=1.24995 s, 2097152 iters, t-(init.)=1.13329 s t(norm)=0.0675492, mflops=74.0201 (err=5.3e-17) FFTW_MEASURE plan: (cost = 2.225151e-07) FFTW_NOTW 4 14. FFTW: elapsed time t=1.98325 s, 8388608 iters, t-(init.)=1.58327 s t(norm)=0.0235926, mflops=211.931 (err=5.3e-17) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 15. FFTW_ESTIMATE: elapsed time t=1.96659 s, 8388608 iters, t-(init.)=0.983294 s t(norm)=0.0146522, mflops=341.245 (err=5.3e-17) 16. Frigo-old: elapsed time t=1.36661 s, 8388608 iters, t-(init.)=0.949962 s t(norm)=0.0141555, mflops=353.219 (err=5.3e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.19995 s, 2097152 iters, t-(init.)=1.08329 s t(norm)=0.0645691, mflops=77.4364 (err=5.3e-17) 19. GSL DIT: elapsed time t=1.04996 s, 524288 iters, t-(init.)=1.01663 s t(norm)=0.242383, mflops=20.6285 (err=6.5e-17) 20. GSL DIF: elapsed time t=1.98325 s, 1048576 iters, t-(init.)=1.93326 s t(norm)=0.230462, mflops=21.6955 (err=6.5e-17) 21. Krukar: elapsed time t=1.53327 s, 4194304 iters, t-(init.)=1.31661 s t(norm)=0.0392382, mflops=127.427 (err=5.3e-17) 22. Mayer (Buneman): elapsed time t=1.26662 s, 2097152 iters, t-(init.)=1.14995 s t(norm)=0.0685426, mflops=72.9473 (err=1.3e-16) 23. Mayer (simple): elapsed time t=1.19995 s, 2097152 iters, t-(init.)=1.09996 s t(norm)=0.0655625, mflops=76.2631 24. Mayer (lookup): elapsed time t=1.21662 s, 2097152 iters, t-(init.)=1.09996 s t(norm)=0.0655625, mflops=76.2631 (err=1.3e-16) 25. Monro: elapsed time t=1.94992 s, 262144 iters, t-(init.)=1.94992 s t(norm)=0.929795, mflops=5.37753 (err=1.3e-16) 26. NAPACK (f2c): elapsed time t=1.7166 s, 524288 iters, t-(init.)=1.69993 s t(norm)=0.405295, mflops=12.3367 (err=1.6e-16) 27. Nielsen: elapsed time t=1.68327 s, 524288 iters, t-(init.)=1.6666 s t(norm)=0.397348, mflops=12.5834 (err=1.3e-16) 28. NR (C): elapsed time t=1.98325 s, 1048576 iters, t-(init.)=1.93326 s t(norm)=0.230462, mflops=21.6955 (err=6.5e-17) 29. NR (F): elapsed time t=1.13329 s, 524288 iters, t-(init.)=1.09996 s t(norm)=0.26225, mflops=19.0658 (err=6.5e-17) 30. Ooura (C): elapsed time t=1.91659 s, 4194304 iters, t-(init.)=1.7166 s t(norm)=0.0511586, mflops=97.7353 (err=5.3e-17) 31. Ooura (F): elapsed time t=1.46661 s, 4194304 iters, t-(init.)=1.24995 s t(norm)=0.0372514, mflops=134.223 (err=5.3e-17) 32. Ransom: elapsed time t=1.21662 s, 262144 iters, t-(init.)=1.21662 s t(norm)=0.580129, mflops=8.61878 (err=1.6e-16) 33. SCIPORT: elapsed time t=1.33328 s, 2097152 iters, t-(init.)=1.23328 s t(norm)=0.0735095, mflops=68.0185 (err=6.5e-17) 34. Singleton: elapsed time t=1.44994 s, 1048576 iters, t-(init.)=1.39994 s t(norm)=0.166886, mflops=29.9605 (err=5.3e-17) 35. Singleton (f2c): elapsed time t=1.48327 s, 1048576 iters, t-(init.)=1.41661 s t(norm)=0.168873, mflops=29.608 (err=5.3e-17) 36. Sorensen: elapsed time t=1.33328 s, 2097152 iters, t-(init.)=1.23328 s t(norm)=0.0735095, mflops=68.0185 (err=1.3e-16) 37. Sorensen DIT: elapsed time t=1.83326 s, 2097152 iters, t-(init.)=1.63327 s t(norm)=0.0973504, mflops=51.3609 (err=1.3e-16) 38. Temperton: elapsed time t=1.34995 s, 1048576 iters, t-(init.)=1.29995 s t(norm)=0.154966, mflops=32.2652 (err=5.3e-17) 39. Temperton (f2c): elapsed time t=1.06662 s, 524288 iters, t-(init.)=1.03329 s t(norm)=0.246356, mflops=20.2958 (err=5.3e-17) 40. Valkenburg: elapsed time t=1.43328 s, 524288 iters, t-(init.)=1.39994 s t(norm)=0.333773, mflops=14.9803 (err=1.6e-16) 41. DXML: elapsed time t=1.16662 s, 1048576 iters, t-(init.)=1.11662 s t(norm)=0.133112, mflops=37.5624 (err=5.3e-17) Top mflops for N=4 = 353.219 Normalized results and averages for N=4: fft 0: mflops = 141.785 (norm. = 0.401408), norm. avg. (of 2) = 0.559038 fft 1: mflops = 118.432 (norm. = 0.335294), norm. avg. (of 2) = 0.429842 fft 2: mflops = 48.8676 (norm. = 0.13835), norm. avg. (of 2) = 0.293133 fft 3: mflops = 9.25251 (norm. = 0.0261949), norm. avg. (of 2) = 0.0239341 fft 4: mflops = 44.9408 (norm. = 0.127232), norm. avg. (of 2) = 0.157914 fft 5: mflops = 15.1607 (norm. = 0.0429217), norm. avg. (of 2) = 0.0554798 fft 6: mflops = 35.9526 (norm. = 0.101786), norm. avg. (of 2) = 0.0956845 fft 7: mflops = 34.0092 (norm. = 0.0962838), norm. avg. (of 2) = 0.102435 fft 8: mflops = 49.3467 (norm. = 0.139706), norm. avg. (of 2) = 0.227941 fft 9: mflops = 31.0702 (norm. = 0.087963), norm. avg. (of 2) = 0.0788841 fft 10: mflops = 16.3421 (norm. = 0.0462662), norm. avg. (of 2) = 0.0555126 fft 11: mflops = 42.6556 (norm. = 0.120763), norm. avg. (of 1) = 0.120763 fft 12: mflops = 93.2105 (norm. = 0.263889), norm. avg. (of 2) = 0.254104 fft 13: mflops = 74.0201 (norm. = 0.209559), norm. avg. (of 2) = 0.206195 fft 14: mflops = 211.931 (norm. = 0.6), norm. avg. (of 2) = 0.598611 fft 15: mflops = 341.245 (norm. = 0.966102), norm. avg. (of 2) = 0.829825 fft 16: mflops = 353.219 (norm. = 1), norm. avg. (of 2) = 1 fft 17: mflops = -1 (norm. = -0.00283111), norm. avg. (of 0) = -1 fft 18: mflops = 77.4364 (norm. = 0.219231), norm. avg. (of 2) = 0.19773 fft 19: mflops = 20.6285 (norm. = 0.0584016), norm. avg. (of 2) = 0.0725476 fft 20: mflops = 21.6955 (norm. = 0.0614224), norm. avg. (of 2) = 0.0804797 fft 21: mflops = 127.427 (norm. = 0.360759), norm. avg. (of 2) = 0.387111 fft 22: mflops = 72.9473 (norm. = 0.206522), norm. avg. (of 1) = 0.206522 fft 23: mflops = 76.2631 (norm. = 0.215909), norm. avg. (of 1) = 0.215909 fft 24: mflops = 76.2631 (norm. = 0.215909), norm. avg. (of 1) = 0.215909 fft 25: mflops = 5.37753 (norm. = 0.0152244), norm. avg. (of 1) = 0.0152244 fft 26: mflops = 12.3367 (norm. = 0.0349265), norm. avg. (of 2) = 0.0451694 fft 27: mflops = 12.5834 (norm. = 0.035625), norm. avg. (of 2) = 0.0316656 fft 28: mflops = 21.6955 (norm. = 0.0614224), norm. avg. (of 2) = 0.0850041 fft 29: mflops = 19.0658 (norm. = 0.0539773), norm. avg. (of 2) = 0.0717803 fft 30: mflops = 97.7353 (norm. = 0.276699), norm. avg. (of 2) = 0.403782 fft 31: mflops = 134.223 (norm. = 0.38), norm. avg. (of 2) = 0.45875 fft 32: mflops = 8.61878 (norm. = 0.0244007), norm. avg. (of 1) = 0.0244007 fft 33: mflops = 68.0185 (norm. = 0.192568), norm. avg. (of 1) = 0.192568 fft 34: mflops = 29.9605 (norm. = 0.0848214), norm. avg. (of 2) = 0.0797371 fft 35: mflops = 29.608 (norm. = 0.0838235), norm. avg. (of 2) = 0.0787268 fft 36: mflops = 68.0185 (norm. = 0.192568), norm. avg. (of 2) = 0.241554 fft 37: mflops = 51.3609 (norm. = 0.145408), norm. avg. (of 2) = 0.3064 fft 38: mflops = 32.2652 (norm. = 0.0913462), norm. avg. (of 2) = 0.0840659 fft 39: mflops = 20.2958 (norm. = 0.0574597), norm. avg. (of 2) = 0.0536141 fft 40: mflops = 14.9803 (norm. = 0.0424107), norm. avg. (of 2) = 0.0790011 fft 41: mflops = 37.5624 (norm. = 0.106343), norm. avg. (of 2) = 0.0932836 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.44994 s, 2097152 iters, t-(init.)=1.13329 s t(norm)=0.0225164, mflops=222.06 (err=1.2e-16) 1. Arndt DIT: elapsed time t=1.54994 s, 2097152 iters, t-(init.)=1.34995 s t(norm)=0.026821, mflops=186.421 (err=1.2e-16) 2. Arndt Split-Radix: elapsed time t=1.13329 s, 524288 iters, t-(init.)=1.08329 s t(norm)=0.0860922, mflops=58.0773 (err=1.1e-16) 3. Arndt 4-step: elapsed time t=1.19995 s, 131072 iters, t-(init.)=1.18329 s t(norm)=0.376156, mflops=13.2923 (err=1.3e-16) 4. Bailey: elapsed time t=1.48327 s, 1048576 iters, t-(init.)=1.38328 s t(norm)=0.0549665, mflops=90.9644 (err=9.8e-17) 5. Beauregard: elapsed time t=1.6166 s, 262144 iters, t-(init.)=1.59994 s t(norm)=0.254303, mflops=19.6616 (err=1.2e-16) 6. Bergland: elapsed time t=1.01663 s, 524288 iters, t-(init.)=0.966628 s t(norm)=0.0768207, mflops=65.0866 (err=1.3e-16) 7. Brenner: elapsed time t=1.53327 s, 524288 iters, t-(init.)=1.48327 s t(norm)=0.11788, mflops=42.416 (err=1.2e-16) 8. Burrus: elapsed time t=1.26662 s, 524288 iters, t-(init.)=1.21662 s t(norm)=0.0966881, mflops=51.7127 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.59994 s, 1048576 iters, t-(init.)=1.49994 s t(norm)=0.0596023, mflops=83.8894 10. CWP (best N) (N=15): elapsed time t=1.36661 s, 524288 iters, t-(init.)=1.29995 s t(norm)=0.103311, mflops=48.3978 11. Edelblute: elapsed time t=1.64993 s, 524288 iters, t-(init.)=1.59994 s t(norm)=0.127151, mflops=39.3232 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.19995 s, 1048576 iters, t-(init.)=1.09996 s t(norm)=0.0437083, mflops=114.395 (err=1.2e-16) 13. FFTPACK (f2c): elapsed time t=1.36661 s, 1048576 iters, t-(init.)=1.26662 s t(norm)=0.0503308, mflops=99.3428 (err=1.2e-16) FFTW_MEASURE plan: (cost = 3.973484e-07) FFTW_NOTW 8 14. FFTW: elapsed time t=1.68327 s, 4194304 iters, t-(init.)=1.29995 s t(norm)=0.0129138, mflops=387.182 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.6666 s, 4194304 iters, t-(init.)=0.99996 s t(norm)=0.00993371, mflops=503.337 (err=1.2e-16) 16. Frigo-old: elapsed time t=1.28328 s, 4194304 iters, t-(init.)=0.883298 s t(norm)=0.00877478, mflops=569.815 (err=1.4e-16) 17. Green: elapsed time t=1.6666 s, 2097152 iters, t-(init.)=1.46661 s t(norm)=0.0291389, mflops=171.592 (err=1.4e-16) 18. GSL: elapsed time t=1.39994 s, 1048576 iters, t-(init.)=1.29995 s t(norm)=0.0516553, mflops=96.7955 (err=1.4e-16) 19. GSL DIT: elapsed time t=1.89992 s, 524288 iters, t-(init.)=1.84993 s t(norm)=0.147019, mflops=34.0092 (err=1.2e-16) 20. GSL DIF: elapsed time t=1.78326 s, 524288 iters, t-(init.)=1.73326 s t(norm)=0.137747, mflops=36.2983 (err=1.4e-16) 21. Krukar: elapsed time t=1.44994 s, 2097152 iters, t-(init.)=1.24995 s t(norm)=0.0248343, mflops=201.335 (err=1.9e-16) 22. Mayer (Buneman): elapsed time t=1.31661 s, 1048576 iters, t-(init.)=1.21662 s t(norm)=0.0483441, mflops=103.425 (err=1.2e-16) 23. Mayer (simple): elapsed time t=1.26662 s, 1048576 iters, t-(init.)=1.18329 s t(norm)=0.0470196, mflops=106.339 24. Mayer (lookup): elapsed time t=1.31661 s, 1048576 iters, t-(init.)=1.23328 s t(norm)=0.0490063, mflops=102.028 (err=1.2e-16) 25. Monro: elapsed time t=1.09996 s, 131072 iters, t-(init.)=1.08329 s t(norm)=0.344369, mflops=14.5193 (err=1.1e-08) 26. NAPACK (f2c): elapsed time t=1.68327 s, 262144 iters, t-(init.)=1.64993 s t(norm)=0.26225, mflops=19.0658 (err=1.7e-16) 27. Nielsen: elapsed time t=1.03329 s, 262144 iters, t-(init.)=1.01663 s t(norm)=0.161588, mflops=30.9428 (err=6.9e-16) 28. NR (C): elapsed time t=1.93326 s, 524288 iters, t-(init.)=1.88326 s t(norm)=0.149668, mflops=33.4073 (err=1.5e-16) 29. NR (F): elapsed time t=1.04996 s, 262144 iters, t-(init.)=1.03329 s t(norm)=0.164237, mflops=30.4437 (err=1.5e-16) 30. Ooura (C): elapsed time t=1.5666 s, 2097152 iters, t-(init.)=1.36661 s t(norm)=0.0271521, mflops=184.148 (err=1.3e-16) 31. Ooura (F): elapsed time t=1.49994 s, 2097152 iters, t-(init.)=1.31661 s t(norm)=0.0261588, mflops=191.14 (err=1.3e-16) 32. Ransom: elapsed time t=1.31661 s, 131072 iters, t-(init.)=1.29995 s t(norm)=0.413242, mflops=12.0994 (err=3.4e-16) 33. SCIPORT: elapsed time t=1.34995 s, 1048576 iters, t-(init.)=1.24995 s t(norm)=0.0496686, mflops=100.667 (err=1.4e-16) 34. Singleton: elapsed time t=1.96659 s, 524288 iters, t-(init.)=1.91659 s t(norm)=0.152317, mflops=32.8263 (err=1.4e-16) 35. Singleton (f2c): elapsed time t=1.04996 s, 262144 iters, t-(init.)=1.01663 s t(norm)=0.161588, mflops=30.9428 (err=1.4e-16) 36. Sorensen: elapsed time t=1.39994 s, 1048576 iters, t-(init.)=1.29995 s t(norm)=0.0516553, mflops=96.7955 (err=1.5e-16) 37. Sorensen DIT: elapsed time t=1.34995 s, 524288 iters, t-(init.)=1.28328 s t(norm)=0.101986, mflops=49.0263 (err=1.1e-16) 38. Temperton: elapsed time t=1.26662 s, 524288 iters, t-(init.)=1.21662 s t(norm)=0.0966881, mflops=51.7127 (err=4.6e-09) 39. Temperton (f2c): elapsed time t=1.84993 s, 524288 iters, t-(init.)=1.79993 s t(norm)=0.143045, mflops=34.9539 (err=1.4e-16) 40. Valkenburg: elapsed time t=1.91659 s, 262144 iters, t-(init.)=1.88326 s t(norm)=0.299336, mflops=16.7036 (err=1.5e-16) 41. DXML: elapsed time t=1.34995 s, 1048576 iters, t-(init.)=1.24995 s t(norm)=0.0496686, mflops=100.667 (err=1.0e-15) Top mflops for N=8 = 569.815 Normalized results and averages for N=8: fft 0: mflops = 222.06 (norm. = 0.389706), norm. avg. (of 3) = 0.502594 fft 1: mflops = 186.421 (norm. = 0.32716), norm. avg. (of 3) = 0.395615 fft 2: mflops = 58.0773 (norm. = 0.101923), norm. avg. (of 3) = 0.229396 fft 3: mflops = 13.2923 (norm. = 0.0233275), norm. avg. (of 3) = 0.0237319 fft 4: mflops = 90.9644 (norm. = 0.159639), norm. avg. (of 3) = 0.158489 fft 5: mflops = 19.6616 (norm. = 0.0345052), norm. avg. (of 3) = 0.0484883 fft 6: mflops = 65.0866 (norm. = 0.114224), norm. avg. (of 3) = 0.101864 fft 7: mflops = 42.416 (norm. = 0.0744382), norm. avg. (of 3) = 0.0931026 fft 8: mflops = 51.7127 (norm. = 0.0907534), norm. avg. (of 3) = 0.182212 fft 9: mflops = 83.8894 (norm. = 0.147222), norm. avg. (of 3) = 0.101663 fft 10: mflops = 48.3978 (norm. = 0.0849359), norm. avg. (of 3) = 0.0653204 fft 11: mflops = 39.3232 (norm. = 0.0690104), norm. avg. (of 2) = 0.0948866 fft 12: mflops = 114.395 (norm. = 0.200758), norm. avg. (of 3) = 0.236322 fft 13: mflops = 99.3428 (norm. = 0.174342), norm. avg. (of 3) = 0.195577 fft 14: mflops = 387.182 (norm. = 0.679487), norm. avg. (of 3) = 0.62557 fft 15: mflops = 503.337 (norm. = 0.883333), norm. avg. (of 3) = 0.847661 fft 16: mflops = 569.815 (norm. = 1), norm. avg. (of 3) = 1 fft 17: mflops = 171.592 (norm. = 0.301136), norm. avg. (of 1) = 0.301136 fft 18: mflops = 96.7955 (norm. = 0.169872), norm. avg. (of 3) = 0.188444 fft 19: mflops = 34.0092 (norm. = 0.0596847), norm. avg. (of 3) = 0.06826 fft 20: mflops = 36.2983 (norm. = 0.0637019), norm. avg. (of 3) = 0.0748871 fft 21: mflops = 201.335 (norm. = 0.353333), norm. avg. (of 3) = 0.375851 fft 22: mflops = 103.425 (norm. = 0.181507), norm. avg. (of 2) = 0.194014 fft 23: mflops = 106.339 (norm. = 0.18662), norm. avg. (of 2) = 0.201264 fft 24: mflops = 102.028 (norm. = 0.179054), norm. avg. (of 2) = 0.197482 fft 25: mflops = 14.5193 (norm. = 0.0254808), norm. avg. (of 2) = 0.0203526 fft 26: mflops = 19.0658 (norm. = 0.0334596), norm. avg. (of 3) = 0.0412661 fft 27: mflops = 30.9428 (norm. = 0.0543033), norm. avg. (of 3) = 0.0392115 fft 28: mflops = 33.4073 (norm. = 0.0586283), norm. avg. (of 3) = 0.0762122 fft 29: mflops = 30.4437 (norm. = 0.0534274), norm. avg. (of 3) = 0.0656627 fft 30: mflops = 184.148 (norm. = 0.323171), norm. avg. (of 3) = 0.376911 fft 31: mflops = 191.14 (norm. = 0.335443), norm. avg. (of 3) = 0.417648 fft 32: mflops = 12.0994 (norm. = 0.021234), norm. avg. (of 2) = 0.0228173 fft 33: mflops = 100.667 (norm. = 0.176667), norm. avg. (of 2) = 0.184617 fft 34: mflops = 32.8263 (norm. = 0.0576087), norm. avg. (of 3) = 0.072361 fft 35: mflops = 30.9428 (norm. = 0.0543033), norm. avg. (of 3) = 0.0705856 fft 36: mflops = 96.7955 (norm. = 0.169872), norm. avg. (of 3) = 0.21766 fft 37: mflops = 49.0263 (norm. = 0.086039), norm. avg. (of 3) = 0.232946 fft 38: mflops = 51.7127 (norm. = 0.0907534), norm. avg. (of 3) = 0.0862951 fft 39: mflops = 34.9539 (norm. = 0.0613426), norm. avg. (of 3) = 0.0561903 fft 40: mflops = 16.7036 (norm. = 0.0293142), norm. avg. (of 3) = 0.0624388 fft 41: mflops = 100.667 (norm. = 0.176667), norm. avg. (of 3) = 0.121078 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.73326 s, 524288 iters, t-(init.)=1.64993 s t(norm)=0.0491719, mflops=101.684 (err=1.5e-16) 1. Arndt DIT: elapsed time t=1.74993 s, 524288 iters, t-(init.)=1.69993 s t(norm)=0.0506619, mflops=98.6935 (err=2.2e-16) 2. Arndt Split-Radix: elapsed time t=1.38328 s, 262144 iters, t-(init.)=1.34995 s t(norm)=0.0804631, mflops=62.1403 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.93326 s, 131072 iters, t-(init.)=1.91659 s t(norm)=0.228475, mflops=21.8842 (err=2.0e-16) 4. Bailey: elapsed time t=1.29995 s, 524288 iters, t-(init.)=1.23328 s t(norm)=0.0367547, mflops=136.037 (err=2.0e-16) 5. Beauregard: elapsed time t=1.89992 s, 131072 iters, t-(init.)=1.88326 s t(norm)=0.224502, mflops=22.2715 (err=2.7e-16) 6. Bergland: elapsed time t=1.01663 s, 262144 iters, t-(init.)=0.99996 s t(norm)=0.0596023, mflops=83.8894 (err=2.6e-16) 7. Brenner: elapsed time t=1.36661 s, 262144 iters, t-(init.)=1.33328 s t(norm)=0.0794697, mflops=62.9171 (err=2.1e-16) 8. Burrus: elapsed time t=1.53327 s, 262144 iters, t-(init.)=1.51661 s t(norm)=0.0903968, mflops=55.3117 (err=1.4e-16) 9. CWP (min N): elapsed time t=1.23328 s, 524288 iters, t-(init.)=1.16662 s t(norm)=0.034768, mflops=143.81 10. CWP (best N) (N=28): elapsed time t=1.7666 s, 524288 iters, t-(init.)=1.68327 s t(norm)=0.0501652, mflops=99.6706 11. Edelblute: elapsed time t=1.03329 s, 131072 iters, t-(init.)=1.01663 s t(norm)=0.121191, mflops=41.2571 (err=1.4e-16) 12. FFTPACK: elapsed time t=1.49994 s, 1048576 iters, t-(init.)=1.38328 s t(norm)=0.0206124, mflops=242.572 (err=1.8e-16) 13. FFTPACK (f2c): elapsed time t=1.04996 s, 524288 iters, t-(init.)=0.99996 s t(norm)=0.0298011, mflops=167.779 (err=1.8e-16) FFTW_MEASURE plan: (cost = 8.264847e-07) FFTW_NOTW 16 14. FFTW: elapsed time t=1.63327 s, 2097152 iters, t-(init.)=1.38328 s t(norm)=0.0103062, mflops=485.144 (err=1.8e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.63327 s, 2097152 iters, t-(init.)=1.24995 s t(norm)=0.00931285, mflops=536.892 (err=1.8e-16) 16. Frigo-old: elapsed time t=1.21662 s, 2097152 iters, t-(init.)=0.966628 s t(norm)=0.00720194, mflops=694.257 (err=1.8e-16) 17. Green: elapsed time t=1.73326 s, 1048576 iters, t-(init.)=1.6166 s t(norm)=0.0240892, mflops=207.561 (err=1.9e-16) 18. GSL: elapsed time t=1.09996 s, 524288 iters, t-(init.)=1.03329 s t(norm)=0.0307945, mflops=162.367 (err=1.8e-16) 19. GSL DIT: elapsed time t=1.6666 s, 262144 iters, t-(init.)=1.64993 s t(norm)=0.0983437, mflops=50.8421 (err=2.1e-16) 20. GSL DIF: elapsed time t=1.48327 s, 262144 iters, t-(init.)=1.44994 s t(norm)=0.0864233, mflops=57.8548 (err=2.8e-16) 21. Krukar: elapsed time t=1.46661 s, 1048576 iters, t-(init.)=1.34995 s t(norm)=0.0201158, mflops=248.561 (err=1.9e-16) 22. Mayer (Buneman): elapsed time t=1.88326 s, 524288 iters, t-(init.)=1.81659 s t(norm)=0.0541387, mflops=92.3553 (err=1.7e-16) 23. Mayer (simple): elapsed time t=1.53327 s, 524288 iters, t-(init.)=1.46661 s t(norm)=0.0437083, mflops=114.395 24. Mayer (lookup): elapsed time t=1.58327 s, 524288 iters, t-(init.)=1.51661 s t(norm)=0.0451984, mflops=110.623 (err=1.8e-16) 25. Monro: elapsed time t=1.48327 s, 131072 iters, t-(init.)=1.46661 s t(norm)=0.174833, mflops=28.5987 (err=2.1e-08) 26. NAPACK (f2c): elapsed time t=1.58327 s, 131072 iters, t-(init.)=1.5666 s t(norm)=0.186754, mflops=26.7732 (err=3.3e-16) 27. Nielsen: elapsed time t=1.11662 s, 131072 iters, t-(init.)=1.09996 s t(norm)=0.131125, mflops=38.1316 (err=1.6e-16) 28. NR (C): elapsed time t=1.86659 s, 262144 iters, t-(init.)=1.83326 s t(norm)=0.109271, mflops=45.7579 (err=2.1e-16) 29. NR (F): elapsed time t=1.91659 s, 262144 iters, t-(init.)=1.88326 s t(norm)=0.112251, mflops=44.5431 (err=2.1e-16) 30. Ooura (C): elapsed time t=1.63327 s, 1048576 iters, t-(init.)=1.49994 s t(norm)=0.0223508, mflops=223.705 (err=2.0e-16) 31. Ooura (F): elapsed time t=1.88326 s, 1048576 iters, t-(init.)=1.7666 s t(norm)=0.0263243, mflops=189.938 (err=2.0e-16) 32. Ransom: elapsed time t=1.24995 s, 131072 iters, t-(init.)=1.23328 s t(norm)=0.147019, mflops=34.0092 (err=3.4e-16) 33. SCIPORT: elapsed time t=1.38328 s, 524288 iters, t-(init.)=1.31661 s t(norm)=0.0392382, mflops=127.427 (err=2.8e-16) 34. Singleton: elapsed time t=1.03329 s, 262144 iters, t-(init.)=0.99996 s t(norm)=0.0596023, mflops=83.8894 (err=1.7e-16) 35. Singleton (f2c): elapsed time t=1.06662 s, 262144 iters, t-(init.)=1.03329 s t(norm)=0.061589, mflops=81.1833 (err=1.7e-16) 36. Sorensen: elapsed time t=1.39994 s, 524288 iters, t-(init.)=1.33328 s t(norm)=0.0397348, mflops=125.834 (err=1.5e-16) 37. Sorensen DIT: elapsed time t=1.6166 s, 262144 iters, t-(init.)=1.58327 s t(norm)=0.0943702, mflops=52.9828 (err=1.6e-16) 38. Temperton: elapsed time t=1.89992 s, 524288 iters, t-(init.)=1.84993 s t(norm)=0.0551321, mflops=90.6913 (err=1.7e-08) 39. Temperton (f2c): elapsed time t=1.24995 s, 262144 iters, t-(init.)=1.23328 s t(norm)=0.0735095, mflops=68.0185 (err=1.8e-16) 40. Valkenburg: elapsed time t=1.18329 s, 65536 iters, t-(init.)=1.16662 s t(norm)=0.278144, mflops=17.9763 (err=2.9e-16) 41. DXML: elapsed time t=1.09996 s, 524288 iters, t-(init.)=1.03329 s t(norm)=0.0307945, mflops=162.367 (err=2.0e-16) Top mflops for N=16 = 694.257 Normalized results and averages for N=16: fft 0: mflops = 101.684 (norm. = 0.146465), norm. avg. (of 4) = 0.413561 fft 1: mflops = 98.6935 (norm. = 0.142157), norm. avg. (of 4) = 0.33225 fft 2: mflops = 62.1403 (norm. = 0.0895062), norm. avg. (of 4) = 0.194424 fft 3: mflops = 21.8842 (norm. = 0.0315217), norm. avg. (of 4) = 0.0256794 fft 4: mflops = 136.037 (norm. = 0.195946), norm. avg. (of 4) = 0.167853 fft 5: mflops = 22.2715 (norm. = 0.0320796), norm. avg. (of 4) = 0.0443861 fft 6: mflops = 83.8894 (norm. = 0.120833), norm. avg. (of 4) = 0.106607 fft 7: mflops = 62.9171 (norm. = 0.090625), norm. avg. (of 4) = 0.0924832 fft 8: mflops = 55.3117 (norm. = 0.0796703), norm. avg. (of 4) = 0.156577 fft 9: mflops = 143.81 (norm. = 0.207143), norm. avg. (of 4) = 0.128033 fft 10: mflops = 99.6706 (norm. = 0.143564), norm. avg. (of 4) = 0.0848814 fft 11: mflops = 41.2571 (norm. = 0.0594262), norm. avg. (of 3) = 0.0830665 fft 12: mflops = 242.572 (norm. = 0.349398), norm. avg. (of 4) = 0.264591 fft 13: mflops = 167.779 (norm. = 0.241667), norm. avg. (of 4) = 0.207099 fft 14: mflops = 485.144 (norm. = 0.698795), norm. avg. (of 4) = 0.643876 fft 15: mflops = 536.892 (norm. = 0.773333), norm. avg. (of 4) = 0.829079 fft 16: mflops = 694.257 (norm. = 1), norm. avg. (of 4) = 1 fft 17: mflops = 207.561 (norm. = 0.298969), norm. avg. (of 2) = 0.300053 fft 18: mflops = 162.367 (norm. = 0.233871), norm. avg. (of 4) = 0.199801 fft 19: mflops = 50.8421 (norm. = 0.0732323), norm. avg. (of 4) = 0.069503 fft 20: mflops = 57.8548 (norm. = 0.0833333), norm. avg. (of 4) = 0.0769987 fft 21: mflops = 248.561 (norm. = 0.358025), norm. avg. (of 4) = 0.371395 fft 22: mflops = 92.3553 (norm. = 0.133028), norm. avg. (of 3) = 0.173685 fft 23: mflops = 114.395 (norm. = 0.164773), norm. avg. (of 3) = 0.189101 fft 24: mflops = 110.623 (norm. = 0.159341), norm. avg. (of 3) = 0.184768 fft 25: mflops = 28.5987 (norm. = 0.0411932), norm. avg. (of 3) = 0.0272994 fft 26: mflops = 26.7732 (norm. = 0.0385638), norm. avg. (of 4) = 0.0405906 fft 27: mflops = 38.1316 (norm. = 0.0549242), norm. avg. (of 4) = 0.0431397 fft 28: mflops = 45.7579 (norm. = 0.0659091), norm. avg. (of 4) = 0.0736364 fft 29: mflops = 44.5431 (norm. = 0.0641593), norm. avg. (of 4) = 0.0652868 fft 30: mflops = 223.705 (norm. = 0.322222), norm. avg. (of 4) = 0.363239 fft 31: mflops = 189.938 (norm. = 0.273585), norm. avg. (of 4) = 0.381632 fft 32: mflops = 34.0092 (norm. = 0.0489865), norm. avg. (of 3) = 0.0315404 fft 33: mflops = 127.427 (norm. = 0.183544), norm. avg. (of 3) = 0.18426 fft 34: mflops = 83.8894 (norm. = 0.120833), norm. avg. (of 4) = 0.0844791 fft 35: mflops = 81.1833 (norm. = 0.116935), norm. avg. (of 4) = 0.0821731 fft 36: mflops = 125.834 (norm. = 0.18125), norm. avg. (of 4) = 0.208557 fft 37: mflops = 52.9828 (norm. = 0.0763158), norm. avg. (of 4) = 0.193789 fft 38: mflops = 90.6913 (norm. = 0.130631), norm. avg. (of 4) = 0.097379 fft 39: mflops = 68.0185 (norm. = 0.097973), norm. avg. (of 4) = 0.0666359 fft 40: mflops = 17.9763 (norm. = 0.0258929), norm. avg. (of 4) = 0.0533023 fft 41: mflops = 162.367 (norm. = 0.233871), norm. avg. (of 4) = 0.149276 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.79993 s, 262144 iters, t-(init.)=1.73326 s t(norm)=0.0413242, mflops=120.994 (err=3.1e-16) 1. Arndt DIT: elapsed time t=1.88326 s, 262144 iters, t-(init.)=1.83326 s t(norm)=0.0437083, mflops=114.395 (err=2.5e-16) 2. Arndt Split-Radix: elapsed time t=1.54994 s, 131072 iters, t-(init.)=1.51661 s t(norm)=0.0723174, mflops=69.1396 (err=2.7e-16) 3. Arndt 4-step: elapsed time t=1.03329 s, 32768 iters, t-(init.)=1.01663 s t(norm)=0.193906, mflops=25.7857 (err=2.8e-16) 4. Bailey: elapsed time t=1.38328 s, 262144 iters, t-(init.)=1.33328 s t(norm)=0.0317879, mflops=157.293 (err=2.7e-16) 5. Beauregard: elapsed time t=1.16662 s, 32768 iters, t-(init.)=1.16662 s t(norm)=0.222515, mflops=22.4704 (err=1.8e-16) 6. Bergland: elapsed time t=1.69993 s, 262144 iters, t-(init.)=1.64993 s t(norm)=0.0393375, mflops=127.105 (err=2.6e-16) 7. Brenner: elapsed time t=1.41661 s, 131072 iters, t-(init.)=1.38328 s t(norm)=0.0659598, mflops=75.8037 (err=2.2e-16) 8. Burrus: elapsed time t=1.73326 s, 131072 iters, t-(init.)=1.7166 s t(norm)=0.0818538, mflops=61.0845 (err=2.9e-16) 9. CWP (min N) (N=33): elapsed time t=1.26662 s, 262144 iters, t-(init.)=1.21662 s t(norm)=0.0290064, mflops=172.376 10. CWP (best N) (N=35): elapsed time t=1.08329 s, 262144 iters, t-(init.)=1.03329 s t(norm)=0.0246356, mflops=202.958 11. Edelblute: elapsed time t=1.14995 s, 65536 iters, t-(init.)=1.13329 s t(norm)=0.108079, mflops=46.2626 (err=2.9e-16) 12. FFTPACK: elapsed time t=1.01663 s, 262144 iters, t-(init.)=0.966628 s t(norm)=0.0230462, mflops=216.955 (err=1.9e-16) 13. FFTPACK (f2c): elapsed time t=1.6666 s, 262144 iters, t-(init.)=1.6166 s t(norm)=0.0385428, mflops=129.726 (err=1.9e-16) FFTW_MEASURE plan: (cost = 1.398666e-06) FFTW_NOTW 32 14. FFTW: elapsed time t=1.58327 s, 1048576 iters, t-(init.)=1.38328 s t(norm)=0.00824498, mflops=606.43 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.54994 s, 1048576 iters, t-(init.)=1.28328 s t(norm)=0.00764896, mflops=653.684 (err=2.1e-16) 16. Frigo-old: elapsed time t=1.6666 s, 1048576 iters, t-(init.)=1.46661 s t(norm)=0.00874166, mflops=571.973 (err=2.2e-16) 17. Green: elapsed time t=1.59994 s, 524288 iters, t-(init.)=1.49994 s t(norm)=0.0178807, mflops=279.631 (err=2.0e-16) 18. GSL: elapsed time t=1.53327 s, 262144 iters, t-(init.)=1.48327 s t(norm)=0.035364, mflops=141.387 (err=2.0e-16) 19. GSL DIT: elapsed time t=1.58327 s, 131072 iters, t-(init.)=1.5666 s t(norm)=0.0747015, mflops=66.9331 (err=2.2e-16) 20. GSL DIF: elapsed time t=1.36661 s, 131072 iters, t-(init.)=1.33328 s t(norm)=0.0635757, mflops=78.6463 (err=2.5e-16) 21. Krukar: elapsed time t=1.83326 s, 524288 iters, t-(init.)=1.73326 s t(norm)=0.0206621, mflops=241.989 (err=2.2e-16) 22. Mayer (Buneman): elapsed time t=1.99992 s, 262144 iters, t-(init.)=1.96659 s t(norm)=0.0468871, mflops=106.639 (err=2.7e-16) 23. Mayer (simple): elapsed time t=1.6166 s, 262144 iters, t-(init.)=1.5666 s t(norm)=0.0373507, mflops=133.866 24. Mayer (lookup): elapsed time t=1.68327 s, 262144 iters, t-(init.)=1.63327 s t(norm)=0.0389401, mflops=128.402 (err=2.5e-16) 25. Monro: elapsed time t=1.04996 s, 65536 iters, t-(init.)=1.03329 s t(norm)=0.0985424, mflops=50.7396 (err=3.7e-08) 26. NAPACK (f2c): elapsed time t=1.69993 s, 65536 iters, t-(init.)=1.69993 s t(norm)=0.162118, mflops=30.8417 (err=5.4e-16) 27. Nielsen: elapsed time t=1.69993 s, 131072 iters, t-(init.)=1.6666 s t(norm)=0.0794697, mflops=62.9171 (err=1.1e-15) 28. NR (C): elapsed time t=1.84993 s, 131072 iters, t-(init.)=1.81659 s t(norm)=0.086622, mflops=57.7221 (err=2.0e-16) 29. NR (F): elapsed time t=1.78326 s, 131072 iters, t-(init.)=1.7666 s t(norm)=0.0842379, mflops=59.3557 (err=2.0e-16) 30. Ooura (C): elapsed time t=1.78326 s, 524288 iters, t-(init.)=1.69993 s t(norm)=0.0202648, mflops=246.734 (err=2.7e-16) 31. Ooura (F): elapsed time t=1.03329 s, 262144 iters, t-(init.)=0.983294 s t(norm)=0.0234436, mflops=213.278 (err=2.7e-16) 32. Ransom: elapsed time t=1.39994 s, 65536 iters, t-(init.)=1.38328 s t(norm)=0.13192, mflops=37.9019 (err=7.0e-16) 33. SCIPORT: elapsed time t=1.41661 s, 262144 iters, t-(init.)=1.36661 s t(norm)=0.0325826, mflops=153.456 (err=1.8e-16) 34. Singleton: elapsed time t=1.91659 s, 262144 iters, t-(init.)=1.86659 s t(norm)=0.044503, mflops=112.352 (err=2.2e-16) 35. Singleton (f2c): elapsed time t=1.01663 s, 131072 iters, t-(init.)=0.983294 s t(norm)=0.0468871, mflops=106.639 (err=2.2e-16) 36. Sorensen: elapsed time t=1.34995 s, 262144 iters, t-(init.)=1.29995 s t(norm)=0.0309932, mflops=161.326 (err=2.7e-16) 37. Sorensen DIT: elapsed time t=1.83326 s, 131072 iters, t-(init.)=1.81659 s t(norm)=0.086622, mflops=57.7221 (err=2.6e-16) 38. Temperton: elapsed time t=1.06662 s, 131072 iters, t-(init.)=1.03329 s t(norm)=0.0492712, mflops=101.479 (err=3.1e-08) 39. Temperton (f2c): elapsed time t=1.39994 s, 131072 iters, t-(init.)=1.38328 s t(norm)=0.0659598, mflops=75.8037 (err=2.0e-16) 40. Valkenburg: elapsed time t=1.41661 s, 32768 iters, t-(init.)=1.39994 s t(norm)=0.267018, mflops=18.7253 (err=4.3e-16) 41. DXML: elapsed time t=1.91659 s, 524288 iters, t-(init.)=1.81659 s t(norm)=0.0216555, mflops=230.888 (err=1.1e-15) Top mflops for N=32 = 653.684 Normalized results and averages for N=32: fft 0: mflops = 120.994 (norm. = 0.185096), norm. avg. (of 5) = 0.367868 fft 1: mflops = 114.395 (norm. = 0.175), norm. avg. (of 5) = 0.3008 fft 2: mflops = 69.1396 (norm. = 0.105769), norm. avg. (of 5) = 0.176693 fft 3: mflops = 25.7857 (norm. = 0.0394467), norm. avg. (of 5) = 0.0284328 fft 4: mflops = 157.293 (norm. = 0.240625), norm. avg. (of 5) = 0.182408 fft 5: mflops = 22.4704 (norm. = 0.034375), norm. avg. (of 5) = 0.0423839 fft 6: mflops = 127.105 (norm. = 0.194444), norm. avg. (of 5) = 0.124174 fft 7: mflops = 75.8037 (norm. = 0.115964), norm. avg. (of 5) = 0.0971793 fft 8: mflops = 61.0845 (norm. = 0.0934466), norm. avg. (of 5) = 0.143951 fft 9: mflops = 172.376 (norm. = 0.263699), norm. avg. (of 5) = 0.155166 fft 10: mflops = 202.958 (norm. = 0.310484), norm. avg. (of 5) = 0.130002 fft 11: mflops = 46.2626 (norm. = 0.0707721), norm. avg. (of 4) = 0.0799929 fft 12: mflops = 216.955 (norm. = 0.331897), norm. avg. (of 5) = 0.278052 fft 13: mflops = 129.726 (norm. = 0.198454), norm. avg. (of 5) = 0.20537 fft 14: mflops = 606.43 (norm. = 0.927711), norm. avg. (of 5) = 0.700643 fft 15: mflops = 653.684 (norm. = 1), norm. avg. (of 5) = 0.863263 fft 16: mflops = 571.973 (norm. = 0.875), norm. avg. (of 5) = 0.975 fft 17: mflops = 279.631 (norm. = 0.427778), norm. avg. (of 3) = 0.342628 fft 18: mflops = 141.387 (norm. = 0.216292), norm. avg. (of 5) = 0.203099 fft 19: mflops = 66.9331 (norm. = 0.102394), norm. avg. (of 5) = 0.0760812 fft 20: mflops = 78.6463 (norm. = 0.120313), norm. avg. (of 5) = 0.0856614 fft 21: mflops = 241.989 (norm. = 0.370192), norm. avg. (of 5) = 0.371154 fft 22: mflops = 106.639 (norm. = 0.163136), norm. avg. (of 4) = 0.171048 fft 23: mflops = 133.866 (norm. = 0.204787), norm. avg. (of 4) = 0.193022 fft 24: mflops = 128.402 (norm. = 0.196429), norm. avg. (of 4) = 0.187683 fft 25: mflops = 50.7396 (norm. = 0.077621), norm. avg. (of 4) = 0.0398798 fft 26: mflops = 30.8417 (norm. = 0.0471814), norm. avg. (of 5) = 0.0419087 fft 27: mflops = 62.9171 (norm. = 0.09625), norm. avg. (of 5) = 0.0537617 fft 28: mflops = 57.7221 (norm. = 0.0883028), norm. avg. (of 5) = 0.0765697 fft 29: mflops = 59.3557 (norm. = 0.0908019), norm. avg. (of 5) = 0.0703898 fft 30: mflops = 246.734 (norm. = 0.377451), norm. avg. (of 5) = 0.366081 fft 31: mflops = 213.278 (norm. = 0.326271), norm. avg. (of 5) = 0.37056 fft 32: mflops = 37.9019 (norm. = 0.0579819), norm. avg. (of 4) = 0.0381508 fft 33: mflops = 153.456 (norm. = 0.234756), norm. avg. (of 4) = 0.196884 fft 34: mflops = 112.352 (norm. = 0.171875), norm. avg. (of 5) = 0.101958 fft 35: mflops = 106.639 (norm. = 0.163136), norm. avg. (of 5) = 0.0983656 fft 36: mflops = 161.326 (norm. = 0.246795), norm. avg. (of 5) = 0.216205 fft 37: mflops = 57.7221 (norm. = 0.0883028), norm. avg. (of 5) = 0.172691 fft 38: mflops = 101.479 (norm. = 0.155242), norm. avg. (of 5) = 0.108952 fft 39: mflops = 75.8037 (norm. = 0.115964), norm. avg. (of 5) = 0.0765015 fft 40: mflops = 18.7253 (norm. = 0.0286458), norm. avg. (of 5) = 0.048371 fft 41: mflops = 230.888 (norm. = 0.353211), norm. avg. (of 5) = 0.190063 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.08329 s, 65536 iters, t-(init.)=1.06662 s t(norm)=0.0423838, mflops=117.97 (err=5.7e-16) 1. Arndt DIT: elapsed time t=1.11662 s, 65536 iters, t-(init.)=1.08329 s t(norm)=0.0430461, mflops=116.155 (err=5.7e-16) 2. Arndt Split-Radix: elapsed time t=1.69993 s, 65536 iters, t-(init.)=1.68327 s t(norm)=0.066887, mflops=74.753 (err=5.7e-16) 3. Arndt 4-step: elapsed time t=1.79993 s, 32768 iters, t-(init.)=1.78326 s t(norm)=0.141721, mflops=35.2806 (err=5.6e-16) 4. Bailey: elapsed time t=1.41661 s, 131072 iters, t-(init.)=1.38328 s t(norm)=0.0274833, mflops=181.929 (err=5.6e-16) 5. Beauregard: elapsed time t=1.41661 s, 16384 iters, t-(init.)=1.41661 s t(norm)=0.225164, mflops=22.206 (err=6.0e-16) 6. Bergland: elapsed time t=1.69993 s, 131072 iters, t-(init.)=1.64993 s t(norm)=0.0327812, mflops=152.526 (err=6.0e-16) 7. Brenner: elapsed time t=1.38328 s, 65536 iters, t-(init.)=1.36661 s t(norm)=0.0543043, mflops=92.0738 (err=5.9e-16) 8. Burrus: elapsed time t=1.89992 s, 65536 iters, t-(init.)=1.88326 s t(norm)=0.0748339, mflops=66.8146 (err=5.7e-16) 9. CWP (min N) (N=65): elapsed time t=1.16662 s, 131072 iters, t-(init.)=1.11662 s t(norm)=0.0221853, mflops=225.375 10. CWP (best N) (N=84): elapsed time t=1.03329 s, 131072 iters, t-(init.)=0.983294 s t(norm)=0.0195363, mflops=255.934 11. Edelblute: elapsed time t=1.26662 s, 32768 iters, t-(init.)=1.24995 s t(norm)=0.0993371, mflops=50.3337 (err=5.7e-16) 12. FFTPACK: elapsed time t=1.58327 s, 262144 iters, t-(init.)=1.49994 s t(norm)=0.0149006, mflops=335.558 (err=5.5e-16) 13. FFTPACK (f2c): elapsed time t=1.5666 s, 131072 iters, t-(init.)=1.51661 s t(norm)=0.0301323, mflops=165.935 (err=5.5e-16) FFTW_MEASURE plan: (cost = 3.305939e-06) FFTW_NOTW 64 14. FFTW: elapsed time t=1.79993 s, 524288 iters, t-(init.)=1.63327 s t(norm)=0.00811253, mflops=616.331 (err=5.3e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.23328 s, 262144 iters, t-(init.)=1.13329 s t(norm)=0.0112582, mflops=444.121 (err=5.5e-16) 16. Frigo-old: elapsed time t=1.41661 s, 262144 iters, t-(init.)=1.33328 s t(norm)=0.0132449, mflops=377.502 (err=5.6e-16) 17. Green: elapsed time t=1.46661 s, 262144 iters, t-(init.)=1.38328 s t(norm)=0.0137416, mflops=363.858 (err=5.5e-16) 18. GSL: elapsed time t=1.36661 s, 131072 iters, t-(init.)=1.31661 s t(norm)=0.0261588, mflops=191.14 (err=5.5e-16) 19. GSL DIT: elapsed time t=1.68327 s, 65536 iters, t-(init.)=1.6666 s t(norm)=0.0662247, mflops=75.5005 (err=5.6e-16) 20. GSL DIF: elapsed time t=1.43328 s, 65536 iters, t-(init.)=1.41661 s t(norm)=0.056291, mflops=88.8241 (err=5.4e-16) 21. Krukar: elapsed time t=1.09996 s, 131072 iters, t-(init.)=1.04996 s t(norm)=0.0208608, mflops=239.684 (err=6.0e-16) 22. Mayer (Buneman): elapsed time t=1.21662 s, 65536 iters, t-(init.)=1.19995 s t(norm)=0.0476818, mflops=104.862 (err=5.4e-16) 23. Mayer (simple): elapsed time t=1.89992 s, 131072 iters, t-(init.)=1.86659 s t(norm)=0.0370859, mflops=134.822 24. Mayer (lookup): elapsed time t=1.96659 s, 131072 iters, t-(init.)=1.93326 s t(norm)=0.0384103, mflops=130.173 (err=5.5e-16) 25. Monro: elapsed time t=1.69993 s, 65536 iters, t-(init.)=1.6666 s t(norm)=0.0662247, mflops=75.5005 (err=3.4e-08) 26. NAPACK (f2c): elapsed time t=1.68327 s, 32768 iters, t-(init.)=1.6666 s t(norm)=0.132449, mflops=37.7502 (err=1.1e-15) 27. Nielsen: elapsed time t=1.44994 s, 65536 iters, t-(init.)=1.41661 s t(norm)=0.056291, mflops=88.8241 (err=1.8e-15) 28. NR (C): elapsed time t=1.93326 s, 65536 iters, t-(init.)=1.91659 s t(norm)=0.0761584, mflops=65.6526 (err=5.5e-16) 29. NR (F): elapsed time t=1.74993 s, 65536 iters, t-(init.)=1.73326 s t(norm)=0.0688737, mflops=72.5966 (err=5.5e-16) 30. Ooura (C): elapsed time t=1.73326 s, 262144 iters, t-(init.)=1.64993 s t(norm)=0.0163906, mflops=305.052 (err=5.9e-16) 31. Ooura (F): elapsed time t=1.08329 s, 131072 iters, t-(init.)=1.04996 s t(norm)=0.0208608, mflops=239.684 (err=5.9e-16) 32. Ransom: elapsed time t=1.86659 s, 65536 iters, t-(init.)=1.84993 s t(norm)=0.0735095, mflops=68.0185 (err=8.6e-16) 33. SCIPORT: elapsed time t=1.41661 s, 131072 iters, t-(init.)=1.38328 s t(norm)=0.0274833, mflops=181.929 (err=5.9e-16) 34. Singleton: elapsed time t=1.59994 s, 131072 iters, t-(init.)=1.5666 s t(norm)=0.0311256, mflops=160.639 (err=9.2e-16) 35. Singleton (f2c): elapsed time t=1.68327 s, 131072 iters, t-(init.)=1.64993 s t(norm)=0.0327812, mflops=152.526 (err=9.2e-16) 36. Sorensen: elapsed time t=1.31661 s, 131072 iters, t-(init.)=1.26662 s t(norm)=0.0251654, mflops=198.686 (err=5.4e-16) 37. Sorensen DIT: elapsed time t=1.01663 s, 32768 iters, t-(init.)=1.01663 s t(norm)=0.0807942, mflops=61.8856 (err=5.5e-16) 38. Temperton: elapsed time t=1.59994 s, 131072 iters, t-(init.)=1.5666 s t(norm)=0.0311256, mflops=160.639 (err=3.8e-08) 39. Temperton (f2c): elapsed time t=1.04996 s, 65536 iters, t-(init.)=1.01663 s t(norm)=0.0403971, mflops=123.771 (err=5.5e-16) 40. Valkenburg: elapsed time t=1.63327 s, 16384 iters, t-(init.)=1.6166 s t(norm)=0.256952, mflops=19.4589 (err=8.0e-16) 41. DXML: elapsed time t=1.01663 s, 131072 iters, t-(init.)=0.966628 s t(norm)=0.0192052, mflops=260.347 (err=2.0e-15) Top mflops for N=64 = 616.331 Normalized results and averages for N=64: fft 0: mflops = 117.97 (norm. = 0.191406), norm. avg. (of 6) = 0.338458 fft 1: mflops = 116.155 (norm. = 0.188462), norm. avg. (of 6) = 0.282077 fft 2: mflops = 74.753 (norm. = 0.121287), norm. avg. (of 6) = 0.167459 fft 3: mflops = 35.2806 (norm. = 0.057243), norm. avg. (of 6) = 0.0332345 fft 4: mflops = 181.929 (norm. = 0.295181), norm. avg. (of 6) = 0.201203 fft 5: mflops = 22.206 (norm. = 0.0360294), norm. avg. (of 6) = 0.0413248 fft 6: mflops = 152.526 (norm. = 0.247475), norm. avg. (of 6) = 0.144724 fft 7: mflops = 92.0738 (norm. = 0.14939), norm. avg. (of 6) = 0.105881 fft 8: mflops = 66.8146 (norm. = 0.108407), norm. avg. (of 6) = 0.138027 fft 9: mflops = 225.375 (norm. = 0.365672), norm. avg. (of 6) = 0.190251 fft 10: mflops = 255.934 (norm. = 0.415254), norm. avg. (of 6) = 0.177544 fft 11: mflops = 50.3337 (norm. = 0.0816667), norm. avg. (of 5) = 0.0803276 fft 12: mflops = 335.558 (norm. = 0.544444), norm. avg. (of 6) = 0.322451 fft 13: mflops = 165.935 (norm. = 0.269231), norm. avg. (of 6) = 0.216014 fft 14: mflops = 616.331 (norm. = 1), norm. avg. (of 6) = 0.750536 fft 15: mflops = 444.121 (norm. = 0.720588), norm. avg. (of 6) = 0.839484 fft 16: mflops = 377.502 (norm. = 0.6125), norm. avg. (of 6) = 0.914583 fft 17: mflops = 363.858 (norm. = 0.590361), norm. avg. (of 4) = 0.404561 fft 18: mflops = 191.14 (norm. = 0.310127), norm. avg. (of 6) = 0.220937 fft 19: mflops = 75.5005 (norm. = 0.1225), norm. avg. (of 6) = 0.0838176 fft 20: mflops = 88.8241 (norm. = 0.144118), norm. avg. (of 6) = 0.0954041 fft 21: mflops = 239.684 (norm. = 0.388889), norm. avg. (of 6) = 0.37411 fft 22: mflops = 104.862 (norm. = 0.170139), norm. avg. (of 5) = 0.170866 fft 23: mflops = 134.822 (norm. = 0.21875), norm. avg. (of 5) = 0.198168 fft 24: mflops = 130.173 (norm. = 0.211207), norm. avg. (of 5) = 0.192388 fft 25: mflops = 75.5005 (norm. = 0.1225), norm. avg. (of 5) = 0.0564039 fft 26: mflops = 37.7502 (norm. = 0.06125), norm. avg. (of 6) = 0.0451323 fft 27: mflops = 88.8241 (norm. = 0.144118), norm. avg. (of 6) = 0.0688211 fft 28: mflops = 65.6526 (norm. = 0.106522), norm. avg. (of 6) = 0.0815617 fft 29: mflops = 72.5966 (norm. = 0.117788), norm. avg. (of 6) = 0.0782896 fft 30: mflops = 305.052 (norm. = 0.494949), norm. avg. (of 6) = 0.387559 fft 31: mflops = 239.684 (norm. = 0.388889), norm. avg. (of 6) = 0.373615 fft 32: mflops = 68.0185 (norm. = 0.11036), norm. avg. (of 5) = 0.0525927 fft 33: mflops = 181.929 (norm. = 0.295181), norm. avg. (of 5) = 0.216543 fft 34: mflops = 160.639 (norm. = 0.260638), norm. avg. (of 6) = 0.128405 fft 35: mflops = 152.526 (norm. = 0.247475), norm. avg. (of 6) = 0.123217 fft 36: mflops = 198.686 (norm. = 0.322368), norm. avg. (of 6) = 0.233899 fft 37: mflops = 61.8856 (norm. = 0.10041), norm. avg. (of 6) = 0.160644 fft 38: mflops = 160.639 (norm. = 0.260638), norm. avg. (of 6) = 0.134233 fft 39: mflops = 123.771 (norm. = 0.20082), norm. avg. (of 6) = 0.0972212 fft 40: mflops = 19.4589 (norm. = 0.0315722), norm. avg. (of 6) = 0.0455712 fft 41: mflops = 260.347 (norm. = 0.422414), norm. avg. (of 6) = 0.228788 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.13329 s, 32768 iters, t-(init.)=1.11662 s t(norm)=0.0380319, mflops=131.469 (err=3.5e-16) 1. Arndt DIT: elapsed time t=1.19995 s, 32768 iters, t-(init.)=1.18329 s t(norm)=0.0403025, mflops=124.062 (err=3.3e-16) 2. Arndt Split-Radix: elapsed time t=1.83326 s, 32768 iters, t-(init.)=1.81659 s t(norm)=0.0618728, mflops=80.8109 (err=3.6e-16) 3. Arndt 4-step: elapsed time t=1.08329 s, 8192 iters, t-(init.)=1.06662 s t(norm)=0.145316, mflops=34.4078 (err=3.3e-16) 4. Bailey: elapsed time t=1.28328 s, 65536 iters, t-(init.)=1.23328 s t(norm)=0.0210027, mflops=238.065 (err=3.3e-16) 5. Beauregard: elapsed time t=1.6666 s, 8192 iters, t-(init.)=1.6666 s t(norm)=0.227056, mflops=22.021 (err=3.8e-16) 6. Bergland: elapsed time t=1.86659 s, 65536 iters, t-(init.)=1.83326 s t(norm)=0.0312202, mflops=160.153 (err=3.5e-16) 7. Brenner: elapsed time t=1.48327 s, 32768 iters, t-(init.)=1.44994 s t(norm)=0.0493847, mflops=101.246 (err=4.1e-16) 8. Burrus: elapsed time t=2.01659 s, 32768 iters, t-(init.)=1.99992 s t(norm)=0.0681169, mflops=73.4033 (err=3.2e-16) 9. CWP (min N) (N=130): elapsed time t=1.11662 s, 65536 iters, t-(init.)=1.06662 s t(norm)=0.0181645, mflops=275.262 10. CWP (best N) (N=140): elapsed time t=1.63327 s, 131072 iters, t-(init.)=1.54994 s t(norm)=0.0131976, mflops=378.856 11. Edelblute: elapsed time t=1.34995 s, 16384 iters, t-(init.)=1.33328 s t(norm)=0.0908225, mflops=55.0524 (err=3.2e-16) 12. FFTPACK: elapsed time t=1.49994 s, 131072 iters, t-(init.)=1.41661 s t(norm)=0.0120624, mflops=414.513 (err=3.6e-16) 13. FFTPACK (f2c): elapsed time t=1.68327 s, 65536 iters, t-(init.)=1.64993 s t(norm)=0.0280982, mflops=177.947 (err=3.6e-16) FFTW_MEASURE plan: (cost = 8.646301e-06) FFTW_TWIDDLE 4 FFTW_NOTW 32 14. FFTW: elapsed time t=1.14995 s, 131072 iters, t-(init.)=1.06662 s t(norm)=0.00908225, mflops=550.524 (err=3.4e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.14995 s, 131072 iters, t-(init.)=1.06662 s t(norm)=0.00908225, mflops=550.524 (err=3.4e-16) 16. Frigo-old: elapsed time t=1.38328 s, 131072 iters, t-(init.)=1.31661 s t(norm)=0.0112109, mflops=445.994 (err=3.4e-16) 17. Green: elapsed time t=1.68327 s, 131072 iters, t-(init.)=1.59994 s t(norm)=0.0136234, mflops=367.016 (err=4.2e-16) 18. GSL: elapsed time t=1.53327 s, 65536 iters, t-(init.)=1.49994 s t(norm)=0.0255438, mflops=195.742 (err=3.4e-16) 19. GSL DIT: elapsed time t=1.74993 s, 32768 iters, t-(init.)=1.73326 s t(norm)=0.0590346, mflops=84.6961 (err=3.5e-16) 20. GSL DIF: elapsed time t=1.44994 s, 32768 iters, t-(init.)=1.43328 s t(norm)=0.0488171, mflops=102.423 (err=3.7e-16) 21. Krukar: elapsed time t=1.43328 s, 65536 iters, t-(init.)=1.38328 s t(norm)=0.0235571, mflops=212.25 (err=3.6e-16) 22. Mayer (Buneman): elapsed time t=1.28328 s, 32768 iters, t-(init.)=1.26662 s t(norm)=0.0431407, mflops=115.9 (err=3.2e-16) 23. Mayer (simple): elapsed time t=1.96659 s, 65536 iters, t-(init.)=1.93326 s t(norm)=0.0329232, mflops=151.869 24. Mayer (lookup): elapsed time t=1.01663 s, 32768 iters, t-(init.)=0.99996 s t(norm)=0.0340584, mflops=146.807 (err=3.3e-16) 25. Monro: elapsed time t=1.54994 s, 32768 iters, t-(init.)=1.53327 s t(norm)=0.0522229, mflops=95.7434 (err=5.2e-08) 26. NAPACK (f2c): elapsed time t=1.89992 s, 16384 iters, t-(init.)=1.88326 s t(norm)=0.128287, mflops=38.9752 (err=1.2e-15) 27. Nielsen: elapsed time t=1.68327 s, 32768 iters, t-(init.)=1.6666 s t(norm)=0.0567641, mflops=88.0839 (err=1.0e-15) 28. NR (C): elapsed time t=1.01663 s, 16384 iters, t-(init.)=0.99996 s t(norm)=0.0681169, mflops=73.4033 (err=3.2e-16) 29. NR (F): elapsed time t=1.78326 s, 32768 iters, t-(init.)=1.7666 s t(norm)=0.0601699, mflops=83.098 (err=3.2e-16) 30. Ooura (C): elapsed time t=1.98325 s, 131072 iters, t-(init.)=1.89992 s t(norm)=0.0161778, mflops=309.066 (err=3.3e-16) 31. Ooura (F): elapsed time t=1.21662 s, 65536 iters, t-(init.)=1.16662 s t(norm)=0.0198674, mflops=251.668 (err=3.3e-16) 32. Ransom: elapsed time t=1.08329 s, 16384 iters, t-(init.)=1.06662 s t(norm)=0.072658, mflops=68.8156 (err=1.0e-15) 33. SCIPORT: elapsed time t=1.49994 s, 65536 iters, t-(init.)=1.46661 s t(norm)=0.0249762, mflops=200.191 (err=3.8e-16) 34. Singleton: elapsed time t=1.94992 s, 65536 iters, t-(init.)=1.89992 s t(norm)=0.0323555, mflops=154.533 (err=4.2e-16) 35. Singleton (f2c): elapsed time t=1.08329 s, 32768 iters, t-(init.)=1.06662 s t(norm)=0.036329, mflops=137.631 (err=4.2e-16) 36. Sorensen: elapsed time t=1.28328 s, 65536 iters, t-(init.)=1.24995 s t(norm)=0.0212865, mflops=234.89 (err=3.3e-16) 37. Sorensen DIT: elapsed time t=1.06662 s, 16384 iters, t-(init.)=1.04996 s t(norm)=0.0715227, mflops=69.9079 (err=3.2e-16) 38. Temperton: elapsed time t=1.78326 s, 65536 iters, t-(init.)=1.74993 s t(norm)=0.0298011, mflops=167.779 (err=4.7e-08) 39. Temperton (f2c): elapsed time t=1.21662 s, 32768 iters, t-(init.)=1.19995 s t(norm)=0.0408701, mflops=122.339 (err=3.6e-16) 40. Valkenburg: elapsed time t=1.84993 s, 8192 iters, t-(init.)=1.83326 s t(norm)=0.249762, mflops=20.0191 (err=5.8e-16) 41. DXML: elapsed time t=1.74993 s, 131072 iters, t-(init.)=1.6666 s t(norm)=0.014191, mflops=352.336 (err=1.0e-15) Top mflops for N=128 = 550.524 Normalized results and averages for N=128: fft 0: mflops = 131.469 (norm. = 0.238806), norm. avg. (of 7) = 0.324222 fft 1: mflops = 124.062 (norm. = 0.225352), norm. avg. (of 7) = 0.273974 fft 2: mflops = 80.8109 (norm. = 0.146789), norm. avg. (of 7) = 0.164506 fft 3: mflops = 34.4078 (norm. = 0.0625), norm. avg. (of 7) = 0.0374153 fft 4: mflops = 238.065 (norm. = 0.432432), norm. avg. (of 7) = 0.234236 fft 5: mflops = 22.021 (norm. = 0.04), norm. avg. (of 7) = 0.0411356 fft 6: mflops = 160.153 (norm. = 0.290909), norm. avg. (of 7) = 0.165608 fft 7: mflops = 101.246 (norm. = 0.183908), norm. avg. (of 7) = 0.117028 fft 8: mflops = 73.4033 (norm. = 0.133333), norm. avg. (of 7) = 0.137356 fft 9: mflops = 275.262 (norm. = 0.5), norm. avg. (of 7) = 0.234501 fft 10: mflops = 378.856 (norm. = 0.688172), norm. avg. (of 7) = 0.250491 fft 11: mflops = 55.0524 (norm. = 0.1), norm. avg. (of 6) = 0.0836063 fft 12: mflops = 414.513 (norm. = 0.752941), norm. avg. (of 7) = 0.383949 fft 13: mflops = 177.947 (norm. = 0.323232), norm. avg. (of 7) = 0.231331 fft 14: mflops = 550.524 (norm. = 1), norm. avg. (of 7) = 0.786174 fft 15: mflops = 550.524 (norm. = 1), norm. avg. (of 7) = 0.862415 fft 16: mflops = 445.994 (norm. = 0.810127), norm. avg. (of 7) = 0.899661 fft 17: mflops = 367.016 (norm. = 0.666667), norm. avg. (of 5) = 0.456982 fft 18: mflops = 195.742 (norm. = 0.355556), norm. avg. (of 7) = 0.240168 fft 19: mflops = 84.6961 (norm. = 0.153846), norm. avg. (of 7) = 0.0938217 fft 20: mflops = 102.423 (norm. = 0.186047), norm. avg. (of 7) = 0.108353 fft 21: mflops = 212.25 (norm. = 0.385542), norm. avg. (of 7) = 0.375743 fft 22: mflops = 115.9 (norm. = 0.210526), norm. avg. (of 6) = 0.177476 fft 23: mflops = 151.869 (norm. = 0.275862), norm. avg. (of 6) = 0.211117 fft 24: mflops = 146.807 (norm. = 0.266667), norm. avg. (of 6) = 0.204768 fft 25: mflops = 95.7434 (norm. = 0.173913), norm. avg. (of 6) = 0.0759887 fft 26: mflops = 38.9752 (norm. = 0.0707965), norm. avg. (of 7) = 0.0487986 fft 27: mflops = 88.0839 (norm. = 0.16), norm. avg. (of 7) = 0.0818466 fft 28: mflops = 73.4033 (norm. = 0.133333), norm. avg. (of 7) = 0.0889576 fft 29: mflops = 83.098 (norm. = 0.150943), norm. avg. (of 7) = 0.0886687 fft 30: mflops = 309.066 (norm. = 0.561404), norm. avg. (of 7) = 0.412394 fft 31: mflops = 251.668 (norm. = 0.457143), norm. avg. (of 7) = 0.385547 fft 32: mflops = 68.8156 (norm. = 0.125), norm. avg. (of 6) = 0.0646606 fft 33: mflops = 200.191 (norm. = 0.363636), norm. avg. (of 6) = 0.241059 fft 34: mflops = 154.533 (norm. = 0.280702), norm. avg. (of 7) = 0.150162 fft 35: mflops = 137.631 (norm. = 0.25), norm. avg. (of 7) = 0.141329 fft 36: mflops = 234.89 (norm. = 0.426667), norm. avg. (of 7) = 0.261437 fft 37: mflops = 69.9079 (norm. = 0.126984), norm. avg. (of 7) = 0.155836 fft 38: mflops = 167.779 (norm. = 0.304762), norm. avg. (of 7) = 0.158594 fft 39: mflops = 122.339 (norm. = 0.222222), norm. avg. (of 7) = 0.115079 fft 40: mflops = 20.0191 (norm. = 0.0363636), norm. avg. (of 7) = 0.0442558 fft 41: mflops = 352.336 (norm. = 0.64), norm. avg. (of 7) = 0.287533 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.23328 s, 16384 iters, t-(init.)=1.19995 s t(norm)=0.0357614, mflops=139.816 (err=9.7e-16) 1. Arndt DIT: elapsed time t=1.29995 s, 16384 iters, t-(init.)=1.28328 s t(norm)=0.0382448, mflops=130.737 (err=9.9e-16) 2. Arndt Split-Radix: elapsed time t=1.93326 s, 16384 iters, t-(init.)=1.91659 s t(norm)=0.0571188, mflops=87.5368 (err=9.8e-16) 3. Arndt 4-step: elapsed time t=1.14995 s, 4096 iters, t-(init.)=1.14995 s t(norm)=0.137085, mflops=36.4737 (err=1.0e-15) 4. Bailey: elapsed time t=1.51661 s, 32768 iters, t-(init.)=1.48327 s t(norm)=0.0221025, mflops=226.219 (err=9.8e-16) 5. Beauregard: elapsed time t=1.91659 s, 4096 iters, t-(init.)=1.91659 s t(norm)=0.228475, mflops=21.8842 (err=1.1e-15) 6. Bergland: elapsed time t=1.81659 s, 32768 iters, t-(init.)=1.78326 s t(norm)=0.0265727, mflops=188.163 (err=1.0e-15) 7. Brenner: elapsed time t=1.46661 s, 16384 iters, t-(init.)=1.44994 s t(norm)=0.0432116, mflops=115.71 (err=1.1e-15) 8. Burrus: elapsed time t=1.06662 s, 8192 iters, t-(init.)=1.04996 s t(norm)=0.0625824, mflops=79.8947 (err=9.9e-16) 9. CWP (min N) (N=260): elapsed time t=1.01663 s, 32768 iters, t-(init.)=0.983294 s t(norm)=0.0146522, mflops=341.245 10. CWP (best N) (N=280): elapsed time t=1.49994 s, 65536 iters, t-(init.)=1.41661 s t(norm)=0.0105546, mflops=473.729 11. Edelblute: elapsed time t=1.41661 s, 8192 iters, t-(init.)=1.41661 s t(norm)=0.0844365, mflops=59.2161 (err=9.9e-16) 12. FFTPACK: elapsed time t=1.48327 s, 65536 iters, t-(init.)=1.41661 s t(norm)=0.0105546, mflops=473.729 (err=1.0e-15) 13. FFTPACK (f2c): elapsed time t=1.6666 s, 32768 iters, t-(init.)=1.63327 s t(norm)=0.0243376, mflops=205.444 (err=1.0e-15) FFTW_MEASURE plan: (cost = 1.830981e-05) FFTW_TWIDDLE 8 FFTW_NOTW 32 14. FFTW: elapsed time t=1.23328 s, 65536 iters, t-(init.)=1.16662 s t(norm)=0.008692, mflops=575.242 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.23328 s, 65536 iters, t-(init.)=1.14995 s t(norm)=0.00856782, mflops=583.579 (err=1.1e-15) 16. Frigo-old: elapsed time t=1.69993 s, 65536 iters, t-(init.)=1.6166 s t(norm)=0.0120446, mflops=415.123 (err=1.1e-15) 17. Green: elapsed time t=1.73326 s, 65536 iters, t-(init.)=1.6666 s t(norm)=0.0124171, mflops=402.669 (err=1.1e-15) 18. GSL: elapsed time t=1.49994 s, 32768 iters, t-(init.)=1.46661 s t(norm)=0.0218542, mflops=228.789 (err=1.0e-15) 19. GSL DIT: elapsed time t=1.88326 s, 16384 iters, t-(init.)=1.86659 s t(norm)=0.0556288, mflops=89.8815 (err=1.0e-15) 20. GSL DIF: elapsed time t=1.49994 s, 16384 iters, t-(init.)=1.48327 s t(norm)=0.044205, mflops=113.109 (err=1.1e-15) 21. Krukar: elapsed time t=1.98325 s, 32768 iters, t-(init.)=1.94992 s t(norm)=0.0290561, mflops=172.081 (err=1.1e-15) 22. Mayer (Buneman): elapsed time t=1.43328 s, 16384 iters, t-(init.)=1.41661 s t(norm)=0.0422183, mflops=118.432 (err=9.7e-16) 23. Mayer (simple): elapsed time t=1.09996 s, 16384 iters, t-(init.)=1.08329 s t(norm)=0.0322846, mflops=154.873 24. Mayer (lookup): elapsed time t=1.13329 s, 16384 iters, t-(init.)=1.09996 s t(norm)=0.0327812, mflops=152.526 (err=9.4e-16) 25. Monro: elapsed time t=1.5666 s, 16384 iters, t-(init.)=1.54994 s t(norm)=0.0461918, mflops=108.244 (err=8.5e-08) 26. NAPACK (f2c): elapsed time t=1.01663 s, 4096 iters, t-(init.)=1.01663 s t(norm)=0.121191, mflops=41.2571 (err=3.8e-15) 27. Nielsen: elapsed time t=1.63327 s, 16384 iters, t-(init.)=1.6166 s t(norm)=0.0481785, mflops=103.781 (err=3.8e-15) 28. NR (C): elapsed time t=1.08329 s, 8192 iters, t-(init.)=1.06662 s t(norm)=0.0635757, mflops=78.6463 (err=1.1e-15) 29. NR (F): elapsed time t=1.86659 s, 16384 iters, t-(init.)=1.84993 s t(norm)=0.0551321, mflops=90.6913 (err=1.1e-15) 30. Ooura (C): elapsed time t=1.98325 s, 65536 iters, t-(init.)=1.89992 s t(norm)=0.0141555, mflops=353.219 (err=9.9e-16) 31. Ooura (F): elapsed time t=1.28328 s, 32768 iters, t-(init.)=1.24995 s t(norm)=0.0186257, mflops=268.446 (err=9.9e-16) 32. Ransom: elapsed time t=1.79993 s, 16384 iters, t-(init.)=1.78326 s t(norm)=0.0531453, mflops=94.0816 (err=1.9e-15) 33. SCIPORT: elapsed time t=1.58327 s, 32768 iters, t-(init.)=1.54994 s t(norm)=0.0230959, mflops=216.489 (err=1.1e-15) 34. Singleton: elapsed time t=1.63327 s, 32768 iters, t-(init.)=1.58327 s t(norm)=0.0235926, mflops=211.931 (err=1.7e-15) 35. Singleton (f2c): elapsed time t=1.7666 s, 32768 iters, t-(init.)=1.73326 s t(norm)=0.0258276, mflops=193.591 (err=1.7e-15) 36. Sorensen: elapsed time t=1.34995 s, 32768 iters, t-(init.)=1.31661 s t(norm)=0.0196191, mflops=254.854 (err=9.9e-16) 37. Sorensen DIT: elapsed time t=1.14995 s, 8192 iters, t-(init.)=1.13329 s t(norm)=0.0675492, mflops=74.0201 (err=9.9e-16) 38. Temperton: elapsed time t=1.81659 s, 32768 iters, t-(init.)=1.7666 s t(norm)=0.0263243, mflops=189.938 (err=9.5e-08) 39. Temperton (f2c): elapsed time t=1.19995 s, 16384 iters, t-(init.)=1.18329 s t(norm)=0.0352647, mflops=141.785 (err=1.0e-15) 40. Valkenburg: elapsed time t=1.04996 s, 2048 iters, t-(init.)=1.04996 s t(norm)=0.250329, mflops=19.9737 (err=1.2e-15) 41. DXML: elapsed time t=1.91659 s, 65536 iters, t-(init.)=1.84993 s t(norm)=0.013783, mflops=362.765 (err=2.4e-15) Top mflops for N=256 = 583.579 Normalized results and averages for N=256: fft 0: mflops = 139.816 (norm. = 0.239583), norm. avg. (of 8) = 0.313642 fft 1: mflops = 130.737 (norm. = 0.224026), norm. avg. (of 8) = 0.26773 fft 2: mflops = 87.5368 (norm. = 0.15), norm. avg. (of 8) = 0.162693 fft 3: mflops = 36.4737 (norm. = 0.0625), norm. avg. (of 8) = 0.0405509 fft 4: mflops = 226.219 (norm. = 0.38764), norm. avg. (of 8) = 0.253411 fft 5: mflops = 21.8842 (norm. = 0.0375), norm. avg. (of 8) = 0.0406811 fft 6: mflops = 188.163 (norm. = 0.32243), norm. avg. (of 8) = 0.185211 fft 7: mflops = 115.71 (norm. = 0.198276), norm. avg. (of 8) = 0.127184 fft 8: mflops = 79.8947 (norm. = 0.136905), norm. avg. (of 8) = 0.1373 fft 9: mflops = 341.245 (norm. = 0.584746), norm. avg. (of 8) = 0.278281 fft 10: mflops = 473.729 (norm. = 0.811765), norm. avg. (of 8) = 0.32065 fft 11: mflops = 59.2161 (norm. = 0.101471), norm. avg. (of 7) = 0.0861584 fft 12: mflops = 473.729 (norm. = 0.811765), norm. avg. (of 8) = 0.437426 fft 13: mflops = 205.444 (norm. = 0.352041), norm. avg. (of 8) = 0.246419 fft 14: mflops = 575.242 (norm. = 0.985714), norm. avg. (of 8) = 0.811116 fft 15: mflops = 583.579 (norm. = 1), norm. avg. (of 8) = 0.879613 fft 16: mflops = 415.123 (norm. = 0.71134), norm. avg. (of 8) = 0.876121 fft 17: mflops = 402.669 (norm. = 0.69), norm. avg. (of 6) = 0.495819 fft 18: mflops = 228.789 (norm. = 0.392045), norm. avg. (of 8) = 0.259153 fft 19: mflops = 89.8815 (norm. = 0.154018), norm. avg. (of 8) = 0.101346 fft 20: mflops = 113.109 (norm. = 0.19382), norm. avg. (of 8) = 0.119036 fft 21: mflops = 172.081 (norm. = 0.294872), norm. avg. (of 8) = 0.365634 fft 22: mflops = 118.432 (norm. = 0.202941), norm. avg. (of 7) = 0.181114 fft 23: mflops = 154.873 (norm. = 0.265385), norm. avg. (of 7) = 0.218869 fft 24: mflops = 152.526 (norm. = 0.261364), norm. avg. (of 7) = 0.212853 fft 25: mflops = 108.244 (norm. = 0.185484), norm. avg. (of 7) = 0.0916309 fft 26: mflops = 41.2571 (norm. = 0.0706967), norm. avg. (of 8) = 0.0515359 fft 27: mflops = 103.781 (norm. = 0.177835), norm. avg. (of 8) = 0.0938452 fft 28: mflops = 78.6463 (norm. = 0.134766), norm. avg. (of 8) = 0.0946836 fft 29: mflops = 90.6913 (norm. = 0.155405), norm. avg. (of 8) = 0.0970108 fft 30: mflops = 353.219 (norm. = 0.605263), norm. avg. (of 8) = 0.436503 fft 31: mflops = 268.446 (norm. = 0.46), norm. avg. (of 8) = 0.394854 fft 32: mflops = 94.0816 (norm. = 0.161215), norm. avg. (of 7) = 0.0784541 fft 33: mflops = 216.489 (norm. = 0.370968), norm. avg. (of 7) = 0.259617 fft 34: mflops = 211.931 (norm. = 0.363158), norm. avg. (of 8) = 0.176786 fft 35: mflops = 193.591 (norm. = 0.331731), norm. avg. (of 8) = 0.165129 fft 36: mflops = 254.854 (norm. = 0.436709), norm. avg. (of 8) = 0.283346 fft 37: mflops = 74.0201 (norm. = 0.126838), norm. avg. (of 8) = 0.152211 fft 38: mflops = 189.938 (norm. = 0.325472), norm. avg. (of 8) = 0.179454 fft 39: mflops = 141.785 (norm. = 0.242958), norm. avg. (of 8) = 0.131063 fft 40: mflops = 19.9737 (norm. = 0.0342262), norm. avg. (of 8) = 0.0430021 fft 41: mflops = 362.765 (norm. = 0.621622), norm. avg. (of 8) = 0.329294 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.29995 s, 8192 iters, t-(init.)=1.26662 s t(norm)=0.0335539, mflops=149.014 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.34995 s, 8192 iters, t-(init.)=1.31661 s t(norm)=0.0348784, mflops=143.355 (err=1.1e-15) 2. Arndt Split-Radix: elapsed time t=1.03329 s, 4096 iters, t-(init.)=1.03329 s t(norm)=0.0547458, mflops=91.3312 (err=1.1e-15) 3. Arndt 4-step: elapsed time t=1.23328 s, 2048 iters, t-(init.)=1.21662 s t(norm)=0.128917, mflops=38.7845 (err=9.9e-16) 4. Bailey: elapsed time t=1.49994 s, 16384 iters, t-(init.)=1.46661 s t(norm)=0.0194259, mflops=257.388 (err=1.1e-15) 5. Beauregard: elapsed time t=1.09996 s, 1024 iters, t-(init.)=1.09996 s t(norm)=0.233111, mflops=21.449 (err=1.0e-15) 6. Bergland: elapsed time t=1.89992 s, 16384 iters, t-(init.)=1.86659 s t(norm)=0.0247239, mflops=202.233 (err=1.0e-15) 7. Brenner: elapsed time t=1.13329 s, 4096 iters, t-(init.)=1.13329 s t(norm)=0.0600438, mflops=83.2726 (err=1.0e-15) 8. Burrus: elapsed time t=1.24995 s, 4096 iters, t-(init.)=1.23328 s t(norm)=0.0653417, mflops=76.5208 (err=1.1e-15) 9. CWP (min N) (N=520): elapsed time t=1.99992 s, 32768 iters, t-(init.)=1.93326 s t(norm)=0.0128034, mflops=390.52 10. CWP (best N) (N=560): elapsed time t=1.04996 s, 16384 iters, t-(init.)=0.99996 s t(norm)=0.0132449, mflops=377.502 11. Edelblute: elapsed time t=1.48327 s, 4096 iters, t-(init.)=1.46661 s t(norm)=0.0777037, mflops=64.347 (err=1.1e-15) 12. FFTPACK: elapsed time t=1.99992 s, 32768 iters, t-(init.)=1.93326 s t(norm)=0.0128034, mflops=390.52 (err=1.0e-15) 13. FFTPACK (f2c): elapsed time t=1.23328 s, 8192 iters, t-(init.)=1.21662 s t(norm)=0.0322294, mflops=155.138 (err=1.0e-15) FFTW_MEASURE plan: (cost = 4.272290e-05) FFTW_TWIDDLE 8 FFTW_NOTW 64 14. FFTW: elapsed time t=1.41661 s, 32768 iters, t-(init.)=1.34995 s t(norm)=0.00894034, mflops=559.263 (err=9.9e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.68327 s, 32768 iters, t-(init.)=1.59994 s t(norm)=0.010596, mflops=471.878 (err=9.6e-16) 16. Frigo-old: elapsed time t=1.09996 s, 16384 iters, t-(init.)=1.06662 s t(norm)=0.0141279, mflops=353.909 (err=9.4e-16) 17. Green: elapsed time t=1.83326 s, 32768 iters, t-(init.)=1.7666 s t(norm)=0.0116997, mflops=427.361 (err=9.6e-16) 18. GSL: elapsed time t=2.03325 s, 16384 iters, t-(init.)=1.99992 s t(norm)=0.0264899, mflops=188.751 (err=1.0e-15) 19. GSL DIT: elapsed time t=1.01663 s, 4096 iters, t-(init.)=1.01663 s t(norm)=0.0538628, mflops=92.8285 (err=1.2e-15) 20. GSL DIF: elapsed time t=1.6166 s, 8192 iters, t-(init.)=1.59994 s t(norm)=0.0423838, mflops=117.97 (err=1.1e-15) 21. Krukar: elapsed time t=1.23328 s, 8192 iters, t-(init.)=1.21662 s t(norm)=0.0322294, mflops=155.138 (err=1.0e-15) 22. Mayer (Buneman): elapsed time t=1.49994 s, 8192 iters, t-(init.)=1.48327 s t(norm)=0.0392933, mflops=127.248 (err=1.0e-15) 23. Mayer (simple): elapsed time t=1.16662 s, 8192 iters, t-(init.)=1.14995 s t(norm)=0.0304634, mflops=164.132 24. Mayer (lookup): elapsed time t=1.18329 s, 8192 iters, t-(init.)=1.16662 s t(norm)=0.0309049, mflops=161.787 (err=1.0e-15) 25. Monro: elapsed time t=1.6166 s, 8192 iters, t-(init.)=1.59994 s t(norm)=0.0423838, mflops=117.97 (err=8.4e-08) 26. NAPACK (f2c): elapsed time t=1.31661 s, 2048 iters, t-(init.)=1.29995 s t(norm)=0.137747, mflops=36.2983 (err=7.1e-15) 27. Nielsen: elapsed time t=1.73326 s, 8192 iters, t-(init.)=1.7166 s t(norm)=0.0454743, mflops=109.952 (err=3.7e-15) 28. NR (C): elapsed time t=1.16662 s, 4096 iters, t-(init.)=1.14995 s t(norm)=0.0609268, mflops=82.0658 (err=9.7e-16) 29. NR (F): elapsed time t=1.11662 s, 4096 iters, t-(init.)=1.11662 s t(norm)=0.0591608, mflops=84.5155 (err=9.7e-16) 30. Ooura (C): elapsed time t=1.16662 s, 16384 iters, t-(init.)=1.13329 s t(norm)=0.0150109, mflops=333.09 (err=9.7e-16) 31. Ooura (F): elapsed time t=1.46661 s, 16384 iters, t-(init.)=1.43328 s t(norm)=0.0189844, mflops=263.374 (err=9.7e-16) 32. Ransom: elapsed time t=1.04996 s, 4096 iters, t-(init.)=1.03329 s t(norm)=0.0547458, mflops=91.3312 (err=1.4e-15) 33. SCIPORT: elapsed time t=1.81659 s, 16384 iters, t-(init.)=1.78326 s t(norm)=0.0236202, mflops=211.684 (err=1.0e-15) 34. Singleton: elapsed time t=1.83326 s, 16384 iters, t-(init.)=1.79993 s t(norm)=0.0238409, mflops=209.724 (err=1.2e-15) 35. Singleton (f2c): elapsed time t=1.99992 s, 16384 iters, t-(init.)=1.96659 s t(norm)=0.0260484, mflops=191.95 (err=1.2e-15) 36. Sorensen: elapsed time t=1.43328 s, 16384 iters, t-(init.)=1.39994 s t(norm)=0.0185429, mflops=269.645 (err=1.0e-15) 37. Sorensen DIT: elapsed time t=1.34995 s, 4096 iters, t-(init.)=1.34995 s t(norm)=0.0715227, mflops=69.9079 (err=1.1e-15) 38. Temperton: elapsed time t=1.26662 s, 8192 iters, t-(init.)=1.24995 s t(norm)=0.0331124, mflops=151.001 (err=1.0e-07) 39. Temperton (f2c): elapsed time t=1.58327 s, 8192 iters, t-(init.)=1.5666 s t(norm)=0.0415008, mflops=120.48 (err=1.0e-15) 40. Valkenburg: elapsed time t=1.16662 s, 1024 iters, t-(init.)=1.16662 s t(norm)=0.247239, mflops=20.2233 (err=1.3e-15) 41. DXML: elapsed time t=1.01663 s, 16384 iters, t-(init.)=0.983294 s t(norm)=0.0130242, mflops=383.901 (err=2.9e-15) Top mflops for N=512 = 559.263 Normalized results and averages for N=512: fft 0: mflops = 149.014 (norm. = 0.266447), norm. avg. (of 9) = 0.308398 fft 1: mflops = 143.355 (norm. = 0.256329), norm. avg. (of 9) = 0.266463 fft 2: mflops = 91.3312 (norm. = 0.163306), norm. avg. (of 9) = 0.162761 fft 3: mflops = 38.7845 (norm. = 0.0693493), norm. avg. (of 9) = 0.0437507 fft 4: mflops = 257.388 (norm. = 0.460227), norm. avg. (of 9) = 0.276391 fft 5: mflops = 21.449 (norm. = 0.0383523), norm. avg. (of 9) = 0.0404224 fft 6: mflops = 202.233 (norm. = 0.361607), norm. avg. (of 9) = 0.20481 fft 7: mflops = 83.2726 (norm. = 0.148897), norm. avg. (of 9) = 0.129596 fft 8: mflops = 76.5208 (norm. = 0.136824), norm. avg. (of 9) = 0.137247 fft 9: mflops = 390.52 (norm. = 0.698276), norm. avg. (of 9) = 0.324947 fft 10: mflops = 377.502 (norm. = 0.675), norm. avg. (of 9) = 0.360022 fft 11: mflops = 64.347 (norm. = 0.115057), norm. avg. (of 8) = 0.0897707 fft 12: mflops = 390.52 (norm. = 0.698276), norm. avg. (of 9) = 0.466409 fft 13: mflops = 155.138 (norm. = 0.277397), norm. avg. (of 9) = 0.249861 fft 14: mflops = 559.263 (norm. = 1), norm. avg. (of 9) = 0.832103 fft 15: mflops = 471.878 (norm. = 0.84375), norm. avg. (of 9) = 0.875628 fft 16: mflops = 353.909 (norm. = 0.632812), norm. avg. (of 9) = 0.849087 fft 17: mflops = 427.361 (norm. = 0.764151), norm. avg. (of 7) = 0.534152 fft 18: mflops = 188.751 (norm. = 0.3375), norm. avg. (of 9) = 0.267858 fft 19: mflops = 92.8285 (norm. = 0.165984), norm. avg. (of 9) = 0.108528 fft 20: mflops = 117.97 (norm. = 0.210938), norm. avg. (of 9) = 0.129248 fft 21: mflops = 155.138 (norm. = 0.277397), norm. avg. (of 9) = 0.35583 fft 22: mflops = 127.248 (norm. = 0.227528), norm. avg. (of 8) = 0.186916 fft 23: mflops = 164.132 (norm. = 0.293478), norm. avg. (of 8) = 0.228195 fft 24: mflops = 161.787 (norm. = 0.289286), norm. avg. (of 8) = 0.222407 fft 25: mflops = 117.97 (norm. = 0.210938), norm. avg. (of 8) = 0.106544 fft 26: mflops = 36.2983 (norm. = 0.0649038), norm. avg. (of 9) = 0.0530212 fft 27: mflops = 109.952 (norm. = 0.196602), norm. avg. (of 9) = 0.105263 fft 28: mflops = 82.0658 (norm. = 0.146739), norm. avg. (of 9) = 0.100468 fft 29: mflops = 84.5155 (norm. = 0.151119), norm. avg. (of 9) = 0.103023 fft 30: mflops = 333.09 (norm. = 0.595588), norm. avg. (of 9) = 0.454179 fft 31: mflops = 263.374 (norm. = 0.47093), norm. avg. (of 9) = 0.403307 fft 32: mflops = 91.3312 (norm. = 0.163306), norm. avg. (of 8) = 0.0890606 fft 33: mflops = 211.684 (norm. = 0.378505), norm. avg. (of 8) = 0.274478 fft 34: mflops = 209.724 (norm. = 0.375), norm. avg. (of 9) = 0.19881 fft 35: mflops = 191.95 (norm. = 0.34322), norm. avg. (of 9) = 0.184917 fft 36: mflops = 269.645 (norm. = 0.482143), norm. avg. (of 9) = 0.305435 fft 37: mflops = 69.9079 (norm. = 0.125), norm. avg. (of 9) = 0.149188 fft 38: mflops = 151.001 (norm. = 0.27), norm. avg. (of 9) = 0.189514 fft 39: mflops = 120.48 (norm. = 0.215426), norm. avg. (of 9) = 0.140437 fft 40: mflops = 20.2233 (norm. = 0.0361607), norm. avg. (of 9) = 0.042242 fft 41: mflops = 383.901 (norm. = 0.686441), norm. avg. (of 9) = 0.368977 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.59994 s, 4096 iters, t-(init.)=1.58327 s t(norm)=0.0377481, mflops=132.457 (err=1.8e-15) 1. Arndt DIT: elapsed time t=1.5666 s, 4096 iters, t-(init.)=1.54994 s t(norm)=0.0369534, mflops=135.306 (err=1.8e-15) 2. Arndt Split-Radix: elapsed time t=1.26662 s, 2048 iters, t-(init.)=1.24995 s t(norm)=0.0596023, mflops=83.8894 (err=1.8e-15) 3. Arndt 4-step: elapsed time t=1.24995 s, 1024 iters, t-(init.)=1.23328 s t(norm)=0.117615, mflops=42.5115 (err=1.8e-15) 4. Bailey: elapsed time t=1.04996 s, 4096 iters, t-(init.)=1.03329 s t(norm)=0.0246356, mflops=202.958 (err=1.9e-15) 5. Beauregard: elapsed time t=1.24995 s, 512 iters, t-(init.)=1.23328 s t(norm)=0.23523, mflops=21.2558 (err=2.0e-15) 6. Bergland: elapsed time t=1.18329 s, 4096 iters, t-(init.)=1.16662 s t(norm)=0.0278144, mflops=179.763 (err=2.2e-15) 7. Brenner: elapsed time t=1.16662 s, 2048 iters, t-(init.)=1.14995 s t(norm)=0.0548341, mflops=91.1842 (err=2.0e-15) 8. Burrus: elapsed time t=1.48327 s, 2048 iters, t-(init.)=1.48327 s t(norm)=0.070728, mflops=70.6933 (err=1.8e-15) 9. CWP (min N) (N=1040): elapsed time t=1.34995 s, 8192 iters, t-(init.)=1.31661 s t(norm)=0.0156953, mflops=318.567 10. CWP (best N) (N=1040): elapsed time t=1.33328 s, 8192 iters, t-(init.)=1.28328 s t(norm)=0.0152979, mflops=326.842 11. Edelblute: elapsed time t=1.7166 s, 2048 iters, t-(init.)=1.7166 s t(norm)=0.0818538, mflops=61.0845 (err=1.8e-15) 12. FFTPACK: elapsed time t=1.08329 s, 8192 iters, t-(init.)=1.04996 s t(norm)=0.0125165, mflops=399.474 (err=1.9e-15) 13. FFTPACK (f2c): elapsed time t=1.24995 s, 4096 iters, t-(init.)=1.23328 s t(norm)=0.0294038, mflops=170.046 (err=1.9e-15) FFTW_MEASURE plan: (cost = 1.017212e-04) FFTW_TWIDDLE 32 FFTW_NOTW 32 14. FFTW: elapsed time t=1.63327 s, 16384 iters, t-(init.)=1.5666 s t(norm)=0.00933769, mflops=535.464 (err=2.0e-15) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.81659 s, 16384 iters, t-(init.)=1.74993 s t(norm)=0.0104304, mflops=479.368 (err=2.0e-15) 16. Frigo-old: elapsed time t=1.19995 s, 8192 iters, t-(init.)=1.16662 s t(norm)=0.0139072, mflops=359.526 (err=1.9e-15) 17. Green: elapsed time t=1.08329 s, 8192 iters, t-(init.)=1.04996 s t(norm)=0.0125165, mflops=399.474 (err=2.0e-15) 18. GSL: elapsed time t=1.06662 s, 4096 iters, t-(init.)=1.04996 s t(norm)=0.0250329, mflops=199.737 (err=1.9e-15) 19. GSL DIT: elapsed time t=1.28328 s, 2048 iters, t-(init.)=1.28328 s t(norm)=0.0611917, mflops=81.7105 (err=2.1e-15) 20. GSL DIF: elapsed time t=1.06662 s, 2048 iters, t-(init.)=1.06662 s t(norm)=0.0508606, mflops=98.3079 (err=2.2e-15) 21. Krukar: elapsed time t=1.89992 s, 4096 iters, t-(init.)=1.88326 s t(norm)=0.0449004, mflops=111.358 (err=1.9e-15) 22. Mayer (Buneman): elapsed time t=1.74993 s, 4096 iters, t-(init.)=1.73326 s t(norm)=0.0413242, mflops=120.994 (err=1.8e-15) 23. Mayer (simple): elapsed time t=1.39994 s, 4096 iters, t-(init.)=1.38328 s t(norm)=0.0329799, mflops=151.607 24. Mayer (lookup): elapsed time t=1.44994 s, 4096 iters, t-(init.)=1.43328 s t(norm)=0.034172, mflops=146.319 (err=1.8e-15) 25. Monro: elapsed time t=1.88326 s, 4096 iters, t-(init.)=1.86659 s t(norm)=0.044503, mflops=112.352 (err=1.0e-07) 26. NAPACK (f2c): elapsed time t=1.51661 s, 1024 iters, t-(init.)=1.51661 s t(norm)=0.144635, mflops=34.5698 (err=1.7e-14) 27. Nielsen: elapsed time t=1.03329 s, 2048 iters, t-(init.)=1.01663 s t(norm)=0.0484765, mflops=103.143 (err=7.5e-15) 28. NR (C): elapsed time t=1.33328 s, 2048 iters, t-(init.)=1.33328 s t(norm)=0.0635757, mflops=78.6463 (err=1.9e-15) 29. NR (F): elapsed time t=1.38328 s, 2048 iters, t-(init.)=1.36661 s t(norm)=0.0651651, mflops=76.7281 (err=1.9e-15) 30. Ooura (C): elapsed time t=1.34995 s, 8192 iters, t-(init.)=1.31661 s t(norm)=0.0156953, mflops=318.567 (err=2.2e-15) 31. Ooura (F): elapsed time t=1.79993 s, 8192 iters, t-(init.)=1.7666 s t(norm)=0.0210595, mflops=237.423 (err=2.2e-15) 32. Ransom: elapsed time t=1.94992 s, 4096 iters, t-(init.)=1.93326 s t(norm)=0.0460924, mflops=108.478 (err=2.3e-15) 33. SCIPORT: elapsed time t=1.93326 s, 8192 iters, t-(init.)=1.89992 s t(norm)=0.0226489, mflops=220.762 (err=2.0e-15) 34. Singleton: elapsed time t=1.94992 s, 8192 iters, t-(init.)=1.91659 s t(norm)=0.0228475, mflops=218.842 (err=2.8e-15) 35. Singleton (f2c): elapsed time t=1.23328 s, 4096 iters, t-(init.)=1.21662 s t(norm)=0.0290064, mflops=172.376 (err=2.8e-15) 36. Sorensen: elapsed time t=1.08329 s, 4096 iters, t-(init.)=1.06662 s t(norm)=0.0254303, mflops=196.616 (err=1.8e-15) 37. Sorensen DIT: elapsed time t=1.6666 s, 2048 iters, t-(init.)=1.64993 s t(norm)=0.078675, mflops=63.5526 (err=1.9e-15) 38. Temperton: elapsed time t=1.19995 s, 4096 iters, t-(init.)=1.18329 s t(norm)=0.0282117, mflops=177.231 (err=1.1e-07) 39. Temperton (f2c): elapsed time t=1.46661 s, 4096 iters, t-(init.)=1.44994 s t(norm)=0.0345693, mflops=144.637 (err=1.9e-15) 40. Valkenburg: elapsed time t=1.44994 s, 512 iters, t-(init.)=1.44994 s t(norm)=0.276554, mflops=18.0796 (err=2.4e-15) 41. DXML: elapsed time t=1.19995 s, 8192 iters, t-(init.)=1.16662 s t(norm)=0.0139072, mflops=359.526 (err=2.8e-15) Top mflops for N=1024 = 535.464 Normalized results and averages for N=1024: fft 0: mflops = 132.457 (norm. = 0.247368), norm. avg. (of 10) = 0.302295 fft 1: mflops = 135.306 (norm. = 0.252688), norm. avg. (of 10) = 0.265086 fft 2: mflops = 83.8894 (norm. = 0.156667), norm. avg. (of 10) = 0.162151 fft 3: mflops = 42.5115 (norm. = 0.0793919), norm. avg. (of 10) = 0.0473148 fft 4: mflops = 202.958 (norm. = 0.379032), norm. avg. (of 10) = 0.286655 fft 5: mflops = 21.2558 (norm. = 0.0396959), norm. avg. (of 10) = 0.0403497 fft 6: mflops = 179.763 (norm. = 0.335714), norm. avg. (of 10) = 0.217901 fft 7: mflops = 91.1842 (norm. = 0.17029), norm. avg. (of 10) = 0.133666 fft 8: mflops = 70.6933 (norm. = 0.132022), norm. avg. (of 10) = 0.136724 fft 9: mflops = 318.567 (norm. = 0.594937), norm. avg. (of 10) = 0.351946 fft 10: mflops = 326.842 (norm. = 0.61039), norm. avg. (of 10) = 0.385059 fft 11: mflops = 61.0845 (norm. = 0.114078), norm. avg. (of 9) = 0.0924715 fft 12: mflops = 399.474 (norm. = 0.746032), norm. avg. (of 10) = 0.494372 fft 13: mflops = 170.046 (norm. = 0.317568), norm. avg. (of 10) = 0.256632 fft 14: mflops = 535.464 (norm. = 1), norm. avg. (of 10) = 0.848893 fft 15: mflops = 479.368 (norm. = 0.895238), norm. avg. (of 10) = 0.877589 fft 16: mflops = 359.526 (norm. = 0.671429), norm. avg. (of 10) = 0.831321 fft 17: mflops = 399.474 (norm. = 0.746032), norm. avg. (of 8) = 0.560637 fft 18: mflops = 199.737 (norm. = 0.373016), norm. avg. (of 10) = 0.278374 fft 19: mflops = 81.7105 (norm. = 0.152597), norm. avg. (of 10) = 0.112935 fft 20: mflops = 98.3079 (norm. = 0.183594), norm. avg. (of 10) = 0.134682 fft 21: mflops = 111.358 (norm. = 0.207965), norm. avg. (of 10) = 0.341044 fft 22: mflops = 120.994 (norm. = 0.225962), norm. avg. (of 9) = 0.191254 fft 23: mflops = 151.607 (norm. = 0.283133), norm. avg. (of 9) = 0.2343 fft 24: mflops = 146.319 (norm. = 0.273256), norm. avg. (of 9) = 0.228057 fft 25: mflops = 112.352 (norm. = 0.209821), norm. avg. (of 9) = 0.118019 fft 26: mflops = 34.5698 (norm. = 0.0645604), norm. avg. (of 10) = 0.0541751 fft 27: mflops = 103.143 (norm. = 0.192623), norm. avg. (of 10) = 0.113999 fft 28: mflops = 78.6463 (norm. = 0.146875), norm. avg. (of 10) = 0.105108 fft 29: mflops = 76.7281 (norm. = 0.143293), norm. avg. (of 10) = 0.10705 fft 30: mflops = 318.567 (norm. = 0.594937), norm. avg. (of 10) = 0.468255 fft 31: mflops = 237.423 (norm. = 0.443396), norm. avg. (of 10) = 0.407316 fft 32: mflops = 108.478 (norm. = 0.202586), norm. avg. (of 9) = 0.101675 fft 33: mflops = 220.762 (norm. = 0.412281), norm. avg. (of 9) = 0.289789 fft 34: mflops = 218.842 (norm. = 0.408696), norm. avg. (of 10) = 0.219798 fft 35: mflops = 172.376 (norm. = 0.321918), norm. avg. (of 10) = 0.198617 fft 36: mflops = 196.616 (norm. = 0.367188), norm. avg. (of 10) = 0.31161 fft 37: mflops = 63.5526 (norm. = 0.118687), norm. avg. (of 10) = 0.146138 fft 38: mflops = 177.231 (norm. = 0.330986), norm. avg. (of 10) = 0.203662 fft 39: mflops = 144.637 (norm. = 0.270115), norm. avg. (of 10) = 0.153405 fft 40: mflops = 18.0796 (norm. = 0.0337644), norm. avg. (of 10) = 0.0413942 fft 41: mflops = 359.526 (norm. = 0.671429), norm. avg. (of 10) = 0.399222 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.64993 s, 2048 iters, t-(init.)=1.63327 s t(norm)=0.0354001, mflops=141.242 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.59994 s, 2048 iters, t-(init.)=1.58327 s t(norm)=0.0343165, mflops=145.703 (err=1.4e-15) 2. Arndt Split-Radix: elapsed time t=1.33328 s, 1024 iters, t-(init.)=1.31661 s t(norm)=0.0570737, mflops=87.6061 (err=1.5e-15) 3. Arndt 4-step: elapsed time t=1.38328 s, 512 iters, t-(init.)=1.38328 s t(norm)=0.119927, mflops=41.692 (err=1.4e-15) 4. Bailey: elapsed time t=1.49994 s, 2048 iters, t-(init.)=1.46661 s t(norm)=0.0317879, mflops=157.293 (err=1.4e-15) 5. Beauregard: elapsed time t=1.39994 s, 256 iters, t-(init.)=1.39994 s t(norm)=0.242744, mflops=20.5979 (err=1.4e-15) 6. Bergland: elapsed time t=1.16662 s, 2048 iters, t-(init.)=1.14995 s t(norm)=0.0249246, mflops=200.605 (err=1.5e-15) 7. Brenner: elapsed time t=1.31661 s, 1024 iters, t-(init.)=1.31661 s t(norm)=0.0570737, mflops=87.6061 (err=1.4e-15) 8. Burrus: elapsed time t=1.54994 s, 1024 iters, t-(init.)=1.53327 s t(norm)=0.0664656, mflops=75.2269 (err=1.4e-15) 9. CWP (min N) (N=2145): elapsed time t=1.6666 s, 4096 iters, t-(init.)=1.63327 s t(norm)=0.0177001, mflops=282.485 10. CWP (best N) (N=2184): elapsed time t=1.36661 s, 4096 iters, t-(init.)=1.33328 s t(norm)=0.014449, mflops=346.044 11. Edelblute: elapsed time t=1.78326 s, 1024 iters, t-(init.)=1.7666 s t(norm)=0.0765799, mflops=65.2913 (err=1.4e-15) 12. FFTPACK: elapsed time t=1.5666 s, 4096 iters, t-(init.)=1.53327 s t(norm)=0.0166164, mflops=300.908 (err=1.4e-15) 13. FFTPACK (f2c): elapsed time t=1.48327 s, 2048 iters, t-(init.)=1.46661 s t(norm)=0.0317879, mflops=157.293 (err=1.4e-15) FFTW_MEASURE plan: (cost = 2.604063e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 14. FFTW: elapsed time t=1.7166 s, 4096 iters, t-(init.)=1.68327 s t(norm)=0.0182419, mflops=274.094 (err=1.4e-15) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.7166 s, 4096 iters, t-(init.)=1.6666 s t(norm)=0.0180613, mflops=276.835 (err=1.4e-15) 16. Frigo-old: elapsed time t=1.59994 s, 4096 iters, t-(init.)=1.5666 s t(norm)=0.0169776, mflops=294.505 (err=1.3e-15) 17. Green: elapsed time t=1.16662 s, 4096 iters, t-(init.)=1.13329 s t(norm)=0.0122817, mflops=407.11 (err=1.4e-15) 18. GSL: elapsed time t=1.26662 s, 2048 iters, t-(init.)=1.24995 s t(norm)=0.0270919, mflops=184.557 (err=1.4e-15) 19. GSL DIT: elapsed time t=1.39994 s, 1024 iters, t-(init.)=1.39994 s t(norm)=0.0606859, mflops=82.3914 (err=2.0e-15) 20. GSL DIF: elapsed time t=1.14995 s, 1024 iters, t-(init.)=1.14995 s t(norm)=0.0498492, mflops=100.303 (err=2.3e-15) 21. Krukar: elapsed time t=1.09996 s, 1024 iters, t-(init.)=1.09996 s t(norm)=0.0476818, mflops=104.862 (err=1.4e-15) 22. Mayer (Buneman): elapsed time t=1.94992 s, 2048 iters, t-(init.)=1.93326 s t(norm)=0.0419022, mflops=119.325 (err=1.4e-15) 23. Mayer (simple): elapsed time t=1.6166 s, 2048 iters, t-(init.)=1.59994 s t(norm)=0.0346777, mflops=144.185 24. Mayer (lookup): elapsed time t=1.63327 s, 2048 iters, t-(init.)=1.6166 s t(norm)=0.0350389, mflops=142.699 (err=1.4e-15) 25. Monro: elapsed time t=1.13329 s, 1024 iters, t-(init.)=1.11662 s t(norm)=0.0484043, mflops=103.297 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.74993 s, 512 iters, t-(init.)=1.74993 s t(norm)=0.151715, mflops=32.9566 (err=1.5e-14) 27. Nielsen: elapsed time t=1.11662 s, 1024 iters, t-(init.)=1.09996 s t(norm)=0.0476818, mflops=104.862 (err=1.1e-14) 28. NR (C): elapsed time t=1.44994 s, 1024 iters, t-(init.)=1.44994 s t(norm)=0.0628533, mflops=79.5503 (err=1.4e-15) 29. NR (F): elapsed time t=1.51661 s, 1024 iters, t-(init.)=1.51661 s t(norm)=0.0657431, mflops=76.0536 (err=1.4e-15) 30. Ooura (C): elapsed time t=1.68327 s, 4096 iters, t-(init.)=1.63327 s t(norm)=0.0177001, mflops=282.485 (err=1.4e-15) 31. Ooura (F): elapsed time t=1.13329 s, 2048 iters, t-(init.)=1.11662 s t(norm)=0.0242021, mflops=206.593 (err=1.4e-15) 32. Ransom: elapsed time t=1.16662 s, 1024 iters, t-(init.)=1.16662 s t(norm)=0.0505716, mflops=98.8697 (err=2.1e-15) 33. SCIPORT: elapsed time t=1.33328 s, 2048 iters, t-(init.)=1.31661 s t(norm)=0.0285368, mflops=175.212 (err=1.4e-15) 34. Singleton: elapsed time t=1.23328 s, 2048 iters, t-(init.)=1.21662 s t(norm)=0.0263695, mflops=189.613 (err=1.9e-15) 35. Singleton (f2c): elapsed time t=1.53327 s, 2048 iters, t-(init.)=1.51661 s t(norm)=0.0328715, mflops=152.107 (err=1.9e-15) 36. Sorensen: elapsed time t=1.18329 s, 2048 iters, t-(init.)=1.16662 s t(norm)=0.0252858, mflops=197.739 (err=1.4e-15) 37. Sorensen DIT: elapsed time t=1.7666 s, 1024 iters, t-(init.)=1.7666 s t(norm)=0.0765799, mflops=65.2913 (err=1.4e-15) 38. Temperton: elapsed time t=1.44994 s, 2048 iters, t-(init.)=1.43328 s t(norm)=0.0310654, mflops=160.951 (err=1.1e-07) 39. Temperton (f2c): elapsed time t=1.74993 s, 2048 iters, t-(init.)=1.73326 s t(norm)=0.0375675, mflops=133.094 (err=1.4e-15) 40. Valkenburg: elapsed time t=1.69993 s, 256 iters, t-(init.)=1.69993 s t(norm)=0.29476, mflops=16.9629 (err=1.7e-15) 41. DXML: elapsed time t=1.31661 s, 2048 iters, t-(init.)=1.29995 s t(norm)=0.0281756, mflops=177.458 (err=2.9e-15) Top mflops for N=2048 = 407.11 Normalized results and averages for N=2048: fft 0: mflops = 141.242 (norm. = 0.346939), norm. avg. (of 11) = 0.306354 fft 1: mflops = 145.703 (norm. = 0.357895), norm. avg. (of 11) = 0.273523 fft 2: mflops = 87.6061 (norm. = 0.21519), norm. avg. (of 11) = 0.166973 fft 3: mflops = 41.692 (norm. = 0.10241), norm. avg. (of 11) = 0.0523235 fft 4: mflops = 157.293 (norm. = 0.386364), norm. avg. (of 11) = 0.29572 fft 5: mflops = 20.5979 (norm. = 0.0505952), norm. avg. (of 11) = 0.0412811 fft 6: mflops = 200.605 (norm. = 0.492754), norm. avg. (of 11) = 0.242887 fft 7: mflops = 87.6061 (norm. = 0.21519), norm. avg. (of 11) = 0.141077 fft 8: mflops = 75.2269 (norm. = 0.184783), norm. avg. (of 11) = 0.141093 fft 9: mflops = 282.485 (norm. = 0.693878), norm. avg. (of 11) = 0.383031 fft 10: mflops = 346.044 (norm. = 0.85), norm. avg. (of 11) = 0.427326 fft 11: mflops = 65.2913 (norm. = 0.160377), norm. avg. (of 10) = 0.0992621 fft 12: mflops = 300.908 (norm. = 0.73913), norm. avg. (of 11) = 0.516622 fft 13: mflops = 157.293 (norm. = 0.386364), norm. avg. (of 11) = 0.268426 fft 14: mflops = 274.094 (norm. = 0.673267), norm. avg. (of 11) = 0.832927 fft 15: mflops = 276.835 (norm. = 0.68), norm. avg. (of 11) = 0.859627 fft 16: mflops = 294.505 (norm. = 0.723404), norm. avg. (of 11) = 0.82151 fft 17: mflops = 407.11 (norm. = 1), norm. avg. (of 9) = 0.609455 fft 18: mflops = 184.557 (norm. = 0.453333), norm. avg. (of 11) = 0.294279 fft 19: mflops = 82.3914 (norm. = 0.202381), norm. avg. (of 11) = 0.121067 fft 20: mflops = 100.303 (norm. = 0.246377), norm. avg. (of 11) = 0.144836 fft 21: mflops = 104.862 (norm. = 0.257576), norm. avg. (of 11) = 0.333456 fft 22: mflops = 119.325 (norm. = 0.293103), norm. avg. (of 10) = 0.201439 fft 23: mflops = 144.185 (norm. = 0.354167), norm. avg. (of 10) = 0.246286 fft 24: mflops = 142.699 (norm. = 0.350515), norm. avg. (of 10) = 0.240303 fft 25: mflops = 103.297 (norm. = 0.253731), norm. avg. (of 10) = 0.131591 fft 26: mflops = 32.9566 (norm. = 0.0809524), norm. avg. (of 11) = 0.0566094 fft 27: mflops = 104.862 (norm. = 0.257576), norm. avg. (of 11) = 0.127051 fft 28: mflops = 79.5503 (norm. = 0.195402), norm. avg. (of 11) = 0.113317 fft 29: mflops = 76.0536 (norm. = 0.186813), norm. avg. (of 11) = 0.114301 fft 30: mflops = 282.485 (norm. = 0.693878), norm. avg. (of 11) = 0.488766 fft 31: mflops = 206.593 (norm. = 0.507463), norm. avg. (of 11) = 0.41642 fft 32: mflops = 98.8697 (norm. = 0.242857), norm. avg. (of 10) = 0.115793 fft 33: mflops = 175.212 (norm. = 0.43038), norm. avg. (of 10) = 0.303848 fft 34: mflops = 189.613 (norm. = 0.465753), norm. avg. (of 11) = 0.242158 fft 35: mflops = 152.107 (norm. = 0.373626), norm. avg. (of 11) = 0.214527 fft 36: mflops = 197.739 (norm. = 0.485714), norm. avg. (of 11) = 0.327438 fft 37: mflops = 65.2913 (norm. = 0.160377), norm. avg. (of 11) = 0.147432 fft 38: mflops = 160.951 (norm. = 0.395349), norm. avg. (of 11) = 0.221088 fft 39: mflops = 133.094 (norm. = 0.326923), norm. avg. (of 11) = 0.169179 fft 40: mflops = 16.9629 (norm. = 0.0416667), norm. avg. (of 11) = 0.041419 fft 41: mflops = 177.458 (norm. = 0.435897), norm. avg. (of 11) = 0.402556 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.38328 s, 512 iters, t-(init.)=1.34995 s t(norm)=0.053642, mflops=93.2105 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.34995 s, 512 iters, t-(init.)=1.33328 s t(norm)=0.0529798, mflops=94.3756 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.91659 s, 512 iters, t-(init.)=1.89992 s t(norm)=0.0754962, mflops=66.2285 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.46661 s, 256 iters, t-(init.)=1.46661 s t(norm)=0.116556, mflops=42.898 (err=3.7e-15) 4. Bailey: elapsed time t=1.96659 s, 512 iters, t-(init.)=1.94992 s t(norm)=0.0774829, mflops=64.5303 (err=3.7e-15) 5. Beauregard: elapsed time t=1.54994 s, 128 iters, t-(init.)=1.54994 s t(norm)=0.246356, mflops=20.2958 (err=3.8e-15) 6. Bergland: elapsed time t=1.58327 s, 1024 iters, t-(init.)=1.53327 s t(norm)=0.0304634, mflops=164.132 (err=3.9e-15) 7. Brenner: elapsed time t=1.48327 s, 512 iters, t-(init.)=1.46661 s t(norm)=0.0582778, mflops=85.796 (err=3.8e-15) 8. Burrus: elapsed time t=1.09996 s, 256 iters, t-(init.)=1.09996 s t(norm)=0.0874166, mflops=57.1973 (err=3.7e-15) 9. CWP (min N) (N=4290): elapsed time t=1.04996 s, 1024 iters, t-(init.)=0.99996 s t(norm)=0.0198674, mflops=251.668 10. CWP (best N) (N=4368): elapsed time t=1.89992 s, 2048 iters, t-(init.)=1.78326 s t(norm)=0.0177151, mflops=282.245 11. Edelblute: elapsed time t=1.19995 s, 256 iters, t-(init.)=1.19995 s t(norm)=0.0953636, mflops=52.4309 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.74993 s, 1024 iters, t-(init.)=1.69993 s t(norm)=0.0337746, mflops=148.04 (err=3.8e-15) 13. FFTPACK (f2c): elapsed time t=1.04996 s, 512 iters, t-(init.)=1.01663 s t(norm)=0.0403971, mflops=123.771 (err=3.8e-15) FFTW_MEASURE plan: (cost = 1.041625e-03) FFTW_TWIDDLE 2 FFTW_TWIDDLE 32 FFTW_NOTW 64 14. FFTW: elapsed time t=1.24995 s, 1024 iters, t-(init.)=1.19995 s t(norm)=0.0238409, mflops=209.724 (err=3.8e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.16662 s, 1024 iters, t-(init.)=1.11662 s t(norm)=0.0221853, mflops=225.375 (err=3.8e-15) 16. Frigo-old: elapsed time t=1.58327 s, 1024 iters, t-(init.)=1.53327 s t(norm)=0.0304634, mflops=164.132 (err=3.8e-15) 17. Green: elapsed time t=1.04996 s, 1024 iters, t-(init.)=0.99996 s t(norm)=0.0198674, mflops=251.668 (err=3.8e-15) 18. GSL: elapsed time t=1.88326 s, 1024 iters, t-(init.)=1.83326 s t(norm)=0.0364236, mflops=137.274 (err=3.8e-15) 19. GSL DIT: elapsed time t=1.79993 s, 512 iters, t-(init.)=1.7666 s t(norm)=0.0701982, mflops=71.2269 (err=4.1e-15) 20. GSL DIF: elapsed time t=1.49994 s, 512 iters, t-(init.)=1.46661 s t(norm)=0.0582778, mflops=85.796 (err=4.3e-15) 21. Krukar: elapsed time t=1.34995 s, 512 iters, t-(init.)=1.33328 s t(norm)=0.0529798, mflops=94.3756 (err=3.8e-15) 22. Mayer (Buneman): elapsed time t=1.09996 s, 512 iters, t-(init.)=1.08329 s t(norm)=0.0430461, mflops=116.155 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.83326 s, 1024 iters, t-(init.)=1.78326 s t(norm)=0.0354302, mflops=141.122 24. Mayer (lookup): elapsed time t=1.89992 s, 1024 iters, t-(init.)=1.86659 s t(norm)=0.0370859, mflops=134.822 (err=3.7e-15) 25. Monro: elapsed time t=1.74993 s, 512 iters, t-(init.)=1.7166 s t(norm)=0.0682115, mflops=73.3014 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.06662 s, 128 iters, t-(init.)=1.06662 s t(norm)=0.169535, mflops=29.4924 (err=4.9e-14) 27. Nielsen: elapsed time t=1.46661 s, 512 iters, t-(init.)=1.44994 s t(norm)=0.0576155, mflops=86.7822 (err=2.6e-14) 28. NR (C): elapsed time t=1.88326 s, 512 iters, t-(init.)=1.86659 s t(norm)=0.0741717, mflops=67.4112 (err=3.9e-15) 29. NR (F): elapsed time t=1.98325 s, 512 iters, t-(init.)=1.96659 s t(norm)=0.0781452, mflops=63.9835 (err=3.9e-15) 30. Ooura (C): elapsed time t=1.01663 s, 1024 iters, t-(init.)=0.966628 s t(norm)=0.0192052, mflops=260.347 (err=3.9e-15) 31. Ooura (F): elapsed time t=1.21662 s, 1024 iters, t-(init.)=1.16662 s t(norm)=0.0231787, mflops=215.716 (err=3.9e-15) 32. Ransom: elapsed time t=1.14995 s, 512 iters, t-(init.)=1.11662 s t(norm)=0.0443706, mflops=112.687 (err=4.4e-15) 33. SCIPORT: elapsed time t=1.23328 s, 512 iters, t-(init.)=1.21662 s t(norm)=0.0483441, mflops=103.425 (err=3.8e-15) 34. Singleton: elapsed time t=1.46661 s, 1024 iters, t-(init.)=1.41661 s t(norm)=0.0281455, mflops=177.648 (err=5.8e-15) 35. Singleton (f2c): elapsed time t=1.79993 s, 1024 iters, t-(init.)=1.74993 s t(norm)=0.034768, mflops=143.81 (err=5.8e-15) 36. Sorensen: elapsed time t=1.13329 s, 512 iters, t-(init.)=1.11662 s t(norm)=0.0443706, mflops=112.687 (err=3.7e-15) 37. Sorensen DIT: elapsed time t=1.24995 s, 256 iters, t-(init.)=1.23328 s t(norm)=0.0980126, mflops=51.0138 (err=3.7e-15) 38. Temperton: elapsed time t=1.91659 s, 1024 iters, t-(init.)=1.86659 s t(norm)=0.0370859, mflops=134.822 (err=1.2e-07) 39. Temperton (f2c): elapsed time t=1.13329 s, 512 iters, t-(init.)=1.09996 s t(norm)=0.0437083, mflops=114.395 (err=3.8e-15) 40. Valkenburg: elapsed time t=1.01663 s, 64 iters, t-(init.)=1.01663 s t(norm)=0.323177, mflops=15.4714 (err=4.0e-15) 41. DXML: elapsed time t=1.53327 s, 1024 iters, t-(init.)=1.48327 s t(norm)=0.02947, mflops=169.664 (err=4.7e-15) Top mflops for N=4096 = 282.245 Normalized results and averages for N=4096: fft 0: mflops = 93.2105 (norm. = 0.330247), norm. avg. (of 12) = 0.308345 fft 1: mflops = 94.3756 (norm. = 0.334375), norm. avg. (of 12) = 0.278594 fft 2: mflops = 66.2285 (norm. = 0.234649), norm. avg. (of 12) = 0.172613 fft 3: mflops = 42.898 (norm. = 0.151989), norm. avg. (of 12) = 0.0606289 fft 4: mflops = 64.5303 (norm. = 0.228632), norm. avg. (of 12) = 0.290129 fft 5: mflops = 20.2958 (norm. = 0.0719086), norm. avg. (of 12) = 0.0438334 fft 6: mflops = 164.132 (norm. = 0.581522), norm. avg. (of 12) = 0.271107 fft 7: mflops = 85.796 (norm. = 0.303977), norm. avg. (of 12) = 0.154652 fft 8: mflops = 57.1973 (norm. = 0.202652), norm. avg. (of 12) = 0.146223 fft 9: mflops = 251.668 (norm. = 0.891667), norm. avg. (of 12) = 0.425417 fft 10: mflops = 282.245 (norm. = 1), norm. avg. (of 12) = 0.475049 fft 11: mflops = 52.4309 (norm. = 0.185764), norm. avg. (of 11) = 0.107126 fft 12: mflops = 148.04 (norm. = 0.52451), norm. avg. (of 12) = 0.51728 fft 13: mflops = 123.771 (norm. = 0.438525), norm. avg. (of 12) = 0.282601 fft 14: mflops = 209.724 (norm. = 0.743056), norm. avg. (of 12) = 0.825438 fft 15: mflops = 225.375 (norm. = 0.798507), norm. avg. (of 12) = 0.854533 fft 16: mflops = 164.132 (norm. = 0.581522), norm. avg. (of 12) = 0.801511 fft 17: mflops = 251.668 (norm. = 0.891667), norm. avg. (of 10) = 0.637676 fft 18: mflops = 137.274 (norm. = 0.486364), norm. avg. (of 12) = 0.310286 fft 19: mflops = 71.2269 (norm. = 0.252358), norm. avg. (of 12) = 0.132008 fft 20: mflops = 85.796 (norm. = 0.303977), norm. avg. (of 12) = 0.158098 fft 21: mflops = 94.3756 (norm. = 0.334375), norm. avg. (of 12) = 0.333532 fft 22: mflops = 116.155 (norm. = 0.411538), norm. avg. (of 11) = 0.220539 fft 23: mflops = 141.122 (norm. = 0.5), norm. avg. (of 11) = 0.269351 fft 24: mflops = 134.822 (norm. = 0.477679), norm. avg. (of 11) = 0.261882 fft 25: mflops = 73.3014 (norm. = 0.259709), norm. avg. (of 11) = 0.143238 fft 26: mflops = 29.4924 (norm. = 0.104492), norm. avg. (of 12) = 0.0605996 fft 27: mflops = 86.7822 (norm. = 0.307471), norm. avg. (of 12) = 0.142086 fft 28: mflops = 67.4112 (norm. = 0.238839), norm. avg. (of 12) = 0.123777 fft 29: mflops = 63.9835 (norm. = 0.226695), norm. avg. (of 12) = 0.123667 fft 30: mflops = 260.347 (norm. = 0.922414), norm. avg. (of 12) = 0.524903 fft 31: mflops = 215.716 (norm. = 0.764286), norm. avg. (of 12) = 0.445409 fft 32: mflops = 112.687 (norm. = 0.399254), norm. avg. (of 11) = 0.141562 fft 33: mflops = 103.425 (norm. = 0.366438), norm. avg. (of 11) = 0.309538 fft 34: mflops = 177.648 (norm. = 0.629412), norm. avg. (of 12) = 0.274429 fft 35: mflops = 143.81 (norm. = 0.509524), norm. avg. (of 12) = 0.23911 fft 36: mflops = 112.687 (norm. = 0.399254), norm. avg. (of 12) = 0.333422 fft 37: mflops = 51.0138 (norm. = 0.180743), norm. avg. (of 12) = 0.150208 fft 38: mflops = 134.822 (norm. = 0.477679), norm. avg. (of 12) = 0.24247 fft 39: mflops = 114.395 (norm. = 0.405303), norm. avg. (of 12) = 0.188856 fft 40: mflops = 15.4714 (norm. = 0.0548156), norm. avg. (of 12) = 0.0425354 fft 41: mflops = 169.664 (norm. = 0.601124), norm. avg. (of 12) = 0.419103 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.89992 s, 256 iters, t-(init.)=1.86659 s t(norm)=0.0684662, mflops=73.0287 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.93326 s, 256 iters, t-(init.)=1.88326 s t(norm)=0.0690775, mflops=72.3825 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.26662 s, 128 iters, t-(init.)=1.24995 s t(norm)=0.0916958, mflops=54.5281 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.69993 s, 128 iters, t-(init.)=1.68327 s t(norm)=0.123484, mflops=40.4912 (err=3.7e-15) 4. Bailey: elapsed time t=1.23328 s, 128 iters, t-(init.)=1.19995 s t(norm)=0.088028, mflops=56.8001 (err=3.7e-15) 5. Beauregard: elapsed time t=1.69993 s, 64 iters, t-(init.)=1.68327 s t(norm)=0.246967, mflops=20.2456 (err=3.7e-15) 6. Bergland: elapsed time t=1.04996 s, 256 iters, t-(init.)=0.99996 s t(norm)=0.0366783, mflops=136.32 (err=3.7e-15) 7. Brenner: elapsed time t=1.84993 s, 256 iters, t-(init.)=1.81659 s t(norm)=0.0666323, mflops=75.0387 (err=3.7e-15) 8. Burrus: elapsed time t=1.48327 s, 128 iters, t-(init.)=1.46661 s t(norm)=0.10759, mflops=46.4728 (err=3.7e-15) 9. CWP (min N) (N=8580): elapsed time t=1.18329 s, 512 iters, t-(init.)=1.08329 s t(norm)=0.0198674, mflops=251.668 10. CWP (best N) (N=9240): elapsed time t=1.06662 s, 512 iters, t-(init.)=0.983294 s t(norm)=0.0180335, mflops=277.262 11. Edelblute: elapsed time t=1.49994 s, 128 iters, t-(init.)=1.46661 s t(norm)=0.10759, mflops=46.4728 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.48327 s, 256 iters, t-(init.)=1.44994 s t(norm)=0.0531836, mflops=94.014 (err=3.7e-15) 13. FFTPACK (f2c): elapsed time t=1.68327 s, 256 iters, t-(init.)=1.64993 s t(norm)=0.0605192, mflops=82.6184 (err=3.7e-15) FFTW_MEASURE plan: (cost = 2.734266e-03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 32 FFTW_NOTW 32 14. FFTW: elapsed time t=1.29995 s, 512 iters, t-(init.)=1.21662 s t(norm)=0.0223126, mflops=224.088 (err=3.7e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.51661 s, 512 iters, t-(init.)=1.43328 s t(norm)=0.0262861, mflops=190.214 (err=3.7e-15) 16. Frigo-old: elapsed time t=1.11662 s, 256 iters, t-(init.)=1.08329 s t(norm)=0.0397348, mflops=125.834 (err=3.7e-15) 17. Green: elapsed time t=1.39994 s, 512 iters, t-(init.)=1.33328 s t(norm)=0.0244522, mflops=204.48 (err=3.7e-15) 18. GSL: elapsed time t=1.39994 s, 256 iters, t-(init.)=1.34995 s t(norm)=0.0495157, mflops=100.978 (err=3.7e-15) 19. GSL DIT: elapsed time t=1.21662 s, 128 iters, t-(init.)=1.19995 s t(norm)=0.088028, mflops=56.8001 (err=4.3e-15) 20. GSL DIF: elapsed time t=1.01663 s, 128 iters, t-(init.)=0.99996 s t(norm)=0.0733566, mflops=68.1602 (err=4.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.29995 s, 256 iters, t-(init.)=1.24995 s t(norm)=0.0458479, mflops=109.056 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.11662 s, 256 iters, t-(init.)=1.06662 s t(norm)=0.0391235, mflops=127.8 24. Mayer (lookup): elapsed time t=1.23328 s, 256 iters, t-(init.)=1.19995 s t(norm)=0.044014, mflops=113.6 (err=3.7e-15) 25. Monro: elapsed time t=1.23328 s, 128 iters, t-(init.)=1.21662 s t(norm)=0.0892506, mflops=56.0221 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.26662 s, 64 iters, t-(init.)=1.26662 s t(norm)=0.185837, mflops=26.9053 (err=4.5e-14) 27. Nielsen: elapsed time t=1.23328 s, 128 iters, t-(init.)=1.21662 s t(norm)=0.0892506, mflops=56.0221 (err=1.1e-14) 28. NR (C): elapsed time t=1.24995 s, 128 iters, t-(init.)=1.23328 s t(norm)=0.0904732, mflops=55.265 (err=3.9e-15) 29. NR (F): elapsed time t=1.29995 s, 128 iters, t-(init.)=1.28328 s t(norm)=0.094141, mflops=53.1118 (err=3.9e-15) 30. Ooura (C): elapsed time t=1.49994 s, 512 iters, t-(init.)=1.41661 s t(norm)=0.0259805, mflops=192.452 (err=3.7e-15) 31. Ooura (F): elapsed time t=1.58327 s, 512 iters, t-(init.)=1.49994 s t(norm)=0.0275087, mflops=181.76 (err=3.7e-15) 32. Ransom: elapsed time t=1.46661 s, 256 iters, t-(init.)=1.41661 s t(norm)=0.0519609, mflops=96.2261 (err=4.9e-15) 33. SCIPORT: elapsed time t=1.99992 s, 256 iters, t-(init.)=1.96659 s t(norm)=0.072134, mflops=69.3154 (err=3.7e-15) 34. Singleton: elapsed time t=1.08329 s, 256 iters, t-(init.)=1.03329 s t(norm)=0.0379009, mflops=131.923 (err=5.6e-15) 35. Singleton (f2c): elapsed time t=1.28328 s, 256 iters, t-(init.)=1.24995 s t(norm)=0.0458479, mflops=109.056 (err=5.6e-15) 36. Sorensen: elapsed time t=1.5666 s, 256 iters, t-(init.)=1.53327 s t(norm)=0.0562401, mflops=88.9046 (err=3.7e-15) 37. Sorensen DIT: elapsed time t=1.63327 s, 128 iters, t-(init.)=1.6166 s t(norm)=0.118593, mflops=42.1609 (err=3.7e-15) 38. Temperton: elapsed time t=1.38328 s, 256 iters, t-(init.)=1.34995 s t(norm)=0.0495157, mflops=100.978 (err=1.4e-07) 39. Temperton (f2c): elapsed time t=1.6666 s, 256 iters, t-(init.)=1.6166 s t(norm)=0.0592966, mflops=84.3219 (err=3.7e-15) 40. Valkenburg: elapsed time t=1.14995 s, 32 iters, t-(init.)=1.14995 s t(norm)=0.33744, mflops=14.8174 (err=3.8e-15) 41. DXML: elapsed time t=1.13329 s, 256 iters, t-(init.)=1.08329 s t(norm)=0.0397348, mflops=125.834 (err=4.6e-15) Top mflops for N=8192 = 277.262 Normalized results and averages for N=8192: fft 0: mflops = 73.0287 (norm. = 0.263393), norm. avg. (of 13) = 0.304887 fft 1: mflops = 72.3825 (norm. = 0.261062), norm. avg. (of 13) = 0.277245 fft 2: mflops = 54.5281 (norm. = 0.196667), norm. avg. (of 13) = 0.174463 fft 3: mflops = 40.4912 (norm. = 0.14604), norm. avg. (of 13) = 0.0671989 fft 4: mflops = 56.8001 (norm. = 0.204861), norm. avg. (of 13) = 0.28357 fft 5: mflops = 20.2456 (norm. = 0.0730198), norm. avg. (of 13) = 0.0460785 fft 6: mflops = 136.32 (norm. = 0.491667), norm. avg. (of 13) = 0.288073 fft 7: mflops = 75.0387 (norm. = 0.270642), norm. avg. (of 13) = 0.163574 fft 8: mflops = 46.4728 (norm. = 0.167614), norm. avg. (of 13) = 0.147869 fft 9: mflops = 251.668 (norm. = 0.907692), norm. avg. (of 13) = 0.462515 fft 10: mflops = 277.262 (norm. = 1), norm. avg. (of 13) = 0.51543 fft 11: mflops = 46.4728 (norm. = 0.167614), norm. avg. (of 12) = 0.112167 fft 12: mflops = 94.014 (norm. = 0.33908), norm. avg. (of 13) = 0.503572 fft 13: mflops = 82.6184 (norm. = 0.29798), norm. avg. (of 13) = 0.283784 fft 14: mflops = 224.088 (norm. = 0.808219), norm. avg. (of 13) = 0.824113 fft 15: mflops = 190.214 (norm. = 0.686047), norm. avg. (of 13) = 0.841573 fft 16: mflops = 125.834 (norm. = 0.453846), norm. avg. (of 13) = 0.774768 fft 17: mflops = 204.48 (norm. = 0.7375), norm. avg. (of 11) = 0.646751 fft 18: mflops = 100.978 (norm. = 0.364198), norm. avg. (of 13) = 0.314433 fft 19: mflops = 56.8001 (norm. = 0.204861), norm. avg. (of 13) = 0.137612 fft 20: mflops = 68.1602 (norm. = 0.245833), norm. avg. (of 13) = 0.164847 fft 21: mflops = -1 (norm. = -0.0036067), norm. avg. (of 12) = 0.333532 fft 22: mflops = 109.056 (norm. = 0.393333), norm. avg. (of 12) = 0.234939 fft 23: mflops = 127.8 (norm. = 0.460938), norm. avg. (of 12) = 0.285317 fft 24: mflops = 113.6 (norm. = 0.409722), norm. avg. (of 12) = 0.274202 fft 25: mflops = 56.0221 (norm. = 0.202055), norm. avg. (of 12) = 0.148139 fft 26: mflops = 26.9053 (norm. = 0.0970395), norm. avg. (of 13) = 0.0634027 fft 27: mflops = 56.0221 (norm. = 0.202055), norm. avg. (of 13) = 0.146699 fft 28: mflops = 55.265 (norm. = 0.199324), norm. avg. (of 13) = 0.129588 fft 29: mflops = 53.1118 (norm. = 0.191558), norm. avg. (of 13) = 0.12889 fft 30: mflops = 192.452 (norm. = 0.694118), norm. avg. (of 13) = 0.53792 fft 31: mflops = 181.76 (norm. = 0.655556), norm. avg. (of 13) = 0.461574 fft 32: mflops = 96.2261 (norm. = 0.347059), norm. avg. (of 12) = 0.158687 fft 33: mflops = 69.3154 (norm. = 0.25), norm. avg. (of 12) = 0.304577 fft 34: mflops = 131.923 (norm. = 0.475806), norm. avg. (of 13) = 0.28992 fft 35: mflops = 109.056 (norm. = 0.393333), norm. avg. (of 13) = 0.250973 fft 36: mflops = 88.9046 (norm. = 0.320652), norm. avg. (of 13) = 0.33244 fft 37: mflops = 42.1609 (norm. = 0.152062), norm. avg. (of 13) = 0.150351 fft 38: mflops = 100.978 (norm. = 0.364198), norm. avg. (of 13) = 0.251834 fft 39: mflops = 84.3219 (norm. = 0.304124), norm. avg. (of 13) = 0.197723 fft 40: mflops = 14.8174 (norm. = 0.053442), norm. avg. (of 13) = 0.0433743 fft 41: mflops = 125.834 (norm. = 0.453846), norm. avg. (of 13) = 0.421776 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.63327 s, 64 iters, t-(init.)=1.59994 s t(norm)=0.108987, mflops=45.877 (err=6.8e-15) 1. Arndt DIT: elapsed time t=1.64993 s, 64 iters, t-(init.)=1.6166 s t(norm)=0.110122, mflops=45.4041 (err=6.8e-15) 2. Arndt Split-Radix: elapsed time t=2.01659 s, 64 iters, t-(init.)=1.98325 s t(norm)=0.135098, mflops=37.01 (err=6.8e-15) 3. Arndt 4-step: elapsed time t=1.79993 s, 64 iters, t-(init.)=1.7666 s t(norm)=0.12034, mflops=41.549 (err=6.8e-15) 4. Bailey: elapsed time t=1.69993 s, 64 iters, t-(init.)=1.6666 s t(norm)=0.113528, mflops=44.042 (err=6.8e-15) 5. Beauregard: elapsed time t=1.93326 s, 32 iters, t-(init.)=1.91659 s t(norm)=0.261115, mflops=19.1487 (err=6.8e-15) 6. Bergland: elapsed time t=1.53327 s, 128 iters, t-(init.)=1.46661 s t(norm)=0.0499524, mflops=100.095 (err=6.8e-15) 7. Brenner: elapsed time t=1.23328 s, 64 iters, t-(init.)=1.19995 s t(norm)=0.0817402, mflops=61.1694 (err=6.8e-15) 8. Burrus: elapsed time t=1.09996 s, 32 iters, t-(init.)=1.08329 s t(norm)=0.147587, mflops=33.8784 (err=6.8e-15) 9. CWP (min N) (N=17160): elapsed time t=1.49994 s, 256 iters, t-(init.)=1.34995 s t(norm)=0.0229894, mflops=217.491 10. CWP (best N) (N=17160): elapsed time t=1.49994 s, 256 iters, t-(init.)=1.34995 s t(norm)=0.0229894, mflops=217.491 11. Edelblute: elapsed time t=1.11662 s, 32 iters, t-(init.)=1.09996 s t(norm)=0.149857, mflops=33.3651 (err=6.8e-15) 12. FFTPACK: elapsed time t=1.74993 s, 128 iters, t-(init.)=1.68327 s t(norm)=0.0573317, mflops=87.2118 (err=6.8e-15) 13. FFTPACK (f2c): elapsed time t=1.93326 s, 128 iters, t-(init.)=1.84993 s t(norm)=0.0630081, mflops=79.3549 (err=6.8e-15) FFTW_MEASURE plan: (cost = 6.510156e-03) FFTW_TWIDDLE 16 FFTW_TWIDDLE 32 FFTW_NOTW 32 14. FFTW: elapsed time t=1.81659 s, 256 iters, t-(init.)=1.68327 s t(norm)=0.0286658, mflops=174.424 (err=6.8e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.5666 s, 256 iters, t-(init.)=1.43328 s t(norm)=0.0244085, mflops=204.846 (err=6.8e-15) 16. Frigo-old: elapsed time t=1.41661 s, 128 iters, t-(init.)=1.34995 s t(norm)=0.0459789, mflops=108.746 (err=6.8e-15) 17. Green: elapsed time t=1.26662 s, 128 iters, t-(init.)=1.19995 s t(norm)=0.0408701, mflops=122.339 (err=6.8e-15) 18. GSL: elapsed time t=1.6666 s, 128 iters, t-(init.)=1.59994 s t(norm)=0.0544935, mflops=91.7541 (err=6.8e-15) 19. GSL DIT: elapsed time t=1.84993 s, 64 iters, t-(init.)=1.79993 s t(norm)=0.12261, mflops=40.7796 (err=7.2e-15) 20. GSL DIF: elapsed time t=1.59994 s, 64 iters, t-(init.)=1.5666 s t(norm)=0.106716, mflops=46.8531 (err=7.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.03329 s, 64 iters, t-(init.)=0.99996 s t(norm)=0.0681169, mflops=73.4033 (err=6.8e-15) 23. Mayer (simple): elapsed time t=1.91659 s, 128 iters, t-(init.)=1.84993 s t(norm)=0.0630081, mflops=79.3549 24. Mayer (lookup): elapsed time t=1.03329 s, 64 iters, t-(init.)=0.99996 s t(norm)=0.0681169, mflops=73.4033 (err=6.8e-15) 25. Monro: elapsed time t=1.86659 s, 64 iters, t-(init.)=1.83326 s t(norm)=0.124881, mflops=40.0381 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.43328 s, 32 iters, t-(init.)=1.41661 s t(norm)=0.192998, mflops=25.907 (err=2.3e-13) 27. Nielsen: elapsed time t=1.7166 s, 64 iters, t-(init.)=1.68327 s t(norm)=0.114663, mflops=43.6059 (err=1.3e-13) 28. NR (C): elapsed time t=1.86659 s, 64 iters, t-(init.)=1.83326 s t(norm)=0.124881, mflops=40.0381 (err=6.9e-15) 29. NR (F): elapsed time t=1.91659 s, 64 iters, t-(init.)=1.88326 s t(norm)=0.128287, mflops=38.9752 (err=6.9e-15) 30. Ooura (C): elapsed time t=1.01663 s, 128 iters, t-(init.)=0.949962 s t(norm)=0.0323555, mflops=154.533 (err=6.8e-15) 31. Ooura (F): elapsed time t=1.03329 s, 128 iters, t-(init.)=0.966628 s t(norm)=0.0329232, mflops=151.869 (err=6.8e-15) 32. Ransom: elapsed time t=1.46661 s, 128 iters, t-(init.)=1.39994 s t(norm)=0.0476818, mflops=104.862 (err=7.4e-15) 33. SCIPORT: elapsed time t=1.23328 s, 64 iters, t-(init.)=1.19995 s t(norm)=0.0817402, mflops=61.1694 (err=6.8e-15) 34. Singleton: elapsed time t=1.6666 s, 128 iters, t-(init.)=1.58327 s t(norm)=0.0539259, mflops=92.7199 (err=1.0e-14) 35. Singleton (f2c): elapsed time t=1.89992 s, 128 iters, t-(init.)=1.83326 s t(norm)=0.0624405, mflops=80.0763 (err=1.0e-14) 36. Sorensen: elapsed time t=1.13329 s, 64 iters, t-(init.)=1.09996 s t(norm)=0.0749286, mflops=66.7302 (err=6.8e-15) 37. Sorensen DIT: elapsed time t=1.19995 s, 32 iters, t-(init.)=1.18329 s t(norm)=0.16121, mflops=31.0155 (err=6.8e-15) 38. Temperton: elapsed time t=1.88326 s, 128 iters, t-(init.)=1.81659 s t(norm)=0.0618728, mflops=80.8109 (err=1.5e-07) 39. Temperton (f2c): elapsed time t=1.06662 s, 64 iters, t-(init.)=1.03329 s t(norm)=0.0703874, mflops=71.0354 (err=6.8e-15) 40. Valkenburg: elapsed time t=1.31661 s, 16 iters, t-(init.)=1.31661 s t(norm)=0.358749, mflops=13.9373 (err=6.9e-15) 41. DXML: elapsed time t=1.06662 s, 128 iters, t-(init.)=0.99996 s t(norm)=0.0340584, mflops=146.807 (err=7.6e-15) Top mflops for N=16384 = 217.491 Normalized results and averages for N=16384: fft 0: mflops = 45.877 (norm. = 0.210937), norm. avg. (of 14) = 0.298176 fft 1: mflops = 45.4041 (norm. = 0.208763), norm. avg. (of 14) = 0.272354 fft 2: mflops = 37.01 (norm. = 0.170168), norm. avg. (of 14) = 0.174156 fft 3: mflops = 41.549 (norm. = 0.191038), norm. avg. (of 14) = 0.0760446 fft 4: mflops = 44.042 (norm. = 0.2025), norm. avg. (of 14) = 0.277779 fft 5: mflops = 19.1487 (norm. = 0.0880435), norm. avg. (of 14) = 0.049076 fft 6: mflops = 100.095 (norm. = 0.460227), norm. avg. (of 14) = 0.30037 fft 7: mflops = 61.1694 (norm. = 0.28125), norm. avg. (of 14) = 0.17198 fft 8: mflops = 33.8784 (norm. = 0.155769), norm. avg. (of 14) = 0.148433 fft 9: mflops = 217.491 (norm. = 1), norm. avg. (of 14) = 0.500907 fft 10: mflops = 217.491 (norm. = 1), norm. avg. (of 14) = 0.550042 fft 11: mflops = 33.3651 (norm. = 0.153409), norm. avg. (of 13) = 0.115339 fft 12: mflops = 87.2118 (norm. = 0.40099), norm. avg. (of 14) = 0.496245 fft 13: mflops = 79.3549 (norm. = 0.364865), norm. avg. (of 14) = 0.289575 fft 14: mflops = 174.424 (norm. = 0.80198), norm. avg. (of 14) = 0.822532 fft 15: mflops = 204.846 (norm. = 0.94186), norm. avg. (of 14) = 0.848736 fft 16: mflops = 108.746 (norm. = 0.5), norm. avg. (of 14) = 0.755141 fft 17: mflops = 122.339 (norm. = 0.5625), norm. avg. (of 12) = 0.63973 fft 18: mflops = 91.7541 (norm. = 0.421875), norm. avg. (of 14) = 0.322108 fft 19: mflops = 40.7796 (norm. = 0.1875), norm. avg. (of 14) = 0.141175 fft 20: mflops = 46.8531 (norm. = 0.215426), norm. avg. (of 14) = 0.16846 fft 21: mflops = -1 (norm. = -0.00459789), norm. avg. (of 12) = 0.333532 fft 22: mflops = 73.4033 (norm. = 0.3375), norm. avg. (of 13) = 0.242828 fft 23: mflops = 79.3549 (norm. = 0.364865), norm. avg. (of 13) = 0.291436 fft 24: mflops = 73.4033 (norm. = 0.3375), norm. avg. (of 13) = 0.279071 fft 25: mflops = 40.0381 (norm. = 0.184091), norm. avg. (of 13) = 0.150905 fft 26: mflops = 25.907 (norm. = 0.119118), norm. avg. (of 14) = 0.0673823 fft 27: mflops = 43.6059 (norm. = 0.200495), norm. avg. (of 14) = 0.150542 fft 28: mflops = 40.0381 (norm. = 0.184091), norm. avg. (of 14) = 0.133481 fft 29: mflops = 38.9752 (norm. = 0.179204), norm. avg. (of 14) = 0.132483 fft 30: mflops = 154.533 (norm. = 0.710526), norm. avg. (of 14) = 0.550249 fft 31: mflops = 151.869 (norm. = 0.698276), norm. avg. (of 14) = 0.478481 fft 32: mflops = 104.862 (norm. = 0.482143), norm. avg. (of 13) = 0.183568 fft 33: mflops = 61.1694 (norm. = 0.28125), norm. avg. (of 13) = 0.302783 fft 34: mflops = 92.7199 (norm. = 0.426316), norm. avg. (of 14) = 0.299662 fft 35: mflops = 80.0763 (norm. = 0.368182), norm. avg. (of 14) = 0.259346 fft 36: mflops = 66.7302 (norm. = 0.306818), norm. avg. (of 14) = 0.33061 fft 37: mflops = 31.0155 (norm. = 0.142606), norm. avg. (of 14) = 0.149797 fft 38: mflops = 80.8109 (norm. = 0.37156), norm. avg. (of 14) = 0.260386 fft 39: mflops = 71.0354 (norm. = 0.326613), norm. avg. (of 14) = 0.206929 fft 40: mflops = 13.9373 (norm. = 0.0640823), norm. avg. (of 14) = 0.0448535 fft 41: mflops = 146.807 (norm. = 0.675), norm. avg. (of 14) = 0.439863 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.83326 s, 32 iters, t-(init.)=1.79993 s t(norm)=0.114436, mflops=43.6924 (err=1.4e-14) 1. Arndt DIT: elapsed time t=1.86659 s, 32 iters, t-(init.)=1.83326 s t(norm)=0.116556, mflops=42.898 (err=1.4e-14) 2. Arndt Split-Radix: elapsed time t=1.18329 s, 16 iters, t-(init.)=1.16662 s t(norm)=0.148343, mflops=33.7056 (err=1.4e-14) 3. Arndt 4-step: elapsed time t=1.11662 s, 16 iters, t-(init.)=1.09996 s t(norm)=0.139867, mflops=35.7483 (err=1.4e-14) 4. Bailey: elapsed time t=1.93326 s, 32 iters, t-(init.)=1.89992 s t(norm)=0.120794, mflops=41.3928 (err=1.4e-14) 5. Beauregard: elapsed time t=1.04996 s, 8 iters, t-(init.)=1.03329 s t(norm)=0.26278, mflops=19.0273 (err=1.4e-14) 6. Bergland: elapsed time t=1.68327 s, 64 iters, t-(init.)=1.6166 s t(norm)=0.0513904, mflops=97.2944 (err=1.4e-14) 7. Brenner: elapsed time t=1.41661 s, 32 iters, t-(init.)=1.38328 s t(norm)=0.0879464, mflops=56.8528 (err=1.4e-14) 8. Burrus: elapsed time t=1.29995 s, 16 iters, t-(init.)=1.28328 s t(norm)=0.163178, mflops=30.6414 (err=1.4e-14) 9. CWP (min N) (N=34320): elapsed time t=1.74993 s, 128 iters, t-(init.)=1.58327 s t(norm)=0.0251654, mflops=198.686 10. CWP (best N) (N=34320): elapsed time t=1.7666 s, 128 iters, t-(init.)=1.6166 s t(norm)=0.0256952, mflops=194.589 11. Edelblute: elapsed time t=1.29995 s, 16 iters, t-(init.)=1.28328 s t(norm)=0.163178, mflops=30.6414 (err=1.4e-14) 12. FFTPACK: elapsed time t=1.03329 s, 32 iters, t-(init.)=0.99996 s t(norm)=0.0635757, mflops=78.6463 (err=1.4e-14) 13. FFTPACK (f2c): elapsed time t=1.14995 s, 32 iters, t-(init.)=1.11662 s t(norm)=0.0709929, mflops=70.4296 (err=1.4e-14) FFTW_MEASURE plan: (cost = 1.458275e-02) FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_TWIDDLE 32 FFTW_NOTW 32 14. FFTW: elapsed time t=1.98325 s, 128 iters, t-(init.)=1.83326 s t(norm)=0.0291389, mflops=171.592 (err=1.4e-14) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.03329 s, 64 iters, t-(init.)=0.949962 s t(norm)=0.0301985, mflops=165.571 (err=1.4e-14) 16. Frigo-old: elapsed time t=1.69993 s, 64 iters, t-(init.)=1.6166 s t(norm)=0.0513904, mflops=97.2944 (err=1.4e-14) 17. Green: elapsed time t=1.41661 s, 64 iters, t-(init.)=1.34995 s t(norm)=0.0429136, mflops=116.513 (err=1.4e-14) 18. GSL: elapsed time t=1.94992 s, 64 iters, t-(init.)=1.88326 s t(norm)=0.0598672, mflops=83.5182 (err=1.4e-14) 19. GSL DIT: elapsed time t=1.03329 s, 16 iters, t-(init.)=0.99996 s t(norm)=0.127151, mflops=39.3232 (err=1.4e-14) 20. GSL DIF: elapsed time t=1.79993 s, 32 iters, t-(init.)=1.7666 s t(norm)=0.112317, mflops=44.5168 (err=1.4e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.53327 s, 32 iters, t-(init.)=1.49994 s t(norm)=0.0953636, mflops=52.4309 (err=1.4e-14) 23. Mayer (simple): elapsed time t=1.44994 s, 32 iters, t-(init.)=1.41661 s t(norm)=0.0900656, mflops=55.5151 24. Mayer (lookup): elapsed time t=1.53327 s, 32 iters, t-(init.)=1.49994 s t(norm)=0.0953636, mflops=52.4309 (err=1.4e-14) 25. Monro: elapsed time t=1.11662 s, 16 iters, t-(init.)=1.09996 s t(norm)=0.139867, mflops=35.7483 (err=1.5e-07) 26. NAPACK (f2c): elapsed time t=1.6666 s, 16 iters, t-(init.)=1.64993 s t(norm)=0.2098, mflops=23.8322 (err=5.6e-13) 27. Nielsen: elapsed time t=1.01663 s, 16 iters, t-(init.)=0.99996 s t(norm)=0.127151, mflops=39.3232 (err=2.3e-13) 28. NR (C): elapsed time t=1.04996 s, 16 iters, t-(init.)=1.01663 s t(norm)=0.129271, mflops=38.6785 (err=1.4e-14) 29. NR (F): elapsed time t=1.09996 s, 16 iters, t-(init.)=1.08329 s t(norm)=0.137747, mflops=36.2983 (err=1.4e-14) 30. Ooura (C): elapsed time t=1.24995 s, 64 iters, t-(init.)=1.16662 s t(norm)=0.0370859, mflops=134.822 (err=1.4e-14) 31. Ooura (F): elapsed time t=1.19995 s, 64 iters, t-(init.)=1.11662 s t(norm)=0.0354965, mflops=140.859 (err=1.4e-14) 32. Ransom: elapsed time t=1.81659 s, 64 iters, t-(init.)=1.74993 s t(norm)=0.0556288, mflops=89.8815 (err=1.5e-14) 33. SCIPORT: elapsed time t=1.58327 s, 32 iters, t-(init.)=1.54994 s t(norm)=0.0985424, mflops=50.7396 (err=1.4e-14) 34. Singleton: elapsed time t=1.13329 s, 32 iters, t-(init.)=1.09996 s t(norm)=0.0699333, mflops=71.4967 (err=2.1e-14) 35. Singleton (f2c): elapsed time t=1.28328 s, 32 iters, t-(init.)=1.24995 s t(norm)=0.0794697, mflops=62.9171 (err=2.1e-14) 36. Sorensen: elapsed time t=1.44994 s, 32 iters, t-(init.)=1.41661 s t(norm)=0.0900656, mflops=55.5151 (err=1.4e-14) 37. Sorensen DIT: elapsed time t=1.38328 s, 16 iters, t-(init.)=1.36661 s t(norm)=0.173774, mflops=28.7731 (err=1.4e-14) 38. Temperton: elapsed time t=1.14995 s, 32 iters, t-(init.)=1.11662 s t(norm)=0.0709929, mflops=70.4296 (err=1.5e-07) 39. Temperton (f2c): elapsed time t=1.28328 s, 32 iters, t-(init.)=1.23328 s t(norm)=0.0784101, mflops=63.7673 (err=1.4e-14) 40. Valkenburg: elapsed time t=1.49994 s, 8 iters, t-(init.)=1.48327 s t(norm)=0.377216, mflops=13.255 (err=1.4e-14) 41. DXML: elapsed time t=1.16662 s, 64 iters, t-(init.)=1.09996 s t(norm)=0.0349667, mflops=142.993 (err=1.4e-14) Top mflops for N=32768 = 198.686 Normalized results and averages for N=32768: fft 0: mflops = 43.6924 (norm. = 0.219907), norm. avg. (of 15) = 0.292958 fft 1: mflops = 42.898 (norm. = 0.215909), norm. avg. (of 15) = 0.268591 fft 2: mflops = 33.7056 (norm. = 0.169643), norm. avg. (of 15) = 0.173855 fft 3: mflops = 35.7483 (norm. = 0.179924), norm. avg. (of 15) = 0.0829699 fft 4: mflops = 41.3928 (norm. = 0.208333), norm. avg. (of 15) = 0.273149 fft 5: mflops = 19.0273 (norm. = 0.0957661), norm. avg. (of 15) = 0.0521887 fft 6: mflops = 97.2944 (norm. = 0.489691), norm. avg. (of 15) = 0.312991 fft 7: mflops = 56.8528 (norm. = 0.286145), norm. avg. (of 15) = 0.179591 fft 8: mflops = 30.6414 (norm. = 0.154221), norm. avg. (of 15) = 0.148819 fft 9: mflops = 198.686 (norm. = 1), norm. avg. (of 15) = 0.53418 fft 10: mflops = 194.589 (norm. = 0.979381), norm. avg. (of 15) = 0.578665 fft 11: mflops = 30.6414 (norm. = 0.154221), norm. avg. (of 14) = 0.118116 fft 12: mflops = 78.6463 (norm. = 0.395833), norm. avg. (of 15) = 0.489551 fft 13: mflops = 70.4296 (norm. = 0.354478), norm. avg. (of 15) = 0.293902 fft 14: mflops = 171.592 (norm. = 0.863636), norm. avg. (of 15) = 0.825273 fft 15: mflops = 165.571 (norm. = 0.833333), norm. avg. (of 15) = 0.847709 fft 16: mflops = 97.2944 (norm. = 0.489691), norm. avg. (of 15) = 0.737445 fft 17: mflops = 116.513 (norm. = 0.58642), norm. avg. (of 13) = 0.635629 fft 18: mflops = 83.5182 (norm. = 0.420354), norm. avg. (of 15) = 0.328657 fft 19: mflops = 39.3232 (norm. = 0.197917), norm. avg. (of 15) = 0.144958 fft 20: mflops = 44.5168 (norm. = 0.224057), norm. avg. (of 15) = 0.172166 fft 21: mflops = -1 (norm. = -0.00503308), norm. avg. (of 12) = 0.333532 fft 22: mflops = 52.4309 (norm. = 0.263889), norm. avg. (of 14) = 0.244332 fft 23: mflops = 55.5151 (norm. = 0.279412), norm. avg. (of 14) = 0.290577 fft 24: mflops = 52.4309 (norm. = 0.263889), norm. avg. (of 14) = 0.277987 fft 25: mflops = 35.7483 (norm. = 0.179924), norm. avg. (of 14) = 0.152978 fft 26: mflops = 23.8322 (norm. = 0.119949), norm. avg. (of 15) = 0.0708868 fft 27: mflops = 39.3232 (norm. = 0.197917), norm. avg. (of 15) = 0.1537 fft 28: mflops = 38.6785 (norm. = 0.194672), norm. avg. (of 15) = 0.137561 fft 29: mflops = 36.2983 (norm. = 0.182692), norm. avg. (of 15) = 0.135831 fft 30: mflops = 134.822 (norm. = 0.678571), norm. avg. (of 15) = 0.558804 fft 31: mflops = 140.859 (norm. = 0.708955), norm. avg. (of 15) = 0.493846 fft 32: mflops = 89.8815 (norm. = 0.452381), norm. avg. (of 14) = 0.202769 fft 33: mflops = 50.7396 (norm. = 0.255376), norm. avg. (of 14) = 0.299396 fft 34: mflops = 71.4967 (norm. = 0.359848), norm. avg. (of 15) = 0.303675 fft 35: mflops = 62.9171 (norm. = 0.316667), norm. avg. (of 15) = 0.263167 fft 36: mflops = 55.5151 (norm. = 0.279412), norm. avg. (of 15) = 0.327197 fft 37: mflops = 28.7731 (norm. = 0.144817), norm. avg. (of 15) = 0.149465 fft 38: mflops = 70.4296 (norm. = 0.354478), norm. avg. (of 15) = 0.266659 fft 39: mflops = 63.7673 (norm. = 0.320946), norm. avg. (of 15) = 0.21453 fft 40: mflops = 13.255 (norm. = 0.0667135), norm. avg. (of 15) = 0.0463108 fft 41: mflops = 142.993 (norm. = 0.719697), norm. avg. (of 15) = 0.458519 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.34995 s, 8 iters, t-(init.)=1.33328 s t(norm)=0.158939, mflops=31.4585 (err=1.7e-14) 1. Arndt DIT: elapsed time t=1.36661 s, 8 iters, t-(init.)=1.33328 s t(norm)=0.158939, mflops=31.4585 (err=1.7e-14) 2. Arndt Split-Radix: elapsed time t=1.7166 s, 8 iters, t-(init.)=1.69993 s t(norm)=0.202648, mflops=24.6734 (err=1.7e-14) 3. Arndt 4-step: elapsed time t=1.08329 s, 8 iters, t-(init.)=1.06662 s t(norm)=0.127151, mflops=39.3232 (err=1.7e-14) 4. Bailey: elapsed time t=1.23328 s, 8 iters, t-(init.)=1.19995 s t(norm)=0.143045, mflops=34.9539 (err=1.7e-14) 5. Beauregard: elapsed time t=1.16662 s, 4 iters, t-(init.)=1.14995 s t(norm)=0.27417, mflops=18.2368 (err=1.7e-14) 6. Bergland: elapsed time t=1.16662 s, 16 iters, t-(init.)=1.11662 s t(norm)=0.0665559, mflops=75.1249 (err=1.7e-14) 7. Brenner: elapsed time t=1.84993 s, 16 iters, t-(init.)=1.81659 s t(norm)=0.108277, mflops=46.1777 (err=1.7e-14) 8. Burrus: elapsed time t=1.86659 s, 8 iters, t-(init.)=1.84993 s t(norm)=0.220528, mflops=22.6728 (err=1.7e-14) 9. CWP (min N) (N=72072): elapsed time t=1.03329 s, 32 iters, t-(init.)=0.933296 s t(norm)=0.0278144, mflops=179.763 10. CWP (best N) (N=72072): elapsed time t=1.03329 s, 32 iters, t-(init.)=0.933296 s t(norm)=0.0278144, mflops=179.763 11. Edelblute: elapsed time t=1.83326 s, 8 iters, t-(init.)=1.79993 s t(norm)=0.214568, mflops=23.3026 (err=1.7e-14) 12. FFTPACK: elapsed time t=1.21662 s, 16 iters, t-(init.)=1.16662 s t(norm)=0.069536, mflops=71.9052 (err=1.7e-14) 13. FFTPACK (f2c): elapsed time t=1.26662 s, 16 iters, t-(init.)=1.21662 s t(norm)=0.0725161, mflops=68.9502 (err=1.7e-14) FFTW_MEASURE plan: (cost = 3.749850e-02) FFTW_TWIDDLE 32 FFTW_TWIDDLE 32 FFTW_NOTW 64 14. FFTW: elapsed time t=1.31661 s, 32 iters, t-(init.)=1.21662 s t(norm)=0.036258, mflops=137.9 (err=1.7e-14) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.31661 s, 32 iters, t-(init.)=1.23328 s t(norm)=0.0367547, mflops=136.037 (err=1.7e-14) 16. Frigo-old: elapsed time t=1.06662 s, 16 iters, t-(init.)=1.01663 s t(norm)=0.0605956, mflops=82.5142 (err=1.7e-14) 17. Green: elapsed time t=1.96659 s, 32 iters, t-(init.)=1.86659 s t(norm)=0.0556288, mflops=89.8815 (err=1.7e-14) 18. GSL: elapsed time t=1.11662 s, 16 iters, t-(init.)=1.08329 s t(norm)=0.0645691, mflops=77.4364 (err=1.7e-14) 19. GSL DIT: elapsed time t=1.46661 s, 8 iters, t-(init.)=1.44994 s t(norm)=0.172847, mflops=28.9274 (err=1.7e-14) 20. GSL DIF: elapsed time t=1.31661 s, 8 iters, t-(init.)=1.28328 s t(norm)=0.152979, mflops=32.6842 (err=1.8e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.03329 s, 8 iters, t-(init.)=1.01663 s t(norm)=0.121191, mflops=41.2571 (err=1.7e-14) 23. Mayer (simple): elapsed time t=1.96659 s, 16 iters, t-(init.)=1.91659 s t(norm)=0.114238, mflops=43.7684 24. Mayer (lookup): elapsed time t=1.01663 s, 8 iters, t-(init.)=0.983294 s t(norm)=0.117218, mflops=42.6556 (err=1.7e-14) 25. Monro: elapsed time t=1.51661 s, 8 iters, t-(init.)=1.48327 s t(norm)=0.17682, mflops=28.2773 (err=1.6e-07) 26. NAPACK (f2c): elapsed time t=1.86659 s, 8 iters, t-(init.)=1.83326 s t(norm)=0.218542, mflops=22.8789 (err=8.6e-13) 27. Nielsen: elapsed time t=1.44994 s, 8 iters, t-(init.)=1.43328 s t(norm)=0.17086, mflops=29.2638 (err=2.6e-13) 28. NR (C): elapsed time t=1.46661 s, 8 iters, t-(init.)=1.43328 s t(norm)=0.17086, mflops=29.2638 (err=1.7e-14) 29. NR (F): elapsed time t=1.51661 s, 8 iters, t-(init.)=1.49994 s t(norm)=0.178807, mflops=27.9631 (err=1.7e-14) 30. Ooura (C): elapsed time t=1.54994 s, 32 iters, t-(init.)=1.44994 s t(norm)=0.0432116, mflops=115.71 (err=1.7e-14) 31. Ooura (F): elapsed time t=1.54994 s, 32 iters, t-(init.)=1.46661 s t(norm)=0.0437083, mflops=114.395 (err=1.7e-14) 32. Ransom: elapsed time t=1.39994 s, 16 iters, t-(init.)=1.34995 s t(norm)=0.0804631, mflops=62.1403 (err=1.7e-14) 33. SCIPORT: elapsed time t=1.88326 s, 16 iters, t-(init.)=1.83326 s t(norm)=0.109271, mflops=45.7579 (err=1.7e-14) 34. Singleton: elapsed time t=1.41661 s, 16 iters, t-(init.)=1.36661 s t(norm)=0.0814564, mflops=61.3825 (err=2.3e-14) 35. Singleton (f2c): elapsed time t=1.53327 s, 16 iters, t-(init.)=1.48327 s t(norm)=0.08841, mflops=56.5547 (err=2.3e-14) 36. Sorensen: elapsed time t=1.04996 s, 8 iters, t-(init.)=1.03329 s t(norm)=0.123178, mflops=40.5917 (err=1.7e-14) 37. Sorensen DIT: elapsed time t=1.91659 s, 8 iters, t-(init.)=1.89992 s t(norm)=0.226489, mflops=22.0762 (err=1.7e-14) 38. Temperton: elapsed time t=1.49994 s, 16 iters, t-(init.)=1.44994 s t(norm)=0.0864233, mflops=57.8548 (err=1.7e-07) 39. Temperton (f2c): elapsed time t=1.64993 s, 16 iters, t-(init.)=1.59994 s t(norm)=0.0953636, mflops=52.4309 (err=1.7e-14) 40. Valkenburg: elapsed time t=1.69993 s, 4 iters, t-(init.)=1.68327 s t(norm)=0.401322, mflops=12.4588 (err=1.7e-14) 41. DXML: elapsed time t=1.29995 s, 32 iters, t-(init.)=1.19995 s t(norm)=0.0357614, mflops=139.816 (err=1.7e-14) Top mflops for N=65536 = 179.763 Normalized results and averages for N=65536: fft 0: mflops = 31.4585 (norm. = 0.175), norm. avg. (of 16) = 0.285586 fft 1: mflops = 31.4585 (norm. = 0.175), norm. avg. (of 16) = 0.262741 fft 2: mflops = 24.6734 (norm. = 0.137255), norm. avg. (of 16) = 0.171568 fft 3: mflops = 39.3232 (norm. = 0.21875), norm. avg. (of 16) = 0.0914561 fft 4: mflops = 34.9539 (norm. = 0.194444), norm. avg. (of 16) = 0.26823 fft 5: mflops = 18.2368 (norm. = 0.101449), norm. avg. (of 16) = 0.0552675 fft 6: mflops = 75.1249 (norm. = 0.41791), norm. avg. (of 16) = 0.319549 fft 7: mflops = 46.1777 (norm. = 0.256881), norm. avg. (of 16) = 0.184421 fft 8: mflops = 22.6728 (norm. = 0.126126), norm. avg. (of 16) = 0.147401 fft 9: mflops = 179.763 (norm. = 1), norm. avg. (of 16) = 0.563294 fft 10: mflops = 179.763 (norm. = 1), norm. avg. (of 16) = 0.604998 fft 11: mflops = 23.3026 (norm. = 0.12963), norm. avg. (of 15) = 0.118884 fft 12: mflops = 71.9052 (norm. = 0.4), norm. avg. (of 16) = 0.483954 fft 13: mflops = 68.9502 (norm. = 0.383562), norm. avg. (of 16) = 0.299506 fft 14: mflops = 137.9 (norm. = 0.767123), norm. avg. (of 16) = 0.821638 fft 15: mflops = 136.037 (norm. = 0.756757), norm. avg. (of 16) = 0.842025 fft 16: mflops = 82.5142 (norm. = 0.459016), norm. avg. (of 16) = 0.720043 fft 17: mflops = 89.8815 (norm. = 0.5), norm. avg. (of 14) = 0.625941 fft 18: mflops = 77.4364 (norm. = 0.430769), norm. avg. (of 16) = 0.335039 fft 19: mflops = 28.9274 (norm. = 0.16092), norm. avg. (of 16) = 0.145955 fft 20: mflops = 32.6842 (norm. = 0.181818), norm. avg. (of 16) = 0.172769 fft 21: mflops = -1 (norm. = -0.00556288), norm. avg. (of 12) = 0.333532 fft 22: mflops = 41.2571 (norm. = 0.229508), norm. avg. (of 15) = 0.243344 fft 23: mflops = 43.7684 (norm. = 0.243478), norm. avg. (of 15) = 0.287437 fft 24: mflops = 42.6556 (norm. = 0.237288), norm. avg. (of 15) = 0.275274 fft 25: mflops = 28.2773 (norm. = 0.157303), norm. avg. (of 15) = 0.153266 fft 26: mflops = 22.8789 (norm. = 0.127273), norm. avg. (of 16) = 0.0744109 fft 27: mflops = 29.2638 (norm. = 0.162791), norm. avg. (of 16) = 0.154268 fft 28: mflops = 29.2638 (norm. = 0.162791), norm. avg. (of 16) = 0.139138 fft 29: mflops = 27.9631 (norm. = 0.155556), norm. avg. (of 16) = 0.137064 fft 30: mflops = 115.71 (norm. = 0.643678), norm. avg. (of 16) = 0.564108 fft 31: mflops = 114.395 (norm. = 0.636364), norm. avg. (of 16) = 0.502754 fft 32: mflops = 62.1403 (norm. = 0.345679), norm. avg. (of 15) = 0.212296 fft 33: mflops = 45.7579 (norm. = 0.254545), norm. avg. (of 15) = 0.296406 fft 34: mflops = 61.3825 (norm. = 0.341463), norm. avg. (of 16) = 0.306037 fft 35: mflops = 56.5547 (norm. = 0.314607), norm. avg. (of 16) = 0.266382 fft 36: mflops = 40.5917 (norm. = 0.225806), norm. avg. (of 16) = 0.32086 fft 37: mflops = 22.0762 (norm. = 0.122807), norm. avg. (of 16) = 0.147799 fft 38: mflops = 57.8548 (norm. = 0.321839), norm. avg. (of 16) = 0.270107 fft 39: mflops = 52.4309 (norm. = 0.291667), norm. avg. (of 16) = 0.219351 fft 40: mflops = 12.4588 (norm. = 0.0693069), norm. avg. (of 16) = 0.0477481 fft 41: mflops = 139.816 (norm. = 0.777778), norm. avg. (of 16) = 0.478473 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.7666 s, 4 iters, t-(init.)=1.73326 s t(norm)=0.194467, mflops=25.7113 (err=3.3e-14) 1. Arndt DIT: elapsed time t=1.78326 s, 4 iters, t-(init.)=1.74993 s t(norm)=0.196337, mflops=25.4664 (err=3.3e-14) 2. Arndt Split-Radix: elapsed time t=1.13329 s, 2 iters, t-(init.)=1.13329 s t(norm)=0.254303, mflops=19.6616 (err=3.3e-14) 3. Arndt 4-step: elapsed time t=1.44994 s, 4 iters, t-(init.)=1.41661 s t(norm)=0.158939, mflops=31.4585 (err=3.3e-14) 4. Bailey: elapsed time t=1.44994 s, 4 iters, t-(init.)=1.41661 s t(norm)=0.158939, mflops=31.4585 (err=3.3e-14) 5. Beauregard: elapsed time t=1.29995 s, 2 iters, t-(init.)=1.28328 s t(norm)=0.287961, mflops=17.3635 (err=3.3e-14) 6. Bergland: elapsed time t=1.54994 s, 8 iters, t-(init.)=1.48327 s t(norm)=0.0832094, mflops=60.0893 (err=3.4e-14) 7. Brenner: elapsed time t=1.23328 s, 4 iters, t-(init.)=1.19995 s t(norm)=0.134631, mflops=37.1386 (err=3.3e-14) 8. Burrus: elapsed time t=1.19995 s, 2 iters, t-(init.)=1.19995 s t(norm)=0.269262, mflops=18.5693 (err=3.3e-14) 9. CWP (min N) (N=144144): elapsed time t=1.24995 s, 16 iters, t-(init.)=1.11662 s t(norm)=0.0313204, mflops=159.64 10. CWP (best N) (N=144144): elapsed time t=1.26662 s, 16 iters, t-(init.)=1.13329 s t(norm)=0.0317879, mflops=157.293 11. Edelblute: elapsed time t=1.18329 s, 2 iters, t-(init.)=1.16662 s t(norm)=0.261782, mflops=19.0998 (err=3.3e-14) 12. FFTPACK: elapsed time t=1.64993 s, 8 iters, t-(init.)=1.58327 s t(norm)=0.0888191, mflops=56.2942 (err=3.3e-14) 13. FFTPACK (f2c): elapsed time t=1.78326 s, 8 iters, t-(init.)=1.7166 s t(norm)=0.0962986, mflops=51.9219 (err=3.3e-14) FFTW_MEASURE plan: (cost = 9.166300e-02) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_NOTW 32 14. FFTW: elapsed time t=1.44994 s, 16 iters, t-(init.)=1.33328 s t(norm)=0.0373975, mflops=133.699 (err=3.3e-14) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.63327 s, 16 iters, t-(init.)=1.51661 s t(norm)=0.0425397, mflops=117.537 (err=3.3e-14) 16. Frigo-old: elapsed time t=1.33328 s, 8 iters, t-(init.)=1.28328 s t(norm)=0.0719902, mflops=69.4539 (err=3.3e-14) 17. Green: elapsed time t=1.43328 s, 8 iters, t-(init.)=1.38328 s t(norm)=0.0775998, mflops=64.4332 (err=3.3e-14) 18. GSL: elapsed time t=1.53327 s, 8 iters, t-(init.)=1.46661 s t(norm)=0.0822745, mflops=60.7722 (err=3.3e-14) 19. GSL DIT: elapsed time t=1.93326 s, 4 iters, t-(init.)=1.91659 s t(norm)=0.215036, mflops=23.252 (err=3.5e-14) 20. GSL DIF: elapsed time t=1.79993 s, 4 iters, t-(init.)=1.7666 s t(norm)=0.198207, mflops=25.2262 (err=3.5e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.24995 s, 4 iters, t-(init.)=1.21662 s t(norm)=0.136501, mflops=36.6298 (err=3.3e-14) 23. Mayer (simple): elapsed time t=1.19995 s, 4 iters, t-(init.)=1.16662 s t(norm)=0.130891, mflops=38.1997 24. Mayer (lookup): elapsed time t=1.24995 s, 4 iters, t-(init.)=1.21662 s t(norm)=0.136501, mflops=36.6298 (err=3.3e-14) 25. Monro: elapsed time t=2.01659 s, 4 iters, t-(init.)=1.99992 s t(norm)=0.224385, mflops=22.2831 (err=1.7e-07) 26. NAPACK (f2c): elapsed time t=1.11662 s, 2 iters, t-(init.)=1.09996 s t(norm)=0.246823, mflops=20.2574 (err=2.0e-12) 27. Nielsen: elapsed time t=1.6166 s, 4 iters, t-(init.)=1.58327 s t(norm)=0.177638, mflops=28.1471 (err=9.2e-13) 28. NR (C): elapsed time t=1.94992 s, 4 iters, t-(init.)=1.93326 s t(norm)=0.216905, mflops=23.0515 (err=3.4e-14) 29. NR (F): elapsed time t=1.98325 s, 4 iters, t-(init.)=1.96659 s t(norm)=0.220645, mflops=22.6608 (err=3.4e-14) 30. Ooura (C): elapsed time t=1.03329 s, 8 iters, t-(init.)=0.966628 s t(norm)=0.0542264, mflops=92.2061 (err=3.4e-14) 31. Ooura (F): elapsed time t=1.01663 s, 8 iters, t-(init.)=0.966628 s t(norm)=0.0542264, mflops=92.2061 (err=3.4e-14) 32. Ransom: elapsed time t=1.34995 s, 8 iters, t-(init.)=1.28328 s t(norm)=0.0719902, mflops=69.4539 (err=3.3e-14) 33. SCIPORT: elapsed time t=1.18329 s, 4 iters, t-(init.)=1.14995 s t(norm)=0.129021, mflops=38.7533 (err=3.3e-14) 34. Singleton: elapsed time t=1.01663 s, 4 iters, t-(init.)=0.983294 s t(norm)=0.110323, mflops=45.3216 (err=4.8e-14) 35. Singleton (f2c): elapsed time t=1.09996 s, 4 iters, t-(init.)=1.08329 s t(norm)=0.121542, mflops=41.1381 (err=4.8e-14) 36. Sorensen: elapsed time t=1.26662 s, 4 iters, t-(init.)=1.23328 s t(norm)=0.138371, mflops=36.1348 (err=3.3e-14) 37. Sorensen DIT: elapsed time t=1.23328 s, 2 iters, t-(init.)=1.21662 s t(norm)=0.273002, mflops=18.3149 (err=3.3e-14) 38. Temperton: elapsed time t=1.04996 s, 4 iters, t-(init.)=1.01663 s t(norm)=0.114062, mflops=43.8357 (err=1.9e-07) 39. Temperton (f2c): elapsed time t=1.14995 s, 4 iters, t-(init.)=1.13329 s t(norm)=0.127151, mflops=39.3232 (err=3.3e-14) 40. Valkenburg: elapsed time t=1.98325 s, 2 iters, t-(init.)=1.96659 s t(norm)=0.44129, mflops=11.3304 (err=3.4e-14) 41. DXML: elapsed time t=1.54994 s, 16 iters, t-(init.)=1.43328 s t(norm)=0.0402023, mflops=124.371 (err=3.4e-14) Top mflops for N=131072 = 159.64 Normalized results and averages for N=131072: fft 0: mflops = 25.7113 (norm. = 0.161058), norm. avg. (of 17) = 0.278261 fft 1: mflops = 25.4664 (norm. = 0.159524), norm. avg. (of 17) = 0.25667 fft 2: mflops = 19.6616 (norm. = 0.123162), norm. avg. (of 17) = 0.16872 fft 3: mflops = 31.4585 (norm. = 0.197059), norm. avg. (of 17) = 0.0976681 fft 4: mflops = 31.4585 (norm. = 0.197059), norm. avg. (of 17) = 0.264044 fft 5: mflops = 17.3635 (norm. = 0.108766), norm. avg. (of 17) = 0.0584145 fft 6: mflops = 60.0893 (norm. = 0.376404), norm. avg. (of 17) = 0.322893 fft 7: mflops = 37.1386 (norm. = 0.232639), norm. avg. (of 17) = 0.187258 fft 8: mflops = 18.5693 (norm. = 0.116319), norm. avg. (of 17) = 0.145572 fft 9: mflops = 159.64 (norm. = 1), norm. avg. (of 17) = 0.588982 fft 10: mflops = 157.293 (norm. = 0.985294), norm. avg. (of 17) = 0.627369 fft 11: mflops = 19.0998 (norm. = 0.119643), norm. avg. (of 16) = 0.118931 fft 12: mflops = 56.2942 (norm. = 0.352632), norm. avg. (of 17) = 0.476229 fft 13: mflops = 51.9219 (norm. = 0.325243), norm. avg. (of 17) = 0.30102 fft 14: mflops = 133.699 (norm. = 0.8375), norm. avg. (of 17) = 0.822571 fft 15: mflops = 117.537 (norm. = 0.736264), norm. avg. (of 17) = 0.835804 fft 16: mflops = 69.4539 (norm. = 0.435065), norm. avg. (of 17) = 0.70328 fft 17: mflops = 64.4332 (norm. = 0.403614), norm. avg. (of 15) = 0.61112 fft 18: mflops = 60.7722 (norm. = 0.380682), norm. avg. (of 17) = 0.337724 fft 19: mflops = 23.252 (norm. = 0.145652), norm. avg. (of 17) = 0.145938 fft 20: mflops = 25.2262 (norm. = 0.158019), norm. avg. (of 17) = 0.171902 fft 21: mflops = -1 (norm. = -0.00626408), norm. avg. (of 12) = 0.333532 fft 22: mflops = 36.6298 (norm. = 0.229452), norm. avg. (of 16) = 0.242476 fft 23: mflops = 38.1997 (norm. = 0.239286), norm. avg. (of 16) = 0.284428 fft 24: mflops = 36.6298 (norm. = 0.229452), norm. avg. (of 16) = 0.27241 fft 25: mflops = 22.2831 (norm. = 0.139583), norm. avg. (of 16) = 0.152411 fft 26: mflops = 20.2574 (norm. = 0.126894), norm. avg. (of 17) = 0.0774982 fft 27: mflops = 28.1471 (norm. = 0.176316), norm. avg. (of 17) = 0.155565 fft 28: mflops = 23.0515 (norm. = 0.144397), norm. avg. (of 17) = 0.139447 fft 29: mflops = 22.6608 (norm. = 0.141949), norm. avg. (of 17) = 0.137351 fft 30: mflops = 92.2061 (norm. = 0.577586), norm. avg. (of 17) = 0.564901 fft 31: mflops = 92.2061 (norm. = 0.577586), norm. avg. (of 17) = 0.507155 fft 32: mflops = 69.4539 (norm. = 0.435065), norm. avg. (of 16) = 0.226219 fft 33: mflops = 38.7533 (norm. = 0.242754), norm. avg. (of 16) = 0.293053 fft 34: mflops = 45.3216 (norm. = 0.283898), norm. avg. (of 17) = 0.304734 fft 35: mflops = 41.1381 (norm. = 0.257692), norm. avg. (of 17) = 0.265871 fft 36: mflops = 36.1348 (norm. = 0.226351), norm. avg. (of 17) = 0.3153 fft 37: mflops = 18.3149 (norm. = 0.114726), norm. avg. (of 17) = 0.145854 fft 38: mflops = 43.8357 (norm. = 0.27459), norm. avg. (of 17) = 0.270371 fft 39: mflops = 39.3232 (norm. = 0.246324), norm. avg. (of 17) = 0.220938 fft 40: mflops = 11.3304 (norm. = 0.0709746), norm. avg. (of 17) = 0.0491143 fft 41: mflops = 124.371 (norm. = 0.77907), norm. avg. (of 17) = 0.496155 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=1.23328 s, 1 iters, t-(init.)=1.21662 s t(norm)=0.257835, mflops=19.3922 (err=4.3e-14) 1. Arndt DIT: elapsed time t=1.23328 s, 1 iters, t-(init.)=1.21662 s t(norm)=0.257835, mflops=19.3922 (err=4.3e-14) 2. Arndt Split-Radix: elapsed time t=1.49994 s, 1 iters, t-(init.)=1.48327 s t(norm)=0.314347, mflops=15.906 (err=4.3e-14) 3. Arndt 4-step: elapsed time t=1.38328 s, 2 iters, t-(init.)=1.34995 s t(norm)=0.143045, mflops=34.9539 (err=4.3e-14) 4. Bailey: elapsed time t=1.81659 s, 2 iters, t-(init.)=1.78326 s t(norm)=0.188961, mflops=26.4605 (err=4.3e-14) 5. Beauregard: elapsed time t=1.43328 s, 1 iters, t-(init.)=1.41661 s t(norm)=0.300219, mflops=16.6545 (err=4.4e-14) 6. Bergland: elapsed time t=1.96659 s, 4 iters, t-(init.)=1.89992 s t(norm)=0.100662, mflops=49.6714 (err=4.4e-14) 7. Brenner: elapsed time t=1.63327 s, 2 iters, t-(init.)=1.59994 s t(norm)=0.169535, mflops=29.4924 (err=4.4e-14) 8. Burrus: elapsed time t=1.58327 s, 1 iters, t-(init.)=1.5666 s t(norm)=0.332007, mflops=15.0599 (err=4.3e-14) 9. CWP (min N) (N=360360): elapsed time t=1.08329 s, 4 iters, t-(init.)=0.983294 s t(norm)=0.0520968, mflops=95.9752 10. CWP (best N) (N=360360): elapsed time t=1.08329 s, 4 iters, t-(init.)=0.983294 s t(norm)=0.0520968, mflops=95.9752 11. Edelblute: elapsed time t=1.5666 s, 1 iters, t-(init.)=1.54994 s t(norm)=0.328475, mflops=15.2219 (err=4.3e-14) 12. FFTPACK: elapsed time t=1.93326 s, 4 iters, t-(init.)=1.84993 s t(norm)=0.0980126, mflops=51.0138 (err=4.4e-14) 13. FFTPACK (f2c): elapsed time t=2.04992 s, 4 iters, t-(init.)=1.98325 s t(norm)=0.105077, mflops=47.5843 (err=4.4e-14) FFTW_MEASURE plan: (cost = 2.166580e-01) FFTW_TWIDDLE 32 FFTW_TWIDDLE 8 FFTW_TWIDDLE 32 FFTW_NOTW 32 14. FFTW: elapsed time t=1.78326 s, 8 iters, t-(init.)=1.64993 s t(norm)=0.0437083, mflops=114.395 (err=4.4e-14) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.78326 s, 8 iters, t-(init.)=1.63327 s t(norm)=0.0432668, mflops=115.562 (err=4.4e-14) 16. Frigo-old: elapsed time t=1.5666 s, 4 iters, t-(init.)=1.49994 s t(norm)=0.0794697, mflops=62.9171 (err=4.4e-14) 17. Green: elapsed time t=1.83326 s, 4 iters, t-(init.)=1.7666 s t(norm)=0.0935976, mflops=53.4202 (err=4.4e-14) 18. GSL: elapsed time t=1.7666 s, 4 iters, t-(init.)=1.68327 s t(norm)=0.0891826, mflops=56.0647 (err=4.4e-14) 19. GSL DIT: elapsed time t=1.28328 s, 1 iters, t-(init.)=1.26662 s t(norm)=0.268431, mflops=18.6268 (err=4.6e-14) 20. GSL DIF: elapsed time t=1.23328 s, 1 iters, t-(init.)=1.21662 s t(norm)=0.257835, mflops=19.3922 (err=4.6e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.7166 s, 2 iters, t-(init.)=1.68327 s t(norm)=0.178365, mflops=28.0324 (err=4.3e-14) 23. Mayer (simple): elapsed time t=1.68327 s, 2 iters, t-(init.)=1.64993 s t(norm)=0.174833, mflops=28.5987 24. Mayer (lookup): elapsed time t=1.7666 s, 2 iters, t-(init.)=1.73326 s t(norm)=0.183663, mflops=27.2237 (err=4.3e-14) 25. Monro: elapsed time t=1.31661 s, 1 iters, t-(init.)=1.29995 s t(norm)=0.275495, mflops=18.1492 (err=1.8e-07) 26. NAPACK (f2c): elapsed time t=1.28328 s, 1 iters, t-(init.)=1.26662 s t(norm)=0.268431, mflops=18.6268 (err=3.7e-12) 27. Nielsen: elapsed time t=1.03329 s, 1 iters, t-(init.)=1.01663 s t(norm)=0.215451, mflops=23.2071 (err=2.1e-12) 28. NR (C): elapsed time t=1.28328 s, 1 iters, t-(init.)=1.26662 s t(norm)=0.268431, mflops=18.6268 (err=4.3e-14) 29. NR (F): elapsed time t=1.29995 s, 1 iters, t-(init.)=1.28328 s t(norm)=0.271963, mflops=18.3849 (err=4.3e-14) 30. Ooura (C): elapsed time t=1.28328 s, 4 iters, t-(init.)=1.21662 s t(norm)=0.0644587, mflops=77.569 (err=4.4e-14) 31. Ooura (F): elapsed time t=1.26662 s, 4 iters, t-(init.)=1.19995 s t(norm)=0.0635757, mflops=78.6463 (err=4.4e-14) 32. Ransom: elapsed time t=1.31661 s, 4 iters, t-(init.)=1.24995 s t(norm)=0.0662247, mflops=75.5005 (err=4.3e-14) 33. SCIPORT: elapsed time t=1.44994 s, 2 iters, t-(init.)=1.41661 s t(norm)=0.150109, mflops=33.309 (err=4.4e-14) 34. Singleton: elapsed time t=1.31661 s, 2 iters, t-(init.)=1.28328 s t(norm)=0.135981, mflops=36.7697 (err=6.0e-14) 35. Singleton (f2c): elapsed time t=1.36661 s, 2 iters, t-(init.)=1.33328 s t(norm)=0.141279, mflops=35.3909 (err=6.0e-14) 36. Sorensen: elapsed time t=1.64993 s, 2 iters, t-(init.)=1.6166 s t(norm)=0.171301, mflops=29.1883 (err=4.3e-14) 37. Sorensen DIT: elapsed time t=1.58327 s, 1 iters, t-(init.)=1.5666 s t(norm)=0.332007, mflops=15.0599 (err=4.3e-14) 38. Temperton: elapsed time t=1.31661 s, 2 iters, t-(init.)=1.28328 s t(norm)=0.135981, mflops=36.7697 (err=2.0e-07) 39. Temperton (f2c): elapsed time t=1.36661 s, 2 iters, t-(init.)=1.33328 s t(norm)=0.141279, mflops=35.3909 (err=4.4e-14) 40. Valkenburg: elapsed time t=2.29991 s, 1 iters, t-(init.)=2.28324 s t(norm)=0.483882, mflops=10.3331 (err=4.4e-14) 41. DXML: elapsed time t=1.81659 s, 8 iters, t-(init.)=1.6666 s t(norm)=0.0441498, mflops=113.251 (err=4.4e-14) Top mflops for N=262144 = 115.562 Normalized results and averages for N=262144: fft 0: mflops = 19.3922 (norm. = 0.167808), norm. avg. (of 18) = 0.272125 fft 1: mflops = 19.3922 (norm. = 0.167808), norm. avg. (of 18) = 0.251733 fft 2: mflops = 15.906 (norm. = 0.13764), norm. avg. (of 18) = 0.166994 fft 3: mflops = 34.9539 (norm. = 0.302469), norm. avg. (of 18) = 0.109046 fft 4: mflops = 26.4605 (norm. = 0.228972), norm. avg. (of 18) = 0.262095 fft 5: mflops = 16.6545 (norm. = 0.144118), norm. avg. (of 18) = 0.0631758 fft 6: mflops = 49.6714 (norm. = 0.429825), norm. avg. (of 18) = 0.328834 fft 7: mflops = 29.4924 (norm. = 0.255208), norm. avg. (of 18) = 0.191033 fft 8: mflops = 15.0599 (norm. = 0.130319), norm. avg. (of 18) = 0.144725 fft 9: mflops = 95.9752 (norm. = 0.830508), norm. avg. (of 18) = 0.6024 fft 10: mflops = 95.9752 (norm. = 0.830508), norm. avg. (of 18) = 0.638654 fft 11: mflops = 15.2219 (norm. = 0.13172), norm. avg. (of 17) = 0.119684 fft 12: mflops = 51.0138 (norm. = 0.441441), norm. avg. (of 18) = 0.474296 fft 13: mflops = 47.5843 (norm. = 0.411765), norm. avg. (of 18) = 0.307172 fft 14: mflops = 114.395 (norm. = 0.989899), norm. avg. (of 18) = 0.831867 fft 15: mflops = 115.562 (norm. = 1), norm. avg. (of 18) = 0.844926 fft 16: mflops = 62.9171 (norm. = 0.544444), norm. avg. (of 18) = 0.694455 fft 17: mflops = 53.4202 (norm. = 0.462264), norm. avg. (of 16) = 0.601816 fft 18: mflops = 56.0647 (norm. = 0.485149), norm. avg. (of 18) = 0.345915 fft 19: mflops = 18.6268 (norm. = 0.161184), norm. avg. (of 18) = 0.146785 fft 20: mflops = 19.3922 (norm. = 0.167808), norm. avg. (of 18) = 0.171674 fft 21: mflops = -1 (norm. = -0.00865337), norm. avg. (of 12) = 0.333532 fft 22: mflops = 28.0324 (norm. = 0.242574), norm. avg. (of 17) = 0.242482 fft 23: mflops = 28.5987 (norm. = 0.247475), norm. avg. (of 17) = 0.282254 fft 24: mflops = 27.2237 (norm. = 0.235577), norm. avg. (of 17) = 0.270243 fft 25: mflops = 18.1492 (norm. = 0.157051), norm. avg. (of 17) = 0.152684 fft 26: mflops = 18.6268 (norm. = 0.161184), norm. avg. (of 18) = 0.0821474 fft 27: mflops = 23.2071 (norm. = 0.20082), norm. avg. (of 18) = 0.158079 fft 28: mflops = 18.6268 (norm. = 0.161184), norm. avg. (of 18) = 0.140655 fft 29: mflops = 18.3849 (norm. = 0.159091), norm. avg. (of 18) = 0.138559 fft 30: mflops = 77.569 (norm. = 0.671233), norm. avg. (of 18) = 0.570808 fft 31: mflops = 78.6463 (norm. = 0.680556), norm. avg. (of 18) = 0.516789 fft 32: mflops = 75.5005 (norm. = 0.653333), norm. avg. (of 17) = 0.251344 fft 33: mflops = 33.309 (norm. = 0.288235), norm. avg. (of 17) = 0.29277 fft 34: mflops = 36.7697 (norm. = 0.318182), norm. avg. (of 18) = 0.305481 fft 35: mflops = 35.3909 (norm. = 0.30625), norm. avg. (of 18) = 0.268114 fft 36: mflops = 29.1883 (norm. = 0.252577), norm. avg. (of 18) = 0.311816 fft 37: mflops = 15.0599 (norm. = 0.130319), norm. avg. (of 18) = 0.144991 fft 38: mflops = 36.7697 (norm. = 0.318182), norm. avg. (of 18) = 0.273027 fft 39: mflops = 35.3909 (norm. = 0.30625), norm. avg. (of 18) = 0.225678 fft 40: mflops = 10.3331 (norm. = 0.0894161), norm. avg. (of 18) = 0.0513533 fft 41: mflops = 113.251 (norm. = 0.98), norm. avg. (of 18) = 0.523035 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. Brenner 1. CWP (min N) 2. CWP (best N) 3. FFTPACK 4. FFTPACK (f2c) 5. FFTW 6. FFTW_ESTIMATE 7. Frigo-old 8. GSL 9. NAPACK (f2c) 10. Nielsen 11. Singleton 12. Singleton (f2c) 13. Temperton 14. Temperton (f2c) 15. Valkenburg 16. DXML Computing normalized averages (17 transforms). Benchmarking for array size = 6: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.01663 s, 524288 iters, t-(init.)=0.966628 s t(norm)=0.118873, mflops=42.0616 2. CWP (best N) (N=15): elapsed time t=1.36661 s, 524288 iters, t-(init.)=1.29995 s t(norm)=0.159864, mflops=31.2766 3. FFTPACK: elapsed time t=1.18329 s, 1048576 iters, t-(init.)=1.09996 s t(norm)=0.0676348, mflops=73.9265 (err=1.0e-16) 4. FFTPACK (f2c): elapsed time t=1.46661 s, 1048576 iters, t-(init.)=1.38328 s t(norm)=0.0850558, mflops=58.7849 (err=1.8e-16) FFTW_MEASURE plan: (cost = 3.337727e-07) FFTW_NOTW 6 5. FFTW: elapsed time t=1.49994 s, 4194304 iters, t-(init.)=1.14995 s t(norm)=0.0176773, mflops=282.849 (err=1.1e-16) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 6. FFTW_ESTIMATE: elapsed time t=1.48327 s, 4194304 iters, t-(init.)=0.866632 s t(norm)=0.013322, mflops=375.319 (err=1.1e-16) 7. Frigo-old: elapsed time t=1.29995 s, 524288 iters, t-(init.)=1.26662 s t(norm)=0.155765, mflops=32.0997 (err=3.1e-16) 8. GSL: elapsed time t=1.16662 s, 1048576 iters, t-(init.)=1.08329 s t(norm)=0.06661, mflops=75.0638 (err=1.2e-16) 9. NAPACK (f2c): elapsed time t=1.09996 s, 262144 iters, t-(init.)=1.08329 s t(norm)=0.26644, mflops=18.766 (err=4.7e-16) 10. Nielsen: elapsed time t=1.7166 s, 262144 iters, t-(init.)=1.69993 s t(norm)=0.418106, mflops=11.9587 (err=2.7e-16) 11. Singleton: elapsed time t=1.79993 s, 524288 iters, t-(init.)=1.7666 s t(norm)=0.217251, mflops=23.0149 (err=1.0e-16) 12. Singleton (f2c): elapsed time t=1.79993 s, 524288 iters, t-(init.)=1.7666 s t(norm)=0.217251, mflops=23.0149 (err=1.0e-16) 13. Temperton: elapsed time t=1.46661 s, 524288 iters, t-(init.)=1.43328 s t(norm)=0.17626, mflops=28.3671 (err=3.7e-16) 14. Temperton (f2c): elapsed time t=1.81659 s, 524288 iters, t-(init.)=1.78326 s t(norm)=0.219301, mflops=22.7998 (err=1.0e-16) 15. Valkenburg: elapsed time t=1.26662 s, 262144 iters, t-(init.)=1.24995 s t(norm)=0.307431, mflops=16.2638 (err=3.2e-16) 16. DXML: elapsed time t=1.14995 s, 524288 iters, t-(init.)=1.11662 s t(norm)=0.137319, mflops=36.4116 (err=1.9e-16) Top mflops for N=6 = 375.319 Normalized results and averages for N=6: fft 0: mflops = -1 (norm. = -0.0026644), norm. avg. (of 0) = -1 fft 1: mflops = 42.0616 (norm. = 0.112069), norm. avg. (of 1) = 0.112069 fft 2: mflops = 31.2766 (norm. = 0.0833333), norm. avg. (of 1) = 0.0833333 fft 3: mflops = 73.9265 (norm. = 0.19697), norm. avg. (of 1) = 0.19697 fft 4: mflops = 58.7849 (norm. = 0.156627), norm. avg. (of 1) = 0.156627 fft 5: mflops = 282.849 (norm. = 0.753623), norm. avg. (of 1) = 0.753623 fft 6: mflops = 375.319 (norm. = 1), norm. avg. (of 1) = 1 fft 7: mflops = 32.0997 (norm. = 0.0855263), norm. avg. (of 1) = 0.0855263 fft 8: mflops = 75.0638 (norm. = 0.2), norm. avg. (of 1) = 0.2 fft 9: mflops = 18.766 (norm. = 0.05), norm. avg. (of 1) = 0.05 fft 10: mflops = 11.9587 (norm. = 0.0318627), norm. avg. (of 1) = 0.0318627 fft 11: mflops = 23.0149 (norm. = 0.0613208), norm. avg. (of 1) = 0.0613208 fft 12: mflops = 23.0149 (norm. = 0.0613208), norm. avg. (of 1) = 0.0613208 fft 13: mflops = 28.3671 (norm. = 0.0755814), norm. avg. (of 1) = 0.0755814 fft 14: mflops = 22.7998 (norm. = 0.0607477), norm. avg. (of 1) = 0.0607477 fft 15: mflops = 16.2638 (norm. = 0.0433333), norm. avg. (of 1) = 0.0433333 fft 16: mflops = 36.4116 (norm. = 0.0970149), norm. avg. (of 1) = 0.0970149 Benchmarking for array size = 9: 0. Brenner: elapsed time t=1.41661 s, 131072 iters, t-(init.)=1.39994 s t(norm)=0.374377, mflops=13.3555 (err=3.6e-16) 1. CWP (min N): elapsed time t=1.79993 s, 1048576 iters, t-(init.)=1.69993 s t(norm)=0.0568251, mflops=87.9893 2. CWP (best N) (N=15): elapsed time t=1.38328 s, 524288 iters, t-(init.)=1.31661 s t(norm)=0.0880232, mflops=56.8032 3. FFTPACK: elapsed time t=1.41661 s, 1048576 iters, t-(init.)=1.33328 s t(norm)=0.0445687, mflops=112.186 (err=1.4e-16) 4. FFTPACK (f2c): elapsed time t=1.01663 s, 524288 iters, t-(init.)=0.966628 s t(norm)=0.0646246, mflops=77.3699 (err=2.4e-16) FFTW_MEASURE plan: (cost = 5.403938e-07) FFTW_NOTW 9 5. FFTW: elapsed time t=1.34995 s, 2097152 iters, t-(init.)=1.14995 s t(norm)=0.0192203, mflops=260.142 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.18329 s, 2097152 iters, t-(init.)=0.849966 s t(norm)=0.0142063, mflops=351.957 (err=1.2e-16) 7. Frigo-old: elapsed time t=1.26662 s, 262144 iters, t-(init.)=1.24995 s t(norm)=0.167133, mflops=29.9164 (err=3.3e-16) 8. GSL: elapsed time t=1.09996 s, 524288 iters, t-(init.)=1.04996 s t(norm)=0.0701957, mflops=71.2294 (err=1.4e-16) 9. NAPACK (f2c): elapsed time t=1.34995 s, 262144 iters, t-(init.)=1.33328 s t(norm)=0.178275, mflops=28.0466 (err=4.3e-16) 10. Nielsen: elapsed time t=1.89992 s, 262144 iters, t-(init.)=1.88326 s t(norm)=0.251813, mflops=19.856 (err=4.7e-16) 11. Singleton: elapsed time t=1.88326 s, 524288 iters, t-(init.)=1.83326 s t(norm)=0.122564, mflops=40.795 (err=1.5e-16) 12. Singleton (f2c): elapsed time t=1.93326 s, 524288 iters, t-(init.)=1.88326 s t(norm)=0.125907, mflops=39.712 (err=1.5e-16) 13. Temperton: elapsed time t=1.78326 s, 524288 iters, t-(init.)=1.73326 s t(norm)=0.115879, mflops=43.1486 (err=1.1e-08) 14. Temperton (f2c): elapsed time t=1.78326 s, 524288 iters, t-(init.)=1.73326 s t(norm)=0.115879, mflops=43.1486 (err=1.4e-16) 15. Valkenburg: elapsed time t=1.04996 s, 131072 iters, t-(init.)=1.03329 s t(norm)=0.276326, mflops=18.0946 (err=3.7e-16) 16. DXML: elapsed time t=1.96659 s, 524288 iters, t-(init.)=1.91659 s t(norm)=0.128135, mflops=39.0213 (err=2.0e-16) Top mflops for N=9 = 351.957 Normalized results and averages for N=9: fft 0: mflops = 13.3555 (norm. = 0.0379464), norm. avg. (of 1) = 0.0379464 fft 1: mflops = 87.9893 (norm. = 0.25), norm. avg. (of 2) = 0.181034 fft 2: mflops = 56.8032 (norm. = 0.161392), norm. avg. (of 2) = 0.122363 fft 3: mflops = 112.186 (norm. = 0.31875), norm. avg. (of 2) = 0.25786 fft 4: mflops = 77.3699 (norm. = 0.219828), norm. avg. (of 2) = 0.188227 fft 5: mflops = 260.142 (norm. = 0.73913), norm. avg. (of 2) = 0.746377 fft 6: mflops = 351.957 (norm. = 1), norm. avg. (of 2) = 1 fft 7: mflops = 29.9164 (norm. = 0.085), norm. avg. (of 2) = 0.0852632 fft 8: mflops = 71.2294 (norm. = 0.202381), norm. avg. (of 2) = 0.20119 fft 9: mflops = 28.0466 (norm. = 0.0796875), norm. avg. (of 2) = 0.0648438 fft 10: mflops = 19.856 (norm. = 0.0564159), norm. avg. (of 2) = 0.0441393 fft 11: mflops = 40.795 (norm. = 0.115909), norm. avg. (of 2) = 0.0886149 fft 12: mflops = 39.712 (norm. = 0.112832), norm. avg. (of 2) = 0.0870763 fft 13: mflops = 43.1486 (norm. = 0.122596), norm. avg. (of 2) = 0.0990888 fft 14: mflops = 43.1486 (norm. = 0.122596), norm. avg. (of 2) = 0.0916719 fft 15: mflops = 18.0946 (norm. = 0.0514113), norm. avg. (of 2) = 0.0473723 fft 16: mflops = 39.0213 (norm. = 0.11087), norm. avg. (of 2) = 0.103942 Benchmarking for array size = 12: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.13329 s, 524288 iters, t-(init.)=1.08329 s t(norm)=0.0480296, mflops=104.102 2. CWP (best N) (N=15): elapsed time t=1.36661 s, 524288 iters, t-(init.)=1.29995 s t(norm)=0.0576355, mflops=86.7521 3. FFTPACK: elapsed time t=1.44994 s, 1048576 iters, t-(init.)=1.34995 s t(norm)=0.0299261, mflops=167.078 (err=1.6e-16) 4. FFTPACK (f2c): elapsed time t=1.08329 s, 524288 iters, t-(init.)=1.01663 s t(norm)=0.0450739, mflops=110.929 (err=1.9e-16) FFTW_MEASURE plan: (cost = 6.993332e-07) FFTW_NOTW 12 5. FFTW: elapsed time t=1.54994 s, 2097152 iters, t-(init.)=1.33328 s t(norm)=0.0147783, mflops=338.333 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.54994 s, 2097152 iters, t-(init.)=1.19995 s t(norm)=0.0133005, mflops=375.926 (err=1.2e-16) 7. Frigo-old: elapsed time t=1.11662 s, 262144 iters, t-(init.)=1.09996 s t(norm)=0.097537, mflops=51.2626 (err=2.9e-16) 8. GSL: elapsed time t=1.09996 s, 524288 iters, t-(init.)=1.04996 s t(norm)=0.0465518, mflops=107.407 (err=1.6e-16) 9. NAPACK (f2c): elapsed time t=1.09996 s, 131072 iters, t-(init.)=1.08329 s t(norm)=0.192118, mflops=26.0256 (err=5.5e-16) 10. Nielsen: elapsed time t=1.03329 s, 131072 iters, t-(init.)=1.01663 s t(norm)=0.180296, mflops=27.7322 (err=5.0e-16) 11. Singleton: elapsed time t=1.26662 s, 262144 iters, t-(init.)=1.23328 s t(norm)=0.10936, mflops=45.7207 (err=1.5e-16) 12. Singleton (f2c): elapsed time t=1.36661 s, 262144 iters, t-(init.)=1.33328 s t(norm)=0.118227, mflops=42.2916 (err=1.5e-16) 13. Temperton: elapsed time t=1.68327 s, 524288 iters, t-(init.)=1.63327 s t(norm)=0.0724139, mflops=69.0476 (err=5.4e-16) 14. Temperton (f2c): elapsed time t=1.09996 s, 262144 iters, t-(init.)=1.06662 s t(norm)=0.0945814, mflops=52.8645 (err=1.4e-16) 15. Valkenburg: elapsed time t=1.6666 s, 131072 iters, t-(init.)=1.64993 s t(norm)=0.292611, mflops=17.0875 (err=3.9e-16) 16. DXML: elapsed time t=1.29995 s, 524288 iters, t-(init.)=1.24995 s t(norm)=0.0554188, mflops=90.2221 (err=1.3e-16) Top mflops for N=12 = 375.926 Normalized results and averages for N=12: fft 0: mflops = -1 (norm. = -0.0026601), norm. avg. (of 1) = 0.0379464 fft 1: mflops = 104.102 (norm. = 0.276923), norm. avg. (of 3) = 0.212997 fft 2: mflops = 86.7521 (norm. = 0.230769), norm. avg. (of 3) = 0.158498 fft 3: mflops = 167.078 (norm. = 0.444444), norm. avg. (of 3) = 0.320055 fft 4: mflops = 110.929 (norm. = 0.295082), norm. avg. (of 3) = 0.223845 fft 5: mflops = 338.333 (norm. = 0.9), norm. avg. (of 3) = 0.797585 fft 6: mflops = 375.926 (norm. = 1), norm. avg. (of 3) = 1 fft 7: mflops = 51.2626 (norm. = 0.136364), norm. avg. (of 3) = 0.102297 fft 8: mflops = 107.407 (norm. = 0.285714), norm. avg. (of 3) = 0.229365 fft 9: mflops = 26.0256 (norm. = 0.0692308), norm. avg. (of 3) = 0.0663061 fft 10: mflops = 27.7322 (norm. = 0.0737705), norm. avg. (of 3) = 0.0540164 fft 11: mflops = 45.7207 (norm. = 0.121622), norm. avg. (of 3) = 0.0996172 fft 12: mflops = 42.2916 (norm. = 0.1125), norm. avg. (of 3) = 0.0955509 fft 13: mflops = 69.0476 (norm. = 0.183673), norm. avg. (of 3) = 0.127284 fft 14: mflops = 52.8645 (norm. = 0.140625), norm. avg. (of 3) = 0.10799 fft 15: mflops = 17.0875 (norm. = 0.0454545), norm. avg. (of 3) = 0.0467331 fft 16: mflops = 90.2221 (norm. = 0.24), norm. avg. (of 3) = 0.149295 Benchmarking for array size = 15: 0. Brenner: elapsed time t=1.93326 s, 131072 iters, t-(init.)=1.91659 s t(norm)=0.249515, mflops=20.0389 (err=3.3e-16) 1. CWP (min N): elapsed time t=1.36661 s, 524288 iters, t-(init.)=1.29995 s t(norm)=0.0423091, mflops=118.178 2. CWP (best N): elapsed time t=1.36661 s, 524288 iters, t-(init.)=1.28328 s t(norm)=0.0417667, mflops=119.713 3. FFTPACK: elapsed time t=1.68327 s, 1048576 iters, t-(init.)=1.53327 s t(norm)=0.0249515, mflops=200.389 (err=2.6e-16) 4. FFTPACK (f2c): elapsed time t=1.31661 s, 524288 iters, t-(init.)=1.23328 s t(norm)=0.0401394, mflops=124.566 (err=4.1e-16) FFTW_MEASURE plan: (cost = 1.144363e-06) FFTW_NOTW 15 5. FFTW: elapsed time t=1.24995 s, 1048576 iters, t-(init.)=1.09996 s t(norm)=0.0179, mflops=279.33 (err=1.8e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.26662 s, 1048576 iters, t-(init.)=1.04996 s t(norm)=0.0170864, mflops=292.631 (err=1.8e-16) 7. Frigo-old: elapsed time t=1.06662 s, 131072 iters, t-(init.)=1.04996 s t(norm)=0.136691, mflops=36.5789 (err=4.2e-16) 8. GSL: elapsed time t=1.14995 s, 262144 iters, t-(init.)=1.11662 s t(norm)=0.0726848, mflops=68.7901 (err=2.0e-16) 9. NAPACK (f2c): elapsed time t=1.93326 s, 131072 iters, t-(init.)=1.91659 s t(norm)=0.249515, mflops=20.0389 (err=1.0e-15) 10. Nielsen: elapsed time t=1.18329 s, 131072 iters, t-(init.)=1.16662 s t(norm)=0.151879, mflops=32.921 (err=4.7e-15) 11. Singleton: elapsed time t=1.5666 s, 262144 iters, t-(init.)=1.53327 s t(norm)=0.099806, mflops=50.0972 (err=2.9e-16) 12. Singleton (f2c): elapsed time t=1.58327 s, 262144 iters, t-(init.)=1.54994 s t(norm)=0.100891, mflops=49.5585 (err=2.9e-16) 13. Temperton: elapsed time t=1.06662 s, 262144 iters, t-(init.)=1.03329 s t(norm)=0.0672606, mflops=74.3377 (err=7.9e-16) 14. Temperton (f2c): elapsed time t=1.21662 s, 262144 iters, t-(init.)=1.16662 s t(norm)=0.0759394, mflops=65.842 (err=2.1e-16) 15. Valkenburg: elapsed time t=1.09996 s, 65536 iters, t-(init.)=1.08329 s t(norm)=0.282061, mflops=17.7267 (err=4.0e-16) 16. DXML: elapsed time t=1.59994 s, 524288 iters, t-(init.)=1.53327 s t(norm)=0.049903, mflops=100.194 (err=2.1e-16) Top mflops for N=15 = 292.631 Normalized results and averages for N=15: fft 0: mflops = 20.0389 (norm. = 0.0684783), norm. avg. (of 2) = 0.0532123 fft 1: mflops = 118.178 (norm. = 0.403846), norm. avg. (of 4) = 0.26071 fft 2: mflops = 119.713 (norm. = 0.409091), norm. avg. (of 4) = 0.221146 fft 3: mflops = 200.389 (norm. = 0.684783), norm. avg. (of 4) = 0.411237 fft 4: mflops = 124.566 (norm. = 0.425676), norm. avg. (of 4) = 0.274303 fft 5: mflops = 279.33 (norm. = 0.954545), norm. avg. (of 4) = 0.836825 fft 6: mflops = 292.631 (norm. = 1), norm. avg. (of 4) = 1 fft 7: mflops = 36.5789 (norm. = 0.125), norm. avg. (of 4) = 0.107972 fft 8: mflops = 68.7901 (norm. = 0.235075), norm. avg. (of 4) = 0.230792 fft 9: mflops = 20.0389 (norm. = 0.0684783), norm. avg. (of 4) = 0.0668491 fft 10: mflops = 32.921 (norm. = 0.1125), norm. avg. (of 4) = 0.0686373 fft 11: mflops = 50.0972 (norm. = 0.171196), norm. avg. (of 4) = 0.117512 fft 12: mflops = 49.5585 (norm. = 0.169355), norm. avg. (of 4) = 0.114002 fft 13: mflops = 74.3377 (norm. = 0.254032), norm. avg. (of 4) = 0.158971 fft 14: mflops = 65.842 (norm. = 0.225), norm. avg. (of 4) = 0.137242 fft 15: mflops = 17.7267 (norm. = 0.0605769), norm. avg. (of 4) = 0.050194 fft 16: mflops = 100.194 (norm. = 0.342391), norm. avg. (of 4) = 0.197569 Benchmarking for array size = 18: 0. Brenner: elapsed time t=1.34995 s, 65536 iters, t-(init.)=1.34995 s t(norm)=0.274433, mflops=18.2194 (err=4.5e-16) 1. CWP (min N): elapsed time t=1.5666 s, 524288 iters, t-(init.)=1.48327 s t(norm)=0.0376921, mflops=132.654 2. CWP (best N) (N=28): elapsed time t=1.7666 s, 524288 iters, t-(init.)=1.68327 s t(norm)=0.0427742, mflops=116.893 3. FFTPACK: elapsed time t=1.41661 s, 524288 iters, t-(init.)=1.33328 s t(norm)=0.0338806, mflops=147.577 (err=2.6e-16) 4. FFTPACK (f2c): elapsed time t=1.16662 s, 262144 iters, t-(init.)=1.13329 s t(norm)=0.057597, mflops=86.8101 (err=2.6e-16) FFTW_MEASURE plan: (cost = 1.652969e-06) FFTW_TWIDDLE 3 FFTW_NOTW 6 5. FFTW: elapsed time t=1.7166 s, 1048576 iters, t-(init.)=1.5666 s t(norm)=0.0199048, mflops=251.195 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.79993 s, 1048576 iters, t-(init.)=1.58327 s t(norm)=0.0201166, mflops=248.551 (err=1.9e-16) 7. Frigo-old: elapsed time t=1.41661 s, 131072 iters, t-(init.)=1.38328 s t(norm)=0.140604, mflops=35.5608 (err=4.5e-16) 8. GSL: elapsed time t=1.53327 s, 524288 iters, t-(init.)=1.46661 s t(norm)=0.0372686, mflops=134.161 (err=2.2e-16) 9. NAPACK (f2c): elapsed time t=1.48327 s, 131072 iters, t-(init.)=1.46661 s t(norm)=0.149075, mflops=33.5403 (err=8.7e-16) 10. Nielsen: elapsed time t=1.86659 s, 131072 iters, t-(init.)=1.83326 s t(norm)=0.186343, mflops=26.8322 (err=7.3e-16) 11. Singleton: elapsed time t=1.6166 s, 262144 iters, t-(init.)=1.5666 s t(norm)=0.0796193, mflops=62.7988 (err=2.1e-16) 12. Singleton (f2c): elapsed time t=1.73326 s, 262144 iters, t-(init.)=1.69993 s t(norm)=0.0863955, mflops=57.8734 (err=2.1e-16) 13. Temperton: elapsed time t=1.49994 s, 262144 iters, t-(init.)=1.46661 s t(norm)=0.0745373, mflops=67.0806 (err=2.7e-08) 14. Temperton (f2c): elapsed time t=1.69993 s, 262144 iters, t-(init.)=1.64993 s t(norm)=0.0838544, mflops=59.6272 (err=2.9e-16) 15. Valkenburg: elapsed time t=1.48327 s, 65536 iters, t-(init.)=1.46661 s t(norm)=0.298149, mflops=16.7701 (err=4.1e-16) 16. DXML: elapsed time t=1.94992 s, 524288 iters, t-(init.)=1.88326 s t(norm)=0.0478563, mflops=104.479 (err=2.4e-16) Top mflops for N=18 = 251.195 Normalized results and averages for N=18: fft 0: mflops = 18.2194 (norm. = 0.0725309), norm. avg. (of 3) = 0.0596519 fft 1: mflops = 132.654 (norm. = 0.52809), norm. avg. (of 5) = 0.314186 fft 2: mflops = 116.893 (norm. = 0.465347), norm. avg. (of 5) = 0.269986 fft 3: mflops = 147.577 (norm. = 0.5875), norm. avg. (of 5) = 0.446489 fft 4: mflops = 86.8101 (norm. = 0.345588), norm. avg. (of 5) = 0.28856 fft 5: mflops = 251.195 (norm. = 1), norm. avg. (of 5) = 0.86946 fft 6: mflops = 248.551 (norm. = 0.989474), norm. avg. (of 5) = 0.997895 fft 7: mflops = 35.5608 (norm. = 0.141566), norm. avg. (of 5) = 0.114691 fft 8: mflops = 134.161 (norm. = 0.534091), norm. avg. (of 5) = 0.291452 fft 9: mflops = 33.5403 (norm. = 0.133523), norm. avg. (of 5) = 0.0801839 fft 10: mflops = 26.8322 (norm. = 0.106818), norm. avg. (of 5) = 0.0762735 fft 11: mflops = 62.7988 (norm. = 0.25), norm. avg. (of 5) = 0.144009 fft 12: mflops = 57.8734 (norm. = 0.230392), norm. avg. (of 5) = 0.13728 fft 13: mflops = 67.0806 (norm. = 0.267045), norm. avg. (of 5) = 0.180586 fft 14: mflops = 59.6272 (norm. = 0.237374), norm. avg. (of 5) = 0.157269 fft 15: mflops = 16.7701 (norm. = 0.0667614), norm. avg. (of 5) = 0.0535075 fft 16: mflops = 104.479 (norm. = 0.415929), norm. avg. (of 5) = 0.241241 Benchmarking for array size = 24: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.51661 s, 524288 iters, t-(init.)=1.43328 s t(norm)=0.0248435, mflops=201.26 2. CWP (best N) (N=28): elapsed time t=1.74993 s, 524288 iters, t-(init.)=1.64993 s t(norm)=0.0285989, mflops=174.832 3. FFTPACK: elapsed time t=1.64993 s, 524288 iters, t-(init.)=1.5666 s t(norm)=0.0271545, mflops=184.131 (err=2.2e-16) 4. FFTPACK (f2c): elapsed time t=1.31661 s, 262144 iters, t-(init.)=1.28328 s t(norm)=0.0444872, mflops=112.392 (err=2.4e-16) FFTW_MEASURE plan: (cost = 1.780121e-06) FFTW_TWIDDLE 4 FFTW_NOTW 6 5. FFTW: elapsed time t=1.03329 s, 524288 iters, t-(init.)=0.949962 s t(norm)=0.016466, mflops=303.655 (err=2.3e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.18329 s, 524288 iters, t-(init.)=1.06662 s t(norm)=0.0184882, mflops=270.443 (err=2.1e-16) 7. Frigo-old: elapsed time t=1.01663 s, 131072 iters, t-(init.)=0.99996 s t(norm)=0.0693307, mflops=72.1181 (err=3.6e-16) 8. GSL: elapsed time t=1.59994 s, 524288 iters, t-(init.)=1.51661 s t(norm)=0.0262879, mflops=190.202 (err=2.1e-16) 9. NAPACK (f2c): elapsed time t=1.09996 s, 65536 iters, t-(init.)=1.09996 s t(norm)=0.152528, mflops=32.781 (err=8.0e-16) 10. Nielsen: elapsed time t=1.54994 s, 131072 iters, t-(init.)=1.53327 s t(norm)=0.106307, mflops=47.0335 (err=1.5e-15) 11. Singleton: elapsed time t=1.23328 s, 131072 iters, t-(init.)=1.21662 s t(norm)=0.0843524, mflops=59.2752 (err=2.3e-16) 12. Singleton (f2c): elapsed time t=1.36661 s, 131072 iters, t-(init.)=1.34995 s t(norm)=0.0935965, mflops=53.4208 (err=2.3e-16) 13. Temperton: elapsed time t=1.46661 s, 262144 iters, t-(init.)=1.43328 s t(norm)=0.049687, mflops=100.63 (err=4.5e-09) 14. Temperton (f2c): elapsed time t=1.91659 s, 262144 iters, t-(init.)=1.86659 s t(norm)=0.0647087, mflops=77.2694 (err=2.8e-16) 15. Valkenburg: elapsed time t=1.01663 s, 32768 iters, t-(init.)=1.01663 s t(norm)=0.281945, mflops=17.734 (err=5.6e-16) 16. DXML: elapsed time t=1.18329 s, 262144 iters, t-(init.)=1.13329 s t(norm)=0.0392874, mflops=127.267 (err=8.3e-16) Top mflops for N=24 = 303.655 Normalized results and averages for N=24: fft 0: mflops = -1 (norm. = -0.00329321), norm. avg. (of 3) = 0.0596519 fft 1: mflops = 201.26 (norm. = 0.662791), norm. avg. (of 6) = 0.372286 fft 2: mflops = 174.832 (norm. = 0.575758), norm. avg. (of 6) = 0.320948 fft 3: mflops = 184.131 (norm. = 0.606383), norm. avg. (of 6) = 0.473138 fft 4: mflops = 112.392 (norm. = 0.37013), norm. avg. (of 6) = 0.302155 fft 5: mflops = 303.655 (norm. = 1), norm. avg. (of 6) = 0.891217 fft 6: mflops = 270.443 (norm. = 0.890625), norm. avg. (of 6) = 0.980016 fft 7: mflops = 72.1181 (norm. = 0.2375), norm. avg. (of 6) = 0.135159 fft 8: mflops = 190.202 (norm. = 0.626374), norm. avg. (of 6) = 0.347272 fft 9: mflops = 32.781 (norm. = 0.107955), norm. avg. (of 6) = 0.0848123 fft 10: mflops = 47.0335 (norm. = 0.154891), norm. avg. (of 6) = 0.0893764 fft 11: mflops = 59.2752 (norm. = 0.195205), norm. avg. (of 6) = 0.152542 fft 12: mflops = 53.4208 (norm. = 0.175926), norm. avg. (of 6) = 0.143721 fft 13: mflops = 100.63 (norm. = 0.331395), norm. avg. (of 6) = 0.205721 fft 14: mflops = 77.2694 (norm. = 0.254464), norm. avg. (of 6) = 0.173468 fft 15: mflops = 17.734 (norm. = 0.0584016), norm. avg. (of 6) = 0.0543232 fft 16: mflops = 127.267 (norm. = 0.419118), norm. avg. (of 6) = 0.270887 Benchmarking for array size = 36: 0. Brenner: elapsed time t=1.23328 s, 32768 iters, t-(init.)=1.23328 s t(norm)=0.202221, mflops=24.7254 (err=5.5e-16) 1. CWP (min N): elapsed time t=1.03329 s, 262144 iters, t-(init.)=0.983294 s t(norm)=0.0201538, mflops=248.092 2. CWP (best N): elapsed time t=1.04996 s, 262144 iters, t-(init.)=0.99996 s t(norm)=0.0204954, mflops=243.957 3. FFTPACK: elapsed time t=1.94992 s, 524288 iters, t-(init.)=1.83326 s t(norm)=0.0187874, mflops=266.135 (err=3.9e-16) 4. FFTPACK (f2c): elapsed time t=1.98325 s, 262144 iters, t-(init.)=1.93326 s t(norm)=0.0396244, mflops=126.185 (err=4.5e-16) FFTW_MEASURE plan: (cost = 2.924484e-06) FFTW_TWIDDLE 6 FFTW_NOTW 6 5. FFTW: elapsed time t=1.54994 s, 524288 iters, t-(init.)=1.44994 s t(norm)=0.0148592, mflops=336.493 (err=4.6e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.6666 s, 524288 iters, t-(init.)=1.51661 s t(norm)=0.0155423, mflops=321.702 (err=4.4e-16) 7. Frigo-old: elapsed time t=1.36661 s, 65536 iters, t-(init.)=1.34995 s t(norm)=0.110675, mflops=45.1773 (err=5.4e-16) 8. GSL: elapsed time t=1.16662 s, 262144 iters, t-(init.)=1.11662 s t(norm)=0.0228865, mflops=218.469 (err=4.3e-16) 9. NAPACK (f2c): elapsed time t=1.51661 s, 65536 iters, t-(init.)=1.49994 s t(norm)=0.122972, mflops=40.6596 (err=1.4e-15) 10. Nielsen: elapsed time t=1.33328 s, 65536 iters, t-(init.)=1.31661 s t(norm)=0.107942, mflops=46.321 (err=1.0e-15) 11. Singleton: elapsed time t=1.29995 s, 131072 iters, t-(init.)=1.28328 s t(norm)=0.0526048, mflops=95.0483 (err=4.7e-16) 12. Singleton (f2c): elapsed time t=1.41661 s, 131072 iters, t-(init.)=1.38328 s t(norm)=0.0567039, mflops=88.1774 (err=4.7e-16) 13. Temperton: elapsed time t=1.89992 s, 262144 iters, t-(init.)=1.84993 s t(norm)=0.0379165, mflops=131.869 (err=5.1e-08) 14. Temperton (f2c): elapsed time t=1.14995 s, 131072 iters, t-(init.)=1.13329 s t(norm)=0.0464562, mflops=107.628 (err=3.7e-16) 15. Valkenburg: elapsed time t=1.64993 s, 32768 iters, t-(init.)=1.63327 s t(norm)=0.267806, mflops=18.6702 (err=6.2e-16) 16. DXML: elapsed time t=1.26662 s, 131072 iters, t-(init.)=1.23328 s t(norm)=0.0505553, mflops=98.9017 (err=4.0e-16) Top mflops for N=36 = 336.493 Normalized results and averages for N=36: fft 0: mflops = 24.7254 (norm. = 0.0734797), norm. avg. (of 4) = 0.0631088 fft 1: mflops = 248.092 (norm. = 0.737288), norm. avg. (of 7) = 0.42443 fft 2: mflops = 243.957 (norm. = 0.725), norm. avg. (of 7) = 0.37867 fft 3: mflops = 266.135 (norm. = 0.790909), norm. avg. (of 7) = 0.518534 fft 4: mflops = 126.185 (norm. = 0.375), norm. avg. (of 7) = 0.312561 fft 5: mflops = 336.493 (norm. = 1), norm. avg. (of 7) = 0.906757 fft 6: mflops = 321.702 (norm. = 0.956044), norm. avg. (of 7) = 0.976592 fft 7: mflops = 45.1773 (norm. = 0.134259), norm. avg. (of 7) = 0.135031 fft 8: mflops = 218.469 (norm. = 0.649254), norm. avg. (of 7) = 0.390413 fft 9: mflops = 40.6596 (norm. = 0.120833), norm. avg. (of 7) = 0.0899582 fft 10: mflops = 46.321 (norm. = 0.137658), norm. avg. (of 7) = 0.0962738 fft 11: mflops = 95.0483 (norm. = 0.282468), norm. avg. (of 7) = 0.171103 fft 12: mflops = 88.1774 (norm. = 0.262048), norm. avg. (of 7) = 0.160625 fft 13: mflops = 131.869 (norm. = 0.391892), norm. avg. (of 7) = 0.232317 fft 14: mflops = 107.628 (norm. = 0.319853), norm. avg. (of 7) = 0.19438 fft 15: mflops = 18.6702 (norm. = 0.0554847), norm. avg. (of 7) = 0.0544891 fft 16: mflops = 98.9017 (norm. = 0.293919), norm. avg. (of 7) = 0.274177 Benchmarking for array size = 80: 0. Brenner: elapsed time t=1.89992 s, 32768 iters, t-(init.)=1.89992 s t(norm)=0.114643, mflops=43.6137 (err=4.3e-16) 1. CWP (min N): elapsed time t=1.09996 s, 131072 iters, t-(init.)=1.04996 s t(norm)=0.0158388, mflops=315.68 2. CWP (best N) (N=84): elapsed time t=1.04996 s, 131072 iters, t-(init.)=0.99996 s t(norm)=0.0150846, mflops=331.464 3. FFTPACK: elapsed time t=1.73326 s, 262144 iters, t-(init.)=1.63327 s t(norm)=0.0123191, mflops=405.875 (err=3.2e-16) 4. FFTPACK (f2c): elapsed time t=1.84993 s, 131072 iters, t-(init.)=1.79993 s t(norm)=0.0271522, mflops=184.147 (err=3.8e-16) FFTW_MEASURE plan: (cost = 6.611877e-06) FFTW_TWIDDLE 5 FFTW_NOTW 16 5. FFTW: elapsed time t=1.74993 s, 262144 iters, t-(init.)=1.64993 s t(norm)=0.0124448, mflops=401.775 (err=3.5e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 6. FFTW_ESTIMATE: elapsed time t=1.74993 s, 262144 iters, t-(init.)=1.6166 s t(norm)=0.0121934, mflops=410.059 (err=3.5e-16) 7. Frigo-old: elapsed time t=1.86659 s, 65536 iters, t-(init.)=1.83326 s t(norm)=0.0553101, mflops=90.3994 (err=3.6e-16) 8. GSL: elapsed time t=1.33328 s, 65536 iters, t-(init.)=1.31661 s t(norm)=0.0397227, mflops=125.873 (err=3.3e-16) 9. NAPACK (f2c): elapsed time t=1.54994 s, 16384 iters, t-(init.)=1.53327 s t(norm)=0.185037, mflops=27.0216 (err=5.0e-16) 10. Nielsen: elapsed time t=1.73326 s, 65536 iters, t-(init.)=1.7166 s t(norm)=0.0517904, mflops=96.543 (err=5.0e-15) 11. Singleton: elapsed time t=1.09996 s, 65536 iters, t-(init.)=1.06662 s t(norm)=0.0321804, mflops=155.374 (err=4.4e-16) 12. Singleton (f2c): elapsed time t=1.21662 s, 65536 iters, t-(init.)=1.18329 s t(norm)=0.0357002, mflops=140.055 (err=4.4e-16) 13. Temperton: elapsed time t=1.6666 s, 131072 iters, t-(init.)=1.6166 s t(norm)=0.0243867, mflops=205.03 (err=5.3e-08) 14. Temperton (f2c): elapsed time t=1.11662 s, 65536 iters, t-(init.)=1.08329 s t(norm)=0.0326833, mflops=152.984 (err=3.4e-16) 15. Valkenburg: elapsed time t=1.14995 s, 8192 iters, t-(init.)=1.13329 s t(norm)=0.273534, mflops=18.2793 (err=4.6e-16) 16. DXML: elapsed time t=1.64993 s, 131072 iters, t-(init.)=1.59994 s t(norm)=0.0241353, mflops=207.165 (err=3.5e-16) Top mflops for N=80 = 410.059 Normalized results and averages for N=80: fft 0: mflops = 43.6137 (norm. = 0.10636), norm. avg. (of 5) = 0.071759 fft 1: mflops = 315.68 (norm. = 0.769841), norm. avg. (of 8) = 0.467606 fft 2: mflops = 331.464 (norm. = 0.808333), norm. avg. (of 8) = 0.432378 fft 3: mflops = 405.875 (norm. = 0.989796), norm. avg. (of 8) = 0.577442 fft 4: mflops = 184.147 (norm. = 0.449074), norm. avg. (of 8) = 0.329625 fft 5: mflops = 401.775 (norm. = 0.979798), norm. avg. (of 8) = 0.915887 fft 6: mflops = 410.059 (norm. = 1), norm. avg. (of 8) = 0.979518 fft 7: mflops = 90.3994 (norm. = 0.220455), norm. avg. (of 8) = 0.145709 fft 8: mflops = 125.873 (norm. = 0.306962), norm. avg. (of 8) = 0.379981 fft 9: mflops = 27.0216 (norm. = 0.0658967), norm. avg. (of 8) = 0.0869505 fft 10: mflops = 96.543 (norm. = 0.235437), norm. avg. (of 8) = 0.113669 fft 11: mflops = 155.374 (norm. = 0.378906), norm. avg. (of 8) = 0.197078 fft 12: mflops = 140.055 (norm. = 0.341549), norm. avg. (of 8) = 0.18324 fft 13: mflops = 205.03 (norm. = 0.5), norm. avg. (of 8) = 0.265777 fft 14: mflops = 152.984 (norm. = 0.373077), norm. avg. (of 8) = 0.216717 fft 15: mflops = 18.2793 (norm. = 0.0445772), norm. avg. (of 8) = 0.0532501 fft 16: mflops = 207.165 (norm. = 0.505208), norm. avg. (of 8) = 0.303056 Benchmarking for array size = 108: 0. Brenner: elapsed time t=1.06662 s, 8192 iters, t-(init.)=1.06662 s t(norm)=0.178476, mflops=28.015 (err=6.5e-16) 1. CWP (min N) (N=110): elapsed time t=1.89992 s, 131072 iters, t-(init.)=1.83326 s t(norm)=0.0191722, mflops=260.794 2. CWP (best N) (N=112): elapsed time t=1.53327 s, 131072 iters, t-(init.)=1.46661 s t(norm)=0.0153378, mflops=325.993 3. FFTPACK: elapsed time t=1.31661 s, 131072 iters, t-(init.)=1.24995 s t(norm)=0.013072, mflops=382.498 (err=4.1e-16) 4. FFTPACK (f2c): elapsed time t=1.68327 s, 65536 iters, t-(init.)=1.64993 s t(norm)=0.03451, mflops=144.886 (err=4.1e-16) FFTW_MEASURE plan: (cost = 1.017212e-05) FFTW_TWIDDLE 9 FFTW_NOTW 12 5. FFTW: elapsed time t=1.29995 s, 131072 iters, t-(init.)=1.23328 s t(norm)=0.0128977, mflops=387.667 (err=3.6e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.31661 s, 131072 iters, t-(init.)=1.24995 s t(norm)=0.013072, mflops=382.498 (err=3.6e-16) 7. Frigo-old: elapsed time t=1.46661 s, 16384 iters, t-(init.)=1.44994 s t(norm)=0.121308, mflops=41.2175 (err=5.5e-16) 8. GSL: elapsed time t=1.19995 s, 65536 iters, t-(init.)=1.16662 s t(norm)=0.024401, mflops=204.91 (err=3.9e-16) 9. NAPACK (f2c): elapsed time t=1.21662 s, 16384 iters, t-(init.)=1.19995 s t(norm)=0.100393, mflops=49.8044 (err=3.1e-15) 10. Nielsen: elapsed time t=1.03329 s, 16384 iters, t-(init.)=1.03329 s t(norm)=0.0864493, mflops=57.8374 (err=1.1e-15) 11. Singleton: elapsed time t=1.14995 s, 32768 iters, t-(init.)=1.13329 s t(norm)=0.0474077, mflops=105.468 (err=4.5e-16) 12. Singleton (f2c): elapsed time t=1.33328 s, 32768 iters, t-(init.)=1.31661 s t(norm)=0.0550765, mflops=90.7828 (err=4.5e-16) 13. Temperton: elapsed time t=1.38328 s, 65536 iters, t-(init.)=1.34995 s t(norm)=0.0282354, mflops=177.082 (err=7.4e-08) 14. Temperton (f2c): elapsed time t=1.64993 s, 65536 iters, t-(init.)=1.6166 s t(norm)=0.0338128, mflops=147.873 (err=3.5e-16) 15. Valkenburg: elapsed time t=1.5666 s, 8192 iters, t-(init.)=1.5666 s t(norm)=0.262136, mflops=19.074 (err=7.5e-16) 16. DXML: elapsed time t=1.26662 s, 65536 iters, t-(init.)=1.23328 s t(norm)=0.0257953, mflops=193.833 (err=3.6e-16) Top mflops for N=108 = 387.667 Normalized results and averages for N=108: fft 0: mflops = 28.015 (norm. = 0.0722656), norm. avg. (of 6) = 0.0718434 fft 1: mflops = 260.794 (norm. = 0.672727), norm. avg. (of 9) = 0.490397 fft 2: mflops = 325.993 (norm. = 0.840909), norm. avg. (of 9) = 0.47777 fft 3: mflops = 382.498 (norm. = 0.986667), norm. avg. (of 9) = 0.622911 fft 4: mflops = 144.886 (norm. = 0.373737), norm. avg. (of 9) = 0.334527 fft 5: mflops = 387.667 (norm. = 1), norm. avg. (of 9) = 0.925233 fft 6: mflops = 382.498 (norm. = 0.986667), norm. avg. (of 9) = 0.980312 fft 7: mflops = 41.2175 (norm. = 0.106322), norm. avg. (of 9) = 0.141332 fft 8: mflops = 204.91 (norm. = 0.528571), norm. avg. (of 9) = 0.396491 fft 9: mflops = 49.8044 (norm. = 0.128472), norm. avg. (of 9) = 0.091564 fft 10: mflops = 57.8374 (norm. = 0.149194), norm. avg. (of 9) = 0.117616 fft 11: mflops = 105.468 (norm. = 0.272059), norm. avg. (of 9) = 0.205409 fft 12: mflops = 90.7828 (norm. = 0.234177), norm. avg. (of 9) = 0.1889 fft 13: mflops = 177.082 (norm. = 0.45679), norm. avg. (of 9) = 0.287001 fft 14: mflops = 147.873 (norm. = 0.381443), norm. avg. (of 9) = 0.23502 fft 15: mflops = 19.074 (norm. = 0.0492021), norm. avg. (of 9) = 0.0528003 fft 16: mflops = 193.833 (norm. = 0.5), norm. avg. (of 9) = 0.324939 Benchmarking for array size = 210: 0. Brenner: elapsed time t=1.93326 s, 8192 iters, t-(init.)=1.91659 s t(norm)=0.14442, mflops=34.6213 (err=6.8e-16) 1. CWP (min N): elapsed time t=1.51661 s, 65536 iters, t-(init.)=1.44994 s t(norm)=0.0136571, mflops=366.11 2. CWP (best N): elapsed time t=1.49994 s, 65536 iters, t-(init.)=1.43328 s t(norm)=0.0135001, mflops=370.367 3. FFTPACK: elapsed time t=1.51661 s, 32768 iters, t-(init.)=1.48327 s t(norm)=0.0279421, mflops=178.942 (err=4.7e-16) 4. FFTPACK (f2c): elapsed time t=1.5666 s, 16384 iters, t-(init.)=1.54994 s t(norm)=0.0583958, mflops=85.6226 (err=6.2e-16) FFTW_MEASURE plan: (cost = 2.644751e-05) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 6 5. FFTW: elapsed time t=1.83326 s, 65536 iters, t-(init.)=1.7666 s t(norm)=0.0166397, mflops=300.487 (err=4.8e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.09996 s, 32768 iters, t-(init.)=1.06662 s t(norm)=0.0200932, mflops=248.841 (err=4.9e-16) 7. Frigo-old: elapsed time t=1.6166 s, 8192 iters, t-(init.)=1.6166 s t(norm)=0.121815, mflops=41.0459 (err=6.3e-16) 8. GSL: elapsed time t=1.88326 s, 32768 iters, t-(init.)=1.84993 s t(norm)=0.0348491, mflops=143.476 (err=6.4e-16) 9. NAPACK (f2c): elapsed time t=1.68327 s, 4096 iters, t-(init.)=1.68327 s t(norm)=0.253676, mflops=19.7102 (err=1.5e-14) 10. Nielsen: elapsed time t=1.89992 s, 16384 iters, t-(init.)=1.88326 s t(norm)=0.070954, mflops=70.4682 (err=7.4e-15) 11. Singleton: elapsed time t=1.43328 s, 16384 iters, t-(init.)=1.41661 s t(norm)=0.0533725, mflops=93.6812 (err=6.4e-16) 12. Singleton (f2c): elapsed time t=1.63327 s, 16384 iters, t-(init.)=1.6166 s t(norm)=0.0609075, mflops=82.0918 (err=6.4e-16) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.96659 s, 4096 iters, t-(init.)=1.94992 s t(norm)=0.293863, mflops=17.0147 (err=7.5e-16) 16. DXML: elapsed time t=1.81659 s, 16384 iters, t-(init.)=1.79993 s t(norm)=0.0678145, mflops=73.7306 (err=7.3e-16) Top mflops for N=210 = 370.367 Normalized results and averages for N=210: fft 0: mflops = 34.6213 (norm. = 0.0934783), norm. avg. (of 7) = 0.0749341 fft 1: mflops = 366.11 (norm. = 0.988506), norm. avg. (of 10) = 0.540208 fft 2: mflops = 370.367 (norm. = 1), norm. avg. (of 10) = 0.529993 fft 3: mflops = 178.942 (norm. = 0.483146), norm. avg. (of 10) = 0.608935 fft 4: mflops = 85.6226 (norm. = 0.231183), norm. avg. (of 10) = 0.324192 fft 5: mflops = 300.487 (norm. = 0.811321), norm. avg. (of 10) = 0.913842 fft 6: mflops = 248.841 (norm. = 0.671875), norm. avg. (of 10) = 0.949468 fft 7: mflops = 41.0459 (norm. = 0.110825), norm. avg. (of 10) = 0.138282 fft 8: mflops = 143.476 (norm. = 0.387387), norm. avg. (of 10) = 0.395581 fft 9: mflops = 19.7102 (norm. = 0.0532178), norm. avg. (of 10) = 0.0877294 fft 10: mflops = 70.4682 (norm. = 0.190265), norm. avg. (of 10) = 0.124881 fft 11: mflops = 93.6812 (norm. = 0.252941), norm. avg. (of 10) = 0.210163 fft 12: mflops = 82.0918 (norm. = 0.221649), norm. avg. (of 10) = 0.192175 fft 13: mflops = -1 (norm. = -0.00270002), norm. avg. (of 9) = 0.287001 fft 14: mflops = -1 (norm. = -0.00270002), norm. avg. (of 9) = 0.23502 fft 15: mflops = 17.0147 (norm. = 0.0459402), norm. avg. (of 10) = 0.0521143 fft 16: mflops = 73.7306 (norm. = 0.199074), norm. avg. (of 10) = 0.312352 Benchmarking for array size = 504: 0. Brenner: elapsed time t=1.43328 s, 2048 iters, t-(init.)=1.41661 s t(norm)=0.152878, mflops=32.7058 (err=1.5e-15) 1. CWP (min N): elapsed time t=1.34995 s, 32768 iters, t-(init.)=1.28328 s t(norm)=0.00865559, mflops=577.661 2. CWP (best N): elapsed time t=1.34995 s, 32768 iters, t-(init.)=1.26662 s t(norm)=0.00854318, mflops=585.262 3. FFTPACK: elapsed time t=1.89992 s, 16384 iters, t-(init.)=1.86659 s t(norm)=0.0251799, mflops=198.571 (err=1.3e-15) 4. FFTPACK (f2c): elapsed time t=1.14995 s, 4096 iters, t-(init.)=1.13329 s t(norm)=0.0611512, mflops=81.7645 (err=1.3e-15) FFTW_MEASURE plan: (cost = 6.103271e-05) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 5. FFTW: elapsed time t=2.01659 s, 32768 iters, t-(init.)=1.93326 s t(norm)=0.0130396, mflops=383.447 (err=1.2e-15) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.01663 s, 16384 iters, t-(init.)=0.983294 s t(norm)=0.0132644, mflops=376.948 (err=1.2e-15) 7. Frigo-old: elapsed time t=1.89992 s, 4096 iters, t-(init.)=1.89992 s t(norm)=0.102518, mflops=48.7718 (err=1.3e-15) 8. GSL: elapsed time t=1.69993 s, 16384 iters, t-(init.)=1.6666 s t(norm)=0.0224821, mflops=222.4 (err=1.3e-15) 9. NAPACK (f2c): elapsed time t=1.83326 s, 2048 iters, t-(init.)=1.81659 s t(norm)=0.196044, mflops=25.5045 (err=4.1e-14) 10. Nielsen: elapsed time t=1.33328 s, 4096 iters, t-(init.)=1.31661 s t(norm)=0.0710433, mflops=70.3796 (err=6.0e-15) 11. Singleton: elapsed time t=1.7166 s, 8192 iters, t-(init.)=1.68327 s t(norm)=0.0454138, mflops=110.099 (err=1.9e-15) 12. Singleton (f2c): elapsed time t=1.96659 s, 8192 iters, t-(init.)=1.94992 s t(norm)=0.052608, mflops=95.0425 (err=1.9e-15) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.29995 s, 1024 iters, t-(init.)=1.29995 s t(norm)=0.280576, mflops=17.8205 (err=1.7e-15) 16. DXML: elapsed time t=1.89992 s, 8192 iters, t-(init.)=1.88326 s t(norm)=0.0508095, mflops=98.4069 (err=2.1e-15) Top mflops for N=504 = 585.262 Normalized results and averages for N=504: fft 0: mflops = 32.7058 (norm. = 0.0558824), norm. avg. (of 8) = 0.0725526 fft 1: mflops = 577.661 (norm. = 0.987013), norm. avg. (of 11) = 0.580827 fft 2: mflops = 585.262 (norm. = 1), norm. avg. (of 11) = 0.572721 fft 3: mflops = 198.571 (norm. = 0.339286), norm. avg. (of 11) = 0.584421 fft 4: mflops = 81.7645 (norm. = 0.139706), norm. avg. (of 11) = 0.307421 fft 5: mflops = 383.447 (norm. = 0.655172), norm. avg. (of 11) = 0.890326 fft 6: mflops = 376.948 (norm. = 0.644068), norm. avg. (of 11) = 0.921705 fft 7: mflops = 48.7718 (norm. = 0.0833333), norm. avg. (of 11) = 0.133286 fft 8: mflops = 222.4 (norm. = 0.38), norm. avg. (of 11) = 0.394164 fft 9: mflops = 25.5045 (norm. = 0.043578), norm. avg. (of 11) = 0.0837156 fft 10: mflops = 70.3796 (norm. = 0.120253), norm. avg. (of 11) = 0.124461 fft 11: mflops = 110.099 (norm. = 0.188119), norm. avg. (of 11) = 0.208159 fft 12: mflops = 95.0425 (norm. = 0.162393), norm. avg. (of 11) = 0.189468 fft 13: mflops = -1 (norm. = -0.00170864), norm. avg. (of 9) = 0.287001 fft 14: mflops = -1 (norm. = -0.00170864), norm. avg. (of 9) = 0.23502 fft 15: mflops = 17.8205 (norm. = 0.0304487), norm. avg. (of 11) = 0.0501447 fft 16: mflops = 98.4069 (norm. = 0.168142), norm. avg. (of 11) = 0.299242 Benchmarking for array size = 1000: 0. Brenner: elapsed time t=1.49994 s, 1024 iters, t-(init.)=1.49994 s t(norm)=0.146981, mflops=34.0179 (err=1.1e-15) 1. CWP (min N) (N=1001): elapsed time t=1.39994 s, 8192 iters, t-(init.)=1.36661 s t(norm)=0.0167396, mflops=298.694 2. CWP (best N) (N=1008): elapsed time t=1.99992 s, 16384 iters, t-(init.)=1.91659 s t(norm)=0.0117381, mflops=425.963 3. FFTPACK: elapsed time t=1.23328 s, 8192 iters, t-(init.)=1.19995 s t(norm)=0.0146981, mflops=340.179 (err=9.9e-16) 4. FFTPACK (f2c): elapsed time t=1.31661 s, 4096 iters, t-(init.)=1.29995 s t(norm)=0.031846, mflops=157.006 (err=1.1e-15) FFTW_MEASURE plan: (cost = 1.464785e-04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 5. FFTW: elapsed time t=1.21662 s, 8192 iters, t-(init.)=1.18329 s t(norm)=0.014494, mflops=344.97 (err=9.8e-16) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.21662 s, 8192 iters, t-(init.)=1.18329 s t(norm)=0.014494, mflops=344.97 (err=9.8e-16) 7. Frigo-old: elapsed time t=1.79993 s, 2048 iters, t-(init.)=1.78326 s t(norm)=0.0873723, mflops=57.2264 (err=1.0e-15) 8. GSL: elapsed time t=1.01663 s, 2048 iters, t-(init.)=0.99996 s t(norm)=0.0489938, mflops=102.054 (err=1.0e-15) 9. NAPACK (f2c): elapsed time t=1.39994 s, 512 iters, t-(init.)=1.39994 s t(norm)=0.274365, mflops=18.2239 (err=1.7e-14) 10. Nielsen: elapsed time t=1.81659 s, 4096 iters, t-(init.)=1.79993 s t(norm)=0.0440944, mflops=113.393 (err=1.5e-14) 11. Singleton: elapsed time t=1.36661 s, 4096 iters, t-(init.)=1.33328 s t(norm)=0.0326625, mflops=153.081 (err=1.5e-15) 12. Singleton (f2c): elapsed time t=1.64993 s, 4096 iters, t-(init.)=1.63327 s t(norm)=0.0400116, mflops=124.964 (err=1.5e-15) 13. Temperton: elapsed time t=1.91659 s, 8192 iters, t-(init.)=1.88326 s t(norm)=0.0230679, mflops=216.751 (err=1.3e-07) 14. Temperton (f2c): elapsed time t=1.58327 s, 4096 iters, t-(init.)=1.5666 s t(norm)=0.0383785, mflops=130.281 (err=9.9e-16) 15. Valkenburg: elapsed time t=1.69993 s, 512 iters, t-(init.)=1.68327 s t(norm)=0.329892, mflops=15.1565 (err=1.1e-15) 16. DXML: elapsed time t=1.24995 s, 8192 iters, t-(init.)=1.21662 s t(norm)=0.0149023, mflops=335.519 (err=1.8e-15) Top mflops for N=1000 = 425.963 Normalized results and averages for N=1000: fft 0: mflops = 34.0179 (norm. = 0.0798611), norm. avg. (of 9) = 0.0733647 fft 1: mflops = 298.694 (norm. = 0.70122), norm. avg. (of 12) = 0.590859 fft 2: mflops = 425.963 (norm. = 1), norm. avg. (of 12) = 0.608328 fft 3: mflops = 340.179 (norm. = 0.798611), norm. avg. (of 12) = 0.60227 fft 4: mflops = 157.006 (norm. = 0.36859), norm. avg. (of 12) = 0.312518 fft 5: mflops = 344.97 (norm. = 0.809859), norm. avg. (of 12) = 0.883621 fft 6: mflops = 344.97 (norm. = 0.809859), norm. avg. (of 12) = 0.912384 fft 7: mflops = 57.2264 (norm. = 0.134346), norm. avg. (of 12) = 0.133375 fft 8: mflops = 102.054 (norm. = 0.239583), norm. avg. (of 12) = 0.381283 fft 9: mflops = 18.2239 (norm. = 0.0427827), norm. avg. (of 12) = 0.0803046 fft 10: mflops = 113.393 (norm. = 0.266204), norm. avg. (of 12) = 0.136272 fft 11: mflops = 153.081 (norm. = 0.359375), norm. avg. (of 12) = 0.22076 fft 12: mflops = 124.964 (norm. = 0.293367), norm. avg. (of 12) = 0.198126 fft 13: mflops = 216.751 (norm. = 0.50885), norm. avg. (of 10) = 0.309186 fft 14: mflops = 130.281 (norm. = 0.305851), norm. avg. (of 10) = 0.242103 fft 15: mflops = 15.1565 (norm. = 0.0355817), norm. avg. (of 12) = 0.0489311 fft 16: mflops = 335.519 (norm. = 0.787671), norm. avg. (of 12) = 0.339945 Benchmarking for array size = 1960: 0. Brenner: elapsed time t=1.54994 s, 512 iters, t-(init.)=1.54994 s t(norm)=0.141223, mflops=35.4051 (err=2.9e-15) 1. CWP (min N) (N=1980): elapsed time t=1.11662 s, 4096 iters, t-(init.)=1.08329 s t(norm)=0.012338, mflops=405.252 2. CWP (best N) (N=1980): elapsed time t=1.13329 s, 4096 iters, t-(init.)=1.09996 s t(norm)=0.0125278, mflops=399.112 3. FFTPACK: elapsed time t=1.5666 s, 2048 iters, t-(init.)=1.54994 s t(norm)=0.0353057, mflops=141.62 (err=2.8e-15) 4. FFTPACK (f2c): elapsed time t=1.83326 s, 1024 iters, t-(init.)=1.83326 s t(norm)=0.0835188, mflops=59.8668 (err=2.8e-15) FFTW_MEASURE plan: (cost = 3.743340e-04) FFTW_TWIDDLE 4 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 14 5. FFTW: elapsed time t=1.5666 s, 4096 iters, t-(init.)=1.53327 s t(norm)=0.017463, mflops=286.319 (err=2.8e-15) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.79993 s, 4096 iters, t-(init.)=1.7666 s t(norm)=0.0201204, mflops=248.504 (err=2.8e-15) 7. Frigo-old: elapsed time t=1.04996 s, 512 iters, t-(init.)=1.03329 s t(norm)=0.0941485, mflops=53.1076 (err=2.8e-15) 8. GSL: elapsed time t=1.49994 s, 2048 iters, t-(init.)=1.48327 s t(norm)=0.0337871, mflops=147.985 (err=2.8e-15) 9. NAPACK (f2c): elapsed time t=1.68327 s, 256 iters, t-(init.)=1.68327 s t(norm)=0.306742, mflops=16.3004 (err=1.3e-13) 10. Nielsen: elapsed time t=1.48327 s, 1024 iters, t-(init.)=1.46661 s t(norm)=0.066815, mflops=74.8335 (err=1.7e-14) 11. Singleton: elapsed time t=1.88326 s, 2048 iters, t-(init.)=1.86659 s t(norm)=0.0425187, mflops=117.595 (err=4.3e-15) 12. Singleton (f2c): elapsed time t=1.09996 s, 1024 iters, t-(init.)=1.08329 s t(norm)=0.049352, mflops=101.313 (err=4.3e-15) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.06662 s, 128 iters, t-(init.)=1.06662 s t(norm)=0.388742, mflops=12.862 (err=2.8e-15) 16. DXML: elapsed time t=1.01663 s, 512 iters, t-(init.)=1.01663 s t(norm)=0.0926299, mflops=53.9782 (err=3.6e-15) Top mflops for N=1960 = 405.252 Normalized results and averages for N=1960: fft 0: mflops = 35.4051 (norm. = 0.0873656), norm. avg. (of 10) = 0.0747648 fft 1: mflops = 405.252 (norm. = 1), norm. avg. (of 13) = 0.622332 fft 2: mflops = 399.112 (norm. = 0.984848), norm. avg. (of 13) = 0.637291 fft 3: mflops = 141.62 (norm. = 0.349462), norm. avg. (of 13) = 0.582824 fft 4: mflops = 59.8668 (norm. = 0.147727), norm. avg. (of 13) = 0.299842 fft 5: mflops = 286.319 (norm. = 0.706522), norm. avg. (of 13) = 0.869998 fft 6: mflops = 248.504 (norm. = 0.613208), norm. avg. (of 13) = 0.889371 fft 7: mflops = 53.1076 (norm. = 0.131048), norm. avg. (of 13) = 0.133196 fft 8: mflops = 147.985 (norm. = 0.365169), norm. avg. (of 13) = 0.380043 fft 9: mflops = 16.3004 (norm. = 0.0402228), norm. avg. (of 13) = 0.0772213 fft 10: mflops = 74.8335 (norm. = 0.184659), norm. avg. (of 13) = 0.139995 fft 11: mflops = 117.595 (norm. = 0.290179), norm. avg. (of 13) = 0.2261 fft 12: mflops = 101.313 (norm. = 0.25), norm. avg. (of 13) = 0.202116 fft 13: mflops = -1 (norm. = -0.0024676), norm. avg. (of 10) = 0.309186 fft 14: mflops = -1 (norm. = -0.0024676), norm. avg. (of 10) = 0.242103 fft 15: mflops = 12.862 (norm. = 0.0317383), norm. avg. (of 13) = 0.0476086 fft 16: mflops = 53.9782 (norm. = 0.133197), norm. avg. (of 13) = 0.324041 Benchmarking for array size = 4725: 0. Brenner: elapsed time t=1.39994 s, 128 iters, t-(init.)=1.38328 s t(norm)=0.187379, mflops=26.6839 (err=1.9e-15) 1. CWP (min N) (N=5005): elapsed time t=1.09996 s, 1024 iters, t-(init.)=1.03329 s t(norm)=0.0174962, mflops=285.776 2. CWP (best N) (N=5040): elapsed time t=1.68327 s, 2048 iters, t-(init.)=1.54994 s t(norm)=0.0131222, mflops=381.035 3. FFTPACK: elapsed time t=1.16662 s, 512 iters, t-(init.)=1.13329 s t(norm)=0.0383788, mflops=130.28 (err=1.8e-15) 4. FFTPACK (f2c): elapsed time t=1.99992 s, 512 iters, t-(init.)=1.96659 s t(norm)=0.0665985, mflops=75.0767 (err=1.9e-15) FFTW_MEASURE plan: (cost = 1.432234e-03) FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_NOTW 15 5. FFTW: elapsed time t=1.33328 s, 1024 iters, t-(init.)=1.28328 s t(norm)=0.0217292, mflops=230.105 (err=1.9e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.43328 s, 1024 iters, t-(init.)=1.36661 s t(norm)=0.0231402, mflops=216.074 (err=1.8e-15) 7. Frigo-old: elapsed time t=1.13329 s, 128 iters, t-(init.)=1.11662 s t(norm)=0.151258, mflops=33.0562 (err=1.9e-15) 8. GSL: elapsed time t=1.34995 s, 512 iters, t-(init.)=1.33328 s t(norm)=0.0451516, mflops=110.738 (err=1.9e-15) 9. NAPACK (f2c): elapsed time t=1.98325 s, 128 iters, t-(init.)=1.98325 s t(norm)=0.268652, mflops=18.6115 (err=3.5e-13) 10. Nielsen: elapsed time t=1.13329 s, 256 iters, t-(init.)=1.11662 s t(norm)=0.0756289, mflops=66.1123 (err=4.1e-14) 11. Singleton: elapsed time t=1.41661 s, 512 iters, t-(init.)=1.38328 s t(norm)=0.0468447, mflops=106.736 (err=2.4e-15) 12. Singleton (f2c): elapsed time t=1.69993 s, 512 iters, t-(init.)=1.68327 s t(norm)=0.0570038, mflops=87.7134 (err=2.4e-15) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.39994 s, 64 iters, t-(init.)=1.39994 s t(norm)=0.379273, mflops=13.1831 (err=1.9e-15) 16. DXML: elapsed time t=1.5666 s, 512 iters, t-(init.)=1.53327 s t(norm)=0.0519243, mflops=96.294 (err=1.9e-15) Top mflops for N=4725 = 381.035 Normalized results and averages for N=4725: fft 0: mflops = 26.6839 (norm. = 0.0700301), norm. avg. (of 11) = 0.0743344 fft 1: mflops = 285.776 (norm. = 0.75), norm. avg. (of 14) = 0.631451 fft 2: mflops = 381.035 (norm. = 1), norm. avg. (of 14) = 0.663199 fft 3: mflops = 130.28 (norm. = 0.341912), norm. avg. (of 14) = 0.565616 fft 4: mflops = 75.0767 (norm. = 0.197034), norm. avg. (of 14) = 0.292499 fft 5: mflops = 230.105 (norm. = 0.603896), norm. avg. (of 14) = 0.850991 fft 6: mflops = 216.074 (norm. = 0.567073), norm. avg. (of 14) = 0.866349 fft 7: mflops = 33.0562 (norm. = 0.0867537), norm. avg. (of 14) = 0.129878 fft 8: mflops = 110.738 (norm. = 0.290625), norm. avg. (of 14) = 0.373656 fft 9: mflops = 18.6115 (norm. = 0.0488445), norm. avg. (of 14) = 0.0751944 fft 10: mflops = 66.1123 (norm. = 0.173507), norm. avg. (of 14) = 0.142388 fft 11: mflops = 106.736 (norm. = 0.28012), norm. avg. (of 14) = 0.229959 fft 12: mflops = 87.7134 (norm. = 0.230198), norm. avg. (of 14) = 0.204122 fft 13: mflops = -1 (norm. = -0.00262443), norm. avg. (of 10) = 0.309186 fft 14: mflops = -1 (norm. = -0.00262443), norm. avg. (of 10) = 0.242103 fft 15: mflops = 13.1831 (norm. = 0.0345982), norm. avg. (of 14) = 0.0466793 fft 16: mflops = 96.294 (norm. = 0.252717), norm. avg. (of 14) = 0.318946 Benchmarking for array size = 10368: 0. Brenner: elapsed time t=1.7666 s, 64 iters, t-(init.)=1.74993 s t(norm)=0.197695, mflops=25.2915 (err=3.1e-15) 1. CWP (min N) (N=10920): elapsed time t=1.46661 s, 512 iters, t-(init.)=1.31661 s t(norm)=0.0185927, mflops=268.923 2. CWP (best N) (N=11088): elapsed time t=1.46661 s, 512 iters, t-(init.)=1.31661 s t(norm)=0.0185927, mflops=268.923 3. FFTPACK: elapsed time t=1.33328 s, 256 iters, t-(init.)=1.26662 s t(norm)=0.0357733, mflops=139.769 (err=3.0e-15) 4. FFTPACK (f2c): elapsed time t=1.94992 s, 256 iters, t-(init.)=1.88326 s t(norm)=0.0531893, mflops=94.0039 (err=3.0e-15) FFTW_MEASURE plan: (cost = 3.255078e-03) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_TWIDDLE 9 FFTW_NOTW 32 5. FFTW: elapsed time t=1.68327 s, 512 iters, t-(init.)=1.5666 s t(norm)=0.022123, mflops=226.009 (err=3.0e-15) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.59994 s, 512 iters, t-(init.)=1.46661 s t(norm)=0.0207109, mflops=241.419 (err=3.0e-15) 7. Frigo-old: elapsed time t=1.96659 s, 128 iters, t-(init.)=1.93326 s t(norm)=0.109203, mflops=45.7864 (err=3.1e-15) 8. GSL: elapsed time t=1.28328 s, 256 iters, t-(init.)=1.21662 s t(norm)=0.0343612, mflops=145.513 (err=3.0e-15) 9. NAPACK (f2c): elapsed time t=1.46661 s, 64 iters, t-(init.)=1.44994 s t(norm)=0.163804, mflops=30.5243 (err=8.1e-14) 10. Nielsen: elapsed time t=1.5666 s, 128 iters, t-(init.)=1.53327 s t(norm)=0.0866091, mflops=57.7307 (err=4.9e-15) 11. Singleton: elapsed time t=1.91659 s, 256 iters, t-(init.)=1.84993 s t(norm)=0.0522479, mflops=95.6977 (err=4.4e-15) 12. Singleton (f2c): elapsed time t=1.08329 s, 128 iters, t-(init.)=1.04996 s t(norm)=0.0593084, mflops=84.3051 (err=4.4e-15) 13. Temperton: elapsed time t=1.6166 s, 256 iters, t-(init.)=1.54994 s t(norm)=0.0437752, mflops=114.22 (err=2.1e-07) 14. Temperton (f2c): elapsed time t=1.84993 s, 256 iters, t-(init.)=1.78326 s t(norm)=0.050365, mflops=99.2752 (err=3.0e-15) 15. Valkenburg: elapsed time t=1.54994 s, 32 iters, t-(init.)=1.53327 s t(norm)=0.346436, mflops=14.4327 (err=3.0e-15) 16. DXML: elapsed time t=1.26662 s, 256 iters, t-(init.)=1.21662 s t(norm)=0.0343612, mflops=145.513 (err=3.2e-15) Top mflops for N=10368 = 268.923 Normalized results and averages for N=10368: fft 0: mflops = 25.2915 (norm. = 0.0940476), norm. avg. (of 12) = 0.0759771 fft 1: mflops = 268.923 (norm. = 1), norm. avg. (of 15) = 0.656021 fft 2: mflops = 268.923 (norm. = 1), norm. avg. (of 15) = 0.685652 fft 3: mflops = 139.769 (norm. = 0.519737), norm. avg. (of 15) = 0.562557 fft 4: mflops = 94.0039 (norm. = 0.349558), norm. avg. (of 15) = 0.296303 fft 5: mflops = 226.009 (norm. = 0.840426), norm. avg. (of 15) = 0.850286 fft 6: mflops = 241.419 (norm. = 0.897727), norm. avg. (of 15) = 0.868441 fft 7: mflops = 45.7864 (norm. = 0.170259), norm. avg. (of 15) = 0.13257 fft 8: mflops = 145.513 (norm. = 0.541096), norm. avg. (of 15) = 0.384819 fft 9: mflops = 30.5243 (norm. = 0.113506), norm. avg. (of 15) = 0.0777485 fft 10: mflops = 57.7307 (norm. = 0.214674), norm. avg. (of 15) = 0.147207 fft 11: mflops = 95.6977 (norm. = 0.355856), norm. avg. (of 15) = 0.238352 fft 12: mflops = 84.3051 (norm. = 0.313492), norm. avg. (of 15) = 0.211413 fft 13: mflops = 114.22 (norm. = 0.424731), norm. avg. (of 11) = 0.31969 fft 14: mflops = 99.2752 (norm. = 0.369159), norm. avg. (of 11) = 0.253654 fft 15: mflops = 14.4327 (norm. = 0.0536685), norm. avg. (of 15) = 0.0471452 fft 16: mflops = 145.513 (norm. = 0.541096), norm. avg. (of 15) = 0.333756 Benchmarking for array size = 27000: 0. Brenner: elapsed time t=1.51661 s, 16 iters, t-(init.)=1.49994 s t(norm)=0.235864, mflops=21.1986 (err=5.6e-15) 1. CWP (min N) (N=27720): elapsed time t=1.18329 s, 128 iters, t-(init.)=1.04996 s t(norm)=0.0206381, mflops=242.27 2. CWP (best N) (N=27720): elapsed time t=1.18329 s, 128 iters, t-(init.)=1.04996 s t(norm)=0.0206381, mflops=242.27 3. FFTPACK: elapsed time t=1.08329 s, 64 iters, t-(init.)=1.01663 s t(norm)=0.0399659, mflops=125.107 (err=5.5e-15) 4. FFTPACK (f2c): elapsed time t=1.44994 s, 64 iters, t-(init.)=1.38328 s t(norm)=0.0543799, mflops=91.9458 (err=5.5e-15) FFTW_MEASURE plan: (cost = 1.197869e-02) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 6 FFTW_TWIDDLE 5 FFTW_NOTW 9 5. FFTW: elapsed time t=1.54994 s, 128 iters, t-(init.)=1.41661 s t(norm)=0.0278451, mflops=179.565 (err=5.6e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.63327 s, 128 iters, t-(init.)=1.49994 s t(norm)=0.0294831, mflops=169.589 (err=5.6e-15) 7. Frigo-old: elapsed time t=1.93326 s, 32 iters, t-(init.)=1.89992 s t(norm)=0.149381, mflops=33.4715 (err=5.7e-15) 8. GSL: elapsed time t=1.44994 s, 64 iters, t-(init.)=1.38328 s t(norm)=0.0543799, mflops=91.9458 (err=5.5e-15) 9. NAPACK (f2c): elapsed time t=1.68327 s, 16 iters, t-(init.)=1.6666 s t(norm)=0.262072, mflops=19.0788 (err=1.1e-12) 10. Nielsen: elapsed time t=1.26662 s, 32 iters, t-(init.)=1.23328 s t(norm)=0.0969665, mflops=51.5642 (err=2.0e-13) 11. Singleton: elapsed time t=1.94992 s, 64 iters, t-(init.)=1.88326 s t(norm)=0.0740352, mflops=67.5354 (err=7.6e-15) 12. Singleton (f2c): elapsed time t=1.06662 s, 32 iters, t-(init.)=1.03329 s t(norm)=0.0812422, mflops=61.5444 (err=7.6e-15) 13. Temperton: elapsed time t=1.6666 s, 64 iters, t-(init.)=1.59994 s t(norm)=0.0628972, mflops=79.4948 (err=1.4e-07) 14. Temperton (f2c): elapsed time t=1.01663 s, 32 iters, t-(init.)=0.983294 s t(norm)=0.0773111, mflops=64.6737 (err=5.7e-15) 15. Valkenburg: elapsed time t=1.23328 s, 8 iters, t-(init.)=1.21662 s t(norm)=0.382625, mflops=13.0676 (err=5.4e-15) 16. DXML: elapsed time t=1.59994 s, 64 iters, t-(init.)=1.53327 s t(norm)=0.0602765, mflops=82.9511 (err=5.8e-15) Top mflops for N=27000 = 242.27 Normalized results and averages for N=27000: fft 0: mflops = 21.1986 (norm. = 0.0875), norm. avg. (of 13) = 0.0768635 fft 1: mflops = 242.27 (norm. = 1), norm. avg. (of 16) = 0.67752 fft 2: mflops = 242.27 (norm. = 1), norm. avg. (of 16) = 0.705299 fft 3: mflops = 125.107 (norm. = 0.516393), norm. avg. (of 16) = 0.559672 fft 4: mflops = 91.9458 (norm. = 0.379518), norm. avg. (of 16) = 0.301504 fft 5: mflops = 179.565 (norm. = 0.741176), norm. avg. (of 16) = 0.843467 fft 6: mflops = 169.589 (norm. = 0.7), norm. avg. (of 16) = 0.857914 fft 7: mflops = 33.4715 (norm. = 0.138158), norm. avg. (of 16) = 0.13292 fft 8: mflops = 91.9458 (norm. = 0.379518), norm. avg. (of 16) = 0.384487 fft 9: mflops = 19.0788 (norm. = 0.07875), norm. avg. (of 16) = 0.0778111 fft 10: mflops = 51.5642 (norm. = 0.212838), norm. avg. (of 16) = 0.151309 fft 11: mflops = 67.5354 (norm. = 0.278761), norm. avg. (of 16) = 0.240877 fft 12: mflops = 61.5444 (norm. = 0.254032), norm. avg. (of 16) = 0.214077 fft 13: mflops = 79.4948 (norm. = 0.328125), norm. avg. (of 12) = 0.320393 fft 14: mflops = 64.6737 (norm. = 0.266949), norm. avg. (of 12) = 0.254762 fft 15: mflops = 13.0676 (norm. = 0.0539384), norm. avg. (of 16) = 0.0475698 fft 16: mflops = 82.9511 (norm. = 0.342391), norm. avg. (of 16) = 0.334296 Benchmarking for array size = 75600: 0. Brenner: elapsed time t=1.33328 s, 4 iters, t-(init.)=1.31661 s t(norm)=0.268657, mflops=18.6111 (err=1.1e-14) 1. CWP (min N) (N=80080): elapsed time t=1.16662 s, 32 iters, t-(init.)=1.04996 s t(norm)=0.0267807, mflops=186.702 2. CWP (best N) (N=80080): elapsed time t=1.16662 s, 32 iters, t-(init.)=1.04996 s t(norm)=0.0267807, mflops=186.702 3. FFTPACK: elapsed time t=1.53327 s, 16 iters, t-(init.)=1.48327 s t(norm)=0.0756661, mflops=66.0798 (err=1.0e-14) 4. FFTPACK (f2c): elapsed time t=1.03329 s, 8 iters, t-(init.)=1.01663 s t(norm)=0.103722, mflops=48.2058 (err=1.1e-14) FFTW_MEASURE plan: (cost = 4.374825e-02) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_NOTW 14 5. FFTW: elapsed time t=1.41661 s, 32 iters, t-(init.)=1.31661 s t(norm)=0.0335821, mflops=148.889 (err=1.1e-14) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.44994 s, 32 iters, t-(init.)=1.34995 s t(norm)=0.0344323, mflops=145.212 (err=1.1e-14) 7. Frigo-old: elapsed time t=1.59994 s, 8 iters, t-(init.)=1.5666 s t(norm)=0.159834, mflops=31.2825 (err=1.1e-14) 8. GSL: elapsed time t=1.33328 s, 16 iters, t-(init.)=1.28328 s t(norm)=0.0654639, mflops=76.378 (err=1.1e-14) 9. NAPACK (f2c): elapsed time t=1.44994 s, 4 iters, t-(init.)=1.43328 s t(norm)=0.292462, mflops=17.0962 (err=5.1e-12) 10. Nielsen: elapsed time t=1.36661 s, 8 iters, t-(init.)=1.33328 s t(norm)=0.136029, mflops=36.7569 (err=4.8e-13) 11. Singleton: elapsed time t=1.79993 s, 16 iters, t-(init.)=1.74993 s t(norm)=0.0892689, mflops=56.0105 (err=1.5e-14) 12. Singleton (f2c): elapsed time t=1.93326 s, 16 iters, t-(init.)=1.88326 s t(norm)=0.0960704, mflops=52.0452 (err=1.5e-14) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.14995 s, 2 iters, t-(init.)=1.14995 s t(norm)=0.4693, mflops=10.6542 (err=1.1e-14) 16. DXML: elapsed time t=1.31661 s, 4 iters, t-(init.)=1.31661 s t(norm)=0.268657, mflops=18.6111 (err=1.1e-14) Top mflops for N=75600 = 186.702 Normalized results and averages for N=75600: fft 0: mflops = 18.6111 (norm. = 0.0996835), norm. avg. (of 14) = 0.0784935 fft 1: mflops = 186.702 (norm. = 1), norm. avg. (of 17) = 0.696489 fft 2: mflops = 186.702 (norm. = 1), norm. avg. (of 17) = 0.722634 fft 3: mflops = 66.0798 (norm. = 0.353933), norm. avg. (of 17) = 0.547569 fft 4: mflops = 48.2058 (norm. = 0.258197), norm. avg. (of 17) = 0.298956 fft 5: mflops = 148.889 (norm. = 0.797468), norm. avg. (of 17) = 0.840761 fft 6: mflops = 145.212 (norm. = 0.777778), norm. avg. (of 17) = 0.8532 fft 7: mflops = 31.2825 (norm. = 0.167553), norm. avg. (of 17) = 0.134957 fft 8: mflops = 76.378 (norm. = 0.409091), norm. avg. (of 17) = 0.385935 fft 9: mflops = 17.0962 (norm. = 0.0915698), norm. avg. (of 17) = 0.0786204 fft 10: mflops = 36.7569 (norm. = 0.196875), norm. avg. (of 17) = 0.15399 fft 11: mflops = 56.0105 (norm. = 0.3), norm. avg. (of 17) = 0.244355 fft 12: mflops = 52.0452 (norm. = 0.278761), norm. avg. (of 17) = 0.217882 fft 13: mflops = -1 (norm. = -0.00535614), norm. avg. (of 12) = 0.320393 fft 14: mflops = -1 (norm. = -0.00535614), norm. avg. (of 12) = 0.254762 fft 15: mflops = 10.6542 (norm. = 0.0570652), norm. avg. (of 17) = 0.0481284 fft 16: mflops = 18.6111 (norm. = 0.0996835), norm. avg. (of 17) = 0.320495 Benchmarking for array size = 165375: 0. Brenner: elapsed time t=1.01663 s, 1 iters, t-(init.)=0.99996 s t(norm)=0.348802, mflops=14.3348 (err=2.7e-14) 1. CWP (min N) (N=180180): elapsed time t=1.69993 s, 16 iters, t-(init.)=1.53327 s t(norm)=0.0334269, mflops=149.58 2. CWP (best N) (N=180180): elapsed time t=1.68327 s, 16 iters, t-(init.)=1.51661 s t(norm)=0.0330636, mflops=151.224 3. FFTPACK: elapsed time t=1.58327 s, 4 iters, t-(init.)=1.54994 s t(norm)=0.135161, mflops=36.9929 (err=2.7e-14) 4. FFTPACK (f2c): elapsed time t=1.08329 s, 2 iters, t-(init.)=1.04996 s t(norm)=0.183121, mflops=27.3043 (err=2.7e-14) FFTW_MEASURE plan: (cost = 1.249950e-01) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_NOTW 15 5. FFTW: elapsed time t=1.94992 s, 16 iters, t-(init.)=1.78326 s t(norm)=0.0388769, mflops=128.611 (err=2.7e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.01663 s, 8 iters, t-(init.)=0.949962 s t(norm)=0.0414203, mflops=120.714 (err=2.7e-14) 7. Frigo-old: elapsed time t=1.29995 s, 2 iters, t-(init.)=1.26662 s t(norm)=0.220908, mflops=22.6338 (err=2.7e-14) 8. GSL: elapsed time t=1.78326 s, 8 iters, t-(init.)=1.7166 s t(norm)=0.0748472, mflops=66.8028 (err=2.7e-14) 9. NAPACK (f2c): elapsed time t=1.06662 s, 1 iters, t-(init.)=1.06662 s t(norm)=0.372056, mflops=13.4388 (err=1.6e-11) 10. Nielsen: elapsed time t=1.69993 s, 4 iters, t-(init.)=1.6666 s t(norm)=0.145334, mflops=34.4034 (err=1.6e-12) 11. Singleton: elapsed time t=1.26662 s, 4 iters, t-(init.)=1.23328 s t(norm)=0.107547, mflops=46.4911 (err=4.0e-14) 12. Singleton (f2c): elapsed time t=1.34995 s, 4 iters, t-(init.)=1.31661 s t(norm)=0.114814, mflops=43.5487 (err=4.0e-14) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.39994 s, 1 iters, t-(init.)=1.38328 s t(norm)=0.48251, mflops=10.3625 (err=2.7e-14) 16. DXML: elapsed time t=1.58327 s, 2 iters, t-(init.)=1.5666 s t(norm)=0.273228, mflops=18.2997 (err=2.7e-14) Top mflops for N=165375 = 151.224 Normalized results and averages for N=165375: fft 0: mflops = 14.3348 (norm. = 0.0947917), norm. avg. (of 15) = 0.0795801 fft 1: mflops = 149.58 (norm. = 0.98913), norm. avg. (of 18) = 0.712747 fft 2: mflops = 151.224 (norm. = 1), norm. avg. (of 18) = 0.738043 fft 3: mflops = 36.9929 (norm. = 0.244624), norm. avg. (of 18) = 0.530739 fft 4: mflops = 27.3043 (norm. = 0.180556), norm. avg. (of 18) = 0.292378 fft 5: mflops = 128.611 (norm. = 0.850467), norm. avg. (of 18) = 0.8413 fft 6: mflops = 120.714 (norm. = 0.798246), norm. avg. (of 18) = 0.850147 fft 7: mflops = 22.6338 (norm. = 0.149671), norm. avg. (of 18) = 0.135774 fft 8: mflops = 66.8028 (norm. = 0.441748), norm. avg. (of 18) = 0.389035 fft 9: mflops = 13.4388 (norm. = 0.0888672), norm. avg. (of 18) = 0.0791897 fft 10: mflops = 34.4034 (norm. = 0.2275), norm. avg. (of 18) = 0.158073 fft 11: mflops = 46.4911 (norm. = 0.307432), norm. avg. (of 18) = 0.247859 fft 12: mflops = 43.5487 (norm. = 0.287975), norm. avg. (of 18) = 0.221776 fft 13: mflops = -1 (norm. = -0.00661271), norm. avg. (of 12) = 0.320393 fft 14: mflops = -1 (norm. = -0.00661271), norm. avg. (of 12) = 0.254762 fft 15: mflops = 10.3625 (norm. = 0.0685241), norm. avg. (of 18) = 0.0492615 fft 16: mflops = 18.2997 (norm. = 0.121011), norm. avg. (of 18) = 0.309413 Benchmarking for array size = 362880: 0. Brenner: elapsed time t=2.58323 s, 1 iters, t-(init.)=2.5499 s t(norm)=0.380464, mflops=13.1419 (err=1.1e-13) 1. CWP (min N) (N=720720): elapsed time t=1.26662 s, 2 iters, t-(init.)=1.14995 s t(norm)=0.0857908, mflops=58.2813 2. CWP (best N) (N=720720): elapsed time t=1.26662 s, 2 iters, t-(init.)=1.14995 s t(norm)=0.0857908, mflops=58.2813 3. FFTPACK: elapsed time t=1.5666 s, 2 iters, t-(init.)=1.51661 s t(norm)=0.113144, mflops=44.1913 (err=1.1e-13) 4. FFTPACK (f2c): elapsed time t=1.03329 s, 1 iters, t-(init.)=0.99996 s t(norm)=0.149201, mflops=33.5117 (err=1.1e-13) FFTW_MEASURE plan: (cost = 2.833220e-01) FFTW_TWIDDLE 32 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 10 FFTW_NOTW 14 5. FFTW: elapsed time t=1.14995 s, 4 iters, t-(init.)=1.04996 s t(norm)=0.0391654, mflops=127.664 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 7.511616e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.24995 s, 4 iters, t-(init.)=1.14995 s t(norm)=0.0428954, mflops=116.563 (err=1.1e-13) 7. Frigo-old: elapsed time t=1.29995 s, 1 iters, t-(init.)=1.26662 s t(norm)=0.188989, mflops=26.4566 (err=1.1e-13) 8. GSL: elapsed time t=1.04996 s, 2 iters, t-(init.)=0.99996 s t(norm)=0.0746007, mflops=67.0235 (err=1.1e-13) 9. NAPACK (f2c): elapsed time t=2.19991 s, 1 iters, t-(init.)=2.16658 s t(norm)=0.32327, mflops=15.467 (err=3.4e-11) 10. Nielsen: elapsed time t=1.28328 s, 1 iters, t-(init.)=1.26662 s t(norm)=0.188989, mflops=26.4566 (err=3.5e-12) 11. Singleton: elapsed time t=1.08329 s, 1 iters, t-(init.)=1.06662 s t(norm)=0.159148, mflops=31.4173 (err=1.6e-13) 12. Singleton (f2c): elapsed time t=1.13329 s, 1 iters, t-(init.)=1.11662 s t(norm)=0.166608, mflops=30.0105 (err=1.6e-13) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=3.58319 s, 1 iters, t-(init.)=3.54986 s t(norm)=0.529665, mflops=9.43993 (err=1.1e-13) 16. DXML: elapsed time t=1.16662 s, 2 iters, t-(init.)=1.11662 s t(norm)=0.0833042, mflops=60.021 (err=1.1e-13) Top mflops for N=362880 = 127.664 Normalized results and averages for N=362880: fft 0: mflops = 13.1419 (norm. = 0.102941), norm. avg. (of 16) = 0.0810401 fft 1: mflops = 58.2813 (norm. = 0.456522), norm. avg. (of 19) = 0.699261 fft 2: mflops = 58.2813 (norm. = 0.456522), norm. avg. (of 19) = 0.723226 fft 3: mflops = 44.1913 (norm. = 0.346154), norm. avg. (of 19) = 0.521024 fft 4: mflops = 33.5117 (norm. = 0.2625), norm. avg. (of 19) = 0.290806 fft 5: mflops = 127.664 (norm. = 1), norm. avg. (of 19) = 0.849653 fft 6: mflops = 116.563 (norm. = 0.913043), norm. avg. (of 19) = 0.853457 fft 7: mflops = 26.4566 (norm. = 0.207237), norm. avg. (of 19) = 0.139536 fft 8: mflops = 67.0235 (norm. = 0.525), norm. avg. (of 19) = 0.396191 fft 9: mflops = 15.467 (norm. = 0.121154), norm. avg. (of 19) = 0.0813983 fft 10: mflops = 26.4566 (norm. = 0.207237), norm. avg. (of 19) = 0.160661 fft 11: mflops = 31.4173 (norm. = 0.246094), norm. avg. (of 19) = 0.247766 fft 12: mflops = 30.0105 (norm. = 0.235075), norm. avg. (of 19) = 0.222476 fft 13: mflops = -1 (norm. = -0.00783308), norm. avg. (of 12) = 0.320393 fft 14: mflops = -1 (norm. = -0.00783308), norm. avg. (of 12) = 0.254762 fft 15: mflops = 9.43993 (norm. = 0.0739437), norm. avg. (of 19) = 0.0505605 fft 16: mflops = 60.021 (norm. = 0.470149), norm. avg. (of 19) = 0.317873 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) Maximum array size N = 2097152 Benchmarking FFTs: 0. FFTW 1. HARM 2. HARM (f2c) 3. NR (C) 4. NR (F) 5. PDA 6. PDA (f2c) 7. Singleton 8. Singleton (f2c) 9. Temperton 10. Temperton (f2c) Computing normalized averages (11 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.09996 s, 131072 iters, t-(init.)=1.04996 s t(norm)=0.0208608, mflops=239.684 (err=1.9e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. Skipping fft (all dimensions must be > 4 for HARM). 3. NR (C): elapsed time t=1.39994 s, 65536 iters, t-(init.)=1.38328 s t(norm)=0.0549665, mflops=90.9644 (err=2.4e-16) 4. NR (F): elapsed time t=1.89992 s, 65536 iters, t-(init.)=1.88326 s t(norm)=0.0748339, mflops=66.8146 (err=2.4e-16) 5. PDA: elapsed time t=1.33328 s, 32768 iters, t-(init.)=1.33328 s t(norm)=0.10596, mflops=47.1878 (err=2.8e-16) 6. PDA (f2c): elapsed time t=1.7166 s, 32768 iters, t-(init.)=1.69993 s t(norm)=0.135098, mflops=37.01 (err=2.8e-16) 7. Singleton: elapsed time t=1.43328 s, 131072 iters, t-(init.)=1.39994 s t(norm)=0.0278144, mflops=179.763 (err=1.9e-16) 8. Singleton (f2c): elapsed time t=1.48327 s, 131072 iters, t-(init.)=1.44994 s t(norm)=0.0288078, mflops=173.564 (err=1.9e-16) 9. Temperton: elapsed time t=1.43328 s, 131072 iters, t-(init.)=1.39994 s t(norm)=0.0278144, mflops=179.763 (err=1.9e-16) 10. Temperton (f2c): elapsed time t=1.11662 s, 65536 iters, t-(init.)=1.08329 s t(norm)=0.0430461, mflops=116.155 (err=1.9e-16) Top mflops for N=64 = 239.684 Normalized results and averages for N=64: fft 0: mflops = 239.684 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.00417216), norm. avg. (of 0) = -1 fft 2: mflops = -1 (norm. = -0.00417216), norm. avg. (of 0) = -1 fft 3: mflops = 90.9644 (norm. = 0.379518), norm. avg. (of 1) = 0.379518 fft 4: mflops = 66.8146 (norm. = 0.278761), norm. avg. (of 1) = 0.278761 fft 5: mflops = 47.1878 (norm. = 0.196875), norm. avg. (of 1) = 0.196875 fft 6: mflops = 37.01 (norm. = 0.154412), norm. avg. (of 1) = 0.154412 fft 7: mflops = 179.763 (norm. = 0.75), norm. avg. (of 1) = 0.75 fft 8: mflops = 173.564 (norm. = 0.724138), norm. avg. (of 1) = 0.724138 fft 9: mflops = 179.763 (norm. = 0.75), norm. avg. (of 1) = 0.75 fft 10: mflops = 116.155 (norm. = 0.484615), norm. avg. (of 1) = 0.484615 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.46661 s, 32768 iters, t-(init.)=1.38328 s t(norm)=0.00916109, mflops=545.787 (err=3.6e-16) 1. HARM: elapsed time t=1.94992 s, 16384 iters, t-(init.)=1.91659 s t(norm)=0.0253861, mflops=196.958 (err=3.8e-16) 2. HARM (f2c): elapsed time t=1.16662 s, 8192 iters, t-(init.)=1.14995 s t(norm)=0.0304634, mflops=164.132 (err=3.8e-16) 3. NR (C): elapsed time t=1.59994 s, 8192 iters, t-(init.)=1.58327 s t(norm)=0.0419423, mflops=119.211 (err=3.6e-16) 4. NR (F): elapsed time t=1.14995 s, 4096 iters, t-(init.)=1.13329 s t(norm)=0.0600438, mflops=83.2726 (err=3.6e-16) 5. PDA: elapsed time t=1.13329 s, 4096 iters, t-(init.)=1.13329 s t(norm)=0.0600438, mflops=83.2726 (err=3.0e-16) 6. PDA (f2c): elapsed time t=1.59994 s, 4096 iters, t-(init.)=1.58327 s t(norm)=0.0838847, mflops=59.6057 (err=3.0e-16) 7. Singleton: elapsed time t=1.29995 s, 8192 iters, t-(init.)=1.28328 s t(norm)=0.0339954, mflops=147.079 (err=3.5e-16) 8. Singleton (f2c): elapsed time t=1.69993 s, 8192 iters, t-(init.)=1.6666 s t(norm)=0.0441498, mflops=113.251 (err=3.5e-16) 9. Temperton: elapsed time t=1.34995 s, 16384 iters, t-(init.)=1.31661 s t(norm)=0.0174392, mflops=286.711 (err=1.3e-08) 10. Temperton (f2c): elapsed time t=1.06662 s, 8192 iters, t-(init.)=1.04996 s t(norm)=0.0278144, mflops=179.763 (err=3.3e-16) Top mflops for N=512 = 545.787 Normalized results and averages for N=512: fft 0: mflops = 545.787 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 196.958 (norm. = 0.36087), norm. avg. (of 1) = 0.36087 fft 2: mflops = 164.132 (norm. = 0.300725), norm. avg. (of 1) = 0.300725 fft 3: mflops = 119.211 (norm. = 0.218421), norm. avg. (of 2) = 0.29897 fft 4: mflops = 83.2726 (norm. = 0.152574), norm. avg. (of 2) = 0.215667 fft 5: mflops = 83.2726 (norm. = 0.152574), norm. avg. (of 2) = 0.174724 fft 6: mflops = 59.6057 (norm. = 0.109211), norm. avg. (of 2) = 0.131811 fft 7: mflops = 147.079 (norm. = 0.269481), norm. avg. (of 2) = 0.50974 fft 8: mflops = 113.251 (norm. = 0.2075), norm. avg. (of 2) = 0.465819 fft 9: mflops = 286.711 (norm. = 0.525316), norm. avg. (of 2) = 0.637658 fft 10: mflops = 179.763 (norm. = 0.329365), norm. avg. (of 2) = 0.40699 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.11662 s, 2048 iters, t-(init.)=1.06662 s t(norm)=0.010596, mflops=471.878 (err=4.2e-16) 1. HARM: elapsed time t=1.16662 s, 1024 iters, t-(init.)=1.14995 s t(norm)=0.0228475, mflops=218.842 (err=4.0e-16) 2. HARM (f2c): elapsed time t=1.39994 s, 1024 iters, t-(init.)=1.36661 s t(norm)=0.0271521, mflops=184.148 (err=4.0e-16) 3. NR (C): elapsed time t=1.31661 s, 512 iters, t-(init.)=1.31661 s t(norm)=0.0523175, mflops=95.5702 (err=5.0e-16) 4. NR (F): elapsed time t=1.81659 s, 512 iters, t-(init.)=1.81659 s t(norm)=0.072185, mflops=69.2665 (err=5.0e-16) 5. PDA: elapsed time t=1.6666 s, 1024 iters, t-(init.)=1.64993 s t(norm)=0.0327812, mflops=152.526 (err=4.0e-16) 6. PDA (f2c): elapsed time t=1.31661 s, 512 iters, t-(init.)=1.29995 s t(norm)=0.0516553, mflops=96.7955 (err=4.0e-16) 7. Singleton: elapsed time t=1.16662 s, 1024 iters, t-(init.)=1.14995 s t(norm)=0.0228475, mflops=218.842 (err=4.1e-16) 8. Singleton (f2c): elapsed time t=1.59994 s, 1024 iters, t-(init.)=1.58327 s t(norm)=0.0314567, mflops=158.948 (err=4.1e-16) 9. Temperton: elapsed time t=1.48327 s, 2048 iters, t-(init.)=1.44994 s t(norm)=0.0144039, mflops=347.129 (err=6.3e-08) 10. Temperton (f2c): elapsed time t=1.01663 s, 1024 iters, t-(init.)=0.99996 s t(norm)=0.0198674, mflops=251.668 (err=4.6e-16) Top mflops for N=4096 = 471.878 Normalized results and averages for N=4096: fft 0: mflops = 471.878 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 218.842 (norm. = 0.463768), norm. avg. (of 2) = 0.412319 fft 2: mflops = 184.148 (norm. = 0.390244), norm. avg. (of 2) = 0.345484 fft 3: mflops = 95.5702 (norm. = 0.202532), norm. avg. (of 3) = 0.266824 fft 4: mflops = 69.2665 (norm. = 0.146789), norm. avg. (of 3) = 0.192708 fft 5: mflops = 152.526 (norm. = 0.323232), norm. avg. (of 3) = 0.224227 fft 6: mflops = 96.7955 (norm. = 0.205128), norm. avg. (of 3) = 0.15625 fft 7: mflops = 218.842 (norm. = 0.463768), norm. avg. (of 3) = 0.494416 fft 8: mflops = 158.948 (norm. = 0.336842), norm. avg. (of 3) = 0.422827 fft 9: mflops = 347.129 (norm. = 0.735632), norm. avg. (of 3) = 0.670316 fft 10: mflops = 251.668 (norm. = 0.533333), norm. avg. (of 3) = 0.449105 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.74993 s, 128 iters, t-(init.)=1.58327 s t(norm)=0.0251654, mflops=198.686 (err=5.2e-16) 1. HARM: elapsed time t=1.53327 s, 64 iters, t-(init.)=1.43328 s t(norm)=0.0455626, mflops=109.739 (err=5.3e-16) 2. HARM (f2c): elapsed time t=1.68327 s, 64 iters, t-(init.)=1.59994 s t(norm)=0.0508606, mflops=98.3079 (err=5.3e-16) 3. NR (C): elapsed time t=1.23328 s, 16 iters, t-(init.)=1.21662 s t(norm)=0.154701, mflops=32.3204 (err=5.9e-16) 4. NR (F): elapsed time t=1.39994 s, 16 iters, t-(init.)=1.38328 s t(norm)=0.175893, mflops=28.4264 (err=5.9e-16) 5. PDA: elapsed time t=1.08329 s, 32 iters, t-(init.)=1.03329 s t(norm)=0.0656949, mflops=76.1094 (err=4.2e-16) 6. PDA (f2c): elapsed time t=1.41661 s, 32 iters, t-(init.)=1.38328 s t(norm)=0.0879464, mflops=56.8528 (err=4.2e-16) 7. Singleton: elapsed time t=1.53327 s, 32 iters, t-(init.)=1.49994 s t(norm)=0.0953636, mflops=52.4309 (err=5.3e-16) 8. Singleton (f2c): elapsed time t=1.68327 s, 32 iters, t-(init.)=1.63327 s t(norm)=0.10384, mflops=48.1508 (err=5.3e-16) 9. Temperton: elapsed time t=1.74993 s, 64 iters, t-(init.)=1.6666 s t(norm)=0.0529798, mflops=94.3756 (err=9.6e-08) 10. Temperton (f2c): elapsed time t=1.88326 s, 64 iters, t-(init.)=1.79993 s t(norm)=0.0572182, mflops=87.3848 (err=4.7e-16) Top mflops for N=32768 = 198.686 Normalized results and averages for N=32768: fft 0: mflops = 198.686 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 109.739 (norm. = 0.552326), norm. avg. (of 3) = 0.458988 fft 2: mflops = 98.3079 (norm. = 0.494792), norm. avg. (of 3) = 0.395253 fft 3: mflops = 32.3204 (norm. = 0.162671), norm. avg. (of 4) = 0.240786 fft 4: mflops = 28.4264 (norm. = 0.143072), norm. avg. (of 4) = 0.180299 fft 5: mflops = 76.1094 (norm. = 0.383065), norm. avg. (of 4) = 0.263936 fft 6: mflops = 56.8528 (norm. = 0.286145), norm. avg. (of 4) = 0.188724 fft 7: mflops = 52.4309 (norm. = 0.263889), norm. avg. (of 4) = 0.436784 fft 8: mflops = 48.1508 (norm. = 0.242347), norm. avg. (of 4) = 0.377707 fft 9: mflops = 94.3756 (norm. = 0.475), norm. avg. (of 4) = 0.621487 fft 10: mflops = 87.3848 (norm. = 0.439815), norm. avg. (of 4) = 0.446782 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.34995 s, 8 iters, t-(init.)=1.19995 s t(norm)=0.0317879, mflops=157.293 (err=1.2e-15) 1. HARM: elapsed time t=1.24995 s, 4 iters, t-(init.)=1.18329 s t(norm)=0.0626927, mflops=79.754 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.34995 s, 4 iters, t-(init.)=1.26662 s t(norm)=0.0671077, mflops=74.5071 (err=1.2e-15) 3. NR (C): elapsed time t=1.31661 s, 1 iters, t-(init.)=1.29995 s t(norm)=0.275495, mflops=18.1492 (err=1.2e-15) 4. NR (F): elapsed time t=1.41661 s, 1 iters, t-(init.)=1.39994 s t(norm)=0.296687, mflops=16.8528 (err=1.2e-15) 5. PDA: elapsed time t=1.24995 s, 4 iters, t-(init.)=1.18329 s t(norm)=0.0626927, mflops=79.754 (err=1.3e-15) 6. PDA (f2c): elapsed time t=1.6166 s, 4 iters, t-(init.)=1.54994 s t(norm)=0.0821187, mflops=60.8875 (err=1.3e-15) 7. Singleton: elapsed time t=1.48327 s, 2 iters, t-(init.)=1.44994 s t(norm)=0.153641, mflops=32.5433 (err=1.7e-15) 8. Singleton (f2c): elapsed time t=1.5666 s, 2 iters, t-(init.)=1.53327 s t(norm)=0.162471, mflops=30.7747 (err=1.7e-15) 9. Temperton: elapsed time t=1.11662 s, 4 iters, t-(init.)=1.04996 s t(norm)=0.0556288, mflops=89.8815 (err=1.3e-07) 10. Temperton (f2c): elapsed time t=1.16662 s, 4 iters, t-(init.)=1.09996 s t(norm)=0.0582778, mflops=85.796 (err=1.3e-15) Top mflops for N=262144 = 157.293 Normalized results and averages for N=262144: fft 0: mflops = 157.293 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 79.754 (norm. = 0.507042), norm. avg. (of 4) = 0.471001 fft 2: mflops = 74.5071 (norm. = 0.473684), norm. avg. (of 4) = 0.414861 fft 3: mflops = 18.1492 (norm. = 0.115385), norm. avg. (of 5) = 0.215705 fft 4: mflops = 16.8528 (norm. = 0.107143), norm. avg. (of 5) = 0.165668 fft 5: mflops = 79.754 (norm. = 0.507042), norm. avg. (of 5) = 0.312558 fft 6: mflops = 60.8875 (norm. = 0.387097), norm. avg. (of 5) = 0.228398 fft 7: mflops = 32.5433 (norm. = 0.206897), norm. avg. (of 5) = 0.390807 fft 8: mflops = 30.7747 (norm. = 0.195652), norm. avg. (of 5) = 0.341296 fft 9: mflops = 89.8815 (norm. = 0.571429), norm. avg. (of 5) = 0.611475 fft 10: mflops = 85.796 (norm. = 0.545455), norm. avg. (of 5) = 0.466517 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.6666 s, 4 iters, t-(init.)=1.49994 s t(norm)=0.0376435, mflops=132.825 (err=1.2e-15) 1. HARM: elapsed time t=1.58327 s, 2 iters, t-(init.)=1.49994 s t(norm)=0.0752871, mflops=66.4125 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.64993 s, 2 iters, t-(init.)=1.5666 s t(norm)=0.0786332, mflops=63.5864 (err=1.2e-15) 3. NR (C): elapsed time t=3.31653 s, 1 iters, t-(init.)=3.2832 s t(norm)=0.32959, mflops=15.1704 (err=1.3e-15) 4. NR (F): elapsed time t=3.4332 s, 1 iters, t-(init.)=3.3832 s t(norm)=0.339628, mflops=14.722 (err=1.3e-15) 5. PDA: elapsed time t=1.5666 s, 2 iters, t-(init.)=1.48327 s t(norm)=0.0744505, mflops=67.1587 (err=1.2e-15) 6. PDA (f2c): elapsed time t=1.94992 s, 2 iters, t-(init.)=1.88326 s t(norm)=0.0945271, mflops=52.8949 (err=1.2e-15) 7. Singleton: elapsed time t=2.01659 s, 1 iters, t-(init.)=1.96659 s t(norm)=0.197419, mflops=25.3268 (err=1.7e-15) 8. Singleton (f2c): elapsed time t=2.08325 s, 1 iters, t-(init.)=2.04992 s t(norm)=0.205785, mflops=24.2972 (err=1.7e-15) 9. Temperton: elapsed time t=1.34995 s, 2 iters, t-(init.)=1.26662 s t(norm)=0.0635757, mflops=78.6463 (err=1.5e-07) 10. Temperton (f2c): elapsed time t=1.39994 s, 2 iters, t-(init.)=1.31661 s t(norm)=0.0660853, mflops=75.6598 (err=1.3e-15) Top mflops for N=524288 = 132.825 Normalized results and averages for N=524288: fft 0: mflops = 132.825 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 66.4125 (norm. = 0.5), norm. avg. (of 5) = 0.476801 fft 2: mflops = 63.5864 (norm. = 0.478723), norm. avg. (of 5) = 0.427634 fft 3: mflops = 15.1704 (norm. = 0.114213), norm. avg. (of 6) = 0.19879 fft 4: mflops = 14.722 (norm. = 0.110837), norm. avg. (of 6) = 0.156529 fft 5: mflops = 67.1587 (norm. = 0.505618), norm. avg. (of 6) = 0.344734 fft 6: mflops = 52.8949 (norm. = 0.39823), norm. avg. (of 6) = 0.256704 fft 7: mflops = 25.3268 (norm. = 0.190678), norm. avg. (of 6) = 0.357452 fft 8: mflops = 24.2972 (norm. = 0.182927), norm. avg. (of 6) = 0.314901 fft 9: mflops = 78.6463 (norm. = 0.592105), norm. avg. (of 6) = 0.608247 fft 10: mflops = 75.6598 (norm. = 0.56962), norm. avg. (of 6) = 0.483701 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=1.84993 s, 2 iters, t-(init.)=1.68327 s t(norm)=0.0401322, mflops=124.588 (err=2.0e-15) 1. HARM: elapsed time t=1.7166 s, 1 iters, t-(init.)=1.63327 s t(norm)=0.0778803, mflops=64.2011 (err=1.9e-15) 2. HARM (f2c): elapsed time t=1.79993 s, 1 iters, t-(init.)=1.7166 s t(norm)=0.0818538, mflops=61.0845 (err=1.9e-15) 3. NR (C): elapsed time t=7.5497 s, 1 iters, t-(init.)=7.46637 s t(norm)=0.356024, mflops=14.044 (err=1.9e-15) 4. NR (F): elapsed time t=7.81635 s, 1 iters, t-(init.)=7.73302 s t(norm)=0.368739, mflops=13.5597 (err=1.9e-15) 5. PDA: elapsed time t=1.6166 s, 1 iters, t-(init.)=1.51661 s t(norm)=0.0723174, mflops=69.1396 (err=2.0e-15) 6. PDA (f2c): elapsed time t=1.99992 s, 1 iters, t-(init.)=1.91659 s t(norm)=0.0913901, mflops=54.7105 (err=2.0e-15) 7. Singleton: elapsed time t=4.48315 s, 1 iters, t-(init.)=4.39982 s t(norm)=0.2098, mflops=23.8322 (err=2.8e-15) 8. Singleton (f2c): elapsed time t=4.61648 s, 1 iters, t-(init.)=4.53315 s t(norm)=0.216158, mflops=23.1313 (err=2.8e-15) 9. Skipping fft (Temperton can't handle dimensions > 256). 10. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 124.588 Normalized results and averages for N=1048576: fft 0: mflops = 124.588 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 64.2011 (norm. = 0.515306), norm. avg. (of 6) = 0.483219 fft 2: mflops = 61.0845 (norm. = 0.490291), norm. avg. (of 6) = 0.438077 fft 3: mflops = 14.044 (norm. = 0.112723), norm. avg. (of 7) = 0.186495 fft 4: mflops = 13.5597 (norm. = 0.108836), norm. avg. (of 7) = 0.149716 fft 5: mflops = 69.1396 (norm. = 0.554945), norm. avg. (of 7) = 0.374764 fft 6: mflops = 54.7105 (norm. = 0.43913), norm. avg. (of 7) = 0.282765 fft 7: mflops = 23.8322 (norm. = 0.191288), norm. avg. (of 7) = 0.333714 fft 8: mflops = 23.1313 (norm. = 0.185662), norm. avg. (of 7) = 0.296438 fft 9: mflops = -1 (norm. = -0.00802644), norm. avg. (of 6) = 0.608247 fft 10: mflops = -1 (norm. = -0.00802644), norm. avg. (of 6) = 0.483701 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=2.08325 s, 1 iters, t-(init.)=1.89992 s t(norm)=0.0431407, mflops=115.9 (err=7.2e-16) 1. HARM: elapsed time t=3.98317 s, 1 iters, t-(init.)=3.79985 s t(norm)=0.0862814, mflops=57.9499 (err=7.0e-16) 2. HARM (f2c): elapsed time t=4.09984 s, 1 iters, t-(init.)=3.93318 s t(norm)=0.0893088, mflops=55.9855 (err=7.0e-16) 3. NR (C): elapsed time t=16.3827 s, 1 iters, t-(init.)=16.1994 s t(norm)=0.367831, mflops=13.5932 (err=7.4e-16) 4. NR (F): elapsed time t=16.966 s, 1 iters, t-(init.)=16.7827 s t(norm)=0.381076, mflops=13.1207 (err=7.4e-16) 5. PDA: elapsed time t=3.4332 s, 1 iters, t-(init.)=3.26654 s t(norm)=0.0741717, mflops=67.4112 (err=7.1e-16) 6. PDA (f2c): elapsed time t=4.28316 s, 1 iters, t-(init.)=4.1165 s t(norm)=0.0934715, mflops=53.4923 (err=7.1e-16) 7. Singleton: elapsed time t=12.9162 s, 1 iters, t-(init.)=12.7495 s t(norm)=0.289497, mflops=17.2714 (err=8.4e-16) 8. Singleton (f2c): elapsed time t=13.4661 s, 1 iters, t-(init.)=13.2828 s t(norm)=0.301606, mflops=16.5779 (err=8.4e-16) 9. Temperton: elapsed time t=4.49982 s, 1 iters, t-(init.)=4.33316 s t(norm)=0.098391, mflops=50.8176 (err=1.5e-07) 10. Temperton (f2c): elapsed time t=4.14983 s, 1 iters, t-(init.)=3.96651 s t(norm)=0.0900656, mflops=55.5151 (err=7.4e-16) Top mflops for N=2097152 = 115.9 Normalized results and averages for N=2097152: fft 0: mflops = 115.9 (norm. = 1), norm. avg. (of 8) = 1 fft 1: mflops = 57.9499 (norm. = 0.5), norm. avg. (of 7) = 0.485616 fft 2: mflops = 55.9855 (norm. = 0.483051), norm. avg. (of 7) = 0.444501 fft 3: mflops = 13.5932 (norm. = 0.117284), norm. avg. (of 8) = 0.177843 fft 4: mflops = 13.1207 (norm. = 0.113208), norm. avg. (of 8) = 0.145152 fft 5: mflops = 67.4112 (norm. = 0.581633), norm. avg. (of 8) = 0.400623 fft 6: mflops = 53.4923 (norm. = 0.461538), norm. avg. (of 8) = 0.305111 fft 7: mflops = 17.2714 (norm. = 0.14902), norm. avg. (of 8) = 0.310627 fft 8: mflops = 16.5779 (norm. = 0.143036), norm. avg. (of 8) = 0.277263 fft 9: mflops = 50.8176 (norm. = 0.438462), norm. avg. (of 7) = 0.583992 fft 10: mflops = 55.5151 (norm. = 0.478992), norm. avg. (of 7) = 0.483028 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) Maximum array size N = 2985984 Benchmarking FFTs: 0. FFTW 1. PDA 2. PDA (f2c) 3. Singleton 4. Singleton (f2c) 5. Temperton 6. Temperton (f2c) Computing normalized averages (7 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.01663 s, 65536 iters, t-(init.)=0.966628 s t(norm)=0.0169395, mflops=295.169 (err=3.0e-16) 1. PDA: elapsed time t=1.19995 s, 16384 iters, t-(init.)=1.19995 s t(norm)=0.0841132, mflops=59.4437 (err=2.3e-16) 2. PDA (f2c): elapsed time t=1.7666 s, 16384 iters, t-(init.)=1.74993 s t(norm)=0.122665, mflops=40.7614 (err=2.3e-16) 3. Singleton: elapsed time t=1.89992 s, 131072 iters, t-(init.)=1.81659 s t(norm)=0.0159172, mflops=314.125 (err=3.1e-16) 4. Singleton (f2c): elapsed time t=1.83326 s, 131072 iters, t-(init.)=1.7666 s t(norm)=0.0154792, mflops=323.015 (err=3.1e-16) 5. Temperton: elapsed time t=1.49994 s, 65536 iters, t-(init.)=1.46661 s t(norm)=0.0257012, mflops=194.543 (err=5.3e-16) 6. Temperton (f2c): elapsed time t=1.79993 s, 65536 iters, t-(init.)=1.7666 s t(norm)=0.0309583, mflops=161.508 (err=2.4e-16) Top mflops for N=125 = 323.015 Normalized results and averages for N=125: fft 0: mflops = 295.169 (norm. = 0.913793), norm. avg. (of 1) = 0.913793 fft 1: mflops = 59.4437 (norm. = 0.184028), norm. avg. (of 1) = 0.184028 fft 2: mflops = 40.7614 (norm. = 0.12619), norm. avg. (of 1) = 0.12619 fft 3: mflops = 314.125 (norm. = 0.972477), norm. avg. (of 1) = 0.972477 fft 4: mflops = 323.015 (norm. = 1), norm. avg. (of 1) = 1 fft 5: mflops = 194.543 (norm. = 0.602273), norm. avg. (of 1) = 0.602273 fft 6: mflops = 161.508 (norm. = 0.5), norm. avg. (of 1) = 0.5 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.36661 s, 65536 iters, t-(init.)=1.29995 s t(norm)=0.0118418, mflops=422.234 (err=2.9e-16) 1. PDA: elapsed time t=1.29995 s, 8192 iters, t-(init.)=1.29995 s t(norm)=0.0947342, mflops=52.7793 (err=3.6e-16) 2. PDA (f2c): elapsed time t=1.6666 s, 8192 iters, t-(init.)=1.64993 s t(norm)=0.12024, mflops=41.5837 (err=3.6e-16) 3. Singleton: elapsed time t=1.11662 s, 16384 iters, t-(init.)=1.09996 s t(norm)=0.0400799, mflops=124.751 (err=2.9e-16) 4. Singleton (f2c): elapsed time t=1.24995 s, 16384 iters, t-(init.)=1.23328 s t(norm)=0.044938, mflops=111.264 (err=2.9e-16) 5. Temperton: elapsed time t=1.58327 s, 32768 iters, t-(init.)=1.54994 s t(norm)=0.0282381, mflops=177.066 (err=2.1e-15) 6. Temperton (f2c): elapsed time t=1.86659 s, 32768 iters, t-(init.)=1.83326 s t(norm)=0.0333999, mflops=149.701 (err=3.1e-16) Top mflops for N=216 = 422.234 Normalized results and averages for N=216: fft 0: mflops = 422.234 (norm. = 1), norm. avg. (of 2) = 0.956897 fft 1: mflops = 52.7793 (norm. = 0.125), norm. avg. (of 2) = 0.154514 fft 2: mflops = 41.5837 (norm. = 0.0984848), norm. avg. (of 2) = 0.112338 fft 3: mflops = 124.751 (norm. = 0.295455), norm. avg. (of 2) = 0.633966 fft 4: mflops = 111.264 (norm. = 0.263514), norm. avg. (of 2) = 0.631757 fft 5: mflops = 177.066 (norm. = 0.419355), norm. avg. (of 2) = 0.510814 fft 6: mflops = 149.701 (norm. = 0.354545), norm. avg. (of 2) = 0.427273 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.41661 s, 32768 iters, t-(init.)=1.36661 s t(norm)=0.0144372, mflops=346.328 (err=3.8e-16) 1. PDA: elapsed time t=1.43328 s, 2048 iters, t-(init.)=1.41661 s t(norm)=0.239446, mflops=20.8815 (err=4.8e-16) 2. PDA (f2c): elapsed time t=1.39994 s, 2048 iters, t-(init.)=1.39994 s t(norm)=0.236629, mflops=21.1301 (err=4.3e-16) 3. Singleton: elapsed time t=1.68327 s, 16384 iters, t-(init.)=1.64993 s t(norm)=0.0348605, mflops=143.429 (err=5.8e-16) 4. Singleton (f2c): elapsed time t=1.73326 s, 16384 iters, t-(init.)=1.7166 s t(norm)=0.036269, mflops=137.859 (err=5.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 346.328 Normalized results and averages for N=343: fft 0: mflops = 346.328 (norm. = 1), norm. avg. (of 3) = 0.971264 fft 1: mflops = 20.8815 (norm. = 0.0602941), norm. avg. (of 3) = 0.123107 fft 2: mflops = 21.1301 (norm. = 0.0610119), norm. avg. (of 3) = 0.0952291 fft 3: mflops = 143.429 (norm. = 0.414141), norm. avg. (of 3) = 0.560691 fft 4: mflops = 137.859 (norm. = 0.398058), norm. avg. (of 3) = 0.553857 fft 5: mflops = -1 (norm. = -0.00288744), norm. avg. (of 2) = 0.510814 fft 6: mflops = -1 (norm. = -0.00288744), norm. avg. (of 2) = 0.427273 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.53327 s, 16384 iters, t-(init.)=1.48327 s t(norm)=0.0130588, mflops=382.883 (err=5.3e-16) 1. PDA: elapsed time t=1.59994 s, 4096 iters, t-(init.)=1.58327 s t(norm)=0.0557567, mflops=89.6753 (err=4.9e-16) 2. PDA (f2c): elapsed time t=1.26662 s, 2048 iters, t-(init.)=1.24995 s t(norm)=0.0880369, mflops=56.7943 (err=4.9e-16) 3. Singleton: elapsed time t=1.98325 s, 8192 iters, t-(init.)=1.96659 s t(norm)=0.0346279, mflops=144.392 (err=4.5e-16) 4. Singleton (f2c): elapsed time t=1.14995 s, 4096 iters, t-(init.)=1.13329 s t(norm)=0.0399101, mflops=125.282 (err=4.5e-16) 5. Temperton: elapsed time t=1.11662 s, 8192 iters, t-(init.)=1.08329 s t(norm)=0.0190747, mflops=262.128 (err=6.0e-08) 6. Temperton (f2c): elapsed time t=1.34995 s, 8192 iters, t-(init.)=1.33328 s t(norm)=0.0234765, mflops=212.979 (err=5.1e-16) Top mflops for N=729 = 382.883 Normalized results and averages for N=729: fft 0: mflops = 382.883 (norm. = 1), norm. avg. (of 4) = 0.978448 fft 1: mflops = 89.6753 (norm. = 0.234211), norm. avg. (of 4) = 0.150883 fft 2: mflops = 56.7943 (norm. = 0.148333), norm. avg. (of 4) = 0.108505 fft 3: mflops = 144.392 (norm. = 0.377119), norm. avg. (of 4) = 0.514798 fft 4: mflops = 125.282 (norm. = 0.327206), norm. avg. (of 4) = 0.497194 fft 5: mflops = 262.128 (norm. = 0.684615), norm. avg. (of 3) = 0.568748 fft 6: mflops = 212.979 (norm. = 0.55625), norm. avg. (of 3) = 0.470265 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.04996 s, 8192 iters, t-(init.)=1.01663 s t(norm)=0.0124526, mflops=401.523 (err=4.0e-16) 1. PDA: elapsed time t=1.14995 s, 2048 iters, t-(init.)=1.14995 s t(norm)=0.0563429, mflops=88.7424 (err=4.3e-16) 2. PDA (f2c): elapsed time t=1.46661 s, 2048 iters, t-(init.)=1.46661 s t(norm)=0.0718576, mflops=69.5821 (err=4.3e-16) 3. Singleton: elapsed time t=1.29995 s, 4096 iters, t-(init.)=1.28328 s t(norm)=0.0314377, mflops=159.045 (err=4.6e-16) 4. Singleton (f2c): elapsed time t=1.49994 s, 4096 iters, t-(init.)=1.48327 s t(norm)=0.0363371, mflops=137.601 (err=4.6e-16) 5. Temperton: elapsed time t=1.6166 s, 8192 iters, t-(init.)=1.58327 s t(norm)=0.0193934, mflops=257.82 (err=6.3e-16) 6. Temperton (f2c): elapsed time t=1.14995 s, 4096 iters, t-(init.)=1.13329 s t(norm)=0.0277632, mflops=180.095 (err=3.4e-16) Top mflops for N=1000 = 401.523 Normalized results and averages for N=1000: fft 0: mflops = 401.523 (norm. = 1), norm. avg. (of 5) = 0.982759 fft 1: mflops = 88.7424 (norm. = 0.221014), norm. avg. (of 5) = 0.164909 fft 2: mflops = 69.5821 (norm. = 0.173295), norm. avg. (of 5) = 0.121463 fft 3: mflops = 159.045 (norm. = 0.396104), norm. avg. (of 5) = 0.491059 fft 4: mflops = 137.601 (norm. = 0.342697), norm. avg. (of 5) = 0.466295 fft 5: mflops = 257.82 (norm. = 0.642105), norm. avg. (of 4) = 0.587087 fft 6: mflops = 180.095 (norm. = 0.448529), norm. avg. (of 4) = 0.464831 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.03329 s, 4096 iters, t-(init.)=1.01663 s t(norm)=0.0179679, mflops=278.274 (err=4.3e-16) 1. PDA: elapsed time t=1.7666 s, 512 iters, t-(init.)=1.7666 s t(norm)=0.249783, mflops=20.0174 (err=5.6e-16) 2. PDA (f2c): elapsed time t=1.64993 s, 512 iters, t-(init.)=1.64993 s t(norm)=0.233288, mflops=21.4327 (err=5.6e-16) 3. Singleton: elapsed time t=1.89992 s, 4096 iters, t-(init.)=1.86659 s t(norm)=0.0329902, mflops=151.56 (err=6.6e-16) 4. Singleton (f2c): elapsed time t=1.7166 s, 4096 iters, t-(init.)=1.68327 s t(norm)=0.0297501, mflops=168.067 (err=6.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 278.274 Normalized results and averages for N=1331: fft 0: mflops = 278.274 (norm. = 1), norm. avg. (of 6) = 0.985632 fft 1: mflops = 20.0174 (norm. = 0.071934), norm. avg. (of 6) = 0.149413 fft 2: mflops = 21.4327 (norm. = 0.0770202), norm. avg. (of 6) = 0.114056 fft 3: mflops = 151.56 (norm. = 0.544643), norm. avg. (of 6) = 0.49999 fft 4: mflops = 168.067 (norm. = 0.60396), norm. avg. (of 6) = 0.489239 fft 5: mflops = -1 (norm. = -0.00359358), norm. avg. (of 4) = 0.587087 fft 6: mflops = -1 (norm. = -0.00359358), norm. avg. (of 4) = 0.464831 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.03329 s, 4096 iters, t-(init.)=0.99996 s t(norm)=0.0131363, mflops=380.625 (err=3.9e-16) 1. PDA: elapsed time t=1.53327 s, 2048 iters, t-(init.)=1.53327 s t(norm)=0.0402847, mflops=124.117 (err=3.9e-16) 2. PDA (f2c): elapsed time t=1.29995 s, 1024 iters, t-(init.)=1.28328 s t(norm)=0.067433, mflops=74.1477 (err=3.8e-16) 3. Singleton: elapsed time t=1.41661 s, 2048 iters, t-(init.)=1.41661 s t(norm)=0.0372195, mflops=134.338 (err=4.0e-16) 4. Singleton (f2c): elapsed time t=1.78326 s, 2048 iters, t-(init.)=1.7666 s t(norm)=0.0464149, mflops=107.724 (err=4.0e-16) 5. Temperton: elapsed time t=1.08329 s, 4096 iters, t-(init.)=1.04996 s t(norm)=0.0137931, mflops=362.5 (err=1.8e-15) 6. Temperton (f2c): elapsed time t=1.51661 s, 4096 iters, t-(init.)=1.48327 s t(norm)=0.0194855, mflops=256.601 (err=3.9e-16) Top mflops for N=1728 = 380.625 Normalized results and averages for N=1728: fft 0: mflops = 380.625 (norm. = 1), norm. avg. (of 7) = 0.987685 fft 1: mflops = 124.117 (norm. = 0.326087), norm. avg. (of 7) = 0.174653 fft 2: mflops = 74.1477 (norm. = 0.194805), norm. avg. (of 7) = 0.125592 fft 3: mflops = 134.338 (norm. = 0.352941), norm. avg. (of 7) = 0.478983 fft 4: mflops = 107.724 (norm. = 0.283019), norm. avg. (of 7) = 0.459779 fft 5: mflops = 362.5 (norm. = 0.952381), norm. avg. (of 5) = 0.660146 fft 6: mflops = 256.601 (norm. = 0.674157), norm. avg. (of 5) = 0.506696 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.93326 s, 4096 iters, t-(init.)=1.89992 s t(norm)=0.0190183, mflops=262.905 (err=4.5e-16) 1. PDA: elapsed time t=1.6166 s, 256 iters, t-(init.)=1.6166 s t(norm)=0.258916, mflops=19.3113 (err=9.2e-16) 2. PDA (f2c): elapsed time t=1.49994 s, 256 iters, t-(init.)=1.48327 s t(norm)=0.237562, mflops=21.0471 (err=9.2e-16) 3. Singleton: elapsed time t=1.88326 s, 2048 iters, t-(init.)=1.86659 s t(norm)=0.0373693, mflops=133.8 (err=7.7e-16) 4. Singleton (f2c): elapsed time t=1.7666 s, 2048 iters, t-(init.)=1.74993 s t(norm)=0.0350337, mflops=142.72 (err=7.7e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 262.905 Normalized results and averages for N=2197: fft 0: mflops = 262.905 (norm. = 1), norm. avg. (of 8) = 0.989224 fft 1: mflops = 19.3113 (norm. = 0.0734536), norm. avg. (of 8) = 0.162003 fft 2: mflops = 21.0471 (norm. = 0.0800562), norm. avg. (of 8) = 0.1199 fft 3: mflops = 133.8 (norm. = 0.508929), norm. avg. (of 8) = 0.482726 fft 4: mflops = 142.72 (norm. = 0.542857), norm. avg. (of 8) = 0.470164 fft 5: mflops = -1 (norm. = -0.00380366), norm. avg. (of 5) = 0.660146 fft 6: mflops = -1 (norm. = -0.00380366), norm. avg. (of 5) = 0.506696 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.74993 s, 4096 iters, t-(init.)=1.69993 s t(norm)=0.0132417, mflops=377.596 (err=4.1e-16) 1. PDA: elapsed time t=1.91659 s, 512 iters, t-(init.)=1.89992 s t(norm)=0.118396, mflops=42.2311 (err=4.5e-16) 2. PDA (f2c): elapsed time t=1.09996 s, 256 iters, t-(init.)=1.09996 s t(norm)=0.13709, mflops=36.4723 (err=4.5e-16) 3. Singleton: elapsed time t=1.43328 s, 1024 iters, t-(init.)=1.41661 s t(norm)=0.0441389, mflops=113.279 (err=5.6e-16) 4. Singleton (f2c): elapsed time t=1.63327 s, 1024 iters, t-(init.)=1.6166 s t(norm)=0.0503703, mflops=99.2649 (err=5.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 377.596 Normalized results and averages for N=2744: fft 0: mflops = 377.596 (norm. = 1), norm. avg. (of 9) = 0.990421 fft 1: mflops = 42.2311 (norm. = 0.111842), norm. avg. (of 9) = 0.156429 fft 2: mflops = 36.4723 (norm. = 0.0965909), norm. avg. (of 9) = 0.11731 fft 3: mflops = 113.279 (norm. = 0.3), norm. avg. (of 9) = 0.462423 fft 4: mflops = 99.2649 (norm. = 0.262887), norm. avg. (of 9) = 0.447133 fft 5: mflops = -1 (norm. = -0.00264833), norm. avg. (of 5) = 0.660146 fft 6: mflops = -1 (norm. = -0.00264833), norm. avg. (of 5) = 0.506696 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.28328 s, 2048 iters, t-(init.)=1.26662 s t(norm)=0.0156347, mflops=319.802 (err=4.7e-16) 1. PDA: elapsed time t=1.54994 s, 1024 iters, t-(init.)=1.53327 s t(norm)=0.0378524, mflops=132.092 (err=4.8e-16) 2. PDA (f2c): elapsed time t=1.24995 s, 512 iters, t-(init.)=1.23328 s t(norm)=0.0608929, mflops=82.1113 (err=4.8e-16) 3. Singleton: elapsed time t=1.41661 s, 1024 iters, t-(init.)=1.39994 s t(norm)=0.0345609, mflops=144.672 (err=6.1e-16) 4. Singleton (f2c): elapsed time t=1.69993 s, 1024 iters, t-(init.)=1.68327 s t(norm)=0.0415553, mflops=120.322 (err=6.1e-16) 5. Temperton: elapsed time t=1.13329 s, 2048 iters, t-(init.)=1.09996 s t(norm)=0.0135775, mflops=368.257 (err=2.0e-15) 6. Temperton (f2c): elapsed time t=1.74993 s, 2048 iters, t-(init.)=1.7166 s t(norm)=0.0211891, mflops=235.97 (err=4.6e-16) Top mflops for N=3375 = 368.257 Normalized results and averages for N=3375: fft 0: mflops = 319.802 (norm. = 0.868421), norm. avg. (of 10) = 0.978221 fft 1: mflops = 132.092 (norm. = 0.358696), norm. avg. (of 10) = 0.176656 fft 2: mflops = 82.1113 (norm. = 0.222973), norm. avg. (of 10) = 0.127876 fft 3: mflops = 144.672 (norm. = 0.392857), norm. avg. (of 10) = 0.455467 fft 4: mflops = 120.322 (norm. = 0.326733), norm. avg. (of 10) = 0.435093 fft 5: mflops = 368.257 (norm. = 1), norm. avg. (of 6) = 0.716788 fft 6: mflops = 235.97 (norm. = 0.640777), norm. avg. (of 6) = 0.529043 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.86659 s, 256 iters, t-(init.)=1.69993 s t(norm)=0.0281601, mflops=177.556 (err=4.5e-16) 1. PDA: elapsed time t=1.49994 s, 128 iters, t-(init.)=1.41661 s t(norm)=0.0469334, mflops=106.534 (err=5.2e-16) 2. PDA (f2c): elapsed time t=1.09996 s, 64 iters, t-(init.)=1.06662 s t(norm)=0.0706762, mflops=70.7451 (err=4.8e-16) 3. Singleton: elapsed time t=1.01663 s, 64 iters, t-(init.)=0.983294 s t(norm)=0.0651546, mflops=76.7405 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=1.13329 s, 64 iters, t-(init.)=1.09996 s t(norm)=0.0728849, mflops=68.6014 (err=5.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 177.556 Normalized results and averages for N=16800: fft 0: mflops = 177.556 (norm. = 1), norm. avg. (of 11) = 0.980201 fft 1: mflops = 106.534 (norm. = 0.6), norm. avg. (of 11) = 0.215142 fft 2: mflops = 70.7451 (norm. = 0.398437), norm. avg. (of 11) = 0.152473 fft 3: mflops = 76.7405 (norm. = 0.432203), norm. avg. (of 11) = 0.453352 fft 4: mflops = 68.6014 (norm. = 0.386364), norm. avg. (of 11) = 0.430663 fft 5: mflops = -1 (norm. = -0.00563201), norm. avg. (of 6) = 0.716788 fft 6: mflops = -1 (norm. = -0.00563201), norm. avg. (of 6) = 0.529043 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.06662 s, 16 iters, t-(init.)=0.983294 s t(norm)=0.0331664, mflops=150.755 (err=6.8e-16) 1. PDA: elapsed time t=1.53327 s, 16 iters, t-(init.)=1.43328 s t(norm)=0.0483442, mflops=103.425 (err=6.3e-16) 2. PDA (f2c): elapsed time t=1.14995 s, 8 iters, t-(init.)=1.09996 s t(norm)=0.0742028, mflops=67.3829 (err=6.2e-16) 3. Singleton: elapsed time t=1.83326 s, 8 iters, t-(init.)=1.78326 s t(norm)=0.120298, mflops=41.5633 (err=6.5e-16) 4. Singleton (f2c): elapsed time t=1.91659 s, 8 iters, t-(init.)=1.86659 s t(norm)=0.12592, mflops=39.7078 (err=6.5e-16) 5. Temperton: elapsed time t=1.53327 s, 16 iters, t-(init.)=1.43328 s t(norm)=0.0483442, mflops=103.425 (err=1.0e-07) 6. Temperton (f2c): elapsed time t=1.64993 s, 16 iters, t-(init.)=1.54994 s t(norm)=0.0522792, mflops=95.6403 (err=7.0e-16) Top mflops for N=110592 = 150.755 Normalized results and averages for N=110592: fft 0: mflops = 150.755 (norm. = 1), norm. avg. (of 12) = 0.981851 fft 1: mflops = 103.425 (norm. = 0.686047), norm. avg. (of 12) = 0.254384 fft 2: mflops = 67.3829 (norm. = 0.44697), norm. avg. (of 12) = 0.177014 fft 3: mflops = 41.5633 (norm. = 0.275701), norm. avg. (of 12) = 0.438547 fft 4: mflops = 39.7078 (norm. = 0.263393), norm. avg. (of 12) = 0.416724 fft 5: mflops = 103.425 (norm. = 0.686047), norm. avg. (of 7) = 0.712397 fft 6: mflops = 95.6403 (norm. = 0.634409), norm. avg. (of 7) = 0.544095 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.96659 s, 32 iters, t-(init.)=1.74993 s t(norm)=0.0275952, mflops=181.191 (err=6.4e-16) 1. PDA: elapsed time t=1.6166 s, 8 iters, t-(init.)=1.54994 s t(norm)=0.0977659, mflops=51.1426 (err=7.1e-16) 2. PDA (f2c): elapsed time t=1.11662 s, 4 iters, t-(init.)=1.08329 s t(norm)=0.136662, mflops=36.5866 (err=7.1e-16) 3. Singleton: elapsed time t=1.48327 s, 8 iters, t-(init.)=1.41661 s t(norm)=0.089356, mflops=55.956 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.58327 s, 8 iters, t-(init.)=1.53327 s t(norm)=0.0967147, mflops=51.6985 (err=1.0e-15) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 181.191 Normalized results and averages for N=117649: fft 0: mflops = 181.191 (norm. = 1), norm. avg. (of 13) = 0.983247 fft 1: mflops = 51.1426 (norm. = 0.282258), norm. avg. (of 13) = 0.256528 fft 2: mflops = 36.5866 (norm. = 0.201923), norm. avg. (of 13) = 0.17893 fft 3: mflops = 55.956 (norm. = 0.308824), norm. avg. (of 13) = 0.428569 fft 4: mflops = 51.6985 (norm. = 0.285326), norm. avg. (of 13) = 0.406616 fft 5: mflops = -1 (norm. = -0.00551904), norm. avg. (of 7) = 0.712397 fft 6: mflops = -1 (norm. = -0.00551904), norm. avg. (of 7) = 0.544095 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.08329 s, 8 iters, t-(init.)=0.966628 s t(norm)=0.0315672, mflops=158.392 (err=7.3e-16) 1. PDA: elapsed time t=1.38328 s, 8 iters, t-(init.)=1.28328 s t(norm)=0.0419081, mflops=119.309 (err=7.4e-16) 2. PDA (f2c): elapsed time t=1.03329 s, 4 iters, t-(init.)=0.966628 s t(norm)=0.0631343, mflops=79.1962 (err=7.4e-16) 3. Singleton: elapsed time t=1.08329 s, 2 iters, t-(init.)=1.04996 s t(norm)=0.137154, mflops=36.4554 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.18329 s, 2 iters, t-(init.)=1.16662 s t(norm)=0.152393, mflops=32.8099 (err=1.0e-15) 5. Temperton: elapsed time t=1.54994 s, 8 iters, t-(init.)=1.44994 s t(norm)=0.0473507, mflops=105.595 (err=2.0e-15) 6. Temperton (f2c): elapsed time t=1.78326 s, 8 iters, t-(init.)=1.68327 s t(norm)=0.0549704, mflops=90.9581 (err=7.1e-16) Top mflops for N=216000 = 158.392 Normalized results and averages for N=216000: fft 0: mflops = 158.392 (norm. = 1), norm. avg. (of 14) = 0.984444 fft 1: mflops = 119.309 (norm. = 0.753247), norm. avg. (of 14) = 0.292008 fft 2: mflops = 79.1962 (norm. = 0.5), norm. avg. (of 14) = 0.201864 fft 3: mflops = 36.4554 (norm. = 0.230159), norm. avg. (of 14) = 0.414397 fft 4: mflops = 32.8099 (norm. = 0.207143), norm. avg. (of 14) = 0.392368 fft 5: mflops = 105.595 (norm. = 0.666667), norm. avg. (of 8) = 0.70668 fft 6: mflops = 90.9581 (norm. = 0.574257), norm. avg. (of 8) = 0.547866 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.21662 s, 8 iters, t-(init.)=1.08329 s t(norm)=0.0312978, mflops=159.756 (err=7.3e-16) 1. PDA: elapsed time t=1.81659 s, 8 iters, t-(init.)=1.69993 s t(norm)=0.0491135, mflops=101.805 (err=7.8e-16) 2. PDA (f2c): elapsed time t=1.39994 s, 4 iters, t-(init.)=1.34995 s t(norm)=0.0780038, mflops=64.0994 (err=7.8e-16) 3. Singleton: elapsed time t=1.31661 s, 2 iters, t-(init.)=1.28328 s t(norm)=0.148304, mflops=33.7146 (err=9.4e-16) 4. Singleton (f2c): elapsed time t=1.41661 s, 2 iters, t-(init.)=1.38328 s t(norm)=0.15986, mflops=31.2774 (err=9.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 159.756 Normalized results and averages for N=241920: fft 0: mflops = 159.756 (norm. = 1), norm. avg. (of 15) = 0.985481 fft 1: mflops = 101.805 (norm. = 0.637255), norm. avg. (of 15) = 0.315024 fft 2: mflops = 64.0994 (norm. = 0.401235), norm. avg. (of 15) = 0.215155 fft 3: mflops = 33.7146 (norm. = 0.211039), norm. avg. (of 15) = 0.400839 fft 4: mflops = 31.2774 (norm. = 0.195783), norm. avg. (of 15) = 0.379263 fft 5: mflops = -1 (norm. = -0.00625957), norm. avg. (of 8) = 0.70668 fft 6: mflops = -1 (norm. = -0.00625957), norm. avg. (of 8) = 0.547866 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.16662 s, 4 iters, t-(init.)=1.03329 s t(norm)=0.0327682, mflops=152.587 (err=7.0e-16) 1. PDA: elapsed time t=1.44994 s, 4 iters, t-(init.)=1.33328 s t(norm)=0.0422815, mflops=118.255 (err=7.3e-16) 2. PDA (f2c): elapsed time t=1.04996 s, 2 iters, t-(init.)=0.983294 s t(norm)=0.0623652, mflops=80.1729 (err=7.3e-16) 3. Singleton: elapsed time t=1.08329 s, 1 iters, t-(init.)=1.06662 s t(norm)=0.135301, mflops=36.9547 (err=9.3e-16) 4. Singleton (f2c): elapsed time t=1.11662 s, 1 iters, t-(init.)=1.08329 s t(norm)=0.137415, mflops=36.3861 (err=9.3e-16) 5. Temperton: elapsed time t=1.51661 s, 4 iters, t-(init.)=1.39994 s t(norm)=0.0443956, mflops=112.624 (err=1.4e-07) 6. Temperton (f2c): elapsed time t=1.83326 s, 4 iters, t-(init.)=1.7166 s t(norm)=0.0544375, mflops=91.8485 (err=9.6e-16) Top mflops for N=421875 = 152.587 Normalized results and averages for N=421875: fft 0: mflops = 152.587 (norm. = 1), norm. avg. (of 16) = 0.986388 fft 1: mflops = 118.255 (norm. = 0.775), norm. avg. (of 16) = 0.343773 fft 2: mflops = 80.1729 (norm. = 0.525424), norm. avg. (of 16) = 0.234547 fft 3: mflops = 36.9547 (norm. = 0.242188), norm. avg. (of 16) = 0.390924 fft 4: mflops = 36.3861 (norm. = 0.238462), norm. avg. (of 16) = 0.370463 fft 5: mflops = 112.624 (norm. = 0.738095), norm. avg. (of 9) = 0.710171 fft 6: mflops = 91.8485 (norm. = 0.601942), norm. avg. (of 9) = 0.553874 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.64993 s, 4 iters, t-(init.)=1.48327 s t(norm)=0.0381874, mflops=130.933 (err=6.3e-16) 1. PDA: elapsed time t=1.04996 s, 2 iters, t-(init.)=0.966628 s t(norm)=0.0497724, mflops=100.457 (err=6.2e-16) 2. PDA (f2c): elapsed time t=1.43328 s, 2 iters, t-(init.)=1.36661 s t(norm)=0.0703679, mflops=71.0551 (err=6.2e-16) 3. Singleton: elapsed time t=1.49994 s, 1 iters, t-(init.)=1.46661 s t(norm)=0.151033, mflops=33.1052 (err=8.2e-16) 4. Singleton (f2c): elapsed time t=1.54994 s, 1 iters, t-(init.)=1.51661 s t(norm)=0.156182, mflops=32.0139 (err=8.2e-16) 5. Temperton: elapsed time t=1.18329 s, 2 iters, t-(init.)=1.11662 s t(norm)=0.0574957, mflops=86.963 (err=1.7e-07) 6. Temperton (f2c): elapsed time t=1.29995 s, 2 iters, t-(init.)=1.21662 s t(norm)=0.0626446, mflops=79.8154 (err=6.6e-16) Top mflops for N=512000 = 130.933 Normalized results and averages for N=512000: fft 0: mflops = 130.933 (norm. = 1), norm. avg. (of 17) = 0.987189 fft 1: mflops = 100.457 (norm. = 0.767241), norm. avg. (of 17) = 0.368683 fft 2: mflops = 71.0551 (norm. = 0.542683), norm. avg. (of 17) = 0.252673 fft 3: mflops = 33.1052 (norm. = 0.252841), norm. avg. (of 17) = 0.382801 fft 4: mflops = 32.0139 (norm. = 0.244505), norm. avg. (of 17) = 0.363053 fft 5: mflops = 86.963 (norm. = 0.664179), norm. avg. (of 10) = 0.705572 fft 6: mflops = 79.8154 (norm. = 0.609589), norm. avg. (of 10) = 0.559446 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.74993 s, 4 iters, t-(init.)=1.5666 s t(norm)=0.0344573, mflops=145.107 (err=7.0e-16) 1. PDA: elapsed time t=1.51661 s, 2 iters, t-(init.)=1.41661 s t(norm)=0.0623165, mflops=80.2356 (err=6.6e-16) 2. PDA (f2c): elapsed time t=1.11662 s, 1 iters, t-(init.)=1.06662 s t(norm)=0.0938413, mflops=53.2815 (err=6.6e-16) 3. Singleton: elapsed time t=2.08325 s, 1 iters, t-(init.)=2.03325 s t(norm)=0.178885, mflops=27.9509 (err=8.9e-16) 4. Singleton (f2c): elapsed time t=2.18325 s, 1 iters, t-(init.)=2.14991 s t(norm)=0.189149, mflops=26.4342 (err=8.9e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 145.107 Normalized results and averages for N=592704: fft 0: mflops = 145.107 (norm. = 1), norm. avg. (of 18) = 0.987901 fft 1: mflops = 80.2356 (norm. = 0.552941), norm. avg. (of 18) = 0.378919 fft 2: mflops = 53.2815 (norm. = 0.367187), norm. avg. (of 18) = 0.259034 fft 3: mflops = 27.9509 (norm. = 0.192623), norm. avg. (of 18) = 0.372236 fft 4: mflops = 26.4342 (norm. = 0.182171), norm. avg. (of 18) = 0.353004 fft 5: mflops = -1 (norm. = -0.00689147), norm. avg. (of 10) = 0.705572 fft 6: mflops = -1 (norm. = -0.00689147), norm. avg. (of 10) = 0.559446 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=1.53327 s, 2 iters, t-(init.)=1.38328 s t(norm)=0.0395723, mflops=126.351 (err=7.9e-16) 1. PDA: elapsed time t=1.18329 s, 1 iters, t-(init.)=1.09996 s t(norm)=0.0629342, mflops=79.448 (err=6.4e-16) 2. PDA (f2c): elapsed time t=1.58327 s, 1 iters, t-(init.)=1.51661 s t(norm)=0.086773, mflops=57.6216 (err=6.4e-16) 3. Singleton: elapsed time t=4.24983 s, 1 iters, t-(init.)=4.18317 s t(norm)=0.239341, mflops=20.8907 (err=7.0e-16) 4. Singleton (f2c): elapsed time t=4.39982 s, 1 iters, t-(init.)=4.33316 s t(norm)=0.247923, mflops=20.1676 (err=7.0e-16) 5. Temperton: elapsed time t=1.46661 s, 1 iters, t-(init.)=1.39994 s t(norm)=0.0800981, mflops=62.4234 (err=1.6e-07) 6. Temperton (f2c): elapsed time t=1.51661 s, 1 iters, t-(init.)=1.43328 s t(norm)=0.0820052, mflops=60.9717 (err=7.5e-16) Top mflops for N=884736 = 126.351 Normalized results and averages for N=884736: fft 0: mflops = 126.351 (norm. = 1), norm. avg. (of 19) = 0.988538 fft 1: mflops = 79.448 (norm. = 0.628788), norm. avg. (of 19) = 0.39207 fft 2: mflops = 57.6216 (norm. = 0.456044), norm. avg. (of 19) = 0.269403 fft 3: mflops = 20.8907 (norm. = 0.165339), norm. avg. (of 19) = 0.361346 fft 4: mflops = 20.1676 (norm. = 0.159615), norm. avg. (of 19) = 0.342826 fft 5: mflops = 62.4234 (norm. = 0.494048), norm. avg. (of 11) = 0.686342 fft 6: mflops = 60.9717 (norm. = 0.482558), norm. avg. (of 11) = 0.552456 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=1.83326 s, 2 iters, t-(init.)=1.64993 s t(norm)=0.0353794, mflops=141.325 (err=7.4e-16) 1. PDA: elapsed time t=1.58327 s, 1 iters, t-(init.)=1.49994 s t(norm)=0.0643261, mflops=77.7289 (err=7.8e-16) 2. PDA (f2c): elapsed time t=2.24991 s, 1 iters, t-(init.)=2.14991 s t(norm)=0.0922008, mflops=54.2295 (err=7.2e-16) 3. Singleton: elapsed time t=3.73318 s, 1 iters, t-(init.)=3.63319 s t(norm)=0.155812, mflops=32.0899 (err=8.0e-16) 4. Singleton (f2c): elapsed time t=3.83318 s, 1 iters, t-(init.)=3.74985 s t(norm)=0.160815, mflops=31.0916 (err=8.0e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 141.325 Normalized results and averages for N=1157625: fft 0: mflops = 141.325 (norm. = 1), norm. avg. (of 20) = 0.989111 fft 1: mflops = 77.7289 (norm. = 0.55), norm. avg. (of 20) = 0.399967 fft 2: mflops = 54.2295 (norm. = 0.383721), norm. avg. (of 20) = 0.275119 fft 3: mflops = 32.0899 (norm. = 0.227064), norm. avg. (of 20) = 0.354632 fft 4: mflops = 31.0916 (norm. = 0.22), norm. avg. (of 20) = 0.336685 fft 5: mflops = -1 (norm. = -0.00707588), norm. avg. (of 11) = 0.686342 fft 6: mflops = -1 (norm. = -0.00707588), norm. avg. (of 11) = 0.552456 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=1.31661 s, 1 iters, t-(init.)=1.19995 s t(norm)=0.0418225, mflops=119.553 (err=5.8e-16) 1. PDA: elapsed time t=2.08325 s, 1 iters, t-(init.)=1.96659 s t(norm)=0.0685425, mflops=72.9475 (err=6.1e-16) 2. PDA (f2c): elapsed time t=2.91655 s, 1 iters, t-(init.)=2.79989 s t(norm)=0.0975859, mflops=51.2369 (err=5.6e-16) 3. Singleton: elapsed time t=5.28312 s, 1 iters, t-(init.)=5.14979 s t(norm)=0.179488, mflops=27.857 (err=6.1e-16) 4. Singleton (f2c): elapsed time t=5.43312 s, 1 iters, t-(init.)=5.31645 s t(norm)=0.185297, mflops=26.9837 (err=6.1e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 119.553 Normalized results and averages for N=1404928: fft 0: mflops = 119.553 (norm. = 1), norm. avg. (of 21) = 0.989629 fft 1: mflops = 72.9475 (norm. = 0.610169), norm. avg. (of 21) = 0.409976 fft 2: mflops = 51.2369 (norm. = 0.428571), norm. avg. (of 21) = 0.282427 fft 3: mflops = 27.857 (norm. = 0.23301), norm. avg. (of 21) = 0.348841 fft 4: mflops = 26.9837 (norm. = 0.225705), norm. avg. (of 21) = 0.3314 fft 5: mflops = -1 (norm. = -0.0083645), norm. avg. (of 11) = 0.686342 fft 6: mflops = -1 (norm. = -0.0083645), norm. avg. (of 11) = 0.552456 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=1.46661 s, 1 iters, t-(init.)=1.31661 s t(norm)=0.0367715, mflops=135.975 (err=7.3e-16) 1. PDA: elapsed time t=1.78326 s, 1 iters, t-(init.)=1.63327 s t(norm)=0.0456152, mflops=109.613 (err=7.9e-16) 2. PDA (f2c): elapsed time t=2.61656 s, 1 iters, t-(init.)=2.46657 s t(norm)=0.0688883, mflops=72.5813 (err=7.8e-16) 3. Singleton: elapsed time t=9.1663 s, 1 iters, t-(init.)=9.03297 s t(norm)=0.25228, mflops=19.8192 (err=9.4e-16) 4. Singleton (f2c): elapsed time t=9.49962 s, 1 iters, t-(init.)=9.36629 s t(norm)=0.261589, mflops=19.1139 (err=9.4e-16) 5. Temperton: elapsed time t=3.04988 s, 1 iters, t-(init.)=2.89988 s t(norm)=0.0809903, mflops=61.7358 (err=1.1e-08) 6. Temperton (f2c): elapsed time t=3.06654 s, 1 iters, t-(init.)=2.91655 s t(norm)=0.0814558, mflops=61.383 (err=6.9e-16) Top mflops for N=1728000 = 135.975 Normalized results and averages for N=1728000: fft 0: mflops = 135.975 (norm. = 1), norm. avg. (of 22) = 0.990101 fft 1: mflops = 109.613 (norm. = 0.806122), norm. avg. (of 22) = 0.427983 fft 2: mflops = 72.5813 (norm. = 0.533784), norm. avg. (of 22) = 0.293852 fft 3: mflops = 19.8192 (norm. = 0.145756), norm. avg. (of 22) = 0.33961 fft 4: mflops = 19.1139 (norm. = 0.140569), norm. avg. (of 22) = 0.322726 fft 5: mflops = 61.7358 (norm. = 0.454023), norm. avg. (of 12) = 0.666982 fft 6: mflops = 61.383 (norm. = 0.451429), norm. avg. (of 12) = 0.544037 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=2.78322 s, 1 iters, t-(init.)=2.53323 s t(norm)=0.0394413, mflops=126.771 (err=1.2e-15) 1. PDA: elapsed time t=3.4332 s, 1 iters, t-(init.)=3.18321 s t(norm)=0.0495612, mflops=100.885 (err=1.2e-15) 2. PDA (f2c): elapsed time t=5.03313 s, 1 iters, t-(init.)=4.78314 s t(norm)=0.0744715, mflops=67.1398 (err=1.2e-15) 3. Singleton: elapsed time t=16.4327 s, 1 iters, t-(init.)=16.166 s t(norm)=0.251698, mflops=19.8651 (err=1.6e-15) 4. Singleton (f2c): elapsed time t=16.716 s, 1 iters, t-(init.)=16.466 s t(norm)=0.256369, mflops=19.5032 (err=1.6e-15) 5. Temperton: elapsed time t=4.83314 s, 1 iters, t-(init.)=4.58315 s t(norm)=0.0713577, mflops=70.0695 (err=1.8e-07) 6. Temperton (f2c): elapsed time t=5.0498 s, 1 iters, t-(init.)=4.78314 s t(norm)=0.0744715, mflops=67.1398 (err=1.2e-15) Top mflops for N=2985984 = 126.771 Normalized results and averages for N=2985984: fft 0: mflops = 126.771 (norm. = 1), norm. avg. (of 23) = 0.990531 fft 1: mflops = 100.885 (norm. = 0.795812), norm. avg. (of 23) = 0.443976 fft 2: mflops = 67.1398 (norm. = 0.529617), norm. avg. (of 23) = 0.304102 fft 3: mflops = 19.8651 (norm. = 0.156701), norm. avg. (of 23) = 0.331657 fft 4: mflops = 19.5032 (norm. = 0.153846), norm. avg. (of 23) = 0.315383 fft 5: mflops = 70.0695 (norm. = 0.552727), norm. avg. (of 13) = 0.658193 fft 6: mflops = 67.1398 (norm. = 0.529617), norm. avg. (of 13) = 0.542928 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Bailey, Beauregard, Bergland, Brenner, Burrus, CWP (min N), CWP (best N), Edelblute, FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), Monro, NAPACK (f2c), Nielsen, NR (C), NR (F), Ooura (C), Ooura (F), Ransom, SCIPORT, Singleton, Singleton (f2c), Sorensen, Sorensen DIT, Temperton, Temperton (f2c), Valkenburg, DXML 2, 83.8894, 61.3825, 52.4309, 2.53698, 22.0762, 7.96419, 10.4862, 12.7105, 37.01, 8.17105, 7.58037, , 28.5987, 23.7423, 69.9079, 81.1833, 117.055, , 20.6285, 10.1479, 11.6513, 48.3978, , , , , 6.4863, 3.24315, 12.7105, 10.4862, 62.1403, 62.9171, , , 8.73848, 8.61878, 34.0092, 54.7105, 8.98815, 5.82566, 13.5306, 9.39061 4, 141.785, 118.432, 48.8676, 9.25251, 44.9408, 15.1607, 35.9526, 34.0092, 49.3467, 31.0702, 16.3421, 42.6556, 93.2105, 74.0201, 211.931, 341.245, 353.219, , 77.4364, 20.6285, 21.6955, 127.427, 72.9473, 76.2631, 76.2631, 5.37753, 12.3367, 12.5834, 21.6955, 19.0658, 97.7353, 134.223, 8.61878, 68.0185, 29.9605, 29.608, 68.0185, 51.3609, 32.2652, 20.2958, 14.9803, 37.5624 8, 222.06, 186.421, 58.0773, 13.2923, 90.9644, 19.6616, 65.0866, 42.416, 51.7127, 83.8894, 48.3978, 39.3232, 114.395, 99.3428, 387.182, 503.337, 569.815, 171.592, 96.7955, 34.0092, 36.2983, 201.335, 103.425, 106.339, 102.028, 14.5193, 19.0658, 30.9428, 33.4073, 30.4437, 184.148, 191.14, 12.0994, 100.667, 32.8263, 30.9428, 96.7955, 49.0263, 51.7127, 34.9539, 16.7036, 100.667 16, 101.684, 98.6935, 62.1403, 21.8842, 136.037, 22.2715, 83.8894, 62.9171, 55.3117, 143.81, 99.6706, 41.2571, 242.572, 167.779, 485.144, 536.892, 694.257, 207.561, 162.367, 50.8421, 57.8548, 248.561, 92.3553, 114.395, 110.623, 28.5987, 26.7732, 38.1316, 45.7579, 44.5431, 223.705, 189.938, 34.0092, 127.427, 83.8894, 81.1833, 125.834, 52.9828, 90.6913, 68.0185, 17.9763, 162.367 32, 120.994, 114.395, 69.1396, 25.7857, 157.293, 22.4704, 127.105, 75.8037, 61.0845, 172.376, 202.958, 46.2626, 216.955, 129.726, 606.43, 653.684, 571.973, 279.631, 141.387, 66.9331, 78.6463, 241.989, 106.639, 133.866, 128.402, 50.7396, 30.8417, 62.9171, 57.7221, 59.3557, 246.734, 213.278, 37.9019, 153.456, 112.352, 106.639, 161.326, 57.7221, 101.479, 75.8037, 18.7253, 230.888 64, 117.97, 116.155, 74.753, 35.2806, 181.929, 22.206, 152.526, 92.0738, 66.8146, 225.375, 255.934, 50.3337, 335.558, 165.935, 616.331, 444.121, 377.502, 363.858, 191.14, 75.5005, 88.8241, 239.684, 104.862, 134.822, 130.173, 75.5005, 37.7502, 88.8241, 65.6526, 72.5966, 305.052, 239.684, 68.0185, 181.929, 160.639, 152.526, 198.686, 61.8856, 160.639, 123.771, 19.4589, 260.347 128, 131.469, 124.062, 80.8109, 34.4078, 238.065, 22.021, 160.153, 101.246, 73.4033, 275.262, 378.856, 55.0524, 414.513, 177.947, 550.524, 550.524, 445.994, 367.016, 195.742, 84.6961, 102.423, 212.25, 115.9, 151.869, 146.807, 95.7434, 38.9752, 88.0839, 73.4033, 83.098, 309.066, 251.668, 68.8156, 200.191, 154.533, 137.631, 234.89, 69.9079, 167.779, 122.339, 20.0191, 352.336 256, 139.816, 130.737, 87.5368, 36.4737, 226.219, 21.8842, 188.163, 115.71, 79.8947, 341.245, 473.729, 59.2161, 473.729, 205.444, 575.242, 583.579, 415.123, 402.669, 228.789, 89.8815, 113.109, 172.081, 118.432, 154.873, 152.526, 108.244, 41.2571, 103.781, 78.6463, 90.6913, 353.219, 268.446, 94.0816, 216.489, 211.931, 193.591, 254.854, 74.0201, 189.938, 141.785, 19.9737, 362.765 512, 149.014, 143.355, 91.3312, 38.7845, 257.388, 21.449, 202.233, 83.2726, 76.5208, 390.52, 377.502, 64.347, 390.52, 155.138, 559.263, 471.878, 353.909, 427.361, 188.751, 92.8285, 117.97, 155.138, 127.248, 164.132, 161.787, 117.97, 36.2983, 109.952, 82.0658, 84.5155, 333.09, 263.374, 91.3312, 211.684, 209.724, 191.95, 269.645, 69.9079, 151.001, 120.48, 20.2233, 383.901 1024, 132.457, 135.306, 83.8894, 42.5115, 202.958, 21.2558, 179.763, 91.1842, 70.6933, 318.567, 326.842, 61.0845, 399.474, 170.046, 535.464, 479.368, 359.526, 399.474, 199.737, 81.7105, 98.3079, 111.358, 120.994, 151.607, 146.319, 112.352, 34.5698, 103.143, 78.6463, 76.7281, 318.567, 237.423, 108.478, 220.762, 218.842, 172.376, 196.616, 63.5526, 177.231, 144.637, 18.0796, 359.526 2048, 141.242, 145.703, 87.6061, 41.692, 157.293, 20.5979, 200.605, 87.6061, 75.2269, 282.485, 346.044, 65.2913, 300.908, 157.293, 274.094, 276.835, 294.505, 407.11, 184.557, 82.3914, 100.303, 104.862, 119.325, 144.185, 142.699, 103.297, 32.9566, 104.862, 79.5503, 76.0536, 282.485, 206.593, 98.8697, 175.212, 189.613, 152.107, 197.739, 65.2913, 160.951, 133.094, 16.9629, 177.458 4096, 93.2105, 94.3756, 66.2285, 42.898, 64.5303, 20.2958, 164.132, 85.796, 57.1973, 251.668, 282.245, 52.4309, 148.04, 123.771, 209.724, 225.375, 164.132, 251.668, 137.274, 71.2269, 85.796, 94.3756, 116.155, 141.122, 134.822, 73.3014, 29.4924, 86.7822, 67.4112, 63.9835, 260.347, 215.716, 112.687, 103.425, 177.648, 143.81, 112.687, 51.0138, 134.822, 114.395, 15.4714, 169.664 8192, 73.0287, 72.3825, 54.5281, 40.4912, 56.8001, 20.2456, 136.32, 75.0387, 46.4728, 251.668, 277.262, 46.4728, 94.014, 82.6184, 224.088, 190.214, 125.834, 204.48, 100.978, 56.8001, 68.1602, , 109.056, 127.8, 113.6, 56.0221, 26.9053, 56.0221, 55.265, 53.1118, 192.452, 181.76, 96.2261, 69.3154, 131.923, 109.056, 88.9046, 42.1609, 100.978, 84.3219, 14.8174, 125.834 16384, 45.877, 45.4041, 37.01, 41.549, 44.042, 19.1487, 100.095, 61.1694, 33.8784, 217.491, 217.491, 33.3651, 87.2118, 79.3549, 174.424, 204.846, 108.746, 122.339, 91.7541, 40.7796, 46.8531, , 73.4033, 79.3549, 73.4033, 40.0381, 25.907, 43.6059, 40.0381, 38.9752, 154.533, 151.869, 104.862, 61.1694, 92.7199, 80.0763, 66.7302, 31.0155, 80.8109, 71.0354, 13.9373, 146.807 32768, 43.6924, 42.898, 33.7056, 35.7483, 41.3928, 19.0273, 97.2944, 56.8528, 30.6414, 198.686, 194.589, 30.6414, 78.6463, 70.4296, 171.592, 165.571, 97.2944, 116.513, 83.5182, 39.3232, 44.5168, , 52.4309, 55.5151, 52.4309, 35.7483, 23.8322, 39.3232, 38.6785, 36.2983, 134.822, 140.859, 89.8815, 50.7396, 71.4967, 62.9171, 55.5151, 28.7731, 70.4296, 63.7673, 13.255, 142.993 65536, 31.4585, 31.4585, 24.6734, 39.3232, 34.9539, 18.2368, 75.1249, 46.1777, 22.6728, 179.763, 179.763, 23.3026, 71.9052, 68.9502, 137.9, 136.037, 82.5142, 89.8815, 77.4364, 28.9274, 32.6842, , 41.2571, 43.7684, 42.6556, 28.2773, 22.8789, 29.2638, 29.2638, 27.9631, 115.71, 114.395, 62.1403, 45.7579, 61.3825, 56.5547, 40.5917, 22.0762, 57.8548, 52.4309, 12.4588, 139.816 131072, 25.7113, 25.4664, 19.6616, 31.4585, 31.4585, 17.3635, 60.0893, 37.1386, 18.5693, 159.64, 157.293, 19.0998, 56.2942, 51.9219, 133.699, 117.537, 69.4539, 64.4332, 60.7722, 23.252, 25.2262, , 36.6298, 38.1997, 36.6298, 22.2831, 20.2574, 28.1471, 23.0515, 22.6608, 92.2061, 92.2061, 69.4539, 38.7533, 45.3216, 41.1381, 36.1348, 18.3149, 43.8357, 39.3232, 11.3304, 124.371 262144, 19.3922, 19.3922, 15.906, 34.9539, 26.4605, 16.6545, 49.6714, 29.4924, 15.0599, 95.9752, 95.9752, 15.2219, 51.0138, 47.5843, 114.395, 115.562, 62.9171, 53.4202, 56.0647, 18.6268, 19.3922, , 28.0324, 28.5987, 27.2237, 18.1492, 18.6268, 23.2071, 18.6268, 18.3849, 77.569, 78.6463, 75.5005, 33.309, 36.7697, 35.3909, 29.1883, 15.0599, 36.7697, 35.3909, 10.3331, 113.251 Norm. Avg., 0.272125, 0.251733, 0.166994, 0.109046, 0.262095, 0.0631758, 0.328834, 0.191033, 0.144725, 0.6024, 0.638654, 0.119684, 0.474296, 0.307172, 0.831867, 0.844926, 0.694455, 0.601816, 0.345915, 0.146785, 0.171674, 0.333532, 0.242482, 0.282254, 0.270243, 0.152684, 0.0821474, 0.158079, 0.140655, 0.138559, 0.570808, 0.516789, 0.251344, 0.29277, 0.305481, 0.268114, 0.311816, 0.144991, 0.273027, 0.225678, 0.0513533, 0.523035 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, Brenner, CWP (min N), CWP (best N), FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Nielsen, Singleton, Singleton (f2c), Temperton, Temperton (f2c), Valkenburg, DXML 6, , 42.0616, 31.2766, 73.9265, 58.7849, 282.849, 375.319, 32.0997, 75.0638, 18.766, 11.9587, 23.0149, 23.0149, 28.3671, 22.7998, 16.2638, 36.4116 9, 13.3555, 87.9893, 56.8032, 112.186, 77.3699, 260.142, 351.957, 29.9164, 71.2294, 28.0466, 19.856, 40.795, 39.712, 43.1486, 43.1486, 18.0946, 39.0213 12, , 104.102, 86.7521, 167.078, 110.929, 338.333, 375.926, 51.2626, 107.407, 26.0256, 27.7322, 45.7207, 42.2916, 69.0476, 52.8645, 17.0875, 90.2221 15, 20.0389, 118.178, 119.713, 200.389, 124.566, 279.33, 292.631, 36.5789, 68.7901, 20.0389, 32.921, 50.0972, 49.5585, 74.3377, 65.842, 17.7267, 100.194 18, 18.2194, 132.654, 116.893, 147.577, 86.8101, 251.195, 248.551, 35.5608, 134.161, 33.5403, 26.8322, 62.7988, 57.8734, 67.0806, 59.6272, 16.7701, 104.479 24, , 201.26, 174.832, 184.131, 112.392, 303.655, 270.443, 72.1181, 190.202, 32.781, 47.0335, 59.2752, 53.4208, 100.63, 77.2694, 17.734, 127.267 36, 24.7254, 248.092, 243.957, 266.135, 126.185, 336.493, 321.702, 45.1773, 218.469, 40.6596, 46.321, 95.0483, 88.1774, 131.869, 107.628, 18.6702, 98.9017 80, 43.6137, 315.68, 331.464, 405.875, 184.147, 401.775, 410.059, 90.3994, 125.873, 27.0216, 96.543, 155.374, 140.055, 205.03, 152.984, 18.2793, 207.165 108, 28.015, 260.794, 325.993, 382.498, 144.886, 387.667, 382.498, 41.2175, 204.91, 49.8044, 57.8374, 105.468, 90.7828, 177.082, 147.873, 19.074, 193.833 210, 34.6213, 366.11, 370.367, 178.942, 85.6226, 300.487, 248.841, 41.0459, 143.476, 19.7102, 70.4682, 93.6812, 82.0918, , , 17.0147, 73.7306 504, 32.7058, 577.661, 585.262, 198.571, 81.7645, 383.447, 376.948, 48.7718, 222.4, 25.5045, 70.3796, 110.099, 95.0425, , , 17.8205, 98.4069 1000, 34.0179, 298.694, 425.963, 340.179, 157.006, 344.97, 344.97, 57.2264, 102.054, 18.2239, 113.393, 153.081, 124.964, 216.751, 130.281, 15.1565, 335.519 1960, 35.4051, 405.252, 399.112, 141.62, 59.8668, 286.319, 248.504, 53.1076, 147.985, 16.3004, 74.8335, 117.595, 101.313, , , 12.862, 53.9782 4725, 26.6839, 285.776, 381.035, 130.28, 75.0767, 230.105, 216.074, 33.0562, 110.738, 18.6115, 66.1123, 106.736, 87.7134, , , 13.1831, 96.294 10368, 25.2915, 268.923, 268.923, 139.769, 94.0039, 226.009, 241.419, 45.7864, 145.513, 30.5243, 57.7307, 95.6977, 84.3051, 114.22, 99.2752, 14.4327, 145.513 27000, 21.1986, 242.27, 242.27, 125.107, 91.9458, 179.565, 169.589, 33.4715, 91.9458, 19.0788, 51.5642, 67.5354, 61.5444, 79.4948, 64.6737, 13.0676, 82.9511 75600, 18.6111, 186.702, 186.702, 66.0798, 48.2058, 148.889, 145.212, 31.2825, 76.378, 17.0962, 36.7569, 56.0105, 52.0452, , , 10.6542, 18.6111 165375, 14.3348, 149.58, 151.224, 36.9929, 27.3043, 128.611, 120.714, 22.6338, 66.8028, 13.4388, 34.4034, 46.4911, 43.5487, , , 10.3625, 18.2997 362880, 13.1419, 58.2813, 58.2813, 44.1913, 33.5117, 127.664, 116.563, 26.4566, 67.0235, 15.467, 26.4566, 31.4173, 30.0105, , , 9.43993, 60.021 Norm. Avg., 0.0810401, 0.699261, 0.723226, 0.521024, 0.290806, 0.849653, 0.853457, 0.139536, 0.396191, 0.0813983, 0.160661, 0.247766, 0.222476, 0.320393, 0.254762, 0.0505605, 0.317873 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM, HARM (f2c), NR (C), NR (F), PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 4x4x4, 239.684, , , 90.9644, 66.8146, 47.1878, 37.01, 179.763, 173.564, 179.763, 116.155 8x8x8, 545.787, 196.958, 164.132, 119.211, 83.2726, 83.2726, 59.6057, 147.079, 113.251, 286.711, 179.763 16x16x16, 471.878, 218.842, 184.148, 95.5702, 69.2665, 152.526, 96.7955, 218.842, 158.948, 347.129, 251.668 32x32x32, 198.686, 109.739, 98.3079, 32.3204, 28.4264, 76.1094, 56.8528, 52.4309, 48.1508, 94.3756, 87.3848 64x64x64, 157.293, 79.754, 74.5071, 18.1492, 16.8528, 79.754, 60.8875, 32.5433, 30.7747, 89.8815, 85.796 256x64x32, 132.825, 66.4125, 63.5864, 15.1704, 14.722, 67.1587, 52.8949, 25.3268, 24.2972, 78.6463, 75.6598 16x1024x64, 124.588, 64.2011, 61.0845, 14.044, 13.5597, 69.1396, 54.7105, 23.8322, 23.1313, , 128x128x128, 115.9, 57.9499, 55.9855, 13.5932, 13.1207, 67.4112, 53.4923, 17.2714, 16.5779, 50.8176, 55.5151 Norm. Avg., 1, 0.485616, 0.444501, 0.177843, 0.145152, 0.400623, 0.305111, 0.310627, 0.277263, 0.583992, 0.483028 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 5x5x5, 295.169, 59.4437, 40.7614, 314.125, 323.015, 194.543, 161.508 6x6x6, 422.234, 52.7793, 41.5837, 124.751, 111.264, 177.066, 149.701 7x7x7, 346.328, 20.8815, 21.1301, 143.429, 137.859, , 9x9x9, 382.883, 89.6753, 56.7943, 144.392, 125.282, 262.128, 212.979 10x10x10, 401.523, 88.7424, 69.5821, 159.045, 137.601, 257.82, 180.095 11x11x11, 278.274, 20.0174, 21.4327, 151.56, 168.067, , 12x12x12, 380.625, 124.117, 74.1477, 134.338, 107.724, 362.5, 256.601 13x13x13, 262.905, 19.3113, 21.0471, 133.8, 142.72, , 14x14x14, 377.596, 42.2311, 36.4723, 113.279, 99.2649, , 15x15x15, 319.802, 132.092, 82.1113, 144.672, 120.322, 368.257, 235.97 24x25x28, 177.556, 106.534, 70.7451, 76.7405, 68.6014, , 48x48x48, 150.755, 103.425, 67.3829, 41.5633, 39.7078, 103.425, 95.6403 49x49x49, 181.191, 51.1426, 36.5866, 55.956, 51.6985, , 60x60x60, 158.392, 119.309, 79.1962, 36.4554, 32.8099, 105.595, 90.9581 72x60x56, 159.756, 101.805, 64.0994, 33.7146, 31.2774, , 75x75x75, 152.587, 118.255, 80.1729, 36.9547, 36.3861, 112.624, 91.8485 80x80x80, 130.933, 100.457, 71.0551, 33.1052, 32.0139, 86.963, 79.8154 84x84x84, 145.107, 80.2356, 53.2815, 27.9509, 26.4342, , 96x96x96, 126.351, 79.448, 57.6216, 20.8907, 20.1676, 62.4234, 60.9717 105x105x105, 141.325, 77.7289, 54.2295, 32.0899, 31.0916, , 112x112x112, 119.553, 72.9475, 51.2369, 27.857, 26.9837, , 120x120x120, 135.975, 109.613, 72.5813, 19.8192, 19.1139, 61.7358, 61.383 144x144x144, 126.771, 100.885, 67.1398, 19.8651, 19.5032, 70.0695, 67.1398 Norm. Avg., 0.990531, 0.443976, 0.304102, 0.331657, 0.315383, 0.658193, 0.542928 @@@@ end