To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ name = Steven G. Johnson @ email = stevenj@alum.mit.edu @ organization = MIT @ computer manufacturer = IBM @ computer model = RS/6000 Model 3BT @ CPU manufacturer = IBM @ CPU model = POWER2 @ CPU speed = @ RAM = 128 MB @ L2 cache size = @ operating system = AIX 3.2 @ C compiler = xlc @ C compiler flags = -O3 -qarch=pwr2 -qtune=pwr2 -DUSE_ESSL @ Fortran compiler = xlf @ Fortran compiler flags = -O3 -qarch=pwr2 -qtune=pwr2 @ remarks = @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) Maximum array size = 360360 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Bailey 5. Beauregard 6. Bergland 7. Brenner 8. Burrus 9. CWP (min N) 10. CWP (best N) 11. Edelblute 12. FFTPACK 13. FFTPACK (f2c) 14. FFTW 15. FFTW_ESTIMATE 16. Frigo-old 17. Green 18. GSL 19. GSL DIT 20. GSL DIF 21. Krukar 22. Mayer (Buneman) 23. Mayer (simple) 24. Mayer (lookup) 25. Monro 26. NAPACK (f2c) 27. Nielsen 28. NR (C) 29. NR (F) 30. Ooura (C) 31. Ooura (F) 32. QFT 33. Ransom 34. SCIPORT 35. Singleton 36. Singleton (f2c) 37. Sorensen 38. Sorensen DIT 39. Temperton 40. Temperton (f2c) 41. Valkenburg 42. ESSL Computing normalized averages (43 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.36 s, 1048576 iters, t-(init.)=1.11 s t(norm)=0.529289, mflops=9.44663 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.38 s, 1048576 iters, t-(init.)=1.15 s t(norm)=0.548363, mflops=9.11805 (err=1.3e-16) 2. Arndt Split-Radix: elapsed time t=1.84 s, 1048576 iters, t-(init.)=1.61 s t(norm)=0.767708, mflops=6.51289 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.65 s, 65536 iters, t-(init.)=1.64 s t(norm)=12.5122, mflops=0.39961 (err=1.3e-16) 4. Bailey: elapsed time t=1.56 s, 524288 iters, t-(init.)=1.44 s t(norm)=1.37329, mflops=3.64089 (err=1.3e-16) 5. Beauregard: elapsed time t=1.14 s, 131072 iters, t-(init.)=1.12 s t(norm)=4.27246, mflops=1.17029 (err=1.3e-16) 6. Bergland: elapsed time t=1.81 s, 262144 iters, t-(init.)=1.75 s t(norm)=3.33786, mflops=1.49797 (err=1.3e-16) 7. Brenner: elapsed time t=1.08 s, 262144 iters, t-(init.)=1.02 s t(norm)=1.9455, mflops=2.57004 (err=1.3e-16) 8. Burrus: elapsed time t=1.01 s, 524288 iters, t-(init.)=0.88 s t(norm)=0.839233, mflops=5.95782 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.71 s, 262144 iters, t-(init.)=1.65 s t(norm)=3.14713, mflops=1.58875 10. CWP (best N) (N=3): elapsed time t=1.78 s, 262144 iters, t-(init.)=1.71 s t(norm)=3.26157, mflops=1.53301 11. Skipping fft (Edelblute can't handle N <= 2). 12. FFTPACK: elapsed time t=1.33 s, 524288 iters, t-(init.)=1.22 s t(norm)=1.16348, mflops=4.29744 (err=1.3e-16) 13. FFTPACK (f2c): elapsed time t=1.24 s, 524288 iters, t-(init.)=1.11 s t(norm)=1.05858, mflops=4.72332 (err=1.3e-16) FFTW_MEASURE plan: (cost = 9.918213e-07) FFTW_NOTW 2 14. FFTW: elapsed time t=1.18 s, 1048576 iters, t-(init.)=0.94 s t(norm)=0.448227, mflops=11.1551 (err=1.3e-16) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 15. FFTW_ESTIMATE: elapsed time t=1.17 s, 1048576 iters, t-(init.)=0.93 s t(norm)=0.443459, mflops=11.275 (err=1.3e-16) 16. Frigo-old: elapsed time t=1.46 s, 2097152 iters, t-(init.)=0.97 s t(norm)=0.231266, mflops=21.6201 (err=1.3e-16) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.24 s, 524288 iters, t-(init.)=1.12 s t(norm)=1.06812, mflops=4.68114 (err=1.3e-16) 19. GSL DIT: elapsed time t=1.92 s, 524288 iters, t-(init.)=1.8 s t(norm)=1.71661, mflops=2.91271 (err=1.3e-16) 20. GSL DIF: elapsed time t=1.01 s, 262144 iters, t-(init.)=0.96 s t(norm)=1.83105, mflops=2.73067 (err=1.3e-16) 21. Krukar: elapsed time t=1.77 s, 1048576 iters, t-(init.)=1.54 s t(norm)=0.734329, mflops=6.80894 (err=1.3e-16) 22. Skipping fft (Mayer can't handle N <= 2). 23. Skipping fft (Mayer can't handle N <= 2). 24. Skipping fft (Mayer can't handle N <= 2). 25. Skipping fft (Monro can't handle N <= 2). 26. NAPACK (f2c): elapsed time t=1.67 s, 262144 iters, t-(init.)=1.61 s t(norm)=3.07083, mflops=1.62822 (err=1.0e-16) 27. Nielsen: elapsed time t=1.47 s, 131072 iters, t-(init.)=1.44 s t(norm)=5.49316, mflops=0.910222 (err=1.3e-16) 28. NR (C): elapsed time t=1.76 s, 524288 iters, t-(init.)=1.65 s t(norm)=1.57356, mflops=3.1775 (err=1.3e-16) 29. NR (F): elapsed time t=1.9 s, 524288 iters, t-(init.)=1.79 s t(norm)=1.70708, mflops=2.92898 (err=1.3e-16) 30. Ooura (C): elapsed time t=1.52 s, 1048576 iters, t-(init.)=1.27 s t(norm)=0.605583, mflops=8.2565 (err=1.3e-16) 31. Ooura (F): elapsed time t=1.43 s, 1048576 iters, t-(init.)=1.19 s t(norm)=0.567436, mflops=8.81156 (err=1.3e-16) 32. Skipping fft (QFT requires N >= 16). 33. Skipping fft (Ransom doesn't work for N=2). 34. Skipping fft (SCIPORT can't handle N < 4). 35. Singleton: elapsed time t=1.26 s, 262144 iters, t-(init.)=1.19 s t(norm)=2.26974, mflops=2.20289 (err=1.3e-16) 36. Singleton (f2c): elapsed time t=1.26 s, 262144 iters, t-(init.)=1.2 s t(norm)=2.28882, mflops=2.18453 (err=1.3e-16) 37. Sorensen: elapsed time t=1.54 s, 1048576 iters, t-(init.)=1.28 s t(norm)=0.610352, mflops=8.192 (err=1.3e-16) 38. Sorensen DIT: elapsed time t=1.05 s, 524288 iters, t-(init.)=0.92 s t(norm)=0.87738, mflops=5.69878 (err=1.3e-16) 39. Temperton: elapsed time t=1.51 s, 262144 iters, t-(init.)=1.44 s t(norm)=2.74658, mflops=1.82044 (err=1.3e-16) 40. Temperton (f2c): elapsed time t=1.67 s, 262144 iters, t-(init.)=1.61 s t(norm)=3.07083, mflops=1.62822 (err=1.3e-16) 41. Valkenburg: elapsed time t=1.79 s, 524288 iters, t-(init.)=1.68 s t(norm)=1.60217, mflops=3.12076 (err=1.0e-16) 42. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=2 = 21.6201 Normalized results and averages for N=2: fft 0: mflops = 9.44663 (norm. = 0.436937), norm. avg. (of 1) = 0.436937 fft 1: mflops = 9.11805 (norm. = 0.421739), norm. avg. (of 1) = 0.421739 fft 2: mflops = 6.51289 (norm. = 0.301242), norm. avg. (of 1) = 0.301242 fft 3: mflops = 0.39961 (norm. = 0.0184832), norm. avg. (of 1) = 0.0184832 fft 4: mflops = 3.64089 (norm. = 0.168403), norm. avg. (of 1) = 0.168403 fft 5: mflops = 1.17029 (norm. = 0.0541295), norm. avg. (of 1) = 0.0541295 fft 6: mflops = 1.49797 (norm. = 0.0692857), norm. avg. (of 1) = 0.0692857 fft 7: mflops = 2.57004 (norm. = 0.118873), norm. avg. (of 1) = 0.118873 fft 8: mflops = 5.95782 (norm. = 0.275568), norm. avg. (of 1) = 0.275568 fft 9: mflops = 1.58875 (norm. = 0.0734848), norm. avg. (of 1) = 0.0734848 fft 10: mflops = 1.53301 (norm. = 0.0709064), norm. avg. (of 1) = 0.0709064 fft 11: mflops = -1 (norm. = -0.0462532), norm. avg. (of 0) = -1 fft 12: mflops = 4.29744 (norm. = 0.19877), norm. avg. (of 1) = 0.19877 fft 13: mflops = 4.72332 (norm. = 0.218468), norm. avg. (of 1) = 0.218468 fft 14: mflops = 11.1551 (norm. = 0.515957), norm. avg. (of 1) = 0.515957 fft 15: mflops = 11.275 (norm. = 0.521505), norm. avg. (of 1) = 0.521505 fft 16: mflops = 21.6201 (norm. = 1), norm. avg. (of 1) = 1 fft 17: mflops = -1 (norm. = -0.0462532), norm. avg. (of 0) = -1 fft 18: mflops = 4.68114 (norm. = 0.216518), norm. avg. (of 1) = 0.216518 fft 19: mflops = 2.91271 (norm. = 0.134722), norm. avg. (of 1) = 0.134722 fft 20: mflops = 2.73067 (norm. = 0.126302), norm. avg. (of 1) = 0.126302 fft 21: mflops = 6.80894 (norm. = 0.314935), norm. avg. (of 1) = 0.314935 fft 22: mflops = -1 (norm. = -0.0462532), norm. avg. (of 0) = -1 fft 23: mflops = -1 (norm. = -0.0462532), norm. avg. (of 0) = -1 fft 24: mflops = -1 (norm. = -0.0462532), norm. avg. (of 0) = -1 fft 25: mflops = -1 (norm. = -0.0462532), norm. avg. (of 0) = -1 fft 26: mflops = 1.62822 (norm. = 0.0753106), norm. avg. (of 1) = 0.0753106 fft 27: mflops = 0.910222 (norm. = 0.0421007), norm. avg. (of 1) = 0.0421007 fft 28: mflops = 3.1775 (norm. = 0.14697), norm. avg. (of 1) = 0.14697 fft 29: mflops = 2.92898 (norm. = 0.135475), norm. avg. (of 1) = 0.135475 fft 30: mflops = 8.2565 (norm. = 0.38189), norm. avg. (of 1) = 0.38189 fft 31: mflops = 8.81156 (norm. = 0.407563), norm. avg. (of 1) = 0.407563 fft 32: mflops = -1 (norm. = -0.0462532), norm. avg. (of 0) = -1 fft 33: mflops = -1 (norm. = -0.0462532), norm. avg. (of 0) = -1 fft 34: mflops = -1 (norm. = -0.0462532), norm. avg. (of 0) = -1 fft 35: mflops = 2.20289 (norm. = 0.101891), norm. avg. (of 1) = 0.101891 fft 36: mflops = 2.18453 (norm. = 0.101042), norm. avg. (of 1) = 0.101042 fft 37: mflops = 8.192 (norm. = 0.378906), norm. avg. (of 1) = 0.378906 fft 38: mflops = 5.69878 (norm. = 0.263587), norm. avg. (of 1) = 0.263587 fft 39: mflops = 1.82044 (norm. = 0.0842014), norm. avg. (of 1) = 0.0842014 fft 40: mflops = 1.62822 (norm. = 0.0753106), norm. avg. (of 1) = 0.0753106 fft 41: mflops = 3.12076 (norm. = 0.144345), norm. avg. (of 1) = 0.144345 fft 42: mflops = -1 (norm. = -0.0462532), norm. avg. (of 0) = -1 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.21 s, 524288 iters, t-(init.)=1.07 s t(norm)=0.255108, mflops=19.5996 (err=8.5e-17) 1. Arndt DIT: elapsed time t=1.19 s, 524288 iters, t-(init.)=1.05 s t(norm)=0.25034, mflops=19.9729 (err=8.5e-17) 2. Arndt Split-Radix: elapsed time t=1.07 s, 262144 iters, t-(init.)=1 s t(norm)=0.476837, mflops=10.4858 (err=8.5e-17) 3. Arndt 4-step: elapsed time t=1.39 s, 65536 iters, t-(init.)=1.37 s t(norm)=2.61307, mflops=1.91346 (err=8.5e-17) 4. Bailey: elapsed time t=1.45 s, 262144 iters, t-(init.)=1.38 s t(norm)=0.658035, mflops=7.59838 (err=8.5e-17) 5. Beauregard: elapsed time t=1.35 s, 131072 iters, t-(init.)=1.31 s t(norm)=1.24931, mflops=4.0022 (err=1.4e-16) 6. Bergland: elapsed time t=1.99 s, 262144 iters, t-(init.)=1.91 s t(norm)=0.910759, mflops=5.48993 (err=1.4e-16) 7. Brenner: elapsed time t=1.71 s, 262144 iters, t-(init.)=1.63 s t(norm)=0.777245, mflops=6.43298 (err=1.4e-16) 8. Burrus: elapsed time t=1.4 s, 262144 iters, t-(init.)=1.31 s t(norm)=0.624657, mflops=8.0044 (err=8.5e-17) 9. CWP (min N): elapsed time t=1.81 s, 262144 iters, t-(init.)=1.74 s t(norm)=0.829697, mflops=6.0263 10. CWP (best N) (N=15): elapsed time t=1.44 s, 131072 iters, t-(init.)=1.36 s t(norm)=1.297, mflops=3.85506 11. Edelblute: elapsed time t=1.36 s, 262144 iters, t-(init.)=1.28 s t(norm)=0.610352, mflops=8.192 (err=8.5e-17) 12. FFTPACK: elapsed time t=1.63 s, 524288 iters, t-(init.)=1.48 s t(norm)=0.352859, mflops=14.1699 (err=1.4e-16) 13. FFTPACK (f2c): elapsed time t=1.61 s, 524288 iters, t-(init.)=1.46 s t(norm)=0.348091, mflops=14.3641 (err=1.4e-16) FFTW_MEASURE plan: (cost = 1.296997e-06) FFTW_NOTW 4 14. FFTW: elapsed time t=1.45 s, 1048576 iters, t-(init.)=1.18 s t(norm)=0.140667, mflops=35.5449 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 15. FFTW_ESTIMATE: elapsed time t=1.46 s, 1048576 iters, t-(init.)=1.16 s t(norm)=0.138283, mflops=36.1578 (err=1.4e-16) 16. Frigo-old: elapsed time t=1.13 s, 1048576 iters, t-(init.)=0.8 s t(norm)=0.0953674, mflops=52.4288 (err=1.4e-16) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.06 s, 262144 iters, t-(init.)=0.99 s t(norm)=0.472069, mflops=10.5917 (err=1.4e-16) 19. GSL DIT: elapsed time t=1.89 s, 262144 iters, t-(init.)=1.82 s t(norm)=0.867844, mflops=5.76141 (err=1.3e-16) 20. GSL DIF: elapsed time t=1.82 s, 262144 iters, t-(init.)=1.74 s t(norm)=0.829697, mflops=6.0263 (err=1.3e-16) 21. Krukar: elapsed time t=1.24 s, 524288 iters, t-(init.)=1.1 s t(norm)=0.26226, mflops=19.065 (err=1.4e-16) 22. Mayer (Buneman): elapsed time t=1.95 s, 524288 iters, t-(init.)=1.79 s t(norm)=0.426769, mflops=11.7159 (err=9.8e-17) 23. Mayer (simple): elapsed time t=1.93 s, 524288 iters, t-(init.)=1.77 s t(norm)=0.422001, mflops=11.8483 24. Mayer (lookup): elapsed time t=1.09 s, 262144 iters, t-(init.)=1.02 s t(norm)=0.486374, mflops=10.2802 (err=9.8e-17) 25. Monro: elapsed time t=1.08 s, 65536 iters, t-(init.)=1.06 s t(norm)=2.02179, mflops=2.47306 (err=8.5e-17) 26. NAPACK (f2c): elapsed time t=1.53 s, 131072 iters, t-(init.)=1.49 s t(norm)=1.42097, mflops=3.51871 (err=2.0e-16) 27. Nielsen: elapsed time t=1.51 s, 131072 iters, t-(init.)=1.47 s t(norm)=1.4019, mflops=3.56659 (err=8.5e-17) 28. NR (C): elapsed time t=1.6 s, 262144 iters, t-(init.)=1.53 s t(norm)=0.729561, mflops=6.85344 (err=1.3e-16) 29. NR (F): elapsed time t=1.86 s, 262144 iters, t-(init.)=1.78 s t(norm)=0.84877, mflops=5.89088 (err=1.3e-16) 30. Ooura (C): elapsed time t=1.43 s, 524288 iters, t-(init.)=1.28 s t(norm)=0.305176, mflops=16.384 (err=1.4e-16) 31. Ooura (F): elapsed time t=1.42 s, 524288 iters, t-(init.)=1.28 s t(norm)=0.305176, mflops=16.384 (err=1.4e-16) 32. Skipping fft (QFT requires N >= 16). 33. Ransom: elapsed time t=1.65 s, 65536 iters, t-(init.)=1.63 s t(norm)=3.10898, mflops=1.60825 (err=1.9e-16) 34. SCIPORT: elapsed time t=1.58 s, 524288 iters, t-(init.)=1.44 s t(norm)=0.343323, mflops=14.5636 (err=8.7e-09) 35. Singleton: elapsed time t=1.31 s, 262144 iters, t-(init.)=1.23 s t(norm)=0.58651, mflops=8.52501 (err=1.4e-16) 36. Singleton (f2c): elapsed time t=1.26 s, 262144 iters, t-(init.)=1.19 s t(norm)=0.567436, mflops=8.81156 (err=1.4e-16) 37. Sorensen: elapsed time t=1.81 s, 524288 iters, t-(init.)=1.66 s t(norm)=0.395775, mflops=12.6334 (err=8.5e-17) 38. Sorensen DIT: elapsed time t=1.43 s, 262144 iters, t-(init.)=1.36 s t(norm)=0.648499, mflops=7.71012 (err=8.5e-17) 39. Temperton: elapsed time t=1.73 s, 262144 iters, t-(init.)=1.65 s t(norm)=0.786781, mflops=6.35501 (err=1.4e-16) 40. Temperton (f2c): elapsed time t=1 s, 131072 iters, t-(init.)=0.96 s t(norm)=0.915527, mflops=5.46133 (err=1.4e-16) 41. Valkenburg: elapsed time t=1.57 s, 131072 iters, t-(init.)=1.53 s t(norm)=1.45912, mflops=3.42672 (err=2.0e-16) 42. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=4 = 52.4288 Normalized results and averages for N=4: fft 0: mflops = 19.5996 (norm. = 0.373832), norm. avg. (of 2) = 0.405384 fft 1: mflops = 19.9729 (norm. = 0.380952), norm. avg. (of 2) = 0.401346 fft 2: mflops = 10.4858 (norm. = 0.2), norm. avg. (of 2) = 0.250621 fft 3: mflops = 1.91346 (norm. = 0.0364964), norm. avg. (of 2) = 0.0274898 fft 4: mflops = 7.59838 (norm. = 0.144928), norm. avg. (of 2) = 0.156665 fft 5: mflops = 4.0022 (norm. = 0.0763359), norm. avg. (of 2) = 0.0652327 fft 6: mflops = 5.48993 (norm. = 0.104712), norm. avg. (of 2) = 0.0869989 fft 7: mflops = 6.43298 (norm. = 0.122699), norm. avg. (of 2) = 0.120786 fft 8: mflops = 8.0044 (norm. = 0.152672), norm. avg. (of 2) = 0.21412 fft 9: mflops = 6.0263 (norm. = 0.114943), norm. avg. (of 2) = 0.0942137 fft 10: mflops = 3.85506 (norm. = 0.0735294), norm. avg. (of 2) = 0.0722179 fft 11: mflops = 8.192 (norm. = 0.15625), norm. avg. (of 1) = 0.15625 fft 12: mflops = 14.1699 (norm. = 0.27027), norm. avg. (of 2) = 0.23452 fft 13: mflops = 14.3641 (norm. = 0.273973), norm. avg. (of 2) = 0.246221 fft 14: mflops = 35.5449 (norm. = 0.677966), norm. avg. (of 2) = 0.596962 fft 15: mflops = 36.1578 (norm. = 0.689655), norm. avg. (of 2) = 0.60558 fft 16: mflops = 52.4288 (norm. = 1), norm. avg. (of 2) = 1 fft 17: mflops = -1 (norm. = -0.0190735), norm. avg. (of 0) = -1 fft 18: mflops = 10.5917 (norm. = 0.20202), norm. avg. (of 2) = 0.209269 fft 19: mflops = 5.76141 (norm. = 0.10989), norm. avg. (of 2) = 0.122306 fft 20: mflops = 6.0263 (norm. = 0.114943), norm. avg. (of 2) = 0.120622 fft 21: mflops = 19.065 (norm. = 0.363636), norm. avg. (of 2) = 0.339286 fft 22: mflops = 11.7159 (norm. = 0.223464), norm. avg. (of 1) = 0.223464 fft 23: mflops = 11.8483 (norm. = 0.225989), norm. avg. (of 1) = 0.225989 fft 24: mflops = 10.2802 (norm. = 0.196078), norm. avg. (of 1) = 0.196078 fft 25: mflops = 2.47306 (norm. = 0.0471698), norm. avg. (of 1) = 0.0471698 fft 26: mflops = 3.51871 (norm. = 0.0671141), norm. avg. (of 2) = 0.0712123 fft 27: mflops = 3.56659 (norm. = 0.0680272), norm. avg. (of 2) = 0.055064 fft 28: mflops = 6.85344 (norm. = 0.130719), norm. avg. (of 2) = 0.138844 fft 29: mflops = 5.89088 (norm. = 0.11236), norm. avg. (of 2) = 0.123917 fft 30: mflops = 16.384 (norm. = 0.3125), norm. avg. (of 2) = 0.347195 fft 31: mflops = 16.384 (norm. = 0.3125), norm. avg. (of 2) = 0.360032 fft 32: mflops = -1 (norm. = -0.0190735), norm. avg. (of 0) = -1 fft 33: mflops = 1.60825 (norm. = 0.0306748), norm. avg. (of 1) = 0.0306748 fft 34: mflops = 14.5636 (norm. = 0.277778), norm. avg. (of 1) = 0.277778 fft 35: mflops = 8.52501 (norm. = 0.162602), norm. avg. (of 2) = 0.132246 fft 36: mflops = 8.81156 (norm. = 0.168067), norm. avg. (of 2) = 0.134554 fft 37: mflops = 12.6334 (norm. = 0.240964), norm. avg. (of 2) = 0.309935 fft 38: mflops = 7.71012 (norm. = 0.147059), norm. avg. (of 2) = 0.205323 fft 39: mflops = 6.35501 (norm. = 0.121212), norm. avg. (of 2) = 0.102707 fft 40: mflops = 5.46133 (norm. = 0.104167), norm. avg. (of 2) = 0.0897386 fft 41: mflops = 3.42672 (norm. = 0.0653595), norm. avg. (of 2) = 0.104852 fft 42: mflops = -1 (norm. = -0.0190735), norm. avg. (of 0) = -1 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.08 s, 262144 iters, t-(init.)=0.98 s t(norm)=0.155767, mflops=32.0993 (err=1.4e-16) 1. Arndt DIT: elapsed time t=1.09 s, 262144 iters, t-(init.)=0.98 s t(norm)=0.155767, mflops=32.0993 (err=1.4e-16) 2. Arndt Split-Radix: elapsed time t=1.13 s, 131072 iters, t-(init.)=1.07 s t(norm)=0.340144, mflops=14.6997 (err=1.5e-16) 3. Arndt 4-step: elapsed time t=1.51 s, 32768 iters, t-(init.)=1.5 s t(norm)=1.90735, mflops=2.62144 (err=1.4e-16) 4. Bailey: elapsed time t=1.06 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.32107, mflops=15.5729 (err=1.4e-16) 5. Beauregard: elapsed time t=1.73 s, 65536 iters, t-(init.)=1.71 s t(norm)=1.08719, mflops=4.59902 (err=1.5e-16) 6. Bergland: elapsed time t=1.51 s, 131072 iters, t-(init.)=1.46 s t(norm)=0.464122, mflops=10.773 (err=1.5e-16) 7. Brenner: elapsed time t=1.7 s, 131072 iters, t-(init.)=1.64 s t(norm)=0.521342, mflops=9.59063 (err=1.7e-16) 8. Burrus: elapsed time t=1.68 s, 131072 iters, t-(init.)=1.64 s t(norm)=0.521342, mflops=9.59063 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.04 s, 131072 iters, t-(init.)=0.99 s t(norm)=0.314713, mflops=15.8875 10. CWP (best N) (N=15): elapsed time t=1.45 s, 131072 iters, t-(init.)=1.38 s t(norm)=0.43869, mflops=11.3976 11. Edelblute: elapsed time t=1.72 s, 131072 iters, t-(init.)=1.68 s t(norm)=0.534058, mflops=9.36229 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.3 s, 262144 iters, t-(init.)=1.19 s t(norm)=0.189145, mflops=26.4347 (err=1.7e-16) 13. FFTPACK (f2c): elapsed time t=1.31 s, 262144 iters, t-(init.)=1.21 s t(norm)=0.192324, mflops=25.9978 (err=1.7e-16) FFTW_MEASURE plan: (cost = 2.212524e-06) FFTW_NOTW 8 14. FFTW: elapsed time t=1.21 s, 524288 iters, t-(init.)=0.99 s t(norm)=0.0786781, mflops=63.5501 (err=1.5e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.17 s, 524288 iters, t-(init.)=0.97 s t(norm)=0.0770887, mflops=64.8604 (err=1.5e-16) 16. Frigo-old: elapsed time t=1.86 s, 1048576 iters, t-(init.)=1.45 s t(norm)=0.0576178, mflops=86.7787 (err=1.5e-16) 17. Green: elapsed time t=1.14 s, 262144 iters, t-(init.)=1.03 s t(norm)=0.163714, mflops=30.541 (err=1.3e-16) 18. GSL: elapsed time t=1.72 s, 262144 iters, t-(init.)=1.61 s t(norm)=0.255903, mflops=19.5387 (err=1.5e-16) 19. GSL DIT: elapsed time t=1.45 s, 131072 iters, t-(init.)=1.4 s t(norm)=0.445048, mflops=11.2347 (err=2.1e-16) 20. GSL DIF: elapsed time t=1.44 s, 131072 iters, t-(init.)=1.39 s t(norm)=0.441869, mflops=11.3156 (err=2.3e-16) 21. Krukar: elapsed time t=1 s, 262144 iters, t-(init.)=0.9 s t(norm)=0.143051, mflops=34.9525 (err=2.1e-16) 22. Mayer (Buneman): elapsed time t=1.66 s, 262144 iters, t-(init.)=1.56 s t(norm)=0.247955, mflops=20.1649 (err=1.2e-16) 23. Mayer (simple): elapsed time t=1.6 s, 262144 iters, t-(init.)=1.49 s t(norm)=0.236829, mflops=21.1123 24. Mayer (lookup): elapsed time t=1.76 s, 262144 iters, t-(init.)=1.65 s t(norm)=0.26226, mflops=19.065 (err=1.2e-16) 25. Monro: elapsed time t=1.27 s, 65536 iters, t-(init.)=1.23 s t(norm)=0.782013, mflops=6.39376 (err=1.5e-16) 26. NAPACK (f2c): elapsed time t=1.25 s, 65536 iters, t-(init.)=1.23 s t(norm)=0.782013, mflops=6.39376 (err=2.8e-16) 27. Nielsen: elapsed time t=1.75 s, 131072 iters, t-(init.)=1.69 s t(norm)=0.537237, mflops=9.30689 (err=1.1e-15) 28. NR (C): elapsed time t=1.27 s, 131072 iters, t-(init.)=1.22 s t(norm)=0.387828, mflops=12.8923 (err=1.6e-16) 29. NR (F): elapsed time t=1.55 s, 131072 iters, t-(init.)=1.49 s t(norm)=0.473658, mflops=10.5561 (err=1.6e-16) 30. Ooura (C): elapsed time t=1.08 s, 262144 iters, t-(init.)=0.98 s t(norm)=0.155767, mflops=32.0993 (err=1.1e-16) 31. Ooura (F): elapsed time t=1.14 s, 262144 iters, t-(init.)=1.03 s t(norm)=0.163714, mflops=30.541 (err=1.1e-16) 32. Skipping fft (QFT requires N >= 16). 33. Ransom: elapsed time t=1.06 s, 16384 iters, t-(init.)=1.06 s t(norm)=2.69572, mflops=1.85479 (err=2.7e-16) 34. SCIPORT: elapsed time t=1.49 s, 262144 iters, t-(init.)=1.37 s t(norm)=0.217756, mflops=22.9615 (err=2.2e-08) 35. Singleton: elapsed time t=1.8 s, 131072 iters, t-(init.)=1.76 s t(norm)=0.559489, mflops=8.93673 (err=2.0e-16) 36. Singleton (f2c): elapsed time t=1.71 s, 131072 iters, t-(init.)=1.66 s t(norm)=0.5277, mflops=9.47508 (err=1.7e-16) 37. Sorensen: elapsed time t=1.76 s, 262144 iters, t-(init.)=1.66 s t(norm)=0.26385, mflops=18.9502 (err=1.5e-16) 38. Sorensen DIT: elapsed time t=1.68 s, 131072 iters, t-(init.)=1.63 s t(norm)=0.518163, mflops=9.64947 (err=1.3e-16) 39. Temperton: elapsed time t=1.35 s, 131072 iters, t-(init.)=1.29 s t(norm)=0.41008, mflops=12.1927 (err=1.5e-16) 40. Temperton (f2c): elapsed time t=1.77 s, 131072 iters, t-(init.)=1.71 s t(norm)=0.543594, mflops=9.19804 (err=1.9e-16) 41. Valkenburg: elapsed time t=1.06 s, 32768 iters, t-(init.)=1.05 s t(norm)=1.33514, mflops=3.74491 (err=2.6e-16) 42. ESSL: elapsed time t=1.63 s, 262144 iters, t-(init.)=1.51 s t(norm)=0.240008, mflops=20.8326 (err=1.5e-16) Top mflops for N=8 = 86.7787 Normalized results and averages for N=8: fft 0: mflops = 32.0993 (norm. = 0.369898), norm. avg. (of 3) = 0.393556 fft 1: mflops = 32.0993 (norm. = 0.369898), norm. avg. (of 3) = 0.390863 fft 2: mflops = 14.6997 (norm. = 0.169393), norm. avg. (of 3) = 0.223545 fft 3: mflops = 2.62144 (norm. = 0.0302083), norm. avg. (of 3) = 0.028396 fft 4: mflops = 15.5729 (norm. = 0.179455), norm. avg. (of 3) = 0.164262 fft 5: mflops = 4.59902 (norm. = 0.0529971), norm. avg. (of 3) = 0.0611541 fft 6: mflops = 10.773 (norm. = 0.124144), norm. avg. (of 3) = 0.0993805 fft 7: mflops = 9.59063 (norm. = 0.110518), norm. avg. (of 3) = 0.117363 fft 8: mflops = 9.59063 (norm. = 0.110518), norm. avg. (of 3) = 0.179586 fft 9: mflops = 15.8875 (norm. = 0.183081), norm. avg. (of 3) = 0.123836 fft 10: mflops = 11.3976 (norm. = 0.131341), norm. avg. (of 3) = 0.0919255 fft 11: mflops = 9.36229 (norm. = 0.107887), norm. avg. (of 2) = 0.132068 fft 12: mflops = 26.4347 (norm. = 0.304622), norm. avg. (of 3) = 0.257888 fft 13: mflops = 25.9978 (norm. = 0.299587), norm. avg. (of 3) = 0.264009 fft 14: mflops = 63.5501 (norm. = 0.732323), norm. avg. (of 3) = 0.642082 fft 15: mflops = 64.8604 (norm. = 0.747423), norm. avg. (of 3) = 0.652861 fft 16: mflops = 86.7787 (norm. = 1), norm. avg. (of 3) = 1 fft 17: mflops = 30.541 (norm. = 0.351942), norm. avg. (of 1) = 0.351942 fft 18: mflops = 19.5387 (norm. = 0.225155), norm. avg. (of 3) = 0.214564 fft 19: mflops = 11.2347 (norm. = 0.129464), norm. avg. (of 3) = 0.124692 fft 20: mflops = 11.3156 (norm. = 0.130396), norm. avg. (of 3) = 0.12388 fft 21: mflops = 34.9525 (norm. = 0.402778), norm. avg. (of 3) = 0.36045 fft 22: mflops = 20.1649 (norm. = 0.232372), norm. avg. (of 2) = 0.227918 fft 23: mflops = 21.1123 (norm. = 0.243289), norm. avg. (of 2) = 0.234639 fft 24: mflops = 19.065 (norm. = 0.219697), norm. avg. (of 2) = 0.207888 fft 25: mflops = 6.39376 (norm. = 0.0736789), norm. avg. (of 2) = 0.0604243 fft 26: mflops = 6.39376 (norm. = 0.0736789), norm. avg. (of 3) = 0.0720345 fft 27: mflops = 9.30689 (norm. = 0.107249), norm. avg. (of 3) = 0.0724588 fft 28: mflops = 12.8923 (norm. = 0.148566), norm. avg. (of 3) = 0.142085 fft 29: mflops = 10.5561 (norm. = 0.121644), norm. avg. (of 3) = 0.12316 fft 30: mflops = 32.0993 (norm. = 0.369898), norm. avg. (of 3) = 0.354763 fft 31: mflops = 30.541 (norm. = 0.351942), norm. avg. (of 3) = 0.357335 fft 32: mflops = -1 (norm. = -0.0115236), norm. avg. (of 0) = -1 fft 33: mflops = 1.85479 (norm. = 0.0213738), norm. avg. (of 2) = 0.0260243 fft 34: mflops = 22.9615 (norm. = 0.264599), norm. avg. (of 2) = 0.271188 fft 35: mflops = 8.93673 (norm. = 0.102983), norm. avg. (of 3) = 0.122492 fft 36: mflops = 9.47508 (norm. = 0.109187), norm. avg. (of 3) = 0.126099 fft 37: mflops = 18.9502 (norm. = 0.218373), norm. avg. (of 3) = 0.279415 fft 38: mflops = 9.64947 (norm. = 0.111196), norm. avg. (of 3) = 0.173947 fft 39: mflops = 12.1927 (norm. = 0.140504), norm. avg. (of 3) = 0.115306 fft 40: mflops = 9.19804 (norm. = 0.105994), norm. avg. (of 3) = 0.0951571 fft 41: mflops = 3.74491 (norm. = 0.0431548), norm. avg. (of 3) = 0.0842865 fft 42: mflops = 20.8326 (norm. = 0.240066), norm. avg. (of 1) = 0.240066 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.71 s, 131072 iters, t-(init.)=1.62 s t(norm)=0.193119, mflops=25.8908 (err=2.0e-16) 1. Arndt DIT: elapsed time t=1.82 s, 131072 iters, t-(init.)=1.73 s t(norm)=0.206232, mflops=24.2445 (err=1.7e-16) 2. Arndt Split-Radix: elapsed time t=1.22 s, 65536 iters, t-(init.)=1.17 s t(norm)=0.27895, mflops=17.9244 (err=1.7e-16) 3. Arndt 4-step: elapsed time t=1.07 s, 16384 iters, t-(init.)=1.06 s t(norm)=1.01089, mflops=4.94611 (err=2.0e-16) 4. Bailey: elapsed time t=1.89 s, 131072 iters, t-(init.)=1.79 s t(norm)=0.213385, mflops=23.4319 (err=1.9e-16) 5. Beauregard: elapsed time t=1.81 s, 32768 iters, t-(init.)=1.78 s t(norm)=0.84877, mflops=5.89088 (err=2.3e-16) 6. Bergland: elapsed time t=1.02 s, 65536 iters, t-(init.)=0.97 s t(norm)=0.231266, mflops=21.6201 (err=2.3e-16) 7. Brenner: elapsed time t=1.48 s, 65536 iters, t-(init.)=1.44 s t(norm)=0.343323, mflops=14.5636 (err=2.0e-16) 8. Burrus: elapsed time t=1.88 s, 65536 iters, t-(init.)=1.84 s t(norm)=0.43869, mflops=11.3976 (err=1.7e-16) 9. CWP (min N): elapsed time t=1.41 s, 131072 iters, t-(init.)=1.32 s t(norm)=0.157356, mflops=31.775 10. CWP (best N) (N=28): elapsed time t=1.93 s, 131072 iters, t-(init.)=1.78 s t(norm)=0.212193, mflops=23.5635 11. Edelblute: elapsed time t=1.85 s, 65536 iters, t-(init.)=1.81 s t(norm)=0.431538, mflops=11.5865 (err=1.7e-16) 12. FFTPACK: elapsed time t=1.87 s, 262144 iters, t-(init.)=1.68 s t(norm)=0.100136, mflops=49.9322 (err=2.0e-16) 13. FFTPACK (f2c): elapsed time t=1 s, 131072 iters, t-(init.)=0.92 s t(norm)=0.109673, mflops=45.5903 (err=2.0e-16) FFTW_MEASURE plan: (cost = 3.967285e-06) FFTW_NOTW 16 14. FFTW: elapsed time t=1.04 s, 262144 iters, t-(init.)=0.86 s t(norm)=0.05126, mflops=97.542 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.04 s, 262144 iters, t-(init.)=0.86 s t(norm)=0.05126, mflops=97.542 (err=2.1e-16) 16. Frigo-old: elapsed time t=1.04 s, 262144 iters, t-(init.)=0.86 s t(norm)=0.05126, mflops=97.542 (err=1.9e-16) 17. Green: elapsed time t=1.2 s, 131072 iters, t-(init.)=1.11 s t(norm)=0.132322, mflops=37.7865 (err=1.6e-16) 18. GSL: elapsed time t=1.5 s, 131072 iters, t-(init.)=1.41 s t(norm)=0.168085, mflops=29.7468 (err=2.0e-16) 19. GSL DIT: elapsed time t=1.29 s, 65536 iters, t-(init.)=1.25 s t(norm)=0.298023, mflops=16.7772 (err=2.3e-16) 20. GSL DIF: elapsed time t=1.22 s, 65536 iters, t-(init.)=1.17 s t(norm)=0.27895, mflops=17.9244 (err=1.6e-16) 21. Krukar: elapsed time t=1.84 s, 262144 iters, t-(init.)=1.66 s t(norm)=0.0989437, mflops=50.5338 (err=2.1e-16) 22. Mayer (Buneman): elapsed time t=1.11 s, 65536 iters, t-(init.)=1.07 s t(norm)=0.255108, mflops=19.5996 (err=2.1e-16) 23. Mayer (simple): elapsed time t=1.9 s, 131072 iters, t-(init.)=1.82 s t(norm)=0.216961, mflops=23.0456 24. Mayer (lookup): elapsed time t=1.03 s, 65536 iters, t-(init.)=0.99 s t(norm)=0.236034, mflops=21.1834 (err=1.7e-16) 25. Monro: elapsed time t=1.81 s, 65536 iters, t-(init.)=1.77 s t(norm)=0.422001, mflops=11.8483 (err=1.9e-16) 26. NAPACK (f2c): elapsed time t=1.16 s, 32768 iters, t-(init.)=1.14 s t(norm)=0.543594, mflops=9.19804 (err=2.8e-16) 27. Nielsen: elapsed time t=1.98 s, 65536 iters, t-(init.)=1.94 s t(norm)=0.462532, mflops=10.8101 (err=1.9e-16) 28. NR (C): elapsed time t=1.08 s, 65536 iters, t-(init.)=1.04 s t(norm)=0.247955, mflops=20.1649 (err=2.1e-16) 29. NR (F): elapsed time t=1.37 s, 65536 iters, t-(init.)=1.32 s t(norm)=0.314713, mflops=15.8875 (err=2.1e-16) 30. Ooura (C): elapsed time t=1.01 s, 131072 iters, t-(init.)=0.93 s t(norm)=0.110865, mflops=45.1 (err=1.9e-16) 31. Ooura (F): elapsed time t=1.02 s, 131072 iters, t-(init.)=0.93 s t(norm)=0.110865, mflops=45.1 (err=1.9e-16) 32. QFT: elapsed time t=1.37 s, 131072 iters, t-(init.)=1.28 s t(norm)=0.152588, mflops=32.768 (err=2.0e-16) 33. Ransom: elapsed time t=1.54 s, 32768 iters, t-(init.)=1.52 s t(norm)=0.724792, mflops=6.89853 (err=3.1e-16) 34. SCIPORT: elapsed time t=1.38 s, 131072 iters, t-(init.)=1.29 s t(norm)=0.15378, mflops=32.514 (err=2.6e-08) 35. Singleton: elapsed time t=1.76 s, 131072 iters, t-(init.)=1.67 s t(norm)=0.19908, mflops=25.1156 (err=2.4e-16) 36. Singleton (f2c): elapsed time t=1.78 s, 131072 iters, t-(init.)=1.69 s t(norm)=0.201464, mflops=24.8184 (err=2.4e-16) 37. Sorensen: elapsed time t=1.8 s, 131072 iters, t-(init.)=1.72 s t(norm)=0.20504, mflops=24.3855 (err=1.8e-16) 38. Sorensen DIT: elapsed time t=1.88 s, 65536 iters, t-(init.)=1.84 s t(norm)=0.43869, mflops=11.3976 (err=1.9e-16) 39. Temperton: elapsed time t=1.18 s, 65536 iters, t-(init.)=1.13 s t(norm)=0.269413, mflops=18.5589 (err=1.9e-16) 40. Temperton (f2c): elapsed time t=1.62 s, 65536 iters, t-(init.)=1.57 s t(norm)=0.374317, mflops=13.3577 (err=1.9e-16) 41. Valkenburg: elapsed time t=1.33 s, 16384 iters, t-(init.)=1.32 s t(norm)=1.25885, mflops=3.97188 (err=3.1e-16) 42. ESSL: elapsed time t=1.06 s, 131072 iters, t-(init.)=0.97 s t(norm)=0.115633, mflops=43.2402 (err=1.9e-16) Top mflops for N=16 = 97.542 Normalized results and averages for N=16: fft 0: mflops = 25.8908 (norm. = 0.265432), norm. avg. (of 4) = 0.361525 fft 1: mflops = 24.2445 (norm. = 0.248555), norm. avg. (of 4) = 0.355286 fft 2: mflops = 17.9244 (norm. = 0.183761), norm. avg. (of 4) = 0.213599 fft 3: mflops = 4.94611 (norm. = 0.0507075), norm. avg. (of 4) = 0.0339739 fft 4: mflops = 23.4319 (norm. = 0.240223), norm. avg. (of 4) = 0.183252 fft 5: mflops = 5.89088 (norm. = 0.0603933), norm. avg. (of 4) = 0.0609639 fft 6: mflops = 21.6201 (norm. = 0.221649), norm. avg. (of 4) = 0.129948 fft 7: mflops = 14.5636 (norm. = 0.149306), norm. avg. (of 4) = 0.125349 fft 8: mflops = 11.3976 (norm. = 0.116848), norm. avg. (of 4) = 0.163902 fft 9: mflops = 31.775 (norm. = 0.325758), norm. avg. (of 4) = 0.174316 fft 10: mflops = 23.5635 (norm. = 0.241573), norm. avg. (of 4) = 0.129337 fft 11: mflops = 11.5865 (norm. = 0.118785), norm. avg. (of 3) = 0.12764 fft 12: mflops = 49.9322 (norm. = 0.511905), norm. avg. (of 4) = 0.321392 fft 13: mflops = 45.5903 (norm. = 0.467391), norm. avg. (of 4) = 0.314855 fft 14: mflops = 97.542 (norm. = 1), norm. avg. (of 4) = 0.731562 fft 15: mflops = 97.542 (norm. = 1), norm. avg. (of 4) = 0.739646 fft 16: mflops = 97.542 (norm. = 1), norm. avg. (of 4) = 1 fft 17: mflops = 37.7865 (norm. = 0.387387), norm. avg. (of 2) = 0.369665 fft 18: mflops = 29.7468 (norm. = 0.304965), norm. avg. (of 4) = 0.237164 fft 19: mflops = 16.7772 (norm. = 0.172), norm. avg. (of 4) = 0.136519 fft 20: mflops = 17.9244 (norm. = 0.183761), norm. avg. (of 4) = 0.13885 fft 21: mflops = 50.5338 (norm. = 0.518072), norm. avg. (of 4) = 0.399855 fft 22: mflops = 19.5996 (norm. = 0.200935), norm. avg. (of 3) = 0.218923 fft 23: mflops = 23.0456 (norm. = 0.236264), norm. avg. (of 3) = 0.23518 fft 24: mflops = 21.1834 (norm. = 0.217172), norm. avg. (of 3) = 0.210982 fft 25: mflops = 11.8483 (norm. = 0.121469), norm. avg. (of 3) = 0.0807725 fft 26: mflops = 9.19804 (norm. = 0.0942982), norm. avg. (of 4) = 0.0776004 fft 27: mflops = 10.8101 (norm. = 0.110825), norm. avg. (of 4) = 0.0820503 fft 28: mflops = 20.1649 (norm. = 0.206731), norm. avg. (of 4) = 0.158246 fft 29: mflops = 15.8875 (norm. = 0.162879), norm. avg. (of 4) = 0.133089 fft 30: mflops = 45.1 (norm. = 0.462366), norm. avg. (of 4) = 0.381663 fft 31: mflops = 45.1 (norm. = 0.462366), norm. avg. (of 4) = 0.383593 fft 32: mflops = 32.768 (norm. = 0.335938), norm. avg. (of 1) = 0.335938 fft 33: mflops = 6.89853 (norm. = 0.0707237), norm. avg. (of 3) = 0.0409241 fft 34: mflops = 32.514 (norm. = 0.333333), norm. avg. (of 3) = 0.291903 fft 35: mflops = 25.1156 (norm. = 0.257485), norm. avg. (of 4) = 0.15624 fft 36: mflops = 24.8184 (norm. = 0.254438), norm. avg. (of 4) = 0.158183 fft 37: mflops = 24.3855 (norm. = 0.25), norm. avg. (of 4) = 0.272061 fft 38: mflops = 11.3976 (norm. = 0.116848), norm. avg. (of 4) = 0.159672 fft 39: mflops = 18.5589 (norm. = 0.190265), norm. avg. (of 4) = 0.134046 fft 40: mflops = 13.3577 (norm. = 0.136943), norm. avg. (of 4) = 0.105604 fft 41: mflops = 3.97188 (norm. = 0.0407197), norm. avg. (of 4) = 0.0733948 fft 42: mflops = 43.2402 (norm. = 0.443299), norm. avg. (of 2) = 0.341683 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.84 s, 65536 iters, t-(init.)=1.76 s t(norm)=0.167847, mflops=29.7891 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.93 s, 65536 iters, t-(init.)=1.85 s t(norm)=0.17643, mflops=28.3399 (err=1.4e-16) 2. Arndt Split-Radix: elapsed time t=1.26 s, 32768 iters, t-(init.)=1.22 s t(norm)=0.232697, mflops=21.4872 (err=1.6e-16) 3. Arndt 4-step: elapsed time t=1.05 s, 8192 iters, t-(init.)=1.04 s t(norm)=0.793457, mflops=6.30154 (err=1.8e-16) 4. Bailey: elapsed time t=1.43 s, 65536 iters, t-(init.)=1.35 s t(norm)=0.128746, mflops=38.8361 (err=1.5e-16) 5. Beauregard: elapsed time t=1.08 s, 8192 iters, t-(init.)=1.07 s t(norm)=0.816345, mflops=6.12486 (err=1.9e-16) 6. Bergland: elapsed time t=1.69 s, 65536 iters, t-(init.)=1.62 s t(norm)=0.154495, mflops=32.3635 (err=1.8e-16) 7. Brenner: elapsed time t=1.41 s, 32768 iters, t-(init.)=1.37 s t(norm)=0.261307, mflops=19.1346 (err=2.4e-16) 8. Burrus: elapsed time t=1.99 s, 32768 iters, t-(init.)=1.95 s t(norm)=0.371933, mflops=13.4433 (err=1.3e-16) 9. CWP (min N) (N=33): elapsed time t=1.35 s, 65536 iters, t-(init.)=1.27 s t(norm)=0.121117, mflops=41.2825 10. CWP (best N) (N=35): elapsed time t=1.15 s, 65536 iters, t-(init.)=1.06 s t(norm)=0.101089, mflops=49.4611 11. Edelblute: elapsed time t=1.95 s, 32768 iters, t-(init.)=1.91 s t(norm)=0.364304, mflops=13.7248 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.04 s, 65536 iters, t-(init.)=0.96 s t(norm)=0.0915527, mflops=54.6133 (err=2.0e-16) 13. FFTPACK (f2c): elapsed time t=1.04 s, 65536 iters, t-(init.)=0.97 s t(norm)=0.0925064, mflops=54.0503 (err=2.0e-16) FFTW_MEASURE plan: (cost = 7.934570e-06) FFTW_NOTW 32 14. FFTW: elapsed time t=1.04 s, 131072 iters, t-(init.)=0.88 s t(norm)=0.0419617, mflops=119.156 (err=2.0e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.04 s, 131072 iters, t-(init.)=0.88 s t(norm)=0.0419617, mflops=119.156 (err=2.0e-16) 16. Frigo-old: elapsed time t=1.91 s, 262144 iters, t-(init.)=1.59 s t(norm)=0.0379086, mflops=131.896 (err=2.3e-16) 17. Green: elapsed time t=1.98 s, 131072 iters, t-(init.)=1.83 s t(norm)=0.0872612, mflops=57.2992 (err=1.9e-16) 18. GSL: elapsed time t=1.76 s, 65536 iters, t-(init.)=1.69 s t(norm)=0.161171, mflops=31.023 (err=2.0e-16) 19. GSL DIT: elapsed time t=1.26 s, 32768 iters, t-(init.)=1.22 s t(norm)=0.232697, mflops=21.4872 (err=2.9e-16) 20. GSL DIF: elapsed time t=1.14 s, 32768 iters, t-(init.)=1.1 s t(norm)=0.209808, mflops=23.8313 (err=2.8e-16) 21. Krukar: elapsed time t=1.94 s, 131072 iters, t-(init.)=1.78 s t(norm)=0.084877, mflops=58.9088 (err=2.3e-16) 22. Mayer (Buneman): elapsed time t=1.13 s, 32768 iters, t-(init.)=1.09 s t(norm)=0.207901, mflops=24.0499 (err=1.6e-16) 23. Mayer (simple): elapsed time t=1.8 s, 65536 iters, t-(init.)=1.73 s t(norm)=0.164986, mflops=30.3057 24. Mayer (lookup): elapsed time t=1.91 s, 65536 iters, t-(init.)=1.83 s t(norm)=0.174522, mflops=28.6496 (err=1.4e-16) 25. Monro: elapsed time t=1.39 s, 32768 iters, t-(init.)=1.35 s t(norm)=0.257492, mflops=19.4181 (err=1.5e-16) 26. NAPACK (f2c): elapsed time t=1.1 s, 16384 iters, t-(init.)=1.08 s t(norm)=0.411987, mflops=12.1363 (err=4.0e-16) 27. Nielsen: elapsed time t=1.61 s, 32768 iters, t-(init.)=1.57 s t(norm)=0.299454, mflops=16.6971 (err=7.4e-16) 28. NR (C): elapsed time t=1.93 s, 65536 iters, t-(init.)=1.85 s t(norm)=0.17643, mflops=28.3399 (err=2.3e-16) 29. NR (F): elapsed time t=1.26 s, 32768 iters, t-(init.)=1.22 s t(norm)=0.232697, mflops=21.4872 (err=2.3e-16) 30. Ooura (C): elapsed time t=1.9 s, 131072 iters, t-(init.)=1.74 s t(norm)=0.0829697, mflops=60.263 (err=2.3e-16) 31. Ooura (F): elapsed time t=1.94 s, 131072 iters, t-(init.)=1.78 s t(norm)=0.084877, mflops=58.9088 (err=2.2e-16) 32. QFT: elapsed time t=1.68 s, 65536 iters, t-(init.)=1.6 s t(norm)=0.152588, mflops=32.768 (err=1.6e-16) 33. Ransom: elapsed time t=1.97 s, 16384 iters, t-(init.)=1.95 s t(norm)=0.743866, mflops=6.72164 (err=4.9e-16) 34. SCIPORT: elapsed time t=1.3 s, 65536 iters, t-(init.)=1.22 s t(norm)=0.116348, mflops=42.9744 (err=3.3e-08) 35. Singleton: elapsed time t=1.64 s, 65536 iters, t-(init.)=1.56 s t(norm)=0.148773, mflops=33.6082 (err=2.2e-16) 36. Singleton (f2c): elapsed time t=1.63 s, 65536 iters, t-(init.)=1.55 s t(norm)=0.14782, mflops=33.825 (err=2.2e-16) 37. Sorensen: elapsed time t=1.57 s, 65536 iters, t-(init.)=1.49 s t(norm)=0.142097, mflops=35.1871 (err=1.5e-16) 38. Sorensen DIT: elapsed time t=1.02 s, 16384 iters, t-(init.)=1 s t(norm)=0.38147, mflops=13.1072 (err=1.4e-16) 39. Temperton: elapsed time t=1.33 s, 32768 iters, t-(init.)=1.29 s t(norm)=0.246048, mflops=20.3212 (err=2.1e-16) 40. Temperton (f2c): elapsed time t=1.76 s, 32768 iters, t-(init.)=1.72 s t(norm)=0.328064, mflops=15.2409 (err=2.1e-16) 41. Valkenburg: elapsed time t=1.58 s, 8192 iters, t-(init.)=1.57 s t(norm)=1.19781, mflops=4.17427 (err=3.1e-16) 42. ESSL: elapsed time t=1.66 s, 131072 iters, t-(init.)=1.49 s t(norm)=0.0710487, mflops=70.3742 (err=2.2e-16) Top mflops for N=32 = 131.896 Normalized results and averages for N=32: fft 0: mflops = 29.7891 (norm. = 0.225852), norm. avg. (of 5) = 0.33439 fft 1: mflops = 28.3399 (norm. = 0.214865), norm. avg. (of 5) = 0.327202 fft 2: mflops = 21.4872 (norm. = 0.16291), norm. avg. (of 5) = 0.203461 fft 3: mflops = 6.30154 (norm. = 0.0477764), norm. avg. (of 5) = 0.0367344 fft 4: mflops = 38.8361 (norm. = 0.294444), norm. avg. (of 5) = 0.205491 fft 5: mflops = 6.12486 (norm. = 0.0464369), norm. avg. (of 5) = 0.0580585 fft 6: mflops = 32.3635 (norm. = 0.24537), norm. avg. (of 5) = 0.153032 fft 7: mflops = 19.1346 (norm. = 0.145073), norm. avg. (of 5) = 0.129294 fft 8: mflops = 13.4433 (norm. = 0.101923), norm. avg. (of 5) = 0.151506 fft 9: mflops = 41.2825 (norm. = 0.312992), norm. avg. (of 5) = 0.202052 fft 10: mflops = 49.4611 (norm. = 0.375), norm. avg. (of 5) = 0.17847 fft 11: mflops = 13.7248 (norm. = 0.104058), norm. avg. (of 4) = 0.121745 fft 12: mflops = 54.6133 (norm. = 0.414062), norm. avg. (of 5) = 0.339926 fft 13: mflops = 54.0503 (norm. = 0.409794), norm. avg. (of 5) = 0.333843 fft 14: mflops = 119.156 (norm. = 0.903409), norm. avg. (of 5) = 0.765931 fft 15: mflops = 119.156 (norm. = 0.903409), norm. avg. (of 5) = 0.772398 fft 16: mflops = 131.896 (norm. = 1), norm. avg. (of 5) = 1 fft 17: mflops = 57.2992 (norm. = 0.434426), norm. avg. (of 3) = 0.391252 fft 18: mflops = 31.023 (norm. = 0.235207), norm. avg. (of 5) = 0.236773 fft 19: mflops = 21.4872 (norm. = 0.16291), norm. avg. (of 5) = 0.141797 fft 20: mflops = 23.8313 (norm. = 0.180682), norm. avg. (of 5) = 0.147217 fft 21: mflops = 58.9088 (norm. = 0.446629), norm. avg. (of 5) = 0.40921 fft 22: mflops = 24.0499 (norm. = 0.182339), norm. avg. (of 4) = 0.209777 fft 23: mflops = 30.3057 (norm. = 0.229769), norm. avg. (of 4) = 0.233827 fft 24: mflops = 28.6496 (norm. = 0.217213), norm. avg. (of 4) = 0.21254 fft 25: mflops = 19.4181 (norm. = 0.147222), norm. avg. (of 4) = 0.097385 fft 26: mflops = 12.1363 (norm. = 0.0920139), norm. avg. (of 5) = 0.0804831 fft 27: mflops = 16.6971 (norm. = 0.126592), norm. avg. (of 5) = 0.0909587 fft 28: mflops = 28.3399 (norm. = 0.214865), norm. avg. (of 5) = 0.16957 fft 29: mflops = 21.4872 (norm. = 0.16291), norm. avg. (of 5) = 0.139053 fft 30: mflops = 60.263 (norm. = 0.456897), norm. avg. (of 5) = 0.39671 fft 31: mflops = 58.9088 (norm. = 0.446629), norm. avg. (of 5) = 0.3962 fft 32: mflops = 32.768 (norm. = 0.248438), norm. avg. (of 2) = 0.292188 fft 33: mflops = 6.72164 (norm. = 0.0509615), norm. avg. (of 4) = 0.0434335 fft 34: mflops = 42.9744 (norm. = 0.32582), norm. avg. (of 4) = 0.300382 fft 35: mflops = 33.6082 (norm. = 0.254808), norm. avg. (of 5) = 0.175954 fft 36: mflops = 33.825 (norm. = 0.256452), norm. avg. (of 5) = 0.177837 fft 37: mflops = 35.1871 (norm. = 0.266779), norm. avg. (of 5) = 0.271004 fft 38: mflops = 13.1072 (norm. = 0.099375), norm. avg. (of 5) = 0.147613 fft 39: mflops = 20.3212 (norm. = 0.15407), norm. avg. (of 5) = 0.138051 fft 40: mflops = 15.2409 (norm. = 0.115552), norm. avg. (of 5) = 0.107593 fft 41: mflops = 4.17427 (norm. = 0.0316481), norm. avg. (of 5) = 0.0650455 fft 42: mflops = 70.3742 (norm. = 0.533557), norm. avg. (of 3) = 0.405641 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1 s, 16384 iters, t-(init.)=0.96 s t(norm)=0.152588, mflops=32.768 (err=5.2e-16) 1. Arndt DIT: elapsed time t=1.06 s, 16384 iters, t-(init.)=1.02 s t(norm)=0.162125, mflops=30.8405 (err=5.0e-16) 2. Arndt Split-Radix: elapsed time t=1.33 s, 16384 iters, t-(init.)=1.29 s t(norm)=0.20504, mflops=24.3855 (err=4.9e-16) 3. Arndt 4-step: elapsed time t=1.69 s, 8192 iters, t-(init.)=1.68 s t(norm)=0.534058, mflops=9.36229 (err=4.8e-16) 4. Bailey: elapsed time t=1.42 s, 32768 iters, t-(init.)=1.35 s t(norm)=0.107288, mflops=46.6034 (err=5.0e-16) 5. Beauregard: elapsed time t=1.26 s, 4096 iters, t-(init.)=1.25 s t(norm)=0.794729, mflops=6.29146 (err=4.3e-16) 6. Bergland: elapsed time t=1.55 s, 32768 iters, t-(init.)=1.48 s t(norm)=0.11762, mflops=42.5098 (err=4.4e-16) 7. Brenner: elapsed time t=1.37 s, 16384 iters, t-(init.)=1.34 s t(norm)=0.212987, mflops=23.4756 (err=4.2e-16) 8. Burrus: elapsed time t=1.04 s, 8192 iters, t-(init.)=1.02 s t(norm)=0.324249, mflops=15.4202 (err=5.1e-16) 9. CWP (min N) (N=65): elapsed time t=1.27 s, 32768 iters, t-(init.)=1.19 s t(norm)=0.0945727, mflops=52.8694 10. CWP (best N) (N=84): elapsed time t=1.08 s, 32768 iters, t-(init.)=0.98 s t(norm)=0.0778834, mflops=64.1985 11. Edelblute: elapsed time t=1.03 s, 8192 iters, t-(init.)=1.01 s t(norm)=0.32107, mflops=15.5729 (err=5.0e-16) 12. FFTPACK: elapsed time t=1.8 s, 65536 iters, t-(init.)=1.64 s t(norm)=0.0651677, mflops=76.7251 (err=3.8e-16) 13. FFTPACK (f2c): elapsed time t=1.93 s, 65536 iters, t-(init.)=1.78 s t(norm)=0.0707308, mflops=70.6905 (err=3.8e-16) FFTW_MEASURE plan: (cost = 1.770020e-05) FFTW_NOTW 64 14. FFTW: elapsed time t=1.17 s, 65536 iters, t-(init.)=1.02 s t(norm)=0.0405312, mflops=123.362 (err=4.2e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.42 s, 65536 iters, t-(init.)=1.27 s t(norm)=0.0504653, mflops=99.078 (err=4.1e-16) 16. Frigo-old: elapsed time t=1.02 s, 32768 iters, t-(init.)=0.95 s t(norm)=0.0754992, mflops=66.2259 (err=4.5e-16) 17. Green: elapsed time t=1.58 s, 65536 iters, t-(init.)=1.43 s t(norm)=0.0568231, mflops=87.9924 (err=4.5e-16) 18. GSL: elapsed time t=1.68 s, 32768 iters, t-(init.)=1.6 s t(norm)=0.127157, mflops=39.3216 (err=3.8e-16) 19. GSL DIT: elapsed time t=1.31 s, 16384 iters, t-(init.)=1.27 s t(norm)=0.201861, mflops=24.7695 (err=4.6e-16) 20. GSL DIF: elapsed time t=1.13 s, 16384 iters, t-(init.)=1.1 s t(norm)=0.17484, mflops=28.5975 (err=5.0e-16) 21. Krukar: elapsed time t=1.1 s, 32768 iters, t-(init.)=1.03 s t(norm)=0.081857, mflops=61.0821 (err=4.3e-16) 22. Mayer (Buneman): elapsed time t=1.26 s, 16384 iters, t-(init.)=1.22 s t(norm)=0.193914, mflops=25.7847 (err=4.6e-16) 23. Mayer (simple): elapsed time t=1.91 s, 32768 iters, t-(init.)=1.83 s t(norm)=0.145435, mflops=34.3795 24. Mayer (lookup): elapsed time t=1.02 s, 16384 iters, t-(init.)=0.98 s t(norm)=0.155767, mflops=32.0993 (err=4.7e-16) 25. Monro: elapsed time t=1.23 s, 16384 iters, t-(init.)=1.2 s t(norm)=0.190735, mflops=26.2144 (err=4.2e-16) 26. NAPACK (f2c): elapsed time t=1.11 s, 8192 iters, t-(init.)=1.09 s t(norm)=0.346502, mflops=14.4299 (err=8.8e-16) 27. Nielsen: elapsed time t=1.47 s, 16384 iters, t-(init.)=1.43 s t(norm)=0.227292, mflops=21.9981 (err=2.6e-15) 28. NR (C): elapsed time t=1.84 s, 32768 iters, t-(init.)=1.76 s t(norm)=0.139872, mflops=35.7469 (err=4.4e-16) 29. NR (F): elapsed time t=1.24 s, 16384 iters, t-(init.)=1.21 s t(norm)=0.192324, mflops=25.9978 (err=4.4e-16) 30. Ooura (C): elapsed time t=1.98 s, 65536 iters, t-(init.)=1.83 s t(norm)=0.0727177, mflops=68.7591 (err=4.4e-16) 31. Ooura (F): elapsed time t=1.94 s, 65536 iters, t-(init.)=1.79 s t(norm)=0.0711282, mflops=70.2956 (err=4.4e-16) 32. QFT: elapsed time t=1.96 s, 32768 iters, t-(init.)=1.88 s t(norm)=0.149409, mflops=33.4652 (err=4.8e-16) 33. Ransom: elapsed time t=1.11 s, 8192 iters, t-(init.)=1.09 s t(norm)=0.346502, mflops=14.4299 (err=9.4e-16) 34. SCIPORT: elapsed time t=1.25 s, 32768 iters, t-(init.)=1.17 s t(norm)=0.0929832, mflops=53.7731 (err=6.4e-08) 35. Singleton: elapsed time t=1.44 s, 32768 iters, t-(init.)=1.36 s t(norm)=0.108083, mflops=46.2607 (err=6.6e-16) 36. Singleton (f2c): elapsed time t=1.44 s, 32768 iters, t-(init.)=1.36 s t(norm)=0.108083, mflops=46.2607 (err=6.7e-16) 37. Sorensen: elapsed time t=1.46 s, 32768 iters, t-(init.)=1.38 s t(norm)=0.109673, mflops=45.5903 (err=4.5e-16) 38. Sorensen DIT: elapsed time t=1.06 s, 8192 iters, t-(init.)=1.04 s t(norm)=0.330607, mflops=15.1237 (err=4.6e-16) 39. Temperton: elapsed time t=1.21 s, 16384 iters, t-(init.)=1.18 s t(norm)=0.187556, mflops=26.6587 (err=4.0e-16) 40. Temperton (f2c): elapsed time t=1.46 s, 16384 iters, t-(init.)=1.42 s t(norm)=0.225703, mflops=22.153 (err=4.0e-16) 41. Valkenburg: elapsed time t=1.84 s, 4096 iters, t-(init.)=1.83 s t(norm)=1.16348, mflops=4.29744 (err=5.3e-16) 42. ESSL: elapsed time t=1.73 s, 65536 iters, t-(init.)=1.58 s t(norm)=0.0627836, mflops=79.6387 (err=4.2e-16) Top mflops for N=64 = 123.362 Normalized results and averages for N=64: fft 0: mflops = 32.768 (norm. = 0.265625), norm. avg. (of 6) = 0.322929 fft 1: mflops = 30.8405 (norm. = 0.25), norm. avg. (of 6) = 0.314335 fft 2: mflops = 24.3855 (norm. = 0.197674), norm. avg. (of 6) = 0.202497 fft 3: mflops = 9.36229 (norm. = 0.0758929), norm. avg. (of 6) = 0.0432608 fft 4: mflops = 46.6034 (norm. = 0.377778), norm. avg. (of 6) = 0.234205 fft 5: mflops = 6.29146 (norm. = 0.051), norm. avg. (of 6) = 0.0568821 fft 6: mflops = 42.5098 (norm. = 0.344595), norm. avg. (of 6) = 0.184959 fft 7: mflops = 23.4756 (norm. = 0.190299), norm. avg. (of 6) = 0.139461 fft 8: mflops = 15.4202 (norm. = 0.125), norm. avg. (of 6) = 0.147088 fft 9: mflops = 52.8694 (norm. = 0.428571), norm. avg. (of 6) = 0.239805 fft 10: mflops = 64.1985 (norm. = 0.520408), norm. avg. (of 6) = 0.23546 fft 11: mflops = 15.5729 (norm. = 0.126238), norm. avg. (of 5) = 0.122643 fft 12: mflops = 76.7251 (norm. = 0.621951), norm. avg. (of 6) = 0.38693 fft 13: mflops = 70.6905 (norm. = 0.573034), norm. avg. (of 6) = 0.373708 fft 14: mflops = 123.362 (norm. = 1), norm. avg. (of 6) = 0.804943 fft 15: mflops = 99.078 (norm. = 0.80315), norm. avg. (of 6) = 0.777524 fft 16: mflops = 66.2259 (norm. = 0.536842), norm. avg. (of 6) = 0.922807 fft 17: mflops = 87.9924 (norm. = 0.713287), norm. avg. (of 4) = 0.471761 fft 18: mflops = 39.3216 (norm. = 0.31875), norm. avg. (of 6) = 0.250436 fft 19: mflops = 24.7695 (norm. = 0.200787), norm. avg. (of 6) = 0.151629 fft 20: mflops = 28.5975 (norm. = 0.231818), norm. avg. (of 6) = 0.161317 fft 21: mflops = 61.0821 (norm. = 0.495146), norm. avg. (of 6) = 0.423533 fft 22: mflops = 25.7847 (norm. = 0.209016), norm. avg. (of 5) = 0.209625 fft 23: mflops = 34.3795 (norm. = 0.278689), norm. avg. (of 5) = 0.2428 fft 24: mflops = 32.0993 (norm. = 0.260204), norm. avg. (of 5) = 0.222073 fft 25: mflops = 26.2144 (norm. = 0.2125), norm. avg. (of 5) = 0.120408 fft 26: mflops = 14.4299 (norm. = 0.116972), norm. avg. (of 6) = 0.0865647 fft 27: mflops = 21.9981 (norm. = 0.178322), norm. avg. (of 6) = 0.105519 fft 28: mflops = 35.7469 (norm. = 0.289773), norm. avg. (of 6) = 0.189604 fft 29: mflops = 25.9978 (norm. = 0.210744), norm. avg. (of 6) = 0.151002 fft 30: mflops = 68.7591 (norm. = 0.557377), norm. avg. (of 6) = 0.423488 fft 31: mflops = 70.2956 (norm. = 0.569832), norm. avg. (of 6) = 0.425139 fft 32: mflops = 33.4652 (norm. = 0.271277), norm. avg. (of 3) = 0.285217 fft 33: mflops = 14.4299 (norm. = 0.116972), norm. avg. (of 5) = 0.0581413 fft 34: mflops = 53.7731 (norm. = 0.435897), norm. avg. (of 5) = 0.327485 fft 35: mflops = 46.2607 (norm. = 0.375), norm. avg. (of 6) = 0.209128 fft 36: mflops = 46.2607 (norm. = 0.375), norm. avg. (of 6) = 0.210698 fft 37: mflops = 45.5903 (norm. = 0.369565), norm. avg. (of 6) = 0.287431 fft 38: mflops = 15.1237 (norm. = 0.122596), norm. avg. (of 6) = 0.143444 fft 39: mflops = 26.6587 (norm. = 0.216102), norm. avg. (of 6) = 0.151059 fft 40: mflops = 22.153 (norm. = 0.179577), norm. avg. (of 6) = 0.119591 fft 41: mflops = 4.29744 (norm. = 0.0348361), norm. avg. (of 6) = 0.0600106 fft 42: mflops = 79.6387 (norm. = 0.64557), norm. avg. (of 4) = 0.465623 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.08 s, 8192 iters, t-(init.)=1.04 s t(norm)=0.141689, mflops=35.2886 (err=4.2e-16) 1. Arndt DIT: elapsed time t=1.15 s, 8192 iters, t-(init.)=1.11 s t(norm)=0.151225, mflops=33.0632 (err=4.0e-16) 2. Arndt Split-Radix: elapsed time t=1.38 s, 8192 iters, t-(init.)=1.34 s t(norm)=0.182561, mflops=27.3882 (err=4.2e-16) 3. Arndt 4-step: elapsed time t=1.86 s, 4096 iters, t-(init.)=1.84 s t(norm)=0.50136, mflops=9.97287 (err=4.2e-16) 4. Bailey: elapsed time t=1.26 s, 16384 iters, t-(init.)=1.19 s t(norm)=0.0810623, mflops=61.6809 (err=4.2e-16) 5. Beauregard: elapsed time t=1.44 s, 2048 iters, t-(init.)=1.43 s t(norm)=0.779288, mflops=6.41611 (err=4.7e-16) 6. Bergland: elapsed time t=1.59 s, 16384 iters, t-(init.)=1.52 s t(norm)=0.103542, mflops=48.2897 (err=4.6e-16) 7. Brenner: elapsed time t=1.4 s, 8192 iters, t-(init.)=1.36 s t(norm)=0.185285, mflops=26.9854 (err=4.5e-16) 8. Burrus: elapsed time t=1.08 s, 4096 iters, t-(init.)=1.06 s t(norm)=0.288827, mflops=17.3114 (err=4.1e-16) 9. CWP (min N) (N=130): elapsed time t=1.21 s, 16384 iters, t-(init.)=1.14 s t(norm)=0.0776563, mflops=64.3862 10. CWP (best N) (N=140): elapsed time t=1.84 s, 32768 iters, t-(init.)=1.68 s t(norm)=0.0572205, mflops=87.3813 11. Edelblute: elapsed time t=1.06 s, 4096 iters, t-(init.)=1.04 s t(norm)=0.283378, mflops=17.6443 (err=3.9e-16) 12. FFTPACK: elapsed time t=1.68 s, 32768 iters, t-(init.)=1.54 s t(norm)=0.0524521, mflops=95.3251 (err=4.5e-16) 13. FFTPACK (f2c): elapsed time t=1.96 s, 32768 iters, t-(init.)=1.81 s t(norm)=0.0616482, mflops=81.1053 (err=4.5e-16) FFTW_MEASURE plan: (cost = 4.638672e-05) FFTW_TWIDDLE 2 FFTW_NOTW 64 14. FFTW: elapsed time t=1.52 s, 32768 iters, t-(init.)=1.38 s t(norm)=0.0470025, mflops=106.377 (err=4.3e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.57 s, 32768 iters, t-(init.)=1.42 s t(norm)=0.0483649, mflops=103.381 (err=4.4e-16) 16. Frigo-old: elapsed time t=1.03 s, 16384 iters, t-(init.)=0.96 s t(norm)=0.0653948, mflops=76.4587 (err=4.5e-16) 17. Green: elapsed time t=1.71 s, 32768 iters, t-(init.)=1.56 s t(norm)=0.0531333, mflops=94.103 (err=4.8e-16) 18. GSL: elapsed time t=1.75 s, 16384 iters, t-(init.)=1.67 s t(norm)=0.11376, mflops=43.9523 (err=4.5e-16) 19. GSL DIT: elapsed time t=1.39 s, 8192 iters, t-(init.)=1.35 s t(norm)=0.183923, mflops=27.1853 (err=5.0e-16) 20. GSL DIF: elapsed time t=1.15 s, 8192 iters, t-(init.)=1.11 s t(norm)=0.151225, mflops=33.0632 (err=5.3e-16) 21. Krukar: elapsed time t=1.24 s, 16384 iters, t-(init.)=1.17 s t(norm)=0.0796999, mflops=62.7353 (err=4.9e-16) 22. Mayer (Buneman): elapsed time t=1.31 s, 8192 iters, t-(init.)=1.28 s t(norm)=0.174386, mflops=28.672 (err=4.0e-16) 23. Mayer (simple): elapsed time t=1.92 s, 16384 iters, t-(init.)=1.85 s t(norm)=0.126021, mflops=39.6758 24. Mayer (lookup): elapsed time t=1.02 s, 8192 iters, t-(init.)=0.99 s t(norm)=0.134877, mflops=37.0709 (err=4.1e-16) 25. Monro: elapsed time t=1.15 s, 8192 iters, t-(init.)=1.12 s t(norm)=0.152588, mflops=32.768 (err=4.3e-16) 26. NAPACK (f2c): elapsed time t=1.16 s, 4096 iters, t-(init.)=1.14 s t(norm)=0.310625, mflops=16.0966 (err=1.3e-15) 27. Nielsen: elapsed time t=1.58 s, 8192 iters, t-(init.)=1.55 s t(norm)=0.211171, mflops=23.6775 (err=1.4e-15) 28. NR (C): elapsed time t=1.84 s, 16384 iters, t-(init.)=1.76 s t(norm)=0.11989, mflops=41.7047 (err=4.5e-16) 29. NR (F): elapsed time t=1.24 s, 8192 iters, t-(init.)=1.2 s t(norm)=0.163487, mflops=30.5835 (err=4.5e-16) 30. Ooura (C): elapsed time t=1.04 s, 16384 iters, t-(init.)=0.97 s t(norm)=0.066076, mflops=75.6704 (err=4.3e-16) 31. Ooura (F): elapsed time t=1.03 s, 16384 iters, t-(init.)=0.96 s t(norm)=0.0653948, mflops=76.4587 (err=4.2e-16) 32. QFT: elapsed time t=1.12 s, 8192 iters, t-(init.)=1.09 s t(norm)=0.148501, mflops=33.6699 (err=5.0e-16) 33. Ransom: elapsed time t=1.37 s, 4096 iters, t-(init.)=1.35 s t(norm)=0.367846, mflops=13.5927 (err=1.1e-15) 34. SCIPORT: elapsed time t=1.28 s, 16384 iters, t-(init.)=1.21 s t(norm)=0.0824247, mflops=60.6614 (err=8.4e-08) 35. Singleton: elapsed time t=1.66 s, 16384 iters, t-(init.)=1.59 s t(norm)=0.10831, mflops=46.1637 (err=5.6e-16) 36. Singleton (f2c): elapsed time t=1.66 s, 16384 iters, t-(init.)=1.59 s t(norm)=0.10831, mflops=46.1637 (err=5.5e-16) 37. Sorensen: elapsed time t=1.34 s, 16384 iters, t-(init.)=1.26 s t(norm)=0.0858307, mflops=58.2542 (err=4.1e-16) 38. Sorensen DIT: elapsed time t=1.11 s, 4096 iters, t-(init.)=1.09 s t(norm)=0.297001, mflops=16.8349 (err=4.1e-16) 39. Temperton: elapsed time t=1.35 s, 8192 iters, t-(init.)=1.31 s t(norm)=0.178473, mflops=28.0154 (err=4.6e-16) 40. Temperton (f2c): elapsed time t=1.8 s, 8192 iters, t-(init.)=1.77 s t(norm)=0.241143, mflops=20.7346 (err=4.6e-16) 41. Valkenburg: elapsed time t=1.04 s, 1024 iters, t-(init.)=1.04 s t(norm)=1.13351, mflops=4.41108 (err=5.6e-16) 42. ESSL: elapsed time t=1.55 s, 32768 iters, t-(init.)=1.4 s t(norm)=0.0476837, mflops=104.858 (err=3.9e-16) Top mflops for N=128 = 106.377 Normalized results and averages for N=128: fft 0: mflops = 35.2886 (norm. = 0.331731), norm. avg. (of 7) = 0.324187 fft 1: mflops = 33.0632 (norm. = 0.310811), norm. avg. (of 7) = 0.313831 fft 2: mflops = 27.3882 (norm. = 0.257463), norm. avg. (of 7) = 0.210349 fft 3: mflops = 9.97287 (norm. = 0.09375), norm. avg. (of 7) = 0.0504735 fft 4: mflops = 61.6809 (norm. = 0.579832), norm. avg. (of 7) = 0.28358 fft 5: mflops = 6.41611 (norm. = 0.0603147), norm. avg. (of 7) = 0.0573725 fft 6: mflops = 48.2897 (norm. = 0.453947), norm. avg. (of 7) = 0.223386 fft 7: mflops = 26.9854 (norm. = 0.253676), norm. avg. (of 7) = 0.155778 fft 8: mflops = 17.3114 (norm. = 0.162736), norm. avg. (of 7) = 0.149324 fft 9: mflops = 64.3862 (norm. = 0.605263), norm. avg. (of 7) = 0.292013 fft 10: mflops = 87.3813 (norm. = 0.821429), norm. avg. (of 7) = 0.319169 fft 11: mflops = 17.6443 (norm. = 0.165865), norm. avg. (of 6) = 0.129847 fft 12: mflops = 95.3251 (norm. = 0.896104), norm. avg. (of 7) = 0.459669 fft 13: mflops = 81.1053 (norm. = 0.762431), norm. avg. (of 7) = 0.42924 fft 14: mflops = 106.377 (norm. = 1), norm. avg. (of 7) = 0.832808 fft 15: mflops = 103.381 (norm. = 0.971831), norm. avg. (of 7) = 0.805282 fft 16: mflops = 76.4587 (norm. = 0.71875), norm. avg. (of 7) = 0.893656 fft 17: mflops = 94.103 (norm. = 0.884615), norm. avg. (of 5) = 0.554331 fft 18: mflops = 43.9523 (norm. = 0.413174), norm. avg. (of 7) = 0.273684 fft 19: mflops = 27.1853 (norm. = 0.255556), norm. avg. (of 7) = 0.166476 fft 20: mflops = 33.0632 (norm. = 0.310811), norm. avg. (of 7) = 0.182673 fft 21: mflops = 62.7353 (norm. = 0.589744), norm. avg. (of 7) = 0.447277 fft 22: mflops = 28.672 (norm. = 0.269531), norm. avg. (of 6) = 0.21961 fft 23: mflops = 39.6758 (norm. = 0.372973), norm. avg. (of 6) = 0.264495 fft 24: mflops = 37.0709 (norm. = 0.348485), norm. avg. (of 6) = 0.243142 fft 25: mflops = 32.768 (norm. = 0.308036), norm. avg. (of 6) = 0.151679 fft 26: mflops = 16.0966 (norm. = 0.151316), norm. avg. (of 7) = 0.0958148 fft 27: mflops = 23.6775 (norm. = 0.222581), norm. avg. (of 7) = 0.122242 fft 28: mflops = 41.7047 (norm. = 0.392045), norm. avg. (of 7) = 0.218524 fft 29: mflops = 30.5835 (norm. = 0.2875), norm. avg. (of 7) = 0.170502 fft 30: mflops = 75.6704 (norm. = 0.71134), norm. avg. (of 7) = 0.46461 fft 31: mflops = 76.4587 (norm. = 0.71875), norm. avg. (of 7) = 0.467083 fft 32: mflops = 33.6699 (norm. = 0.316514), norm. avg. (of 4) = 0.293041 fft 33: mflops = 13.5927 (norm. = 0.127778), norm. avg. (of 6) = 0.0697474 fft 34: mflops = 60.6614 (norm. = 0.570248), norm. avg. (of 6) = 0.367946 fft 35: mflops = 46.1637 (norm. = 0.433962), norm. avg. (of 7) = 0.241247 fft 36: mflops = 46.1637 (norm. = 0.433962), norm. avg. (of 7) = 0.242592 fft 37: mflops = 58.2542 (norm. = 0.547619), norm. avg. (of 7) = 0.324601 fft 38: mflops = 16.8349 (norm. = 0.158257), norm. avg. (of 7) = 0.14556 fft 39: mflops = 28.0154 (norm. = 0.263359), norm. avg. (of 7) = 0.167102 fft 40: mflops = 20.7346 (norm. = 0.194915), norm. avg. (of 7) = 0.130351 fft 41: mflops = 4.41108 (norm. = 0.0414663), norm. avg. (of 7) = 0.0573614 fft 42: mflops = 104.858 (norm. = 0.985714), norm. avg. (of 5) = 0.569641 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.09 s, 4096 iters, t-(init.)=1.05 s t(norm)=0.12517, mflops=39.9458 (err=7.1e-16) 1. Arndt DIT: elapsed time t=1.19 s, 4096 iters, t-(init.)=1.16 s t(norm)=0.138283, mflops=36.1578 (err=7.0e-16) 2. Arndt Split-Radix: elapsed time t=1.45 s, 4096 iters, t-(init.)=1.41 s t(norm)=0.168085, mflops=29.7468 (err=6.5e-16) 3. Arndt 4-step: elapsed time t=1.82 s, 2048 iters, t-(init.)=1.81 s t(norm)=0.431538, mflops=11.5865 (err=6.7e-16) 4. Bailey: elapsed time t=1.41 s, 8192 iters, t-(init.)=1.34 s t(norm)=0.0798702, mflops=62.6016 (err=7.0e-16) 5. Beauregard: elapsed time t=1.64 s, 1024 iters, t-(init.)=1.64 s t(norm)=0.782013, mflops=6.39376 (err=7.0e-16) 6. Bergland: elapsed time t=1.55 s, 8192 iters, t-(init.)=1.47 s t(norm)=0.0876188, mflops=57.0654 (err=6.7e-16) 7. Brenner: elapsed time t=1.42 s, 4096 iters, t-(init.)=1.39 s t(norm)=0.165701, mflops=30.1748 (err=7.5e-16) 8. Burrus: elapsed time t=1.11 s, 2048 iters, t-(init.)=1.09 s t(norm)=0.259876, mflops=19.2399 (err=6.4e-16) 9. CWP (min N) (N=260): elapsed time t=1.18 s, 8192 iters, t-(init.)=1.11 s t(norm)=0.0661612, mflops=75.573 10. CWP (best N) (N=280): elapsed time t=1.65 s, 16384 iters, t-(init.)=1.5 s t(norm)=0.0447035, mflops=111.848 11. Edelblute: elapsed time t=1.1 s, 2048 iters, t-(init.)=1.08 s t(norm)=0.257492, mflops=19.4181 (err=6.4e-16) 12. FFTPACK: elapsed time t=1.65 s, 16384 iters, t-(init.)=1.5 s t(norm)=0.0447035, mflops=111.848 (err=6.5e-16) 13. FFTPACK (f2c): elapsed time t=1.02 s, 8192 iters, t-(init.)=0.94 s t(norm)=0.0560284, mflops=89.2405 (err=6.5e-16) FFTW_MEASURE plan: (cost = 9.765625e-05) FFTW_TWIDDLE 8 FFTW_NOTW 32 14. FFTW: elapsed time t=1.6 s, 16384 iters, t-(init.)=1.46 s t(norm)=0.0435114, mflops=114.912 (err=6.9e-16) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.61 s, 16384 iters, t-(init.)=1.47 s t(norm)=0.0438094, mflops=114.131 (err=6.9e-16) 16. Frigo-old: elapsed time t=1.93 s, 16384 iters, t-(init.)=1.78 s t(norm)=0.0530481, mflops=94.254 (err=6.9e-16) 17. Green: elapsed time t=1.75 s, 16384 iters, t-(init.)=1.61 s t(norm)=0.0479817, mflops=104.206 (err=7.4e-16) 18. GSL: elapsed time t=1.77 s, 8192 iters, t-(init.)=1.7 s t(norm)=0.101328, mflops=49.3448 (err=6.5e-16) 19. GSL DIT: elapsed time t=1.51 s, 4096 iters, t-(init.)=1.47 s t(norm)=0.175238, mflops=28.5327 (err=6.9e-16) 20. GSL DIF: elapsed time t=1.2 s, 4096 iters, t-(init.)=1.16 s t(norm)=0.138283, mflops=36.1578 (err=6.8e-16) 21. Krukar: elapsed time t=1.53 s, 8192 iters, t-(init.)=1.45 s t(norm)=0.0864267, mflops=57.8525 (err=6.5e-16) 22. Mayer (Buneman): elapsed time t=1.39 s, 4096 iters, t-(init.)=1.36 s t(norm)=0.162125, mflops=30.8405 (err=6.5e-16) 23. Mayer (simple): elapsed time t=1.02 s, 4096 iters, t-(init.)=0.98 s t(norm)=0.116825, mflops=42.799 24. Mayer (lookup): elapsed time t=1.09 s, 4096 iters, t-(init.)=1.05 s t(norm)=0.12517, mflops=39.9458 (err=6.5e-16) 25. Monro: elapsed time t=1.14 s, 4096 iters, t-(init.)=1.11 s t(norm)=0.132322, mflops=37.7865 (err=6.4e-16) 26. NAPACK (f2c): elapsed time t=1.25 s, 2048 iters, t-(init.)=1.23 s t(norm)=0.293255, mflops=17.05 (err=3.1e-15) 27. Nielsen: elapsed time t=1.64 s, 4096 iters, t-(init.)=1.61 s t(norm)=0.191927, mflops=26.0516 (err=3.1e-15) 28. NR (C): elapsed time t=1.91 s, 8192 iters, t-(init.)=1.84 s t(norm)=0.109673, mflops=45.5903 (err=7.2e-16) 29. NR (F): elapsed time t=1.28 s, 4096 iters, t-(init.)=1.25 s t(norm)=0.149012, mflops=33.5544 (err=7.2e-16) 30. Ooura (C): elapsed time t=1.1 s, 8192 iters, t-(init.)=1.02 s t(norm)=0.0607967, mflops=82.2413 (err=6.5e-16) 31. Ooura (F): elapsed time t=1.06 s, 8192 iters, t-(init.)=0.99 s t(norm)=0.0590086, mflops=84.7334 (err=6.5e-16) 32. QFT: elapsed time t=1.26 s, 4096 iters, t-(init.)=1.23 s t(norm)=0.146627, mflops=34.1 (err=8.8e-16) 33. Ransom: elapsed time t=1 s, 2048 iters, t-(init.)=0.99 s t(norm)=0.236034, mflops=21.1834 (err=1.7e-15) 34. SCIPORT: elapsed time t=1.34 s, 8192 iters, t-(init.)=1.27 s t(norm)=0.0756979, mflops=66.052 (err=1.3e-07) 35. Singleton: elapsed time t=1.51 s, 8192 iters, t-(init.)=1.44 s t(norm)=0.0858307, mflops=58.2542 (err=1.2e-15) 36. Singleton (f2c): elapsed time t=1.55 s, 8192 iters, t-(init.)=1.48 s t(norm)=0.0882149, mflops=56.6798 (err=1.1e-15) 37. Sorensen: elapsed time t=1.3 s, 8192 iters, t-(init.)=1.22 s t(norm)=0.0727177, mflops=68.7591 (err=6.7e-16) 38. Sorensen DIT: elapsed time t=1.14 s, 2048 iters, t-(init.)=1.12 s t(norm)=0.267029, mflops=18.7246 (err=6.8e-16) 39. Temperton: elapsed time t=1.43 s, 4096 iters, t-(init.)=1.39 s t(norm)=0.165701, mflops=30.1748 (err=6.7e-16) 40. Temperton (f2c): elapsed time t=1.77 s, 4096 iters, t-(init.)=1.73 s t(norm)=0.206232, mflops=24.2445 (err=6.7e-16) 41. Valkenburg: elapsed time t=1.14 s, 512 iters, t-(init.)=1.13 s t(norm)=1.07765, mflops=4.63972 (err=7.9e-16) 42. ESSL: elapsed time t=1.8 s, 16384 iters, t-(init.)=1.66 s t(norm)=0.0494719, mflops=101.068 (err=7.1e-16) Top mflops for N=256 = 114.912 Normalized results and averages for N=256: fft 0: mflops = 39.9458 (norm. = 0.347619), norm. avg. (of 8) = 0.327116 fft 1: mflops = 36.1578 (norm. = 0.314655), norm. avg. (of 8) = 0.313934 fft 2: mflops = 29.7468 (norm. = 0.258865), norm. avg. (of 8) = 0.216413 fft 3: mflops = 11.5865 (norm. = 0.100829), norm. avg. (of 8) = 0.0567679 fft 4: mflops = 62.6016 (norm. = 0.544776), norm. avg. (of 8) = 0.31623 fft 5: mflops = 6.39376 (norm. = 0.0556402), norm. avg. (of 8) = 0.0571559 fft 6: mflops = 57.0654 (norm. = 0.496599), norm. avg. (of 8) = 0.257538 fft 7: mflops = 30.1748 (norm. = 0.26259), norm. avg. (of 8) = 0.169129 fft 8: mflops = 19.2399 (norm. = 0.167431), norm. avg. (of 8) = 0.151587 fft 9: mflops = 75.573 (norm. = 0.657658), norm. avg. (of 8) = 0.337719 fft 10: mflops = 111.848 (norm. = 0.973333), norm. avg. (of 8) = 0.40094 fft 11: mflops = 19.4181 (norm. = 0.168981), norm. avg. (of 7) = 0.135438 fft 12: mflops = 111.848 (norm. = 0.973333), norm. avg. (of 8) = 0.523877 fft 13: mflops = 89.2405 (norm. = 0.776596), norm. avg. (of 8) = 0.472659 fft 14: mflops = 114.912 (norm. = 1), norm. avg. (of 8) = 0.853707 fft 15: mflops = 114.131 (norm. = 0.993197), norm. avg. (of 8) = 0.828771 fft 16: mflops = 94.254 (norm. = 0.820225), norm. avg. (of 8) = 0.884477 fft 17: mflops = 104.206 (norm. = 0.906832), norm. avg. (of 6) = 0.613082 fft 18: mflops = 49.3448 (norm. = 0.429412), norm. avg. (of 8) = 0.29315 fft 19: mflops = 28.5327 (norm. = 0.248299), norm. avg. (of 8) = 0.176704 fft 20: mflops = 36.1578 (norm. = 0.314655), norm. avg. (of 8) = 0.199171 fft 21: mflops = 57.8525 (norm. = 0.503448), norm. avg. (of 8) = 0.454299 fft 22: mflops = 30.8405 (norm. = 0.268382), norm. avg. (of 7) = 0.226577 fft 23: mflops = 42.799 (norm. = 0.372449), norm. avg. (of 7) = 0.279917 fft 24: mflops = 39.9458 (norm. = 0.347619), norm. avg. (of 7) = 0.258067 fft 25: mflops = 37.7865 (norm. = 0.328829), norm. avg. (of 7) = 0.176986 fft 26: mflops = 17.05 (norm. = 0.148374), norm. avg. (of 8) = 0.102385 fft 27: mflops = 26.0516 (norm. = 0.226708), norm. avg. (of 8) = 0.1353 fft 28: mflops = 45.5903 (norm. = 0.396739), norm. avg. (of 8) = 0.240801 fft 29: mflops = 33.5544 (norm. = 0.292), norm. avg. (of 8) = 0.185689 fft 30: mflops = 82.2413 (norm. = 0.715686), norm. avg. (of 8) = 0.495994 fft 31: mflops = 84.7334 (norm. = 0.737374), norm. avg. (of 8) = 0.500869 fft 32: mflops = 34.1 (norm. = 0.296748), norm. avg. (of 5) = 0.293783 fft 33: mflops = 21.1834 (norm. = 0.184343), norm. avg. (of 7) = 0.0861182 fft 34: mflops = 66.052 (norm. = 0.574803), norm. avg. (of 7) = 0.397497 fft 35: mflops = 58.2542 (norm. = 0.506944), norm. avg. (of 8) = 0.274459 fft 36: mflops = 56.6798 (norm. = 0.493243), norm. avg. (of 8) = 0.273924 fft 37: mflops = 68.7591 (norm. = 0.598361), norm. avg. (of 8) = 0.358821 fft 38: mflops = 18.7246 (norm. = 0.162946), norm. avg. (of 8) = 0.147733 fft 39: mflops = 30.1748 (norm. = 0.26259), norm. avg. (of 8) = 0.179038 fft 40: mflops = 24.2445 (norm. = 0.210983), norm. avg. (of 8) = 0.14043 fft 41: mflops = 4.63972 (norm. = 0.0403761), norm. avg. (of 8) = 0.0552382 fft 42: mflops = 101.068 (norm. = 0.879518), norm. avg. (of 6) = 0.621287 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.19 s, 2048 iters, t-(init.)=1.15 s t(norm)=0.121858, mflops=41.0312 (err=7.0e-16) 1. Arndt DIT: elapsed time t=1.29 s, 2048 iters, t-(init.)=1.25 s t(norm)=0.132455, mflops=37.7487 (err=7.4e-16) 2. Arndt Split-Radix: elapsed time t=1.52 s, 2048 iters, t-(init.)=1.49 s t(norm)=0.157886, mflops=31.6684 (err=7.0e-16) 3. Arndt 4-step: elapsed time t=1.88 s, 1024 iters, t-(init.)=1.86 s t(norm)=0.394185, mflops=12.6844 (err=7.0e-16) 4. Bailey: elapsed time t=1.33 s, 4096 iters, t-(init.)=1.25 s t(norm)=0.0662274, mflops=75.4975 (err=7.3e-16) 5. Beauregard: elapsed time t=1.83 s, 512 iters, t-(init.)=1.82 s t(norm)=0.771417, mflops=6.48158 (err=7.9e-16) 6. Bergland: elapsed time t=1.56 s, 4096 iters, t-(init.)=1.49 s t(norm)=0.078943, mflops=63.3368 (err=6.9e-16) 7. Brenner: elapsed time t=1.47 s, 2048 iters, t-(init.)=1.44 s t(norm)=0.152588, mflops=32.768 (err=7.3e-16) 8. Burrus: elapsed time t=1.14 s, 1024 iters, t-(init.)=1.12 s t(norm)=0.237359, mflops=21.0651 (err=7.0e-16) 9. CWP (min N) (N=520): elapsed time t=1.14 s, 4096 iters, t-(init.)=1.07 s t(norm)=0.0566906, mflops=88.198 10. CWP (best N) (N=560): elapsed time t=1.96 s, 8192 iters, t-(init.)=1.8 s t(norm)=0.0476837, mflops=104.858 11. Edelblute: elapsed time t=1.13 s, 1024 iters, t-(init.)=1.11 s t(norm)=0.23524, mflops=21.2549 (err=7.2e-16) 12. FFTPACK: elapsed time t=1.07 s, 4096 iters, t-(init.)=1 s t(norm)=0.0529819, mflops=94.3718 (err=7.6e-16) 13. FFTPACK (f2c): elapsed time t=1.23 s, 4096 iters, t-(init.)=1.16 s t(norm)=0.061459, mflops=81.355 (err=7.6e-16) FFTW_MEASURE plan: (cost = 2.050781e-04) FFTW_TWIDDLE 16 FFTW_NOTW 32 14. FFTW: elapsed time t=1.74 s, 8192 iters, t-(init.)=1.6 s t(norm)=0.0423855, mflops=117.965 (err=7.4e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.74 s, 8192 iters, t-(init.)=1.6 s t(norm)=0.0423855, mflops=117.965 (err=7.4e-16) 16. Frigo-old: elapsed time t=1.17 s, 4096 iters, t-(init.)=1.1 s t(norm)=0.0582801, mflops=85.7926 (err=7.0e-16) 17. Green: elapsed time t=1.74 s, 8192 iters, t-(init.)=1.6 s t(norm)=0.0423855, mflops=117.965 (err=7.6e-16) 18. GSL: elapsed time t=1.08 s, 2048 iters, t-(init.)=1.04 s t(norm)=0.110202, mflops=45.3711 (err=7.4e-16) 19. GSL DIT: elapsed time t=1.65 s, 2048 iters, t-(init.)=1.62 s t(norm)=0.171661, mflops=29.1271 (err=8.7e-16) 20. GSL DIF: elapsed time t=1.28 s, 2048 iters, t-(init.)=1.24 s t(norm)=0.131395, mflops=38.0532 (err=8.3e-16) 21. Krukar: elapsed time t=1.63 s, 4096 iters, t-(init.)=1.56 s t(norm)=0.0826518, mflops=60.4948 (err=7.8e-16) 22. Mayer (Buneman): elapsed time t=1.44 s, 2048 iters, t-(init.)=1.41 s t(norm)=0.149409, mflops=33.4652 (err=6.9e-16) 23. Mayer (simple): elapsed time t=1.05 s, 2048 iters, t-(init.)=1.02 s t(norm)=0.108083, mflops=46.2607 24. Mayer (lookup): elapsed time t=1.12 s, 2048 iters, t-(init.)=1.09 s t(norm)=0.115501, mflops=43.2898 (err=7.0e-16) 25. Monro: elapsed time t=1.17 s, 2048 iters, t-(init.)=1.13 s t(norm)=0.119739, mflops=41.7575 (err=7.6e-16) 26. NAPACK (f2c): elapsed time t=1.33 s, 1024 iters, t-(init.)=1.31 s t(norm)=0.277625, mflops=18.0099 (err=6.9e-15) 27. Nielsen: elapsed time t=1.73 s, 2048 iters, t-(init.)=1.69 s t(norm)=0.179079, mflops=27.9207 (err=3.2e-15) 28. NR (C): elapsed time t=1 s, 2048 iters, t-(init.)=0.97 s t(norm)=0.102785, mflops=48.6453 (err=7.2e-16) 29. NR (F): elapsed time t=1.32 s, 2048 iters, t-(init.)=1.28 s t(norm)=0.135634, mflops=36.864 (err=7.2e-16) 30. Ooura (C): elapsed time t=1.2 s, 4096 iters, t-(init.)=1.13 s t(norm)=0.0598696, mflops=83.5149 (err=6.9e-16) 31. Ooura (F): elapsed time t=1.16 s, 4096 iters, t-(init.)=1.09 s t(norm)=0.0577503, mflops=86.5797 (err=6.9e-16) 32. QFT: elapsed time t=1.38 s, 2048 iters, t-(init.)=1.34 s t(norm)=0.141992, mflops=35.2134 (err=9.3e-16) 33. Ransom: elapsed time t=1.22 s, 1024 iters, t-(init.)=1.2 s t(norm)=0.254313, mflops=19.6608 (err=1.8e-15) 34. SCIPORT: elapsed time t=1.43 s, 4096 iters, t-(init.)=1.36 s t(norm)=0.0720554, mflops=69.3911 (err=1.3e-07) 35. Singleton: elapsed time t=1.63 s, 4096 iters, t-(init.)=1.55 s t(norm)=0.082122, mflops=60.8851 (err=1.1e-15) 36. Singleton (f2c): elapsed time t=1.68 s, 4096 iters, t-(init.)=1.61 s t(norm)=0.0853009, mflops=58.616 (err=1.1e-15) 37. Sorensen: elapsed time t=1.28 s, 4096 iters, t-(init.)=1.21 s t(norm)=0.0641081, mflops=77.9933 (err=6.9e-16) 38. Sorensen DIT: elapsed time t=1.18 s, 1024 iters, t-(init.)=1.16 s t(norm)=0.245836, mflops=20.3388 (err=7.2e-16) 39. Temperton: elapsed time t=1.74 s, 2048 iters, t-(init.)=1.7 s t(norm)=0.180138, mflops=27.7564 (err=7.4e-16) 40. Temperton (f2c): elapsed time t=1.1 s, 1024 iters, t-(init.)=1.09 s t(norm)=0.231001, mflops=21.6449 (err=7.4e-16) 41. Valkenburg: elapsed time t=1.29 s, 256 iters, t-(init.)=1.29 s t(norm)=1.09355, mflops=4.57228 (err=8.8e-16) 42. ESSL: elapsed time t=1.76 s, 8192 iters, t-(init.)=1.62 s t(norm)=0.0429153, mflops=116.508 (err=6.9e-16) Top mflops for N=512 = 117.965 Normalized results and averages for N=512: fft 0: mflops = 41.0312 (norm. = 0.347826), norm. avg. (of 9) = 0.329417 fft 1: mflops = 37.7487 (norm. = 0.32), norm. avg. (of 9) = 0.314608 fft 2: mflops = 31.6684 (norm. = 0.268456), norm. avg. (of 9) = 0.222196 fft 3: mflops = 12.6844 (norm. = 0.107527), norm. avg. (of 9) = 0.0624078 fft 4: mflops = 75.4975 (norm. = 0.64), norm. avg. (of 9) = 0.352204 fft 5: mflops = 6.48158 (norm. = 0.0549451), norm. avg. (of 9) = 0.0569103 fft 6: mflops = 63.3368 (norm. = 0.536913), norm. avg. (of 9) = 0.288579 fft 7: mflops = 32.768 (norm. = 0.277778), norm. avg. (of 9) = 0.181201 fft 8: mflops = 21.0651 (norm. = 0.178571), norm. avg. (of 9) = 0.154585 fft 9: mflops = 88.198 (norm. = 0.747664), norm. avg. (of 9) = 0.383268 fft 10: mflops = 104.858 (norm. = 0.888889), norm. avg. (of 9) = 0.455156 fft 11: mflops = 21.2549 (norm. = 0.18018), norm. avg. (of 8) = 0.14103 fft 12: mflops = 94.3718 (norm. = 0.8), norm. avg. (of 9) = 0.554558 fft 13: mflops = 81.355 (norm. = 0.689655), norm. avg. (of 9) = 0.49677 fft 14: mflops = 117.965 (norm. = 1), norm. avg. (of 9) = 0.869962 fft 15: mflops = 117.965 (norm. = 1), norm. avg. (of 9) = 0.847797 fft 16: mflops = 85.7926 (norm. = 0.727273), norm. avg. (of 9) = 0.86701 fft 17: mflops = 117.965 (norm. = 1), norm. avg. (of 7) = 0.668356 fft 18: mflops = 45.3711 (norm. = 0.384615), norm. avg. (of 9) = 0.303313 fft 19: mflops = 29.1271 (norm. = 0.246914), norm. avg. (of 9) = 0.184505 fft 20: mflops = 38.0532 (norm. = 0.322581), norm. avg. (of 9) = 0.212883 fft 21: mflops = 60.4948 (norm. = 0.512821), norm. avg. (of 9) = 0.460801 fft 22: mflops = 33.4652 (norm. = 0.283688), norm. avg. (of 8) = 0.233716 fft 23: mflops = 46.2607 (norm. = 0.392157), norm. avg. (of 8) = 0.293947 fft 24: mflops = 43.2898 (norm. = 0.366972), norm. avg. (of 8) = 0.27168 fft 25: mflops = 41.7575 (norm. = 0.353982), norm. avg. (of 8) = 0.199111 fft 26: mflops = 18.0099 (norm. = 0.152672), norm. avg. (of 9) = 0.107972 fft 27: mflops = 27.9207 (norm. = 0.236686), norm. avg. (of 9) = 0.146566 fft 28: mflops = 48.6453 (norm. = 0.412371), norm. avg. (of 9) = 0.259864 fft 29: mflops = 36.864 (norm. = 0.3125), norm. avg. (of 9) = 0.199779 fft 30: mflops = 83.5149 (norm. = 0.707965), norm. avg. (of 9) = 0.519546 fft 31: mflops = 86.5797 (norm. = 0.733945), norm. avg. (of 9) = 0.526767 fft 32: mflops = 35.2134 (norm. = 0.298507), norm. avg. (of 6) = 0.29457 fft 33: mflops = 19.6608 (norm. = 0.166667), norm. avg. (of 8) = 0.0961868 fft 34: mflops = 69.3911 (norm. = 0.588235), norm. avg. (of 8) = 0.421339 fft 35: mflops = 60.8851 (norm. = 0.516129), norm. avg. (of 9) = 0.301312 fft 36: mflops = 58.616 (norm. = 0.496894), norm. avg. (of 9) = 0.298698 fft 37: mflops = 77.9933 (norm. = 0.661157), norm. avg. (of 9) = 0.392414 fft 38: mflops = 20.3388 (norm. = 0.172414), norm. avg. (of 9) = 0.150475 fft 39: mflops = 27.7564 (norm. = 0.235294), norm. avg. (of 9) = 0.185289 fft 40: mflops = 21.6449 (norm. = 0.183486), norm. avg. (of 9) = 0.145214 fft 41: mflops = 4.57228 (norm. = 0.0387597), norm. avg. (of 9) = 0.0534073 fft 42: mflops = 116.508 (norm. = 0.987654), norm. avg. (of 7) = 0.673626 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.18 s, 1024 iters, t-(init.)=1.14 s t(norm)=0.108719, mflops=45.9902 (err=1.1e-15) 1. Arndt DIT: elapsed time t=1.31 s, 1024 iters, t-(init.)=1.27 s t(norm)=0.121117, mflops=41.2825 (err=1.1e-15) 2. Arndt Split-Radix: elapsed time t=1.58 s, 1024 iters, t-(init.)=1.55 s t(norm)=0.14782, mflops=33.825 (err=1.1e-15) 3. Arndt 4-step: elapsed time t=1.81 s, 512 iters, t-(init.)=1.79 s t(norm)=0.341415, mflops=14.6449 (err=1.1e-15) 4. Bailey: elapsed time t=1.55 s, 2048 iters, t-(init.)=1.48 s t(norm)=0.0705719, mflops=70.8497 (err=1.1e-15) 5. Beauregard: elapsed time t=1.02 s, 128 iters, t-(init.)=1.02 s t(norm)=0.778198, mflops=6.4251 (err=1.0e-15) 6. Bergland: elapsed time t=1.7 s, 2048 iters, t-(init.)=1.62 s t(norm)=0.0772476, mflops=64.7269 (err=1.0e-15) 7. Brenner: elapsed time t=1.51 s, 1024 iters, t-(init.)=1.48 s t(norm)=0.141144, mflops=35.4249 (err=1.0e-15) 8. Burrus: elapsed time t=1.16 s, 512 iters, t-(init.)=1.14 s t(norm)=0.217438, mflops=22.9951 (err=1.1e-15) 9. CWP (min N) (N=1040): elapsed time t=1.3 s, 2048 iters, t-(init.)=1.23 s t(norm)=0.058651, mflops=85.2501 10. CWP (best N) (N=1040): elapsed time t=1.3 s, 2048 iters, t-(init.)=1.23 s t(norm)=0.058651, mflops=85.2501 11. Edelblute: elapsed time t=1.16 s, 512 iters, t-(init.)=1.14 s t(norm)=0.217438, mflops=22.9951 (err=1.1e-15) 12. FFTPACK: elapsed time t=1.09 s, 2048 iters, t-(init.)=1.02 s t(norm)=0.0486374, mflops=102.802 (err=1.0e-15) 13. FFTPACK (f2c): elapsed time t=1.28 s, 2048 iters, t-(init.)=1.22 s t(norm)=0.0581741, mflops=85.9489 (err=1.0e-15) FFTW_MEASURE plan: (cost = 4.687500e-04) FFTW_TWIDDLE 16 FFTW_NOTW 64 14. FFTW: elapsed time t=1.9 s, 4096 iters, t-(init.)=1.76 s t(norm)=0.0419617, mflops=119.156 (err=1.0e-15) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.07 s, 2048 iters, t-(init.)=1.01 s t(norm)=0.0481606, mflops=103.819 (err=1.0e-15) 16. Frigo-old: elapsed time t=1.52 s, 2048 iters, t-(init.)=1.45 s t(norm)=0.0691414, mflops=72.3156 (err=1.0e-15) 17. Green: elapsed time t=1.82 s, 4096 iters, t-(init.)=1.68 s t(norm)=0.0400543, mflops=124.83 (err=1.1e-15) 18. GSL: elapsed time t=1.11 s, 1024 iters, t-(init.)=1.07 s t(norm)=0.102043, mflops=48.9989 (err=1.0e-15) 19. GSL DIT: elapsed time t=1.78 s, 1024 iters, t-(init.)=1.75 s t(norm)=0.166893, mflops=29.9593 (err=1.2e-15) 20. GSL DIF: elapsed time t=1.36 s, 1024 iters, t-(init.)=1.32 s t(norm)=0.125885, mflops=39.7188 (err=1.3e-15) 21. Krukar: elapsed time t=1.66 s, 1024 iters, t-(init.)=1.62 s t(norm)=0.154495, mflops=32.3635 (err=1.0e-15) 22. Mayer (Buneman): elapsed time t=1.51 s, 1024 iters, t-(init.)=1.47 s t(norm)=0.14019, mflops=35.6659 (err=1.1e-15) 23. Mayer (simple): elapsed time t=1.11 s, 1024 iters, t-(init.)=1.07 s t(norm)=0.102043, mflops=48.9989 24. Mayer (lookup): elapsed time t=1.2 s, 1024 iters, t-(init.)=1.17 s t(norm)=0.11158, mflops=44.8109 (err=1.1e-15) 25. Monro: elapsed time t=1.19 s, 1024 iters, t-(init.)=1.15 s t(norm)=0.109673, mflops=45.5903 (err=1.1e-15) 26. NAPACK (f2c): elapsed time t=1.46 s, 512 iters, t-(init.)=1.44 s t(norm)=0.274658, mflops=18.2044 (err=1.4e-14) 27. Nielsen: elapsed time t=1.86 s, 1024 iters, t-(init.)=1.83 s t(norm)=0.174522, mflops=28.6496 (err=5.9e-15) 28. NR (C): elapsed time t=1.06 s, 1024 iters, t-(init.)=1.03 s t(norm)=0.0982285, mflops=50.9017 (err=1.1e-15) 29. NR (F): elapsed time t=1.39 s, 1024 iters, t-(init.)=1.35 s t(norm)=0.128746, mflops=38.8361 (err=1.1e-15) 30. Ooura (C): elapsed time t=1.28 s, 2048 iters, t-(init.)=1.21 s t(norm)=0.0576973, mflops=86.6592 (err=1.0e-15) 31. Ooura (F): elapsed time t=1.2 s, 2048 iters, t-(init.)=1.13 s t(norm)=0.0538826, mflops=92.7943 (err=1.0e-15) 32. QFT: elapsed time t=1.54 s, 1024 iters, t-(init.)=1.5 s t(norm)=0.143051, mflops=34.9525 (err=1.5e-15) 33. Ransom: elapsed time t=1.99 s, 1024 iters, t-(init.)=1.95 s t(norm)=0.185966, mflops=26.8866 (err=2.4e-15) 34. SCIPORT: elapsed time t=1.55 s, 2048 iters, t-(init.)=1.48 s t(norm)=0.0705719, mflops=70.8497 (err=1.4e-07) 35. Singleton: elapsed time t=1.67 s, 2048 iters, t-(init.)=1.61 s t(norm)=0.0767708, mflops=65.1289 (err=2.8e-15) 36. Singleton (f2c): elapsed time t=1.74 s, 2048 iters, t-(init.)=1.67 s t(norm)=0.0796318, mflops=62.789 (err=1.6e-15) 37. Sorensen: elapsed time t=1.33 s, 2048 iters, t-(init.)=1.26 s t(norm)=0.0600815, mflops=83.2203 (err=1.1e-15) 38. Sorensen DIT: elapsed time t=1.2 s, 512 iters, t-(init.)=1.18 s t(norm)=0.225067, mflops=22.2156 (err=1.1e-15) 39. Temperton: elapsed time t=1.72 s, 1024 iters, t-(init.)=1.69 s t(norm)=0.161171, mflops=31.023 (err=1.0e-15) 40. Temperton (f2c): elapsed time t=1.02 s, 512 iters, t-(init.)=1 s t(norm)=0.190735, mflops=26.2144 (err=1.0e-15) 41. Valkenburg: elapsed time t=1.42 s, 128 iters, t-(init.)=1.42 s t(norm)=1.08337, mflops=4.61521 (err=1.1e-15) 42. ESSL: elapsed time t=1.93 s, 4096 iters, t-(init.)=1.79 s t(norm)=0.0426769, mflops=117.159 (err=9.9e-16) Top mflops for N=1024 = 124.83 Normalized results and averages for N=1024: fft 0: mflops = 45.9902 (norm. = 0.368421), norm. avg. (of 10) = 0.333317 fft 1: mflops = 41.2825 (norm. = 0.330709), norm. avg. (of 10) = 0.316218 fft 2: mflops = 33.825 (norm. = 0.270968), norm. avg. (of 10) = 0.227073 fft 3: mflops = 14.6449 (norm. = 0.117318), norm. avg. (of 10) = 0.0678989 fft 4: mflops = 70.8497 (norm. = 0.567568), norm. avg. (of 10) = 0.373741 fft 5: mflops = 6.4251 (norm. = 0.0514706), norm. avg. (of 10) = 0.0563663 fft 6: mflops = 64.7269 (norm. = 0.518519), norm. avg. (of 10) = 0.311573 fft 7: mflops = 35.4249 (norm. = 0.283784), norm. avg. (of 10) = 0.19146 fft 8: mflops = 22.9951 (norm. = 0.184211), norm. avg. (of 10) = 0.157548 fft 9: mflops = 85.2501 (norm. = 0.682927), norm. avg. (of 10) = 0.413234 fft 10: mflops = 85.2501 (norm. = 0.682927), norm. avg. (of 10) = 0.477934 fft 11: mflops = 22.9951 (norm. = 0.184211), norm. avg. (of 9) = 0.145828 fft 12: mflops = 102.802 (norm. = 0.823529), norm. avg. (of 10) = 0.581455 fft 13: mflops = 85.9489 (norm. = 0.688525), norm. avg. (of 10) = 0.515945 fft 14: mflops = 119.156 (norm. = 0.954545), norm. avg. (of 10) = 0.87842 fft 15: mflops = 103.819 (norm. = 0.831683), norm. avg. (of 10) = 0.846185 fft 16: mflops = 72.3156 (norm. = 0.57931), norm. avg. (of 10) = 0.83824 fft 17: mflops = 124.83 (norm. = 1), norm. avg. (of 8) = 0.709811 fft 18: mflops = 48.9989 (norm. = 0.392523), norm. avg. (of 10) = 0.312234 fft 19: mflops = 29.9593 (norm. = 0.24), norm. avg. (of 10) = 0.190054 fft 20: mflops = 39.7188 (norm. = 0.318182), norm. avg. (of 10) = 0.223413 fft 21: mflops = 32.3635 (norm. = 0.259259), norm. avg. (of 10) = 0.440647 fft 22: mflops = 35.6659 (norm. = 0.285714), norm. avg. (of 9) = 0.239494 fft 23: mflops = 48.9989 (norm. = 0.392523), norm. avg. (of 9) = 0.3049 fft 24: mflops = 44.8109 (norm. = 0.358974), norm. avg. (of 9) = 0.281379 fft 25: mflops = 45.5903 (norm. = 0.365217), norm. avg. (of 9) = 0.217567 fft 26: mflops = 18.2044 (norm. = 0.145833), norm. avg. (of 10) = 0.111758 fft 27: mflops = 28.6496 (norm. = 0.229508), norm. avg. (of 10) = 0.15486 fft 28: mflops = 50.9017 (norm. = 0.407767), norm. avg. (of 10) = 0.274655 fft 29: mflops = 38.8361 (norm. = 0.311111), norm. avg. (of 10) = 0.210912 fft 30: mflops = 86.6592 (norm. = 0.694215), norm. avg. (of 10) = 0.537013 fft 31: mflops = 92.7943 (norm. = 0.743363), norm. avg. (of 10) = 0.548426 fft 32: mflops = 34.9525 (norm. = 0.28), norm. avg. (of 7) = 0.292489 fft 33: mflops = 26.8866 (norm. = 0.215385), norm. avg. (of 9) = 0.109431 fft 34: mflops = 70.8497 (norm. = 0.567568), norm. avg. (of 9) = 0.437587 fft 35: mflops = 65.1289 (norm. = 0.521739), norm. avg. (of 10) = 0.323354 fft 36: mflops = 62.789 (norm. = 0.502994), norm. avg. (of 10) = 0.319128 fft 37: mflops = 83.2203 (norm. = 0.666667), norm. avg. (of 10) = 0.419839 fft 38: mflops = 22.2156 (norm. = 0.177966), norm. avg. (of 10) = 0.153224 fft 39: mflops = 31.023 (norm. = 0.248521), norm. avg. (of 10) = 0.191612 fft 40: mflops = 26.2144 (norm. = 0.21), norm. avg. (of 10) = 0.151693 fft 41: mflops = 4.61521 (norm. = 0.0369718), norm. avg. (of 10) = 0.0517637 fft 42: mflops = 117.159 (norm. = 0.938547), norm. avg. (of 8) = 0.706741 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.28 s, 512 iters, t-(init.)=1.25 s t(norm)=0.108372, mflops=46.1373 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.42 s, 512 iters, t-(init.)=1.39 s t(norm)=0.12051, mflops=41.4904 (err=1.5e-15) 2. Arndt Split-Radix: elapsed time t=1.62 s, 512 iters, t-(init.)=1.58 s t(norm)=0.136982, mflops=36.5011 (err=1.5e-15) 3. Arndt 4-step: elapsed time t=1.93 s, 256 iters, t-(init.)=1.91 s t(norm)=0.331185, mflops=15.0973 (err=1.4e-15) 4. Bailey: elapsed time t=1.02 s, 512 iters, t-(init.)=0.99 s t(norm)=0.0858307, mflops=58.2542 (err=1.5e-15) 5. Beauregard: elapsed time t=1.09 s, 64 iters, t-(init.)=1.08 s t(norm)=0.749068, mflops=6.67496 (err=1.5e-15) 6. Bergland: elapsed time t=1.73 s, 1024 iters, t-(init.)=1.65 s t(norm)=0.0715256, mflops=69.9051 (err=1.4e-15) 7. Brenner: elapsed time t=1.58 s, 512 iters, t-(init.)=1.54 s t(norm)=0.133514, mflops=37.4491 (err=1.5e-15) 8. Burrus: elapsed time t=1.19 s, 256 iters, t-(init.)=1.17 s t(norm)=0.202873, mflops=24.646 (err=1.4e-15) 9. CWP (min N) (N=2145): elapsed time t=1.67 s, 1024 iters, t-(init.)=1.6 s t(norm)=0.0693581, mflops=72.0896 10. CWP (best N) (N=2184): elapsed time t=1.33 s, 1024 iters, t-(init.)=1.25 s t(norm)=0.054186, mflops=92.2747 11. Edelblute: elapsed time t=1.18 s, 256 iters, t-(init.)=1.16 s t(norm)=0.201139, mflops=24.8585 (err=1.4e-15) 12. FFTPACK: elapsed time t=1.8 s, 1024 iters, t-(init.)=1.73 s t(norm)=0.0749935, mflops=66.6725 (err=1.5e-15) 13. FFTPACK (f2c): elapsed time t=1.93 s, 1024 iters, t-(init.)=1.86 s t(norm)=0.0806288, mflops=62.0126 (err=1.5e-15) FFTW_MEASURE plan: (cost = 1.562500e-03) FFTW_TWIDDLE 64 FFTW_NOTW 32 14. FFTW: elapsed time t=1.6 s, 1024 iters, t-(init.)=1.53 s t(norm)=0.0663237, mflops=75.3878 (err=1.4e-15) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.62 s, 1024 iters, t-(init.)=1.55 s t(norm)=0.0671907, mflops=74.4151 (err=1.4e-15) 16. Frigo-old: elapsed time t=1.12 s, 512 iters, t-(init.)=1.08 s t(norm)=0.0936335, mflops=53.3997 (err=1.4e-15) 17. Green: elapsed time t=1.02 s, 1024 iters, t-(init.)=0.95 s t(norm)=0.0411814, mflops=121.414 (err=1.5e-15) 18. GSL: elapsed time t=1.44 s, 512 iters, t-(init.)=1.4 s t(norm)=0.121377, mflops=41.1941 (err=1.4e-15) 19. GSL DIT: elapsed time t=1.93 s, 512 iters, t-(init.)=1.9 s t(norm)=0.164726, mflops=30.3535 (err=1.9e-15) 20. GSL DIF: elapsed time t=1.45 s, 512 iters, t-(init.)=1.41 s t(norm)=0.122244, mflops=40.9019 (err=2.3e-15) 21. Krukar: elapsed time t=1.83 s, 512 iters, t-(init.)=1.79 s t(norm)=0.155189, mflops=32.2188 (err=1.4e-15) 22. Mayer (Buneman): elapsed time t=1.55 s, 512 iters, t-(init.)=1.51 s t(norm)=0.130913, mflops=38.1932 (err=1.4e-15) 23. Mayer (simple): elapsed time t=1.14 s, 512 iters, t-(init.)=1.1 s t(norm)=0.0953674, mflops=52.4288 24. Mayer (lookup): elapsed time t=1.22 s, 512 iters, t-(init.)=1.18 s t(norm)=0.102303, mflops=48.8743 (err=1.5e-15) 25. Monro: elapsed time t=1.24 s, 512 iters, t-(init.)=1.21 s t(norm)=0.104904, mflops=47.6625 (err=1.5e-15) 26. NAPACK (f2c): elapsed time t=1.63 s, 256 iters, t-(init.)=1.61 s t(norm)=0.279166, mflops=17.9105 (err=1.5e-14) 27. Nielsen: elapsed time t=1.02 s, 256 iters, t-(init.)=1 s t(norm)=0.173395, mflops=28.8358 (err=1.2e-14) 28. NR (C): elapsed time t=1.12 s, 512 iters, t-(init.)=1.09 s t(norm)=0.0945005, mflops=52.9098 (err=1.6e-15) 29. NR (F): elapsed time t=1.46 s, 512 iters, t-(init.)=1.42 s t(norm)=0.123111, mflops=40.6139 (err=1.6e-15) 30. Ooura (C): elapsed time t=1.37 s, 1024 iters, t-(init.)=1.3 s t(norm)=0.0563535, mflops=88.7257 (err=1.4e-15) 31. Ooura (F): elapsed time t=1.33 s, 1024 iters, t-(init.)=1.26 s t(norm)=0.0546195, mflops=91.5423 (err=1.4e-15) 32. QFT: elapsed time t=1.09 s, 256 iters, t-(init.)=1.07 s t(norm)=0.185533, mflops=26.9494 (err=2.1e-15) 33. Ransom: elapsed time t=1.21 s, 256 iters, t-(init.)=1.2 s t(norm)=0.208074, mflops=24.0299 (err=2.9e-15) 34. SCIPORT: elapsed time t=1.79 s, 256 iters, t-(init.)=1.77 s t(norm)=0.30691, mflops=16.2914 (err=1.7e-07) 35. Singleton: elapsed time t=1.91 s, 1024 iters, t-(init.)=1.84 s t(norm)=0.0797619, mflops=62.6866 (err=2.7e-15) 36. Singleton (f2c): elapsed time t=1.97 s, 1024 iters, t-(init.)=1.9 s t(norm)=0.0823628, mflops=60.707 (err=2.1e-15) 37. Sorensen: elapsed time t=1.37 s, 1024 iters, t-(init.)=1.3 s t(norm)=0.0563535, mflops=88.7257 (err=1.4e-15) 38. Sorensen DIT: elapsed time t=1.24 s, 256 iters, t-(init.)=1.22 s t(norm)=0.211542, mflops=23.6359 (err=1.5e-15) 39. Temperton: elapsed time t=1.91 s, 512 iters, t-(init.)=1.88 s t(norm)=0.162992, mflops=30.6764 (err=1.5e-15) 40. Temperton (f2c): elapsed time t=1.24 s, 256 iters, t-(init.)=1.23 s t(norm)=0.213276, mflops=23.4438 (err=1.5e-15) 41. Valkenburg: elapsed time t=1.58 s, 64 iters, t-(init.)=1.57 s t(norm)=1.08892, mflops=4.59169 (err=1.5e-15) 42. ESSL: elapsed time t=1.63 s, 1024 iters, t-(init.)=1.56 s t(norm)=0.0676242, mflops=73.9381 (err=1.4e-15) Top mflops for N=2048 = 121.414 Normalized results and averages for N=2048: fft 0: mflops = 46.1373 (norm. = 0.38), norm. avg. (of 11) = 0.337561 fft 1: mflops = 41.4904 (norm. = 0.341727), norm. avg. (of 11) = 0.318537 fft 2: mflops = 36.5011 (norm. = 0.300633), norm. avg. (of 11) = 0.23376 fft 3: mflops = 15.0973 (norm. = 0.124346), norm. avg. (of 11) = 0.0730304 fft 4: mflops = 58.2542 (norm. = 0.479798), norm. avg. (of 11) = 0.383382 fft 5: mflops = 6.67496 (norm. = 0.0549769), norm. avg. (of 11) = 0.05624 fft 6: mflops = 69.9051 (norm. = 0.575758), norm. avg. (of 11) = 0.33559 fft 7: mflops = 37.4491 (norm. = 0.308442), norm. avg. (of 11) = 0.202094 fft 8: mflops = 24.646 (norm. = 0.202991), norm. avg. (of 11) = 0.161679 fft 9: mflops = 72.0896 (norm. = 0.59375), norm. avg. (of 11) = 0.429645 fft 10: mflops = 92.2747 (norm. = 0.76), norm. avg. (of 11) = 0.503576 fft 11: mflops = 24.8585 (norm. = 0.204741), norm. avg. (of 10) = 0.15172 fft 12: mflops = 66.6725 (norm. = 0.549133), norm. avg. (of 11) = 0.578516 fft 13: mflops = 62.0126 (norm. = 0.510753), norm. avg. (of 11) = 0.515473 fft 14: mflops = 75.3878 (norm. = 0.620915), norm. avg. (of 11) = 0.855011 fft 15: mflops = 74.4151 (norm. = 0.612903), norm. avg. (of 11) = 0.824978 fft 16: mflops = 53.3997 (norm. = 0.439815), norm. avg. (of 11) = 0.80202 fft 17: mflops = 121.414 (norm. = 1), norm. avg. (of 9) = 0.742054 fft 18: mflops = 41.1941 (norm. = 0.339286), norm. avg. (of 11) = 0.314693 fft 19: mflops = 30.3535 (norm. = 0.25), norm. avg. (of 11) = 0.195504 fft 20: mflops = 40.9019 (norm. = 0.336879), norm. avg. (of 11) = 0.233728 fft 21: mflops = 32.2188 (norm. = 0.265363), norm. avg. (of 11) = 0.424712 fft 22: mflops = 38.1932 (norm. = 0.31457), norm. avg. (of 10) = 0.247001 fft 23: mflops = 52.4288 (norm. = 0.431818), norm. avg. (of 10) = 0.317592 fft 24: mflops = 48.8743 (norm. = 0.402542), norm. avg. (of 10) = 0.293496 fft 25: mflops = 47.6625 (norm. = 0.392562), norm. avg. (of 10) = 0.235067 fft 26: mflops = 17.9105 (norm. = 0.147516), norm. avg. (of 11) = 0.115009 fft 27: mflops = 28.8358 (norm. = 0.2375), norm. avg. (of 11) = 0.162373 fft 28: mflops = 52.9098 (norm. = 0.43578), norm. avg. (of 11) = 0.289302 fft 29: mflops = 40.6139 (norm. = 0.334507), norm. avg. (of 11) = 0.222148 fft 30: mflops = 88.7257 (norm. = 0.730769), norm. avg. (of 11) = 0.554627 fft 31: mflops = 91.5423 (norm. = 0.753968), norm. avg. (of 11) = 0.567112 fft 32: mflops = 26.9494 (norm. = 0.221963), norm. avg. (of 8) = 0.283673 fft 33: mflops = 24.0299 (norm. = 0.197917), norm. avg. (of 10) = 0.11828 fft 34: mflops = 16.2914 (norm. = 0.134181), norm. avg. (of 10) = 0.407246 fft 35: mflops = 62.6866 (norm. = 0.516304), norm. avg. (of 11) = 0.340895 fft 36: mflops = 60.707 (norm. = 0.5), norm. avg. (of 11) = 0.335571 fft 37: mflops = 88.7257 (norm. = 0.730769), norm. avg. (of 11) = 0.448105 fft 38: mflops = 23.6359 (norm. = 0.194672), norm. avg. (of 11) = 0.156992 fft 39: mflops = 30.6764 (norm. = 0.25266), norm. avg. (of 11) = 0.197162 fft 40: mflops = 23.4438 (norm. = 0.193089), norm. avg. (of 11) = 0.155456 fft 41: mflops = 4.59169 (norm. = 0.0378185), norm. avg. (of 11) = 0.050496 fft 42: mflops = 73.9381 (norm. = 0.608974), norm. avg. (of 9) = 0.695878 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.39 s, 256 iters, t-(init.)=1.35 s t(norm)=0.107288, mflops=46.6034 (err=2.6e-15) 1. Arndt DIT: elapsed time t=1.55 s, 256 iters, t-(init.)=1.51 s t(norm)=0.120004, mflops=41.6653 (err=2.6e-15) 2. Arndt Split-Radix: elapsed time t=1.9 s, 256 iters, t-(init.)=1.86 s t(norm)=0.14782, mflops=33.825 (err=2.6e-15) 3. Arndt 4-step: elapsed time t=1.01 s, 64 iters, t-(init.)=1 s t(norm)=0.317891, mflops=15.7286 (err=2.6e-15) 4. Bailey: elapsed time t=1.79 s, 128 iters, t-(init.)=1.77 s t(norm)=0.281334, mflops=17.7725 (err=2.6e-15) 5. Beauregard: elapsed time t=1.23 s, 32 iters, t-(init.)=1.22 s t(norm)=0.775655, mflops=6.44616 (err=2.7e-15) 6. Bergland: elapsed time t=1.96 s, 512 iters, t-(init.)=1.89 s t(norm)=0.0751019, mflops=66.5763 (err=2.6e-15) 7. Brenner: elapsed time t=1.7 s, 256 iters, t-(init.)=1.66 s t(norm)=0.131925, mflops=37.9003 (err=2.6e-15) 8. Burrus: elapsed time t=1.32 s, 128 iters, t-(init.)=1.3 s t(norm)=0.206629, mflops=24.1979 (err=2.6e-15) 9. CWP (min N) (N=4290): elapsed time t=1.92 s, 512 iters, t-(init.)=1.8 s t(norm)=0.0715256, mflops=69.9051 10. CWP (best N) (N=4368): elapsed time t=1.64 s, 512 iters, t-(init.)=1.49 s t(norm)=0.0592073, mflops=84.4491 11. Edelblute: elapsed time t=1.34 s, 128 iters, t-(init.)=1.32 s t(norm)=0.209808, mflops=23.8313 (err=2.6e-15) 12. FFTPACK: elapsed time t=1.02 s, 128 iters, t-(init.)=1 s t(norm)=0.158946, mflops=31.4573 (err=2.7e-15) 13. FFTPACK (f2c): elapsed time t=1.17 s, 128 iters, t-(init.)=1.15 s t(norm)=0.182788, mflops=27.3542 (err=2.7e-15) FFTW_MEASURE plan: (cost = 4.218750e-03) FFTW_TWIDDLE 64 FFTW_NOTW 64 14. FFTW: elapsed time t=1.11 s, 256 iters, t-(init.)=1.07 s t(norm)=0.085036, mflops=58.7987 (err=2.6e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.11 s, 256 iters, t-(init.)=1.07 s t(norm)=0.085036, mflops=58.7987 (err=2.7e-15) 16. Frigo-old: elapsed time t=1.74 s, 256 iters, t-(init.)=1.7 s t(norm)=0.135104, mflops=37.0086 (err=2.6e-15) 17. Green: elapsed time t=1.35 s, 512 iters, t-(init.)=1.28 s t(norm)=0.0508626, mflops=98.304 (err=2.7e-15) 18. GSL: elapsed time t=1.1 s, 128 iters, t-(init.)=1.08 s t(norm)=0.171661, mflops=29.1271 (err=2.7e-15) 19. GSL DIT: elapsed time t=1.05 s, 128 iters, t-(init.)=1.03 s t(norm)=0.163714, mflops=30.541 (err=3.0e-15) 20. GSL DIF: elapsed time t=1.59 s, 256 iters, t-(init.)=1.55 s t(norm)=0.123183, mflops=40.59 (err=3.1e-15) 21. Krukar: elapsed time t=1.11 s, 128 iters, t-(init.)=1.09 s t(norm)=0.173251, mflops=28.8599 (err=2.7e-15) 22. Mayer (Buneman): elapsed time t=1.65 s, 256 iters, t-(init.)=1.61 s t(norm)=0.127951, mflops=39.0774 (err=2.6e-15) 23. Mayer (simple): elapsed time t=1.23 s, 256 iters, t-(init.)=1.2 s t(norm)=0.0953674, mflops=52.4288 24. Mayer (lookup): elapsed time t=1.4 s, 256 iters, t-(init.)=1.36 s t(norm)=0.108083, mflops=46.2607 (err=2.6e-15) 25. Monro: elapsed time t=1.32 s, 256 iters, t-(init.)=1.28 s t(norm)=0.101725, mflops=49.152 (err=2.6e-15) 26. NAPACK (f2c): elapsed time t=1.21 s, 64 iters, t-(init.)=1.2 s t(norm)=0.38147, mflops=13.1072 (err=5.0e-14) 27. Nielsen: elapsed time t=1.28 s, 128 iters, t-(init.)=1.26 s t(norm)=0.200272, mflops=24.9661 (err=2.3e-14) 28. NR (C): elapsed time t=1.22 s, 256 iters, t-(init.)=1.18 s t(norm)=0.093778, mflops=53.3174 (err=2.6e-15) 29. NR (F): elapsed time t=1.58 s, 256 iters, t-(init.)=1.54 s t(norm)=0.122388, mflops=40.8536 (err=2.6e-15) 30. Ooura (C): elapsed time t=1.71 s, 512 iters, t-(init.)=1.63 s t(norm)=0.0647704, mflops=77.1958 (err=2.6e-15) 31. Ooura (F): elapsed time t=1.6 s, 512 iters, t-(init.)=1.52 s t(norm)=0.0603994, mflops=82.7823 (err=2.6e-15) 32. QFT: elapsed time t=1.32 s, 128 iters, t-(init.)=1.3 s t(norm)=0.206629, mflops=24.1979 (err=3.3e-15) 33. Ransom: elapsed time t=1.1 s, 128 iters, t-(init.)=1.08 s t(norm)=0.171661, mflops=29.1271 (err=3.3e-15) 34. SCIPORT: elapsed time t=1.49 s, 64 iters, t-(init.)=1.48 s t(norm)=0.470479, mflops=10.6275 (err=1.9e-07) 35. Singleton: elapsed time t=1.97 s, 512 iters, t-(init.)=1.89 s t(norm)=0.0751019, mflops=66.5763 (err=3.8e-15) 36. Singleton (f2c): elapsed time t=1.03 s, 256 iters, t-(init.)=0.99 s t(norm)=0.0786781, mflops=63.5501 (err=4.0e-15) 37. Sorensen: elapsed time t=1.07 s, 256 iters, t-(init.)=1.03 s t(norm)=0.081857, mflops=61.0821 (err=2.6e-15) 38. Sorensen DIT: elapsed time t=1.37 s, 128 iters, t-(init.)=1.35 s t(norm)=0.214577, mflops=23.3017 (err=2.6e-15) 39. Temperton: elapsed time t=1.19 s, 128 iters, t-(init.)=1.17 s t(norm)=0.185966, mflops=26.8866 (err=2.7e-15) 40. Temperton (f2c): elapsed time t=1.48 s, 128 iters, t-(init.)=1.46 s t(norm)=0.232061, mflops=21.5461 (err=2.7e-15) 41. Valkenburg: elapsed time t=1.86 s, 32 iters, t-(init.)=1.85 s t(norm)=1.1762, mflops=4.25098 (err=2.8e-15) 42. ESSL: elapsed time t=1.12 s, 256 iters, t-(init.)=1.09 s t(norm)=0.0866254, mflops=57.7198 (err=2.6e-15) Top mflops for N=4096 = 98.304 Normalized results and averages for N=4096: fft 0: mflops = 46.6034 (norm. = 0.474074), norm. avg. (of 12) = 0.348937 fft 1: mflops = 41.6653 (norm. = 0.423841), norm. avg. (of 12) = 0.327313 fft 2: mflops = 33.825 (norm. = 0.344086), norm. avg. (of 12) = 0.242954 fft 3: mflops = 15.7286 (norm. = 0.16), norm. avg. (of 12) = 0.0802779 fft 4: mflops = 17.7725 (norm. = 0.180791), norm. avg. (of 12) = 0.3665 fft 5: mflops = 6.44616 (norm. = 0.0655738), norm. avg. (of 12) = 0.0570178 fft 6: mflops = 66.5763 (norm. = 0.677249), norm. avg. (of 12) = 0.364062 fft 7: mflops = 37.9003 (norm. = 0.385542), norm. avg. (of 12) = 0.217382 fft 8: mflops = 24.1979 (norm. = 0.246154), norm. avg. (of 12) = 0.168719 fft 9: mflops = 69.9051 (norm. = 0.711111), norm. avg. (of 12) = 0.4531 fft 10: mflops = 84.4491 (norm. = 0.85906), norm. avg. (of 12) = 0.5332 fft 11: mflops = 23.8313 (norm. = 0.242424), norm. avg. (of 11) = 0.159965 fft 12: mflops = 31.4573 (norm. = 0.32), norm. avg. (of 12) = 0.556973 fft 13: mflops = 27.3542 (norm. = 0.278261), norm. avg. (of 12) = 0.495706 fft 14: mflops = 58.7987 (norm. = 0.598131), norm. avg. (of 12) = 0.833604 fft 15: mflops = 58.7987 (norm. = 0.598131), norm. avg. (of 12) = 0.806074 fft 16: mflops = 37.0086 (norm. = 0.376471), norm. avg. (of 12) = 0.766557 fft 17: mflops = 98.304 (norm. = 1), norm. avg. (of 10) = 0.767849 fft 18: mflops = 29.1271 (norm. = 0.296296), norm. avg. (of 12) = 0.31316 fft 19: mflops = 30.541 (norm. = 0.31068), norm. avg. (of 12) = 0.205102 fft 20: mflops = 40.59 (norm. = 0.412903), norm. avg. (of 12) = 0.248659 fft 21: mflops = 28.8599 (norm. = 0.293578), norm. avg. (of 12) = 0.413784 fft 22: mflops = 39.0774 (norm. = 0.397516), norm. avg. (of 11) = 0.260684 fft 23: mflops = 52.4288 (norm. = 0.533333), norm. avg. (of 11) = 0.337205 fft 24: mflops = 46.2607 (norm. = 0.470588), norm. avg. (of 11) = 0.309595 fft 25: mflops = 49.152 (norm. = 0.5), norm. avg. (of 11) = 0.259151 fft 26: mflops = 13.1072 (norm. = 0.133333), norm. avg. (of 12) = 0.116536 fft 27: mflops = 24.9661 (norm. = 0.253968), norm. avg. (of 12) = 0.170006 fft 28: mflops = 53.3174 (norm. = 0.542373), norm. avg. (of 12) = 0.310391 fft 29: mflops = 40.8536 (norm. = 0.415584), norm. avg. (of 12) = 0.238268 fft 30: mflops = 77.1958 (norm. = 0.785276), norm. avg. (of 12) = 0.573848 fft 31: mflops = 82.7823 (norm. = 0.842105), norm. avg. (of 12) = 0.590028 fft 32: mflops = 24.1979 (norm. = 0.246154), norm. avg. (of 9) = 0.279504 fft 33: mflops = 29.1271 (norm. = 0.296296), norm. avg. (of 11) = 0.134463 fft 34: mflops = 10.6275 (norm. = 0.108108), norm. avg. (of 11) = 0.380052 fft 35: mflops = 66.5763 (norm. = 0.677249), norm. avg. (of 12) = 0.368925 fft 36: mflops = 63.5501 (norm. = 0.646465), norm. avg. (of 12) = 0.361479 fft 37: mflops = 61.0821 (norm. = 0.621359), norm. avg. (of 12) = 0.462543 fft 38: mflops = 23.3017 (norm. = 0.237037), norm. avg. (of 12) = 0.163663 fft 39: mflops = 26.8866 (norm. = 0.273504), norm. avg. (of 12) = 0.203523 fft 40: mflops = 21.5461 (norm. = 0.219178), norm. avg. (of 12) = 0.160766 fft 41: mflops = 4.25098 (norm. = 0.0432432), norm. avg. (of 12) = 0.0498916 fft 42: mflops = 57.7198 (norm. = 0.587156), norm. avg. (of 10) = 0.685006 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.45 s, 32 iters, t-(init.)=1.42 s t(norm)=0.416682, mflops=11.9995 (err=3.0e-15) 1. Arndt DIT: elapsed time t=1.48 s, 32 iters, t-(init.)=1.45 s t(norm)=0.425485, mflops=11.7513 (err=3.0e-15) 2. Arndt Split-Radix: elapsed time t=1.85 s, 32 iters, t-(init.)=1.81 s t(norm)=0.531123, mflops=9.41401 (err=3.0e-15) 3. Arndt 4-step: elapsed time t=1.43 s, 32 iters, t-(init.)=1.4 s t(norm)=0.410814, mflops=12.171 (err=3.0e-15) 4. Bailey: elapsed time t=1.09 s, 32 iters, t-(init.)=1.06 s t(norm)=0.311045, mflops=16.0749 (err=3.0e-15) 5. Beauregard: elapsed time t=1.41 s, 16 iters, t-(init.)=1.4 s t(norm)=0.821627, mflops=6.08549 (err=3.0e-15) 6. Bergland: elapsed time t=1.6 s, 64 iters, t-(init.)=1.53 s t(norm)=0.22448, mflops=22.2737 (err=2.9e-15) 7. Brenner: elapsed time t=1.11 s, 32 iters, t-(init.)=1.08 s t(norm)=0.316913, mflops=15.7772 (err=3.0e-15) 8. Burrus: elapsed time t=1.06 s, 16 iters, t-(init.)=1.05 s t(norm)=0.61622, mflops=8.11398 (err=3.0e-15) 9. CWP (min N) (N=8580): elapsed time t=1.42 s, 128 iters, t-(init.)=1.28 s t(norm)=0.0939002, mflops=53.248 10. CWP (best N) (N=9240): elapsed time t=1.3 s, 128 iters, t-(init.)=1.15 s t(norm)=0.0843635, mflops=59.2673 11. Edelblute: elapsed time t=1.03 s, 16 iters, t-(init.)=1.02 s t(norm)=0.598614, mflops=8.35263 (err=3.0e-15) 12. FFTPACK: elapsed time t=1.31 s, 64 iters, t-(init.)=1.24 s t(norm)=0.181932, mflops=27.4828 (err=2.9e-15) 13. FFTPACK (f2c): elapsed time t=1.47 s, 64 iters, t-(init.)=1.4 s t(norm)=0.205407, mflops=24.3419 (err=2.9e-15) FFTW_MEASURE plan: (cost = 1.093750e-02) FFTW_TWIDDLE 4 FFTW_TWIDDLE 64 FFTW_NOTW 32 14. FFTW: elapsed time t=1.41 s, 128 iters, t-(init.)=1.28 s t(norm)=0.0939002, mflops=53.248 (err=2.9e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.99 s, 128 iters, t-(init.)=1.86 s t(norm)=0.136449, mflops=36.6438 (err=2.9e-15) 16. Frigo-old: elapsed time t=1.42 s, 64 iters, t-(init.)=1.35 s t(norm)=0.198071, mflops=25.2435 (err=2.9e-15) 17. Green: elapsed time t=1.43 s, 64 iters, t-(init.)=1.37 s t(norm)=0.201005, mflops=24.875 (err=3.0e-15) 18. GSL: elapsed time t=1.37 s, 64 iters, t-(init.)=1.31 s t(norm)=0.192202, mflops=26.0143 (err=2.9e-15) 19. GSL DIT: elapsed time t=1.69 s, 32 iters, t-(init.)=1.66 s t(norm)=0.487107, mflops=10.2647 (err=3.7e-15) 20. GSL DIF: elapsed time t=1.67 s, 32 iters, t-(init.)=1.63 s t(norm)=0.478304, mflops=10.4536 (err=3.8e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.89 s, 128 iters, t-(init.)=1.76 s t(norm)=0.129113, mflops=38.7258 (err=3.0e-15) 23. Mayer (simple): elapsed time t=1.42 s, 128 iters, t-(init.)=1.29 s t(norm)=0.0946338, mflops=52.8352 24. Mayer (lookup): elapsed time t=1.85 s, 128 iters, t-(init.)=1.72 s t(norm)=0.126178, mflops=39.6264 (err=3.0e-15) 25. Monro: elapsed time t=1.6 s, 32 iters, t-(init.)=1.56 s t(norm)=0.457764, mflops=10.9227 (err=3.1e-15) 26. NAPACK (f2c): elapsed time t=1.32 s, 32 iters, t-(init.)=1.29 s t(norm)=0.378535, mflops=13.2088 (err=4.5e-14) 27. Nielsen: elapsed time t=1.52 s, 32 iters, t-(init.)=1.48 s t(norm)=0.434289, mflops=11.5131 (err=1.1e-14) 28. NR (C): elapsed time t=1.56 s, 32 iters, t-(init.)=1.52 s t(norm)=0.446026, mflops=11.2101 (err=3.1e-15) 29. NR (F): elapsed time t=1.62 s, 32 iters, t-(init.)=1.58 s t(norm)=0.463632, mflops=10.7844 (err=3.1e-15) 30. Ooura (C): elapsed time t=1.73 s, 128 iters, t-(init.)=1.6 s t(norm)=0.117375, mflops=42.5984 (err=2.9e-15) 31. Ooura (F): elapsed time t=1.71 s, 128 iters, t-(init.)=1.58 s t(norm)=0.115908, mflops=43.1376 (err=2.9e-15) 32. QFT: elapsed time t=1.78 s, 64 iters, t-(init.)=1.71 s t(norm)=0.25089, mflops=19.9291 (err=4.1e-15) 33. Ransom: elapsed time t=1.65 s, 64 iters, t-(init.)=1.58 s t(norm)=0.231816, mflops=21.5688 (err=4.2e-15) 34. SCIPORT: elapsed time t=1.57 s, 32 iters, t-(init.)=1.54 s t(norm)=0.451895, mflops=11.0645 (err=2.0e-07) 35. Singleton: elapsed time t=1.84 s, 64 iters, t-(init.)=1.77 s t(norm)=0.259693, mflops=19.2535 (err=4.4e-15) 36. Singleton (f2c): elapsed time t=1.87 s, 64 iters, t-(init.)=1.81 s t(norm)=0.265562, mflops=18.828 (err=4.4e-15) 37. Sorensen: elapsed time t=1.09 s, 32 iters, t-(init.)=1.06 s t(norm)=0.311045, mflops=16.0749 (err=3.0e-15) 38. Sorensen DIT: elapsed time t=1.05 s, 16 iters, t-(init.)=1.03 s t(norm)=0.604483, mflops=8.27153 (err=3.0e-15) 39. Temperton: elapsed time t=1.2 s, 32 iters, t-(init.)=1.16 s t(norm)=0.340388, mflops=14.6891 (err=2.9e-15) 40. Temperton (f2c): elapsed time t=1.34 s, 32 iters, t-(init.)=1.31 s t(norm)=0.384404, mflops=13.0071 (err=2.9e-15) 41. Valkenburg: elapsed time t=1.08 s, 8 iters, t-(init.)=1.08 s t(norm)=1.26765, mflops=3.9443 (err=3.0e-15) 42. ESSL: elapsed time t=1.29 s, 128 iters, t-(init.)=1.16 s t(norm)=0.0850971, mflops=58.7564 (err=2.9e-15) Top mflops for N=8192 = 59.2673 Normalized results and averages for N=8192: fft 0: mflops = 11.9995 (norm. = 0.202465), norm. avg. (of 13) = 0.33767 fft 1: mflops = 11.7513 (norm. = 0.198276), norm. avg. (of 13) = 0.317387 fft 2: mflops = 9.41401 (norm. = 0.15884), norm. avg. (of 13) = 0.236484 fft 3: mflops = 12.171 (norm. = 0.205357), norm. avg. (of 13) = 0.0898993 fft 4: mflops = 16.0749 (norm. = 0.271226), norm. avg. (of 13) = 0.359171 fft 5: mflops = 6.08549 (norm. = 0.102679), norm. avg. (of 13) = 0.0605302 fft 6: mflops = 22.2737 (norm. = 0.375817), norm. avg. (of 13) = 0.364966 fft 7: mflops = 15.7772 (norm. = 0.266204), norm. avg. (of 13) = 0.221137 fft 8: mflops = 8.11398 (norm. = 0.136905), norm. avg. (of 13) = 0.166271 fft 9: mflops = 53.248 (norm. = 0.898438), norm. avg. (of 13) = 0.487357 fft 10: mflops = 59.2673 (norm. = 1), norm. avg. (of 13) = 0.569107 fft 11: mflops = 8.35263 (norm. = 0.140931), norm. avg. (of 12) = 0.158379 fft 12: mflops = 27.4828 (norm. = 0.46371), norm. avg. (of 13) = 0.549799 fft 13: mflops = 24.3419 (norm. = 0.410714), norm. avg. (of 13) = 0.489168 fft 14: mflops = 53.248 (norm. = 0.898438), norm. avg. (of 13) = 0.838591 fft 15: mflops = 36.6438 (norm. = 0.61828), norm. avg. (of 13) = 0.791628 fft 16: mflops = 25.2435 (norm. = 0.425926), norm. avg. (of 13) = 0.740355 fft 17: mflops = 24.875 (norm. = 0.419708), norm. avg. (of 11) = 0.7362 fft 18: mflops = 26.0143 (norm. = 0.438931), norm. avg. (of 13) = 0.322835 fft 19: mflops = 10.2647 (norm. = 0.173193), norm. avg. (of 13) = 0.202647 fft 20: mflops = 10.4536 (norm. = 0.17638), norm. avg. (of 13) = 0.243099 fft 21: mflops = -1 (norm. = -0.0168727), norm. avg. (of 12) = 0.413784 fft 22: mflops = 38.7258 (norm. = 0.653409), norm. avg. (of 12) = 0.293411 fft 23: mflops = 52.8352 (norm. = 0.891473), norm. avg. (of 12) = 0.383394 fft 24: mflops = 39.6264 (norm. = 0.668605), norm. avg. (of 12) = 0.339513 fft 25: mflops = 10.9227 (norm. = 0.184295), norm. avg. (of 12) = 0.252913 fft 26: mflops = 13.2088 (norm. = 0.222868), norm. avg. (of 13) = 0.124715 fft 27: mflops = 11.5131 (norm. = 0.194257), norm. avg. (of 13) = 0.171871 fft 28: mflops = 11.2101 (norm. = 0.189145), norm. avg. (of 13) = 0.301065 fft 29: mflops = 10.7844 (norm. = 0.181962), norm. avg. (of 13) = 0.233937 fft 30: mflops = 42.5984 (norm. = 0.71875), norm. avg. (of 13) = 0.584994 fft 31: mflops = 43.1376 (norm. = 0.727848), norm. avg. (of 13) = 0.60063 fft 32: mflops = 19.9291 (norm. = 0.336257), norm. avg. (of 10) = 0.285179 fft 33: mflops = 21.5688 (norm. = 0.363924), norm. avg. (of 12) = 0.153585 fft 34: mflops = 11.0645 (norm. = 0.186688), norm. avg. (of 12) = 0.363938 fft 35: mflops = 19.2535 (norm. = 0.324859), norm. avg. (of 13) = 0.365535 fft 36: mflops = 18.828 (norm. = 0.31768), norm. avg. (of 13) = 0.358109 fft 37: mflops = 16.0749 (norm. = 0.271226), norm. avg. (of 13) = 0.447827 fft 38: mflops = 8.27153 (norm. = 0.139563), norm. avg. (of 13) = 0.161809 fft 39: mflops = 14.6891 (norm. = 0.247845), norm. avg. (of 13) = 0.206933 fft 40: mflops = 13.0071 (norm. = 0.219466), norm. avg. (of 13) = 0.165282 fft 41: mflops = 3.9443 (norm. = 0.0665509), norm. avg. (of 13) = 0.0511731 fft 42: mflops = 58.7564 (norm. = 0.991379), norm. avg. (of 11) = 0.712858 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.6 s, 16 iters, t-(init.)=1.57 s t(norm)=0.427791, mflops=11.6879 (err=5.7e-15) 1. Arndt DIT: elapsed time t=1.61 s, 16 iters, t-(init.)=1.58 s t(norm)=0.430516, mflops=11.614 (err=5.7e-15) 2. Arndt Split-Radix: elapsed time t=1.07 s, 8 iters, t-(init.)=1.06 s t(norm)=0.577654, mflops=8.6557 (err=5.7e-15) 3. Arndt 4-step: elapsed time t=1.46 s, 16 iters, t-(init.)=1.42 s t(norm)=0.386919, mflops=12.9226 (err=5.7e-15) 4. Bailey: elapsed time t=1.2 s, 16 iters, t-(init.)=1.17 s t(norm)=0.3188, mflops=15.6838 (err=5.7e-15) 5. Beauregard: elapsed time t=1.52 s, 8 iters, t-(init.)=1.5 s t(norm)=0.817435, mflops=6.11669 (err=5.7e-15) 6. Bergland: elapsed time t=1.76 s, 32 iters, t-(init.)=1.69 s t(norm)=0.230244, mflops=21.7161 (err=5.7e-15) 7. Brenner: elapsed time t=1.19 s, 16 iters, t-(init.)=1.16 s t(norm)=0.316075, mflops=15.819 (err=5.7e-15) 8. Burrus: elapsed time t=1.21 s, 8 iters, t-(init.)=1.2 s t(norm)=0.653948, mflops=7.64587 (err=5.7e-15) 9. CWP (min N) (N=17160): elapsed time t=1.5 s, 64 iters, t-(init.)=1.36 s t(norm)=0.0926426, mflops=53.9708 10. CWP (best N) (N=17160): elapsed time t=1.49 s, 64 iters, t-(init.)=1.36 s t(norm)=0.0926426, mflops=53.9708 11. Edelblute: elapsed time t=1.17 s, 8 iters, t-(init.)=1.15 s t(norm)=0.6267, mflops=7.9783 (err=5.7e-15) 12. FFTPACK: elapsed time t=1.57 s, 32 iters, t-(init.)=1.5 s t(norm)=0.204359, mflops=24.4668 (err=5.7e-15) 13. FFTPACK (f2c): elapsed time t=1.8 s, 32 iters, t-(init.)=1.74 s t(norm)=0.237056, mflops=21.092 (err=5.7e-15) FFTW_MEASURE plan: (cost = 2.250000e-02) FFTW_TWIDDLE 2 FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_NOTW 64 14. FFTW: elapsed time t=1.48 s, 64 iters, t-(init.)=1.34 s t(norm)=0.0912803, mflops=54.7764 (err=5.7e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.07 s, 32 iters, t-(init.)=1.01 s t(norm)=0.137602, mflops=36.3368 (err=5.7e-15) 16. Frigo-old: elapsed time t=1.77 s, 32 iters, t-(init.)=1.7 s t(norm)=0.231607, mflops=21.5883 (err=5.7e-15) 17. Green: elapsed time t=1.63 s, 32 iters, t-(init.)=1.56 s t(norm)=0.212533, mflops=23.5257 (err=5.8e-15) 18. GSL: elapsed time t=1.38 s, 32 iters, t-(init.)=1.31 s t(norm)=0.178473, mflops=28.0154 (err=5.7e-15) 19. GSL DIT: elapsed time t=1.87 s, 16 iters, t-(init.)=1.84 s t(norm)=0.50136, mflops=9.97287 (err=6.3e-15) 20. GSL DIF: elapsed time t=1.84 s, 16 iters, t-(init.)=1.81 s t(norm)=0.493186, mflops=10.1382 (err=6.4e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.28 s, 16 iters, t-(init.)=1.24 s t(norm)=0.337873, mflops=14.7985 (err=5.7e-15) 23. Mayer (simple): elapsed time t=1.2 s, 16 iters, t-(init.)=1.17 s t(norm)=0.3188, mflops=15.6838 24. Mayer (lookup): elapsed time t=1.25 s, 16 iters, t-(init.)=1.22 s t(norm)=0.332424, mflops=15.041 (err=5.7e-15) 25. Monro: elapsed time t=1.69 s, 16 iters, t-(init.)=1.65 s t(norm)=0.449589, mflops=11.1213 (err=6.0e-15) 26. NAPACK (f2c): elapsed time t=1.37 s, 16 iters, t-(init.)=1.34 s t(norm)=0.365121, mflops=13.6941 (err=2.3e-13) 27. Nielsen: elapsed time t=1.57 s, 16 iters, t-(init.)=1.54 s t(norm)=0.419617, mflops=11.9156 (err=1.3e-13) 28. NR (C): elapsed time t=1.73 s, 16 iters, t-(init.)=1.7 s t(norm)=0.463213, mflops=10.7942 (err=5.6e-15) 29. NR (F): elapsed time t=1.81 s, 16 iters, t-(init.)=1.78 s t(norm)=0.485012, mflops=10.309 (err=5.6e-15) 30. Ooura (C): elapsed time t=1.83 s, 64 iters, t-(init.)=1.7 s t(norm)=0.115803, mflops=43.1767 (err=5.6e-15) 31. Ooura (F): elapsed time t=1.77 s, 64 iters, t-(init.)=1.65 s t(norm)=0.112397, mflops=44.485 (err=5.6e-15) 32. QFT: elapsed time t=1.22 s, 16 iters, t-(init.)=1.19 s t(norm)=0.324249, mflops=15.4202 (err=7.0e-15) 33. Ransom: elapsed time t=1.42 s, 32 iters, t-(init.)=1.36 s t(norm)=0.185285, mflops=26.9854 (err=6.1e-15) 34. SCIPORT: elapsed time t=1.99 s, 16 iters, t-(init.)=1.95 s t(norm)=0.531333, mflops=9.4103 (err=2.1e-07) 35. Singleton: elapsed time t=1.9 s, 32 iters, t-(init.)=1.84 s t(norm)=0.25068, mflops=19.9457 (err=1.4e-14) 36. Singleton (f2c): elapsed time t=1.92 s, 32 iters, t-(init.)=1.86 s t(norm)=0.253405, mflops=19.7313 (err=8.4e-15) 37. Sorensen: elapsed time t=1.34 s, 16 iters, t-(init.)=1.31 s t(norm)=0.356947, mflops=14.0077 (err=5.7e-15) 38. Sorensen DIT: elapsed time t=1.2 s, 8 iters, t-(init.)=1.18 s t(norm)=0.643049, mflops=7.77546 (err=5.7e-15) 39. Temperton: elapsed time t=1.25 s, 16 iters, t-(init.)=1.22 s t(norm)=0.332424, mflops=15.041 (err=5.7e-15) 40. Temperton (f2c): elapsed time t=1.35 s, 16 iters, t-(init.)=1.32 s t(norm)=0.359671, mflops=13.9016 (err=5.7e-15) 41. Valkenburg: elapsed time t=1.17 s, 4 iters, t-(init.)=1.17 s t(norm)=1.2752, mflops=3.92096 (err=5.8e-15) 42. ESSL: elapsed time t=1.36 s, 64 iters, t-(init.)=1.23 s t(norm)=0.0837871, mflops=59.6751 (err=5.6e-15) Top mflops for N=16384 = 59.6751 Normalized results and averages for N=16384: fft 0: mflops = 11.6879 (norm. = 0.19586), norm. avg. (of 14) = 0.327541 fft 1: mflops = 11.614 (norm. = 0.19462), norm. avg. (of 14) = 0.308618 fft 2: mflops = 8.6557 (norm. = 0.145047), norm. avg. (of 14) = 0.229953 fft 3: mflops = 12.9226 (norm. = 0.216549), norm. avg. (of 14) = 0.0989458 fft 4: mflops = 15.6838 (norm. = 0.262821), norm. avg. (of 14) = 0.352289 fft 5: mflops = 6.11669 (norm. = 0.1025), norm. avg. (of 14) = 0.063528 fft 6: mflops = 21.7161 (norm. = 0.363905), norm. avg. (of 14) = 0.36489 fft 7: mflops = 15.819 (norm. = 0.265086), norm. avg. (of 14) = 0.224276 fft 8: mflops = 7.64587 (norm. = 0.128125), norm. avg. (of 14) = 0.163547 fft 9: mflops = 53.9708 (norm. = 0.904412), norm. avg. (of 14) = 0.517146 fft 10: mflops = 53.9708 (norm. = 0.904412), norm. avg. (of 14) = 0.593058 fft 11: mflops = 7.9783 (norm. = 0.133696), norm. avg. (of 13) = 0.156481 fft 12: mflops = 24.4668 (norm. = 0.41), norm. avg. (of 14) = 0.539814 fft 13: mflops = 21.092 (norm. = 0.353448), norm. avg. (of 14) = 0.479474 fft 14: mflops = 54.7764 (norm. = 0.91791), norm. avg. (of 14) = 0.844257 fft 15: mflops = 36.3368 (norm. = 0.608911), norm. avg. (of 14) = 0.778577 fft 16: mflops = 21.5883 (norm. = 0.361765), norm. avg. (of 14) = 0.713313 fft 17: mflops = 23.5257 (norm. = 0.394231), norm. avg. (of 12) = 0.707702 fft 18: mflops = 28.0154 (norm. = 0.469466), norm. avg. (of 14) = 0.333308 fft 19: mflops = 9.97287 (norm. = 0.16712), norm. avg. (of 14) = 0.20011 fft 20: mflops = 10.1382 (norm. = 0.16989), norm. avg. (of 14) = 0.23787 fft 21: mflops = -1 (norm. = -0.0167574), norm. avg. (of 12) = 0.413784 fft 22: mflops = 14.7985 (norm. = 0.247984), norm. avg. (of 13) = 0.289917 fft 23: mflops = 15.6838 (norm. = 0.262821), norm. avg. (of 13) = 0.374119 fft 24: mflops = 15.041 (norm. = 0.252049), norm. avg. (of 13) = 0.332785 fft 25: mflops = 11.1213 (norm. = 0.186364), norm. avg. (of 13) = 0.247794 fft 26: mflops = 13.6941 (norm. = 0.229478), norm. avg. (of 14) = 0.132198 fft 27: mflops = 11.9156 (norm. = 0.199675), norm. avg. (of 14) = 0.173857 fft 28: mflops = 10.7942 (norm. = 0.180882), norm. avg. (of 14) = 0.29248 fft 29: mflops = 10.309 (norm. = 0.172753), norm. avg. (of 14) = 0.229566 fft 30: mflops = 43.1767 (norm. = 0.723529), norm. avg. (of 14) = 0.59489 fft 31: mflops = 44.485 (norm. = 0.745455), norm. avg. (of 14) = 0.610974 fft 32: mflops = 15.4202 (norm. = 0.258403), norm. avg. (of 11) = 0.282745 fft 33: mflops = 26.9854 (norm. = 0.452206), norm. avg. (of 13) = 0.176556 fft 34: mflops = 9.4103 (norm. = 0.157692), norm. avg. (of 13) = 0.348073 fft 35: mflops = 19.9457 (norm. = 0.334239), norm. avg. (of 14) = 0.3633 fft 36: mflops = 19.7313 (norm. = 0.330645), norm. avg. (of 14) = 0.356148 fft 37: mflops = 14.0077 (norm. = 0.234733), norm. avg. (of 14) = 0.432606 fft 38: mflops = 7.77546 (norm. = 0.130297), norm. avg. (of 14) = 0.159558 fft 39: mflops = 15.041 (norm. = 0.252049), norm. avg. (of 14) = 0.210155 fft 40: mflops = 13.9016 (norm. = 0.232955), norm. avg. (of 14) = 0.170115 fft 41: mflops = 3.92096 (norm. = 0.0657051), norm. avg. (of 14) = 0.0522111 fft 42: mflops = 59.6751 (norm. = 1), norm. avg. (of 12) = 0.736786 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.81 s, 8 iters, t-(init.)=1.78 s t(norm)=0.452677, mflops=11.0454 (err=5.1e-15) 1. Arndt DIT: elapsed time t=1.86 s, 8 iters, t-(init.)=1.83 s t(norm)=0.465393, mflops=10.7436 (err=5.1e-15) 2. Arndt Split-Radix: elapsed time t=1.16 s, 4 iters, t-(init.)=1.14 s t(norm)=0.579834, mflops=8.62316 (err=5.1e-15) 3. Arndt 4-step: elapsed time t=1.82 s, 8 iters, t-(init.)=1.79 s t(norm)=0.455221, mflops=10.9837 (err=5.2e-15) 4. Bailey: elapsed time t=1.18 s, 8 iters, t-(init.)=1.14 s t(norm)=0.289917, mflops=17.2463 (err=5.1e-15) 5. Beauregard: elapsed time t=1.63 s, 4 iters, t-(init.)=1.61 s t(norm)=0.818888, mflops=6.10584 (err=5.1e-15) 6. Bergland: elapsed time t=1.83 s, 16 iters, t-(init.)=1.76 s t(norm)=0.223796, mflops=22.3418 (err=5.0e-15) 7. Brenner: elapsed time t=1.27 s, 8 iters, t-(init.)=1.23 s t(norm)=0.312805, mflops=15.9844 (err=5.2e-15) 8. Burrus: elapsed time t=1.31 s, 4 iters, t-(init.)=1.3 s t(norm)=0.661214, mflops=7.56185 (err=5.1e-15) 9. CWP (min N) (N=34320): elapsed time t=1.65 s, 32 iters, t-(init.)=1.52 s t(norm)=0.096639, mflops=51.7389 10. CWP (best N) (N=34320): elapsed time t=1.64 s, 32 iters, t-(init.)=1.5 s t(norm)=0.0953674, mflops=52.4288 11. Edelblute: elapsed time t=1.26 s, 4 iters, t-(init.)=1.24 s t(norm)=0.630697, mflops=7.92774 (err=5.1e-15) 12. FFTPACK: elapsed time t=1.47 s, 16 iters, t-(init.)=1.4 s t(norm)=0.178019, mflops=28.0869 (err=5.1e-15) 13. FFTPACK (f2c): elapsed time t=1.71 s, 16 iters, t-(init.)=1.65 s t(norm)=0.209808, mflops=23.8313 (err=5.1e-15) FFTW_MEASURE plan: (cost = 4.875000e-02) FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_NOTW 64 14. FFTW: elapsed time t=1.57 s, 32 iters, t-(init.)=1.44 s t(norm)=0.0915527, mflops=54.6133 (err=5.1e-15) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.05 s, 16 iters, t-(init.)=0.98 s t(norm)=0.124613, mflops=40.1241 (err=5.1e-15) 16. Frigo-old: elapsed time t=1.87 s, 16 iters, t-(init.)=1.8 s t(norm)=0.228882, mflops=21.8453 (err=5.1e-15) 17. Green: elapsed time t=1.72 s, 16 iters, t-(init.)=1.65 s t(norm)=0.209808, mflops=23.8313 (err=5.1e-15) 18. GSL: elapsed time t=1.41 s, 16 iters, t-(init.)=1.34 s t(norm)=0.17039, mflops=29.3445 (err=5.1e-15) 19. GSL DIT: elapsed time t=1.02 s, 4 iters, t-(init.)=1 s t(norm)=0.508626, mflops=9.8304 (err=5.8e-15) 20. GSL DIF: elapsed time t=1 s, 4 iters, t-(init.)=0.98 s t(norm)=0.498454, mflops=10.031 (err=5.9e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.4 s, 8 iters, t-(init.)=1.36 s t(norm)=0.345866, mflops=14.4565 (err=5.2e-15) 23. Mayer (simple): elapsed time t=1.31 s, 8 iters, t-(init.)=1.28 s t(norm)=0.325521, mflops=15.36 24. Mayer (lookup): elapsed time t=1.37 s, 8 iters, t-(init.)=1.34 s t(norm)=0.34078, mflops=14.6722 (err=5.2e-15) 25. Monro: elapsed time t=1.98 s, 8 iters, t-(init.)=1.95 s t(norm)=0.495911, mflops=10.0825 (err=6.1e-15) 26. NAPACK (f2c): elapsed time t=1.5 s, 8 iters, t-(init.)=1.47 s t(norm)=0.37384, mflops=13.3747 (err=5.7e-13) 27. Nielsen: elapsed time t=1.7 s, 8 iters, t-(init.)=1.67 s t(norm)=0.424703, mflops=11.7729 (err=2.3e-13) 28. NR (C): elapsed time t=1.91 s, 8 iters, t-(init.)=1.88 s t(norm)=0.478109, mflops=10.4579 (err=5.3e-15) 29. NR (F): elapsed time t=1.98 s, 8 iters, t-(init.)=1.95 s t(norm)=0.495911, mflops=10.0825 (err=5.3e-15) 30. Ooura (C): elapsed time t=1.96 s, 32 iters, t-(init.)=1.83 s t(norm)=0.116348, mflops=42.9744 (err=5.0e-15) 31. Ooura (F): elapsed time t=1.93 s, 32 iters, t-(init.)=1.8 s t(norm)=0.114441, mflops=43.6907 (err=5.0e-15) 32. QFT: elapsed time t=1.57 s, 8 iters, t-(init.)=1.54 s t(norm)=0.391642, mflops=12.7668 (err=7.5e-15) 33. Ransom: elapsed time t=1.67 s, 16 iters, t-(init.)=1.6 s t(norm)=0.203451, mflops=24.576 (err=6.5e-15) 34. SCIPORT: elapsed time t=1.02 s, 4 iters, t-(init.)=1 s t(norm)=0.508626, mflops=9.8304 (err=2.3e-07) 35. Singleton: elapsed time t=1.23 s, 8 iters, t-(init.)=1.19 s t(norm)=0.302633, mflops=16.5217 (err=7.8e-15) 36. Singleton (f2c): elapsed time t=1.24 s, 8 iters, t-(init.)=1.21 s t(norm)=0.307719, mflops=16.2486 (err=7.2e-15) 37. Sorensen: elapsed time t=1.69 s, 8 iters, t-(init.)=1.66 s t(norm)=0.42216, mflops=11.8439 (err=5.2e-15) 38. Sorensen DIT: elapsed time t=1.29 s, 4 iters, t-(init.)=1.27 s t(norm)=0.645955, mflops=7.74047 (err=5.1e-15) 39. Temperton: elapsed time t=1.42 s, 8 iters, t-(init.)=1.39 s t(norm)=0.353495, mflops=14.1445 (err=5.1e-15) 40. Temperton (f2c): elapsed time t=1.59 s, 8 iters, t-(init.)=1.56 s t(norm)=0.396729, mflops=12.6031 (err=5.1e-15) 41. Valkenburg: elapsed time t=1.26 s, 2 iters, t-(init.)=1.26 s t(norm)=1.28174, mflops=3.90095 (err=5.0e-15) 42. ESSL: elapsed time t=1.66 s, 32 iters, t-(init.)=1.53 s t(norm)=0.0972748, mflops=51.4008 (err=5.0e-15) Top mflops for N=32768 = 54.6133 Normalized results and averages for N=32768: fft 0: mflops = 11.0454 (norm. = 0.202247), norm. avg. (of 15) = 0.319188 fft 1: mflops = 10.7436 (norm. = 0.196721), norm. avg. (of 15) = 0.301158 fft 2: mflops = 8.62316 (norm. = 0.157895), norm. avg. (of 15) = 0.225149 fft 3: mflops = 10.9837 (norm. = 0.201117), norm. avg. (of 15) = 0.105757 fft 4: mflops = 17.2463 (norm. = 0.315789), norm. avg. (of 15) = 0.349855 fft 5: mflops = 6.10584 (norm. = 0.111801), norm. avg. (of 15) = 0.0667462 fft 6: mflops = 22.3418 (norm. = 0.409091), norm. avg. (of 15) = 0.367837 fft 7: mflops = 15.9844 (norm. = 0.292683), norm. avg. (of 15) = 0.228837 fft 8: mflops = 7.56185 (norm. = 0.138462), norm. avg. (of 15) = 0.161874 fft 9: mflops = 51.7389 (norm. = 0.947368), norm. avg. (of 15) = 0.545828 fft 10: mflops = 52.4288 (norm. = 0.96), norm. avg. (of 15) = 0.61752 fft 11: mflops = 7.92774 (norm. = 0.145161), norm. avg. (of 14) = 0.155672 fft 12: mflops = 28.0869 (norm. = 0.514286), norm. avg. (of 15) = 0.538112 fft 13: mflops = 23.8313 (norm. = 0.436364), norm. avg. (of 15) = 0.4766 fft 14: mflops = 54.6133 (norm. = 1), norm. avg. (of 15) = 0.85464 fft 15: mflops = 40.1241 (norm. = 0.734694), norm. avg. (of 15) = 0.775651 fft 16: mflops = 21.8453 (norm. = 0.4), norm. avg. (of 15) = 0.692425 fft 17: mflops = 23.8313 (norm. = 0.436364), norm. avg. (of 13) = 0.68683 fft 18: mflops = 29.3445 (norm. = 0.537313), norm. avg. (of 15) = 0.346909 fft 19: mflops = 9.8304 (norm. = 0.18), norm. avg. (of 15) = 0.198769 fft 20: mflops = 10.031 (norm. = 0.183673), norm. avg. (of 15) = 0.234257 fft 21: mflops = -1 (norm. = -0.0183105), norm. avg. (of 12) = 0.413784 fft 22: mflops = 14.4565 (norm. = 0.264706), norm. avg. (of 14) = 0.288116 fft 23: mflops = 15.36 (norm. = 0.28125), norm. avg. (of 14) = 0.367485 fft 24: mflops = 14.6722 (norm. = 0.268657), norm. avg. (of 14) = 0.328204 fft 25: mflops = 10.0825 (norm. = 0.184615), norm. avg. (of 14) = 0.243281 fft 26: mflops = 13.3747 (norm. = 0.244898), norm. avg. (of 15) = 0.139712 fft 27: mflops = 11.7729 (norm. = 0.215569), norm. avg. (of 15) = 0.176638 fft 28: mflops = 10.4579 (norm. = 0.191489), norm. avg. (of 15) = 0.285748 fft 29: mflops = 10.0825 (norm. = 0.184615), norm. avg. (of 15) = 0.22657 fft 30: mflops = 42.9744 (norm. = 0.786885), norm. avg. (of 15) = 0.60769 fft 31: mflops = 43.6907 (norm. = 0.8), norm. avg. (of 15) = 0.623576 fft 32: mflops = 12.7668 (norm. = 0.233766), norm. avg. (of 12) = 0.278664 fft 33: mflops = 24.576 (norm. = 0.45), norm. avg. (of 14) = 0.196087 fft 34: mflops = 9.8304 (norm. = 0.18), norm. avg. (of 14) = 0.336068 fft 35: mflops = 16.5217 (norm. = 0.302521), norm. avg. (of 15) = 0.359248 fft 36: mflops = 16.2486 (norm. = 0.297521), norm. avg. (of 15) = 0.352239 fft 37: mflops = 11.8439 (norm. = 0.216867), norm. avg. (of 15) = 0.418223 fft 38: mflops = 7.74047 (norm. = 0.141732), norm. avg. (of 15) = 0.15837 fft 39: mflops = 14.1445 (norm. = 0.258993), norm. avg. (of 15) = 0.213411 fft 40: mflops = 12.6031 (norm. = 0.230769), norm. avg. (of 15) = 0.174159 fft 41: mflops = 3.90095 (norm. = 0.0714286), norm. avg. (of 15) = 0.0534922 fft 42: mflops = 51.4008 (norm. = 0.941176), norm. avg. (of 13) = 0.752509 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.93 s, 4 iters, t-(init.)=1.89 s t(norm)=0.450611, mflops=11.096 (err=1.6e-14) 1. Arndt DIT: elapsed time t=1.96 s, 4 iters, t-(init.)=1.93 s t(norm)=0.460148, mflops=10.8661 (err=1.6e-14) 2. Arndt Split-Radix: elapsed time t=1.27 s, 2 iters, t-(init.)=1.26 s t(norm)=0.600815, mflops=8.32203 (err=1.6e-14) 3. Arndt 4-step: elapsed time t=1.54 s, 4 iters, t-(init.)=1.5 s t(norm)=0.357628, mflops=13.981 (err=1.6e-14) 4. Bailey: elapsed time t=1.3 s, 4 iters, t-(init.)=1.26 s t(norm)=0.300407, mflops=16.6441 (err=1.6e-14) 5. Beauregard: elapsed time t=1.74 s, 2 iters, t-(init.)=1.72 s t(norm)=0.82016, mflops=6.09637 (err=1.6e-14) 6. Bergland: elapsed time t=1.04 s, 4 iters, t-(init.)=1 s t(norm)=0.238419, mflops=20.9715 (err=1.6e-14) 7. Brenner: elapsed time t=1.34 s, 4 iters, t-(init.)=1.31 s t(norm)=0.312328, mflops=16.0088 (err=1.6e-14) 8. Burrus: elapsed time t=1.42 s, 2 iters, t-(init.)=1.4 s t(norm)=0.667572, mflops=7.48983 (err=1.6e-14) 9. CWP (min N) (N=72072): elapsed time t=1.72 s, 16 iters, t-(init.)=1.57 s t(norm)=0.0935793, mflops=53.4306 10. CWP (best N) (N=72072): elapsed time t=1.72 s, 16 iters, t-(init.)=1.57 s t(norm)=0.0935793, mflops=53.4306 11. Edelblute: elapsed time t=1.36 s, 2 iters, t-(init.)=1.35 s t(norm)=0.64373, mflops=7.76723 (err=1.6e-14) 12. FFTPACK: elapsed time t=1.78 s, 8 iters, t-(init.)=1.71 s t(norm)=0.203848, mflops=24.5281 (err=1.6e-14) 13. FFTPACK (f2c): elapsed time t=1.04 s, 4 iters, t-(init.)=1.01 s t(norm)=0.240803, mflops=20.7639 (err=1.6e-14) FFTW_MEASURE plan: (cost = 1.100000e-01) FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 64 FFTW_NOTW 64 14. FFTW: elapsed time t=1.74 s, 16 iters, t-(init.)=1.61 s t(norm)=0.0959635, mflops=52.1032 (err=1.6e-14) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.12 s, 8 iters, t-(init.)=1.05 s t(norm)=0.12517, mflops=39.9458 (err=1.6e-14) 16. Frigo-old: elapsed time t=1.06 s, 4 iters, t-(init.)=1.02 s t(norm)=0.243187, mflops=20.5603 (err=1.6e-14) 17. Green: elapsed time t=1.94 s, 8 iters, t-(init.)=1.87 s t(norm)=0.222921, mflops=22.4294 (err=1.6e-14) 18. GSL: elapsed time t=1.44 s, 8 iters, t-(init.)=1.37 s t(norm)=0.163317, mflops=30.6154 (err=1.6e-14) 19. GSL DIT: elapsed time t=1.1 s, 2 iters, t-(init.)=1.08 s t(norm)=0.514984, mflops=9.70904 (err=1.7e-14) 20. GSL DIF: elapsed time t=1.08 s, 2 iters, t-(init.)=1.07 s t(norm)=0.510216, mflops=9.79978 (err=1.7e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.51 s, 4 iters, t-(init.)=1.47 s t(norm)=0.350475, mflops=14.2663 (err=1.6e-14) 23. Mayer (simple): elapsed time t=1.43 s, 4 iters, t-(init.)=1.39 s t(norm)=0.331402, mflops=15.0874 24. Mayer (lookup): elapsed time t=1.5 s, 4 iters, t-(init.)=1.47 s t(norm)=0.350475, mflops=14.2663 (err=1.6e-14) 25. Monro: elapsed time t=1.03 s, 2 iters, t-(init.)=1.02 s t(norm)=0.486374, mflops=10.2802 (err=1.8e-14) 26. NAPACK (f2c): elapsed time t=1.55 s, 4 iters, t-(init.)=1.52 s t(norm)=0.362396, mflops=13.7971 (err=8.7e-13) 27. Nielsen: elapsed time t=1.93 s, 4 iters, t-(init.)=1.9 s t(norm)=0.452995, mflops=11.0376 (err=2.6e-13) 28. NR (C): elapsed time t=1.03 s, 2 iters, t-(init.)=1.01 s t(norm)=0.481606, mflops=10.3819 (err=1.6e-14) 29. NR (F): elapsed time t=1.07 s, 2 iters, t-(init.)=1.06 s t(norm)=0.505447, mflops=9.89223 (err=1.6e-14) 30. Ooura (C): elapsed time t=1.02 s, 8 iters, t-(init.)=0.96 s t(norm)=0.114441, mflops=43.6907 (err=1.6e-14) 31. Ooura (F): elapsed time t=1.96 s, 16 iters, t-(init.)=1.83 s t(norm)=0.109076, mflops=45.8394 (err=1.6e-14) 32. QFT: elapsed time t=1.71 s, 4 iters, t-(init.)=1.68 s t(norm)=0.400543, mflops=12.483 (err=1.9e-14) 33. Ransom: elapsed time t=1.81 s, 8 iters, t-(init.)=1.74 s t(norm)=0.207424, mflops=24.1052 (err=1.7e-14) 34. SCIPORT: elapsed time t=1.18 s, 2 iters, t-(init.)=1.16 s t(norm)=0.553131, mflops=9.03945 (err=2.4e-07) 35. Singleton: elapsed time t=1.09 s, 4 iters, t-(init.)=1.06 s t(norm)=0.252724, mflops=19.7845 (err=2.3e-14) 36. Singleton (f2c): elapsed time t=1.09 s, 4 iters, t-(init.)=1.05 s t(norm)=0.25034, mflops=19.9729 (err=2.4e-14) 37. Sorensen: elapsed time t=1.95 s, 4 iters, t-(init.)=1.92 s t(norm)=0.457764, mflops=10.9227 (err=1.6e-14) 38. Sorensen DIT: elapsed time t=1.4 s, 2 iters, t-(init.)=1.39 s t(norm)=0.662804, mflops=7.54371 (err=1.6e-14) 39. Temperton: elapsed time t=1.41 s, 4 iters, t-(init.)=1.37 s t(norm)=0.326633, mflops=15.3077 (err=1.6e-14) 40. Temperton (f2c): elapsed time t=1.6 s, 4 iters, t-(init.)=1.57 s t(norm)=0.374317, mflops=13.3577 (err=1.6e-14) 41. Valkenburg: elapsed time t=1.34 s, 1 iters, t-(init.)=1.33 s t(norm)=1.26839, mflops=3.94202 (err=1.6e-14) 42. ESSL: elapsed time t=1.79 s, 16 iters, t-(init.)=1.65 s t(norm)=0.0983477, mflops=50.84 (err=1.6e-14) Top mflops for N=65536 = 53.4306 Normalized results and averages for N=65536: fft 0: mflops = 11.096 (norm. = 0.207672), norm. avg. (of 16) = 0.312218 fft 1: mflops = 10.8661 (norm. = 0.203368), norm. avg. (of 16) = 0.295046 fft 2: mflops = 8.32203 (norm. = 0.155754), norm. avg. (of 16) = 0.220812 fft 3: mflops = 13.981 (norm. = 0.261667), norm. avg. (of 16) = 0.115502 fft 4: mflops = 16.6441 (norm. = 0.311508), norm. avg. (of 16) = 0.347459 fft 5: mflops = 6.09637 (norm. = 0.114099), norm. avg. (of 16) = 0.0697058 fft 6: mflops = 20.9715 (norm. = 0.3925), norm. avg. (of 16) = 0.369378 fft 7: mflops = 16.0088 (norm. = 0.299618), norm. avg. (of 16) = 0.233261 fft 8: mflops = 7.48983 (norm. = 0.140179), norm. avg. (of 16) = 0.160518 fft 9: mflops = 53.4306 (norm. = 1), norm. avg. (of 16) = 0.574214 fft 10: mflops = 53.4306 (norm. = 1), norm. avg. (of 16) = 0.641425 fft 11: mflops = 7.76723 (norm. = 0.14537), norm. avg. (of 15) = 0.154985 fft 12: mflops = 24.5281 (norm. = 0.459064), norm. avg. (of 16) = 0.533171 fft 13: mflops = 20.7639 (norm. = 0.388614), norm. avg. (of 16) = 0.4711 fft 14: mflops = 52.1032 (norm. = 0.975155), norm. avg. (of 16) = 0.862172 fft 15: mflops = 39.9458 (norm. = 0.747619), norm. avg. (of 16) = 0.773899 fft 16: mflops = 20.5603 (norm. = 0.384804), norm. avg. (of 16) = 0.673199 fft 17: mflops = 22.4294 (norm. = 0.419786), norm. avg. (of 14) = 0.667756 fft 18: mflops = 30.6154 (norm. = 0.572993), norm. avg. (of 16) = 0.361039 fft 19: mflops = 9.70904 (norm. = 0.181713), norm. avg. (of 16) = 0.197703 fft 20: mflops = 9.79978 (norm. = 0.183411), norm. avg. (of 16) = 0.231079 fft 21: mflops = -1 (norm. = -0.0187159), norm. avg. (of 12) = 0.413784 fft 22: mflops = 14.2663 (norm. = 0.267007), norm. avg. (of 15) = 0.286709 fft 23: mflops = 15.0874 (norm. = 0.282374), norm. avg. (of 15) = 0.361811 fft 24: mflops = 14.2663 (norm. = 0.267007), norm. avg. (of 15) = 0.324124 fft 25: mflops = 10.2802 (norm. = 0.192402), norm. avg. (of 15) = 0.239889 fft 26: mflops = 13.7971 (norm. = 0.258224), norm. avg. (of 16) = 0.147119 fft 27: mflops = 11.0376 (norm. = 0.206579), norm. avg. (of 16) = 0.178509 fft 28: mflops = 10.3819 (norm. = 0.194307), norm. avg. (of 16) = 0.280033 fft 29: mflops = 9.89223 (norm. = 0.185142), norm. avg. (of 16) = 0.22398 fft 30: mflops = 43.6907 (norm. = 0.817708), norm. avg. (of 16) = 0.620816 fft 31: mflops = 45.8394 (norm. = 0.857923), norm. avg. (of 16) = 0.638223 fft 32: mflops = 12.483 (norm. = 0.233631), norm. avg. (of 13) = 0.2752 fft 33: mflops = 24.1052 (norm. = 0.451149), norm. avg. (of 15) = 0.213091 fft 34: mflops = 9.03945 (norm. = 0.169181), norm. avg. (of 15) = 0.324942 fft 35: mflops = 19.7845 (norm. = 0.370283), norm. avg. (of 16) = 0.359937 fft 36: mflops = 19.9729 (norm. = 0.37381), norm. avg. (of 16) = 0.353587 fft 37: mflops = 10.9227 (norm. = 0.204427), norm. avg. (of 16) = 0.404861 fft 38: mflops = 7.54371 (norm. = 0.141187), norm. avg. (of 16) = 0.157296 fft 39: mflops = 15.3077 (norm. = 0.286496), norm. avg. (of 16) = 0.217979 fft 40: mflops = 13.3577 (norm. = 0.25), norm. avg. (of 16) = 0.178899 fft 41: mflops = 3.94202 (norm. = 0.0737782), norm. avg. (of 16) = 0.0547601 fft 42: mflops = 50.84 (norm. = 0.951515), norm. avg. (of 14) = 0.766723 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.08 s, 1 iters, t-(init.)=1.06 s t(norm)=0.475715, mflops=10.5105 (err=3.9e-14) 1. Arndt DIT: elapsed time t=1.1 s, 1 iters, t-(init.)=1.08 s t(norm)=0.484691, mflops=10.3159 (err=3.9e-14) 2. Arndt Split-Radix: elapsed time t=1.36 s, 1 iters, t-(init.)=1.34 s t(norm)=0.601376, mflops=8.31427 (err=3.9e-14) 3. Arndt 4-step: elapsed time t=1.9 s, 2 iters, t-(init.)=1.87 s t(norm)=0.419617, mflops=11.9156 (err=3.9e-14) 4. Bailey: elapsed time t=1.46 s, 2 iters, t-(init.)=1.43 s t(norm)=0.320883, mflops=15.582 (err=3.9e-14) 5. Beauregard: elapsed time t=1.84 s, 1 iters, t-(init.)=1.82 s t(norm)=0.816794, mflops=6.12149 (err=3.9e-14) 6. Bergland: elapsed time t=1.11 s, 2 iters, t-(init.)=1.07 s t(norm)=0.240102, mflops=20.8245 (err=3.8e-14) 7. Brenner: elapsed time t=1.41 s, 2 iters, t-(init.)=1.37 s t(norm)=0.30742, mflops=16.2644 (err=3.9e-14) 8. Burrus: elapsed time t=1.53 s, 1 iters, t-(init.)=1.51 s t(norm)=0.67767, mflops=7.37823 (err=3.9e-14) 9. CWP (min N) (N=144144): elapsed time t=1.87 s, 8 iters, t-(init.)=1.72 s t(norm)=0.0964894, mflops=51.8192 10. CWP (best N) (N=144144): elapsed time t=1.87 s, 8 iters, t-(init.)=1.73 s t(norm)=0.0970504, mflops=51.5196 11. Edelblute: elapsed time t=1.47 s, 1 iters, t-(init.)=1.46 s t(norm)=0.65523, mflops=7.6309 (err=3.9e-14) 12. FFTPACK: elapsed time t=1.04 s, 2 iters, t-(init.)=1 s t(norm)=0.224394, mflops=22.2822 (err=3.9e-14) 13. FFTPACK (f2c): elapsed time t=1.23 s, 2 iters, t-(init.)=1.2 s t(norm)=0.269273, mflops=18.5685 (err=3.9e-14) FFTW_MEASURE plan: (cost = 2.300000e-01) FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_NOTW 64 14. FFTW: elapsed time t=1.77 s, 8 iters, t-(init.)=1.64 s t(norm)=0.0920015, mflops=54.3469 (err=3.9e-14) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.5 s, 4 iters, t-(init.)=1.43 s t(norm)=0.160442, mflops=31.164 (err=3.9e-14) 16. Frigo-old: elapsed time t=1.41 s, 2 iters, t-(init.)=1.38 s t(norm)=0.309664, mflops=16.1466 (err=3.9e-14) 17. Green: elapsed time t=1.07 s, 2 iters, t-(init.)=1.04 s t(norm)=0.23337, mflops=21.4252 (err=3.9e-14) 18. GSL: elapsed time t=1.83 s, 4 iters, t-(init.)=1.77 s t(norm)=0.198589, mflops=25.1777 (err=3.9e-14) 19. GSL DIT: elapsed time t=1.19 s, 1 iters, t-(init.)=1.18 s t(norm)=0.52957, mflops=9.44163 (err=4.0e-14) 20. GSL DIF: elapsed time t=1.16 s, 1 iters, t-(init.)=1.15 s t(norm)=0.516106, mflops=9.68793 (err=4.2e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.64 s, 2 iters, t-(init.)=1.6 s t(norm)=0.35903, mflops=13.9264 (err=3.9e-14) 23. Mayer (simple): elapsed time t=1.55 s, 2 iters, t-(init.)=1.51 s t(norm)=0.338835, mflops=14.7565 24. Mayer (lookup): elapsed time t=1.62 s, 2 iters, t-(init.)=1.59 s t(norm)=0.356786, mflops=14.014 (err=3.9e-14) 25. Monro: elapsed time t=1.18 s, 1 iters, t-(init.)=1.17 s t(norm)=0.525082, mflops=9.52232 (err=4.2e-14) 26. NAPACK (f2c): elapsed time t=1.68 s, 2 iters, t-(init.)=1.64 s t(norm)=0.368006, mflops=13.5867 (err=2.1e-12) 27. Nielsen: elapsed time t=1.01 s, 1 iters, t-(init.)=1 s t(norm)=0.448788, mflops=11.1411 (err=9.2e-13) 28. NR (C): elapsed time t=1.1 s, 1 iters, t-(init.)=1.08 s t(norm)=0.484691, mflops=10.3159 (err=3.9e-14) 29. NR (F): elapsed time t=1.14 s, 1 iters, t-(init.)=1.12 s t(norm)=0.502642, mflops=9.94743 (err=3.9e-14) 30. Ooura (C): elapsed time t=1.08 s, 4 iters, t-(init.)=1.01 s t(norm)=0.113319, mflops=44.1232 (err=3.8e-14) 31. Ooura (F): elapsed time t=1.06 s, 4 iters, t-(init.)=1 s t(norm)=0.112197, mflops=44.5645 (err=3.8e-14) 32. QFT: elapsed time t=1.85 s, 2 iters, t-(init.)=1.82 s t(norm)=0.408397, mflops=12.243 (err=4.2e-14) 33. Ransom: elapsed time t=1.02 s, 2 iters, t-(init.)=0.98 s t(norm)=0.219906, mflops=22.737 (err=3.9e-14) 34. SCIPORT: elapsed time t=1.3 s, 1 iters, t-(init.)=1.29 s t(norm)=0.578936, mflops=8.63653 (err=2.6e-07) 35. Singleton: elapsed time t=1.26 s, 2 iters, t-(init.)=1.23 s t(norm)=0.276005, mflops=18.1156 (err=5.3e-14) 36. Singleton (f2c): elapsed time t=1.25 s, 2 iters, t-(init.)=1.21 s t(norm)=0.271517, mflops=18.4151 (err=5.8e-14) 37. Sorensen: elapsed time t=1.11 s, 1 iters, t-(init.)=1.09 s t(norm)=0.489179, mflops=10.2212 (err=3.9e-14) 38. Sorensen DIT: elapsed time t=1.48 s, 1 iters, t-(init.)=1.46 s t(norm)=0.65523, mflops=7.6309 (err=3.9e-14) 39. Temperton: elapsed time t=1.64 s, 2 iters, t-(init.)=1.61 s t(norm)=0.361274, mflops=13.8399 (err=3.9e-14) 40. Temperton (f2c): elapsed time t=1.8 s, 2 iters, t-(init.)=1.76 s t(norm)=0.394933, mflops=12.6604 (err=3.9e-14) 41. Valkenburg: elapsed time t=2.92 s, 1 iters, t-(init.)=2.9 s t(norm)=1.30148, mflops=3.84177 (err=3.8e-14) 42. ESSL: elapsed time t=1.9 s, 8 iters, t-(init.)=1.76 s t(norm)=0.0987333, mflops=50.6415 (err=3.8e-14) Top mflops for N=131072 = 54.3469 Normalized results and averages for N=131072: fft 0: mflops = 10.5105 (norm. = 0.193396), norm. avg. (of 17) = 0.305229 fft 1: mflops = 10.3159 (norm. = 0.189815), norm. avg. (of 17) = 0.288856 fft 2: mflops = 8.31427 (norm. = 0.152985), norm. avg. (of 17) = 0.216822 fft 3: mflops = 11.9156 (norm. = 0.219251), norm. avg. (of 17) = 0.121604 fft 4: mflops = 15.582 (norm. = 0.286713), norm. avg. (of 17) = 0.343886 fft 5: mflops = 6.12149 (norm. = 0.112637), norm. avg. (of 17) = 0.0722312 fft 6: mflops = 20.8245 (norm. = 0.383178), norm. avg. (of 17) = 0.37019 fft 7: mflops = 16.2644 (norm. = 0.29927), norm. avg. (of 17) = 0.237144 fft 8: mflops = 7.37823 (norm. = 0.135762), norm. avg. (of 17) = 0.159062 fft 9: mflops = 51.8192 (norm. = 0.953488), norm. avg. (of 17) = 0.596524 fft 10: mflops = 51.5196 (norm. = 0.947977), norm. avg. (of 17) = 0.659458 fft 11: mflops = 7.6309 (norm. = 0.140411), norm. avg. (of 16) = 0.154074 fft 12: mflops = 22.2822 (norm. = 0.41), norm. avg. (of 17) = 0.525926 fft 13: mflops = 18.5685 (norm. = 0.341667), norm. avg. (of 17) = 0.463487 fft 14: mflops = 54.3469 (norm. = 1), norm. avg. (of 17) = 0.870279 fft 15: mflops = 31.164 (norm. = 0.573427), norm. avg. (of 17) = 0.762107 fft 16: mflops = 16.1466 (norm. = 0.297101), norm. avg. (of 17) = 0.651075 fft 17: mflops = 21.4252 (norm. = 0.394231), norm. avg. (of 15) = 0.649521 fft 18: mflops = 25.1777 (norm. = 0.463277), norm. avg. (of 17) = 0.367053 fft 19: mflops = 9.44163 (norm. = 0.173729), norm. avg. (of 17) = 0.196293 fft 20: mflops = 9.68793 (norm. = 0.178261), norm. avg. (of 17) = 0.227972 fft 21: mflops = -1 (norm. = -0.0184003), norm. avg. (of 12) = 0.413784 fft 22: mflops = 13.9264 (norm. = 0.25625), norm. avg. (of 16) = 0.284805 fft 23: mflops = 14.7565 (norm. = 0.271523), norm. avg. (of 16) = 0.356168 fft 24: mflops = 14.014 (norm. = 0.257862), norm. avg. (of 16) = 0.319983 fft 25: mflops = 9.52232 (norm. = 0.175214), norm. avg. (of 16) = 0.235847 fft 26: mflops = 13.5867 (norm. = 0.25), norm. avg. (of 17) = 0.153171 fft 27: mflops = 11.1411 (norm. = 0.205), norm. avg. (of 17) = 0.180067 fft 28: mflops = 10.3159 (norm. = 0.189815), norm. avg. (of 17) = 0.274726 fft 29: mflops = 9.94743 (norm. = 0.183036), norm. avg. (of 17) = 0.221572 fft 30: mflops = 44.1232 (norm. = 0.811881), norm. avg. (of 17) = 0.632055 fft 31: mflops = 44.5645 (norm. = 0.82), norm. avg. (of 17) = 0.648915 fft 32: mflops = 12.243 (norm. = 0.225275), norm. avg. (of 14) = 0.271634 fft 33: mflops = 22.737 (norm. = 0.418367), norm. avg. (of 16) = 0.225921 fft 34: mflops = 8.63653 (norm. = 0.158915), norm. avg. (of 16) = 0.314565 fft 35: mflops = 18.1156 (norm. = 0.333333), norm. avg. (of 17) = 0.358372 fft 36: mflops = 18.4151 (norm. = 0.338843), norm. avg. (of 17) = 0.35272 fft 37: mflops = 10.2212 (norm. = 0.188073), norm. avg. (of 17) = 0.392109 fft 38: mflops = 7.6309 (norm. = 0.140411), norm. avg. (of 17) = 0.156303 fft 39: mflops = 13.8399 (norm. = 0.254658), norm. avg. (of 17) = 0.220137 fft 40: mflops = 12.6604 (norm. = 0.232955), norm. avg. (of 17) = 0.182079 fft 41: mflops = 3.84177 (norm. = 0.0706897), norm. avg. (of 17) = 0.0556971 fft 42: mflops = 50.6415 (norm. = 0.931818), norm. avg. (of 15) = 0.77773 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=2.46 s, 1 iters, t-(init.)=2.43 s t(norm)=0.514984, mflops=9.70904 (err=4.3e-14) 1. Arndt DIT: elapsed time t=2.49 s, 1 iters, t-(init.)=2.45 s t(norm)=0.519223, mflops=9.62978 (err=4.3e-14) 2. Arndt Split-Radix: elapsed time t=3.12 s, 1 iters, t-(init.)=3.08 s t(norm)=0.652737, mflops=7.66005 (err=4.3e-14) 3. Arndt 4-step: elapsed time t=1.79 s, 1 iters, t-(init.)=1.76 s t(norm)=0.372993, mflops=13.4051 (err=4.3e-14) 4. Bailey: elapsed time t=2.08 s, 1 iters, t-(init.)=2.05 s t(norm)=0.434452, mflops=11.5088 (err=4.3e-14) 5. Beauregard: elapsed time t=3.93 s, 1 iters, t-(init.)=3.9 s t(norm)=0.826518, mflops=6.04948 (err=4.2e-14) 6. Bergland: elapsed time t=1.24 s, 1 iters, t-(init.)=1.2 s t(norm)=0.254313, mflops=19.6608 (err=4.3e-14) 7. Brenner: elapsed time t=1.64 s, 1 iters, t-(init.)=1.61 s t(norm)=0.341203, mflops=14.654 (err=4.3e-14) 8. Burrus: elapsed time t=3.5 s, 1 iters, t-(init.)=3.47 s t(norm)=0.735389, mflops=6.79912 (err=4.3e-14) 9. CWP (min N) (N=360360): elapsed time t=1.44 s, 2 iters, t-(init.)=1.35 s t(norm)=0.143051, mflops=34.9525 10. CWP (best N) (N=360360): elapsed time t=1.43 s, 2 iters, t-(init.)=1.34 s t(norm)=0.141992, mflops=35.2134 11. Edelblute: elapsed time t=3.33 s, 1 iters, t-(init.)=3.3 s t(norm)=0.699361, mflops=7.14938 (err=4.3e-14) 12. FFTPACK: elapsed time t=1.49 s, 1 iters, t-(init.)=1.45 s t(norm)=0.307295, mflops=16.271 (err=4.2e-14) 13. FFTPACK (f2c): elapsed time t=1.7 s, 1 iters, t-(init.)=1.66 s t(norm)=0.3518, mflops=14.2126 (err=4.2e-14) FFTW_MEASURE plan: (cost = 5.600000e-01) FFTW_TWIDDLE 2 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_NOTW 64 14. FFTW: elapsed time t=1.14 s, 2 iters, t-(init.)=1.07 s t(norm)=0.113381, mflops=44.099 (err=4.3e-14) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.81 s, 2 iters, t-(init.)=1.75 s t(norm)=0.185437, mflops=26.9634 (err=4.2e-14) 16. Frigo-old: elapsed time t=1.67 s, 1 iters, t-(init.)=1.63 s t(norm)=0.345442, mflops=14.4742 (err=4.2e-14) 17. Green: elapsed time t=1.22 s, 1 iters, t-(init.)=1.19 s t(norm)=0.252194, mflops=19.826 (err=4.3e-14) 18. GSL: elapsed time t=1.16 s, 1 iters, t-(init.)=1.13 s t(norm)=0.239478, mflops=20.8787 (err=4.2e-14) 19. GSL DIT: elapsed time t=2.63 s, 1 iters, t-(init.)=2.6 s t(norm)=0.551012, mflops=9.07422 (err=4.5e-14) 20. GSL DIF: elapsed time t=2.59 s, 1 iters, t-(init.)=2.56 s t(norm)=0.542535, mflops=9.216 (err=4.7e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.77 s, 1 iters, t-(init.)=1.74 s t(norm)=0.368754, mflops=13.5592 (err=4.3e-14) 23. Mayer (simple): elapsed time t=1.68 s, 1 iters, t-(init.)=1.65 s t(norm)=0.349681, mflops=14.2988 24. Mayer (lookup): elapsed time t=1.77 s, 1 iters, t-(init.)=1.74 s t(norm)=0.368754, mflops=13.5592 (err=4.3e-14) 25. Monro: elapsed time t=2.54 s, 1 iters, t-(init.)=2.5 s t(norm)=0.529819, mflops=9.43718 (err=5.1e-14) 26. NAPACK (f2c): elapsed time t=1.75 s, 1 iters, t-(init.)=1.72 s t(norm)=0.364516, mflops=13.7168 (err=3.7e-12) 27. Nielsen: elapsed time t=2.33 s, 1 iters, t-(init.)=2.29 s t(norm)=0.485314, mflops=10.3026 (err=2.2e-12) 28. NR (C): elapsed time t=2.47 s, 1 iters, t-(init.)=2.44 s t(norm)=0.517103, mflops=9.66925 (err=4.3e-14) 29. NR (F): elapsed time t=2.56 s, 1 iters, t-(init.)=2.52 s t(norm)=0.534058, mflops=9.36229 (err=4.3e-14) 30. Ooura (C): elapsed time t=1.24 s, 2 iters, t-(init.)=1.17 s t(norm)=0.123978, mflops=40.3298 (err=4.3e-14) 31. Ooura (F): elapsed time t=1.2 s, 2 iters, t-(init.)=1.14 s t(norm)=0.120799, mflops=41.3912 (err=4.3e-14) 32. QFT: elapsed time t=2.01 s, 1 iters, t-(init.)=1.98 s t(norm)=0.419617, mflops=11.9156 (err=4.7e-14) 33. Ransom: elapsed time t=1.86 s, 2 iters, t-(init.)=1.79 s t(norm)=0.189675, mflops=26.3608 (err=4.3e-14) 34. SCIPORT: elapsed time t=3.15 s, 1 iters, t-(init.)=3.12 s t(norm)=0.661214, mflops=7.56185 (err=2.8e-07) 35. Singleton: elapsed time t=1.42 s, 1 iters, t-(init.)=1.39 s t(norm)=0.294579, mflops=16.9734 (err=6.5e-14) 36. Singleton (f2c): elapsed time t=1.41 s, 1 iters, t-(init.)=1.38 s t(norm)=0.29246, mflops=17.0963 (err=5.9e-14) 37. Sorensen: elapsed time t=2.55 s, 1 iters, t-(init.)=2.51 s t(norm)=0.531938, mflops=9.39959 (err=4.3e-14) 38. Sorensen DIT: elapsed time t=3.4 s, 1 iters, t-(init.)=3.37 s t(norm)=0.714196, mflops=7.00088 (err=4.3e-14) 39. Temperton: elapsed time t=1.84 s, 1 iters, t-(init.)=1.8 s t(norm)=0.38147, mflops=13.1072 (err=4.2e-14) 40. Temperton (f2c): elapsed time t=2.01 s, 1 iters, t-(init.)=1.98 s t(norm)=0.419617, mflops=11.9156 (err=4.2e-14) 41. Valkenburg: elapsed time t=6.71 s, 1 iters, t-(init.)=6.67 s t(norm)=1.41356, mflops=3.53718 (err=4.3e-14) 42. ESSL: elapsed time t=1 s, 2 iters, t-(init.)=0.93 s t(norm)=0.0985463, mflops=50.7375 (err=4.3e-14) Top mflops for N=262144 = 50.7375 Normalized results and averages for N=262144: fft 0: mflops = 9.70904 (norm. = 0.191358), norm. avg. (of 18) = 0.298903 fft 1: mflops = 9.62978 (norm. = 0.189796), norm. avg. (of 18) = 0.283353 fft 2: mflops = 7.66005 (norm. = 0.150974), norm. avg. (of 18) = 0.213164 fft 3: mflops = 13.4051 (norm. = 0.264205), norm. avg. (of 18) = 0.129527 fft 4: mflops = 11.5088 (norm. = 0.226829), norm. avg. (of 18) = 0.337382 fft 5: mflops = 6.04948 (norm. = 0.119231), norm. avg. (of 18) = 0.0748423 fft 6: mflops = 19.6608 (norm. = 0.3875), norm. avg. (of 18) = 0.371152 fft 7: mflops = 14.654 (norm. = 0.28882), norm. avg. (of 18) = 0.240014 fft 8: mflops = 6.79912 (norm. = 0.134006), norm. avg. (of 18) = 0.15767 fft 9: mflops = 34.9525 (norm. = 0.688889), norm. avg. (of 18) = 0.601655 fft 10: mflops = 35.2134 (norm. = 0.69403), norm. avg. (of 18) = 0.661379 fft 11: mflops = 7.14938 (norm. = 0.140909), norm. avg. (of 17) = 0.1533 fft 12: mflops = 16.271 (norm. = 0.32069), norm. avg. (of 18) = 0.514524 fft 13: mflops = 14.2126 (norm. = 0.28012), norm. avg. (of 18) = 0.4533 fft 14: mflops = 44.099 (norm. = 0.869159), norm. avg. (of 18) = 0.870217 fft 15: mflops = 26.9634 (norm. = 0.531429), norm. avg. (of 18) = 0.749291 fft 16: mflops = 14.4742 (norm. = 0.285276), norm. avg. (of 18) = 0.630753 fft 17: mflops = 19.826 (norm. = 0.390756), norm. avg. (of 16) = 0.633348 fft 18: mflops = 20.8787 (norm. = 0.411504), norm. avg. (of 18) = 0.369523 fft 19: mflops = 9.07422 (norm. = 0.178846), norm. avg. (of 18) = 0.195323 fft 20: mflops = 9.216 (norm. = 0.181641), norm. avg. (of 18) = 0.225398 fft 21: mflops = -1 (norm. = -0.0197093), norm. avg. (of 12) = 0.413784 fft 22: mflops = 13.5592 (norm. = 0.267241), norm. avg. (of 17) = 0.283772 fft 23: mflops = 14.2988 (norm. = 0.281818), norm. avg. (of 17) = 0.351795 fft 24: mflops = 13.5592 (norm. = 0.267241), norm. avg. (of 17) = 0.31688 fft 25: mflops = 9.43718 (norm. = 0.186), norm. avg. (of 17) = 0.232915 fft 26: mflops = 13.7168 (norm. = 0.270349), norm. avg. (of 18) = 0.15968 fft 27: mflops = 10.3026 (norm. = 0.203057), norm. avg. (of 18) = 0.181345 fft 28: mflops = 9.66925 (norm. = 0.190574), norm. avg. (of 18) = 0.270051 fft 29: mflops = 9.36229 (norm. = 0.184524), norm. avg. (of 18) = 0.219514 fft 30: mflops = 40.3298 (norm. = 0.794872), norm. avg. (of 18) = 0.6411 fft 31: mflops = 41.3912 (norm. = 0.815789), norm. avg. (of 18) = 0.658186 fft 32: mflops = 11.9156 (norm. = 0.234848), norm. avg. (of 15) = 0.269181 fft 33: mflops = 26.3608 (norm. = 0.519553), norm. avg. (of 17) = 0.243194 fft 34: mflops = 7.56185 (norm. = 0.149038), norm. avg. (of 17) = 0.304828 fft 35: mflops = 16.9734 (norm. = 0.334532), norm. avg. (of 18) = 0.357048 fft 36: mflops = 17.0963 (norm. = 0.336957), norm. avg. (of 18) = 0.351844 fft 37: mflops = 9.39959 (norm. = 0.185259), norm. avg. (of 18) = 0.380617 fft 38: mflops = 7.00088 (norm. = 0.137982), norm. avg. (of 18) = 0.155285 fft 39: mflops = 13.1072 (norm. = 0.258333), norm. avg. (of 18) = 0.222259 fft 40: mflops = 11.9156 (norm. = 0.234848), norm. avg. (of 18) = 0.18501 fft 41: mflops = 3.53718 (norm. = 0.0697151), norm. avg. (of 18) = 0.0564759 fft 42: mflops = 50.7375 (norm. = 1), norm. avg. (of 16) = 0.791622 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. Brenner 1. CWP (min N) 2. CWP (best N) 3. FFTPACK 4. FFTPACK (f2c) 5. FFTW 6. FFTW_ESTIMATE 7. Frigo-old 8. GSL 9. NAPACK (f2c) 10. Nielsen 11. Singleton 12. Singleton (f2c) 13. Temperton 14. Temperton (f2c) 15. Valkenburg 16. ESSL Computing normalized averages (17 transforms). Benchmarking for array size = 6: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.08 s, 131072 iters, t-(init.)=1.04 s t(norm)=0.511585, mflops=9.77354 2. CWP (best N) (N=15): elapsed time t=1.42 s, 131072 iters, t-(init.)=1.34 s t(norm)=0.659158, mflops=7.58544 3. FFTPACK: elapsed time t=1.24 s, 262144 iters, t-(init.)=1.15 s t(norm)=0.282848, mflops=17.6774 (err=7.1e-17) 4. FFTPACK (f2c): elapsed time t=1.18 s, 262144 iters, t-(init.)=1.09 s t(norm)=0.26809, mflops=18.6504 (err=2.2e-16) FFTW_MEASURE plan: (cost = 1.831055e-06) FFTW_NOTW 6 5. FFTW: elapsed time t=1.01 s, 524288 iters, t-(init.)=0.83 s t(norm)=0.102071, mflops=48.9855 (err=1.3e-16) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 6. FFTW_ESTIMATE: elapsed time t=1.01 s, 524288 iters, t-(init.)=0.83 s t(norm)=0.102071, mflops=48.9855 (err=1.3e-16) 7. Frigo-old: elapsed time t=1.12 s, 131072 iters, t-(init.)=1.07 s t(norm)=0.526342, mflops=9.49952 (err=2.5e-16) 8. GSL: elapsed time t=1.68 s, 262144 iters, t-(init.)=1.59 s t(norm)=0.391067, mflops=12.7855 (err=1.1e-16) 9. NAPACK (f2c): elapsed time t=1.78 s, 131072 iters, t-(init.)=1.73 s t(norm)=0.851002, mflops=5.87543 (err=4.8e-16) 10. Nielsen: elapsed time t=1.43 s, 65536 iters, t-(init.)=1.41 s t(norm)=1.38718, mflops=3.60443 (err=7.7e-16) 11. Singleton: elapsed time t=1.61 s, 131072 iters, t-(init.)=1.57 s t(norm)=0.772297, mflops=6.47419 (err=7.6e-17) 12. Singleton (f2c): elapsed time t=1.56 s, 131072 iters, t-(init.)=1.52 s t(norm)=0.747701, mflops=6.68716 (err=1.7e-16) 13. Temperton: elapsed time t=1.37 s, 131072 iters, t-(init.)=1.32 s t(norm)=0.64932, mflops=7.70037 (err=7.8e-09) 14. Temperton (f2c): elapsed time t=1.63 s, 131072 iters, t-(init.)=1.58 s t(norm)=0.777216, mflops=6.43322 (err=7.6e-17) 15. Valkenburg: elapsed time t=1.23 s, 65536 iters, t-(init.)=1.21 s t(norm)=1.19042, mflops=4.2002 (err=3.5e-16) 16. ESSL: elapsed time t=1.56 s, 262144 iters, t-(init.)=1.47 s t(norm)=0.361553, mflops=13.8292 (err=7.6e-17) Top mflops for N=6 = 48.9855 Normalized results and averages for N=6: fft 0: mflops = -1 (norm. = -0.0204142), norm. avg. (of 0) = -1 fft 1: mflops = 9.77354 (norm. = 0.199519), norm. avg. (of 1) = 0.199519 fft 2: mflops = 7.58544 (norm. = 0.154851), norm. avg. (of 1) = 0.154851 fft 3: mflops = 17.6774 (norm. = 0.36087), norm. avg. (of 1) = 0.36087 fft 4: mflops = 18.6504 (norm. = 0.380734), norm. avg. (of 1) = 0.380734 fft 5: mflops = 48.9855 (norm. = 1), norm. avg. (of 1) = 1 fft 6: mflops = 48.9855 (norm. = 1), norm. avg. (of 1) = 1 fft 7: mflops = 9.49952 (norm. = 0.193925), norm. avg. (of 1) = 0.193925 fft 8: mflops = 12.7855 (norm. = 0.261006), norm. avg. (of 1) = 0.261006 fft 9: mflops = 5.87543 (norm. = 0.119942), norm. avg. (of 1) = 0.119942 fft 10: mflops = 3.60443 (norm. = 0.0735816), norm. avg. (of 1) = 0.0735816 fft 11: mflops = 6.47419 (norm. = 0.132166), norm. avg. (of 1) = 0.132166 fft 12: mflops = 6.68716 (norm. = 0.136513), norm. avg. (of 1) = 0.136513 fft 13: mflops = 7.70037 (norm. = 0.157197), norm. avg. (of 1) = 0.157197 fft 14: mflops = 6.43322 (norm. = 0.131329), norm. avg. (of 1) = 0.131329 fft 15: mflops = 4.2002 (norm. = 0.0857438), norm. avg. (of 1) = 0.0857438 fft 16: mflops = 13.8292 (norm. = 0.282313), norm. avg. (of 1) = 0.282313 Benchmarking for array size = 9: 0. Brenner: elapsed time t=1.22 s, 32768 iters, t-(init.)=1.2 s t(norm)=1.28363, mflops=3.8952 (err=4.2e-16) 1. CWP (min N): elapsed time t=1.13 s, 131072 iters, t-(init.)=1.06 s t(norm)=0.283468, mflops=17.6387 2. CWP (best N) (N=15): elapsed time t=1.42 s, 131072 iters, t-(init.)=1.34 s t(norm)=0.358347, mflops=13.953 3. FFTPACK: elapsed time t=1.51 s, 262144 iters, t-(init.)=1.38 s t(norm)=0.184522, mflops=27.0971 (err=3.9e-16) 4. FFTPACK (f2c): elapsed time t=1.47 s, 262144 iters, t-(init.)=1.35 s t(norm)=0.18051, mflops=27.6992 (err=2.4e-16) FFTW_MEASURE plan: (cost = 2.899170e-06) FFTW_NOTW 9 5. FFTW: elapsed time t=1.58 s, 524288 iters, t-(init.)=1.32 s t(norm)=0.0882496, mflops=56.6575 (err=2.4e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.57 s, 524288 iters, t-(init.)=1.31 s t(norm)=0.087581, mflops=57.09 (err=2.4e-16) 7. Frigo-old: elapsed time t=1.08 s, 65536 iters, t-(init.)=1.05 s t(norm)=0.561588, mflops=8.90332 (err=4.0e-16) 8. GSL: elapsed time t=1.23 s, 131072 iters, t-(init.)=1.16 s t(norm)=0.310211, mflops=16.1181 (err=3.9e-16) 9. NAPACK (f2c): elapsed time t=1.08 s, 65536 iters, t-(init.)=1.05 s t(norm)=0.561588, mflops=8.90332 (err=3.3e-16) 10. Nielsen: elapsed time t=1.56 s, 65536 iters, t-(init.)=1.53 s t(norm)=0.818314, mflops=6.11012 (err=1.0e-15) 11. Singleton: elapsed time t=1.48 s, 131072 iters, t-(init.)=1.42 s t(norm)=0.379741, mflops=13.1669 (err=3.7e-16) 12. Singleton (f2c): elapsed time t=1.51 s, 131072 iters, t-(init.)=1.45 s t(norm)=0.387763, mflops=12.8945 (err=3.7e-16) 13. Temperton: elapsed time t=1.43 s, 131072 iters, t-(init.)=1.37 s t(norm)=0.366369, mflops=13.6474 (err=1.3e-08) 14. Temperton (f2c): elapsed time t=1.94 s, 131072 iters, t-(init.)=1.87 s t(norm)=0.500081, mflops=9.99838 (err=3.0e-16) 15. Valkenburg: elapsed time t=1.02 s, 32768 iters, t-(init.)=1.01 s t(norm)=1.08039, mflops=4.62796 (err=4.6e-16) 16. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=9 = 57.09 Normalized results and averages for N=9: fft 0: mflops = 3.8952 (norm. = 0.0682292), norm. avg. (of 1) = 0.0682292 fft 1: mflops = 17.6387 (norm. = 0.308962), norm. avg. (of 2) = 0.254241 fft 2: mflops = 13.953 (norm. = 0.244403), norm. avg. (of 2) = 0.199627 fft 3: mflops = 27.0971 (norm. = 0.474638), norm. avg. (of 2) = 0.417754 fft 4: mflops = 27.6992 (norm. = 0.485185), norm. avg. (of 2) = 0.43296 fft 5: mflops = 56.6575 (norm. = 0.992424), norm. avg. (of 2) = 0.996212 fft 6: mflops = 57.09 (norm. = 1), norm. avg. (of 2) = 1 fft 7: mflops = 8.90332 (norm. = 0.155952), norm. avg. (of 2) = 0.174939 fft 8: mflops = 16.1181 (norm. = 0.282328), norm. avg. (of 2) = 0.271667 fft 9: mflops = 8.90332 (norm. = 0.155952), norm. avg. (of 2) = 0.137947 fft 10: mflops = 6.11012 (norm. = 0.107026), norm. avg. (of 2) = 0.0903039 fft 11: mflops = 13.1669 (norm. = 0.230634), norm. avg. (of 2) = 0.1814 fft 12: mflops = 12.8945 (norm. = 0.225862), norm. avg. (of 2) = 0.181188 fft 13: mflops = 13.6474 (norm. = 0.239051), norm. avg. (of 2) = 0.198124 fft 14: mflops = 9.99838 (norm. = 0.175134), norm. avg. (of 2) = 0.153231 fft 15: mflops = 4.62796 (norm. = 0.0810644), norm. avg. (of 2) = 0.0834041 fft 16: mflops = -1 (norm. = -0.0175162), norm. avg. (of 1) = 0.282313 Benchmarking for array size = 12: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.3 s, 131072 iters, t-(init.)=1.23 s t(norm)=0.218137, mflops=22.9214 2. CWP (best N) (N=15): elapsed time t=1.41 s, 131072 iters, t-(init.)=1.32 s t(norm)=0.234098, mflops=21.3586 3. FFTPACK: elapsed time t=1.64 s, 262144 iters, t-(init.)=1.5 s t(norm)=0.13301, mflops=37.5911 (err=1.9e-16) 4. FFTPACK (f2c): elapsed time t=1.64 s, 262144 iters, t-(init.)=1.49 s t(norm)=0.132124, mflops=37.8433 (err=2.6e-16) FFTW_MEASURE plan: (cost = 3.204346e-06) FFTW_NOTW 12 5. FFTW: elapsed time t=1.75 s, 524288 iters, t-(init.)=1.45 s t(norm)=0.0642883, mflops=77.7746 (err=1.5e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.75 s, 524288 iters, t-(init.)=1.45 s t(norm)=0.0642883, mflops=77.7746 (err=1.5e-16) 7. Frigo-old: elapsed time t=1.05 s, 65536 iters, t-(init.)=1.01 s t(norm)=0.358241, mflops=13.9571 (err=2.6e-16) 8. GSL: elapsed time t=1.41 s, 131072 iters, t-(init.)=1.34 s t(norm)=0.237645, mflops=21.0398 (err=1.8e-16) 9. NAPACK (f2c): elapsed time t=1.65 s, 65536 iters, t-(init.)=1.62 s t(norm)=0.574605, mflops=8.70163 (err=5.0e-16) 10. Nielsen: elapsed time t=1.74 s, 65536 iters, t-(init.)=1.71 s t(norm)=0.606527, mflops=8.24365 (err=7.1e-16) 11. Singleton: elapsed time t=1.08 s, 65536 iters, t-(init.)=1.04 s t(norm)=0.368882, mflops=13.5545 (err=1.1e-16) 12. Singleton (f2c): elapsed time t=1.04 s, 65536 iters, t-(init.)=1.01 s t(norm)=0.358241, mflops=13.9571 (err=1.1e-16) 13. Temperton: elapsed time t=1.74 s, 131072 iters, t-(init.)=1.67 s t(norm)=0.29617, mflops=16.8822 (err=5.6e-09) 14. Temperton (f2c): elapsed time t=1.98 s, 131072 iters, t-(init.)=1.9 s t(norm)=0.33696, mflops=14.8386 (err=1.2e-16) 15. Valkenburg: elapsed time t=1.64 s, 32768 iters, t-(init.)=1.62 s t(norm)=1.14921, mflops=4.35082 (err=3.6e-16) 16. ESSL: elapsed time t=1.86 s, 262144 iters, t-(init.)=1.71 s t(norm)=0.151632, mflops=32.9746 (err=9.9e-17) Top mflops for N=12 = 77.7746 Normalized results and averages for N=12: fft 0: mflops = -1 (norm. = -0.0128577), norm. avg. (of 1) = 0.0682292 fft 1: mflops = 22.9214 (norm. = 0.294715), norm. avg. (of 3) = 0.267732 fft 2: mflops = 21.3586 (norm. = 0.274621), norm. avg. (of 3) = 0.224625 fft 3: mflops = 37.5911 (norm. = 0.483333), norm. avg. (of 3) = 0.439614 fft 4: mflops = 37.8433 (norm. = 0.486577), norm. avg. (of 3) = 0.450832 fft 5: mflops = 77.7746 (norm. = 1), norm. avg. (of 3) = 0.997475 fft 6: mflops = 77.7746 (norm. = 1), norm. avg. (of 3) = 1 fft 7: mflops = 13.9571 (norm. = 0.179455), norm. avg. (of 3) = 0.176444 fft 8: mflops = 21.0398 (norm. = 0.270522), norm. avg. (of 3) = 0.271285 fft 9: mflops = 8.70163 (norm. = 0.111883), norm. avg. (of 3) = 0.129259 fft 10: mflops = 8.24365 (norm. = 0.105994), norm. avg. (of 3) = 0.095534 fft 11: mflops = 13.5545 (norm. = 0.174279), norm. avg. (of 3) = 0.179026 fft 12: mflops = 13.9571 (norm. = 0.179455), norm. avg. (of 3) = 0.18061 fft 13: mflops = 16.8822 (norm. = 0.217066), norm. avg. (of 3) = 0.204438 fft 14: mflops = 14.8386 (norm. = 0.190789), norm. avg. (of 3) = 0.165751 fft 15: mflops = 4.35082 (norm. = 0.0559414), norm. avg. (of 3) = 0.0742498 fft 16: mflops = 32.9746 (norm. = 0.423977), norm. avg. (of 2) = 0.353145 Benchmarking for array size = 15: 0. Brenner: elapsed time t=1.67 s, 32768 iters, t-(init.)=1.65 s t(norm)=0.859234, mflops=5.81914 (err=4.1e-16) 1. CWP (min N): elapsed time t=1.42 s, 131072 iters, t-(init.)=1.33 s t(norm)=0.173149, mflops=28.8769 2. CWP (best N): elapsed time t=1.41 s, 131072 iters, t-(init.)=1.32 s t(norm)=0.171847, mflops=29.0957 3. FFTPACK: elapsed time t=1.96 s, 262144 iters, t-(init.)=1.79 s t(norm)=0.116517, mflops=42.9121 (err=1.9e-16) 4. FFTPACK (f2c): elapsed time t=1.04 s, 131072 iters, t-(init.)=0.95 s t(norm)=0.123678, mflops=40.4277 (err=2.9e-16) FFTW_MEASURE plan: (cost = 3.967285e-06) FFTW_NOTW 15 5. FFTW: elapsed time t=1.1 s, 262144 iters, t-(init.)=0.93 s t(norm)=0.0605369, mflops=82.5942 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.1 s, 262144 iters, t-(init.)=0.93 s t(norm)=0.0605369, mflops=82.5942 (err=1.4e-16) 7. Frigo-old: elapsed time t=1.68 s, 65536 iters, t-(init.)=1.64 s t(norm)=0.427013, mflops=11.7092 (err=3.1e-16) 8. GSL: elapsed time t=1.05 s, 65536 iters, t-(init.)=1.01 s t(norm)=0.262978, mflops=19.013 (err=1.8e-16) 9. NAPACK (f2c): elapsed time t=1.27 s, 32768 iters, t-(init.)=1.25 s t(norm)=0.650935, mflops=7.68126 (err=7.8e-16) 10. Nielsen: elapsed time t=1.97 s, 65536 iters, t-(init.)=1.92 s t(norm)=0.499918, mflops=10.0016 (err=6.5e-15) 11. Singleton: elapsed time t=1.38 s, 65536 iters, t-(init.)=1.33 s t(norm)=0.346297, mflops=14.4385 (err=2.3e-16) 12. Singleton (f2c): elapsed time t=1.32 s, 65536 iters, t-(init.)=1.27 s t(norm)=0.330675, mflops=15.1206 (err=2.4e-16) 13. Temperton: elapsed time t=1.03 s, 65536 iters, t-(init.)=0.99 s t(norm)=0.25777, mflops=19.3971 (err=8.0e-09) 14. Temperton (f2c): elapsed time t=1.3 s, 65536 iters, t-(init.)=1.26 s t(norm)=0.328071, mflops=15.2406 (err=1.5e-16) 15. Valkenburg: elapsed time t=1.06 s, 16384 iters, t-(init.)=1.05 s t(norm)=1.09357, mflops=4.57218 (err=3.2e-16) 16. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=15 = 82.5942 Normalized results and averages for N=15: fft 0: mflops = 5.81914 (norm. = 0.0704545), norm. avg. (of 2) = 0.0693419 fft 1: mflops = 28.8769 (norm. = 0.349624), norm. avg. (of 4) = 0.288205 fft 2: mflops = 29.0957 (norm. = 0.352273), norm. avg. (of 4) = 0.256537 fft 3: mflops = 42.9121 (norm. = 0.519553), norm. avg. (of 4) = 0.459598 fft 4: mflops = 40.4277 (norm. = 0.489474), norm. avg. (of 4) = 0.460492 fft 5: mflops = 82.5942 (norm. = 1), norm. avg. (of 4) = 0.998106 fft 6: mflops = 82.5942 (norm. = 1), norm. avg. (of 4) = 1 fft 7: mflops = 11.7092 (norm. = 0.141768), norm. avg. (of 4) = 0.167775 fft 8: mflops = 19.013 (norm. = 0.230198), norm. avg. (of 4) = 0.261014 fft 9: mflops = 7.68126 (norm. = 0.093), norm. avg. (of 4) = 0.120194 fft 10: mflops = 10.0016 (norm. = 0.121094), norm. avg. (of 4) = 0.101924 fft 11: mflops = 14.4385 (norm. = 0.174812), norm. avg. (of 4) = 0.177973 fft 12: mflops = 15.1206 (norm. = 0.183071), norm. avg. (of 4) = 0.181225 fft 13: mflops = 19.3971 (norm. = 0.234848), norm. avg. (of 4) = 0.212041 fft 14: mflops = 15.2406 (norm. = 0.184524), norm. avg. (of 4) = 0.170444 fft 15: mflops = 4.57218 (norm. = 0.0553571), norm. avg. (of 4) = 0.0695267 fft 16: mflops = -1 (norm. = -0.0121074), norm. avg. (of 2) = 0.353145 Benchmarking for array size = 18: 0. Brenner: elapsed time t=1.15 s, 16384 iters, t-(init.)=1.14 s t(norm)=0.927009, mflops=5.39369 (err=3.5e-16) 1. CWP (min N): elapsed time t=1.65 s, 131072 iters, t-(init.)=1.54 s t(norm)=0.156534, mflops=31.9418 2. CWP (best N) (N=28): elapsed time t=1.94 s, 131072 iters, t-(init.)=1.79 s t(norm)=0.181946, mflops=27.4807 3. FFTPACK: elapsed time t=1.51 s, 131072 iters, t-(init.)=1.41 s t(norm)=0.143321, mflops=34.8868 (err=1.9e-16) 4. FFTPACK (f2c): elapsed time t=1.4 s, 131072 iters, t-(init.)=1.29 s t(norm)=0.131123, mflops=38.1321 (err=2.8e-16) FFTW_MEASURE plan: (cost = 7.934570e-06) FFTW_TWIDDLE 2 FFTW_NOTW 9 5. FFTW: elapsed time t=1.07 s, 131072 iters, t-(init.)=0.97 s t(norm)=0.0985964, mflops=50.7118 (err=2.2e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.08 s, 131072 iters, t-(init.)=0.98 s t(norm)=0.0996129, mflops=50.1943 (err=2.2e-16) 7. Frigo-old: elapsed time t=1.25 s, 32768 iters, t-(init.)=1.23 s t(norm)=0.500097, mflops=9.99806 (err=3.1e-16) 8. GSL: elapsed time t=1.04 s, 65536 iters, t-(init.)=0.99 s t(norm)=0.201259, mflops=24.8437 (err=2.1e-16) 9. NAPACK (f2c): elapsed time t=1.06 s, 32768 iters, t-(init.)=1.03 s t(norm)=0.418781, mflops=11.9394 (err=8.4e-16) 10. Nielsen: elapsed time t=1.52 s, 32768 iters, t-(init.)=1.5 s t(norm)=0.609875, mflops=8.19841 (err=9.0e-16) 11. Singleton: elapsed time t=1.27 s, 65536 iters, t-(init.)=1.22 s t(norm)=0.248016, mflops=20.16 (err=2.4e-16) 12. Singleton (f2c): elapsed time t=1.26 s, 65536 iters, t-(init.)=1.21 s t(norm)=0.245983, mflops=20.3266 (err=2.3e-16) 13. Temperton: elapsed time t=1.31 s, 65536 iters, t-(init.)=1.26 s t(norm)=0.256147, mflops=19.52 (err=1.0e-08) 14. Temperton (f2c): elapsed time t=1.66 s, 65536 iters, t-(init.)=1.61 s t(norm)=0.327299, mflops=15.2765 (err=2.6e-16) 15. Valkenburg: elapsed time t=1.32 s, 16384 iters, t-(init.)=1.31 s t(norm)=1.06525, mflops=4.69374 (err=4.2e-16) 16. ESSL: elapsed time t=1.22 s, 131072 iters, t-(init.)=1.12 s t(norm)=0.113843, mflops=43.92 (err=2.0e-16) Top mflops for N=18 = 50.7118 Normalized results and averages for N=18: fft 0: mflops = 5.39369 (norm. = 0.10636), norm. avg. (of 3) = 0.0816811 fft 1: mflops = 31.9418 (norm. = 0.62987), norm. avg. (of 5) = 0.356538 fft 2: mflops = 27.4807 (norm. = 0.541899), norm. avg. (of 5) = 0.313609 fft 3: mflops = 34.8868 (norm. = 0.687943), norm. avg. (of 5) = 0.505267 fft 4: mflops = 38.1321 (norm. = 0.751938), norm. avg. (of 5) = 0.518782 fft 5: mflops = 50.7118 (norm. = 1), norm. avg. (of 5) = 0.998485 fft 6: mflops = 50.1943 (norm. = 0.989796), norm. avg. (of 5) = 0.997959 fft 7: mflops = 9.99806 (norm. = 0.197154), norm. avg. (of 5) = 0.173651 fft 8: mflops = 24.8437 (norm. = 0.489899), norm. avg. (of 5) = 0.306791 fft 9: mflops = 11.9394 (norm. = 0.235437), norm. avg. (of 5) = 0.143243 fft 10: mflops = 8.19841 (norm. = 0.161667), norm. avg. (of 5) = 0.113872 fft 11: mflops = 20.16 (norm. = 0.397541), norm. avg. (of 5) = 0.221886 fft 12: mflops = 20.3266 (norm. = 0.400826), norm. avg. (of 5) = 0.225146 fft 13: mflops = 19.52 (norm. = 0.384921), norm. avg. (of 5) = 0.246617 fft 14: mflops = 15.2765 (norm. = 0.301242), norm. avg. (of 5) = 0.196604 fft 15: mflops = 4.69374 (norm. = 0.0925573), norm. avg. (of 5) = 0.0741328 fft 16: mflops = 43.92 (norm. = 0.866071), norm. avg. (of 3) = 0.52412 Benchmarking for array size = 24: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.65 s, 131072 iters, t-(init.)=1.53 s t(norm)=0.10608, mflops=47.1341 2. CWP (best N) (N=28): elapsed time t=1.95 s, 131072 iters, t-(init.)=1.8 s t(norm)=0.1248, mflops=40.064 3. FFTPACK: elapsed time t=1.73 s, 131072 iters, t-(init.)=1.6 s t(norm)=0.110934, mflops=45.072 (err=2.4e-16) 4. FFTPACK (f2c): elapsed time t=1.66 s, 131072 iters, t-(init.)=1.53 s t(norm)=0.10608, mflops=47.1341 (err=2.4e-16) FFTW_MEASURE plan: (cost = 9.155273e-06) FFTW_TWIDDLE 2 FFTW_NOTW 12 5. FFTW: elapsed time t=1.22 s, 131072 iters, t-(init.)=1.09 s t(norm)=0.0755735, mflops=66.1608 (err=1.8e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.21 s, 131072 iters, t-(init.)=1.08 s t(norm)=0.0748802, mflops=66.7734 (err=1.8e-16) 7. Frigo-old: elapsed time t=1 s, 32768 iters, t-(init.)=0.97 s t(norm)=0.269014, mflops=18.5864 (err=2.5e-16) 8. GSL: elapsed time t=1.22 s, 65536 iters, t-(init.)=1.16 s t(norm)=0.160854, mflops=31.0841 (err=2.1e-16) 9. NAPACK (f2c): elapsed time t=1.48 s, 32768 iters, t-(init.)=1.45 s t(norm)=0.402134, mflops=12.4337 (err=7.1e-16) 10. Nielsen: elapsed time t=1.38 s, 32768 iters, t-(init.)=1.35 s t(norm)=0.374401, mflops=13.3547 (err=1.0e-15) 11. Singleton: elapsed time t=1.07 s, 32768 iters, t-(init.)=1.04 s t(norm)=0.288427, mflops=17.3354 (err=2.0e-16) 12. Singleton (f2c): elapsed time t=1.02 s, 32768 iters, t-(init.)=0.99 s t(norm)=0.274561, mflops=18.2109 (err=2.1e-16) 13. Temperton: elapsed time t=1.58 s, 65536 iters, t-(init.)=1.52 s t(norm)=0.210774, mflops=23.7221 (err=4.4e-09) 14. Temperton (f2c): elapsed time t=1.8 s, 65536 iters, t-(init.)=1.73 s t(norm)=0.239894, mflops=20.8426 (err=1.5e-16) 15. Valkenburg: elapsed time t=1.01 s, 8192 iters, t-(init.)=1 s t(norm)=1.10934, mflops=4.5072 (err=3.2e-16) 16. ESSL: elapsed time t=1.47 s, 131072 iters, t-(init.)=1.35 s t(norm)=0.0936002, mflops=53.4187 (err=1.4e-16) Top mflops for N=24 = 66.7734 Normalized results and averages for N=24: fft 0: mflops = -1 (norm. = -0.014976), norm. avg. (of 3) = 0.0816811 fft 1: mflops = 47.1341 (norm. = 0.705882), norm. avg. (of 6) = 0.414762 fft 2: mflops = 40.064 (norm. = 0.6), norm. avg. (of 6) = 0.361341 fft 3: mflops = 45.072 (norm. = 0.675), norm. avg. (of 6) = 0.533556 fft 4: mflops = 47.1341 (norm. = 0.705882), norm. avg. (of 6) = 0.549965 fft 5: mflops = 66.1608 (norm. = 0.990826), norm. avg. (of 6) = 0.997208 fft 6: mflops = 66.7734 (norm. = 1), norm. avg. (of 6) = 0.998299 fft 7: mflops = 18.5864 (norm. = 0.278351), norm. avg. (of 6) = 0.191101 fft 8: mflops = 31.0841 (norm. = 0.465517), norm. avg. (of 6) = 0.333245 fft 9: mflops = 12.4337 (norm. = 0.186207), norm. avg. (of 6) = 0.150404 fft 10: mflops = 13.3547 (norm. = 0.2), norm. avg. (of 6) = 0.128227 fft 11: mflops = 17.3354 (norm. = 0.259615), norm. avg. (of 6) = 0.228174 fft 12: mflops = 18.2109 (norm. = 0.272727), norm. avg. (of 6) = 0.233076 fft 13: mflops = 23.7221 (norm. = 0.355263), norm. avg. (of 6) = 0.264724 fft 14: mflops = 20.8426 (norm. = 0.312139), norm. avg. (of 6) = 0.21586 fft 15: mflops = 4.5072 (norm. = 0.0675), norm. avg. (of 6) = 0.0730273 fft 16: mflops = 53.4187 (norm. = 0.8), norm. avg. (of 4) = 0.59309 Benchmarking for array size = 36: 0. Brenner: elapsed time t=1.08 s, 8192 iters, t-(init.)=1.07 s t(norm)=0.70179, mflops=7.12464 (err=4.4e-16) 1. CWP (min N): elapsed time t=1.17 s, 65536 iters, t-(init.)=1.08 s t(norm)=0.0885436, mflops=56.4694 2. CWP (best N): elapsed time t=1.17 s, 65536 iters, t-(init.)=1.07 s t(norm)=0.0877237, mflops=56.9971 3. FFTPACK: elapsed time t=1.18 s, 65536 iters, t-(init.)=1.09 s t(norm)=0.0893634, mflops=55.9513 (err=2.4e-16) 4. FFTPACK (f2c): elapsed time t=1.17 s, 65536 iters, t-(init.)=1.08 s t(norm)=0.0885436, mflops=56.4694 (err=2.9e-16) FFTW_MEASURE plan: (cost = 1.342773e-05) FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.86 s, 131072 iters, t-(init.)=1.68 s t(norm)=0.0688672, mflops=72.6035 (err=1.8e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.86 s, 131072 iters, t-(init.)=1.68 s t(norm)=0.0688672, mflops=72.6035 (err=1.8e-16) 7. Frigo-old: elapsed time t=1.23 s, 16384 iters, t-(init.)=1.21 s t(norm)=0.396806, mflops=12.6006 (err=3.1e-16) 8. GSL: elapsed time t=1.87 s, 65536 iters, t-(init.)=1.78 s t(norm)=0.145933, mflops=34.2623 (err=2.1e-16) 9. NAPACK (f2c): elapsed time t=1.04 s, 16384 iters, t-(init.)=1.01 s t(norm)=0.331219, mflops=15.0958 (err=6.8e-16) 10. Nielsen: elapsed time t=1.1 s, 16384 iters, t-(init.)=1.08 s t(norm)=0.354174, mflops=14.1173 (err=7.8e-16) 11. Singleton: elapsed time t=1.01 s, 32768 iters, t-(init.)=0.96 s t(norm)=0.157411, mflops=31.764 (err=1.9e-16) 12. Singleton (f2c): elapsed time t=1.03 s, 32768 iters, t-(init.)=0.99 s t(norm)=0.16233, mflops=30.8015 (err=2.1e-16) 13. Temperton: elapsed time t=1 s, 32768 iters, t-(init.)=0.96 s t(norm)=0.157411, mflops=31.764 (err=9.8e-09) 14. Temperton (f2c): elapsed time t=1.16 s, 32768 iters, t-(init.)=1.11 s t(norm)=0.182006, mflops=27.4716 (err=2.3e-16) 15. Valkenburg: elapsed time t=1.61 s, 8192 iters, t-(init.)=1.6 s t(norm)=1.04941, mflops=4.7646 (err=4.9e-16) 16. ESSL: elapsed time t=1.11 s, 65536 iters, t-(init.)=1.02 s t(norm)=0.0836245, mflops=59.7911 (err=1.9e-16) Top mflops for N=36 = 72.6035 Normalized results and averages for N=36: fft 0: mflops = 7.12464 (norm. = 0.0981308), norm. avg. (of 4) = 0.0857936 fft 1: mflops = 56.4694 (norm. = 0.777778), norm. avg. (of 7) = 0.466622 fft 2: mflops = 56.9971 (norm. = 0.785047), norm. avg. (of 7) = 0.421871 fft 3: mflops = 55.9513 (norm. = 0.770642), norm. avg. (of 7) = 0.567426 fft 4: mflops = 56.4694 (norm. = 0.777778), norm. avg. (of 7) = 0.58251 fft 5: mflops = 72.6035 (norm. = 1), norm. avg. (of 7) = 0.997607 fft 6: mflops = 72.6035 (norm. = 1), norm. avg. (of 7) = 0.998542 fft 7: mflops = 12.6006 (norm. = 0.173554), norm. avg. (of 7) = 0.188594 fft 8: mflops = 34.2623 (norm. = 0.47191), norm. avg. (of 7) = 0.353054 fft 9: mflops = 15.0958 (norm. = 0.207921), norm. avg. (of 7) = 0.15862 fft 10: mflops = 14.1173 (norm. = 0.194444), norm. avg. (of 7) = 0.137687 fft 11: mflops = 31.764 (norm. = 0.4375), norm. avg. (of 7) = 0.258078 fft 12: mflops = 30.8015 (norm. = 0.424242), norm. avg. (of 7) = 0.260385 fft 13: mflops = 31.764 (norm. = 0.4375), norm. avg. (of 7) = 0.289407 fft 14: mflops = 27.4716 (norm. = 0.378378), norm. avg. (of 7) = 0.239076 fft 15: mflops = 4.7646 (norm. = 0.065625), norm. avg. (of 7) = 0.0719698 fft 16: mflops = 59.7911 (norm. = 0.823529), norm. avg. (of 5) = 0.639178 Benchmarking for array size = 80: 0. Brenner: elapsed time t=1.7 s, 8192 iters, t-(init.)=1.67 s t(norm)=0.403076, mflops=12.4046 (err=1.0e-15) 1. CWP (min N): elapsed time t=1.13 s, 32768 iters, t-(init.)=1.04 s t(norm)=0.0627544, mflops=79.6757 2. CWP (best N) (N=84): elapsed time t=1.08 s, 32768 iters, t-(init.)=0.98 s t(norm)=0.0591339, mflops=84.5539 3. FFTPACK: elapsed time t=1.12 s, 32768 iters, t-(init.)=1.03 s t(norm)=0.0621509, mflops=80.4493 (err=7.2e-16) 4. FFTPACK (f2c): elapsed time t=1.27 s, 32768 iters, t-(init.)=1.18 s t(norm)=0.0712021, mflops=70.2227 (err=8.1e-16) FFTW_MEASURE plan: (cost = 3.173828e-05) FFTW_TWIDDLE 5 FFTW_NOTW 16 5. FFTW: elapsed time t=1.04 s, 32768 iters, t-(init.)=0.94 s t(norm)=0.0567203, mflops=88.1519 (err=7.2e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 6. FFTW_ESTIMATE: elapsed time t=1.04 s, 32768 iters, t-(init.)=0.95 s t(norm)=0.0573237, mflops=87.224 (err=7.2e-16) 7. Frigo-old: elapsed time t=1.61 s, 16384 iters, t-(init.)=1.56 s t(norm)=0.188263, mflops=26.5586 (err=6.4e-16) 8. GSL: elapsed time t=1.32 s, 16384 iters, t-(init.)=1.27 s t(norm)=0.153265, mflops=32.6231 (err=6.7e-16) 9. NAPACK (f2c): elapsed time t=1.99 s, 8192 iters, t-(init.)=1.97 s t(norm)=0.475485, mflops=10.5156 (err=9.2e-16) 10. Nielsen: elapsed time t=1.8 s, 16384 iters, t-(init.)=1.75 s t(norm)=0.211193, mflops=23.6751 (err=5.7e-15) 11. Singleton: elapsed time t=1.93 s, 32768 iters, t-(init.)=1.84 s t(norm)=0.111027, mflops=45.0341 (err=8.7e-16) 12. Singleton (f2c): elapsed time t=1.86 s, 32768 iters, t-(init.)=1.77 s t(norm)=0.106803, mflops=46.8151 (err=9.1e-16) 13. Temperton: elapsed time t=1.27 s, 16384 iters, t-(init.)=1.22 s t(norm)=0.147231, mflops=33.9602 (err=7.3e-09) 14. Temperton (f2c): elapsed time t=1.66 s, 16384 iters, t-(init.)=1.62 s t(norm)=0.195504, mflops=25.5749 (err=7.0e-16) 15. Valkenburg: elapsed time t=1.12 s, 2048 iters, t-(init.)=1.11 s t(norm)=1.07165, mflops=4.6657 (err=7.9e-16) 16. ESSL: elapsed time t=1.83 s, 65536 iters, t-(init.)=1.64 s t(norm)=0.0494794, mflops=101.052 (err=6.2e-16) Top mflops for N=80 = 101.052 Normalized results and averages for N=80: fft 0: mflops = 12.4046 (norm. = 0.122754), norm. avg. (of 5) = 0.0931857 fft 1: mflops = 79.6757 (norm. = 0.788462), norm. avg. (of 8) = 0.506852 fft 2: mflops = 84.5539 (norm. = 0.836735), norm. avg. (of 8) = 0.473729 fft 3: mflops = 80.4493 (norm. = 0.796117), norm. avg. (of 8) = 0.596012 fft 4: mflops = 70.2227 (norm. = 0.694915), norm. avg. (of 8) = 0.59656 fft 5: mflops = 88.1519 (norm. = 0.87234), norm. avg. (of 8) = 0.981949 fft 6: mflops = 87.224 (norm. = 0.863158), norm. avg. (of 8) = 0.981619 fft 7: mflops = 26.5586 (norm. = 0.262821), norm. avg. (of 8) = 0.197873 fft 8: mflops = 32.6231 (norm. = 0.322835), norm. avg. (of 8) = 0.349277 fft 9: mflops = 10.5156 (norm. = 0.104061), norm. avg. (of 8) = 0.1518 fft 10: mflops = 23.6751 (norm. = 0.234286), norm. avg. (of 8) = 0.149762 fft 11: mflops = 45.0341 (norm. = 0.445652), norm. avg. (of 8) = 0.281525 fft 12: mflops = 46.8151 (norm. = 0.463277), norm. avg. (of 8) = 0.285747 fft 13: mflops = 33.9602 (norm. = 0.336066), norm. avg. (of 8) = 0.295239 fft 14: mflops = 25.5749 (norm. = 0.253086), norm. avg. (of 8) = 0.240828 fft 15: mflops = 4.6657 (norm. = 0.0461712), norm. avg. (of 8) = 0.068745 fft 16: mflops = 101.052 (norm. = 1), norm. avg. (of 6) = 0.699315 Benchmarking for array size = 108: 0. Brenner: elapsed time t=1.82 s, 4096 iters, t-(init.)=1.81 s t(norm)=0.605727, mflops=8.25455 (err=5.8e-16) 1. CWP (min N) (N=110): elapsed time t=1.96 s, 32768 iters, t-(init.)=1.83 s t(norm)=0.0765525, mflops=65.3147 2. CWP (best N) (N=112): elapsed time t=1.48 s, 32768 iters, t-(init.)=1.37 s t(norm)=0.0573098, mflops=87.2451 3. FFTPACK: elapsed time t=1.56 s, 32768 iters, t-(init.)=1.43 s t(norm)=0.0598197, mflops=83.5845 (err=4.2e-16) 4. FFTPACK (f2c): elapsed time t=1.73 s, 32768 iters, t-(init.)=1.6 s t(norm)=0.0669311, mflops=74.7037 (err=4.6e-16) FFTW_MEASURE plan: (cost = 4.882813e-05) FFTW_TWIDDLE 9 FFTW_NOTW 12 5. FFTW: elapsed time t=1.57 s, 32768 iters, t-(init.)=1.45 s t(norm)=0.0606563, mflops=82.4316 (err=3.7e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.57 s, 32768 iters, t-(init.)=1.45 s t(norm)=0.0606563, mflops=82.4316 (err=3.7e-16) 7. Frigo-old: elapsed time t=1.26 s, 4096 iters, t-(init.)=1.24 s t(norm)=0.414973, mflops=12.049 (err=4.5e-16) 8. GSL: elapsed time t=1.65 s, 16384 iters, t-(init.)=1.59 s t(norm)=0.133026, mflops=37.5867 (err=4.1e-16) 9. NAPACK (f2c): elapsed time t=1.54 s, 8192 iters, t-(init.)=1.51 s t(norm)=0.252665, mflops=19.789 (err=2.8e-15) 10. Nielsen: elapsed time t=1.65 s, 8192 iters, t-(init.)=1.62 s t(norm)=0.271071, mflops=18.4453 (err=1.2e-15) 11. Singleton: elapsed time t=1.67 s, 16384 iters, t-(init.)=1.61 s t(norm)=0.134699, mflops=37.1198 (err=5.3e-16) 12. Singleton (f2c): elapsed time t=1.64 s, 16384 iters, t-(init.)=1.58 s t(norm)=0.132189, mflops=37.8246 (err=5.1e-16) 13. Temperton: elapsed time t=1.46 s, 16384 iters, t-(init.)=1.4 s t(norm)=0.117129, mflops=42.6878 (err=1.5e-08) 14. Temperton (f2c): elapsed time t=1.72 s, 16384 iters, t-(init.)=1.66 s t(norm)=0.138882, mflops=36.0018 (err=3.9e-16) 15. Valkenburg: elapsed time t=1.52 s, 2048 iters, t-(init.)=1.52 s t(norm)=1.01735, mflops=4.91471 (err=6.2e-16) 16. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=108 = 87.2451 Normalized results and averages for N=108: fft 0: mflops = 8.25455 (norm. = 0.0946133), norm. avg. (of 6) = 0.0934237 fft 1: mflops = 65.3147 (norm. = 0.748634), norm. avg. (of 9) = 0.533716 fft 2: mflops = 87.2451 (norm. = 1), norm. avg. (of 9) = 0.532203 fft 3: mflops = 83.5845 (norm. = 0.958042), norm. avg. (of 9) = 0.636238 fft 4: mflops = 74.7037 (norm. = 0.85625), norm. avg. (of 9) = 0.625415 fft 5: mflops = 82.4316 (norm. = 0.944828), norm. avg. (of 9) = 0.977824 fft 6: mflops = 82.4316 (norm. = 0.944828), norm. avg. (of 9) = 0.977531 fft 7: mflops = 12.049 (norm. = 0.138105), norm. avg. (of 9) = 0.191232 fft 8: mflops = 37.5867 (norm. = 0.430818), norm. avg. (of 9) = 0.358337 fft 9: mflops = 19.789 (norm. = 0.226821), norm. avg. (of 9) = 0.160136 fft 10: mflops = 18.4453 (norm. = 0.21142), norm. avg. (of 9) = 0.156612 fft 11: mflops = 37.1198 (norm. = 0.425466), norm. avg. (of 9) = 0.297518 fft 12: mflops = 37.8246 (norm. = 0.433544), norm. avg. (of 9) = 0.302169 fft 13: mflops = 42.6878 (norm. = 0.489286), norm. avg. (of 9) = 0.3168 fft 14: mflops = 36.0018 (norm. = 0.412651), norm. avg. (of 9) = 0.259919 fft 15: mflops = 4.91471 (norm. = 0.0563322), norm. avg. (of 9) = 0.0673658 fft 16: mflops = -1 (norm. = -0.011462), norm. avg. (of 6) = 0.699315 Benchmarking for array size = 210: 0. Brenner: elapsed time t=1.49 s, 2048 iters, t-(init.)=1.47 s t(norm)=0.443072, mflops=11.2848 (err=9.0e-16) 1. CWP (min N): elapsed time t=1.4 s, 16384 iters, t-(init.)=1.28 s t(norm)=0.0482256, mflops=103.679 2. CWP (best N): elapsed time t=1.4 s, 16384 iters, t-(init.)=1.28 s t(norm)=0.0482256, mflops=103.679 3. FFTPACK: elapsed time t=1.1 s, 8192 iters, t-(init.)=1.04 s t(norm)=0.0783665, mflops=63.8027 (err=4.9e-16) 4. FFTPACK (f2c): elapsed time t=1.22 s, 8192 iters, t-(init.)=1.16 s t(norm)=0.0874088, mflops=57.2025 (err=5.9e-16) FFTW_MEASURE plan: (cost = 1.074219e-04) FFTW_TWIDDLE 2 FFTW_TWIDDLE 7 FFTW_NOTW 15 5. FFTW: elapsed time t=1.79 s, 16384 iters, t-(init.)=1.67 s t(norm)=0.0629193, mflops=79.4669 (err=5.5e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.87 s, 16384 iters, t-(init.)=1.75 s t(norm)=0.0659334, mflops=75.8341 (err=5.3e-16) 7. Frigo-old: elapsed time t=1.17 s, 2048 iters, t-(init.)=1.16 s t(norm)=0.349635, mflops=14.3006 (err=5.5e-16) 8. GSL: elapsed time t=1.85 s, 8192 iters, t-(init.)=1.79 s t(norm)=0.134881, mflops=37.0698 (err=6.4e-16) 9. NAPACK (f2c): elapsed time t=1.77 s, 2048 iters, t-(init.)=1.75 s t(norm)=0.527467, mflops=9.47926 (err=1.4e-14) 10. Nielsen: elapsed time t=1.56 s, 4096 iters, t-(init.)=1.53 s t(norm)=0.230578, mflops=21.6846 (err=7.3e-15) 11. Singleton: elapsed time t=1.3 s, 4096 iters, t-(init.)=1.28 s t(norm)=0.192902, mflops=25.9199 (err=6.1e-16) 12. Singleton (f2c): elapsed time t=1.22 s, 4096 iters, t-(init.)=1.19 s t(norm)=0.179339, mflops=27.8802 (err=6.0e-16) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.89 s, 1024 iters, t-(init.)=1.88 s t(norm)=1.1333, mflops=4.41189 (err=7.4e-16) 16. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=210 = 103.679 Normalized results and averages for N=210: fft 0: mflops = 11.2848 (norm. = 0.108844), norm. avg. (of 7) = 0.0956265 fft 1: mflops = 103.679 (norm. = 1), norm. avg. (of 10) = 0.580345 fft 2: mflops = 103.679 (norm. = 1), norm. avg. (of 10) = 0.578983 fft 3: mflops = 63.8027 (norm. = 0.615385), norm. avg. (of 10) = 0.634152 fft 4: mflops = 57.2025 (norm. = 0.551724), norm. avg. (of 10) = 0.618046 fft 5: mflops = 79.4669 (norm. = 0.766467), norm. avg. (of 10) = 0.956689 fft 6: mflops = 75.8341 (norm. = 0.731429), norm. avg. (of 10) = 0.952921 fft 7: mflops = 14.3006 (norm. = 0.137931), norm. avg. (of 10) = 0.185902 fft 8: mflops = 37.0698 (norm. = 0.357542), norm. avg. (of 10) = 0.358257 fft 9: mflops = 9.47926 (norm. = 0.0914286), norm. avg. (of 10) = 0.153265 fft 10: mflops = 21.6846 (norm. = 0.20915), norm. avg. (of 10) = 0.161866 fft 11: mflops = 25.9199 (norm. = 0.25), norm. avg. (of 10) = 0.292766 fft 12: mflops = 27.8802 (norm. = 0.268908), norm. avg. (of 10) = 0.298843 fft 13: mflops = -1 (norm. = -0.00964511), norm. avg. (of 9) = 0.3168 fft 14: mflops = -1 (norm. = -0.00964511), norm. avg. (of 9) = 0.259919 fft 15: mflops = 4.41189 (norm. = 0.0425532), norm. avg. (of 10) = 0.0648846 fft 16: mflops = -1 (norm. = -0.00964511), norm. avg. (of 6) = 0.699315 Benchmarking for array size = 504: 0. Brenner: elapsed time t=1.88 s, 1024 iters, t-(init.)=1.87 s t(norm)=0.403614, mflops=12.3881 (err=1.1e-15) 1. CWP (min N): elapsed time t=1.54 s, 8192 iters, t-(init.)=1.4 s t(norm)=0.0377714, mflops=132.375 2. CWP (best N): elapsed time t=1.55 s, 8192 iters, t-(init.)=1.41 s t(norm)=0.0380412, mflops=131.437 3. FFTPACK: elapsed time t=1.49 s, 4096 iters, t-(init.)=1.42 s t(norm)=0.0766219, mflops=65.2555 (err=9.5e-16) 4. FFTPACK (f2c): elapsed time t=1.64 s, 4096 iters, t-(init.)=1.57 s t(norm)=0.0847158, mflops=59.0209 (err=1.0e-15) FFTW_MEASURE plan: (cost = 2.832031e-04) FFTW_TWIDDLE 7 FFTW_TWIDDLE 6 FFTW_NOTW 12 5. FFTW: elapsed time t=1.19 s, 4096 iters, t-(init.)=1.12 s t(norm)=0.0604342, mflops=82.7346 (err=9.6e-16) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.21 s, 4096 iters, t-(init.)=1.14 s t(norm)=0.0615134, mflops=81.2831 (err=9.0e-16) 7. Frigo-old: elapsed time t=1.43 s, 1024 iters, t-(init.)=1.41 s t(norm)=0.304329, mflops=16.4296 (err=1.0e-15) 8. GSL: elapsed time t=1.05 s, 2048 iters, t-(init.)=1.01 s t(norm)=0.108997, mflops=45.8727 (err=1.0e-15) 9. NAPACK (f2c): elapsed time t=1.82 s, 1024 iters, t-(init.)=1.8 s t(norm)=0.388506, mflops=12.8698 (err=3.8e-14) 10. Nielsen: elapsed time t=1.08 s, 1024 iters, t-(init.)=1.06 s t(norm)=0.228787, mflops=21.8544 (err=5.5e-15) 11. Singleton: elapsed time t=1.31 s, 2048 iters, t-(init.)=1.28 s t(norm)=0.138135, mflops=36.1964 (err=1.2e-15) 12. Singleton (f2c): elapsed time t=1.31 s, 2048 iters, t-(init.)=1.28 s t(norm)=0.138135, mflops=36.1964 (err=1.3e-15) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.25 s, 256 iters, t-(init.)=1.25 s t(norm)=1.07918, mflops=4.63314 (err=1.1e-15) 16. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=504 = 132.375 Normalized results and averages for N=504: fft 0: mflops = 12.3881 (norm. = 0.0935829), norm. avg. (of 8) = 0.095371 fft 1: mflops = 132.375 (norm. = 1), norm. avg. (of 11) = 0.618495 fft 2: mflops = 131.437 (norm. = 0.992908), norm. avg. (of 11) = 0.616612 fft 3: mflops = 65.2555 (norm. = 0.492958), norm. avg. (of 11) = 0.621316 fft 4: mflops = 59.0209 (norm. = 0.44586), norm. avg. (of 11) = 0.602392 fft 5: mflops = 82.7346 (norm. = 0.625), norm. avg. (of 11) = 0.926535 fft 6: mflops = 81.2831 (norm. = 0.614035), norm. avg. (of 11) = 0.922113 fft 7: mflops = 16.4296 (norm. = 0.124113), norm. avg. (of 11) = 0.180285 fft 8: mflops = 45.8727 (norm. = 0.346535), norm. avg. (of 11) = 0.357192 fft 9: mflops = 12.8698 (norm. = 0.0972222), norm. avg. (of 11) = 0.14817 fft 10: mflops = 21.8544 (norm. = 0.165094), norm. avg. (of 11) = 0.16216 fft 11: mflops = 36.1964 (norm. = 0.273437), norm. avg. (of 11) = 0.291009 fft 12: mflops = 36.1964 (norm. = 0.273437), norm. avg. (of 11) = 0.296533 fft 13: mflops = -1 (norm. = -0.00755427), norm. avg. (of 9) = 0.3168 fft 14: mflops = -1 (norm. = -0.00755427), norm. avg. (of 9) = 0.259919 fft 15: mflops = 4.63314 (norm. = 0.035), norm. avg. (of 11) = 0.0621678 fft 16: mflops = -1 (norm. = -0.00755427), norm. avg. (of 6) = 0.699315 Benchmarking for array size = 1000: 0. Brenner: elapsed time t=1.92 s, 512 iters, t-(init.)=1.9 s t(norm)=0.372368, mflops=13.4276 (err=3.2e-15) 1. CWP (min N) (N=1001): elapsed time t=1.49 s, 2048 iters, t-(init.)=1.42 s t(norm)=0.069574, mflops=71.8659 2. CWP (best N) (N=1008): elapsed time t=1.85 s, 4096 iters, t-(init.)=1.71 s t(norm)=0.0418914, mflops=119.356 3. FFTPACK: elapsed time t=1.18 s, 2048 iters, t-(init.)=1.11 s t(norm)=0.0543853, mflops=91.9366 (err=3.1e-15) 4. FFTPACK (f2c): elapsed time t=1.34 s, 2048 iters, t-(init.)=1.27 s t(norm)=0.0622246, mflops=80.354 (err=3.1e-15) FFTW_MEASURE plan: (cost = 6.445313e-04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 5. FFTW: elapsed time t=1.3 s, 2048 iters, t-(init.)=1.23 s t(norm)=0.0602648, mflops=82.9672 (err=3.0e-15) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.31 s, 2048 iters, t-(init.)=1.24 s t(norm)=0.0607548, mflops=82.2981 (err=3.0e-15) 7. Frigo-old: elapsed time t=1.38 s, 512 iters, t-(init.)=1.37 s t(norm)=0.268497, mflops=18.6222 (err=3.1e-15) 8. GSL: elapsed time t=1.57 s, 1024 iters, t-(init.)=1.54 s t(norm)=0.150907, mflops=33.133 (err=3.0e-15) 9. NAPACK (f2c): elapsed time t=1.32 s, 256 iters, t-(init.)=1.31 s t(norm)=0.513476, mflops=9.73756 (err=1.7e-14) 10. Nielsen: elapsed time t=1.67 s, 1024 iters, t-(init.)=1.64 s t(norm)=0.160706, mflops=31.1127 (err=1.3e-14) 11. Singleton: elapsed time t=1.05 s, 1024 iters, t-(init.)=1.02 s t(norm)=0.0999514, mflops=50.0243 (err=3.2e-15) 12. Singleton (f2c): elapsed time t=1.93 s, 2048 iters, t-(init.)=1.86 s t(norm)=0.0911321, mflops=54.8654 (err=4.9e-15) 13. Temperton: elapsed time t=1.44 s, 1024 iters, t-(init.)=1.41 s t(norm)=0.138168, mflops=36.1878 (err=2.4e-08) 14. Temperton (f2c): elapsed time t=1.71 s, 1024 iters, t-(init.)=1.68 s t(norm)=0.164626, mflops=30.3719 (err=3.0e-15) 15. Valkenburg: elapsed time t=1.37 s, 128 iters, t-(init.)=1.37 s t(norm)=1.07399, mflops=4.65555 (err=3.0e-15) 16. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=1000 = 119.356 Normalized results and averages for N=1000: fft 0: mflops = 13.4276 (norm. = 0.1125), norm. avg. (of 9) = 0.0972743 fft 1: mflops = 71.8659 (norm. = 0.602113), norm. avg. (of 12) = 0.61713 fft 2: mflops = 119.356 (norm. = 1), norm. avg. (of 12) = 0.648561 fft 3: mflops = 91.9366 (norm. = 0.77027), norm. avg. (of 12) = 0.633729 fft 4: mflops = 80.354 (norm. = 0.673228), norm. avg. (of 12) = 0.608295 fft 5: mflops = 82.9672 (norm. = 0.695122), norm. avg. (of 12) = 0.907251 fft 6: mflops = 82.2981 (norm. = 0.689516), norm. avg. (of 12) = 0.90273 fft 7: mflops = 18.6222 (norm. = 0.156022), norm. avg. (of 12) = 0.178263 fft 8: mflops = 33.133 (norm. = 0.277597), norm. avg. (of 12) = 0.350559 fft 9: mflops = 9.73756 (norm. = 0.081584), norm. avg. (of 12) = 0.142622 fft 10: mflops = 31.1127 (norm. = 0.260671), norm. avg. (of 12) = 0.170369 fft 11: mflops = 50.0243 (norm. = 0.419118), norm. avg. (of 12) = 0.301685 fft 12: mflops = 54.8654 (norm. = 0.459677), norm. avg. (of 12) = 0.310128 fft 13: mflops = 36.1878 (norm. = 0.303191), norm. avg. (of 10) = 0.315439 fft 14: mflops = 30.3719 (norm. = 0.254464), norm. avg. (of 10) = 0.259374 fft 15: mflops = 4.65555 (norm. = 0.0390055), norm. avg. (of 12) = 0.0602376 fft 16: mflops = -1 (norm. = -0.00837828), norm. avg. (of 6) = 0.699315 Benchmarking for array size = 1960: 0. Brenner: elapsed time t=1.87 s, 256 iters, t-(init.)=1.85 s t(norm)=0.337126, mflops=14.8313 (err=1.7e-15) 1. CWP (min N) (N=1980): elapsed time t=1.22 s, 1024 iters, t-(init.)=1.15 s t(norm)=0.0523912, mflops=95.436 2. CWP (best N) (N=1980): elapsed time t=1.22 s, 1024 iters, t-(init.)=1.15 s t(norm)=0.0523912, mflops=95.436 3. FFTPACK: elapsed time t=1.07 s, 512 iters, t-(init.)=1.04 s t(norm)=0.0947597, mflops=52.7651 (err=1.4e-15) 4. FFTPACK (f2c): elapsed time t=1.23 s, 512 iters, t-(init.)=1.19 s t(norm)=0.108427, mflops=46.114 (err=1.5e-15) FFTW_MEASURE plan: (cost = 1.718750e-03) FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 7 FFTW_NOTW 14 5. FFTW: elapsed time t=1.91 s, 1024 iters, t-(init.)=1.84 s t(norm)=0.0838258, mflops=59.6475 (err=1.4e-15) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.97 s, 1024 iters, t-(init.)=1.91 s t(norm)=0.0870149, mflops=57.4614 (err=1.4e-15) 7. Frigo-old: elapsed time t=1.57 s, 256 iters, t-(init.)=1.55 s t(norm)=0.282457, mflops=17.7018 (err=1.5e-15) 8. GSL: elapsed time t=1.57 s, 512 iters, t-(init.)=1.53 s t(norm)=0.139406, mflops=35.8665 (err=1.5e-15) 9. NAPACK (f2c): elapsed time t=1.56 s, 128 iters, t-(init.)=1.56 s t(norm)=0.568558, mflops=8.79418 (err=1.4e-13) 10. Nielsen: elapsed time t=1.15 s, 256 iters, t-(init.)=1.14 s t(norm)=0.207742, mflops=24.0683 (err=1.7e-14) 11. Singleton: elapsed time t=1.4 s, 512 iters, t-(init.)=1.36 s t(norm)=0.123916, mflops=40.3498 (err=3.9e-15) 12. Singleton (f2c): elapsed time t=1.43 s, 512 iters, t-(init.)=1.39 s t(norm)=0.12665, mflops=39.4789 (err=2.1e-15) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.68 s, 64 iters, t-(init.)=1.68 s t(norm)=1.22459, mflops=4.08301 (err=1.5e-15) 16. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=1960 = 95.436 Normalized results and averages for N=1960: fft 0: mflops = 14.8313 (norm. = 0.155405), norm. avg. (of 10) = 0.103087 fft 1: mflops = 95.436 (norm. = 1), norm. avg. (of 13) = 0.646581 fft 2: mflops = 95.436 (norm. = 1), norm. avg. (of 13) = 0.675595 fft 3: mflops = 52.7651 (norm. = 0.552885), norm. avg. (of 13) = 0.62751 fft 4: mflops = 46.114 (norm. = 0.483193), norm. avg. (of 13) = 0.598672 fft 5: mflops = 59.6475 (norm. = 0.625), norm. avg. (of 13) = 0.885539 fft 6: mflops = 57.4614 (norm. = 0.602094), norm. avg. (of 13) = 0.879604 fft 7: mflops = 17.7018 (norm. = 0.185484), norm. avg. (of 13) = 0.178818 fft 8: mflops = 35.8665 (norm. = 0.375817), norm. avg. (of 13) = 0.352502 fft 9: mflops = 8.79418 (norm. = 0.0921474), norm. avg. (of 13) = 0.138739 fft 10: mflops = 24.0683 (norm. = 0.252193), norm. avg. (of 13) = 0.176663 fft 11: mflops = 40.3498 (norm. = 0.422794), norm. avg. (of 13) = 0.311001 fft 12: mflops = 39.4789 (norm. = 0.413669), norm. avg. (of 13) = 0.318093 fft 13: mflops = -1 (norm. = -0.0104782), norm. avg. (of 10) = 0.315439 fft 14: mflops = -1 (norm. = -0.0104782), norm. avg. (of 10) = 0.259374 fft 15: mflops = 4.08301 (norm. = 0.0427827), norm. avg. (of 13) = 0.0588949 fft 16: mflops = -1 (norm. = -0.0104782), norm. avg. (of 6) = 0.699315 Benchmarking for array size = 4725: 0. Brenner: elapsed time t=1.19 s, 32 iters, t-(init.)=1.18 s t(norm)=0.639372, mflops=7.82018 (err=2.6e-15) 1. CWP (min N) (N=5005): elapsed time t=1.24 s, 256 iters, t-(init.)=1.08 s t(norm)=0.0731484, mflops=68.3542 2. CWP (best N) (N=5040): elapsed time t=1.98 s, 512 iters, t-(init.)=1.67 s t(norm)=0.0565546, mflops=88.4102 3. FFTPACK: elapsed time t=1.33 s, 128 iters, t-(init.)=1.27 s t(norm)=0.172034, mflops=29.064 (err=2.4e-15) 4. FFTPACK (f2c): elapsed time t=1.38 s, 128 iters, t-(init.)=1.32 s t(norm)=0.178807, mflops=27.9631 (err=2.5e-15) FFTW_MEASURE plan: (cost = 5.312500e-03) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 15 5. FFTW: elapsed time t=1.35 s, 256 iters, t-(init.)=1.23 s t(norm)=0.083308, mflops=60.0183 (err=2.4e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.61 s, 256 iters, t-(init.)=1.49 s t(norm)=0.100918, mflops=49.5453 (err=2.4e-15) 7. Frigo-old: elapsed time t=1.75 s, 64 iters, t-(init.)=1.72 s t(norm)=0.465983, mflops=10.73 (err=2.5e-15) 8. GSL: elapsed time t=1.3 s, 128 iters, t-(init.)=1.24 s t(norm)=0.167971, mflops=29.7671 (err=2.5e-15) 9. NAPACK (f2c): elapsed time t=1.01 s, 32 iters, t-(init.)=1 s t(norm)=0.54184, mflops=9.22781 (err=3.8e-13) 10. Nielsen: elapsed time t=1.02 s, 64 iters, t-(init.)=0.99 s t(norm)=0.268211, mflops=18.642 (err=4.7e-14) 11. Singleton: elapsed time t=1.11 s, 128 iters, t-(init.)=1.06 s t(norm)=0.143588, mflops=34.8219 (err=4.3e-15) 12. Singleton (f2c): elapsed time t=1.09 s, 128 iters, t-(init.)=1.03 s t(norm)=0.139524, mflops=35.8362 (err=3.5e-15) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.09 s, 16 iters, t-(init.)=1.08 s t(norm)=1.17038, mflops=4.27213 (err=2.4e-15) 16. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=4725 = 88.4102 Normalized results and averages for N=4725: fft 0: mflops = 7.82018 (norm. = 0.0884534), norm. avg. (of 11) = 0.101757 fft 1: mflops = 68.3542 (norm. = 0.773148), norm. avg. (of 14) = 0.655622 fft 2: mflops = 88.4102 (norm. = 1), norm. avg. (of 14) = 0.698767 fft 3: mflops = 29.064 (norm. = 0.32874), norm. avg. (of 14) = 0.60617 fft 4: mflops = 27.9631 (norm. = 0.316288), norm. avg. (of 14) = 0.578502 fft 5: mflops = 60.0183 (norm. = 0.678862), norm. avg. (of 14) = 0.870776 fft 6: mflops = 49.5453 (norm. = 0.560403), norm. avg. (of 14) = 0.856804 fft 7: mflops = 10.73 (norm. = 0.121366), norm. avg. (of 14) = 0.174714 fft 8: mflops = 29.7671 (norm. = 0.336694), norm. avg. (of 14) = 0.351373 fft 9: mflops = 9.22781 (norm. = 0.104375), norm. avg. (of 14) = 0.136284 fft 10: mflops = 18.642 (norm. = 0.210859), norm. avg. (of 14) = 0.179106 fft 11: mflops = 34.8219 (norm. = 0.393868), norm. avg. (of 14) = 0.31692 fft 12: mflops = 35.8362 (norm. = 0.40534), norm. avg. (of 14) = 0.324325 fft 13: mflops = -1 (norm. = -0.0113109), norm. avg. (of 10) = 0.315439 fft 14: mflops = -1 (norm. = -0.0113109), norm. avg. (of 10) = 0.259374 fft 15: mflops = 4.27213 (norm. = 0.0483218), norm. avg. (of 14) = 0.0581397 fft 16: mflops = -1 (norm. = -0.0113109), norm. avg. (of 6) = 0.699315 Benchmarking for array size = 10368: 0. Brenner: elapsed time t=1.46 s, 16 iters, t-(init.)=1.44 s t(norm)=0.650724, mflops=7.68375 (err=4.8e-15) 1. CWP (min N) (N=10920): elapsed time t=1.65 s, 128 iters, t-(init.)=1.48 s t(norm)=0.0835999, mflops=59.8087 2. CWP (best N) (N=11088): elapsed time t=1.62 s, 128 iters, t-(init.)=1.44 s t(norm)=0.0813405, mflops=61.47 3. FFTPACK: elapsed time t=1.2 s, 64 iters, t-(init.)=1.11 s t(norm)=0.1254, mflops=39.8725 (err=4.7e-15) 4. FFTPACK (f2c): elapsed time t=1.32 s, 64 iters, t-(init.)=1.24 s t(norm)=0.140086, mflops=35.6923 (err=4.7e-15) FFTW_MEASURE plan: (cost = 1.375000e-02) FFTW_TWIDDLE 16 FFTW_TWIDDLE 6 FFTW_TWIDDLE 9 FFTW_NOTW 12 5. FFTW: elapsed time t=1.79 s, 128 iters, t-(init.)=1.63 s t(norm)=0.0920729, mflops=54.3048 (err=4.7e-15) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.82 s, 128 iters, t-(init.)=1.65 s t(norm)=0.0932026, mflops=53.6466 (err=4.7e-15) 7. Frigo-old: elapsed time t=1.68 s, 32 iters, t-(init.)=1.63 s t(norm)=0.368291, mflops=13.5762 (err=4.8e-15) 8. GSL: elapsed time t=1.47 s, 64 iters, t-(init.)=1.39 s t(norm)=0.157032, mflops=31.8406 (err=4.7e-15) 9. NAPACK (f2c): elapsed time t=1.51 s, 32 iters, t-(init.)=1.47 s t(norm)=0.33214, mflops=15.0539 (err=8.0e-14) 10. Nielsen: elapsed time t=1.48 s, 32 iters, t-(init.)=1.44 s t(norm)=0.325362, mflops=15.3675 (err=1.1e-14) 11. Singleton: elapsed time t=1.05 s, 32 iters, t-(init.)=1.01 s t(norm)=0.228205, mflops=21.9101 (err=7.1e-15) 12. Singleton (f2c): elapsed time t=1.07 s, 32 iters, t-(init.)=1.03 s t(norm)=0.232724, mflops=21.4847 (err=6.7e-15) 13. Temperton: elapsed time t=1.59 s, 32 iters, t-(init.)=1.55 s t(norm)=0.350216, mflops=14.2769 (err=2.1e-08) 14. Temperton (f2c): elapsed time t=1.6 s, 32 iters, t-(init.)=1.56 s t(norm)=0.352475, mflops=14.1854 (err=4.7e-15) 15. Valkenburg: elapsed time t=1.41 s, 8 iters, t-(init.)=1.4 s t(norm)=1.2653, mflops=3.95164 (err=4.7e-15) 16. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=10368 = 61.47 Normalized results and averages for N=10368: fft 0: mflops = 7.68375 (norm. = 0.125), norm. avg. (of 12) = 0.103694 fft 1: mflops = 59.8087 (norm. = 0.972973), norm. avg. (of 15) = 0.676779 fft 2: mflops = 61.47 (norm. = 1), norm. avg. (of 15) = 0.718849 fft 3: mflops = 39.8725 (norm. = 0.648649), norm. avg. (of 15) = 0.609002 fft 4: mflops = 35.6923 (norm. = 0.580645), norm. avg. (of 15) = 0.578645 fft 5: mflops = 54.3048 (norm. = 0.883436), norm. avg. (of 15) = 0.87162 fft 6: mflops = 53.6466 (norm. = 0.872727), norm. avg. (of 15) = 0.857866 fft 7: mflops = 13.5762 (norm. = 0.220859), norm. avg. (of 15) = 0.177791 fft 8: mflops = 31.8406 (norm. = 0.517986), norm. avg. (of 15) = 0.36248 fft 9: mflops = 15.0539 (norm. = 0.244898), norm. avg. (of 15) = 0.143525 fft 10: mflops = 15.3675 (norm. = 0.25), norm. avg. (of 15) = 0.183832 fft 11: mflops = 21.9101 (norm. = 0.356436), norm. avg. (of 15) = 0.319554 fft 12: mflops = 21.4847 (norm. = 0.349515), norm. avg. (of 15) = 0.326004 fft 13: mflops = 14.2769 (norm. = 0.232258), norm. avg. (of 11) = 0.307877 fft 14: mflops = 14.1854 (norm. = 0.230769), norm. avg. (of 11) = 0.256773 fft 15: mflops = 3.95164 (norm. = 0.0642857), norm. avg. (of 15) = 0.0585494 fft 16: mflops = -1 (norm. = -0.0162681), norm. avg. (of 6) = 0.699315 Benchmarking for array size = 27000: 0. Brenner: elapsed time t=1.16 s, 4 iters, t-(init.)=1.15 s t(norm)=0.723347, mflops=6.91232 (err=7.5e-15) 1. CWP (min N) (N=27720): elapsed time t=1.11 s, 32 iters, t-(init.)=0.99 s t(norm)=0.0778384, mflops=64.2357 2. CWP (best N) (N=27720): elapsed time t=1.12 s, 32 iters, t-(init.)=1.01 s t(norm)=0.0794109, mflops=62.9637 3. FFTPACK: elapsed time t=1.64 s, 32 iters, t-(init.)=1.53 s t(norm)=0.120296, mflops=41.5642 (err=7.3e-15) 4. FFTPACK (f2c): elapsed time t=1.84 s, 32 iters, t-(init.)=1.73 s t(norm)=0.136021, mflops=36.7591 (err=7.3e-15) FFTW_MEASURE plan: (cost = 4.125000e-02) FFTW_TWIDDLE 9 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 10 FFTW_NOTW 15 5. FFTW: elapsed time t=1.33 s, 32 iters, t-(init.)=1.22 s t(norm)=0.0959221, mflops=52.1257 (err=7.3e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.36 s, 32 iters, t-(init.)=1.25 s t(norm)=0.0982808, mflops=50.8746 (err=7.4e-15) 7. Frigo-old: elapsed time t=1.45 s, 8 iters, t-(init.)=1.43 s t(norm)=0.449733, mflops=11.1177 (err=7.4e-15) 8. GSL: elapsed time t=1.14 s, 16 iters, t-(init.)=1.09 s t(norm)=0.171402, mflops=29.1712 (err=7.3e-15) 9. NAPACK (f2c): elapsed time t=1.58 s, 8 iters, t-(init.)=1.55 s t(norm)=0.487473, mflops=10.257 (err=1.0e-12) 10. Nielsen: elapsed time t=1.07 s, 8 iters, t-(init.)=1.04 s t(norm)=0.327078, mflops=15.2869 (err=2.0e-13) 11. Singleton: elapsed time t=1.69 s, 16 iters, t-(init.)=1.63 s t(norm)=0.256316, mflops=19.5071 (err=1.4e-14) 12. Singleton (f2c): elapsed time t=1.67 s, 16 iters, t-(init.)=1.61 s t(norm)=0.253171, mflops=19.7495 (err=1.1e-14) 13. Temperton: elapsed time t=1.16 s, 8 iters, t-(init.)=1.13 s t(norm)=0.355383, mflops=14.0693 (err=1.8e-08) 14. Temperton (f2c): elapsed time t=1.2 s, 8 iters, t-(init.)=1.17 s t(norm)=0.367963, mflops=13.5883 (err=7.4e-15) 15. Valkenburg: elapsed time t=1 s, 2 iters, t-(init.)=1 s t(norm)=1.25799, mflops=3.97458 (err=7.4e-15) 16. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=27000 = 64.2357 Normalized results and averages for N=27000: fft 0: mflops = 6.91232 (norm. = 0.107609), norm. avg. (of 13) = 0.103995 fft 1: mflops = 64.2357 (norm. = 1), norm. avg. (of 16) = 0.69698 fft 2: mflops = 62.9637 (norm. = 0.980198), norm. avg. (of 16) = 0.735183 fft 3: mflops = 41.5642 (norm. = 0.647059), norm. avg. (of 16) = 0.61138 fft 4: mflops = 36.7591 (norm. = 0.572254), norm. avg. (of 16) = 0.578245 fft 5: mflops = 52.1257 (norm. = 0.811475), norm. avg. (of 16) = 0.867861 fft 6: mflops = 50.8746 (norm. = 0.792), norm. avg. (of 16) = 0.853749 fft 7: mflops = 11.1177 (norm. = 0.173077), norm. avg. (of 16) = 0.177496 fft 8: mflops = 29.1712 (norm. = 0.454128), norm. avg. (of 16) = 0.368208 fft 9: mflops = 10.257 (norm. = 0.159677), norm. avg. (of 16) = 0.144535 fft 10: mflops = 15.2869 (norm. = 0.237981), norm. avg. (of 16) = 0.187216 fft 11: mflops = 19.5071 (norm. = 0.303681), norm. avg. (of 16) = 0.318562 fft 12: mflops = 19.7495 (norm. = 0.307453), norm. avg. (of 16) = 0.324845 fft 13: mflops = 14.0693 (norm. = 0.219027), norm. avg. (of 12) = 0.300473 fft 14: mflops = 13.5883 (norm. = 0.211538), norm. avg. (of 12) = 0.253004 fft 15: mflops = 3.97458 (norm. = 0.061875), norm. avg. (of 16) = 0.0587573 fft 16: mflops = -1 (norm. = -0.0155677), norm. avg. (of 6) = 0.699315 Benchmarking for array size = 75600: 0. Brenner: elapsed time t=1.8 s, 2 iters, t-(init.)=1.78 s t(norm)=0.726423, mflops=6.88304 (err=9.3e-15) 1. CWP (min N) (N=80080): elapsed time t=1 s, 8 iters, t-(init.)=0.92 s t(norm)=0.0938637, mflops=53.2687 2. CWP (best N) (N=80080): elapsed time t=1.01 s, 8 iters, t-(init.)=0.93 s t(norm)=0.0948839, mflops=52.696 3. FFTPACK: elapsed time t=1.91 s, 8 iters, t-(init.)=1.83 s t(norm)=0.186707, mflops=26.7799 (err=9.2e-15) 4. FFTPACK (f2c): elapsed time t=1.04 s, 4 iters, t-(init.)=1.01 s t(norm)=0.206092, mflops=24.261 (err=9.2e-15) FFTW_MEASURE plan: (cost = 1.150000e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_NOTW 15 5. FFTW: elapsed time t=1.94 s, 16 iters, t-(init.)=1.79 s t(norm)=0.091313, mflops=54.7567 (err=9.2e-15) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.02 s, 8 iters, t-(init.)=0.94 s t(norm)=0.0959042, mflops=52.1354 (err=9.3e-15) 7. Frigo-old: elapsed time t=1.11 s, 2 iters, t-(init.)=1.1 s t(norm)=0.448913, mflops=11.138 (err=9.2e-15) 8. GSL: elapsed time t=1.8 s, 8 iters, t-(init.)=1.72 s t(norm)=0.175484, mflops=28.4926 (err=9.2e-15) 9. NAPACK (f2c): elapsed time t=1.21 s, 2 iters, t-(init.)=1.19 s t(norm)=0.485643, mflops=10.2956 (err=5.1e-12) 10. Nielsen: elapsed time t=1.77 s, 4 iters, t-(init.)=1.73 s t(norm)=0.353009, mflops=14.1639 (err=4.7e-13) 11. Singleton: elapsed time t=1.47 s, 4 iters, t-(init.)=1.43 s t(norm)=0.291794, mflops=17.1354 (err=2.1e-14) 12. Singleton (f2c): elapsed time t=1.47 s, 4 iters, t-(init.)=1.43 s t(norm)=0.291794, mflops=17.1354 (err=1.3e-14) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.66 s, 1 iters, t-(init.)=1.65 s t(norm)=1.34674, mflops=3.71267 (err=9.5e-15) 16. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=75600 = 54.7567 Normalized results and averages for N=75600: fft 0: mflops = 6.88304 (norm. = 0.125702), norm. avg. (of 14) = 0.105546 fft 1: mflops = 53.2687 (norm. = 0.972826), norm. avg. (of 17) = 0.713206 fft 2: mflops = 52.696 (norm. = 0.962366), norm. avg. (of 17) = 0.748547 fft 3: mflops = 26.7799 (norm. = 0.489071), norm. avg. (of 17) = 0.604185 fft 4: mflops = 24.261 (norm. = 0.443069), norm. avg. (of 17) = 0.570294 fft 5: mflops = 54.7567 (norm. = 1), norm. avg. (of 17) = 0.875634 fft 6: mflops = 52.1354 (norm. = 0.952128), norm. avg. (of 17) = 0.859536 fft 7: mflops = 11.138 (norm. = 0.203409), norm. avg. (of 17) = 0.17902 fft 8: mflops = 28.4926 (norm. = 0.520349), norm. avg. (of 17) = 0.377158 fft 9: mflops = 10.2956 (norm. = 0.188025), norm. avg. (of 17) = 0.147093 fft 10: mflops = 14.1639 (norm. = 0.258671), norm. avg. (of 17) = 0.191419 fft 11: mflops = 17.1354 (norm. = 0.312937), norm. avg. (of 17) = 0.318232 fft 12: mflops = 17.1354 (norm. = 0.312937), norm. avg. (of 17) = 0.324144 fft 13: mflops = -1 (norm. = -0.0182626), norm. avg. (of 12) = 0.300473 fft 14: mflops = -1 (norm. = -0.0182626), norm. avg. (of 12) = 0.253004 fft 15: mflops = 3.71267 (norm. = 0.067803), norm. avg. (of 17) = 0.0592894 fft 16: mflops = -1 (norm. = -0.0182626), norm. avg. (of 6) = 0.699315 Benchmarking for array size = 165375: 0. Brenner: elapsed time t=2.58 s, 1 iters, t-(init.)=2.56 s t(norm)=0.89297, mflops=5.59929 (err=3.7e-14) 1. CWP (min N) (N=180180): elapsed time t=1.28 s, 4 iters, t-(init.)=1.19 s t(norm)=0.103773, mflops=48.1822 2. CWP (best N) (N=180180): elapsed time t=1.27 s, 4 iters, t-(init.)=1.18 s t(norm)=0.102901, mflops=48.5905 3. FFTPACK: elapsed time t=1.02 s, 1 iters, t-(init.)=1 s t(norm)=0.348816, mflops=14.3342 (err=3.7e-14) 4. FFTPACK (f2c): elapsed time t=1.03 s, 1 iters, t-(init.)=1.01 s t(norm)=0.352304, mflops=14.1923 (err=3.7e-14) FFTW_MEASURE plan: (cost = 3.200000e-01) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 15 5. FFTW: elapsed time t=1.26 s, 4 iters, t-(init.)=1.18 s t(norm)=0.102901, mflops=48.5905 (err=3.7e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.35 s, 4 iters, t-(init.)=1.27 s t(norm)=0.110749, mflops=45.1471 (err=3.7e-14) 7. Frigo-old: elapsed time t=1.86 s, 1 iters, t-(init.)=1.84 s t(norm)=0.641822, mflops=7.79032 (err=3.7e-14) 8. GSL: elapsed time t=1.96 s, 4 iters, t-(init.)=1.88 s t(norm)=0.163944, mflops=30.4983 (err=3.7e-14) 9. NAPACK (f2c): elapsed time t=1.68 s, 1 iters, t-(init.)=1.66 s t(norm)=0.579035, mflops=8.63506 (err=1.6e-11) 10. Nielsen: elapsed time t=1.23 s, 1 iters, t-(init.)=1.2 s t(norm)=0.41858, mflops=11.9452 (err=1.6e-12) 11. Singleton: elapsed time t=1.76 s, 2 iters, t-(init.)=1.72 s t(norm)=0.299982, mflops=16.6677 (err=5.5e-14) 12. Singleton (f2c): elapsed time t=1.77 s, 2 iters, t-(init.)=1.73 s t(norm)=0.301726, mflops=16.5713 (err=5.6e-14) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=4.05 s, 1 iters, t-(init.)=4.03 s t(norm)=1.40573, mflops=3.55687 (err=3.7e-14) 16. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=165375 = 48.5905 Normalized results and averages for N=165375: fft 0: mflops = 5.59929 (norm. = 0.115234), norm. avg. (of 15) = 0.106191 fft 1: mflops = 48.1822 (norm. = 0.991597), norm. avg. (of 18) = 0.728672 fft 2: mflops = 48.5905 (norm. = 1), norm. avg. (of 18) = 0.762517 fft 3: mflops = 14.3342 (norm. = 0.295), norm. avg. (of 18) = 0.587009 fft 4: mflops = 14.1923 (norm. = 0.292079), norm. avg. (of 18) = 0.554837 fft 5: mflops = 48.5905 (norm. = 1), norm. avg. (of 18) = 0.882543 fft 6: mflops = 45.1471 (norm. = 0.929134), norm. avg. (of 18) = 0.863403 fft 7: mflops = 7.79032 (norm. = 0.160326), norm. avg. (of 18) = 0.177982 fft 8: mflops = 30.4983 (norm. = 0.62766), norm. avg. (of 18) = 0.391074 fft 9: mflops = 8.63506 (norm. = 0.177711), norm. avg. (of 18) = 0.148794 fft 10: mflops = 11.9452 (norm. = 0.245833), norm. avg. (of 18) = 0.194442 fft 11: mflops = 16.6677 (norm. = 0.343023), norm. avg. (of 18) = 0.319609 fft 12: mflops = 16.5713 (norm. = 0.34104), norm. avg. (of 18) = 0.325083 fft 13: mflops = -1 (norm. = -0.0205802), norm. avg. (of 12) = 0.300473 fft 14: mflops = -1 (norm. = -0.0205802), norm. avg. (of 12) = 0.253004 fft 15: mflops = 3.55687 (norm. = 0.073201), norm. avg. (of 18) = 0.0600622 fft 16: mflops = -1 (norm. = -0.0205802), norm. avg. (of 6) = 0.699315 Benchmarking for array size = 362880: 0. Brenner: elapsed time t=5.77 s, 1 iters, t-(init.)=5.72 s t(norm)=0.853467, mflops=5.85846 (err=7.1e-14) 1. CWP (min N) (N=720720): elapsed time t=1.62 s, 1 iters, t-(init.)=1.53 s t(norm)=0.228287, mflops=21.9022 2. CWP (best N) (N=720720): elapsed time t=1.63 s, 1 iters, t-(init.)=1.54 s t(norm)=0.229779, mflops=21.76 3. FFTPACK: elapsed time t=1.32 s, 1 iters, t-(init.)=1.27 s t(norm)=0.189493, mflops=26.3861 (err=7.1e-14) 4. FFTPACK (f2c): elapsed time t=1.42 s, 1 iters, t-(init.)=1.37 s t(norm)=0.204414, mflops=24.4601 (err=7.1e-14) FFTW_MEASURE plan: (cost = 7.100000e-01) FFTW_TWIDDLE 64 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 6 FFTW_NOTW 15 5. FFTW: elapsed time t=1.42 s, 2 iters, t-(init.)=1.33 s t(norm)=0.0992229, mflops=50.3916 (err=7.1e-14) FFTW_ESTIMATE plan: (cost = 7.511616e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.58 s, 2 iters, t-(init.)=1.49 s t(norm)=0.11116, mflops=44.9804 (err=7.1e-14) 7. Frigo-old: elapsed time t=3.47 s, 1 iters, t-(init.)=3.43 s t(norm)=0.511781, mflops=9.76979 (err=7.1e-14) 8. GSL: elapsed time t=1.08 s, 1 iters, t-(init.)=1.03 s t(norm)=0.153684, mflops=32.5344 (err=7.1e-14) 9. NAPACK (f2c): elapsed time t=2.99 s, 1 iters, t-(init.)=2.95 s t(norm)=0.440162, mflops=11.3595 (err=3.4e-11) 10. Nielsen: elapsed time t=3.32 s, 1 iters, t-(init.)=3.28 s t(norm)=0.4894, mflops=10.2166 (err=3.5e-12) 11. Singleton: elapsed time t=2.89 s, 1 iters, t-(init.)=2.85 s t(norm)=0.425241, mflops=11.758 (err=1.1e-13) 12. Singleton (f2c): elapsed time t=2.91 s, 1 iters, t-(init.)=2.86 s t(norm)=0.426733, mflops=11.7169 (err=1.0e-13) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=10.09 s, 1 iters, t-(init.)=10.04 s t(norm)=1.49804, mflops=3.33769 (err=7.1e-14) 16. Skipping fft (ESSL only works for N = x * 2^m, where x < 10 and m >= 1). Top mflops for N=362880 = 50.3916 Normalized results and averages for N=362880: fft 0: mflops = 5.85846 (norm. = 0.116259), norm. avg. (of 16) = 0.106821 fft 1: mflops = 21.9022 (norm. = 0.434641), norm. avg. (of 19) = 0.713197 fft 2: mflops = 21.76 (norm. = 0.431818), norm. avg. (of 19) = 0.745111 fft 3: mflops = 26.3861 (norm. = 0.523622), norm. avg. (of 19) = 0.583672 fft 4: mflops = 24.4601 (norm. = 0.485401), norm. avg. (of 19) = 0.551183 fft 5: mflops = 50.3916 (norm. = 1), norm. avg. (of 19) = 0.888725 fft 6: mflops = 44.9804 (norm. = 0.892617), norm. avg. (of 19) = 0.86494 fft 7: mflops = 9.76979 (norm. = 0.193878), norm. avg. (of 19) = 0.178818 fft 8: mflops = 32.5344 (norm. = 0.645631), norm. avg. (of 19) = 0.404472 fft 9: mflops = 11.3595 (norm. = 0.225424), norm. avg. (of 19) = 0.152827 fft 10: mflops = 10.2166 (norm. = 0.202744), norm. avg. (of 19) = 0.194879 fft 11: mflops = 11.758 (norm. = 0.233333), norm. avg. (of 19) = 0.315068 fft 12: mflops = 11.7169 (norm. = 0.232517), norm. avg. (of 19) = 0.320211 fft 13: mflops = -1 (norm. = -0.0198446), norm. avg. (of 12) = 0.300473 fft 14: mflops = -1 (norm. = -0.0198446), norm. avg. (of 12) = 0.253004 fft 15: mflops = 3.33769 (norm. = 0.0662351), norm. avg. (of 19) = 0.0603871 fft 16: mflops = -1 (norm. = -0.0198446), norm. avg. (of 6) = 0.699315 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) Maximum array size N = 2097152 Benchmarking FFTs: 0. FFTW 1. HARM 2. HARM (f2c) 3. NR (C) 4. NR (F) 5. PDA 6. PDA (f2c) 7. Singleton 8. Singleton (f2c) 9. Temperton 10. Temperton (f2c) Computing normalized averages (11 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.06 s, 32768 iters, t-(init.)=0.98 s t(norm)=0.0778834, mflops=64.1985 (err=2.1e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. Skipping fft (all dimensions must be > 4 for HARM). 3. NR (C): elapsed time t=1.07 s, 16384 iters, t-(init.)=1.04 s t(norm)=0.165304, mflops=30.2474 (err=2.3e-16) 4. NR (F): elapsed time t=1.3 s, 16384 iters, t-(init.)=1.26 s t(norm)=0.200272, mflops=24.9661 (err=2.3e-16) 5. PDA: elapsed time t=1.62 s, 8192 iters, t-(init.)=1.6 s t(norm)=0.508626, mflops=9.8304 (err=1.7e-16) 6. PDA (f2c): elapsed time t=1.71 s, 8192 iters, t-(init.)=1.69 s t(norm)=0.537237, mflops=9.30689 (err=1.7e-16) 7. Singleton: elapsed time t=1.26 s, 32768 iters, t-(init.)=1.19 s t(norm)=0.0945727, mflops=52.8694 (err=2.1e-16) 8. Singleton (f2c): elapsed time t=1.33 s, 32768 iters, t-(init.)=1.26 s t(norm)=0.100136, mflops=49.9322 (err=2.1e-16) 9. Temperton: elapsed time t=1.73 s, 32768 iters, t-(init.)=1.65 s t(norm)=0.13113, mflops=38.13 (err=2.1e-16) 10. Temperton (f2c): elapsed time t=1 s, 16384 iters, t-(init.)=0.96 s t(norm)=0.152588, mflops=32.768 (err=2.1e-16) Top mflops for N=64 = 64.1985 Normalized results and averages for N=64: fft 0: mflops = 64.1985 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.0155767), norm. avg. (of 0) = -1 fft 2: mflops = -1 (norm. = -0.0155767), norm. avg. (of 0) = -1 fft 3: mflops = 30.2474 (norm. = 0.471154), norm. avg. (of 1) = 0.471154 fft 4: mflops = 24.9661 (norm. = 0.388889), norm. avg. (of 1) = 0.388889 fft 5: mflops = 9.8304 (norm. = 0.153125), norm. avg. (of 1) = 0.153125 fft 6: mflops = 9.30689 (norm. = 0.14497), norm. avg. (of 1) = 0.14497 fft 7: mflops = 52.8694 (norm. = 0.823529), norm. avg. (of 1) = 0.823529 fft 8: mflops = 49.9322 (norm. = 0.777778), norm. avg. (of 1) = 0.777778 fft 9: mflops = 38.13 (norm. = 0.593939), norm. avg. (of 1) = 0.593939 fft 10: mflops = 32.768 (norm. = 0.510417), norm. avg. (of 1) = 0.510417 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.11 s, 4096 iters, t-(init.)=1.04 s t(norm)=0.0551012, mflops=90.7422 (err=3.3e-16) 1. HARM: elapsed time t=1.48 s, 4096 iters, t-(init.)=1.41 s t(norm)=0.0747045, mflops=66.9304 (err=3.3e-16) 2. HARM (f2c): elapsed time t=1.87 s, 4096 iters, t-(init.)=1.79 s t(norm)=0.0948376, mflops=52.7217 (err=3.3e-16) 3. NR (C): elapsed time t=1.16 s, 2048 iters, t-(init.)=1.13 s t(norm)=0.119739, mflops=41.7575 (err=3.4e-16) 4. NR (F): elapsed time t=1.23 s, 2048 iters, t-(init.)=1.19 s t(norm)=0.126097, mflops=39.652 (err=3.4e-16) 5. PDA: elapsed time t=1.17 s, 1024 iters, t-(init.)=1.16 s t(norm)=0.245836, mflops=20.3388 (err=3.0e-16) 6. PDA (f2c): elapsed time t=1.2 s, 1024 iters, t-(init.)=1.18 s t(norm)=0.250075, mflops=19.994 (err=3.0e-16) 7. Singleton: elapsed time t=1.06 s, 2048 iters, t-(init.)=1.03 s t(norm)=0.109143, mflops=45.8116 (err=3.3e-16) 8. Singleton (f2c): elapsed time t=2 s, 4096 iters, t-(init.)=1.93 s t(norm)=0.102255, mflops=48.8973 (err=3.2e-16) 9. Temperton: elapsed time t=1.17 s, 4096 iters, t-(init.)=1.1 s t(norm)=0.0582801, mflops=85.7926 (err=3.3e-16) 10. Temperton (f2c): elapsed time t=1.39 s, 4096 iters, t-(init.)=1.32 s t(norm)=0.0699361, mflops=71.4938 (err=3.3e-16) Top mflops for N=512 = 90.7422 Normalized results and averages for N=512: fft 0: mflops = 90.7422 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 66.9304 (norm. = 0.737589), norm. avg. (of 1) = 0.737589 fft 2: mflops = 52.7217 (norm. = 0.581006), norm. avg. (of 1) = 0.581006 fft 3: mflops = 41.7575 (norm. = 0.460177), norm. avg. (of 2) = 0.465665 fft 4: mflops = 39.652 (norm. = 0.436975), norm. avg. (of 2) = 0.412932 fft 5: mflops = 20.3388 (norm. = 0.224138), norm. avg. (of 2) = 0.188631 fft 6: mflops = 19.994 (norm. = 0.220339), norm. avg. (of 2) = 0.182655 fft 7: mflops = 45.8116 (norm. = 0.504854), norm. avg. (of 2) = 0.664192 fft 8: mflops = 48.8973 (norm. = 0.53886), norm. avg. (of 2) = 0.658319 fft 9: mflops = 85.7926 (norm. = 0.945455), norm. avg. (of 2) = 0.769697 fft 10: mflops = 71.4938 (norm. = 0.787879), norm. avg. (of 2) = 0.649148 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.24 s, 512 iters, t-(init.)=1.17 s t(norm)=0.0464916, mflops=107.546 (err=4.5e-16) 1. HARM: elapsed time t=1.54 s, 512 iters, t-(init.)=1.46 s t(norm)=0.0580152, mflops=86.1843 (err=4.5e-16) 2. HARM (f2c): elapsed time t=1.9 s, 512 iters, t-(init.)=1.83 s t(norm)=0.0727177, mflops=68.7591 (err=4.5e-16) 3. NR (C): elapsed time t=1.5 s, 256 iters, t-(init.)=1.46 s t(norm)=0.11603, mflops=43.0922 (err=5.4e-16) 4. NR (F): elapsed time t=1.52 s, 256 iters, t-(init.)=1.48 s t(norm)=0.11762, mflops=42.5098 (err=5.4e-16) 5. PDA: elapsed time t=1.78 s, 256 iters, t-(init.)=1.74 s t(norm)=0.138283, mflops=36.1578 (err=4.6e-16) 6. PDA (f2c): elapsed time t=1.93 s, 256 iters, t-(init.)=1.9 s t(norm)=0.150998, mflops=33.1129 (err=4.6e-16) 7. Singleton: elapsed time t=2.01 s, 512 iters, t-(init.)=1.94 s t(norm)=0.0770887, mflops=64.8604 (err=5.3e-16) 8. Singleton (f2c): elapsed time t=1.05 s, 256 iters, t-(init.)=1.02 s t(norm)=0.0810623, mflops=61.6809 (err=5.3e-16) 9. Temperton: elapsed time t=1.18 s, 512 iters, t-(init.)=1.11 s t(norm)=0.0441074, mflops=113.36 (err=4.5e-16) 10. Temperton (f2c): elapsed time t=1.45 s, 512 iters, t-(init.)=1.37 s t(norm)=0.0544389, mflops=91.8461 (err=4.5e-16) Top mflops for N=4096 = 113.36 Normalized results and averages for N=4096: fft 0: mflops = 107.546 (norm. = 0.948718), norm. avg. (of 3) = 0.982906 fft 1: mflops = 86.1843 (norm. = 0.760274), norm. avg. (of 2) = 0.748931 fft 2: mflops = 68.7591 (norm. = 0.606557), norm. avg. (of 2) = 0.593781 fft 3: mflops = 43.0922 (norm. = 0.380137), norm. avg. (of 3) = 0.437156 fft 4: mflops = 42.5098 (norm. = 0.375), norm. avg. (of 3) = 0.400288 fft 5: mflops = 36.1578 (norm. = 0.318966), norm. avg. (of 3) = 0.232076 fft 6: mflops = 33.1129 (norm. = 0.292105), norm. avg. (of 3) = 0.219138 fft 7: mflops = 64.8604 (norm. = 0.572165), norm. avg. (of 3) = 0.633516 fft 8: mflops = 61.6809 (norm. = 0.544118), norm. avg. (of 3) = 0.620252 fft 9: mflops = 113.36 (norm. = 1), norm. avg. (of 3) = 0.846465 fft 10: mflops = 91.8461 (norm. = 0.810219), norm. avg. (of 3) = 0.702838 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.67 s, 32 iters, t-(init.)=1.54 s t(norm)=0.0979106, mflops=51.067 (err=4.5e-16) 1. HARM: elapsed time t=1.9 s, 32 iters, t-(init.)=1.77 s t(norm)=0.112534, mflops=44.4312 (err=4.6e-16) 2. HARM (f2c): elapsed time t=1.02 s, 16 iters, t-(init.)=0.96 s t(norm)=0.12207, mflops=40.96 (err=4.6e-16) 3. NR (C): elapsed time t=1.02 s, 4 iters, t-(init.)=1 s t(norm)=0.508626, mflops=9.8304 (err=6.5e-16) 4. NR (F): elapsed time t=1.06 s, 4 iters, t-(init.)=1.04 s t(norm)=0.528971, mflops=9.45231 (err=6.5e-16) 5. PDA: elapsed time t=1.71 s, 16 iters, t-(init.)=1.64 s t(norm)=0.208537, mflops=23.9766 (err=3.8e-16) 6. PDA (f2c): elapsed time t=1.71 s, 16 iters, t-(init.)=1.64 s t(norm)=0.208537, mflops=23.9766 (err=3.8e-16) 7. Singleton: elapsed time t=1.36 s, 8 iters, t-(init.)=1.32 s t(norm)=0.335693, mflops=14.8945 (err=4.6e-16) 8. Singleton (f2c): elapsed time t=1.37 s, 8 iters, t-(init.)=1.34 s t(norm)=0.34078, mflops=14.6722 (err=4.6e-16) 9. Temperton: elapsed time t=1.49 s, 16 iters, t-(init.)=1.42 s t(norm)=0.180562, mflops=27.6913 (err=4.5e-16) 10. Temperton (f2c): elapsed time t=1.53 s, 16 iters, t-(init.)=1.47 s t(norm)=0.18692, mflops=26.7494 (err=4.5e-16) Top mflops for N=32768 = 51.067 Normalized results and averages for N=32768: fft 0: mflops = 51.067 (norm. = 1), norm. avg. (of 4) = 0.987179 fft 1: mflops = 44.4312 (norm. = 0.870056), norm. avg. (of 3) = 0.789306 fft 2: mflops = 40.96 (norm. = 0.802083), norm. avg. (of 3) = 0.663215 fft 3: mflops = 9.8304 (norm. = 0.1925), norm. avg. (of 4) = 0.375992 fft 4: mflops = 9.45231 (norm. = 0.185096), norm. avg. (of 4) = 0.34649 fft 5: mflops = 23.9766 (norm. = 0.469512), norm. avg. (of 4) = 0.291435 fft 6: mflops = 23.9766 (norm. = 0.469512), norm. avg. (of 4) = 0.281732 fft 7: mflops = 14.8945 (norm. = 0.291667), norm. avg. (of 4) = 0.548054 fft 8: mflops = 14.6722 (norm. = 0.287313), norm. avg. (of 4) = 0.537017 fft 9: mflops = 27.6913 (norm. = 0.542254), norm. avg. (of 4) = 0.770412 fft 10: mflops = 26.7494 (norm. = 0.52381), norm. avg. (of 4) = 0.658081 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.04 s, 2 iters, t-(init.)=0.97 s t(norm)=0.102785, mflops=48.6453 (err=1.0e-15) 1. HARM: elapsed time t=1.08 s, 2 iters, t-(init.)=1.02 s t(norm)=0.108083, mflops=46.2607 (err=1.0e-15) 2. HARM (f2c): elapsed time t=1.16 s, 2 iters, t-(init.)=1.09 s t(norm)=0.115501, mflops=43.2898 (err=1.0e-15) 3. NR (C): elapsed time t=2.63 s, 1 iters, t-(init.)=2.59 s t(norm)=0.548893, mflops=9.10925 (err=1.1e-15) 4. NR (F): elapsed time t=2.74 s, 1 iters, t-(init.)=2.71 s t(norm)=0.574324, mflops=8.70589 (err=1.1e-15) 5. PDA: elapsed time t=1.88 s, 2 iters, t-(init.)=1.82 s t(norm)=0.192854, mflops=25.9263 (err=1.0e-15) 6. PDA (f2c): elapsed time t=1.97 s, 2 iters, t-(init.)=1.9 s t(norm)=0.201331, mflops=24.8347 (err=1.0e-15) 7. Singleton: elapsed time t=1.53 s, 1 iters, t-(init.)=1.49 s t(norm)=0.315772, mflops=15.8342 (err=1.5e-15) 8. Singleton (f2c): elapsed time t=1.53 s, 1 iters, t-(init.)=1.49 s t(norm)=0.315772, mflops=15.8342 (err=1.5e-15) 9. Temperton: elapsed time t=1.66 s, 2 iters, t-(init.)=1.59 s t(norm)=0.168482, mflops=29.6767 (err=1.0e-15) 10. Temperton (f2c): elapsed time t=1.69 s, 2 iters, t-(init.)=1.63 s t(norm)=0.172721, mflops=28.9484 (err=1.0e-15) Top mflops for N=262144 = 48.6453 Normalized results and averages for N=262144: fft 0: mflops = 48.6453 (norm. = 1), norm. avg. (of 5) = 0.989744 fft 1: mflops = 46.2607 (norm. = 0.95098), norm. avg. (of 4) = 0.829725 fft 2: mflops = 43.2898 (norm. = 0.889908), norm. avg. (of 4) = 0.719889 fft 3: mflops = 9.10925 (norm. = 0.187259), norm. avg. (of 5) = 0.338245 fft 4: mflops = 8.70589 (norm. = 0.178967), norm. avg. (of 5) = 0.312985 fft 5: mflops = 25.9263 (norm. = 0.532967), norm. avg. (of 5) = 0.339742 fft 6: mflops = 24.8347 (norm. = 0.510526), norm. avg. (of 5) = 0.327491 fft 7: mflops = 15.8342 (norm. = 0.325503), norm. avg. (of 5) = 0.503544 fft 8: mflops = 15.8342 (norm. = 0.325503), norm. avg. (of 5) = 0.494714 fft 9: mflops = 29.6767 (norm. = 0.610063), norm. avg. (of 5) = 0.738342 fft 10: mflops = 28.9484 (norm. = 0.595092), norm. avg. (of 5) = 0.645483 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.33 s, 1 iters, t-(init.)=1.27 s t(norm)=0.127491, mflops=39.2184 (err=9.2e-16) 1. HARM: elapsed time t=1.23 s, 1 iters, t-(init.)=1.16 s t(norm)=0.116449, mflops=42.9374 (err=9.2e-16) 2. HARM (f2c): elapsed time t=1.32 s, 1 iters, t-(init.)=1.26 s t(norm)=0.126487, mflops=39.5297 (err=9.2e-16) 3. NR (C): elapsed time t=6.9 s, 1 iters, t-(init.)=6.83 s t(norm)=0.685642, mflops=7.29244 (err=9.8e-16) 4. NR (F): elapsed time t=7.04 s, 1 iters, t-(init.)=6.97 s t(norm)=0.699696, mflops=7.14596 (err=9.8e-16) 5. PDA: elapsed time t=2.28 s, 1 iters, t-(init.)=2.21 s t(norm)=0.221855, mflops=22.5373 (err=8.9e-16) 6. PDA (f2c): elapsed time t=2.34 s, 1 iters, t-(init.)=2.27 s t(norm)=0.227878, mflops=21.9416 (err=8.9e-16) 7. Singleton: elapsed time t=4.03 s, 1 iters, t-(init.)=3.97 s t(norm)=0.398535, mflops=12.5459 (err=1.4e-15) 8. Singleton (f2c): elapsed time t=4.03 s, 1 iters, t-(init.)=3.96 s t(norm)=0.397532, mflops=12.5776 (err=1.3e-15) 9. Temperton: elapsed time t=2.16 s, 1 iters, t-(init.)=2.09 s t(norm)=0.209808, mflops=23.8313 (err=9.3e-16) 10. Temperton (f2c): elapsed time t=2.13 s, 1 iters, t-(init.)=2.06 s t(norm)=0.206797, mflops=24.1783 (err=9.3e-16) Top mflops for N=524288 = 42.9374 Normalized results and averages for N=524288: fft 0: mflops = 39.2184 (norm. = 0.913386), norm. avg. (of 6) = 0.977017 fft 1: mflops = 42.9374 (norm. = 1), norm. avg. (of 5) = 0.86378 fft 2: mflops = 39.5297 (norm. = 0.920635), norm. avg. (of 5) = 0.760038 fft 3: mflops = 7.29244 (norm. = 0.169839), norm. avg. (of 6) = 0.310178 fft 4: mflops = 7.14596 (norm. = 0.166428), norm. avg. (of 6) = 0.288559 fft 5: mflops = 22.5373 (norm. = 0.524887), norm. avg. (of 6) = 0.370599 fft 6: mflops = 21.9416 (norm. = 0.511013), norm. avg. (of 6) = 0.358078 fft 7: mflops = 12.5459 (norm. = 0.292191), norm. avg. (of 6) = 0.468318 fft 8: mflops = 12.5776 (norm. = 0.292929), norm. avg. (of 6) = 0.461084 fft 9: mflops = 23.8313 (norm. = 0.555024), norm. avg. (of 6) = 0.707789 fft 10: mflops = 24.1783 (norm. = 0.563107), norm. avg. (of 6) = 0.631754 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=3.37 s, 1 iters, t-(init.)=3.24 s t(norm)=0.154495, mflops=32.3635 (err=1.2e-15) 1. HARM: elapsed time t=2.63 s, 1 iters, t-(init.)=2.49 s t(norm)=0.118732, mflops=42.1115 (err=1.2e-15) 2. HARM (f2c): elapsed time t=2.77 s, 1 iters, t-(init.)=2.63 s t(norm)=0.125408, mflops=39.8698 (err=1.2e-15) 3. NR (C): elapsed time t=14.8 s, 1 iters, t-(init.)=14.66 s t(norm)=0.699043, mflops=7.15263 (err=1.4e-15) 4. NR (F): elapsed time t=15.11 s, 1 iters, t-(init.)=14.98 s t(norm)=0.714302, mflops=6.99984 (err=1.4e-15) 5. PDA: elapsed time t=5.78 s, 1 iters, t-(init.)=5.65 s t(norm)=0.269413, mflops=18.5589 (err=1.2e-15) 6. PDA (f2c): elapsed time t=5.97 s, 1 iters, t-(init.)=5.83 s t(norm)=0.277996, mflops=17.9859 (err=1.2e-15) 7. Singleton: elapsed time t=8.39 s, 1 iters, t-(init.)=8.26 s t(norm)=0.393867, mflops=12.6946 (err=2.8e-15) 8. Singleton (f2c): elapsed time t=8.42 s, 1 iters, t-(init.)=8.29 s t(norm)=0.395298, mflops=12.6487 (err=1.7e-15) 9. Skipping fft (Temperton can't handle dimensions > 256). 10. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 42.1115 Normalized results and averages for N=1048576: fft 0: mflops = 32.3635 (norm. = 0.768519), norm. avg. (of 7) = 0.947232 fft 1: mflops = 42.1115 (norm. = 1), norm. avg. (of 6) = 0.886483 fft 2: mflops = 39.8698 (norm. = 0.946768), norm. avg. (of 6) = 0.79116 fft 3: mflops = 7.15263 (norm. = 0.16985), norm. avg. (of 7) = 0.290131 fft 4: mflops = 6.99984 (norm. = 0.166222), norm. avg. (of 7) = 0.271082 fft 5: mflops = 18.5589 (norm. = 0.440708), norm. avg. (of 7) = 0.380615 fft 6: mflops = 17.9859 (norm. = 0.427101), norm. avg. (of 7) = 0.367938 fft 7: mflops = 12.6946 (norm. = 0.301453), norm. avg. (of 7) = 0.44448 fft 8: mflops = 12.6487 (norm. = 0.300362), norm. avg. (of 7) = 0.438123 fft 9: mflops = -1 (norm. = -0.0237465), norm. avg. (of 6) = 0.707789 fft 10: mflops = -1 (norm. = -0.0237465), norm. avg. (of 6) = 0.631754 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=7.42 s, 1 iters, t-(init.)=7.15 s t(norm)=0.162352, mflops=30.7973 (err=8.5e-16) 1. HARM: elapsed time t=5.44 s, 1 iters, t-(init.)=5.17 s t(norm)=0.117393, mflops=42.5921 (err=8.4e-16) 2. HARM (f2c): elapsed time t=5.79 s, 1 iters, t-(init.)=5.52 s t(norm)=0.12534, mflops=39.8915 (err=8.4e-16) 3. NR (C): elapsed time t=31.25 s, 1 iters, t-(init.)=30.98 s t(norm)=0.703448, mflops=7.10784 (err=9.6e-16) 4. NR (F): elapsed time t=31.94 s, 1 iters, t-(init.)=31.67 s t(norm)=0.719116, mflops=6.95298 (err=9.6e-16) 5. PDA: elapsed time t=11.13 s, 1 iters, t-(init.)=10.86 s t(norm)=0.246593, mflops=20.2763 (err=7.9e-16) 6. PDA (f2c): elapsed time t=11.56 s, 1 iters, t-(init.)=11.3 s t(norm)=0.256584, mflops=19.4868 (err=7.9e-16) 7. Singleton: elapsed time t=24.31 s, 1 iters, t-(init.)=24.05 s t(norm)=0.546092, mflops=9.15597 (err=1.1e-15) 8. Singleton (f2c): elapsed time t=24.26 s, 1 iters, t-(init.)=24 s t(norm)=0.544957, mflops=9.17504 (err=1.1e-15) 9. Temperton: elapsed time t=12.47 s, 1 iters, t-(init.)=12.22 s t(norm)=0.277474, mflops=18.0197 (err=8.5e-16) 10. Temperton (f2c): elapsed time t=14.4 s, 1 iters, t-(init.)=14.13 s t(norm)=0.320843, mflops=15.5839 (err=8.5e-16) Top mflops for N=2097152 = 42.5921 Normalized results and averages for N=2097152: fft 0: mflops = 30.7973 (norm. = 0.723077), norm. avg. (of 8) = 0.919212 fft 1: mflops = 42.5921 (norm. = 1), norm. avg. (of 7) = 0.9027 fft 2: mflops = 39.8915 (norm. = 0.936594), norm. avg. (of 7) = 0.811936 fft 3: mflops = 7.10784 (norm. = 0.166882), norm. avg. (of 8) = 0.274725 fft 4: mflops = 6.95298 (norm. = 0.163246), norm. avg. (of 8) = 0.257603 fft 5: mflops = 20.2763 (norm. = 0.476059), norm. avg. (of 8) = 0.392545 fft 6: mflops = 19.4868 (norm. = 0.457522), norm. avg. (of 8) = 0.379136 fft 7: mflops = 9.15597 (norm. = 0.214969), norm. avg. (of 8) = 0.415791 fft 8: mflops = 9.17504 (norm. = 0.215417), norm. avg. (of 8) = 0.410285 fft 9: mflops = 18.0197 (norm. = 0.423077), norm. avg. (of 7) = 0.667116 fft 10: mflops = 15.5839 (norm. = 0.365888), norm. avg. (of 7) = 0.593773 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) Maximum array size N = 2985984 Benchmarking FFTs: 0. FFTW 1. PDA 2. PDA (f2c) 3. Singleton 4. Singleton (f2c) 5. Temperton 6. Temperton (f2c) Computing normalized averages (7 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.38 s, 16384 iters, t-(init.)=1.31 s t(norm)=0.0918272, mflops=54.4501 (err=2.5e-16) 1. PDA: elapsed time t=1.41 s, 4096 iters, t-(init.)=1.39 s t(norm)=0.38974, mflops=12.8291 (err=2.0e-16) 2. PDA (f2c): elapsed time t=1.55 s, 4096 iters, t-(init.)=1.53 s t(norm)=0.428994, mflops=11.6552 (err=2.0e-16) 3. Singleton: elapsed time t=1.93 s, 32768 iters, t-(init.)=1.79 s t(norm)=0.0627369, mflops=79.6979 (err=2.7e-16) 4. Singleton (f2c): elapsed time t=1.77 s, 32768 iters, t-(init.)=1.63 s t(norm)=0.0571291, mflops=87.521 (err=2.7e-16) 5. Temperton: elapsed time t=1.4 s, 16384 iters, t-(init.)=1.33 s t(norm)=0.0932291, mflops=53.6313 (err=2.4e-08) 6. Temperton (f2c): elapsed time t=1.96 s, 16384 iters, t-(init.)=1.89 s t(norm)=0.132484, mflops=37.7405 (err=2.5e-16) Top mflops for N=125 = 87.521 Normalized results and averages for N=125: fft 0: mflops = 54.4501 (norm. = 0.622137), norm. avg. (of 1) = 0.622137 fft 1: mflops = 12.8291 (norm. = 0.146583), norm. avg. (of 1) = 0.146583 fft 2: mflops = 11.6552 (norm. = 0.13317), norm. avg. (of 1) = 0.13317 fft 3: mflops = 79.6979 (norm. = 0.910615), norm. avg. (of 1) = 0.910615 fft 4: mflops = 87.521 (norm. = 1), norm. avg. (of 1) = 1 fft 5: mflops = 53.6313 (norm. = 0.612782), norm. avg. (of 1) = 0.612782 fft 6: mflops = 37.7405 (norm. = 0.431217), norm. avg. (of 1) = 0.431217 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1 s, 8192 iters, t-(init.)=0.94 s t(norm)=0.0685029, mflops=72.9897 (err=3.1e-16) 1. PDA: elapsed time t=1.26 s, 2048 iters, t-(init.)=1.24 s t(norm)=0.361462, mflops=13.8327 (err=3.3e-16) 2. PDA (f2c): elapsed time t=1.24 s, 2048 iters, t-(init.)=1.23 s t(norm)=0.358547, mflops=13.9452 (err=3.3e-16) 3. Singleton: elapsed time t=1.86 s, 8192 iters, t-(init.)=1.8 s t(norm)=0.131176, mflops=38.1168 (err=3.3e-16) 4. Singleton (f2c): elapsed time t=1.84 s, 8192 iters, t-(init.)=1.78 s t(norm)=0.129718, mflops=38.5451 (err=3.4e-16) 5. Temperton: elapsed time t=1.21 s, 8192 iters, t-(init.)=1.15 s t(norm)=0.0838067, mflops=59.6611 (err=1.8e-08) 6. Temperton (f2c): elapsed time t=1.5 s, 8192 iters, t-(init.)=1.44 s t(norm)=0.104941, mflops=47.646 (err=3.5e-16) Top mflops for N=216 = 72.9897 Normalized results and averages for N=216: fft 0: mflops = 72.9897 (norm. = 1), norm. avg. (of 2) = 0.811069 fft 1: mflops = 13.8327 (norm. = 0.189516), norm. avg. (of 2) = 0.168049 fft 2: mflops = 13.9452 (norm. = 0.191057), norm. avg. (of 2) = 0.162113 fft 3: mflops = 38.1168 (norm. = 0.522222), norm. avg. (of 2) = 0.716418 fft 4: mflops = 38.5451 (norm. = 0.52809), norm. avg. (of 2) = 0.764045 fft 5: mflops = 59.6611 (norm. = 0.817391), norm. avg. (of 2) = 0.715087 fft 6: mflops = 47.646 (norm. = 0.652778), norm. avg. (of 2) = 0.541997 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=2 s, 8192 iters, t-(init.)=1.91 s t(norm)=0.0807106, mflops=61.9497 (err=3.5e-16) 1. PDA: elapsed time t=1.7 s, 1024 iters, t-(init.)=1.69 s t(norm)=0.571313, mflops=8.75177 (err=4.0e-16) 2. PDA (f2c): elapsed time t=1.87 s, 1024 iters, t-(init.)=1.86 s t(norm)=0.628782, mflops=7.95188 (err=4.0e-16) 3. Singleton: elapsed time t=1.41 s, 4096 iters, t-(init.)=1.37 s t(norm)=0.115784, mflops=43.1839 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=1.6 s, 4096 iters, t-(init.)=1.55 s t(norm)=0.130996, mflops=38.169 (err=5.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 61.9497 Normalized results and averages for N=343: fft 0: mflops = 61.9497 (norm. = 1), norm. avg. (of 3) = 0.874046 fft 1: mflops = 8.75177 (norm. = 0.141272), norm. avg. (of 3) = 0.159124 fft 2: mflops = 7.95188 (norm. = 0.12836), norm. avg. (of 3) = 0.150862 fft 3: mflops = 43.1839 (norm. = 0.69708), norm. avg. (of 3) = 0.709972 fft 4: mflops = 38.169 (norm. = 0.616129), norm. avg. (of 3) = 0.71474 fft 5: mflops = -1 (norm. = -0.0161421), norm. avg. (of 2) = 0.715087 fft 6: mflops = -1 (norm. = -0.0161421), norm. avg. (of 2) = 0.541997 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.01 s, 2048 iters, t-(init.)=0.96 s t(norm)=0.0676151, mflops=73.948 (err=4.5e-16) 1. PDA: elapsed time t=1.67 s, 1024 iters, t-(init.)=1.65 s t(norm)=0.232427, mflops=21.5121 (err=4.9e-16) 2. PDA (f2c): elapsed time t=1.63 s, 1024 iters, t-(init.)=1.61 s t(norm)=0.226792, mflops=22.0466 (err=5.0e-16) 3. Singleton: elapsed time t=1.25 s, 2048 iters, t-(init.)=1.2 s t(norm)=0.0845188, mflops=59.1584 (err=4.7e-16) 4. Singleton (f2c): elapsed time t=1.31 s, 2048 iters, t-(init.)=1.26 s t(norm)=0.0887448, mflops=56.3413 (err=5.4e-16) 5. Temperton: elapsed time t=1.77 s, 4096 iters, t-(init.)=1.67 s t(norm)=0.058811, mflops=85.0181 (err=3.0e-08) 6. Temperton (f2c): elapsed time t=1.99 s, 4096 iters, t-(init.)=1.89 s t(norm)=0.0665586, mflops=75.1218 (err=4.8e-16) Top mflops for N=729 = 85.0181 Normalized results and averages for N=729: fft 0: mflops = 73.948 (norm. = 0.869792), norm. avg. (of 4) = 0.872982 fft 1: mflops = 21.5121 (norm. = 0.25303), norm. avg. (of 4) = 0.1826 fft 2: mflops = 22.0466 (norm. = 0.259317), norm. avg. (of 4) = 0.177976 fft 3: mflops = 59.1584 (norm. = 0.695833), norm. avg. (of 4) = 0.706438 fft 4: mflops = 56.3413 (norm. = 0.662698), norm. avg. (of 4) = 0.701729 fft 5: mflops = 85.0181 (norm. = 1), norm. avg. (of 3) = 0.810058 fft 6: mflops = 75.1218 (norm. = 0.883598), norm. avg. (of 3) = 0.655864 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.37 s, 2048 iters, t-(init.)=1.3 s t(norm)=0.0636945, mflops=78.4997 (err=3.5e-16) 1. PDA: elapsed time t=1.06 s, 512 iters, t-(init.)=1.04 s t(norm)=0.203822, mflops=24.5312 (err=3.7e-16) 2. PDA (f2c): elapsed time t=1.14 s, 512 iters, t-(init.)=1.13 s t(norm)=0.221461, mflops=22.5774 (err=3.7e-16) 3. Singleton: elapsed time t=1.14 s, 1024 iters, t-(init.)=1.11 s t(norm)=0.108771, mflops=45.9683 (err=4.4e-16) 4. Singleton (f2c): elapsed time t=1.04 s, 1024 iters, t-(init.)=1.01 s t(norm)=0.0989715, mflops=50.5196 (err=4.6e-16) 5. Temperton: elapsed time t=1.17 s, 2048 iters, t-(init.)=1.1 s t(norm)=0.0538953, mflops=92.7724 (err=2.3e-08) 6. Temperton (f2c): elapsed time t=1.67 s, 2048 iters, t-(init.)=1.6 s t(norm)=0.0783932, mflops=63.781 (err=3.6e-16) Top mflops for N=1000 = 92.7724 Normalized results and averages for N=1000: fft 0: mflops = 78.4997 (norm. = 0.846154), norm. avg. (of 5) = 0.867617 fft 1: mflops = 24.5312 (norm. = 0.264423), norm. avg. (of 5) = 0.198965 fft 2: mflops = 22.5774 (norm. = 0.243363), norm. avg. (of 5) = 0.191053 fft 3: mflops = 45.9683 (norm. = 0.495495), norm. avg. (of 5) = 0.664249 fft 4: mflops = 50.5196 (norm. = 0.544554), norm. avg. (of 5) = 0.670294 fft 5: mflops = 92.7724 (norm. = 1), norm. avg. (of 4) = 0.857543 fft 6: mflops = 63.781 (norm. = 0.6875), norm. avg. (of 4) = 0.663773 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.42 s, 1024 iters, t-(init.)=1.37 s t(norm)=0.0968538, mflops=51.6242 (err=3.7e-16) 1. PDA: elapsed time t=1.74 s, 256 iters, t-(init.)=1.73 s t(norm)=0.489218, mflops=10.2204 (err=5.8e-16) 2. PDA (f2c): elapsed time t=1.82 s, 256 iters, t-(init.)=1.8 s t(norm)=0.509013, mflops=9.82294 (err=5.8e-16) 3. Singleton: elapsed time t=1.72 s, 1024 iters, t-(init.)=1.67 s t(norm)=0.118063, mflops=42.3504 (err=6.4e-16) 4. Singleton (f2c): elapsed time t=1.89 s, 1024 iters, t-(init.)=1.84 s t(norm)=0.130081, mflops=38.4376 (err=6.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 51.6242 Normalized results and averages for N=1331: fft 0: mflops = 51.6242 (norm. = 1), norm. avg. (of 6) = 0.88968 fft 1: mflops = 10.2204 (norm. = 0.197977), norm. avg. (of 6) = 0.1988 fft 2: mflops = 9.82294 (norm. = 0.190278), norm. avg. (of 6) = 0.190924 fft 3: mflops = 42.3504 (norm. = 0.820359), norm. avg. (of 6) = 0.690268 fft 4: mflops = 38.4376 (norm. = 0.744565), norm. avg. (of 6) = 0.682673 fft 5: mflops = -1 (norm. = -0.0193708), norm. avg. (of 4) = 0.857543 fft 6: mflops = -1 (norm. = -0.0193708), norm. avg. (of 4) = 0.663773 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.04 s, 1024 iters, t-(init.)=0.98 s t(norm)=0.0514964, mflops=97.0942 (err=3.6e-16) 1. PDA: elapsed time t=1.62 s, 512 iters, t-(init.)=1.59 s t(norm)=0.1671, mflops=29.9221 (err=3.4e-16) 2. PDA (f2c): elapsed time t=1.67 s, 512 iters, t-(init.)=1.64 s t(norm)=0.172355, mflops=29.0099 (err=3.4e-16) 3. Singleton: elapsed time t=1.78 s, 1024 iters, t-(init.)=1.72 s t(norm)=0.0903814, mflops=55.3211 (err=3.6e-16) 4. Singleton (f2c): elapsed time t=1.75 s, 1024 iters, t-(init.)=1.69 s t(norm)=0.0888049, mflops=56.3032 (err=3.8e-16) 5. Temperton: elapsed time t=1.69 s, 2048 iters, t-(init.)=1.57 s t(norm)=0.0412496, mflops=121.213 (err=1.6e-08) 6. Temperton (f2c): elapsed time t=1.92 s, 2048 iters, t-(init.)=1.8 s t(norm)=0.0472926, mflops=105.725 (err=3.2e-16) Top mflops for N=1728 = 121.213 Normalized results and averages for N=1728: fft 0: mflops = 97.0942 (norm. = 0.80102), norm. avg. (of 7) = 0.877015 fft 1: mflops = 29.9221 (norm. = 0.246855), norm. avg. (of 7) = 0.205665 fft 2: mflops = 29.0099 (norm. = 0.239329), norm. avg. (of 7) = 0.197839 fft 3: mflops = 55.3211 (norm. = 0.456395), norm. avg. (of 7) = 0.656857 fft 4: mflops = 56.3032 (norm. = 0.464497), norm. avg. (of 7) = 0.651505 fft 5: mflops = 121.213 (norm. = 1), norm. avg. (of 5) = 0.886035 fft 6: mflops = 105.725 (norm. = 0.872222), norm. avg. (of 5) = 0.705463 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.42 s, 512 iters, t-(init.)=1.38 s t(norm)=0.110511, mflops=45.2445 (err=4.5e-16) 1. PDA: elapsed time t=1.53 s, 128 iters, t-(init.)=1.52 s t(norm)=0.486888, mflops=10.2693 (err=9.4e-16) 2. PDA (f2c): elapsed time t=1.57 s, 128 iters, t-(init.)=1.56 s t(norm)=0.499701, mflops=10.006 (err=9.4e-16) 3. Singleton: elapsed time t=1.6 s, 512 iters, t-(init.)=1.57 s t(norm)=0.125726, mflops=39.769 (err=4.8e-16) 4. Singleton (f2c): elapsed time t=1.75 s, 512 iters, t-(init.)=1.71 s t(norm)=0.136937, mflops=36.5131 (err=4.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 45.2445 Normalized results and averages for N=2197: fft 0: mflops = 45.2445 (norm. = 1), norm. avg. (of 8) = 0.892388 fft 1: mflops = 10.2693 (norm. = 0.226974), norm. avg. (of 8) = 0.208329 fft 2: mflops = 10.006 (norm. = 0.221154), norm. avg. (of 8) = 0.200753 fft 3: mflops = 39.769 (norm. = 0.878981), norm. avg. (of 8) = 0.684623 fft 4: mflops = 36.5131 (norm. = 0.807018), norm. avg. (of 8) = 0.670944 fft 5: mflops = -1 (norm. = -0.0221021), norm. avg. (of 5) = 0.886035 fft 6: mflops = -1 (norm. = -0.0221021), norm. avg. (of 5) = 0.705463 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.11 s, 512 iters, t-(init.)=1.06 s t(norm)=0.0660552, mflops=75.6942 (err=4.3e-16) 1. PDA: elapsed time t=1.25 s, 128 iters, t-(init.)=1.24 s t(norm)=0.309089, mflops=16.1766 (err=4.3e-16) 2. PDA (f2c): elapsed time t=1.29 s, 128 iters, t-(init.)=1.28 s t(norm)=0.319059, mflops=15.6711 (err=4.3e-16) 3. Singleton: elapsed time t=1.14 s, 256 iters, t-(init.)=1.12 s t(norm)=0.139588, mflops=35.8196 (err=5.8e-16) 4. Singleton (f2c): elapsed time t=1.19 s, 256 iters, t-(init.)=1.17 s t(norm)=0.14582, mflops=34.2888 (err=5.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 75.6942 Normalized results and averages for N=2744: fft 0: mflops = 75.6942 (norm. = 1), norm. avg. (of 9) = 0.904345 fft 1: mflops = 16.1766 (norm. = 0.21371), norm. avg. (of 9) = 0.208927 fft 2: mflops = 15.6711 (norm. = 0.207031), norm. avg. (of 9) = 0.201451 fft 3: mflops = 35.8196 (norm. = 0.473214), norm. avg. (of 9) = 0.661133 fft 4: mflops = 34.2888 (norm. = 0.452991), norm. avg. (of 9) = 0.646727 fft 5: mflops = -1 (norm. = -0.013211), norm. avg. (of 5) = 0.886035 fft 6: mflops = -1 (norm. = -0.013211), norm. avg. (of 5) = 0.705463 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.1 s, 512 iters, t-(init.)=1.04 s t(norm)=0.0513496, mflops=97.3717 (err=3.8e-16) 1. PDA: elapsed time t=1.51 s, 256 iters, t-(init.)=1.49 s t(norm)=0.147136, mflops=33.9821 (err=4.9e-16) 2. PDA (f2c): elapsed time t=1.58 s, 256 iters, t-(init.)=1.55 s t(norm)=0.153061, mflops=32.6666 (err=4.9e-16) 3. Singleton: elapsed time t=1.01 s, 256 iters, t-(init.)=0.98 s t(norm)=0.0967743, mflops=51.6666 (err=5.1e-16) 4. Singleton (f2c): elapsed time t=1.91 s, 512 iters, t-(init.)=1.85 s t(norm)=0.091343, mflops=54.7387 (err=4.9e-16) 5. Temperton: elapsed time t=1.82 s, 1024 iters, t-(init.)=1.7 s t(norm)=0.0419684, mflops=119.137 (err=1.9e-08) 6. Temperton (f2c): elapsed time t=1.26 s, 512 iters, t-(init.)=1.2 s t(norm)=0.0592495, mflops=84.3888 (err=4.0e-16) Top mflops for N=3375 = 119.137 Normalized results and averages for N=3375: fft 0: mflops = 97.3717 (norm. = 0.817308), norm. avg. (of 10) = 0.895641 fft 1: mflops = 33.9821 (norm. = 0.285235), norm. avg. (of 10) = 0.216557 fft 2: mflops = 32.6666 (norm. = 0.274194), norm. avg. (of 10) = 0.208725 fft 3: mflops = 51.6666 (norm. = 0.433673), norm. avg. (of 10) = 0.638387 fft 4: mflops = 54.7387 (norm. = 0.459459), norm. avg. (of 10) = 0.628 fft 5: mflops = 119.137 (norm. = 1), norm. avg. (of 6) = 0.905029 fft 6: mflops = 84.3888 (norm. = 0.708333), norm. avg. (of 6) = 0.705941 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1 s, 32 iters, t-(init.)=0.93 s t(norm)=0.123247, mflops=40.5691 (err=4.8e-16) 1. PDA: elapsed time t=1.25 s, 32 iters, t-(init.)=1.18 s t(norm)=0.156377, mflops=31.9739 (err=5.2e-16) 2. PDA (f2c): elapsed time t=1.31 s, 32 iters, t-(init.)=1.24 s t(norm)=0.164329, mflops=30.4268 (err=5.2e-16) 3. Singleton: elapsed time t=1.85 s, 32 iters, t-(init.)=1.78 s t(norm)=0.235891, mflops=21.1962 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=1.87 s, 32 iters, t-(init.)=1.8 s t(norm)=0.238542, mflops=20.9607 (err=5.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 40.5691 Normalized results and averages for N=16800: fft 0: mflops = 40.5691 (norm. = 1), norm. avg. (of 11) = 0.905128 fft 1: mflops = 31.9739 (norm. = 0.788136), norm. avg. (of 11) = 0.268519 fft 2: mflops = 30.4268 (norm. = 0.75), norm. avg. (of 11) = 0.257932 fft 3: mflops = 21.1962 (norm. = 0.522472), norm. avg. (of 11) = 0.627849 fft 4: mflops = 20.9607 (norm. = 0.516667), norm. avg. (of 11) = 0.617879 fft 5: mflops = -1 (norm. = -0.0246493), norm. avg. (of 6) = 0.905029 fft 6: mflops = -1 (norm. = -0.0246493), norm. avg. (of 6) = 0.705941 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=2 s, 8 iters, t-(init.)=1.89 s t(norm)=0.127499, mflops=39.216 (err=6.4e-16) 1. PDA: elapsed time t=1.41 s, 4 iters, t-(init.)=1.35 s t(norm)=0.182141, mflops=27.4512 (err=6.5e-16) 2. PDA (f2c): elapsed time t=1.42 s, 4 iters, t-(init.)=1.36 s t(norm)=0.183491, mflops=27.2494 (err=6.5e-16) 3. Singleton: elapsed time t=1.34 s, 2 iters, t-(init.)=1.31 s t(norm)=0.353489, mflops=14.1447 (err=7.4e-16) 4. Singleton (f2c): elapsed time t=1.34 s, 2 iters, t-(init.)=1.31 s t(norm)=0.353489, mflops=14.1447 (err=8.1e-16) 5. Temperton: elapsed time t=1.04 s, 4 iters, t-(init.)=0.99 s t(norm)=0.13357, mflops=37.4335 (err=1.6e-08) 6. Temperton (f2c): elapsed time t=1.06 s, 4 iters, t-(init.)=1 s t(norm)=0.13492, mflops=37.0591 (err=6.3e-16) Top mflops for N=110592 = 39.216 Normalized results and averages for N=110592: fft 0: mflops = 39.216 (norm. = 1), norm. avg. (of 12) = 0.913034 fft 1: mflops = 27.4512 (norm. = 0.7), norm. avg. (of 12) = 0.304476 fft 2: mflops = 27.2494 (norm. = 0.694853), norm. avg. (of 12) = 0.294342 fft 3: mflops = 14.1447 (norm. = 0.360687), norm. avg. (of 12) = 0.605586 fft 4: mflops = 14.1447 (norm. = 0.360687), norm. avg. (of 12) = 0.596446 fft 5: mflops = 37.4335 (norm. = 0.954545), norm. avg. (of 7) = 0.912103 fft 6: mflops = 37.0591 (norm. = 0.945), norm. avg. (of 7) = 0.740093 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.76 s, 8 iters, t-(init.)=1.64 s t(norm)=0.103447, mflops=48.334 (err=1.0e-15) 1. PDA: elapsed time t=1.76 s, 4 iters, t-(init.)=1.7 s t(norm)=0.214463, mflops=23.3141 (err=1.1e-15) 2. PDA (f2c): elapsed time t=1.86 s, 4 iters, t-(init.)=1.8 s t(norm)=0.227078, mflops=22.0188 (err=1.1e-15) 3. Singleton: elapsed time t=1.09 s, 2 iters, t-(init.)=1.06 s t(norm)=0.267448, mflops=18.6952 (err=1.4e-15) 4. Singleton (f2c): elapsed time t=1.16 s, 2 iters, t-(init.)=1.13 s t(norm)=0.285109, mflops=17.5371 (err=1.7e-15) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 48.334 Normalized results and averages for N=117649: fft 0: mflops = 48.334 (norm. = 1), norm. avg. (of 13) = 0.919724 fft 1: mflops = 23.3141 (norm. = 0.482353), norm. avg. (of 13) = 0.318159 fft 2: mflops = 22.0188 (norm. = 0.455556), norm. avg. (of 13) = 0.306743 fft 3: mflops = 18.6952 (norm. = 0.386792), norm. avg. (of 13) = 0.588755 fft 4: mflops = 17.5371 (norm. = 0.362832), norm. avg. (of 13) = 0.578476 fft 5: mflops = -1 (norm. = -0.0206894), norm. avg. (of 7) = 0.912103 fft 6: mflops = -1 (norm. = -0.0206894), norm. avg. (of 7) = 0.740093 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.49 s, 4 iters, t-(init.)=1.38 s t(norm)=0.0901333, mflops=55.4734 (err=5.9e-16) 1. PDA: elapsed time t=1.71 s, 4 iters, t-(init.)=1.6 s t(norm)=0.104502, mflops=47.8458 (err=6.0e-16) 2. PDA (f2c): elapsed time t=1.82 s, 4 iters, t-(init.)=1.71 s t(norm)=0.111687, mflops=44.768 (err=6.0e-16) 3. Singleton: elapsed time t=1.49 s, 1 iters, t-(init.)=1.46 s t(norm)=0.381434, mflops=13.1084 (err=6.5e-16) 4. Singleton (f2c): elapsed time t=1.5 s, 1 iters, t-(init.)=1.48 s t(norm)=0.386659, mflops=12.9313 (err=7.4e-16) 5. Temperton: elapsed time t=1.04 s, 2 iters, t-(init.)=0.98 s t(norm)=0.128015, mflops=39.0578 (err=1.8e-08) 6. Temperton (f2c): elapsed time t=1.12 s, 2 iters, t-(init.)=1.07 s t(norm)=0.139772, mflops=35.7726 (err=5.3e-16) Top mflops for N=216000 = 55.4734 Normalized results and averages for N=216000: fft 0: mflops = 55.4734 (norm. = 1), norm. avg. (of 14) = 0.925458 fft 1: mflops = 47.8458 (norm. = 0.8625), norm. avg. (of 14) = 0.35704 fft 2: mflops = 44.768 (norm. = 0.807018), norm. avg. (of 14) = 0.342477 fft 3: mflops = 13.1084 (norm. = 0.236301), norm. avg. (of 14) = 0.56358 fft 4: mflops = 12.9313 (norm. = 0.233108), norm. avg. (of 14) = 0.553807 fft 5: mflops = 39.0578 (norm. = 0.704082), norm. avg. (of 8) = 0.8861 fft 6: mflops = 35.7726 (norm. = 0.64486), norm. avg. (of 8) = 0.728188 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.71 s, 4 iters, t-(init.)=1.59 s t(norm)=0.0918748, mflops=54.4219 (err=5.9e-16) 1. PDA: elapsed time t=1.39 s, 2 iters, t-(init.)=1.33 s t(norm)=0.153703, mflops=32.5304 (err=6.9e-16) 2. PDA (f2c): elapsed time t=1.45 s, 2 iters, t-(init.)=1.38 s t(norm)=0.159481, mflops=31.3517 (err=6.9e-16) 3. Singleton: elapsed time t=1.78 s, 1 iters, t-(init.)=1.75 s t(norm)=0.40448, mflops=12.3615 (err=7.0e-16) 4. Singleton (f2c): elapsed time t=1.8 s, 1 iters, t-(init.)=1.77 s t(norm)=0.409103, mflops=12.2219 (err=7.3e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 54.4219 Normalized results and averages for N=241920: fft 0: mflops = 54.4219 (norm. = 1), norm. avg. (of 15) = 0.930427 fft 1: mflops = 32.5304 (norm. = 0.597744), norm. avg. (of 15) = 0.373087 fft 2: mflops = 31.3517 (norm. = 0.576087), norm. avg. (of 15) = 0.358051 fft 3: mflops = 12.3615 (norm. = 0.227143), norm. avg. (of 15) = 0.541151 fft 4: mflops = 12.2219 (norm. = 0.224576), norm. avg. (of 15) = 0.531858 fft 5: mflops = -1 (norm. = -0.018375), norm. avg. (of 8) = 0.8861 fft 6: mflops = -1 (norm. = -0.018375), norm. avg. (of 8) = 0.728188 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.46 s, 2 iters, t-(init.)=1.36 s t(norm)=0.0862578, mflops=57.9658 (err=9.6e-16) 1. PDA: elapsed time t=1.61 s, 2 iters, t-(init.)=1.5 s t(norm)=0.0951372, mflops=52.5557 (err=1.1e-15) 2. PDA (f2c): elapsed time t=1.66 s, 2 iters, t-(init.)=1.55 s t(norm)=0.0983085, mflops=50.8603 (err=1.0e-15) 3. Singleton: elapsed time t=2.63 s, 1 iters, t-(init.)=2.58 s t(norm)=0.327272, mflops=15.2778 (err=1.1e-15) 4. Singleton (f2c): elapsed time t=2.61 s, 1 iters, t-(init.)=2.56 s t(norm)=0.324735, mflops=15.3972 (err=1.3e-15) 5. Temperton: elapsed time t=1.01 s, 1 iters, t-(init.)=0.95 s t(norm)=0.120507, mflops=41.4913 (err=3.3e-08) 6. Temperton (f2c): elapsed time t=1.12 s, 1 iters, t-(init.)=1.07 s t(norm)=0.135729, mflops=36.8381 (err=9.4e-16) Top mflops for N=421875 = 57.9658 Normalized results and averages for N=421875: fft 0: mflops = 57.9658 (norm. = 1), norm. avg. (of 16) = 0.934776 fft 1: mflops = 52.5557 (norm. = 0.906667), norm. avg. (of 16) = 0.406436 fft 2: mflops = 50.8603 (norm. = 0.877419), norm. avg. (of 16) = 0.390512 fft 3: mflops = 15.2778 (norm. = 0.263566), norm. avg. (of 16) = 0.523802 fft 4: mflops = 15.3972 (norm. = 0.265625), norm. avg. (of 16) = 0.515219 fft 5: mflops = 41.4913 (norm. = 0.715789), norm. avg. (of 9) = 0.867177 fft 6: mflops = 36.8381 (norm. = 0.635514), norm. avg. (of 9) = 0.717891 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.22 s, 1 iters, t-(init.)=1.15 s t(norm)=0.118429, mflops=42.2195 (err=1.5e-15) 1. PDA: elapsed time t=1.58 s, 1 iters, t-(init.)=1.51 s t(norm)=0.155502, mflops=32.1539 (err=1.5e-15) 2. PDA (f2c): elapsed time t=1.67 s, 1 iters, t-(init.)=1.6 s t(norm)=0.16477, mflops=30.3453 (err=1.5e-15) 3. Singleton: elapsed time t=3.78 s, 1 iters, t-(init.)=3.71 s t(norm)=0.382061, mflops=13.0869 (err=1.7e-15) 4. Singleton (f2c): elapsed time t=3.79 s, 1 iters, t-(init.)=3.72 s t(norm)=0.383091, mflops=13.0517 (err=2.3e-15) 5. Temperton: elapsed time t=1.55 s, 1 iters, t-(init.)=1.48 s t(norm)=0.152413, mflops=32.8057 (err=2.4e-08) 6. Temperton (f2c): elapsed time t=1.68 s, 1 iters, t-(init.)=1.62 s t(norm)=0.16683, mflops=29.9706 (err=1.6e-15) Top mflops for N=512000 = 42.2195 Normalized results and averages for N=512000: fft 0: mflops = 42.2195 (norm. = 1), norm. avg. (of 17) = 0.938612 fft 1: mflops = 32.1539 (norm. = 0.761589), norm. avg. (of 17) = 0.427327 fft 2: mflops = 30.3453 (norm. = 0.71875), norm. avg. (of 17) = 0.40982 fft 3: mflops = 13.0869 (norm. = 0.309973), norm. avg. (of 17) = 0.511224 fft 4: mflops = 13.0517 (norm. = 0.30914), norm. avg. (of 17) = 0.503096 fft 5: mflops = 32.8057 (norm. = 0.777027), norm. avg. (of 10) = 0.858162 fft 6: mflops = 29.9706 (norm. = 0.709877), norm. avg. (of 10) = 0.71709 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.08 s, 1 iters, t-(init.)=1 s t(norm)=0.0879797, mflops=56.8313 (err=7.1e-16) 1. PDA: elapsed time t=1.57 s, 1 iters, t-(init.)=1.49 s t(norm)=0.13109, mflops=38.1418 (err=7.0e-16) 2. PDA (f2c): elapsed time t=1.63 s, 1 iters, t-(init.)=1.55 s t(norm)=0.136369, mflops=36.6653 (err=7.0e-16) 3. Singleton: elapsed time t=4.76 s, 1 iters, t-(init.)=4.69 s t(norm)=0.412625, mflops=12.1175 (err=9.9e-16) 4. Singleton (f2c): elapsed time t=4.82 s, 1 iters, t-(init.)=4.74 s t(norm)=0.417024, mflops=11.9897 (err=9.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 56.8313 Normalized results and averages for N=592704: fft 0: mflops = 56.8313 (norm. = 1), norm. avg. (of 18) = 0.942023 fft 1: mflops = 38.1418 (norm. = 0.671141), norm. avg. (of 18) = 0.440872 fft 2: mflops = 36.6653 (norm. = 0.645161), norm. avg. (of 18) = 0.422894 fft 3: mflops = 12.1175 (norm. = 0.21322), norm. avg. (of 18) = 0.494668 fft 4: mflops = 11.9897 (norm. = 0.21097), norm. avg. (of 18) = 0.486867 fft 5: mflops = -1 (norm. = -0.0175959), norm. avg. (of 10) = 0.858162 fft 6: mflops = -1 (norm. = -0.0175959), norm. avg. (of 10) = 0.71709 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=2 s, 1 iters, t-(init.)=1.89 s t(norm)=0.108137, mflops=46.2377 (err=7.3e-16) 1. PDA: elapsed time t=2.85 s, 1 iters, t-(init.)=2.74 s t(norm)=0.15677, mflops=31.8939 (err=6.7e-16) 2. PDA (f2c): elapsed time t=3.03 s, 1 iters, t-(init.)=2.92 s t(norm)=0.167069, mflops=29.9278 (err=6.7e-16) 3. Singleton: elapsed time t=8.03 s, 1 iters, t-(init.)=7.91 s t(norm)=0.452573, mflops=11.048 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=8.04 s, 1 iters, t-(init.)=7.93 s t(norm)=0.453717, mflops=11.0201 (err=8.9e-16) 5. Temperton: elapsed time t=3.55 s, 1 iters, t-(init.)=3.44 s t(norm)=0.19682, mflops=25.4039 (err=1.6e-08) 6. Temperton (f2c): elapsed time t=3.58 s, 1 iters, t-(init.)=3.46 s t(norm)=0.197965, mflops=25.257 (err=7.2e-16) Top mflops for N=884736 = 46.2377 Normalized results and averages for N=884736: fft 0: mflops = 46.2377 (norm. = 1), norm. avg. (of 19) = 0.945074 fft 1: mflops = 31.8939 (norm. = 0.689781), norm. avg. (of 19) = 0.453973 fft 2: mflops = 29.9278 (norm. = 0.64726), norm. avg. (of 19) = 0.434703 fft 3: mflops = 11.048 (norm. = 0.238938), norm. avg. (of 19) = 0.481208 fft 4: mflops = 11.0201 (norm. = 0.238335), norm. avg. (of 19) = 0.473786 fft 5: mflops = 25.4039 (norm. = 0.549419), norm. avg. (of 11) = 0.830094 fft 6: mflops = 25.257 (norm. = 0.546243), norm. avg. (of 11) = 0.701558 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=2.04 s, 1 iters, t-(init.)=1.9 s t(norm)=0.081483, mflops=61.3625 (err=7.4e-16) 1. PDA: elapsed time t=3 s, 1 iters, t-(init.)=2.85 s t(norm)=0.122225, mflops=40.9083 (err=7.6e-16) 2. PDA (f2c): elapsed time t=3.08 s, 1 iters, t-(init.)=2.93 s t(norm)=0.125655, mflops=39.7914 (err=7.8e-16) 3. Singleton: elapsed time t=8.46 s, 1 iters, t-(init.)=8.31 s t(norm)=0.356381, mflops=14.0299 (err=9.1e-16) 4. Singleton (f2c): elapsed time t=8.51 s, 1 iters, t-(init.)=8.36 s t(norm)=0.358525, mflops=13.946 (err=9.7e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 61.3625 Normalized results and averages for N=1157625: fft 0: mflops = 61.3625 (norm. = 1), norm. avg. (of 20) = 0.947821 fft 1: mflops = 40.9083 (norm. = 0.666667), norm. avg. (of 20) = 0.464608 fft 2: mflops = 39.7914 (norm. = 0.648464), norm. avg. (of 20) = 0.445391 fft 3: mflops = 14.0299 (norm. = 0.22864), norm. avg. (of 20) = 0.46858 fft 4: mflops = 13.946 (norm. = 0.227273), norm. avg. (of 20) = 0.461461 fft 5: mflops = -1 (norm. = -0.0162966), norm. avg. (of 11) = 0.830094 fft 6: mflops = -1 (norm. = -0.0162966), norm. avg. (of 11) = 0.701558 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=3.42 s, 1 iters, t-(init.)=3.24 s t(norm)=0.112925, mflops=44.2771 (err=5.4e-16) 1. PDA: elapsed time t=5.21 s, 1 iters, t-(init.)=5.04 s t(norm)=0.175662, mflops=28.4638 (err=6.0e-16) 2. PDA (f2c): elapsed time t=5.4 s, 1 iters, t-(init.)=5.22 s t(norm)=0.181935, mflops=27.4823 (err=6.0e-16) 3. Singleton: elapsed time t=11.29 s, 1 iters, t-(init.)=11.11 s t(norm)=0.387222, mflops=12.9125 (err=6.9e-16) 4. Singleton (f2c): elapsed time t=11.43 s, 1 iters, t-(init.)=11.25 s t(norm)=0.392102, mflops=12.7518 (err=6.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 44.2771 Normalized results and averages for N=1404928: fft 0: mflops = 44.2771 (norm. = 1), norm. avg. (of 21) = 0.950305 fft 1: mflops = 28.4638 (norm. = 0.642857), norm. avg. (of 21) = 0.473096 fft 2: mflops = 27.4823 (norm. = 0.62069), norm. avg. (of 21) = 0.453739 fft 3: mflops = 12.9125 (norm. = 0.291629), norm. avg. (of 21) = 0.460154 fft 4: mflops = 12.7518 (norm. = 0.288), norm. avg. (of 21) = 0.453201 fft 5: mflops = -1 (norm. = -0.0225851), norm. avg. (of 11) = 0.830094 fft 6: mflops = -1 (norm. = -0.0225851), norm. avg. (of 11) = 0.701558 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=4.01 s, 1 iters, t-(init.)=3.8 s t(norm)=0.106129, mflops=47.1123 (err=6.3e-16) 1. PDA: elapsed time t=5.4 s, 1 iters, t-(init.)=5.18 s t(norm)=0.144671, mflops=34.5611 (err=7.4e-16) 2. PDA (f2c): elapsed time t=5.78 s, 1 iters, t-(init.)=5.56 s t(norm)=0.155284, mflops=32.199 (err=7.4e-16) 3. Singleton: elapsed time t=19.84 s, 1 iters, t-(init.)=19.62 s t(norm)=0.547963, mflops=9.1247 (err=7.4e-16) 4. Singleton (f2c): elapsed time t=19.92 s, 1 iters, t-(init.)=19.7 s t(norm)=0.550198, mflops=9.08764 (err=8.0e-16) 5. Temperton: elapsed time t=5.9 s, 1 iters, t-(init.)=5.67 s t(norm)=0.158356, mflops=31.5744 (err=1.8e-08) 6. Temperton (f2c): elapsed time t=6.23 s, 1 iters, t-(init.)=6.01 s t(norm)=0.167852, mflops=29.7881 (err=5.6e-16) Top mflops for N=1728000 = 47.1123 Normalized results and averages for N=1728000: fft 0: mflops = 47.1123 (norm. = 1), norm. avg. (of 22) = 0.952564 fft 1: mflops = 34.5611 (norm. = 0.733591), norm. avg. (of 22) = 0.484936 fft 2: mflops = 32.199 (norm. = 0.683453), norm. avg. (of 22) = 0.46418 fft 3: mflops = 9.1247 (norm. = 0.19368), norm. avg. (of 22) = 0.448041 fft 4: mflops = 9.08764 (norm. = 0.192893), norm. avg. (of 22) = 0.441369 fft 5: mflops = 31.5744 (norm. = 0.670194), norm. avg. (of 12) = 0.816769 fft 6: mflops = 29.7881 (norm. = 0.63228), norm. avg. (of 12) = 0.695785 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=7.3 s, 1 iters, t-(init.)=6.92 s t(norm)=0.107741, mflops=46.4074 (err=1.1e-15) 1. PDA: elapsed time t=9.39 s, 1 iters, t-(init.)=9.01 s t(norm)=0.140282, mflops=35.6425 (err=1.2e-15) 2. PDA (f2c): elapsed time t=9.96 s, 1 iters, t-(init.)=9.58 s t(norm)=0.149156, mflops=33.5218 (err=1.2e-15) 3. Singleton: elapsed time t=31.05 s, 1 iters, t-(init.)=30.67 s t(norm)=0.477519, mflops=10.4708 (err=1.3e-15) 4. Singleton (f2c): elapsed time t=31.29 s, 1 iters, t-(init.)=30.91 s t(norm)=0.481255, mflops=10.3895 (err=1.6e-15) 5. Temperton: elapsed time t=12.42 s, 1 iters, t-(init.)=12.04 s t(norm)=0.187458, mflops=26.6727 (err=3.0e-08) 6. Temperton (f2c): elapsed time t=12.58 s, 1 iters, t-(init.)=12.2 s t(norm)=0.189949, mflops=26.3229 (err=1.2e-15) Top mflops for N=2985984 = 46.4074 Normalized results and averages for N=2985984: fft 0: mflops = 46.4074 (norm. = 1), norm. avg. (of 23) = 0.954627 fft 1: mflops = 35.6425 (norm. = 0.768036), norm. avg. (of 23) = 0.497245 fft 2: mflops = 33.5218 (norm. = 0.722338), norm. avg. (of 23) = 0.475404 fft 3: mflops = 10.4708 (norm. = 0.225628), norm. avg. (of 23) = 0.438371 fft 4: mflops = 10.3895 (norm. = 0.223876), norm. avg. (of 23) = 0.431912 fft 5: mflops = 26.6727 (norm. = 0.574751), norm. avg. (of 13) = 0.798152 fft 6: mflops = 26.3229 (norm. = 0.567213), norm. avg. (of 13) = 0.685895 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Bailey, Beauregard, Bergland, Brenner, Burrus, CWP (min N), CWP (best N), Edelblute, FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), Monro, NAPACK (f2c), Nielsen, NR (C), NR (F), Ooura (C), Ooura (F), QFT, Ransom, SCIPORT, Singleton, Singleton (f2c), Sorensen, Sorensen DIT, Temperton, Temperton (f2c), Valkenburg, ESSL 2, 9.44663, 9.11805, 6.51289, 0.39961, 3.64089, 1.17029, 1.49797, 2.57004, 5.95782, 1.58875, 1.53301, , 4.29744, 4.72332, 11.1551, 11.275, 21.6201, , 4.68114, 2.91271, 2.73067, 6.80894, , , , , 1.62822, 0.910222, 3.1775, 2.92898, 8.2565, 8.81156, , , , 2.20289, 2.18453, 8.192, 5.69878, 1.82044, 1.62822, 3.12076, 4, 19.5996, 19.9729, 10.4858, 1.91346, 7.59838, 4.0022, 5.48993, 6.43298, 8.0044, 6.0263, 3.85506, 8.192, 14.1699, 14.3641, 35.5449, 36.1578, 52.4288, , 10.5917, 5.76141, 6.0263, 19.065, 11.7159, 11.8483, 10.2802, 2.47306, 3.51871, 3.56659, 6.85344, 5.89088, 16.384, 16.384, , 1.60825, 14.5636, 8.52501, 8.81156, 12.6334, 7.71012, 6.35501, 5.46133, 3.42672, 8, 32.0993, 32.0993, 14.6997, 2.62144, 15.5729, 4.59902, 10.773, 9.59063, 9.59063, 15.8875, 11.3976, 9.36229, 26.4347, 25.9978, 63.5501, 64.8604, 86.7787, 30.541, 19.5387, 11.2347, 11.3156, 34.9525, 20.1649, 21.1123, 19.065, 6.39376, 6.39376, 9.30689, 12.8923, 10.5561, 32.0993, 30.541, , 1.85479, 22.9615, 8.93673, 9.47508, 18.9502, 9.64947, 12.1927, 9.19804, 3.74491, 20.8326 16, 25.8908, 24.2445, 17.9244, 4.94611, 23.4319, 5.89088, 21.6201, 14.5636, 11.3976, 31.775, 23.5635, 11.5865, 49.9322, 45.5903, 97.542, 97.542, 97.542, 37.7865, 29.7468, 16.7772, 17.9244, 50.5338, 19.5996, 23.0456, 21.1834, 11.8483, 9.19804, 10.8101, 20.1649, 15.8875, 45.1, 45.1, 32.768, 6.89853, 32.514, 25.1156, 24.8184, 24.3855, 11.3976, 18.5589, 13.3577, 3.97188, 43.2402 32, 29.7891, 28.3399, 21.4872, 6.30154, 38.8361, 6.12486, 32.3635, 19.1346, 13.4433, 41.2825, 49.4611, 13.7248, 54.6133, 54.0503, 119.156, 119.156, 131.896, 57.2992, 31.023, 21.4872, 23.8313, 58.9088, 24.0499, 30.3057, 28.6496, 19.4181, 12.1363, 16.6971, 28.3399, 21.4872, 60.263, 58.9088, 32.768, 6.72164, 42.9744, 33.6082, 33.825, 35.1871, 13.1072, 20.3212, 15.2409, 4.17427, 70.3742 64, 32.768, 30.8405, 24.3855, 9.36229, 46.6034, 6.29146, 42.5098, 23.4756, 15.4202, 52.8694, 64.1985, 15.5729, 76.7251, 70.6905, 123.362, 99.078, 66.2259, 87.9924, 39.3216, 24.7695, 28.5975, 61.0821, 25.7847, 34.3795, 32.0993, 26.2144, 14.4299, 21.9981, 35.7469, 25.9978, 68.7591, 70.2956, 33.4652, 14.4299, 53.7731, 46.2607, 46.2607, 45.5903, 15.1237, 26.6587, 22.153, 4.29744, 79.6387 128, 35.2886, 33.0632, 27.3882, 9.97287, 61.6809, 6.41611, 48.2897, 26.9854, 17.3114, 64.3862, 87.3813, 17.6443, 95.3251, 81.1053, 106.377, 103.381, 76.4587, 94.103, 43.9523, 27.1853, 33.0632, 62.7353, 28.672, 39.6758, 37.0709, 32.768, 16.0966, 23.6775, 41.7047, 30.5835, 75.6704, 76.4587, 33.6699, 13.5927, 60.6614, 46.1637, 46.1637, 58.2542, 16.8349, 28.0154, 20.7346, 4.41108, 104.858 256, 39.9458, 36.1578, 29.7468, 11.5865, 62.6016, 6.39376, 57.0654, 30.1748, 19.2399, 75.573, 111.848, 19.4181, 111.848, 89.2405, 114.912, 114.131, 94.254, 104.206, 49.3448, 28.5327, 36.1578, 57.8525, 30.8405, 42.799, 39.9458, 37.7865, 17.05, 26.0516, 45.5903, 33.5544, 82.2413, 84.7334, 34.1, 21.1834, 66.052, 58.2542, 56.6798, 68.7591, 18.7246, 30.1748, 24.2445, 4.63972, 101.068 512, 41.0312, 37.7487, 31.6684, 12.6844, 75.4975, 6.48158, 63.3368, 32.768, 21.0651, 88.198, 104.858, 21.2549, 94.3718, 81.355, 117.965, 117.965, 85.7926, 117.965, 45.3711, 29.1271, 38.0532, 60.4948, 33.4652, 46.2607, 43.2898, 41.7575, 18.0099, 27.9207, 48.6453, 36.864, 83.5149, 86.5797, 35.2134, 19.6608, 69.3911, 60.8851, 58.616, 77.9933, 20.3388, 27.7564, 21.6449, 4.57228, 116.508 1024, 45.9902, 41.2825, 33.825, 14.6449, 70.8497, 6.4251, 64.7269, 35.4249, 22.9951, 85.2501, 85.2501, 22.9951, 102.802, 85.9489, 119.156, 103.819, 72.3156, 124.83, 48.9989, 29.9593, 39.7188, 32.3635, 35.6659, 48.9989, 44.8109, 45.5903, 18.2044, 28.6496, 50.9017, 38.8361, 86.6592, 92.7943, 34.9525, 26.8866, 70.8497, 65.1289, 62.789, 83.2203, 22.2156, 31.023, 26.2144, 4.61521, 117.159 2048, 46.1373, 41.4904, 36.5011, 15.0973, 58.2542, 6.67496, 69.9051, 37.4491, 24.646, 72.0896, 92.2747, 24.8585, 66.6725, 62.0126, 75.3878, 74.4151, 53.3997, 121.414, 41.1941, 30.3535, 40.9019, 32.2188, 38.1932, 52.4288, 48.8743, 47.6625, 17.9105, 28.8358, 52.9098, 40.6139, 88.7257, 91.5423, 26.9494, 24.0299, 16.2914, 62.6866, 60.707, 88.7257, 23.6359, 30.6764, 23.4438, 4.59169, 73.9381 4096, 46.6034, 41.6653, 33.825, 15.7286, 17.7725, 6.44616, 66.5763, 37.9003, 24.1979, 69.9051, 84.4491, 23.8313, 31.4573, 27.3542, 58.7987, 58.7987, 37.0086, 98.304, 29.1271, 30.541, 40.59, 28.8599, 39.0774, 52.4288, 46.2607, 49.152, 13.1072, 24.9661, 53.3174, 40.8536, 77.1958, 82.7823, 24.1979, 29.1271, 10.6275, 66.5763, 63.5501, 61.0821, 23.3017, 26.8866, 21.5461, 4.25098, 57.7198 8192, 11.9995, 11.7513, 9.41401, 12.171, 16.0749, 6.08549, 22.2737, 15.7772, 8.11398, 53.248, 59.2673, 8.35263, 27.4828, 24.3419, 53.248, 36.6438, 25.2435, 24.875, 26.0143, 10.2647, 10.4536, , 38.7258, 52.8352, 39.6264, 10.9227, 13.2088, 11.5131, 11.2101, 10.7844, 42.5984, 43.1376, 19.9291, 21.5688, 11.0645, 19.2535, 18.828, 16.0749, 8.27153, 14.6891, 13.0071, 3.9443, 58.7564 16384, 11.6879, 11.614, 8.6557, 12.9226, 15.6838, 6.11669, 21.7161, 15.819, 7.64587, 53.9708, 53.9708, 7.9783, 24.4668, 21.092, 54.7764, 36.3368, 21.5883, 23.5257, 28.0154, 9.97287, 10.1382, , 14.7985, 15.6838, 15.041, 11.1213, 13.6941, 11.9156, 10.7942, 10.309, 43.1767, 44.485, 15.4202, 26.9854, 9.4103, 19.9457, 19.7313, 14.0077, 7.77546, 15.041, 13.9016, 3.92096, 59.6751 32768, 11.0454, 10.7436, 8.62316, 10.9837, 17.2463, 6.10584, 22.3418, 15.9844, 7.56185, 51.7389, 52.4288, 7.92774, 28.0869, 23.8313, 54.6133, 40.1241, 21.8453, 23.8313, 29.3445, 9.8304, 10.031, , 14.4565, 15.36, 14.6722, 10.0825, 13.3747, 11.7729, 10.4579, 10.0825, 42.9744, 43.6907, 12.7668, 24.576, 9.8304, 16.5217, 16.2486, 11.8439, 7.74047, 14.1445, 12.6031, 3.90095, 51.4008 65536, 11.096, 10.8661, 8.32203, 13.981, 16.6441, 6.09637, 20.9715, 16.0088, 7.48983, 53.4306, 53.4306, 7.76723, 24.5281, 20.7639, 52.1032, 39.9458, 20.5603, 22.4294, 30.6154, 9.70904, 9.79978, , 14.2663, 15.0874, 14.2663, 10.2802, 13.7971, 11.0376, 10.3819, 9.89223, 43.6907, 45.8394, 12.483, 24.1052, 9.03945, 19.7845, 19.9729, 10.9227, 7.54371, 15.3077, 13.3577, 3.94202, 50.84 131072, 10.5105, 10.3159, 8.31427, 11.9156, 15.582, 6.12149, 20.8245, 16.2644, 7.37823, 51.8192, 51.5196, 7.6309, 22.2822, 18.5685, 54.3469, 31.164, 16.1466, 21.4252, 25.1777, 9.44163, 9.68793, , 13.9264, 14.7565, 14.014, 9.52232, 13.5867, 11.1411, 10.3159, 9.94743, 44.1232, 44.5645, 12.243, 22.737, 8.63653, 18.1156, 18.4151, 10.2212, 7.6309, 13.8399, 12.6604, 3.84177, 50.6415 262144, 9.70904, 9.62978, 7.66005, 13.4051, 11.5088, 6.04948, 19.6608, 14.654, 6.79912, 34.9525, 35.2134, 7.14938, 16.271, 14.2126, 44.099, 26.9634, 14.4742, 19.826, 20.8787, 9.07422, 9.216, , 13.5592, 14.2988, 13.5592, 9.43718, 13.7168, 10.3026, 9.66925, 9.36229, 40.3298, 41.3912, 11.9156, 26.3608, 7.56185, 16.9734, 17.0963, 9.39959, 7.00088, 13.1072, 11.9156, 3.53718, 50.7375 Norm. Avg., 0.298903, 0.283353, 0.213164, 0.129527, 0.337382, 0.0748423, 0.371152, 0.240014, 0.15767, 0.601655, 0.661379, 0.1533, 0.514524, 0.4533, 0.870217, 0.749291, 0.630753, 0.633348, 0.369523, 0.195323, 0.225398, 0.413784, 0.283772, 0.351795, 0.31688, 0.232915, 0.15968, 0.181345, 0.270051, 0.219514, 0.6411, 0.658186, 0.269181, 0.243194, 0.304828, 0.357048, 0.351844, 0.380617, 0.155285, 0.222259, 0.18501, 0.0564759, 0.791622 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, Brenner, CWP (min N), CWP (best N), FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Nielsen, Singleton, Singleton (f2c), Temperton, Temperton (f2c), Valkenburg, ESSL 6, , 9.77354, 7.58544, 17.6774, 18.6504, 48.9855, 48.9855, 9.49952, 12.7855, 5.87543, 3.60443, 6.47419, 6.68716, 7.70037, 6.43322, 4.2002, 13.8292 9, 3.8952, 17.6387, 13.953, 27.0971, 27.6992, 56.6575, 57.09, 8.90332, 16.1181, 8.90332, 6.11012, 13.1669, 12.8945, 13.6474, 9.99838, 4.62796, 12, , 22.9214, 21.3586, 37.5911, 37.8433, 77.7746, 77.7746, 13.9571, 21.0398, 8.70163, 8.24365, 13.5545, 13.9571, 16.8822, 14.8386, 4.35082, 32.9746 15, 5.81914, 28.8769, 29.0957, 42.9121, 40.4277, 82.5942, 82.5942, 11.7092, 19.013, 7.68126, 10.0016, 14.4385, 15.1206, 19.3971, 15.2406, 4.57218, 18, 5.39369, 31.9418, 27.4807, 34.8868, 38.1321, 50.7118, 50.1943, 9.99806, 24.8437, 11.9394, 8.19841, 20.16, 20.3266, 19.52, 15.2765, 4.69374, 43.92 24, , 47.1341, 40.064, 45.072, 47.1341, 66.1608, 66.7734, 18.5864, 31.0841, 12.4337, 13.3547, 17.3354, 18.2109, 23.7221, 20.8426, 4.5072, 53.4187 36, 7.12464, 56.4694, 56.9971, 55.9513, 56.4694, 72.6035, 72.6035, 12.6006, 34.2623, 15.0958, 14.1173, 31.764, 30.8015, 31.764, 27.4716, 4.7646, 59.7911 80, 12.4046, 79.6757, 84.5539, 80.4493, 70.2227, 88.1519, 87.224, 26.5586, 32.6231, 10.5156, 23.6751, 45.0341, 46.8151, 33.9602, 25.5749, 4.6657, 101.052 108, 8.25455, 65.3147, 87.2451, 83.5845, 74.7037, 82.4316, 82.4316, 12.049, 37.5867, 19.789, 18.4453, 37.1198, 37.8246, 42.6878, 36.0018, 4.91471, 210, 11.2848, 103.679, 103.679, 63.8027, 57.2025, 79.4669, 75.8341, 14.3006, 37.0698, 9.47926, 21.6846, 25.9199, 27.8802, , , 4.41189, 504, 12.3881, 132.375, 131.437, 65.2555, 59.0209, 82.7346, 81.2831, 16.4296, 45.8727, 12.8698, 21.8544, 36.1964, 36.1964, , , 4.63314, 1000, 13.4276, 71.8659, 119.356, 91.9366, 80.354, 82.9672, 82.2981, 18.6222, 33.133, 9.73756, 31.1127, 50.0243, 54.8654, 36.1878, 30.3719, 4.65555, 1960, 14.8313, 95.436, 95.436, 52.7651, 46.114, 59.6475, 57.4614, 17.7018, 35.8665, 8.79418, 24.0683, 40.3498, 39.4789, , , 4.08301, 4725, 7.82018, 68.3542, 88.4102, 29.064, 27.9631, 60.0183, 49.5453, 10.73, 29.7671, 9.22781, 18.642, 34.8219, 35.8362, , , 4.27213, 10368, 7.68375, 59.8087, 61.47, 39.8725, 35.6923, 54.3048, 53.6466, 13.5762, 31.8406, 15.0539, 15.3675, 21.9101, 21.4847, 14.2769, 14.1854, 3.95164, 27000, 6.91232, 64.2357, 62.9637, 41.5642, 36.7591, 52.1257, 50.8746, 11.1177, 29.1712, 10.257, 15.2869, 19.5071, 19.7495, 14.0693, 13.5883, 3.97458, 75600, 6.88304, 53.2687, 52.696, 26.7799, 24.261, 54.7567, 52.1354, 11.138, 28.4926, 10.2956, 14.1639, 17.1354, 17.1354, , , 3.71267, 165375, 5.59929, 48.1822, 48.5905, 14.3342, 14.1923, 48.5905, 45.1471, 7.79032, 30.4983, 8.63506, 11.9452, 16.6677, 16.5713, , , 3.55687, 362880, 5.85846, 21.9022, 21.76, 26.3861, 24.4601, 50.3916, 44.9804, 9.76979, 32.5344, 11.3595, 10.2166, 11.758, 11.7169, , , 3.33769, Norm. Avg., 0.106821, 0.713197, 0.745111, 0.583672, 0.551183, 0.888725, 0.86494, 0.178818, 0.404472, 0.152827, 0.194879, 0.315068, 0.320211, 0.300473, 0.253004, 0.0603871, 0.699315 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM, HARM (f2c), NR (C), NR (F), PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 4x4x4, 64.1985, , , 30.2474, 24.9661, 9.8304, 9.30689, 52.8694, 49.9322, 38.13, 32.768 8x8x8, 90.7422, 66.9304, 52.7217, 41.7575, 39.652, 20.3388, 19.994, 45.8116, 48.8973, 85.7926, 71.4938 16x16x16, 107.546, 86.1843, 68.7591, 43.0922, 42.5098, 36.1578, 33.1129, 64.8604, 61.6809, 113.36, 91.8461 32x32x32, 51.067, 44.4312, 40.96, 9.8304, 9.45231, 23.9766, 23.9766, 14.8945, 14.6722, 27.6913, 26.7494 64x64x64, 48.6453, 46.2607, 43.2898, 9.10925, 8.70589, 25.9263, 24.8347, 15.8342, 15.8342, 29.6767, 28.9484 256x64x32, 39.2184, 42.9374, 39.5297, 7.29244, 7.14596, 22.5373, 21.9416, 12.5459, 12.5776, 23.8313, 24.1783 16x1024x64, 32.3635, 42.1115, 39.8698, 7.15263, 6.99984, 18.5589, 17.9859, 12.6946, 12.6487, , 128x128x128, 30.7973, 42.5921, 39.8915, 7.10784, 6.95298, 20.2763, 19.4868, 9.15597, 9.17504, 18.0197, 15.5839 Norm. Avg., 0.919212, 0.9027, 0.811936, 0.274725, 0.257603, 0.392545, 0.379136, 0.415791, 0.410285, 0.667116, 0.593773 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 5x5x5, 54.4501, 12.8291, 11.6552, 79.6979, 87.521, 53.6313, 37.7405 6x6x6, 72.9897, 13.8327, 13.9452, 38.1168, 38.5451, 59.6611, 47.646 7x7x7, 61.9497, 8.75177, 7.95188, 43.1839, 38.169, , 9x9x9, 73.948, 21.5121, 22.0466, 59.1584, 56.3413, 85.0181, 75.1218 10x10x10, 78.4997, 24.5312, 22.5774, 45.9683, 50.5196, 92.7724, 63.781 11x11x11, 51.6242, 10.2204, 9.82294, 42.3504, 38.4376, , 12x12x12, 97.0942, 29.9221, 29.0099, 55.3211, 56.3032, 121.213, 105.725 13x13x13, 45.2445, 10.2693, 10.006, 39.769, 36.5131, , 14x14x14, 75.6942, 16.1766, 15.6711, 35.8196, 34.2888, , 15x15x15, 97.3717, 33.9821, 32.6666, 51.6666, 54.7387, 119.137, 84.3888 24x25x28, 40.5691, 31.9739, 30.4268, 21.1962, 20.9607, , 48x48x48, 39.216, 27.4512, 27.2494, 14.1447, 14.1447, 37.4335, 37.0591 49x49x49, 48.334, 23.3141, 22.0188, 18.6952, 17.5371, , 60x60x60, 55.4734, 47.8458, 44.768, 13.1084, 12.9313, 39.0578, 35.7726 72x60x56, 54.4219, 32.5304, 31.3517, 12.3615, 12.2219, , 75x75x75, 57.9658, 52.5557, 50.8603, 15.2778, 15.3972, 41.4913, 36.8381 80x80x80, 42.2195, 32.1539, 30.3453, 13.0869, 13.0517, 32.8057, 29.9706 84x84x84, 56.8313, 38.1418, 36.6653, 12.1175, 11.9897, , 96x96x96, 46.2377, 31.8939, 29.9278, 11.048, 11.0201, 25.4039, 25.257 105x105x105, 61.3625, 40.9083, 39.7914, 14.0299, 13.946, , 112x112x112, 44.2771, 28.4638, 27.4823, 12.9125, 12.7518, , 120x120x120, 47.1123, 34.5611, 32.199, 9.1247, 9.08764, 31.5744, 29.7881 144x144x144, 46.4074, 35.6425, 33.5218, 10.4708, 10.3895, 26.6727, 26.3229 Norm. Avg., 0.954627, 0.497245, 0.475404, 0.438371, 0.431912, 0.798152, 0.685895 @@@@ end