To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Joachim Wesner @ submitter email = joachim.wesner@frankfurt.netsurf.de @ submitter organization = NONE @ computer manufacturer = Digital @ computer model = AlphaPC 164LX @ CPU manufacturer = Digital/Samsung @ CPU model = 21164A @ CPU speed = 533 MHz @ RAM = 64 MB @ L2 cache size = 2MB @ operating system = Linux 2.0.30 (Redhat 4.2) @ C compiler = gcc 2.7.2 @ C compiler flags = -pedantic -ansi -O2 -Wall @ Fortran compiler = fort77 (using f2c, patched so INTEGER = INTEGER*4) @ Fortran compiler flags = -O2 @ remarks = -O3 seems to give slightly inferior results for small/intermediate sizes @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) 524288 (50.9973 MB) Maximum array size = 720720 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Bailey 5. Beauregard 6. Bergland 7. Brenner 8. Burrus 9. CWP (min N) 10. CWP (best N) 11. Edelblute 12. FFTPACK 13. FFTPACK (f2c) 14. FFTW 15. FFTW_ESTIMATE 16. Frigo-old 17. Green 18. GSL 19. GSL DIT 20. GSL DIF 21. Krukar 22. Mayer (Buneman) 23. Mayer (simple) 24. Mayer (lookup) 25. Monro 26. NAPACK (f2c) 27. Nielsen 28. NR (C) 29. Ooura (C) 30. Ooura (F) 31. Ransom 32. SCIPORT 33. Singleton 34. Singleton (f2c) 35. Sorensen 36. Sorensen DIT 37. Temperton 38. Temperton (f2c) 39. Valkenburg Computing normalized averages (40 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.0127 s, 4194304 iters, t-(init.)=0.740234 s t(norm)=0.0882428, mflops=56.6618 (err=4.7e-17) 1. Arndt DIT: elapsed time t=1.36914 s, 4194304 iters, t-(init.)=1.09766 s t(norm)=0.130851, mflops=38.2115 (err=4.7e-17) 2. Arndt Split-Radix: elapsed time t=1.3584 s, 4194304 iters, t-(init.)=1.08105 s t(norm)=0.128872, mflops=38.7983 (err=4.7e-17) 3. Arndt 4-step: elapsed time t=1.01172 s, 262144 iters, t-(init.)=0.994141 s t(norm)=1.89617, mflops=2.63689 (err=4.7e-17) 4. Bailey: elapsed time t=1.22754 s, 1048576 iters, t-(init.)=1.1582 s t(norm)=0.552274, mflops=9.05347 (err=4.7e-17) 5. Beauregard: elapsed time t=1.00195 s, 524288 iters, t-(init.)=0.966797 s t(norm)=0.922009, mflops=5.42294 (err=1.6e-17) 6. Bergland: elapsed time t=1.79004 s, 2097152 iters, t-(init.)=1.65039 s t(norm)=0.393484, mflops=12.707 (err=1.6e-17) 7. Brenner: elapsed time t=1.46973 s, 1048576 iters, t-(init.)=1.29688 s t(norm)=0.618398, mflops=8.08541 (err=1.6e-17) 8. Burrus: elapsed time t=1.79395 s, 4194304 iters, t-(init.)=1.51953 s t(norm)=0.181142, mflops=27.6026 (err=4.7e-17) 9. CWP (min N): elapsed time t=1.12305 s, 524288 iters, t-(init.)=1.08887 s t(norm)=1.03842, mflops=4.81499 10. CWP (best N) (N=3): elapsed time t=1.20313 s, 524288 iters, t-(init.)=1.16016 s t(norm)=1.10641, mflops=4.51912 11. Skipping fft (Edelblute can't handle N <= 2). 12. FFTPACK: elapsed time t=1.96582 s, 2097152 iters, t-(init.)=1.82129 s t(norm)=0.434229, mflops=11.5147 (err=1.6e-17) 13. FFTPACK (f2c): elapsed time t=1.90039 s, 2097152 iters, t-(init.)=1.76367 s t(norm)=0.420492, mflops=11.8908 (err=1.6e-17) FFTW_MEASURE plan: (cost = 1.811422e-07) FFTW_NOTW 2 14. FFTW: elapsed time t=1.84473 s, 8388608 iters, t-(init.)=1.29199 s t(norm)=0.0770087, mflops=64.9277 (err=1.6e-17) FFTW_ESTIMATE plan: (cost = 1.480000e+01) FFTW_NOTW 2 15. FFTW_ESTIMATE: elapsed time t=1.83301 s, 8388608 iters, t-(init.)=1.29102 s t(norm)=0.0769505, mflops=64.9768 (err=1.6e-17) 16. Frigo-old: elapsed time t=1.4375 s, 1048576 iters, t-(init.)=1.36523 s t(norm)=0.650994, mflops=7.68056 (err=1.6e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.47363 s, 1048576 iters, t-(init.)=1.4043 s t(norm)=0.669621, mflops=7.46691 (err=1.6e-17) 19. GSL DIT: elapsed time t=1.50391 s, 1048576 iters, t-(init.)=1.42773 s t(norm)=0.680797, mflops=7.34434 (err=1.6e-17) 20. GSL DIF: elapsed time t=1.59473 s, 1048576 iters, t-(init.)=1.52734 s t(norm)=0.728294, mflops=6.86536 (err=1.6e-17) 21. Krukar: elapsed time t=1.09375 s, 4194304 iters, t-(init.)=0.821289 s t(norm)=0.0979053, mflops=51.0698 (err=1.6e-17) 22. Skipping fft (Mayer can't handle N <= 2). 23. Skipping fft (Mayer can't handle N <= 2). 24. Skipping fft (Mayer can't handle N <= 2). 25. Skipping fft (Monro can't handle N <= 2). 26. NAPACK (f2c): elapsed time t=1.21387 s, 262144 iters, t-(init.)=1.19727 s t(norm)=2.2836, mflops=2.18952 (err=1.6e-17) 27. Nielsen: elapsed time t=1.63281 s, 262144 iters, t-(init.)=1.61523 s t(norm)=3.08082, mflops=1.62295 (err=4.7e-17) 28. NR (C): elapsed time t=1.43848 s, 1048576 iters, t-(init.)=1.37109 s t(norm)=0.653788, mflops=7.64773 (err=1.6e-17) 29. Ooura (C): elapsed time t=1.01172 s, 4194304 iters, t-(init.)=0.740234 s t(norm)=0.0882428, mflops=56.6618 (err=1.6e-17) 30. Ooura (F): elapsed time t=1.51855 s, 4194304 iters, t-(init.)=1.23535 s t(norm)=0.147265, mflops=33.9523 (err=1.6e-17) 31. Skipping fft (Ransom doesn't work for N=2). 32. Skipping fft (SCIPORT can't handle N < 4). 33. Singleton: elapsed time t=1.01367 s, 524288 iters, t-(init.)=0.981445 s t(norm)=0.935979, mflops=5.342 (err=1.6e-17) 34. Singleton (f2c): elapsed time t=1.74805 s, 1048576 iters, t-(init.)=1.68066 s t(norm)=0.801403, mflops=6.23906 (err=1.6e-17) 35. Sorensen: elapsed time t=1.16406 s, 2097152 iters, t-(init.)=1.02832 s t(norm)=0.245171, mflops=20.394 (err=4.7e-17) 36. Sorensen DIT: elapsed time t=1.04297 s, 2097152 iters, t-(init.)=0.907227 s t(norm)=0.2163, mflops=23.1161 (err=4.7e-17) 37. Temperton: elapsed time t=1.14551 s, 524288 iters, t-(init.)=1.1123 s t(norm)=1.06078, mflops=4.71353 (err=1.6e-17) 38. Temperton (f2c): elapsed time t=1.96094 s, 1048576 iters, t-(init.)=1.8916 s t(norm)=0.901986, mflops=5.54332 (err=1.6e-17) 39. Valkenburg: elapsed time t=1.6123 s, 524288 iters, t-(init.)=1.5791 s t(norm)=1.50595, mflops=3.32017 (err=1.6e-17) Top mflops for N=2 = 64.9768 Normalized results and averages for N=2: fft 0: mflops = 56.6618 (norm. = 0.872032), norm. avg. (of 1) = 0.872032 fft 1: mflops = 38.2115 (norm. = 0.588078), norm. avg. (of 1) = 0.588078 fft 2: mflops = 38.7983 (norm. = 0.597109), norm. avg. (of 1) = 0.597109 fft 3: mflops = 2.63689 (norm. = 0.040582), norm. avg. (of 1) = 0.040582 fft 4: mflops = 9.05347 (norm. = 0.139334), norm. avg. (of 1) = 0.139334 fft 5: mflops = 5.42294 (norm. = 0.0834596), norm. avg. (of 1) = 0.0834596 fft 6: mflops = 12.707 (norm. = 0.195562), norm. avg. (of 1) = 0.195562 fft 7: mflops = 8.08541 (norm. = 0.124435), norm. avg. (of 1) = 0.124435 fft 8: mflops = 27.6026 (norm. = 0.424807), norm. avg. (of 1) = 0.424807 fft 9: mflops = 4.81499 (norm. = 0.0741031), norm. avg. (of 1) = 0.0741031 fft 10: mflops = 4.51912 (norm. = 0.0695497), norm. avg. (of 1) = 0.0695497 fft 11: mflops = -1 (norm. = -0.0153901), norm. avg. (of 0) = -1 fft 12: mflops = 11.5147 (norm. = 0.177212), norm. avg. (of 1) = 0.177212 fft 13: mflops = 11.8908 (norm. = 0.183001), norm. avg. (of 1) = 0.183001 fft 14: mflops = 64.9277 (norm. = 0.999244), norm. avg. (of 1) = 0.999244 fft 15: mflops = 64.9768 (norm. = 1), norm. avg. (of 1) = 1 fft 16: mflops = 7.68056 (norm. = 0.118205), norm. avg. (of 1) = 0.118205 fft 17: mflops = -1 (norm. = -0.0153901), norm. avg. (of 0) = -1 fft 18: mflops = 7.46691 (norm. = 0.114917), norm. avg. (of 1) = 0.114917 fft 19: mflops = 7.34434 (norm. = 0.11303), norm. avg. (of 1) = 0.11303 fft 20: mflops = 6.86536 (norm. = 0.105659), norm. avg. (of 1) = 0.105659 fft 21: mflops = 51.0698 (norm. = 0.785969), norm. avg. (of 1) = 0.785969 fft 22: mflops = -1 (norm. = -0.0153901), norm. avg. (of 0) = -1 fft 23: mflops = -1 (norm. = -0.0153901), norm. avg. (of 0) = -1 fft 24: mflops = -1 (norm. = -0.0153901), norm. avg. (of 0) = -1 fft 25: mflops = -1 (norm. = -0.0153901), norm. avg. (of 0) = -1 fft 26: mflops = 2.18952 (norm. = 0.033697), norm. avg. (of 1) = 0.033697 fft 27: mflops = 1.62295 (norm. = 0.0249773), norm. avg. (of 1) = 0.0249773 fft 28: mflops = 7.64773 (norm. = 0.117699), norm. avg. (of 1) = 0.117699 fft 29: mflops = 56.6618 (norm. = 0.872032), norm. avg. (of 1) = 0.872032 fft 30: mflops = 33.9523 (norm. = 0.52253), norm. avg. (of 1) = 0.52253 fft 31: mflops = -1 (norm. = -0.0153901), norm. avg. (of 0) = -1 fft 32: mflops = -1 (norm. = -0.0153901), norm. avg. (of 0) = -1 fft 33: mflops = 5.342 (norm. = 0.0822139), norm. avg. (of 1) = 0.0822139 fft 34: mflops = 6.23906 (norm. = 0.0960198), norm. avg. (of 1) = 0.0960198 fft 35: mflops = 20.394 (norm. = 0.313865), norm. avg. (of 1) = 0.313865 fft 36: mflops = 23.1161 (norm. = 0.355759), norm. avg. (of 1) = 0.355759 fft 37: mflops = 4.71353 (norm. = 0.0725417), norm. avg. (of 1) = 0.0725417 fft 38: mflops = 5.54332 (norm. = 0.0853123), norm. avg. (of 1) = 0.0853123 fft 39: mflops = 3.32017 (norm. = 0.0510977), norm. avg. (of 1) = 0.0510977 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.74316 s, 4194304 iters, t-(init.)=1.40137 s t(norm)=0.041764, mflops=119.72 (err=7.3e-17) 1. Arndt DIT: elapsed time t=1.10352 s, 2097152 iters, t-(init.)=0.932617 s t(norm)=0.0555883, mflops=89.947 (err=7.3e-17) 2. Arndt Split-Radix: elapsed time t=1.04883 s, 1048576 iters, t-(init.)=0.962891 s t(norm)=0.114786, mflops=43.5595 (err=7.3e-17) 3. Arndt 4-step: elapsed time t=1.19824 s, 262144 iters, t-(init.)=1.17676 s t(norm)=0.561122, mflops=8.91072 (err=7.3e-17) 4. Bailey: elapsed time t=1.77539 s, 131072 iters, t-(init.)=1.76367 s t(norm)=1.68197, mflops=2.97271 (err=7.3e-17) 5. Beauregard: elapsed time t=1.99023 s, 524288 iters, t-(init.)=1.94531 s t(norm)=0.463799, mflops=10.7805 (err=3.9e-17) 6. Bergland: elapsed time t=1.0918 s, 1048576 iters, t-(init.)=1.00586 s t(norm)=0.119908, mflops=41.6987 (err=3.9e-17) 7. Brenner: elapsed time t=1.31738 s, 524288 iters, t-(init.)=1.2207 s t(norm)=0.291038, mflops=17.1799 (err=3.9e-17) 8. Burrus: elapsed time t=1.0127 s, 524288 iters, t-(init.)=0.96875 s t(norm)=0.230968, mflops=21.648 (err=7.3e-17) 9. CWP (min N): elapsed time t=1.23047 s, 524288 iters, t-(init.)=1.18652 s t(norm)=0.282889, mflops=17.6748 10. CWP (best N) (N=15): elapsed time t=1.26172 s, 262144 iters, t-(init.)=1.21777 s t(norm)=0.58068, mflops=8.6106 11. Edelblute: elapsed time t=1.2793 s, 1048576 iters, t-(init.)=1.19141 s t(norm)=0.142027, mflops=35.2046 (err=7.3e-17) 12. FFTPACK: elapsed time t=1.45117 s, 262144 iters, t-(init.)=1.42969 s t(norm)=0.681728, mflops=7.3343 (err=3.9e-17) 13. FFTPACK (f2c): elapsed time t=1.49414 s, 262144 iters, t-(init.)=1.47266 s t(norm)=0.702217, mflops=7.1203 (err=3.9e-17) FFTW_MEASURE plan: (cost = 2.384186e-07) FFTW_NOTW 4 14. FFTW: elapsed time t=1.16113 s, 4194304 iters, t-(init.)=0.821289 s t(norm)=0.0244763, mflops=204.279 (err=3.9e-17) FFTW_ESTIMATE plan: (cost = 1.840000e+01) FFTW_NOTW 4 15. FFTW_ESTIMATE: elapsed time t=1.16406 s, 4194304 iters, t-(init.)=0.827148 s t(norm)=0.0246509, mflops=202.832 (err=3.9e-17) 16. Frigo-old: elapsed time t=1.86621 s, 8388608 iters, t-(init.)=1.18262 s t(norm)=0.0176224, mflops=283.73 (err=3.9e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.31055 s, 1048576 iters, t-(init.)=1.22461 s t(norm)=0.145985, mflops=34.2501 (err=3.9e-17) 19. GSL DIT: elapsed time t=1.35645 s, 524288 iters, t-(init.)=1.31445 s t(norm)=0.31339, mflops=15.9546 (err=7.5e-17) 20. GSL DIF: elapsed time t=1.39063 s, 524288 iters, t-(init.)=1.34863 s t(norm)=0.321539, mflops=15.5502 (err=7.5e-17) 21. Krukar: elapsed time t=1.32813 s, 4194304 iters, t-(init.)=0.986328 s t(norm)=0.0293949, mflops=170.098 (err=3.9e-17) 22. Mayer (Buneman): elapsed time t=1.4082 s, 2097152 iters, t-(init.)=1.23633 s t(norm)=0.0736909, mflops=67.851 (err=8.5e-17) 23. Mayer (simple): elapsed time t=1.43359 s, 2097152 iters, t-(init.)=1.26172 s t(norm)=0.0752043, mflops=66.4856 24. Mayer (lookup): elapsed time t=1.51758 s, 2097152 iters, t-(init.)=1.34668 s t(norm)=0.0802684, mflops=62.291 (err=8.5e-17) 25. Monro: elapsed time t=1.08301 s, 262144 iters, t-(init.)=1.06152 s t(norm)=0.506174, mflops=9.87803 (err=7.3e-17) 26. NAPACK (f2c): elapsed time t=1.41113 s, 262144 iters, t-(init.)=1.38965 s t(norm)=0.662636, mflops=7.54562 (err=1.3e-16) 27. Nielsen: elapsed time t=1.08887 s, 262144 iters, t-(init.)=1.06836 s t(norm)=0.509433, mflops=9.81482 (err=7.3e-17) 28. NR (C): elapsed time t=1.24219 s, 524288 iters, t-(init.)=1.19922 s t(norm)=0.285916, mflops=17.4877 (err=1.0e-16) 29. Ooura (C): elapsed time t=1.03027 s, 2097152 iters, t-(init.)=0.859375 s t(norm)=0.0512227, mflops=97.6129 (err=3.9e-17) 30. Ooura (F): elapsed time t=1.6416 s, 2097152 iters, t-(init.)=1.46973 s t(norm)=0.0876025, mflops=57.076 (err=3.9e-17) 31. Ransom: elapsed time t=1.73926 s, 262144 iters, t-(init.)=1.71777 s t(norm)=0.819098, mflops=6.10427 (err=1.0e-16) 32. SCIPORT: elapsed time t=1.65039 s, 1048576 iters, t-(init.)=1.52051 s t(norm)=0.181259, mflops=27.5849 (err=3.9e-17) 33. Singleton: elapsed time t=1.08301 s, 524288 iters, t-(init.)=1.04004 s t(norm)=0.247965, mflops=20.1642 (err=3.9e-17) 34. Singleton (f2c): elapsed time t=1.60156 s, 1048576 iters, t-(init.)=1.51563 s t(norm)=0.180677, mflops=27.6738 (err=3.9e-17) 35. Sorensen: elapsed time t=1.31348 s, 1048576 iters, t-(init.)=1.22656 s t(norm)=0.146218, mflops=34.1956 (err=7.3e-17) 36. Sorensen DIT: elapsed time t=1.03027 s, 524288 iters, t-(init.)=0.988281 s t(norm)=0.235625, mflops=21.2202 (err=7.3e-17) 37. Temperton: elapsed time t=1.40723 s, 524288 iters, t-(init.)=1.36426 s t(norm)=0.325264, mflops=15.3721 (err=3.9e-17) 38. Temperton (f2c): elapsed time t=1.27051 s, 524288 iters, t-(init.)=1.22754 s t(norm)=0.292668, mflops=17.0842 (err=3.9e-17) 39. Valkenburg: elapsed time t=1.12598 s, 262144 iters, t-(init.)=1.10449 s t(norm)=0.526663, mflops=9.49374 (err=1.3e-16) Top mflops for N=4 = 283.73 Normalized results and averages for N=4: fft 0: mflops = 119.72 (norm. = 0.421951), norm. avg. (of 2) = 0.646991 fft 1: mflops = 89.947 (norm. = 0.317016), norm. avg. (of 2) = 0.452547 fft 2: mflops = 43.5595 (norm. = 0.153524), norm. avg. (of 2) = 0.375317 fft 3: mflops = 8.91072 (norm. = 0.0314056), norm. avg. (of 2) = 0.0359938 fft 4: mflops = 2.97271 (norm. = 0.0104772), norm. avg. (of 2) = 0.0749056 fft 5: mflops = 10.7805 (norm. = 0.0379957), norm. avg. (of 2) = 0.0607277 fft 6: mflops = 41.6987 (norm. = 0.146966), norm. avg. (of 2) = 0.171264 fft 7: mflops = 17.1799 (norm. = 0.06055), norm. avg. (of 2) = 0.0924926 fft 8: mflops = 21.648 (norm. = 0.0762979), norm. avg. (of 2) = 0.250553 fft 9: mflops = 17.6748 (norm. = 0.0622942), norm. avg. (of 2) = 0.0681987 fft 10: mflops = 8.6106 (norm. = 0.0303478), norm. avg. (of 2) = 0.0499487 fft 11: mflops = 35.2046 (norm. = 0.124078), norm. avg. (of 1) = 0.124078 fft 12: mflops = 7.3343 (norm. = 0.0258496), norm. avg. (of 2) = 0.101531 fft 13: mflops = 7.1203 (norm. = 0.0250953), norm. avg. (of 2) = 0.104048 fft 14: mflops = 204.279 (norm. = 0.719976), norm. avg. (of 2) = 0.85961 fft 15: mflops = 202.832 (norm. = 0.714876), norm. avg. (of 2) = 0.857438 fft 16: mflops = 283.73 (norm. = 1), norm. avg. (of 2) = 0.559102 fft 17: mflops = -1 (norm. = -0.00352447), norm. avg. (of 0) = -1 fft 18: mflops = 34.2501 (norm. = 0.120714), norm. avg. (of 2) = 0.117815 fft 19: mflops = 15.9546 (norm. = 0.0562314), norm. avg. (of 2) = 0.0846308 fft 20: mflops = 15.5502 (norm. = 0.0548063), norm. avg. (of 2) = 0.0802324 fft 21: mflops = 170.098 (norm. = 0.599505), norm. avg. (of 2) = 0.692737 fft 22: mflops = 67.851 (norm. = 0.239139), norm. avg. (of 1) = 0.239139 fft 23: mflops = 66.4856 (norm. = 0.234327), norm. avg. (of 1) = 0.234327 fft 24: mflops = 62.291 (norm. = 0.219543), norm. avg. (of 1) = 0.219543 fft 25: mflops = 9.87803 (norm. = 0.0348149), norm. avg. (of 1) = 0.0348149 fft 26: mflops = 7.54562 (norm. = 0.0265943), norm. avg. (of 2) = 0.0301457 fft 27: mflops = 9.81482 (norm. = 0.0345921), norm. avg. (of 2) = 0.0297847 fft 28: mflops = 17.4877 (norm. = 0.0616348), norm. avg. (of 2) = 0.0896671 fft 29: mflops = 97.6129 (norm. = 0.344034), norm. avg. (of 2) = 0.608033 fft 30: mflops = 57.076 (norm. = 0.201163), norm. avg. (of 2) = 0.361846 fft 31: mflops = 6.10427 (norm. = 0.0215144), norm. avg. (of 1) = 0.0215144 fft 32: mflops = 27.5849 (norm. = 0.0972222), norm. avg. (of 1) = 0.0972222 fft 33: mflops = 20.1642 (norm. = 0.0710681), norm. avg. (of 2) = 0.076641 fft 34: mflops = 27.6738 (norm. = 0.0975354), norm. avg. (of 2) = 0.0967776 fft 35: mflops = 34.1956 (norm. = 0.120521), norm. avg. (of 2) = 0.217193 fft 36: mflops = 21.2202 (norm. = 0.07479), norm. avg. (of 2) = 0.215274 fft 37: mflops = 15.3721 (norm. = 0.0541786), norm. avg. (of 2) = 0.0633602 fft 38: mflops = 17.0842 (norm. = 0.0602128), norm. avg. (of 2) = 0.0727626 fft 39: mflops = 9.49374 (norm. = 0.0334604), norm. avg. (of 2) = 0.0422791 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.52051 s, 2097152 iters, t-(init.)=1.29102 s t(norm)=0.0256502, mflops=194.93 (err=1.8e-16) 1. Arndt DIT: elapsed time t=1.75195 s, 2097152 iters, t-(init.)=1.51953 s t(norm)=0.0301904, mflops=165.616 (err=1.8e-16) 2. Arndt Split-Radix: elapsed time t=1.32227 s, 524288 iters, t-(init.)=1.26465 s t(norm)=0.100505, mflops=49.7487 (err=1.5e-16) 3. Arndt 4-step: elapsed time t=1.44141 s, 131072 iters, t-(init.)=1.42676 s t(norm)=0.453554, mflops=11.024 (err=1.2e-16) 4. Bailey: elapsed time t=1.18652 s, 262144 iters, t-(init.)=1.15723 s t(norm)=0.183936, mflops=27.1833 (err=1.5e-16) 5. Beauregard: elapsed time t=1.54492 s, 131072 iters, t-(init.)=1.53027 s t(norm)=0.486461, mflops=10.2783 (err=1.9e-16) 6. Bergland: elapsed time t=1.16309 s, 524288 iters, t-(init.)=1.10449 s t(norm)=0.0877772, mflops=56.9624 (err=1.9e-16) 7. Brenner: elapsed time t=1.39746 s, 262144 iters, t-(init.)=1.34473 s t(norm)=0.213739, mflops=23.3931 (err=1.7e-16) 8. Burrus: elapsed time t=1.54297 s, 262144 iters, t-(init.)=1.51367 s t(norm)=0.240592, mflops=20.7821 (err=1.5e-16) 9. CWP (min N): elapsed time t=1.3916 s, 524288 iters, t-(init.)=1.33398 s t(norm)=0.106016, mflops=47.1629 10. CWP (best N) (N=15): elapsed time t=1.25391 s, 262144 iters, t-(init.)=1.21094 s t(norm)=0.192473, mflops=25.9776 11. Edelblute: elapsed time t=1.08203 s, 262144 iters, t-(init.)=1.05273 s t(norm)=0.167328, mflops=29.8815 (err=1.5e-16) 12. FFTPACK: elapsed time t=1.50098 s, 524288 iters, t-(init.)=1.44238 s t(norm)=0.11463, mflops=43.6185 (err=1.9e-16) 13. FFTPACK (f2c): elapsed time t=1.50879 s, 524288 iters, t-(init.)=1.45117 s t(norm)=0.115329, mflops=43.3543 (err=1.9e-16) FFTW_MEASURE plan: (cost = 3.837049e-07) FFTW_NOTW 8 14. FFTW: elapsed time t=1.81152 s, 4194304 iters, t-(init.)=1.34863 s t(norm)=0.0133975, mflops=373.205 (err=1.8e-16) FFTW_ESTIMATE plan: (cost = 1.120000e+01) FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.8125 s, 4194304 iters, t-(init.)=1.35254 s t(norm)=0.0134363, mflops=372.127 (err=1.8e-16) 16. Frigo-old: elapsed time t=1.45605 s, 4194304 iters, t-(init.)=0.991211 s t(norm)=0.0098468, mflops=507.779 (err=1.8e-16) 17. Green: elapsed time t=1.77148 s, 2097152 iters, t-(init.)=1.53809 s t(norm)=0.030559, mflops=163.618 (err=1.8e-16) 18. GSL: elapsed time t=1.29688 s, 524288 iters, t-(init.)=1.23828 s t(norm)=0.0984098, mflops=50.808 (err=1.9e-16) 19. GSL DIT: elapsed time t=1.10742 s, 262144 iters, t-(init.)=1.07813 s t(norm)=0.171363, mflops=29.1778 (err=1.5e-16) 20. GSL DIF: elapsed time t=1.09082 s, 262144 iters, t-(init.)=1.06152 s t(norm)=0.168725, mflops=29.6341 (err=1.5e-16) 21. Krukar: elapsed time t=1.5293 s, 2097152 iters, t-(init.)=1.2959 s t(norm)=0.0257472, mflops=194.196 (err=1.7e-16) 22. Mayer (Buneman): elapsed time t=1.43848 s, 1048576 iters, t-(init.)=1.32129 s t(norm)=0.0525033, mflops=95.2321 (err=1.7e-16) 23. Mayer (simple): elapsed time t=1.43164 s, 1048576 iters, t-(init.)=1.31543 s t(norm)=0.0522705, mflops=95.6563 24. Mayer (lookup): elapsed time t=1.48242 s, 1048576 iters, t-(init.)=1.36523 s t(norm)=0.0542495, mflops=92.1667 (err=1.7e-16) 25. Monro: elapsed time t=1.45313 s, 262144 iters, t-(init.)=1.42383 s t(norm)=0.226311, mflops=22.0935 (err=1.3e-16) 26. NAPACK (f2c): elapsed time t=1.22168 s, 131072 iters, t-(init.)=1.20703 s t(norm)=0.383705, mflops=13.0308 (err=2.5e-16) 27. Nielsen: elapsed time t=1.32422 s, 262144 iters, t-(init.)=1.29492 s t(norm)=0.205822, mflops=24.2928 (err=6.5e-16) 28. NR (C): elapsed time t=1.00586 s, 262144 iters, t-(init.)=0.976563 s t(norm)=0.15522, mflops=32.2123 (err=3.8e-16) 29. Ooura (C): elapsed time t=1.6875 s, 2097152 iters, t-(init.)=1.4541 s t(norm)=0.0288904, mflops=173.068 (err=1.9e-16) 30. Ooura (F): elapsed time t=1.64648 s, 1048576 iters, t-(init.)=1.53027 s t(norm)=0.0608076, mflops=82.2266 (err=1.9e-16) 31. Ransom: elapsed time t=1.77832 s, 131072 iters, t-(init.)=1.76367 s t(norm)=0.560656, mflops=8.91812 (err=2.9e-16) 32. SCIPORT: elapsed time t=1.06543 s, 262144 iters, t-(init.)=1.02441 s t(norm)=0.162826, mflops=30.7076 (err=1.9e-16) 33. Singleton: elapsed time t=1.62402 s, 262144 iters, t-(init.)=1.59473 s t(norm)=0.253475, mflops=19.7258 (err=1.4e-16) 34. Singleton (f2c): elapsed time t=1.24805 s, 262144 iters, t-(init.)=1.21875 s t(norm)=0.193715, mflops=25.8111 (err=1.4e-16) 35. Sorensen: elapsed time t=1.22266 s, 524288 iters, t-(init.)=1.16504 s t(norm)=0.092589, mflops=54.0021 (err=1.7e-16) 36. Sorensen DIT: elapsed time t=1.55273 s, 262144 iters, t-(init.)=1.52344 s t(norm)=0.242144, mflops=20.6489 (err=1.6e-16) 37. Temperton: elapsed time t=1.31641 s, 262144 iters, t-(init.)=1.28711 s t(norm)=0.204581, mflops=24.4403 (err=1.8e-16) 38. Temperton (f2c): elapsed time t=1.19922 s, 262144 iters, t-(init.)=1.1709 s t(norm)=0.186109, mflops=26.8659 (err=1.8e-16) 39. Valkenburg: elapsed time t=1.59277 s, 131072 iters, t-(init.)=1.57813 s t(norm)=0.501672, mflops=9.96666 (err=1.9e-16) Top mflops for N=8 = 507.779 Normalized results and averages for N=8: fft 0: mflops = 194.93 (norm. = 0.383888), norm. avg. (of 3) = 0.55929 fft 1: mflops = 165.616 (norm. = 0.326157), norm. avg. (of 3) = 0.410417 fft 2: mflops = 49.7487 (norm. = 0.097973), norm. avg. (of 3) = 0.282869 fft 3: mflops = 11.024 (norm. = 0.0217103), norm. avg. (of 3) = 0.0312326 fft 4: mflops = 27.1833 (norm. = 0.0535338), norm. avg. (of 3) = 0.0677816 fft 5: mflops = 10.2783 (norm. = 0.0202417), norm. avg. (of 3) = 0.0472323 fft 6: mflops = 56.9624 (norm. = 0.112179), norm. avg. (of 3) = 0.151569 fft 7: mflops = 23.3931 (norm. = 0.0460694), norm. avg. (of 3) = 0.0770182 fft 8: mflops = 20.7821 (norm. = 0.0409274), norm. avg. (of 3) = 0.180678 fft 9: mflops = 47.1629 (norm. = 0.0928807), norm. avg. (of 3) = 0.076426 fft 10: mflops = 25.9776 (norm. = 0.0511593), norm. avg. (of 3) = 0.0503523 fft 11: mflops = 29.8815 (norm. = 0.0588474), norm. avg. (of 2) = 0.0914626 fft 12: mflops = 43.6185 (norm. = 0.0859005), norm. avg. (of 3) = 0.0963206 fft 13: mflops = 43.3543 (norm. = 0.0853802), norm. avg. (of 3) = 0.0978255 fft 14: mflops = 373.205 (norm. = 0.734975), norm. avg. (of 3) = 0.818065 fft 15: mflops = 372.127 (norm. = 0.732852), norm. avg. (of 3) = 0.815909 fft 16: mflops = 507.779 (norm. = 1), norm. avg. (of 3) = 0.706068 fft 17: mflops = 163.618 (norm. = 0.322222), norm. avg. (of 1) = 0.322222 fft 18: mflops = 50.808 (norm. = 0.100059), norm. avg. (of 3) = 0.111896 fft 19: mflops = 29.1778 (norm. = 0.0574615), norm. avg. (of 3) = 0.0755743 fft 20: mflops = 29.6341 (norm. = 0.0583602), norm. avg. (of 3) = 0.0729417 fft 21: mflops = 194.196 (norm. = 0.382442), norm. avg. (of 3) = 0.589305 fft 22: mflops = 95.2321 (norm. = 0.187546), norm. avg. (of 2) = 0.213343 fft 23: mflops = 95.6563 (norm. = 0.188382), norm. avg. (of 2) = 0.211354 fft 24: mflops = 92.1667 (norm. = 0.181509), norm. avg. (of 2) = 0.200526 fft 25: mflops = 22.0935 (norm. = 0.0435099), norm. avg. (of 2) = 0.0391624 fft 26: mflops = 13.0308 (norm. = 0.0256624), norm. avg. (of 3) = 0.0286512 fft 27: mflops = 24.2928 (norm. = 0.0478413), norm. avg. (of 3) = 0.0358036 fft 28: mflops = 32.2123 (norm. = 0.0634375), norm. avg. (of 3) = 0.0809239 fft 29: mflops = 173.068 (norm. = 0.340833), norm. avg. (of 3) = 0.518966 fft 30: mflops = 82.2266 (norm. = 0.161934), norm. avg. (of 3) = 0.295209 fft 31: mflops = 8.91812 (norm. = 0.017563), norm. avg. (of 2) = 0.0195387 fft 32: mflops = 30.7076 (norm. = 0.0604743), norm. avg. (of 2) = 0.0788482 fft 33: mflops = 19.7258 (norm. = 0.0388472), norm. avg. (of 3) = 0.0640431 fft 34: mflops = 25.8111 (norm. = 0.0508313), norm. avg. (of 3) = 0.0814622 fft 35: mflops = 54.0021 (norm. = 0.10635), norm. avg. (of 3) = 0.180245 fft 36: mflops = 20.6489 (norm. = 0.0406651), norm. avg. (of 3) = 0.157071 fft 37: mflops = 24.4403 (norm. = 0.0481316), norm. avg. (of 3) = 0.058284 fft 38: mflops = 26.8659 (norm. = 0.0529087), norm. avg. (of 3) = 0.0661446 fft 39: mflops = 9.96666 (norm. = 0.0196279), norm. avg. (of 3) = 0.0347287 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.10352 s, 262144 iters, t-(init.)=1.05859 s t(norm)=0.0630971, mflops=79.2429 (err=1.9e-16) 1. Arndt DIT: elapsed time t=1.19727 s, 262144 iters, t-(init.)=1.15137 s t(norm)=0.0686268, mflops=72.8578 (err=2.0e-16) 2. Arndt Split-Radix: elapsed time t=1.65137 s, 262144 iters, t-(init.)=1.60645 s t(norm)=0.0957516, mflops=52.2184 (err=1.9e-16) 3. Arndt 4-step: elapsed time t=1.24121 s, 65536 iters, t-(init.)=1.22949 s t(norm)=0.293134, mflops=17.0571 (err=1.8e-16) 4. Bailey: elapsed time t=1.3291 s, 131072 iters, t-(init.)=1.30664 s t(norm)=0.155764, mflops=32.0999 (err=2.1e-16) 5. Beauregard: elapsed time t=1.69531 s, 65536 iters, t-(init.)=1.68457 s t(norm)=0.401633, mflops=12.4492 (err=1.8e-16) 6. Bergland: elapsed time t=1.05664 s, 262144 iters, t-(init.)=1.01172 s t(norm)=0.0603031, mflops=82.9144 (err=1.9e-16) 7. Brenner: elapsed time t=1.29785 s, 131072 iters, t-(init.)=1.26367 s t(norm)=0.150641, mflops=33.1914 (err=1.9e-16) 8. Burrus: elapsed time t=1.95508 s, 131072 iters, t-(init.)=1.93262 s t(norm)=0.230386, mflops=21.7027 (err=1.7e-16) 9. CWP (min N): elapsed time t=1.00195 s, 262144 iters, t-(init.)=0.957031 s t(norm)=0.0570435, mflops=87.6524 10. CWP (best N) (N=28): elapsed time t=1.55273 s, 262144 iters, t-(init.)=1.48438 s t(norm)=0.0884756, mflops=56.5127 11. Edelblute: elapsed time t=1.43457 s, 131072 iters, t-(init.)=1.41211 s t(norm)=0.168337, mflops=29.7024 (err=1.7e-16) 12. FFTPACK: elapsed time t=1.36523 s, 262144 iters, t-(init.)=1.32031 s t(norm)=0.0786968, mflops=63.535 (err=1.7e-16) 13. FFTPACK (f2c): elapsed time t=1.44922 s, 262144 iters, t-(init.)=1.4043 s t(norm)=0.0837026, mflops=59.7353 (err=1.7e-16) FFTW_MEASURE plan: (cost = 8.903444e-07) FFTW_NOTW 16 14. FFTW: elapsed time t=1.09668 s, 1048576 iters, t-(init.)=0.916992 s t(norm)=0.0136642, mflops=365.918 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 7.360000e+01) FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.09668 s, 1048576 iters, t-(init.)=0.917969 s t(norm)=0.0136788, mflops=365.529 (err=1.7e-16) 16. Frigo-old: elapsed time t=1.11035 s, 1048576 iters, t-(init.)=0.929688 s t(norm)=0.0138534, mflops=360.922 (err=1.7e-16) 17. Green: elapsed time t=1.89844 s, 1048576 iters, t-(init.)=1.71875 s t(norm)=0.0256114, mflops=195.226 (err=2.3e-16) 18. GSL: elapsed time t=1.82324 s, 524288 iters, t-(init.)=1.7334 s t(norm)=0.0516593, mflops=96.788 (err=1.7e-16) 19. GSL DIT: elapsed time t=1.88086 s, 262144 iters, t-(init.)=1.83594 s t(norm)=0.10943, mflops=45.6911 (err=1.9e-16) 20. GSL DIF: elapsed time t=1.78711 s, 262144 iters, t-(init.)=1.74219 s t(norm)=0.103842, mflops=48.1499 (err=1.7e-16) 21. Krukar: elapsed time t=1.67676 s, 1048576 iters, t-(init.)=1.49609 s t(norm)=0.0222935, mflops=224.28 (err=1.8e-16) 22. Mayer (Buneman): elapsed time t=1.97949 s, 524288 iters, t-(init.)=1.89063 s t(norm)=0.056345, mflops=88.739 (err=1.6e-16) 23. Mayer (simple): elapsed time t=1.62793 s, 524288 iters, t-(init.)=1.53809 s t(norm)=0.0458385, mflops=109.079 24. Mayer (lookup): elapsed time t=1.71484 s, 524288 iters, t-(init.)=1.625 s t(norm)=0.0484288, mflops=103.244 (err=1.8e-16) 25. Monro: elapsed time t=1.21387 s, 131072 iters, t-(init.)=1.19141 s t(norm)=0.142027, mflops=35.2046 (err=1.8e-16) 26. NAPACK (f2c): elapsed time t=1.05566 s, 65536 iters, t-(init.)=1.04395 s t(norm)=0.248896, mflops=20.0887 (err=3.3e-16) 27. Nielsen: elapsed time t=1.60059 s, 131072 iters, t-(init.)=1.57813 s t(norm)=0.188127, mflops=26.5778 (err=1.7e-16) 28. NR (C): elapsed time t=1.70117 s, 262144 iters, t-(init.)=1.65625 s t(norm)=0.0987202, mflops=50.6482 (err=5.3e-16) 29. Ooura (C): elapsed time t=1.72754 s, 1048576 iters, t-(init.)=1.5498 s t(norm)=0.0230939, mflops=216.507 (err=1.7e-16) 30. Ooura (F): elapsed time t=1.75586 s, 524288 iters, t-(init.)=1.66602 s t(norm)=0.0496511, mflops=100.703 (err=1.7e-16) 31. Ransom: elapsed time t=1.50195 s, 131072 iters, t-(init.)=1.47949 s t(norm)=0.176369, mflops=28.3496 (err=3.2e-16) 32. SCIPORT: elapsed time t=1.32031 s, 131072 iters, t-(init.)=1.29297 s t(norm)=0.154134, mflops=32.4393 (err=1.8e-16) 33. Singleton: elapsed time t=1.77148 s, 262144 iters, t-(init.)=1.72656 s t(norm)=0.102911, mflops=48.5856 (err=1.6e-16) 34. Singleton (f2c): elapsed time t=1.15039 s, 262144 iters, t-(init.)=1.10547 s t(norm)=0.0658911, mflops=75.8828 (err=1.6e-16) 35. Sorensen: elapsed time t=1.30859 s, 262144 iters, t-(init.)=1.26367 s t(norm)=0.0753207, mflops=66.3828 (err=1.8e-16) 36. Sorensen DIT: elapsed time t=1.94922 s, 131072 iters, t-(init.)=1.92676 s t(norm)=0.229687, mflops=21.7687 (err=2.1e-16) 37. Temperton: elapsed time t=1.98828 s, 262144 iters, t-(init.)=1.94336 s t(norm)=0.115833, mflops=43.1655 (err=1.7e-16) 38. Temperton (f2c): elapsed time t=1.84277 s, 262144 iters, t-(init.)=1.79785 s t(norm)=0.10716, mflops=46.6591 (err=1.7e-16) 39. Valkenburg: elapsed time t=1.03516 s, 32768 iters, t-(init.)=1.0293 s t(norm)=0.490807, mflops=10.1873 (err=3.3e-16) Top mflops for N=16 = 365.918 Normalized results and averages for N=16: fft 0: mflops = 79.2429 (norm. = 0.216559), norm. avg. (of 4) = 0.473607 fft 1: mflops = 72.8578 (norm. = 0.199109), norm. avg. (of 4) = 0.35759 fft 2: mflops = 52.2184 (norm. = 0.142705), norm. avg. (of 4) = 0.247828 fft 3: mflops = 17.0571 (norm. = 0.0466144), norm. avg. (of 4) = 0.0350781 fft 4: mflops = 32.0999 (norm. = 0.0877242), norm. avg. (of 4) = 0.0727673 fft 5: mflops = 12.4492 (norm. = 0.0340217), norm. avg. (of 4) = 0.0439297 fft 6: mflops = 82.9144 (norm. = 0.226593), norm. avg. (of 4) = 0.170325 fft 7: mflops = 33.1914 (norm. = 0.0907071), norm. avg. (of 4) = 0.0804404 fft 8: mflops = 21.7027 (norm. = 0.0593103), norm. avg. (of 4) = 0.150336 fft 9: mflops = 87.6524 (norm. = 0.239541), norm. avg. (of 4) = 0.117205 fft 10: mflops = 56.5127 (norm. = 0.154441), norm. avg. (of 4) = 0.0763744 fft 11: mflops = 29.7024 (norm. = 0.0811722), norm. avg. (of 3) = 0.0880325 fft 12: mflops = 63.535 (norm. = 0.173632), norm. avg. (of 4) = 0.115648 fft 13: mflops = 59.7353 (norm. = 0.163248), norm. avg. (of 4) = 0.114181 fft 14: mflops = 365.918 (norm. = 1), norm. avg. (of 4) = 0.863549 fft 15: mflops = 365.529 (norm. = 0.998936), norm. avg. (of 4) = 0.861666 fft 16: mflops = 360.922 (norm. = 0.986345), norm. avg. (of 4) = 0.776137 fft 17: mflops = 195.226 (norm. = 0.533523), norm. avg. (of 2) = 0.427872 fft 18: mflops = 96.788 (norm. = 0.264507), norm. avg. (of 4) = 0.150049 fft 19: mflops = 45.6911 (norm. = 0.124867), norm. avg. (of 4) = 0.0878975 fft 20: mflops = 48.1499 (norm. = 0.131586), norm. avg. (of 4) = 0.0876028 fft 21: mflops = 224.28 (norm. = 0.612924), norm. avg. (of 4) = 0.59521 fft 22: mflops = 88.739 (norm. = 0.24251), norm. avg. (of 3) = 0.223065 fft 23: mflops = 109.079 (norm. = 0.298095), norm. avg. (of 3) = 0.240268 fft 24: mflops = 103.244 (norm. = 0.282151), norm. avg. (of 3) = 0.227735 fft 25: mflops = 35.2046 (norm. = 0.096209), norm. avg. (of 3) = 0.0581779 fft 26: mflops = 20.0887 (norm. = 0.0548994), norm. avg. (of 4) = 0.0352133 fft 27: mflops = 26.5778 (norm. = 0.072633), norm. avg. (of 4) = 0.0450109 fft 28: mflops = 50.6482 (norm. = 0.138414), norm. avg. (of 4) = 0.0952964 fft 29: mflops = 216.507 (norm. = 0.591682), norm. avg. (of 4) = 0.537145 fft 30: mflops = 100.703 (norm. = 0.275205), norm. avg. (of 4) = 0.290208 fft 31: mflops = 28.3496 (norm. = 0.0774752), norm. avg. (of 3) = 0.0388509 fft 32: mflops = 32.4393 (norm. = 0.0886518), norm. avg. (of 3) = 0.0821161 fft 33: mflops = 48.5856 (norm. = 0.132777), norm. avg. (of 4) = 0.0812266 fft 34: mflops = 75.8828 (norm. = 0.207376), norm. avg. (of 4) = 0.112941 fft 35: mflops = 66.3828 (norm. = 0.181414), norm. avg. (of 4) = 0.180538 fft 36: mflops = 21.7687 (norm. = 0.0594906), norm. avg. (of 4) = 0.132676 fft 37: mflops = 43.1655 (norm. = 0.117965), norm. avg. (of 4) = 0.0732042 fft 38: mflops = 46.6591 (norm. = 0.127512), norm. avg. (of 4) = 0.0814865 fft 39: mflops = 10.1873 (norm. = 0.0278404), norm. avg. (of 4) = 0.0330066 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.17871 s, 131072 iters, t-(init.)=1.14063 s t(norm)=0.0543892, mflops=91.93 (err=2.4e-16) 1. Arndt DIT: elapsed time t=1.23828 s, 131072 iters, t-(init.)=1.2002 s t(norm)=0.0572298, mflops=87.3671 (err=2.3e-16) 2. Arndt Split-Radix: elapsed time t=1.89063 s, 131072 iters, t-(init.)=1.85254 s t(norm)=0.0883359, mflops=56.6021 (err=2.2e-16) 3. Arndt 4-step: elapsed time t=1.33008 s, 32768 iters, t-(init.)=1.32031 s t(norm)=0.25183, mflops=19.8547 (err=2.4e-16) 4. Bailey: elapsed time t=1.21094 s, 65536 iters, t-(init.)=1.19141 s t(norm)=0.113621, mflops=44.0058 (err=2.3e-16) 5. Beauregard: elapsed time t=1.97852 s, 32768 iters, t-(init.)=1.96875 s t(norm)=0.375509, mflops=13.3153 (err=2.7e-16) 6. Bergland: elapsed time t=1.85449 s, 262144 iters, t-(init.)=1.77832 s t(norm)=0.0423985, mflops=117.929 (err=3.1e-16) 7. Brenner: elapsed time t=1.2998 s, 65536 iters, t-(init.)=1.27539 s t(norm)=0.121631, mflops=41.108 (err=2.4e-16) 8. Burrus: elapsed time t=1.10449 s, 32768 iters, t-(init.)=1.09473 s t(norm)=0.208803, mflops=23.9461 (err=2.2e-16) 9. CWP (min N) (N=33): elapsed time t=1.05273 s, 131072 iters, t-(init.)=1.01367 s t(norm)=0.0483356, mflops=103.443 10. CWP (best N) (N=35): elapsed time t=1.93945 s, 262144 iters, t-(init.)=1.85645 s t(norm)=0.0442611, mflops=112.966 11. Edelblute: elapsed time t=1.65137 s, 65536 iters, t-(init.)=1.63184 s t(norm)=0.155624, mflops=32.1287 (err=2.2e-16) 12. FFTPACK: elapsed time t=1.06445 s, 65536 iters, t-(init.)=1.0459 s t(norm)=0.0997446, mflops=50.128 (err=2.6e-16) 13. FFTPACK (f2c): elapsed time t=1.08984 s, 65536 iters, t-(init.)=1.07129 s t(norm)=0.102166, mflops=48.9399 (err=2.6e-16) FFTW_MEASURE plan: (cost = 2.525747e-06) FFTW_TWIDDLE 4 FFTW_NOTW 8 14. FFTW: elapsed time t=1.41504 s, 524288 iters, t-(init.)=1.26172 s t(norm)=0.0150409, mflops=332.428 (err=2.6e-16) FFTW_ESTIMATE plan: (cost = 1.280000e+02) FFTW_TWIDDLE 4 FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.39844 s, 524288 iters, t-(init.)=1.24512 s t(norm)=0.014843, mflops=336.86 (err=2.6e-16) 16. Frigo-old: elapsed time t=1.38086 s, 524288 iters, t-(init.)=1.22754 s t(norm)=0.0146334, mflops=341.684 (err=2.9e-16) 17. Green: elapsed time t=1.75391 s, 524288 iters, t-(init.)=1.60059 s t(norm)=0.0190805, mflops=262.048 (err=2.4e-16) 18. GSL: elapsed time t=1.09863 s, 131072 iters, t-(init.)=1.05957 s t(norm)=0.0505242, mflops=98.9624 (err=2.9e-16) 19. GSL DIT: elapsed time t=1.73145 s, 131072 iters, t-(init.)=1.69238 s t(norm)=0.0806991, mflops=61.9586 (err=3.3e-16) 20. GSL DIF: elapsed time t=1.58496 s, 131072 iters, t-(init.)=1.54688 s t(norm)=0.0737607, mflops=67.7867 (err=3.0e-16) 21. Krukar: elapsed time t=1.03027 s, 262144 iters, t-(init.)=0.954102 s t(norm)=0.0227476, mflops=219.804 (err=2.6e-16) 22. Mayer (Buneman): elapsed time t=1.09375 s, 131072 iters, t-(init.)=1.05469 s t(norm)=0.0502914, mflops=99.4205 (err=2.2e-16) 23. Mayer (simple): elapsed time t=1.71582 s, 262144 iters, t-(init.)=1.63867 s t(norm)=0.039069, mflops=127.979 24. Mayer (lookup): elapsed time t=1.76074 s, 262144 iters, t-(init.)=1.68457 s t(norm)=0.0401633, mflops=124.492 (err=2.7e-16) 25. Monro: elapsed time t=1.08594 s, 65536 iters, t-(init.)=1.06641 s t(norm)=0.1017, mflops=49.164 (err=3.0e-16) 26. NAPACK (f2c): elapsed time t=1.01074 s, 32768 iters, t-(init.)=1.00098 s t(norm)=0.190921, mflops=26.1888 (err=5.9e-16) 27. Nielsen: elapsed time t=1.27734 s, 65536 iters, t-(init.)=1.25781 s t(norm)=0.119954, mflops=41.6825 (err=1.0e-15) 28. NR (C): elapsed time t=1.50488 s, 131072 iters, t-(init.)=1.46582 s t(norm)=0.0698958, mflops=71.5351 (err=9.8e-16) 29. Ooura (C): elapsed time t=1.79102 s, 524288 iters, t-(init.)=1.63867 s t(norm)=0.0195345, mflops=255.958 (err=2.5e-16) 30. Ooura (F): elapsed time t=1.02637 s, 131072 iters, t-(init.)=0.988281 s t(norm)=0.0471249, mflops=106.101 (err=2.5e-16) 31. Ransom: elapsed time t=1.74805 s, 65536 iters, t-(init.)=1.72949 s t(norm)=0.164937, mflops=30.3146 (err=7.2e-16) 32. SCIPORT: elapsed time t=1.57715 s, 65536 iters, t-(init.)=1.55566 s t(norm)=0.14836, mflops=33.7019 (err=2.7e-16) 33. Singleton: elapsed time t=1.79492 s, 131072 iters, t-(init.)=1.75586 s t(norm)=0.0837259, mflops=59.7187 (err=3.2e-16) 34. Singleton (f2c): elapsed time t=1.08984 s, 131072 iters, t-(init.)=1.05176 s t(norm)=0.0501517, mflops=99.6975 (err=3.2e-16) 35. Sorensen: elapsed time t=1.30078 s, 131072 iters, t-(init.)=1.2627 s t(norm)=0.06021, mflops=83.0427 (err=2.4e-16) 36. Sorensen DIT: elapsed time t=1.10352 s, 32768 iters, t-(init.)=1.09375 s t(norm)=0.208616, mflops=23.9675 (err=2.1e-16) 37. Temperton: elapsed time t=1.10742 s, 65536 iters, t-(init.)=1.08789 s t(norm)=0.103749, mflops=48.1931 (err=2.8e-16) 38. Temperton (f2c): elapsed time t=1.08496 s, 65536 iters, t-(init.)=1.06543 s t(norm)=0.101607, mflops=49.2091 (err=2.8e-16) 39. Valkenburg: elapsed time t=1.30078 s, 16384 iters, t-(init.)=1.2959 s t(norm)=0.494346, mflops=10.1144 (err=3.8e-16) Top mflops for N=32 = 341.684 Normalized results and averages for N=32: fft 0: mflops = 91.93 (norm. = 0.26905), norm. avg. (of 5) = 0.432696 fft 1: mflops = 87.3671 (norm. = 0.255696), norm. avg. (of 5) = 0.337211 fft 2: mflops = 56.6021 (norm. = 0.165656), norm. avg. (of 5) = 0.231394 fft 3: mflops = 19.8547 (norm. = 0.0581084), norm. avg. (of 5) = 0.0396841 fft 4: mflops = 44.0058 (norm. = 0.128791), norm. avg. (of 5) = 0.083972 fft 5: mflops = 13.3153 (norm. = 0.0389695), norm. avg. (of 5) = 0.0429377 fft 6: mflops = 117.929 (norm. = 0.34514), norm. avg. (of 5) = 0.205288 fft 7: mflops = 41.108 (norm. = 0.12031), norm. avg. (of 5) = 0.0884144 fft 8: mflops = 23.9461 (norm. = 0.0700825), norm. avg. (of 5) = 0.134285 fft 9: mflops = 103.443 (norm. = 0.302746), norm. avg. (of 5) = 0.154313 fft 10: mflops = 112.966 (norm. = 0.330615), norm. avg. (of 5) = 0.127223 fft 11: mflops = 32.1287 (norm. = 0.0940305), norm. avg. (of 4) = 0.089532 fft 12: mflops = 50.128 (norm. = 0.146709), norm. avg. (of 5) = 0.12186 fft 13: mflops = 48.9399 (norm. = 0.143232), norm. avg. (of 5) = 0.119991 fft 14: mflops = 332.428 (norm. = 0.97291), norm. avg. (of 5) = 0.885421 fft 15: mflops = 336.86 (norm. = 0.985882), norm. avg. (of 5) = 0.886509 fft 16: mflops = 341.684 (norm. = 1), norm. avg. (of 5) = 0.82091 fft 17: mflops = 262.048 (norm. = 0.766931), norm. avg. (of 3) = 0.540892 fft 18: mflops = 98.9624 (norm. = 0.289631), norm. avg. (of 5) = 0.177966 fft 19: mflops = 61.9586 (norm. = 0.181333), norm. avg. (of 5) = 0.106585 fft 20: mflops = 67.7867 (norm. = 0.19839), norm. avg. (of 5) = 0.10976 fft 21: mflops = 219.804 (norm. = 0.643296), norm. avg. (of 5) = 0.604827 fft 22: mflops = 99.4205 (norm. = 0.290972), norm. avg. (of 4) = 0.240042 fft 23: mflops = 127.979 (norm. = 0.374553), norm. avg. (of 4) = 0.273839 fft 24: mflops = 124.492 (norm. = 0.364348), norm. avg. (of 4) = 0.261888 fft 25: mflops = 49.164 (norm. = 0.143887), norm. avg. (of 4) = 0.0796053 fft 26: mflops = 26.1888 (norm. = 0.0766463), norm. avg. (of 5) = 0.0434999 fft 27: mflops = 41.6825 (norm. = 0.121991), norm. avg. (of 5) = 0.060407 fft 28: mflops = 71.5351 (norm. = 0.20936), norm. avg. (of 5) = 0.118109 fft 29: mflops = 255.958 (norm. = 0.749106), norm. avg. (of 5) = 0.579537 fft 30: mflops = 106.101 (norm. = 0.310524), norm. avg. (of 5) = 0.294271 fft 31: mflops = 30.3146 (norm. = 0.0887211), norm. avg. (of 4) = 0.0513184 fft 32: mflops = 33.7019 (norm. = 0.0986347), norm. avg. (of 4) = 0.0862457 fft 33: mflops = 59.7187 (norm. = 0.174778), norm. avg. (of 5) = 0.0999368 fft 34: mflops = 99.6975 (norm. = 0.291783), norm. avg. (of 5) = 0.148709 fft 35: mflops = 83.0427 (norm. = 0.243039), norm. avg. (of 5) = 0.193038 fft 36: mflops = 23.9675 (norm. = 0.0701451), norm. avg. (of 5) = 0.12017 fft 37: mflops = 48.1931 (norm. = 0.141046), norm. avg. (of 5) = 0.0867725 fft 38: mflops = 49.2091 (norm. = 0.144019), norm. avg. (of 5) = 0.0939931 fft 39: mflops = 10.1144 (norm. = 0.0296015), norm. avg. (of 5) = 0.0323256 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.40625 s, 65536 iters, t-(init.)=1.37109 s t(norm)=0.0544824, mflops=91.7728 (err=6.1e-16) 1. Arndt DIT: elapsed time t=1.50195 s, 65536 iters, t-(init.)=1.4668 s t(norm)=0.0582853, mflops=85.785 (err=6.0e-16) 2. Arndt Split-Radix: elapsed time t=1.01855 s, 32768 iters, t-(init.)=1.00195 s t(norm)=0.0796281, mflops=62.7919 (err=5.9e-16) 3. Arndt 4-step: elapsed time t=1.16699 s, 16384 iters, t-(init.)=1.1582 s t(norm)=0.184091, mflops=27.1604 (err=5.3e-16) 4. Bailey: elapsed time t=1.39063 s, 32768 iters, t-(init.)=1.37305 s t(norm)=0.10912, mflops=45.8211 (err=5.8e-16) 5. Beauregard: elapsed time t=1.1748 s, 8192 iters, t-(init.)=1.1709 s t(norm)=0.372219, mflops=13.433 (err=5.8e-16) 6. Bergland: elapsed time t=1.88574 s, 131072 iters, t-(init.)=1.81543 s t(norm)=0.0360693, mflops=138.622 (err=4.9e-16) 7. Brenner: elapsed time t=1.30762 s, 32768 iters, t-(init.)=1.28711 s t(norm)=0.10229, mflops=48.8805 (err=5.7e-16) 8. Burrus: elapsed time t=1.19824 s, 16384 iters, t-(init.)=1.18945 s t(norm)=0.189058, mflops=26.4468 (err=5.5e-16) 9. CWP (min N) (N=65): elapsed time t=1.83691 s, 131072 iters, t-(init.)=1.76563 s t(norm)=0.0350798, mflops=142.532 10. CWP (best N) (N=84): elapsed time t=1.91309 s, 131072 iters, t-(init.)=1.82324 s t(norm)=0.0362246, mflops=138.028 11. Edelblute: elapsed time t=1.79395 s, 32768 iters, t-(init.)=1.77637 s t(norm)=0.141173, mflops=35.4175 (err=5.5e-16) 12. FFTPACK: elapsed time t=1.08984 s, 32768 iters, t-(init.)=1.07227 s t(norm)=0.085216, mflops=58.6744 (err=5.6e-16) 13. FFTPACK (f2c): elapsed time t=1.1084 s, 32768 iters, t-(init.)=1.09082 s t(norm)=0.0866906, mflops=57.6764 (err=5.6e-16) FFTW_MEASURE plan: (cost = 5.170703e-06) FFTW_TWIDDLE 8 FFTW_NOTW 8 14. FFTW: elapsed time t=1.40527 s, 262144 iters, t-(init.)=1.26465 s t(norm)=0.0125632, mflops=397.989 (err=5.3e-16) FFTW_ESTIMATE plan: (cost = 1.536000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.40527 s, 262144 iters, t-(init.)=1.26563 s t(norm)=0.0125729, mflops=397.682 (err=5.3e-16) 16. Frigo-old: elapsed time t=1.7832 s, 262144 iters, t-(init.)=1.64258 s t(norm)=0.0163175, mflops=306.419 (err=5.3e-16) 17. Green: elapsed time t=1.61133 s, 262144 iters, t-(init.)=1.47168 s t(norm)=0.0146198, mflops=342.001 (err=5.2e-16) 18. GSL: elapsed time t=1.89844 s, 131072 iters, t-(init.)=1.82813 s t(norm)=0.0363216, mflops=137.659 (err=5.6e-16) 19. GSL DIT: elapsed time t=1.75098 s, 65536 iters, t-(init.)=1.71582 s t(norm)=0.0681806, mflops=73.3347 (err=5.6e-16) 20. GSL DIF: elapsed time t=1.53906 s, 65536 iters, t-(init.)=1.50391 s t(norm)=0.0597599, mflops=83.6682 (err=5.7e-16) 21. Krukar: elapsed time t=1.27441 s, 131072 iters, t-(init.)=1.2041 s t(norm)=0.0239233, mflops=209.001 (err=5.7e-16) 22. Mayer (Buneman): elapsed time t=1.25 s, 65536 iters, t-(init.)=1.21484 s t(norm)=0.0482736, mflops=103.576 (err=5.5e-16) 23. Mayer (simple): elapsed time t=1.95215 s, 131072 iters, t-(init.)=1.88184 s t(norm)=0.0373887, mflops=133.73 24. Mayer (lookup): elapsed time t=1.96191 s, 131072 iters, t-(init.)=1.89063 s t(norm)=0.0375633, mflops=133.108 (err=5.5e-16) 25. Monro: elapsed time t=1.0459 s, 32768 iters, t-(init.)=1.02832 s t(norm)=0.0817236, mflops=61.1819 (err=5.6e-16) 26. NAPACK (f2c): elapsed time t=2.00098 s, 32768 iters, t-(init.)=1.9834 s t(norm)=0.157626, mflops=31.7206 (err=1.1e-15) 27. Nielsen: elapsed time t=1.12305 s, 32768 iters, t-(init.)=1.10645 s t(norm)=0.0879324, mflops=56.8619 (err=2.0e-15) 28. NR (C): elapsed time t=1.4502 s, 65536 iters, t-(init.)=1.41504 s t(norm)=0.0562286, mflops=88.9227 (err=1.1e-15) 29. Ooura (C): elapsed time t=1.74121 s, 262144 iters, t-(init.)=1.60059 s t(norm)=0.0159004, mflops=314.458 (err=4.8e-16) 30. Ooura (F): elapsed time t=1.07422 s, 65536 iters, t-(init.)=1.03906 s t(norm)=0.0412886, mflops=121.099 (err=4.8e-16) 31. Ransom: elapsed time t=1.06348 s, 32768 iters, t-(init.)=1.0459 s t(norm)=0.0831205, mflops=60.1536 (err=7.7e-16) 32. SCIPORT: elapsed time t=1.88574 s, 32768 iters, t-(init.)=1.86719 s t(norm)=0.148391, mflops=33.6948 (err=6.1e-16) 33. Singleton: elapsed time t=1.69434 s, 65536 iters, t-(init.)=1.65918 s t(norm)=0.0659299, mflops=75.8382 (err=7.7e-16) 34. Singleton (f2c): elapsed time t=1.89941 s, 131072 iters, t-(init.)=1.8291 s t(norm)=0.036341, mflops=137.586 (err=7.7e-16) 35. Sorensen: elapsed time t=1.38672 s, 65536 iters, t-(init.)=1.35156 s t(norm)=0.0537063, mflops=93.099 (err=5.8e-16) 36. Sorensen DIT: elapsed time t=1.2002 s, 16384 iters, t-(init.)=1.19141 s t(norm)=0.189369, mflops=26.4035 (err=5.3e-16) 37. Temperton: elapsed time t=1.85449 s, 65536 iters, t-(init.)=1.81934 s t(norm)=0.0722939, mflops=69.1621 (err=5.6e-16) 38. Temperton (f2c): elapsed time t=1.80273 s, 65536 iters, t-(init.)=1.76758 s t(norm)=0.0702372, mflops=71.1873 (err=5.6e-16) 39. Valkenburg: elapsed time t=1.51855 s, 8192 iters, t-(init.)=1.51367 s t(norm)=0.481183, mflops=10.391 (err=6.4e-16) Top mflops for N=64 = 397.989 Normalized results and averages for N=64: fft 0: mflops = 91.7728 (norm. = 0.230591), norm. avg. (of 6) = 0.399012 fft 1: mflops = 85.785 (norm. = 0.215546), norm. avg. (of 6) = 0.316934 fft 2: mflops = 62.7919 (norm. = 0.157773), norm. avg. (of 6) = 0.219123 fft 3: mflops = 27.1604 (norm. = 0.0682441), norm. avg. (of 6) = 0.0444441 fft 4: mflops = 45.8211 (norm. = 0.115132), norm. avg. (of 6) = 0.0891653 fft 5: mflops = 13.433 (norm. = 0.0337521), norm. avg. (of 6) = 0.0414067 fft 6: mflops = 138.622 (norm. = 0.348306), norm. avg. (of 6) = 0.229124 fft 7: mflops = 48.8805 (norm. = 0.122819), norm. avg. (of 6) = 0.0941484 fft 8: mflops = 26.4468 (norm. = 0.0664511), norm. avg. (of 6) = 0.122979 fft 9: mflops = 142.532 (norm. = 0.358131), norm. avg. (of 6) = 0.188283 fft 10: mflops = 138.028 (norm. = 0.346813), norm. avg. (of 6) = 0.163821 fft 11: mflops = 35.4175 (norm. = 0.0889912), norm. avg. (of 5) = 0.0894238 fft 12: mflops = 58.6744 (norm. = 0.147427), norm. avg. (of 6) = 0.126122 fft 13: mflops = 57.6764 (norm. = 0.144919), norm. avg. (of 6) = 0.124146 fft 14: mflops = 397.989 (norm. = 1), norm. avg. (of 6) = 0.904518 fft 15: mflops = 397.682 (norm. = 0.999228), norm. avg. (of 6) = 0.905296 fft 16: mflops = 306.419 (norm. = 0.769917), norm. avg. (of 6) = 0.812411 fft 17: mflops = 342.001 (norm. = 0.859323), norm. avg. (of 4) = 0.6205 fft 18: mflops = 137.659 (norm. = 0.345887), norm. avg. (of 6) = 0.205952 fft 19: mflops = 73.3347 (norm. = 0.184263), norm. avg. (of 6) = 0.119531 fft 20: mflops = 83.6682 (norm. = 0.210227), norm. avg. (of 6) = 0.126505 fft 21: mflops = 209.001 (norm. = 0.525142), norm. avg. (of 6) = 0.591546 fft 22: mflops = 103.576 (norm. = 0.260249), norm. avg. (of 5) = 0.244083 fft 23: mflops = 133.73 (norm. = 0.336015), norm. avg. (of 5) = 0.286274 fft 24: mflops = 133.108 (norm. = 0.334452), norm. avg. (of 5) = 0.276401 fft 25: mflops = 61.1819 (norm. = 0.153727), norm. avg. (of 5) = 0.0944297 fft 26: mflops = 31.7206 (norm. = 0.0797021), norm. avg. (of 6) = 0.0495336 fft 27: mflops = 56.8619 (norm. = 0.142873), norm. avg. (of 6) = 0.0741513 fft 28: mflops = 88.9227 (norm. = 0.22343), norm. avg. (of 6) = 0.135663 fft 29: mflops = 314.458 (norm. = 0.790116), norm. avg. (of 6) = 0.614634 fft 30: mflops = 121.099 (norm. = 0.304276), norm. avg. (of 6) = 0.295939 fft 31: mflops = 60.1536 (norm. = 0.151144), norm. avg. (of 5) = 0.0712835 fft 32: mflops = 33.6948 (norm. = 0.0846627), norm. avg. (of 5) = 0.0859291 fft 33: mflops = 75.8382 (norm. = 0.190553), norm. avg. (of 6) = 0.11504 fft 34: mflops = 137.586 (norm. = 0.345702), norm. avg. (of 6) = 0.181541 fft 35: mflops = 93.099 (norm. = 0.233923), norm. avg. (of 6) = 0.199852 fft 36: mflops = 26.4035 (norm. = 0.0663422), norm. avg. (of 6) = 0.111199 fft 37: mflops = 69.1621 (norm. = 0.173779), norm. avg. (of 6) = 0.101274 fft 38: mflops = 71.1873 (norm. = 0.178867), norm. avg. (of 6) = 0.108139 fft 39: mflops = 10.391 (norm. = 0.0261089), norm. avg. (of 6) = 0.0312895 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.44922 s, 32768 iters, t-(init.)=1.41602 s t(norm)=0.0482292, mflops=103.672 (err=3.7e-16) 1. Arndt DIT: elapsed time t=1.5166 s, 32768 iters, t-(init.)=1.4834 s t(norm)=0.0505242, mflops=98.9624 (err=3.5e-16) 2. Arndt Split-Radix: elapsed time t=1.08984 s, 16384 iters, t-(init.)=1.07324 s t(norm)=0.0731088, mflops=68.3912 (err=3.7e-16) 3. Arndt 4-step: elapsed time t=1.34473 s, 8192 iters, t-(init.)=1.33594 s t(norm)=0.182007, mflops=27.4715 (err=3.7e-16) 4. Bailey: elapsed time t=1.36719 s, 16384 iters, t-(init.)=1.35059 s t(norm)=0.0920014, mflops=54.347 (err=3.8e-16) 5. Beauregard: elapsed time t=1.375 s, 4096 iters, t-(init.)=1.37012 s t(norm)=0.373327, mflops=13.3931 (err=3.6e-16) 6. Bergland: elapsed time t=1.00684 s, 32768 iters, t-(init.)=0.973633 s t(norm)=0.0331617, mflops=150.776 (err=3.5e-16) 7. Brenner: elapsed time t=1.3916 s, 16384 iters, t-(init.)=1.37305 s t(norm)=0.0935314, mflops=53.458 (err=4.0e-16) 8. Burrus: elapsed time t=1.30176 s, 8192 iters, t-(init.)=1.29297 s t(norm)=0.176153, mflops=28.3844 (err=3.8e-16) 9. CWP (min N) (N=130): elapsed time t=1.77246 s, 65536 iters, t-(init.)=1.70508 s t(norm)=0.0290373, mflops=172.192 10. CWP (best N) (N=140): elapsed time t=1.55957 s, 65536 iters, t-(init.)=1.4873 s t(norm)=0.0253286, mflops=197.405 11. Edelblute: elapsed time t=1.87891 s, 16384 iters, t-(init.)=1.8623 s t(norm)=0.126859, mflops=39.4137 (err=3.8e-16) 12. FFTPACK: elapsed time t=1.14453 s, 16384 iters, t-(init.)=1.12793 s t(norm)=0.0768341, mflops=65.0753 (err=3.8e-16) 13. FFTPACK (f2c): elapsed time t=1.16113 s, 16384 iters, t-(init.)=1.14453 s t(norm)=0.077965, mflops=64.1313 (err=3.8e-16) FFTW_MEASURE plan: (cost = 1.099706e-05) FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.62207 s, 131072 iters, t-(init.)=1.48828 s t(norm)=0.0126726, mflops=394.551 (err=3.8e-16) FFTW_ESTIMATE plan: (cost = 7.168000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.62012 s, 131072 iters, t-(init.)=1.48633 s t(norm)=0.012656, mflops=395.069 (err=3.8e-16) 16. Frigo-old: elapsed time t=1.10449 s, 65536 iters, t-(init.)=1.03809 s t(norm)=0.0176785, mflops=282.829 (err=3.5e-16) 17. Green: elapsed time t=1.89844 s, 131072 iters, t-(init.)=1.76367 s t(norm)=0.0150176, mflops=332.943 (err=3.8e-16) 18. GSL: elapsed time t=1.99414 s, 65536 iters, t-(init.)=1.92773 s t(norm)=0.0328291, mflops=152.304 (err=3.7e-16) 19. GSL DIT: elapsed time t=1.80664 s, 32768 iters, t-(init.)=1.77344 s t(norm)=0.0604029, mflops=82.7775 (err=3.9e-16) 20. GSL DIF: elapsed time t=1.55664 s, 32768 iters, t-(init.)=1.52344 s t(norm)=0.051888, mflops=96.3614 (err=4.0e-16) 21. Krukar: elapsed time t=1.57227 s, 65536 iters, t-(init.)=1.50586 s t(norm)=0.0256446, mflops=194.973 (err=3.4e-16) 22. Mayer (Buneman): elapsed time t=1.32227 s, 32768 iters, t-(init.)=1.28906 s t(norm)=0.0439052, mflops=113.882 (err=3.5e-16) 23. Mayer (simple): elapsed time t=1.02246 s, 32768 iters, t-(init.)=0.989258 s t(norm)=0.0336939, mflops=148.395 24. Mayer (lookup): elapsed time t=1.03027 s, 32768 iters, t-(init.)=0.99707 s t(norm)=0.03396, mflops=147.232 (err=3.4e-16) 25. Monro: elapsed time t=1.14648 s, 16384 iters, t-(init.)=1.12988 s t(norm)=0.0769672, mflops=64.9628 (err=3.7e-16) 26. NAPACK (f2c): elapsed time t=1.07129 s, 8192 iters, t-(init.)=1.0625 s t(norm)=0.144754, mflops=34.5413 (err=1.4e-15) 27. Nielsen: elapsed time t=1.33691 s, 16384 iters, t-(init.)=1.32031 s t(norm)=0.0899392, mflops=55.5931 (err=1.0e-15) 28. NR (C): elapsed time t=1.47266 s, 32768 iters, t-(init.)=1.43945 s t(norm)=0.0490275, mflops=101.984 (err=1.5e-15) 29. Ooura (C): elapsed time t=1.0127 s, 65536 iters, t-(init.)=0.946289 s t(norm)=0.0161152, mflops=310.266 (err=3.3e-16) 30. Ooura (F): elapsed time t=1.27148 s, 32768 iters, t-(init.)=1.2373 s t(norm)=0.0421423, mflops=118.646 (err=3.3e-16) 31. Ransom: elapsed time t=1.26953 s, 16384 iters, t-(init.)=1.25293 s t(norm)=0.0853491, mflops=58.583 (err=9.4e-16) 32. SCIPORT: elapsed time t=1.08398 s, 8192 iters, t-(init.)=1.0752 s t(norm)=0.146484, mflops=34.1335 (err=3.9e-16) 33. Singleton: elapsed time t=1.08203 s, 16384 iters, t-(init.)=1.06543 s t(norm)=0.0725766, mflops=68.8927 (err=3.9e-16) 34. Singleton (f2c): elapsed time t=1.13574 s, 32768 iters, t-(init.)=1.10156 s t(norm)=0.037519, mflops=133.266 (err=3.9e-16) 35. Sorensen: elapsed time t=1.47168 s, 32768 iters, t-(init.)=1.43848 s t(norm)=0.0489942, mflops=102.053 (err=3.3e-16) 36. Sorensen DIT: elapsed time t=1.27734 s, 8192 iters, t-(init.)=1.26953 s t(norm)=0.17296, mflops=28.9084 (err=3.8e-16) 37. Temperton: elapsed time t=1.05176 s, 16384 iters, t-(init.)=1.03516 s t(norm)=0.0705144, mflops=70.9075 (err=3.4e-16) 38. Temperton (f2c): elapsed time t=1.08496 s, 16384 iters, t-(init.)=1.06836 s t(norm)=0.0727762, mflops=68.7038 (err=3.4e-16) 39. Valkenburg: elapsed time t=1.74707 s, 4096 iters, t-(init.)=1.74316 s t(norm)=0.474975, mflops=10.5269 (err=5.4e-16) Top mflops for N=128 = 395.069 Normalized results and averages for N=128: fft 0: mflops = 103.672 (norm. = 0.262414), norm. avg. (of 7) = 0.379498 fft 1: mflops = 98.9624 (norm. = 0.250494), norm. avg. (of 7) = 0.307442 fft 2: mflops = 68.3912 (norm. = 0.173112), norm. avg. (of 7) = 0.21255 fft 3: mflops = 27.4715 (norm. = 0.0695358), norm. avg. (of 7) = 0.0480287 fft 4: mflops = 54.347 (norm. = 0.137563), norm. avg. (of 7) = 0.0960793 fft 5: mflops = 13.3931 (norm. = 0.0339006), norm. avg. (of 7) = 0.0403344 fft 6: mflops = 150.776 (norm. = 0.381645), norm. avg. (of 7) = 0.250913 fft 7: mflops = 53.458 (norm. = 0.135313), norm. avg. (of 7) = 0.100029 fft 8: mflops = 28.3844 (norm. = 0.0718467), norm. avg. (of 7) = 0.115675 fft 9: mflops = 172.192 (norm. = 0.435853), norm. avg. (of 7) = 0.22365 fft 10: mflops = 197.405 (norm. = 0.499672), norm. avg. (of 7) = 0.2118 fft 11: mflops = 39.4137 (norm. = 0.099764), norm. avg. (of 6) = 0.0911472 fft 12: mflops = 65.0753 (norm. = 0.164719), norm. avg. (of 7) = 0.131635 fft 13: mflops = 64.1313 (norm. = 0.162329), norm. avg. (of 7) = 0.129601 fft 14: mflops = 394.551 (norm. = 0.998688), norm. avg. (of 7) = 0.91797 fft 15: mflops = 395.069 (norm. = 1), norm. avg. (of 7) = 0.918825 fft 16: mflops = 282.829 (norm. = 0.715898), norm. avg. (of 7) = 0.798623 fft 17: mflops = 332.943 (norm. = 0.842746), norm. avg. (of 5) = 0.664949 fft 18: mflops = 152.304 (norm. = 0.385512), norm. avg. (of 7) = 0.231604 fft 19: mflops = 82.7775 (norm. = 0.209526), norm. avg. (of 7) = 0.132387 fft 20: mflops = 96.3614 (norm. = 0.24391), norm. avg. (of 7) = 0.143277 fft 21: mflops = 194.973 (norm. = 0.493515), norm. avg. (of 7) = 0.577542 fft 22: mflops = 113.882 (norm. = 0.288258), norm. avg. (of 6) = 0.251446 fft 23: mflops = 148.395 (norm. = 0.375617), norm. avg. (of 6) = 0.301165 fft 24: mflops = 147.232 (norm. = 0.372674), norm. avg. (of 6) = 0.292446 fft 25: mflops = 64.9628 (norm. = 0.164434), norm. avg. (of 6) = 0.106097 fft 26: mflops = 34.5413 (norm. = 0.0874311), norm. avg. (of 7) = 0.0549475 fft 27: mflops = 55.5931 (norm. = 0.140717), norm. avg. (of 7) = 0.0836608 fft 28: mflops = 101.984 (norm. = 0.258141), norm. avg. (of 7) = 0.15316 fft 29: mflops = 310.266 (norm. = 0.785346), norm. avg. (of 7) = 0.639021 fft 30: mflops = 118.646 (norm. = 0.300316), norm. avg. (of 7) = 0.296564 fft 31: mflops = 58.583 (norm. = 0.148285), norm. avg. (of 6) = 0.0841171 fft 32: mflops = 34.1335 (norm. = 0.0863987), norm. avg. (of 6) = 0.0860074 fft 33: mflops = 68.8927 (norm. = 0.174381), norm. avg. (of 7) = 0.123517 fft 34: mflops = 133.266 (norm. = 0.337323), norm. avg. (of 7) = 0.203796 fft 35: mflops = 102.053 (norm. = 0.258316), norm. avg. (of 7) = 0.208204 fft 36: mflops = 28.9084 (norm. = 0.0731731), norm. avg. (of 7) = 0.105766 fft 37: mflops = 70.9075 (norm. = 0.179481), norm. avg. (of 7) = 0.112446 fft 38: mflops = 68.7038 (norm. = 0.173903), norm. avg. (of 7) = 0.117534 fft 39: mflops = 10.5269 (norm. = 0.0266457), norm. avg. (of 7) = 0.0306261 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.60156 s, 16384 iters, t-(init.)=1.56934 s t(norm)=0.0467699, mflops=106.906 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.70898 s, 16384 iters, t-(init.)=1.67578 s t(norm)=0.0499422, mflops=100.116 (err=1.0e-15) 2. Arndt Split-Radix: elapsed time t=1.1582 s, 8192 iters, t-(init.)=1.1416 s t(norm)=0.0680448, mflops=73.481 (err=9.9e-16) 3. Arndt 4-step: elapsed time t=1.38672 s, 4096 iters, t-(init.)=1.37793 s t(norm)=0.164262, mflops=30.4392 (err=9.9e-16) 4. Bailey: elapsed time t=1.62988 s, 8192 iters, t-(init.)=1.61328 s t(norm)=0.0961591, mflops=51.9972 (err=9.9e-16) 5. Beauregard: elapsed time t=1.58203 s, 2048 iters, t-(init.)=1.57813 s t(norm)=0.376254, mflops=13.2889 (err=1.1e-15) 6. Bergland: elapsed time t=1.97949 s, 32768 iters, t-(init.)=1.91406 s t(norm)=0.0285218, mflops=175.305 (err=9.9e-16) 7. Brenner: elapsed time t=1.46094 s, 8192 iters, t-(init.)=1.44336 s t(norm)=0.0860309, mflops=58.1186 (err=1.1e-15) 8. Burrus: elapsed time t=1.35547 s, 4096 iters, t-(init.)=1.34668 s t(norm)=0.160537, mflops=31.1455 (err=9.8e-16) 9. CWP (min N) (N=260): elapsed time t=1.57813 s, 32768 iters, t-(init.)=1.51172 s t(norm)=0.0225264, mflops=221.962 10. CWP (best N) (N=280): elapsed time t=1.45313 s, 32768 iters, t-(init.)=1.38281 s t(norm)=0.0206055, mflops=242.654 11. Edelblute: elapsed time t=1.97656 s, 8192 iters, t-(init.)=1.95996 s t(norm)=0.116823, mflops=42.7999 (err=9.8e-16) 12. FFTPACK: elapsed time t=1.20703 s, 8192 iters, t-(init.)=1.19043 s t(norm)=0.0709551, mflops=70.4671 (err=1.1e-15) 13. FFTPACK (f2c): elapsed time t=1.20996 s, 8192 iters, t-(init.)=1.19336 s t(norm)=0.0711298, mflops=70.2941 (err=1.1e-15) FFTW_MEASURE plan: (cost = 2.777576e-05) FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_NOTW 8 14. FFTW: elapsed time t=1.86621 s, 65536 iters, t-(init.)=1.73633 s t(norm)=0.0129367, mflops=386.499 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 1.280000e+03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.93359 s, 65536 iters, t-(init.)=1.80371 s t(norm)=0.0134387, mflops=372.06 (err=1.1e-15) 16. Frigo-old: elapsed time t=1.23047 s, 32768 iters, t-(init.)=1.16504 s t(norm)=0.0173604, mflops=288.011 (err=1.1e-15) 17. Green: elapsed time t=1.92676 s, 65536 iters, t-(init.)=1.79688 s t(norm)=0.0133878, mflops=373.475 (err=1.1e-15) 18. GSL: elapsed time t=1.8916 s, 32768 iters, t-(init.)=1.82715 s t(norm)=0.0272266, mflops=183.644 (err=1.1e-15) 19. GSL DIT: elapsed time t=1.93848 s, 16384 iters, t-(init.)=1.90527 s t(norm)=0.0567816, mflops=88.0567 (err=1.1e-15) 20. GSL DIF: elapsed time t=1.64746 s, 16384 iters, t-(init.)=1.61523 s t(norm)=0.0481377, mflops=103.869 (err=1.1e-15) 21. Krukar: elapsed time t=1.14551 s, 16384 iters, t-(init.)=1.11328 s t(norm)=0.0331784, mflops=150.701 (err=1.1e-15) 22. Mayer (Buneman): elapsed time t=1.42285 s, 16384 iters, t-(init.)=1.39063 s t(norm)=0.0414439, mflops=120.645 (err=1.0e-15) 23. Mayer (simple): elapsed time t=1.10156 s, 16384 iters, t-(init.)=1.06934 s t(norm)=0.0318687, mflops=156.894 24. Mayer (lookup): elapsed time t=1.10742 s, 16384 iters, t-(init.)=1.0752 s t(norm)=0.0320433, mflops=156.039 (err=9.8e-16) 25. Monro: elapsed time t=1.15137 s, 8192 iters, t-(init.)=1.13477 s t(norm)=0.0676373, mflops=73.9237 (err=9.9e-16) 26. NAPACK (f2c): elapsed time t=1.11133 s, 4096 iters, t-(init.)=1.10352 s t(norm)=0.131549, mflops=38.0086 (err=3.5e-15) 27. Nielsen: elapsed time t=1.3252 s, 8192 iters, t-(init.)=1.30957 s t(norm)=0.0780565, mflops=64.0562 (err=3.2e-15) 28. NR (C): elapsed time t=1.55566 s, 16384 iters, t-(init.)=1.52344 s t(norm)=0.045402, mflops=110.127 (err=2.1e-15) 29. Ooura (C): elapsed time t=1.9873 s, 65536 iters, t-(init.)=1.85742 s t(norm)=0.0138389, mflops=361.301 (err=1.0e-15) 30. Ooura (F): elapsed time t=1.3418 s, 16384 iters, t-(init.)=1.30859 s t(norm)=0.0389991, mflops=128.208 (err=1.0e-15) 31. Ransom: elapsed time t=1.00781 s, 8192 iters, t-(init.)=0.991211 s t(norm)=0.0590808, mflops=84.6299 (err=1.5e-15) 32. SCIPORT: elapsed time t=1.2666 s, 4096 iters, t-(init.)=1.25879 s t(norm)=0.150059, mflops=33.3201 (err=1.1e-15) 33. Singleton: elapsed time t=1.93848 s, 16384 iters, t-(init.)=1.90625 s t(norm)=0.0568107, mflops=88.0116 (err=1.4e-15) 34. Singleton (f2c): elapsed time t=1.99609 s, 32768 iters, t-(init.)=1.93066 s t(norm)=0.0287691, mflops=173.797 (err=1.4e-15) 35. Sorensen: elapsed time t=1.64453 s, 16384 iters, t-(init.)=1.61133 s t(norm)=0.0480213, mflops=104.12 (err=9.3e-16) 36. Sorensen DIT: elapsed time t=1.36523 s, 4096 iters, t-(init.)=1.35645 s t(norm)=0.161701, mflops=30.9213 (err=9.4e-16) 37. Temperton: elapsed time t=1.0166 s, 8192 iters, t-(init.)=1 s t(norm)=0.0596046, mflops=83.8861 (err=1.1e-15) 38. Temperton (f2c): elapsed time t=1.01758 s, 8192 iters, t-(init.)=1.00195 s t(norm)=0.0597211, mflops=83.7226 (err=1.1e-15) 39. Valkenburg: elapsed time t=1.98926 s, 2048 iters, t-(init.)=1.98438 s t(norm)=0.473112, mflops=10.5683 (err=1.1e-15) Top mflops for N=256 = 386.499 Normalized results and averages for N=256: fft 0: mflops = 106.906 (norm. = 0.276602), norm. avg. (of 8) = 0.366636 fft 1: mflops = 100.116 (norm. = 0.259033), norm. avg. (of 8) = 0.301391 fft 2: mflops = 73.481 (norm. = 0.19012), norm. avg. (of 8) = 0.209747 fft 3: mflops = 30.4392 (norm. = 0.0787562), norm. avg. (of 8) = 0.0518696 fft 4: mflops = 51.9972 (norm. = 0.134534), norm. avg. (of 8) = 0.100886 fft 5: mflops = 13.2889 (norm. = 0.0343827), norm. avg. (of 8) = 0.0395905 fft 6: mflops = 175.305 (norm. = 0.453571), norm. avg. (of 8) = 0.276245 fft 7: mflops = 58.1186 (norm. = 0.150372), norm. avg. (of 8) = 0.106322 fft 8: mflops = 31.1455 (norm. = 0.0805838), norm. avg. (of 8) = 0.111288 fft 9: mflops = 221.962 (norm. = 0.574289), norm. avg. (of 8) = 0.26748 fft 10: mflops = 242.654 (norm. = 0.627825), norm. avg. (of 8) = 0.263803 fft 11: mflops = 42.7999 (norm. = 0.110737), norm. avg. (of 7) = 0.0939458 fft 12: mflops = 70.4671 (norm. = 0.182322), norm. avg. (of 8) = 0.137971 fft 13: mflops = 70.2941 (norm. = 0.181874), norm. avg. (of 8) = 0.136135 fft 14: mflops = 386.499 (norm. = 1), norm. avg. (of 8) = 0.928224 fft 15: mflops = 372.06 (norm. = 0.962642), norm. avg. (of 8) = 0.924302 fft 16: mflops = 288.011 (norm. = 0.74518), norm. avg. (of 8) = 0.791943 fft 17: mflops = 373.475 (norm. = 0.966304), norm. avg. (of 6) = 0.715175 fft 18: mflops = 183.644 (norm. = 0.475147), norm. avg. (of 8) = 0.262047 fft 19: mflops = 88.0567 (norm. = 0.227832), norm. avg. (of 8) = 0.144318 fft 20: mflops = 103.869 (norm. = 0.268742), norm. avg. (of 8) = 0.15896 fft 21: mflops = 150.701 (norm. = 0.389912), norm. avg. (of 8) = 0.554088 fft 22: mflops = 120.645 (norm. = 0.312149), norm. avg. (of 7) = 0.260118 fft 23: mflops = 156.894 (norm. = 0.405936), norm. avg. (of 7) = 0.316132 fft 24: mflops = 156.039 (norm. = 0.403724), norm. avg. (of 7) = 0.308343 fft 25: mflops = 73.9237 (norm. = 0.191265), norm. avg. (of 7) = 0.118264 fft 26: mflops = 38.0086 (norm. = 0.0983407), norm. avg. (of 8) = 0.0603717 fft 27: mflops = 64.0562 (norm. = 0.165735), norm. avg. (of 8) = 0.09392 fft 28: mflops = 110.127 (norm. = 0.284936), norm. avg. (of 8) = 0.169632 fft 29: mflops = 361.301 (norm. = 0.934805), norm. avg. (of 8) = 0.675994 fft 30: mflops = 128.208 (norm. = 0.331716), norm. avg. (of 8) = 0.300958 fft 31: mflops = 84.6299 (norm. = 0.218966), norm. avg. (of 7) = 0.103381 fft 32: mflops = 33.3201 (norm. = 0.0862102), norm. avg. (of 7) = 0.0860364 fft 33: mflops = 88.0116 (norm. = 0.227715), norm. avg. (of 8) = 0.136542 fft 34: mflops = 173.797 (norm. = 0.449671), norm. avg. (of 8) = 0.23453 fft 35: mflops = 104.12 (norm. = 0.269394), norm. avg. (of 8) = 0.215853 fft 36: mflops = 30.9213 (norm. = 0.0800036), norm. avg. (of 8) = 0.102546 fft 37: mflops = 83.8861 (norm. = 0.217041), norm. avg. (of 8) = 0.12552 fft 38: mflops = 83.7226 (norm. = 0.216618), norm. avg. (of 8) = 0.129919 fft 39: mflops = 10.5683 (norm. = 0.0273438), norm. avg. (of 8) = 0.0302158 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.64551 s, 8192 iters, t-(init.)=1.61328 s t(norm)=0.0427374, mflops=116.994 (err=9.9e-16) 1. Arndt DIT: elapsed time t=1.7334 s, 8192 iters, t-(init.)=1.70215 s t(norm)=0.0450915, mflops=110.886 (err=9.5e-16) 2. Arndt Split-Radix: elapsed time t=1.21582 s, 4096 iters, t-(init.)=1.2002 s t(norm)=0.0635886, mflops=78.6304 (err=1.0e-15) 3. Arndt 4-step: elapsed time t=1.47461 s, 2048 iters, t-(init.)=1.4668 s t(norm)=0.155427, mflops=32.1694 (err=9.0e-16) 4. Bailey: elapsed time t=1.62988 s, 4096 iters, t-(init.)=1.61426 s t(norm)=0.0855265, mflops=58.4614 (err=1.0e-15) 5. Beauregard: elapsed time t=1.79883 s, 1024 iters, t-(init.)=1.79492 s t(norm)=0.380394, mflops=13.1443 (err=9.7e-16) 6. Bergland: elapsed time t=1.02734 s, 8192 iters, t-(init.)=0.995117 s t(norm)=0.0263616, mflops=189.67 (err=1.0e-15) 7. Brenner: elapsed time t=1.59375 s, 4096 iters, t-(init.)=1.57715 s t(norm)=0.0835603, mflops=59.837 (err=9.3e-16) 8. Burrus: elapsed time t=1.43945 s, 2048 iters, t-(init.)=1.43164 s t(norm)=0.151702, mflops=32.9593 (err=9.7e-16) 9. CWP (min N) (N=520): elapsed time t=1.65234 s, 16384 iters, t-(init.)=1.58691 s t(norm)=0.0210194, mflops=237.875 10. CWP (best N) (N=560): elapsed time t=1.69824 s, 16384 iters, t-(init.)=1.62793 s t(norm)=0.0215627, mflops=231.882 11. Edelblute: elapsed time t=1.02246 s, 2048 iters, t-(init.)=1.01465 s t(norm)=0.107516, mflops=46.5047 (err=9.7e-16) 12. FFTPACK: elapsed time t=1.64844 s, 4096 iters, t-(init.)=1.63184 s t(norm)=0.0864578, mflops=57.8317 (err=9.5e-16) 13. FFTPACK (f2c): elapsed time t=1.64941 s, 4096 iters, t-(init.)=1.63281 s t(norm)=0.0865095, mflops=57.7971 (err=9.5e-16) FFTW_MEASURE plan: (cost = 5.960464e-05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 8 14. FFTW: elapsed time t=1 s, 16384 iters, t-(init.)=0.935547 s t(norm)=0.0123918, mflops=403.494 (err=9.3e-16) FFTW_ESTIMATE plan: (cost = 1.740800e+03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.98926 s, 32768 iters, t-(init.)=1.86035 s t(norm)=0.0123206, mflops=405.824 (err=9.3e-16) 16. Frigo-old: elapsed time t=1.66602 s, 16384 iters, t-(init.)=1.60156 s t(norm)=0.0212135, mflops=235.699 (err=8.9e-16) 17. Green: elapsed time t=1.00391 s, 16384 iters, t-(init.)=0.94043 s t(norm)=0.0124564, mflops=401.399 (err=9.9e-16) 18. GSL: elapsed time t=1.19043 s, 8192 iters, t-(init.)=1.1582 s t(norm)=0.0306819, mflops=162.963 (err=9.1e-16) 19. GSL DIT: elapsed time t=1.04883 s, 4096 iters, t-(init.)=1.03223 s t(norm)=0.0546893, mflops=91.4255 (err=1.1e-15) 20. GSL DIF: elapsed time t=1.76465 s, 8192 iters, t-(init.)=1.73242 s t(norm)=0.0458935, mflops=108.948 (err=1.2e-15) 21. Krukar: elapsed time t=1.4541 s, 8192 iters, t-(init.)=1.42188 s t(norm)=0.0376668, mflops=132.743 (err=9.2e-16) 22. Mayer (Buneman): elapsed time t=1.51367 s, 8192 iters, t-(init.)=1.48145 s t(norm)=0.0392449, mflops=127.405 (err=9.3e-16) 23. Mayer (simple): elapsed time t=1.1709 s, 8192 iters, t-(init.)=1.13867 s t(norm)=0.0301645, mflops=165.758 24. Mayer (lookup): elapsed time t=1.17676 s, 8192 iters, t-(init.)=1.14453 s t(norm)=0.0303197, mflops=164.909 (err=9.4e-16) 25. Monro: elapsed time t=1.26172 s, 4096 iters, t-(init.)=1.24512 s t(norm)=0.0659687, mflops=75.7935 (err=9.5e-16) 26. NAPACK (f2c): elapsed time t=1.375 s, 2048 iters, t-(init.)=1.36621 s t(norm)=0.144769, mflops=34.5378 (err=7.5e-15) 27. Nielsen: elapsed time t=1.33594 s, 4096 iters, t-(init.)=1.31934 s t(norm)=0.0699009, mflops=71.5298 (err=3.8e-15) 28. NR (C): elapsed time t=1.66602 s, 8192 iters, t-(init.)=1.63379 s t(norm)=0.0432806, mflops=115.525 (err=2.3e-15) 29. Ooura (C): elapsed time t=1.16797 s, 16384 iters, t-(init.)=1.10449 s t(norm)=0.0146295, mflops=341.775 (err=9.0e-16) 30. Ooura (F): elapsed time t=1.56445 s, 8192 iters, t-(init.)=1.53223 s t(norm)=0.0405901, mflops=123.183 (err=9.0e-16) 31. Ransom: elapsed time t=1.19629 s, 4096 iters, t-(init.)=1.18066 s t(norm)=0.0625538, mflops=79.9312 (err=1.3e-15) 32. SCIPORT: elapsed time t=1.46387 s, 2048 iters, t-(init.)=1.45605 s t(norm)=0.154289, mflops=32.4067 (err=9.8e-16) 33. Singleton: elapsed time t=1.07031 s, 4096 iters, t-(init.)=1.05371 s t(norm)=0.0558276, mflops=89.5614 (err=1.2e-15) 34. Singleton (f2c): elapsed time t=1.10547 s, 8192 iters, t-(init.)=1.07324 s t(norm)=0.0284312, mflops=175.863 (err=1.2e-15) 35. Sorensen: elapsed time t=1.80859 s, 8192 iters, t-(init.)=1.77637 s t(norm)=0.0470577, mflops=106.253 (err=9.1e-16) 36. Sorensen DIT: elapsed time t=1.42578 s, 2048 iters, t-(init.)=1.41797 s t(norm)=0.150253, mflops=33.2771 (err=1.0e-15) 37. Temperton: elapsed time t=1.31152 s, 4096 iters, t-(init.)=1.2959 s t(norm)=0.0686592, mflops=72.8235 (err=9.3e-16) 38. Temperton (f2c): elapsed time t=1.2959 s, 4096 iters, t-(init.)=1.28027 s t(norm)=0.0678313, mflops=73.7123 (err=9.3e-16) 39. Valkenburg: elapsed time t=1.12207 s, 512 iters, t-(init.)=1.12012 s t(norm)=0.474768, mflops=10.5315 (err=1.3e-15) Top mflops for N=512 = 405.824 Normalized results and averages for N=512: fft 0: mflops = 116.994 (norm. = 0.288287), norm. avg. (of 9) = 0.35793 fft 1: mflops = 110.886 (norm. = 0.273236), norm. avg. (of 9) = 0.298263 fft 2: mflops = 78.6304 (norm. = 0.193755), norm. avg. (of 9) = 0.20797 fft 3: mflops = 32.1694 (norm. = 0.0792693), norm. avg. (of 9) = 0.054914 fft 4: mflops = 58.4614 (norm. = 0.144056), norm. avg. (of 9) = 0.105683 fft 5: mflops = 13.1443 (norm. = 0.0323891), norm. avg. (of 9) = 0.0387903 fft 6: mflops = 189.67 (norm. = 0.46737), norm. avg. (of 9) = 0.297481 fft 7: mflops = 59.837 (norm. = 0.147446), norm. avg. (of 9) = 0.110891 fft 8: mflops = 32.9593 (norm. = 0.0812159), norm. avg. (of 9) = 0.107947 fft 9: mflops = 237.875 (norm. = 0.586154), norm. avg. (of 9) = 0.302888 fft 10: mflops = 231.882 (norm. = 0.571386), norm. avg. (of 9) = 0.297979 fft 11: mflops = 46.5047 (norm. = 0.114593), norm. avg. (of 8) = 0.0965268 fft 12: mflops = 57.8317 (norm. = 0.142504), norm. avg. (of 9) = 0.138475 fft 13: mflops = 57.7971 (norm. = 0.142419), norm. avg. (of 9) = 0.136833 fft 14: mflops = 403.494 (norm. = 0.994259), norm. avg. (of 9) = 0.935561 fft 15: mflops = 405.824 (norm. = 1), norm. avg. (of 9) = 0.932713 fft 16: mflops = 235.699 (norm. = 0.580793), norm. avg. (of 9) = 0.768482 fft 17: mflops = 401.399 (norm. = 0.989097), norm. avg. (of 7) = 0.754307 fft 18: mflops = 162.963 (norm. = 0.40156), norm. avg. (of 9) = 0.277548 fft 19: mflops = 91.4255 (norm. = 0.225284), norm. avg. (of 9) = 0.153314 fft 20: mflops = 108.948 (norm. = 0.268461), norm. avg. (of 9) = 0.171127 fft 21: mflops = 132.743 (norm. = 0.327095), norm. avg. (of 9) = 0.528867 fft 22: mflops = 127.405 (norm. = 0.313942), norm. avg. (of 8) = 0.266846 fft 23: mflops = 165.758 (norm. = 0.408448), norm. avg. (of 8) = 0.327671 fft 24: mflops = 164.909 (norm. = 0.406357), norm. avg. (of 8) = 0.320595 fft 25: mflops = 75.7935 (norm. = 0.186765), norm. avg. (of 8) = 0.126827 fft 26: mflops = 34.5378 (norm. = 0.0851054), norm. avg. (of 9) = 0.0631199 fft 27: mflops = 71.5298 (norm. = 0.176258), norm. avg. (of 9) = 0.103069 fft 28: mflops = 115.525 (norm. = 0.284668), norm. avg. (of 9) = 0.182413 fft 29: mflops = 341.775 (norm. = 0.842175), norm. avg. (of 9) = 0.694459 fft 30: mflops = 123.183 (norm. = 0.303537), norm. avg. (of 9) = 0.301245 fft 31: mflops = 79.9312 (norm. = 0.19696), norm. avg. (of 8) = 0.115079 fft 32: mflops = 32.4067 (norm. = 0.0798541), norm. avg. (of 8) = 0.0852636 fft 33: mflops = 89.5614 (norm. = 0.22069), norm. avg. (of 9) = 0.145892 fft 34: mflops = 175.863 (norm. = 0.433348), norm. avg. (of 9) = 0.256621 fft 35: mflops = 106.253 (norm. = 0.26182), norm. avg. (of 9) = 0.22096 fft 36: mflops = 33.2771 (norm. = 0.081999), norm. avg. (of 9) = 0.100263 fft 37: mflops = 72.8235 (norm. = 0.179446), norm. avg. (of 9) = 0.131512 fft 38: mflops = 73.7123 (norm. = 0.181636), norm. avg. (of 9) = 0.135666 fft 39: mflops = 10.5315 (norm. = 0.0259509), norm. avg. (of 9) = 0.0297419 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.93652 s, 4096 iters, t-(init.)=1.9043 s t(norm)=0.045402, mflops=110.127 (err=1.8e-15) 1. Arndt DIT: elapsed time t=1.03711 s, 2048 iters, t-(init.)=1.02051 s t(norm)=0.0486616, mflops=102.75 (err=1.8e-15) 2. Arndt Split-Radix: elapsed time t=1.46191 s, 2048 iters, t-(init.)=1.44629 s t(norm)=0.0689644, mflops=72.5011 (err=1.8e-15) 3. Arndt 4-step: elapsed time t=1.48145 s, 1024 iters, t-(init.)=1.47363 s t(norm)=0.140537, mflops=35.5779 (err=1.7e-15) 4. Bailey: elapsed time t=1.96973 s, 2048 iters, t-(init.)=1.9541 s t(norm)=0.0931788, mflops=53.6603 (err=1.8e-15) 5. Beauregard: elapsed time t=1.01758 s, 256 iters, t-(init.)=1.01563 s t(norm)=0.38743, mflops=12.9056 (err=1.9e-15) 6. Bergland: elapsed time t=1.22852 s, 4096 iters, t-(init.)=1.19727 s t(norm)=0.028545, mflops=175.162 (err=1.7e-15) 7. Brenner: elapsed time t=1.80176 s, 2048 iters, t-(init.)=1.78613 s t(norm)=0.0851694, mflops=58.7065 (err=1.9e-15) 8. Burrus: elapsed time t=1.57324 s, 1024 iters, t-(init.)=1.56543 s t(norm)=0.149291, mflops=33.4916 (err=1.8e-15) 9. CWP (min N) (N=1040): elapsed time t=1.88379 s, 8192 iters, t-(init.)=1.81934 s t(norm)=0.0216882, mflops=230.54 10. CWP (best N) (N=1040): elapsed time t=1.87988 s, 8192 iters, t-(init.)=1.81445 s t(norm)=0.02163, mflops=231.161 11. Edelblute: elapsed time t=1.14453 s, 1024 iters, t-(init.)=1.13574 s t(norm)=0.108313, mflops=46.1626 (err=1.8e-15) 12. FFTPACK: elapsed time t=1.75586 s, 2048 iters, t-(init.)=1.73926 s t(norm)=0.0829343, mflops=60.2887 (err=1.8e-15) 13. FFTPACK (f2c): elapsed time t=1.75781 s, 2048 iters, t-(init.)=1.74219 s t(norm)=0.083074, mflops=60.1873 (err=1.8e-15) FFTW_MEASURE plan: (cost = 1.368523e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.15918 s, 8192 iters, t-(init.)=1.09473 s t(norm)=0.0130502, mflops=383.137 (err=1.8e-15) FFTW_ESTIMATE plan: (cost = 6.758400e+03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.1582 s, 8192 iters, t-(init.)=1.09473 s t(norm)=0.0130502, mflops=383.137 (err=1.8e-15) 16. Frigo-old: elapsed time t=1.83203 s, 8192 iters, t-(init.)=1.76855 s t(norm)=0.0210828, mflops=237.16 (err=1.8e-15) 17. Green: elapsed time t=1.30371 s, 8192 iters, t-(init.)=1.24023 s t(norm)=0.0147847, mflops=338.186 (err=1.8e-15) 18. GSL: elapsed time t=1.54492 s, 4096 iters, t-(init.)=1.5127 s t(norm)=0.0360655, mflops=138.637 (err=1.8e-15) 19. GSL DIT: elapsed time t=1.35645 s, 2048 iters, t-(init.)=1.34082 s t(norm)=0.0639353, mflops=78.2041 (err=2.0e-15) 20. GSL DIF: elapsed time t=1.09375 s, 2048 iters, t-(init.)=1.07813 s t(norm)=0.051409, mflops=97.2592 (err=2.0e-15) 21. Krukar: elapsed time t=1.18359 s, 2048 iters, t-(init.)=1.16699 s t(norm)=0.0556465, mflops=89.8529 (err=1.9e-15) 22. Mayer (Buneman): elapsed time t=1.73242 s, 4096 iters, t-(init.)=1.7002 s t(norm)=0.0405358, mflops=123.348 (err=1.7e-15) 23. Mayer (simple): elapsed time t=1.3877 s, 4096 iters, t-(init.)=1.35547 s t(norm)=0.0323169, mflops=154.718 24. Mayer (lookup): elapsed time t=1.3916 s, 4096 iters, t-(init.)=1.35938 s t(norm)=0.03241, mflops=154.273 (err=1.7e-15) 25. Monro: elapsed time t=1.50781 s, 2048 iters, t-(init.)=1.49121 s t(norm)=0.0711065, mflops=70.3171 (err=1.7e-15) 26. NAPACK (f2c): elapsed time t=1.55762 s, 1024 iters, t-(init.)=1.5498 s t(norm)=0.147801, mflops=33.8293 (err=1.6e-14) 27. Nielsen: elapsed time t=1.71582 s, 2048 iters, t-(init.)=1.69922 s t(norm)=0.0810251, mflops=61.7093 (err=6.4e-15) 28. NR (C): elapsed time t=1.125 s, 2048 iters, t-(init.)=1.10938 s t(norm)=0.0528991, mflops=94.5195 (err=3.0e-15) 29. Ooura (C): elapsed time t=1.36816 s, 8192 iters, t-(init.)=1.30371 s t(norm)=0.0155414, mflops=321.72 (err=1.7e-15) 30. Ooura (F): elapsed time t=1.76074 s, 4096 iters, t-(init.)=1.72949 s t(norm)=0.0412343, mflops=121.258 (err=1.7e-15) 31. Ransom: elapsed time t=1.1416 s, 2048 iters, t-(init.)=1.12598 s t(norm)=0.0536907, mflops=93.1259 (err=2.1e-15) 32. SCIPORT: elapsed time t=1.68262 s, 1024 iters, t-(init.)=1.6748 s t(norm)=0.159722, mflops=31.3044 (err=1.9e-15) 33. Singleton: elapsed time t=1.23926 s, 2048 iters, t-(init.)=1.22363 s t(norm)=0.0583474, mflops=85.6937 (err=2.6e-15) 34. Singleton (f2c): elapsed time t=1.26953 s, 4096 iters, t-(init.)=1.23828 s t(norm)=0.0295229, mflops=169.36 (err=2.6e-15) 35. Sorensen: elapsed time t=1.23145 s, 2048 iters, t-(init.)=1.21582 s t(norm)=0.0579748, mflops=86.2443 (err=1.7e-15) 36. Sorensen DIT: elapsed time t=1.63281 s, 1024 iters, t-(init.)=1.625 s t(norm)=0.154972, mflops=32.2639 (err=1.7e-15) 37. Temperton: elapsed time t=1.35059 s, 2048 iters, t-(init.)=1.33496 s t(norm)=0.0636559, mflops=78.5473 (err=1.8e-15) 38. Temperton (f2c): elapsed time t=1.33789 s, 2048 iters, t-(init.)=1.32227 s t(norm)=0.0630505, mflops=79.3015 (err=1.8e-15) 39. Valkenburg: elapsed time t=1.26465 s, 256 iters, t-(init.)=1.2627 s t(norm)=0.48168, mflops=10.3803 (err=1.9e-15) Top mflops for N=1024 = 383.137 Normalized results and averages for N=1024: fft 0: mflops = 110.127 (norm. = 0.287436), norm. avg. (of 10) = 0.350881 fft 1: mflops = 102.75 (norm. = 0.268182), norm. avg. (of 10) = 0.295255 fft 2: mflops = 72.5011 (norm. = 0.18923), norm. avg. (of 10) = 0.206096 fft 3: mflops = 35.5779 (norm. = 0.0928595), norm. avg. (of 10) = 0.0587086 fft 4: mflops = 53.6603 (norm. = 0.140055), norm. avg. (of 10) = 0.10912 fft 5: mflops = 12.9056 (norm. = 0.0336839), norm. avg. (of 10) = 0.0382797 fft 6: mflops = 175.162 (norm. = 0.457178), norm. avg. (of 10) = 0.313451 fft 7: mflops = 58.7065 (norm. = 0.153226), norm. avg. (of 10) = 0.115125 fft 8: mflops = 33.4916 (norm. = 0.0874142), norm. avg. (of 10) = 0.105894 fft 9: mflops = 230.54 (norm. = 0.601718), norm. avg. (of 10) = 0.332771 fft 10: mflops = 231.161 (norm. = 0.603337), norm. avg. (of 10) = 0.328515 fft 11: mflops = 46.1626 (norm. = 0.120486), norm. avg. (of 9) = 0.0991889 fft 12: mflops = 60.2887 (norm. = 0.157355), norm. avg. (of 10) = 0.140363 fft 13: mflops = 60.1873 (norm. = 0.157091), norm. avg. (of 10) = 0.138859 fft 14: mflops = 383.137 (norm. = 1), norm. avg. (of 10) = 0.942005 fft 15: mflops = 383.137 (norm. = 1), norm. avg. (of 10) = 0.939442 fft 16: mflops = 237.16 (norm. = 0.618995), norm. avg. (of 10) = 0.753533 fft 17: mflops = 338.186 (norm. = 0.882677), norm. avg. (of 8) = 0.770353 fft 18: mflops = 138.637 (norm. = 0.361846), norm. avg. (of 10) = 0.285978 fft 19: mflops = 78.2041 (norm. = 0.204115), norm. avg. (of 10) = 0.158394 fft 20: mflops = 97.2592 (norm. = 0.25385), norm. avg. (of 10) = 0.179399 fft 21: mflops = 89.8529 (norm. = 0.234519), norm. avg. (of 10) = 0.499432 fft 22: mflops = 123.348 (norm. = 0.321941), norm. avg. (of 9) = 0.272967 fft 23: mflops = 154.718 (norm. = 0.403818), norm. avg. (of 9) = 0.336132 fft 24: mflops = 154.273 (norm. = 0.402658), norm. avg. (of 9) = 0.329713 fft 25: mflops = 70.3171 (norm. = 0.18353), norm. avg. (of 9) = 0.133127 fft 26: mflops = 33.8293 (norm. = 0.0882955), norm. avg. (of 10) = 0.0656374 fft 27: mflops = 61.7093 (norm. = 0.161063), norm. avg. (of 10) = 0.108868 fft 28: mflops = 94.5195 (norm. = 0.246699), norm. avg. (of 10) = 0.188842 fft 29: mflops = 321.72 (norm. = 0.8397), norm. avg. (of 10) = 0.708983 fft 30: mflops = 121.258 (norm. = 0.316488), norm. avg. (of 10) = 0.302769 fft 31: mflops = 93.1259 (norm. = 0.243062), norm. avg. (of 9) = 0.129299 fft 32: mflops = 31.3044 (norm. = 0.0817055), norm. avg. (of 9) = 0.0848682 fft 33: mflops = 85.6937 (norm. = 0.223663), norm. avg. (of 10) = 0.153669 fft 34: mflops = 169.36 (norm. = 0.442035), norm. avg. (of 10) = 0.275162 fft 35: mflops = 86.2443 (norm. = 0.2251), norm. avg. (of 10) = 0.221374 fft 36: mflops = 32.2639 (norm. = 0.0842097), norm. avg. (of 10) = 0.0986577 fft 37: mflops = 78.5473 (norm. = 0.205011), norm. avg. (of 10) = 0.138862 fft 38: mflops = 79.3015 (norm. = 0.206979), norm. avg. (of 10) = 0.142797 fft 39: mflops = 10.3803 (norm. = 0.027093), norm. avg. (of 10) = 0.029477 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.0127 s, 1024 iters, t-(init.)=0.99707 s t(norm)=0.0432218, mflops=115.682 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.05078 s, 1024 iters, t-(init.)=1.03516 s t(norm)=0.0448728, mflops=111.426 (err=1.4e-15) 2. Arndt Split-Radix: elapsed time t=1.55664 s, 1024 iters, t-(init.)=1.54102 s t(norm)=0.0668012, mflops=74.8489 (err=1.4e-15) 3. Arndt 4-step: elapsed time t=1.66309 s, 512 iters, t-(init.)=1.65527 s t(norm)=0.143508, mflops=34.8412 (err=1.3e-15) 4. Bailey: elapsed time t=1.4082 s, 512 iters, t-(init.)=1.40039 s t(norm)=0.121411, mflops=41.1826 (err=1.4e-15) 5. Beauregard: elapsed time t=1.13086 s, 128 iters, t-(init.)=1.12891 s t(norm)=0.391494, mflops=12.7716 (err=1.4e-15) 6. Bergland: elapsed time t=1.21289 s, 2048 iters, t-(init.)=1.18066 s t(norm)=0.0255902, mflops=195.387 (err=1.5e-15) 7. Brenner: elapsed time t=1.95898 s, 1024 iters, t-(init.)=1.94336 s t(norm)=0.0842424, mflops=59.3526 (err=1.4e-15) 8. Burrus: elapsed time t=1.66211 s, 512 iters, t-(init.)=1.6543 s t(norm)=0.143424, mflops=34.8617 (err=1.3e-15) 9. CWP (min N) (N=2145): elapsed time t=1.27051 s, 2048 iters, t-(init.)=1.22852 s t(norm)=0.0266274, mflops=187.777 10. CWP (best N) (N=2184): elapsed time t=1.12695 s, 2048 iters, t-(init.)=1.08203 s t(norm)=0.0234524, mflops=213.198 11. Edelblute: elapsed time t=1.19434 s, 512 iters, t-(init.)=1.18652 s t(norm)=0.102869, mflops=48.6056 (err=1.3e-15) 12. FFTPACK: elapsed time t=1.0166 s, 512 iters, t-(init.)=1.00879 s t(norm)=0.0874597, mflops=57.1692 (err=1.4e-15) 13. FFTPACK (f2c): elapsed time t=1.02344 s, 512 iters, t-(init.)=1.01563 s t(norm)=0.0880523, mflops=56.7844 (err=1.4e-15) FFTW_MEASURE plan: (cost = 3.519058e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_NOTW 8 14. FFTW: elapsed time t=1.16895 s, 2048 iters, t-(init.)=1.13672 s t(norm)=0.0246377, mflops=202.941 (err=1.4e-15) FFTW_ESTIMATE plan: (cost = 1.228800e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.14551 s, 2048 iters, t-(init.)=1.11426 s t(norm)=0.0241509, mflops=207.032 (err=1.4e-15) 16. Frigo-old: elapsed time t=1.41992 s, 2048 iters, t-(init.)=1.3877 s t(norm)=0.0300775, mflops=166.237 (err=1.4e-15) 17. Green: elapsed time t=1.43359 s, 4096 iters, t-(init.)=1.37012 s t(norm)=0.0148482, mflops=336.74 (err=1.5e-15) 18. GSL: elapsed time t=1.18262 s, 1024 iters, t-(init.)=1.16602 s t(norm)=0.0505454, mflops=98.9209 (err=1.4e-15) 19. GSL DIT: elapsed time t=1.48633 s, 1024 iters, t-(init.)=1.4707 s t(norm)=0.0637533, mflops=78.4274 (err=1.9e-15) 20. GSL DIF: elapsed time t=1.1875 s, 1024 iters, t-(init.)=1.1709 s t(norm)=0.0507571, mflops=98.5084 (err=2.3e-15) 21. Krukar: elapsed time t=1.44238 s, 1024 iters, t-(init.)=1.42676 s t(norm)=0.0618483, mflops=80.843 (err=1.4e-15) 22. Mayer (Buneman): elapsed time t=1.00586 s, 1024 iters, t-(init.)=0.989258 s t(norm)=0.0428832, mflops=116.596 (err=1.3e-15) 23. Mayer (simple): elapsed time t=1.66797 s, 2048 iters, t-(init.)=1.63672 s t(norm)=0.0354749, mflops=140.945 24. Mayer (lookup): elapsed time t=1.66504 s, 2048 iters, t-(init.)=1.63184 s t(norm)=0.0353691, mflops=141.366 (err=1.3e-15) 25. Monro: elapsed time t=1.73535 s, 1024 iters, t-(init.)=1.71875 s t(norm)=0.0745058, mflops=67.1089 (err=1.4e-15) 26. NAPACK (f2c): elapsed time t=1.9541 s, 512 iters, t-(init.)=1.94629 s t(norm)=0.168739, mflops=29.6316 (err=1.5e-14) 27. Nielsen: elapsed time t=1.85254 s, 1024 iters, t-(init.)=1.83594 s t(norm)=0.0795857, mflops=62.8253 (err=1.1e-14) 28. NR (C): elapsed time t=1.2334 s, 1024 iters, t-(init.)=1.2168 s t(norm)=0.0527467, mflops=94.7926 (err=3.2e-15) 29. Ooura (C): elapsed time t=1.82031 s, 4096 iters, t-(init.)=1.75586 s t(norm)=0.0190286, mflops=262.762 (err=1.4e-15) 30. Ooura (F): elapsed time t=1.0498 s, 1024 iters, t-(init.)=1.0332 s t(norm)=0.0447881, mflops=111.637 (err=1.4e-15) 31. Ransom: elapsed time t=1.42676 s, 1024 iters, t-(init.)=1.41016 s t(norm)=0.0611286, mflops=81.7947 (err=2.0e-15) 32. SCIPORT: elapsed time t=1.00781 s, 256 iters, t-(init.)=1.00391 s t(norm)=0.174073, mflops=28.7236 (err=1.5e-15) 33. Singleton: elapsed time t=1.52148 s, 1024 iters, t-(init.)=1.50586 s t(norm)=0.0652772, mflops=76.5964 (err=1.9e-15) 34. Singleton (f2c): elapsed time t=1.54395 s, 2048 iters, t-(init.)=1.5127 s t(norm)=0.0327868, mflops=152.5 (err=1.9e-15) 35. Sorensen: elapsed time t=1.39355 s, 1024 iters, t-(init.)=1.37793 s t(norm)=0.0597316, mflops=83.7077 (err=1.3e-15) 36. Sorensen DIT: elapsed time t=1.73828 s, 512 iters, t-(init.)=1.73047 s t(norm)=0.150028, mflops=33.3272 (err=1.4e-15) 37. Temperton: elapsed time t=1.73145 s, 1024 iters, t-(init.)=1.71582 s t(norm)=0.0743788, mflops=67.2234 (err=1.4e-15) 38. Temperton (f2c): elapsed time t=1.75781 s, 1024 iters, t-(init.)=1.74219 s t(norm)=0.0755218, mflops=66.2061 (err=1.4e-15) 39. Valkenburg: elapsed time t=1.46289 s, 128 iters, t-(init.)=1.46094 s t(norm)=0.506639, mflops=9.86895 (err=1.7e-15) Top mflops for N=2048 = 336.74 Normalized results and averages for N=2048: fft 0: mflops = 115.682 (norm. = 0.343536), norm. avg. (of 11) = 0.350213 fft 1: mflops = 111.426 (norm. = 0.330896), norm. avg. (of 11) = 0.298495 fft 2: mflops = 74.8489 (norm. = 0.222275), norm. avg. (of 11) = 0.207567 fft 3: mflops = 34.8412 (norm. = 0.103466), norm. avg. (of 11) = 0.0627774 fft 4: mflops = 41.1826 (norm. = 0.122298), norm. avg. (of 11) = 0.110318 fft 5: mflops = 12.7716 (norm. = 0.0379271), norm. avg. (of 11) = 0.0382476 fft 6: mflops = 195.387 (norm. = 0.580232), norm. avg. (of 11) = 0.337704 fft 7: mflops = 59.3526 (norm. = 0.176256), norm. avg. (of 11) = 0.120682 fft 8: mflops = 34.8617 (norm. = 0.103527), norm. avg. (of 11) = 0.105679 fft 9: mflops = 187.777 (norm. = 0.557631), norm. avg. (of 11) = 0.353213 fft 10: mflops = 213.198 (norm. = 0.633123), norm. avg. (of 11) = 0.356206 fft 11: mflops = 48.6056 (norm. = 0.144342), norm. avg. (of 10) = 0.103704 fft 12: mflops = 57.1692 (norm. = 0.169773), norm. avg. (of 11) = 0.143037 fft 13: mflops = 56.7844 (norm. = 0.16863), norm. avg. (of 11) = 0.141565 fft 14: mflops = 202.941 (norm. = 0.602663), norm. avg. (of 11) = 0.911156 fft 15: mflops = 207.032 (norm. = 0.614812), norm. avg. (of 11) = 0.90993 fft 16: mflops = 166.237 (norm. = 0.493666), norm. avg. (of 11) = 0.729909 fft 17: mflops = 336.74 (norm. = 1), norm. avg. (of 9) = 0.795869 fft 18: mflops = 98.9209 (norm. = 0.29376), norm. avg. (of 11) = 0.286685 fft 19: mflops = 78.4274 (norm. = 0.232902), norm. avg. (of 11) = 0.165168 fft 20: mflops = 98.5084 (norm. = 0.292535), norm. avg. (of 11) = 0.189684 fft 21: mflops = 80.843 (norm. = 0.240075), norm. avg. (of 11) = 0.475854 fft 22: mflops = 116.596 (norm. = 0.346249), norm. avg. (of 10) = 0.280296 fft 23: mflops = 140.945 (norm. = 0.418556), norm. avg. (of 10) = 0.344375 fft 24: mflops = 141.366 (norm. = 0.419808), norm. avg. (of 10) = 0.338723 fft 25: mflops = 67.1089 (norm. = 0.19929), norm. avg. (of 10) = 0.139743 fft 26: mflops = 29.6316 (norm. = 0.0879955), norm. avg. (of 11) = 0.06767 fft 27: mflops = 62.8253 (norm. = 0.186569), norm. avg. (of 11) = 0.115932 fft 28: mflops = 94.7926 (norm. = 0.281501), norm. avg. (of 11) = 0.197266 fft 29: mflops = 262.762 (norm. = 0.780311), norm. avg. (of 11) = 0.715467 fft 30: mflops = 111.637 (norm. = 0.331522), norm. avg. (of 11) = 0.305383 fft 31: mflops = 81.7947 (norm. = 0.242902), norm. avg. (of 10) = 0.140659 fft 32: mflops = 28.7236 (norm. = 0.0852991), norm. avg. (of 10) = 0.0849113 fft 33: mflops = 76.5964 (norm. = 0.227464), norm. avg. (of 11) = 0.160377 fft 34: mflops = 152.5 (norm. = 0.452873), norm. avg. (of 11) = 0.291318 fft 35: mflops = 83.7077 (norm. = 0.248583), norm. avg. (of 11) = 0.223848 fft 36: mflops = 33.3272 (norm. = 0.0989701), norm. avg. (of 11) = 0.0986861 fft 37: mflops = 67.2234 (norm. = 0.19963), norm. avg. (of 11) = 0.144386 fft 38: mflops = 66.2061 (norm. = 0.196609), norm. avg. (of 11) = 0.147689 fft 39: mflops = 9.86895 (norm. = 0.0293073), norm. avg. (of 11) = 0.0294616 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.47852 s, 512 iters, t-(init.)=1.45215 s t(norm)=0.0577032, mflops=86.6503 (err=3.8e-15) 1. Arndt DIT: elapsed time t=1.58008 s, 512 iters, t-(init.)=1.55273 s t(norm)=0.0617001, mflops=81.0371 (err=3.8e-15) 2. Arndt Split-Radix: elapsed time t=1.03027 s, 256 iters, t-(init.)=1.0166 s t(norm)=0.0807922, mflops=61.8871 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.68652 s, 256 iters, t-(init.)=1.67383 s t(norm)=0.133024, mflops=37.5872 (err=3.7e-15) 4. Bailey: elapsed time t=1.92676 s, 256 iters, t-(init.)=1.91309 s t(norm)=0.152038, mflops=32.8864 (err=3.8e-15) 5. Beauregard: elapsed time t=1.25098 s, 64 iters, t-(init.)=1.24707 s t(norm)=0.396433, mflops=12.6125 (err=3.9e-15) 6. Bergland: elapsed time t=1.63965 s, 1024 iters, t-(init.)=1.58594 s t(norm)=0.0315097, mflops=158.681 (err=4.0e-15) 7. Brenner: elapsed time t=1.11621 s, 256 iters, t-(init.)=1.10254 s t(norm)=0.0876219, mflops=57.0633 (err=4.0e-15) 8. Burrus: elapsed time t=1.95313 s, 256 iters, t-(init.)=1.93945 s t(norm)=0.154134, mflops=32.4393 (err=3.8e-15) 9. CWP (min N) (N=4290): elapsed time t=1.47168 s, 1024 iters, t-(init.)=1.41602 s t(norm)=0.0281337, mflops=177.723 10. CWP (best N) (N=4368): elapsed time t=1.24805 s, 1024 iters, t-(init.)=1.19141 s t(norm)=0.0236711, mflops=211.228 11. Edelblute: elapsed time t=1.45313 s, 256 iters, t-(init.)=1.44043 s t(norm)=0.114475, mflops=43.6776 (err=3.8e-15) 12. FFTPACK: elapsed time t=1.21484 s, 256 iters, t-(init.)=1.20117 s t(norm)=0.0954606, mflops=52.3776 (err=4.0e-15) 13. FFTPACK (f2c): elapsed time t=1.21191 s, 256 iters, t-(init.)=1.19824 s t(norm)=0.0952277, mflops=52.5057 (err=4.0e-15) FFTW_MEASURE plan: (cost = 1.171112e-03) FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_NOTW 8 14. FFTW: elapsed time t=1.53223 s, 1024 iters, t-(init.)=1.47754 s t(norm)=0.0293561, mflops=170.323 (err=3.9e-15) FFTW_ESTIMATE plan: (cost = 1.802240e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.98047 s, 1024 iters, t-(init.)=1.92676 s t(norm)=0.0382812, mflops=130.612 (err=3.9e-15) 16. Frigo-old: elapsed time t=1.07715 s, 512 iters, t-(init.)=1.0498 s t(norm)=0.0417155, mflops=119.86 (err=3.9e-15) 17. Green: elapsed time t=1.1582 s, 1024 iters, t-(init.)=1.10449 s t(norm)=0.0219443, mflops=227.85 (err=4.0e-15) 18. GSL: elapsed time t=1.37207 s, 512 iters, t-(init.)=1.3457 s t(norm)=0.0534734, mflops=93.5044 (err=4.0e-15) 19. GSL DIT: elapsed time t=1.9668 s, 512 iters, t-(init.)=1.94043 s t(norm)=0.0771057, mflops=64.846 (err=4.3e-15) 20. GSL DIF: elapsed time t=1.62305 s, 512 iters, t-(init.)=1.59668 s t(norm)=0.0634464, mflops=78.8067 (err=4.5e-15) 21. Krukar: elapsed time t=1.54688 s, 512 iters, t-(init.)=1.52051 s t(norm)=0.0604196, mflops=82.7547 (err=4.0e-15) 22. Mayer (Buneman): elapsed time t=1.09277 s, 512 iters, t-(init.)=1.06641 s t(norm)=0.0423752, mflops=117.994 (err=3.8e-15) 23. Mayer (simple): elapsed time t=1.8457 s, 1024 iters, t-(init.)=1.79199 s t(norm)=0.0356037, mflops=140.435 24. Mayer (lookup): elapsed time t=1.9873 s, 1024 iters, t-(init.)=1.93359 s t(norm)=0.0384171, mflops=130.151 (err=3.8e-15) 25. Monro: elapsed time t=1.09082 s, 256 iters, t-(init.)=1.07715 s t(norm)=0.0856041, mflops=58.4084 (err=3.9e-15) 26. NAPACK (f2c): elapsed time t=1.08496 s, 128 iters, t-(init.)=1.0791 s t(norm)=0.171519, mflops=29.1514 (err=5.8e-14) 27. Nielsen: elapsed time t=1.28516 s, 256 iters, t-(init.)=1.27148 s t(norm)=0.101048, mflops=49.4812 (err=2.6e-14) 28. NR (C): elapsed time t=1.68164 s, 512 iters, t-(init.)=1.6543 s t(norm)=0.0657359, mflops=76.062 (err=5.5e-15) 29. Ooura (C): elapsed time t=1.0332 s, 1024 iters, t-(init.)=0.979492 s t(norm)=0.0194608, mflops=256.927 (err=4.0e-15) 30. Ooura (F): elapsed time t=1.17773 s, 512 iters, t-(init.)=1.15137 s t(norm)=0.0457512, mflops=109.287 (err=4.0e-15) 31. Ransom: elapsed time t=1.29004 s, 512 iters, t-(init.)=1.26367 s t(norm)=0.0502138, mflops=99.5742 (err=4.5e-15) 32. SCIPORT: elapsed time t=1.2627 s, 128 iters, t-(init.)=1.25586 s t(norm)=0.199613, mflops=25.0484 (err=4.0e-15) 33. Singleton: elapsed time t=1.68848 s, 512 iters, t-(init.)=1.66113 s t(norm)=0.0660075, mflops=75.749 (err=6.5e-15) 34. Singleton (f2c): elapsed time t=1.86523 s, 1024 iters, t-(init.)=1.81055 s t(norm)=0.0359723, mflops=138.996 (err=6.5e-15) 35. Sorensen: elapsed time t=1.02441 s, 256 iters, t-(init.)=1.01074 s t(norm)=0.0803266, mflops=62.2459 (err=3.8e-15) 36. Sorensen DIT: elapsed time t=1.04688 s, 128 iters, t-(init.)=1.04004 s t(norm)=0.16531, mflops=30.2462 (err=3.8e-15) 37. Temperton: elapsed time t=1.83496 s, 512 iters, t-(init.)=1.80859 s t(norm)=0.0718671, mflops=69.5729 (err=4.0e-15) 38. Temperton (f2c): elapsed time t=1.8418 s, 512 iters, t-(init.)=1.81543 s t(norm)=0.0721387, mflops=69.3109 (err=4.0e-15) 39. Valkenburg: elapsed time t=1.72656 s, 64 iters, t-(init.)=1.72266 s t(norm)=0.547618, mflops=9.13046 (err=4.2e-15) Top mflops for N=4096 = 256.927 Normalized results and averages for N=4096: fft 0: mflops = 86.6503 (norm. = 0.337256), norm. avg. (of 12) = 0.349133 fft 1: mflops = 81.0371 (norm. = 0.315409), norm. avg. (of 12) = 0.299904 fft 2: mflops = 61.8871 (norm. = 0.240874), norm. avg. (of 12) = 0.210342 fft 3: mflops = 37.5872 (norm. = 0.146295), norm. avg. (of 12) = 0.0697372 fft 4: mflops = 32.8864 (norm. = 0.127999), norm. avg. (of 12) = 0.111791 fft 5: mflops = 12.6125 (norm. = 0.0490897), norm. avg. (of 12) = 0.0391511 fft 6: mflops = 158.681 (norm. = 0.617611), norm. avg. (of 12) = 0.361029 fft 7: mflops = 57.0633 (norm. = 0.222099), norm. avg. (of 12) = 0.129134 fft 8: mflops = 32.4393 (norm. = 0.126259), norm. avg. (of 12) = 0.107394 fft 9: mflops = 177.723 (norm. = 0.691724), norm. avg. (of 12) = 0.381422 fft 10: mflops = 211.228 (norm. = 0.822131), norm. avg. (of 12) = 0.395033 fft 11: mflops = 43.6776 (norm. = 0.17), norm. avg. (of 11) = 0.109731 fft 12: mflops = 52.3776 (norm. = 0.203862), norm. avg. (of 12) = 0.148105 fft 13: mflops = 52.5057 (norm. = 0.20436), norm. avg. (of 12) = 0.146798 fft 14: mflops = 170.323 (norm. = 0.662921), norm. avg. (of 12) = 0.89047 fft 15: mflops = 130.612 (norm. = 0.508363), norm. avg. (of 12) = 0.876466 fft 16: mflops = 119.86 (norm. = 0.466512), norm. avg. (of 12) = 0.707959 fft 17: mflops = 227.85 (norm. = 0.886826), norm. avg. (of 10) = 0.804965 fft 18: mflops = 93.5044 (norm. = 0.363933), norm. avg. (of 12) = 0.293123 fft 19: mflops = 64.846 (norm. = 0.252391), norm. avg. (of 12) = 0.172436 fft 20: mflops = 78.8067 (norm. = 0.306728), norm. avg. (of 12) = 0.199438 fft 21: mflops = 82.7547 (norm. = 0.322094), norm. avg. (of 12) = 0.463041 fft 22: mflops = 117.994 (norm. = 0.459249), norm. avg. (of 11) = 0.296564 fft 23: mflops = 140.435 (norm. = 0.546594), norm. avg. (of 11) = 0.362758 fft 24: mflops = 130.151 (norm. = 0.506566), norm. avg. (of 11) = 0.353981 fft 25: mflops = 58.4084 (norm. = 0.227335), norm. avg. (of 11) = 0.147706 fft 26: mflops = 29.1514 (norm. = 0.113462), norm. avg. (of 12) = 0.0714859 fft 27: mflops = 49.4812 (norm. = 0.192588), norm. avg. (of 12) = 0.12232 fft 28: mflops = 76.062 (norm. = 0.296045), norm. avg. (of 12) = 0.205497 fft 29: mflops = 256.927 (norm. = 1), norm. avg. (of 12) = 0.739178 fft 30: mflops = 109.287 (norm. = 0.42536), norm. avg. (of 12) = 0.315381 fft 31: mflops = 99.5742 (norm. = 0.387558), norm. avg. (of 11) = 0.163105 fft 32: mflops = 25.0484 (norm. = 0.0974922), norm. avg. (of 11) = 0.0860551 fft 33: mflops = 75.749 (norm. = 0.294827), norm. avg. (of 12) = 0.171582 fft 34: mflops = 138.996 (norm. = 0.540992), norm. avg. (of 12) = 0.312124 fft 35: mflops = 62.2459 (norm. = 0.242271), norm. avg. (of 12) = 0.225383 fft 36: mflops = 30.2462 (norm. = 0.117723), norm. avg. (of 12) = 0.100273 fft 37: mflops = 69.5729 (norm. = 0.270788), norm. avg. (of 12) = 0.15492 fft 38: mflops = 69.3109 (norm. = 0.269769), norm. avg. (of 12) = 0.157862 fft 39: mflops = 9.13046 (norm. = 0.0355371), norm. avg. (of 12) = 0.0299679 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.11523 s, 128 iters, t-(init.)=1.0918 s t(norm)=0.0800937, mflops=62.4269 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.13672 s, 128 iters, t-(init.)=1.11328 s t(norm)=0.0816698, mflops=61.2221 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.48438 s, 128 iters, t-(init.)=1.45996 s t(norm)=0.107102, mflops=46.6844 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.96582 s, 128 iters, t-(init.)=1.94238 s t(norm)=0.142492, mflops=35.0896 (err=3.8e-15) 4. Bailey: elapsed time t=1.18066 s, 64 iters, t-(init.)=1.16797 s t(norm)=0.171363, mflops=29.1778 (err=3.7e-15) 5. Beauregard: elapsed time t=1.39355 s, 32 iters, t-(init.)=1.3877 s t(norm)=0.407203, mflops=12.2789 (err=3.8e-15) 6. Bergland: elapsed time t=1.14258 s, 256 iters, t-(init.)=1.0957 s t(norm)=0.0401902, mflops=124.409 (err=3.7e-15) 7. Brenner: elapsed time t=1.33789 s, 128 iters, t-(init.)=1.31445 s t(norm)=0.0964277, mflops=51.8523 (err=3.8e-15) 8. Burrus: elapsed time t=1.21582 s, 64 iters, t-(init.)=1.2041 s t(norm)=0.176665, mflops=28.3022 (err=3.7e-15) 9. CWP (min N) (N=8580): elapsed time t=1.70313 s, 512 iters, t-(init.)=1.60254 s t(norm)=0.0293904, mflops=170.124 10. CWP (best N) (N=9240): elapsed time t=1.66016 s, 512 iters, t-(init.)=1.55371 s t(norm)=0.0284949, mflops=175.47 11. Edelblute: elapsed time t=1.91016 s, 128 iters, t-(init.)=1.88672 s t(norm)=0.138409, mflops=36.1249 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.65527 s, 128 iters, t-(init.)=1.63184 s t(norm)=0.119711, mflops=41.7673 (err=3.8e-15) 13. FFTPACK (f2c): elapsed time t=1.64063 s, 128 iters, t-(init.)=1.61719 s t(norm)=0.118636, mflops=42.1457 (err=3.8e-15) FFTW_MEASURE plan: (cost = 3.036499e-03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.60547 s, 512 iters, t-(init.)=1.51172 s t(norm)=0.0277248, mflops=180.344 (err=3.8e-15) FFTW_ESTIMATE plan: (cost = 6.225920e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.60547 s, 512 iters, t-(init.)=1.5127 s t(norm)=0.0277427, mflops=180.228 (err=3.8e-15) 16. Frigo-old: elapsed time t=1.45996 s, 256 iters, t-(init.)=1.41211 s t(norm)=0.0517959, mflops=96.5328 (err=3.8e-15) 17. Green: elapsed time t=1.71582 s, 512 iters, t-(init.)=1.62305 s t(norm)=0.0297665, mflops=167.974 (err=3.8e-15) 18. GSL: elapsed time t=1.0127 s, 128 iters, t-(init.)=0.989258 s t(norm)=0.0725715, mflops=68.8976 (err=3.8e-15) 19. GSL DIT: elapsed time t=1.38184 s, 128 iters, t-(init.)=1.3584 s t(norm)=0.0996515, mflops=50.1749 (err=4.5e-15) 20. GSL DIF: elapsed time t=1.14355 s, 128 iters, t-(init.)=1.12012 s t(norm)=0.0821713, mflops=60.8485 (err=4.7e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.47168 s, 256 iters, t-(init.)=1.4248 s t(norm)=0.0522615, mflops=95.6727 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.29785 s, 256 iters, t-(init.)=1.25 s t(norm)=0.0458497, mflops=109.052 24. Mayer (lookup): elapsed time t=1.33496 s, 256 iters, t-(init.)=1.28809 s t(norm)=0.0472467, mflops=105.827 (err=3.7e-15) 25. Monro: elapsed time t=1.48828 s, 128 iters, t-(init.)=1.46484 s t(norm)=0.10746, mflops=46.5288 (err=3.9e-15) 26. NAPACK (f2c): elapsed time t=1.4248 s, 64 iters, t-(init.)=1.41309 s t(norm)=0.207327, mflops=24.1165 (err=4.6e-14) 27. Nielsen: elapsed time t=1.9209 s, 128 iters, t-(init.)=1.89746 s t(norm)=0.139197, mflops=35.9203 (err=1.1e-14) 28. NR (C): elapsed time t=1.22266 s, 128 iters, t-(init.)=1.19922 s t(norm)=0.0879742, mflops=56.8349 (err=6.1e-15) 29. Ooura (C): elapsed time t=1.61523 s, 512 iters, t-(init.)=1.52148 s t(norm)=0.0279039, mflops=179.187 (err=3.6e-15) 30. Ooura (F): elapsed time t=1.51074 s, 256 iters, t-(init.)=1.46289 s t(norm)=0.0536585, mflops=93.1819 (err=3.6e-15) 31. Ransom: elapsed time t=1.74512 s, 256 iters, t-(init.)=1.69727 s t(norm)=0.0622553, mflops=80.3144 (err=4.8e-15) 32. SCIPORT: elapsed time t=1.66602 s, 64 iters, t-(init.)=1.6543 s t(norm)=0.242717, mflops=20.6001 (err=3.8e-15) 33. Singleton: elapsed time t=1.09863 s, 128 iters, t-(init.)=1.0752 s t(norm)=0.0788759, mflops=63.3908 (err=5.8e-15) 34. Singleton (f2c): elapsed time t=1.41797 s, 256 iters, t-(init.)=1.37207 s t(norm)=0.0503272, mflops=99.3498 (err=5.8e-15) 35. Sorensen: elapsed time t=1.30859 s, 128 iters, t-(init.)=1.28516 s t(norm)=0.0942785, mflops=53.0344 (err=3.7e-15) 36. Sorensen DIT: elapsed time t=1.28223 s, 64 iters, t-(init.)=1.27051 s t(norm)=0.186408, mflops=26.8229 (err=3.7e-15) 37. Temperton: elapsed time t=1.25098 s, 128 iters, t-(init.)=1.22754 s t(norm)=0.0900517, mflops=55.5236 (err=3.8e-15) 38. Temperton (f2c): elapsed time t=1.25684 s, 128 iters, t-(init.)=1.2334 s t(norm)=0.0904816, mflops=55.2599 (err=3.8e-15) 39. Valkenburg: elapsed time t=1.86328 s, 32 iters, t-(init.)=1.85742 s t(norm)=0.545039, mflops=9.17366 (err=3.8e-15) Top mflops for N=8192 = 180.344 Normalized results and averages for N=8192: fft 0: mflops = 62.4269 (norm. = 0.346154), norm. avg. (of 13) = 0.348904 fft 1: mflops = 61.2221 (norm. = 0.339474), norm. avg. (of 13) = 0.302948 fft 2: mflops = 46.6844 (norm. = 0.258863), norm. avg. (of 13) = 0.214075 fft 3: mflops = 35.0896 (norm. = 0.19457), norm. avg. (of 13) = 0.0793398 fft 4: mflops = 29.1778 (norm. = 0.161789), norm. avg. (of 13) = 0.115637 fft 5: mflops = 12.2789 (norm. = 0.0680859), norm. avg. (of 13) = 0.0413769 fft 6: mflops = 124.409 (norm. = 0.68984), norm. avg. (of 13) = 0.386322 fft 7: mflops = 51.8523 (norm. = 0.287519), norm. avg. (of 13) = 0.141317 fft 8: mflops = 28.3022 (norm. = 0.156934), norm. avg. (of 13) = 0.111204 fft 9: mflops = 170.124 (norm. = 0.943327), norm. avg. (of 13) = 0.424646 fft 10: mflops = 175.47 (norm. = 0.972973), norm. avg. (of 13) = 0.43949 fft 11: mflops = 36.1249 (norm. = 0.200311), norm. avg. (of 12) = 0.117279 fft 12: mflops = 41.7673 (norm. = 0.231598), norm. avg. (of 13) = 0.154528 fft 13: mflops = 42.1457 (norm. = 0.233696), norm. avg. (of 13) = 0.153483 fft 14: mflops = 180.344 (norm. = 1), norm. avg. (of 13) = 0.898895 fft 15: mflops = 180.228 (norm. = 0.999354), norm. avg. (of 13) = 0.885919 fft 16: mflops = 96.5328 (norm. = 0.53527), norm. avg. (of 13) = 0.694675 fft 17: mflops = 167.974 (norm. = 0.931408), norm. avg. (of 11) = 0.81646 fft 18: mflops = 68.8976 (norm. = 0.382034), norm. avg. (of 13) = 0.299962 fft 19: mflops = 50.1749 (norm. = 0.278217), norm. avg. (of 13) = 0.180573 fft 20: mflops = 60.8485 (norm. = 0.337402), norm. avg. (of 13) = 0.210051 fft 21: mflops = -1 (norm. = -0.00554495), norm. avg. (of 12) = 0.463041 fft 22: mflops = 95.6727 (norm. = 0.5305), norm. avg. (of 12) = 0.316059 fft 23: mflops = 109.052 (norm. = 0.604688), norm. avg. (of 12) = 0.382919 fft 24: mflops = 105.827 (norm. = 0.586808), norm. avg. (of 12) = 0.373383 fft 25: mflops = 46.5288 (norm. = 0.258), norm. avg. (of 12) = 0.156897 fft 26: mflops = 24.1165 (norm. = 0.133725), norm. avg. (of 13) = 0.0762736 fft 27: mflops = 35.9203 (norm. = 0.199177), norm. avg. (of 13) = 0.128232 fft 28: mflops = 56.8349 (norm. = 0.315147), norm. avg. (of 13) = 0.213932 fft 29: mflops = 179.187 (norm. = 0.993582), norm. avg. (of 13) = 0.758748 fft 30: mflops = 93.1819 (norm. = 0.516689), norm. avg. (of 13) = 0.330866 fft 31: mflops = 80.3144 (norm. = 0.445339), norm. avg. (of 12) = 0.186624 fft 32: mflops = 20.6001 (norm. = 0.114227), norm. avg. (of 12) = 0.0884027 fft 33: mflops = 63.3908 (norm. = 0.351499), norm. avg. (of 13) = 0.185421 fft 34: mflops = 99.3498 (norm. = 0.55089), norm. avg. (of 13) = 0.330491 fft 35: mflops = 53.0344 (norm. = 0.294073), norm. avg. (of 13) = 0.230667 fft 36: mflops = 26.8229 (norm. = 0.148732), norm. avg. (of 13) = 0.104 fft 37: mflops = 55.5236 (norm. = 0.307876), norm. avg. (of 13) = 0.166686 fft 38: mflops = 55.2599 (norm. = 0.306413), norm. avg. (of 13) = 0.169289 fft 39: mflops = 9.17366 (norm. = 0.0508675), norm. avg. (of 13) = 0.0315755 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.50293 s, 64 iters, t-(init.)=1.47266 s t(norm)=0.100317, mflops=49.8421 (err=6.8e-15) 1. Arndt DIT: elapsed time t=1.55273 s, 64 iters, t-(init.)=1.52246 s t(norm)=0.103709, mflops=48.2116 (err=6.8e-15) 2. Arndt Split-Radix: elapsed time t=1.92285 s, 64 iters, t-(init.)=1.89258 s t(norm)=0.128922, mflops=38.7832 (err=6.8e-15) 3. Arndt 4-step: elapsed time t=1.95898 s, 64 iters, t-(init.)=1.92969 s t(norm)=0.13145, mflops=38.0374 (err=6.8e-15) 4. Bailey: elapsed time t=1.65918 s, 32 iters, t-(init.)=1.64453 s t(norm)=0.22405, mflops=22.3165 (err=6.9e-15) 5. Beauregard: elapsed time t=1.52832 s, 16 iters, t-(init.)=1.52051 s t(norm)=0.414305, mflops=12.0684 (err=6.9e-15) 6. Bergland: elapsed time t=1.37891 s, 128 iters, t-(init.)=1.31934 s t(norm)=0.0449363, mflops=111.269 (err=6.9e-15) 7. Brenner: elapsed time t=1.53906 s, 64 iters, t-(init.)=1.50977 s t(norm)=0.102845, mflops=48.617 (err=7.0e-15) 8. Burrus: elapsed time t=1.43945 s, 32 iters, t-(init.)=1.42383 s t(norm)=0.193981, mflops=25.7757 (err=6.8e-15) 9. CWP (min N) (N=17160): elapsed time t=1.00488 s, 128 iters, t-(init.)=0.929688 s t(norm)=0.031665, mflops=157.903 10. CWP (best N) (N=17160): elapsed time t=1.00684 s, 128 iters, t-(init.)=0.932617 s t(norm)=0.0317648, mflops=157.407 11. Edelblute: elapsed time t=1.17285 s, 32 iters, t-(init.)=1.1582 s t(norm)=0.157793, mflops=31.6872 (err=6.8e-15) 12. FFTPACK: elapsed time t=1.93066 s, 64 iters, t-(init.)=1.90039 s t(norm)=0.129454, mflops=38.6238 (err=6.9e-15) 13. FFTPACK (f2c): elapsed time t=1.91602 s, 64 iters, t-(init.)=1.88574 s t(norm)=0.128456, mflops=38.9238 (err=6.9e-15) FFTW_MEASURE plan: (cost = 8.148193e-03) FFTW_TWIDDLE 4 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_NOTW 8 14. FFTW: elapsed time t=1.93555 s, 256 iters, t-(init.)=1.81445 s t(norm)=0.0309, mflops=161.813 (err=6.9e-15) FFTW_ESTIMATE plan: (cost = 1.146880e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.76367 s, 256 iters, t-(init.)=1.64258 s t(norm)=0.0279729, mflops=178.744 (err=6.9e-15) 16. Frigo-old: elapsed time t=1.70605 s, 128 iters, t-(init.)=1.64551 s t(norm)=0.0560457, mflops=89.213 (err=6.9e-15) 17. Green: elapsed time t=1.19141 s, 128 iters, t-(init.)=1.13184 s t(norm)=0.0385501, mflops=129.701 (err=6.9e-15) 18. GSL: elapsed time t=1.15039 s, 64 iters, t-(init.)=1.12012 s t(norm)=0.0763019, mflops=65.5291 (err=6.9e-15) 19. GSL DIT: elapsed time t=1.6543 s, 64 iters, t-(init.)=1.62402 s t(norm)=0.110628, mflops=45.1966 (err=7.3e-15) 20. GSL DIF: elapsed time t=1.41211 s, 64 iters, t-(init.)=1.38281 s t(norm)=0.0941966, mflops=53.0805 (err=7.4e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.03516 s, 64 iters, t-(init.)=1.00488 s t(norm)=0.0684522, mflops=73.0437 (err=6.8e-15) 23. Mayer (simple): elapsed time t=1.90234 s, 128 iters, t-(init.)=1.8418 s t(norm)=0.0627312, mflops=79.7051 24. Mayer (lookup): elapsed time t=1.03613 s, 64 iters, t-(init.)=1.00586 s t(norm)=0.0685187, mflops=72.9727 (err=6.8e-15) 25. Monro: elapsed time t=1.80566 s, 64 iters, t-(init.)=1.77539 s t(norm)=0.120939, mflops=41.3432 (err=7.0e-15) 26. NAPACK (f2c): elapsed time t=1.53418 s, 32 iters, t-(init.)=1.51953 s t(norm)=0.20702, mflops=24.1523 (err=2.4e-13) 27. Nielsen: elapsed time t=1.07031 s, 32 iters, t-(init.)=1.05469 s t(norm)=0.14369, mflops=34.7972 (err=1.4e-13) 28. NR (C): elapsed time t=1.45996 s, 64 iters, t-(init.)=1.42969 s t(norm)=0.0973897, mflops=51.3401 (err=8.6e-15) 29. Ooura (C): elapsed time t=1.87402 s, 256 iters, t-(init.)=1.75293 s t(norm)=0.0298522, mflops=167.492 (err=6.9e-15) 30. Ooura (F): elapsed time t=1.69434 s, 128 iters, t-(init.)=1.63379 s t(norm)=0.0556465, mflops=89.8529 (err=6.9e-15) 31. Ransom: elapsed time t=1.55957 s, 128 iters, t-(init.)=1.49902 s t(norm)=0.0510564, mflops=97.9309 (err=7.7e-15) 32. SCIPORT: elapsed time t=1.91602 s, 32 iters, t-(init.)=1.90137 s t(norm)=0.259041, mflops=19.302 (err=6.9e-15) 33. Singleton: elapsed time t=1.29297 s, 64 iters, t-(init.)=1.2627 s t(norm)=0.0860143, mflops=58.1299 (err=1.1e-14) 34. Singleton (f2c): elapsed time t=1.7168 s, 128 iters, t-(init.)=1.65625 s t(norm)=0.0564115, mflops=88.6343 (err=1.1e-14) 35. Sorensen: elapsed time t=1.7373 s, 64 iters, t-(init.)=1.70703 s t(norm)=0.116282, mflops=42.9988 (err=6.8e-15) 36. Sorensen DIT: elapsed time t=1.5293 s, 32 iters, t-(init.)=1.51465 s t(norm)=0.206354, mflops=24.2302 (err=6.8e-15) 37. Temperton: elapsed time t=1.35742 s, 64 iters, t-(init.)=1.32715 s t(norm)=0.0904048, mflops=55.3068 (err=6.9e-15) 38. Temperton (f2c): elapsed time t=1.35352 s, 64 iters, t-(init.)=1.32324 s t(norm)=0.0901387, mflops=55.4701 (err=6.9e-15) 39. Valkenburg: elapsed time t=1.03613 s, 8 iters, t-(init.)=1.03223 s t(norm)=0.562519, mflops=8.88859 (err=7.0e-15) Top mflops for N=16384 = 178.744 Normalized results and averages for N=16384: fft 0: mflops = 49.8421 (norm. = 0.278846), norm. avg. (of 14) = 0.3439 fft 1: mflops = 48.2116 (norm. = 0.269724), norm. avg. (of 14) = 0.300575 fft 2: mflops = 38.7832 (norm. = 0.216976), norm. avg. (of 14) = 0.214282 fft 3: mflops = 38.0374 (norm. = 0.212804), norm. avg. (of 14) = 0.0888729 fft 4: mflops = 22.3165 (norm. = 0.124852), norm. avg. (of 14) = 0.116296 fft 5: mflops = 12.0684 (norm. = 0.0675177), norm. avg. (of 14) = 0.0432441 fft 6: mflops = 111.269 (norm. = 0.622502), norm. avg. (of 14) = 0.403192 fft 7: mflops = 48.617 (norm. = 0.271992), norm. avg. (of 14) = 0.150651 fft 8: mflops = 25.7757 (norm. = 0.144204), norm. avg. (of 14) = 0.113562 fft 9: mflops = 157.903 (norm. = 0.883403), norm. avg. (of 14) = 0.457414 fft 10: mflops = 157.407 (norm. = 0.880628), norm. avg. (of 14) = 0.471 fft 11: mflops = 31.6872 (norm. = 0.177277), norm. avg. (of 13) = 0.121894 fft 12: mflops = 38.6238 (norm. = 0.216084), norm. avg. (of 14) = 0.158925 fft 13: mflops = 38.9238 (norm. = 0.217763), norm. avg. (of 14) = 0.158074 fft 14: mflops = 161.813 (norm. = 0.905274), norm. avg. (of 14) = 0.899351 fft 15: mflops = 178.744 (norm. = 1), norm. avg. (of 14) = 0.894068 fft 16: mflops = 89.213 (norm. = 0.49911), norm. avg. (of 14) = 0.680706 fft 17: mflops = 129.701 (norm. = 0.725626), norm. avg. (of 12) = 0.80889 fft 18: mflops = 65.5291 (norm. = 0.366609), norm. avg. (of 14) = 0.304723 fft 19: mflops = 45.1966 (norm. = 0.252856), norm. avg. (of 14) = 0.185736 fft 20: mflops = 53.0805 (norm. = 0.296963), norm. avg. (of 14) = 0.216259 fft 21: mflops = -1 (norm. = -0.00559459), norm. avg. (of 12) = 0.463041 fft 22: mflops = 73.0437 (norm. = 0.408649), norm. avg. (of 13) = 0.323181 fft 23: mflops = 79.7051 (norm. = 0.445917), norm. avg. (of 13) = 0.387765 fft 24: mflops = 72.9727 (norm. = 0.408252), norm. avg. (of 13) = 0.376065 fft 25: mflops = 41.3432 (norm. = 0.231298), norm. avg. (of 13) = 0.16262 fft 26: mflops = 24.1523 (norm. = 0.135122), norm. avg. (of 14) = 0.080477 fft 27: mflops = 34.7972 (norm. = 0.194676), norm. avg. (of 14) = 0.132978 fft 28: mflops = 51.3401 (norm. = 0.287227), norm. avg. (of 14) = 0.219167 fft 29: mflops = 167.492 (norm. = 0.937047), norm. avg. (of 14) = 0.771484 fft 30: mflops = 89.8529 (norm. = 0.50269), norm. avg. (of 14) = 0.343139 fft 31: mflops = 97.9309 (norm. = 0.547883), norm. avg. (of 13) = 0.214413 fft 32: mflops = 19.302 (norm. = 0.107987), norm. avg. (of 13) = 0.0899091 fft 33: mflops = 58.1299 (norm. = 0.325213), norm. avg. (of 14) = 0.195406 fft 34: mflops = 88.6343 (norm. = 0.495873), norm. avg. (of 14) = 0.342304 fft 35: mflops = 42.9988 (norm. = 0.240561), norm. avg. (of 14) = 0.231374 fft 36: mflops = 24.2302 (norm. = 0.135558), norm. avg. (of 14) = 0.106254 fft 37: mflops = 55.3068 (norm. = 0.309419), norm. avg. (of 14) = 0.176881 fft 38: mflops = 55.4701 (norm. = 0.310332), norm. avg. (of 14) = 0.179364 fft 39: mflops = 8.88859 (norm. = 0.049728), norm. avg. (of 14) = 0.0328721 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.16016 s, 16 iters, t-(init.)=1.13574 s t(norm)=0.144417, mflops=34.6219 (err=1.4e-14) 1. Arndt DIT: elapsed time t=1.19824 s, 16 iters, t-(init.)=1.1748 s t(norm)=0.149384, mflops=33.4708 (err=1.4e-14) 2. Arndt Split-Radix: elapsed time t=1.54004 s, 16 iters, t-(init.)=1.5166 s t(norm)=0.192846, mflops=25.9274 (err=1.4e-14) 3. Arndt 4-step: elapsed time t=1.31445 s, 16 iters, t-(init.)=1.29102 s t(norm)=0.164161, mflops=30.4579 (err=1.4e-14) 4. Bailey: elapsed time t=1.78027 s, 16 iters, t-(init.)=1.75684 s t(norm)=0.223393, mflops=22.3821 (err=1.4e-14) 5. Beauregard: elapsed time t=1.71484 s, 8 iters, t-(init.)=1.70313 s t(norm)=0.433127, mflops=11.544 (err=1.4e-14) 6. Bergland: elapsed time t=1.03711 s, 32 iters, t-(init.)=0.990234 s t(norm)=0.0629574, mflops=79.4188 (err=1.4e-14) 7. Brenner: elapsed time t=1.05078 s, 16 iters, t-(init.)=1.02734 s t(norm)=0.130634, mflops=38.275 (err=1.4e-14) 8. Burrus: elapsed time t=1.01465 s, 8 iters, t-(init.)=1.00293 s t(norm)=0.255058, mflops=19.6034 (err=1.4e-14) 9. CWP (min N) (N=34320): elapsed time t=1.17773 s, 64 iters, t-(init.)=1.0791 s t(norm)=0.0343037, mflops=145.757 10. CWP (best N) (N=34320): elapsed time t=1.17773 s, 64 iters, t-(init.)=1.0791 s t(norm)=0.0343037, mflops=145.757 11. Edelblute: elapsed time t=1.7627 s, 16 iters, t-(init.)=1.73926 s t(norm)=0.221158, mflops=22.6083 (err=1.4e-14) 12. FFTPACK: elapsed time t=1.04199 s, 16 iters, t-(init.)=1.01758 s t(norm)=0.129392, mflops=38.6423 (err=1.4e-14) 13. FFTPACK (f2c): elapsed time t=1.03906 s, 16 iters, t-(init.)=1.01563 s t(norm)=0.129143, mflops=38.7167 (err=1.4e-14) FFTW_MEASURE plan: (cost = 2.044678e-02) FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_NOTW 8 14. FFTW: elapsed time t=1.21973 s, 64 iters, t-(init.)=1.125 s t(norm)=0.0357628, mflops=139.81 (err=1.4e-14) FFTW_ESTIMATE plan: (cost = 1.769472e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.28711 s, 64 iters, t-(init.)=1.19141 s t(norm)=0.0378738, mflops=132.017 (err=1.4e-14) 16. Frigo-old: elapsed time t=1.02832 s, 32 iters, t-(init.)=0.980469 s t(norm)=0.0623365, mflops=80.2098 (err=1.4e-14) 17. Green: elapsed time t=1.82422 s, 64 iters, t-(init.)=1.72949 s t(norm)=0.0549791, mflops=90.9437 (err=1.4e-14) 18. GSL: elapsed time t=1.38672 s, 32 iters, t-(init.)=1.33984 s t(norm)=0.085185, mflops=58.6958 (err=1.4e-14) 19. GSL DIT: elapsed time t=1.31543 s, 16 iters, t-(init.)=1.29199 s t(norm)=0.164285, mflops=30.4349 (err=1.4e-14) 20. GSL DIF: elapsed time t=1.18066 s, 16 iters, t-(init.)=1.15527 s t(norm)=0.146901, mflops=34.0366 (err=1.4e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.41504 s, 32 iters, t-(init.)=1.36816 s t(norm)=0.0869855, mflops=57.4808 (err=1.4e-14) 23. Mayer (simple): elapsed time t=1.32227 s, 32 iters, t-(init.)=1.27539 s t(norm)=0.0810872, mflops=61.662 24. Mayer (lookup): elapsed time t=1.39063 s, 32 iters, t-(init.)=1.34277 s t(norm)=0.0853712, mflops=58.5677 (err=1.4e-14) 25. Monro: elapsed time t=1.46582 s, 16 iters, t-(init.)=1.44238 s t(norm)=0.183408, mflops=27.2616 (err=1.4e-14) 26. NAPACK (f2c): elapsed time t=1.93555 s, 16 iters, t-(init.)=1.91113 s t(norm)=0.243013, mflops=20.575 (err=5.8e-13) 27. Nielsen: elapsed time t=1.2002 s, 16 iters, t-(init.)=1.17578 s t(norm)=0.149508, mflops=33.443 (err=2.3e-13) 28. NR (C): elapsed time t=1.18652 s, 16 iters, t-(init.)=1.16211 s t(norm)=0.14777, mflops=33.8364 (err=1.5e-14) 29. Ooura (C): elapsed time t=1.41699 s, 64 iters, t-(init.)=1.32227 s t(norm)=0.0420337, mflops=118.952 (err=1.4e-14) 30. Ooura (F): elapsed time t=1.12891 s, 32 iters, t-(init.)=1.08105 s t(norm)=0.0687316, mflops=72.7467 (err=1.4e-14) 31. Ransom: elapsed time t=1.13672 s, 32 iters, t-(init.)=1.08984 s t(norm)=0.0692904, mflops=72.1601 (err=1.5e-14) 32. SCIPORT: elapsed time t=1.19141 s, 8 iters, t-(init.)=1.17969 s t(norm)=0.30001, mflops=16.6661 (err=1.4e-14) 33. Singleton: elapsed time t=1.0166 s, 16 iters, t-(init.)=0.993164 s t(norm)=0.126287, mflops=39.5923 (err=2.1e-14) 34. Singleton (f2c): elapsed time t=1.57715 s, 32 iters, t-(init.)=1.5293 s t(norm)=0.0972301, mflops=51.4244 (err=2.1e-14) 35. Sorensen: elapsed time t=1.13184 s, 16 iters, t-(init.)=1.10742 s t(norm)=0.140816, mflops=35.5073 (err=1.4e-14) 36. Sorensen DIT: elapsed time t=1.03613 s, 8 iters, t-(init.)=1.02441 s t(norm)=0.260522, mflops=19.1922 (err=1.4e-14) 37. Temperton: elapsed time t=1.98145 s, 32 iters, t-(init.)=1.93457 s t(norm)=0.122997, mflops=40.6515 (err=1.4e-14) 38. Temperton (f2c): elapsed time t=1.97656 s, 32 iters, t-(init.)=1.92969 s t(norm)=0.122686, mflops=40.7544 (err=1.4e-14) 39. Valkenburg: elapsed time t=1.21875 s, 4 iters, t-(init.)=1.21289 s t(norm)=0.616908, mflops=8.10494 (err=1.4e-14) Top mflops for N=32768 = 145.757 Normalized results and averages for N=32768: fft 0: mflops = 34.6219 (norm. = 0.237532), norm. avg. (of 15) = 0.336809 fft 1: mflops = 33.4708 (norm. = 0.229634), norm. avg. (of 15) = 0.295846 fft 2: mflops = 25.9274 (norm. = 0.177882), norm. avg. (of 15) = 0.211855 fft 3: mflops = 30.4579 (norm. = 0.208964), norm. avg. (of 15) = 0.096879 fft 4: mflops = 22.3821 (norm. = 0.153558), norm. avg. (of 15) = 0.11878 fft 5: mflops = 11.544 (norm. = 0.0792001), norm. avg. (of 15) = 0.0456411 fft 6: mflops = 79.4188 (norm. = 0.544872), norm. avg. (of 15) = 0.412638 fft 7: mflops = 38.275 (norm. = 0.262595), norm. avg. (of 15) = 0.158114 fft 8: mflops = 19.6034 (norm. = 0.134494), norm. avg. (of 15) = 0.114957 fft 9: mflops = 145.757 (norm. = 1), norm. avg. (of 15) = 0.493586 fft 10: mflops = 145.757 (norm. = 1), norm. avg. (of 15) = 0.506267 fft 11: mflops = 22.6083 (norm. = 0.155109), norm. avg. (of 14) = 0.124267 fft 12: mflops = 38.6423 (norm. = 0.265115), norm. avg. (of 15) = 0.166004 fft 13: mflops = 38.7167 (norm. = 0.265625), norm. avg. (of 15) = 0.165244 fft 14: mflops = 139.81 (norm. = 0.959201), norm. avg. (of 15) = 0.903341 fft 15: mflops = 132.017 (norm. = 0.905738), norm. avg. (of 15) = 0.894846 fft 16: mflops = 80.2098 (norm. = 0.550299), norm. avg. (of 15) = 0.672013 fft 17: mflops = 90.9437 (norm. = 0.623941), norm. avg. (of 13) = 0.794663 fft 18: mflops = 58.6958 (norm. = 0.402697), norm. avg. (of 15) = 0.311254 fft 19: mflops = 30.4349 (norm. = 0.208806), norm. avg. (of 15) = 0.187274 fft 20: mflops = 34.0366 (norm. = 0.233516), norm. avg. (of 15) = 0.217409 fft 21: mflops = -1 (norm. = -0.00686074), norm. avg. (of 12) = 0.463041 fft 22: mflops = 57.4808 (norm. = 0.394361), norm. avg. (of 14) = 0.328265 fft 23: mflops = 61.662 (norm. = 0.423047), norm. avg. (of 14) = 0.390285 fft 24: mflops = 58.5677 (norm. = 0.401818), norm. avg. (of 14) = 0.377905 fft 25: mflops = 27.2616 (norm. = 0.187035), norm. avg. (of 14) = 0.164364 fft 26: mflops = 20.575 (norm. = 0.14116), norm. avg. (of 15) = 0.0845226 fft 27: mflops = 33.443 (norm. = 0.229444), norm. avg. (of 15) = 0.139409 fft 28: mflops = 33.8364 (norm. = 0.232143), norm. avg. (of 15) = 0.220032 fft 29: mflops = 118.952 (norm. = 0.8161), norm. avg. (of 15) = 0.774458 fft 30: mflops = 72.7467 (norm. = 0.499097), norm. avg. (of 15) = 0.353536 fft 31: mflops = 72.1601 (norm. = 0.495072), norm. avg. (of 14) = 0.23446 fft 32: mflops = 16.6661 (norm. = 0.114342), norm. avg. (of 14) = 0.0916543 fft 33: mflops = 39.5923 (norm. = 0.271632), norm. avg. (of 15) = 0.200488 fft 34: mflops = 51.4244 (norm. = 0.35281), norm. avg. (of 15) = 0.343004 fft 35: mflops = 35.5073 (norm. = 0.243607), norm. avg. (of 15) = 0.232189 fft 36: mflops = 19.1922 (norm. = 0.131673), norm. avg. (of 15) = 0.107949 fft 37: mflops = 40.6515 (norm. = 0.2789), norm. avg. (of 15) = 0.183682 fft 38: mflops = 40.7544 (norm. = 0.279605), norm. avg. (of 15) = 0.186046 fft 39: mflops = 8.10494 (norm. = 0.0556059), norm. avg. (of 15) = 0.0343877 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.53711 s, 8 iters, t-(init.)=1.50879 s t(norm)=0.179862, mflops=27.7991 (err=1.6e-14) 1. Arndt DIT: elapsed time t=1.5791 s, 8 iters, t-(init.)=1.55078 s t(norm)=0.184868, mflops=27.0464 (err=1.6e-14) 2. Arndt Split-Radix: elapsed time t=1.99023 s, 8 iters, t-(init.)=1.96094 s t(norm)=0.233762, mflops=21.3893 (err=1.6e-14) 3. Arndt 4-step: elapsed time t=1.29785 s, 8 iters, t-(init.)=1.26855 s t(norm)=0.151224, mflops=33.0636 (err=1.6e-14) 4. Bailey: elapsed time t=1.12793 s, 4 iters, t-(init.)=1.11328 s t(norm)=0.265427, mflops=18.8376 (err=1.6e-14) 5. Beauregard: elapsed time t=1.88281 s, 4 iters, t-(init.)=1.86914 s t(norm)=0.445638, mflops=11.2199 (err=1.7e-14) 6. Bergland: elapsed time t=1.33691 s, 16 iters, t-(init.)=1.2793 s t(norm)=0.076252, mflops=65.572 (err=1.6e-14) 7. Brenner: elapsed time t=1.25391 s, 8 iters, t-(init.)=1.22559 s t(norm)=0.146101, mflops=34.2228 (err=1.7e-14) 8. Burrus: elapsed time t=1.2627 s, 4 iters, t-(init.)=1.24902 s t(norm)=0.29779, mflops=16.7903 (err=1.6e-14) 9. CWP (min N) (N=72072): elapsed time t=1.35059 s, 32 iters, t-(init.)=1.22363 s t(norm)=0.0364671, mflops=137.11 10. CWP (best N) (N=72072): elapsed time t=1.35059 s, 32 iters, t-(init.)=1.22363 s t(norm)=0.0364671, mflops=137.11 11. Edelblute: elapsed time t=1.12793 s, 4 iters, t-(init.)=1.11426 s t(norm)=0.26566, mflops=18.8211 (err=1.6e-14) 12. FFTPACK: elapsed time t=1.2168 s, 8 iters, t-(init.)=1.1875 s t(norm)=0.141561, mflops=35.3205 (err=1.7e-14) 13. FFTPACK (f2c): elapsed time t=1.20801 s, 8 iters, t-(init.)=1.17969 s t(norm)=0.14063, mflops=35.5544 (err=1.7e-14) FFTW_MEASURE plan: (cost = 5.102539e-02) FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.50781 s, 32 iters, t-(init.)=1.3916 s t(norm)=0.041473, mflops=120.56 (err=1.7e-14) FFTW_ESTIMATE plan: (cost = 5.636096e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.68164 s, 32 iters, t-(init.)=1.56641 s t(norm)=0.0466825, mflops=107.106 (err=1.7e-14) 16. Frigo-old: elapsed time t=1.35156 s, 16 iters, t-(init.)=1.29297 s t(norm)=0.0770669, mflops=64.8787 (err=1.7e-14) 17. Green: elapsed time t=1.16016 s, 16 iters, t-(init.)=1.10254 s t(norm)=0.0657164, mflops=76.0845 (err=1.7e-14) 18. GSL: elapsed time t=1.62988 s, 16 iters, t-(init.)=1.57227 s t(norm)=0.0937143, mflops=53.3536 (err=1.7e-14) 19. GSL DIT: elapsed time t=1.6377 s, 8 iters, t-(init.)=1.60938 s t(norm)=0.191852, mflops=26.0617 (err=1.7e-14) 20. GSL DIF: elapsed time t=1.50098 s, 8 iters, t-(init.)=1.47266 s t(norm)=0.175554, mflops=28.4812 (err=1.8e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.07813 s, 8 iters, t-(init.)=1.04883 s t(norm)=0.12503, mflops=39.9904 (err=1.6e-14) 23. Mayer (simple): elapsed time t=1.0332 s, 8 iters, t-(init.)=1.00488 s t(norm)=0.119791, mflops=41.7392 24. Mayer (lookup): elapsed time t=1.08301 s, 8 iters, t-(init.)=1.05469 s t(norm)=0.125729, mflops=39.7682 (err=1.6e-14) 25. Monro: elapsed time t=1.84082 s, 8 iters, t-(init.)=1.8125 s t(norm)=0.216067, mflops=23.141 (err=1.8e-14) 26. NAPACK (f2c): elapsed time t=1.12793 s, 4 iters, t-(init.)=1.11328 s t(norm)=0.265427, mflops=18.8376 (err=8.7e-13) 27. Nielsen: elapsed time t=1.54492 s, 8 iters, t-(init.)=1.51563 s t(norm)=0.180677, mflops=27.6738 (err=2.6e-13) 28. NR (C): elapsed time t=1.53027 s, 8 iters, t-(init.)=1.50195 s t(norm)=0.179047, mflops=27.9257 (err=1.7e-14) 29. Ooura (C): elapsed time t=1.68066 s, 32 iters, t-(init.)=1.56543 s t(norm)=0.0466534, mflops=107.173 (err=1.6e-14) 30. Ooura (F): elapsed time t=1.27832 s, 16 iters, t-(init.)=1.2207 s t(norm)=0.0727596, mflops=68.7195 (err=1.6e-14) 31. Ransom: elapsed time t=1.25586 s, 16 iters, t-(init.)=1.19824 s t(norm)=0.0714208, mflops=70.0076 (err=1.7e-14) 32. SCIPORT: elapsed time t=1.51074 s, 4 iters, t-(init.)=1.49609 s t(norm)=0.356697, mflops=14.0175 (err=1.7e-14) 33. Singleton: elapsed time t=1.13281 s, 8 iters, t-(init.)=1.10352 s t(norm)=0.131549, mflops=38.0086 (err=2.3e-14) 34. Singleton (f2c): elapsed time t=1.74414 s, 16 iters, t-(init.)=1.68652 s t(norm)=0.100525, mflops=49.7391 (err=2.3e-14) 35. Sorensen: elapsed time t=1.45801 s, 8 iters, t-(init.)=1.42969 s t(norm)=0.170432, mflops=29.3372 (err=1.6e-14) 36. Sorensen DIT: elapsed time t=1.27539 s, 4 iters, t-(init.)=1.26074 s t(norm)=0.300584, mflops=16.6343 (err=1.6e-14) 37. Temperton: elapsed time t=1.12988 s, 8 iters, t-(init.)=1.10156 s t(norm)=0.131316, mflops=38.076 (err=1.7e-14) 38. Temperton (f2c): elapsed time t=1.12793 s, 8 iters, t-(init.)=1.09961 s t(norm)=0.131084, mflops=38.1436 (err=1.7e-14) 39. Valkenburg: elapsed time t=1.36621 s, 2 iters, t-(init.)=1.3584 s t(norm)=0.647735, mflops=7.71921 (err=1.6e-14) Top mflops for N=65536 = 137.11 Normalized results and averages for N=65536: fft 0: mflops = 27.7991 (norm. = 0.202751), norm. avg. (of 16) = 0.32843 fft 1: mflops = 27.0464 (norm. = 0.197261), norm. avg. (of 16) = 0.289684 fft 2: mflops = 21.3893 (norm. = 0.156001), norm. avg. (of 16) = 0.208364 fft 3: mflops = 33.0636 (norm. = 0.241147), norm. avg. (of 16) = 0.105896 fft 4: mflops = 18.8376 (norm. = 0.13739), norm. avg. (of 16) = 0.119943 fft 5: mflops = 11.2199 (norm. = 0.0818312), norm. avg. (of 16) = 0.047903 fft 6: mflops = 65.572 (norm. = 0.478244), norm. avg. (of 16) = 0.416738 fft 7: mflops = 34.2228 (norm. = 0.249602), norm. avg. (of 16) = 0.163832 fft 8: mflops = 16.7903 (norm. = 0.122459), norm. avg. (of 16) = 0.115426 fft 9: mflops = 137.11 (norm. = 1), norm. avg. (of 16) = 0.525237 fft 10: mflops = 137.11 (norm. = 1), norm. avg. (of 16) = 0.537125 fft 11: mflops = 18.8211 (norm. = 0.13727), norm. avg. (of 15) = 0.125134 fft 12: mflops = 35.3205 (norm. = 0.257607), norm. avg. (of 16) = 0.171729 fft 13: mflops = 35.5544 (norm. = 0.259313), norm. avg. (of 16) = 0.171123 fft 14: mflops = 120.56 (norm. = 0.879298), norm. avg. (of 16) = 0.901838 fft 15: mflops = 107.106 (norm. = 0.781172), norm. avg. (of 16) = 0.887741 fft 16: mflops = 64.8787 (norm. = 0.473187), norm. avg. (of 16) = 0.659586 fft 17: mflops = 76.0845 (norm. = 0.554916), norm. avg. (of 14) = 0.777539 fft 18: mflops = 53.3536 (norm. = 0.38913), norm. avg. (of 16) = 0.316121 fft 19: mflops = 26.0617 (norm. = 0.190079), norm. avg. (of 16) = 0.18745 fft 20: mflops = 28.4812 (norm. = 0.207725), norm. avg. (of 16) = 0.216804 fft 21: mflops = -1 (norm. = -0.00729342), norm. avg. (of 12) = 0.463041 fft 22: mflops = 39.9904 (norm. = 0.291667), norm. avg. (of 15) = 0.325825 fft 23: mflops = 41.7392 (norm. = 0.304422), norm. avg. (of 15) = 0.384561 fft 24: mflops = 39.7682 (norm. = 0.290046), norm. avg. (of 15) = 0.372048 fft 25: mflops = 23.141 (norm. = 0.168777), norm. avg. (of 15) = 0.164658 fft 26: mflops = 18.8376 (norm. = 0.13739), norm. avg. (of 16) = 0.0878268 fft 27: mflops = 27.6738 (norm. = 0.201836), norm. avg. (of 16) = 0.143311 fft 28: mflops = 27.9257 (norm. = 0.203674), norm. avg. (of 16) = 0.21901 fft 29: mflops = 107.173 (norm. = 0.781659), norm. avg. (of 16) = 0.774908 fft 30: mflops = 68.7195 (norm. = 0.5012), norm. avg. (of 16) = 0.362765 fft 31: mflops = 70.0076 (norm. = 0.510595), norm. avg. (of 15) = 0.252869 fft 32: mflops = 14.0175 (norm. = 0.102236), norm. avg. (of 15) = 0.0923598 fft 33: mflops = 38.0086 (norm. = 0.277212), norm. avg. (of 16) = 0.205283 fft 34: mflops = 49.7391 (norm. = 0.362768), norm. avg. (of 16) = 0.344239 fft 35: mflops = 29.3372 (norm. = 0.213969), norm. avg. (of 16) = 0.23105 fft 36: mflops = 16.6343 (norm. = 0.121321), norm. avg. (of 16) = 0.108785 fft 37: mflops = 38.076 (norm. = 0.277704), norm. avg. (of 16) = 0.189559 fft 38: mflops = 38.1436 (norm. = 0.278197), norm. avg. (of 16) = 0.191806 fft 39: mflops = 7.71921 (norm. = 0.0562994), norm. avg. (of 16) = 0.0357572 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.93164 s, 4 iters, t-(init.)=1.89746 s t(norm)=0.212889, mflops=23.4864 (err=3.3e-14) 1. Arndt DIT: elapsed time t=1.9707 s, 4 iters, t-(init.)=1.9375 s t(norm)=0.217382, mflops=23.001 (err=3.3e-14) 2. Arndt Split-Radix: elapsed time t=1.2627 s, 2 iters, t-(init.)=1.24609 s t(norm)=0.279616, mflops=17.8817 (err=3.3e-14) 3. Arndt 4-step: elapsed time t=1.73633 s, 4 iters, t-(init.)=1.70117 s t(norm)=0.190866, mflops=26.1963 (err=3.3e-14) 4. Bailey: elapsed time t=1.24414 s, 2 iters, t-(init.)=1.22754 s t(norm)=0.275452, mflops=18.152 (err=3.3e-14) 5. Beauregard: elapsed time t=1.03027 s, 1 iters, t-(init.)=1.02246 s t(norm)=0.458868, mflops=10.8964 (err=3.4e-14) 6. Bergland: elapsed time t=1.66699 s, 8 iters, t-(init.)=1.59961 s t(norm)=0.0897357, mflops=55.7192 (err=3.3e-14) 7. Brenner: elapsed time t=1.51953 s, 4 iters, t-(init.)=1.48633 s t(norm)=0.166762, mflops=29.9829 (err=3.4e-14) 8. Burrus: elapsed time t=1.54199 s, 2 iters, t-(init.)=1.52539 s t(norm)=0.342288, mflops=14.6076 (err=3.3e-14) 9. CWP (min N) (N=144144): elapsed time t=1.58984 s, 16 iters, t-(init.)=1.43359 s t(norm)=0.0402112, mflops=124.343 10. CWP (best N) (N=144144): elapsed time t=1.58887 s, 16 iters, t-(init.)=1.43262 s t(norm)=0.0401838, mflops=124.428 11. Edelblute: elapsed time t=1.39258 s, 2 iters, t-(init.)=1.375 s t(norm)=0.308542, mflops=16.2053 (err=3.3e-14) 12. FFTPACK: elapsed time t=1.58691 s, 4 iters, t-(init.)=1.55176 s t(norm)=0.174103, mflops=28.7187 (err=3.4e-14) 13. FFTPACK (f2c): elapsed time t=1.58398 s, 4 iters, t-(init.)=1.5498 s t(norm)=0.173883, mflops=28.7549 (err=3.4e-14) FFTW_MEASURE plan: (cost = 1.308594e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.0293 s, 8 iters, t-(init.)=0.958984 s t(norm)=0.0537976, mflops=92.941 (err=3.4e-14) FFTW_ESTIMATE plan: (cost = 1.048576e+06) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.95801 s, 16 iters, t-(init.)=1.81934 s t(norm)=0.051031, mflops=97.9797 (err=3.4e-14) 16. Frigo-old: elapsed time t=1.67773 s, 8 iters, t-(init.)=1.60938 s t(norm)=0.0902835, mflops=55.3811 (err=3.4e-14) 17. Green: elapsed time t=1.57129 s, 8 iters, t-(init.)=1.50391 s t(norm)=0.0843669, mflops=59.265 (err=3.4e-14) 18. GSL: elapsed time t=1.11523 s, 4 iters, t-(init.)=1.08008 s t(norm)=0.121182, mflops=41.2604 (err=3.4e-14) 19. GSL DIT: elapsed time t=1.02637 s, 2 iters, t-(init.)=1.00879 s t(norm)=0.226366, mflops=22.0881 (err=3.5e-14) 20. GSL DIF: elapsed time t=1.92676 s, 4 iters, t-(init.)=1.89258 s t(norm)=0.212342, mflops=23.547 (err=3.6e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.41309 s, 4 iters, t-(init.)=1.37891 s t(norm)=0.154709, mflops=32.3187 (err=3.3e-14) 23. Mayer (simple): elapsed time t=1.375 s, 4 iters, t-(init.)=1.3418 s t(norm)=0.150546, mflops=33.2125 24. Mayer (lookup): elapsed time t=1.44238 s, 4 iters, t-(init.)=1.40918 s t(norm)=0.158106, mflops=31.6244 (err=3.3e-14) 25. Monro: elapsed time t=1.1709 s, 2 iters, t-(init.)=1.15332 s t(norm)=0.258798, mflops=19.3201 (err=3.6e-14) 26. NAPACK (f2c): elapsed time t=1.40234 s, 2 iters, t-(init.)=1.38477 s t(norm)=0.310733, mflops=16.091 (err=2.1e-12) 27. Nielsen: elapsed time t=1.90332 s, 4 iters, t-(init.)=1.86816 s t(norm)=0.209602, mflops=23.8547 (err=9.3e-13) 28. NR (C): elapsed time t=1.93945 s, 4 iters, t-(init.)=1.90527 s t(norm)=0.213766, mflops=23.3901 (err=3.4e-14) 29. Ooura (C): elapsed time t=1.09863 s, 8 iters, t-(init.)=1.0293 s t(norm)=0.057742, mflops=86.5921 (err=3.3e-14) 30. Ooura (F): elapsed time t=1.59082 s, 8 iters, t-(init.)=1.52148 s t(norm)=0.085353, mflops=58.5803 (err=3.3e-14) 31. Ransom: elapsed time t=1.60938 s, 8 iters, t-(init.)=1.54102 s t(norm)=0.0864486, mflops=57.8378 (err=3.4e-14) 32. SCIPORT: elapsed time t=1.86523 s, 2 iters, t-(init.)=1.84766 s t(norm)=0.414603, mflops=12.0597 (err=3.4e-14) 33. Singleton: elapsed time t=1.4502 s, 4 iters, t-(init.)=1.41602 s t(norm)=0.158873, mflops=31.4717 (err=4.9e-14) 34. Singleton (f2c): elapsed time t=1.16895 s, 4 iters, t-(init.)=1.13477 s t(norm)=0.127317, mflops=39.272 (err=4.9e-14) 35. Sorensen: elapsed time t=1.75098 s, 4 iters, t-(init.)=1.71777 s t(norm)=0.192729, mflops=25.9432 (err=3.3e-14) 36. Sorensen DIT: elapsed time t=1.54004 s, 2 iters, t-(init.)=1.52344 s t(norm)=0.34185, mflops=14.6263 (err=3.3e-14) 37. Temperton: elapsed time t=1.4873 s, 4 iters, t-(init.)=1.45313 s t(norm)=0.163036, mflops=30.668 (err=3.4e-14) 38. Temperton (f2c): elapsed time t=1.48535 s, 4 iters, t-(init.)=1.45117 s t(norm)=0.162817, mflops=30.7093 (err=3.4e-14) 39. Valkenburg: elapsed time t=1.54395 s, 1 iters, t-(init.)=1.53516 s t(norm)=0.68896, mflops=7.25732 (err=3.3e-14) Top mflops for N=131072 = 124.428 Normalized results and averages for N=131072: fft 0: mflops = 23.4864 (norm. = 0.188755), norm. avg. (of 17) = 0.320214 fft 1: mflops = 23.001 (norm. = 0.184854), norm. avg. (of 17) = 0.283518 fft 2: mflops = 17.8817 (norm. = 0.143711), norm. avg. (of 17) = 0.204561 fft 3: mflops = 26.1963 (norm. = 0.210534), norm. avg. (of 17) = 0.112051 fft 4: mflops = 18.152 (norm. = 0.145883), norm. avg. (of 17) = 0.121469 fft 5: mflops = 10.8964 (norm. = 0.0875716), norm. avg. (of 17) = 0.0502365 fft 6: mflops = 55.7192 (norm. = 0.447802), norm. avg. (of 17) = 0.418565 fft 7: mflops = 29.9829 (norm. = 0.240966), norm. avg. (of 17) = 0.168369 fft 8: mflops = 14.6076 (norm. = 0.117398), norm. avg. (of 17) = 0.115542 fft 9: mflops = 124.343 (norm. = 0.999319), norm. avg. (of 17) = 0.553124 fft 10: mflops = 124.428 (norm. = 1), norm. avg. (of 17) = 0.564353 fft 11: mflops = 16.2053 (norm. = 0.130238), norm. avg. (of 16) = 0.125453 fft 12: mflops = 28.7187 (norm. = 0.230806), norm. avg. (of 17) = 0.175204 fft 13: mflops = 28.7549 (norm. = 0.231096), norm. avg. (of 17) = 0.174651 fft 14: mflops = 92.941 (norm. = 0.746945), norm. avg. (of 17) = 0.892727 fft 15: mflops = 97.9797 (norm. = 0.78744), norm. avg. (of 17) = 0.881841 fft 16: mflops = 55.3811 (norm. = 0.445085), norm. avg. (of 17) = 0.646968 fft 17: mflops = 59.265 (norm. = 0.476299), norm. avg. (of 15) = 0.757456 fft 18: mflops = 41.2604 (norm. = 0.3316), norm. avg. (of 17) = 0.317032 fft 19: mflops = 22.0881 (norm. = 0.177517), norm. avg. (of 17) = 0.186865 fft 20: mflops = 23.547 (norm. = 0.189241), norm. avg. (of 17) = 0.215183 fft 21: mflops = -1 (norm. = -0.00803677), norm. avg. (of 12) = 0.463041 fft 22: mflops = 32.3187 (norm. = 0.259738), norm. avg. (of 16) = 0.321695 fft 23: mflops = 33.2125 (norm. = 0.266921), norm. avg. (of 16) = 0.377208 fft 24: mflops = 31.6244 (norm. = 0.254158), norm. avg. (of 16) = 0.36468 fft 25: mflops = 19.3201 (norm. = 0.155271), norm. avg. (of 16) = 0.164072 fft 26: mflops = 16.091 (norm. = 0.129319), norm. avg. (of 17) = 0.0902675 fft 27: mflops = 23.8547 (norm. = 0.191715), norm. avg. (of 17) = 0.146158 fft 28: mflops = 23.3901 (norm. = 0.187981), norm. avg. (of 17) = 0.217184 fft 29: mflops = 86.5921 (norm. = 0.69592), norm. avg. (of 17) = 0.770262 fft 30: mflops = 58.5803 (norm. = 0.470796), norm. avg. (of 17) = 0.36912 fft 31: mflops = 57.8378 (norm. = 0.464829), norm. avg. (of 16) = 0.266117 fft 32: mflops = 12.0597 (norm. = 0.0969212), norm. avg. (of 16) = 0.0926449 fft 33: mflops = 31.4717 (norm. = 0.252931), norm. avg. (of 17) = 0.208086 fft 34: mflops = 39.272 (norm. = 0.31562), norm. avg. (of 17) = 0.342556 fft 35: mflops = 25.9432 (norm. = 0.208499), norm. avg. (of 17) = 0.229724 fft 36: mflops = 14.6263 (norm. = 0.117548), norm. avg. (of 17) = 0.1093 fft 37: mflops = 30.668 (norm. = 0.246472), norm. avg. (of 17) = 0.192906 fft 38: mflops = 30.7093 (norm. = 0.246803), norm. avg. (of 17) = 0.195041 fft 39: mflops = 7.25732 (norm. = 0.0583254), norm. avg. (of 17) = 0.0370848 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=1.16797 s, 1 iters, t-(init.)=1.14844 s t(norm)=0.243386, mflops=20.5435 (err=4.4e-14) 1. Arndt DIT: elapsed time t=1.17383 s, 1 iters, t-(init.)=1.1543 s t(norm)=0.244627, mflops=20.4392 (err=4.4e-14) 2. Arndt Split-Radix: elapsed time t=1.51367 s, 1 iters, t-(init.)=1.49316 s t(norm)=0.316443, mflops=15.8006 (err=4.4e-14) 3. Arndt 4-step: elapsed time t=1.57422 s, 2 iters, t-(init.)=1.53223 s t(norm)=0.162361, mflops=30.7957 (err=4.3e-14) 4. Bailey: elapsed time t=1.44922 s, 1 iters, t-(init.)=1.42871 s t(norm)=0.302783, mflops=16.5135 (err=4.4e-14) 5. Beauregard: elapsed time t=2.27539 s, 1 iters, t-(init.)=2.25391 s t(norm)=0.477665, mflops=10.4676 (err=4.4e-14) 6. Bergland: elapsed time t=1.94727 s, 4 iters, t-(init.)=1.86426 s t(norm)=0.0987719, mflops=50.6217 (err=4.4e-14) 7. Brenner: elapsed time t=1.79688 s, 2 iters, t-(init.)=1.75684 s t(norm)=0.186161, mflops=26.8585 (err=4.4e-14) 8. Burrus: elapsed time t=1.79492 s, 1 iters, t-(init.)=1.77441 s t(norm)=0.376047, mflops=13.2962 (err=4.4e-14) 9. CWP (min N) (N=360360): elapsed time t=1.28125 s, 4 iters, t-(init.)=1.16113 s t(norm)=0.061519, mflops=81.2757 10. CWP (best N) (N=360360): elapsed time t=1.28418 s, 4 iters, t-(init.)=1.16309 s t(norm)=0.0616225, mflops=81.1392 11. Edelblute: elapsed time t=1.64551 s, 1 iters, t-(init.)=1.62598 s t(norm)=0.344589, mflops=14.51 (err=4.4e-14) 12. FFTPACK: elapsed time t=1.72559 s, 2 iters, t-(init.)=1.68359 s t(norm)=0.1784, mflops=28.0269 (err=4.4e-14) 13. FFTPACK (f2c): elapsed time t=1.72266 s, 2 iters, t-(init.)=1.68164 s t(norm)=0.178193, mflops=28.0595 (err=4.4e-14) FFTW_MEASURE plan: (cost = 2.919922e-01) FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 8 14. FFTW: elapsed time t=1.01758 s, 4 iters, t-(init.)=0.932617 s t(norm)=0.0494118, mflops=101.19 (err=4.4e-14) FFTW_ESTIMATE plan: (cost = 1.677722e+06) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.04492 s, 4 iters, t-(init.)=0.960938 s t(norm)=0.0509123, mflops=98.2081 (err=4.4e-14) 16. Frigo-old: elapsed time t=1.93848 s, 4 iters, t-(init.)=1.85547 s t(norm)=0.0983063, mflops=50.8615 (err=4.4e-14) 17. Green: elapsed time t=1.85156 s, 4 iters, t-(init.)=1.76953 s t(norm)=0.0937531, mflops=53.3315 (err=4.4e-14) 18. GSL: elapsed time t=1.16504 s, 2 iters, t-(init.)=1.12305 s t(norm)=0.119002, mflops=42.016 (err=4.4e-14) 19. GSL DIT: elapsed time t=1.2334 s, 1 iters, t-(init.)=1.21387 s t(norm)=0.257252, mflops=19.4362 (err=4.6e-14) 20. GSL DIF: elapsed time t=1.18262 s, 1 iters, t-(init.)=1.16211 s t(norm)=0.246283, mflops=20.3018 (err=4.6e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.73633 s, 2 iters, t-(init.)=1.69531 s t(norm)=0.179642, mflops=27.8332 (err=4.3e-14) 23. Mayer (simple): elapsed time t=1.69922 s, 2 iters, t-(init.)=1.65918 s t(norm)=0.175813, mflops=28.4393 24. Mayer (lookup): elapsed time t=1.78906 s, 2 iters, t-(init.)=1.74902 s t(norm)=0.185333, mflops=26.9784 (err=4.3e-14) 25. Monro: elapsed time t=1.31836 s, 1 iters, t-(init.)=1.29785 s t(norm)=0.275051, mflops=18.1785 (err=5.2e-14) 26. NAPACK (f2c): elapsed time t=1.52441 s, 1 iters, t-(init.)=1.50391 s t(norm)=0.318719, mflops=15.6878 (err=3.7e-12) 27. Nielsen: elapsed time t=1.12793 s, 1 iters, t-(init.)=1.10547 s t(norm)=0.234279, mflops=21.342 (err=2.2e-12) 28. NR (C): elapsed time t=1.18652 s, 1 iters, t-(init.)=1.16602 s t(norm)=0.247111, mflops=20.2338 (err=4.4e-14) 29. Ooura (C): elapsed time t=1.26172 s, 4 iters, t-(init.)=1.17773 s t(norm)=0.0623986, mflops=80.13 (err=4.4e-14) 30. Ooura (F): elapsed time t=1.7793 s, 4 iters, t-(init.)=1.69629 s t(norm)=0.0898726, mflops=55.6343 (err=4.4e-14) 31. Ransom: elapsed time t=1.4707 s, 4 iters, t-(init.)=1.38672 s t(norm)=0.073471, mflops=68.0541 (err=4.3e-14) 32. SCIPORT: elapsed time t=2.1543 s, 1 iters, t-(init.)=2.13379 s t(norm)=0.452209, mflops=11.0568 (err=4.4e-14) 33. Singleton: elapsed time t=1.63965 s, 2 iters, t-(init.)=1.59766 s t(norm)=0.169294, mflops=29.5345 (err=6.1e-14) 34. Singleton (f2c): elapsed time t=1.34961 s, 2 iters, t-(init.)=1.30762 s t(norm)=0.13856, mflops=36.0854 (err=6.1e-14) 35. Sorensen: elapsed time t=1.00293 s, 1 iters, t-(init.)=0.983398 s t(norm)=0.208409, mflops=23.9913 (err=4.3e-14) 36. Sorensen DIT: elapsed time t=1.77734 s, 1 iters, t-(init.)=1.75781 s t(norm)=0.372529, mflops=13.4218 (err=4.4e-14) 37. Temperton: elapsed time t=1.63477 s, 2 iters, t-(init.)=1.59375 s t(norm)=0.16888, mflops=29.6069 (err=4.4e-14) 38. Temperton (f2c): elapsed time t=1.62793 s, 2 iters, t-(init.)=1.58691 s t(norm)=0.168155, mflops=29.7344 (err=4.4e-14) 39. Valkenburg: elapsed time t=3.37109 s, 1 iters, t-(init.)=3.35059 s t(norm)=0.710082, mflops=7.04144 (err=4.4e-14) Top mflops for N=262144 = 101.19 Normalized results and averages for N=262144: fft 0: mflops = 20.5435 (norm. = 0.203019), norm. avg. (of 18) = 0.313703 fft 1: mflops = 20.4392 (norm. = 0.201988), norm. avg. (of 18) = 0.278988 fft 2: mflops = 15.8006 (norm. = 0.156148), norm. avg. (of 18) = 0.201872 fft 3: mflops = 30.7957 (norm. = 0.304334), norm. avg. (of 18) = 0.122733 fft 4: mflops = 16.5135 (norm. = 0.163192), norm. avg. (of 18) = 0.123787 fft 5: mflops = 10.4676 (norm. = 0.103445), norm. avg. (of 18) = 0.0531925 fft 6: mflops = 50.6217 (norm. = 0.500262), norm. avg. (of 18) = 0.423104 fft 7: mflops = 26.8585 (norm. = 0.265425), norm. avg. (of 18) = 0.173761 fft 8: mflops = 13.2962 (norm. = 0.131398), norm. avg. (of 18) = 0.116423 fft 9: mflops = 81.2757 (norm. = 0.803196), norm. avg. (of 18) = 0.567017 fft 10: mflops = 81.1392 (norm. = 0.801847), norm. avg. (of 18) = 0.577547 fft 11: mflops = 14.51 (norm. = 0.143393), norm. avg. (of 17) = 0.126508 fft 12: mflops = 28.0269 (norm. = 0.276972), norm. avg. (of 18) = 0.180858 fft 13: mflops = 28.0595 (norm. = 0.277294), norm. avg. (of 18) = 0.180354 fft 14: mflops = 101.19 (norm. = 1), norm. avg. (of 18) = 0.898686 fft 15: mflops = 98.2081 (norm. = 0.970528), norm. avg. (of 18) = 0.886768 fft 16: mflops = 50.8615 (norm. = 0.502632), norm. avg. (of 18) = 0.63895 fft 17: mflops = 53.3315 (norm. = 0.527042), norm. avg. (of 16) = 0.743055 fft 18: mflops = 42.016 (norm. = 0.415217), norm. avg. (of 18) = 0.322487 fft 19: mflops = 19.4362 (norm. = 0.192076), norm. avg. (of 18) = 0.187155 fft 20: mflops = 20.3018 (norm. = 0.20063), norm. avg. (of 18) = 0.214374 fft 21: mflops = -1 (norm. = -0.00988237), norm. avg. (of 12) = 0.463041 fft 22: mflops = 27.8332 (norm. = 0.275058), norm. avg. (of 17) = 0.318952 fft 23: mflops = 28.4393 (norm. = 0.281048), norm. avg. (of 17) = 0.371552 fft 24: mflops = 26.9784 (norm. = 0.266611), norm. avg. (of 17) = 0.358911 fft 25: mflops = 18.1785 (norm. = 0.179646), norm. avg. (of 17) = 0.164988 fft 26: mflops = 15.6878 (norm. = 0.155032), norm. avg. (of 18) = 0.0938656 fft 27: mflops = 21.342 (norm. = 0.21091), norm. avg. (of 18) = 0.149755 fft 28: mflops = 20.2338 (norm. = 0.199958), norm. avg. (of 18) = 0.216227 fft 29: mflops = 80.13 (norm. = 0.791874), norm. avg. (of 18) = 0.771462 fft 30: mflops = 55.6343 (norm. = 0.549799), norm. avg. (of 18) = 0.379158 fft 31: mflops = 68.0541 (norm. = 0.672535), norm. avg. (of 17) = 0.290024 fft 32: mflops = 11.0568 (norm. = 0.109268), norm. avg. (of 17) = 0.0936227 fft 33: mflops = 29.5345 (norm. = 0.29187), norm. avg. (of 18) = 0.212741 fft 34: mflops = 36.0854 (norm. = 0.356609), norm. avg. (of 18) = 0.343337 fft 35: mflops = 23.9913 (norm. = 0.23709), norm. avg. (of 18) = 0.230133 fft 36: mflops = 13.4218 (norm. = 0.132639), norm. avg. (of 18) = 0.110597 fft 37: mflops = 29.6069 (norm. = 0.292586), norm. avg. (of 18) = 0.198444 fft 38: mflops = 29.7344 (norm. = 0.293846), norm. avg. (of 18) = 0.20053 fft 39: mflops = 7.04144 (norm. = 0.0695861), norm. avg. (of 18) = 0.0388904 Benchmarking for array size = 524288 (power of 2): 0. Arndt DIF: elapsed time t=2.7168 s, 1 iters, t-(init.)=2.67188 s t(norm)=0.268221, mflops=18.6414 (err=1.1e-13) 1. Arndt DIT: elapsed time t=2.74316 s, 1 iters, t-(init.)=2.69922 s t(norm)=0.270966, mflops=18.4525 (err=1.1e-13) 2. Arndt Split-Radix: elapsed time t=3.60254 s, 1 iters, t-(init.)=3.55762 s t(norm)=0.357138, mflops=14.0002 (err=1.1e-13) 3. Arndt 4-step: elapsed time t=2.18066 s, 1 iters, t-(init.)=2.13477 s t(norm)=0.214302, mflops=23.3315 (err=1.1e-13) 4. Bailey: elapsed time t=3.02246 s, 1 iters, t-(init.)=2.97949 s t(norm)=0.299102, mflops=16.7167 (err=1.1e-13) 5. Beauregard: elapsed time t=4.84961 s, 1 iters, t-(init.)=4.80469 s t(norm)=0.482327, mflops=10.3664 (err=1.1e-13) 6. Bergland: elapsed time t=1.14551 s, 1 iters, t-(init.)=1.10156 s t(norm)=0.110582, mflops=45.2152 (err=1.1e-13) 7. Brenner: elapsed time t=2.02539 s, 1 iters, t-(init.)=1.98145 s t(norm)=0.198911, mflops=25.1369 (err=1.1e-13) 8. Burrus: elapsed time t=4.12891 s, 1 iters, t-(init.)=4.08398 s t(norm)=0.409978, mflops=12.1958 (err=1.1e-13) 9. CWP (min N) (N=720720): elapsed time t=1.39648 s, 2 iters, t-(init.)=1.27051 s t(norm)=0.0637711, mflops=78.4054 10. CWP (best N) (N=720720): elapsed time t=1.39453 s, 2 iters, t-(init.)=1.27051 s t(norm)=0.0637711, mflops=78.4054 11. Edelblute: elapsed time t=3.83398 s, 1 iters, t-(init.)=3.78906 s t(norm)=0.380372, mflops=13.145 (err=1.1e-13) 12. FFTPACK: elapsed time t=1.79004 s, 1 iters, t-(init.)=1.74707 s t(norm)=0.175383, mflops=28.5091 (err=1.1e-13) 13. FFTPACK (f2c): elapsed time t=1.78125 s, 1 iters, t-(init.)=1.73828 s t(norm)=0.1745, mflops=28.6532 (err=1.1e-13) FFTW_MEASURE plan: (cost = 6.621094e-01) FFTW_TWIDDLE 32 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.16895 s, 2 iters, t-(init.)=1.0791 s t(norm)=0.0541638, mflops=92.3126 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 5.033165e+06) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.26172 s, 2 iters, t-(init.)=1.17285 s t(norm)=0.0588694, mflops=84.9338 (err=1.1e-13) 16. Frigo-old: elapsed time t=1.0957 s, 1 iters, t-(init.)=1.05273 s t(norm)=0.105681, mflops=47.3124 (err=1.1e-13) 17. Green: elapsed time t=1.05664 s, 1 iters, t-(init.)=1.0127 s t(norm)=0.101661, mflops=49.183 (err=1.1e-13) 18. GSL: elapsed time t=1.25781 s, 1 iters, t-(init.)=1.21387 s t(norm)=0.121856, mflops=41.032 (err=1.1e-13) 19. GSL DIT: elapsed time t=2.82813 s, 1 iters, t-(init.)=2.7832 s t(norm)=0.279397, mflops=17.8957 (err=1.1e-13) 20. GSL DIF: elapsed time t=2.72852 s, 1 iters, t-(init.)=2.68359 s t(norm)=0.269397, mflops=18.5599 (err=1.1e-13) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=2.11816 s, 1 iters, t-(init.)=2.07324 s t(norm)=0.208126, mflops=24.0239 (err=1.1e-13) 23. Mayer (simple): elapsed time t=2.09863 s, 1 iters, t-(init.)=2.05371 s t(norm)=0.206165, mflops=24.2524 24. Mayer (lookup): elapsed time t=2.18555 s, 1 iters, t-(init.)=2.14355 s t(norm)=0.215185, mflops=23.2359 (err=1.1e-13) 25. Monro: elapsed time t=3.12012 s, 1 iters, t-(init.)=3.0752 s t(norm)=0.308709, mflops=16.1965 (err=1.3e-13) 26. NAPACK (f2c): elapsed time t=3.36133 s, 1 iters, t-(init.)=3.31836 s t(norm)=0.333119, mflops=15.0096 (err=7.9e-12) 27. Nielsen: elapsed time t=2.69336 s, 1 iters, t-(init.)=2.64746 s t(norm)=0.26577, mflops=18.8133 (err=4.5e-12) 28. NR (C): elapsed time t=2.70703 s, 1 iters, t-(init.)=2.66211 s t(norm)=0.267241, mflops=18.7097 (err=1.1e-13) 29. Ooura (C): elapsed time t=1.48047 s, 2 iters, t-(init.)=1.39063 s t(norm)=0.0698002, mflops=71.6331 (err=1.1e-13) 30. Ooura (F): elapsed time t=1.01465 s, 1 iters, t-(init.)=0.969727 s t(norm)=0.0973477, mflops=51.3623 (err=1.1e-13) 31. Ransom: elapsed time t=1.93848 s, 2 iters, t-(init.)=1.84863 s t(norm)=0.0927891, mflops=53.8856 (err=1.1e-13) 32. SCIPORT: elapsed time t=4.60449 s, 1 iters, t-(init.)=4.56055 s t(norm)=0.457819, mflops=10.9214 (err=1.1e-13) 33. Singleton: elapsed time t=2.11914 s, 1 iters, t-(init.)=2.07422 s t(norm)=0.208224, mflops=24.0126 (err=1.6e-13) 34. Singleton (f2c): elapsed time t=1.7998 s, 1 iters, t-(init.)=1.75391 s t(norm)=0.176069, mflops=28.398 (err=1.6e-13) 35. Sorensen: elapsed time t=2.29199 s, 1 iters, t-(init.)=2.24805 s t(norm)=0.225674, mflops=22.1558 (err=1.1e-13) 36. Sorensen DIT: elapsed time t=4.07715 s, 1 iters, t-(init.)=4.0332 s t(norm)=0.40488, mflops=12.3493 (err=1.1e-13) 37. Temperton: elapsed time t=1.94336 s, 1 iters, t-(init.)=1.89941 s t(norm)=0.190676, mflops=26.2225 (err=1.1e-13) 38. Temperton (f2c): elapsed time t=1.93848 s, 1 iters, t-(init.)=1.89453 s t(norm)=0.190186, mflops=26.2901 (err=1.1e-13) 39. Valkenburg: elapsed time t=7.44922 s, 1 iters, t-(init.)=7.40625 s t(norm)=0.74349, mflops=6.72504 (err=1.1e-13) Top mflops for N=524288 = 92.3126 Normalized results and averages for N=524288: fft 0: mflops = 18.6414 (norm. = 0.201937), norm. avg. (of 19) = 0.307821 fft 1: mflops = 18.4525 (norm. = 0.199891), norm. avg. (of 19) = 0.274825 fft 2: mflops = 14.0002 (norm. = 0.151661), norm. avg. (of 19) = 0.199229 fft 3: mflops = 23.3315 (norm. = 0.252745), norm. avg. (of 19) = 0.129576 fft 4: mflops = 16.7167 (norm. = 0.181088), norm. avg. (of 19) = 0.126803 fft 5: mflops = 10.3664 (norm. = 0.112297), norm. avg. (of 19) = 0.0563032 fft 6: mflops = 45.2152 (norm. = 0.489805), norm. avg. (of 19) = 0.426615 fft 7: mflops = 25.1369 (norm. = 0.272302), norm. avg. (of 19) = 0.178948 fft 8: mflops = 12.1958 (norm. = 0.132114), norm. avg. (of 19) = 0.117249 fft 9: mflops = 78.4054 (norm. = 0.849347), norm. avg. (of 19) = 0.581877 fft 10: mflops = 78.4054 (norm. = 0.849347), norm. avg. (of 19) = 0.591852 fft 11: mflops = 13.145 (norm. = 0.142397), norm. avg. (of 18) = 0.127391 fft 12: mflops = 28.5091 (norm. = 0.308832), norm. avg. (of 19) = 0.187594 fft 13: mflops = 28.6532 (norm. = 0.310393), norm. avg. (of 19) = 0.187198 fft 14: mflops = 92.3126 (norm. = 1), norm. avg. (of 19) = 0.904019 fft 15: mflops = 84.9338 (norm. = 0.920067), norm. avg. (of 19) = 0.888521 fft 16: mflops = 47.3124 (norm. = 0.512523), norm. avg. (of 19) = 0.632296 fft 17: mflops = 49.183 (norm. = 0.532787), norm. avg. (of 17) = 0.730686 fft 18: mflops = 41.032 (norm. = 0.444489), norm. avg. (of 19) = 0.328908 fft 19: mflops = 17.8957 (norm. = 0.19386), norm. avg. (of 19) = 0.187508 fft 20: mflops = 18.5599 (norm. = 0.201055), norm. avg. (of 19) = 0.213673 fft 21: mflops = -1 (norm. = -0.0108328), norm. avg. (of 12) = 0.463041 fft 22: mflops = 24.0239 (norm. = 0.260245), norm. avg. (of 18) = 0.31569 fft 23: mflops = 24.2524 (norm. = 0.26272), norm. avg. (of 18) = 0.365506 fft 24: mflops = 23.2359 (norm. = 0.251708), norm. avg. (of 18) = 0.352955 fft 25: mflops = 16.1965 (norm. = 0.175453), norm. avg. (of 18) = 0.165569 fft 26: mflops = 15.0096 (norm. = 0.162596), norm. avg. (of 19) = 0.097483 fft 27: mflops = 18.8133 (norm. = 0.203799), norm. avg. (of 19) = 0.1526 fft 28: mflops = 18.7097 (norm. = 0.202678), norm. avg. (of 19) = 0.215514 fft 29: mflops = 71.6331 (norm. = 0.775983), norm. avg. (of 19) = 0.7717 fft 30: mflops = 51.3623 (norm. = 0.556395), norm. avg. (of 19) = 0.388486 fft 31: mflops = 53.8856 (norm. = 0.58373), norm. avg. (of 18) = 0.306341 fft 32: mflops = 10.9214 (norm. = 0.118308), norm. avg. (of 18) = 0.0949941 fft 33: mflops = 24.0126 (norm. = 0.260122), norm. avg. (of 19) = 0.215235 fft 34: mflops = 28.398 (norm. = 0.307628), norm. avg. (of 19) = 0.341457 fft 35: mflops = 22.1558 (norm. = 0.240009), norm. avg. (of 19) = 0.230653 fft 36: mflops = 12.3493 (norm. = 0.133777), norm. avg. (of 19) = 0.111817 fft 37: mflops = 26.2225 (norm. = 0.284062), norm. avg. (of 19) = 0.20295 fft 38: mflops = 26.2901 (norm. = 0.284794), norm. avg. (of 19) = 0.204965 fft 39: mflops = 6.72504 (norm. = 0.0728507), norm. avg. (of 19) = 0.0406778 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. Brenner 1. CWP (min N) 2. CWP (best N) 3. FFTPACK 4. FFTPACK (f2c) 5. FFTW 6. FFTW_ESTIMATE 7. Frigo-old 8. GSL 9. Nielsen 10. Singleton 11. Singleton (f2c) 12. Temperton 13. Temperton (f2c) 14. Valkenburg Computing normalized averages (15 transforms). Benchmarking for array size = 6: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.78809 s, 524288 iters, t-(init.)=1.7373 s t(norm)=0.213649, mflops=23.4029 2. CWP (best N) (N=15): elapsed time t=1.25879 s, 262144 iters, t-(init.)=1.21582 s t(norm)=0.299036, mflops=16.7204 3. FFTPACK: elapsed time t=1.24023 s, 524288 iters, t-(init.)=1.18945 s t(norm)=0.146276, mflops=34.182 (err=1.2e-16) 4. FFTPACK (f2c): elapsed time t=1.25 s, 524288 iters, t-(init.)=1.19922 s t(norm)=0.147477, mflops=33.9037 (err=2.2e-16) FFTW_MEASURE plan: (cost = 3.241003e-07) FFTW_NOTW 6 5. FFTW: elapsed time t=1.54199 s, 4194304 iters, t-(init.)=1.1377 s t(norm)=0.0174888, mflops=285.897 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 1.560000e+01) FFTW_NOTW 6 6. FFTW_ESTIMATE: elapsed time t=1.83984 s, 4194304 iters, t-(init.)=1.43555 s t(norm)=0.0220674, mflops=226.578 (err=1.2e-16) 7. Frigo-old: elapsed time t=1.69043 s, 524288 iters, t-(init.)=1.64063 s t(norm)=0.201759, mflops=24.782 (err=1.6e-16) 8. GSL: elapsed time t=1.06738 s, 524288 iters, t-(init.)=1.0166 s t(norm)=0.125019, mflops=39.994 (err=9.5e-17) 9. Nielsen: elapsed time t=1.31152 s, 131072 iters, t-(init.)=1.29883 s t(norm)=0.638905, mflops=7.82589 (err=7.0e-16) 10. Singleton: elapsed time t=1.5332 s, 262144 iters, t-(init.)=1.50879 s t(norm)=0.371093, mflops=13.4737 (err=1.3e-16) 11. Singleton (f2c): elapsed time t=1.16016 s, 262144 iters, t-(init.)=1.13574 s t(norm)=0.279341, mflops=17.8993 (err=1.3e-16) 12. Temperton: elapsed time t=1.25586 s, 262144 iters, t-(init.)=1.23047 s t(norm)=0.302639, mflops=16.5213 (err=5.9e-16) 13. Temperton (f2c): elapsed time t=1.12305 s, 262144 iters, t-(init.)=1.09766 s t(norm)=0.269973, mflops=18.5203 (err=1.1e-16) 14. Valkenburg: elapsed time t=1.99316 s, 262144 iters, t-(init.)=1.96875 s t(norm)=0.484223, mflops=10.3258 (err=3.5e-16) Top mflops for N=6 = 285.897 Normalized results and averages for N=6: fft 0: mflops = -1 (norm. = -0.00349776), norm. avg. (of 0) = -1 fft 1: mflops = 23.4029 (norm. = 0.0818578), norm. avg. (of 1) = 0.0818578 fft 2: mflops = 16.7204 (norm. = 0.0584839), norm. avg. (of 1) = 0.0584839 fft 3: mflops = 34.182 (norm. = 0.119561), norm. avg. (of 1) = 0.119561 fft 4: mflops = 33.9037 (norm. = 0.118587), norm. avg. (of 1) = 0.118587 fft 5: mflops = 285.897 (norm. = 1), norm. avg. (of 1) = 1 fft 6: mflops = 226.578 (norm. = 0.792517), norm. avg. (of 1) = 0.792517 fft 7: mflops = 24.782 (norm. = 0.0866815), norm. avg. (of 1) = 0.0866815 fft 8: mflops = 39.994 (norm. = 0.13989), norm. avg. (of 1) = 0.13989 fft 9: mflops = 7.82589 (norm. = 0.0273731), norm. avg. (of 1) = 0.0273731 fft 10: mflops = 13.4737 (norm. = 0.0471278), norm. avg. (of 1) = 0.0471278 fft 11: mflops = 17.8993 (norm. = 0.0626075), norm. avg. (of 1) = 0.0626075 fft 12: mflops = 16.5213 (norm. = 0.0577877), norm. avg. (of 1) = 0.0577877 fft 13: mflops = 18.5203 (norm. = 0.0647798), norm. avg. (of 1) = 0.0647798 fft 14: mflops = 10.3258 (norm. = 0.0361173), norm. avg. (of 1) = 0.0361173 Benchmarking for array size = 9: 0. Brenner: elapsed time t=1.20215 s, 65536 iters, t-(init.)=1.18848 s t(norm)=0.635652, mflops=7.86594 (err=3.9e-16) 1. CWP (min N): elapsed time t=1.61719 s, 524288 iters, t-(init.)=1.55469 s t(norm)=0.10394, mflops=48.1048 2. CWP (best N) (N=15): elapsed time t=1.25879 s, 262144 iters, t-(init.)=1.2168 s t(norm)=0.1627, mflops=30.7315 3. FFTPACK: elapsed time t=1.73535 s, 524288 iters, t-(init.)=1.6709 s t(norm)=0.111709, mflops=44.7591 (err=2.5e-16) 4. FFTPACK (f2c): elapsed time t=1.76074 s, 524288 iters, t-(init.)=1.69629 s t(norm)=0.113407, mflops=44.0891 (err=2.4e-16) FFTW_MEASURE plan: (cost = 6.463379e-07) FFTW_NOTW 9 5. FFTW: elapsed time t=1.42285 s, 2097152 iters, t-(init.)=1.16797 s t(norm)=0.0195213, mflops=256.13 (err=2.3e-16) FFTW_ESTIMATE plan: (cost = 9.900000e+00) FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.4248 s, 2097152 iters, t-(init.)=1.1709 s t(norm)=0.0195703, mflops=255.489 (err=2.3e-16) 7. Frigo-old: elapsed time t=1.53516 s, 262144 iters, t-(init.)=1.50293 s t(norm)=0.200959, mflops=24.8807 (err=4.4e-16) 8. GSL: elapsed time t=1.97559 s, 524288 iters, t-(init.)=1.91211 s t(norm)=0.127835, mflops=39.1128 (err=2.5e-16) 9. Nielsen: elapsed time t=1.29492 s, 131072 iters, t-(init.)=1.2793 s t(norm)=0.342113, mflops=14.615 (err=1.4e-15) 10. Singleton: elapsed time t=1.45703 s, 262144 iters, t-(init.)=1.42578 s t(norm)=0.190643, mflops=26.227 (err=2.7e-16) 11. Singleton (f2c): elapsed time t=1.12891 s, 262144 iters, t-(init.)=1.09766 s t(norm)=0.146769, mflops=34.0671 (err=2.7e-16) 12. Temperton: elapsed time t=1.49707 s, 262144 iters, t-(init.)=1.46582 s t(norm)=0.195997, mflops=25.5106 (err=1.6e-15) 13. Temperton (f2c): elapsed time t=1.3584 s, 262144 iters, t-(init.)=1.32617 s t(norm)=0.177324, mflops=28.1969 (err=2.5e-16) 14. Valkenburg: elapsed time t=1.78906 s, 131072 iters, t-(init.)=1.77344 s t(norm)=0.474258, mflops=10.5428 (err=3.9e-16) Top mflops for N=9 = 256.13 Normalized results and averages for N=9: fft 0: mflops = 7.86594 (norm. = 0.0307108), norm. avg. (of 1) = 0.0307108 fft 1: mflops = 48.1048 (norm. = 0.187814), norm. avg. (of 2) = 0.134836 fft 2: mflops = 30.7315 (norm. = 0.119984), norm. avg. (of 2) = 0.0892339 fft 3: mflops = 44.7591 (norm. = 0.174752), norm. avg. (of 2) = 0.147156 fft 4: mflops = 44.0891 (norm. = 0.172136), norm. avg. (of 2) = 0.145361 fft 5: mflops = 256.13 (norm. = 1), norm. avg. (of 2) = 1 fft 6: mflops = 255.489 (norm. = 0.997498), norm. avg. (of 2) = 0.895007 fft 7: mflops = 24.8807 (norm. = 0.097141), norm. avg. (of 2) = 0.0919113 fft 8: mflops = 39.1128 (norm. = 0.152707), norm. avg. (of 2) = 0.146298 fft 9: mflops = 14.615 (norm. = 0.0570611), norm. avg. (of 2) = 0.0422171 fft 10: mflops = 26.227 (norm. = 0.102397), norm. avg. (of 2) = 0.0747625 fft 11: mflops = 34.0671 (norm. = 0.133007), norm. avg. (of 2) = 0.0978073 fft 12: mflops = 25.5106 (norm. = 0.0996003), norm. avg. (of 2) = 0.078694 fft 13: mflops = 28.1969 (norm. = 0.110088), norm. avg. (of 2) = 0.0874341 fft 14: mflops = 10.5428 (norm. = 0.0411619), norm. avg. (of 2) = 0.0386396 Benchmarking for array size = 12: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.12109 s, 262144 iters, t-(init.)=1.08398 s t(norm)=0.0961208, mflops=52.0179 2. CWP (best N) (N=15): elapsed time t=1.25293 s, 262144 iters, t-(init.)=1.20996 s t(norm)=0.107292, mflops=46.602 3. FFTPACK: elapsed time t=1.07813 s, 262144 iters, t-(init.)=1.04102 s t(norm)=0.0923106, mflops=54.165 (err=2.8e-16) 4. FFTPACK (f2c): elapsed time t=1.09277 s, 262144 iters, t-(init.)=1.05566 s t(norm)=0.0936095, mflops=53.4134 (err=3.3e-16) FFTW_MEASURE plan: (cost = 6.388873e-07) FFTW_NOTW 12 5. FFTW: elapsed time t=1.44531 s, 2097152 iters, t-(init.)=1.14844 s t(norm)=0.0127295, mflops=392.788 (err=2.5e-16) FFTW_ESTIMATE plan: (cost = 1.680000e+01) FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.58008 s, 2097152 iters, t-(init.)=1.2832 s t(norm)=0.0142233, mflops=351.536 (err=2.5e-16) 7. Frigo-old: elapsed time t=1.40527 s, 262144 iters, t-(init.)=1.36816 s t(norm)=0.12132, mflops=41.2133 (err=3.0e-16) 8. GSL: elapsed time t=1.01465 s, 262144 iters, t-(init.)=0.977539 s t(norm)=0.0866819, mflops=57.6822 (err=2.8e-16) 9. Nielsen: elapsed time t=1.45508 s, 131072 iters, t-(init.)=1.43652 s t(norm)=0.254763, mflops=19.6261 (err=5.3e-16) 10. Singleton: elapsed time t=1.08594 s, 131072 iters, t-(init.)=1.06836 s t(norm)=0.18947, mflops=26.3893 (err=2.1e-16) 11. Singleton (f2c): elapsed time t=1.64453 s, 262144 iters, t-(init.)=1.60742 s t(norm)=0.142536, mflops=35.0789 (err=2.1e-16) 12. Temperton: elapsed time t=1.64355 s, 262144 iters, t-(init.)=1.60742 s t(norm)=0.142536, mflops=35.0789 (err=7.4e-16) 13. Temperton (f2c): elapsed time t=1.55566 s, 262144 iters, t-(init.)=1.51855 s t(norm)=0.134656, mflops=37.1317 (err=1.8e-16) 14. Valkenburg: elapsed time t=1.34375 s, 65536 iters, t-(init.)=1.33496 s t(norm)=0.473503, mflops=10.5596 (err=4.1e-16) Top mflops for N=12 = 392.788 Normalized results and averages for N=12: fft 0: mflops = -1 (norm. = -0.0025459), norm. avg. (of 1) = 0.0307108 fft 1: mflops = 52.0179 (norm. = 0.132432), norm. avg. (of 3) = 0.134035 fft 2: mflops = 46.602 (norm. = 0.118644), norm. avg. (of 3) = 0.0990373 fft 3: mflops = 54.165 (norm. = 0.137899), norm. avg. (of 3) = 0.14407 fft 4: mflops = 53.4134 (norm. = 0.135985), norm. avg. (of 3) = 0.142236 fft 5: mflops = 392.788 (norm. = 1), norm. avg. (of 3) = 1 fft 6: mflops = 351.536 (norm. = 0.894977), norm. avg. (of 3) = 0.894997 fft 7: mflops = 41.2133 (norm. = 0.104925), norm. avg. (of 3) = 0.0962492 fft 8: mflops = 57.6822 (norm. = 0.146853), norm. avg. (of 3) = 0.146483 fft 9: mflops = 19.6261 (norm. = 0.049966), norm. avg. (of 3) = 0.0448001 fft 10: mflops = 26.3893 (norm. = 0.0671846), norm. avg. (of 3) = 0.0722366 fft 11: mflops = 35.0789 (norm. = 0.0893074), norm. avg. (of 3) = 0.094974 fft 12: mflops = 35.0789 (norm. = 0.0893074), norm. avg. (of 3) = 0.0822318 fft 13: mflops = 37.1317 (norm. = 0.0945338), norm. avg. (of 3) = 0.0898006 fft 14: mflops = 10.5596 (norm. = 0.0268837), norm. avg. (of 3) = 0.034721 Benchmarking for array size = 15: 0. Brenner: elapsed time t=1.8125 s, 65536 iters, t-(init.)=1.7959 s t(norm)=0.467605, mflops=10.6928 (err=4.0e-16) 1. CWP (min N): elapsed time t=1.25488 s, 262144 iters, t-(init.)=1.21289 s t(norm)=0.0789513, mflops=63.3302 2. CWP (best N): elapsed time t=1.25293 s, 262144 iters, t-(init.)=1.20996 s t(norm)=0.0787606, mflops=63.4835 3. FFTPACK: elapsed time t=1.42969 s, 262144 iters, t-(init.)=1.38672 s t(norm)=0.0902664, mflops=55.3916 (err=2.6e-16) 4. FFTPACK (f2c): elapsed time t=1.4502 s, 262144 iters, t-(init.)=1.4082 s t(norm)=0.0916649, mflops=54.5465 (err=3.8e-16) FFTW_MEASURE plan: (cost = 1.084059e-06) FFTW_NOTW 15 5. FFTW: elapsed time t=1.26465 s, 1048576 iters, t-(init.)=1.09375 s t(norm)=0.017799, mflops=280.915 (err=2.7e-16) FFTW_ESTIMATE plan: (cost = 5.250000e+01) FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.3291 s, 1048576 iters, t-(init.)=1.15918 s t(norm)=0.0188638, mflops=265.058 (err=2.7e-16) 7. Frigo-old: elapsed time t=1.25293 s, 131072 iters, t-(init.)=1.23145 s t(norm)=0.160318, mflops=31.188 (err=3.1e-16) 8. GSL: elapsed time t=1.79492 s, 262144 iters, t-(init.)=1.75098 s t(norm)=0.113977, mflops=43.8684 (err=2.3e-16) 9. Nielsen: elapsed time t=1.63086 s, 131072 iters, t-(init.)=1.60938 s t(norm)=0.20952, mflops=23.8641 (err=4.2e-15) 10. Singleton: elapsed time t=1.36914 s, 131072 iters, t-(init.)=1.34766 s t(norm)=0.175447, mflops=28.4986 (err=3.1e-16) 11. Singleton (f2c): elapsed time t=1.85645 s, 262144 iters, t-(init.)=1.81348 s t(norm)=0.118046, mflops=42.3565 (err=3.1e-16) 12. Temperton: elapsed time t=1.91797 s, 262144 iters, t-(init.)=1.87402 s t(norm)=0.121987, mflops=40.9881 (err=9.9e-16) 13. Temperton (f2c): elapsed time t=1.74121 s, 262144 iters, t-(init.)=1.69824 s t(norm)=0.110545, mflops=45.2306 (err=2.6e-16) 14. Valkenburg: elapsed time t=1.03809 s, 32768 iters, t-(init.)=1.03223 s t(norm)=0.53753, mflops=9.30181 (err=2.9e-16) Top mflops for N=15 = 280.915 Normalized results and averages for N=15: fft 0: mflops = 10.6928 (norm. = 0.0380642), norm. avg. (of 2) = 0.0343875 fft 1: mflops = 63.3302 (norm. = 0.225443), norm. avg. (of 4) = 0.156887 fft 2: mflops = 63.4835 (norm. = 0.225989), norm. avg. (of 4) = 0.130775 fft 3: mflops = 55.3916 (norm. = 0.197183), norm. avg. (of 4) = 0.157349 fft 4: mflops = 54.5465 (norm. = 0.194175), norm. avg. (of 4) = 0.155221 fft 5: mflops = 280.915 (norm. = 1), norm. avg. (of 4) = 1 fft 6: mflops = 265.058 (norm. = 0.943555), norm. avg. (of 4) = 0.907137 fft 7: mflops = 31.188 (norm. = 0.111023), norm. avg. (of 4) = 0.0999426 fft 8: mflops = 43.8684 (norm. = 0.156163), norm. avg. (of 4) = 0.148903 fft 9: mflops = 23.8641 (norm. = 0.0849515), norm. avg. (of 4) = 0.0548379 fft 10: mflops = 28.4986 (norm. = 0.101449), norm. avg. (of 4) = 0.0795398 fft 11: mflops = 42.3565 (norm. = 0.150781), norm. avg. (of 4) = 0.108926 fft 12: mflops = 40.9881 (norm. = 0.145909), norm. avg. (of 4) = 0.0981512 fft 13: mflops = 45.2306 (norm. = 0.161012), norm. avg. (of 4) = 0.107604 fft 14: mflops = 9.30181 (norm. = 0.0331126), norm. avg. (of 4) = 0.0343189 Benchmarking for array size = 18: 0. Brenner: elapsed time t=1.25 s, 32768 iters, t-(init.)=1.24121 s t(norm)=0.504655, mflops=9.90775 (err=4.9e-16) 1. CWP (min N): elapsed time t=1.44434 s, 262144 iters, t-(init.)=1.39551 s t(norm)=0.0709237, mflops=70.4983 2. CWP (best N) (N=28): elapsed time t=1.55371 s, 262144 iters, t-(init.)=1.48535 s t(norm)=0.0754899, mflops=66.2341 3. FFTPACK: elapsed time t=1.27148 s, 131072 iters, t-(init.)=1.24707 s t(norm)=0.126759, mflops=39.4448 (err=2.1e-16) 4. FFTPACK (f2c): elapsed time t=1.27441 s, 131072 iters, t-(init.)=1.25 s t(norm)=0.127057, mflops=39.3523 (err=2.8e-16) FFTW_MEASURE plan: (cost = 1.735985e-06) FFTW_TWIDDLE 3 FFTW_NOTW 6 5. FFTW: elapsed time t=1.86914 s, 1048576 iters, t-(init.)=1.67578 s t(norm)=0.021292, mflops=234.83 (err=2.4e-16) FFTW_ESTIMATE plan: (cost = 1.026000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.12207 s, 524288 iters, t-(init.)=1.02441 s t(norm)=0.0260318, mflops=192.072 (err=2.8e-16) 7. Frigo-old: elapsed time t=1.76563 s, 131072 iters, t-(init.)=1.74121 s t(norm)=0.176987, mflops=28.2507 (err=4.6e-16) 8. GSL: elapsed time t=1.39063 s, 262144 iters, t-(init.)=1.3418 s t(norm)=0.068194, mflops=73.3202 (err=2.4e-16) 9. Nielsen: elapsed time t=1.3418 s, 65536 iters, t-(init.)=1.3291 s t(norm)=0.270195, mflops=18.5051 (err=1.6e-15) 10. Singleton: elapsed time t=1.37598 s, 131072 iters, t-(init.)=1.35156 s t(norm)=0.137381, mflops=36.3952 (err=3.0e-16) 11. Singleton (f2c): elapsed time t=1.93164 s, 262144 iters, t-(init.)=1.88281 s t(norm)=0.09569, mflops=52.2521 (err=3.0e-16) 12. Temperton: elapsed time t=1.38477 s, 131072 iters, t-(init.)=1.35938 s t(norm)=0.138175, mflops=36.1861 (err=1.0e-15) 13. Temperton (f2c): elapsed time t=1.36426 s, 131072 iters, t-(init.)=1.33984 s t(norm)=0.136189, mflops=36.7136 (err=2.2e-16) 14. Valkenburg: elapsed time t=1.17188 s, 32768 iters, t-(init.)=1.16504 s t(norm)=0.473685, mflops=10.5555 (err=4.5e-16) Top mflops for N=18 = 234.83 Normalized results and averages for N=18: fft 0: mflops = 9.90775 (norm. = 0.0421912), norm. avg. (of 3) = 0.0369887 fft 1: mflops = 70.4983 (norm. = 0.30021), norm. avg. (of 5) = 0.185551 fft 2: mflops = 66.2341 (norm. = 0.282051), norm. avg. (of 5) = 0.16103 fft 3: mflops = 39.4448 (norm. = 0.167972), norm. avg. (of 5) = 0.159473 fft 4: mflops = 39.3523 (norm. = 0.167578), norm. avg. (of 5) = 0.157692 fft 5: mflops = 234.83 (norm. = 1), norm. avg. (of 5) = 1 fft 6: mflops = 192.072 (norm. = 0.817922), norm. avg. (of 5) = 0.889294 fft 7: mflops = 28.2507 (norm. = 0.120303), norm. avg. (of 5) = 0.104015 fft 8: mflops = 73.3202 (norm. = 0.312227), norm. avg. (of 5) = 0.181568 fft 9: mflops = 18.5051 (norm. = 0.0788024), norm. avg. (of 5) = 0.0596308 fft 10: mflops = 36.3952 (norm. = 0.154986), norm. avg. (of 5) = 0.0946289 fft 11: mflops = 52.2521 (norm. = 0.22251), norm. avg. (of 5) = 0.131643 fft 12: mflops = 36.1861 (norm. = 0.154095), norm. avg. (of 5) = 0.10934 fft 13: mflops = 36.7136 (norm. = 0.156341), norm. avg. (of 5) = 0.117351 fft 14: mflops = 10.5555 (norm. = 0.0449497), norm. avg. (of 5) = 0.036445 Benchmarking for array size = 24: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.46387 s, 262144 iters, t-(init.)=1.40234 s t(norm)=0.0486147, mflops=102.85 2. CWP (best N) (N=28): elapsed time t=1.55469 s, 262144 iters, t-(init.)=1.48633 s t(norm)=0.0515262, mflops=97.0381 3. FFTPACK: elapsed time t=1.61914 s, 131072 iters, t-(init.)=1.58887 s t(norm)=0.110162, mflops=45.3878 (err=2.8e-16) 4. FFTPACK (f2c): elapsed time t=1.62988 s, 131072 iters, t-(init.)=1.59961 s t(norm)=0.110906, mflops=45.083 (err=3.3e-16) FFTW_MEASURE plan: (cost = 2.019107e-06) FFTW_TWIDDLE 4 FFTW_NOTW 6 5. FFTW: elapsed time t=1.13086 s, 524288 iters, t-(init.)=1.00977 s t(norm)=0.0175026, mflops=285.671 (err=2.6e-16) FFTW_ESTIMATE plan: (cost = 1.176000e+02) FFTW_TWIDDLE 3 FFTW_NOTW 8 6. FFTW_ESTIMATE: elapsed time t=1.13867 s, 524288 iters, t-(init.)=1.01563 s t(norm)=0.0176042, mflops=284.023 (err=2.9e-16) 7. Frigo-old: elapsed time t=1.35156 s, 131072 iters, t-(init.)=1.32129 s t(norm)=0.0916096, mflops=54.5794 (err=3.6e-16) 8. GSL: elapsed time t=1.46582 s, 262144 iters, t-(init.)=1.4043 s t(norm)=0.0486824, mflops=102.707 (err=2.9e-16) 9. Nielsen: elapsed time t=1.09668 s, 65536 iters, t-(init.)=1.08203 s t(norm)=0.150042, mflops=33.324 (err=1.4e-15) 10. Singleton: elapsed time t=1.11914 s, 65536 iters, t-(init.)=1.10449 s t(norm)=0.153157, mflops=32.6463 (err=3.2e-16) 11. Singleton (f2c): elapsed time t=1.51367 s, 131072 iters, t-(init.)=1.4834 s t(norm)=0.102849, mflops=48.6149 (err=3.2e-16) 12. Temperton: elapsed time t=1.50488 s, 131072 iters, t-(init.)=1.47461 s t(norm)=0.10224, mflops=48.9046 (err=7.1e-16) 13. Temperton (f2c): elapsed time t=1.45117 s, 131072 iters, t-(init.)=1.4209 s t(norm)=0.0985158, mflops=50.7533 (err=3.0e-16) 14. Valkenburg: elapsed time t=1.72461 s, 32768 iters, t-(init.)=1.71777 s t(norm)=0.476397, mflops=10.4955 (err=4.6e-16) Top mflops for N=24 = 285.671 Normalized results and averages for N=24: fft 0: mflops = -1 (norm. = -0.00350053), norm. avg. (of 3) = 0.0369887 fft 1: mflops = 102.85 (norm. = 0.360028), norm. avg. (of 6) = 0.214631 fft 2: mflops = 97.0381 (norm. = 0.339685), norm. avg. (of 6) = 0.190806 fft 3: mflops = 45.3878 (norm. = 0.158881), norm. avg. (of 6) = 0.159375 fft 4: mflops = 45.083 (norm. = 0.157814), norm. avg. (of 6) = 0.157713 fft 5: mflops = 285.671 (norm. = 1), norm. avg. (of 6) = 1 fft 6: mflops = 284.023 (norm. = 0.994231), norm. avg. (of 6) = 0.906783 fft 7: mflops = 54.5794 (norm. = 0.191057), norm. avg. (of 6) = 0.118522 fft 8: mflops = 102.707 (norm. = 0.359527), norm. avg. (of 6) = 0.211228 fft 9: mflops = 33.324 (norm. = 0.116652), norm. avg. (of 6) = 0.0691343 fft 10: mflops = 32.6463 (norm. = 0.114279), norm. avg. (of 6) = 0.097904 fft 11: mflops = 48.6149 (norm. = 0.170178), norm. avg. (of 6) = 0.138065 fft 12: mflops = 48.9046 (norm. = 0.171192), norm. avg. (of 6) = 0.119649 fft 13: mflops = 50.7533 (norm. = 0.177663), norm. avg. (of 6) = 0.127403 fft 14: mflops = 10.4955 (norm. = 0.0367396), norm. avg. (of 6) = 0.0364941 Benchmarking for array size = 36: 0. Brenner: elapsed time t=1.25684 s, 16384 iters, t-(init.)=1.25 s t(norm)=0.409924, mflops=12.1974 (err=6.6e-16) 1. CWP (min N): elapsed time t=1.97754 s, 262144 iters, t-(init.)=1.89258 s t(norm)=0.0387907, mflops=128.897 2. CWP (best N): elapsed time t=1.9541 s, 262144 iters, t-(init.)=1.87012 s t(norm)=0.0383303, mflops=130.445 3. FFTPACK: elapsed time t=1.22559 s, 65536 iters, t-(init.)=1.2041 s t(norm)=0.098718, mflops=50.6493 (err=5.0e-16) 4. FFTPACK (f2c): elapsed time t=1.23145 s, 65536 iters, t-(init.)=1.20996 s t(norm)=0.0991984, mflops=50.404 (err=5.2e-16) FFTW_MEASURE plan: (cost = 3.188848e-06) FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.74805 s, 524288 iters, t-(init.)=1.5791 s t(norm)=0.0161828, mflops=308.97 (err=4.7e-16) FFTW_ESTIMATE plan: (cost = 1.332000e+02) FFTW_TWIDDLE 4 FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.93555 s, 524288 iters, t-(init.)=1.7666 s t(norm)=0.0181043, mflops=276.177 (err=5.2e-16) 7. Frigo-old: elapsed time t=1.73535 s, 65536 iters, t-(init.)=1.71484 s t(norm)=0.140591, mflops=35.5641 (err=5.2e-16) 8. GSL: elapsed time t=1.09961 s, 131072 iters, t-(init.)=1.05762 s t(norm)=0.0433543, mflops=115.329 (err=4.6e-16) 9. Nielsen: elapsed time t=1.90625 s, 65536 iters, t-(init.)=1.88477 s t(norm)=0.154522, mflops=32.3578 (err=8.5e-16) 10. Singleton: elapsed time t=1.14551 s, 65536 iters, t-(init.)=1.12402 s t(norm)=0.0921528, mflops=54.2577 (err=5.5e-16) 11. Singleton (f2c): elapsed time t=1.47266 s, 131072 iters, t-(init.)=1.42969 s t(norm)=0.0586063, mflops=85.315 (err=5.5e-16) 12. Temperton: elapsed time t=1.0332 s, 65536 iters, t-(init.)=1.0127 s t(norm)=0.0830256, mflops=60.2224 (err=1.2e-15) 13. Temperton (f2c): elapsed time t=1.03711 s, 65536 iters, t-(init.)=1.0166 s t(norm)=0.0833459, mflops=59.991 (err=4.8e-16) 14. Valkenburg: elapsed time t=1.50879 s, 16384 iters, t-(init.)=1.50293 s t(norm)=0.49287, mflops=10.1447 (err=5.7e-16) Top mflops for N=36 = 308.97 Normalized results and averages for N=36: fft 0: mflops = 12.1974 (norm. = 0.0394775), norm. avg. (of 4) = 0.0376109 fft 1: mflops = 128.897 (norm. = 0.417183), norm. avg. (of 7) = 0.243567 fft 2: mflops = 130.445 (norm. = 0.422193), norm. avg. (of 7) = 0.223861 fft 3: mflops = 50.6493 (norm. = 0.163929), norm. avg. (of 7) = 0.160025 fft 4: mflops = 50.404 (norm. = 0.163136), norm. avg. (of 7) = 0.158487 fft 5: mflops = 308.97 (norm. = 1), norm. avg. (of 7) = 1 fft 6: mflops = 276.177 (norm. = 0.893864), norm. avg. (of 7) = 0.904938 fft 7: mflops = 35.5641 (norm. = 0.115105), norm. avg. (of 7) = 0.118034 fft 8: mflops = 115.329 (norm. = 0.373269), norm. avg. (of 7) = 0.234376 fft 9: mflops = 32.3578 (norm. = 0.104728), norm. avg. (of 7) = 0.0742191 fft 10: mflops = 54.2577 (norm. = 0.175608), norm. avg. (of 7) = 0.109005 fft 11: mflops = 85.315 (norm. = 0.276127), norm. avg. (of 7) = 0.157788 fft 12: mflops = 60.2224 (norm. = 0.194913), norm. avg. (of 7) = 0.130401 fft 13: mflops = 59.991 (norm. = 0.194164), norm. avg. (of 7) = 0.13694 fft 14: mflops = 10.1447 (norm. = 0.0328338), norm. avg. (of 7) = 0.0359712 Benchmarking for array size = 80: 0. Brenner: elapsed time t=1.06934 s, 8192 iters, t-(init.)=1.06348 s t(norm)=0.256684, mflops=19.4792 (err=3.8e-16) 1. CWP (min N): elapsed time t=1.88672 s, 131072 iters, t-(init.)=1.80078 s t(norm)=0.0271651, mflops=184.06 2. CWP (best N) (N=84): elapsed time t=1.90234 s, 131072 iters, t-(init.)=1.8125 s t(norm)=0.0273419, mflops=182.87 3. FFTPACK: elapsed time t=1.44727 s, 32768 iters, t-(init.)=1.42578 s t(norm)=0.0860327, mflops=58.1175 (err=3.0e-16) 4. FFTPACK (f2c): elapsed time t=1.44824 s, 32768 iters, t-(init.)=1.42676 s t(norm)=0.0860916, mflops=58.0777 (err=3.6e-16) FFTW_MEASURE plan: (cost = 7.718801e-06) FFTW_TWIDDLE 5 FFTW_NOTW 16 5. FFTW: elapsed time t=1.05859 s, 131072 iters, t-(init.)=0.972656 s t(norm)=0.0146727, mflops=340.769 (err=3.3e-16) FFTW_ESTIMATE plan: (cost = 1.600000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.23633 s, 131072 iters, t-(init.)=1.15039 s t(norm)=0.0173539, mflops=288.12 (err=3.4e-16) 7. Frigo-old: elapsed time t=1.23926 s, 32768 iters, t-(init.)=1.21777 s t(norm)=0.0734813, mflops=68.0445 (err=3.3e-16) 8. GSL: elapsed time t=1.99121 s, 65536 iters, t-(init.)=1.94824 s t(norm)=0.0587792, mflops=85.0641 (err=3.1e-16) 9. Nielsen: elapsed time t=1.42188 s, 32768 iters, t-(init.)=1.40039 s t(norm)=0.0845006, mflops=59.1712 (err=5.7e-15) 10. Singleton: elapsed time t=1.20996 s, 32768 iters, t-(init.)=1.18848 s t(norm)=0.0717135, mflops=69.7218 (err=4.0e-16) 11. Singleton (f2c): elapsed time t=1.2959 s, 65536 iters, t-(init.)=1.25293 s t(norm)=0.0378013, mflops=132.27 (err=4.0e-16) 12. Temperton: elapsed time t=1.94043 s, 65536 iters, t-(init.)=1.89746 s t(norm)=0.0572471, mflops=87.3407 (err=4.0e-16) 13. Temperton (f2c): elapsed time t=1.02539 s, 32768 iters, t-(init.)=1.00391 s t(norm)=0.0605764, mflops=82.5404 (err=3.4e-16) 14. Valkenburg: elapsed time t=1.08887 s, 4096 iters, t-(init.)=1.08594 s t(norm)=0.52421, mflops=9.53816 (err=3.8e-16) Top mflops for N=80 = 340.769 Normalized results and averages for N=80: fft 0: mflops = 19.4792 (norm. = 0.0571625), norm. avg. (of 5) = 0.0415212 fft 1: mflops = 184.06 (norm. = 0.54013), norm. avg. (of 8) = 0.280637 fft 2: mflops = 182.87 (norm. = 0.536638), norm. avg. (of 8) = 0.262958 fft 3: mflops = 58.1175 (norm. = 0.170548), norm. avg. (of 8) = 0.161341 fft 4: mflops = 58.0777 (norm. = 0.170431), norm. avg. (of 8) = 0.15998 fft 5: mflops = 340.769 (norm. = 1), norm. avg. (of 8) = 1 fft 6: mflops = 288.12 (norm. = 0.845501), norm. avg. (of 8) = 0.897508 fft 7: mflops = 68.0445 (norm. = 0.199679), norm. avg. (of 8) = 0.128239 fft 8: mflops = 85.0641 (norm. = 0.249624), norm. avg. (of 8) = 0.236282 fft 9: mflops = 59.1712 (norm. = 0.17364), norm. avg. (of 8) = 0.0866467 fft 10: mflops = 69.7218 (norm. = 0.204601), norm. avg. (of 8) = 0.120954 fft 11: mflops = 132.27 (norm. = 0.388153), norm. avg. (of 8) = 0.186584 fft 12: mflops = 87.3407 (norm. = 0.256305), norm. avg. (of 8) = 0.146139 fft 13: mflops = 82.5404 (norm. = 0.242218), norm. avg. (of 8) = 0.1501 fft 14: mflops = 9.53816 (norm. = 0.0279901), norm. avg. (of 8) = 0.0349736 Benchmarking for array size = 108: 0. Brenner: elapsed time t=1.16211 s, 4096 iters, t-(init.)=1.1582 s t(norm)=0.387599, mflops=12.8999 (err=5.3e-16) 1. CWP (min N) (N=110): elapsed time t=1.54492 s, 65536 iters, t-(init.)=1.4873 s t(norm)=0.0311084, mflops=160.728 2. CWP (best N) (N=112): elapsed time t=1.19043 s, 65536 iters, t-(init.)=1.13184 s t(norm)=0.0236735, mflops=211.207 3. FFTPACK: elapsed time t=1.97168 s, 32768 iters, t-(init.)=1.94336 s t(norm)=0.0812945, mflops=61.5048 (err=3.6e-16) 4. FFTPACK (f2c): elapsed time t=1.97266 s, 32768 iters, t-(init.)=1.94434 s t(norm)=0.0813354, mflops=61.4739 (err=5.0e-16) FFTW_MEASURE plan: (cost = 1.287460e-05) FFTW_TWIDDLE 3 FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.87109 s, 131072 iters, t-(init.)=1.75781 s t(norm)=0.0183832, mflops=271.988 (err=3.2e-16) FFTW_ESTIMATE plan: (cost = 2.700000e+02) FFTW_TWIDDLE 9 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.0293 s, 65536 iters, t-(init.)=0.972656 s t(norm)=0.0203441, mflops=245.772 (err=3.5e-16) 7. Frigo-old: elapsed time t=1.77539 s, 16384 iters, t-(init.)=1.76172 s t(norm)=0.147392, mflops=33.9231 (err=4.8e-16) 8. GSL: elapsed time t=1.97168 s, 65536 iters, t-(init.)=1.91602 s t(norm)=0.0400753, mflops=124.765 (err=3.4e-16) 9. Nielsen: elapsed time t=1.4707 s, 16384 iters, t-(init.)=1.45605 s t(norm)=0.121819, mflops=41.0444 (err=9.6e-16) 10. Singleton: elapsed time t=1.13379 s, 16384 iters, t-(init.)=1.11914 s t(norm)=0.0936317, mflops=53.4007 (err=4.1e-16) 11. Singleton (f2c): elapsed time t=1.27344 s, 32768 iters, t-(init.)=1.24414 s t(norm)=0.0520448, mflops=96.071 (err=4.1e-16) 12. Temperton: elapsed time t=1.72461 s, 32768 iters, t-(init.)=1.69238 s t(norm)=0.0707957, mflops=70.6258 (err=1.5e-15) 13. Temperton (f2c): elapsed time t=1.64453 s, 32768 iters, t-(init.)=1.61621 s t(norm)=0.0676093, mflops=73.9544 (err=3.9e-16) 14. Valkenburg: elapsed time t=1.48438 s, 4096 iters, t-(init.)=1.48047 s t(norm)=0.495447, mflops=10.0919 (err=5.6e-16) Top mflops for N=108 = 271.988 Normalized results and averages for N=108: fft 0: mflops = 12.8999 (norm. = 0.0474283), norm. avg. (of 6) = 0.0425058 fft 1: mflops = 160.728 (norm. = 0.590939), norm. avg. (of 9) = 0.315115 fft 2: mflops = 211.207 (norm. = 0.776531), norm. avg. (of 9) = 0.320022 fft 3: mflops = 61.5048 (norm. = 0.226131), norm. avg. (of 9) = 0.168539 fft 4: mflops = 61.4739 (norm. = 0.226017), norm. avg. (of 9) = 0.167318 fft 5: mflops = 271.988 (norm. = 1), norm. avg. (of 9) = 1 fft 6: mflops = 245.772 (norm. = 0.903614), norm. avg. (of 9) = 0.898187 fft 7: mflops = 33.9231 (norm. = 0.124723), norm. avg. (of 9) = 0.127849 fft 8: mflops = 124.765 (norm. = 0.458716), norm. avg. (of 9) = 0.260997 fft 9: mflops = 41.0444 (norm. = 0.150905), norm. avg. (of 9) = 0.0937866 fft 10: mflops = 53.4007 (norm. = 0.196335), norm. avg. (of 9) = 0.12933 fft 11: mflops = 96.071 (norm. = 0.353218), norm. avg. (of 9) = 0.205099 fft 12: mflops = 70.6258 (norm. = 0.259665), norm. avg. (of 9) = 0.158753 fft 13: mflops = 73.9544 (norm. = 0.271903), norm. avg. (of 9) = 0.163634 fft 14: mflops = 10.0919 (norm. = 0.0371042), norm. avg. (of 9) = 0.0352103 Benchmarking for array size = 210: 0. Brenner: elapsed time t=1.04297 s, 2048 iters, t-(init.)=1.04004 s t(norm)=0.313478, mflops=15.9501 (err=8.2e-16) 1. CWP (min N): elapsed time t=1.35352 s, 32768 iters, t-(init.)=1.2998 s t(norm)=0.0244859, mflops=204.199 2. CWP (best N): elapsed time t=1.35254 s, 32768 iters, t-(init.)=1.2998 s t(norm)=0.0244859, mflops=204.199 3. FFTPACK: elapsed time t=1.67773 s, 8192 iters, t-(init.)=1.66406 s t(norm)=0.125391, mflops=39.8752 (err=5.4e-16) 4. FFTPACK (f2c): elapsed time t=1.7002 s, 8192 iters, t-(init.)=1.68652 s t(norm)=0.127084, mflops=39.3442 (err=6.7e-16) FFTW_MEASURE plan: (cost = 3.683567e-05) FFTW_TWIDDLE 6 FFTW_TWIDDLE 5 FFTW_NOTW 7 5. FFTW: elapsed time t=1.21484 s, 32768 iters, t-(init.)=1.16113 s t(norm)=0.0218735, mflops=228.587 (err=5.0e-16) FFTW_ESTIMATE plan: (cost = 1.092000e+03) FFTW_TWIDDLE 6 FFTW_TWIDDLE 5 FFTW_NOTW 7 6. FFTW_ESTIMATE: elapsed time t=1.20801 s, 32768 iters, t-(init.)=1.15527 s t(norm)=0.0217632, mflops=229.746 (err=5.0e-16) 7. Frigo-old: elapsed time t=1.84375 s, 8192 iters, t-(init.)=1.83008 s t(norm)=0.137901, mflops=36.2579 (err=6.6e-16) 8. GSL: elapsed time t=1.6709 s, 16384 iters, t-(init.)=1.6416 s t(norm)=0.0618493, mflops=80.8416 (err=7.1e-16) 9. Nielsen: elapsed time t=1.40137 s, 8192 iters, t-(init.)=1.3877 s t(norm)=0.104566, mflops=47.8166 (err=7.5e-15) 10. Singleton: elapsed time t=1.66992 s, 8192 iters, t-(init.)=1.65625 s t(norm)=0.124802, mflops=40.0633 (err=6.7e-16) 11. Singleton (f2c): elapsed time t=1.9248 s, 16384 iters, t-(init.)=1.89746 s t(norm)=0.0714892, mflops=69.9407 (err=6.7e-16) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.06543 s, 1024 iters, t-(init.)=1.06348 s t(norm)=0.641084, mflops=7.79928 (err=6.4e-16) Top mflops for N=210 = 229.746 Normalized results and averages for N=210: fft 0: mflops = 15.9501 (norm. = 0.0694249), norm. avg. (of 7) = 0.0463513 fft 1: mflops = 204.199 (norm. = 0.888805), norm. avg. (of 10) = 0.372484 fft 2: mflops = 204.199 (norm. = 0.888805), norm. avg. (of 10) = 0.3769 fft 3: mflops = 39.8752 (norm. = 0.173562), norm. avg. (of 10) = 0.169042 fft 4: mflops = 39.3442 (norm. = 0.171251), norm. avg. (of 10) = 0.167711 fft 5: mflops = 228.587 (norm. = 0.994954), norm. avg. (of 10) = 0.999495 fft 6: mflops = 229.746 (norm. = 1), norm. avg. (of 10) = 0.908368 fft 7: mflops = 36.2579 (norm. = 0.157818), norm. avg. (of 10) = 0.130846 fft 8: mflops = 80.8416 (norm. = 0.351874), norm. avg. (of 10) = 0.270085 fft 9: mflops = 47.8166 (norm. = 0.208128), norm. avg. (of 10) = 0.105221 fft 10: mflops = 40.0633 (norm. = 0.174381), norm. avg. (of 10) = 0.133835 fft 11: mflops = 69.9407 (norm. = 0.304426), norm. avg. (of 10) = 0.215032 fft 12: mflops = -1 (norm. = -0.00435263), norm. avg. (of 9) = 0.158753 fft 13: mflops = -1 (norm. = -0.00435263), norm. avg. (of 9) = 0.163634 fft 14: mflops = 7.79928 (norm. = 0.0339474), norm. avg. (of 10) = 0.035084 Benchmarking for array size = 504: 0. Brenner: elapsed time t=1.35449 s, 1024 iters, t-(init.)=1.35059 s t(norm)=0.291506, mflops=17.1523 (err=1.5e-15) 1. CWP (min N): elapsed time t=1.38379 s, 16384 iters, t-(init.)=1.32129 s t(norm)=0.0178239, mflops=280.522 2. CWP (best N): elapsed time t=1.38281 s, 16384 iters, t-(init.)=1.31934 s t(norm)=0.0177975, mflops=280.938 3. FFTPACK: elapsed time t=1.19336 s, 2048 iters, t-(init.)=1.18555 s t(norm)=0.127942, mflops=39.0802 (err=1.3e-15) 4. FFTPACK (f2c): elapsed time t=1.2002 s, 2048 iters, t-(init.)=1.19238 s t(norm)=0.12868, mflops=38.8561 (err=1.4e-15) FFTW_MEASURE plan: (cost = 9.965897e-05) FFTW_TWIDDLE 3 FFTW_TWIDDLE 3 FFTW_TWIDDLE 8 FFTW_NOTW 7 5. FFTW: elapsed time t=1.64258 s, 16384 iters, t-(init.)=1.58008 s t(norm)=0.0213149, mflops=234.578 (err=1.3e-15) FFTW_ESTIMATE plan: (cost = 1.612800e+03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 7 FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.75977 s, 16384 iters, t-(init.)=1.69629 s t(norm)=0.0228826, mflops=218.507 (err=1.3e-15) 7. Frigo-old: elapsed time t=1.08398 s, 2048 iters, t-(init.)=1.07617 s t(norm)=0.116139, mflops=43.052 (err=1.4e-15) 8. GSL: elapsed time t=1.53125 s, 8192 iters, t-(init.)=1.5 s t(norm)=0.0404693, mflops=123.55 (err=1.4e-15) 9. Nielsen: elapsed time t=1.91309 s, 4096 iters, t-(init.)=1.89746 s t(norm)=0.102385, mflops=48.8351 (err=6.2e-15) 10. Singleton: elapsed time t=1.91504 s, 4096 iters, t-(init.)=1.89941 s t(norm)=0.102491, mflops=48.7849 (err=2.0e-15) 11. Singleton (f2c): elapsed time t=1.10156 s, 4096 iters, t-(init.)=1.08594 s t(norm)=0.0585962, mflops=85.3297 (err=2.0e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.36523 s, 512 iters, t-(init.)=1.36328 s t(norm)=0.588492, mflops=8.4963 (err=1.5e-15) Top mflops for N=504 = 280.938 Normalized results and averages for N=504: fft 0: mflops = 17.1523 (norm. = 0.0610539), norm. avg. (of 8) = 0.0481892 fft 1: mflops = 280.522 (norm. = 0.998522), norm. avg. (of 11) = 0.429397 fft 2: mflops = 280.938 (norm. = 1), norm. avg. (of 11) = 0.433546 fft 3: mflops = 39.0802 (norm. = 0.139106), norm. avg. (of 11) = 0.16632 fft 4: mflops = 38.8561 (norm. = 0.138309), norm. avg. (of 11) = 0.165038 fft 5: mflops = 234.578 (norm. = 0.834981), norm. avg. (of 11) = 0.98454 fft 6: mflops = 218.507 (norm. = 0.777778), norm. avg. (of 11) = 0.896496 fft 7: mflops = 43.052 (norm. = 0.153244), norm. avg. (of 11) = 0.132882 fft 8: mflops = 123.55 (norm. = 0.439779), norm. avg. (of 11) = 0.285512 fft 9: mflops = 48.8351 (norm. = 0.173829), norm. avg. (of 11) = 0.111458 fft 10: mflops = 48.7849 (norm. = 0.17365), norm. avg. (of 11) = 0.137455 fft 11: mflops = 85.3297 (norm. = 0.303732), norm. avg. (of 11) = 0.223095 fft 12: mflops = -1 (norm. = -0.00355951), norm. avg. (of 9) = 0.158753 fft 13: mflops = -1 (norm. = -0.00355951), norm. avg. (of 9) = 0.163634 fft 14: mflops = 8.4963 (norm. = 0.0302427), norm. avg. (of 11) = 0.0346439 Benchmarking for array size = 1000: 0. Brenner: elapsed time t=1.45508 s, 512 iters, t-(init.)=1.45117 s t(norm)=0.284405, mflops=17.5806 (err=1.0e-15) 1. CWP (min N) (N=1001): elapsed time t=1.9873 s, 8192 iters, t-(init.)=1.9248 s t(norm)=0.0235768, mflops=212.073 2. CWP (best N) (N=1008): elapsed time t=1.60449 s, 8192 iters, t-(init.)=1.54199 s t(norm)=0.0188878, mflops=264.722 3. FFTPACK: elapsed time t=1.8418 s, 2048 iters, t-(init.)=1.82617 s t(norm)=0.0894747, mflops=55.8817 (err=8.8e-16) 4. FFTPACK (f2c): elapsed time t=1.80469 s, 2048 iters, t-(init.)=1.78906 s t(norm)=0.0876565, mflops=57.0408 (err=9.8e-16) FFTW_MEASURE plan: (cost = 1.721382e-04) FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 8 5. FFTW: elapsed time t=1.40918 s, 8192 iters, t-(init.)=1.34668 s t(norm)=0.0164954, mflops=303.115 (err=9.0e-16) FFTW_ESTIMATE plan: (cost = 3.800000e+03) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.0293 s, 4096 iters, t-(init.)=0.998047 s t(norm)=0.02445, mflops=204.499 (err=8.9e-16) 7. Frigo-old: elapsed time t=1.17188 s, 1024 iters, t-(init.)=1.16406 s t(norm)=0.114068, mflops=43.8334 (err=9.3e-16) 8. GSL: elapsed time t=1.49219 s, 2048 iters, t-(init.)=1.47656 s t(norm)=0.0723453, mflops=69.113 (err=9.1e-16) 9. Nielsen: elapsed time t=1.40723 s, 2048 iters, t-(init.)=1.3916 s t(norm)=0.0681826, mflops=73.3325 (err=1.4e-14) 10. Singleton: elapsed time t=1.5791 s, 2048 iters, t-(init.)=1.56348 s t(norm)=0.0766037, mflops=65.271 (err=1.3e-15) 11. Singleton (f2c): elapsed time t=1.66016 s, 4096 iters, t-(init.)=1.62891 s t(norm)=0.0399048, mflops=125.298 (err=1.3e-15) 12. Temperton: elapsed time t=1.13281 s, 2048 iters, t-(init.)=1.11719 s t(norm)=0.0547375, mflops=91.3451 (err=1.1e-15) 13. Temperton (f2c): elapsed time t=1.16406 s, 2048 iters, t-(init.)=1.14844 s t(norm)=0.0562686, mflops=88.8595 (err=9.3e-16) 14. Valkenburg: elapsed time t=1.52344 s, 256 iters, t-(init.)=1.52148 s t(norm)=0.59637, mflops=8.38405 (err=1.1e-15) Top mflops for N=1000 = 303.115 Normalized results and averages for N=1000: fft 0: mflops = 17.5806 (norm. = 0.0579997), norm. avg. (of 9) = 0.0492792 fft 1: mflops = 212.073 (norm. = 0.699645), norm. avg. (of 12) = 0.451917 fft 2: mflops = 264.722 (norm. = 0.873338), norm. avg. (of 12) = 0.470195 fft 3: mflops = 55.8817 (norm. = 0.184358), norm. avg. (of 12) = 0.167824 fft 4: mflops = 57.0408 (norm. = 0.188182), norm. avg. (of 12) = 0.166967 fft 5: mflops = 303.115 (norm. = 1), norm. avg. (of 12) = 0.985828 fft 6: mflops = 204.499 (norm. = 0.674658), norm. avg. (of 12) = 0.87801 fft 7: mflops = 43.8334 (norm. = 0.14461), norm. avg. (of 12) = 0.133859 fft 8: mflops = 69.113 (norm. = 0.228009), norm. avg. (of 12) = 0.28072 fft 9: mflops = 73.3325 (norm. = 0.24193), norm. avg. (of 12) = 0.122331 fft 10: mflops = 65.271 (norm. = 0.215334), norm. avg. (of 12) = 0.143945 fft 11: mflops = 125.298 (norm. = 0.413369), norm. avg. (of 12) = 0.238951 fft 12: mflops = 91.3451 (norm. = 0.301355), norm. avg. (of 10) = 0.173013 fft 13: mflops = 88.8595 (norm. = 0.293155), norm. avg. (of 10) = 0.176586 fft 14: mflops = 8.38405 (norm. = 0.0276597), norm. avg. (of 12) = 0.0340619 Benchmarking for array size = 1960: 0. Brenner: elapsed time t=1.53809 s, 256 iters, t-(init.)=1.53418 s t(norm)=0.279574, mflops=17.8844 (err=2.7e-15) 1. CWP (min N) (N=1980): elapsed time t=1.92969 s, 4096 iters, t-(init.)=1.86816 s t(norm)=0.0212772, mflops=234.993 2. CWP (best N) (N=1980): elapsed time t=1.92969 s, 4096 iters, t-(init.)=1.86816 s t(norm)=0.0212772, mflops=234.993 3. FFTPACK: elapsed time t=1.78906 s, 512 iters, t-(init.)=1.78125 s t(norm)=0.162299, mflops=30.8074 (err=2.6e-15) 4. FFTPACK (f2c): elapsed time t=1.77734 s, 512 iters, t-(init.)=1.77051 s t(norm)=0.16132, mflops=30.9943 (err=2.6e-15) FFTW_MEASURE plan: (cost = 5.931854e-04) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 8 FFTW_NOTW 7 5. FFTW: elapsed time t=1.31641 s, 2048 iters, t-(init.)=1.28613 s t(norm)=0.0292965, mflops=170.669 (err=2.6e-15) FFTW_ESTIMATE plan: (cost = 1.078000e+04) FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 8 6. FFTW_ESTIMATE: elapsed time t=1.31152 s, 2048 iters, t-(init.)=1.28125 s t(norm)=0.0291853, mflops=171.319 (err=2.6e-15) 7. Frigo-old: elapsed time t=1.29395 s, 512 iters, t-(init.)=1.28613 s t(norm)=0.117186, mflops=42.6672 (err=2.7e-15) 8. GSL: elapsed time t=1.2168 s, 1024 iters, t-(init.)=1.20117 s t(norm)=0.0547224, mflops=91.3702 (err=2.7e-15) 9. Nielsen: elapsed time t=1.10938 s, 512 iters, t-(init.)=1.10156 s t(norm)=0.100369, mflops=49.8162 (err=1.7e-14) 10. Singleton: elapsed time t=1.23047 s, 512 iters, t-(init.)=1.22363 s t(norm)=0.111491, mflops=44.8465 (err=4.1e-15) 11. Singleton (f2c): elapsed time t=1.39941 s, 1024 iters, t-(init.)=1.38477 s t(norm)=0.0630865, mflops=79.2563 (err=4.1e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.86816 s, 128 iters, t-(init.)=1.86621 s t(norm)=0.68016, mflops=7.35122 (err=2.6e-15) Top mflops for N=1960 = 234.993 Normalized results and averages for N=1960: fft 0: mflops = 17.8844 (norm. = 0.076106), norm. avg. (of 10) = 0.0519619 fft 1: mflops = 234.993 (norm. = 1), norm. avg. (of 13) = 0.494078 fft 2: mflops = 234.993 (norm. = 1), norm. avg. (of 13) = 0.510949 fft 3: mflops = 30.8074 (norm. = 0.131099), norm. avg. (of 13) = 0.164999 fft 4: mflops = 30.9943 (norm. = 0.131895), norm. avg. (of 13) = 0.164269 fft 5: mflops = 170.669 (norm. = 0.726272), norm. avg. (of 13) = 0.965862 fft 6: mflops = 171.319 (norm. = 0.72904), norm. avg. (of 13) = 0.86655 fft 7: mflops = 42.6672 (norm. = 0.181568), norm. avg. (of 13) = 0.137529 fft 8: mflops = 91.3702 (norm. = 0.388821), norm. avg. (of 13) = 0.289035 fft 9: mflops = 49.8162 (norm. = 0.21199), norm. avg. (of 13) = 0.129227 fft 10: mflops = 44.8465 (norm. = 0.190842), norm. avg. (of 13) = 0.147552 fft 11: mflops = 79.2563 (norm. = 0.337271), norm. avg. (of 13) = 0.246514 fft 12: mflops = -1 (norm. = -0.00425545), norm. avg. (of 10) = 0.173013 fft 13: mflops = -1 (norm. = -0.00425545), norm. avg. (of 10) = 0.176586 fft 14: mflops = 7.35122 (norm. = 0.0312827), norm. avg. (of 13) = 0.0338481 Benchmarking for array size = 4725: 0. Brenner: elapsed time t=1.47461 s, 64 iters, t-(init.)=1.46973 s t(norm)=0.398179, mflops=12.5572 (err=2.2e-15) 1. CWP (min N) (N=5005): elapsed time t=1.65039 s, 1024 iters, t-(init.)=1.56738 s t(norm)=0.0265397, mflops=188.397 2. CWP (best N) (N=5040): elapsed time t=1.43457 s, 1024 iters, t-(init.)=1.35059 s t(norm)=0.0228688, mflops=218.638 3. FFTPACK: elapsed time t=1.87109 s, 256 iters, t-(init.)=1.85059 s t(norm)=0.12534, mflops=39.8914 (err=2.1e-15) 4. FFTPACK (f2c): elapsed time t=1.87402 s, 256 iters, t-(init.)=1.85352 s t(norm)=0.125539, mflops=39.8284 (err=2.0e-15) FFTW_MEASURE plan: (cost = 1.747131e-03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 3 FFTW_TWIDDLE 3 FFTW_TWIDDLE 5 FFTW_NOTW 15 5. FFTW: elapsed time t=1.96191 s, 1024 iters, t-(init.)=1.87891 s t(norm)=0.0318146, mflops=157.161 (err=2.1e-15) FFTW_ESTIMATE plan: (cost = 3.591000e+04) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.99023 s, 1024 iters, t-(init.)=1.90723 s t(norm)=0.0322941, mflops=154.827 (err=2.0e-15) 7. Frigo-old: elapsed time t=1.29785 s, 128 iters, t-(init.)=1.28809 s t(norm)=0.174484, mflops=28.6559 (err=2.0e-15) 8. GSL: elapsed time t=1.10742 s, 256 iters, t-(init.)=1.08691 s t(norm)=0.0736167, mflops=67.9193 (err=2.1e-15) 9. Nielsen: elapsed time t=1.64551 s, 256 iters, t-(init.)=1.625 s t(norm)=0.110061, mflops=45.4292 (err=4.5e-14) 10. Singleton: elapsed time t=1.7627 s, 256 iters, t-(init.)=1.74316 s t(norm)=0.118065, mflops=42.3497 (err=2.6e-15) 11. Singleton (f2c): elapsed time t=1.95508 s, 512 iters, t-(init.)=1.91406 s t(norm)=0.0648198, mflops=77.137 (err=2.6e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.26465 s, 32 iters, t-(init.)=1.2627 s t(norm)=0.684179, mflops=7.30803 (err=2.1e-15) Top mflops for N=4725 = 218.638 Normalized results and averages for N=4725: fft 0: mflops = 12.5572 (norm. = 0.0574336), norm. avg. (of 11) = 0.0524593 fft 1: mflops = 188.397 (norm. = 0.861682), norm. avg. (of 14) = 0.520335 fft 2: mflops = 218.638 (norm. = 1), norm. avg. (of 14) = 0.545882 fft 3: mflops = 39.8914 (norm. = 0.182454), norm. avg. (of 14) = 0.166245 fft 4: mflops = 39.8284 (norm. = 0.182165), norm. avg. (of 14) = 0.165547 fft 5: mflops = 157.161 (norm. = 0.718815), norm. avg. (of 14) = 0.948216 fft 6: mflops = 154.827 (norm. = 0.708141), norm. avg. (of 14) = 0.855235 fft 7: mflops = 28.6559 (norm. = 0.131065), norm. avg. (of 14) = 0.137067 fft 8: mflops = 67.9193 (norm. = 0.310647), norm. avg. (of 14) = 0.290579 fft 9: mflops = 45.4292 (norm. = 0.207782), norm. avg. (of 14) = 0.134838 fft 10: mflops = 42.3497 (norm. = 0.193697), norm. avg. (of 14) = 0.150848 fft 11: mflops = 77.137 (norm. = 0.352806), norm. avg. (of 14) = 0.254107 fft 12: mflops = -1 (norm. = -0.00457376), norm. avg. (of 10) = 0.173013 fft 13: mflops = -1 (norm. = -0.00457376), norm. avg. (of 10) = 0.176586 fft 14: mflops = 7.30803 (norm. = 0.0334252), norm. avg. (of 14) = 0.0338179 Benchmarking for array size = 10368: 0. Brenner: elapsed time t=1.53125 s, 32 iters, t-(init.)=1.52344 s t(norm)=0.344214, mflops=14.5258 (err=3.2e-15) 1. CWP (min N) (N=10920): elapsed time t=1.0625 s, 256 iters, t-(init.)=0.994141 s t(norm)=0.0280777, mflops=178.077 2. CWP (best N) (N=11088): elapsed time t=1.98535 s, 512 iters, t-(init.)=1.84473 s t(norm)=0.0260505, mflops=191.935 3. FFTPACK: elapsed time t=1.84766 s, 128 iters, t-(init.)=1.81641 s t(norm)=0.102602, mflops=48.7319 (err=3.1e-15) 4. FFTPACK (f2c): elapsed time t=1.85742 s, 128 iters, t-(init.)=1.82617 s t(norm)=0.103154, mflops=48.4713 (err=3.1e-15) FFTW_MEASURE plan: (cost = 3.585815e-03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 3 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 12 5. FFTW: elapsed time t=1.84863 s, 512 iters, t-(init.)=1.72266 s t(norm)=0.0243267, mflops=205.536 (err=3.1e-15) FFTW_ESTIMATE plan: (cost = 7.257600e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 8 6. FFTW_ESTIMATE: elapsed time t=1.94238 s, 512 iters, t-(init.)=1.81641 s t(norm)=0.0256506, mflops=194.927 (err=3.1e-15) 7. Frigo-old: elapsed time t=1.17773 s, 64 iters, t-(init.)=1.16113 s t(norm)=0.131176, mflops=38.1166 (err=3.2e-15) 8. GSL: elapsed time t=1.62695 s, 256 iters, t-(init.)=1.56348 s t(norm)=0.0441576, mflops=113.231 (err=3.1e-15) 9. Nielsen: elapsed time t=1.0498 s, 64 iters, t-(init.)=1.0332 s t(norm)=0.116724, mflops=42.8361 (err=8.5e-15) 10. Singleton: elapsed time t=1.81641 s, 128 iters, t-(init.)=1.78516 s t(norm)=0.100837, mflops=49.5849 (err=4.5e-15) 11. Singleton (f2c): elapsed time t=1.07813 s, 128 iters, t-(init.)=1.0459 s t(norm)=0.0590791, mflops=84.6323 (err=4.5e-15) 12. Temperton: elapsed time t=1.5293 s, 128 iters, t-(init.)=1.49805 s t(norm)=0.0846193, mflops=59.0882 (err=4.1e-15) 13. Temperton (f2c): elapsed time t=1.52246 s, 128 iters, t-(init.)=1.49121 s t(norm)=0.0842332, mflops=59.359 (err=3.1e-15) 14. Valkenburg: elapsed time t=1.22852 s, 16 iters, t-(init.)=1.22461 s t(norm)=0.55339, mflops=9.03521 (err=3.0e-15) Top mflops for N=10368 = 205.536 Normalized results and averages for N=10368: fft 0: mflops = 14.5258 (norm. = 0.0706731), norm. avg. (of 12) = 0.0539771 fft 1: mflops = 178.077 (norm. = 0.866405), norm. avg. (of 15) = 0.543406 fft 2: mflops = 191.935 (norm. = 0.933827), norm. avg. (of 15) = 0.571745 fft 3: mflops = 48.7319 (norm. = 0.237097), norm. avg. (of 15) = 0.170969 fft 4: mflops = 48.4713 (norm. = 0.235829), norm. avg. (of 15) = 0.170233 fft 5: mflops = 205.536 (norm. = 1), norm. avg. (of 15) = 0.951668 fft 6: mflops = 194.927 (norm. = 0.948387), norm. avg. (of 15) = 0.861446 fft 7: mflops = 38.1166 (norm. = 0.18545), norm. avg. (of 15) = 0.140293 fft 8: mflops = 113.231 (norm. = 0.550906), norm. avg. (of 15) = 0.307934 fft 9: mflops = 42.8361 (norm. = 0.208412), norm. avg. (of 15) = 0.139743 fft 10: mflops = 49.5849 (norm. = 0.241247), norm. avg. (of 15) = 0.156875 fft 11: mflops = 84.6323 (norm. = 0.411765), norm. avg. (of 15) = 0.264617 fft 12: mflops = 59.0882 (norm. = 0.287484), norm. avg. (of 11) = 0.183419 fft 13: mflops = 59.359 (norm. = 0.288802), norm. avg. (of 11) = 0.186787 fft 14: mflops = 9.03521 (norm. = 0.0439593), norm. avg. (of 15) = 0.034494 Benchmarking for array size = 27000: 0. Brenner: elapsed time t=1.42188 s, 8 iters, t-(init.)=1.41406 s t(norm)=0.444721, mflops=11.243 (err=5.7e-15) 1. CWP (min N) (N=27720): elapsed time t=1.57813 s, 128 iters, t-(init.)=1.45313 s t(norm)=0.0285629, mflops=175.053 2. CWP (best N) (N=27720): elapsed time t=1.57617 s, 128 iters, t-(init.)=1.45117 s t(norm)=0.0285245, mflops=175.288 3. FFTPACK: elapsed time t=1.45215 s, 32 iters, t-(init.)=1.42188 s t(norm)=0.111794, mflops=44.725 (err=5.6e-15) 4. FFTPACK (f2c): elapsed time t=1.43945 s, 32 iters, t-(init.)=1.4082 s t(norm)=0.110719, mflops=45.1592 (err=5.6e-15) FFTW_MEASURE plan: (cost = 1.391602e-02) FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 6 FFTW_TWIDDLE 3 FFTW_TWIDDLE 5 FFTW_NOTW 12 5. FFTW: elapsed time t=1.81348 s, 128 iters, t-(init.)=1.69043 s t(norm)=0.0332274, mflops=150.478 (err=5.7e-15) FFTW_ESTIMATE plan: (cost = 1.836000e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 10 FFTW_TWIDDLE 5 FFTW_TWIDDLE 6 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.01758 s, 64 iters, t-(init.)=0.956055 s t(norm)=0.0375847, mflops=133.033 (err=5.6e-15) 7. Frigo-old: elapsed time t=1.15723 s, 16 iters, t-(init.)=1.1416 s t(norm)=0.179516, mflops=27.8527 (err=5.8e-15) 8. GSL: elapsed time t=1.00293 s, 32 iters, t-(init.)=0.97168 s t(norm)=0.076398, mflops=65.4468 (err=5.6e-15) 9. Nielsen: elapsed time t=1.64355 s, 32 iters, t-(init.)=1.6123 s t(norm)=0.126767, mflops=39.4425 (err=2.2e-13) 10. Singleton: elapsed time t=1.53125 s, 32 iters, t-(init.)=1.50098 s t(norm)=0.118014, mflops=42.368 (err=7.7e-15) 11. Singleton (f2c): elapsed time t=1.93848 s, 64 iters, t-(init.)=1.87695 s t(norm)=0.0737874, mflops=67.7623 (err=7.7e-15) 12. Temperton: elapsed time t=1.21191 s, 32 iters, t-(init.)=1.18164 s t(norm)=0.0929061, mflops=53.8178 (err=6.1e-15) 13. Temperton (f2c): elapsed time t=1.2041 s, 32 iters, t-(init.)=1.17383 s t(norm)=0.0922918, mflops=54.176 (err=5.8e-15) 14. Valkenburg: elapsed time t=1.06055 s, 4 iters, t-(init.)=1.05664 s t(norm)=0.664624, mflops=7.52305 (err=5.7e-15) Top mflops for N=27000 = 175.288 Normalized results and averages for N=27000: fft 0: mflops = 11.243 (norm. = 0.0641402), norm. avg. (of 13) = 0.0547589 fft 1: mflops = 175.053 (norm. = 0.998656), norm. avg. (of 16) = 0.571859 fft 2: mflops = 175.288 (norm. = 1), norm. avg. (of 16) = 0.598511 fft 3: mflops = 44.725 (norm. = 0.255151), norm. avg. (of 16) = 0.17623 fft 4: mflops = 45.1592 (norm. = 0.257628), norm. avg. (of 16) = 0.175695 fft 5: mflops = 150.478 (norm. = 0.858463), norm. avg. (of 16) = 0.945843 fft 6: mflops = 133.033 (norm. = 0.758938), norm. avg. (of 16) = 0.855039 fft 7: mflops = 27.8527 (norm. = 0.158896), norm. avg. (of 16) = 0.141456 fft 8: mflops = 65.4468 (norm. = 0.373367), norm. avg. (of 16) = 0.312024 fft 9: mflops = 39.4425 (norm. = 0.225015), norm. avg. (of 16) = 0.145073 fft 10: mflops = 42.368 (norm. = 0.241705), norm. avg. (of 16) = 0.162177 fft 11: mflops = 67.7623 (norm. = 0.386576), norm. avg. (of 16) = 0.27224 fft 12: mflops = 53.8178 (norm. = 0.307025), norm. avg. (of 12) = 0.19372 fft 13: mflops = 54.176 (norm. = 0.309068), norm. avg. (of 12) = 0.196977 fft 14: mflops = 7.52305 (norm. = 0.0429182), norm. avg. (of 16) = 0.0350205 Benchmarking for array size = 75600: 0. Brenner: elapsed time t=1.22852 s, 2 iters, t-(init.)=1.2207 s t(norm)=0.498173, mflops=10.0367 (err=1.1e-14) 1. CWP (min N) (N=80080): elapsed time t=1.58789 s, 32 iters, t-(init.)=1.44043 s t(norm)=0.0367402, mflops=136.091 2. CWP (best N) (N=80080): elapsed time t=1.58691 s, 32 iters, t-(init.)=1.44043 s t(norm)=0.0367402, mflops=136.091 3. FFTPACK: elapsed time t=1.72461 s, 8 iters, t-(init.)=1.69141 s t(norm)=0.172567, mflops=28.9743 (err=1.0e-14) 4. FFTPACK (f2c): elapsed time t=1.72363 s, 8 iters, t-(init.)=1.68945 s t(norm)=0.172368, mflops=29.0078 (err=1.1e-14) FFTW_MEASURE plan: (cost = 5.639648e-02) FFTW_TWIDDLE 7 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 12 5. FFTW: elapsed time t=1.73828 s, 32 iters, t-(init.)=1.60352 s t(norm)=0.0409, mflops=122.249 (err=1.1e-14) FFTW_ESTIMATE plan: (cost = 4.838400e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.92871 s, 32 iters, t-(init.)=1.79395 s t(norm)=0.0457571, mflops=109.273 (err=1.0e-14) 7. Frigo-old: elapsed time t=1.93066 s, 8 iters, t-(init.)=1.89648 s t(norm)=0.19349, mflops=25.8411 (err=1.0e-14) 8. GSL: elapsed time t=1.89258 s, 16 iters, t-(init.)=1.8252 s t(norm)=0.0931084, mflops=53.7008 (err=1.0e-14) 9. Nielsen: elapsed time t=1.62402 s, 8 iters, t-(init.)=1.58984 s t(norm)=0.162205, mflops=30.8252 (err=4.9e-13) 10. Singleton: elapsed time t=1.56836 s, 8 iters, t-(init.)=1.53613 s t(norm)=0.156725, mflops=31.903 (err=1.5e-14) 11. Singleton (f2c): elapsed time t=1.12988 s, 8 iters, t-(init.)=1.09668 s t(norm)=0.11189, mflops=44.6869 (err=1.5e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.83301 s, 2 iters, t-(init.)=1.82422 s t(norm)=0.744469, mflops=6.7162 (err=1.1e-14) Top mflops for N=75600 = 136.091 Normalized results and averages for N=75600: fft 0: mflops = 10.0367 (norm. = 0.07375), norm. avg. (of 14) = 0.0561154 fft 1: mflops = 136.091 (norm. = 1), norm. avg. (of 17) = 0.597044 fft 2: mflops = 136.091 (norm. = 1), norm. avg. (of 17) = 0.622128 fft 3: mflops = 28.9743 (norm. = 0.212904), norm. avg. (of 17) = 0.178387 fft 4: mflops = 29.0078 (norm. = 0.21315), norm. avg. (of 17) = 0.177898 fft 5: mflops = 122.249 (norm. = 0.898295), norm. avg. (of 17) = 0.943046 fft 6: mflops = 109.273 (norm. = 0.80294), norm. avg. (of 17) = 0.851974 fft 7: mflops = 25.8411 (norm. = 0.189882), norm. avg. (of 17) = 0.144304 fft 8: mflops = 53.7008 (norm. = 0.394596), norm. avg. (of 17) = 0.316881 fft 9: mflops = 30.8252 (norm. = 0.226505), norm. avg. (of 17) = 0.149863 fft 10: mflops = 31.903 (norm. = 0.234425), norm. avg. (of 17) = 0.166426 fft 11: mflops = 44.6869 (norm. = 0.328362), norm. avg. (of 17) = 0.275541 fft 12: mflops = -1 (norm. = -0.00734805), norm. avg. (of 12) = 0.19372 fft 13: mflops = -1 (norm. = -0.00734805), norm. avg. (of 12) = 0.196977 fft 14: mflops = 6.7162 (norm. = 0.0493509), norm. avg. (of 17) = 0.0358635 Benchmarking for array size = 165375: 0. Brenner: elapsed time t=1.74902 s, 1 iters, t-(init.)=1.7373 s t(norm)=0.606, mflops=8.25082 (err=2.7e-14) 1. CWP (min N) (N=180180): elapsed time t=1.12891 s, 8 iters, t-(init.)=1.02539 s t(norm)=0.0447091, mflops=111.834 2. CWP (best N) (N=180180): elapsed time t=1.12891 s, 8 iters, t-(init.)=1.02539 s t(norm)=0.0447091, mflops=111.834 3. FFTPACK: elapsed time t=1.60938 s, 2 iters, t-(init.)=1.58496 s t(norm)=0.27643, mflops=18.0878 (err=2.7e-14) 4. FFTPACK (f2c): elapsed time t=1.5957 s, 2 iters, t-(init.)=1.57227 s t(norm)=0.274216, mflops=18.2338 (err=2.7e-14) FFTW_MEASURE plan: (cost = 1.660156e-01) FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 3 FFTW_TWIDDLE 7 FFTW_NOTW 7 5. FFTW: elapsed time t=1.2793 s, 8 iters, t-(init.)=1.18457 s t(norm)=0.0516497, mflops=96.806 (err=2.7e-14) FFTW_ESTIMATE plan: (cost = 1.752975e+06) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.29492 s, 8 iters, t-(init.)=1.19922 s t(norm)=0.0522884, mflops=95.6235 (err=2.7e-14) 7. Frigo-old: elapsed time t=1.58789 s, 2 iters, t-(init.)=1.56445 s t(norm)=0.272853, mflops=18.3249 (err=2.7e-14) 8. GSL: elapsed time t=1.29492 s, 4 iters, t-(init.)=1.24805 s t(norm)=0.108835, mflops=45.9412 (err=2.7e-14) 9. Nielsen: elapsed time t=1.08398 s, 2 iters, t-(init.)=1.05859 s t(norm)=0.184627, mflops=27.0816 (err=1.6e-12) 10. Singleton: elapsed time t=1.0791 s, 2 iters, t-(init.)=1.05762 s t(norm)=0.184457, mflops=27.1066 (err=4.0e-14) 11. Singleton (f2c): elapsed time t=1.5791 s, 4 iters, t-(init.)=1.53418 s t(norm)=0.133787, mflops=37.3729 (err=4.0e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=2.44141 s, 1 iters, t-(init.)=2.42969 s t(norm)=0.847515, mflops=5.8996 (err=2.7e-14) Top mflops for N=165375 = 111.834 Normalized results and averages for N=165375: fft 0: mflops = 8.25082 (norm. = 0.0737774), norm. avg. (of 15) = 0.0572929 fft 1: mflops = 111.834 (norm. = 1), norm. avg. (of 18) = 0.619431 fft 2: mflops = 111.834 (norm. = 1), norm. avg. (of 18) = 0.643121 fft 3: mflops = 18.0878 (norm. = 0.161738), norm. avg. (of 18) = 0.177462 fft 4: mflops = 18.2338 (norm. = 0.163043), norm. avg. (of 18) = 0.177073 fft 5: mflops = 96.806 (norm. = 0.865622), norm. avg. (of 18) = 0.938745 fft 6: mflops = 95.6235 (norm. = 0.855049), norm. avg. (of 18) = 0.852145 fft 7: mflops = 18.3249 (norm. = 0.163858), norm. avg. (of 18) = 0.14539 fft 8: mflops = 45.9412 (norm. = 0.410798), norm. avg. (of 18) = 0.322098 fft 9: mflops = 27.0816 (norm. = 0.242159), norm. avg. (of 18) = 0.154991 fft 10: mflops = 27.1066 (norm. = 0.242382), norm. avg. (of 18) = 0.170646 fft 11: mflops = 37.3729 (norm. = 0.334182), norm. avg. (of 18) = 0.278799 fft 12: mflops = -1 (norm. = -0.00894182), norm. avg. (of 12) = 0.19372 fft 13: mflops = -1 (norm. = -0.00894182), norm. avg. (of 12) = 0.196977 fft 14: mflops = 5.8996 (norm. = 0.0527532), norm. avg. (of 18) = 0.0368018 Benchmarking for array size = 362880: 0. Brenner: elapsed time t=3.89258 s, 1 iters, t-(init.)=3.86328 s t(norm)=0.57643, mflops=8.67408 (err=1.1e-13) 1. CWP (min N) (N=720720): elapsed time t=1.3916 s, 2 iters, t-(init.)=1.2666 s t(norm)=0.0944932, mflops=52.9139 2. CWP (best N) (N=720720): elapsed time t=1.39258 s, 2 iters, t-(init.)=1.26758 s t(norm)=0.094566, mflops=52.8731 3. FFTPACK: elapsed time t=1.44531 s, 1 iters, t-(init.)=1.41602 s t(norm)=0.21128, mflops=23.6653 (err=1.1e-13) 4. FFTPACK (f2c): elapsed time t=1.43945 s, 1 iters, t-(init.)=1.41016 s t(norm)=0.210406, mflops=23.7636 (err=1.1e-13) FFTW_MEASURE plan: (cost = 3.662109e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 5 FFTW_TWIDDLE 6 FFTW_TWIDDLE 3 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 7 5. FFTW: elapsed time t=1.37109 s, 4 iters, t-(init.)=1.25 s t(norm)=0.0466273, mflops=107.233 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 2.286144e+06) FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 7 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.49805 s, 4 iters, t-(init.)=1.37598 s t(norm)=0.0513265, mflops=97.4156 (err=1.1e-13) 7. Frigo-old: elapsed time t=1.54004 s, 1 iters, t-(init.)=1.51074 s t(norm)=0.225414, mflops=22.1814 (err=1.1e-13) 8. GSL: elapsed time t=1.2998 s, 2 iters, t-(init.)=1.24023 s t(norm)=0.0925261, mflops=54.0388 (err=1.1e-13) 9. Nielsen: elapsed time t=1.4707 s, 1 iters, t-(init.)=1.43848 s t(norm)=0.214631, mflops=23.2958 (err=3.6e-12) 10. Singleton: elapsed time t=1.4541 s, 1 iters, t-(init.)=1.42969 s t(norm)=0.21332, mflops=23.439 (err=1.6e-13) 11. Singleton (f2c): elapsed time t=1.15137 s, 1 iters, t-(init.)=1.12598 s t(norm)=0.168004, mflops=29.7612 (err=1.6e-13) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=5.54688 s, 1 iters, t-(init.)=5.51758 s t(norm)=0.823264, mflops=6.07339 (err=1.1e-13) Top mflops for N=362880 = 107.233 Normalized results and averages for N=362880: fft 0: mflops = 8.67408 (norm. = 0.0808898), norm. avg. (of 16) = 0.0587677 fft 1: mflops = 52.9139 (norm. = 0.493446), norm. avg. (of 19) = 0.6128 fft 2: mflops = 52.8731 (norm. = 0.493066), norm. avg. (of 19) = 0.635223 fft 3: mflops = 23.6653 (norm. = 0.22069), norm. avg. (of 19) = 0.179738 fft 4: mflops = 23.7636 (norm. = 0.221607), norm. avg. (of 19) = 0.179417 fft 5: mflops = 107.233 (norm. = 1), norm. avg. (of 19) = 0.941969 fft 6: mflops = 97.4156 (norm. = 0.908446), norm. avg. (of 19) = 0.855108 fft 7: mflops = 22.1814 (norm. = 0.206852), norm. avg. (of 19) = 0.148625 fft 8: mflops = 54.0388 (norm. = 0.503937), norm. avg. (of 19) = 0.331669 fft 9: mflops = 23.2958 (norm. = 0.217244), norm. avg. (of 19) = 0.158267 fft 10: mflops = 23.439 (norm. = 0.218579), norm. avg. (of 19) = 0.173169 fft 11: mflops = 29.7612 (norm. = 0.277537), norm. avg. (of 19) = 0.278732 fft 12: mflops = -1 (norm. = -0.00932546), norm. avg. (of 12) = 0.19372 fft 13: mflops = -1 (norm. = -0.00932546), norm. avg. (of 12) = 0.196977 fft 14: mflops = 6.07339 (norm. = 0.0566372), norm. avg. (of 19) = 0.0378458 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) Maximum array size N = 2097152 Benchmarking FFTs: 0. FFTW 1. HARM 2. HARM (f2c) 3. NR (C) 4. PDA 5. PDA (f2c) 6. Singleton 7. Singleton (f2c) 8. Temperton 9. Temperton (f2c) Computing normalized averages (10 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.87793 s, 262144 iters, t-(init.)=1.7373 s t(norm)=0.0172586, mflops=289.711 (err=2.1e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. Skipping fft (all dimensions must be > 4 for HARM). 3. NR (C): elapsed time t=1.52441 s, 65536 iters, t-(init.)=1.48926 s t(norm)=0.0591778, mflops=84.4912 (err=4.5e-16) 4. PDA: elapsed time t=1.6582 s, 16384 iters, t-(init.)=1.64941 s t(norm)=0.262167, mflops=19.0718 (err=1.9e-16) 5. PDA (f2c): elapsed time t=1.55957 s, 16384 iters, t-(init.)=1.55078 s t(norm)=0.24649, mflops=20.2848 (err=1.9e-16) 6. Singleton: elapsed time t=1.46484 s, 65536 iters, t-(init.)=1.42969 s t(norm)=0.0568107, mflops=88.0116 (err=2.1e-16) 7. Singleton (f2c): elapsed time t=1.74219 s, 131072 iters, t-(init.)=1.67188 s t(norm)=0.0332172, mflops=150.525 (err=2.1e-16) 8. Temperton: elapsed time t=1.56348 s, 65536 iters, t-(init.)=1.52832 s t(norm)=0.06073, mflops=82.3316 (err=2.1e-16) 9. Temperton (f2c): elapsed time t=1.56152 s, 65536 iters, t-(init.)=1.52637 s t(norm)=0.0606524, mflops=82.437 (err=2.1e-16) Top mflops for N=64 = 289.711 Normalized results and averages for N=64: fft 0: mflops = 289.711 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.00345171), norm. avg. (of 0) = -1 fft 2: mflops = -1 (norm. = -0.00345171), norm. avg. (of 0) = -1 fft 3: mflops = 84.4912 (norm. = 0.291639), norm. avg. (of 1) = 0.291639 fft 4: mflops = 19.0718 (norm. = 0.0658304), norm. avg. (of 1) = 0.0658304 fft 5: mflops = 20.2848 (norm. = 0.0700173), norm. avg. (of 1) = 0.0700173 fft 6: mflops = 88.0116 (norm. = 0.303791), norm. avg. (of 1) = 0.303791 fft 7: mflops = 150.525 (norm. = 0.519568), norm. avg. (of 1) = 0.519568 fft 8: mflops = 82.3316 (norm. = 0.284185), norm. avg. (of 1) = 0.284185 fft 9: mflops = 82.437 (norm. = 0.284549), norm. avg. (of 1) = 0.284549 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.73535 s, 32768 iters, t-(init.)=1.60645 s t(norm)=0.0106391, mflops=469.966 (err=3.8e-16) 1. HARM: elapsed time t=1.98926 s, 8192 iters, t-(init.)=1.95703 s t(norm)=0.0518436, mflops=96.4439 (err=3.7e-16) 2. HARM (f2c): elapsed time t=1.00098 s, 4096 iters, t-(init.)=0.985352 s t(norm)=0.0522058, mflops=95.7748 (err=3.7e-16) 3. NR (C): elapsed time t=1.55273 s, 8192 iters, t-(init.)=1.52148 s t(norm)=0.0403056, mflops=124.052 (err=1.5e-15) 4. PDA: elapsed time t=1.47949 s, 2048 iters, t-(init.)=1.4707 s t(norm)=0.155841, mflops=32.0839 (err=3.2e-16) 5. PDA (f2c): elapsed time t=1.4834 s, 2048 iters, t-(init.)=1.47559 s t(norm)=0.156359, mflops=31.9778 (err=3.2e-16) 6. Singleton: elapsed time t=1.42578 s, 4096 iters, t-(init.)=1.40918 s t(norm)=0.074661, mflops=66.9693 (err=3.8e-16) 7. Singleton (f2c): elapsed time t=1.55957 s, 8192 iters, t-(init.)=1.52734 s t(norm)=0.0404608, mflops=123.576 (err=3.8e-16) 8. Temperton: elapsed time t=1.67969 s, 8192 iters, t-(init.)=1.64746 s t(norm)=0.0436428, mflops=114.566 (err=3.5e-16) 9. Temperton (f2c): elapsed time t=1.56836 s, 8192 iters, t-(init.)=1.53613 s t(norm)=0.0406936, mflops=122.869 (err=3.5e-16) Top mflops for N=512 = 469.966 Normalized results and averages for N=512: fft 0: mflops = 469.966 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 96.4439 (norm. = 0.205215), norm. avg. (of 1) = 0.205215 fft 2: mflops = 95.7748 (norm. = 0.203791), norm. avg. (of 1) = 0.203791 fft 3: mflops = 124.052 (norm. = 0.26396), norm. avg. (of 2) = 0.2778 fft 4: mflops = 32.0839 (norm. = 0.0682686), norm. avg. (of 2) = 0.0670495 fft 5: mflops = 31.9778 (norm. = 0.0680427), norm. avg. (of 2) = 0.06903 fft 6: mflops = 66.9693 (norm. = 0.142498), norm. avg. (of 2) = 0.223145 fft 7: mflops = 123.576 (norm. = 0.262948), norm. avg. (of 2) = 0.391258 fft 8: mflops = 114.566 (norm. = 0.243776), norm. avg. (of 2) = 0.263981 fft 9: mflops = 122.869 (norm. = 0.261443), norm. avg. (of 2) = 0.272996 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.71387 s, 2048 iters, t-(init.)=1.60352 s t(norm)=0.0159295, mflops=313.883 (err=4.0e-16) 1. HARM: elapsed time t=1.18066 s, 512 iters, t-(init.)=1.15332 s t(norm)=0.0458288, mflops=109.102 (err=4.2e-16) 2. HARM (f2c): elapsed time t=1.28223 s, 512 iters, t-(init.)=1.25488 s t(norm)=0.0498646, mflops=100.272 (err=4.2e-16) 3. NR (C): elapsed time t=1.76563 s, 512 iters, t-(init.)=1.73828 s t(norm)=0.0690731, mflops=72.3871 (err=2.0e-15) 4. PDA: elapsed time t=1.62109 s, 256 iters, t-(init.)=1.60742 s t(norm)=0.127746, mflops=39.14 (err=4.3e-16) 5. PDA (f2c): elapsed time t=1.49902 s, 256 iters, t-(init.)=1.48535 s t(norm)=0.118045, mflops=42.3567 (err=4.3e-16) 6. Singleton: elapsed time t=1.89063 s, 512 iters, t-(init.)=1.86328 s t(norm)=0.0740401, mflops=67.5309 (err=4.0e-16) 7. Singleton (f2c): elapsed time t=1.07324 s, 512 iters, t-(init.)=1.0459 s t(norm)=0.0415603, mflops=120.307 (err=4.0e-16) 8. Temperton: elapsed time t=1.24707 s, 512 iters, t-(init.)=1.21973 s t(norm)=0.0484676, mflops=103.162 (err=4.6e-16) 9. Temperton (f2c): elapsed time t=1.14844 s, 512 iters, t-(init.)=1.12109 s t(norm)=0.0445483, mflops=112.238 (err=4.6e-16) Top mflops for N=4096 = 313.883 Normalized results and averages for N=4096: fft 0: mflops = 313.883 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 109.102 (norm. = 0.347587), norm. avg. (of 2) = 0.276401 fft 2: mflops = 100.272 (norm. = 0.319455), norm. avg. (of 2) = 0.261623 fft 3: mflops = 72.3871 (norm. = 0.230618), norm. avg. (of 3) = 0.262073 fft 4: mflops = 39.14 (norm. = 0.124696), norm. avg. (of 3) = 0.0862651 fft 5: mflops = 42.3567 (norm. = 0.134944), norm. avg. (of 3) = 0.0910014 fft 6: mflops = 67.5309 (norm. = 0.215147), norm. avg. (of 3) = 0.220479 fft 7: mflops = 120.307 (norm. = 0.383287), norm. avg. (of 3) = 0.388601 fft 8: mflops = 103.162 (norm. = 0.328663), norm. avg. (of 3) = 0.285541 fft 9: mflops = 112.238 (norm. = 0.357578), norm. avg. (of 3) = 0.30119 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.08203 s, 64 iters, t-(init.)=1.00195 s t(norm)=0.0318512, mflops=156.98 (err=5.1e-16) 1. HARM: elapsed time t=1.14355 s, 32 iters, t-(init.)=1.10352 s t(norm)=0.0701596, mflops=71.2661 (err=5.3e-16) 2. HARM (f2c): elapsed time t=1.18945 s, 32 iters, t-(init.)=1.14941 s t(norm)=0.0730778, mflops=68.4203 (err=5.3e-16) 3. NR (C): elapsed time t=1.0791 s, 16 iters, t-(init.)=1.05859 s t(norm)=0.134607, mflops=37.1451 (err=2.9e-15) 4. PDA: elapsed time t=1.3291 s, 16 iters, t-(init.)=1.30859 s t(norm)=0.166396, mflops=30.0487 (err=4.3e-16) 5. PDA (f2c): elapsed time t=1.19629 s, 16 iters, t-(init.)=1.17676 s t(norm)=0.149632, mflops=33.4152 (err=4.3e-16) 6. Singleton: elapsed time t=1.01855 s, 16 iters, t-(init.)=0.998047 s t(norm)=0.126908, mflops=39.3986 (err=5.2e-16) 7. Singleton (f2c): elapsed time t=1.5166 s, 32 iters, t-(init.)=1.47559 s t(norm)=0.0938152, mflops=53.2963 (err=5.2e-16) 8. Temperton: elapsed time t=1.17969 s, 32 iters, t-(init.)=1.13965 s t(norm)=0.0724569, mflops=69.0065 (err=4.6e-16) 9. Temperton (f2c): elapsed time t=1.16309 s, 32 iters, t-(init.)=1.12305 s t(norm)=0.0714014, mflops=70.0266 (err=4.6e-16) Top mflops for N=32768 = 156.98 Normalized results and averages for N=32768: fft 0: mflops = 156.98 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 71.2661 (norm. = 0.453982), norm. avg. (of 3) = 0.335595 fft 2: mflops = 68.4203 (norm. = 0.435854), norm. avg. (of 3) = 0.3197 fft 3: mflops = 37.1451 (norm. = 0.236624), norm. avg. (of 4) = 0.25571 fft 4: mflops = 30.0487 (norm. = 0.191418), norm. avg. (of 4) = 0.112553 fft 5: mflops = 33.4152 (norm. = 0.212863), norm. avg. (of 4) = 0.121467 fft 6: mflops = 39.3986 (norm. = 0.250978), norm. avg. (of 4) = 0.228104 fft 7: mflops = 53.2963 (norm. = 0.33951), norm. avg. (of 4) = 0.376328 fft 8: mflops = 69.0065 (norm. = 0.439589), norm. avg. (of 4) = 0.324053 fft 9: mflops = 70.0266 (norm. = 0.446087), norm. avg. (of 4) = 0.337414 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.64551 s, 8 iters, t-(init.)=1.48047 s t(norm)=0.039219, mflops=127.489 (err=1.2e-15) 1. HARM: elapsed time t=1.56445 s, 4 iters, t-(init.)=1.48145 s t(norm)=0.0784898, mflops=63.7025 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.63867 s, 4 iters, t-(init.)=1.55566 s t(norm)=0.082422, mflops=60.6634 (err=1.2e-15) 3. NR (C): elapsed time t=1.24316 s, 1 iters, t-(init.)=1.22363 s t(norm)=0.259322, mflops=19.2811 (err=3.6e-15) 4. PDA: elapsed time t=1.95898 s, 2 iters, t-(init.)=1.91895 s t(norm)=0.203339, mflops=24.5895 (err=1.3e-15) 5. PDA (f2c): elapsed time t=1.4209 s, 2 iters, t-(init.)=1.38086 s t(norm)=0.146321, mflops=34.1714 (err=1.3e-15) 6. Singleton: elapsed time t=1.81738 s, 2 iters, t-(init.)=1.77734 s t(norm)=0.188334, mflops=26.5486 (err=1.7e-15) 7. Singleton (f2c): elapsed time t=1.49902 s, 2 iters, t-(init.)=1.45801 s t(norm)=0.154496, mflops=32.3633 (err=1.7e-15) 8. Temperton: elapsed time t=1.55078 s, 4 iters, t-(init.)=1.46875 s t(norm)=0.0778172, mflops=64.2532 (err=1.3e-15) 9. Temperton (f2c): elapsed time t=1.54199 s, 4 iters, t-(init.)=1.45996 s t(norm)=0.0773515, mflops=64.64 (err=1.3e-15) Top mflops for N=262144 = 127.489 Normalized results and averages for N=262144: fft 0: mflops = 127.489 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 63.7025 (norm. = 0.49967), norm. avg. (of 4) = 0.376614 fft 2: mflops = 60.6634 (norm. = 0.475832), norm. avg. (of 4) = 0.358733 fft 3: mflops = 19.2811 (norm. = 0.151237), norm. avg. (of 5) = 0.234816 fft 4: mflops = 24.5895 (norm. = 0.192875), norm. avg. (of 5) = 0.128618 fft 5: mflops = 34.1714 (norm. = 0.268034), norm. avg. (of 5) = 0.15078 fft 6: mflops = 26.5486 (norm. = 0.208242), norm. avg. (of 5) = 0.224131 fft 7: mflops = 32.3633 (norm. = 0.253851), norm. avg. (of 5) = 0.351833 fft 8: mflops = 64.2532 (norm. = 0.503989), norm. avg. (of 5) = 0.36004 fft 9: mflops = 64.64 (norm. = 0.507023), norm. avg. (of 5) = 0.371336 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.02344 s, 2 iters, t-(init.)=0.933594 s t(norm)=0.0468602, mflops=106.7 (err=1.2e-15) 1. HARM: elapsed time t=1.9668 s, 2 iters, t-(init.)=1.87695 s t(norm)=0.0942106, mflops=53.0726 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.01563 s, 1 iters, t-(init.)=0.970703 s t(norm)=0.0974458, mflops=51.3106 (err=1.2e-15) 3. NR (C): elapsed time t=2.87598 s, 1 iters, t-(init.)=2.83203 s t(norm)=0.284298, mflops=17.5872 (err=4.2e-15) 4. PDA: elapsed time t=1.91602 s, 1 iters, t-(init.)=1.87109 s t(norm)=0.187833, mflops=26.6194 (err=1.2e-15) 5. PDA (f2c): elapsed time t=1.63086 s, 1 iters, t-(init.)=1.58691 s t(norm)=0.159305, mflops=31.3863 (err=1.2e-15) 6. Singleton: elapsed time t=2.22852 s, 1 iters, t-(init.)=2.18359 s t(norm)=0.219204, mflops=22.8098 (err=1.8e-15) 7. Singleton (f2c): elapsed time t=1.90137 s, 1 iters, t-(init.)=1.85645 s t(norm)=0.186363, mflops=26.8294 (err=1.8e-15) 8. Temperton: elapsed time t=1.75879 s, 2 iters, t-(init.)=1.66895 s t(norm)=0.08377, mflops=59.6872 (err=1.2e-15) 9. Temperton (f2c): elapsed time t=1.75488 s, 2 iters, t-(init.)=1.66602 s t(norm)=0.083623, mflops=59.7922 (err=1.2e-15) Top mflops for N=524288 = 106.7 Normalized results and averages for N=524288: fft 0: mflops = 106.7 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 53.0726 (norm. = 0.497399), norm. avg. (of 5) = 0.400771 fft 2: mflops = 51.3106 (norm. = 0.480885), norm. avg. (of 5) = 0.383163 fft 3: mflops = 17.5872 (norm. = 0.164828), norm. avg. (of 6) = 0.223151 fft 4: mflops = 26.6194 (norm. = 0.249478), norm. avg. (of 6) = 0.148761 fft 5: mflops = 31.3863 (norm. = 0.294154), norm. avg. (of 6) = 0.174676 fft 6: mflops = 22.8098 (norm. = 0.213775), norm. avg. (of 6) = 0.222405 fft 7: mflops = 26.8294 (norm. = 0.251447), norm. avg. (of 6) = 0.335102 fft 8: mflops = 59.6872 (norm. = 0.559391), norm. avg. (of 6) = 0.393266 fft 9: mflops = 59.7922 (norm. = 0.560375), norm. avg. (of 6) = 0.402843 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=1.02148 s, 1 iters, t-(init.)=0.930664 s t(norm)=0.0443775, mflops=112.67 (err=2.0e-15) 1. HARM: elapsed time t=2.03906 s, 1 iters, t-(init.)=1.94727 s t(norm)=0.0928529, mflops=53.8486 (err=2.0e-15) 2. HARM (f2c): elapsed time t=2.1084 s, 1 iters, t-(init.)=2.0166 s t(norm)=0.0961591, mflops=51.9972 (err=2.0e-15) 3. NR (C): elapsed time t=6.24316 s, 1 iters, t-(init.)=6.15137 s t(norm)=0.29332, mflops=17.0462 (err=4.6e-15) 4. PDA: elapsed time t=3.82324 s, 1 iters, t-(init.)=3.7334 s t(norm)=0.178022, mflops=28.0864 (err=2.0e-15) 5. PDA (f2c): elapsed time t=3.34766 s, 1 iters, t-(init.)=3.25684 s t(norm)=0.155298, mflops=32.1962 (err=2.0e-15) 6. Singleton: elapsed time t=4.4834 s, 1 iters, t-(init.)=4.3916 s t(norm)=0.209408, mflops=23.8768 (err=2.8e-15) 7. Singleton (f2c): elapsed time t=3.7998 s, 1 iters, t-(init.)=3.70801 s t(norm)=0.176812, mflops=28.2787 (err=2.8e-15) 8. Skipping fft (Temperton can't handle dimensions > 256). 9. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 112.67 Normalized results and averages for N=1048576: fft 0: mflops = 112.67 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 53.8486 (norm. = 0.477934), norm. avg. (of 6) = 0.413631 fft 2: mflops = 51.9972 (norm. = 0.461501), norm. avg. (of 6) = 0.39622 fft 3: mflops = 17.0462 (norm. = 0.151294), norm. avg. (of 7) = 0.212886 fft 4: mflops = 28.0864 (norm. = 0.249281), norm. avg. (of 7) = 0.163121 fft 5: mflops = 32.1962 (norm. = 0.285757), norm. avg. (of 7) = 0.190545 fft 6: mflops = 23.8768 (norm. = 0.211919), norm. avg. (of 7) = 0.220907 fft 7: mflops = 28.2787 (norm. = 0.250988), norm. avg. (of 7) = 0.323085 fft 8: mflops = -1 (norm. = -0.0088755), norm. avg. (of 6) = 0.393266 fft 9: mflops = -1 (norm. = -0.0088755), norm. avg. (of 6) = 0.402843 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=2.33203 s, 1 iters, t-(init.)=2.15039 s t(norm)=0.0488279, mflops=102.4 (err=7.4e-16) 1. HARM: elapsed time t=4.71875 s, 1 iters, t-(init.)=4.53613 s t(norm)=0.103, mflops=48.5438 (err=6.8e-16) 2. HARM (f2c): elapsed time t=4.8623 s, 1 iters, t-(init.)=4.67969 s t(norm)=0.106259, mflops=47.0546 (err=6.8e-16) 3. NR (C): elapsed time t=13.8027 s, 1 iters, t-(init.)=13.6201 s t(norm)=0.309266, mflops=16.1673 (err=4.9e-15) 4. PDA: elapsed time t=8.5127 s, 1 iters, t-(init.)=8.33008 s t(norm)=0.189147, mflops=26.4344 (err=7.1e-16) 5. PDA (f2c): elapsed time t=7.22363 s, 1 iters, t-(init.)=7.04199 s t(norm)=0.159899, mflops=31.2697 (err=7.1e-16) 6. Singleton: elapsed time t=13.0381 s, 1 iters, t-(init.)=12.8555 s t(norm)=0.291903, mflops=17.129 (err=8.4e-16) 7. Singleton (f2c): elapsed time t=11.4492 s, 1 iters, t-(init.)=11.2666 s t(norm)=0.255825, mflops=19.5446 (err=8.4e-16) 8. Temperton: elapsed time t=5.93945 s, 1 iters, t-(init.)=5.75781 s t(norm)=0.13074, mflops=38.2439 (err=7.5e-16) 9. Temperton (f2c): elapsed time t=5.9248 s, 1 iters, t-(init.)=5.74219 s t(norm)=0.130385, mflops=38.3479 (err=7.5e-16) Top mflops for N=2097152 = 102.4 Normalized results and averages for N=2097152: fft 0: mflops = 102.4 (norm. = 1), norm. avg. (of 8) = 1 fft 1: mflops = 48.5438 (norm. = 0.474058), norm. avg. (of 7) = 0.422264 fft 2: mflops = 47.0546 (norm. = 0.459516), norm. avg. (of 7) = 0.405262 fft 3: mflops = 16.1673 (norm. = 0.157883), norm. avg. (of 8) = 0.20601 fft 4: mflops = 26.4344 (norm. = 0.258148), norm. avg. (of 8) = 0.174999 fft 5: mflops = 31.2697 (norm. = 0.305367), norm. avg. (of 8) = 0.204897 fft 6: mflops = 17.129 (norm. = 0.167274), norm. avg. (of 8) = 0.214203 fft 7: mflops = 19.5446 (norm. = 0.190864), norm. avg. (of 8) = 0.306558 fft 8: mflops = 38.2439 (norm. = 0.373474), norm. avg. (of 7) = 0.390438 fft 9: mflops = 38.3479 (norm. = 0.37449), norm. avg. (of 7) = 0.398792 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) Maximum array size N = 2985984 Benchmarking FFTs: 0. FFTW 1. PDA 2. PDA (f2c) 3. Singleton 4. Singleton (f2c) 5. Temperton 6. Temperton (f2c) Computing normalized averages (7 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.3125 s, 65536 iters, t-(init.)=1.24707 s t(norm)=0.021854, mflops=228.791 (err=2.6e-16) 1. PDA: elapsed time t=1.6543 s, 8192 iters, t-(init.)=1.64551 s t(norm)=0.230691, mflops=21.674 (err=2.9e-16) 2. PDA (f2c): elapsed time t=1.52734 s, 8192 iters, t-(init.)=1.51953 s t(norm)=0.213029, mflops=23.4709 (err=2.9e-16) 3. Singleton: elapsed time t=1.01855 s, 32768 iters, t-(init.)=0.985352 s t(norm)=0.0345351, mflops=144.78 (err=3.0e-16) 4. Singleton (f2c): elapsed time t=1.25391 s, 65536 iters, t-(init.)=1.18848 s t(norm)=0.0208272, mflops=240.071 (err=3.0e-16) 5. Temperton: elapsed time t=1.3418 s, 32768 iters, t-(init.)=1.30957 s t(norm)=0.0458985, mflops=108.936 (err=5.2e-16) 6. Temperton (f2c): elapsed time t=1.39844 s, 32768 iters, t-(init.)=1.36621 s t(norm)=0.0478837, mflops=104.42 (err=2.5e-16) Top mflops for N=125 = 240.071 Normalized results and averages for N=125: fft 0: mflops = 228.791 (norm. = 0.953015), norm. avg. (of 1) = 0.953015 fft 1: mflops = 21.674 (norm. = 0.0902819), norm. avg. (of 1) = 0.0902819 fft 2: mflops = 23.4709 (norm. = 0.0977667), norm. avg. (of 1) = 0.0977667 fft 3: mflops = 144.78 (norm. = 0.603072), norm. avg. (of 1) = 0.603072 fft 4: mflops = 240.071 (norm. = 1), norm. avg. (of 1) = 1 fft 5: mflops = 108.936 (norm. = 0.453766), norm. avg. (of 1) = 0.453766 fft 6: mflops = 104.42 (norm. = 0.434954), norm. avg. (of 1) = 0.434954 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.64551 s, 65536 iters, t-(init.)=1.53516 s t(norm)=0.0139844, mflops=357.542 (err=3.5e-16) 1. PDA: elapsed time t=1.41602 s, 4096 iters, t-(init.)=1.4082 s t(norm)=0.205247, mflops=24.3609 (err=4.4e-16) 2. PDA (f2c): elapsed time t=1.3916 s, 4096 iters, t-(init.)=1.38477 s t(norm)=0.201831, mflops=24.7732 (err=4.4e-16) 3. Singleton: elapsed time t=1.14258 s, 8192 iters, t-(init.)=1.12891 s t(norm)=0.0822695, mflops=60.7759 (err=3.0e-16) 4. Singleton (f2c): elapsed time t=1.30566 s, 16384 iters, t-(init.)=1.27832 s t(norm)=0.046579, mflops=107.344 (err=3.0e-16) 5. Temperton: elapsed time t=1.62988 s, 16384 iters, t-(init.)=1.60254 s t(norm)=0.0583928, mflops=85.627 (err=2.0e-15) 6. Temperton (f2c): elapsed time t=1.61035 s, 16384 iters, t-(init.)=1.58301 s t(norm)=0.0576811, mflops=86.6834 (err=2.8e-16) Top mflops for N=216 = 357.542 Normalized results and averages for N=216: fft 0: mflops = 357.542 (norm. = 1), norm. avg. (of 2) = 0.976507 fft 1: mflops = 24.3609 (norm. = 0.0681345), norm. avg. (of 2) = 0.0792082 fft 2: mflops = 24.7732 (norm. = 0.0692877), norm. avg. (of 2) = 0.0835272 fft 3: mflops = 60.7759 (norm. = 0.169983), norm. avg. (of 2) = 0.386528 fft 4: mflops = 107.344 (norm. = 0.300229), norm. avg. (of 2) = 0.650115 fft 5: mflops = 85.627 (norm. = 0.239488), norm. avg. (of 2) = 0.346627 fft 6: mflops = 86.6834 (norm. = 0.242443), norm. avg. (of 2) = 0.338698 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.35059 s, 16384 iters, t-(init.)=1.30762 s t(norm)=0.0276279, mflops=180.976 (err=4.2e-16) 1. PDA: elapsed time t=1.02051 s, 1024 iters, t-(init.)=1.01758 s t(norm)=0.343997, mflops=14.535 (err=5.0e-16) 2. PDA (f2c): elapsed time t=1.02344 s, 1024 iters, t-(init.)=1.02051 s t(norm)=0.344988, mflops=14.4933 (err=5.0e-16) 3. Singleton: elapsed time t=1.21777 s, 4096 iters, t-(init.)=1.20703 s t(norm)=0.102011, mflops=49.0145 (err=6.2e-16) 4. Singleton (f2c): elapsed time t=1.63477 s, 8192 iters, t-(init.)=1.61328 s t(norm)=0.0681722, mflops=73.3437 (err=6.2e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 180.976 Normalized results and averages for N=343: fft 0: mflops = 180.976 (norm. = 1), norm. avg. (of 3) = 0.984338 fft 1: mflops = 14.535 (norm. = 0.0803143), norm. avg. (of 3) = 0.0795769 fft 2: mflops = 14.4933 (norm. = 0.0800837), norm. avg. (of 3) = 0.0823794 fft 3: mflops = 49.0145 (norm. = 0.270833), norm. avg. (of 3) = 0.347963 fft 4: mflops = 73.3437 (norm. = 0.405266), norm. avg. (of 3) = 0.568499 fft 5: mflops = -1 (norm. = -0.00552558), norm. avg. (of 2) = 0.346627 fft 6: mflops = -1 (norm. = -0.00552558), norm. avg. (of 2) = 0.338698 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.03613 s, 8192 iters, t-(init.)=0.990234 s t(norm)=0.0174361, mflops=286.761 (err=4.6e-16) 1. PDA: elapsed time t=1.07422 s, 1024 iters, t-(init.)=1.06836 s t(norm)=0.150494, mflops=33.2239 (err=4.5e-16) 2. PDA (f2c): elapsed time t=1.04883 s, 1024 iters, t-(init.)=1.04297 s t(norm)=0.146918, mflops=34.0327 (err=4.5e-16) 3. Singleton: elapsed time t=1.97559 s, 4096 iters, t-(init.)=1.95215 s t(norm)=0.0687472, mflops=72.7302 (err=3.9e-16) 4. Singleton (f2c): elapsed time t=1.09375 s, 4096 iters, t-(init.)=1.07129 s t(norm)=0.0377267, mflops=132.532 (err=3.9e-16) 5. Temperton: elapsed time t=1.48047 s, 4096 iters, t-(init.)=1.45801 s t(norm)=0.0513455, mflops=97.3796 (err=3.5e-15) 6. Temperton (f2c): elapsed time t=1.30469 s, 4096 iters, t-(init.)=1.28223 s t(norm)=0.0451551, mflops=110.729 (err=4.3e-16) Top mflops for N=729 = 286.761 Normalized results and averages for N=729: fft 0: mflops = 286.761 (norm. = 1), norm. avg. (of 4) = 0.988254 fft 1: mflops = 33.2239 (norm. = 0.115859), norm. avg. (of 4) = 0.0886475 fft 2: mflops = 34.0327 (norm. = 0.11868), norm. avg. (of 4) = 0.0914545 fft 3: mflops = 72.7302 (norm. = 0.253627), norm. avg. (of 4) = 0.324379 fft 4: mflops = 132.532 (norm. = 0.46217), norm. avg. (of 4) = 0.541916 fft 5: mflops = 97.3796 (norm. = 0.339585), norm. avg. (of 3) = 0.34428 fft 6: mflops = 110.729 (norm. = 0.386139), norm. avg. (of 3) = 0.354512 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.65918 s, 8192 iters, t-(init.)=1.59766 s t(norm)=0.0195696, mflops=255.498 (err=3.5e-16) 1. PDA: elapsed time t=1.56641 s, 1024 iters, t-(init.)=1.55859 s t(norm)=0.152729, mflops=32.7377 (err=4.1e-16) 2. PDA (f2c): elapsed time t=1.47363 s, 1024 iters, t-(init.)=1.46582 s t(norm)=0.143638, mflops=34.8097 (err=4.1e-16) 3. Singleton: elapsed time t=1.37793 s, 2048 iters, t-(init.)=1.3623 s t(norm)=0.0667472, mflops=74.9095 (err=4.1e-16) 4. Singleton (f2c): elapsed time t=1.58984 s, 4096 iters, t-(init.)=1.55859 s t(norm)=0.0381822, mflops=130.951 (err=4.1e-16) 5. Temperton: elapsed time t=1.78027 s, 4096 iters, t-(init.)=1.74902 s t(norm)=0.0428474, mflops=116.693 (err=5.9e-16) 6. Temperton (f2c): elapsed time t=1.7627 s, 4096 iters, t-(init.)=1.73145 s t(norm)=0.0424167, mflops=117.878 (err=3.2e-16) Top mflops for N=1000 = 255.498 Normalized results and averages for N=1000: fft 0: mflops = 255.498 (norm. = 1), norm. avg. (of 5) = 0.990603 fft 1: mflops = 32.7377 (norm. = 0.128133), norm. avg. (of 5) = 0.0965446 fft 2: mflops = 34.8097 (norm. = 0.136243), norm. avg. (of 5) = 0.100412 fft 3: mflops = 74.9095 (norm. = 0.29319), norm. avg. (of 5) = 0.318141 fft 4: mflops = 130.951 (norm. = 0.512531), norm. avg. (of 5) = 0.536039 fft 5: mflops = 116.693 (norm. = 0.456728), norm. avg. (of 4) = 0.372392 fft 6: mflops = 117.878 (norm. = 0.461365), norm. avg. (of 4) = 0.381225 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.64258 s, 2048 iters, t-(init.)=1.62207 s t(norm)=0.0573371, mflops=87.2036 (err=4.2e-16) 1. PDA: elapsed time t=1.28223 s, 256 iters, t-(init.)=1.28027 s t(norm)=0.362042, mflops=13.8106 (err=5.3e-16) 2. PDA (f2c): elapsed time t=1.23242 s, 256 iters, t-(init.)=1.22949 s t(norm)=0.347682, mflops=14.381 (err=5.3e-16) 3. Singleton: elapsed time t=1.65332 s, 1024 iters, t-(init.)=1.64355 s t(norm)=0.116193, mflops=43.0318 (err=6.3e-16) 4. Singleton (f2c): elapsed time t=1.12793 s, 1024 iters, t-(init.)=1.11816 s t(norm)=0.0790499, mflops=63.2512 (err=6.3e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 87.2036 Normalized results and averages for N=1331: fft 0: mflops = 87.2036 (norm. = 1), norm. avg. (of 6) = 0.992169 fft 1: mflops = 13.8106 (norm. = 0.158371), norm. avg. (of 6) = 0.106849 fft 2: mflops = 14.381 (norm. = 0.164913), norm. avg. (of 6) = 0.111162 fft 3: mflops = 43.0318 (norm. = 0.493464), norm. avg. (of 6) = 0.347362 fft 4: mflops = 63.2512 (norm. = 0.725328), norm. avg. (of 6) = 0.567587 fft 5: mflops = -1 (norm. = -0.0114674), norm. avg. (of 4) = 0.372392 fft 6: mflops = -1 (norm. = -0.0114674), norm. avg. (of 4) = 0.381225 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.85645 s, 8192 iters, t-(init.)=1.74902 s t(norm)=0.0114883, mflops=435.225 (err=4.3e-16) 1. PDA: elapsed time t=1.22559 s, 512 iters, t-(init.)=1.21875 s t(norm)=0.128084, mflops=39.0369 (err=3.7e-16) 2. PDA (f2c): elapsed time t=1.14648 s, 512 iters, t-(init.)=1.13965 s t(norm)=0.119771, mflops=41.7464 (err=3.7e-16) 3. Singleton: elapsed time t=1.57129 s, 1024 iters, t-(init.)=1.55762 s t(norm)=0.0818486, mflops=61.0884 (err=4.0e-16) 4. Singleton (f2c): elapsed time t=1.63672 s, 2048 iters, t-(init.)=1.60938 s t(norm)=0.0422842, mflops=118.248 (err=4.0e-16) 5. Temperton: elapsed time t=1.53223 s, 2048 iters, t-(init.)=1.50488 s t(norm)=0.0395388, mflops=126.458 (err=1.7e-15) 6. Temperton (f2c): elapsed time t=1.47266 s, 2048 iters, t-(init.)=1.44629 s t(norm)=0.0379993, mflops=131.581 (err=3.8e-16) Top mflops for N=1728 = 435.225 Normalized results and averages for N=1728: fft 0: mflops = 435.225 (norm. = 1), norm. avg. (of 7) = 0.993288 fft 1: mflops = 39.0369 (norm. = 0.0896935), norm. avg. (of 7) = 0.104398 fft 2: mflops = 41.7464 (norm. = 0.095919), norm. avg. (of 7) = 0.108985 fft 3: mflops = 61.0884 (norm. = 0.140361), norm. avg. (of 7) = 0.31779 fft 4: mflops = 118.248 (norm. = 0.271693), norm. avg. (of 7) = 0.525317 fft 5: mflops = 126.458 (norm. = 0.290558), norm. avg. (of 5) = 0.356025 fft 6: mflops = 131.581 (norm. = 0.30233), norm. avg. (of 5) = 0.365446 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.64746 s, 1024 iters, t-(init.)=1.63086 s t(norm)=0.0652998, mflops=76.5699 (err=4.8e-16) 1. PDA: elapsed time t=1.16699 s, 128 iters, t-(init.)=1.16504 s t(norm)=0.373186, mflops=13.3981 (err=9.3e-16) 2. PDA (f2c): elapsed time t=1.125 s, 128 iters, t-(init.)=1.12305 s t(norm)=0.359735, mflops=13.8991 (err=9.3e-16) 3. Singleton: elapsed time t=1.56934 s, 512 iters, t-(init.)=1.56055 s t(norm)=0.124969, mflops=40.0099 (err=7.5e-16) 4. Singleton (f2c): elapsed time t=1.07617 s, 512 iters, t-(init.)=1.06836 s t(norm)=0.0855545, mflops=58.4423 (err=7.5e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 76.5699 Normalized results and averages for N=2197: fft 0: mflops = 76.5699 (norm. = 1), norm. avg. (of 8) = 0.994127 fft 1: mflops = 13.3981 (norm. = 0.174979), norm. avg. (of 8) = 0.113221 fft 2: mflops = 13.8991 (norm. = 0.181522), norm. avg. (of 8) = 0.118052 fft 3: mflops = 40.0099 (norm. = 0.522528), norm. avg. (of 8) = 0.343382 fft 4: mflops = 58.4423 (norm. = 0.763254), norm. avg. (of 8) = 0.555059 fft 5: mflops = -1 (norm. = -0.01306), norm. avg. (of 5) = 0.356025 fft 6: mflops = -1 (norm. = -0.01306), norm. avg. (of 5) = 0.365446 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.1543 s, 1024 iters, t-(init.)=1.13281 s t(norm)=0.0352963, mflops=141.658 (err=3.9e-16) 1. PDA: elapsed time t=1.97852 s, 256 iters, t-(init.)=1.97363 s t(norm)=0.245979, mflops=20.327 (err=4.5e-16) 2. PDA (f2c): elapsed time t=1.84375 s, 256 iters, t-(init.)=1.83887 s t(norm)=0.229183, mflops=21.8167 (err=4.5e-16) 3. Singleton: elapsed time t=1.83887 s, 512 iters, t-(init.)=1.82813 s t(norm)=0.113922, mflops=43.8897 (err=5.2e-16) 4. Singleton (f2c): elapsed time t=1.13867 s, 512 iters, t-(init.)=1.12793 s t(norm)=0.0702883, mflops=71.1355 (err=5.2e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 141.658 Normalized results and averages for N=2744: fft 0: mflops = 141.658 (norm. = 1), norm. avg. (of 9) = 0.994779 fft 1: mflops = 20.327 (norm. = 0.143493), norm. avg. (of 9) = 0.116584 fft 2: mflops = 21.8167 (norm. = 0.15401), norm. avg. (of 9) = 0.122047 fft 3: mflops = 43.8897 (norm. = 0.309829), norm. avg. (of 9) = 0.339654 fft 4: mflops = 71.1355 (norm. = 0.502165), norm. avg. (of 9) = 0.549182 fft 5: mflops = -1 (norm. = -0.00705926), norm. avg. (of 5) = 0.356025 fft 6: mflops = -1 (norm. = -0.00705926), norm. avg. (of 5) = 0.365446 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.3418 s, 2048 iters, t-(init.)=1.28906 s t(norm)=0.0159117, mflops=314.233 (err=4.9e-16) 1. PDA: elapsed time t=1.26563 s, 256 iters, t-(init.)=1.25879 s t(norm)=0.124304, mflops=40.2238 (err=4.8e-16) 2. PDA (f2c): elapsed time t=1.17969 s, 256 iters, t-(init.)=1.17285 s t(norm)=0.115818, mflops=43.1711 (err=4.8e-16) 3. Singleton: elapsed time t=1.51563 s, 512 iters, t-(init.)=1.50293 s t(norm)=0.0742066, mflops=67.3795 (err=5.9e-16) 4. Singleton (f2c): elapsed time t=1.66699 s, 1024 iters, t-(init.)=1.64063 s t(norm)=0.0405026, mflops=123.449 (err=5.9e-16) 5. Temperton: elapsed time t=1.64844 s, 1024 iters, t-(init.)=1.62207 s t(norm)=0.0400446, mflops=124.861 (err=2.0e-15) 6. Temperton (f2c): elapsed time t=1.5957 s, 1024 iters, t-(init.)=1.56934 s t(norm)=0.0387427, mflops=129.057 (err=4.7e-16) Top mflops for N=3375 = 314.233 Normalized results and averages for N=3375: fft 0: mflops = 314.233 (norm. = 1), norm. avg. (of 10) = 0.995301 fft 1: mflops = 40.2238 (norm. = 0.128006), norm. avg. (of 10) = 0.117727 fft 2: mflops = 43.1711 (norm. = 0.137386), norm. avg. (of 10) = 0.123581 fft 3: mflops = 67.3795 (norm. = 0.214425), norm. avg. (of 10) = 0.327131 fft 4: mflops = 123.449 (norm. = 0.392857), norm. avg. (of 10) = 0.533549 fft 5: mflops = 124.861 (norm. = 0.397351), norm. avg. (of 6) = 0.362913 fft 6: mflops = 129.057 (norm. = 0.410703), norm. avg. (of 6) = 0.372989 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.03711 s, 128 iters, t-(init.)=0.973633 s t(norm)=0.0322572, mflops=155.004 (err=4.8e-16) 1. PDA: elapsed time t=1.43457 s, 32 iters, t-(init.)=1.41797 s t(norm)=0.187914, mflops=26.6079 (err=4.6e-16) 2. PDA (f2c): elapsed time t=1.03809 s, 32 iters, t-(init.)=1.02246 s t(norm)=0.1355, mflops=36.9004 (err=4.6e-16) 3. Singleton: elapsed time t=1.66699 s, 64 iters, t-(init.)=1.63477 s t(norm)=0.108322, mflops=46.1586 (err=5.9e-16) 4. Singleton (f2c): elapsed time t=1.06152 s, 64 iters, t-(init.)=1.0293 s t(norm)=0.0682029, mflops=73.3107 (err=5.9e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 155.004 Normalized results and averages for N=16800: fft 0: mflops = 155.004 (norm. = 1), norm. avg. (of 11) = 0.995729 fft 1: mflops = 26.6079 (norm. = 0.17166), norm. avg. (of 11) = 0.12263 fft 2: mflops = 36.9004 (norm. = 0.238061), norm. avg. (of 11) = 0.133988 fft 3: mflops = 46.1586 (norm. = 0.29779), norm. avg. (of 11) = 0.324464 fft 4: mflops = 73.3107 (norm. = 0.47296), norm. avg. (of 11) = 0.528041 fft 5: mflops = -1 (norm. = -0.00645145), norm. avg. (of 6) = 0.362913 fft 6: mflops = -1 (norm. = -0.00645145), norm. avg. (of 6) = 0.372989 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.08984 s, 16 iters, t-(init.)=0.979492 s t(norm)=0.0330382, mflops=151.34 (err=6.8e-16) 1. PDA: elapsed time t=1.81445 s, 4 iters, t-(init.)=1.78711 s t(norm)=0.241116, mflops=20.7369 (err=6.2e-16) 2. PDA (f2c): elapsed time t=1.03027 s, 4 iters, t-(init.)=1.00391 s t(norm)=0.135447, mflops=36.9149 (err=6.2e-16) 3. Singleton: elapsed time t=1.20801 s, 4 iters, t-(init.)=1.18066 s t(norm)=0.159295, mflops=31.3884 (err=6.7e-16) 4. Singleton (f2c): elapsed time t=1.86426 s, 8 iters, t-(init.)=1.80957 s t(norm)=0.122073, mflops=40.959 (err=6.7e-16) 5. Temperton: elapsed time t=1.05859 s, 8 iters, t-(init.)=1.00391 s t(norm)=0.0677233, mflops=73.8299 (err=1.8e-15) 6. Temperton (f2c): elapsed time t=1.05469 s, 8 iters, t-(init.)=1 s t(norm)=0.0674598, mflops=74.1183 (err=7.1e-16) Top mflops for N=110592 = 151.34 Normalized results and averages for N=110592: fft 0: mflops = 151.34 (norm. = 1), norm. avg. (of 12) = 0.996085 fft 1: mflops = 20.7369 (norm. = 0.137022), norm. avg. (of 12) = 0.123829 fft 2: mflops = 36.9149 (norm. = 0.24392), norm. avg. (of 12) = 0.143149 fft 3: mflops = 31.3884 (norm. = 0.207403), norm. avg. (of 12) = 0.314709 fft 4: mflops = 40.959 (norm. = 0.270642), norm. avg. (of 12) = 0.506591 fft 5: mflops = 73.8299 (norm. = 0.48784), norm. avg. (of 7) = 0.380759 fft 6: mflops = 74.1183 (norm. = 0.489746), norm. avg. (of 7) = 0.389668 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.61035 s, 16 iters, t-(init.)=1.49023 s t(norm)=0.047, mflops=106.383 (err=6.2e-16) 1. PDA: elapsed time t=1.06641 s, 2 iters, t-(init.)=1.05176 s t(norm)=0.265368, mflops=18.8417 (err=7.1e-16) 2. PDA (f2c): elapsed time t=1.00195 s, 2 iters, t-(init.)=0.987305 s t(norm)=0.249106, mflops=20.0718 (err=7.1e-16) 3. Singleton: elapsed time t=1.4541 s, 4 iters, t-(init.)=1.4248 s t(norm)=0.179746, mflops=27.8171 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.01953 s, 4 iters, t-(init.)=0.989258 s t(norm)=0.124799, mflops=40.0643 (err=1.0e-15) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 106.383 Normalized results and averages for N=117649: fft 0: mflops = 106.383 (norm. = 1), norm. avg. (of 13) = 0.996386 fft 1: mflops = 18.8417 (norm. = 0.177112), norm. avg. (of 13) = 0.127928 fft 2: mflops = 20.0718 (norm. = 0.188675), norm. avg. (of 13) = 0.146651 fft 3: mflops = 27.8171 (norm. = 0.26148), norm. avg. (of 13) = 0.310614 fft 4: mflops = 40.0643 (norm. = 0.376604), norm. avg. (of 13) = 0.496592 fft 5: mflops = -1 (norm. = -0.0094), norm. avg. (of 7) = 0.380759 fft 6: mflops = -1 (norm. = -0.0094), norm. avg. (of 7) = 0.389668 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.12793 s, 8 iters, t-(init.)=0.993164 s t(norm)=0.0324337, mflops=154.16 (err=7.5e-16) 1. PDA: elapsed time t=1.37891 s, 2 iters, t-(init.)=1.3457 s t(norm)=0.175786, mflops=28.4436 (err=7.4e-16) 2. PDA (f2c): elapsed time t=1.98828 s, 4 iters, t-(init.)=1.92188 s t(norm)=0.125525, mflops=39.8326 (err=7.4e-16) 3. Singleton: elapsed time t=1.52734 s, 2 iters, t-(init.)=1.49512 s t(norm)=0.195304, mflops=25.6011 (err=9.9e-16) 4. Singleton (f2c): elapsed time t=1.20215 s, 2 iters, t-(init.)=1.1709 s t(norm)=0.152952, mflops=32.69 (err=9.9e-16) 5. Temperton: elapsed time t=1.12793 s, 4 iters, t-(init.)=1.06152 s t(norm)=0.0693323, mflops=72.1165 (err=2.1e-15) 6. Temperton (f2c): elapsed time t=1.11914 s, 4 iters, t-(init.)=1.05273 s t(norm)=0.0687583, mflops=72.7185 (err=7.3e-16) Top mflops for N=216000 = 154.16 Normalized results and averages for N=216000: fft 0: mflops = 154.16 (norm. = 1), norm. avg. (of 14) = 0.996644 fft 1: mflops = 28.4436 (norm. = 0.184507), norm. avg. (of 14) = 0.131969 fft 2: mflops = 39.8326 (norm. = 0.258384), norm. avg. (of 14) = 0.154632 fft 3: mflops = 25.6011 (norm. = 0.166068), norm. avg. (of 14) = 0.300289 fft 4: mflops = 32.69 (norm. = 0.212052), norm. avg. (of 14) = 0.476268 fft 5: mflops = 72.1165 (norm. = 0.467801), norm. avg. (of 8) = 0.39164 fft 6: mflops = 72.7185 (norm. = 0.471707), norm. avg. (of 8) = 0.399923 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.35449 s, 8 iters, t-(init.)=1.20117 s t(norm)=0.0347036, mflops=144.077 (err=7.1e-16) 1. PDA: elapsed time t=1.5332 s, 2 iters, t-(init.)=1.49707 s t(norm)=0.17301, mflops=28.9 (err=7.8e-16) 2. PDA (f2c): elapsed time t=1.27539 s, 2 iters, t-(init.)=1.23828 s t(norm)=0.143103, mflops=34.9399 (err=7.8e-16) 3. Singleton: elapsed time t=1.85645 s, 2 iters, t-(init.)=1.81934 s t(norm)=0.210253, mflops=23.7809 (err=9.3e-16) 4. Singleton (f2c): elapsed time t=1.48633 s, 2 iters, t-(init.)=1.44922 s t(norm)=0.16748, mflops=29.8543 (err=9.3e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 144.077 Normalized results and averages for N=241920: fft 0: mflops = 144.077 (norm. = 1), norm. avg. (of 15) = 0.996868 fft 1: mflops = 28.9 (norm. = 0.200587), norm. avg. (of 15) = 0.136544 fft 2: mflops = 34.9399 (norm. = 0.242508), norm. avg. (of 15) = 0.16049 fft 3: mflops = 23.7809 (norm. = 0.165056), norm. avg. (of 15) = 0.291274 fft 4: mflops = 29.8543 (norm. = 0.20721), norm. avg. (of 15) = 0.458331 fft 5: mflops = -1 (norm. = -0.00694072), norm. avg. (of 8) = 0.39164 fft 6: mflops = -1 (norm. = -0.00694072), norm. avg. (of 8) = 0.399923 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.27832 s, 4 iters, t-(init.)=1.13672 s t(norm)=0.0360481, mflops=138.704 (err=7.2e-16) 1. PDA: elapsed time t=1.3584 s, 1 iters, t-(init.)=1.32422 s t(norm)=0.167977, mflops=29.766 (err=7.2e-16) 2. PDA (f2c): elapsed time t=1.01758 s, 1 iters, t-(init.)=0.983398 s t(norm)=0.124744, mflops=40.0822 (err=7.2e-16) 3. Singleton: elapsed time t=1.41113 s, 1 iters, t-(init.)=1.37598 s t(norm)=0.174542, mflops=28.6464 (err=9.2e-16) 4. Singleton (f2c): elapsed time t=1.11719 s, 1 iters, t-(init.)=1.08203 s t(norm)=0.137255, mflops=36.4285 (err=9.2e-16) 5. Temperton: elapsed time t=1.05664 s, 2 iters, t-(init.)=0.986328 s t(norm)=0.0625577, mflops=79.9262 (err=2.3e-15) 6. Temperton (f2c): elapsed time t=1.04395 s, 2 iters, t-(init.)=0.973633 s t(norm)=0.0617525, mflops=80.9684 (err=9.4e-16) Top mflops for N=421875 = 138.704 Normalized results and averages for N=421875: fft 0: mflops = 138.704 (norm. = 1), norm. avg. (of 16) = 0.997063 fft 1: mflops = 29.766 (norm. = 0.214602), norm. avg. (of 16) = 0.141422 fft 2: mflops = 40.0822 (norm. = 0.288977), norm. avg. (of 16) = 0.168521 fft 3: mflops = 28.6464 (norm. = 0.206529), norm. avg. (of 16) = 0.285977 fft 4: mflops = 36.4285 (norm. = 0.262635), norm. avg. (of 16) = 0.4461 fft 5: mflops = 79.9262 (norm. = 0.576238), norm. avg. (of 9) = 0.412151 fft 6: mflops = 80.9684 (norm. = 0.583751), norm. avg. (of 9) = 0.420349 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.66211 s, 4 iters, t-(init.)=1.48828 s t(norm)=0.0383164, mflops=130.493 (err=6.4e-16) 1. PDA: elapsed time t=1.50781 s, 1 iters, t-(init.)=1.46582 s t(norm)=0.150952, mflops=33.123 (err=5.9e-16) 2. PDA (f2c): elapsed time t=1.32715 s, 1 iters, t-(init.)=1.28418 s t(norm)=0.132247, mflops=37.8081 (err=5.9e-16) 3. Singleton: elapsed time t=1.88379 s, 1 iters, t-(init.)=1.84082 s t(norm)=0.18957, mflops=26.3754 (err=8.3e-16) 4. Singleton (f2c): elapsed time t=1.54102 s, 1 iters, t-(init.)=1.49805 s t(norm)=0.154271, mflops=32.4105 (err=8.3e-16) 5. Temperton: elapsed time t=1.5752 s, 2 iters, t-(init.)=1.48926 s t(norm)=0.076683, mflops=65.2035 (err=8.0e-16) 6. Temperton (f2c): elapsed time t=1.57031 s, 2 iters, t-(init.)=1.48438 s t(norm)=0.0764316, mflops=65.418 (err=6.7e-16) Top mflops for N=512000 = 130.493 Normalized results and averages for N=512000: fft 0: mflops = 130.493 (norm. = 1), norm. avg. (of 17) = 0.997236 fft 1: mflops = 33.123 (norm. = 0.253831), norm. avg. (of 17) = 0.148035 fft 2: mflops = 37.8081 (norm. = 0.289734), norm. avg. (of 17) = 0.175651 fft 3: mflops = 26.3754 (norm. = 0.202122), norm. avg. (of 17) = 0.281045 fft 4: mflops = 32.4105 (norm. = 0.24837), norm. avg. (of 17) = 0.434469 fft 5: mflops = 65.2035 (norm. = 0.499672), norm. avg. (of 10) = 0.420903 fft 6: mflops = 65.418 (norm. = 0.501316), norm. avg. (of 10) = 0.428445 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.08496 s, 2 iters, t-(init.)=0.983398 s t(norm)=0.0432596, mflops=115.581 (err=6.8e-16) 1. PDA: elapsed time t=2.18945 s, 1 iters, t-(init.)=2.14063 s t(norm)=0.188332, mflops=26.5489 (err=6.8e-16) 2. PDA (f2c): elapsed time t=2.00684 s, 1 iters, t-(init.)=1.95801 s t(norm)=0.172265, mflops=29.0251 (err=6.8e-16) 3. Singleton: elapsed time t=2.82227 s, 1 iters, t-(init.)=2.77441 s t(norm)=0.244092, mflops=20.4841 (err=8.9e-16) 4. Singleton (f2c): elapsed time t=2.24707 s, 1 iters, t-(init.)=2.2002 s t(norm)=0.193573, mflops=25.8301 (err=8.9e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 115.581 Normalized results and averages for N=592704: fft 0: mflops = 115.581 (norm. = 1), norm. avg. (of 18) = 0.99739 fft 1: mflops = 26.5489 (norm. = 0.229699), norm. avg. (of 18) = 0.152571 fft 2: mflops = 29.0251 (norm. = 0.251122), norm. avg. (of 18) = 0.179844 fft 3: mflops = 20.4841 (norm. = 0.177226), norm. avg. (of 18) = 0.275277 fft 4: mflops = 25.8301 (norm. = 0.22348), norm. avg. (of 18) = 0.422747 fft 5: mflops = -1 (norm. = -0.00865191), norm. avg. (of 10) = 0.420903 fft 6: mflops = -1 (norm. = -0.00865191), norm. avg. (of 10) = 0.428445 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=1.55176 s, 2 iters, t-(init.)=1.39941 s t(norm)=0.0400339, mflops=124.894 (err=7.9e-16) 1. PDA: elapsed time t=2.77637 s, 1 iters, t-(init.)=2.7002 s t(norm)=0.154492, mflops=32.3641 (err=6.5e-16) 2. PDA (f2c): elapsed time t=2.59961 s, 1 iters, t-(init.)=2.52344 s t(norm)=0.144379, mflops=34.6311 (err=6.5e-16) 3. Singleton: elapsed time t=4.68164 s, 1 iters, t-(init.)=4.60742 s t(norm)=0.263615, mflops=18.9671 (err=6.9e-16) 4. Singleton (f2c): elapsed time t=4.02344 s, 1 iters, t-(init.)=3.9502 s t(norm)=0.226011, mflops=22.1228 (err=6.9e-16) 5. Temperton: elapsed time t=1.78027 s, 1 iters, t-(init.)=1.7041 s t(norm)=0.0975006, mflops=51.2817 (err=1.9e-15) 6. Temperton (f2c): elapsed time t=1.77441 s, 1 iters, t-(init.)=1.69922 s t(norm)=0.0972212, mflops=51.4291 (err=7.7e-16) Top mflops for N=884736 = 124.894 Normalized results and averages for N=884736: fft 0: mflops = 124.894 (norm. = 1), norm. avg. (of 19) = 0.997527 fft 1: mflops = 32.3641 (norm. = 0.259132), norm. avg. (of 19) = 0.15818 fft 2: mflops = 34.6311 (norm. = 0.277283), norm. avg. (of 19) = 0.184972 fft 3: mflops = 18.9671 (norm. = 0.151865), norm. avg. (of 19) = 0.268782 fft 4: mflops = 22.1228 (norm. = 0.177132), norm. avg. (of 19) = 0.40982 fft 5: mflops = 51.2817 (norm. = 0.410602), norm. avg. (of 11) = 0.419966 fft 6: mflops = 51.4291 (norm. = 0.411782), norm. avg. (of 11) = 0.42693 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=1.10742 s, 1 iters, t-(init.)=1.00684 s t(norm)=0.043179, mflops=115.797 (err=7.2e-16) 1. PDA: elapsed time t=5.3584 s, 1 iters, t-(init.)=5.25879 s t(norm)=0.225527, mflops=22.1703 (err=7.1e-16) 2. PDA (f2c): elapsed time t=4.11719 s, 1 iters, t-(init.)=4.0166 s t(norm)=0.172255, mflops=29.0267 (err=7.1e-16) 3. Singleton: elapsed time t=5.01465 s, 1 iters, t-(init.)=4.91504 s t(norm)=0.210785, mflops=23.7208 (err=8.0e-16) 4. Singleton (f2c): elapsed time t=3.93555 s, 1 iters, t-(init.)=3.83594 s t(norm)=0.164507, mflops=30.3938 (err=8.0e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 115.797 Normalized results and averages for N=1157625: fft 0: mflops = 115.797 (norm. = 1), norm. avg. (of 20) = 0.997651 fft 1: mflops = 22.1703 (norm. = 0.191458), norm. avg. (of 20) = 0.159844 fft 2: mflops = 29.0267 (norm. = 0.250669), norm. avg. (of 20) = 0.188257 fft 3: mflops = 23.7208 (norm. = 0.204848), norm. avg. (of 20) = 0.265585 fft 4: mflops = 30.3938 (norm. = 0.262475), norm. avg. (of 20) = 0.402453 fft 5: mflops = -1 (norm. = -0.0086358), norm. avg. (of 11) = 0.419966 fft 6: mflops = -1 (norm. = -0.0086358), norm. avg. (of 11) = 0.42693 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=1.5127 s, 1 iters, t-(init.)=1.39063 s t(norm)=0.0484681, mflops=103.161 (err=5.3e-16) 1. PDA: elapsed time t=6.56934 s, 1 iters, t-(init.)=6.44824 s t(norm)=0.224744, mflops=22.2476 (err=5.7e-16) 2. PDA (f2c): elapsed time t=5.19922 s, 1 iters, t-(init.)=5.07715 s t(norm)=0.176956, mflops=28.2556 (err=5.7e-16) 3. Singleton: elapsed time t=6.65039 s, 1 iters, t-(init.)=6.52734 s t(norm)=0.227501, mflops=21.978 (err=6.5e-16) 4. Singleton (f2c): elapsed time t=5.3916 s, 1 iters, t-(init.)=5.26855 s t(norm)=0.183628, mflops=27.229 (err=6.5e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 103.161 Normalized results and averages for N=1404928: fft 0: mflops = 103.161 (norm. = 1), norm. avg. (of 21) = 0.997763 fft 1: mflops = 22.2476 (norm. = 0.21566), norm. avg. (of 21) = 0.162502 fft 2: mflops = 28.2556 (norm. = 0.273899), norm. avg. (of 21) = 0.192335 fft 3: mflops = 21.978 (norm. = 0.213046), norm. avg. (of 21) = 0.263083 fft 4: mflops = 27.229 (norm. = 0.263948), norm. avg. (of 21) = 0.395857 fft 5: mflops = -1 (norm. = -0.00969363), norm. avg. (of 11) = 0.419966 fft 6: mflops = -1 (norm. = -0.00969363), norm. avg. (of 11) = 0.42693 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=1.40039 s, 1 iters, t-(init.)=1.25 s t(norm)=0.034911, mflops=143.221 (err=7.2e-16) 1. PDA: elapsed time t=5.53613 s, 1 iters, t-(init.)=5.38574 s t(norm)=0.150417, mflops=33.2408 (err=8.0e-16) 2. PDA (f2c): elapsed time t=4.56055 s, 1 iters, t-(init.)=4.41016 s t(norm)=0.12317, mflops=40.5942 (err=8.0e-16) 3. Singleton: elapsed time t=9.75488 s, 1 iters, t-(init.)=9.60547 s t(norm)=0.268269, mflops=18.638 (err=9.4e-16) 4. Singleton (f2c): elapsed time t=8.30957 s, 1 iters, t-(init.)=8.15918 s t(norm)=0.227876, mflops=21.9417 (err=9.4e-16) 5. Temperton: elapsed time t=3.34375 s, 1 iters, t-(init.)=3.19336 s t(norm)=0.0891867, mflops=56.0622 (err=2.0e-15) 6. Temperton (f2c): elapsed time t=3.31641 s, 1 iters, t-(init.)=3.16602 s t(norm)=0.088423, mflops=56.5463 (err=7.0e-16) Top mflops for N=1728000 = 143.221 Normalized results and averages for N=1728000: fft 0: mflops = 143.221 (norm. = 1), norm. avg. (of 22) = 0.997864 fft 1: mflops = 33.2408 (norm. = 0.232094), norm. avg. (of 22) = 0.165665 fft 2: mflops = 40.5942 (norm. = 0.283437), norm. avg. (of 22) = 0.196476 fft 3: mflops = 18.638 (norm. = 0.130134), norm. avg. (of 22) = 0.25704 fft 4: mflops = 21.9417 (norm. = 0.153202), norm. avg. (of 22) = 0.384827 fft 5: mflops = 56.0622 (norm. = 0.391437), norm. avg. (of 12) = 0.417589 fft 6: mflops = 56.5463 (norm. = 0.394818), norm. avg. (of 12) = 0.424254 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=2.78906 s, 1 iters, t-(init.)=2.52832 s t(norm)=0.0393649, mflops=127.017 (err=1.2e-15) 1. PDA: elapsed time t=10.3525 s, 1 iters, t-(init.)=10.0928 s t(norm)=0.15714, mflops=31.8187 (err=1.2e-15) 2. PDA (f2c): elapsed time t=8.48145 s, 1 iters, t-(init.)=8.22266 s t(norm)=0.128023, mflops=39.0554 (err=1.2e-15) 3. Singleton: elapsed time t=17.1221 s, 1 iters, t-(init.)=16.8613 s t(norm)=0.262524, mflops=19.0459 (err=1.6e-15) 4. Singleton (f2c): elapsed time t=14.6064 s, 1 iters, t-(init.)=14.3457 s t(norm)=0.223356, mflops=22.3857 (err=1.6e-15) 5. Temperton: elapsed time t=6.20215 s, 1 iters, t-(init.)=5.94238 s t(norm)=0.0925204, mflops=54.0422 (err=3.4e-15) 6. Temperton (f2c): elapsed time t=6.19336 s, 1 iters, t-(init.)=5.93359 s t(norm)=0.0923835, mflops=54.1222 (err=1.2e-15) Top mflops for N=2985984 = 127.017 Normalized results and averages for N=2985984: fft 0: mflops = 127.017 (norm. = 1), norm. avg. (of 23) = 0.997957 fft 1: mflops = 31.8187 (norm. = 0.250508), norm. avg. (of 23) = 0.169354 fft 2: mflops = 39.0554 (norm. = 0.307482), norm. avg. (of 23) = 0.201303 fft 3: mflops = 19.0459 (norm. = 0.149948), norm. avg. (of 23) = 0.252384 fft 4: mflops = 22.3857 (norm. = 0.176242), norm. avg. (of 23) = 0.375758 fft 5: mflops = 54.0422 (norm. = 0.425472), norm. avg. (of 13) = 0.418195 fft 6: mflops = 54.1222 (norm. = 0.426103), norm. avg. (of 13) = 0.424397 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Bailey, Beauregard, Bergland, Brenner, Burrus, CWP (min N), CWP (best N), Edelblute, FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), Monro, NAPACK (f2c), Nielsen, NR (C), Ooura (C), Ooura (F), Ransom, SCIPORT, Singleton, Singleton (f2c), Sorensen, Sorensen DIT, Temperton, Temperton (f2c), Valkenburg 2, 56.6618, 38.2115, 38.7983, 2.63689, 9.05347, 5.42294, 12.707, 8.08541, 27.6026, 4.81499, 4.51912, , 11.5147, 11.8908, 64.9277, 64.9768, 7.68056, , 7.46691, 7.34434, 6.86536, 51.0698, , , , , 2.18952, 1.62295, 7.64773, 56.6618, 33.9523, , , 5.342, 6.23906, 20.394, 23.1161, 4.71353, 5.54332, 3.32017 4, 119.72, 89.947, 43.5595, 8.91072, 2.97271, 10.7805, 41.6987, 17.1799, 21.648, 17.6748, 8.6106, 35.2046, 7.3343, 7.1203, 204.279, 202.832, 283.73, , 34.2501, 15.9546, 15.5502, 170.098, 67.851, 66.4856, 62.291, 9.87803, 7.54562, 9.81482, 17.4877, 97.6129, 57.076, 6.10427, 27.5849, 20.1642, 27.6738, 34.1956, 21.2202, 15.3721, 17.0842, 9.49374 8, 194.93, 165.616, 49.7487, 11.024, 27.1833, 10.2783, 56.9624, 23.3931, 20.7821, 47.1629, 25.9776, 29.8815, 43.6185, 43.3543, 373.205, 372.127, 507.779, 163.618, 50.808, 29.1778, 29.6341, 194.196, 95.2321, 95.6563, 92.1667, 22.0935, 13.0308, 24.2928, 32.2123, 173.068, 82.2266, 8.91812, 30.7076, 19.7258, 25.8111, 54.0021, 20.6489, 24.4403, 26.8659, 9.96666 16, 79.2429, 72.8578, 52.2184, 17.0571, 32.0999, 12.4492, 82.9144, 33.1914, 21.7027, 87.6524, 56.5127, 29.7024, 63.535, 59.7353, 365.918, 365.529, 360.922, 195.226, 96.788, 45.6911, 48.1499, 224.28, 88.739, 109.079, 103.244, 35.2046, 20.0887, 26.5778, 50.6482, 216.507, 100.703, 28.3496, 32.4393, 48.5856, 75.8828, 66.3828, 21.7687, 43.1655, 46.6591, 10.1873 32, 91.93, 87.3671, 56.6021, 19.8547, 44.0058, 13.3153, 117.929, 41.108, 23.9461, 103.443, 112.966, 32.1287, 50.128, 48.9399, 332.428, 336.86, 341.684, 262.048, 98.9624, 61.9586, 67.7867, 219.804, 99.4205, 127.979, 124.492, 49.164, 26.1888, 41.6825, 71.5351, 255.958, 106.101, 30.3146, 33.7019, 59.7187, 99.6975, 83.0427, 23.9675, 48.1931, 49.2091, 10.1144 64, 91.7728, 85.785, 62.7919, 27.1604, 45.8211, 13.433, 138.622, 48.8805, 26.4468, 142.532, 138.028, 35.4175, 58.6744, 57.6764, 397.989, 397.682, 306.419, 342.001, 137.659, 73.3347, 83.6682, 209.001, 103.576, 133.73, 133.108, 61.1819, 31.7206, 56.8619, 88.9227, 314.458, 121.099, 60.1536, 33.6948, 75.8382, 137.586, 93.099, 26.4035, 69.1621, 71.1873, 10.391 128, 103.672, 98.9624, 68.3912, 27.4715, 54.347, 13.3931, 150.776, 53.458, 28.3844, 172.192, 197.405, 39.4137, 65.0753, 64.1313, 394.551, 395.069, 282.829, 332.943, 152.304, 82.7775, 96.3614, 194.973, 113.882, 148.395, 147.232, 64.9628, 34.5413, 55.5931, 101.984, 310.266, 118.646, 58.583, 34.1335, 68.8927, 133.266, 102.053, 28.9084, 70.9075, 68.7038, 10.5269 256, 106.906, 100.116, 73.481, 30.4392, 51.9972, 13.2889, 175.305, 58.1186, 31.1455, 221.962, 242.654, 42.7999, 70.4671, 70.2941, 386.499, 372.06, 288.011, 373.475, 183.644, 88.0567, 103.869, 150.701, 120.645, 156.894, 156.039, 73.9237, 38.0086, 64.0562, 110.127, 361.301, 128.208, 84.6299, 33.3201, 88.0116, 173.797, 104.12, 30.9213, 83.8861, 83.7226, 10.5683 512, 116.994, 110.886, 78.6304, 32.1694, 58.4614, 13.1443, 189.67, 59.837, 32.9593, 237.875, 231.882, 46.5047, 57.8317, 57.7971, 403.494, 405.824, 235.699, 401.399, 162.963, 91.4255, 108.948, 132.743, 127.405, 165.758, 164.909, 75.7935, 34.5378, 71.5298, 115.525, 341.775, 123.183, 79.9312, 32.4067, 89.5614, 175.863, 106.253, 33.2771, 72.8235, 73.7123, 10.5315 1024, 110.127, 102.75, 72.5011, 35.5779, 53.6603, 12.9056, 175.162, 58.7065, 33.4916, 230.54, 231.161, 46.1626, 60.2887, 60.1873, 383.137, 383.137, 237.16, 338.186, 138.637, 78.2041, 97.2592, 89.8529, 123.348, 154.718, 154.273, 70.3171, 33.8293, 61.7093, 94.5195, 321.72, 121.258, 93.1259, 31.3044, 85.6937, 169.36, 86.2443, 32.2639, 78.5473, 79.3015, 10.3803 2048, 115.682, 111.426, 74.8489, 34.8412, 41.1826, 12.7716, 195.387, 59.3526, 34.8617, 187.777, 213.198, 48.6056, 57.1692, 56.7844, 202.941, 207.032, 166.237, 336.74, 98.9209, 78.4274, 98.5084, 80.843, 116.596, 140.945, 141.366, 67.1089, 29.6316, 62.8253, 94.7926, 262.762, 111.637, 81.7947, 28.7236, 76.5964, 152.5, 83.7077, 33.3272, 67.2234, 66.2061, 9.86895 4096, 86.6503, 81.0371, 61.8871, 37.5872, 32.8864, 12.6125, 158.681, 57.0633, 32.4393, 177.723, 211.228, 43.6776, 52.3776, 52.5057, 170.323, 130.612, 119.86, 227.85, 93.5044, 64.846, 78.8067, 82.7547, 117.994, 140.435, 130.151, 58.4084, 29.1514, 49.4812, 76.062, 256.927, 109.287, 99.5742, 25.0484, 75.749, 138.996, 62.2459, 30.2462, 69.5729, 69.3109, 9.13046 8192, 62.4269, 61.2221, 46.6844, 35.0896, 29.1778, 12.2789, 124.409, 51.8523, 28.3022, 170.124, 175.47, 36.1249, 41.7673, 42.1457, 180.344, 180.228, 96.5328, 167.974, 68.8976, 50.1749, 60.8485, , 95.6727, 109.052, 105.827, 46.5288, 24.1165, 35.9203, 56.8349, 179.187, 93.1819, 80.3144, 20.6001, 63.3908, 99.3498, 53.0344, 26.8229, 55.5236, 55.2599, 9.17366 16384, 49.8421, 48.2116, 38.7832, 38.0374, 22.3165, 12.0684, 111.269, 48.617, 25.7757, 157.903, 157.407, 31.6872, 38.6238, 38.9238, 161.813, 178.744, 89.213, 129.701, 65.5291, 45.1966, 53.0805, , 73.0437, 79.7051, 72.9727, 41.3432, 24.1523, 34.7972, 51.3401, 167.492, 89.8529, 97.9309, 19.302, 58.1299, 88.6343, 42.9988, 24.2302, 55.3068, 55.4701, 8.88859 32768, 34.6219, 33.4708, 25.9274, 30.4579, 22.3821, 11.544, 79.4188, 38.275, 19.6034, 145.757, 145.757, 22.6083, 38.6423, 38.7167, 139.81, 132.017, 80.2098, 90.9437, 58.6958, 30.4349, 34.0366, , 57.4808, 61.662, 58.5677, 27.2616, 20.575, 33.443, 33.8364, 118.952, 72.7467, 72.1601, 16.6661, 39.5923, 51.4244, 35.5073, 19.1922, 40.6515, 40.7544, 8.10494 65536, 27.7991, 27.0464, 21.3893, 33.0636, 18.8376, 11.2199, 65.572, 34.2228, 16.7903, 137.11, 137.11, 18.8211, 35.3205, 35.5544, 120.56, 107.106, 64.8787, 76.0845, 53.3536, 26.0617, 28.4812, , 39.9904, 41.7392, 39.7682, 23.141, 18.8376, 27.6738, 27.9257, 107.173, 68.7195, 70.0076, 14.0175, 38.0086, 49.7391, 29.3372, 16.6343, 38.076, 38.1436, 7.71921 131072, 23.4864, 23.001, 17.8817, 26.1963, 18.152, 10.8964, 55.7192, 29.9829, 14.6076, 124.343, 124.428, 16.2053, 28.7187, 28.7549, 92.941, 97.9797, 55.3811, 59.265, 41.2604, 22.0881, 23.547, , 32.3187, 33.2125, 31.6244, 19.3201, 16.091, 23.8547, 23.3901, 86.5921, 58.5803, 57.8378, 12.0597, 31.4717, 39.272, 25.9432, 14.6263, 30.668, 30.7093, 7.25732 262144, 20.5435, 20.4392, 15.8006, 30.7957, 16.5135, 10.4676, 50.6217, 26.8585, 13.2962, 81.2757, 81.1392, 14.51, 28.0269, 28.0595, 101.19, 98.2081, 50.8615, 53.3315, 42.016, 19.4362, 20.3018, , 27.8332, 28.4393, 26.9784, 18.1785, 15.6878, 21.342, 20.2338, 80.13, 55.6343, 68.0541, 11.0568, 29.5345, 36.0854, 23.9913, 13.4218, 29.6069, 29.7344, 7.04144 524288, 18.6414, 18.4525, 14.0002, 23.3315, 16.7167, 10.3664, 45.2152, 25.1369, 12.1958, 78.4054, 78.4054, 13.145, 28.5091, 28.6532, 92.3126, 84.9338, 47.3124, 49.183, 41.032, 17.8957, 18.5599, , 24.0239, 24.2524, 23.2359, 16.1965, 15.0096, 18.8133, 18.7097, 71.6331, 51.3623, 53.8856, 10.9214, 24.0126, 28.398, 22.1558, 12.3493, 26.2225, 26.2901, 6.72504 Norm. Avg., 0.307821, 0.274825, 0.199229, 0.129576, 0.126803, 0.0563032, 0.426615, 0.178948, 0.117249, 0.581877, 0.591852, 0.127391, 0.187594, 0.187198, 0.904019, 0.888521, 0.632296, 0.730686, 0.328908, 0.187508, 0.213673, 0.463041, 0.31569, 0.365506, 0.352955, 0.165569, 0.097483, 0.1526, 0.215514, 0.7717, 0.388486, 0.306341, 0.0949941, 0.215235, 0.341457, 0.230653, 0.111817, 0.20295, 0.204965, 0.0406778 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, Brenner, CWP (min N), CWP (best N), FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, Nielsen, Singleton, Singleton (f2c), Temperton, Temperton (f2c), Valkenburg 6, , 23.4029, 16.7204, 34.182, 33.9037, 285.897, 226.578, 24.782, 39.994, 7.82589, 13.4737, 17.8993, 16.5213, 18.5203, 10.3258 9, 7.86594, 48.1048, 30.7315, 44.7591, 44.0891, 256.13, 255.489, 24.8807, 39.1128, 14.615, 26.227, 34.0671, 25.5106, 28.1969, 10.5428 12, , 52.0179, 46.602, 54.165, 53.4134, 392.788, 351.536, 41.2133, 57.6822, 19.6261, 26.3893, 35.0789, 35.0789, 37.1317, 10.5596 15, 10.6928, 63.3302, 63.4835, 55.3916, 54.5465, 280.915, 265.058, 31.188, 43.8684, 23.8641, 28.4986, 42.3565, 40.9881, 45.2306, 9.30181 18, 9.90775, 70.4983, 66.2341, 39.4448, 39.3523, 234.83, 192.072, 28.2507, 73.3202, 18.5051, 36.3952, 52.2521, 36.1861, 36.7136, 10.5555 24, , 102.85, 97.0381, 45.3878, 45.083, 285.671, 284.023, 54.5794, 102.707, 33.324, 32.6463, 48.6149, 48.9046, 50.7533, 10.4955 36, 12.1974, 128.897, 130.445, 50.6493, 50.404, 308.97, 276.177, 35.5641, 115.329, 32.3578, 54.2577, 85.315, 60.2224, 59.991, 10.1447 80, 19.4792, 184.06, 182.87, 58.1175, 58.0777, 340.769, 288.12, 68.0445, 85.0641, 59.1712, 69.7218, 132.27, 87.3407, 82.5404, 9.53816 108, 12.8999, 160.728, 211.207, 61.5048, 61.4739, 271.988, 245.772, 33.9231, 124.765, 41.0444, 53.4007, 96.071, 70.6258, 73.9544, 10.0919 210, 15.9501, 204.199, 204.199, 39.8752, 39.3442, 228.587, 229.746, 36.2579, 80.8416, 47.8166, 40.0633, 69.9407, , , 7.79928 504, 17.1523, 280.522, 280.938, 39.0802, 38.8561, 234.578, 218.507, 43.052, 123.55, 48.8351, 48.7849, 85.3297, , , 8.4963 1000, 17.5806, 212.073, 264.722, 55.8817, 57.0408, 303.115, 204.499, 43.8334, 69.113, 73.3325, 65.271, 125.298, 91.3451, 88.8595, 8.38405 1960, 17.8844, 234.993, 234.993, 30.8074, 30.9943, 170.669, 171.319, 42.6672, 91.3702, 49.8162, 44.8465, 79.2563, , , 7.35122 4725, 12.5572, 188.397, 218.638, 39.8914, 39.8284, 157.161, 154.827, 28.6559, 67.9193, 45.4292, 42.3497, 77.137, , , 7.30803 10368, 14.5258, 178.077, 191.935, 48.7319, 48.4713, 205.536, 194.927, 38.1166, 113.231, 42.8361, 49.5849, 84.6323, 59.0882, 59.359, 9.03521 27000, 11.243, 175.053, 175.288, 44.725, 45.1592, 150.478, 133.033, 27.8527, 65.4468, 39.4425, 42.368, 67.7623, 53.8178, 54.176, 7.52305 75600, 10.0367, 136.091, 136.091, 28.9743, 29.0078, 122.249, 109.273, 25.8411, 53.7008, 30.8252, 31.903, 44.6869, , , 6.7162 165375, 8.25082, 111.834, 111.834, 18.0878, 18.2338, 96.806, 95.6235, 18.3249, 45.9412, 27.0816, 27.1066, 37.3729, , , 5.8996 362880, 8.67408, 52.9139, 52.8731, 23.6653, 23.7636, 107.233, 97.4156, 22.1814, 54.0388, 23.2958, 23.439, 29.7612, , , 6.07339 Norm. Avg., 0.0587677, 0.6128, 0.635223, 0.179738, 0.179417, 0.941969, 0.855108, 0.148625, 0.331669, 0.158267, 0.173169, 0.278732, 0.19372, 0.196977, 0.0378458 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM, HARM (f2c), NR (C), PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 4x4x4, 289.711, , , 84.4912, 19.0718, 20.2848, 88.0116, 150.525, 82.3316, 82.437 8x8x8, 469.966, 96.4439, 95.7748, 124.052, 32.0839, 31.9778, 66.9693, 123.576, 114.566, 122.869 16x16x16, 313.883, 109.102, 100.272, 72.3871, 39.14, 42.3567, 67.5309, 120.307, 103.162, 112.238 32x32x32, 156.98, 71.2661, 68.4203, 37.1451, 30.0487, 33.4152, 39.3986, 53.2963, 69.0065, 70.0266 64x64x64, 127.489, 63.7025, 60.6634, 19.2811, 24.5895, 34.1714, 26.5486, 32.3633, 64.2532, 64.64 256x64x32, 106.7, 53.0726, 51.3106, 17.5872, 26.6194, 31.3863, 22.8098, 26.8294, 59.6872, 59.7922 16x1024x64, 112.67, 53.8486, 51.9972, 17.0462, 28.0864, 32.1962, 23.8768, 28.2787, , 128x128x128, 102.4, 48.5438, 47.0546, 16.1673, 26.4344, 31.2697, 17.129, 19.5446, 38.2439, 38.3479 Norm. Avg., 1, 0.422264, 0.405262, 0.20601, 0.174999, 0.204897, 0.214203, 0.306558, 0.390438, 0.398792 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 5x5x5, 228.791, 21.674, 23.4709, 144.78, 240.071, 108.936, 104.42 6x6x6, 357.542, 24.3609, 24.7732, 60.7759, 107.344, 85.627, 86.6834 7x7x7, 180.976, 14.535, 14.4933, 49.0145, 73.3437, , 9x9x9, 286.761, 33.2239, 34.0327, 72.7302, 132.532, 97.3796, 110.729 10x10x10, 255.498, 32.7377, 34.8097, 74.9095, 130.951, 116.693, 117.878 11x11x11, 87.2036, 13.8106, 14.381, 43.0318, 63.2512, , 12x12x12, 435.225, 39.0369, 41.7464, 61.0884, 118.248, 126.458, 131.581 13x13x13, 76.5699, 13.3981, 13.8991, 40.0099, 58.4423, , 14x14x14, 141.658, 20.327, 21.8167, 43.8897, 71.1355, , 15x15x15, 314.233, 40.2238, 43.1711, 67.3795, 123.449, 124.861, 129.057 24x25x28, 155.004, 26.6079, 36.9004, 46.1586, 73.3107, , 48x48x48, 151.34, 20.7369, 36.9149, 31.3884, 40.959, 73.8299, 74.1183 49x49x49, 106.383, 18.8417, 20.0718, 27.8171, 40.0643, , 60x60x60, 154.16, 28.4436, 39.8326, 25.6011, 32.69, 72.1165, 72.7185 72x60x56, 144.077, 28.9, 34.9399, 23.7809, 29.8543, , 75x75x75, 138.704, 29.766, 40.0822, 28.6464, 36.4285, 79.9262, 80.9684 80x80x80, 130.493, 33.123, 37.8081, 26.3754, 32.4105, 65.2035, 65.418 84x84x84, 115.581, 26.5489, 29.0251, 20.4841, 25.8301, , 96x96x96, 124.894, 32.3641, 34.6311, 18.9671, 22.1228, 51.2817, 51.4291 105x105x105, 115.797, 22.1703, 29.0267, 23.7208, 30.3938, , 112x112x112, 103.161, 22.2476, 28.2556, 21.978, 27.229, , 120x120x120, 143.221, 33.2408, 40.5942, 18.638, 21.9417, 56.0622, 56.5463 144x144x144, 127.017, 31.8187, 39.0554, 19.0459, 22.3857, 54.0422, 54.1222 Norm. Avg., 0.997957, 0.169354, 0.201303, 0.252384, 0.375758, 0.418195, 0.424397 @@@@ end