To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Jose Miguel Garrido @ submitter email = angmar@dali.eis.uva.es @ submitter organization = Universidad de Valladolid @ computer manufacturer = @ computer model = @ CPU manufacturer = Intel @ CPU model = Pentium II @ CPU speed = 233 MHz @ RAM = 128 MB @ L2 cache size = @ operating system = Red Hat Linux 4.2 @ C compiler = @ C compiler flags = -pedantic -ansi -O6 -fomit-frame-pointer -Wall @ Fortran compiler = @ Fortran compiler flags = -O6 -fomit-frame-pointer @ remarks = @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.3 7915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) 524288 (50.9973 MB) 1048576 (96 MB) Maximum array size = 1048576 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Bailey 5. Beauregard 6. Bergland 7. Brenner 8. Burrus 9. CWP (min N) 10. CWP (best N) 11. Edelblute 12. FFTPACK 13. FFTPACK (f2c) 14. FFTW 15. FFTW_ESTIMATE 16. Frigo-old 17. Green 18. GSL 19. GSL DIT 20. GSL DIF 21. Krukar 22. Mayer (Buneman) 23. Mayer (simple) 24. Mayer (lookup) 25. Monro 26. NAPACK (f2c) 27. Nielsen 28. NR (C) 29. NR (F) 30. Ooura (C) 31. Ooura (F) 32. QFT 33. Ransom 34. SCIPORT 35. Singleton 36. Singleton (f2c) 37. Sorensen 38. Sorensen DIT 39. Temperton 40. Temperton (f2c) 41. Valkenburg Computing normalized averages (42 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.66 s, 4194304 iters, t-(init.)=1.26 s t(norm)=0.150204, mflops=33.2881 (err=5.6e-17) 1. Arndt DIT: elapsed time t=1.62 s, 4194304 iters, t-(init.)=1.21 s t(norm)=0.144243, mflops=34.6637 (err=5.6e-17) 2. Arndt Split-Radix: elapsed time t=1.16 s, 2097152 iters, t-(init.)=0.96 s t(norm)=0.228882, mflops=21.8453 (err=5.6e-17) 3. Arndt 4-step: elapsed time t=1.78 s, 262144 iters, t-(init.)=1.75 s t(norm)=3.33786, mflops=1.49797 (err=5.6e-17) 4. Bailey: elapsed time t=1.75 s, 1048576 iters, t-(init.)=1.66 s t(norm)=0.79155, mflops=6.31672 (err=5.6e-17) 5. Beauregard: elapsed time t=1.13 s, 524288 iters, t-(init.)=1.08 s t(norm)=1.02997, mflops=4.85452 (err=8.4e-17) 6. Bergland: elapsed time t=1.93 s, 1048576 iters, t-(init.)=1.83 s t(norm)=0.872612, mflops=5.72992 (err=8.4e-17) 7. Brenner: elapsed time t=1.76 s, 1048576 iters, t-(init.)=1.66 s t(norm)=0.79155, mflops=6.31672 (err=8.4e-17) 8. Burrus: elapsed time t=1.22 s, 2097152 iters, t-(init.)=1.03 s t(norm)=0.245571, mflops=20.3607 (err=5.6e-17) 9. CWP (min N): elapsed time t=1.24 s, 524288 iters, t-(init.)=1.19 s t(norm)=1.13487, mflops=4.40578 10. CWP (best N) (N=3): elapsed time t=1.39 s, 524288 iters, t-(init.)=1.33 s t(norm)=1.26839, mflops=3.94202 11. Skipping fft (Edelblute can't handle N <= 2). 12. FFTPACK: elapsed time t=1.94 s, 2097152 iters, t-(init.)=1.75 s t(norm)=0.417233, mflops=11.9837 (err=8.4e-17) 13. FFTPACK (f2c): elapsed time t=1.14 s, 1048576 iters, t-(init.)=1.04 s t(norm)=0.495911, mflops=10.0825 (err=8.4e-17) FFTW_MEASURE plan: (cost = 3.623962e-07) FFTW_NOTW 2 14. FFTW: elapsed time t=1.41 s, 4194304 iters, t-(init.)=1.02 s t(norm)=0.121593, mflops=41.1206 (err=8.4e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 15. FFTW_ESTIMATE: elapsed time t=1.39 s, 4194304 iters, t-(init.)=0.97 s t(norm)=0.115633, mflops=43.2402 (err=8.4e-17) 16. Frigo-old: elapsed time t=1.13 s, 4194304 iters, t-(init.)=0.75 s t(norm)=0.089407, mflops=55.9241 (err=8.4e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.97 s, 1048576 iters, t-(init.)=1.87 s t(norm)=0.891685, mflops=5.60736 (err=8.4e-17) 19. GSL DIT: elapsed time t=1.89 s, 1048576 iters, t-(init.)=1.79 s t(norm)=0.853539, mflops=5.85797 (err=8.4e-17) 20. GSL DIF: elapsed time t=1.03 s, 524288 iters, t-(init.)=0.98 s t(norm)=0.934601, mflops=5.34988 (err=8.4e-17) 21. Krukar: elapsed time t=1.95 s, 4194304 iters, t-(init.)=1.55 s t(norm)=0.184774, mflops=27.06 (err=8.4e-17) 22. Skipping fft (Mayer can't handle N <= 2). 23. Skipping fft (Mayer can't handle N <= 2). 24. Skipping fft (Mayer can't handle N <= 2). 25. Skipping fft (Monro can't handle N <= 2). 26. NAPACK (f2c): elapsed time t=1.74 s, 524288 iters, t-(init.)=1.69 s t(norm)=1.61171, mflops=3.1023 (err=8.3e-17) 27. Nielsen: elapsed time t=1.36 s, 262144 iters, t-(init.)=1.34 s t(norm)=2.55585, mflops=1.9563 (err=5.6e-17) 28. NR (C): elapsed time t=1.75 s, 1048576 iters, t-(init.)=1.65 s t(norm)=0.786781, mflops=6.35501 (err=8.4e-17) 29. NR (F): elapsed time t=1.97 s, 1048576 iters, t-(init.)=1.87 s t(norm)=0.891685, mflops=5.60736 (err=8.4e-17) 30. Ooura (C): elapsed time t=1.44 s, 4194304 iters, t-(init.)=1.06 s t(norm)=0.126362, mflops=39.5689 (err=8.4e-17) 31. Ooura (F): elapsed time t=1.58 s, 4194304 iters, t-(init.)=1.18 s t(norm)=0.140667, mflops=35.5449 (err=8.4e-17) 32. Skipping fft (QFT requires N >= 16). 33. Skipping fft (Ransom doesn't work for N=2). 34. Skipping fft (SCIPORT can't handle N < 4). 35. Singleton: elapsed time t=1.04 s, 524288 iters, t-(init.)=0.99 s t(norm)=0.944138, mflops=5.29584 (err=8.4e-17) 36. Singleton (f2c): elapsed time t=1.07 s, 524288 iters, t-(init.)=1.02 s t(norm)=0.972748, mflops=5.14008 (err=8.4e-17) 37. Sorensen: elapsed time t=1.36 s, 2097152 iters, t-(init.)=1.16 s t(norm)=0.276566, mflops=18.0789 (err=5.6e-17) 38. Sorensen DIT: elapsed time t=1.33 s, 2097152 iters, t-(init.)=1.13 s t(norm)=0.269413, mflops=18.5589 (err=5.6e-17) 39. Temperton: elapsed time t=1.45 s, 524288 iters, t-(init.)=1.4 s t(norm)=1.33514, mflops=3.74491 (err=8.4e-17) 40. Temperton (f2c): elapsed time t=1.46 s, 524288 iters, t-(init.)=1.41 s t(norm)=1.34468, mflops=3.71835 (err=8.4e-17) 41. Valkenburg: elapsed time t=1.01 s, 524288 iters, t-(init.)=0.96 s t(norm)=0.915527, mflops=5.46133 (err=8.3e-17) Top mflops for N=2 = 55.9241 Normalized results and averages for N=2: fft 0: mflops = 33.2881 (norm. = 0.595238), norm. avg. (of 1) = 0.595238 fft 1: mflops = 34.6637 (norm. = 0.619835), norm. avg. (of 1) = 0.619835 fft 2: mflops = 21.8453 (norm. = 0.390625), norm. avg. (of 1) = 0.390625 fft 3: mflops = 1.49797 (norm. = 0.0267857), norm. avg. (of 1) = 0.0267857 fft 4: mflops = 6.31672 (norm. = 0.112952), norm. avg. (of 1) = 0.112952 fft 5: mflops = 4.85452 (norm. = 0.0868056), norm. avg. (of 1) = 0.0868056 fft 6: mflops = 5.72992 (norm. = 0.102459), norm. avg. (of 1) = 0.102459 fft 7: mflops = 6.31672 (norm. = 0.112952), norm. avg. (of 1) = 0.112952 fft 8: mflops = 20.3607 (norm. = 0.364078), norm. avg. (of 1) = 0.364078 fft 9: mflops = 4.40578 (norm. = 0.0787815), norm. avg. (of 1) = 0.0787815 fft 10: mflops = 3.94202 (norm. = 0.0704887), norm. avg. (of 1) = 0.0704887 fft 11: mflops = -1 (norm. = -0.0178814), norm. avg. (of 0) = -1 fft 12: mflops = 11.9837 (norm. = 0.214286), norm. avg. (of 1) = 0.214286 fft 13: mflops = 10.0825 (norm. = 0.180288), norm. avg. (of 1) = 0.180288 fft 14: mflops = 41.1206 (norm. = 0.735294), norm. avg. (of 1) = 0.735294 fft 15: mflops = 43.2402 (norm. = 0.773196), norm. avg. (of 1) = 0.773196 fft 16: mflops = 55.9241 (norm. = 1), norm. avg. (of 1) = 1 fft 17: mflops = -1 (norm. = -0.0178814), norm. avg. (of 0) = -1 fft 18: mflops = 5.60736 (norm. = 0.100267), norm. avg. (of 1) = 0.100267 fft 19: mflops = 5.85797 (norm. = 0.104749), norm. avg. (of 1) = 0.104749 fft 20: mflops = 5.34988 (norm. = 0.0956633), norm. avg. (of 1) = 0.0956633 fft 21: mflops = 27.06 (norm. = 0.483871), norm. avg. (of 1) = 0.483871 fft 22: mflops = -1 (norm. = -0.0178814), norm. avg. (of 0) = -1 fft 23: mflops = -1 (norm. = -0.0178814), norm. avg. (of 0) = -1 fft 24: mflops = -1 (norm. = -0.0178814), norm. avg. (of 0) = -1 fft 25: mflops = -1 (norm. = -0.0178814), norm. avg. (of 0) = -1 fft 26: mflops = 3.1023 (norm. = 0.0554734), norm. avg. (of 1) = 0.0554734 fft 27: mflops = 1.9563 (norm. = 0.0349813), norm. avg. (of 1) = 0.0349813 fft 28: mflops = 6.35501 (norm. = 0.113636), norm. avg. (of 1) = 0.113636 fft 29: mflops = 5.60736 (norm. = 0.100267), norm. avg. (of 1) = 0.100267 fft 30: mflops = 39.5689 (norm. = 0.707547), norm. avg. (of 1) = 0.707547 fft 31: mflops = 35.5449 (norm. = 0.635593), norm. avg. (of 1) = 0.635593 fft 32: mflops = -1 (norm. = -0.0178814), norm. avg. (of 0) = -1 fft 33: mflops = -1 (norm. = -0.0178814), norm. avg. (of 0) = -1 fft 34: mflops = -1 (norm. = -0.0178814), norm. avg. (of 0) = -1 fft 35: mflops = 5.29584 (norm. = 0.094697), norm. avg. (of 1) = 0.094697 fft 36: mflops = 5.14008 (norm. = 0.0919118), norm. avg. (of 1) = 0.0919118 fft 37: mflops = 18.0789 (norm. = 0.323276), norm. avg. (of 1) = 0.323276 fft 38: mflops = 18.5589 (norm. = 0.331858), norm. avg. (of 1) = 0.331858 fft 39: mflops = 3.74491 (norm. = 0.0669643), norm. avg. (of 1) = 0.0669643 fft 40: mflops = 3.71835 (norm. = 0.0664894), norm. avg. (of 1) = 0.0664894 fft 41: mflops = 5.46133 (norm. = 0.0976563), norm. avg. (of 1) = 0.0976563 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.59 s, 2097152 iters, t-(init.)=1.29 s t(norm)=0.07689, mflops=65.028 (err=9.6e-17) 1. Arndt DIT: elapsed time t=1.72 s, 2097152 iters, t-(init.)=1.42 s t(norm)=0.0846386, mflops=59.0747 (err=9.6e-17) 2. Arndt Split-Radix: elapsed time t=1.8 s, 1048576 iters, t-(init.)=1.65 s t(norm)=0.196695, mflops=25.42 (err=1.5e-16) 3. Arndt 4-step: elapsed time t=1.83 s, 262144 iters, t-(init.)=1.79 s t(norm)=0.853539, mflops=5.85797 (err=1.5e-16) 4. Bailey: elapsed time t=1.91 s, 524288 iters, t-(init.)=1.83 s t(norm)=0.436306, mflops=11.4598 (err=1.5e-16) 5. Beauregard: elapsed time t=1.01 s, 131072 iters, t-(init.)=0.99 s t(norm)=0.944138, mflops=5.29584 (err=1.4e-16) 6. Bergland: elapsed time t=1.18 s, 524288 iters, t-(init.)=1.1 s t(norm)=0.26226, mflops=19.065 (err=7.6e-17) 7. Brenner: elapsed time t=1.68 s, 524288 iters, t-(init.)=1.6 s t(norm)=0.38147, mflops=13.1072 (err=1.4e-16) 8. Burrus: elapsed time t=1.31 s, 524288 iters, t-(init.)=1.24 s t(norm)=0.295639, mflops=16.9125 (err=1.5e-16) 9. CWP (min N): elapsed time t=1.4 s, 524288 iters, t-(init.)=1.32 s t(norm)=0.314713, mflops=15.8875 10. CWP (best N) (N=15): elapsed time t=1.68 s, 262144 iters, t-(init.)=1.56 s t(norm)=0.743866, mflops=6.72164 11. Edelblute: elapsed time t=1.06 s, 524288 iters, t-(init.)=0.98 s t(norm)=0.23365, mflops=21.3995 (err=1.5e-16) 12. FFTPACK: elapsed time t=1.56 s, 1048576 iters, t-(init.)=1.41 s t(norm)=0.168085, mflops=29.7468 (err=1.2e-16) 13. FFTPACK (f2c): elapsed time t=1.7 s, 1048576 iters, t-(init.)=1.55 s t(norm)=0.184774, mflops=27.06 (err=1.2e-16) FFTW_MEASURE plan: (cost = 4.959106e-07) FFTW_NOTW 4 14. FFTW: elapsed time t=1.08 s, 2097152 iters, t-(init.)=0.77 s t(norm)=0.0458956, mflops=108.943 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 15. FFTW_ESTIMATE: elapsed time t=1.09 s, 2097152 iters, t-(init.)=0.79 s t(norm)=0.0470877, mflops=106.185 (err=1.4e-16) 16. Frigo-old: elapsed time t=1.07 s, 2097152 iters, t-(init.)=0.76 s t(norm)=0.0452995, mflops=110.376 (err=1.4e-16) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.53 s, 524288 iters, t-(init.)=1.46 s t(norm)=0.348091, mflops=14.3641 (err=1.4e-16) 19. GSL DIT: elapsed time t=1.97 s, 524288 iters, t-(init.)=1.9 s t(norm)=0.452995, mflops=11.0376 (err=1.4e-16) 20. GSL DIF: elapsed time t=1.02 s, 262144 iters, t-(init.)=0.98 s t(norm)=0.4673, mflops=10.6998 (err=1.8e-16) 21. Krukar: elapsed time t=1.34 s, 2097152 iters, t-(init.)=1.03 s t(norm)=0.0613928, mflops=81.4428 (err=1.4e-16) 22. Mayer (Buneman): elapsed time t=1.34 s, 1048576 iters, t-(init.)=1.18 s t(norm)=0.140667, mflops=35.5449 (err=8.1e-17) 23. Mayer (simple): elapsed time t=1.36 s, 1048576 iters, t-(init.)=1.2 s t(norm)=0.143051, mflops=34.9525 24. Mayer (lookup): elapsed time t=1.34 s, 1048576 iters, t-(init.)=1.19 s t(norm)=0.141859, mflops=35.2463 (err=8.1e-17) 25. Monro: elapsed time t=1.97 s, 262144 iters, t-(init.)=1.93 s t(norm)=0.920296, mflops=5.43304 (err=8.1e-17) 26. NAPACK (f2c): elapsed time t=1.86 s, 262144 iters, t-(init.)=1.82 s t(norm)=0.867844, mflops=5.76141 (err=1.8e-16) 27. Nielsen: elapsed time t=1.48 s, 262144 iters, t-(init.)=1.45 s t(norm)=0.691414, mflops=7.23156 (err=1.5e-16) 28. NR (C): elapsed time t=1.9 s, 524288 iters, t-(init.)=1.83 s t(norm)=0.436306, mflops=11.4598 (err=1.4e-16) 29. NR (F): elapsed time t=1.12 s, 262144 iters, t-(init.)=1.08 s t(norm)=0.514984, mflops=9.70904 (err=1.4e-16) 30. Ooura (C): elapsed time t=1.8 s, 2097152 iters, t-(init.)=1.49 s t(norm)=0.0888109, mflops=56.2994 (err=9.8e-17) 31. Ooura (F): elapsed time t=1.01 s, 1048576 iters, t-(init.)=0.86 s t(norm)=0.10252, mflops=48.771 (err=9.8e-17) 32. Skipping fft (QFT requires N >= 16). 33. Ransom: elapsed time t=1.13 s, 131072 iters, t-(init.)=1.11 s t(norm)=1.05858, mflops=4.72332 (err=2.1e-16) 34. SCIPORT: elapsed time t=1.3 s, 524288 iters, t-(init.)=1.23 s t(norm)=0.293255, mflops=17.05 (err=8.0e-09) 35. Singleton: elapsed time t=1.8 s, 524288 iters, t-(init.)=1.72 s t(norm)=0.41008, mflops=12.1927 (err=1.5e-16) 36. Singleton (f2c): elapsed time t=1.45 s, 524288 iters, t-(init.)=1.38 s t(norm)=0.329018, mflops=15.1968 (err=1.1e-16) 37. Sorensen: elapsed time t=1.8 s, 1048576 iters, t-(init.)=1.64 s t(norm)=0.195503, mflops=25.575 (err=1.5e-16) 38. Sorensen DIT: elapsed time t=1.31 s, 524288 iters, t-(init.)=1.23 s t(norm)=0.293255, mflops=17.05 (err=8.1e-17) 39. Temperton: elapsed time t=1.62 s, 524288 iters, t-(init.)=1.54 s t(norm)=0.367165, mflops=13.6179 (err=1.7e-16) 40. Temperton (f2c): elapsed time t=1.76 s, 524288 iters, t-(init.)=1.68 s t(norm)=0.400543, mflops=12.483 (err=1.7e-16) 41. Valkenburg: elapsed time t=1.93 s, 262144 iters, t-(init.)=1.89 s t(norm)=0.901222, mflops=5.54802 (err=1.8e-16) Top mflops for N=4 = 110.376 Normalized results and averages for N=4: fft 0: mflops = 65.028 (norm. = 0.589147), norm. avg. (of 2) = 0.592193 fft 1: mflops = 59.0747 (norm. = 0.535211), norm. avg. (of 2) = 0.577523 fft 2: mflops = 25.42 (norm. = 0.230303), norm. avg. (of 2) = 0.310464 fft 3: mflops = 5.85797 (norm. = 0.0530726), norm. avg. (of 2) = 0.0399292 fft 4: mflops = 11.4598 (norm. = 0.103825), norm. avg. (of 2) = 0.108388 fft 5: mflops = 5.29584 (norm. = 0.0479798), norm. avg. (of 2) = 0.0673927 fft 6: mflops = 19.065 (norm. = 0.172727), norm. avg. (of 2) = 0.137593 fft 7: mflops = 13.1072 (norm. = 0.11875), norm. avg. (of 2) = 0.115851 fft 8: mflops = 16.9125 (norm. = 0.153226), norm. avg. (of 2) = 0.258652 fft 9: mflops = 15.8875 (norm. = 0.143939), norm. avg. (of 2) = 0.11136 fft 10: mflops = 6.72164 (norm. = 0.0608974), norm. avg. (of 2) = 0.0656931 fft 11: mflops = 21.3995 (norm. = 0.193878), norm. avg. (of 1) = 0.193878 fft 12: mflops = 29.7468 (norm. = 0.269504), norm. avg. (of 2) = 0.241895 fft 13: mflops = 27.06 (norm. = 0.245161), norm. avg. (of 2) = 0.212725 fft 14: mflops = 108.943 (norm. = 0.987013), norm. avg. (of 2) = 0.861154 fft 15: mflops = 106.185 (norm. = 0.962025), norm. avg. (of 2) = 0.867611 fft 16: mflops = 110.376 (norm. = 1), norm. avg. (of 2) = 1 fft 17: mflops = -1 (norm. = -0.00905991), norm. avg. (of 0) = -1 fft 18: mflops = 14.3641 (norm. = 0.130137), norm. avg. (of 2) = 0.115202 fft 19: mflops = 11.0376 (norm. = 0.1), norm. avg. (of 2) = 0.102374 fft 20: mflops = 10.6998 (norm. = 0.0969388), norm. avg. (of 2) = 0.096301 fft 21: mflops = 81.4428 (norm. = 0.737864), norm. avg. (of 2) = 0.610868 fft 22: mflops = 35.5449 (norm. = 0.322034), norm. avg. (of 1) = 0.322034 fft 23: mflops = 34.9525 (norm. = 0.316667), norm. avg. (of 1) = 0.316667 fft 24: mflops = 35.2463 (norm. = 0.319328), norm. avg. (of 1) = 0.319328 fft 25: mflops = 5.43304 (norm. = 0.0492228), norm. avg. (of 1) = 0.0492228 fft 26: mflops = 5.76141 (norm. = 0.0521978), norm. avg. (of 2) = 0.0538356 fft 27: mflops = 7.23156 (norm. = 0.0655172), norm. avg. (of 2) = 0.0502493 fft 28: mflops = 11.4598 (norm. = 0.103825), norm. avg. (of 2) = 0.108731 fft 29: mflops = 9.70904 (norm. = 0.087963), norm. avg. (of 2) = 0.0941152 fft 30: mflops = 56.2994 (norm. = 0.510067), norm. avg. (of 2) = 0.608807 fft 31: mflops = 48.771 (norm. = 0.44186), norm. avg. (of 2) = 0.538727 fft 32: mflops = -1 (norm. = -0.00905991), norm. avg. (of 0) = -1 fft 33: mflops = 4.72332 (norm. = 0.0427928), norm. avg. (of 1) = 0.0427928 fft 34: mflops = 17.05 (norm. = 0.154472), norm. avg. (of 1) = 0.154472 fft 35: mflops = 12.1927 (norm. = 0.110465), norm. avg. (of 2) = 0.102581 fft 36: mflops = 15.1968 (norm. = 0.137681), norm. avg. (of 2) = 0.114796 fft 37: mflops = 25.575 (norm. = 0.231707), norm. avg. (of 2) = 0.277492 fft 38: mflops = 17.05 (norm. = 0.154472), norm. avg. (of 2) = 0.243165 fft 39: mflops = 13.6179 (norm. = 0.123377), norm. avg. (of 2) = 0.0951705 fft 40: mflops = 12.483 (norm. = 0.113095), norm. avg. (of 2) = 0.0897923 fft 41: mflops = 5.54802 (norm. = 0.0502646), norm. avg. (of 2) = 0.0739604 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.35 s, 524288 iters, t-(init.)=1.2 s t(norm)=0.0953674, mflops=52.4288 (err=1.6e-16) 1. Arndt DIT: elapsed time t=1.43 s, 524288 iters, t-(init.)=1.29 s t(norm)=0.10252, mflops=48.771 (err=1.6e-16) 2. Arndt Split-Radix: elapsed time t=1.24 s, 262144 iters, t-(init.)=1.16 s t(norm)=0.184377, mflops=27.1183 (err=2.0e-16) 3. Arndt 4-step: elapsed time t=1.13 s, 65536 iters, t-(init.)=1.11 s t(norm)=0.705719, mflops=7.08497 (err=2.1e-16) 4. Bailey: elapsed time t=1.67 s, 262144 iters, t-(init.)=1.6 s t(norm)=0.254313, mflops=19.6608 (err=1.6e-16) 5. Beauregard: elapsed time t=1.06 s, 65536 iters, t-(init.)=1.04 s t(norm)=0.661214, mflops=7.56185 (err=1.3e-16) 6. Bergland: elapsed time t=1.66 s, 262144 iters, t-(init.)=1.59 s t(norm)=0.252724, mflops=19.7845 (err=2.0e-16) 7. Brenner: elapsed time t=1.02 s, 131072 iters, t-(init.)=0.99 s t(norm)=0.314713, mflops=15.8875 (err=1.3e-16) 8. Burrus: elapsed time t=1.01 s, 131072 iters, t-(init.)=0.98 s t(norm)=0.311534, mflops=16.0496 (err=1.9e-16) 9. CWP (min N): elapsed time t=1.79 s, 524288 iters, t-(init.)=1.64 s t(norm)=0.130335, mflops=38.3625 10. CWP (best N) (N=15): elapsed time t=1.68 s, 262144 iters, t-(init.)=1.57 s t(norm)=0.249545, mflops=20.0365 11. Edelblute: elapsed time t=1.72 s, 262144 iters, t-(init.)=1.65 s t(norm)=0.26226, mflops=19.065 (err=1.9e-16) 12. FFTPACK: elapsed time t=1.44 s, 524288 iters, t-(init.)=1.3 s t(norm)=0.103315, mflops=48.3958 (err=1.8e-16) 13. FFTPACK (f2c): elapsed time t=1.62 s, 524288 iters, t-(init.)=1.47 s t(norm)=0.116825, mflops=42.799 (err=1.8e-16) FFTW_MEASURE plan: (cost = 9.918213e-07) FFTW_NOTW 8 14. FFTW: elapsed time t=1.16 s, 1048576 iters, t-(init.)=0.86 s t(norm)=0.0341733, mflops=146.313 (err=1.0e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.16 s, 1048576 iters, t-(init.)=0.87 s t(norm)=0.0345707, mflops=144.631 (err=1.0e-16) 16. Frigo-old: elapsed time t=1.81 s, 2097152 iters, t-(init.)=1.19 s t(norm)=0.0236432, mflops=211.478 (err=1.0e-16) 17. Green: elapsed time t=1.17 s, 524288 iters, t-(init.)=1.02 s t(norm)=0.0810623, mflops=61.6809 (err=1.2e-16) 18. GSL: elapsed time t=1.34 s, 262144 iters, t-(init.)=1.27 s t(norm)=0.201861, mflops=24.7695 (err=1.2e-16) 19. GSL DIT: elapsed time t=1.78 s, 262144 iters, t-(init.)=1.71 s t(norm)=0.271797, mflops=18.3961 (err=1.2e-16) 20. GSL DIF: elapsed time t=1.9 s, 262144 iters, t-(init.)=1.83 s t(norm)=0.290871, mflops=17.1898 (err=1.6e-16) 21. Krukar: elapsed time t=1.31 s, 1048576 iters, t-(init.)=1.02 s t(norm)=0.0405312, mflops=123.362 (err=1.2e-16) 22. Mayer (Buneman): elapsed time t=1.56 s, 524288 iters, t-(init.)=1.41 s t(norm)=0.112057, mflops=44.6203 (err=1.5e-16) 23. Mayer (simple): elapsed time t=1.55 s, 524288 iters, t-(init.)=1.4 s t(norm)=0.111262, mflops=44.939 24. Mayer (lookup): elapsed time t=1.58 s, 524288 iters, t-(init.)=1.44 s t(norm)=0.114441, mflops=43.6907 (err=1.5e-16) 25. Monro: elapsed time t=1.34 s, 131072 iters, t-(init.)=1.31 s t(norm)=0.416438, mflops=12.0066 (err=1.3e-08) 26. NAPACK (f2c): elapsed time t=1.78 s, 131072 iters, t-(init.)=1.74 s t(norm)=0.553131, mflops=9.03945 (err=2.8e-16) 27. Nielsen: elapsed time t=1.19 s, 131072 iters, t-(init.)=1.15 s t(norm)=0.365575, mflops=13.6771 (err=1.3e-15) 28. NR (C): elapsed time t=1.8 s, 262144 iters, t-(init.)=1.72 s t(norm)=0.273387, mflops=18.2891 (err=1.2e-16) 29. NR (F): elapsed time t=1.09 s, 131072 iters, t-(init.)=1.05 s t(norm)=0.333786, mflops=14.9797 (err=1.5e-16) 30. Ooura (C): elapsed time t=1.59 s, 1048576 iters, t-(init.)=1.28 s t(norm)=0.0508626, mflops=98.304 (err=1.6e-16) 31. Ooura (F): elapsed time t=1.02 s, 524288 iters, t-(init.)=0.87 s t(norm)=0.0691414, mflops=72.3156 (err=1.6e-16) 32. Skipping fft (QFT requires N >= 16). 33. Ransom: elapsed time t=1.37 s, 65536 iters, t-(init.)=1.35 s t(norm)=0.858307, mflops=5.82542 (err=7.5e-16) 34. SCIPORT: elapsed time t=1.01 s, 131072 iters, t-(init.)=0.97 s t(norm)=0.308355, mflops=16.2151 (err=4.5e-08) 35. Singleton: elapsed time t=1.04 s, 131072 iters, t-(init.)=1 s t(norm)=0.317891, mflops=15.7286 (err=1.2e-16) 36. Singleton (f2c): elapsed time t=1.84 s, 262144 iters, t-(init.)=1.76 s t(norm)=0.279744, mflops=17.8735 (err=1.2e-16) 37. Sorensen: elapsed time t=1.85 s, 524288 iters, t-(init.)=1.7 s t(norm)=0.135104, mflops=37.0086 (err=2.4e-16) 38. Sorensen DIT: elapsed time t=1.05 s, 131072 iters, t-(init.)=1.02 s t(norm)=0.324249, mflops=15.4202 (err=1.8e-16) 39. Temperton: elapsed time t=1.47 s, 262144 iters, t-(init.)=1.4 s t(norm)=0.222524, mflops=22.4695 (err=7.5e-09) 40. Temperton (f2c): elapsed time t=1.65 s, 262144 iters, t-(init.)=1.57 s t(norm)=0.249545, mflops=20.0365 (err=1.4e-16) 41. Valkenburg: elapsed time t=1.41 s, 65536 iters, t-(init.)=1.39 s t(norm)=0.883738, mflops=5.65778 (err=1.3e-16) Top mflops for N=8 = 211.478 Normalized results and averages for N=8: fft 0: mflops = 52.4288 (norm. = 0.247917), norm. avg. (of 3) = 0.477434 fft 1: mflops = 48.771 (norm. = 0.23062), norm. avg. (of 3) = 0.461889 fft 2: mflops = 27.1183 (norm. = 0.128233), norm. avg. (of 3) = 0.24972 fft 3: mflops = 7.08497 (norm. = 0.0335023), norm. avg. (of 3) = 0.0377869 fft 4: mflops = 19.6608 (norm. = 0.0929688), norm. avg. (of 3) = 0.103249 fft 5: mflops = 7.56185 (norm. = 0.0357572), norm. avg. (of 3) = 0.0568475 fft 6: mflops = 19.7845 (norm. = 0.0935535), norm. avg. (of 3) = 0.122913 fft 7: mflops = 15.8875 (norm. = 0.0751263), norm. avg. (of 3) = 0.102276 fft 8: mflops = 16.0496 (norm. = 0.0758929), norm. avg. (of 3) = 0.197732 fft 9: mflops = 38.3625 (norm. = 0.181402), norm. avg. (of 3) = 0.134708 fft 10: mflops = 20.0365 (norm. = 0.0947452), norm. avg. (of 3) = 0.0753771 fft 11: mflops = 19.065 (norm. = 0.0901515), norm. avg. (of 2) = 0.142015 fft 12: mflops = 48.3958 (norm. = 0.228846), norm. avg. (of 3) = 0.237545 fft 13: mflops = 42.799 (norm. = 0.202381), norm. avg. (of 3) = 0.209277 fft 14: mflops = 146.313 (norm. = 0.69186), norm. avg. (of 3) = 0.804723 fft 15: mflops = 144.631 (norm. = 0.683908), norm. avg. (of 3) = 0.806376 fft 16: mflops = 211.478 (norm. = 1), norm. avg. (of 3) = 1 fft 17: mflops = 61.6809 (norm. = 0.291667), norm. avg. (of 1) = 0.291667 fft 18: mflops = 24.7695 (norm. = 0.117126), norm. avg. (of 3) = 0.115843 fft 19: mflops = 18.3961 (norm. = 0.0869883), norm. avg. (of 3) = 0.0972456 fft 20: mflops = 17.1898 (norm. = 0.0812842), norm. avg. (of 3) = 0.0912954 fft 21: mflops = 123.362 (norm. = 0.583333), norm. avg. (of 3) = 0.601689 fft 22: mflops = 44.6203 (norm. = 0.210993), norm. avg. (of 2) = 0.266513 fft 23: mflops = 44.939 (norm. = 0.2125), norm. avg. (of 2) = 0.264583 fft 24: mflops = 43.6907 (norm. = 0.206597), norm. avg. (of 2) = 0.262962 fft 25: mflops = 12.0066 (norm. = 0.0567748), norm. avg. (of 2) = 0.0529988 fft 26: mflops = 9.03945 (norm. = 0.0427443), norm. avg. (of 3) = 0.0501385 fft 27: mflops = 13.6771 (norm. = 0.0646739), norm. avg. (of 3) = 0.0550575 fft 28: mflops = 18.2891 (norm. = 0.0864826), norm. avg. (of 3) = 0.101315 fft 29: mflops = 14.9797 (norm. = 0.0708333), norm. avg. (of 3) = 0.0863546 fft 30: mflops = 98.304 (norm. = 0.464844), norm. avg. (of 3) = 0.560819 fft 31: mflops = 72.3156 (norm. = 0.341954), norm. avg. (of 3) = 0.473136 fft 32: mflops = -1 (norm. = -0.00472864), norm. avg. (of 0) = -1 fft 33: mflops = 5.82542 (norm. = 0.0275463), norm. avg. (of 2) = 0.0351695 fft 34: mflops = 16.2151 (norm. = 0.0766753), norm. avg. (of 2) = 0.115573 fft 35: mflops = 15.7286 (norm. = 0.074375), norm. avg. (of 3) = 0.093179 fft 36: mflops = 17.8735 (norm. = 0.084517), norm. avg. (of 3) = 0.104703 fft 37: mflops = 37.0086 (norm. = 0.175), norm. avg. (of 3) = 0.243328 fft 38: mflops = 15.4202 (norm. = 0.0729167), norm. avg. (of 3) = 0.186416 fft 39: mflops = 22.4695 (norm. = 0.10625), norm. avg. (of 3) = 0.0988636 fft 40: mflops = 20.0365 (norm. = 0.0947452), norm. avg. (of 3) = 0.0914433 fft 41: mflops = 5.65778 (norm. = 0.0267536), norm. avg. (of 3) = 0.0582248 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.16 s, 131072 iters, t-(init.)=1.1 s t(norm)=0.13113, mflops=38.13 (err=1.6e-16) 1. Arndt DIT: elapsed time t=1.14 s, 131072 iters, t-(init.)=1.08 s t(norm)=0.128746, mflops=38.8361 (err=1.2e-16) 2. Arndt Split-Radix: elapsed time t=1.56 s, 131072 iters, t-(init.)=1.5 s t(norm)=0.178814, mflops=27.962 (err=1.4e-16) 3. Arndt 4-step: elapsed time t=1.85 s, 65536 iters, t-(init.)=1.82 s t(norm)=0.433922, mflops=11.5228 (err=1.3e-16) 4. Bailey: elapsed time t=1.69 s, 131072 iters, t-(init.)=1.63 s t(norm)=0.194311, mflops=25.7319 (err=1.4e-16) 5. Beauregard: elapsed time t=1.25 s, 32768 iters, t-(init.)=1.24 s t(norm)=0.591278, mflops=8.45626 (err=1.9e-16) 6. Bergland: elapsed time t=1.65 s, 131072 iters, t-(init.)=1.59 s t(norm)=0.189543, mflops=26.3793 (err=2.0e-16) 7. Brenner: elapsed time t=1.06 s, 65536 iters, t-(init.)=1.03 s t(norm)=0.245571, mflops=20.3607 (err=1.5e-16) 8. Burrus: elapsed time t=1.28 s, 65536 iters, t-(init.)=1..25 s t(norm)=0.298023, mflops=16.7772 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.31 s, 262144 iters, t-(init.)=1.19 s t(norm)=0.0709295, mflops=70.4925 10. CWP (best N) (N=28): elapsed time t=1.28 s, 131072 iters, t-(init.)=1.18 s t(norm)=0.140667, mflops=35.5449 11. Edelblute: elapsed time t=1.14 s, 65536 iters, t-(init.)=1.11 s t(norm)=0.264645, mflops=18.8933 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.45 s, 262144 iters, t-(init.)=1.33 s t(norm)=0.0792742, mflops=63.0722 (err=1.5e-16) 13. FFTPACK (f2c): elapsed time t=1.32 s, 262144 iters, t-(init.)=1.2 s t(norm)=0.0715256, mflops=69.9051 (err=1.3e-16) FFTW_MEASURE plan: (cost = 2.899170e-06) FFTW_NOTW 16 14. FFTW: elapsed time t=1.54 s, 524288 iters, t-(init.)=1.3 s t(norm)=0.038743, mflops=129.056 (err=1.6e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.56 s, 524288 iters, t-(init.)=1.32 s t(norm)=0.0393391, mflops=127.1 (err=1.6e-16) 16. Frigo-old: elapsed time t=1.06 s, 524288 iters, t-(init.)=0.8 s t(norm)=0.0238419, mflops=209.715 (err=1.6e-16) 17. Green: elapsed time t=1.36 s, 262144 iters, t-(init.)=1.24 s t(norm)=0.0739098, mflops=67.6501 (err=1.9e-16) 18. GSL: elapsed time t=1.34 s, 131072 iters, t-(init.)=1.28 s t(norm)=0.152588, mflops=32.768 (err=1.5e-16) 19. GSL DIT: elapsed time t=1.71 s, 131072 iters, t-(init.)=1.66 s t(norm)=0.197887, mflops=25.2669 (err=1.6e-16) 20. GSL DIF: elapsed time t=1.76 s, 131072 iters, t-(init.)=1.71 s t(norm)=0.203848, mflops=24.5281 (err=1.9e-16) 21. Krukar: elapsed time t=1.44 s, 524288 iters, t-(init.)=1.2 s t(norm)=0.0357628, mflops=139.81 (err=1.7e-16) 22. Mayer (Buneman): elapsed time t=1.2 s, 131072 iters, t-(init.)=1.13 s t(norm)=0.134706, mflops=37.1177 (err=1.3e-16) 23. Mayer (simple): elapsed time t=1.03 s, 131072 iters, t-(init.)=0.97 s t(norm)=0.115633, mflops=43.2402 24. Mayer (lookup): elapsed time t=1.86 s, 262144 iters, t-(init.)=1.74 s t(norm)=0.103712, mflops=48.2104 (err=1.4e-16) 25. Monro: elapsed time t=1.15 s, 65536 iters, t-(init.)=1.12 s t(norm)=0.267029, mflops=18.7246 (err=1.2e-08) 26. NAPACK (f2c): elapsed time t=1.6 s, 65536 iters, t-(init.)=1.57 s t(norm)=0.374317, mflops=13.3577 (err=3.5e-16) 27. Nielsen: elapsed time t=1.33 s, 65536 iters, t-(init.)=1.3 s t(norm)=0.309944, mflops=16.1319 (err=1.3e-16) 28. NR (C): elapsed time t=1.62 s, 131072 iters, t-(init.)=1.56 s t(norm)=0.185966, mflops=26.8866 (err=1.6e-16) 29. NR (F): elapsed time t=1.01 s, 65536 iters, t-(init.)=0.98 s t(norm)=0.23365, mflops=21.3995 (err=1.7e-16) 30. Ooura (C): elapsed time t=1.84 s, 524288 iters, t-(init.)=1.58 s t(norm)=0.0470877, mflops=106.185 (err=1.4e-16) 31. Ooura (F): elapsed time t=1.13 s, 262144 iters, t-(init.)=1.01 s t(norm)=0.0602007, mflops=83.0555 (err=1.4e-16) 32. QFT: elapsed time t=1.03 s, 262144 iters, t-(init.)=0.91 s t(norm)=0.0542402, mflops=92.1825 (err=1.3e-16) 33. Ransom: elapsed time t=1.42 s, 65536 iters, t-(init.)=1.39 s t(norm)=0.331402, mflops=15.0874 (err=6.0e-16) 34. SCIPORT: elapsed time t=1.39 s, 65536 iters, t-(init.)=1.36 s t(norm)=0.324249, mflops=15.4202 (err=5.2e-08) 35. Singleton: elapsed time t=1.08 s, 65536 iters, t-(init.)=1.05 s t(norm)=0.25034, mflops=19.9729 (err=1.7e-16) 36. Singleton (f2c): elapsed time t=1.32 s, 131072 iters, t-(init.)=1.26 s t(norm)=0.150204, mflops=33.2881 (err=1.9e-16) 37. Sorensen: elapsed time t=1.89 s, 262144 iters, t-(init.)=1.77 s t(norm)=0.1055, mflops=47.3933 (err=1.3e-16) 38. Sorensen DIT: elapsed time t=1.34 s, 65536 iters, t-(init.)=1.31 s t(norm)=0.312328, mflops=16.0088 (err=1.4e-16) 39. Temperton: elapsed time t=1.22 s, 131072 iters, t-(init.)=1.16 s t(norm)=0.138283, mflops=36.1578 (err=2.9e-08) 40. Temperton (f2c): elapsed time t=1.5 s, 131072 iters, t-(init.)=1.44 s t(norm)=0.171661, mflops=29.1271 (err=1.5e-16) 41. Valkenburg: elapsed time t=1.83 s, 32768 iters, t-(init.)=1.82 s t(norm)=0.867844, mflops=5.76141 (err=3.0e-16) Top mflops for N=16 = 209.715 Normalized results and averages for N=16: fft 0: mflops = 38.13 (norm. = 0.181818), norm. avg. (of 4) = 0.40353 fft 1: mflops = 38.8361 (norm. = 0.185185), norm. avg. (of 4) = 0.392713 fft 2: mflops = 27.962 (norm. = 0.133333), norm. avg. (of 4) = 0.220624 fft 3: mflops = 11.5228 (norm. = 0.0549451), norm. avg. (of 4) = 0.0420764 fft 4: mflops = 25.7319 (norm. = 0.122699), norm. avg. (of 4) = 0.108111 fft 5: mflops = 8.45626 (norm. = 0.0403226), norm. avg. (of 4) = 0.0527163 fft 6: mflops = 26.3793 (norm. = 0.125786), norm. avg. (of 4) = 0.123631 fft 7: mflops = 20.3607 (norm. = 0.0970874), norm. avg. (of 4) = 0.100979 fft 8: mflops = 16.7772 (norm. = 0.08), norm. avg. (of 4) = 0.168299 fft 9: mflops = 70.4925 (norm. = 0.336134), norm. avg. (of 4) = 0.185064 fft 10: mflops = 35.5449 (norm. = 0.169492), norm. avg. (of 4) = 0.0989057 fft 11: mflops = 18.8933 (norm. = 0.0900901), norm. avg. (of 3) = 0.124706 fft 12: mflops = 63.0722 (norm. = 0.300752), norm. avg. (of 4) = 0.253347 fft 13: mflops = 69.9051 (norm. = 0.333333), norm. avg. (of 4) = 0.240291 fft 14: mflops = 129.056 (norm. = 0.615385), norm. avg. (of 4) = 0.757388 fft 15: mflops = 127.1 (norm. = 0.606061), norm. avg. (of 4) = 0.756297 fft 16: mflops = 209.715 (norm. = 1), norm. avg. (of 4) = 1 fft 17: mflops = 67.6501 (norm. = 0.322581), norm. avg. (of 2) = 0.307124 fft 18: mflops = 32.768 (norm. = 0.15625), norm. avg. (of 4) = 0.125945 fft 19: mflops = 25.2669 (norm. = 0.120482), norm. avg. (of 4) = 0.103055 fft 20: mflops = 24.5281 (norm. = 0.116959), norm. avg. (of 4) = 0.0977113 fft 21: mflops = 139.81 (norm. = 0.666667), norm. avg. (of 4) = 0.617934 fft 22: mflops = 37.1177 (norm. = 0.176991), norm. avg. (of 3) = 0.236673 fft 23: mflops = 43.2402 (norm. = 0.206186), norm. avg. (of 3) = 0.245117 fft 24: mflops = 48.2104 (norm. = 0.229885), norm. avg. (of 3) = 0.251937 fft 25: mflops = 18.7246 (norm. = 0.0892857), norm. avg. (of 3) = 0.0650944 fft 26: mflops = 13.3577 (norm. = 0.0636943), norm. avg. (of 4) = 0.0535274 fft 27: mflops = 16.1319 (norm. = 0.0769231), norm. avg. (of 4) = 0.0605239 fft 28: mflops = 26.8866 (norm. = 0.128205), norm. avg. (of 4) = 0.108037 fft 29: mflops = 21.3995 (norm. = 0.102041), norm. avg. (of 4) = 0.0902761 fft 30: mflops = 106.185 (norm. = 0.506329), norm. avg. (of 4) = 0.547197 fft 31: mflops = 83.0555 (norm. = 0.39604), norm. avg. (of 4) = 0.453862 fft 32: mflops = 92.1825 (norm. = 0.43956), norm. avg. (of 1) = 0.43956 fft 33: mflops = 15.0874 (norm. = 0.0719424), norm. avg. (of 3) = 0.0474272 fft 34: mflops = 15.4202 (norm. = 0.0735294), norm. avg. (of 3) = 0.101559 fft 35: mflops = 19.9729 (norm. = 0.0952381), norm. avg. (of 4) = 0.0936938 fft 36: mflops = 33.2881 (norm. = 0.15873), norm. avg. (of 4) = 0.11821 fft 37: mflops = 47.3933 (norm. = 0.225989), norm. avg. (of 4) = 0.238993 fft 38: mflops = 16.0088 (norm. = 0.0763359), norm. avg. (of 4) = 0.158896 fft 39: mflops = 36.1578 (norm. = 0.172414), norm. avg. (of 4) = 0.117251 fft 40: mflops = 29.1271 (norm. = 0.138889), norm. avg. (of 4) = 0.103305 fft 41: mflops = 5.76141 (norm. = 0.0274725), norm. avg. (of 4) = 0.0505367 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.47 s, 65536 iters, t-(init.)=1.42 s t(norm)=0.135422, mflops=36.9217 (err=6.0e-16) 1. Arndt DIT: elapsed time t=1.48 s, 65536 iters, t-(init.)=1.42 s t(norm)=0.135422, mflops=36.9217 (err=5.5e-16) 2. Arndt Split-Radix: elapsed time t=1.76 s, 65536 iters, t-(init.)=1.71 s t(norm)=0.163078, mflops=30.6601 (err=3.6e-16) 3. Arndt 4-step: elapsed time t=1.1 s, 16384 iters, t-(init.)=1.09 s t(norm)=0.415802, mflops=12.025 (err=3.2e-16) 4. Bailey: elapsed time t=1.54 s, 65536 iters, t-(init.)=1.49 s t(norm)=0.142097, mflops=35.1871 (err=6.7e-16) 5. Beauregard: elapsed time t=1.48 s, 16384 iters, t-(init.)=1.47 s t(norm)=0.56076, mflops=8.91646 (err=6.6e-16) 6. Bergland: elapsed time t=1.63 s, 65536 iters, t-(init.)=1.58 s t(norm)=0.150681, mflops=33.1828 (err=6.4e-16) 7. Brenner: elapsed time t=1.12 s, 32768 iters, t-(init.)=1.1 s t(norm)=0.209808, mflops=23.8313 (err=6.0e-16) 8. Burrus: elapsed time t=1.47 s, 32768 iters, t-(init.)=1.44 s t(norm)=0.274658, mflops=18.2044 (err=3.5e-16) 9. CWP (min N) (N=33): elapsed time t=1.9 s, 131072 iters, t-(init.)=1.79 s t(norm)=0.0853539, mflops=58.5797 10. CWP (best N) (N=35): elapsed time t=1.7 s, 131072 iters, t-(init.)=1.59 s t(norm)=0.0758171, mflops=65.9482 11. Edelblute: elapsed time t=1.29 s, 32768 iters, t-(init.)=1.26 s t(norm)=0.240326, mflops=20.8051 (err=3.5e-16) 12. FFTPACK: elapsed time t=1.92 s, 131072 iters, t-(init.)=1.81 s t(norm)=0.0863075, mflops=57.9324 (err=4.4e-16) 13. FFTPACK (f2c): elapsed time t=1.01 s, 65536 iters, t-(init.)=0.96 s t(norm)=0.0915527, mflops=54.6133 (err=5.3e-16) FFTW_MEASURE plan: (cost = 8.239746e-06) FFTW_TWIDDLE 8 FFTW_NOTW 4 14. FFTW: elapsed time t=1.14 s, 131072 iters, t-(init.)=1.04 s t(norm)=0.0495911, mflops=100.825 (err=6.1e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.41 s, 131072 iters, t-(init.)=1.3 s t(norm)=0.0619888, mflops=80.6597 (err=6.1e-16) 16. Frigo-old: elapsed time t=1.27 s, 262144 iters, t-(init.)=1.03 s t(norm)=0.0245571, mflops=203.607 (err=5.5e-16) 17. Green: elapsed time t=1.5 s, 131072 iters, t-(init.)=1.39 s t(norm)=0.0662804, mflops=75.4371 (err=6.9e-16) 18. GSL: elapsed time t=1.57 s, 65536 iters, t-(init.)=1.52 s t(norm)=0.144958, mflops=34.4926 (err=5.0e-16) 19. GSL DIT: elapsed time t=1.56 s, 65536 iters, t-(init.)=1.51 s t(norm)=0.144005, mflops=34.7211 (err=6.1e-16) 20. GSL DIF: elapsed time t=1.61 s, 65536 iters, t-(init.)=1.55 s t(norm)=0.14782, mflops=33.825 (err=4.3e-16) 21. Krukar: elapsed time t=1.64 s, 262144 iters, t-(init.)=1.43 s t(norm)=0.0340939, mflops=146.654 (err=8.6e-16) 22. Mayer (Buneman): elapsed time t=1.41 s, 65536 iters, t-(init.)=1.35 s t(norm)=0.128746, mflops=38.8361 (err=3.5e-16) 23. Mayer (simple): elapsed time t=1.1 s, 65536 iters, t-(init.)=1.04 s t(norm)=0.0991821, mflops=50.4123 24. Mayer (lookup): elapsed time t=1.02 s, 65536 iters, t-(init.)=0.97 s t(norm)=0.0925064, mflops=54.0503 (err=5.9e-16) 25. Monro: elapsed time t=1.02 s, 32768 iters, t-(init.)=0.99 s t(norm)=0.188828, mflops=26.4792 (err=1.2e-07) 26. NAPACK (f2c): elapsed time t=1.59 s, 32768 iters, t-(init.)=1.57 s t(norm)=0.299454, mflops=16.6971 (err=9.3e-16) 27. Nielsen: elapsed time t=1.3 s, 32768 iters, t-(init.)=1.27 s t(norm)=0.242233, mflops=20.6413 (err=3.1e-15) 28. NR (C): elapsed time t=1.46 s, 65536 iters, t-(init.)=1.41 s t(norm)=0.134468, mflops=37.1835 (err=6.1e-16) 29. NR (F): elapsed time t=1.87 s, 65536 iters, t-(init.)=1.82 s t(norm)=0.173569, mflops=28.807 (err=7.0e-16) 30. Ooura (C): elapsed time t=1 s, 131072 iters, t-(init.)=0.88 s t(norm)=0.0419617, mflops=119.156 (err=4.3e-16) 31. Ooura (F): elapsed time t=1.27 s, 131072 iters, t-(init.)=1.17 s t(norm)=0.0557899, mflops=89.6219 (err=4.3e-16) 32. QFT: elapsed time t=1.52 s, 131072 iters, t-(init.)=1.41 s t(norm)=0.067234, mflops=74.3671 (err=4.6e-16) 33. Ransom: elapsed time t=1.67 s, 32768 iters, t-(init.)=1.65 s t(norm)=0.314713, mflops=15.8875 (err=3.8e-15) 34. SCIPORT: elapsed time t=1.76 s, 32768 iters, t-(init.)=1.73 s t(norm)=0.329971, mflops=15.1528 (err=3.1e-07) 35. Singleton: elapsed time t=1.09 s, 32768 iters, t-(init.)=1.06 s t(norm)=0.202179, mflops=24.7306 (err=7.1e-16) 36. Singleton (f2c): elapsed time t=1.32 s, 65536 iters, t-(init.)=1.27 s t(norm)=0.121117, mflops=41.2825 (err=5.8e-16) 37. Sorensen: elapsed time t=1.81 s, 131072 iters, t-(init.)=1.71 s t(norm)=0.0815392, mflops=61.3202 (err=3.4e-16) 38. Sorensen DIT: elapsed time t=1.54 s, 32768 iters, t-(init.)=1.52 s t(norm)=0.289917, mflops=17.2463 (err=5.1e-16) 39. Temperton: elapsed time t=1.36 s, 65536 iters, t-(init.)=1.31 s t(norm)=0.124931, mflops=40.022 (err=1.8e-07) 40. Temperton (f2c): elapsed time t=1.83 s, 65536 iters, t-(init.)=1.78 s t(norm)=0.169754, mflops=29.4544 (err=5.1e-16) 41. Valkenburg: elapsed time t=1.12 s, 8192 iters, t-(init.)=1.12 s t(norm)=0.854492, mflops=5.85143 (err=8.4e-16) Top mflops for N=32 = 203.607 Normalized results and averages for N=32: fft 0: mflops = 36.9217 (norm. = 0.181338), norm. avg. (of 5) = 0.359092 fft 1: mflops = 36.9217 (norm. = 0.181338), norm. avg. (of 5) = 0.350438 fft 2: mflops = 30.6601 (norm. = 0.150585), norm. avg. (of 5) = 0.206616 fft 3: mflops = 12.025 (norm. = 0.0590596), norm. avg. (of 5) = 0.0454731 fft 4: mflops = 35.1871 (norm. = 0.172819), norm. avg. (of 5) = 0.121053 fft 5: mflops = 8.91646 (norm. = 0.0437925), norm. avg. (of 5) = 0.0509315 fft 6: mflops = 33.1828 (norm. = 0.162975), norm. avg. (of 5) = 0.1315 fft 7: mflops = 23.8313 (norm. = 0.117045), norm. avg. (of 5) = 0.104192 fft 8: mflops = 18.2044 (norm. = 0.0894097), norm. avg. (of 5) = 0.152521 fft 9: mflops = 58.5797 (norm. = 0.287709), norm. avg. (of 5) = 0.205593 fft 10: mflops = 65.9482 (norm. = 0.323899), norm. avg. (of 5) = 0.143904 fft 11: mflops = 20.8051 (norm. = 0.102183), norm. avg. (of 4) = 0.119075 fft 12: mflops = 57.9324 (norm. = 0.28453), norm. avg. (of 5) = 0.259584 fft 13: mflops = 54.6133 (norm. = 0.268229), norm. avg. (of 5) = 0.245879 fft 14: mflops = 100.825 (norm. = 0.495192), norm. avg. (of 5) = 0.704949 fft 15: mflops = 80.6597 (norm. = 0.396154), norm. avg. (of 5) = 0.684269 fft 16: mflops = 203.607 (norm. = 1), norm. avg. (of 5) = 1 fft 17: mflops = 75.4371 (norm. = 0.370504), norm. avg. (of 3) = 0.32825 fft 18: mflops = 34.4926 (norm. = 0.169408), norm. avg. (of 5) = 0.134638 fft 19: mflops = 34.7211 (norm. = 0.17053), norm. avg. (of 5) = 0.11655 fft 20: mflops = 33.825 (norm. = 0.166129), norm. avg. (of 5) = 0.111395 fft 21: mflops = 146.654 (norm. = 0.72028), norm. avg. (of 5) = 0.638403 fft 22: mflops = 38.8361 (norm. = 0.190741), norm. avg. (of 4) = 0.22519 fft 23: mflops = 50.4123 (norm. = 0.247596), norm. avg. (of 4) = 0.245737 fft 24: mflops = 54.0503 (norm. = 0.265464), norm. avg. (of 4) = 0.255318 fft 25: mflops = 26.4792 (norm. = 0.130051), norm. avg. (of 4) = 0.0813335 fft 26: mflops = 16.6971 (norm. = 0.0820064), norm. avg. (of 5) = 0.0592232 fft 27: mflops = 20.6413 (norm. = 0.101378), norm. avg. (of 5) = 0.0686947 fft 28: mflops = 37.1835 (norm. = 0.182624), norm. avg. (of 5) = 0.122955 fft 29: mflops = 28.807 (norm. = 0.141484), norm. avg. (of 5) = 0.100518 fft 30: mflops = 119.156 (norm. = 0.585227), norm. avg. (of 5) = 0.554803 fft 31: mflops = 89.6219 (norm. = 0.440171), norm. avg. (of 5) = 0.451124 fft 32: mflops = 74.3671 (norm. = 0.365248), norm. avg. (of 2) = 0.402404 fft 33: mflops = 15.8875 (norm. = 0.0780303), norm. avg. (of 4) = 0.055078 fft 34: mflops = 15.1528 (norm. = 0.074422), norm. avg. (of 4) = 0.0947745 fft 35: mflops = 24.7306 (norm. = 0.121462), norm. avg. (of 5) = 0.0992475 fft 36: mflops = 41.2825 (norm. = 0.202756), norm. avg. (of 5) = 0.135119 fft 37: mflops = 61.3202 (norm. = 0.30117), norm. avg. (of 5) = 0.251428 fft 38: mflops = 17.2463 (norm. = 0.0847039), norm. avg. (of 5) = 0.144057 fft 39: mflops = 40.022 (norm. = 0.196565), norm. avg. (of 5) = 0.133114 fft 40: mflops = 29.4544 (norm. = 0.144663), norm. avg. (of 5) = 0.111576 fft 41: mflops = 5.85143 (norm. = 0.0287388), norm. avg. (of 5) = 0.0461772 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.51 s, 32768 iters, t-(init.)=1.46 s t(norm)=0.11603, mflops=43.0922 (err=2.3e-16) 1. Arndt DIT: elapsed time t=1.5 s, 32768 iters, t-(init.)=1.45 s t(norm)=0.115236, mflops=43.3894 (err=3.4e-16) 2. Arndt Split-Radix: elapsed time t=1.93 s, 32768 iters, t-(init.)=1.88 s t(norm)=0.149409, mflops=33.4652 (err=4.4e-16) 3. Arndt 4-step: elapsed time t=1 s, 8192 iters, t-(init.)=0.99 s t(norm)=0.314713, mflops=15.8875 (err=3.6e-16) 4. Bailey: elapsed time t=1.59 s, 32768 iters, t-(init.)=1.54 s t(norm)=0.122388, mflops=40.8536 (err=4.8e-16) 5. Beauregard: elapsed time t=1.79 s, 8192 iters, t-(init.)=1.78 s t(norm)=0.565847, mflops=8.83631 (err=3.8e-16) 6. Bergland: elapsed time t=1 s, 16384 iters, t-(init.)=0.97 s t(norm)=0.154177, mflops=32.4302 (err=2.5e-16) 7. Brenner: elapsed time t=1.18 s, 16384 iters, t-(init.)=1.16 s t(norm)=0.184377, mflops=27.1183 (err=3.8e-16) 8. Burrus: elapsed time t=1.61 s, 16384 iters, t-(init.)=1.59 s t(norm)=0.252724, mflops=19.7845 (err=4.2e-16) 9. CWP (min N) (N=65): elapsed time t=1.7 s, 65536 iters, t-(init.)=1.6 s t(norm)=0.0635783, mflops=78.6432 10. CWP (best N) (N=84): elapsed time t=1.98 s, 65536 iters, t-(init.)=1.86 s t(norm)=0.0739098, mflops=67.6501 11. Edelblute: elapsed time t=1.44 s, 16384 iters, t-(init.)=1.42 s t(norm)=0.225703, mflops=22.153 (err=3.6e-16) 12. FFTPACK: elapsed time t=1.04 s, 32768 iters, t-(init.)=0.99 s t(norm)=0.0786781, mflops=63.5501 (err=4.2e-16) 13. FFTPACK (f2c): elapsed time t=1.93 s, 65536 iters, t-(init.)=1.84 s t(norm)=0.073115, mflops=68.3854 (err=4.5e-16) FFTW_MEASURE plan: (cost = 1.708984e-05) FFTW_TWIDDLE 16 FFTW_NOTW 4 14. FFTW: elapsed time t=1.15 s, 65536 iters, t-(init.)=1.05 s t(norm)=0.0417233, mflops=119.837 (err=3.1e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.5 s, 65536 iters, t-(init.)=1.4 s t(norm)=0.055631, mflops=89.8779 (err=2.6e-16) 16. Frigo-old: elapsed time t=1.15 s, 32768 iters, t-(init.)=1.09 s t(norm)=0.0866254, mflops=57.7198 (err=4.5e-16) 17. Green: elapsed time t=1.57 s, 65536 iters, t-(init.)=1.47 s t(norm)=0.0584126, mflops=85.598 (err=3.7e-16) 18. GSL: elapsed time t=1.7 s, 32768 iters, t-(init.)=1.66 s t(norm)=0.131925, mflops=37.9003 (err=3.8e-16) 19. GSL DIT: elapsed time t=1.65 s, 32768 iters, t-(init.)=1.6 s t(norm)=0.127157, mflops=39.3216 (err=3.2e-16) 20. GSL DIF: elapsed time t=1.69 s, 32768 iters, t-(init.)=1.64 s t(norm)=0.130335, mflops=38.3625 (err=3.1e-16) 21. Krukar: elapsed time t=1.25 s, 32768 iters, t-(init.)=1.2 s t(norm)=0.0953674, mflops=52.4288 (err=5.3e-16) 22. Mayer (Buneman): elapsed time t=1.71 s, 32768 iters, t-(init.)=1.65 s t(norm)=0.13113, mflops=38.13 (err=2.0e-16) 23. Mayer (simple): elapsed time t=1.3 s, 32768 iters, t-(init.)=1.25 s t(norm)=0.0993411, mflops=50.3316 24. Mayer (lookup): elapsed time t=1.16 s, 32768 iters, t-(init.)=1.11 s t(norm)=0.0882149, mflops=56.6798 (err=3.4e-16) 25. Monro: elapsed time t=1.03 s, 16384 iters, t-(init.)=1.01 s t(norm)=0.160535, mflops=31.1458 (err=4.9e-08) 26. NAPACK (f2c): elapsed time t=1.53 s, 16384 iters, t-(init.)=1.51 s t(norm)=0.240008, mflops=20.8326 (err=1.0e-15) 27. Nielsen: elapsed time t=1.36 s, 16384 iters, t-(init.)=1.34 s t(norm)=0.212987, mflops=23.4756 (err=6.5e-15) 28. NR (C): elapsed time t=1.37 s, 32768 iters, t-(init.)=1.32 s t(norm)=0.104904, mflops=47.6625 (err=3.2e-16) 29. NR (F): elapsed time t=1.81 s, 32768 iters, t-(init.)=1.77 s t(norm)=0.140667, mflops=35.5449 (err=3.4e-16) 30. Ooura (C): elapsed time t=1.18 s, 65536 iters, t-(init.)=1.06 s t(norm)=0.0421206, mflops=118.707 (err=2.9e-16) 31. Ooura (F): elapsed time t=1.46 s, 65536 iters, t-(init.)=1.36 s t(norm)=0.0540415, mflops=92.5214 (err=2.9e-16) 32. QFT: elapsed time t=1.01 s, 32768 iters, t-(init.)=0.96 s t(norm)=0.0762939, mflops=65.536 (err=5.9e-16) 33. Ransom: elapsed time t=1.17 s, 16384 iters, t-(init.)=1.14 s t(norm)=0.181198, mflops=27.5941 (err=2.5e-15) 34. SCIPORT: elapsed time t=1.08 s, 8192 iters, t-(init.)=1.07 s t(norm)=0.340144, mflops=14.6997 (err=2.0e-07) 35. Singleton: elapsed time t=1.3 s, 16384 iters, t-(init.)=1.28 s t(norm)=0.203451, mflops=24.576 (err=3.5e-16) 36. Singleton (f2c): elapsed time t=1.35 s, 32768 iters, t-(init.)=1.3 s t(norm)=0.103315, mflops=48.3958 (err=4.6e-16) 37. Sorensen: elapsed time t=1.84 s, 65536 iters, t-(init.)=1.74 s t(norm)=0.0691414, mflops=72.3156 (err=3.4e-16) 38. Sorensen DIT: elapsed time t=1.68 s, 16384 iters, t-(init.)=1.65 s t(norm)=0.26226, mflops=19.065 (err=5.4e-16) 39. Temperton: elapsed time t=1.22 s, 32768 iters, t-(init.)=1.17 s t(norm)=0.0929832, mflops=53.7731 (err=1.2e-07) 40. Temperton (f2c): elapsed time t=1.68 s, 32768 iters, t-(init.)=1.63 s t(norm)=0.129541, mflops=38.5979 (err=3.7e-16) 41. Valkenburg: elapsed time t=1.32 s, 4096 iters, t-(init.)=1.32 s t(norm)=0.839233, mflops=5.95782 (err=8.5e-16) Top mflops for N=64 = 119.837 Normalized results and averages for N=64: fft 0: mflops = 43.0922 (norm. = 0.359589), norm. avg. (of 6) = 0.359175 fft 1: mflops = 43.3894 (norm. = 0.362069), norm. avg. (of 6) = 0.352376 fft 2: mflops = 33.4652 (norm. = 0.279255), norm. avg. (of 6) = 0.218722 fft 3: mflops = 15.8875 (norm. = 0.132576), norm. avg. (of 6) = 0.0599902 fft 4: mflops = 40.8536 (norm. = 0.340909), norm. avg. (of 6) = 0.157695 fft 5: mflops = 8.83631 (norm. = 0.073736), norm. avg. (of 6) = 0.0547323 fft 6: mflops = 32.4302 (norm. = 0.270619), norm. avg. (of 6) = 0.154687 fft 7: mflops = 27.1183 (norm. = 0.226293), norm. avg. (of 6) = 0.124542 fft 8: mflops = 19.7845 (norm. = 0.165094), norm. avg. (of 6) = 0.154617 fft 9: mflops = 78.6432 (norm. = 0.65625), norm. avg. (of 6) = 0.280703 fft 10: mflops = 67.6501 (norm. = 0.564516), norm. avg. (of 6) = 0.214006 fft 11: mflops = 22.153 (norm. = 0.184859), norm. avg. (of 5) = 0.132232 fft 12: mflops = 63.5501 (norm. = 0.530303), norm. avg. (of 6) = 0.304703 fft 13: mflops = 68.3854 (norm. = 0.570652), norm. avg. (of 6) = 0.300008 fft 14: mflops = 119.837 (norm. = 1), norm. avg. (of 6) = 0.754124 fft 15: mflops = 89.8779 (norm. = 0.75), norm. avg. (of 6) = 0.695224 fft 16: mflops = 57.7198 (norm. = 0.481651), norm. avg. (of 6) = 0.913609 fft 17: mflops = 85.598 (norm. = 0.714286), norm. avg. (of 4) = 0.424759 fft 18: mflops = 37.9003 (norm. = 0.316265), norm. avg. (of 6) = 0.164909 fft 19: mflops = 39.3216 (norm. = 0.328125), norm. avg. (of 6) = 0.151812 fft 20: mflops = 38.3625 (norm. = 0.320122), norm. avg. (of 6) = 0.146183 fft 21: mflops = 52.4288 (norm. = 0.4375), norm. avg. (of 6) = 0.604919 fft 22: mflops = 38.13 (norm. = 0.318182), norm. avg. (of 5) = 0.243788 fft 23: mflops = 50.3316 (norm. = 0.42), norm. avg. (of 5) = 0.28059 fft 24: mflops = 56.6798 (norm. = 0.472973), norm. avg. (of 5) = 0.298849 fft 25: mflops = 31.1458 (norm. = 0.259901), norm. avg. (of 5) = 0.117047 fft 26: mflops = 20.8326 (norm. = 0.173841), norm. avg. (of 6) = 0.0783262 fft 27: mflops = 23.4756 (norm. = 0.195896), norm. avg. (of 6) = 0.0898948 fft 28: mflops = 47.6625 (norm. = 0.397727), norm. avg. (of 6) = 0.16875 fft 29: mflops = 35.5449 (norm. = 0.29661), norm. avg. (of 6) = 0.1332 fft 30: mflops = 118.707 (norm. = 0.990566), norm. avg. (of 6) = 0.62743 fft 31: mflops = 92.5214 (norm. = 0.772059), norm. avg. (of 6) = 0.504613 fft 32: mflops = 65.536 (norm. = 0.546875), norm. avg. (of 3) = 0.450561 fft 33: mflops = 27.5941 (norm. = 0.230263), norm. avg. (of 5) = 0.090115 fft 34: mflops = 14.6997 (norm. = 0.122664), norm. avg. (of 5) = 0.100352 fft 35: mflops = 24.576 (norm. = 0.205078), norm. avg. (of 6) = 0.116886 fft 36: mflops = 48.3958 (norm. = 0.403846), norm. avg. (of 6) = 0.179907 fft 37: mflops = 72.3156 (norm. = 0.603448), norm. avg. (of 6) = 0.310098 fft 38: mflops = 19.065 (norm. = 0.159091), norm. avg. (of 6) = 0.146563 fft 39: mflops = 53.7731 (norm. = 0.448718), norm. avg. (of 6) = 0.185715 fft 40: mflops = 38.5979 (norm. = 0.322086), norm. avg. (of 6) = 0.146661 fft 41: mflops = 5.95782 (norm. = 0.0497159), norm. avg. (of 6) = 0.0467669 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.82 s, 16384 iters, t-(init.)=1.77 s t(norm)=0.120572, mflops=41.4691 (err=3.8e-16) 1. Arndt DIT: elapsed time t=1.84 s, 16384 iters, t-(init.)=1.8 s t(norm)=0.122615, mflops=40.778 (err=5.1e-16) 2. Arndt Split-Radix: elapsed time t=1.02 s, 8192 iters, t-(init.)=1 s t(norm)=0.136239, mflops=36.7002 (err=6.1e-16) 3. Arndt 4-step: elapsed time t=1.25 s, 4096 iters, t-(init.)=1.23 s t(norm)=0.335148, mflops=14.9188 (err=3.3e-16) 4. Bailey: elapsed time t=1.55 s, 16384 iters, t-(init.)=1.51 s t(norm)=0.102861, mflops=48.6095 (err=6.1e-16) 5. Beauregard: elapsed time t=1.04 s, 2048 iters, t-(init.)=1.04 s t(norm)=0.566755, mflops=8.82215 (err=9.3e-16) 6. Bergland: elapsed time t=1.07 s, 8192 iters, t-(init.)=1.05 s t(norm)=0.143051, mflops=34.9525 (err=6.2e-16) 7. Brenner: elapsed time t=1.27 s, 8192 iters, t-(init.)=1.24 s t(norm)=0.168937, mflops=29.5969 (err=6.6e-16) 8. Burrus: elapsed time t=1.7 s, 8192 iters, t-(init.)=1.68 s t(norm)=0.228882, mflops=21.8453 (err=5.3e-16) 9. CWP (min N) (N=130): elapsed time t=1.67 s, 32768 iters, t-(init.)=1.57 s t(norm)=0.0534739, mflops=93.5036 10. CWP (best N) (N=140): elapsed time t=1.57 s, 32768 iters, t-(init.)=1.47 s t(norm)=0.0500679, mflops=99.8644 11. Edelblute: elapsed time t=1.52 s, 8192 iters, t-(init.)=1.5 s t(norm)=0.204359, mflops=24.4668 (err=6.7e-16) 12. FFTPACK: elapsed time t=1.07 s, 16384 iters, t-(init.)=1.02 s t(norm)=0.069482, mflops=71.9611 (err=5.3e-16) 13. FFTPACK (f2c): elapsed time t=1.02 s, 16384 iters, t-(init.)=0.97 s t(norm)=0.066076, mflops=75.6704 (err=6.2e-16) FFTW_MEASURE plan: (cost = 4.150391e-05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 8 14. FFTW: elapsed time t=1.38 s, 32768 iters, t-(init.)=1.29 s t(norm)=0.0439371, mflops=113.799 (err=5.2e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.68 s, 32768 iters, t-(init.)=1.59 s t(norm)=0.0541551, mflops=92.3274 (err=3.6e-16) 16. Frigo-old: elapsed time t=1.45 s, 16384 iters, t-(init.)=1.4 s t(norm)=0.0953674, mflops=52.4288 (err=4.4e-16) 17. Green: elapsed time t=1.87 s, 32768 iters, t-(init.)=1.77 s t(norm)=0.0602858, mflops=82.9382 (err=6.9e-16) 18. GSL: elapsed time t=1.82 s, 16384 iters, t-(init.)=1.78 s t(norm)=0.121253, mflops=41.2361 (err=8.2e-16) 19. GSL DIT: elapsed time t=1.73 s, 16384 iters, t-(init.)=1.68 s t(norm)=0.114441, mflops=43.6907 (err=7.5e-16) 20. GSL DIF: elapsed time t=1.77 s, 16384 iters, t-(init.)=1.72 s t(norm)=0.117166, mflops=42.6746 (err=7.6e-16) 21. Krukar: elapsed time t=1.05 s, 8192 iters, t-(init.)=1.03 s t(norm)=0.140326, mflops=35.6312 (err=6.5e-16) 22. Mayer (Buneman): elapsed time t=1.86 s, 16384 iters, t-(init.)=1.81 s t(norm)=0.123296, mflops=40.5527 (err=3.1e-16) 23. Mayer (simple): elapsed time t=1.36 s, 16384 iters, t-(init.)=1.31 s t(norm)=0.0892367, mflops=56.0308 24. Mayer (lookup): elapsed time t=1.26 s, 16384 iters, t-(init.)=1.21 s t(norm)=0.0824247, mflops=60.6614 (err=3.5e-16) 25. Monro: elapsed time t=1.04 s, 8192 iters, t-(init.)=1.02 s t(norm)=0.138964, mflops=35.9805 (err=8.3e-08) 26. NAPACK (f2c): elapsed time t=1.66 s, 8192 iters, t-(init.)=1.64 s t(norm)=0.223432, mflops=22.3781 (err=1.6e-15) 27. Nielsen: elapsed time t=1.44 s, 8192 iters, t-(init.)=1.42 s t(norm)=0.19346, mflops=25.8452 (err=1.7e-15) 28. NR (C): elapsed time t=1.34 s, 16384 iters, t-(init.)=1.29 s t(norm)=0.0878743, mflops=56.8995 (err=7.5e-16) 29. NR (F): elapsed time t=1.8 s, 16384 iters, t-(init.)=1.76 s t(norm)=0.11989, mflops=41.7047 (err=6.9e-16) 30. Ooura (C): elapsed time t=1.27 s, 32768 iters, t-(init.)=1.15 s t(norm)=0.0391688, mflops=127.653 (err=6.7e-16) 31. Ooura (F): elapsed time t=1.62 s, 32768 iters, t-(init.)=1.53 s t(norm)=0.0521115, mflops=95.9481 (err=6.7e-16) 32. QFT: elapsed time t=1.24 s, 16384 iters, t-(init.)=1.2 s t(norm)=0.0817435, mflops=61.1669 (err=4.9e-16) 33. Ransom: elapsed time t=1.39 s, 8192 iters, t-(init.)=1.37 s t(norm)=0.186648, mflops=26.7884 (err=1.7e-15) 34. SCIPORT: elapsed time t=1.27 s, 4096 iters, t-(init.)=1.26 s t(norm)=0.343323, mflops=14.5636 (err=1.6e-07) 35. Singleton: elapsed time t=1.18 s, 8192 iters, t-(init.)=1.16 s t(norm)=0.158037, mflops=31.6381 (err=6.2e-16) 36. Singleton (f2c): elapsed time t=1.38 s, 16384 iters, t-(init.)=1.34 s t(norm)=0.0912803, mflops=54.7764 (err=5.7e-16) 37. Sorensen: elapsed time t=1.86 s, 32768 iters, t-(init.)=1.77 s t(norm)=0.0602858, mflops=82.9382 (err=4.3e-16) 38. Sorensen DIT: elapsed time t=1.8 s, 8192 iters, t-(init.)=1.78 s t(norm)=0.242506, mflops=20.6181 (err=4.0e-16) 39. Temperton: elapsed time t=1.36 s, 16384 iters, t-(init.)=1.32 s t(norm)=0.0899179, mflops=55.6063 (err=9.9e-08) 40. Temperton (f2c): elapsed time t=1.01 s, 8192 iters, t-(init.)=0.98 s t(norm)=0.133514, mflops=37.4491 (err=7.7e-16) 41. Valkenburg: elapsed time t=1.51 s, 2048 iters, t-(init.)=1.5 s t(norm)=0.817435, mflops=6.11669 (err=8.6e-16) Top mflops for N=128 = 127.653 Normalized results and averages for N=128: fft 0: mflops = 41.4691 (norm. = 0.324859), norm. avg. (of 7) = 0.354272 fft 1: mflops = 40.778 (norm. = 0.319444), norm. avg. (of 7) = 0.347672 fft 2: mflops = 36.7002 (norm. = 0.2875), norm. avg. (of 7) = 0.228548 fft 3: mflops = 14.9188 (norm. = 0.11687), norm. avg. (of 7) = 0.0681159 fft 4: mflops = 48.6095 (norm. = 0.380795), norm. avg. (of 7) = 0.189567 fft 5: mflops = 8.82215 (norm. = 0.0691106), norm. avg. (of 7) = 0.0567863 fft 6: mflops = 34.9525 (norm. = 0.27381), norm. avg. (of 7) = 0.171704 fft 7: mflops = 29.5969 (norm. = 0.231855), norm. avg. (of 7) = 0.139873 fft 8: mflops = 21.8453 (norm. = 0.171131), norm. avg. (of 7) = 0.156976 fft 9: mflops = 93.5036 (norm. = 0.732484), norm. avg. (of 7) = 0.345243 fft 10: mflops = 99.8644 (norm. = 0.782313), norm. avg. (of 7) = 0.295193 fft 11: mflops = 24.4668 (norm. = 0.191667), norm. avg. (of 6) = 0.142138 fft 12: mflops = 71.9611 (norm. = 0.563725), norm. avg. (of 7) = 0.341707 fft 13: mflops = 75.6704 (norm. = 0.592784), norm. avg. (of 7) = 0.341833 fft 14: mflops = 113.799 (norm. = 0.891473), norm. avg. (of 7) = 0.773745 fft 15: mflops = 92.3274 (norm. = 0.72327), norm. avg. (of 7) = 0.699231 fft 16: mflops = 52.4288 (norm. = 0.410714), norm. avg. (of 7) = 0.841767 fft 17: mflops = 82.9382 (norm. = 0.649718), norm. avg. (of 5) = 0.469751 fft 18: mflops = 41.2361 (norm. = 0.323034), norm. avg. (of 7) = 0.187498 fft 19: mflops = 43.6907 (norm. = 0.342262), norm. avg. (of 7) = 0.179019 fft 20: mflops = 42.6746 (norm. = 0.334302), norm. avg. (of 7) = 0.173057 fft 21: mflops = 35.6312 (norm. = 0.279126), norm. avg. (of 7) = 0.558377 fft 22: mflops = 40.5527 (norm. = 0.31768), norm. avg. (of 6) = 0.256103 fft 23: mflops = 56.0308 (norm. = 0.438931), norm. avg. (of 6) = 0.30698 fft 24: mflops = 60.6614 (norm. = 0.475207), norm. avg. (of 6) = 0.328242 fft 25: mflops = 35.9805 (norm. = 0.281863), norm. avg. (of 6) = 0.144516 fft 26: mflops = 22.3781 (norm. = 0.175305), norm. avg. (of 7) = 0.0921803 fft 27: mflops = 25.8452 (norm. = 0.202465), norm. avg. (of 7) = 0.105976 fft 28: mflops = 56.8995 (norm. = 0.445736), norm. avg. (of 7) = 0.20832 fft 29: mflops = 41.7047 (norm. = 0.326705), norm. avg. (of 7) = 0.160843 fft 30: mflops = 127.653 (norm. = 1), norm. avg. (of 7) = 0.680654 fft 31: mflops = 95.9481 (norm. = 0.751634), norm. avg. (of 7) = 0.539902 fft 32: mflops = 61.1669 (norm. = 0.479167), norm. avg. (of 4) = 0.457713 fft 33: mflops = 26.7884 (norm. = 0.209854), norm. avg. (of 6) = 0.110072 fft 34: mflops = 14.5636 (norm. = 0.114087), norm. avg. (of 6) = 0.102642 fft 35: mflops = 31.6381 (norm. = 0.247845), norm. avg. (of 7) = 0.135594 fft 36: mflops = 54.7764 (norm. = 0.429104), norm. avg. (of 7) = 0.215507 fft 37: mflops = 82.9382 (norm. = 0.649718), norm. avg. (of 7) = 0.358615 fft 38: mflops = 20.6181 (norm. = 0.161517), norm. avg. (of 7) = 0.148699 fft 39: mflops = 55.6063 (norm. = 0.435606), norm. avg. (of 7) = 0.221413 fft 40: mflops = 37.4491 (norm. = 0.293367), norm. avg. (of 7) = 0.167619 fft 41: mflops = 6.11669 (norm. = 0.0479167), norm. avg. (of 7) = 0.0469312 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.74 s, 8192 iters, t-(init.)=1.69 s t(norm)=0.100732, mflops=49.6367 (err=4.8e-16) 1. Arndt DIT: elapsed time t=1.71 s, 8192 iters, t-(init.)=1.66 s t(norm)=0.0989437, mflops=50.5338 (err=5.1e-16) 2. Arndt Split-Radix: elapsed time t=1.06 s, 4096 iters, t-(init.)=1.03 s t(norm)=0.122786, mflops=40.7214 (err=5.5e-16) 3. Arndt 4-step: elapsed time t=1.24 s, 2048 iters, t-(init.)=1.23 s t(norm)=0.293255, mflops=17.05 (err=5.7e-16) 4. Bailey: elapsed time t=1.73 s, 8192 iters, t-(init.)=1.69 s t(norm)=0.100732, mflops=49.6367 (err=5.5e-16) 5. Beauregard: elapsed time t=1.19 s, 1024 iters, t-(init.)=1.18 s t(norm)=0.562668, mflops=8.88624 (err=4.8e-16) 6. Bergland: elapsed time t=1.11 s, 4096 iters, t-(init.)=1.09 s t(norm)=0.129938, mflops=38.4799 (err=5.7e-16) 7. Brenner: elapsed time t=1.34 s, 4096 iters, t-(init.)=1.32 s t(norm)=0.157356, mflops=31.775 (err=4.8e-16) 8. Burrus: elapsed time t=1.76 s, 4096 iters, t-(init.)=1.73 s t(norm)=0.206232, mflops=24.2445 (err=5.4e-16) 9. CWP (min N) (N=260): elapsed time t=1.67 s, 16384 iters, t-(init.)=1.58 s t(norm)=0.0470877, mflops=106.185 10. CWP (best N) (N=280): elapsed time t=1.6 s, 16384 iters, t-(init.)=1.5 s t(norm)=0.0447035, mflops=111.848 11. Edelblute: elapsed time t=1.58 s, 4096 iters, t-(init.)=1.56 s t(norm)=0.185966, mflops=26.8866 (err=5.9e-16) 12. FFTPACK: elapsed time t=1.24 s, 8192 iters, t-(init.)=1.19 s t(norm)=0.0709295, mflops=70.4925 (err=4.4e-16) 13. FFTPACK (f2c): elapsed time t=1.1 s, 8192 iters, t-(init.)=1.05 s t(norm)=0.0625849, mflops=79.8915 (err=4.5e-16) FFTW_MEASURE plan: (cost = 8.544922e-05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 8 14. FFTW: elapsed time t=1.42 s, 16384 iters, t-(init.)=1.33 s t(norm)=0.0396371, mflops=126.144 (err=4.2e-16) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.74 s, 16384 iters, t-(init.)=1.65 s t(norm)=0.0491738, mflops=101.68 (err=4.6e-16) 16. Frigo-old: elapsed time t=1.57 s, 8192 iters, t-(init.)=1.52 s t(norm)=0.0905991, mflops=55.1882 (err=4.5e-16) 17. Green: elapsed time t=1.05 s, 8192 iters, t-(init.)=1 s t(norm)=0.0596046, mflops=83.8861 (err=4.8e-16) 18. GSL: elapsed time t=1 s, 4096 iters, t-(init.)=0.97 s t(norm)=0.115633, mflops=43.2402 (err=4.7e-16) 19. GSL DIT: elapsed time t=1.83 s, 8192 iters, t-(init.)=1.79 s t(norm)=0.106692, mflops=46.8637 (err=5.0e-16) 20. GSL DIF: elapsed time t=1.85 s, 8192 iters, t-(init.)=1.8 s t(norm)=0.107288, mflops=46.6034 (err=4.9e-16) 21. Krukar: elapsed time t=1.68 s, 8192 iters, t-(init.)=1.63 s t(norm)=0.0971556, mflops=51.4639 (err=5.0e-16) 22. Mayer (Buneman): elapsed time t=1.02 s, 4096 iters, t-(init.)=0.99 s t(norm)=0.118017, mflops=42.3667 (err=4.7e-16) 23. Mayer (simple): elapsed time t=1.53 s, 8192 iters, t-(init.)=1.48 s t(norm)=0.0882149, mflops=56.6798 24. Mayer (lookup): elapsed time t=1.42 s, 8192 iters, t-(init.)=1.37 s t(norm)=0.0816584, mflops=61.2307 (err=5.7e-16) 25. Monro: elapsed time t=1.1 s, 4096 iters, t-(init.)=1.07 s t(norm)=0.127554, mflops=39.1991 (err=8.2e-08) 26. NAPACK (f2c): elapsed time t=1.7 s, 4096 iters, t-(init.)=1.68 s t(norm)=0.200272, mflops=24.9661 (err=3.9e-15) 27. Nielsen: elapsed time t=1.59 s, 4096 iters, t-(init.)=1.57 s t(norm)=0.187159, mflops=26.7153 (err=3.8e-15) 28. NR (C): elapsed time t=1.38 s, 8192 iters, t-(init.)=1.33 s t(norm)=0.0792742, mflops=63.0722 (err=4.9e-16) 29. NR (F): elapsed time t=1.87 s, 8192 iters, t-(init.)=1.83 s t(norm)=0.109076, mflops=45.8394 (err=4.5e-16) 30. Ooura (C): elapsed time t=1.42 s, 16384 iters, t-(init.)=1.31 s t(norm)=0.039041, mflops=128.07 (err=5.0e-16) 31. Ooura (F): elapsed time t=1.78 s, 16384 iters, t-(init.)=1.69 s t(norm)=0.0503659, mflops=99.2735 (err=5.0e-16) 32. QFT: elapsed time t=1.48 s, 8192 iters, t-(init.)=1.44 s t(norm)=0.0858307, mflops=58.2542 (err=7.0e-16) 33. Ransom: elapsed time t=1.14 s, 4096 iters, t-(init.)=1.11 s t(norm)=0.132322, mflops=37.7865 (err=2.0e-15) 34. SCIPORT: elapsed time t=1.47 s, 2048 iters, t-(init.)=1.46 s t(norm)=0.348091, mflops=14.3641 (err=1.4e-07) 35. Singleton: elapsed time t=1.57 s, 4096 iters, t-(init.)=1.54 s t(norm)=0.183582, mflops=27.2357 (err=5.0e-16) 36. Singleton (f2c): elapsed time t=1.51 s, 8192 iters, t-(init.)=1.46 s t(norm)=0.0870228, mflops=57.4562 (err=5.4e-16) 37. Sorensen: elapsed time t=1.94 s, 16384 iters, t-(init.)=1.85 s t(norm)=0.0551343, mflops=90.6877 (err=6.0e-16) 38. Sorensen DIT: elapsed time t=1.9 s, 4096 iters, t-(init.)=1.88 s t(norm)=0.224113, mflops=22.3101 (err=5.7e-16) 39. Temperton: elapsed time t=1.41 s, 8192 iters, t-(init.)=1.36 s t(norm)=0.0810623, mflops=61.6809 (err=9.1e-08) 40. Temperton (f2c): elapsed time t=1.09 s, 4096 iters, t-(init.)=1.07 s t(norm)=0.127554, mflops=39.1991 (err=4.5e-16) 41. Valkenburg: elapsed time t=1.7 s, 1024 iters, t-(init.)=1.69 s t(norm)=0.805855, mflops=6.20459 (err=6.4e-16) Top mflops for N=256 = 128.07 Normalized results and averages for N=256: fft 0: mflops = 49.6367 (norm. = 0.387574), norm. avg. (of 8) = 0.358435 fft 1: mflops = 50.5338 (norm. = 0.394578), norm. avg. (of 8) = 0.353535 fft 2: mflops = 40.7214 (norm. = 0.317961), norm. avg. (of 8) = 0.239724 fft 3: mflops = 17.05 (norm. = 0.13313), norm. avg. (of 8) = 0.0762426 fft 4: mflops = 49.6367 (norm. = 0.387574), norm. avg. (of 8) = 0.214318 fft 5: mflops = 8.88624 (norm. = 0.0693856), norm. avg. (of 8) = 0.0583612 fft 6: mflops = 38.4799 (norm. = 0.300459), norm. avg. (of 8) = 0.187798 fft 7: mflops = 31.775 (norm. = 0.248106), norm. avg. (of 8) = 0.153402 fft 8: mflops = 24.2445 (norm. = 0.189306), norm. avg. (of 8) = 0.161017 fft 9: mflops = 106.185 (norm. = 0.829114), norm. avg. (of 8) = 0.405727 fft 10: mflops = 111.848 (norm. = 0.873333), norm. avg. (of 8) = 0.367461 fft 11: mflops = 26.8866 (norm. = 0.209936), norm. avg. (of 7) = 0.151823 fft 12: mflops = 70.4925 (norm. = 0.55042), norm. avg. (of 8) = 0.367796 fft 13: mflops = 79.8915 (norm. = 0.62381), norm. avg. (of 8) = 0.37708 fft 14: mflops = 126.144 (norm. = 0.984962), norm. avg. (of 8) = 0.800147 fft 15: mflops = 101.68 (norm. = 0.793939), norm. avg. (of 8) = 0.711069 fft 16: mflops = 55.1882 (norm. = 0.430921), norm. avg. (of 8) = 0.790411 fft 17: mflops = 83.8861 (norm. = 0.655), norm. avg. (of 6) = 0.500626 fft 18: mflops = 43.2402 (norm. = 0.337629), norm. avg. (of 8) = 0.206264 fft 19: mflops = 46.8637 (norm. = 0.365922), norm. avg. (of 8) = 0.202382 fft 20: mflops = 46.6034 (norm. = 0.363889), norm. avg. (of 8) = 0.196911 fft 21: mflops = 51.4639 (norm. = 0.40184), norm. avg. (of 8) = 0.53881 fft 22: mflops = 42.3667 (norm. = 0.330808), norm. avg. (of 7) = 0.266775 fft 23: mflops = 56.6798 (norm. = 0.442568), norm. avg. (of 7) = 0.32635 fft 24: mflops = 61.2307 (norm. = 0.478102), norm. avg. (of 7) = 0.349651 fft 25: mflops = 39.1991 (norm. = 0.306075), norm. avg. (of 7) = 0.167596 fft 26: mflops = 24.9661 (norm. = 0.19494), norm. avg. (of 8) = 0.105025 fft 27: mflops = 26.7153 (norm. = 0.208599), norm. avg. (of 8) = 0.118804 fft 28: mflops = 63.0722 (norm. = 0.492481), norm. avg. (of 8) = 0.24384 fft 29: mflops = 45.8394 (norm. = 0.357923), norm. avg. (of 8) = 0.185478 fft 30: mflops = 128.07 (norm. = 1), norm. avg. (of 8) = 0.720573 fft 31: mflops = 99.2735 (norm. = 0.775148), norm. avg. (of 8) = 0.569307 fft 32: mflops = 58.2542 (norm. = 0.454861), norm. avg. (of 5) = 0.457142 fft 33: mflops = 37.7865 (norm. = 0.295045), norm. avg. (of 7) = 0.136496 fft 34: mflops = 14.3641 (norm. = 0.112158), norm. avg. (of 7) = 0.104001 fft 35: mflops = 27.2357 (norm. = 0.212662), norm. avg. (of 8) = 0.145228 fft 36: mflops = 57.4562 (norm. = 0.44863), norm. avg. (of 8) = 0.244647 fft 37: mflops = 90.6877 (norm. = 0.708108), norm. avg. (of 8) = 0.402302 fft 38: mflops = 22.3101 (norm. = 0.174202), norm. avg. (of 8) = 0.151887 fft 39: mflops = 61.6809 (norm. = 0.481618), norm. avg. (of 8) = 0.253939 fft 40: mflops = 39.1991 (norm. = 0.306075), norm. avg. (of 8) = 0.184926 fft 41: mflops = 6.20459 (norm. = 0.0484467), norm. avg. (of 8) = 0.0471206 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.03 s, 2048 iters, t-(init.)=1.01 s t(norm)=0.107023, mflops=46.7187 (err=5.4e-16) 1. Arndt DIT: elapsed time t=1.03 s, 2048 iters, t-(init.)=1.01 s t(norm)=0.107023, mflops=46.7187 (err=5.5e-16) 2. Arndt Split-Radix: elapsed time t=1.17 s, 2048 iters, t-(init.)=1.15 s t(norm)=0.121858, mflops=41.0312 (err=6.0e-16) 3. Arndt 4-step: elapsed time t=1.4 s, 1024 iters, t-(init.)=1.38 s t(norm)=0.29246, mflops=17.0963 (err=5.5e-16) 4. Bailey: elapsed time t=1.94 s, 4096 iters, t-(init.)=1.89 s t(norm)=0.100136, mflops=49.9322 (err=5.7e-16) 5. Beauregard: elapsed time t=1.34 s, 512 iters, t-(init.)=1.33 s t(norm)=0.563727, mflops=8.86953 (err=6.6e-16) 6. Bergland: elapsed time t=1.27 s, 2048 iters, t-(init.)=1.25 s t(norm)=0.132455, mflops=37.7487 (err=5.5e-16) 7. Brenner: elapsed time t=1.45 s, 2048 iters, t-(init.)=1.42 s t(norm)=0.150469, mflops=33.2295 (err=5.7e-16) 8. Burrus: elapsed time t=1.87 s, 2048 iters, t-(init.)=1.85 s t(norm)=0.196033, mflops=25.5059 (err=5.8e-16) 9. CWP (min N) (N=520): elapsed time t=1.74 s, 8192 iters, t-(init.)=1.65 s t(norm)=0.0437101, mflops=114.39 10. CWP (best N) (N=560): elapsed time t=1.7 s, 8192 iters, t-(init.)=1.6 s t(norm)=0.0423855, mflops=117.965 11. Edelblute: elapsed time t=1.7 s, 2048 iters, t-(init.)=1.68 s t(norm)=0.178019, mflops=28.0869 (err=6.1e-16) 12. FFTPACK: elapsed time t=1.71 s, 4096 iters, t-(init.)=1.67 s t(norm)=0.0884798, mflops=56.5101 (err=5.5e-16) 13. FFTPACK (f2c): elapsed time t=1.76 s, 4096 iters, t-(init.)=1.71 s t(norm)=0.0905991, mflops=55.1882 (err=5.4e-16) FFTW_MEASURE plan: (cost = 2.636719e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 8 14. FFTW: elapsed time t=1.06 s, 4096 iters, t-(init.)=1.01 s t(norm)=0.0535117, mflops=93.4375 (err=5.4e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.09 s, 4096 iters, t-(init.)=1.04 s t(norm)=0.0551012, mflops=90.7422 (err=5.4e-16) 16. Frigo-old: elapsed time t=1.5 s, 4096 iters, t-(init.)=1.44 s t(norm)=0.0762939, mflops=65.536 (err=5.4e-16) 17. Green: elapsed time t=1.12 s, 4096 iters, t-(init.)=1.08 s t(norm)=0.0572205, mflops=87.3813 (err=5.9e-16) 18. GSL: elapsed time t=1.18 s, 2048 iters, t-(init.)=1.15 s t(norm)=0.121858, mflops=41.0312 (err=6.1e-16) 19. GSL DIT: elapsed time t=1.97 s, 4096 iters, t-(init.)=1.93 s t(norm)=0.102255, mflops=48.8973 (err=5.9e-16) 20. GSL DIF: elapsed time t=1 s, 2048 iters, t-(init.)=0.97 s t(norm)=0.102785, mflops=48.6453 (err=5.6e-16) 21. Krukar: elapsed time t=1.08 s, 2048 iters, t-(init.)=1.05 s t(norm)=0.111262, mflops=44.939 (err=5.6e-16) 22. Mayer (Buneman): elapsed time t=1.05 s, 2048 iters, t-(init.)=1.02 s t(norm)=0.108083, mflops=46.2607 (err=5.7e-16) 23. Mayer (simple): elapsed time t=1.54 s, 4096 iters, t-(init.)=1.48 s t(norm)=0.0784132, mflops=63.7648 24. Mayer (lookup): elapsed time t=1.44 s, 4096 iters, t-(init.)=1.39 s t(norm)=0.0736448, mflops=67.8934 (err=5.1e-16) 25. Monro: elapsed time t=1.26 s, 2048 iters, t-(init.)=1.24 s t(norm)=0.131395, mflops=38.0532 (err=7.5e-08) 26. NAPACK (f2c): elapsed time t=1.92 s, 2048 iters, t-(init.)=1.9 s t(norm)=0.201331, mflops=24.8347 (err=6.0e-15) 27. Nielsen: elapsed time t=1.78 s, 2048 iters, t-(init.)=1.76 s t(norm)=0.186496, mflops=26.8102 (err=3.0e-15) 28. NR (C): elapsed time t=1.47 s, 4096 iters, t-(init.)=1.42 s t(norm)=0.0752343, mflops=66.459 (err=5.9e-16) 29. NR (F): elapsed time t=1.95 s, 4096 iters, t-(init.)=1.91 s t(norm)=0.101195, mflops=49.4093 (err=5.8e-16) 30. Ooura (C): elapsed time t=1.52 s, 8192 iters, t-(init.)=1.42 s t(norm)=0.0376172, mflops=132.918 (err=5.3e-16) 31. Ooura (F): elapsed time t=1.91 s, 8192 iters, t-(init.)=1.81 s t(norm)=0.0479486, mflops=104.278 (err=5.3e-16) 32. QFT: elapsed time t=1.09 s, 2048 iters, t-(init.)=1.06 s t(norm)=0.112322, mflops=44.515 (err=7.4e-16) 33. Ransom: elapsed time t=1.32 s, 2048 iters, t-(init.)=1.3 s t(norm)=0.137753, mflops=36.2969 (err=1.6e-15) 34. SCIPORT: elapsed time t=1.77 s, 1024 iters, t-(init.)=1.75 s t(norm)=0.370873, mflops=13.4817 (err=1.3e-07) 35. Singleton: elapsed time t=1.64 s, 2048 iters, t-(init.)=1.62 s t(norm)=0.171661, mflops=29.1271 (err=7.9e-16) 36. Singleton (f2c): elapsed time t=1.6 s, 4096 iters, t-(init.)=1.56 s t(norm)=0.0826518, mflops=60.4948 (err=8.0e-16) 37. Sorensen: elapsed time t=1.11 s, 4096 iters, t-(init.)=1.06 s t(norm)=0.0561608, mflops=89.03 (err=5.7e-16) 38. Sorensen DIT: elapsed time t=2 s, 2048 iters, t-(init.)=1.97 s t(norm)=0.208749, mflops=23.9522 (err=5.5e-16) 39. Temperton: elapsed time t=1.67 s, 4096 iters, t-(init.)=1.63 s t(norm)=0.0863605, mflops=57.8968 (err=9.2e-08) 40. Temperton (f2c): elapsed time t=1.34 s, 2048 iters, t-(init.)=1.32 s t(norm)=0.139872, mflops=35.7469 (err=6.1e-16) 41. Valkenburg: elapsed time t=1.94 s, 512 iters, t-(init.)=1.94 s t(norm)=0.822279, mflops=6.08066 (err=6.6e-16) Top mflops for N=512 = 132.918 Normalized results and averages for N=512: fft 0: mflops = 46.7187 (norm. = 0.351485), norm. avg. (of 9) = 0.357663 fft 1: mflops = 46.7187 (norm. = 0.351485), norm. avg. (of 9) = 0.353307 fft 2: mflops = 41.0312 (norm. = 0.308696), norm. avg. (of 9) = 0.247388 fft 3: mflops = 17.0963 (norm. = 0.128623), norm. avg. (of 9) = 0.0820627 fft 4: mflops = 49.9322 (norm. = 0.375661), norm. avg. (of 9) = 0.232245 fft 5: mflops = 8.86953 (norm. = 0.0667293), norm. avg. (of 9) = 0.059291 fft 6: mflops = 37.7487 (norm. = 0.284), norm. avg. (of 9) = 0.198487 fft 7: mflops = 33.2295 (norm. = 0.25), norm. avg. (of 9) = 0.164135 fft 8: mflops = 25.5059 (norm. = 0.191892), norm. avg. (of 9) = 0.164448 fft 9: mflops = 114.39 (norm. = 0.860606), norm. avg. (of 9) = 0.456269 fft 10: mflops = 117.965 (norm. = 0.8875), norm. avg. (of 9) = 0.425243 fft 11: mflops = 28.0869 (norm. = 0.21131), norm. avg. (of 8) = 0.159259 fft 12: mflops = 56.5101 (norm. = 0.42515), norm. avg. (of 9) = 0.374168 fft 13: mflops = 55.1882 (norm. = 0.415205), norm. avg. (of 9) = 0.381316 fft 14: mflops = 93.4375 (norm. = 0.70297), norm. avg. (of 9) = 0.78935 fft 15: mflops = 90.7422 (norm. = 0.682692), norm. avg. (of 9) = 0.707916 fft 16: mflops = 65.536 (norm. = 0.493056), norm. avg. (of 9) = 0.757371 fft 17: mflops = 87.3813 (norm. = 0.657407), norm. avg. (of 7) = 0.523023 fft 18: mflops = 41.0312 (norm. = 0.308696), norm. avg. (of 9) = 0.217646 fft 19: mflops = 48.8973 (norm. = 0.367876), norm. avg. (of 9) = 0.22077 fft 20: mflops = 48.6453 (norm. = 0.365979), norm. avg. (of 9) = 0.215696 fft 21: mflops = 44.939 (norm. = 0.338095), norm. avg. (of 9) = 0.516509 fft 22: mflops = 46.2607 (norm. = 0.348039), norm. avg. (of 8) = 0.276933 fft 23: mflops = 63.7648 (norm. = 0.47973), norm. avg. (of 8) = 0.345522 fft 24: mflops = 67.8934 (norm. = 0.510791), norm. avg. (of 8) = 0.369793 fft 25: mflops = 38.0532 (norm. = 0.28629), norm. avg. (of 8) = 0.182433 fft 26: mflops = 24.8347 (norm. = 0.186842), norm. avg. (of 9) = 0.114116 fft 27: mflops = 26.8102 (norm. = 0.201705), norm. avg. (of 9) = 0.128015 fft 28: mflops = 66.459 (norm. = 0.5), norm. avg. (of 9) = 0.272302 fft 29: mflops = 49.4093 (norm. = 0.371728), norm. avg. (of 9) = 0.206173 fft 30: mflops = 132.918 (norm. = 1), norm. avg. (of 9) = 0.75162 fft 31: mflops = 104.278 (norm. = 0.78453), norm. avg. (of 9) = 0.593221 fft 32: mflops = 44.515 (norm. = 0.334906), norm. avg. (of 6) = 0.43677 fft 33: mflops = 36.2969 (norm. = 0.273077), norm. avg. (of 8) = 0.153569 fft 34: mflops = 13.4817 (norm. = 0.101429), norm. avg. (of 8) = 0.103679 fft 35: mflops = 29.1271 (norm. = 0.219136), norm. avg. (of 9) = 0.15344 fft 36: mflops = 60.4948 (norm. = 0.455128), norm. avg. (of 9) = 0.268034 fft 37: mflops = 89.03 (norm. = 0.669811), norm. avg. (of 9) = 0.432025 fft 38: mflops = 23.9522 (norm. = 0.180203), norm. avg. (of 9) = 0.155033 fft 39: mflops = 57.8968 (norm. = 0.435583), norm. avg. (of 9) = 0.274122 fft 40: mflops = 35.7469 (norm. = 0.268939), norm. avg. (of 9) = 0.194261 fft 41: mflops = 6.08066 (norm. = 0.0457474), norm. avg. (of 9) = 0.0469681 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.98 s, 2048 iters, t-(init.)=1.93 s t(norm)=0.0920296, mflops=54..3304 (err=5.2e-16) 1. Arndt DIT: elapsed time t=1.92 s, 2048 iters, t-(init.)=1.87 s t(norm)=0.0891685, mflops=56.0736 (err=4.9e-16) 2. Arndt Split-Radix: elapsed time t=1.24 s, 1024 iters, t-(init.)=1.22 s t(norm)=0.116348, mflops=42.9744 (err=5.1e-16) 3. Arndt 4-step: elapsed time t=1.39 s, 512 iters, t-(init.)=1.37 s t(norm)=0.261307, mflops=19.1346 (err=4.4e-16) 4. Bailey: elapsed time t=1.01 s, 512 iters, t-(init.)=0.99 s t(norm)=0.188828, mflops=26.4792 (err=5.6e-16) 5. Beauregard: elapsed time t=1.51 s, 256 iters, t-(init.)=1.5 s t(norm)=0.572205, mflops=8.73813 (err=5.1e-16) 6. Bergland: elapsed time t=1.39 s, 1024 iters, t-(init.)=1.36 s t(norm)=0.1297, mflops=38.5506 (err=5.0e-16) 7. Brenner: elapsed time t=1.56 s, 1024 iters, t-(init.)=1.53 s t(norm)=0.145912, mflops=34.2672 (err=5.1e-16) 8. Burrus: elapsed time t=1.95 s, 1024 iters, t-(init.)=1.93 s t(norm)=0.184059, mflops=27.1652 (err=5.2e-16) 9. CWP (min N) (N=1040): elapsed time t=1.89 s, 4096 iters, t-(init.)=1.76 s t(norm)=0.0419617, mflops=119.156 10. CWP (best N) (N=1040): elapsed time t=1.9 s, 4096 iters, t-(init.)=1.78 s t(norm)=0.0424385, mflops=117.818 11. Edelblute: elapsed time t=1.77 s, 1024 iters, t-(init.)=1.74 s t(norm)=0.165939, mflops=30.1315 (err=5.2e-16) 12. FFTPACK: elapsed time t=1.48 s, 1024 iters, t-(init.)=1.45 s t(norm)=0.138283, mflops=36.1578 (err=4.9e-16) 13. FFTPACK (f2c): elapsed time t=1.3 s, 1024 iters, t-(init.)=1.27 s t(norm)=0.121117, mflops=41.2825 (err=4.7e-16) FFTW_MEASURE plan: (cost = 6.250000e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 4 FFTW_NOTW 8 14. FFTW: elapsed time t=1.27 s, 2048 iters, t-(init.)=1.21 s t(norm)=0.0576973, mflops=86.6592 (err=4.9e-16) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.35 s, 2048 iters, t-(init.)=1.3 s t(norm)=0.0619888, mflops=80.6597 (err=4.8e-16) 16. Frigo-old: elapsed time t=1.97 s, 2048 iters, t-(init.)=1.91 s t(norm)=0.0910759, mflops=54.8993 (err=4.7e-16) 17. Green: elapsed time t=1.35 s, 2048 iters, t-(init.)=1.3 s t(norm)=0.0619888, mflops=80.6597 (err=5.9e-16) 18. GSL: elapsed time t=1.79 s, 1024 iters, t-(init.)=1.76 s t(norm)=0.167847, mflops=29.7891 (err=4.9e-16) 19. GSL DIT: elapsed time t=1.13 s, 1024 iters, t-(init.)=1.11 s t(norm)=0.105858, mflops=47.2332 (err=5.1e-16) 20. GSL DIF: elapsed time t=1.18 s, 1024 iters, t-(init.)=1.16 s t(norm)=0.110626, mflops=45.1972 (err=4.9e-16) 21. Krukar: elapsed time t=1.51 s, 1024 iters, t-(init.)=1.49 s t(norm)=0.142097, mflops=35.1871 (err=5.4e-16) 22. Mayer (Buneman): elapsed time t=1.15 s, 1024 iters, t-(init.)=1.13 s t(norm)=0.107765, mflops=46.3972 (err=4.6e-16) 23. Mayer (simple): elapsed time t=1.75 s, 2048 iters, t-(init.)=1.7 s t(norm)=0.0810623, mflops=61.6809 24. Mayer (lookup): elapsed time t=1.65 s, 2048 iters, t-(init.)=1.59 s t(norm)=0.0758171, mflops=65.9482 (err=4.6e-16) 25. Monro: elapsed time t=1.34 s, 1024 iters, t-(init.)=1.31 s t(norm)=0.124931, mflops=40.022 (err=8.4e-08) 26. NAPACK (f2c): elapsed time t=1.17 s, 512 iters, t-(init.)=1.15 s t(norm)=0.219345, mflops=22.7951 (err=1.5e-14) 27. Nielsen: elapsed time t=1.01 s, 512 iters, t-(init.)=1 s t(norm)=0.190735, mflops=26.2144 (err=6.2e-15) 28. NR (C): elapsed time t=1.74 s, 2048 iters, t-(init.)=1.68 s t(norm)=0.0801086, mflops=62.4152 (err=5.1e-16) 29. NR (F): elapsed time t=1.09 s, 1024 iters, t-(init.)=1.07 s t(norm)=0.102043, mflops=48.9989 (err=5.0e-16) 30. Ooura (C): elapsed time t=1.81 s, 4096 iters, t-(init.)=1.7 s t(norm)=0.0405312, mflops=123.362 (err=4.6e-16) 31. Ooura (F): elapsed time t=1.09 s, 2048 iters, t-(init.)=1.03 s t(norm)=0.0491142, mflops=101.803 (err=4.6e-16) 32. QFT: elapsed time t=1.37 s, 1024 iters, t-(init.)=1.34 s t(norm)=0.127792, mflops=39.126 (err=9.5e-16) 33. Ransom: elapsed time t=1.18 s, 1024 iters, t-(init.)=1.16 s t(norm)=0.110626, mflops=45.1972 (err=1.8e-15) 34. SCIPORT: elapsed time t=1.09 s, 256 iters, t-(init.)=1.09 s t(norm)=0.415802, mflops=12.025 (err=1.4e-07) 35. Singleton: elapsed time t=1.92 s, 1024 iters, t-(init.)=1.89 s t(norm)=0.180244, mflops=27.7401 (err=6.0e-16) 36. Singleton (f2c): elapsed time t=1.8 s, 2048 iters, t-(init.)=1.74 s t(norm)=0.0829697, mflops=60.263 (err=6.0e-16) 37. Sorensen: elapsed time t=1.35 s, 2048 iters, t-(init.)=1.3 s t(norm)=0.0619888, mflops=80.6597 (err=4.9e-16) 38. Sorensen DIT: elapsed time t=1.07 s, 512 iters, t-(init.)=1.05 s t(norm)=0.200272, mflops=24.9661 (err=4.9e-16) 39. Temperton: elapsed time t=1.97 s, 2048 iters, t-(init.)=1.91 s t(norm)=0.0910759, mflops=54.8993 (err=9.8e-08) 40. Temperton (f2c): elapsed time t=1.42 s, 1024 iters, t-(init.)=1.39 s t(norm)=0.132561, mflops=37.7186 (err=4.8e-16) 41. Valkenburg: elapsed time t=1.12 s, 128 iters, t-(init.)=1.11 s t(norm)=0.846863, mflops=5.90414 (err=8.2e-16) Top mflops for N=1024 = 123.362 Normalized results and averages for N=1024: fft 0: mflops = 54.3304 (norm. = 0.440415), norm. avg. (of 10) = 0.365938 fft 1: mflops = 56.0736 (norm. = 0.454545), norm. avg. (of 10) = 0.363431 fft 2: mflops = 42.9744 (norm. = 0.348361), norm. avg. (of 10) = 0.257485 fft 3: mflops = 19.1346 (norm. = 0.155109), norm. avg. (of 10) = 0.0893674 fft 4: mflops = 26.4792 (norm. = 0.214646), norm. avg. (of 10) = 0.230485 fft 5: mflops = 8.73813 (norm. = 0.0708333), norm. avg. (of 10) = 0.0604452 fft 6: mflops = 38.5506 (norm. = 0.3125), norm. avg. (of 10) = 0.209889 fft 7: mflops = 34.2672 (norm. = 0.277778), norm. avg. (of 10) = 0.175499 fft 8: mflops = 27.1652 (norm. = 0.220207), norm. avg. (of 10) = 0.170024 fft 9: mflops = 119.156 (norm. = 0.965909), norm. avg. (of 10) = 0.507233 fft 10: mflops = 117.818 (norm. = 0.955056), norm. avg. (of 10) = 0.478224 fft 11: mflops = 30.1315 (norm. = 0.244253), norm. avg. (of 9) = 0.168703 fft 12: mflops = 36.1578 (norm. = 0.293103), norm. avg. (of 10) = 0.366062 fft 13: mflops = 41.2825 (norm. = 0.334646), norm. avg. (of 10) = 0.376649 fft 14: mflops = 86.6592 (norm. = 0.702479), norm. avg. (of 10) = 0.780663 fft 15: mflops = 80.6597 (norm. = 0.653846), norm. avg. (of 10) = 0.702509 fft 16: mflops = 54.8993 (norm. = 0.445026), norm. avg. (of 10) = 0.726137 fft 17: mflops = 80.6597 (norm. = 0.653846), norm. avg. (of 8) = 0.539376 fft 18: mflops = 29.7891 (norm. = 0.241477), norm. avg. (of 10) = 0.220029 fft 19: mflops = 47.2332 (norm. = 0.382883), norm. avg. (of 10) = 0.236982 fft 20: mflops = 45.1972 (norm. = 0.366379), norm. avg. (of 10) = 0.230765 fft 21: mflops = 35.1871 (norm. = 0.285235), norm. avg. (of 10) = 0.493381 fft 22: mflops = 46.3972 (norm. = 0.376106), norm. avg. (of 9) = 0.287953 fft 23: mflops = 61.6809 (norm. = 0.5), norm. avg. (of 9) = 0.362686 fft 24: mflops = 65.9482 (norm. = 0.534591), norm. avg. (of 9) = 0.388104 fft 25: mflops = 40.022 (norm. = 0.324427), norm. avg. (of 9) = 0.19821 fft 26: mflops = 22.7951 (norm. = 0.184783), norm. avg. (of 10) = 0.121183 fft 27: mflops = 26.2144 (norm. = 0.2125), norm. avg. (of 10) = 0.136464 fft 28: mflops = 62.4152 (norm. = 0.505952), norm. avg. (of 10) = 0.295667 fft 29: mflops = 48.9989 (norm. = 0.397196), norm. avg. (of 10) = 0.225275 fft 30: mflops = 123.362 (norm. = 1), norm. avg. (of 10) = 0.776458 fft 31: mflops = 101.803 (norm. = 0.825243), norm. avg. (of 10) = 0.616423 fft 32: mflops = 39.126 (norm. = 0.317164), norm. avg. (of 7) = 0.419683 fft 33: mflops = 45.1972 (norm. = 0.366379), norm. avg. (of 9) = 0.177214 fft 34: mflops = 12.025 (norm. = 0.0974771), norm. avg. (of 9) = 0.10299 fft 35: mflops = 27.7401 (norm. = 0.224868), norm. avg. (of 10) = 0.160583 fft 36: mflops = 60.263 (norm. = 0.488506), norm. avg. (of 10) = 0.290081 fft 37: mflops = 80.6597 (norm. = 0.653846), norm. avg. (of 10) = 0.454207 fft 38: mflops = 24.9661 (norm. = 0.202381), norm. avg. (of 10) = 0..159768 fft 39: mflops = 54.8993 (norm. = 0.445026), norm. avg. (of 10) = 0.291212 fft 40: mflops = 37.7186 (norm. = 0.305755), norm. avg. (of 10) = 0.20541 fft 41: mflops = 5.90414 (norm. = 0.0478604), norm. avg. (of 10) = 0.0470573 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.02 s, 256 iters, t-(init.)=0.98 s t(norm)=0.169927, mflops=29.4243 (err=4.6e-16) 1. Arndt DIT: elapsed time t=1.05 s, 256 iters, t-(init.)=1.02 s t(norm)=0.176863, mflops=28.2704 (err=4.6e-16) 2. Arndt Split-Radix: elapsed time t=1.32 s, 256 iters, t-(init.)=1.29 s t(norm)=0.22368, mflops=22.3534 (err=4.8e-16) 3. Arndt 4-step: elapsed time t=1.82 s, 256 iters, t-(init.)=1.78 s t(norm)=0.308644, mflops=16.1999 (err=4.7e-16) 4. Bailey: elapsed time t=1.24 s, 256 iters, t-(init.)=1.2 s t(norm)=0.208074, mflops=24.0299 (err=5.0e-16) 5. Beauregard: elapsed time t=1.72 s, 128 iters, t-(init.)=1.7 s t(norm)=0.589544, mflops=8.48113 (err=4.8e-16) 6. Bergland: elapsed time t=1.01 s, 256 iters, t-(init.)=0.97 s t(norm)=0.168193, mflops=29.7277 (err=4.9e-16) 7. Brenner: elapsed time t=1.17 s, 256 iters, t-(init.)=1.13 s t(norm)=0.195937, mflops=25.5184 (err=5.0e-16) 8. Burrus: elapsed time t=1.63 s, 256 iters, t-(init.)=1.59 s t(norm)=0.275699, mflops=18.1357 (err=4.6e-16) 9. CWP (min N) (N=2145): elapsed time t=1.67 s, 1024 iters, t-(init.)=1.51 s t(norm)=0.0654567, mflops=76.3863 10. CWP (best N) (N=2184): elapsed time t=1.68 s, 1024 iters, t-(init.)=1.52 s t(norm)=0.0658902, mflops=75.8838 11. Edelblute: elapsed time t=1.59 s, 256 iters, t-(init.)=1.55 s t(norm)=0.268763, mflops=18.6038 (err=4.7e-16) 12. FFTPACK: elapsed time t=1.6 s, 512 iters, t-(init.)=1.52 s t(norm)=0.13178, mflops=37.9419 (err=4.5e-16) 13. FFTPACK (f2c): elapsed time t=1.44 s, 512 iters, t-(init.)=1.37 s t(norm)=0.118776, mflops=42.0961 (err=4.6e-16) FFTW_MEASURE plan: (cost = 1.562500e-03) FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 8 14. FFTW: elapsed time t=1.67 s, 1024 iters, t-(init.)=1.52 s t(norm)=0.0658902, mflops=75.8838 (err=4.6e-16) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.93 s, 1024 iters, t-(init.)=1.78 s t(norm)=0.0771609, mflops=64.7996 (err=4.4e-16) 16. Frigo-old: elapsed time t=1.25 s, 512 iters, t-(init.)=1.17 s t(norm)=0.101436, mflops=49.292 (err=4.6e-16) 17. Green: elapsed time t=1.29 s, 512 iters, t-(init.)=1.21 s t(norm)=0.104904, mflops=47.6625 (err=5.8e-16) 18. GSL: elapsed time t=1.92 s, 512 iters, t-(init.)=1.84 s t(norm)=0.159524, mflops=31.3433 (err=4.7e-16) 19. GSL DIT: elapsed time t=1.33 s, 256 iters, t-(init.)=1.3 s t(norm)=0.225414, mflops=22.1814 (err=4.5e-16) 20. GSL DIF: elapsed time t=1.3 s, 256 iters, t-(init.)=1.26 s t(norm)=0.218478, mflops=22.8856 (err=4.4e-16) 21. Krukar: elapsed time t=1.78 s, 512 iters, t-(init.)=1.7 s t(norm)=0.147386, mflops=33.9245 (err=5.0e-16) 22. Mayer (Buneman): elapsed time t=1.31 s, 512 iters, t-(init.)=1.23 s t(norm)=0.106638, mflops=46.8875 (err=4.5e-16) 23. Mayer (simple): elapsed time t=1.02 s, 512 iters, t-(init.)=0.94 s t(norm)=0.0814958, mflops=61.3529 24. Mayer (lookup): elapsed time t=1.08 s, 512 iters, t-(init.)=1.01 s t(norm)=0.0875646, mflops=57.1007 (err=4.5e-16) 25. Monro: elapsed time t=1.16 s, 256 iters, t-(init.)=1.12 s t(norm)=0.194203, mflops=25.7463 (err=1.0e-07) 26. NAPACK (f2c): elapsed time t=1.33 s, 256 iters, t-(init.)=1.29 s t(norm)=0.22368, mflops=22.3534 (err=1.5e-14) 27. Nielsen: elapsed time t=1.41 s, 256 iters, t-(init.)=1.37 s t(norm)=0.237552, mflops=21.0481 (err=1.1e-14) 28. NR (C): elapsed time t=1.17 s, 256 iters, t-(init.)=1.13 s t(norm)=0.195937, mflops=25.5184 (err=4.5e-16) 29. NR (F): elapsed time t=1.26 s, 256 iters, t-(init.)=1.22 s t(norm)=0.211542, mflops=23.6359 (err=4.5e-16) 30. Ooura (C): elapsed time t=1.53 s, 1024 iters, t-(init.)=1.38 s t(norm)=0.0598214, mflops=83.5821 (err=4.6e-16) 31. Ooura (F): elapsed time t=1.65 s, 1024 iters, t-(init.)=1.5 s t(norm)=0.0650232, mflops=76.8956 (err=4.6e-16) 32. QFT: elapsed time t=1.68 s, 512 iters, t-(init.)=1.6 s t(norm)=0.138716, mflops=36.0448 (err=1.2e-15) 33. Ransom: elapsed time t=1.74 s, 512 iters, t-(init.)=1.66 s t(norm)=0.143918, mflops=34.742 (err=2.1e-15) 34. SCIPORT: elapsed time t=1.21 s, 128 iters, t-(init.)=1.19 s t(norm)=0.412681, mflops=12.1159 (err=1.6e-07) 35. Singleton: elapsed time t=1.36 s, 256 iters, t-(init.)=1.32 s t(norm)=0.228882, mflops=21.8453 (err=5.9e-16) 36. Singleton (f2c): elapsed time t=1.95 s, 512 iters, t-(init.)=1.87 s t(norm)=0.162125, mflops=30.8405 (err=5.9e-16) 37. Sorensen: elapsed time t=1.68 s, 512 iters, t-(init.)=1.6 s t(norm)=0.138716, mflops=36.0448 (err=4.5e-16) 38. Sorensen DIT: elapsed time t=1.79 s, 256 iters, t-(init.)=1.75 s t(norm)=0.303442, mflops=16.4776 (err=4.4e-16) 39. Temperton: elapsed time t=1.62 s, 512 iters, t-(init.)=1.54 s t(norm)=0.133514, mflops=37.4491 (err=1.0e-07) 40. Temperton (f2c): elapsed time t=1.95 s, 512 iters, t-(init.)=1.88 s t(norm)=0.162992, mflops=30.6764 (err=4.7e-16) 41. Valkenburg: elapsed time t=1.26 s, 64 iters, t-(init.)=1.25 s t(norm)=0.866977, mflops=5.76717 (err=7.4e-16) Top mflops for N=2048 = 83.5821 Normalized results and averages for N=2048: fft 0: mflops = 29.4243 (norm. = 0.352041), norm. avg. (of 11) = 0.364675 fft 1: mflops = 28.2704 (norm. = 0.338235), norm. avg. (of 11) = 0.361141 fft 2: mflops = 22.3534 (norm. = 0.267442), norm. avg. (of 11) = 0.25839 fft 3: mflops = 16.1999 (norm. = 0.19382), norm. avg. (of 11) = 0.0988631 fft 4: mflops = 24.0299 (norm. = 0.2875), norm. avg. (of 11) = 0.235668 fft 5: mflops = 8.48113 (norm. = 0.101471), norm. avg. (of 11) = 0.0641748 fft 6: mflops = 29.7277 (norm. = 0.35567), norm. avg. (of 11) = 0.223142 fft 7: mflops = 25.5184 (norm. = 0.30531), norm. avg. (of 11) = 0.1873 fft 8: mflops = 18.1357 (norm. = 0.216981), norm. avg. (of 11) = 0.174293 fft 9: mflops = 76.3863 (norm. = 0.913907), norm. avg. (of 11) = 0.544203 fft 10: mflops = 75.8838 (norm. = 0.907895), norm. avg. (of 11) = 0.517285 fft 11: mflops = 18.6038 (norm. = 0.222581), norm. avg. (of 10) = 0.174091 fft 12: mflops = 37.9419 (norm. = 0.453947), norm. avg. (of 11) = 0.374052 fft 13: mflops = 42.0961 (norm. = 0.50365), norm. avg. (of 11) = 0.388194 fft 14: mflops = 75.8838 (norm. = 0.907895), norm. avg. (of 11) = 0.792229 fft 15: mflops = 64.7996 (norm. = 0.775281), norm. avg. (of 11) = 0.709125 fft 16: mflops = 49.292 (norm. = 0.589744), norm. avg. (of 11) = 0.713737 fft 17: mflops = 47.6625 (norm. = 0.570248), norm. avg. (of 9) = 0.542806 fft 18: mflops = 31.3433 (norm. = 0.375), norm. avg. (of 11) = 0.234117 fft 19: mflops = 22.1814 (norm. = 0.265385), norm. avg. (of 11) = 0.239564 fft 20: mflops = 22.8856 (norm. = 0.27381), norm. avg. (of 11) = 0.234678 fft 21: mflops = 33.9245 (norm. = 0.405882), norm. avg. (of 11) = 0.485427 fft 22: mflops = 46.8875 (norm. = 0.560976), norm. avg. (of 10) = 0.315255 fft 23: mflops = 61.3529 (norm. = 0.734043), norm. avg. (of 10) = 0.399822 fft 24: mflops = 57.1007 (norm. = 0.683168), norm. avg. (of 10) = 0.417611 fft 25: mflops = 25.7463 (norm. = 0.308036), norm. avg. (of 10) = 0.209193 fft 26: mflops = 22.3534 (norm. = 0.267442), norm. avg. (of 11) = 0.134479 fft 27: mflops = 21.0481 (norm. = 0.251825), norm. avg. (of 11) = 0.146951 fft 28: mflops = 25.5184 (norm. = 0.30531), norm. avg. (of 11) = 0.296544 fft 29: mflops = 23.6359 (norm. = 0.282787), norm. avg. (of 11) = 0.230503 fft 30: mflops = 83.5821 (norm. = 1), norm. avg. (of 11) = 0.79678 fft 31: mflops = 76.8956 (norm. = 0.92), norm. avg. (of 11) = 0.644021 fft 32: mflops = 36.0448 (norm. = 0.43125), norm. avg. (of 8) = 0.421129 fft 33: mflops = 34.742 (norm. = 0.415663), norm. avg. (of 10) = 0.201059 fft 34: mflops = 12..1159 (norm. = 0.144958), norm. avg. (of 10) = 0.107187 fft 35: mflops = 21.8453 (norm. = 0.261364), norm. avg. (of 11) = 0.169745 fft 36: mflops = 30.8405 (norm. = 0.368984), norm. avg. (of 11) = 0.297254 fft 37: mflops = 36.0448 (norm. = 0.43125), norm. avg. (of 11) = 0.45212 fft 38: mflops = 16.4776 (norm. = 0.197143), norm. avg. (of 11) = 0.163166 fft 39: mflops = 37.4491 (norm. = 0.448052), norm. avg. (of 11) = 0.30547 fft 40: mflops = 30.6764 (norm. = 0.367021), norm. avg. (of 11) = 0.220102 fft 41: mflops = 5.76717 (norm. = 0.069), norm. avg. (of 11) = 0.0490521 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.14 s, 128 iters, t-(init.)=1.1 s t(norm)=0.17484, mflops=28.5975 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.16 s, 128 iters, t-(init.)=1.12 s t(norm)=0.178019, mflops=28.0869 (err=1.1e-15) 2. Arndt Split-Radix: elapsed time t=1.54 s, 128 iters, t-(init.)=1.5 s t(norm)=0.238419, mflops=20.9715 (err=1.0e-15) 3. Arndt 4-step: elapsed time t=1.81 s, 128 iters, t-(init.)=1.78 s t(norm)=0.282923, mflops=17.6726 (err=1.0e-15) 4. Bailey: elapsed time t=1.4 s, 128 iters, t-(init.)=1.36 s t(norm)=0.216166, mflops=23.1304 (err=1.0e-15) 5. Beauregard: elapsed time t=1.88 s, 64 iters, t-(init.)=1.86 s t(norm)=0.591278, mflops=8.45626 (err=1.0e-15) 6. Bergland: elapsed time t=1.12 s, 128 iters, t-(init.)=1.08 s t(norm)=0.171661, mflops=29.1271 (err=1.1e-15) 7. Brenner: elapsed time t=1.24 s, 128 iters, t-(init.)=1.21 s t(norm)=0.192324, mflops=25.9978 (err=1.1e-15) 8. Burrus: elapsed time t=1.81 s, 128 iters, t-(init.)=1.77 s t(norm)=0.281334, mflops=17.7725 (err=1.0e-15) 9. CWP (min N) (N=4290): elapsed time t=1.97 s, 512 iters, t-(init.)=1.81 s t(norm)=0.0719229, mflops=69.5189 10. CWP (best N) (N=4368): elapsed time t=1.8 s, 512 iters, t-(init.)=1.64 s t(norm)=0.0651677, mflops=76.7251 11. Edelblute: elapsed time t=1.78 s, 128 iters, t-(init.)=1.74 s t(norm)=0.276566, mflops=18.0789 (err=1.1e-15) 12. FFTPACK: elapsed time t=1.83 s, 256 iters, t-(init.)=1.75 s t(norm)=0.139078, mflops=35.9512 (err=1.0e-15) 13. FFTPACK (f2c): elapsed time t=1.6 s, 256 iters, t-(init.)=1.52 s t(norm)=0.120799, mflops=41.3912 (err=1.0e-15) FFTW_MEASURE plan: (cost = 3.125000e-03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 8 14. FFTW: elapsed time t=1.77 s, 512 iters, t-(init.)=1.61 s t(norm)=0.0639757, mflops=78.1547 (err=1.0e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.01 s, 256 iters, t-(init.)=0.93 s t(norm)=0.0739098, mflops=67.6501 (err=1.1e-15) 16. Frigo-old: elapsed time t=1.54 s, 256 iters, t-(init.)=1.47 s t(norm)=0.116825, mflops=42.799 (err=1.1e-15) 17. Green: elapsed time t=1.35 s, 256 iters, t-(init.)=1.27 s t(norm)=0.100931, mflops=49.539 (err=1.1e-15) 18. GSL: elapsed time t=1.11 s, 128 iters, t-(init.)=1.07 s t(norm)=0.170072, mflops=29.3993 (err=1.0e-15) 19. GSL DIT: elapsed time t=1.48 s, 128 iters, t-(init.)=1.45 s t(norm)=0.230471, mflops=21.6947 (err=1.0e-15) 20. GSL DIF: elapsed time t=1.45 s, 128 iters, t-(init.)=1.41 s t(norm)=0.224113, mflops=22.3101 (err=1.0e-15) 21. Krukar: elapsed time t=1.92 s, 256 iters, t-(init.)=1.84 s t(norm)=0.14623, mflops=34.1927 (err=1.1e-15) 22. Mayer (Buneman): elapsed time t=1.07 s, 128 iters, t-(init.)=1.03 s t(norm)=0.163714, mflops=30.541 (err=1.1e-15) 23. Mayer (simple): elapsed time t=1.87 s, 256 iters, t-(init.)=1.8 s t(norm)=0.143051, mflops=34.9525 24. Mayer (lookup): elapsed time t=1.9 s, 256 iters, t-(init.)=1.82 s t(norm)=0.144641, mflops=34.5684 (err=1.1e-15) 25. Monro: elapsed time t=1.27 s, 128 iters, t-(init.)=1.23 s t(norm)=0.195503, mflops=25.575 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.37 s, 128 iters, t-(init.)=1.33 s t(norm)=0.211398, mflops=23.6521 (err=4.5e-14) 27. Nielsen: elapsed time t=1.52 s, 128 iters, t-(init.)=1.48 s t(norm)=0.23524, mflops=21.2549 (err=2.2e-14) 28. NR (C): elapsed time t=1.31 s, 128 iters, t-(init.)=1.27 s t(norm)=0.201861, mflops=24.7695 (err=1.0e-15) 29. NR (F): elapsed time t=1.39 s, 128 iters, t-(init.)=1.36 s t(norm)=0.216166, mflops=23.1304 (err=1.0e-15) 30. Ooura (C): elapsed time t=1.6 s, 512 iters, t-(init.)=1.44 s t(norm)=0.0572205, mflops=87.3813 (err=1.1e-15) 31. Ooura (F): elapsed time t=1.75 s, 512 iters, t-(init.)=1.6 s t(norm)=0.0635783, mflops=78.6432 (err=1.1e-15) 32. QFT: elapsed time t=1.02 s, 128 iters, t-(init.)=0.98 s t(norm)=0.155767, mflops=32.0993 (err=1.9e-15) 33. Ransom: elapsed time t=1.41 s, 256 iters, t-(init.)=1.33 s t(norm)=0.105699, mflops=47.3042 (err=2.6e-15) 34. SCIPORT: elapsed time t=1.34 s, 64 iters, t-(init.)=1.32 s t(norm)=0.419617, mflops=11.9156 (err=1.7e-07) 35. Singleton: elapsed time t=1.56 s, 128 iters, t-(init.)=1.53 s t(norm)=0.243187, mflops=20.5603 (err=1.6e-15) 36. Singleton (f2c): elapsed time t=1.94 s, 256 iters, t-(init.)=1.86 s t(norm)=0.14782, mflops=33.825 (err=1.6e-15) 37. Sorensen: elapsed time t=1.93 s, 256 iters, t-(init.)=1.85 s t(norm)=0.147025, mflops=34.0079 (err=1.1e-15) 38. Sorensen DIT: elapsed time t=1.95 s, 128 iters, t-(init.)=1.91 s t(norm)=0.303586, mflops=16.4698 (err=1.0e-15) 39. Temperton: elapsed time t=1.66 s, 256 iters, t-(init.)=1.59 s t(norm)=0.126362, mflops=39.5689 (err=1.2e-07) 40. Temperton (f2c): elapsed time t=1.03 s, 128 iters, t-(init.)=0.99 s t(norm)=0.157356, mflops=31.775 (err=1.0e-15) 41. Valkenburg: elapsed time t=1.37 s, 32 iters, t-(init.)=1.36 s t(norm)=0.864665, mflops=5.78259 (err=1.1e-15) Top mflops for N=4096 = 87.3813 Normalized results and averages for N=4096: fft 0: mflops = 28.5975 (norm. = 0.327273), norm. avg. (of 12) = 0.361558 fft 1: mflops = 28.0869 (norm. = 0.321429), norm. avg. (of 12) = 0.357831 fft 2: mflops = 20.9715 (norm. = 0.24), norm. avg. (of 12) = 0.256858 fft 3: mflops = 17.6726 (norm. = 0.202247), norm. avg. (of 12) = 0.107478 fft 4: mflops = 23.1304 (norm. = 0.264706), norm. avg. (of 12) = 0.238088 fft 5: mflops = 8.45626 (norm. = 0.0967742), norm. avg. (of 12) = 0.0668914 fft 6: mflops = 29.1271 (norm. = 0.333333), norm. avg. (of 12) = 0.232324 fft 7: mflops = 25.9978 (norm. = 0.297521), norm. avg. (of 12) = 0.196485 fft 8: mflops = 17.7725 (norm. = 0.20339), norm. avg. (of 12) = 0.176717 fft 9: mflops = 69.5189 (norm. = 0.79558), norm. avg. (of 12) = 0.565151 fft 10: mflops = 76.7251 (norm. = 0.878049), norm. avg. (of 12) = 0.547349 fft 11: mflops = 18.0789 (norm. = 0.206897), norm. avg. (of 11) = 0.177073 fft 12: mflops = 35.9512 (norm. = 0.411429), norm. avg. (of 12) = 0.377166 fft 13: mflops = 41.3912 (norm. = 0.473684), norm. avg. (of 12) = 0.395319 fft 14: mflops = 78.1547 (norm. = 0.89441), norm. avg. (of 12) = 0.800745 fft 15: mflops = 67.6501 (norm. = 0.774194), norm. avg. (of 12) = 0.714547 fft 16: mflops = 42.799 (norm. = 0.489796), norm. avg. (of 12) = 0.695076 fft 17: mflops = 49.539 (norm. = 0.566929), norm. avg. (of 10) = 0.545218 fft 18: mflops = 29.3993 (norm. = 0.336449), norm. avg. (of 12) = 0.242645 fft 19: mflops = 21.6947 (norm. = 0.248276), norm. avg. (of 12) = 0.24029 fft 20: mflops = 22.3101 (norm. = 0.255319), norm. avg. (of 12) = 0.236398 fft 21: mflops = 34.1927 (norm. = 0.391304), norm. avg. (of 12) = 0.477583 fft 22: mflops = 30.541 (norm. = 0.349515), norm. avg. (of 11) = 0.318369 fft 23: mflops = 34.9525 (norm. = 0.4), norm. avg. (of 11) = 0.399838 fft 24: mflops = 34.5684 (norm. = 0.395604), norm. avg. (of 11) = 0.41561 fft 25: mflops = 25.575 (norm. = 0.292683), norm. avg. (of 11) = 0.216783 fft 26: mflops = 23.6521 (norm. = 0.270677), norm. avg. (of 12) = 0.145829 fft 27: mflops = 21.2549 (norm. = 0.243243), norm. avg. (of 12) = 0.154975 fft 28: mflops = 24.7695 (norm. = 0.283465), norm. avg. (of 12) = 0.295454 fft 29: mflops = 23.1304 (norm. = 0.264706), norm. avg. (of 12) = 0.233354 fft 30: mflops = 87.3813 (norm. = 1), norm. avg. (of 12) = 0.813715 fft 31: mflops = 78.6432 (norm. = 0.9), norm. avg. (of 12) = 0.665353 fft 32: mflops = 32.0993 (norm. = 0.367347), norm. avg. (of 9) = 0.415153 fft 33: mflops = 47.3042 (norm. = 0.541353), norm. avg. (of 11) = 0.231995 fft 34: mflops = 11.9156 (norm. = 0.136364), norm. avg. (of 11) = 0.109839 fft 35: mflops = 20.5603 (norm. = 0.235294), norm. avg. (of 12) = 0.175207 fft 36: mflops = 33.825 (norm. = 0.387097), norm. avg. (of 12) = 0.304741 fft 37: mflops = 34.0079 (norm. = 0.389189), norm. avg. (of 12) = 0.446876 fft 38: mflops = 16.4698 (norm. = 0.188482), norm. avg. (of 12) = 0.165275 fft 39: mflops = 39.5689 (norm. = 0.45283), norm. avg. (of 12) = 0.31775 fft 40: mflops = 31.775 (norm. = 0.363636), norm. avg. (of 12) = 0.232064 fft 41: mflops = 5.78259 (norm. = 0.0661765), norm. avg. (of 12) = 0.0504791 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.27 s, 64 iters, t-(init.)=1.24 s t(norm)=0.181932, mflops=27.4828 (err=1.3e-15) 1. Arndt DIT: elapsed time t=1.3 s, 64 iters, t-(init.)=1.27 s t(norm)=0.186333, mflops=26.8336 (err=1.3e-15) 2. Arndt Split-Radix: elapsed time t=1.65 s, 64 iters, t-(init.)=1.61 s t(norm)=0.236218, mflops=21.1669 (err=1.3e-15) 3. Arndt 4-step: elapsed time t=1.04 s, 32 iters, t-(init.)=1.02 s t(norm)=0.299307, mflops=16.7053 (err=1.4e-15) 4. Bailey: elapsed time t=1.51 s, 64 iters, t-(init.)=1.47 s t(norm)=0.215677, mflops=23.1828 (err=1.3e-15) 5. Beauregard: elapsed time t=1.03 s, 16 iters, t-(init.)=1.02 s t(norm)=0.598614, mflops=8.35263 (err=1.3e-15) 6. Bergland: elapsed time t=1.22 s, 64 iters, t-(init.)=1.18 s t(norm)=0.173129, mflops=28.8803 (err=1.4e-15) 7. Brenner: elapsed time t=1.35 s, 64 iters, t-(init.)=1.31 s t(norm)=0.192202, mflops=26.0143 (err=1.4e-15) 8. Burrus: elapsed time t=1.93 s, 64 iters, t-(init.)=1.89 s t(norm)=0.277299, mflops=18.0311 (err=1.3e-15) 9. CWP (min N) (N=8580): elapsed time t=1.01 s, 128 iters, t-(init.)=0.93 s t(norm)=0.0682244, mflops=73.2876 10. CWP (best N) (N=9240): elapsed time t=1.02 s, 128 iters, t-(init.)=0.94 s t(norm)=0.068958, mflops=72.5079 11. Edelblute: elapsed time t=1.91 s, 64 iters, t-(init.)=1.88 s t(norm)=0.275832, mflops=18.127 (err=1.3e-15) 12. FFTPACK: elapsed time t=1.1 s, 64 iters, t-(init.)=1.06 s t(norm)=0.155522, mflops=32.1497 (err=1.3e-15) 13. FFTPACK (f2c): elapsed time t=1.01 s, 64 iters, t-(init.)=0.98 s t(norm)=0.143785, mflops=34.7742 (err=1.3e-15) FFTW_MEASURE plan: (cost = 7.812500e-03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 8 14. FFTW: elapsed time t=1.05 s, 128 iters, t-(init.)=0.97 s t(norm)=0.0711588, mflops=70.2654 (err=1.3e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.12 s, 128 iters, t-(init.)=1.04 s t(norm)=0.0762939, mflops=65.536 (err=1.4e-15) 16. Frigo-old: elapsed time t=1.57 s, 128 iters, t-(init.)=1.49 s t(norm)=0.109306, mflops=45.7432 (err=1.4e-15) 17. Green: elapsed time t=1.5 s, 128 iters, t-(init.)=1.42 s t(norm)=0.104171, mflops=47.9982 (err=1.4e-15) 18. GSL: elapsed time t=1.27 s, 64 iters, t-(init.)=1.23 s t(norm)=0.180465, mflops=27.7063 (err=1.3e-15) 19. GSL DIT: elapsed time t=1.6 s, 64 iters, t-(init.)=1.56 s t(norm)=0.228882, mflops=21.8453 (err=1.3e-15) 20. GSL DIF: elapsed time t=1.58 s, 64 iters, t-(init.)=1.54 s t(norm)=0.225947, mflops=22.129 (err=1.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.16 s, 64 iters, t-(init.)=1.12 s t(norm)=0.164325, mflops=30.4274 (err=1.3e-15) 23. Mayer (simple): elapsed time t=1.03 s, 64 iters, t-(init.)=1 s t(norm)=0.146719, mflops=34.0787 24. Mayer (lookup): elapsed time t=1.04 s, 64 iters, t-(init.)=1 s t(norm)=0.146719, mflops=34.0787 (err=1.4e-15) 25. Monro: elapsed time t=1.41 s, 64 iters, t-(init.)=1.37 s t(norm)=0.201005, mflops=24.875 (err=1.2e-07) 26. NAPACK (f2c): elapsed time t=1.59 s, 64 iters, t-(init.)=1.55 s t(norm)=0.227415, mflops=21.9863 (err=4.1e-14) 27. Nielsen: elapsed time t=1.65 s, 64 iters, t-(init.)=1.61 s t(norm)=0.236218, mflops=21.1669 (err=1.1e-14) 28. NR (C): elapsed time t=1.43 s, 64 iters, t-(init.)=1.39 s t(norm)=0.20394, mflops=24.5171 (err=1.3e-15) 29. NR (F): elapsed time t=1.52 s, 64 iters, t-(init.)=1.48 s t(norm)=0.217144, mflops=23.0262 (err=1.3e-15) 30. Ooura (C): elapsed time t=1.77 s, 256 iters, t-(init.)=1.62 s t(norm)=0.0594212, mflops=84.145 (err=1.4e-15) 31. Ooura (F): elapsed time t=1.92 s, 256 iters, t-(init.)=1.76 s t(norm)=0.0645564, mflops=77.4516 (err=1.4e-15) 32. QFT: elapsed time t=1.2 s, 64 iters, t-(init.)=1.16 s t(norm)=0.170194, mflops=29.3782 (err=2.8e-15) 33. Ransom: elapsed time t=1.71 s, 128 iters, t-(init.)=1.64 s t(norm)=0.12031, mflops=41.5594 (err=3.2e-15) 34. SCIPORT: elapsed time t=1.49 s, 32 iters, t-(init.)=1.47 s t(norm)=0.431354, mflops=11.5914 (err=1.9e-07) 35. Singleton: elapsed time t=1.67 s, 64 iters, t-(init.)=1.63 s t(norm)=0.239152, mflops=20.9072 (err=2.0e-15) 36. Singleton (f2c): elapsed time t=1.09 s, 64 iters, t-(init.)=1.05 s t(norm)=0.154055, mflops=32.4559 (err=2.0e-15) 37. Sorensen: elapsed time t=1.12 s, 64 iters, t-(init.)=1.08 s t(norm)=0.158457, mflops=31.5544 (err=1.4e-15) 38. Sorensen DIT: elapsed time t=1.05 s, 32 iters, t-(init.)=1.03 s t(norm)=0.302241, mflops=16.5431 (err=1.3e-15) 39. Temperton: elapsed time t=1.91 s, 128 iters, t-(init.)=1.83 s t(norm)=0.134248, mflops=37.2445 (err=1.4e-07) 40. Temperton (f2c): elapsed time t=1.18 s, 64 iters, t-(init.)=1.15 s t(norm)=0.168727, mflops=29.6337 (err=1.3e-15) 41. Valkenburg: elapsed time t=1.51 s, 16 iters, t-(init.)=1.5 s t(norm)=0.880315, mflops=5.67979 (err=1.4e-15) Top mflops for N=8192 = 84.145 Normalized results and averages for N=8192: fft 0: mflops = 27.4828 (norm. = 0.326613), norm. avg. (of 13) = 0.35887 fft 1: mflops = 26.8336 (norm. = 0.318898), norm. avg. (of 13) = 0.354836 fft 2: mflops = 21.1669 (norm. = 0.251553), norm. avg. (of 13) = 0.25645 fft 3: mflops = 16.7053 (norm. = 0.198529), norm. avg. (of 13) = 0.114482 fft 4: mflops = 23.1828 (norm. = 0.27551), norm. avg. (of 13) = 0.240967 fft 5: mflops = 8.35263 (norm. = 0.0992647), norm. avg. (of 13) = 0.0693817 fft 6: mflops = 28.8803 (norm. = 0.34322), norm. avg. (of 13) = 0.240855 fft 7: mflops = 26.0143 (norm. = 0.30916), norm. avg. (of 13) = 0.205153 fft 8: mflops = 18.0311 (norm. = 0.214286), norm. avg. (of 13) = 0.179607 fft 9: mflops = 73.2876 (norm. = 0.870968), norm. avg. (of 13) = 0.588676 fft 10: mflops = 72.5079 (norm. = 0.861702), norm. avg. (of 13) = 0.57153 fft 11: mflops = 18.127 (norm. = 0.215426), norm. avg. (of 12) = 0.180269 fft 12: mflops = 32.1497 (norm. = 0.382075), norm. avg. (of 13) = 0.377544 fft 13: mflops = 34.7742 (norm. = 0.413265), norm. avg. (of 13) = 0.396699 fft 14: mflops = 70.2654 (norm. = 0.835052), norm. avg. (of 13) = 0.803384 fft 15: mflops = 65.536 (norm. = 0.778846), norm. avg. (of 13) = 0.719493 fft 16: mflops = 45.7432 (norm. = 0.543624), norm. avg. (of 13) = 0.683426 fft 17: mflops = 47.9982 (norm. = 0.570423), norm. avg. (of 11) = 0.54751 fft 18: mflops = 27.7063 (norm. = 0.329268), norm. avg. (of 13) = 0.249308 fft 19: mflops = 21.8453 (norm. = 0.259615), norm. avg. (of 13) = 0.241776 fft 20: mflops = 22.129 (norm. = 0.262987), norm. avg. (of 13) = 0.238443 fft 21: mflops = -1 (norm. = -0.0118842), norm. avg. (of 12) = 0.477583 fft 22: mflops = 30.4274 (norm. = 0.361607), norm. avg. (of 12) = 0.321973 fft 23: mflops = 34.0787 (norm. = 0.405), norm. avg. (of 12) = 0.400268 fft 24: mflops = 34.0787 (norm. = 0.405), norm. avg. (of 12) = 0.414726 fft 25: mflops = 24.875 (norm. = 0.29562), norm. avg. (of 12) = 0.223352 fft 26: mflops = 21.9863 (norm. = 0.26129), norm. avg. (of 13) = 0.15471 fft 27: mflops = 21.1669 (norm. = 0.251553), norm. avg. (of 13) = 0.162404 fft 28: mflops = 24.5171 (norm. = 0.291367), norm. avg. (of 13) = 0.295139 fft 29: mflops = 23.0262 (norm. = 0.273649), norm. avg. (of 13) = 0.236453 fft 30: mflops = 84.145 (norm. = 1), norm. avg. (of 13) = 0.828045 fft 31: mflops = 77.4516 (norm. = 0.920455), norm. avg. (of 13) = 0.684976 fft 32: mflops = 29.3782 (norm. = 0.349138), norm. avg. (of 10) = 0.408552 fft 33: mflops = 41.5594 (norm. = 0.493902), norm. avg. (of 12) = 0.253821 fft 34: mflops = 11.5914 (norm. = 0.137755), norm. avg. (of 12) = 0.112166 fft 35: mflops = 20.9072 (norm. = 0.248466), norm. avg. (of 13) = 0.180842 fft 36: mflops = 32.4559 (norm. = 0.385714), norm. avg. (of 13) = 0.31097 fft 37: mflops = 31.5544 (norm. = 0.375), norm. avg. (of 13) = 0.441347 fft 38: mflops = 16.5431 (norm. = 0.196602), norm. avg. (of 13) = 0.167685 fft 39: mflops = 37.2445 (norm. = 0.442623), norm. avg. (of 13) = 0.327356 fft 40: mflops = 29.6337 (norm. = 0.352174), norm. avg. (of 13) = 0.241303 fft 41: mflops = 5.67979 (norm. = 0.0675), norm. avg. (of 13) = 0.0517884 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.43 s, 32 iters, t-(init.)=1.39 s t(norm)=0.189372, mflops=26.403 (err=1.7e-15) 1. Arndt DIT: elapsed time t=1.45 s, 32 iters, t-(init.)=1.41 s t(norm)=0.192097, mflops=26.0285 (err=1.8e-15) 2. Arndt Split-Radix: elapsed time t=1.85 s, 32 iters, t-(init.)=1.81 s t(norm)=0.246593, mflops=20.2763 (err=1.8e-15) 3. Arndt 4-step: elapsed time t=1.98 s, 32 iters, t-(init.)=1.94 s t(norm)=0.264304, mflops=18.9176 (err=1.8e-15) 4. Bailey: elapsed time t=1.03 s, 16 iters, t-(init.)=1.01 s t(norm)=0.275203, mflops=18.1684 (err=1.7e-15) 5. Beauregard: elapsed time t=1.13 s, 8 iters, t-(init.)=1.12 s t(norm)=0.610352, mflops=8.192 (err=1.8e-15) 6. Bergland: elapsed time t=1.35 s, 32 iters, t-(init.)=1.31 s t(norm)=0.178473, mflops=28.0154 (err=1.8e-15) 7. Brenner: elapsed time t=1.49 s, 32 iters, t-(init.)=1.45 s t(norm)=0.197547, mflops=25.3105 (err=1.8e-15) 8. Burrus: elapsed time t=1.06 s, 16 iters, t-(init.)=1.04 s t(norm)=0.283378, mflops=17.6443 (err=1.8e-15) 9. CWP (min N) (N=17160): elapsed time t=1.05 s, 64 iters, t-(init.)=0.96 s t(norm)=0.0653948, mflops=76.4587 10. CWP (best N) (N=17160): elapsed time t=1.06 s, 64 iters, t-(init.)=0.98 s t(norm)=0.0667572, mflops=74.8983 11. Edelblute: elapsed time t=1.05 s, 16 iters, t-(init.)=1.03 s t(norm)=0.280653, mflops=17.8156 (err=1.8e-15) 12. FFTPACK: elapsed time t=1.38 s, 32 iters, t-(init.)=1.34 s t(norm)=0.182561, mflops=27.3882 (err=1.8e-15) 13. FFTPACK (f2c): elapsed time t=1.29 s, 32 iters, t-(init.)=1.25 s t(norm)=0.170299, mflops=29.3601 (err=1.8e-15) FFTW_MEASURE plan: (cost = 1.937500e-02) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_TWIDDLE 2 FFTW_NOTW 16 14. FFTW: elapsed time t=1.25 s, 64 iters, t-(init.)=1.16 s t(norm)=0.0790187, mflops=63.2761 (err=1.8e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.49 s, 64 iters, t-(init.)=1.41 s t(norm)=0.0960486, mflops=52.057 (err=1.8e-15) 16. Frigo-old: elapsed time t=1.12 s, 32 iters, t-(init.)=1.08 s t(norm)=0.147138, mflops=33.9816 (err=1.9e-15) 17. Green: elapsed time t=1.85 s, 64 iters, t-(init.)=1.77 s t(norm)=0.120572, mflops=41.4691 (err=1.8e-15) 18. GSL: elapsed time t=1.6 s, 32 iters, t-(init.)=1.56 s t(norm)=0.212533, mflops=23.5257 (err=1.8e-15) 19. GSL DIT: elapsed time t=1.86 s, 32 iters, t-(init.)=1.82 s t(norm)=0.247955, mflops=20.1649 (err=1.8e-15) 20. GSL DIF: elapsed time t=1.88 s, 32 iters, t-(init.)=1.84 s t(norm)=0.25068, mflops=19.9457 (err=1.8e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.29 s, 32 iters, t-(init.)=1.24 s t(norm)=0.168937, mflops=29.5969 (err=1.8e-15) 23. Mayer (simple): elapsed time t=1.15 s, 32 iters, t-(init.)=1.11 s t(norm)=0.151225, mflops=33.0632 24. Mayer (lookup): elapsed time t=1.18 s, 32 iters, t-(init.)=1.13 s t(norm)=0.15395, mflops=32.478 (err=1.9e-15) 25. Monro: elapsed time t=1.57 s, 32 iters, t-(init.)=1.53 s t(norm)=0.208446, mflops=23.987 (err=1.4e-07) 26. NAPACK (f2c): elapsed time t=1.12 s, 16 iters, t-(init.)=1.1 s t(norm)=0.299726, mflops=16.6819 (err=2.3e-13) 27. Nielsen: elapsed time t=1.87 s, 32 iters, t-(init.)=1.83 s t(norm)=0.249318, mflops=20.0547 (err=1.3e-13) 28. NR (C): elapsed time t=1.7 s, 32 iters, t-(init.)=1.66 s t(norm)=0.226157, mflops=22.1085 (err=1.8e-15) 29. NR (F): elapsed time t=1.78 s, 32 iters, t-(init.)=1.74 s t(norm)=0.237056, mflops=21.092 (err=1.8e-15) 30. Ooura (C): elapsed time t=1.96 s, 128 iters, t-(init.)=1.79 s t(norm)=0.060967, mflops=82.0115 (err=1.9e-15) 31. Ooura (F): elapsed time t=1.07 s, 64 iters, t-(init.)=0.99 s t(norm)=0.0674384, mflops=74.1417 (err=1.9e-15) 32. QFT: elapsed time t=1.56 s, 32 iters, t-(init.)=1.52 s t(norm)=0.207084, mflops=24.1448 (err=3.8e-15) 33. Ransom: elapsed time t=1.59 s, 64 iters, t-(init.)=1.51 s t(norm)=0.102861, mflops=48.6095 (err=4.0e-15) 34. SCIPORT: elapsed time t=1 s, 8 iters, t-(init.)=0.99 s t(norm)=0.539507, mflops=9.26772 (err=2.1e-07) 35. Singleton: elapsed time t=1.88 s, 32 iters, t-(init.)=1.84 s t(norm)=0.25068, mflops=19.9457 (err=2.5e-15) 36. Singleton (f2c): elapsed time t=1.19 s, 32 iters, t-(init.)=1.14 s t(norm)=0.155313, mflops=32.1931 (err=2.5e-15) 37. Sorensen: elapsed time t=1.3 s, 32 iters, t-(init.)=1.25 s t(norm)=0.170299, mflops=29.3601 (err=1.9e-15) 38. Sorensen DIT: elapsed time t=1.15 s, 16 iters, t-(init.)=1.13 s t(norm)=0.307901, mflops=16.239 (err=1.8e-15) 39. Temperton: elapsed time t=1.04 s, 32 iters, t-(init.)=1 s t(norm)=0.136239, mflops=36.7002 (err=1.5e-07) 40. Temperton (f2c): elapsed time t=1.31 s, 32 iters, t-(init.)=1.27 s t(norm)=0.173024, mflops=28.8978 (err=1.8e-15) 41. Valkenburg: elapsed time t=1.7 s, 8 iters, t-(init.)=1.69 s t(norm)=0.920977, mflops=5.42902 (err=1.7e-15) Top mflops for N=16384 = 82.0115 Normalized results and averages for N=16384: fft 0: mflops = 26.403 (norm. = 0.321942), norm. avg. (of 14) = 0.356232 fft 1: mflops = 26.0285 (norm. = 0.317376), norm. avg. (of 14) = 0.352161 fft 2: mflops = 20.2763 (norm. = 0.247238), norm. avg. (of 14) = 0.255792 fft 3: mflops = 18.9176 (norm. = 0.23067), norm. avg. (of 14) = 0.122781 fft 4: mflops = 18.1684 (norm. = 0.221535), norm. avg. (of 14) = 0.239579 fft 5: mflops = 8.192 (norm. = 0.0998884), norm. avg. (of 14) = 0.0715607 fft 6: mflops = 28.0154 (norm. = 0.341603), norm. avg. (of 14) = 0.248051 fft 7: mflops = 25.3105 (norm. = 0.308621), norm. avg. (of 14) = 0.212543 fft 8: mflops = 17.6443 (norm. = 0.215144), norm. avg. (of 14) = 0.182146 fft 9: mflops = 76.4587 (norm. = 0.932292), norm. avg. (of 14) = 0.61322 fft 10: mflops = 74.8983 (norm. = 0.913265), norm. avg. (of 14) = 0.595939 fft 11: mflops = 17.8156 (norm. = 0.217233), norm. avg. (of 13) = 0.183112 fft 12: mflops = 27.3882 (norm. = 0.333955), norm. avg. (of 14) = 0.37443 fft 13: mflops = 29.3601 (norm. = 0.358), norm. avg. (of 14) = 0.393935 fft 14: mflops = 63.2761 (norm. = 0.771552), norm. avg. (of 14) = 0.80111 fft 15: mflops = 52.057 (norm. = 0.634752), norm. avg. (of 14) = 0.71344 fft 16: mflops = 33.9816 (norm. = 0.414352), norm. avg. (of 14) = 0.664206 fft 17: mflops = 41.4691 (norm. = 0.50565), norm. avg. (of 12) = 0.544021 fft 18: mflops = 23.5257 (norm. = 0.286859), norm. avg. (of 14) = 0.25199 fft 19: mflops = 20.1649 (norm. = 0.245879), norm. avg. (of 14) = 0.242069 fft 20: mflops = 19.9457 (norm. = 0.243207), norm. avg. (of 14) = 0.238783 fft 21: mflops = -1 (norm. = -0.0121934), norm. avg. (of 12) = 0.477583 fft 22: mflops = 29.5969 (norm. = 0.360887), norm. avg. (of 13) = 0.324966 fft 23: mflops = 33.0632 (norm. = 0.403153), norm. avg. (of 13) = 0.40049 fft 24: mflops = 32.478 (norm. = 0.396018), norm. avg. (of 13) = 0.413287 fft 25: mflops = 23.987 (norm. = 0.292484), norm. avg. (of 13) = 0.22867 fft 26: mflops = 16.6819 (norm. = 0.203409), norm. avg. (of 14) = 0.158189 fft 27: mflops = 20.0547 (norm. = 0.244536), norm. avg. (of 14) = 0.168271 fft 28: mflops = 22.1085 (norm. = 0.269578), norm. avg. (of 14) = 0.293314 fft 29: mflops = 21.092 (norm. = 0.257184), norm. avg. (of 14) = 0.237934 fft 30: mflops = 82.0115 (norm. = 1), norm. avg. (of 14) = 0.840327 fft 31: mflops = 74.1417 (norm. = 0.90404), norm. avg. (of 14) = 0.700623 fft 32: mflops = 24.1448 (norm. = 0.294408), norm. avg. (of 11) = 0.398175 fft 33: mflops = 48.6095 (norm. = 0.592715), norm. avg. (of 13) = 0.27989 fft 34: mflops = 9.26772 (norm. = 0.113005), norm. avg. (of 13) = 0.11223 fft 35: mflops = 19.9457 (norm. = 0.243207), norm. avg. (of 14) = 0.185297 fft 36: mflops = 32.1931 (norm. = 0.392544), norm. avg. (of 14) = 0.316796 fft 37: mflops = 29.3601 (norm. = 0.358), norm. avg. (of 14) = 0.435394 fft 38: mflops = 16.239 (norm. = 0.198009), norm. avg. (of 14) = 0.169851 fft 39: mflops = 36.7002 (norm. = 0.4475), norm. avg. (of 14) = 0.335938 fft 40: mflops = 28.8978 (norm. = 0.352362), norm. avg. (of 14) = 0.249236 fft 41: mflops = 5.42902 (norm. = 0.0661982), norm. avg. (of 14) = 0.0528177 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.72 s, 16 iters, t-(init.)=1.67 s t(norm)=0.212351, mflops=23.5459 (err=2.1e-15) 1. Arndt DIT: elapsed time t=1.72 s, 16 iters, t-(init.)=1.67 s t(norm)=0.212351, mflops=23.5459 (err=2.1e-15) 2. Arndt Split-Radix: elapsed time t=1.08 s, 8 iters, t-(init.)=1.06 s t(norm)=0.269572, mflops=18.5479 (err=2.1e-15) 3. Arndt 4-step: elapsed time t=1.18 s, 8 iters, t-(init.)=1.16 s t(norm)=0.295003, mflops=16.949 (err=2.1e-15) 4. Bailey: elapsed time t=1.35 s, 8 iters, t-(init.)=1.33 s t(norm)=0.338236, mflops=14.7826 (err=2.1e-15) 5. Beauregard: elapsed time t=1.25 s, 4 iters, t-(init.)=1.24 s t(norm)=0.630697, mflops=7.92774 (err=2.2e-15) 6. Bergland: elapsed time t=1.49 s, 16 iters, t-(init.)=1.44 s t(norm)=0.183105, mflops=27.3067 (err=2.2e-15) 7. Brenner: elapsed time t=1.74 s, 16 iters, t-(init.)=1.69 s t(norm)=0.214895, mflops=23.2672 (err=2.2e-15) 8. Burrus: elapsed time t=1.22 s, 8 iters, t-(init.)=1.2 s t(norm)=0.305176, mflops=16.384 (err=2.1e-15) 9. CWP (min N) (N=34320): elapsed time t=1.18 s, 32 iters, t-(init.)=1.08 s t(norm)=0.0686646, mflops=72.8178 10. CWP (best N) (N=34320): elapsed time t=1.18 s, 32 iters, t-(init.)=1.08 s t(norm)=0.0686646, mflops=72.8178 11. Edelblute: elapsed time t=1.22 s, 8 iters, t-(init.)=1.2 s t(norm)=0.305176, mflops=16.384 (err=2.1e-15) 12. FFTPACK: elapsed time t=1.64 s, 16 iters, t-(init.)=1.59 s t(norm)=0.202179, mflops=24.7306 (err=2.1e-15) 13. FFTPACK (f2c): elapsed time t=1.55 s, 16 iters, t-(init.)=1.5 s t(norm)=0.190735, mflops=26.2144 (err=2.1e-15) FFTW_MEASURE plan: (cost = 5.000000e-02) FFTW_TWIDDLE 32 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 8 14. FFTW: elapsed time t=1.72 s, 32 iters, t-(init.)=1.63 s t(norm)=0.103633, mflops=48.2474 (err=2.1e-15) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.91 s, 32 iters, t-(init.)=1.81 s t(norm)=0.115077, mflops=43.4493 (err=2.1e-15) 16. Frigo-old: elapsed time t=1.49 s, 16 iters, t-(init.)=1.44 s t(norm)=0.183105, mflops=27.3067 (err=2.2e-15) 17. Green: elapsed time t=1.04 s, 16 iters, t-(init.)=0.99 s t(norm)=0.125885, mflops=39.7188 (err=2.2e-15) 18. GSL: elapsed time t=1.9 s, 16 iters, t-(init.)=1.85 s t(norm)=0.23524, mflops=21.2549 (err=2.2e-15) 19. GSL DIT: elapsed time t=1.1 s, 8 iters, t-(init.)=1.07 s t(norm)=0.272115, mflops=18.3746 (err=2.2e-15) 20. GSL DIF: elapsed time t=1.13 s, 8 iters, t-(init.)=1.11 s t(norm)=0.282288, mflops=17.7124 (err=2.2e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.42 s, 16 iters, t-(init.)=1.37 s t(norm)=0.174205, mflops=28.7019 (err=2.1e-15) 23. Mayer (simple): elapsed time t=1.26 s, 16 iters, t-(init.)=1.21 s t(norm)=0.153859, mflops=32.4972 24. Mayer (lookup): elapsed time t=1.36 s, 16 iters, t-(init.)=1.31 s t(norm)=0.166575, mflops=30.0165 (err=2.1e-15) 25. Monro: elapsed time t=1.93 s, 16 iters, t-(init.)=1.88 s t(norm)=0.239054, mflops=20.9157 (err=1.5e-07) 26. NAPACK (f2c): elapsed time t=1.7 s, 8 iters, t-(init.)=1.68 s t(norm)=0.427246, mflops=11.7029 (err=5.7e-13) 27. Nielsen: elapsed time t=1.26 s, 8 iters, t-(init.)=1.23 s t(norm)=0.312805, mflops=15.9844 (err=2.3e-13) 28. NR (C): elapsed time t=1.03 s, 8 iters, t-(init.)=1.01 s t(norm)=0.256856, mflops=19.4661 (err=2.2e-15) 29. NR (F): elapsed time t=1.06 s, 8 iters, t-(init.)=1.03 s t(norm)=0.261943, mflops=19.0882 (err=2.2e-15) 30. Ooura (C): elapsed time t=1.29 s, 32 iters, t-(init.)=1.19 s t(norm)=0.0756582, mflops=66.0867 (err=2.2e-15) 31. Ooura (F): elapsed time t=1.39 s, 32 iters, t-(init.)=1.29 s t(norm)=0.082016, mflops=60.9637 (err=2.2e-15) 32. QFT: elapsed time t=1.96 s, 16 iters, t-(init.)=1.91 s t(norm)=0.242869, mflops=20.5872 (err=4.9e-15) 33. Ransom: elapsed time t=1.07 s, 16 iters, t-(init.)=1.02 s t(norm)=0.1297, mflops=38.5506 (err=3.6e-15) 34. SCIPORT: elapsed time t=1.32 s, 4 iters, t-(init.)=1.31 s t(norm)=0.6663, mflops=7.50412 (err=2.3e-07) 35. Singleton: elapsed time t=1.07 s, 8 iters, t-(init.)=1.04 s t(norm)=0.264486, mflops=18.9046 (err=3.2e-15) 36. Singleton (f2c): elapsed time t=1.56 s, 16 iters, t-(init.)=1.52 s t(norm)=0.193278, mflops=25.8695 (err=3.2e-15) 37. Sorensen: elapsed time t=1.66 s, 16 iters, t-(init.)=1.61 s t(norm)=0.204722, mflops=24.4234 (err=2.1e-15) 38. Sorensen DIT: elapsed time t=1.33 s, 8 iters, t-(init.)=1.31 s t(norm)=0.33315, mflops=15.0082 (err=2.1e-15) 39. Temperton: elapsed time t=1.34 s, 16 iters, t-(init.)=1.29 s t(norm)=0.164032, mflops=30.4819 (err=1.5e-07) 40. Temperton (f2c): elapsed time t=1.65 s, 16 iters, t-(init.)=1.61 s t(norm)=0.204722, mflops=24.4234 (err=2.2e-15) 41. Valkenburg: elapsed time t=1 s, 2 iters, t-(init.)=0.99 s t(norm)=1.00708, mflops=4.96485 (err=2.3e-15) Top mflops for N=32768 = 72.8178 Normalized results and averages for N=32768: fft 0: mflops = 23.5459 (norm. = 0.323353), norm. avg. (of 15) = 0.35404 fft 1: mflops = 23.5459 (norm. = 0.323353), norm. avg. (of 15) = 0.35024 fft 2: mflops = 18.5479 (norm. = 0.254717), norm. avg. (of 15) = 0.25572 fft 3: mflops = 16.949 (norm. = 0.232759), norm. avg. (of 15) = 0.130113 fft 4: mflops = 14.7826 (norm. = 0.203008), norm. avg. (of 15) = 0.237141 fft 5: mflops = 7.92774 (norm. = 0.108871), norm. avg. (of 15) = 0.0740481 fft 6: mflops = 27.3067 (norm. = 0.375), norm. avg. (of 15) = 0.256514 fft 7: mflops = 23.2672 (norm. = 0.319527), norm. avg. (of 15) = 0.219675 fft 8: mflops = 16.384 (norm. = 0.225), norm. avg. (of 15) = 0.185003 fft 9: mflops = 72.8178 (norm. = 1), norm. avg. (of 15) = 0.639005 fft 10: mflops = 72.8178 (norm. = 1), norm. avg. (of 15) = 0.622877 fft 11: mflops = 16.384 (norm. = 0.225), norm. avg. (of 14) = 0.186104 fft 12: mflops = 24.7306 (norm. = 0.339623), norm. avg. (of 15) = 0.37211 fft 13: mflops = 26.2144 (norm. = 0.36), norm. avg. (of 15) = 0.391673 fft 14: mflops = 48.2474 (norm. = 0.662577), norm. avg. (of 15) = 0.791874 fft 15: mflops = 43.4493 (norm. = 0.596685), norm. avg. (of 15) = 0.705657 fft 16: mflops = 27.3067 (norm. = 0.375), norm. avg. (of 15) = 0.644926 fft 17: mflops = 39.7188 (norm. = 0.545455), norm. avg. (of 13) = 0.544132 fft 18: mflops = 21.2549 (norm. = 0.291892), norm. avg. (of 15) = 0.25465 fft 19: mflops = 18.3746 (norm. = 0.252336), norm. avg. (of 15) = 0.242754 fft 20: mflops = 17.7124 (norm. = 0.243243), norm. avg. (of 15) = 0.239081 fft 21: mflops = -1 (norm. = -0.0137329), norm. avg. (of 12) = 0.477583 fft 22: mflops = 28.7019 (norm. = 0.394161), norm. avg. (of 14) = 0.329908 fft 23: mflops = 32.4972 (norm. = 0.446281), norm. avg. (of 14) = 0.403761 fft 24: mflops = 30.0165 (norm. = 0.412214), norm. avg. (of 14) = 0.41321 fft 25: mflops = 20.9157 (norm. = 0.287234), norm. avg. (of 14) = 0.232853 fft 26: mflops = 11.7029 (norm. = 0.160714), norm. avg. (of 15) = 0.158357 fft 27: mflops = 15.9844 (norm. = 0.219512), norm. avg. (of 15) = 0.171687 fft 28: mflops = 19.4661 (norm. = 0.267327), norm. avg. (of 15) = 0.291581 fft 29: mflops = 19.0882 (norm. = 0.262136), norm. avg. (of 15) = 0.239547 fft 30: mflops = 66.0867 (norm. = 0.907563), norm. avg. (of 15) = 0.84481 fft 31: mflops = 60.9637 (norm. = 0.837209), norm. avg. (of 15) = 0.709729 fft 32: mflops = 20.5872 (norm. = 0.282723), norm. avg. (of 12) = 0.388554 fft 33: mflops = 38.5506 (norm. = 0.529412), norm. avg. (of 14) = 0.297713 fft 34: mflops = 7.50412 (norm. = 0.103053), norm. avg. (of 14) = 0.111575 fft 35: mflops = 18.9046 (norm. = 0.259615), norm. avg. (of 15) = 0.190251 fft 36: mflops = 25.8695 (norm. = 0.355263), norm. avg. (of 15) = 0.319361 fft 37: mflops = 24.4234 (norm. = 0.335404), norm. avg. (of 15) = 0.428728 fft 38: mflops = 15.0082 (norm. = 0.206107), norm. avg. (of 15) = 0.172268 fft 39: mflops = 30.4819 (norm. = 0.418605), norm. avg. (of 15) = 0.341449 fft 40: mflops = 24.4234 (norm. = 0.335404), norm. avg. (of 15) = 0.25498 fft 41: mflops = 4.96485 (norm. = 0.0681818), norm. avg. (of 15) = 0.053842 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.31 s, 4 iters, t-(init.)=1.27 s t(norm)=0.302792, mflops=16.513 (err=4.0e-15) 1. Arndt DIT: elapsed time t=1.31 s, 4 iters, t-(init.)=1.28 s t(norm)=0.305176, mflops=16.384 (err=4.1e-15) 2. Arndt Split-Radix: elapsed time t=1.68 s, 4 iters, t-(init.)=1.65 s t(norm)=0.393391, mflops=12.71 (err=4.1e-15) 3. Arndt 4-step: elapsed time t=1.22 s, 4 iters, t-(init.)=1.19 s t(norm)=0.283718, mflops=17.6231 (err=4.2e-15) 4. Bailey: elapsed time t=1.62 s, 4 iters, t-(init.)=1.58 s t(norm)=0.376701, mflops=13.2731 (err=4.0e-15) 5. Beauregard: elapsed time t=1.49 s, 2 iters, t-(init.)=1.47 s t(norm)=0.700951, mflops=7.13317 (err=4.2e-15) 6. Bergland: elapsed time t=1.03 s, 4 iters, t-(init.)=1 s t(norm)=0.238419, mflops=20.9715 (err=4.3e-15) 7. Brenner: elapsed time t=1.2 s, 4 iters, t-(init.)=1.16 s t(norm)=0.276566, mflops=18.0789 (err=4.3e-15) 8. Burrus: elapsed time t=1.81 s, 4 iters, t-(init.)=1.78 s t(norm)=0.424385, mflops=11.7818 (err=4.1e-15) 9. CWP (min N) (N=72072): elapsed time t=1.7 s, 16 iters, t-(init.)=1.54 s t(norm)=0.0917912, mflops=54.4715 10. CWP (best N) (N=72072): elapsed time t=1.7 s, 16 iters, t-(init.)=1.54 s t(norm)=0.0917912, mflops=54.4715 11. Edelblute: elapsed time t=1.81 s, 4 iters, t-(init.)=1.78 s t(norm)=0.424385, mflops=11.7818 (err=4.1e-15) 12. FFTPACK: elapsed time t=1.83 s, 8 iters, t-(init.)=1.76 s t(norm)=0.209808, mflops=23.8313 (err=4.2e-15) 13. FFTPACK (f2c): elapsed time t=1.73 s, 8 iters, t-(init.)=1.66 s t(norm)=0.197887, mflops=25.2669 (err=4.2e-15) FFTW_MEASURE plan: (cost = 1.200000e-01) FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 32 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 8 14. FFTW: elapsed time t=1.85 s, 16 iters, t-(init.)=1.7 s t(norm)=0.101328, mflops=49.3448 (err=4.3e-15) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.2 s, 8 iters, t-(init.)=1.13 s t(norm)=0.134706, mflops=37.1177 (err=4.3e-15) 16. Frigo-old: elapsed time t=1.57 s, 8 iters, t-(init.)=1.5 s t(norm)=0.178814, mflops=27.962 (err=4.4e-15) 17. Green: elapsed time t=1.55 s, 8 iters, t-(init.)=1.48 s t(norm)=0.17643, mflops=28.3399 (err=4.3e-15) 18. GSL: elapsed time t=1.03 s, 4 iters, t-(init.)=0.99 s t(norm)=0.236034, mflops=21.1834 (err=4.2e-15) 19. GSL DIT: elapsed time t=1.71 s, 4 iters, t-(init.)=1.68 s t(norm)=0.400543, mflops=12.483 (err=4.2e-15) 20. GSL DIF: elapsed time t=1.85 s, 4 iters, t-(init.)=1.82 s t(norm)=0.433922, mflops=11.5228 (err=4.2e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.75 s, 8 iters, t-(init.)=1.68 s t(norm)=0.200272, mflops=24.9661 (err=4.2e-15) 23. Mayer (simple): elapsed time t=1.6 s, 8 iters, t-(init.)=1.53 s t(norm)=0.18239, mflops=27.4138 24. Mayer (lookup): elapsed time t=1.71 s, 8 iters, t-(init.)=1.64 s t(norm)=0.195503, mflops=25.575 (err=4.2e-15) 25. Monro: elapsed time t=1.42 s, 4 iters, t-(init.)=1.39 s t(norm)=0.331402, mflops=15.0874 (err=1.6e-07) 26. NAPACK (f2c): elapsed time t=1.82 s, 4 iters, t-(init.)=1.79 s t(norm)=0.426769, mflops=11.7159 (err=8.9e-13) 27. Nielsen: elapsed time t=1.58 s, 4 iters, t-(init.)=1.55 s t(norm)=0.369549, mflops=13.53 (err=2.7e-13) 28. NR (C): elapsed time t=1.66 s, 4 iters, t-(init.)=1.63 s t(norm)=0.388622, mflops=12.866 (err=4.2e-15) 29. NR (F): elapsed time t=1.7 s, 4 iters, t-(init.)=1.67 s t(norm)=0.398159, mflops=12.5578 (err=4.2e-15) 30. Ooura (C): elapsed time t=1.97 s, 16 iters, t-(init.)=1.83 s t(norm)=0.109076, mflops=45.8394 (err=4.4e-15) 31. Ooura (F): elapsed time t=1.05 s, 8 iters, t-(init.)=0.98 s t(norm)=0.116825, mflops=42.799 (err=4.4e-15) 32. QFT: elapsed time t=1.21 s, 4 iters, t-(init.)=1.17 s t(norm)=0.27895, mflops=17.9244 (err=7.9e-15) 33. Ransom: elapsed time t=1.36 s, 8 iters, t-(init.)=1.29 s t(norm)=0.15378, mflops=32.514 (err=6.9e-15) 34. SCIPORT: elapsed time t=1.47 s, 2 iters, t-(init.)=1.45 s t(norm)=0.691414, mflops=7.23156 (err=2.5e-07) 35. Singleton: elapsed time t=1.42 s, 4 iters, t-(init.)=1.38 s t(norm)=0.329018, mflops=15.1968 (err=5.6e-15) 36. Singleton (f2c): elapsed time t=1.02 s, 4 iters, t-(init.)=0.98 s t(norm)=0.23365, mflops=21.3995 (err=5.6e-15) 37. Sorensen: elapsed time t=1.21 s, 4 iters, t-(init.)=1.17 s t(norm)=0.27895, mflops=17.9244 (err=4.2e-15) 38. Sorensen DIT: elapsed time t=1.88 s, 4 iters, t-(init.)=1.85 s t(norm)=0.441074, mflops=11.336 (err=4.1e-15) 39. Temperton: elapsed time t=1.94 s, 8 iters, t-(init.)=1.87 s t(norm)=0.222921, mflops=22.4294 (err=1.8e-07) 40. Temperton (f2c): elapsed time t=1.14 s, 4 iters, t-(init.)=1.11 s t(norm)=0.264645, mflops=18.8933 (err=4.2e-15) 41. Valkenburg: elapsed time t=1.22 s, 1 iters, t-(init.)=1.21 s t(norm)=1.15395, mflops=4.33296 (err=4.0e-15) Top mflops for N=65536 = 54.4715 Normalized results and averages for N=65536: fft 0: mflops = 16.513 (norm. = 0.30315), norm. avg. (of 16) = 0.350859 fft 1: mflops = 16.384 (norm. = 0.300781), norm. avg. (of 16) = 0.347149 fft 2: mflops = 12.71 (norm. = 0.233333), norm. avg. (of 16) = 0.254321 fft 3: mflops = 17.6231 (norm. = 0.323529), norm. avg. (of 16) = 0.142202 fft 4: mflops = 13.2731 (norm. = 0.243671), norm. avg. (of 16) = 0.237549 fft 5: mflops = 7.13317 (norm. = 0.130952), norm. avg. (of 16) = 0.0776046 fft 6: mflops = 20.9715 (norm. = 0.385), norm. avg. (of 16) = 0.264545 fft 7: mflops = 18.0789 (norm. = 0.331897), norm. avg. (of 16) = 0.226689 fft 8: mflops = 11.7818 (norm. = 0.216292), norm. avg. (of 16) = 0.186958 fft 9: mflops = 54.4715 (norm. = 1), norm. avg. (of 16) = 0.661567 fft 10: mflops = 54.4715 (norm. = 1), norm. avg. (of 16) = 0.646447 fft 11: mflops = 11.7818 (norm. = 0.216292), norm. avg. (of 15) = 0.188117 fft 12: mflops = 23.8313 (norm. = 0.4375), norm. avg. (of 16) = 0.376197 fft 13: mflops = 25.2669 (norm. = 0.463855), norm. avg. (of 16) = 0.396184 fft 14: mflops = 49.3448 (norm. = 0.905882), norm. avg. (of 16) = 0.799 fft 15: mflops = 37.1177 (norm. = 0.681416), norm. avg. (of 16) = 0.704142 fft 16: mflops = 27.962 (norm. = 0.513333), norm. avg. (of 16) = 0.636701 fft 17: mflops = 28..3399 (norm. = 0.52027), norm. avg. (of 14) = 0.542427 fft 18: mflops = 21.1834 (norm. = 0.388889), norm. avg. (of 16) = 0.26304 fft 19: mflops = 12.483 (norm. = 0.229167), norm. avg. (of 16) = 0.241905 fft 20: mflops = 11.5228 (norm. = 0.211538), norm. avg. (of 16) = 0.237359 fft 21: mflops = -1 (norm. = -0.0183582), norm. avg. (of 12) = 0.477583 fft 22: mflops = 24.9661 (norm. = 0.458333), norm. avg. (of 15) = 0.33847 fft 23: mflops = 27.4138 (norm. = 0.503268), norm. avg. (of 15) = 0.410395 fft 24: mflops = 25.575 (norm. = 0.469512), norm. avg. (of 15) = 0.416964 fft 25: mflops = 15.0874 (norm. = 0.276978), norm. avg. (of 15) = 0.235795 fft 26: mflops = 11.7159 (norm. = 0.215084), norm. avg. (of 16) = 0.161903 fft 27: mflops = 13.53 (norm. = 0.248387), norm. avg. (of 16) = 0.176481 fft 28: mflops = 12.866 (norm. = 0.236196), norm. avg. (of 16) = 0.28812 fft 29: mflops = 12.5578 (norm. = 0.230539), norm. avg. (of 16) = 0.238984 fft 30: mflops = 45.8394 (norm. = 0.84153), norm. avg. (of 16) = 0.844605 fft 31: mflops = 42.799 (norm. = 0.785714), norm. avg. (of 16) = 0.714478 fft 32: mflops = 17.9244 (norm. = 0.32906), norm. avg. (of 13) = 0.383977 fft 33: mflops = 32.514 (norm. = 0.596899), norm. avg. (of 15) = 0.317658 fft 34: mflops = 7.23156 (norm. = 0.132759), norm. avg. (of 15) = 0.112987 fft 35: mflops = 15.1968 (norm. = 0.278986), norm. avg. (of 16) = 0.195797 fft 36: mflops = 21.3995 (norm. = 0.392857), norm. avg. (of 16) = 0.323954 fft 37: mflops = 17.9244 (norm. = 0.32906), norm. avg. (of 16) = 0.422498 fft 38: mflops = 11.336 (norm. = 0.208108), norm. avg. (of 16) = 0.174508 fft 39: mflops = 22.4294 (norm. = 0.411765), norm. avg. (of 16) = 0.345843 fft 40: mflops = 18.8933 (norm. = 0.346847), norm. avg. (of 16) = 0.260722 fft 41: mflops = 4.33296 (norm. = 0.0795455), norm. avg. (of 16) = 0.0554484 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.52 s, 2 iters, t-(init.)=1.49 s t(norm)=0.334347, mflops=14.9545 (err=2.8e-15) 1. Arndt DIT: elapsed time t=1.52 s, 2 iters, t-(init.)=1.48 s t(norm)=0.332103, mflops=15.0556 (err=2.8e-15) 2. Arndt Split-Radix: elapsed time t=1 s, 1 iters, t-(init.)=0.98 s t(norm)=0.439812, mflops=11.3685 (err=2.8e-15) 3. Arndt 4-step: elapsed time t=1.48 s, 2 iters, t-(init.)=1.44 s t(norm)=0.323127, mflops=15.4738 (err=2.8e-15) 4. Bailey: elapsed time t=1.64 s, 2 iters, t-(init.)=1.61 s t(norm)=0.361274, mflops=13.8399 (err=2.8e-15) 5. Beauregard: elapsed time t=1.61 s, 1 iters, t-(init.)=1.59 s t(norm)=0.713573, mflops=7.00699 (err=2.9e-15) 6. Bergland: elapsed time t=1.16 s, 2 iters, t-(init.)=1.12 s t(norm)=0.251321, mflops=19.8949 (err=2.9e-15) 7. Brenner: elapsed time t=1.4 s, 2 iters, t-(init.)=1.36 s t(norm)=0.305176, mflops=16.384 (err=2.9e-15) 8. Burrus: elapsed time t=1.07 s, 1 iters, t-(init.)=1.05 s t(norm)=0.471227, mflops=10.6106 (err=2.8e-15) 9. CWP (min N) (N=144144): elapsed time t=1.82 s, 8 iters, t-(init.)=1.65 s t(norm)=0.0925625, mflops=54.0176 10. CWP (best N) (N=144144): elapsed time t=1.83 s, 8 iters, t-(init.)=1.66 s t(norm)=0.0931235, mflops=53.6921 11. Edelblute: elapsed time t=1.07 s, 1 iters, t-(init.)=1.06 s t(norm)=0.475715, mflops=10.5105 (err=2.8e-15) 12. FFTPACK: elapsed time t=1.09 s, 2 iters, t-(init.)=1.05 s t(norm)=0.235614, mflops=21.2212 (err=2.9e-15) 13. FFTPACK (f2c): elapsed time t=1.03 s, 2 iters, t-(init.)=0.99 s t(norm)=0.22215, mflops=22.5073 (err=2.9e-15) FFTW_MEASURE plan: (cost = 2.600000e-01) FFTW_TWIDDLE 2 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 8 14. FFTW: elapsed time t=1.97 s, 8 iters, t-(init.)=1.82 s t(norm)=0.102099, mflops=48.972 (err=2.9e-15) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.24 s, 4 iters, t-(init.)=1.17 s t(norm)=0.13127, mflops=38.0893 (err=2.9e-15) 16. Frigo-old: elapsed time t=1.78 s, 4 iters, t-(init.)=1.71 s t(norm)=0.191857, mflops=26.0611 (err=2.8e-15) 17. Green: elapsed time t=1.87 s, 4 iters, t-(init.)=1.8 s t(norm)=0.201955, mflops=24.758 (err=2.9e-15) 18. GSL: elapsed time t=1.17 s, 2 iters, t-(init.)=1.13 s t(norm)=0.253565, mflops=19.7188 (err=2.9e-15) 19. GSL DIT: elapsed time t=1.99 s, 2 iters, t-(init.)=1.95 s t(norm)=0.437568, mflops=11.4268 (err=2.9e-15) 20. GSL DIF: elapsed time t=1.07 s, 1 iters, t-(init.)=1.05 s t(norm)=0.471227, mflops=10.6106 (err=2.9e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.3 s, 2 iters, t-(init.)=1.27 s t(norm)=0.28498, mflops=17.5451 (err=2.8e-15) 23. Mayer (simple): elapsed time t=1.23 s, 2 iters, t-(init.)=1.19 s t(norm)=0.267029, mflops=18.7246 24. Mayer (lookup): elapsed time t=1.27 s, 2 iters, t-(init.)=1.23 s t(norm)=0.276005, mflops=18.1156 (err=2.8e-15) 25. Monro: elapsed time t=1.68 s, 2 iters, t-(init.)=1.64 s t(norm)=0.368006, mflops=13.5867 (err=1.7e-07) 26. NAPACK (f2c): elapsed time t=1.99 s, 2 iters, t-(init.)=1.95 s t(norm)=0.437568, mflops=11.4268 (err=2.1e-12) 27. Nielsen: elapsed time t=1.7 s, 2 iters, t-(init.)=1.66 s t(norm)=0.372494, mflops=13.423 (err=9.6e-13) 28. NR (C): elapsed time t=1.92 s, 2 iters, t-(init.)=1.88 s t(norm)=0.421861, mflops=11.8523 (err=2.9e-15) 29. NR (F): elapsed time t=1.96 s, 2 iters, t-(init.)=1.92 s t(norm)=0.430836, mflops=11.6053 (err=2.9e-15) 30. Ooura (C): elapsed time t=1.16 s, 4 iters, t-(init.)=1.09 s t(norm)=0.122295, mflops=40.8848 (err=2.8e-15) 31. Ooura (F): elapsed time t=1.23 s, 4 iters, t-(init.)=1.15 s t(norm)=0.129027, mflops=38.7517 (err=2.8e-15) 32. QFT: elapsed time t=1.45 s, 2 iters, t-(init.)=1.42 s t(norm)=0.318639, mflops=15.6917 (err=8.6e-15) 33. Ransom: elapsed time t=1.7 s, 4 iters, t-(init.)=1.62 s t(norm)=0.181759, mflops=27.5089 (err=4.0e-15) 34. SCIPORT: elapsed time t=1.57 s, 1 iters, t-(init.)=1.56 s t(norm)=0.700109, mflops=7.14174 (err=2.7e-07) 35. Singleton: elapsed time t=1.64 s, 2 iters, t-(init.)=1.6 s t(norm)=0.35903, mflops=13.9264 (err=4.3e-15) 36. Singleton (f2c): elapsed time t=1.23 s, 2 iters, t-(init.)=1.19 s t(norm)=0.267029, mflops=18.7246 (err=4.2e-15) 37. Sorensen: elapsed time t=1.37 s, 2 iters, t-(init.)=1.34 s t(norm)=0.300688, mflops=16.6285 (err=2.8e-15) 38. Sorensen DIT: elapsed time t=1.08 s, 1 iters, t-(init.)=1.06 s t(norm)=0.475715, mflops=10.5105 (err=2.8e-15) 39. Temperton: elapsed time t=1.18 s, 2 iters, t-(init.)=1.14 s t(norm)=0.255809, mflops=19.5458 (err=2.0e-07) 40. Temperton (f2c): elapsed time t=1.38 s, 2 iters, t-(init.)=1.35 s t(norm)=0.302932, mflops=16.5054 (err=2.9e-15) 41. Valkenburg: elapsed time t=2.6 s, 1 iters, t-(init.)=2.59 s t(norm)=1.16236, mflops=4.30159 (err=3.1e-15) Top mflops for N=131072 = 54.0176 Normalized results and averages for N=131072: fft 0: mflops = 14.9545 (norm. = 0.276846), norm. avg. (of 17) = 0.346506 fft 1: mflops = 15.0556 (norm. = 0.278716), norm. avg. (of 17) = 0.343124 fft 2: mflops = 11.3685 (norm. = 0.210459), norm. avg. (of 17) = 0.251741 fft 3: mflops = 15.4738 (norm. = 0.286458), norm. avg. (of 17) = 0.150687 fft 4: mflops = 13.8399 (norm. = 0.256211), norm. avg. (of 17) = 0.238646 fft 5: mflops = 7.00699 (norm. = 0.129717), norm. avg. (of 17) = 0.08067 fft 6: mflops = 19.8949 (norm. = 0.368304), norm. avg. (of 17) = 0.270648 fft 7: mflops = 16.384 (norm. = 0.303309), norm. avg. (of 17) = 0.231196 fft 8: mflops = 10.6106 (norm. = 0.196429), norm. avg. (of 17) = 0.187515 fft 9: mflops = 54.0176 (norm. = 1), norm. avg. (of 17) = 0.681475 fft 10: mflops = 53.6921 (norm. = 0.993976), norm. avg. (of 17) = 0.66689 fft 11: mflops = 10.5105 (norm. = 0.194575), norm. avg. (of 16) = 0.188521 fft 12: mflops = 21.2212 (norm. = 0.392857), norm. avg. (of 17) = 0.377177 fft 13: mflops = 22.5073 (norm. = 0.416667), norm. avg. (of 17) = 0.397389 fft 14: mflops = 48.972 (norm. = 0.906593), norm. avg. (of 17) = 0.805329 fft 15: mflops = 38.0893 (norm. = 0.705128), norm. avg. (of 17) = 0.7042 fft 16: mflops = 26.0611 (norm. = 0.482456), norm. avg. (of 17) = 0.627628 fft 17: mflops = 24.758 (norm. = 0.458333), norm. avg. (of 15) = 0.536821 fft 18: mflops = 19.7188 (norm. = 0.365044), norm. avg. (of 17) = 0.269041 fft 19: mflops = 11.4268 (norm. = 0.211538), norm. avg. (of 17) = 0.240118 fft 20: mflops = 10.6106 (norm. = 0.196429), norm. avg. (of 17) = 0.234952 fft 21: mflops = -1 (norm. = -0.0185125), norm. avg. (of 12) = 0.477583 fft 22: mflops = 17.5451 (norm. = 0.324803), norm. avg. (of 16) = 0.337616 fft 23: mflops = 18.7246 (norm. = 0.346639), norm. avg. (of 16) = 0.40641 fft 24: mflops = 18.1156 (norm. = 0.335366), norm. avg. (of 16) = 0.411864 fft 25: mflops = 13.5867 (norm. = 0.251524), norm. avg. (of 16) = 0.236778 fft 26: mflops = 11.4268 (norm. = 0.211538), norm. avg. (of 17) = 0.164822 fft 27: mflops = 13.423 (norm. = 0.248494), norm. avg. (of 17) = 0.180717 fft 28: mflops = 11.8523 (norm. = 0.219415), norm. avg. (of 17) = 0.284078 fft 29: mflops = 11.6053 (norm. = 0.214844), norm. avg. (of 17) = 0.237564 fft 30: mflops = 40.8848 (norm. = 0.756881), norm. avg. (of 17) = 0.839444 fft 31: mflops = 38.7517 (norm. = 0.717391), norm. avg. (of 17) = 0.71465 fft 32: mflops = 15.6917 (norm. = 0.290493), norm. avg. (of 14) = 0.3773 fft 33: mflops = 27.5089 (norm. = 0.509259), norm. avg. (of 16) = 0.329633 fft 34: mflops = 7.14174 (norm. = 0.132212), norm. avg. (of 16) = 0.114189 fft 35: mflops = 13.9264 (norm. = 0.257813), norm. avg. (of 17) = 0.199445 fft 36: mflops = 18.7246 (norm. = 0.346639), norm. avg. (of 17) = 0.325289 fft 37: mflops = 16.6285 (norm. = 0.307836), norm. avg. (of 17) = 0.415754 fft 38: mflops = 10.5105 (norm. = 0.194575), norm. avg. (of 17) = 0.175689 fft 39: mflops = 19.5458 (norm. = 0.361842), norm. avg. (of 17) = 0.346785 fft 40: mflops = 16.5054 (norm. = 0.305556), norm. avg. (of 17) = 0.263359 fft 41: mflops = 4.30159 (norm. = 0.0796332), norm. avg. (of 17) = 0.0568711 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=1.67 s, 1 iters, t-(init.)=1.63 s t(norm)=0.345442, mflops=14.4742 (err=6.7e-15) 1. Arndt DIT: elapsed time t=1.66 s, 1 iters, t-(init.)=1.62 s t(norm)=0.343323, mflops=14.5636 (err=6.7e-15) 2. Arndt Split-Radix: elapsed time t=2.2 s, 1 iters, t-(init.)=2.16 s t(norm)=0.457764, mflops=10.9227 (err=6.7e-15) 3. Arndt 4-step: elapsed time t=1.32 s, 1 iters, t-(init.)=1.29 s t(norm)=0.273387, mflops=18.2891 (err=6.8e-15) 4. Bailey: elapsed time t=1.79 s, 1 iters, t-(init.)=1.76 s t(norm)=0.372993, mflops=13.4051 (err=6.7e-15) 5. Beauregard: elapsed time t=3.42 s, 1 iters, t-(init.)=3.38 s t(norm)=0.716315, mflops=6.98017 (err=6.8e-15) 6. Bergland: elapsed time t=1.22 s, 1 iters, t-(init.)=1.18 s t(norm)=0.250075, mflops=19.994 (err=6.8e-15) 7. Brenner: elapsed time t=1.44 s, 1 iters, t-(init.)=1.4 s t(norm)=0.296699, mflops=16.8521 (err=6.9e-15) 8. Burrus: elapsed time t=2.36 s, 1 iters, t-(init.)=2.32 s t(norm)=0.491672, mflops=10.1694 (err=6.7e-15) 9. CWP (min N) (N=360360): elapsed time t=1.35 s, 2 iters, t-(init.)=1.25 s t(norm)=0.132455, mflops=37.7487 10. CWP (best N) (N=360360): elapsed time t=1.35 s, 2 iters, t-(init.)=1.25 s t(norm)=0.132455, mflops=37.7487 11. Edelblute: elapsed time t=2.35 s, 1 iters, t-(init.)=2.32 s t(norm)=0.491672, mflops=10.1694 (err=6.7e-15) 12. FFTPACK: elapsed time t=1.12 s, 1 iters, t-(init.)=1.08 s t(norm)=0.228882, mflops=21.8453 (err=6.8e-15) 13. FFTPACK (f2c): elapsed time t=1.07 s, 1 iters, t-(init.)=1.03 s t(norm)=0.218285, mflops=22.9058 (err=6.8e-15) FFTW_MEASURE plan: (cost = 5.100000e-01) FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 4 FFTW_NOTW 8 14. FFTW: elapsed time t=1.96 s, 4 iters, t-(init.)=1.81 s t(norm)=0.0958973, mflops=52.1391 (err=6.8e-15) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.39 s, 2 iters, t-(init.)=1.32 s t(norm)=0.139872, mflops=35.7469 (err=6.8e-15) 16. Frigo-old: elapsed time t=1.79 s, 2 iters, t-(init.)=1.71 s t(norm)=0.181198, mflops=27.5941 (err=6.9e-15) 17. Green: elapsed time t=1.95 s, 2 iters, t-(init.)=1.87 s t(norm)=0.198152, mflops=25.2331 (err=6.9e-15) 18. GSL: elapsed time t=1.24 s, 1 iters, t-(init.)=1.2 s t(norm)=0.254313, mflops=19.6608 (err=6.8e-15) 19. GSL DIT: elapsed time t=2.16 s, 1 iters, t-(init.)=2.13 s t(norm)=0.451406, mflops=11.0765 (err=6.8e-15) 20. GSL DIF: elapsed time t=2.35 s, 1 iters, t-(init.)=2.31 s t(norm)=0.489553, mflops=10.2134 (err=6.8e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.5 s, 1 iters, t-(init.)=1.46 s t(norm)=0.309414, mflops=16.1596 (err=6.8e-15) 23. Mayer (simple): elapsed time t=1.44 s, 1 iters, t-(init.)=1.4 s t(norm)=0.296699, mflops=16.8521 24. Mayer (lookup): elapsed time t=1.48 s, 1 iters, t-(init.)=1.44 s t(norm)=0.305176, mflops=16.384 (err=6.8e-15) 25. Monro: elapsed time t=1.77 s, 1 iters, t-(init.)=1.73 s t(norm)=0.366635, mflops=13.6375 (err=1.8e-07) 26. NAPACK (f2c): elapsed time t=2 s, 1 iters, t-(init.)=1.97 s t(norm)=0.417497, mflops=11.9761 (err=3.7e-12) 27. Nielsen: elapsed time t=1.78 s, 1 iters, t-(init.)=1.74 s t(norm)=0.368754, mflops=13.5592 (err=2.2e-12) 28. NR (C): elapsed time t=2.08 s, 1 iters, t-(init.)=2.04 s t(norm)=0.432332, mflops=11.5652 (err=6.8e-15) 29. NR (F): elapsed time t=2.12 s, 1 iters, t-(init.)=2.08 s t(norm)=0.440809, mflops=11.3428 (err=6.8e-15) 30. Ooura (C): elapsed time t=1.16 s, 2 iters, t-(init.)=1.08 s t(norm)=0.114441, mflops=43.6907 (err=6.9e-15) 31. Ooura (F): elapsed time t=1.25 s, 2 iters, t-(init.)=1.17 s t(norm)=0.123978, mflops=40.3298 (err=6.9e-15) 32. QFT: elapsed time t=1.62 s, 1 iters, t-(init.)=1.58 s t(norm)=0.334846, mflops=14.9323 (err=1.4e-14) 33. Ransom: elapsed time t=1.42 s, 2 iters, t-(init.)=1.34 s t(norm)=0.141992, mflops=35.2134 (err=8.2e-15) 34. SCIPORT: elapsed time t=3.37 s, 1 iters, t-(init.)=3.33 s t(norm)=0.705719, mflops=7.08497 (err=2.8e-07) 35. Singleton: elapsed time t=1.7 s, 1 iters, t-(init.)=1.66 s t(norm)=0.3518, mflops=14.2126 (err=1.0e-14) 36. Singleton (f2c): elapsed time t=1.26 s, 1 iters, t-(init.)=1.23 s t(norm)=0.260671, mflops=19.1813 (err=1.0e-14) 37. Sorensen: elapsed time t=1.48 s, 1 iters, t-(init.)=1.44 s t(norm)=0.305176, mflops=16.384 (err=6.8e-15) 38. Sorensen DIT: elapsed time t=2.34 s, 1 iters, t-(init.)=2.31 s t(norm)=0.489553, mflops=10.2134 (err=6.7e-15) 39. Temperton: elapsed time t=1.2 s, 1 iters, t-(init.)=1.16 s t(norm)=0.245836, mflops=20.3388 (err=2.0e-07) 40. Temperton (f2c): elapsed time t=1.37 s, 1 iters, t-(init.)=1.33 s t(norm)=0.281864, mflops=17.7391 (err=6.8e-15) 41. Valkenburg: elapsed time t=5.56 s, 1 iters, t-(init.)=5.52 s t(norm)=1.16984, mflops=4.27409 (err=6.8e-15) Top mflops for N=262144 = 52.1391 Normalized results and averages for N=262144: fft 0: mflops = 14.4742 (norm. = 0.277607), norm. avg. (of 18) = 0.342678 fft 1: mflops = 14.5636 (norm. = 0.279321), norm. avg. (of 18) = 0.339579 fft 2: mflops = 10.9227 (norm. = 0.209491), norm. avg. (of 18) = 0.249394 fft 3: mflops = 18.2891 (norm. = 0.350775), norm. avg. (of 18) = 0.161803 fft 4: mflops = 13.4051 (norm. = 0.257102), norm. avg. (of 18) = 0.239672 fft 5: mflops = 6.98017 (norm. = 0.133876), norm. avg. (of 18) = 0.0836259 fft 6: mflops = 19.994 (norm. = 0.383475), norm. avg. (of 18) = 0.276916 fft 7: mflops = 16.8521 (norm. = 0.323214), norm. avg. (of 18) = 0.236308 fft 8: mflops = 10.1694 (norm. = 0.195043), norm. avg. (of 18) = 0.187933 fft 9: mflops = 37.7487 (norm. = 0.724), norm. avg. (of 18) = 0.683838 fft 10: mflops = 37.7487 (norm. = 0.724), norm. avg. (of 18) = 0.670063 fft 11: mflops = 10.1694 (norm. = 0.195043), norm. avg. (of 17) = 0.188904 fft 12: mflops = 21.8453 (norm. = 0.418981), norm. avg. (of 18) = 0.379499 fft 13: mflops = 22.9058 (norm. = 0.43932), norm. avg. (of 18) = 0.399718 fft 14: mflops = 52.1391 (norm. = 1), norm. avg. (of 18) = 0.816144 fft 15: mflops = 35.7469 (norm. = 0.685606), norm. avg. (of 18) = 0.703167 fft 16: mflops = 27.5941 (norm. = 0.52924), norm. avg. (of 18) = 0.622162 fft 17: mflops = 25.2331 (norm. = 0.483957), norm. avg. (of 16) = 0.533517 fft 18: mflops = 19.6608 (norm. = 0.377083), norm. avg. (of 18) = 0.275043 fft 19: mflops = 11.0765 (norm. = 0.212441), norm. avg. (of 18) = 0.238581 fft 20: mflops = 10.2134 (norm. = 0.195887), norm. avg. (of 18) = 0.232781 fft 21: mflops = -1 (norm. = -0.0191795), norm. avg. (of 12) = 0.477583 fft 22: mflops = 16.1596 (norm. = 0.309932), norm. avg. (of 17) = 0.335987 fft 23: mflops = 16.8521 (norm. = 0.323214), norm. avg. (of 17) = 0.401516 fft 24: mflops = 16.384 (norm. = 0.314236), norm. avg. (of 17) = 0.406121 fft 25: mflops = 13.6375 (norm. = 0.261561), norm. avg. (of 17) = 0.238236 fft 26: mflops = 11.9761 (norm. = 0.229695), norm. avg. (of 18) = 0.168427 fft 27: mflops = 13.5592 (norm. = 0.260057), norm. avg. (of 18) = 0.185125 fft 28: mflops = 11.5652 (norm. = 0.221814), norm. avg. (of 18) = 0.280619 fft 29: mflops = 11.3428 (norm. = 0.217548), norm. avg. (of 18) = 0.236452 fft 30: mflops = 43.6907 (norm. = 0.837963), norm. avg. (of 18) = 0.839362 fft 31: mflops = 40.3298 (norm. = 0.773504), norm. avg. (of 18) = 0.717919 fft 32: mflops = 14.9323 (norm. = 0.286392), norm. avg. (of 15) = 0.371239 fft 33: mflops = 35.2134 (norm. = 0.675373), norm. avg. (of 17) = 0.349971 fft 34: mflops = 7.08497 (norm. = 0.135886), norm. avg. (of 17) = 0.115465 fft 35: mflops = 14.2126 (norm. = 0.27259), norm. avg. (of 18) = 0.203509 fft 36: mflops = 19.1813 (norm. = 0.367886), norm. avg. (of 18) = 0.327655 fft 37: mflops = 16.384 (norm. = 0.314236), norm. avg. (of 18) = 0.410114 fft 38: mflops = 10.2134 (norm. = 0.195887), norm. avg. (of 18) = 0.176811 fft 39: mflops = 20.3388 (norm. = 0.390086), norm. avg. (of 18) = 0.34919 fft 40: mflops = 17.7391 (norm. = 0.340226), norm. avg. (of 18) = 0.267629 fft 41: mflops = 4.27409 (norm. = 0.0819746), norm. avg. (of 18) = 0.0582657 Benchmarking for array size = 524288 (power of 2): 0. Arndt DIF: elapsed time t=3.47 s, 1 iters, t-(init.)=3.39 s t(norm)=0.340311, mflops=14.6924 (err=1.0e-14) 1. Arndt DIT: elapsed time t=3.47 s, 1 iters, t-(init.)=3.4 s t(norm)=0.341315, mflops=14.6492 (err=1.0e-14) 2. Arndt Split-Radix: elapsed time t=4.68 s, 1 iters, t-(init.)=4.61 s t(norm)=0.462783, mflops=10.8042 (err=1.0e-14) 3. Arndt 4-step: elapsed time t=3.19 s, 1 iters, t-(init.)=3.11 s t(norm)=0.312203, mflops=16.0152 (err=1.0e-14) 4. Bailey: elapsed time t=3.57 s, 1 iters, t-(init.)=3.5 s t(norm)=0.351354, mflops=14.2307 (err=1.0e-14) 5. Beauregard: elapsed time t=7.23 s, 1 iters, t-(init.)=7.16 s t(norm)=0.718769, mflops=6.95634 (err=1.0e-14) 6. Bergland: elapsed time t=2.66 s, 1 iters, t-(init.)=2.59 s t(norm)=0.260002, mflops=19.2306 (err=1.0e-14) 7. Brenner: elapsed time t=3.13 s, 1 iters, t-(init.)=3.05 s t(norm)=0.30618, mflops=16.3303 (err=1.0e-14) 8. Burrus: elapsed time t=4.99 s, 1 iters, t-(init.)=4.92 s t(norm)=0.493903, mflops=10.1234 (err=1.0e-14) 9. CWP (min N) (N=720720): elapsed time t=1.37 s, 1 iters, t-(init.)=1.26 s t(norm)=0.126487, mflops=39.5297 10. CWP (best N) (N=720720): elapsed time t=1.37 s, 1 iters, t-(init.)=1.26 s t(norm)=0.126487, mflops=39.5297 11. Edelblute: elapsed time t=4.99 s, 1 iters, t-(init.)=4.92 s t(norm)=0.493903, mflops=10.1234 (err=1.0e-14) 12. FFTPACK: elapsed time t=2.29 s, 1 iters, t-(init.)=2.22 s t(norm)=0.222859, mflops=22.4357 (err=1.0e-14) 13. FFTPACK (f2c): elapsed time t=2.14 s, 1 iters, t-(init.)=2.07 s t(norm)=0.207801, mflops=24.0615 (err=1.0e-14) FFTW_MEASURE plan: (cost = 1.170000e+00) FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 32 FFTW_TWIDDLE 2 FFTW_NOTW 8 14. FFTW: elapsed time t=1.11 s, 1 iters, t-(init.)=1.03 s t(norm)=0.103398, mflops=48.3567 (err=1.0e-14) FFTW_ESTIMATE plan: (cost = 5.976883e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.48 s, 1 iters, t-(init.)=1.41 s t(norm)=0.141545, mflops=35.3244 (err=1.0e-14) 16. Frigo-old: elapsed time t=2.06 s, 1 iters, t-(init.)=1.98 s t(norm)=0.198766, mflops=25.1552 (err=1.0e-14) 17. Green: elapsed time t=2.1 s, 1 iters, t-(init.)=2.02 s t(norm)=0.202781, mflops=24.6571 (err=1.0e-14) 18. GSL: elapsed time t=2.53 s, 1 iters, t-(init.)=2.45 s t(norm)=0.245948, mflops=20.3295 (err=1.0e-14) 19. GSL DIT: elapsed time t=4.56 s, 1 iters, t-(init.)=4.48 s t(norm)=0.449733, mflops=11.1177 (err=1.0e-14) 20. GSL DIF: elapsed time t=4.95 s, 1 iters, t-(init.)=4.88 s t(norm)=0.489887, mflops=10.2064 (err=1.0e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=3.11 s, 1 iters, t-(init.)=3.03 s t(norm)=0.304172, mflops=16.4381 (err=1.0e-14) 23. Mayer (simple): elapsed time t=2.97 s, 1 iters, t-(init.)=2.89 s t(norm)=0.290118, mflops=17.2344 24. Mayer (lookup): elapsed time t=3.07 s, 1 iters, t-(init.)=3 s t(norm)=0.30116, mflops=16.6025 (err=1.0e-14) 25. Monro: elapsed time t=3.83 s, 1 iters, t-(init.)=3.76 s t(norm)=0.377454, mflops=13.2466 (err=2.0e-07) 26. NAPACK (f2c): elapsed time t=4.29 s, 1 iters, t-(init.)=4.22 s t(norm)=0.423632, mflops=11.8027 (err=8.0e-12) 27. Nielsen: elapsed time t=4.02 s, 1 iters, t-(init.)=3.95 s t(norm)=0.396528, mflops=12.6095 (err=4.5e-12) 28. NR (C): elapsed time t=4.38 s, 1 iters, t-(init.)=4.3 s t(norm)=0.431663, mflops=11.5831 (err=1.0e-14) 29. NR (F): elapsed time t=4.46 s, 1 iters, t-(init.)=4.39 s t(norm)=0.440698, mflops=11.3456 (err=1.0e-14) 30. Ooura (C): elapsed time t=1.27 s, 1 iters, t-(init.)=1.19 s t(norm)=0.11946, mflops=41.8549 (err=1.0e-14) 31. Ooura (F): elapsed time t=1.35 s, 1 iters, t-(init.)=1.28 s t(norm)=0.128495, mflops=38.912 (err=1.0e-14) 32. QFT: elapsed time t=3.53 s, 1 iters, t-(init.)=3.45 s t(norm)=0.346334, mflops=14.4369 (err=1.9e-14) 33. Ransom: elapsed time t=1.8 s, 1 iters, t-(init.)=1.73 s t(norm)=0.173669, mflops=28.7904 (err=1.0e-14) 34. SCIPORT: elapsed time t=7 s, 1 iters, t-(init.)=6.92 s t(norm)=0.694676, mflops=7.1976 (err=3.0e-07) 35. Singleton: elapsed time t=3.88 s, 1 iters, t-(init.)=3.8 s t(norm)=0.38147, mflops=13.1072 (err=1.6e-14) 36. Singleton (f2c): elapsed time t=3.07 s, 1 iters, t-(init.)=2.99 s t(norm)=0.300156, mflops=16.658 (err=1.6e-14) 37. Sorensen: elapsed time t=3.15 s, 1 iters, t-(init.)=3.07 s t(norm)=0.308187, mflops=16.2239 (err=1.0e-14) 38. Sorensen DIT: elapsed time t=4.95 s, 1 iters, t-(init.)=4.88 s t(norm)=0.489887, mflops=10.2064 (err=1.0e-14) 39. Temperton: elapsed time t=2.7 s, 1 iters, t-(init.)=2.63 s t(norm)=0.264017, mflops=18.9382 (err=2.1e-07) 40. Temperton (f2c): elapsed time t=3.12 s, 1 iters, t-(init.)=3.04 s t(norm)=0.305176, mflops=16.384 (err=1.0e-14) 41. Valkenburg: elapsed time t=11.94 s, 1 iters, t-(init.)=11.86 s t(norm)=1.19059, mflops=4.19961 (err=1.0e-14) Top mflops for N=524288 = 48.3567 Normalized results and averages for N=524288: fft 0: mflops = 14.6924 (norm. = 0.303835), norm. avg. (of 19) = 0.340634 fft 1: mflops = 14.6492 (norm. = 0.302941), norm. avg. (of 19) = 0.337651 fft 2: mflops = 10.8042 (norm. = 0.223427), norm. avg. (of 19) = 0.248027 fft 3: mflops = 16.0152 (norm. = 0.33119), norm. avg. (of 19) = 0.170719 fft 4: mflops = 14.2307 (norm. = 0.294286), norm. avg. (of 19) = 0.242546 fft 5: mflops = 6.95634 (norm. = 0.143855), norm. avg. (of 19) = 0.0867958 fft 6: mflops = 19.2306 (norm. = 0.397683), norm. avg. (of 19) = 0.283272 fft 7: mflops = 16.3303 (norm. = 0.337705), norm. avg. (of 19) = 0.241645 fft 8: mflops = 10.1234 (norm. = 0.20935), norm. avg. (of 19) = 0.189061 fft 9: mflops = 39.5297 (norm. = 0.81746), norm. avg. (of 19) = 0.69087 fft 10: mflops = 39.5297 (norm. = 0.81746), norm. avg. (of 19) = 0.67782 fft 11: mflops = 10.1234 (norm. = 0.20935), norm. avg. (of 18) = 0.19004 fft 12: mflops = 22.4357 (norm. = 0.463964), norm. avg. (of 19) = 0.383945 fft 13: mflops = 24.0615 (norm. = 0.497585), norm. avg. (of 19) = 0.404869 fft 14: mflops = 48.3567 (norm. = 1), norm. avg. (of 19) = 0.825821 fft 15: mflops = 35.3244 (norm. = 0.730496), norm. avg. (of 19) = 0.704605 fft 16: mflops = 25.1552 (norm. = 0.520202), norm. avg. (of 19) = 0.616796 fft 17: mflops = 24.6571 (norm. = 0.509901), norm. avg. (of 17) = 0.532128 fft 18: mflops = 20.3295 (norm. = 0.420408), norm. avg. (of 19) = 0.282694 fft 19: mflops = 11.1177 (norm. = 0.229911), norm. avg. (of 19) = 0.238124 fft 20: mflops = 10.2064 (norm. = 0.211066), norm. avg. (of 19) = 0.231639 fft 21: mflops = -1 (norm. = -0.0206797), norm. avg. (of 12) = 0.477583 fft 22: mflops = 16.4381 (norm. = 0.339934), norm. avg. (of 18) = 0.336207 fft 23: mflops = 17.2344 (norm. = 0.356401), norm. avg. (of 18) = 0.39901 fft 24: mflops = 16.6025 (norm. = 0.343333), norm. avg. (of 18) = 0.402633 fft 25: mflops = 13.2466 (norm. = 0.273936), norm. avg. (of 18) = 0.240219 fft 26: mflops = 11.8027 (norm. = 0.244076), norm. avg. (of 19) = 0.172408 fft 27: mflops = 12.6095 (norm. = 0.260759), norm. avg. (of 19) = 0.189105 fft 28: mflops = 11.5831 (norm. = 0.239535), norm. avg. (of 19) = 0.278457 fft 29: mflops = 11.3456 (norm. = 0.234624), norm. avg. (of 19) = 0.236356 fft 30: mflops = 41.8549 (norm. = 0.865546), norm. avg. (of 19) = 0.84074 fft 31: mflops = 38.912 (norm. = 0.804687), norm. avg. (of 19) = 0.722486 fft 32: mflops = 14.4369 (norm. = 0.298551), norm. avg. (of 16) = 0.366696 fft 33: mflops = 28.7904 (norm. = 0.595376), norm. avg. (of 18) = 0.363605 fft 34: mflops = 7.1976 (norm. = 0.148844), norm. avg. (of 18) = 0.117319 fft 35: mflops = 13.1072 (norm. = 0.271053), norm. avg. (of 19) = 0.207064 fft 36: mflops = 16.658 (norm. = 0.344482), norm. avg. (of 19) = 0.328541 fft 37: mflops = 16.2239 (norm. = 0.335505), norm. avg. (of 19) = 0.406187 fft 38: mflops = 10.2064 (norm. = 0.211066), norm. avg. (of 19) = 0.178614 fft 39: mflops = 18.9382 (norm. = 0.391635), norm. avg. (of 19) = 0.351424 fft 40: mflops = 16.384 (norm. = 0.338816), norm. avg. (of 19) = 0.271376 fft 41: mflops = 4.19961 (norm. = 0.0868465), norm. avg. (of 19) = 0.05977 Benchmarking for array size = 1048576 (power of 2): 0. Arndt DIF: elapsed time t=7.66 s, 1 iters, t-(init.)=7.51 s t(norm)=0.358105, mflops=13.9624 (err=4.5e-14) 1. Arndt DIT: elapsed time t=7.64 s, 1 iters, t-(init.)=7.49 s t(norm)=0.357151, mflops=13.9997 (err=4.5e-14) 2. Arndt Split-Radix: elapsed time t=10.09 s, 1 iters, t-(init.)=9.94 s t(norm)=0.473976, mflops=10.5491 (err=4.5e-14) 3. Arndt 4-step: elapsed time t=5.67 s, 1 iters, t-(init.)=5.52 s t(norm)=0.263214, mflops=18.9959 (err=4.5e-14) 4. Bailey: elapsed time t=7.79 s, 1 iters, t-(init.)=7.64 s t(norm)=0.364304, mflops=13.7248 (err=4.5e-14) 5. Beauregard: elapsed time t=15.24 s, 1 iters, t-(init.)=15.09 s t(norm)=0.719547, mflops=6.94881 (err=4.7e-14) 6. Bergland: elapsed time t=5.61 s, 1 iters, t-(init.)=5.46 s t(norm)=0.260353, mflops=19.2047 (err=4.7e-14) 7. Brenner: elapsed time t=6.5 s, 1 iters, t-(init.)=6.35 s t(norm)=0.302792, mflops=16.513 (err=4.8e-14) 8. Burrus: elapsed time t=10.69 s, 1 iters, t-(init.)=10.54 s t(norm)=0.502586, mflops=9.94854 (err=4.5e-14) 9. Skipping fft (this transform size is too big for CWP). 10. Skipping fft (this transform size is too big for CWP). 11. Edelblute: elapsed time t=10.64 s, 1 iters, t-(init.)=10.49 s t(norm)=0.500202, mflops=9.99596 (err=4.5e-14) 12. FFTPACK: elapsed time t=4.86 s, 1 iters, t-(init.)=4.71 s t(norm)=0.22459, mflops=22.2628 (err=4.7e-14) 13. FFTPACK (f2c): elapsed time t=4.6 s, 1 iters, t-(init.)=4.45 s t(norm)=0.212193, mflops=23.5635 (err=4.7e-14) FFTW_MEASURE plan: (cost = 2.570000e+00) FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_TWIDDLE 2 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=2.49 s, 1 iters, t-(init.)=2.34 s t(norm)=0.11158, mflops=44.8109 (err=4.7e-14) FFTW_ESTIMATE plan: (cost = 1.195377e+07) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=3.55 s, 1 iters, t-(init.)=3.4 s t(norm)=0.162125, mflops=30.8405 (err=4.8e-14) 16. Frigo-old: elapsed time t=5.61 s, 1 iters, t-(init.)=5.46 s t(norm)=0.260353, mflops=19.2047 (err=4.8e-14) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=5.3 s, 1 iters, t-(init.)=5.14 s t(norm)=0.245094, mflops=20.4003 (err=4.7e-14) 19. GSL DIT: elapsed time t=9.95 s, 1 iters, t-(init.)=9.8 s t(norm)=0.4673, mflops=10.6998 (err=4.7e-14) 20. GSL DIF: elapsed time t=10.71 s, 1 iters, t-(init.)=10.56 s t(norm)=0.50354, mflops=9.9297 (err=4.7e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Skipping fft (Mayer (Buneman) can't handle N > 2^19). 23. Mayer (simple): elapsed time t=6.58 s, 1 iters, t-(init.)=6.42 s t(norm)=0.306129, mflops=16.333 24. Mayer (lookup): elapsed time t=6.76 s, 1 iters, t-(init.)=6.61 s t(norm)=0.315189, mflops=15.8635 (err=4.5e-14) 25. Monro: elapsed time t=8.17 s, 1 iters, t-(init.)=8.02 s t(norm)=0.382423, mflops=13.0745 (err=2.0e-07) 26. NAPACK (f2c): elapsed time t=8.67 s, 1 iters, t-(init.)=8.52 s t(norm)=0.406265, mflops=12.3072 (err=1.5e-11) 27. Nielsen: elapsed time t=8.45 s, 1 iters, t-(init.)=8.3 s t(norm)=0.395775, mflops=12.6334 (err=8.2e-12) 28. NR (C): elapsed time t=9.53 s, 1 iters, t-(init.)=9.38 s t(norm)=0.447273, mflops=11.1788 (err=4.7e-14) 29. NR (F): elapsed time t=9.68 s, 1 iters, t-(init.)=9.53 s t(norm)=0.454426, mflops=11.0029 (err=4.7e-14) 30. Ooura (C): elapsed time t=2.55 s, 1 iters, t-(init.)=2.4 s t(norm)=0.114441, mflops=43.6907 (err=4.7e-14) 31. Ooura (F): elapsed time t=2.77 s, 1 iters, t-(init.)=2.61 s t(norm)=0.124454, mflops=40.1753 (err=4.7e-14) 32. QFT: elapsed time t=7.78 s, 1 iters, t-(init.)=7.63 s t(norm)=0.363827, mflops=13.7428 (err=5.4e-14) 33. Ransom: elapsed time t=3.12 s, 1 iters, t-(init.)=2.97 s t(norm)=0.141621, mflops=35.3056 (err=5.0e-14) 34. SCIPORT: elapsed time t=15.09 s, 1 iters, t-(init.)=14.94 s t(norm)=0.712395, mflops=7.01858 (err=3.2e-07) 35. Singleton: elapsed time t=7.77 s, 1 iters, t-(init.)=7.62 s t(norm)=0.36335, mflops=13.7608 (err=6.3e-14) 36. Singleton (f2c): elapsed time t=5.72 s, 1 iters, t-(init.)=5.57 s t(norm)=0.265598, mflops=18.8254 (err=6.3e-14) 37. Sorensen: elapsed time t=6.71 s, 1 iters, t-(init.)=6.56 s t(norm)=0.312805, mflops=15.9844 (err=4.5e-14) 38. Sorensen DIT: elapsed time t=10.62 s, 1 iters, t-(init.)=10.47 s t(norm)=0.499249, mflops=10.0151 (err=4.5e-14) 39. Temperton: elapsed time t=5.48 s, 1 iters, t-(init.)=5.33 s t(norm)=0.254154, mflops=19.6731 (err=2.3e-07) 40. Temperton (f2c): elapsed time t=6.32 s, 1 iters, t-(init.)=6.17 s t(norm)=0.294209, mflops=16.9947 (err=4.7e-14) 41. Valkenburg: elapsed time t=25.44 s, 1 iters, t-(init.)=25.29 s t(norm)=1.20592, mflops=4.14621 (err=4.7e-14) Top mflops for N=1048576 = 44.8109 Normalized results and averages for N=1048576: fft 0: mflops = 13.9624 (norm. = 0.311585), norm. avg. (of 20) = 0.339181 fft 1: mflops = 13.9997 (norm. = 0.312417), norm. avg. (of 20) = 0.336389 fft 2: mflops = 10.5491 (norm. = 0.235412), norm. avg. (of 20) = 0.247396 fft 3: mflops = 18.9959 (norm. = 0.423913), norm. avg. (of 20) = 0.183378 fft 4: mflops = 13.7248 (norm. = 0.306283), norm. avg. (of 20) = 0.245733 fft 5: mflops = 6.94881 (norm. = 0.15507), norm. avg. (of 20) = 0.0902095 fft 6: mflops = 19.2047 (norm. = 0.428571), norm. avg. (of 20) = 0.290537 fft 7: mflops = 16.513 (norm. = 0.368504), norm. avg. (of 20) = 0.247988 fft 8: mflops = 9.94854 (norm. = 0.222011), norm. avg. (of 20) = 0.190708 fft 9: mflops = -1 (norm. = -0.022316), norm. avg. (of 19) = 0.69087 fft 10: mflops = -1 (norm. = -0.022316), norm. avg. (of 19) = 0.67782 fft 11: mflops = 9.99596 (norm. = 0.22307), norm. avg. (of 19) = 0.191778 fft 12: mflops = 22.2628 (norm. = 0.496815), norm. avg. (of 20) = 0.389588 fft 13: mflops = 23.5635 (norm. = 0.525843), norm. avg. (of 20) = 0.410918 fft 14: mflops = 44.8109 (norm. = 1), norm. avg. (of 20) = 0.834529 fft 15: mflops = 30.8405 (norm. = 0.688235), norm. avg. (of 20) = 0.703787 fft 16: mflops = 19.2047 (norm. = 0.428571), norm. avg. (of 20) = 0.607384 fft 17: mflops = -1 (norm. = -0.022316), norm. avg. (of 17) = 0.532128 fft 18: mflops = 20.4003 (norm. = 0.455253), norm. avg. (of 20) = 0.291322 fft 19: mflops = 10.6998 (norm. = 0.238776), norm. avg. (of 20) = 0.238157 fft 20: mflops = 9.9297 (norm. = 0.221591), norm. avg. (of 20) = 0.231136 fft 21: mflops = -1 (norm. = -0.022316), norm. avg. (of 12) = 0.477583 fft 22: mflops = -1 (norm. = -0.022316), norm. avg. (of 18) = 0.336207 fft 23: mflops = 16.333 (norm. = 0.364486), norm. avg. (of 19) = 0.397193 fft 24: mflops = 15.8635 (norm. = 0.354009), norm. avg. (of 19) = 0.400074 fft 25: mflops = 13.0745 (norm. = 0.291771), norm. avg. (of 19) = 0.242932 fft 26: mflops = 12.3072 (norm. = 0.274648), norm. avg. (of 20) = 0.17752 fft 27: mflops = 12.6334 (norm. = 0.281928), norm. avg. (of 20) = 0.193747 fft 28: mflops = 11.1788 (norm. = 0.249467), norm. avg. (of 20) = 0.277007 fft 29: mflops = 11.0029 (norm. = 0.24554), norm. avg. (of 20) = 0.236815 fft 30: mflops = 43.6907 (norm. = 0.975), norm. avg. (of 20) = 0.847453 fft 31: mflops = 40.1753 (norm. = 0.896552), norm. avg. (of 20) = 0.731189 fft 32: mflops = 13.7428 (norm. = 0.306684), norm. avg. (of 17) = 0.363166 fft 33: mflops = 35.3056 (norm. = 0.787879), norm. avg. (of 19) = 0.385935 fft 34: mflops = 7.01858 (norm. = 0.156627), norm. avg. (of 19) = 0.119388 fft 35: mflops = 13.7608 (norm. = 0.307087), norm. avg. (of 20) = 0.212065 fft 36: mflops = 18.8254 (norm. = 0.420108), norm. avg. (of 20) = 0.333119 fft 37: mflops = 15.9844 (norm. = 0.356707), norm. avg. (of 20) = 0.403713 fft 38: mflops = 10.0151 (norm. = 0.223496), norm. avg. (of 20) = 0.180858 fft 39: mflops = 19.6731 (norm. = 0.439024), norm. avg. (of 20) = 0.355804 fft 40: mflops = 16.9947 (norm. = 0.379254), norm. avg. (of 20) = 0.27677 fft 41: mflops = 4.14621 (norm. = 0.0925267), norm. avg. (of 20) = 0.0614078 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.091 6748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. Brenner 1. CWP (min N) 2. CWP (best N) 3. FFTPACK 4. FFTPACK (f2c) 5. FFTW 6. FFTW_ESTIMATE 7. Frigo-old 8. GSL 9. Nielsen 10. Singleton 11. Singleton (f2c) 12. Temperton 13. Temperton (f2c) 14. Valkenburg Computing normalized averages (15 transforms). Benchmarking for array size = 6: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.01 s, 262144 iters, t-(init.)=0.94 s t(norm)=0.231197, mflops=21.6266 2. CWP (best N) (N=15): elapsed time t=1.67 s, 262144 iters, t-(init.)=1.56 s t(norm)=0.383689, mflops=13.0314 3. FFTPACK: elapsed time t=1.28 s, 524288 iters, t-(init.)=1.16 s t(norm)=0.142654, mflops=35.05 (err=1.2e-16) 4. FFTPACK (f2c): elapsed time t=1.49 s, 524288 iters, t-(init.)=1.37 s t(norm)=0.168479, mflops=29.6773 (err=1.5e-16) FFTW_MEASURE plan: (cost = 9.918213e-07) FFTW_NOTW 6 5. FFTW: elapsed time t=1.13 s, 1048576 iters, t-(init.)=0.88 s t(norm)=0.05411, mflops=92.4044 (err=8.7e-17) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 6. FFTW_ESTIMATE: elapsed time t=1.12 s, 1048576 iters, t-(init.)=0.87 s t(norm)=0.0534951, mflops=93.4665 (err=8.7e-17) 7. Frigo-old: elapsed time t=1.1 s, 262144 iters, t-(init.)=1.04 s t(norm)=0.255793, mflops=19.5471 (err=2.6e-16) 8. GSL: elapsed time t=1.81 s, 524288 iters, t-(init.)=1.68 s t(norm)=0.206602, mflops=24.2012 (err=8.2e-17) 9. Nielsen: elapsed time t=1.63 s, 131072 iters, t-(init.)=1.6 s t(norm)=0.787054, mflops=6.3528 (err=5.7e-16) 10. Singleton: elapsed time t=1.91 s, 262144 iters, t-(init.)=1.85 s t(norm)=0.455016, mflops=10.9886 (err=1.2e-16) 11. Singleton (f2c): elapsed time t=1.64 s, 262144 iters, t-(init.)=1.58 s t(norm)=0.388608, mflops=12.8664 (err=1.2e-16) 12. Temperton: elapsed time t=1.63 s, 262144 iters, t-(init.)=1.57 s t(norm)=0.386148, mflops=12.9484 (err=3.9e-09) 13. Temperton (f2c): elapsed time t=1.7 s, 262144 iters, t-(init.)=1.64 s t(norm)=0.403365, mflops=12.3957 (err=2.1e-16) 14. Valkenburg: elapsed time t=1.83 s, 131072 iters, t-(init.)=1.8 s t(norm)=0.885436, mflops=5.64694 (err=2.4e-16) Top mflops for N=6 = 93.4665 Normalized results and averages for N=6: fft 0: mflops = -1 (norm. = -0.010699), norm. avg. (of 0) = -1 fft 1: mflops = 21.6266 (norm. = 0.231383), norm. avg. (of 1) = 0.231383 fft 2: mflops = 13.0314 (norm. = 0.139423), norm. avg. (of 1) = 0.139423 fft 3: mflops = 35.05 (norm. = 0.375), norm. avg. (of 1) = 0.375 fft 4: mflops = 29.6773 (norm. = 0.317518), norm. avg. (of 1) = 0.317518 fft 5: mflops = 92.4044 (norm. = 0.988636), norm. avg. (of 1) = 0.988636 fft 6: mflops = 93.4665 (norm. = 1), norm. avg. (of 1) = 1 fft 7: mflops = 19.5471 (norm. = 0.209135), norm. avg. (of 1) = 0.209135 fft 8: mflops = 24.2012 (norm. = 0.258929), norm. avg. (of 1) = 0.258929 fft 9: mflops = 6.3528 (norm. = 0.0679688), norm. avg. (of 1) = 0.0679688 fft 10: mflops = 10.9886 (norm. = 0.117568), norm. avg. (of 1) = 0.117568 fft 11: mflops = 12.8664 (norm. = 0.137658), norm. avg. (of 1) = 0.137658 fft 12: mflops = 12.9484 (norm. = 0.138535), norm. avg. (of 1) = 0.138535 fft 13: mflops = 12.3957 (norm. = 0.132622), norm. avg. (of 1) = 0.132622 fft 14: mflops = 5.64694 (norm. = 0.0604167), norm. avg. (of 1) = 0.0604167 Benchmarking for array size = 9: 0. Brenner: elapsed time t=1.7 s, 65536 iters, t-(init.)=1.68 s t(norm)=0.898541, mflops=5.56458 (err=4.8e-16) 1. CWP (min N): elapsed time t=1.16 s, 262144 iters, t-(init.)=1.08 s t(norm)=0.144408, mflops=34.624 2. CWP (best N) (N=15): elapsed time t=1.67 s, 262144 iters, t-(init.)=1.56 s t(norm)=0.20859, mflops=23.9705 3. FFTPACK: elapsed time t=1.91 s, 524288 iters, t-(init.)=1.75 s t(norm)=0.116998, mflops=42.736 (err=1.1e-16) 4. FFTPACK (f2c): elapsed time t=1.05 s, 262144 iters, t-(init.)=0.97 s t(norm)=0.1297, mflops=38.5505 (err=2.5e-16) FFTW_MEASURE plan: (cost = 1.678467e-06) FFTW_NOTW 9 5. FFTW: elapsed time t=1.73 s, 1048576 iters, t-(init.)=1.41 s t(norm)=0.0471333, mflops=106.082 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.7 s, 1048576 iters, t-(init.)=1.39 s t(norm)=0.0464647, mflops=107.609 (err=2.1e-16) 7. Frigo-old: elapsed time t=1.22 s, 131072 iters, t-(init.)=1.17 s t(norm)=0.312885, mflops=15.9803 (err=3.1e-16) 8. GSL: elapsed time t=1.72 s, 262144 iters, t-(init.)=1.65 s t(norm)=0.220624, mflops=22.663 (err=1.7e-16) 9. Nielsen: elapsed time t=1.01 s, 65536 iters, t-(init.)=0.99 s t(norm)=0.529497, mflops=9.44292 (err=9.6e-16) 10. Singleton: elapsed time t=1.09 s, 131072 iters, t-(init.)=1.05 s t(norm)=0.280794, mflops=17.8066 (err=1.5e-16) 11. Singleton (f2c): elapsed time t=1.74 s, 262144 iters, t-(init.)=1.66 s t(norm)=0.221961, mflops=22.5265 (err=1.5e-16) 12. Temperton: elapsed time t=1.05 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.270097, mflops=18.5119 (err=1.9e-08) 13. Temperton (f2c): elapsed time t=1.05 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.270097, mflops=18.5119 (err=1.3e-16) 14. Valkenburg: elapsed time t=1.66 s, 65536 iters, t-(init.)=1.64 s t(norm)=0.877147, mflops=5.7003 (err=4.0e-16) Top mflops for N=9 = 107.609 Normalized results and averages for N=9: fft 0: mflops = 5.56458 (norm. = 0.0517113), norm. avg. (of 1) = 0.0517113 fft 1: mflops = 34.624 (norm. = 0.321759), norm. avg. (of 2) = 0.276571 fft 2: mflops = 23.9705 (norm. = 0.222756), norm. avg. (of 2) = 0.18109 fft 3: mflops = 42.736 (norm. = 0.397143), norm. avg. (of 2) = 0.386071 fft 4: mflops = 38.5505 (norm. = 0.358247), norm. avg. (of 2) = 0.337883 fft 5: mflops = 106.082 (norm. = 0.985816), norm. avg. (of 2) = 0.987226 fft 6: mflops = 107.609 (norm. = 1), norm. avg. (of 2) = 1 fft 7: mflops = 15.9803 (norm. = 0.148504), norm. avg. (of 2) = 0.178819 fft 8: mflops = 22.663 (norm. = 0.210606), norm. avg. (of 2) = 0.234767 fft 9: mflops = 9.44292 (norm. = 0.0877525), norm. avg. (of 2) = 0.0778606 fft 10: mflops = 17.8066 (norm. = 0.165476), norm. avg. (of 2) = 0.141522 fft 11: mflops = 22.5265 (norm. = 0.209337), norm. avg. (of 2) = 0.173498 fft 12: mflops = 18.5119 (norm. = 0.17203), norm. avg. (of 2) = 0.155282 fft 13: mflops = 18.5119 (norm. = 0.17203), norm. avg. (of 2) = 0.152326 fft 14: mflops = 5.7003 (norm. = 0.0529726), norm. avg. (of 2) = 0.0566946 Benchmarking for array size = 12: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.42 s, 262144 iters, t-(init.)=1.32 s t(norm)=0.117049, mflops=42.7171 2. CWP (best N) (N=15): elapsed time t=1.67 s, 262144 iters, t-(init.)=1.56 s t(norm)=0.138331, mflops=36.1452 3. FFTPACK: elapsed time t=1.15 s, 262144 iters, t-(init.)=1.05 s t(norm)=0.0931073, mflops=53.7015 (err=2.0e-16) 4. FFTPACK (f2c): elapsed time t=1.2 s, 262144 iters, t-(init.)=1.1 s t(norm)=0.0975409, mflops=51.2605 (err=2.6e-16) FFTW_MEASURE plan: (cost = 1.983643e-06) FFTW_NOTW 12 5. FFTW: elapsed time t=1.05 s, 524288 iters, t-(init.)=0.86 s t(norm)=0.0381296, mflops=131.132 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.08 s, 524288 iters, t-(init.)=0.89 s t(norm)=0.0394597, mflops=126.711 (err=1.4e-16) 7. Frigo-old: elapsed time t=1.07 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.179121, mflops=27.9142 (err=2.9e-16) 8. GSL: elapsed time t=1.65 s, 262144 iters, t-(init.)=1.56 s t(norm)=0.138331, mflops=36.1452 (err=3.2e-16) 9. Nielsen: elapsed time t=1.16 s, 65536 iters, t-(init.)=1.14 s t(norm)=0.404351, mflops=12.3655 (err=6.0e-16) 10. Singleton: elapsed time t=1.45 s, 131072 iters, t-(init.)=1.4 s t(norm)=0.248286, mflops=20.1381 (err=2.5e-16) 11. Singleton (f2c): elapsed time t=1.2 s, 131072 iters, t-(init.)=1.15 s t(norm)=0.203949, mflops=24.5159 (err=2.5e-16) 12. Temperton: elapsed time t=1.06 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.179121, mflops=27.9142 (err=8.2e-09) 13. Temperton (f2c): elapsed time t=1.22 s, 131072 iters, t-(init.)=1.17 s t(norm)=0.207496, mflops=24.0968 (err=1.7e-16) 14. Valkenburg: elapsed time t=1.23 s, 32768 iters, t-(init.)=1.21 s t(norm)=0.85836, mflops=5.82506 (err=4.2e-16) Top mflops for N=12 = 131.132 Normalized results and averages for N=12: fft 0: mflops = -1 (norm. = -0.00762593), norm. avg. (of 1) = 0.0517113 fft 1: mflops = 42.7171 (norm. = 0.325758), norm. avg. (of 3) = 0.292967 fft 2: mflops = 36.1452 (norm. = 0.275641), norm. avg. (of 3) = 0.212607 fft 3: mflops = 53.7015 (norm. = 0.409524), norm. avg. (of 3) = 0.393889 fft 4: mflops = 51.2605 (norm. = 0.390909), norm. avg. (of 3) = 0.355558 fft 5: mflops = 131.132 (norm. = 1), norm. avg. (of 3) = 0.991484 fft 6: mflops = 126.711 (norm. = 0.966292), norm. avg. (of 3) = 0.988764 fft 7: mflops = 27.9142 (norm. = 0.212871), norm. avg. (of 3) = 0.19017 fft 8: mflops = 36.1452 (norm. = 0.275641), norm. avg. (of 3) = 0.248392 fft 9: mflops = 12.3655 (norm. = 0.0942982), norm. avg. (of 3) = 0.0833398 fft 10: mflops = 20.1381 (norm. = 0.153571), norm. avg. (of 3) = 0.145538 fft 11: mflops = 24.5159 (norm. = 0.186957), norm. avg. (of 3) = 0.177984 fft 12: mflops = 27.9142 (norm. = 0.212871), norm. avg. (of 3) = 0.174479 fft 13: mflops = 24.0968 (norm. = 0.183761), norm. avg. (of 3) = 0.162804 fft 14: mflops = 5.82506 (norm. = 0.0444215), norm. avg. (of 3) = 0.0526036 Benchmarking for array size = 15: 0. Brenner: elapsed time t=1.26 s, 32768 iters, t-(init.)=1.25 s t(norm)=0.650935, mflops=7.68126 (err=4.1e-16) 1. CWP (min N): elapsed time t=1.67 s, 262144 iters, t-(init.)=1.56 s t(norm)=0.101546, mflops=49.2388 2. CWP (best N): elapsed time t=1.68 s, 262144 iters, t-(init.)=1.57 s t(norm)=0.102197, mflops=48.9252 3. FFTPACK: elapsed time t=1.63 s, 262144 iters, t-(init.)=1.52 s t(norm)=0.0989421, mflops=50.5346 (err=1.4e-16) 4. FFTPACK (f2c): elapsed time t=1.63 s, 262144 iters, t-(init.)=1.52 s t(norm)=0.0989421, mflops=50.5346 (err=3.0e-16) FFTW_MEASURE plan: (cost = 3.662109e-06) FFTW_NOTW 15 5. FFTW: elapsed time t=1.91 s, 524288 iters, t-(init.)=1.69 s t(norm)=0.055004, mflops=90.9025 (err=1.9e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.91 s, 524288 iters, t-(init.)=1.68 s t(norm)=0.0546785, mflops=91.4436 (err=1.9e-16) 7. Frigo-old: elapsed time t=1.13 s, 65536 iters, t-(init.)=1.1 s t(norm)=0.286411, mflops=17.4574 (err=2.6e-16) 8. GSL: elapsed time t=1.83 s, 131072 iters, t-(init.)=1.77 s t(norm)=0.230431, mflops=21.6985 (err=1.4e-16) 9. Nielsen: elapsed time t=1.36 s, 65536 iters, t-(init.)=1.33 s t(norm)=0.346297, mflops=14.4385 (err=4.3e-15) 10. Singleton: elapsed time t=1.1 s, 65536 iters, t-(init.)=1.07 s t(norm)=0.2786, mflops=17.9469 (err=2.2e-16) 11. Singleton (f2c): elapsed time t=1.67 s, 131072 iters, t-(init.)=1.61 s t(norm)=0.209601, mflops=23.8548 (err=2.2e-16) 12. Temperton: elapsed time t=1.26 s, 131072 iters, t-(init.)=1.2 s t(norm)=0.156224, mflops=32.0052 (err=1.0e-08) 13. Temperton (f2c): elapsed time t=1.39 s, 131072 iters, t-(init.)=1.33 s t(norm)=0.173149, mflops=28.8769 (err=1.8e-16) 14. Valkenburg: elapsed time t=1.89 s, 32768 iters, t-(init.)=1.87 s t(norm)=0.973799, mflops=5.13453 (err=2.2e-16) Top mflops for N=15 = 91.4436 Normalized results and averages for N=15: fft 0: mflops = 7.68126 (norm. = 0.084), norm. avg. (of 2) = 0.0678557 fft 1: mflops = 49.2388 (norm. = 0.538462), norm. avg. (of 4) = 0.35434 fft 2: mflops = 48.9252 (norm. = 0.535032), norm. avg. (of 4) = 0.293213 fft 3: mflops = 50.5346 (norm. = 0.552632), norm. avg. (of 4) = 0.433575 fft 4: mflops = 50.5346 (norm. = 0.552632), norm. avg. (of 4) = 0.404827 fft 5: mflops = 90.9025 (norm. = 0.994083), norm. avg. (of 4) = 0.992134 fft 6: mflops = 91.4436 (norm. = 1), norm. avg. (of 4) = 0.991573 fft 7: mflops = 17.4574 (norm. = 0.190909), norm. avg. (of 4) = 0.190355 fft 8: mflops = 21.6985 (norm. = 0.237288), norm. avg. (of 4) = 0.245616 fft 9: mflops = 14.4385 (norm. = 0.157895), norm. avg. (of 4) = 0.101979 fft 10: mflops = 17.9469 (norm. = 0.196262), norm. avg. (of 4) = 0.158219 fft 11: mflops = 23.8548 (norm. = 0.26087), norm. avg. (of 4) = 0.198705 fft 12: mflops = 32.0052 (norm. = 0.35), norm. avg. (of 4) = 0.218359 fft 13: mflops = 28.8769 (norm. = 0.315789), norm. avg. (of 4) = 0.20105 fft 14: mflops = 5.13453 (norm. = 0.0561497), norm. avg. (of 4) = 0.0534901 Benchmarking for array size = 18: 0. Brenner: elapsed time t=1.76 s, 32768 iters, t-(init.)=1.74 s t(norm)=0.707455, mflops=7.06759 (err=4.1e-16) 1. CWP (min N): elapsed time t=1.07 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.102662, mflops=48.7034 2. CWP (best N) (N=28): elapsed time t=1.27 s, 131072 iters, t-(init.)=1.18 s t(norm)=0.119942, mflops=41.6868 3. FFTPACK: elapsed time t=1.22 s, 131072 iters, t-(init.)=1.15 s t(norm)=0.116893, mflops=42.7743 (err=2.5e-16) 4. FFTPACK (f2c): elapsed time t=1.38 s, 131072 iters, t-(init.)=1.32 s t(norm)=0.134172, mflops=37.2655 (err=2.8e-16) FFTW_MEASURE plan: (cost = 6.713867e-06) FFTW_TWIDDLE 9 FFTW_NOTW 2 5. FFTW: elapsed time t=1.77 s, 262144 iters, t-(init.)=1.64 s t(norm)=0.0833495, mflops=59.9883 (err=1.9e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.64 s, 262144 iters, t-(init.)=1.51 s t(norm)=0.0767426, mflops=65.1529 (err=2.2e-16) 7. Frigo-old: elapsed time t=1.38 s, 65536 iters, t-(init.)=1.35 s t(norm)=0.274444, mflops=18.2187 (err=3.5e-16) 8. GSL: elapsed time t=1.28 s, 131072 iters, t-(init.)=1.22 s t(norm)=0.124008, mflops=40.32 (err=2.1e-16) 9. Nielsen: elapsed time t=1.08 s, 32768 iters, t-(init.)=1.06 s t(norm)=0.430978, mflops=11.6015 (err=8.7e-16) 10. Singleton: elapsed time t=1.07 s, 65536 iters, t-(init.)=1.04 s t(norm)=0.211423, mflops=23.6492 (err=2.1e-16) 11. Singleton (f2c): elapsed time t=1.64 s, 131072 iters, t-(init.)=1.57 s t(norm)=0.159584, mflops=31.3315 (err=2.1e-16) 12. Temperton: elapsed time t=1 s, 65536 iters, t-(init.)=0.97 s t(norm)=0.197193, mflops=25.3559 (err=4.5e-08) 13. Temperton (f2c): elapsed time t=1.12 s, 65536 iters, t-(init.)=1.09 s t(norm)=0.221588, mflops=22.5644 (err=2.6e-16) 14. Valkenburg: elapsed time t=1.06 s, 16384 iters, t-(init.)=1.05 s t(norm)=0.853824, mflops=5.856 (err=4.2e-16) Top mflops for N=18 = 65.1529 Normalized results and averages for N=18: fft 0: mflops = 7.06759 (norm. = 0.108477), norm. avg. (of 3) = 0.0813961 fft 1: mflops = 48.7034 (norm. = 0.747525), norm. avg. (of 5) = 0.432977 fft 2: mflops = 41.6868 (norm. = 0.639831), norm. avg. (of 5) = 0.362537 fft 3: mflops = 42.7743 (norm. = 0.656522), norm. avg. (of 5) = 0.478164 fft 4: mflops = 37.2655 (norm. = 0.57197), norm. avg. (of 5) = 0.438255 fft 5: mflops = 59.9883 (norm. = 0.920732), norm. avg. (of 5) = 0.977853 fft 6: mflops = 65.1529 (norm. = 1), norm. avg. (of 5) = 0.993258 fft 7: mflops = 18.2187 (norm. = 0.27963), norm. avg. (of 5) = 0.20821 fft 8: mflops = 40.32 (norm. = 0.618852), norm. avg. (of 5) = 0.320263 fft 9: mflops = 11.6015 (norm. = 0.178066), norm. avg. (of 5) = 0.117196 fft 10: mflops = 23.6492 (norm. = 0.362981), norm. avg. (of 5) = 0.199172 fft 11: mflops = 31.3315 (norm. = 0.480892), norm. avg. (of 5) = 0.255143 fft 12: mflops = 25.3559 (norm. = 0.389175), norm. avg. (of 5) = 0.252522 fft 13: mflops = 22.5644 (norm. = 0.34633), norm. avg. (of 5) = 0.230106 fft 14: mflops = 5.856 (norm. = 0.089881), norm. avg. (of 5) = 0.0607683 Benchmarking for array size = 24: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.15 s, 131072 iters, t-(init.)=1.07 s t(norm)=0.0741868, mflops=67.3974 2. CWP (best N) (N=28): elapsed time t=1.27 s, 131072 iters, t-(init.)=1.17 s t(norm)=0.0811202, mflops=61.6369 3. FFTPACK: elapsed time t=1.52 s, 131072 iters, t-(init.)=1.44 s t(norm)=0.0998402, mflops=50.08 (err=1.7e-16) 4. FFTPACK (f2c): elapsed time t=1.7 s, 131072 iters, t-(init.)=1.62 s t(norm)=0.11232, mflops=44.5156 (err=2.5e-16) FFTW_MEASURE plan: (cost = 7.934570e-06) FFTW_TWIDDLE 8 FFTW_NOTW 3 5. FFTW: elapsed time t=1.06 s, 131072 iters, t-(init.)=0.98 s t(norm)=0.0679468, mflops=73.587 (err=2.3e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1 s, 131072 iters, t-(init.)=0.92 s t(norm)=0.0637868, mflops=78.3861 (err=2.1e-16) 7. Frigo-old: elapsed time t=1.4 s, 65536 iters, t-(init.)=1.35 s t(norm)=0.1872, mflops=26.7093 (err=3.8e-16) 8. GSL: elapsed time t=1.82 s, 131072 iters, t-(init.)=1.74 s t(norm)=0.12064, mflops=41.4455 (err=2.1e-16) 9. Nielsen: elapsed time t=1.08 s, 32768 iters, t-(init.)=1.06 s t(norm)=0.293974, mflops=17.0083 (err=1.7e-15) 10. Singleton: elapsed time t=1.53 s, 65536 iters, t-(init.)=1.49 s t(norm)=0.206614, mflops=24.1997 (err=2.1e-16) 11. Singleton (f2c): elapsed time t=1.22 s, 65536 iters, t-(init.)=1.18 s t(norm)=0.163627, mflops=30.5573 (err=2.1e-16) 12. Temperton: elapsed time t=1.02 s, 65536 iters, t-(init.)=0.98 s t(norm)=0.135894, mflops=36.7935 (err=8.0e-09) 13. Temperton (f2c): elapsed time t=1.22 s, 65536 iters, t-(init.)=1.18 s t(norm)=0.163627, mflops=30.5573 (err=2.4e-16) 14. Valkenburg: elapsed time t=1.54 s, 16384 iters, t-(init.)=1.53 s t(norm)=0.848642, mflops=5.89177 (err=6.1e-16) Top mflops for N=24 = 78.3861 Normalized results and averages for N=24: fft 0: mflops = -1 (norm. = -0.0127574), norm. avg. (of 3) = 0.0813961 fft 1: mflops = 67.3974 (norm. = 0.859813), norm. avg. (of 6) = 0.504117 fft 2: mflops = 61.6369 (norm. = 0.786325), norm. avg. (of 6) = 0.433168 fft 3: mflops = 50.08 (norm. = 0.638889), norm. avg. (of 6) = 0.504951 fft 4: mflops = 44.5156 (norm. = 0.567901), norm. avg. (of 6) = 0.459863 fft 5: mflops = 73.587 (norm. = 0.938776), norm. avg. (of 6) = 0.97134 fft 6: mflops = 78.3861 (norm. = 1), norm. avg. (of 6) = 0.994382 fft 7: mflops = 26.7093 (norm. = 0.340741), norm. avg. (of 6) = 0.230298 fft 8: mflops = 41.4455 (norm. = 0.528736), norm. avg. (of 6) = 0.355009 fft 9: mflops = 17.0083 (norm. = 0.216981), norm. avg. (of 6) = 0.133827 fft 10: mflops = 24.1997 (norm. = 0.308725), norm. avg. (of 6) = 0.21743 fft 11: mflops = 30.5573 (norm. = 0.389831), norm. avg. (of 6) = 0.277591 fft 12: mflops = 36.7935 (norm. = 0.469388), norm. avg. (of 6) = 0.288667 fft 13: mflops = 30.5573 (norm. = 0.389831), norm. avg. (of 6) = 0.256727 fft 14: mflops = 5.89177 (norm. = 0.0751634), norm. avg. (of 6) = 0.0631675 Benchmarking for array size = 36: 0. Brenner: elapsed time t=1.8 s, 16384 iters, t-(init.)=1.78 s t(norm)=0.583732, mflops=8.56558 (err=1.4e-15) 1. CWP (min N): elapsed time t=1.78 s, 131072 iters, t-(init.)=1.67 s t(norm)=0.0684573, mflops=73.0382 2. CWP (best N): elapsed time t=1.79 s, 131072 iters, t-(init.)=1.68 s t(norm)=0.0688672, mflops=72.6035 3. FFTPACK: elapsed time t=1.21 s, 65536 iters, t-(init.)=1.15 s t(norm)=0.0942825, mflops=53.0321 (err=4.3e-16) 4. FFTPACK (f2c): elapsed time t=1.3 s, 65536 iters, t-(init.)=1.24 s t(norm)=0.101661, mflops=49.183 (err=1.1e-15) FFTW_MEASURE plan: (cost = 1.159668e-05) FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_NOTW 9 5. FFTW: elapsed time t=1.53 s, 131072 iters, t-(init.)=1.42 s t(norm)=0.0582092, mflops=85.8971 (err=6.2e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.98 s, 131072 iters, t-(init.)=1.86 s t(norm)=0.0762459, mflops=65.5773 (err=6.2e-16) 7. Frigo-old: elapsed time t=1.39 s, 32768 iters, t-(init.)=1.36 s t(norm)=0.222999, mflops=22.4217 (err=6.6e-16) 8. GSL: elapsed time t=1.05 s, 65536 iters, t-(init.)=0.99 s t(norm)=0.0811649, mflops=61.6029 (err=4.2e-16) 9. Nielsen: elapsed time t=1.76 s, 32768 iters, t-(init.)=1.73 s t(norm)=0.283667, mflops=17.6263 (err=1.6e-15) 10. Singleton: elapsed time t=1.12 s, 32768 iters, t-(init.)=1.09 s t(norm)=0.178727, mflops=27.9756 (err=4.2e-16) 11. Singleton (f2c): elapsed time t=1.61 s, 65536 iters, t-(init.)=1.55 s t(norm)=0.127076, mflops=39.3464 (err=4.1e-16) 12. Temperton: elapsed time t=1.65 s, 65536 iters, t-(init.)=1.59 s t(norm)=0.130356, mflops=38.3566 (err=6.4e-08) 13. Temperton (f2c): elapsed time t=1.96 s, 65536 iters, t-(init.)=1.9 s t(norm)=0.155771, mflops=32.0984 (err=3.5e-16) 14. Valkenburg: elapsed time t=1.29 s, 8192 iters, t-(init.)=1.28 s t(norm)=0.839524, mflops=5.95575 (err=8.4e-16) Top mflops for N=36 = 85.8971 Normalized results and averages for N=36: fft 0: mflops = 8.56558 (norm. = 0.0997191), norm. avg. (of 4) = 0.0859769 fft 1: mflops = 73.0382 (norm. = 0.850299), norm. avg. (of 7) = 0.553571 fft 2: mflops = 72.6035 (norm. = 0.845238), norm. avg. (of 7) = 0.492035 fft 3: mflops = 53.0321 (norm. = 0.617391), norm. avg. (of 7) = 0.521014 fft 4: mflops = 49.183 (norm. = 0.572581), norm. avg. (of 7) = 0.475965 fft 5: mflops = 85.8971 (norm. = 1), norm. avg. (of 7) = 0.975435 fft 6: mflops = 65.5773 (norm. = 0.763441), norm. avg. (of 7) = 0.96139 fft 7: mflops = 22.4217 (norm. = 0.261029), norm. avg. (of 7) = 0.234688 fft 8: mflops = 61.6029 (norm. = 0.717172), norm. avg. (of 7) = 0.406746 fft 9: mflops = 17.6263 (norm. = 0.205202), norm. avg. (of 7) = 0.144023 fft 10: mflops = 27.9756 (norm. = 0.325688), norm. avg. (of 7) = 0.232896 fft 11: mflops = 39.3464 (norm. = 0.458065), norm. avg. (of 7) = 0.303373 fft 12: mflops = 38.3566 (norm. = 0.446541), norm. avg. (of 7) = 0.31122 fft 13: mflops = 32.0984 (norm. = 0.373684), norm. avg. (of 7) = 0.273435 fft 14: mflops = 5.95575 (norm. = 0.0693359), norm. avg. (of 7) = 0.0640487 Benchmarking for array size = 80: 0. Brenner: elapsed time t=1.54 s, 8192 iters, t-(init.)=1.52 s t(norm)=0.366872, mflops=13.6287 (err=5.1e-16) 1. CWP (min N): elapsed time t=1.53 s, 65536 iters, t-(init.)=1.41 s t(norm)=0.0425402, mflops=117.536 2. CWP (best N) (N=84): elapsed time t=1.97 s, 65536 iters, t-(init.)=1.85 s t(norm)=0.0558152, mflops=89.5814 3. FFTPACK: elapsed time t=1.39 s, 32768 iters, t-(init.)=1.33 s t(norm)=0.0802532, mflops=62.3028 (err=4.3e-16) 4. FFTPACK (f2c): elapsed time t=1.33 s, 32768 iters, t-(init.)=1.27 s t(norm)=0.0766327, mflops=65.2463 (err=4.7e-16) FFTW_MEASURE plan: (cost = 3.051758e-05) FFTW_TWIDDLE 2 FFTW_TWIDDLE 5 FFTW_NOTW 8 5. FFTW: elapsed time t=1 s, 32768 iters, t-(init.)=0.94 s t(norm)=0.0567203, mflops=88.1519 (err=2.6e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 6. FFTW_ESTIMATE: elapsed time t=1.26 s, 32768 iters, t-(init.)=1.2 s t(norm)=0.0724089, mflops=69.0523 (err=4.6e-16) 7. Frigo-old: elapsed time t=1.18 s, 16384 iters, t-(init.)=1.15 s t(norm)=0.138784, mflops=36.0273 (err=3.3e-16) 8. GSL: elapsed time t=1.48 s, 16384 iters, t-(init.)=1.45 s t(norm)=0.174988, mflops=28.5734 (err=4.1e-16) 9. Nielsen: elapsed time t=1.61 s, 16384 iters, t-(init.)=1.58 s t(norm)=0.190677, mflops=26.2224 (err=8.1e-15) 10. Singleton: elapsed time t=1.33 s, 16384 iters, t-(init.)=1.3 s t(norm)=0.156886, mflops=31.8703 (err=4.3e-16) 11. Singleton (f2c): elapsed time t=1.86 s, 32768 iters, t-(init.)=1.8 s t(norm)=0.108613, mflops=46.0349 (err=3.5e-16) 12. Temperton: elapsed time t=1.27 s, 32768 iters, t-(init.)=1.21 s t(norm)=0.0730123, mflops=68.4816 (err=1.7e-07) 13. Temperton (f2c): elapsed time t=1.88 s, 32768 iters, t-(init.)=1.82 s t(norm)=0.10982, mflops=45.529 (err=4.0e-16) 14. Valkenburg: elapsed time t=1.84 s, 4096 iters, t-(init.)=1.83 s t(norm)=0.883388, mflops=5.66003 (err=5.4e-16) Top mflops for N=80 = 117.536 Normalized results and averages for N=80: fft 0: mflops = 13.6287 (norm. = 0.115954), norm. avg. (of 5) = 0.0919723 fft 1: mflops = 117.536 (norm. = 1), norm. avg. (of 8) = 0.609375 fft 2: mflops = 89.5814 (norm. = 0.762162), norm. avg. (of 8) = 0.525801 fft 3: mflops = 62.3028 (norm. = 0.530075), norm. avg. (of 8) = 0.522147 fft 4: mflops = 65.2463 (norm. = 0.555118), norm. avg. (of 8) = 0.48586 fft 5: mflops = 88.1519 (norm. = 0.75), norm. avg. (of 8) = 0.947255 fft 6: mflops = 69.0523 (norm. = 0.5875), norm. avg. (of 8) = 0.914654 fft 7: mflops = 36.0273 (norm. = 0.306522), norm. avg. (of 8) = 0.243668 fft 8: mflops = 28.5734 (norm. = 0.243103), norm. avg. (of 8) = 0.386291 fft 9: mflops = 26.2224 (norm. = 0.223101), norm. avg. (of 8) = 0.153908 fft 10: mflops = 31.8703 (norm. = 0.271154), norm. avg. (of 8) = 0.237678 fft 11: mflops = 46.0349 (norm. = 0.391667), norm. avg. (of 8) = 0.314409 fft 12: mflops = 68.4816 (norm. = 0.582645), norm. avg. (of 8) = 0.345148 fft 13: mflops = 45.529 (norm. = 0.387363), norm. avg. (of 8) = 0.287676 fft 14: mflops = 5.66003 (norm. = 0.0481557), norm. avg. (of 8) = 0.0620621 Benchmarking for array size = 108: 0. Brenner: elapsed time t=1.61 s, 4096 iters, t-(init.)=1.6 s t(norm)=0.535449, mflops=9.33796 (err=8.7e-16) 1. CWP (min N) (N=110): elapsed time t=1.4 s, 32768 iters, t-(init.)=1.32 s t(norm)=0.0552182, mflops=90.5499 2. CWP (best N) (N=112): elapsed time t=1.15 s, 32768 iters, t-(init.)=1.07 s t(norm)=0.0447602, mflops=111.706 3. FFTPACK: elapsed time t=1.99 s, 32768 iters, t-(init.)=1.91 s t(norm)=0.079899, mflops=62.579 (err=3.4e-16) 4. FFTPACK (f2c): elapsed time t=1.06 s, 16384 iters, t-(init.)=1.02 s t(norm)=0.0853372, mflops=58.5911 (err=7.1e-16) FFTW_MEASURE plan: (cost = 4.150391e-05) FFTW_TWIDDLE 4 FFTW_TWIDDLE 3 FFTW_NOTW 9 5. FFTW: elapsed time t=1.39 s, 32768 iters, t-(init.)=1.31 s t(norm)=0.0547999, mflops=91.2411 (err=3.6e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.55 s, 32768 iters, t-(init.)=1.48 s t(norm)=0.0619113, mflops=80.7607 (err=3.0e-16) 7. Frigo-old: elapsed time t=1.49 s, 8192 iters, t-(init.)=1.47 s t(norm)=0.245972, mflops=20.3275 (err=5.6e-16) 8. GSL: elapsed time t=1.06 s, 16384 iters, t-(init.)=1.02 s t(norm)=0.0853372, mflops=58.5911 (err=3.2e-16) 9. Nielsen: elapsed time t=1.47 s, 8192 iters, t-(init.)=1.45 s t(norm)=0.242625, mflops=20.6079 (err=1.2e-15) 10. Singleton: elapsed time t=1.74 s, 16384 iters, t-(init.)=1.7 s t(norm)=0.142229, mflops=35.1547 (err=3.3e-16) 11. Singleton (f2c): elapsed time t=1.21 s, 16384 iters, t-(init.)=1.17 s t(norm)=0.0978868, mflops=51.0794 (err=3.3e-16) 12. Temperton: elapsed time t=1.57 s, 16384 iters, t-(init.)=1.53 s t(norm)=0.128006, mflops=39.0607 (err=1.0e-07) 13. Temperton (f2c): elapsed time t=1.91 s, 16384 iters, t-(init.)=1.87 s t(norm)=0.156452, mflops=31.9588 (err=3.1e-16) 14. Valkenburg: elapsed time t=1.24 s, 2048 iters, t-(init.)=1.23 s t(norm)=0.823253, mflops=6.07347 (err=6.6e-16) Top mflops for N=108 = 111.706 Normalized results and averages for N=108: fft 0: mflops = 9.33796 (norm. = 0.0835937), norm. avg. (of 6) = 0.0905759 fft 1: mflops = 90.5499 (norm. = 0.810606), norm. avg. (of 9) = 0.631734 fft 2: mflops = 111.706 (norm. = 1), norm. avg. (of 9) = 0.57849 fft 3: mflops = 62.579 (norm. = 0.560209), norm. avg. (of 9) = 0.526376 fft 4: mflops = 58.5911 (norm. = 0.52451), norm. avg. (of 9) = 0.490154 fft 5: mflops = 91.2411 (norm. = 0.816794), norm. avg. (of 9) = 0.93276 fft 6: mflops = 80.7607 (norm. = 0.722973), norm. avg. (of 9) = 0.893356 fft 7: mflops = 20.3275 (norm. = 0.181973), norm. avg. (of 9) = 0.236813 fft 8: mflops = 58.5911 (norm. = 0.52451), norm. avg. (of 9) = 0.401649 fft 9: mflops = 20.6079 (norm. = 0.184483), norm. avg. (of 9) = 0.157305 fft 10: mflops = 35.1547 (norm. = 0.314706), norm. avg. (of 9) = 0.246237 fft 11: mflops = 51.0794 (norm. = 0.457265), norm. avg. (of 9) = 0.330282 fft 12: mflops = 39.0607 (norm. = 0.349673), norm. avg. (of 9) = 0.345651 fft 13: mflops = 31.9588 (norm. = 0.286096), norm. avg. (of 9) = 0.287501 fft 14: mflops = 6.07347 (norm. = 0.0543699), norm. avg. (of 9) = 0.0612074 Benchmarking for array size = 210: 0. Brenner: elapsed time t=1.36 s, 2048 iters, t-(init.)=1.35 s t(norm)=0.406903, mflops=12.2879 (err=6.2e-16) 1. CWP (min N): elapsed time t=1.42 s, 16384 iters, t-(init.)=1.35 s t(norm)=0.0508629, mflops=98.3035 2. CWP (best N): elapsed time t=1.4 s, 16384 iters, t-(init.)=1.32 s t(norm)=0.0497326, mflops=100.538 3. FFTPACK: elapsed time t=1.36 s, 8192 iters, t-(init.)=1.32 s t(norm)=0.0994652, mflops=50.2688 (err=3.7e-16) 4. FFTPACK (f2c): elapsed time t=1.81 s, 8192 iters, t-(init.)=1.77 s t(norm)=0.133374, mflops=37.4886 (err=4.2e-16) FFTW_MEASURE plan: (cost = 1.123047e-04) FFTW_TWIDDLE 7 FFTW_TWIDDLE 3 FFTW_NOTW 10 5. FFTW: elapsed time t=1.82 s, 16384 iters, t-(init.)=1.74 s t(norm)=0.0655566, mflops=76.2699 (err=3.0e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.87 s, 16384 iters, t-(init.)=1.8 s t(norm)=0.0678172, mflops=73.7276 (err=3.5e-16) 7. Frigo-old: elapsed time t=1.84 s, 4096 iters, t-(init.)=1.82 s t(norm)=0.274283, mflops=18.2294 (err=4.6e-16) 8. GSL: elapsed time t=1.29 s, 4096 iters, t-(init.)=1.27 s t(norm)=0.191395, mflops=26.124 (err=3.6e-16) 9. Nielsen: elapsed time t=1.59 s, 4096 iters, t-(init.)=1.57 s t(norm)=0.236607, mflops=21.1321 (err=8.6e-15) 10. Singleton: elapsed time t=1.06 s, 4096 iters, t-(init.)=1.04 s t(norm)=0.156733, mflops=31.9014 (err=3.4e-16) 11. Singleton (f2c): elapsed time t=1.92 s, 8192 iters, t-(init.)=1.89 s t(norm)=0.142416, mflops=35.1084 (err=3.4e-16) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.73 s, 1024 iters, t-(init.)=1.72 s t(norm)=1.03685, mflops=4.8223 (err=6.1e-16) Top mflops for N=210 = 100.538 Normalized results and averages for N=210: fft 0: mflops = 12.2879 (norm. = 0.122222), norm. avg. (of 7) = 0.0950968 fft 1: mflops = 98.3035 (norm. = 0.977778), norm. avg. (of 10) = 0.666338 fft 2: mflops = 100.538 (norm. = 1), norm. avg. (of 10) = 0.620641 fft 3: mflops = 50.2688 (norm. = 0.5), norm. avg. (of 10) = 0.523738 fft 4: mflops = 37.4886 (norm. = 0.372881), norm. avg. (of 10) = 0.478427 fft 5: mflops = 76.2699 (norm. = 0.758621), norm. avg. (of 10) = 0.915346 fft 6: mflops = 73.7276 (norm. = 0.733333), norm. avg. (of 10) = 0.877354 fft 7: mflops = 18.2294 (norm. = 0.181319), norm. avg. (of 10) = 0.231263 fft 8: mflops = 26.124 (norm. = 0.259843), norm. avg. (of 10) = 0.387468 fft 9: mflops = 21.1321 (norm. = 0.210191), norm. avg. (of 10) = 0.162594 fft 10: mflops = 31.9014 (norm. = 0.317308), norm. avg. (of 10) = 0.253344 fft 11: mflops = 35.1084 (norm. = 0.349206), norm. avg. (of 10) = 0.332175 fft 12: mflops = -1 (norm. = -0.00994652), norm. avg. (of 9) = 0.345651 fft 13: mflops = -1 (norm. = -0.00994652), norm. avg. (of 9) = 0.287501 fft 14: mflops = 4.8223 (norm. = 0.0479651), norm. avg. (of 10) = 0.0598832 Benchmarking for array size = 504: 0. Brenner: elapsed time t=1.8 s, 1024 iters, t-(init.)=1.79 s t(norm)=0.386347, mflops=12.9417 (err=6.6e-16) 1. CWP (min N): elapsed time t=1.74 s, 8192 iters, t-(init.)=1.65 s t(norm)=0.0445163, mflops=112.319 2. CWP (best N): elapsed time t=1.74 s, 8192 iters, t-(init.)=1.65 s t(norm)=0.0445163, mflops=112.319 3. FFTPACK: elapsed time t=1.02 s, 2048 iters, t-(init.)=1 s t(norm)=0.107918, mflops=46.3314 (err=4.6e-16) 4. FFTPACK (f2c): elapsed time t=1.25 s, 2048 iters, t-(init.)=1.23 s t(norm)=0.132739, mflops=37.6678 (err=6.0e-16) FFTW_MEASURE plan: (cost = 2.832031e-04) FFTW_TWIDDLE 2 FFTW_TWIDDLE 3 FFTW_TWIDDLE 2 FFTW_TWIDDLE 3 FFTW_NOTW 14 5. FFTW: elapsed time t=1.22 s, 4096 iters, t-(init.)=1.17 s t(norm)=0.0631322, mflops=79.1989 (err=4.8e-16) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.34 s, 4096 iters, t-(init.)=1.29 s t(norm)=0.0696072, mflops=71.8316 (err=4.5e-16) 7. Frigo-old: elapsed time t=1.19 s, 1024 iters, t-(init.)=1.18 s t(norm)=0.254687, mflops=19.6319 (err=6.0e-16) 8. GSL: elapsed time t=1.38 s, 2048 iters, t-(init.)=1.36 s t(norm)=0.146769, mflops=34.0672 (err=5.7e-16) 9. Nielsen: elapsed time t=1.15 s, 1024 iters, t-(init.)=1.14 s t(norm)=0.246054, mflops=20.3208 (err=5.4e-15) 10. Singleton: elapsed time t=1.23 s, 2048 iters, t-(init.)=1.21 s t(norm)=0.130581, mflops=38.2904 (err=6.6e-16) 11. Singleton (f2c): elapsed time t=1.11 s, 2048 iters, t-(init.)=1.09 s t(norm)=0.117631, mflops=42.5059 (err=6.6e-16) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.11 s, 256 iters, t-(init.)=1.11 s t(norm)=0.958314, mflops=5.2175 (err=7.6e-16) Top mflops for N=504 = 112.319 Normalized results and averages for N=504: fft 0: mflops = 12.9417 (norm. = 0.115223), norm. avg. (of 8) = 0.0976126 fft 1: mflops = 112.319 (norm. = 1), norm. avg. (of 11) = 0.696671 fft 2: mflops = 112.319 (norm. = 1), norm. avg. (of 11) = 0.655128 fft 3: mflops = 46.3314 (norm. = 0.4125), norm. avg. (of 11) = 0.513626 fft 4: mflops = 37.6678 (norm. = 0.335366), norm. avg. (of 11) = 0.465421 fft 5: mflops = 79.1989 (norm. = 0.705128), norm. avg. (of 11) = 0.896235 fft 6: mflops = 71.8316 (norm. = 0.639535), norm. avg. (of 11) = 0.855734 fft 7: mflops = 19.6319 (norm. = 0.174788), norm. avg. (of 11) = 0.226129 fft 8: mflops = 34.0672 (norm. = 0.303309), norm. avg. (of 11) = 0.379817 fft 9: mflops = 20.3208 (norm. = 0.180921), norm. avg. (of 11) = 0.16426 fft 10: mflops = 38.2904 (norm. = 0.340909), norm. avg. (of 11) = 0.261304 fft 11: mflops = 42.5059 (norm. = 0.37844), norm. avg. (of 11) = 0.336381 fft 12: mflops = -1 (norm. = -0.00890325), norm. avg. (of 9) = 0.345651 fft 13: mflops = -1 (norm. = -0.00890325), norm. avg. (of 9) = 0.287501 fft 14: mflops = 5.2175 (norm. = 0.0464527), norm. avg. (of 11) = 0.0586622 Benchmarking for array size = 1000: 0. Brenner: elapsed time t=1 s, 256 iters, t-(init.)=1 s t(norm)=0.391966, mflops=12.7562 (err=8.0e-16) 1. CWP (min N) (N=1001): elapsed time t=1.16 s, 2048 iters, t-(init.)=1.1 s t(norm)=0.0538953, mflops=92.7724 2. CWP (best N) (N=1008): elapsed time t=1.88 s, 4096 iters, t-(init.)=1.77 s t(norm)=0.0433613, mflops=115.31 3. FFTPACK: elapsed time t=1.18 s, 1024 iters, t-(init.)=1.16 s t(norm)=0.11367, mflops=43.9869 (err=6.1e-16) 4. FFTPACK (f2c): elapsed time t=1.22 s, 1024 iters, t-(init.)=1.19 s t(norm)=0.11661, mflops=42.878 (err=7.8e-16) FFTW_MEASURE plan: (cost = 8.203125e-04) FFTW_TWIDDLE 2 FFTW_TWIDDLE 5 FFTW_TWIDDLE 2 FFTW_TWIDDLE 5 FFTW_NOTW 10 5. FFTW: elapsed time t=1.66 s, 2048 iters, t-(init.)=1.6 s t(norm)=0.0783932, mflops=63.781 (err=5.9e-16) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.62 s, 2048 iters, t-(init.)=1.57 s t(norm)=0.0769234, mflops=64.9998 (err=6.3e-16) 7. Frigo-old: elapsed time t=1.31 s, 512 iters, t-(init.)=1.3 s t(norm)=0.254778, mflops=19.6249 (err=6.3e-16) 8. GSL: elapsed time t=1.07 s, 512 iters, t-(init.)=1.06 s t(norm)=0.207742, mflops=24.0683 (err=6.3e-16) 9. Nielsen: elapsed time t=1.7 s, 1024 iters, t-(init.)=1.67 s t(norm)=0.163646, mflops=30.5538 (err=1.3e-14) 10. Singleton: elapsed time t=1.36 s, 1024 iters, t-(init.)=1.34 s t(norm)=0.131309, mflops=38.0782 (err=9.0e-16) 11. Singleton (f2c): elapsed time t=1.88 s, 2048 iters, t-(init.)=1.82 s t(norm)=0.0891723, mflops=56.0712 (err=9.1e-16) 12. Temperton: elapsed time t=1.52 s, 2048 iters, t-(init.)=1.47 s t(norm)=0.0720238, mflops=69.4215 (err=1.1e-07) 13. Temperton (f2c): elapsed time t=1.07 s, 1024 iters, t-(init.)=1.05 s t(norm)=0.102891, mflops=48.5951 (err=6.4e-16) 14. Valkenburg: elapsed time t=1.26 s, 128 iters, t-(init.)=1.26 s t(norm)=0.987755, mflops=5.06199 (err=7.2e-16) Top mflops for N=1000 = 115.31 Normalized results and averages for N=1000: fft 0: mflops = 12.7562 (norm. = 0.110625), norm. avg. (of 9) = 0.0990584 fft 1: mflops = 92.7724 (norm. = 0.804545), norm. avg. (of 12) = 0.705661 fft 2: mflops = 115.31 (norm. = 1), norm. avg. (of 12) = 0.683867 fft 3: mflops = 43.9869 (norm. = 0.381466), norm. avg. (of 12) = 0.502613 fft 4: mflops = 42.878 (norm. = 0.371849), norm. avg. (of 12) = 0.457623 fft 5: mflops = 63.781 (norm. = 0.553125), norm. avg. (of 12) = 0.867642 fft 6: mflops = 64.9998 (norm. = 0.563694), norm. avg. (of 12) = 0.831397 fft 7: mflops = 19.6249 (norm. = 0.170192), norm. avg. (of 12) = 0.221468 fft 8: mflops = 24.0683 (norm. = 0.208726), norm. avg. (of 12) = 0.36556 fft 9: mflops = 30.5538 (norm. = 0.26497), norm. avg. (of 12) = 0.172652 fft 10: mflops = 38.0782 (norm. = 0.330224), norm. avg. (of 12) = 0.267048 fft 11: mflops = 56.0712 (norm. = 0.486264), norm. avg. (of 12) = 0.348871 fft 12: mflops = 69.4215 (norm. = 0.602041), norm. avg. (of 10) = 0.37129 fft 13: mflops = 48.5951 (norm. = 0.421429), norm. avg. (of 10) = 0.300893 fft 14: mflops = 5.06199 (norm. = 0.0438988), norm. avg. (of 12) = 0.0574319 Benchmarking for array size = 1960: 0. Brenner: elapsed time t=1.04 s, 128 iters, t-(init.)=1.02 s t(norm)=0.371749, mflops=13.4499 (err=7.3e-16) 1. CWP (min N) (N=1980): elapsed time t=1.39 s, 1024 iters, t-(init.)=1.24 s t(norm)=0.0564913, mflops=88.5092 2. CWP (best N) (N=1980): elapsed time t=1.4 s, 1024 iters, t-(init.)=1.26 s t(norm)=0.0574025, mflops=87.1042 3. FFTPACK: elapsed time t=1.94 s, 512 iters, t-(init.)=1.87 s t(norm)=0.170385, mflops=29.3453 (err=5.6e-16) 4. FFTPACK (f2c): elapsed time t=1.11 s, 256 iters, t-(init.)=1.07 s t(norm)=0.194986, mflops=25.6428 (err=6.3e-16) FFTW_MEASURE plan: (cost = 1.718750e-03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_NOTW 14 5. FFTW: elapsed time t=1.84 s, 1024 iters, t-(init.)=1.69 s t(norm)=0.0769922, mflops=64.9416 (err=5.6e-16) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.1 s, 512 iters, t-(init.)=1.02 s t(norm)=0.0929374, mflops=53.7997 (err=5.6e-16) 7. Frigo-old: elapsed time t=1.53 s, 256 iters, t-(init.)=1.49 s t(norm)=0.271523, mflops=18.4147 (err=6.9e-16) 8. GSL: elapsed time t=1.38 s, 256 iters, t-(init.)=1.34 s t(norm)=0.244188, mflops=20.476 (err=7.0e-16) 9. Nielsen: elapsed time t=1.43 s, 256 iters, t-(init.)=1.39 s t(norm)=0.2533, mflops=19.7395 (err=1.5e-14) 10. Singleton: elapsed time t=1.67 s, 512 iters, t-(init.)=1.6 s t(norm)=0.145784, mflops=34.2973 (err=7.7e-16) 11. Singleton (f2c): elapsed time t=1.8 s, 512 iters, t-(init.)=1.73 s t(norm)=0.157629, mflops=31.72 (err=7.7e-16) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.54 s, 64 iters, t-(init.)=1.53 s t(norm)=1.11525, mflops=4.48331 (err=6.1e-16) Top mflops for N=1960 = 88.5092 Normalized results and averages for N=1960: fft 0: mflops = 13.4499 (norm. = 0.151961), norm. avg. (of 10) = 0.104349 fft 1: mflops = 88.5092 (norm. = 1), norm. avg. (of 13) = 0.728302 fft 2: mflops = 87.1042 (norm. = 0.984127), norm. avg. (of 13) = 0.706964 fft 3: mflops = 29.3453 (norm. = 0.331551), norm. avg. (of 13) = 0.489454 fft 4: mflops = 25.6428 (norm. = 0.28972), norm. avg. (of 13) = 0.444708 fft 5: mflops = 64.9416 (norm. = 0..733728), norm. avg. (of 13) = 0.857341 fft 6: mflops = 53.7997 (norm. = 0.607843), norm. avg. (of 13) = 0.814201 fft 7: mflops = 18.4147 (norm. = 0.208054), norm. avg. (of 13) = 0.220436 fft 8: mflops = 20.476 (norm. = 0.231343), norm. avg. (of 13) = 0.355235 fft 9: mflops = 19.7395 (norm. = 0.223022), norm. avg. (of 13) = 0.176527 fft 10: mflops = 34.2973 (norm. = 0.3875), norm. avg. (of 13) = 0.276313 fft 11: mflops = 31.72 (norm. = 0.358382), norm. avg. (of 13) = 0.349602 fft 12: mflops = -1 (norm. = -0.0112983), norm. avg. (of 10) = 0.37129 fft 13: mflops = -1 (norm. = -0.0112983), norm. avg. (of 10) = 0.300893 fft 14: mflops = 4.48331 (norm. = 0.0506536), norm. avg. (of 13) = 0.0569105 Benchmarking for array size = 4725: 0. Brenner: elapsed time t=1.69 s, 64 iters, t-(init.)=1.66 s t(norm)=0.449727, mflops=11.1178 (err=1.4e-15) 1. CWP (min N) (N=5005): elapsed time t=1.05 s, 256 iters, t-(init.)=0.95 s t(norm)=0.0643435, mflops=77.7079 2. CWP (best N) (N=5040): elapsed time t=1.86 s, 512 iters, t-(init.)=1.67 s t(norm)=0.0565546, mflops=88.4102 3. FFTPACK: elapsed time t=1.14 s, 128 iters, t-(init.)=1.1 s t(norm)=0.149006, mflops=33.5557 (err=1.2e-15) 4. FFTPACK (f2c): elapsed time t=1.23 s, 128 iters, t-(init.)=1.19 s t(norm)=0.161198, mflops=31.0179 (err=1.3e-15) FFTW_MEASURE plan: (cost = 5.000000e-03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 3 FFTW_TWIDDLE 3 FFTW_NOTW 15 5. FFTW: elapsed time t=1.22 s, 256 iters, t-(init.)=1.13 s t(norm)=0.0765349, mflops=65.3296 (err=1.2e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.56 s, 256 iters, t-(init.)=1.47 s t(norm)=0.0995632, mflops=50.2194 (err=1.2e-15) 7. Frigo-old: elapsed time t=1.26 s, 64 iters, t-(init.)=1.24 s t(norm)=0.335941, mflops=14.8836 (err=1.3e-15) 8. GSL: elapsed time t=1.55 s, 128 iters, t-(init.)=1.51 s t(norm)=0.204545, mflops=24.4445 (err=1.3e-15) 9. Nielsen: elapsed time t=1.9 s, 128 iters, t-(init.)=1.86 s t(norm)=0.251956, mflops=19.8448 (err=4.3e-14) 10. Singleton: elapsed time t=1.3 s, 128 iters, t-(init.)=1.25 s t(norm)=0.169325, mflops=29.529 (err=1.8e-15) 11. Singleton (f2c): elapsed time t=1.21 s, 128 iters, t-(init.)=1.17 s t(norm)=0.158488, mflops=31.5481 (err=1.8e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.94 s, 32 iters, t-(init.)=1.93 s t(norm)=1.04575, mflops=4.78125 (err=1.3e-15) Top mflops for N=4725 = 88.4102 Normalized results and averages for N=4725: fft 0: mflops = 11.1178 (norm. = 0.125753), norm. avg. (of 11) = 0.106295 fft 1: mflops = 77.7079 (norm. = 0.878947), norm. avg. (of 14) = 0.739063 fft 2: mflops = 88.4102 (norm. = 1), norm. avg. (of 14) = 0.727895 fft 3: mflops = 33.5557 (norm. = 0.379545), norm. avg. (of 14) = 0.481603 fft 4: mflops = 31.0179 (norm. = 0.35084), norm. avg. (of 14) = 0.438003 fft 5: mflops = 65.3296 (norm. = 0.738938), norm. avg. (of 14) = 0.848884 fft 6: mflops = 50.2194 (norm. = 0.568027), norm. avg. (of 14) = 0.796617 fft 7: mflops = 14.8836 (norm. = 0.168347), norm. avg. (of 14) = 0.216715 fft 8: mflops = 24.4445 (norm. = 0.27649), norm. avg. (of 14) = 0.349611 fft 9: mflops = 19.8448 (norm. = 0.224462), norm. avg. (of 14) = 0.179951 fft 10: mflops = 29.529 (norm. = 0.334), norm. avg. (of 14) = 0.280434 fft 11: mflops = 31.5481 (norm. = 0.356838), norm. avg. (of 14) = 0.350119 fft 12: mflops = -1 (norm. = -0.0113109), norm. avg. (of 10) = 0.37129 fft 13: mflops = -1 (norm. = -0.0113109), norm. avg. (of 10) = 0.300893 fft 14: mflops = 4.78125 (norm. = 0.0540803), norm. avg. (of 14) = 0.0567084 Benchmarking for array size = 10368: 0. Brenner: elapsed time t=1.02 s, 16 iters, t-(init.)=1.01 s t(norm)=0.45641, mflops=10.9551 (err=1.1e-15) 1. CWP (min N) (N=10920): elapsed time t=1.24 s, 128 iters, t-(init.)=1.14 s t(norm)=0.0643945, mflops=77.6464 2. CWP (best N) (N=11088): elapsed time t=1.22 s, 128 iters, t-(init.)=1.12 s t(norm)=0.0632648, mflops=79.0329 3. FFTPACK: elapsed time t=1.17 s, 64 iters, t-(init.)=1.12 s t(norm)=0.12653, mflops=39.5164 (err=9.8e-16) 4. FFTPACK (f2c): elapsed time t=1.16 s, 64 iters, t-(init.)=1.11 s t(norm)=0.1254, mflops=39.8725 (err=1.1e-15) FFTW_MEASURE plan: (cost = 9.687500e-03) FFTW_TWIDDLE 32 FFTW_TWIDDLE 3 FFTW_TWIDDLE 3 FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.23 s, 128 iters, t-(init.)=1.13 s t(norm)=0.0638297, mflops=78.3335 (err=9.1e-16) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.56 s, 128 iters, t-(init.)=1.46 s t(norm)=0.0824702, mflops=60.628 (err=9.4e-16) 7. Frigo-old: elapsed time t=1 s, 32 iters, t-(init.)=0.97 s t(norm)=0.219167, mflops=22.8136 (err=1.0e-15) 8. GSL: elapsed time t=1.94 s, 128 iters, t-(init.)=1.85 s t(norm)=0.1045, mflops=47.8469 (err=9.4e-16) 9. Nielsen: elapsed time t=1.16 s, 32 iters, t-(init.)=1.14 s t(norm)=0.257578, mflops=19.4116 (err=1.1e-14) 10. Singleton: elapsed time t=1.75 s, 64 iters, t-(init.)=1.7 s t(norm)=0.192054, mflops=26.0344 (err=1.3e-15) 11. Singleton (f2c): elapsed time t=1.57 s, 64 iters, t-(init.)=1.52 s t(norm)=0.171719, mflops=29.1174 (err=1.3e-15) 12. Temperton: elapsed time t=1.59 s, 64 iters, t-(init.)=1.55 s t(norm)=0.175108, mflops=28.5538 (err=2.2e-07) 13. Temperton (f2c): elapsed time t=1.64 s, 64 iters, t-(init.)=1.59 s t(norm)=0.179627, mflops=27.8355 (err=9.7e-16) 14. Valkenburg: elapsed time t=1 s, 8 iters, t-(init.)=1 s t(norm)=0.903783, mflops=5.5323 (err=1.3e-15) Top mflops for N=10368 = 79.0329 Normalized results and averages for N=10368: fft 0: mflops = 10.9551 (norm. = 0.138614), norm. avg. (of 12) = 0.108988 fft 1: mflops = 77.6464 (norm. = 0.982456), norm. avg. (of 15) = 0.755289 fft 2: mflops = 79.0329 (norm. = 1), norm. avg. (of 15) = 0.746036 fft 3: mflops = 39.5164 (norm. = 0.5), norm. avg. (of 15) = 0.48283 fft 4: mflops = 39.8725 (norm. = 0.504505), norm. avg. (of 15) = 0.442436 fft 5: mflops = 78.3335 (norm. = 0.99115), norm. avg. (of 15) = 0.858368 fft 6: mflops = 60.628 (norm. = 0.767123), norm. avg. (of 15) = 0.794651 fft 7: mflops = 22.8136 (norm. = 0.28866), norm. avg. (of 15) = 0.221512 fft 8: mflops = 47.8469 (norm. = 0.605405), norm. avg. (of 15) = 0.366664 fft 9: mflops = 19.4116 (norm. = 0.245614), norm. avg. (of 15) = 0.184329 fft 10: mflops = 26.0344 (norm. = 0.329412), norm. avg. (of 15) = 0.283699 fft 11: mflops = 29.1174 (norm. = 0.368421), norm. avg. (of 15) = 0.351339 fft 12: mflops = 28.5538 (norm. = 0.36129), norm. avg. (of 11) = 0.370381 fft 13: mflops = 27.8355 (norm. = 0.352201), norm. avg. (of 11) = 0.305558 fft 14: mflops = 5.5323 (norm. = 0.07), norm. avg. (of 15) = 0.0575945 Benchmarking for array size = 27000: 0. Brenner: elapsed time t=1.83 s, 8 iters, t-(init.)=1.81 s t(norm)=0.569242, mflops=8.78361 (err=3.6e-15) 1. CWP (min N) (N=27720): elapsed time t=1.84 s, 64 iters, t-(init.)=1.67 s t(norm)=0.0656516, mflops=76.1596 2. CWP (best N) (N=27720): elapsed time t=1.84 s, 64 iters, t-(init.)=1.67 s t(norm)=0.0656516, mflops=76.1596 3. FFTPACK: elapsed time t=1.07 s, 16 iters, t-(init.)=1.03 s t(norm)=0.161967, mflops=30.8705 (err=3.4e-15) 4. FFTPACK (f2c): elapsed time t=1.11 s, 16 iters, t-(init.)=1.07 s t(norm)=0.168257, mflops=29.7165 (err=3.5e-15) FFTW_MEASURE plan: (cost = 4.250000e-02) FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 3 FFTW_TWIDDLE 5 FFTW_TWIDDLE 2 FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.36 s, 32 iters, t-(init.)=1.28 s t(norm)=0.10064, mflops=49.6823 (err=3.5e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 2 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.27 s, 32 iters, t-(init.)=1.19 s t(norm)=0.0935633, mflops=53.4397 (err=3.5e-15) 7. Frigo-old: elapsed time t=1.07 s, 8 iters, t-(init.)=1.05 s t(norm)=0.330223, mflops=15.1413 (err=3.6e-15) 8. GSL: elapsed time t=1.21 s, 16 iters, t-(init.)=1.17 s t(norm)=0.183982, mflops=27.1766 (err=3.4e-15) 9. Nielsen: elapsed time t=1.57 s, 16 iters, t-(init.)=1.53 s t(norm)=0.240591, mflops=20.7821 (err=2.0e-13) 10. Singleton: elapsed time t=1.35 s, 16 iters, t-(init.)=1.31 s t(norm)=0.205997, mflops=24.2723 (err=5.0e-15) 11. Singleton (f2c): elapsed time t=1.18 s, 16 iters, t-(init.)=1.14 s t(norm)=0.179264, mflops=27.8918 (err=5.0e-15) 12. Temperton: elapsed time t=1.16 s, 16 iters, t-(init.)=1.12 s t(norm)=0.176119, mflops=28.3899 (err=1.4e-07) 13. Temperton (f2c): elapsed time t=1.16 s, 16 iters, t-(init.)=1.12 s t(norm)=0.176119, mflops=28.3899 (err=3.6e-15) 14. Valkenburg: elapsed time t=1.69 s, 4 iters, t-(init.)=1.68 s t(norm)=1.05672, mflops=4.73164 (err=3.4e-15) Top mflops for N=27000 = 76.1596 Normalized results and averages for N=27000: fft 0: mflops = 8.78361 (norm. = 0.115331), norm. avg. (of 13) = 0.109476 fft 1: mflops = 76.1596 (norm. = 1), norm. avg. (of 16) = 0.770583 fft 2: mflops = 76.1596 (norm. = 1), norm. avg. (of 16) = 0.761908 fft 3: mflops = 30.8705 (norm. = 0.40534), norm. avg. (of 16) = 0.477987 fft 4: mflops = 29.7165 (norm. = 0.390187), norm. avg. (of 16) = 0.439171 fft 5: mflops = 49.6823 (norm. = 0.652344), norm. avg. (of 16) = 0.845492 fft 6: mflops = 53.4397 (norm. = 0.701681), norm. avg. (of 16) = 0.78884 fft 7: mflops = 15.1413 (norm. = 0.19881), norm. avg. (of 16) = 0.220093 fft 8: mflops = 27.1766 (norm. = 0.356838), norm. avg. (of 16) = 0.366049 fft 9: mflops = 20.7821 (norm. = 0.272876), norm. avg. (of 16) = 0.189863 fft 10: mflops = 24.2723 (norm. = 0.318702), norm. avg. (of 16) = 0.285887 fft 11: mflops = 27.8918 (norm. = 0.366228), norm. avg. (of 16) = 0.35227 fft 12: mflops = 28.3899 (norm. = 0.372768), norm. avg. (of 12) = 0.37058 fft 13: mflops = 28.3899 (norm. = 0.372768), norm. avg. (of 12) = 0.311159 fft 14: mflops = 4.73164 (norm. = 0.062128), norm. avg. (of 16) = 0.0578778 Benchmarking for array size = 75600: 0. Brenner: elapsed time t=1.51 s, 2 iters, t-(init.)=1.49 s t(norm)=0.608073, mflops=8.22269 (err=4.7e-15) 1. CWP (min N) (N=80080): elapsed time t=1.82 s, 16 iters, t-(init.)=1.65 s t(norm)=0.0841712, mflops=59.4027 2. CWP (best N) (N=80080): elapsed time t=1.82 s, 16 iters, t-(init.)=1.65 s t(norm)=0.0841712, mflops=59.4027 3. FFTPACK: elapsed time t=1.18 s, 4 iters, t-(init.)=1.14 s t(norm)=0.232619, mflops=21.4944 (err=4.7e-15) 4. FFTPACK (f2c): elapsed time t=1.24 s, 4 iters, t-(init.)=1.2 s t(norm)=0.244862, mflops=20.4197 (err=4.7e-15) FFTW_MEASURE plan: (cost = 1.300000e-01) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 8 FFTW_TWIDDLE 3 FFTW_TWIDDLE 2 FFTW_TWIDDLE 3 FFTW_NOTW 15 5. FFTW: elapsed time t=1.02 s, 8 iters, t-(init.)=0.94 s t(norm)=0.0959042, mflops=52.1354 (err=4.7e-15) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 8 FFTW_TWIDDLE 7 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.03 s, 8 iters, t-(init.)=0.95 s t(norm)=0.0969245, mflops=51.5866 (err=4.7e-15) 7. Frigo-old: elapsed time t=1.73 s, 4 iters, t-(init.)=1.69 s t(norm)=0.344847, mflops=14.4992 (err=4.7e-15) 8. GSL: elapsed time t=1.12 s, 4 iters, t-(init.)=1.08 s t(norm)=0.220376, mflops=22.6885 (err=4.7e-15) 9. Nielsen: elapsed time t=1.54 s, 4 iters, t-(init.)=1.5 s t(norm)=0.306077, mflops=16.3357 (err=4.8e-13) 10. Singleton: elapsed time t=1.37 s, 4 iters, t-(init.)=1.33 s t(norm)=0.271388, mflops=18.4238 (err=6.1e-15) 11. Singleton (f2c): elapsed time t=1.27 s, 4 iters, t-(init.)=1.23 s t(norm)=0.250983, mflops=19.9216 (err=6.1e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.45 s, 1 iters, t-(init.)=1.44 s t(norm)=1.17534, mflops=4.2541 (err=4.5e-15) Top mflops for N=75600 = 59.4027 Normalized results and averages for N=75600: fft 0: mflops = 8.22269 (norm. = 0.138423), norm. avg. (of 14) = 0.111543 fft 1: mflops = 59.4027 (norm. = 1), norm. avg. (of 17) = 0.784078 fft 2: mflops = 59.4027 (norm. = 1), norm. avg. (of 17) = 0.775914 fft 3: mflops = 21.4944 (norm. = 0.361842), norm. avg. (of 17) = 0.471155 fft 4: mflops = 20.4197 (norm. = 0.34375), norm. avg. (of 17) = 0.433558 fft 5: mflops = 52.1354 (norm. = 0.87766), norm. avg. (of 17) = 0.847384 fft 6: mflops = 51.5866 (norm. = 0.868421), norm. avg. (of 17) = 0.793521 fft 7: mflops = 14.4992 (norm. = 0.244083), norm. avg. (of 17) = 0.221504 fft 8: mflops = 22.6885 (norm. = 0.381944), norm. avg. (of 17) = 0.366984 fft 9: mflops = 16.3357 (norm. = 0.275), norm. avg. (of 17) = 0.194871 fft 10: mflops = 18.4238 (norm. = 0.31015), norm. avg. (of 17) = 0.287314 fft 11: mflops = 19.9216 (norm. = 0.335366), norm. avg. (of 17) = 0.351276 fft 12: mflops = -1 (norm. = -0.0168342), norm. avg. (of 12) = 0.37058 fft 13: mflops = -1 (norm. = -0.0168342), norm. avg. (of 12) = 0.311159 fft 14: mflops = 4.2541 (norm. = 0.0716146), norm. avg. (of 17) = 0.0586859 Benchmarking for array size = 165375: 0. Brenner: elapsed time t=2.01 s, 1 iters, t-(init.)=1.99 s t(norm)=0.694144, mflops=7.20311 (err=1.2e-14) 1. CWP (min N) (N=180180): elapsed time t=1.28 s, 4 iters, t-(init.)=1.18 s t(norm)=0.102901, mflops=48.5905 2. CWP (best N) (N=180180): elapsed time t=1.27 s, 4 iters, t-(init.)=1.17 s t(norm)=0.102029, mflops=49.0058 3. FFTPACK: elapsed time t=1.1 s, 1 iters, t-(init.)=1.08 s t(norm)=0.376722, mflops=13.2724 (err=1.2e-14) 4. FFTPACK (f2c): elapsed time t=1.18 s, 1 iters, t-(init.)=1.16 s t(norm)=0.404627, mflops=12.3571 (err=1.2e-14) FFTW_MEASURE plan: (cost = 3.400000e-01) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 3 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 3 FFTW_NOTW 15 5. FFTW: elapsed time t=1.31 s, 4 iters, t-(init.)=1.22 s t(norm)=0.106389, mflops=46.9974 (err=1.2e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.49 s, 4 iters, t-(init.)=1.4 s t(norm)=0.122086, mflops=40.9548 (err=1.2e-14) 7. Frigo-old: elapsed time t=1.4 s, 1 iters, t-(init.)=1.37 s t(norm)=0.477878, mflops=10.4629 (err=1.2e-14) 8. GSL: elapsed time t=1.63 s, 2 iters, t-(init.)=1.59 s t(norm)=0.277309, mflops=18.0304 (err=1.2e-14) 9. Nielsen: elapsed time t=1.93 s, 2 iters, t-(init.)=1.88 s t(norm)=0.327887, mflops=15.2491 (err=1.7e-12) 10. Singleton: elapsed time t=1.73 s, 2 iters, t-(init.)=1.69 s t(norm)=0.29475, mflops=16.9635 (err=1.8e-14) 11. Singleton (f2c): elapsed time t=1.7 s, 2 iters, t-(init.)=1.65 s t(norm)=0.287773, mflops=17.3748 (err=1.8e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=3.63 s, 1 iters, t-(init.)=3.61 s t(norm)=1.25923, mflops=3.97069 (err=1.2e-14) Top mflops for N=165375 = 49.0058 Normalized results and averages for N=165375: fft 0: mflops = 7.20311 (norm. = 0.146985), norm. avg. (of 15) = 0.113906 fft 1: mflops = 48.5905 (norm. = 0.991525), norm. avg. (of 18) = 0.795603 fft 2: mflops = 49.0058 (norm. = 1), norm. avg. (of 18) = 0.788363 fft 3: mflops = 13.2724 (norm. = 0.270833), norm. avg. (of 18) = 0.460026 fft 4: mflops = 12.3571 (norm. = 0.252155), norm. avg. (of 18) = 0.42348 fft 5: mflops = 46.9974 (norm. = 0.959016), norm. avg. (of 18) = 0.853586 fft 6: mflops = 40.9548 (norm. = 0.835714), norm. avg. (of 18) = 0.795865 fft 7: mflops = 10.4629 (norm. = 0.213504), norm. avg. (of 18) = 0.221059 fft 8: mflops = 18.0304 (norm. = 0.367925), norm. avg. (of 18) = 0.367037 fft 9: mflops = 15.2491 (norm. = 0.31117), norm. avg. (of 18) = 0.201332 fft 10: mflops = 16.9635 (norm. = 0.346154), norm. avg. (of 18) = 0.290583 fft 11: mflops = 17.3748 (norm. = 0.354545), norm. avg. (of 18) = 0.351457 fft 12: mflops = -1 (norm. = -0.0204058), norm. avg. (of 12) = 0.37058 fft 13: mflops = -1 (norm. = -0.0204058), norm. avg. (of 12) = 0.311159 fft 14: mflops = 3.97069 (norm. = 0.0810249), norm. avg. (of 18) = 0.0599269 Benchmarking for array size = 362880: 0. Brenner: elapsed time t=4.27 s, 1 iters, t-(init.)=4.22 s t(norm)=0.629655, mflops=7.94085 (err=7.7e-15) 1. CWP (min N) (N=720720): elapsed time t=1.35 s, 1 iters, t-(init.)=1.25 s t(norm)=0.186509, mflops=26.8083 2. CWP (best N) (N=720720): elapsed time t=1.35 s, 1 iters, t-(init.)=1.25 s t(norm)=0.186509, mflops=26.8083 3. FFTPACK: elapsed time t=1.67 s, 1 iters, t-(init.)=1.62 s t(norm)=0.241716, mflops=20.6854 (err=7.5e-15) 4. FFTPACK (f2c): elapsed time t=1.75 s, 1 iters, t-(init.)=1.71 s t(norm)=0.255145, mflops=19.5967 (err=7.5e-15) FFTW_MEASURE plan: (cost = 6.800000e-01) FFTW_TWIDDLE 32 FFTW_TWIDDLE 5 FFTW_TWIDDLE 3 FFTW_TWIDDLE 3 FFTW_TWIDDLE 7 FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.3 s, 2 iters, t-(init.)=1.2 s t(norm)=0.0895245, mflops=55.8507 (err=7.6e-15) FFTW_ESTIMATE plan: (cost = 7.511616e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.51 s, 2 iters, t-(init.)=1.41 s t(norm)=0.105191, mflops=47.5325 (err=7.6e-15) 7. Frigo-old: elapsed time t=2.41 s, 1 iters, t-(init.)=2.36 s t(norm)=0.35213, mflops=14.1993 (err=7.6e-15) 8. GSL: elapsed time t=1.34 s, 1 iters, t-(init.)=1.29 s t(norm)=0.192478, mflops=25.9771 (err=7.5e-15) 9. Nielsen: elapsed time t=2.27 s, 1 iters, t-(init.)=2.22 s t(norm)=0.331241, mflops=15.0948 (err=3.5e-12) 10. Singleton: elapsed time t=2.31 s, 1 iters, t-(init.)=2.26 s t(norm)=0.337209, mflops=14.8276 (err=1.1e-14) 11. Singleton (f2c): elapsed time t=2.2 s, 1 iters, t-(init.)=2.15 s t(norm)=0.320796, mflops=15.5862 (err=1.1e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=8.3 s, 1 iters, t-(init.)=8.25 s t(norm)=1.23096, mflops=4.06187 (err=7.9e-15) Top mflops for N=362880 = 55.8507 Normalized results and averages for N=362880: fft 0: mflops = 7.94085 (norm. = 0.14218), norm. avg. (of 16) = 0.115673 fft 1: mflops = 26.8083 (norm. = 0.48), norm. avg. (of 19) = 0.778992 fft 2: mflops = 26.8083 (norm. = 0.48), norm. avg. (of 19) = 0.772133 fft 3: mflops = 20.6854 (norm. = 0.37037), norm. avg. (of 19) = 0.455307 fft 4: mflops = 19.5967 (norm. = 0.350877), norm. avg. (of 19) = 0.419659 fft 5: mflops = 55.8507 (norm. = 1), norm. avg. (of 19) = 0.861292 fft 6: mflops = 47.5325 (norm. = 0.851064), norm. avg. (of 19) = 0.798771 fft 7: mflops = 14.1993 (norm. = 0.254237), norm. avg. (of 19) = 0.222806 fft 8: mflops = 25.9771 (norm. = 0.465116), norm. avg. (of 19) = 0.372199 fft 9: mflops = 15.0948 (norm. = 0.27027), norm. avg. (of 19) = 0.20496 fft 10: mflops = 14.8276 (norm. = 0.265487), norm. avg. (of 19) = 0.289262 fft 11: mflops = 15.5862 (norm. = 0.27907), norm. avg. (of 19) = 0.347647 fft 12: mflops = -1 (norm. = -0.0179049), norm. avg. (of 12) = 0.37058 fft 13: mflops = -1 (norm. = -0.0179049), norm. avg. (of 12) = 0.311159 fft 14: mflops = 4.06187 (norm. = 0.0727273), norm. avg. (of 19) = 0.0606006 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) 512x128x64 (64.0236 MB) Maximum array size N = 4194304 Benchmarking FFTs: 0. FFTW 1. HARM 2. HARM (f2c) 3. NR (C) 4. NR (F) 5. PDA 6. PDA (f2c) 7. Singleton 8. Singleton (f2c) 9. Temperton 10. Temperton (f2c) Computing normalized averages (11 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.42 s, 65536 iters, t-(init.)=1.33 s t(norm)=0.0528495, mflops=94.6084 (err=3.0e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. Skipping fft (all dimensions must be > 4 for HARM). 3. NR (C): elapsed time t=1.18 s, 32768 iters, t-(init.)=1.13 s t(norm)=0.0898043, mflops=55.6766 (err=3.0e-16) 4. NR (F): elapsed time t=1.85 s, 32768 iters, t-(init.)=1.8 s t(norm)=0.143051, mflops=34.9525 (err=3.0e-16) 5. PDA: elapsed time t=1.63 s, 16384 iters, t-(init.)=1.6 s t(norm)=0.254313, mflops=19.6608 (err=2.9e-16) 6. PDA (f2c): elapsed time t=1.92 s, 16384 iters, t-(init.)=1.89 s t(norm)=0.300407, mflops=16.6441 (err=3.3e-16) 7. Singleton: elapsed time t=1.2 s, 32768 iters, t-(init.)=1.15 s t(norm)=0.0913938, mflops=54.7083 (err=3.0e-16) 8. Singleton (f2c): elapsed time t=1.27 s, 32768 iters, t-(init.)=1.22 s t(norm)=0.0969569, mflops=51.5693 (err=2.2e-16) 9. Temperton: elapsed time t=1.76 s, 32768 iters, t-(init.)=1.71 s t(norm)=0.135899, mflops=36.7921 (err=4.1e-16) 10. Temperton (f2c): elapsed time t=1.64 s, 32768 iters, t-(init.)=1.59 s t(norm)=0.126362, mflops=39.5689 (err=3.0e-16) Top mflops for N=64 = 94.6084 Normalized results and averages for N=64: fft 0: mflops = 94.6084 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.0105699), norm. avg. (of 0) = -1 fft 2: mflops = -1 (norm. = -0.0105699), norm. avg. (of 0) = -1 fft 3: mflops = 55.6766 (norm. = 0.588496), norm. avg. (of 1) = 0.588496 fft 4: mflops = 34.9525 (norm. = 0.369444), norm. avg. (of 1) = 0.369444 fft 5: mflops = 19.6608 (norm. = 0.207812), norm. avg. (of 1) = 0.207812 fft 6: mflops = 16.6441 (norm. = 0.175926), norm. avg. (of 1) = 0.175926 fft 7: mflops = 54.7083 (norm. = 0.578261), norm. avg. (of 1) = 0.578261 fft 8: mflops = 51.5693 (norm. = 0.545082), norm. avg. (of 1) = 0.545082 fft 9: mflops = 36.7921 (norm. = 0.388889), norm. avg. (of 1) = 0.388889 fft 10: mflops = 39.5689 (norm. = 0.418239), norm. avg. (of 1) = 0.418239 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.05 s, 4096 iters, t-(init.)=1 s t(norm)=0.0529819, mflops=94.3718 (err=2.8e-16) 1. HARM: elapsed time t=1.32 s, 4096 iters, t-(init.)=1.27 s t(norm)=0.067287, mflops=74.3085 (err=3.3e-16) 2. HARM (f2c): elapsed time t=1.07 s, 2048 iters, t-(init.)=1.05 s t(norm)=0.111262, mflops=44.939 (err=3.2e-16) 3. NR (C): elapsed time t=1.23 s, 4096 iters, t-(init.)=1.19 s t(norm)=0.0630485, mflops=79.3041 (err=3.1e-16) 4. NR (F): elapsed time t=1.79 s, 4096 iters, t-(init.)=1.74 s t(norm)=0.0921885, mflops=54.2367 (err=3.1e-16) 5. PDA: elapsed time t=1.43 s, 2048 iters, t-(init.)=1.41 s t(norm)=0.149409, mflops=33.4652 (err=2.5e-16) 6. PDA (f2c): elapsed time t=1.67 s, 2048 iters, t-(init.)=1.64 s t(norm)=0.173781, mflops=28.7719 (err=2.6e-16) 7. Singleton: elapsed time t=1.88 s, 4096 iters, t-(init.)=1.82 s t(norm)=0.0964271, mflops=51.8527 (err=3.4e-16) 8. Singleton (f2c): elapsed time t=1.18 s, 4096 iters, t-(init.)=1.13 s t(norm)=0.0598696, mflops=83.5149 (err=3.4e-16) 9. Temperton: elapsed time t=1 s, 2048 iters, t-(init.)=0.97 s t(norm)=0.102785, mflops=48.6453 (err=1.1e-08) 10. Temperton (f2c): elapsed time t=1.74 s, 4096 iters, t-(init.)=1.7 s t(norm)=0.0900692, mflops=55.5128 (err=3.0e-16) Top mflops for N=512 = 94.3718 Normalized results and averages for N=512: fft 0: mflops = 94.3718 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 74.3085 (norm. = 0.787402), norm. avg. (of 1) = 0.787402 fft 2: mflops = 44.939 (norm. = 0.47619), norm. avg. (of 1) = 0.47619 fft 3: mflops = 79.3041 (norm. = 0.840336), norm. avg. (of 2) = 0.714416 fft 4: mflops = 54.2367 (norm. = 0.574713), norm. avg. (of 2) = 0.472079 fft 5: mflops = 33.4652 (norm. = 0.35461), norm. avg. (of 2) = 0.281211 fft 6: mflops = 28.7719 (norm. = 0.304878), norm. avg. (of 2) = 0.240402 fft 7: mflops = 51.8527 (norm. = 0.549451), norm. avg. (of 2) = 0.563856 fft 8: mflops = 83.5149 (norm. = 0.884956), norm. avg. (of 2) = 0.715019 fft 9: mflops = 48.6453 (norm. = 0.515464), norm. avg. (of 2) = 0.452176 fft 10: mflops = 55.5128 (norm. = 0.588235), norm. avg. (of 2) = 0.503237 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.78 s, 512 iters, t-(init.)=1.63 s t(norm)=0.0647704, mflops=77.1958 (err=3.1e-16) 1. HARM: elapsed time t=1.82 s, 512 iters, t-(init.)=1.67 s t(norm)=0.0663598, mflops=75.3468 (err=3.2e-16) 2. HARM (f2c): elapsed time t=1.49 s, 256 iters, t-(init.)=1.41 s t(norm)=0.112057, mflops=44.6203 (err=3.1e-16) 3. NR (C): elapsed time t=1.38 s, 128 iters, t-(init.)=1.34 s t(norm)=0.212987, mflops=23.4756 (err=3.5e-16) 4. NR (F): elapsed time t=1.51 s, 128 iters, t-(init.)=1.47 s t(norm)=0.23365, mflops=21.3995 (err=3.5e-16) 5. PDA: elapsed time t=1.75 s, 256 iters, t-(init.)=1.67 s t(norm)=0.13272, mflops=37.6734 (err=2.9e-16) 6. PDA (f2c): elapsed time t=1.86 s, 256 iters, t-(init.)=1.78 s t(norm)=0.141462, mflops=35.3453 (err=2.9e-16) 7. Singleton: elapsed time t=1.88 s, 256 iters, t-(init.)=1.8 s t(norm)=0.143051, mflops=34.9525 (err=3.5e-16) 8. Singleton (f2c): elapsed time t=1.03 s, 128 iters, t-(init.)=1 s t(norm)=0.158946, mflops=31.4573 (err=3.3e-16) 9. Temperton: elapsed time t=1.7 s, 256 iters, t-(init.)=1.62 s t(norm)=0.128746, mflops=38.8361 (err=6.0e-08) 10. Temperton (f2c): elapsed time t=1.52 s, 256 iters, t-(init.)=1.44 s t(norm)=0.114441, mflops=43.6907 (err=3.1e-16) Top mflops for N=4096 = 77.1958 Normalized results and averages for N=4096: fft 0: mflops = 77.1958 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 75.3468 (norm. = 0.976048), norm. avg. (of 2) = 0.881725 fft 2: mflops = 44.6203 (norm. = 0.578014), norm. avg. (of 2) = 0.527102 fft 3: mflops = 23.4756 (norm. = 0.304104), norm. avg. (of 3) = 0.577645 fft 4: mflops = 21.3995 (norm. = 0.277211), norm. avg. (of 3) = 0.407123 fft 5: mflops = 37.6734 (norm. = 0.488024), norm. avg. (of 3) = 0.350149 fft 6: mflops = 35.3453 (norm. = 0.457865), norm. avg. (of 3) = 0.31289 fft 7: mflops = 34.9525 (norm. = 0.452778), norm. avg. (of 3) = 0.52683 fft 8: mflops = 31.4573 (norm. = 0.4075), norm. avg. (of 3) = 0.612513 fft 9: mflops = 38.8361 (norm. = 0.503086), norm. avg. (of 3) = 0.469146 fft 10: mflops = 43.6907 (norm. = 0.565972), norm. avg. (of 3) = 0.524149 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.46 s, 32 iters, t-(init.)=1.36 s t(norm)=0.0864665, mflops=57.8259 (err=4.1e-16) 1. HARM: elapsed time t=1.78 s, 32 iters, t-(init.)=1.67 s t(norm)=0.106176, mflops=47.0917 (err=4.6e-16) 2. HARM (f2c): elapsed time t=1.22 s, 16 iters, t-(init.)=1.16 s t(norm)=0.147502, mflops=33.8979 (err=4.3e-16) 3. NR (C): elapsed time t=1.21 s, 8 iters, t-(init.)=1.18 s t(norm)=0.30009, mflops=16.6617 (err=4.2e-16) 4. NR (F): elapsed time t=1.3 s, 8 iters, t-(init.)=1.27 s t(norm)=0.322978, mflops=15.4809 (err=4.2e-16) 5. PDA: elapsed time t=1.29 s, 16 iters, t-(init.)=1.23 s t(norm)=0.156403, mflops=31.9688 (err=3.5e-16) 6. PDA (f2c): elapsed time t=1.35 s, 16 iters, t-(init.)=1.3 s t(norm)=0.165304, mflops=30.2474 (err=3.6e-16) 7. Singleton: elapsed time t=1.84 s, 16 iters, t-(init.)=1.79 s t(norm)=0.22761, mflops=21.9674 (err=4.4e-16) 8. Singleton (f2c): elapsed time t=1.87 s, 16 iters, t-(init.)=1.82 s t(norm)=0.231425, mflops=21.6053 (err=4.2e-16) 9. Temperton: elapsed time t=1.37 s, 16 iters, t-(init.)=1.32 s t(norm)=0.167847, mflops=29.7891 (err=9.6e-08) 10. Temperton (f2c): elapsed time t=1.32 s, 16 iters, t-(init.)=1.27 s t(norm)=0.161489, mflops=30.9619 (err=4.0e-16) Top mflops for N=32768 = 57.8259 Normalized results and averages for N=32768: fft 0: mflops = 57.8259 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 47.0917 (norm. = 0.814371), norm. avg. (of 3) = 0.859274 fft 2: mflops = 33.8979 (norm. = 0.586207), norm. avg. (of 3) = 0.546804 fft 3: mflops = 16.6617 (norm. = 0.288136), norm. avg. (of 4) = 0.505268 fft 4: mflops = 15.4809 (norm. = 0.267717), norm. avg. (of 4) = 0.372271 fft 5: mflops = 31.9688 (norm. = 0.552846), norm. avg. (of 4) = 0.400823 fft 6: mflops = 30.2474 (norm. = 0.523077), norm. avg. (of 4) = 0.365437 fft 7: mflops = 21.9674 (norm. = 0.379888), norm. avg. (of 4) = 0.490094 fft 8: mflops = 21.6053 (norm. = 0.373626), norm. avg. (of 4) = 0.552791 fft 9: mflops = 29.7891 (norm. = 0.515152), norm. avg. (of 4) = 0.480648 fft 10: mflops = 30.9619 (norm. = 0.535433), norm. avg. (of 4) = 0.52697 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.25 s, 2 iters, t-(init.)=1.17 s t(norm)=0.123978, mflops=40.3298 (err=4.3e-16) 1. HARM: elapsed time t=1.3 s, 2 iters, t-(init.)=1.23 s t(norm)=0.130335, mflops=38.3625 (err=4.9e-16) 2. HARM (f2c): elapsed time t=1.65 s, 2 iters, t-(init.)=1.58 s t(norm)=0.167423, mflops=29.8645 (err=4.3e-16) 3. NR (C): elapsed time t=2.13 s, 1 iters, t-(init.)=2.1 s t(norm)=0.445048, mflops=11.2347 (err=4.9e-16) 4. NR (F): elapsed time t=2.22 s, 1 iters, t-(init.)=2.19 s t(norm)=0.464122, mflops=10.773 (err=4.9e-16) 5. PDA: elapsed time t=1.7 s, 2 iters, t-(init.)=1.63 s t(norm)=0.172721, mflops=28.9484 (err=4.4e-16) 6. PDA (f2c): elapsed time t=1.72 s, 2 iters, t-(init.)=1.65 s t(norm)=0.17484, mflops=28.5975 (err=4.4e-16) 7. Singleton: elapsed time t=1.4 s, 1 iters, t-(init.)=1.36 s t(norm)=0.288222, mflops=17.3478 (err=5.0e-16) 8. Singleton (f2c): elapsed time t=1.44 s, 1 iters, t-(init.)=1.4 s t(norm)=0.296699, mflops=16.8521 (err=4.8e-16) 9. Temperton: elapsed time t=1.85 s, 2 iters, t-(init.)=1.78 s t(norm)=0.188616, mflops=26.5089 (err=1.4e-07) 10. Temperton (f2c): elapsed time t=1.78 s, 2 iters, t-(init.)=1.71 s t(norm)=0.181198, mflops=27.5941 (err=4.4e-16) Top mflops for N=262144 = 40.3298 Normalized results and averages for N=262144: fft 0: mflops = 40.3298 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 38.3625 (norm. = 0.95122), norm. avg. (of 4) = 0.88226 fft 2: mflops = 29.8645 (norm. = 0.740506), norm. avg. (of 4) = 0.595229 fft 3: mflops = 11.2347 (norm. = 0.278571), norm. avg. (of 5) = 0.459929 fft 4: mflops = 10.773 (norm. = 0.267123), norm. avg. (of 5) = 0.351242 fft 5: mflops = 28.9484 (norm. = 0.717791), norm. avg. (of 5) = 0.464217 fft 6: mflops = 28.5975 (norm. = 0.709091), norm. avg. (of 5) = 0.434167 fft 7: mflops = 17.3478 (norm. = 0.430147), norm. avg. (of 5) = 0.478105 fft 8: mflops = 16.8521 (norm. = 0.417857), norm. avg. (of 5) = 0.525804 fft 9: mflops = 26.5089 (norm. = 0.657303), norm. avg. (of 5) = 0.515979 fft 10: mflops = 27.5941 (norm. = 0.684211), norm. avg. (of 5) = 0.558418 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.22 s, 1 iters, t-(init.)=1.14 s t(norm)=0.114441, mflops=43.6907 (err=5.7e-16) 1. HARM: elapsed time t=1.44 s, 1 iters, t-(init.)=1.37 s t(norm)=0.13753, mflops=36.3557 (err=6.1e-16) 2. HARM (f2c): elapsed time t=1.81 s, 1 iters, t-(init.)=1.74 s t(norm)=0.174673, mflops=28.6249 (err=5.9e-16) 3. NR (C): elapsed time t=4.55 s, 1 iters, t-(init.)=4.48 s t(norm)=0.449733, mflops=11.1177 (err=5.7e-16) 4. NR (F): elapsed time t=4.71 s, 1 iters, t-(init.)=4.64 s t(norm)=0.465795, mflops=10.7343 (err=5.7e-16) 5. PDA: elapsed time t=1.65 s, 1 iters, t-(init.)=1.57 s t(norm)=0.157607, mflops=31.7244 (err=4.9e-16) 6. PDA (f2c): elapsed time t=1.66 s, 1 iters, t-(init.)=1.58 s t(norm)=0.158611, mflops=31.5236 (err=5.1e-16) 7. Singleton: elapsed time t=3.09 s, 1 iters, t-(init.)=3.02 s t(norm)=0.303168, mflops=16.4925 (err=7.2e-16) 8. Singleton (f2c): elapsed time t=3.2 s, 1 iters, t-(init.)=3.13 s t(norm)=0.314211, mflops=15.9129 (err=7.1e-16) 9. Temperton: elapsed time t=2 s, 1 iters, t-(init.)=1.93 s t(norm)=0.193746, mflops=25.8069 (err=1.5e-07) 10. Temperton (f2c): elapsed time t=1.89 s, 1 iters, t-(init.)=1.82 s t(norm)=0.182704, mflops=27.3667 (err=5.5e-16) Top mflops for N=524288 = 43.6907 Normalized results and averages for N=524288: fft 0: mflops = 43.6907 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 36.3557 (norm. = 0.832117), norm. avg. (of 5) = 0.872231 fft 2: mflops = 28.6249 (norm. = 0.655172), norm. avg. (of 5) = 0.607218 fft 3: mflops = 11.1177 (norm. = 0.254464), norm. avg. (of 6) = 0.425685 fft 4: mflops = 10.7343 (norm. = 0.24569), norm. avg. (of 6) = 0.33365 fft 5: mflops = 31.7244 (norm. = 0.726115), norm. avg. (of 6) = 0.507866 fft 6: mflops = 31.5236 (norm. = 0.721519), norm. avg. (of 6) = 0.482059 fft 7: mflops = 16.4925 (norm. = 0.377483), norm. avg. (of 6) = 0.461335 fft 8: mflops = 15.9129 (norm. = 0.364217), norm. avg. (of 6) = 0.498873 fft 9: mflops = 25.8069 (norm. = 0.590674), norm. avg. (of 6) = 0.528428 fft 10: mflops = 27.3667 (norm. = 0.626374), norm. avg. (of 6) = 0.569744 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=2.65 s, 1 iters, t-(init.)=2.5 s t(norm)=0.119209, mflops=41.943 (err=5.9e-16) 1. HARM: elapsed time t=2.92 s, 1 iters, t-(init.)=2.78 s t(norm)=0.132561, mflops=37.7186 (err=5.8e-16) 2. HARM (f2c): elapsed time t=3.74 s, 1 iters, t-(init.)=3.6 s t(norm)=0.171661, mflops=29.1271 (err=5.5e-16) 3. NR (C): elapsed time t=9.89 s, 1 iters, t-(init.)=9.74 s t(norm)=0.464439, mflops=10.7657 (err=7.3e-16) 4. NR (F): elapsed time t=10.2 s, 1 iters, t-(init.)=10.06 s t(norm)=0.479698, mflops=10.4232 (err=7.3e-16) 5. PDA: elapsed time t=4.42 s, 1 iters, t-(init.)=4.28 s t(norm)=0.204086, mflops=24.4994 (err=6.0e-16) 6. PDA (f2c): elapsed time t=4.37 s, 1 iters, t-(init.)=4.22 s t(norm)=0.201225, mflops=24.8478 (err=6.0e-16) 7. Singleton: elapsed time t=6.18 s, 1 iters, t-(init.)=6.04 s t(norm)=0.28801, mflops=17.3605 (err=7.1e-16) 8. Singleton (f2c): elapsed time t=6.22 s, 1 iters, t-(init.)=6.07 s t(norm)=0.28944, mflops=17.2747 (err=6.9e-16) 9. Skipping fft (Temperton can't handle dimensions > 256). 10. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 41.943 Normalized results and averages for N=1048576: fft 0: mflops = 41.943 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 37.7186 (norm. = 0.899281), norm. avg. (of 6) = 0.87674 fft 2: mflops = 29.1271 (norm. = 0.694444), norm. avg. (of 6) = 0.621756 fft 3: mflops = 10.7657 (norm. = 0.256674), norm. avg. (of 7) = 0.40154 fft 4: mflops = 10.4232 (norm. = 0.248509), norm. avg. (of 7) = 0.321487 fft 5: mflops = 24.4994 (norm. = 0.584112), norm. avg. (of 7) = 0.518759 fft 6: mflops = 24.8478 (norm. = 0.592417), norm. avg. (of 7) = 0.497825 fft 7: mflops = 17.3605 (norm. = 0.413907), norm. avg. (of 7) = 0.454559 fft 8: mflops = 17.2747 (norm. = 0.411862), norm. avg. (of 7) = 0.486443 fft 9: mflops = -1 (norm. = -0.0238419), norm. avg. (of 6) = 0.528428 fft 10: mflops = -1 (norm. = -0.0238419), norm. avg. (of 6) = 0.569744 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=5.53 s, 1 iters, t-(init.)=5.23 s t(norm)=0.118755, mflops=42.1034 (err=6.5e-16) 1. HARM: elapsed time t=6.41 s, 1 iters, t-(init.)=6.12 s t(norm)=0.138964, mflops=35.9805 (err=6.3e-16) 2. HARM (f2c): elapsed time t=7.89 s, 1 iters, t-(init.)=7.6 s t(norm)=0.17257, mflops=28.9738 (err=6.1e-16) 3. NR (C): elapsed time t=21.85 s, 1 iters, t-(init.)=21.56 s t(norm)=0.489553, mflops=10.2134 (err=7.5e-16) 4. NR (F): elapsed time t=22.4 s, 1 iters, t-(init.)=22.11 s t(norm)=0.502041, mflops=9.95934 (err=7.5e-16) 5. PDA: elapsed time t=7.21 s, 1 iters, t-(init.)=6.92 s t(norm)=0.157129, mflops=31.8209 (err=6.7e-16) 6. PDA (f2c): elapsed time t=7.21 s, 1 iters, t-(init.)=6.92 s t(norm)=0.157129, mflops=31.8209 (err=6.7e-16) 7. Singleton: elapsed time t=18.61 s, 1 iters, t-(init.)=18.32 s t(norm)=0.415984, mflops=12.0197 (err=8.0e-16) 8. Singleton (f2c): elapsed time t=18.33 s, 1 iters, t-(init.)=18.04 s t(norm)=0.409626, mflops=12.2063 (err=7.9e-16) 9. Temperton: elapsed time t=12.19 s, 1 iters, t-(init.)=11.89 s t(norm)=0.269981, mflops=18.5198 (err=1.5e-07) 10. Temperton (f2c): elapsed time t=11.82 s, 1 iters, t-(init.)=11.53 s t(norm)=0.261806, mflops=19.0981 (err=7.1e-16) Top mflops for N=2097152 = 42.1034 Normalized results and averages for N=2097152: fft 0: mflops = 42.1034 (norm. = 1), norm. avg. (of 8) = 1 fft 1: mflops = 35.9805 (norm. = 0.854575), norm. avg. (of 7) = 0.873573 fft 2: mflops = 28.9738 (norm. = 0.688158), norm. avg. (of 7) = 0.631242 fft 3: mflops = 10.2134 (norm. = 0.242579), norm. avg. (of 8) = 0.38167 fft 4: mflops = 9.95934 (norm. = 0.236545), norm. avg. (of 8) = 0.310869 fft 5: mflops = 31.8209 (norm. = 0.75578), norm. avg. (of 8) = 0.548386 fft 6: mflops = 31.8209 (norm. = 0.75578), norm. avg. (of 8) = 0.530069 fft 7: mflops = 12.0197 (norm. = 0.28548), norm. avg. (of 8) = 0.433424 fft 8: mflops = 12.2063 (norm. = 0.289911), norm. avg. (of 8) = 0.461876 fft 9: mflops = 18.5198 (norm. = 0.439865), norm. avg. (of 7) = 0.515776 fft 10: mflops = 19.0981 (norm. = 0.453599), norm. avg. (of 7) = 0.553152 Benchmarking for array size = 512x128x64 (power of 2): 0. FFTW: elapsed time t=11.25 s, 1 iters, t-(init.)=10.65 s t(norm)=0.115416, mflops=43.3214 (err=7.2e-16) 1. HARM: elapsed time t=13.26 s, 1 iters, t-(init.)=12.66 s t(norm)=0.137199, mflops=36.4434 (err=7.3e-16) 2. HARM (f2c): elapsed time t=16.43 s, 1 iters, t-(init.)=15.84 s t(norm)=0.171661, mflops=29.1271 (err=7.0e-16) 3. NR (C): elapsed time t=48.21 s, 1 iters, t-(init.)=47.62 s t(norm)=0.516068, mflops=9.68865 (err=8.1e-16) 4. NR (F): elapsed time t=49.13 s, 1 iters, t-(init.)=48.53 s t(norm)=0.52593, mflops=9.50697 (err=8.1e-16) 5. PDA: elapsed time t=16.17 s, 1 iters, t-(init.)=15.58 s t(norm)=0.168844, mflops=29.6132 (err=7.6e-16) 6. PDA (f2c): elapsed time t=16.22 s, 1 iters, t-(init.)=15.63 s t(norm)=0.169386, mflops=29.5185 (err=7.6e-16) 7. Singleton: elapsed time t=34.07 s, 1 iters, t-(init.)=33.48 s t(norm)=0.36283, mflops=13.7806 (err=9.6e-16) 8. Singleton (f2c): elapsed time t=35.17 s, 1 iters, t-(init.)=34.57 s t(norm)=0.374642, mflops=13.3461 (err=9.4e-16) 9. Skipping fft (Temperton can't handle dimensions > 256). 10. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=4194304 = 43.3214 Normalized results and averages for N=4194304: fft 0: mflops = 43.3214 (norm. = 1), norm. avg. (of 9) = 1 fft 1: mflops = 36.4434 (norm. = 0.841232), norm. avg. (of 8) = 0.869531 fft 2: mflops = 29.1271 (norm. = 0.672348), norm. avg. (of 8) = 0.63638 fft 3: mflops = 9.68865 (norm. = 0.223646), norm. avg. (of 9) = 0.364112 fft 4: mflops = 9.50697 (norm. = 0.219452), norm. avg. (of 9) = 0.300711 fft 5: mflops = 29.6132 (norm. = 0.683569), norm. avg. (of 9) = 0.563407 fft 6: mflops = 29.5185 (norm. = 0.681382), norm. avg. (of 9) = 0.546882 fft 7: mflops = 13.7806 (norm. = 0.3181), norm. avg. (of 9) = 0.420611 fft 8: mflops = 13.3461 (norm. = 0.308071), norm. avg. (of 9) = 0.444787 fft 9: mflops = -1 (norm. = -0.0230833), norm. avg. (of 7) = 0.515776 fft 10: mflops = -1 (norm. = -0.0230833), norm. avg. (of 7) = 0.553152 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.052 3071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) 180x180x180 (88.9976 MB) Maximum array size N = 5832000 Benchmarking FFTs: 0. FFTW 1. PDA 2. PDA (f2c) 3. Singleton 4. Singleton (f2c) 5. Temperton 6. Temperton (f2c) Computing normalized averages (7 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1 s, 16384 iters, t-(init.)=0.95 s t(norm)=0.0665922, mflops=75.0838 (err=2.9e-16) 1. PDA: elapsed time t=1.67 s, 8192 iters, t-(init.)=1.65 s t(norm)=0.23132, mflops=21.615 (err=2.5e-16) 2. PDA (f2c): elapsed time t=1.01 s, 4096 iters, t-(init.)=1 s t(norm)=0.280388, mflops=17.8324 (err=2.4e-16) 3. Singleton: elapsed time t=1.48 s, 16384 iters, t-(init.)=1.43 s t(norm)=0.100239, mflops=49.8809 (err=3.2e-16) 4. Singleton (f2c): elapsed time t=1.18 s, 16384 iters, t-(init.)=1.13 s t(norm)=0.0792097, mflops=63.1236 (err=2.6e-16) 5. Temperton: elapsed time t=1.02 s, 16384 iters, t-(init.)=0.97 s t(norm)=0.0679942, mflops=73.5357 (err=3.8e-08) 6. Temperton (f2c): elapsed time t=1.49 s, 16384 iters, t-(init.)=1.44 s t(norm)=0.10094, mflops=49.5345 (err=1.7e-16) Top mflops for N=125 = 75.0838 Normalized results and averages for N=125: fft 0: mflops = 75.0838 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = 21.615 (norm. = 0.287879), norm. avg. (of 1) = 0.287879 fft 2: mflops = 17.8324 (norm. = 0.2375), norm. avg. (of 1) = 0.2375 fft 3: mflops = 49.8809 (norm. = 0.664336), norm. avg. (of 1) = 0.664336 fft 4: mflops = 63.1236 (norm. = 0.840708), norm. avg. (of 1) = 0.840708 fft 5: mflops = 73.5357 (norm. = 0.979381), norm. avg. (of 1) = 0.979381 fft 6: mflops = 49.5345 (norm. = 0.659722), norm. avg. (of 1) = 0.659722 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.41 s, 16384 iters, t-(init.)=1.33 s t(norm)=0.0484621, mflops=103.173 (err=2.3e-16) 1. PDA: elapsed time t=1.37 s, 4096 iters, t-(init.)=1.35 s t(norm)=0.196764, mflops=25.4112 (err=3.7e-16) 2. PDA (f2c): elapsed time t=1.69 s, 4096 iters, t-(init.)=1.67 s t(norm)=0.243404, mflops=20.542 (err=3.8e-16) 3. Singleton: elapsed time t=1.69 s, 8192 iters, t-(init.)=1.64 s t(norm)=0.119516, mflops=41.8355 (err=3.0e-16) 4. Singleton (f2c): elapsed time t=1.22 s, 8192 iters, t-(init.)=1.18 s t(norm)=0.0859929, mflops=58.1443 (err=3.0e-16) 5. Temperton: elapsed time t=1.01 s, 8192 iters, t-(init.)=0.97 s t(norm)=0.0706891, mflops=70.7322 (err=1.7e-08) 6. Temperton (f2c): elapsed time t=1.79 s, 8192 iters, t-(init.)=1.75 s t(norm)=0.127532, mflops=39.2059 (err=2.9e-16) Top mflops for N=216 = 103.173 Normalized results and averages for N=216: fft 0: mflops = 103.173 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 25.4112 (norm. = 0.246296), norm. avg. (of 2) = 0.267088 fft 2: mflops = 20.542 (norm. = 0.199102), norm. avg. (of 2) = 0.218301 fft 3: mflops = 41.8355 (norm. = 0.405488), norm. avg. (of 2) = 0.534912 fft 4: mflops = 58.1443 (norm. = 0.563559), norm. avg. (of 2) = 0.702134 fft 5: mflops = 70.7322 (norm. = 0.685567), norm. avg. (of 2) = 0.832474 fft 6: mflops = 39.2059 (norm. = 0.38), norm. avg. (of 2) = 0.519861 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.48 s, 8192 iters, t-(init.)=1.42 s t(norm)=0.0600048, mflops=83.3267 (err=2.1e-16) 1. PDA: elapsed time t=1.96 s, 2048 iters, t-(init.)=1.94 s t(norm)=0.327913, mflops=15.2479 (err=3.7e-16) 2. PDA (f2c): elapsed time t=1.15 s, 1024 iters, t-(init.)=1.15 s t(norm)=0.388763, mflops=12.8613 (err=3.4e-16) 3. Singleton: elapsed time t=1.09 s, 4096 iters, t-(init.)=1.05 s t(norm)=0.0887394, mflops=56.3447 (err=4.2e-16) 4. Singleton (f2c): elapsed time t=1.84 s, 4096 iters, t-(init.)=1.81 s t(norm)=0.15297, mflops=32.6862 (err=4.2e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 83.3267 Normalized results and averages for N=343: fft 0: mflops = 83.3267 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 15.2479 (norm. = 0.18299), norm. avg. (of 3) = 0.239055 fft 2: mflops = 12.8613 (norm. = 0.154348), norm. avg. (of 3) = 0.196983 fft 3: mflops = 56.3447 (norm. = 0.67619), norm. avg. (of 3) = 0.582005 fft 4: mflops = 32.6862 (norm. = 0.392265), norm. avg. (of 3) = 0.598844 fft 5: mflops = -1 (norm. = -0.012001), norm. avg. (of 2) = 0.832474 fft 6: mflops = -1 (norm. = -0.012001), norm. avg. (of 2) = 0.519861 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.34 s, 4096 iters, t-(init.)=1.28 s t(norm)=0.0450767, mflops=110.922 (err=4.1e-16) 1. PDA: elapsed time t=1.02 s, 1024 iters, t-(init.)=1 s t(norm)=0.140865, mflops=35.495 (err=3.2e-16) 2. PDA (f2c): elapsed time t=1.32 s, 1024 iters, t-(init.)=1.3 s t(norm)=0.183124, mflops=27.3039 (err=3.6e-16) 3. Singleton: elapsed time t=1.75 s, 2048 iters, t-(init.)=1.71 s t(norm)=0.120439, mflops=41.5147 (err=3.6e-16) 4. Singleton (f2c): elapsed time t=1.18 s, 2048 iters, t-(init.)=1.14 s t(norm)=0.0802929, mflops=62.272 (err=3.6e-16) 5. Temperton: elapsed time t=1.02 s, 2048 iters, t-(init.)=0.99 s t(norm)=0.069728, mflops=71.7072 (err=5.3e-08) 6. Temperton (f2c): elapsed time t=1.48 s, 2048 iters, t-(init.)=1.45 s t(norm)=0.102127, mflops=48.9587 (err=3.5e-16) Top mflops for N=729 = 110.922 Normalized results and averages for N=729: fft 0: mflops = 110.922 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 35.495 (norm. = 0.32), norm. avg. (of 4) = 0.259291 fft 2: mflops = 27.3039 (norm. = 0.246154), norm. avg. (of 4) = 0.209276 fft 3: mflops = 41.5147 (norm. = 0.374269), norm. avg. (of 4) = 0.530071 fft 4: mflops = 62.272 (norm. = 0.561404), norm. avg. (of 4) = 0.589484 fft 5: mflops = 71.7072 (norm. = 0.646465), norm. avg. (of 3) = 0.770471 fft 6: mflops = 48.9587 (norm. = 0.441379), norm. avg. (of 3) = 0.493701 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.03 s, 2048 iters, t-(init.)=0.97 s t(norm)=0.0475259, mflops=105.206 (err=2.6e-16) 1. PDA: elapsed time t=1.52 s, 1024 iters, t-(init.)=1.49 s t(norm)=0.146007, mflops=34.2448 (err=2.8e-16) 2. PDA (f2c): elapsed time t=1.82 s, 1024 iters, t-(init.)=1.79 s t(norm)=0.175405, mflops=28.5055 (err=3.1e-16) 3. Singleton: elapsed time t=1.27 s, 1024 iters, t-(init.)=1.24 s t(norm)=0.12151, mflops=41.149 (err=3.8e-16) 4. Singleton (f2c): elapsed time t=1.02 s, 1024 iters, t-(init.)=1 s t(norm)=0.0979915, mflops=51.0248 (err=3.7e-16) 5. Temperton: elapsed time t=1.08 s, 2048 iters, t-(init.)=1.03 s t(norm)=0.0504656, mflops=99.0773 (err=2.5e-08) 6. Temperton (f2c): elapsed time t=1.09 s, 1024 iters, t-(init.)=1.06 s t(norm)=0.103871, mflops=48.1366 (err=2.8e-16) Top mflops for N=1000 = 105.206 Normalized results and averages for N=1000: fft 0: mflops = 105.206 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 34.2448 (norm. = 0.325503), norm. avg. (of 5) = 0.272534 fft 2: mflops = 28.5055 (norm. = 0.27095), norm. avg. (of 5) = 0.221611 fft 3: mflops = 41.149 (norm. = 0.391129), norm.. avg. (of 5) = 0.502282 fft 4: mflops = 51.0248 (norm. = 0.485), norm. avg. (of 5) = 0.568587 fft 5: mflops = 99.0773 (norm. = 0.941748), norm. avg. (of 4) = 0.81329 fft 6: mflops = 48.1366 (norm. = 0.457547), norm. avg. (of 4) = 0.484662 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.35 s, 1024 iters, t-(init.)=1.25 s t(norm)=0.0883702, mflops=56.5801 (err=2.4e-16) 1. PDA: elapsed time t=1.13 s, 256 iters, t-(init.)=1.11 s t(norm)=0.313891, mflops=15.9291 (err=4.2e-16) 2. PDA (f2c): elapsed time t=1.34 s, 256 iters, t-(init.)=1.31 s t(norm)=0.370448, mflops=13.4972 (err=4.3e-16) 3. Singleton: elapsed time t=1.6 s, 1024 iters, t-(init.)=1.5 s t(norm)=0.106044, mflops=47.1501 (err=3.8e-16) 4. Singleton (f2c): elapsed time t=1.4 s, 512 iters, t-(init.)=1.35 s t(norm)=0.19088, mflops=26.1945 (err=3.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 56.5801 Normalized results and averages for N=1331: fft 0: mflops = 56.5801 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 15.9291 (norm. = 0.281532), norm. avg. (of 6) = 0.274033 fft 2: mflops = 13.4972 (norm. = 0.23855), norm. avg. (of 6) = 0.224434 fft 3: mflops = 47.1501 (norm. = 0.833333), norm. avg. (of 6) = 0.557458 fft 4: mflops = 26.1945 (norm. = 0.462963), norm. avg. (of 6) = 0.550983 fft 5: mflops = -1 (norm. = -0.017674), norm. avg. (of 4) = 0.81329 fft 6: mflops = -1 (norm. = -0.017674), norm. avg. (of 4) = 0.484662 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.99 s, 2048 iters, t-(init.)=1.73 s t(norm)=0.0454534, mflops=110.003 (err=3.3e-16) 1. PDA: elapsed time t=1.2 s, 512 iters, t-(init.)=1.14 s t(norm)=0.119808, mflops=41.7335 (err=3.1e-16) 2. PDA (f2c): elapsed time t=1.5 s, 512 iters, t-(init.)=1.44 s t(norm)=0.151336, mflops=33.039 (err=3.5e-16) 3. Singleton: elapsed time t=1.37 s, 512 iters, t-(init.)=1.31 s t(norm)=0.137674, mflops=36.3177 (err=3.7e-16) 4. Singleton (f2c): elapsed time t=1.22 s, 512 iters, t-(init.)=1.16 s t(norm)=0.12191, mflops=41.0139 (err=3.7e-16) 5. Temperton: elapsed time t=1.38 s, 1024 iters, t-(init.)=1.25 s t(norm)=0.0656841, mflops=76.1219 (err=1.6e-08) 6. Temperton (f2c): elapsed time t=1.07 s, 512 iters, t-(init.)=1 s t(norm)=0.105095, mflops=47.5762 (err=3.9e-16) Top mflops for N=1728 = 110.003 Normalized results and averages for N=1728: fft 0: mflops = 110.003 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 41.7335 (norm. = 0.379386), norm. avg. (of 7) = 0.289084 fft 2: mflops = 33.039 (norm. = 0.300347), norm. avg. (of 7) = 0.235279 fft 3: mflops = 36.3177 (norm. = 0.330153), norm. avg. (of 7) = 0.524985 fft 4: mflops = 41.0139 (norm. = 0.372845), norm. avg. (of 7) = 0.525535 fft 5: mflops = 76.1219 (norm. = 0.692), norm. avg. (of 5) = 0.789032 fft 6: mflops = 47.5762 (norm. = 0.4325), norm. avg. (of 5) = 0.47423 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.4 s, 512 iters, t-(init.)=1.31 s t(norm)=0.104905, mflops=47.6621 (err=2.3e-16) 1. PDA: elapsed time t=1.04 s, 128 iters, t-(init.)=1.02 s t(norm)=0.326727, mflops=15.3033 (err=8.7e-16) 2. PDA (f2c): elapsed time t=1.24 s, 128 iters, t-(init.)=1.22 s t(norm)=0.390792, mflops=12.7945 (err=8.4e-16) 3. Singleton: elapsed time t=1.62 s, 512 iters, t-(init.)=1.54 s t(norm)=0.123324, mflops=40.5437 (err=5.3e-16) 4. Singleton (f2c): elapsed time t=1.41 s, 256 iters, t-(init.)=1.37 s t(norm)=0.21942, mflops=22.7874 (err=5.3e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 47.6621 Normalized results and averages for N=2197: fft 0: mflops = 47.6621 (norm. = 1), norm. avg. (of 8) = 1 fft 1: mflops = 15.3033 (norm. = 0.321078), norm. avg. (of 8) = 0.293083 fft 2: mflops = 12.7945 (norm. = 0.268443), norm. avg. (of 8) = 0.239424 fft 3: mflops = 40.5437 (norm. = 0.850649), norm. avg. (of 8) = 0.565693 fft 4: mflops = 22.7874 (norm. = 0.478102), norm. avg. (of 8) = 0.519606 fft 5: mflops = -1 (norm. = -0.020981), norm. avg. (of 5) = 0.789032 fft 6: mflops = -1 (norm. = -0.020981), norm. avg. (of 5) = 0.47423 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.18 s, 256 iters, t-(init.)=1.13 s t(norm)=0.140835, mflops=35.5026 (err=2.6e-16) 1. PDA: elapsed time t=1.72 s, 256 iters, t-(init.)=1.67 s t(norm)=0.208136, mflops=24.0227 (err=3.7e-16) 2. PDA (f2c): elapsed time t=1.02 s, 128 iters, t-(init.)=0.99 s t(norm)=0.246772, mflops=20.2616 (err=3.6e-16) 3. Singleton: elapsed time t=1.13 s, 256 iters, t-(init.)=1.08 s t(norm)=0.134603, mflops=37.1462 (err=4.1e-16) 4. Singleton (f2c): elapsed time t=1.42 s, 256 iters, t-(init.)=1.36 s t(norm)=0.1695, mflops=29.4985 (err=4.1e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 37.1462 Normalized results and averages for N=2744: fft 0: mflops = 35.5026 (norm. = 0.955752), norm. avg. (of 9) = 0.995084 fft 1: mflops = 24.0227 (norm. = 0.646707), norm. avg. (of 9) = 0.332375 fft 2: mflops = 20.2616 (norm. = 0.545455), norm. avg. (of 9) = 0.273427 fft 3: mflops = 37.1462 (norm. = 1), norm. avg. (of 9) = 0.61395 fft 4: mflops = 29.4985 (norm. = 0.794118), norm. avg. (of 9) = 0.550107 fft 5: mflops = -1 (norm. = -0.0269206), norm. avg. (of 5) = 0.789032 fft 6: mflops = -1 (norm. = -0.0269206), norm. avg. (of 5) = 0.47423 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.37 s, 512 iters, t-(init.)=1.25 s t(norm)=0.0617183, mflops=81.0133 (err=3.7e-16) 1. PDA: elapsed time t=1.29 s, 256 iters, t-(init.)=1.22 s t(norm)=0.120474, mflops=41.5027 (err=3.5e-16) 2. PDA (f2c): elapsed time t=1.6 s, 256 iters, t-(init.)=1.54 s t(norm)=0.152074, mflops=32.8788 (err=3.4e-16) 3. Singleton: elapsed time t=1.8 s, 256 iters, t-(init.)=1.74 s t(norm)=0.171824, mflops=29.0996 (err=4.2e-16) 4. Singleton (f2c): elapsed time t=1.49 s, 256 iters, t-(init.)=1.43 s t(norm)=0.141211, mflops=35.4079 (err=4.1e-16) 5. Temperton: elapsed time t=1.63 s, 512 iters, t-(init.)=1.5 s t(norm)=0.0740619, mflops=67.5111 (err=1.8e-08) 6. Temperton (f2c): elapsed time t=1.05 s, 256 iters, t-(init.)=0.99 s t(norm)=0.0977617, mflops=51.1447 (err=3.8e-16) Top mflops for N=3375 = 81.0133 Normalized results and averages for N=3375: fft 0: mflops = 81.0133 (norm. = 1), norm. avg. (of 10) = 0.995575 fft 1: mflops = 41.5027 (norm. = 0.512295), norm. avg. (of 10) = 0.350367 fft 2: mflops = 32.8788 (norm. = 0.405844), norm. avg. (of 10) = 0.286669 fft 3: mflops = 29.0996 (norm. = 0.359195), norm. avg. (of 10) = 0.588474 fft 4: mflops = 35.4079 (norm. = 0.437063), norm. avg. (of 10) = 0.538803 fft 5: mflops = 67.5111 (norm. = 0.833333), norm. avg. (of 6) = 0.796416 fft 6: mflops = 51.1447 (norm. = 0.631313), norm. avg. (of 6) = 0.50041 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.98 s, 64 iters, t-(init.)=1.9 s t(norm)=0.125897, mflops=39.715 (err=3.3e-16) 1. PDA: elapsed time t=1.04 s, 32 iters, t-(init.)=1 s t(norm)=0.132523, mflops=37.7292 (err=3.8e-16) 2. PDA (f2c): elapsed time t=1.24 s, 32 iters, t-(init.)=1.2 s t(norm)=0.159028, mflops=31.441 (err=3.8e-16) 3. Singleton: elapsed time t=1.34 s, 32 iters, t-(init.)=1.3 s t(norm)=0.17228, mflops=29.0225 (err=4.5e-16) 4. Singleton (f2c): elapsed time t=1.35 s, 32 iters, t-(init.)=1.31 s t(norm)=0.173605, mflops=28.8009 (err=4.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 39.715 Normalized results and averages for N=16800: fft 0: mflops = 39.715 (norm. = 1), norm. avg. (of 11) = 0.995977 fft 1: mflops = 37.7292 (norm. = 0.95), norm. avg. (of 11) = 0.404879 fft 2: mflops = 31.441 (norm. = 0.791667), norm. avg. (of 11) = 0.332578 fft 3: mflops = 29.0225 (norm. = 0.730769), norm. avg. (of 11) = 0.60141 fft 4: mflops = 28.8009 (norm. = 0.725191), norm. avg. (of 11) = 0.555747 fft 5: mflops = -1 (norm. = -0.0251794), norm. avg. (of 6) = 0.796416 fft 6: mflops = -1 (norm. = -0.0251794), norm. avg. (of 6) = 0.50041 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.65 s, 8 iters, t-(init.)=1.53 s t(norm)=0.103213, mflops=48.4433 (err=4.0e-16) 1. PDA: elapsed time t=1.02 s, 4 iters, t-(init.)=0.96 s t(norm)=0.129523, mflops=38.6033 (err=4.0e-16) 2. PDA (f2c): elapsed time t=1.19 s, 4 iters, t-(init.)=1.13 s t(norm)=0.152459, mflops=32.7957 (err=4.2e-16) 3. Singleton: elapsed time t=1.25 s, 2 iters, t-(init.)=1.22 s t(norm)=0.329204, mflops=15.1882 (err=3.7e-16) 4. Singleton (f2c): elapsed time t=1.06 s, 2 iters, t-(init.)=1.03 s t(norm)=0.277934, mflops=17.9899 (err=3.6e-16) 5. Temperton: elapsed time t=1 s, 4 iters, t-(init.)=0.94 s t(norm)=0.126824, mflops=39.4246 (err=1.1e-07) 6. Temperton (f2c): elapsed time t=1.28 s, 4 iters, t-(init.)=1.22 s t(norm)=0.164602, mflops=30.3763 (err=4.8e-16) Top mflops for N=110592 = 48.4433 Normalized results and averages for N=110592: fft 0: mflops = 48.4433 (norm. = 1), norm. avg. (of 12) = 0.996313 fft 1: mflops = 38.6033 (norm. = 0.796875), norm. avg. (of 12) = 0.437545 fft 2: mflops = 32.7957 (norm. = 0.676991), norm. avg. (of 12) = 0.361279 fft 3: mflops = 15.1882 (norm. = 0.313525), norm. avg. (of 12) = 0.57742 fft 4: mflops = 17.9899 (norm. = 0.371359), norm. avg. (of 12) = 0.540381 fft 5: mflops = 39.4246 (norm. = 0.81383), norm. avg. (of 7) = 0.798903 fft 6: mflops = 30.3763 (norm. = 0.627049), norm. avg. (of 7) = 0.518502 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.14 s, 4 iters, t-(init.)=1.07 s t(norm)=0.134985, mflops=37.041 (err=4.3e-16) 1. PDA: elapsed time t=1.79 s, 4 iters, t-(init.)=1.72 s t(norm)=0.216986, mflops=23.043 (err=5.7e-16) 2. PDA (f2c): elapsed time t=1.09 s, 2 iters, t-(init.)=1.05 s t(norm)=0.264925, mflops=18.8733 (err=5.7e-16) 3. Singleton: elapsed time t=1.01 s, 2 iters, t-(init.)=0.97 s t(norm)=0.24474, mflops=20.4298 (err=6.8e-16) 4. Singleton (f2c): elapsed time t=1.19 s, 2 iters, t-(init.)=1.16 s t(norm)=0.292679, mflops=17.0836 (err=6.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 37.041 Normalized results and averages for N=117649: fft 0: mflops = 37.041 (norm. = 1), norm. avg. (of 13) = 0.996596 fft 1: mflops = 23.043 (norm. = 0.622093), norm. avg. (of 13) = 0.451741 fft 2: mflops = 18.8733 (norm. = 0.509524), norm. avg. (of 13) = 0.372683 fft 3: mflops = 20.4298 (norm. = 0.551546), norm. avg. (of 13) = 0.575429 fft 4: mflops = 17.0836 (norm. = 0.461207), norm. avg. (of 13) = 0.534291 fft 5: mflops = -1 (norm. = -0.0269971), norm. avg. (of 7) = 0.798903 fft 6: mflops = -1 (norm. = -0.0269971), norm. avg. (of 7) = 0.518502 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.9 s, 4 iters, t-(init.)=1.78 s t(norm)=0.116259, mflops=43.0075 (err=4.8e-16) 1. PDA: elapsed time t=1.07 s, 2 iters, t-(init.)=1.01 s t(norm)=0.131934, mflops=37.8977 (err=4.4e-16) 2. PDA (f2c): elapsed time t=1.27 s, 2 iters, t-(init.)=1.21 s t(norm)=0.15806, mflops=31.6336 (err=4.4e-16) 3. Singleton: elapsed time t=1.35 s, 1 iters, t-(init.)=1.32 s t(norm)=0.344858, mflops=14.4987 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=1.24 s, 1 iters, t-(init.)=1.21 s t(norm)=0.31612, mflops=15.8168 (err=5.4e-16) 5. Temperton: elapsed time t=1.9 s, 4 iters, t-(init.)=1.78 s t(norm)=0.116259, mflops=43.0075 (err=1.9e-08) 6. Temperton (f2c): elapsed time t=1.22 s, 2 iters, t-(init.)=1.16 s t(norm)=0.151528, mflops=32.9971 (err=5.4e-16) Top mflops for N=216000 = 43.0075 Normalized results and averages for N=216000: fft 0: mflops = 43.0075 (norm. = 1), norm. avg. (of 14) = 0.996839 fft 1: mflops = 37.8977 (norm. = 0.881188), norm. avg. (of 14) = 0.482416 fft 2: mflops = 31.6336 (norm. = 0.735537), norm. avg. (of 14) = 0.398601 fft 3: mflops = 14.4987 (norm. = 0.337121), norm. avg. (of 14) = 0.558407 fft 4: mflops = 15.8168 (norm. = 0.367769), norm. avg. (of 14) = 0.522397 fft 5: mflops = 43.0075 (norm. = 1), norm. avg. (of 8) = 0.82404 fft 6: mflops = 32.9971 (norm. = 0.767241), norm. avg. (of 8) = 0.549594 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.08 s, 2 iters, t-(init.)=1.02 s t(norm)=0.117877, mflops=42.417 (err=4.0e-16) 1. PDA: elapsed time t=1.3 s, 2 iters, t-(init.)=1.23 s t(norm)=0.142146, mflops=35.1751 (err=4.7e-16) 2. PDA (f2c): elapsed time t=1.53 s, 2 iters, t-(init.)=1.46 s t(norm)=0.168726, mflops=29.6338 (err=4.7e-16) 3. Singleton: elapsed time t=1.62 s, 1 iters, t-(init.)=1.58 s t(norm)=0.365188, mflops=13.6916 (err=5.2e-16) 4. Singleton (f2c): elapsed time t=1.54 s, 1 iters, t-(init.)=1.51 s t(norm)=0.349009, mflops=14.3263 (err=5.2e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 42.417 Normalized results and averages for N=241920: fft 0: mflops = 42.417 (norm. = 1), norm. avg. (of 15) = 0.99705 fft 1: mflops = 35.1751 (norm. = 0.829268), norm. avg. (of 15) = 0.505539 fft 2: mflops = 29.6338 (norm. = 0.69863), norm. avg. (of 15) = 0.418603 fft 3: mflops = 13.6916 (norm. = 0.322785), norm. avg. (of 15) = 0.542699 fft 4: mflops = 14.3263 (norm. = 0.337748), norm. avg. (of 15) = 0.510087 fft 5: mflops = -1 (norm. = -0.0235754), norm. avg. (of 8) = 0.82404 fft 6: mflops = -1 (norm. = -0.0235754), norm. avg. (of 8) = 0.549594 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.11 s, 1 iters, t-(init.)=1.05 s t(norm)=0.133192, mflops=37.5398 (err=5.2e-16) 1. PDA: elapsed time t=1.17 s, 1 iters, t-(init.)=1.11 s t(norm)=0.140803, mflops=35.5106 (err=5.8e-16) 2. PDA (f2c): elapsed time t=1.25 s, 1 iters, t-(init.)=1.2 s t(norm)=0.15222, mflops=32.8473 (err=5.8e-16) 3. Singleton: elapsed time t=2.45 s, 1 iters, t-(init.)=2.39 s t(norm)=0.303171, mflops=16.4924 (err=6.6e-16) 4. Singleton (f2c): elapsed time t=2.29 s, 1 iters, t-(init.)=2.24 s t(norm)=0.284143, mflops=17.5968 (err=6.6e-16) 5. Temperton: elapsed time t=1.73 s, 2 iters, t-(init.)=1.62 s t(norm)=0.102748, mflops=48.6626 (err=1.4e-07) 6. Temperton (f2c): elapsed time t=1.1 s, 1 iters, t-(init.)=1.05 s t(norm)=0.133192, mflops=37.5398 (err=7.0e-16) Top mflops for N=421875 = 48.6626 Normalized results and averages for N=421875: fft 0: mflops = 37.5398 (norm. = 0.771429), norm. avg. (of 16) = 0.982949 fft 1: mflops = 35.5106 (norm. = 0.72973), norm. avg. (of 16) = 0.519551 fft 2: mflops = 32.8473 (norm. = 0.675), norm. avg. (of 16) = 0.434628 fft 3: mflops = 16.4924 (norm. = 0.338912), norm. avg. (of 16) = 0.529963 fft 4: mflops = 17.5968 (norm. = 0.361607), norm. avg. (of 16) = 0.500807 fft 5: mflops = 48.6626 (norm. = 1), norm. avg. (of 9) = 0.843592 fft 6: mflops = 37.5398 (norm. = 0.771429), norm. avg. (of 9) = 0.574242 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.21 s, 1 iters, t-(init.)=1.14 s t(norm)=0.117399, mflops=42.5898 (err=3.7e-16) 1. PDA: elapsed time t=1.4 s, 1 iters, t-(init.)=1.33 s t(norm)=0.136965, mflops=36.5056 (err=3.8e-16) 2. PDA (f2c): elapsed time t=1.58 s, 1 iters, t-(init.)=1.51 s t(norm)=0.155502, mflops=32.1539 (err=3.7e-16) 3. Singleton: elapsed time t=3.41 s, 1 iters, t-(init.)=3.34 s t(norm)=0.343958, mflops=14.5366 (err=4.8e-16) 4. Singleton (f2c): elapsed time t=2.93 s, 1 iters, t-(init.)=2.86 s t(norm)=0.294527, mflops=16.9764 (err=4.7e-16) 5. Temperton: elapsed time t=1.34 s, 1 iters, t-(init.)=1.27 s t(norm)=0.130787, mflops=38.2302 (err=1.7e-07) 6. Temperton (f2c): elapsed time t=1.72 s, 1 iters, t-(init.)=1.65 s t(norm)=0.169919, mflops=29.4257 (err=5.2e-16) Top mflops for N=512000 = 42.5898 Normalized results and averages for N=512000: fft 0: mflops = 42.5898 (norm. = 1), norm. avg. (of 17) = 0.983952 fft 1: mflops = 36.5056 (norm. = 0.857143), norm. avg. (of 17) = 0.53941 fft 2: mflops = 32.1539 (norm. = 0.754967), norm. avg. (of 17) = 0.453471 fft 3: mflops = 14.5366 (norm. = 0.341317), norm. avg. (of 17) = 0.518866 fft 4: mflops = 16.9764 (norm. = 0.398601), norm. avg. (of 17) = 0.494795 fft 5: mflops = 38.2302 (norm. = 0.897638), norm. avg. (of 10) = 0.848996 fft 6: mflops = 29.4257 (norm. = 0.690909), norm. avg. (of 10) = 0.585909 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.59 s, 1 iters, t-(init.)=1.51 s t(norm)=0.132849, mflops=37.6366 (err=5.5e-16) 1. PDA: elapsed time t=1.89 s, 1 iters, t-(init.)=1.81 s t(norm)=0.159243, mflops=31.3985 (err=4.7e-16) 2. PDA (f2c): elapsed time t=2.32 s, 1 iters, t-(init.)=2.24 s t(norm)=0.197075, mflops=25.3711 (err=5.0e-16) 3. Singleton: elapsed time t=4.21 s, 1 iters, t-(init.)=4.13 s t(norm)=0.363356, mflops=13.7606 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=4.27 s, 1 iters, t-(init.)=4.19 s t(norm)=0.368635, mflops=13.5636 (err=5.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 37.6366 Normalized results and averages for N=592704: fft 0: mflops = 37.6366 (norm. = 1), norm. avg. (of 18) = 0.984843 fft 1: mflops = 31.3985 (norm. = 0.834254), norm. avg. (of 18) = 0.55579 fft 2: mflops = 25.3711 (norm. = 0.674107), norm. avg. (of 18) = 0.465729 fft 3: mflops = 13.7606 (norm. = 0.365617), norm. avg. (of 18) = 0.510352 fft 4: mflops = 13.5636 (norm. = 0.360382), norm. avg. (of 18) = 0.487327 fft 5: mflops = -1 (norm. = -0.0265699), norm. avg. (of 10) = 0.848996 fft 6: mflops = -1 (norm. = -0.0265699), norm. avg. (of 10) = 0.585909 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=2.15 s, 1 iters, t-(init.)=2.03 s t(norm)=0.116147, mflops=43.0489 (err=4.8e-16) 1. PDA: elapsed time t=2.69 s, 1 iters, t-(init.)=2.56 s t(norm)=0.146471, mflops=34.1364 (err=4.9e-16) 2. PDA (f2c): elapsed time t=2.96 s, 1 iters, t-(init.)=2.83 s t(norm)=0.161919, mflops=30.8796 (err=4.8e-16) 3. Singleton: elapsed time t=7.98 s, 1 iters, t-(init.)=7.85 s t(norm)=0.44914, mflops=11.1324 (err=5.5e-16) 4. Singleton (f2c): elapsed time t=7.16 s, 1 iters, t-(init.)=7.03 s t(norm)=0.402223, mflops=12.4309 (err=5.5e-16) 5. Temperton: elapsed time t=2.87 s, 1 iters, t-(init.)=2.75 s t(norm)=0.157342, mflops=31.7779 (err=1.6e-07) 6. Temperton (f2c): elapsed time t=3.59 s, 1 iters, t-(init.)=3.47 s t(norm)=0.198537, mflops=25.1842 (err=5.2e-16) Top mflops for N=884736 = 43.0489 Normalized results and averages for N=884736: fft 0: mflops = 43.0489 (norm. = 1), norm. avg. (of 19) = 0.985641 fft 1: mflops = 34.1364 (norm. = 0.792969), norm. avg. (of 19) = 0.568273 fft 2: mflops = 30.8796 (norm. = 0.717314), norm. avg. (of 19) = 0.47897 fft 3: mflops = 11.1324 (norm. = 0.258599), norm. avg. (of 19) = 0.497102 fft 4: mflops = 12.4309 (norm. = 0.288762), norm. avg. (of 19) = 0.476876 fft 5: mflops = 31.7779 (norm. = 0.738182), norm. avg. (of 11) = 0.838922 fft 6: mflops = 25.1842 (norm. = 0.585014), norm. avg. (of 11) = 0.585828 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=2.94 s, 1 iters, t-(init.)=2.77 s t(norm)=0.118794, mflops=42.0898 (err=4.8e-16) 1. PDA: elapsed time t=4.07 s, 1 iters, t-(init.)=3.9 s t(norm)=0.167255, mflops=29.8945 (err=5.5e-16) 2. PDA (f2c): elapsed time t=4.69 s, 1 iters, t-(init.)=4.53 s t(norm)=0.194273, mflops=25.737 (err=5.5e-16) 3. Singleton: elapsed time t=7.49 s, 1 iters, t-(init.)=7.32 s t(norm)=0.313924, mflops=15.9274 (err=6.5e-16) 4. Singleton (f2c): elapsed time t=7.58 s, 1 iters, t-(init.)=7.42 s t(norm)=0.318213, mflops=15.7128 (err=6.5e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 42.0898 Normalized results and averages for N=1157625: fft 0: mflops = 42.0898 (norm. = 1), norm. avg. (of 20) = 0.986359 fft 1: mflops = 29.8945 (norm. = 0.710256), norm. avg. (of 20) = 0.575372 fft 2: mflops = 25.737 (norm. = 0.611479), norm. avg. (of 20) = 0.485595 fft 3: mflops = 15.9274 (norm. = 0.378415), norm. avg. (of 20) = 0.491167 fft 4: mflops = 15.7128 (norm. = 0.373315), norm. avg. (of 20) = 0.471698 fft 5: mflops = -1 (norm. = -0.0237587), norm. avg. (of 11) = 0.838922 fft 6: mflops = -1 (norm. = -0.0237587), norm. avg. (of 11) = 0.585828 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=3.55 s, 1 iters, t-(init.)=3.35 s t(norm)=0.116759, mflops=42.8232 (err=5.3e-16) 1. PDA: elapsed time t=4.88 s, 1 iters, t-(init.)=4.68 s t(norm)=0.163114, mflops=30.6533 (err=5.6e-16) 2. PDA (f2c): elapsed time t=5.76 s, 1 iters, t-(init.)=5.55 s t(norm)=0.193437, mflops=25.8482 (err=5.7e-16) 3. Singleton: elapsed time t=9.88 s, 1 iters, t-(init.)=9.68 s t(norm)=0.337382, mflops=14.82 (err=6.3e-16) 4. Singleton (f2c): elapsed time t=9.48 s, 1 iters, t-(init.)=9.28 s t(norm)=0.32344, mflops=15.4588 (err=6.3e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 42.8232 Normalized results and averages for N=1404928: fft 0: mflops = 42.8232 (norm. = 1), norm. avg. (of 21) = 0.987009 fft 1: mflops = 30.6533 (norm. = 0.715812), norm. avg. (of 21) = 0.58206 fft 2: mflops = 25.8482 (norm. = 0.603604), norm. avg. (of 21) = 0.491215 fft 3: mflops = 14.82 (norm. = 0.346074), norm. avg. (of 21) = 0.484258 fft 4: mflops = 15.4588 (norm. = 0.360991), norm. avg. (of 21) = 0.466427 fft 5: mflops = -1 (norm. = -0.0233518), norm. avg. (of 11) = 0.838922 fft 6: mflops = -1 (norm. = -0.0233518), norm. avg. (of 11) = 0.585828 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=4.43 s, 1 iters, t-(init.)=4.18 s t(norm)=0.116742, mflops=42.8293 (err=4.6e-16) 1. PDA: elapsed time t=4.78 s, 1 iters, t-(init.)=4.53 s t(norm)=0.126518, mflops=39.5202 (err=4.8e-16) 2. PDA (f2c): elapsed time t=5.44 s, 1 iters, t-(init.)=5.19 s t(norm)=0.144951, mflops=34.4945 (err=4.9e-16) 3. Singleton: elapsed time t=15.57 s, 1 iters, t-(init.)=15.33 s t(norm)=0.428149, mflops=11.6782 (err=5.6e-16) 4. Singleton (f2c): elapsed time t=14.49 s, 1 iters, t-(init.)=14.24 s t(norm)=0.397706, mflops=12.5721 (err=5.6e-16) 5. Temperton: elapsed time t=5.12 s, 1 iters, t-(init.)=4.88 s t(norm)=0.136293, mflops=36.6858 (err=2.0e-08) 6. Temperton (f2c): elapsed time t=6.06 s, 1 iters, t-(init.)=5.82 s t(norm)=0.162546, mflops=30.7606 (err=5.7e-16) Top mflops for N=1728000 = 42.8293 Normalized results and averages for N=1728000: fft 0: mflops = 42.8293 (norm. = 1), norm. avg. (of 22) = 0.987599 fft 1: mflops = 39.5202 (norm. = 0.922737), norm. avg. (of 22) = 0.597545 fft 2: mflops = 34.4945 (norm. = 0.805395), norm. avg. (of 22) = 0.505496 fft 3: mflops = 11.6782 (norm. = 0.272668), norm. avg. (of 22) = 0.474641 fft 4: mflops = 12.5721 (norm. = 0.293539), norm. avg. (of 22) = 0.458568 fft 5: mflops = 36.6858 (norm. = 0.856557), norm. avg. (of 12) = 0.840392 fft 6: mflops = 30.7606 (norm. = 0.718213), norm. avg. (of 12) = 0.59686 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=7.87 s, 1 iters, t-(init.)=7.45 s t(norm)=0.115993, mflops=43.1059 (err=6.7e-16) 1. PDA: elapsed time t=8.87 s, 1 iters, t-(init.)=8.44 s t(norm)=0.131407, mflops=38.0497 (err=5.7e-16) 2. PDA (f2c): elapsed time t=10.06 s, 1 iters, t-(init.)=9.64 s t(norm)=0.150091, mflops=33.3132 (err=5.8e-16) 3. Singleton: elapsed time t=26.52 s, 1 iters, t-(init.)=26.09 s t(norm)=0.40621, mflops=12.3089 (err=6.0e-16) 4. Singleton (f2c): elapsed time t=23.69 s, 1 iters, t-(init.)=23.26 s t(norm)=0.362148, mflops=13.8065 (err=5.8e-16) 5. Temperton: elapsed time t=10.37 s, 1 iters, t-(init.)=9.94 s t(norm)=0.154762, mflops=32.3078 (err=1.8e-07) 6. Temperton (f2c): elapsed time t=12.41 s, 1 iters, t-(init.)=11.98 s t(norm)=0.186523, mflops=26.8063 (err=6.6e-16) Top mflops for N=2985984 = 43.1059 Normalized results and averages for N=2985984: fft 0: mflops = 43.1059 (norm. = 1), norm. avg. (of 23) = 0.988138 fft 1: mflops = 38.0497 (norm. = 0.882701), norm. avg. (of 23) = 0.609943 fft 2: mflops = 33.3132 (norm. = 0.772822), norm. avg. (of 23) = 0.517119 fft 3: mflops = 12.3089 (norm. = 0.28555), norm. avg. (of 23) = 0.466419 fft 4: mflops = 13.8065 (norm. = 0.320292), norm. avg. (of 23) = 0.452556 fft 5: mflops = 32.3078 (norm. = 0.749497), norm. avg. (of 13) = 0.8334 fft 6: mflops = 26.8063 (norm. = 0.62187), norm. avg. (of 13) = 0.598784 Benchmarking for array size = 180x180x180: 0. FFTW: elapsed time t=14.84 s, 1 iters, t-(init.)=14.01 s t(norm)=0.106883, mflops=46.78 (err=8.7e-16) 1. PDA: elapsed time t=17.8 s, 1 iters, t-(init.)=16.97 s t(norm)=0.129465, mflops=38.6203 (err=8.5e-16) 2. PDA (f2c): elapsed time t=20.34 s, 1 iters, t-(init.)=19.51 s t(norm)=0.148843, mflops=33.5924 (err=8.5e-16) 3. Singleton: elapsed time t=54.03 s, 1 iters, t-(init.)=53.2 s t(norm)=0.405867, mflops=12.3193 (err=7.1e-16) 4. Singleton (f2c): elapsed time t=50.34 s, 1 iters, t-(init.)=49.51 s t(norm)=0.377716, mflops=13.2375 (err=7.1e-16) 5. Temperton: elapsed time t=16.72 s, 1 iters, t-(init.)=15.89 s t(norm)=0.121226, mflops=41.2453 (err=1.0e-07) 6. Temperton (f2c): elapsed time t=20.82 s, 1 iters, t-(init.)=19.99 s t(norm)=0.152505, mflops=32.7858 (err=7.9e-16) Top mflops for N=5832000 = 46.78 Normalized results and averages for N=5832000: fft 0: mflops = 46.78 (norm. = 1), norm. avg. (of 24) = 0.988633 fft 1: mflops = 38.6203 (norm. = 0.825575), norm. avg. (of 24) = 0.618928 fft 2: mflops = 33.5924 (norm. = 0.718093), norm. avg. (of 24) = 0.525493 fft 3: mflops = 12.3193 (norm. = 0.263346), norm. avg. (of 24) = 0.457958 fft 4: mflops = 13.2375 (norm. = 0.282973), norm. avg. (of 24) = 0.44549 fft 5: mflops = 41.2453 (norm. = 0.881687), norm. avg. (of 14) = 0.836849 fft 6: mflops = 32.7858 (norm. = 0.70085), norm. avg. (of 14) = 0.606074 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Bailey, Beauregard, Bergland, Brenner, Burrus, CWP (min N), CWP (best N), Edelblute, FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), Monro, NAPACK (f2c), Nielsen, NR (C), NR (F), Ooura (C), Ooura (F), QFT, Ransom, SCIPORT, Singleton, Singleton (f2c), Sorensen, Sorensen DIT, Temperton, Temperton (f2c), Valkenburg 2, 33.2881, 34.6637, 21.8453, 1.49797, 6.31672, 4.85452, 5.72992, 6.31672, 20.3607, 4.40578, 3.94202, , 11.9837, 10.0825, 41.1206, 43.2402, 55.9241, , 5.60736, 5.85797, 5.34988, 27.06, , , , , 3.1023, 1.9563, 6.35501, 5.60736, 39.5689, 35.5449, , , , 5.29584, 5.14008, 18.0789, 18.5589, 3.74491, 3.71835, 5.46133 4, 65.028, 59.0747, 25.42, 5.85797, 11.4598, 5.29584, 19.065, 13.1072, 16.9125, 15.8875, 6.72164, 21.3995, 29.7468, 27.06, 108.943, 106.185, 110.376, , 14.3641, 11.0376, 10.6998, 81.4428, 35.5449, 34.9525, 35.2463, 5.43304, 5.76141, 7.23156, 11.4598, 9.70904, 56.2994, 48.771, , 4.72332, 17.05, 12.1927, 15.1968, 25.575, 17.05, 13.6179, 12.483, 5.54802 8, 52.4288, 48.771, 27.1183, 7.08497, 19.6608, 7.56185, 19.7845, 15.8875, 16.0496, 38.3625, 20.0365, 19.065, 48.3958, 42.799, 146.313, 144.631, 211.478, 61.6809, 24.7695, 18.3961, 17.1898, 123.362, 44.6203, 44.939, 43.6907, 12.0066, 9.03945, 13.6771, 18.2891, 14.9797, 98.304, 72.3156, , 5.82542, 16.2151, 15.7286, 17.8735, 37.0086, 15.4202, 22.4695, 20.0365, 5.65778 16, 38.13, 38.8361, 27.962, 11.5228, 25.7319, 8.45626, 26.3793, 20.3607, 16.7772, 70.4925, 35.5449, 18.8933, 63.0722, 69.9051, 129.056, 127.1, 209.715, 67.6501, 32.768, 25.2669, 24.5281, 139.81, 37.1177, 43.2402, 48.2104, 18.7246, 13.3577, 16.1319, 26.8866, 21.3995, 106.185, 83.0555, 92.1825, 15.0874, 15.4202, 19.9729, 33.2881, 47.3933, 16.0088, 36.1578, 29.1271, 5.76141 32, 36.9217, 36.9217, 30.6601, 12.025, 35.1871, 8.91646, 33.1828, 23.8313, 18.2044, 58.5797, 65.9482, 20.8051, 57.9324, 54.6133, 100.825, 80.6597, 203.607, 75.4371, 34.4926, 34.7211, 33.825, 146.654, 38.8361, 50.4123, 54.0503, 26.4792, 16.6971, 20.6413, 37.1835, 28.807, 119.156, 89.6219, 74.3671, 15.8875, 15.1528, 24.7306, 41.2825, 61.3202, 17.2463, 40.022, 29.4544, 5.85143 64, 43.0922, 43.3894, 33.4652, 15.8875, 40.8536, 8.83631, 32.4302, 27.1183, 19.7845, 78.6432, 67.6501, 22.153, 63.5501, 68.3854, 119.837, 89.8779, 57.7198, 85.598, 37.9003, 39.3216, 38.3625, 52.4288, 38.13, 50.3316, 56.6798, 31.1458, 20.8326, 23.4756, 47.6625, 35.5449, 118.707, 92.5214, 65.536, 27.5941, 14.6997, 24.576, 48.3958, 72.3156, 19.065, 53.7731, 38.5979, 5.95782 128, 41.4691, 40.778, 36.7002, 14.9188, 48.6095, 8.82215, 34.9525, 29.5969, 21.8453, 93.5036, 99.8644, 24.4668, 71.9611, 75.6704, 113.799, 92.3274, 52.4288, 82.9382, 41.2361, 43.6907, 42.6746, 35.6312, 40.5527, 56.0308, 60.6614, 35.9805, 22.3781, 25.8452, 56.8995, 41.7047, 127.653, 95.9481, 61.1669, 26.7884, 14.5636, 31.6381, 54.7764, 82.9382, 20.6181, 55.6063, 37.4491, 6.11669 256, 49.6367, 50.5338, 40.7214, 17.05, 49.6367, 8.88624, 38.4799, 31.775, 24.2445, 106.185, 111.848, 26.8866, 70.4925, 79.8915, 126.144, 101.68, 55.1882, 83.8861, 43.2402, 46.8637, 46.6034, 51.4639, 42.3667, 56.6798,61.2307, 39.1991, 24.9661, 26.7153, 63.0722, 45.8394, 128.07, 99.2735, 58.2542, 37.7865, 14.3641, 27.2357, 57.4562, 90.6877, 22.3101, 61.6809, 39.1991, 6.20459 512, 46.7187, 46.7187, 41.0312, 17.0963, 49.9322, 8.86953, 37.7487, 33.2295, 25.5059, 114.39, 117.965, 28.0869, 56.5101, 55.1882, 93.4375, 90.7422, 65.536, 87.3813, 41.0312, 48.8973, 48.6453, 44.939, 46.2607, 63.7648, 67.8934, 38.0532, 24.8347, 26.8102, 66.459, 49.4093, 132.918, 104.278, 44.515, 36.2969, 13.4817, 29.1271, 60.4948, 89.03, 23.9522, 57.8968, 35.7469, 6.08066 1024, 54.3304, 56.0736, 42.9744, 19.1346, 26.4792, 8.73813, 38.5506, 34.2672, 27.1652, 119.156, 117.818, 30.1315, 36.1578, 41.2825, 86.6592, 80.6597, 54.8993, 80.6597, 29.7891, 47.2332, 45.1972, 35.1871, 46.3972, 61.6809, 65.9482, 40.022, 22.7951, 26.2144, 62.4152, 48.9989, 123.362, 101.803, 39.126, 45.1972, 12.025, 27.7401, 60.263, 80.6597, 24.9661, 54.8993, 37.7186, 5.90414 2048, 29.4243, 28.2704, 22.3534, 16.1999, 24.0299, 8.48113, 29.7277, 25.5184, 18.1357, 76.3863, 75.8838, 18.6038, 37.9419, 42.0961, 75.8838, 64.7996, 49.292, 47.6625, 31.3433, 22.1814, 22.8856, 33.9245, 46.8875, 61.3529, 57.1007, 25.7463, 22.3534, 21.0481, 25.5184, 23.6359, 83.5821, 76.8956, 36.0448, 34.742, 12.1159, 21.8453, 30.8405, 36.0448, 16.4776, 37.4491, 30.6764, 5.76717 4096, 28.5975, 28.0869, 20.9715, 17.6726, 23.1304, 8.45626, 29.1271, 25.9978, 17.7725, 69.5189, 76.7251, 18.0789, 35.9512, 41.3912, 78.1547,67.6501, 42.799, 49.539, 29.3993, 21.6947, 22.3101, 34.1927, 30.541,34.9525, 34.5684, 25.575, 23.6521, 21.2549, 24.7695, 23.1304, 87.3813, 78.6432, 32.0993, 47.3042, 11.9156, 20.5603,33.825, 34.0079, 16.4698, 39.5689, 31.775, 5.78259 8192, 27.4828, 26.8336, 21.1669, 16.7053, 23.1828, 8.35263, 28.8803,26.0143, 18.0311, 73.2876, 72.5079, 18.127, 32.1497, 34.7742, 70.2654,65.536, 45.7432, 47.9982, 27.7063, 21.8453, 22.129, , 30.4274, 34.0787,34.0787, 24.875, 21.9863, 21.1669, 24.5171,23.0262, 84.145, 77.4516, 29.3782, 41.5594, 11.5914, 20.9072, 32.4559,31.5544, 16.5431, 37.2445, 29.6337, 5.67979 16384, 26.403, 26.0285, 20.2763, 18.9176, 18.1684, 8.192, 28.0154, 25.3105,17.6443, 76.4587, 74.8983, 17.8156, 27.3882, 29.3601, 63.2761, 52.057,33.9816, 41.4691, 23.5257, 20.1649, 19.9457, , 29.5969, 33.0632, 32.478,23.987, 16.6819, 20.0547, 22.1085,21.092, 82.0115, 74.1417, 24.1448, 48.6095, 9.26772, 19.9457, 32.1931,29.3601, 16.239, 36.7002, 28.8978, 5.42902 32768, 23.5459, 23.5459, 18.5479, 16.949, 14.7826, 7.92774, 27.3067,23.2672, 16.384, 72.8178, 72.8178, 16.384, 24.7306, 26.2144, 48.2474,43.4493, 27.3067, 39.7188, 21.2549, 18.3746, 17.7124, , 28.7019, 32.4972,30.0165, 20.9157, 11.7029, 15.9844, 19.4661, 19.0882, 66.0867, 60.9637, 20.5872, 38.5506, 7.50412, 18.9046, 25.8695,24.4234, 15.0082, 30.4819, 24.4234, 4.96485 65536, 16.513, 16.384, 12.71, 17.6231, 13.2731, 7.13317, 20.9715, 18.0789,11.7818, 54.4715, 54.4715, 11.7818, 23.8313, 25.2669, 49.3448, 37.1177,27.962, 28.3399, 21.1834, 12.483, 11.5228, , 24.9661, 27.4138, 25.575,15.0874, 11.7159, 13.53, 12.866, 12.5578, 45.8394, 42.799, 17.9244, 32.514, 7.23156, 15.1968, 21.3995, 17.9244,11.336, 22.4294, 18.8933, 4.33296 131072, 14.9545, 15.0556, 11.3685, 15.4738, 13.8399, 7.00699, 19.8949,16.384, 10.6106, 54.0176, 53.6921, 10.5105, 21.2212, 22.5073, 48.972,38.0893, 26.0611, 24.758, 19.7188, 11.4268, 10.6106, , 17.5451, 18.7246,18.1156, 13.5867, 11.4268, 13.423, 11.8523, 11.6053, 40.8848, 38.7517, 15.6917, 27.5089, 7.14174, 13.9264, 18.7246,16.6285, 10.5105, 19.5458, 16.5054, 4.30159 262144, 14.4742, 14.5636, 10.9227, 18.2891, 13.4051, 6.98017, 19.994,16.8521, 10.1694, 37.7487, 37.7487, 10.1694, 21.8453, 22.9058, 52.1391,35.7469, 27.5941, 25.2331, 19.6608, 11.0765, 10.2134, , 16.1596, 16.8521,16.384, 13.6375, 11.9761, 13.5592, 11.5652, 11.3428, 43.6907, 40.3298, 14.9323, 35.2134, 7.08497, 14.2126, 19.1813,16.384, 10.2134, 20.3388, 17.7391, 4.27409 524288, 14.6924, 14.6492, 10.8042, 16.0152, 14.2307, 6.95634, 19.2306,16.3303, 10.1234, 39.5297, 39.5297, 10.1234, 22.4357, 24.0615, 48.3567,35.3244, 25.1552, 24.6571, 20.3295, 11.1177, 10.2064, , 16.4381, 17.2344,16.6025, 13.2466, 11.8027, 12.6095, 11.5831, 11.3456, 41.8549, 38.912, 14.4369, 28.7904, 7.1976, 13.1072, 16.658,16.2239, 10.2064, 18.9382, 16.384, 4.19961 1048576, 13.9624, 13.9997, 10.5491, 18.9959, 13.7248, 6.94881, 19.2047,16.513, 9.94854, , , 9.99596, 22.2628, 23.5635, 44.8109, 30.8405, 19.2047, ,20.4003, 10.6998, 9.9297, , , 16.333, 15.8635, 13.0745, 12.3072, 12.6334,11.1788, 11.0029, 43.6907, 40.1753, 13.7428, 35.3056, 7.01858, 13.7608, 18.8254, 15.9844, 10.0151, 19.6731,16.9947, 4.14621 Norm. Avg., 0.339181, 0.336389, 0.247396, 0.183378, 0.245733, 0.0902095,0.290537, 0.247988, 0.190708, 0.69087, 0.67782, 0.191778, 0.389588,0.410918, 0.834529, 0.703787, 0.607384, 0.532128, 0.291322, 0.238157,0.231136, 0.477583, 0.336207, 0.397193, 0.400074, 0.242932, 0.17752, 0.193747, 0.277007, 0.236815, 0.847453, 0.731189,0.363166, 0.385935, 0.119388, 0.212065, 0.333119, 0.403713, 0.180858,0.355804, 0.27677, 0.0614078 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, Brenner, CWP (min N), CWP (best N), FFTPACK, FFTPACK (f2c), FFTW,FFTW_ESTIMATE, Frigo-old, GSL, Nielsen, Singleton, Singleton (f2c),Temperton, Temperton (f2c), Valkenburg 6, , 21.6266, 13.0314, 35.05, 29.6773, 92.4044, 93.4665, 19.5471, 24.2012,6.3528, 10.9886, 12.8664, 12.9484, 12.3957, 5.64694 9, 5.56458, 34.624, 23.9705, 42.736, 38.5505, 106.082, 107.609, 15.9803,22.663, 9.44292, 17.8066, 22.5265, 18.5119, 18.5119, 5.7003 12, , 42.7171, 36.1452, 53.7015, 51.2605, 131.132, 126.711, 27.9142,36.1452, 12.3655, 20.1381, 24.5159, 27.9142, 24.0968, 5.82506 15, 7.68126, 49.2388, 48.9252, 50.5346, 50.5346, 90.9025, 91.4436, 17.4574,21.6985, 14.4385, 17.9469, 23.8548, 32.0052, 28.8769, 5.13453 18, 7.06759, 48.7034, 41.6868, 42.7743, 37.2655, 59.9883, 65.1529, 18.2187,40.32, 11.6015, 23.6492, 31.3315, 25.3559, 22.5644, 5.856 24, , 67.3974, 61.6369, 50.08, 44.5156, 73.587, 78.3861, 26.7093, 41.4455,17.0083, 24.1997, 30.5573, 36.7935, 30.5573, 5.89177 36, 8.56558, 73.0382, 72.6035, 53.0321, 49.183, 85.8971, 65.5773, 22.4217,61.6029, 17.6263, 27.9756, 39.3464, 38.3566, 32.0984, 5.95575 80, 13.6287, 117.536, 89.5814, 62.3028, 65.2463, 88.1519, 69.0523, 36.0273,28.5734, 26.2224, 31.8703, 46.0349, 68.4816, 45.529, 5.66003 108, 9.33796, 90.5499, 111.706, 62.579, 58.5911, 91.2411, 80.7607, 20.3275,58.5911, 20.6079, 35.1547, 51.0794, 39.0607, 31.9588, 6.07347 210, 12.2879, 98.3035, 100.538, 50.2688, 37.4886, 76.2699, 73.7276, 18.2294,26.124, 21.1321, 31.9014, 35.1084, , , 4.8223 504, 12.9417, 112.319, 112.319, 46.3314, 37.6678, 79.1989, 71.8316, 19.6319,34.0672, 20.3208, 38.2904, 42.5059, , , 5.2175 1000, 12.7562, 92.7724, 115.31, 43.9869, 42.878, 63.781, 64.9998, 19.6249,24.0683, 30.5538, 38.0782, 56.0712, 69.4215, 48.5951, 5.06199 1960, 13.4499, 88.5092, 87.1042, 29.3453, 25.6428, 64.9416, 53.7997,18.4147, 20.476, 19.7395, 34.2973, 31.72, , , 4.48331 4725, 11.1178, 77.7079, 88.4102, 33.5557, 31.0179, 65.3296, 50.2194,14.8836, 24.4445, 19.8448, 29.529, 31.5481, , , 4.78125 10368, 10.9551, 77.6464, 79.0329, 39.5164, 39.8725, 78.3335, 60.628,22.8136, 47.8469, 19.4116, 26.0344, 29.1174, 28.5538, 27.8355, 5.5323 27000, 8.78361, 76.1596, 76.1596, 30.8705, 29.7165, 49.6823, 53.4397,15.1413, 27.1766, 20.7821, 24.2723, 27.8918, 28.3899, 28.3899, 4.73164 75600, 8.22269, 59.4027, 59.4027, 21.4944, 20.4197, 52.1354, 51.5866,14.4992, 22.6885, 16.3357, 18.4238, 19.9216, , , 4.2541 165375, 7.20311, 48.5905, 49.0058, 13.2724, 12.3571, 46.9974, 40.9548,10.4629, 18.0304, 15.2491, 16.9635, 17.3748, , , 3.97069 362880, 7.94085, 26.8083, 26.8083, 20.6854, 19.5967, 55.8507, 47.5325,14.1993, 25.9771, 15.0948, 14.8276, 15.5862, , , 4.06187 Norm. Avg., 0.115673, 0.778992, 0.772133, 0.455307, 0.419659, 0.861292,0.798771, 0.222806, 0.372199, 0.20496, 0.289262, 0.347647, 0.37058,0.311159, 0.0606006 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM, HARM (f2c), NR (C), NR (F), PDA, PDA (f2c),Singleton, Singleton (f2c), Temperton, Temperton (f2c) 4x4x4, 94.6084, , , 55.6766, 34.9525, 19.6608, 16.6441, 54.7083, 51.5693,36.7921, 39.5689 8x8x8, 94.3718, 74.3085, 44.939, 79.3041, 54.2367, 33.4652, 28.7719,51.8527, 83.5149, 48.6453, 55.5128 16x16x16, 77.1958, 75.3468, 44.6203, 23.4756, 21.3995, 37.6734, 35.3453,34.9525, 31.4573, 38.8361, 43.6907 32x32x32, 57.8259, 47.0917, 33.8979, 16.6617, 15.4809, 31.9688, 30.2474,21.9674, 21.6053, 29.7891, 30.9619 64x64x64, 40.3298, 38.3625, 29.8645, 11.2347, 10.773, 28.9484, 28.5975,17.3478, 16.8521, 26.5089, 27.5941 256x64x32, 43.6907, 36.3557, 28.6249, 11.1177, 10.7343, 31.7244, 31.5236,16.4925, 15.9129, 25.8069, 27.3667 16x1024x64, 41.943, 37.7186, 29.1271, 10.7657, 10.4232, 24.4994, 24.8478,17.3605, 17.2747, , 128x128x128, 42.1034, 35.9805, 28.9738, 10.2134, 9.95934, 31.8209, 31.8209,12.0197, 12.2063, 18.5198, 19.0981 512x128x64, 43.3214, 36.4434, 29.1271, 9.68865, 9.50697, 29.6132, 29.5185,13.7806, 13.3461, , Norm. Avg., 1, 0.869531, 0.63638, 0.364112, 0.300711, 0.563407, 0.546882,0.420611, 0.444787, 0.515776, 0.553152 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA, PDA (f2c), Singleton, Singleton (f2c),Temperton, Temperton (f2c) 5x5x5, 75.0838, 21.615, 17.8324, 49.8809, 63.1236, 73.5357, 49.5345 6x6x6, 103.173, 25.4112, 20.542, 41.8355, 58.1443, 70.7322, 39.2059 7x7x7, 83.3267, 15.2479, 12.8613, 56.3447, 32.6862, , 9x9x9, 110.922, 35.495, 27.3039, 41.5147, 62.272, 71.7072, 48.9587 10x10x10, 105.206, 34.2448, 28.5055, 41.149, 51.0248, 99.0773, 48.1366 11x11x11, 56.5801, 15.9291, 13.4972, 47.1501, 26.1945, , 12x12x12, 110.003, 41.7335, 33.039, 36.3177, 41.0139, 76.1219, 47.5762 13x13x13, 47.6621, 15.3033, 12.7945, 40.5437, 22.7874, , 14x14x14, 35.5026, 24.0227, 20.2616, 37.1462, 29.4985, , 15x15x15, 81.0133, 41.5027, 32.8788, 29.0996, 35.4079, 67.5111, 51.1447 24x25x28, 39.715, 37.7292, 31.441, 29.0225, 28.8009, , 48x48x48, 48.4433, 38.6033, 32.7957, 15.1882, 17.9899, 39.4246, 30.3763 49x49x49, 37.041, 23.043, 18.8733, 20.4298, 17.0836, , 60x60x60, 43.0075, 37.8977, 31.6336, 14.4987, 15.8168, 43.0075, 32.9971 72x60x56, 42.417, 35.1751, 29.6338, 13.6916, 14.3263, , 75x75x75, 37.5398, 35.5106, 32.8473, 16.4924, 17.5968, 48.6626, 37.5398 80x80x80, 42.5898, 36.5056, 32.1539, 14.5366, 16.9764, 38.2302, 29.4257 84x84x84, 37.6366, 31.3985, 25.3711, 13.7606, 13.5636, , 96x96x96, 43.0489, 34.1364, 30.8796, 11.1324, 12.4309, 31.7779, 25.1842 105x105x105, 42.0898, 29.8945, 25.737, 15.9274, 15.7128, , 112x112x112, 42.8232, 30.6533, 25.8482, 14.82, 15.4588, , 120x120x120, 42.8293, 39.5202, 34.4945, 11.6782, 12.5721, 36.6858, 30.7606 144x144x144, 43.1059, 38.0497, 33.3132, 12.3089, 13.8065, 32.3078, 26.8063 180x180x180, 46.78, 38.6203, 33.5924, 12.3193, 13.2375, 41.2453, 32.7858 Norm. Avg., 0.988633, 0.618928, 0.525493, 0.457958, 0.44549, 0.836849,0.606074 @@@@ end