To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Bradley Lucier @ submitter email = lucier@math.purdue.edu @ submitter organization = Purdue University, Math Dept @ computer manufacturer = Sun @ computer model = Enterprise 3001 @ CPU manufacturer = Sun @ CPU model = UltraSPARC @ CPU speed = 250 MHz @ RAM = 256 MB @ L2 cache size = 4 MB @ operating system = SunOS 5.5.1 @ C compiler = Sun WorkShop cc 4.2 @ C compiler flags = -fast -native -DSOLARIS -dalign -xO5 -I../fftw-1.2.1/src/src @ Fortran compiler = Sun WorkShop f77 4.2 @ Fortran compiler flags = -fast -native -dalign -libmil -xO5 @ remarks = @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) Maximum array size = 360360 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Bailey 5. Beauregard 6. Bergland 7. Brenner 8. Burrus 9. CWP (min N) 10. CWP (best N) 11. Edelblute 12. FFTPACK 13. FFTPACK (f2c) 14. FFTW 15. FFTW_ESTIMATE 16. Frigo-old 17. Green 18. GSL 19. GSL DIT 20. GSL DIF 21. Krukar 22. Mayer (Buneman) 23. Mayer (simple) 24. Mayer (lookup) 25. Monro 26. NAPACK (f2c) 27. Ooura (C) 28. Ooura (F) 29. Ransom 30. SCIPORT 31. Singleton 32. Singleton (f2c) 33. Sorensen 34. Sorensen DIT 35. Temperton 36. Temperton (f2c) 37. Valkenburg Computing normalized averages (38 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.29137 s, 4194304 iters, t-(init.)=0.90839 s t(norm)=0.108288, mflops=46.173 (err=1.7e-17) 1. Arndt DIT: elapsed time t=1.2961 s, 4194304 iters, t-(init.)=0.832986 s t(norm)=0.0992996, mflops=50.3527 (err=1.7e-17) 2. Arndt Split-Radix: elapsed time t=1.50275 s, 2097152 iters, t-(init.)=1.33587 s t(norm)=0.318496, mflops=15.6988 (err=1.7e-17) 3. Arndt 4-step: elapsed time t=1.03846 s, 131072 iters, t-(init.)=1.02841 s t(norm)=3.92308, mflops=1.27451 (err=1.7e-17) 4. Bailey: elapsed time t=1.84473 s, 1048576 iters, t-(init.)=1.71824 s t(norm)=0.819319, mflops=6.10263 (err=1.7e-17) 5. Beauregard: elapsed time t=1.67028 s, 1048576 iters, t-(init.)=1.5761 s t(norm)=0.751544, mflops=6.65297 (err=1.7e-17) 6. Bergland: elapsed time t=1.75801 s, 1048576 iters, t-(init.)=1.67331 s t(norm)=0.797897, mflops=6.26647 (err=1.7e-17) 7. Brenner: elapsed time t=1.47928 s, 1048576 iters, t-(init.)=1.35867 s t(norm)=0.647866, mflops=7.71764 (err=1.7e-17) 8. Burrus: elapsed time t=1.37865 s, 4194304 iters, t-(init.)=1.01056 s t(norm)=0.120468, mflops=41.5049 (err=1.7e-17) 9. CWP (min N): elapsed time t=1.77766 s, 524288 iters, t-(init.)=1.71706 s t(norm)=1.63752, mflops=3.0534 10. CWP (best N) (N=3): elapsed time t=1.86539 s, 524288 iters, t-(init.)=1.81883 s t(norm)=1.73457, mflops=2.88256 11. Skipping fft (Edelblute can't handle N <= 2). 12. FFTPACK: elapsed time t=1.03785 s, 1048576 iters, t-(init.)=0.953269 s t(norm)=0.454554, mflops=10.9998 (err=1.7e-17) 13. FFTPACK (f2c): elapsed time t=1.15722 s, 1048576 iters, t-(init.)=1.07366 s t(norm)=0.51196, mflops=9.76639 (err=1.7e-17) FFTW_MEASURE plan: (cost = 2.778438e-07) FFTW_NOTW 2 14. FFTW: elapsed time t=1.34901 s, 4194304 iters, t-(init.)=0.865738 s t(norm)=0.103204, mflops=48.4477 (err=1.7e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 15. FFTW_ESTIMATE: elapsed time t=1.35652 s, 4194304 iters, t-(init.)=1.00429 s t(norm)=0.119721, mflops=41.7639 (err=1.7e-17) 16. Frigo-old: elapsed time t=1.69343 s, 8388608 iters, t-(init.)=0.853104 s t(norm)=0.050849, mflops=98.3304 (err=1.7e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.42697 s, 1048576 iters, t-(init.)=1.27869 s t(norm)=0.609727, mflops=8.2004 (err=1.7e-17) 19. GSL DIT: elapsed time t=1.04827 s, 524288 iters, t-(init.)=1.0078 s t(norm)=0.961109, mflops=5.20232 (err=1.7e-17) 20. GSL DIF: elapsed time t=1.0422 s, 524288 iters, t-(init.)=0.989165 s t(norm)=0.943342, mflops=5.30031 (err=1.7e-17) 21. Krukar: elapsed time t=1.73365 s, 4194304 iters, t-(init.)=1.16662 s t(norm)=0.139072, mflops=35.9527 (err=1.7e-17) 22. Skipping fft (Mayer can't handle N <= 2). 23. Skipping fft (Mayer can't handle N <= 2). 24. Skipping fft (Mayer can't handle N <= 2). 25. Skipping fft (Monro can't handle N <= 2). 26. NAPACK (f2c): elapsed time t=1.06223 s, 262144 iters, t-(init.)=1.04212 s t(norm)=1.98769, mflops=2.51549 (err=1.7e-17) 27. Ooura (C): elapsed time t=1.06031 s, 2097152 iters, t-(init.)=0.899549 s t(norm)=0.214469, mflops=23.3134 (err=1.7e-17) 28. Ooura (F): elapsed time t=1.03409 s, 2097152 iters, t-(init.)=0.873222 s t(norm)=0.208192, mflops=24.0163 (err=1.7e-17) 29. Skipping fft (Ransom doesn't work for N=2). 30. Skipping fft (SCIPORT can't handle N < 4). 31. Singleton: elapsed time t=1.34137 s, 524288 iters, t-(init.)=1.28392 s t(norm)=1.22444, mflops=4.0835 (err=1.7e-17) 32. Singleton (f2c): elapsed time t=1.32197 s, 524288 iters, t-(init.)=1.27468 s t(norm)=1.21563, mflops=4.11309 (err=1.7e-17) 33. Sorensen: elapsed time t=1.3045 s, 2097152 iters, t-(init.)=1.12675 s t(norm)=0.268637, mflops=18.6124 (err=1.7e-17) 34. Sorensen DIT: elapsed time t=1.37718 s, 4194304 iters, t-(init.)=0.843374 s t(norm)=0.100538, mflops=49.7324 (err=1.7e-17) 35. Temperton: elapsed time t=1.67651 s, 524288 iters, t-(init.)=1.63634 s t(norm)=1.56053, mflops=3.20403 (err=1.7e-17) 36. Temperton (f2c): elapsed time t=1.09624 s, 262144 iters, t-(init.)=1.07507 s t(norm)=2.05054, mflops=2.43838 (err=1.7e-17) 37. Valkenburg: elapsed time t=1.91598 s, 1048576 iters, t-(init.)=1.77786 s t(norm)=0.847751, mflops=5.89796 (err=1.7e-17) Top mflops for N=2 = 98.3304 Normalized results and averages for N=2: fft 0: mflops = 46.173 (norm. = 0.46957), norm. avg. (of 1) = 0.46957 fft 1: mflops = 50.3527 (norm. = 0.512076), norm. avg. (of 1) = 0.512076 fft 2: mflops = 15.6988 (norm. = 0.159653), norm. avg. (of 1) = 0.159653 fft 3: mflops = 1.27451 (norm. = 0.0129615), norm. avg. (of 1) = 0.0129615 fft 4: mflops = 6.10263 (norm. = 0.0620625), norm. avg. (of 1) = 0.0620625 fft 5: mflops = 6.65297 (norm. = 0.0676594), norm. avg. (of 1) = 0.0676594 fft 6: mflops = 6.26647 (norm. = 0.0637287), norm. avg. (of 1) = 0.0637287 fft 7: mflops = 7.71764 (norm. = 0.0784868), norm. avg. (of 1) = 0.0784868 fft 8: mflops = 41.5049 (norm. = 0.422096), norm. avg. (of 1) = 0.422096 fft 9: mflops = 3.0534 (norm. = 0.0310525), norm. avg. (of 1) = 0.0310525 fft 10: mflops = 2.88256 (norm. = 0.029315), norm. avg. (of 1) = 0.029315 fft 11: mflops = -1 (norm. = -0.0101698), norm. avg. (of 0) = -1 fft 12: mflops = 10.9998 (norm. = 0.111866), norm. avg. (of 1) = 0.111866 fft 13: mflops = 9.76639 (norm. = 0.0993221), norm. avg. (of 1) = 0.0993221 fft 14: mflops = 48.4477 (norm. = 0.492703), norm. avg. (of 1) = 0.492703 fft 15: mflops = 41.7639 (norm. = 0.42473), norm. avg. (of 1) = 0.42473 fft 16: mflops = 98.3304 (norm. = 1), norm. avg. (of 1) = 1 fft 17: mflops = -1 (norm. = -0.0101698), norm. avg. (of 0) = -1 fft 18: mflops = 8.2004 (norm. = 0.0833963), norm. avg. (of 1) = 0.0833963 fft 19: mflops = 5.20232 (norm. = 0.0529066), norm. avg. (of 1) = 0.0529066 fft 20: mflops = 5.30031 (norm. = 0.053903), norm. avg. (of 1) = 0.053903 fft 21: mflops = 35.9527 (norm. = 0.365632), norm. avg. (of 1) = 0.365632 fft 22: mflops = -1 (norm. = -0.0101698), norm. avg. (of 0) = -1 fft 23: mflops = -1 (norm. = -0.0101698), norm. avg. (of 0) = -1 fft 24: mflops = -1 (norm. = -0.0101698), norm. avg. (of 0) = -1 fft 25: mflops = -1 (norm. = -0.0101698), norm. avg. (of 0) = -1 fft 26: mflops = 2.51549 (norm. = 0.025582), norm. avg. (of 1) = 0.025582 fft 27: mflops = 23.3134 (norm. = 0.237092), norm. avg. (of 1) = 0.237092 fft 28: mflops = 24.0163 (norm. = 0.24424), norm. avg. (of 1) = 0.24424 fft 29: mflops = -1 (norm. = -0.0101698), norm. avg. (of 0) = -1 fft 30: mflops = -1 (norm. = -0.0101698), norm. avg. (of 0) = -1 fft 31: mflops = 4.0835 (norm. = 0.0415283), norm. avg. (of 1) = 0.0415283 fft 32: mflops = 4.11309 (norm. = 0.0418293), norm. avg. (of 1) = 0.0418293 fft 33: mflops = 18.6124 (norm. = 0.189285), norm. avg. (of 1) = 0.189285 fft 34: mflops = 49.7324 (norm. = 0.505768), norm. avg. (of 1) = 0.505768 fft 35: mflops = 3.20403 (norm. = 0.0325844), norm. avg. (of 1) = 0.0325844 fft 36: mflops = 2.43838 (norm. = 0.0247978), norm. avg. (of 1) = 0.0247978 fft 37: mflops = 5.89796 (norm. = 0.059981), norm. avg. (of 1) = 0.059981 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.70057 s, 2097152 iters, t-(init.)=1.39336 s t(norm)=0.0830506, mflops=60.2042 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.80034 s, 2097152 iters, t-(init.)=1.48915 s t(norm)=0.0887605, mflops=56.3313 (err=1.3e-16) 2. Arndt Split-Radix: elapsed time t=1.82768 s, 1048576 iters, t-(init.)=1.69131 s t(norm)=0.201619, mflops=24.7992 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.44273 s, 262144 iters, t-(init.)=1.40833 s t(norm)=0.671542, mflops=7.44555 (err=1.3e-16) 4. Bailey: elapsed time t=1.08862 s, 262144 iters, t-(init.)=1.05045 s t(norm)=0.500895, mflops=9.98212 (err=1.3e-16) 5. Beauregard: elapsed time t=1.03008 s, 262144 iters, t-(init.)=0.998348 s t(norm)=0.47605, mflops=10.5031 (err=6.5e-17) 6. Bergland: elapsed time t=1.09339 s, 524288 iters, t-(init.)=1.00672 s t(norm)=0.24002, mflops=20.8316 (err=5.3e-17) 7. Brenner: elapsed time t=1.28321 s, 524288 iters, t-(init.)=1.20101 s t(norm)=0.286343, mflops=17.4616 (err=5.3e-17) 8. Burrus: elapsed time t=1.63451 s, 1048576 iters, t-(init.)=1.49763 s t(norm)=0.178531, mflops=28.0063 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.9056 s, 524288 iters, t-(init.)=1.83574 s t(norm)=0.437675, mflops=11.424 10. CWP (best N) (N=15): elapsed time t=1.54001 s, 262144 iters, t-(init.)=1.46795 s t(norm)=0.699973, mflops=7.14313 11. Edelblute: elapsed time t=1.94126 s, 1048576 iters, t-(init.)=1.80343 s t(norm)=0.214986, mflops=23.2573 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.23573 s, 1048576 iters, t-(init.)=1.08962 s t(norm)=0.129893, mflops=38.4934 (err=5.3e-17) 13. FFTPACK (f2c): elapsed time t=1.53418 s, 1048576 iters, t-(init.)=1.39363 s t(norm)=0.166134, mflops=30.0962 (err=5.3e-17) FFTW_MEASURE plan: (cost = 4.133008e-07) FFTW_NOTW 4 14. FFTW: elapsed time t=1.81808 s, 4194304 iters, t-(init.)=1.13633 s t(norm)=0.0338652, mflops=147.644 (err=5.3e-17) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 15. FFTW_ESTIMATE: elapsed time t=1.79217 s, 4194304 iters, t-(init.)=1.2447 s t(norm)=0.037095, mflops=134.789 (err=5.3e-17) 16. Frigo-old: elapsed time t=1.39334 s, 4194304 iters, t-(init.)=0.770456 s t(norm)=0.0229614, mflops=217.757 (err=5.3e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.83961 s, 1048576 iters, t-(init.)=1.67953 s t(norm)=0.200216, mflops=24.973 (err=5.3e-17) 19. GSL DIT: elapsed time t=1.0111 s, 262144 iters, t-(init.)=0.977893 s t(norm)=0.466296, mflops=10.7228 (err=6.5e-17) 20. GSL DIF: elapsed time t=1.00465 s, 262144 iters, t-(init.)=0.957293 s t(norm)=0.456473, mflops=10.9535 (err=6.5e-17) 21. Krukar: elapsed time t=1.2 s, 2097152 iters, t-(init.)=0.870991 s t(norm)=0.0519151, mflops=96.311 (err=5.3e-17) 22. Mayer (Buneman): elapsed time t=1.45571 s, 1048576 iters, t-(init.)=1.30983 s t(norm)=0.156144, mflops=32.0218 (err=1.3e-16) 23. Mayer (simple): elapsed time t=1.55927 s, 1048576 iters, t-(init.)=1.36476 s t(norm)=0.162692, mflops=30.7329 24. Mayer (lookup): elapsed time t=1.58095 s, 1048576 iters, t-(init.)=1.44321 s t(norm)=0.172044, mflops=29.0624 (err=1.3e-16) 25. Monro: elapsed time t=1.32828 s, 262144 iters, t-(init.)=1.28628 s t(norm)=0.613347, mflops=8.15199 (err=1.3e-16) 26. NAPACK (f2c): elapsed time t=1.0062 s, 131072 iters, t-(init.)=0.989006 s t(norm)=0.943189, mflops=5.30116 (err=1.6e-16) 27. Ooura (C): elapsed time t=1.70458 s, 2097152 iters, t-(init.)=1.40857 s t(norm)=0.0839572, mflops=59.5542 (err=5.3e-17) 28. Ooura (F): elapsed time t=1.6671 s, 2097152 iters, t-(init.)=1.39188 s t(norm)=0.0829627, mflops=60.268 (err=5.3e-17) 29. Ransom: elapsed time t=1.31949 s, 131072 iters, t-(init.)=1.29871 s t(norm)=1.23855, mflops=4.03698 (err=1.6e-16) 30. SCIPORT: elapsed time t=1.07733 s, 1048576 iters, t-(init.)=0.909919 s t(norm)=0.108471, mflops=46.0953 (err=6.5e-17) 31. Singleton: elapsed time t=1.30645 s, 524288 iters, t-(init.)=1.23669 s t(norm)=0.294849, mflops=16.9578 (err=5.3e-17) 32. Singleton (f2c): elapsed time t=1.29462 s, 524288 iters, t-(init.)=1.22582 s t(norm)=0.292258, mflops=17.1081 (err=5.3e-17) 33. Sorensen: elapsed time t=1.60758 s, 1048576 iters, t-(init.)=1.44476 s t(norm)=0.172229, mflops=29.0311 (err=1.3e-16) 34. Sorensen DIT: elapsed time t=1.68911 s, 1048576 iters, t-(init.)=1.54746 s t(norm)=0.184471, mflops=27.1045 (err=1.3e-16) 35. Temperton: elapsed time t=1.97334 s, 524288 iters, t-(init.)=1.90464 s t(norm)=0.454101, mflops=11.0108 (err=5.3e-17) 36. Temperton (f2c): elapsed time t=1.28395 s, 262144 iters, t-(init.)=1.23473 s t(norm)=0.588767, mflops=8.49232 (err=5.3e-17) 37. Valkenburg: elapsed time t=1.90284 s, 262144 iters, t-(init.)=1.86632 s t(norm)=0.889931, mflops=5.61841 (err=1.6e-16) Top mflops for N=4 = 217.757 Normalized results and averages for N=4: fft 0: mflops = 60.2042 (norm. = 0.276475), norm. avg. (of 2) = 0.373022 fft 1: mflops = 56.3313 (norm. = 0.258689), norm. avg. (of 2) = 0.385383 fft 2: mflops = 24.7992 (norm. = 0.113885), norm. avg. (of 2) = 0.136769 fft 3: mflops = 7.44555 (norm. = 0.034192), norm. avg. (of 2) = 0.0235768 fft 4: mflops = 9.98212 (norm. = 0.0458407), norm. avg. (of 2) = 0.0539516 fft 5: mflops = 10.5031 (norm. = 0.0482332), norm. avg. (of 2) = 0.0579463 fft 6: mflops = 20.8316 (norm. = 0.0956644), norm. avg. (of 2) = 0.0796966 fft 7: mflops = 17.4616 (norm. = 0.0801883), norm. avg. (of 2) = 0.0793376 fft 8: mflops = 28.0063 (norm. = 0.128613), norm. avg. (of 2) = 0.275354 fft 9: mflops = 11.424 (norm. = 0.0524622), norm. avg. (of 2) = 0.0417573 fft 10: mflops = 7.14313 (norm. = 0.0328032), norm. avg. (of 2) = 0.0310591 fft 11: mflops = 23.2573 (norm. = 0.106804), norm. avg. (of 1) = 0.106804 fft 12: mflops = 38.4934 (norm. = 0.176772), norm. avg. (of 2) = 0.144319 fft 13: mflops = 30.0962 (norm. = 0.13821), norm. avg. (of 2) = 0.118766 fft 14: mflops = 147.644 (norm. = 0.678023), norm. avg. (of 2) = 0.585363 fft 15: mflops = 134.789 (norm. = 0.618989), norm. avg. (of 2) = 0.521859 fft 16: mflops = 217.757 (norm. = 1), norm. avg. (of 2) = 1 fft 17: mflops = -1 (norm. = -0.00459228), norm. avg. (of 0) = -1 fft 18: mflops = 24.973 (norm. = 0.114683), norm. avg. (of 2) = 0.0990397 fft 19: mflops = 10.7228 (norm. = 0.0492421), norm. avg. (of 2) = 0.0510743 fft 20: mflops = 10.9535 (norm. = 0.0503017), norm. avg. (of 2) = 0.0521024 fft 21: mflops = 96.311 (norm. = 0.442287), norm. avg. (of 2) = 0.403959 fft 22: mflops = 32.0218 (norm. = 0.147053), norm. avg. (of 1) = 0.147053 fft 23: mflops = 30.7329 (norm. = 0.141134), norm. avg. (of 1) = 0.141134 fft 24: mflops = 29.0624 (norm. = 0.133462), norm. avg. (of 1) = 0.133462 fft 25: mflops = 8.15199 (norm. = 0.0374362), norm. avg. (of 1) = 0.0374362 fft 26: mflops = 5.30116 (norm. = 0.0243444), norm. avg. (of 2) = 0.0249632 fft 27: mflops = 59.5542 (norm. = 0.273489), norm. avg. (of 2) = 0.255291 fft 28: mflops = 60.268 (norm. = 0.276767), norm. avg. (of 2) = 0.260504 fft 29: mflops = 4.03698 (norm. = 0.0185389), norm. avg. (of 1) = 0.0185389 fft 30: mflops = 46.0953 (norm. = 0.211683), norm. avg. (of 1) = 0.211683 fft 31: mflops = 16.9578 (norm. = 0.0778751), norm. avg. (of 2) = 0.0597017 fft 32: mflops = 17.1081 (norm. = 0.0785654), norm. avg. (of 2) = 0.0601973 fft 33: mflops = 29.0311 (norm. = 0.133319), norm. avg. (of 2) = 0.161302 fft 34: mflops = 27.1045 (norm. = 0.124471), norm. avg. (of 2) = 0.31512 fft 35: mflops = 11.0108 (norm. = 0.0505645), norm. avg. (of 2) = 0.0415744 fft 36: mflops = 8.49232 (norm. = 0.0389991), norm. avg. (of 2) = 0.0318985 fft 37: mflops = 5.61841 (norm. = 0.0258013), norm. avg. (of 2) = 0.0428912 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.16635 s, 1048576 iters, t-(init.)=0.963344 s t(norm)=0.0382799, mflops=130.617 (err=1.2e-16) 1. Arndt DIT: elapsed time t=1.34619 s, 1048576 iters, t-(init.)=1.15771 s t(norm)=0.0460033, mflops=108.688 (err=1.2e-16) 2. Arndt Split-Radix: elapsed time t=1.07205 s, 262144 iters, t-(init.)=1.02695 s t(norm)=0.163229, mflops=30.6319 (err=1.1e-16) 3. Arndt 4-step: elapsed time t=1.11623 s, 65536 iters, t-(init.)=1.10617 s t(norm)=0.703287, mflops=7.10948 (err=1.3e-16) 4. Bailey: elapsed time t=1.69469 s, 262144 iters, t-(init.)=1.6497 s t(norm)=0.262213, mflops=19.0684 (err=9.8e-17) 5. Beauregard: elapsed time t=1.69038 s, 131072 iters, t-(init.)=1.67027 s t(norm)=0.530964, mflops=9.41684 (err=1.2e-16) 6. Bergland: elapsed time t=1.88491 s, 524288 iters, t-(init.)=1.78976 s t(norm)=0.142238, mflops=35.1524 (err=1.3e-16) 7. Brenner: elapsed time t=1.40048 s, 262144 iters, t-(init.)=1.34551 s t(norm)=0.213863, mflops=23.3794 (err=1.2e-16) 8. Burrus: elapsed time t=1.55553 s, 262144 iters, t-(init.)=1.51322 s t(norm)=0.24052, mflops=20.7883 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.12066 s, 262144 iters, t-(init.)=1.07677 s t(norm)=0.171149, mflops=29.2144 10. CWP (best N) (N=15): elapsed time t=1.54618 s, 262144 iters, t-(init.)=1.47219 s t(norm)=0.233998, mflops=21.3677 11. Edelblute: elapsed time t=1.70376 s, 262144 iters, t-(init.)=1.65936 s t(norm)=0.263748, mflops=18.9575 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.17622 s, 524288 iters, t-(init.)=1.07942 s t(norm)=0.0857848, mflops=58.2854 (err=1.2e-16) 13. FFTPACK (f2c): elapsed time t=1.49643 s, 524288 iters, t-(init.)=1.416 s t(norm)=0.112533, mflops=44.4313 (err=1.2e-16) FFTW_MEASURE plan: (cost = 5.909766e-07) FFTW_NOTW 8 14. FFTW: elapsed time t=1.2517 s, 2097152 iters, t-(init.)=0.890319 s t(norm)=0.0176891, mflops=282.661 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.26971 s, 2097152 iters, t-(init.)=0.914326 s t(norm)=0.018166, mflops=275.239 (err=1.2e-16) 16. Frigo-old: elapsed time t=1.17727 s, 2097152 iters, t-(init.)=0.759442 s t(norm)=0.0150888, mflops=331.372 (err=1.4e-16) 17. Green: elapsed time t=1.67535 s, 1048576 iters, t-(init.)=1.49643 s t(norm)=0.0594627, mflops=84.0863 (err=1.4e-16) 18. GSL: elapsed time t=1.72643 s, 524288 iters, t-(init.)=1.6214 s t(norm)=0.128857, mflops=38.8027 (err=1.4e-16) 19. GSL DIT: elapsed time t=1.74203 s, 262144 iters, t-(init.)=1.69855 s t(norm)=0.269978, mflops=18.52 (err=1.2e-16) 20. GSL DIF: elapsed time t=1.72182 s, 262144 iters, t-(init.)=1.66683 s t(norm)=0.264936, mflops=18.8725 (err=1.4e-16) 21. Krukar: elapsed time t=1.05526 s, 1048576 iters, t-(init.)=0.861619 s t(norm)=0.0342377, mflops=146.038 (err=1.9e-16) 22. Mayer (Buneman): elapsed time t=1.24047 s, 524288 iters, t-(init.)=1.12806 s t(norm)=0.0896505, mflops=55.7722 (err=1.2e-16) 23. Mayer (simple): elapsed time t=1.2622 s, 524288 iters, t-(init.)=1.15775 s t(norm)=0.0920095, mflops=54.3422 24. Mayer (lookup): elapsed time t=1.31956 s, 524288 iters, t-(init.)=1.23921 s t(norm)=0.0984838, mflops=50.7698 (err=1.2e-16) 25. Monro: elapsed time t=1.65037 s, 262144 iters, t-(init.)=1.60518 s t(norm)=0.255136, mflops=19.5974 (err=1.1e-08) 26. NAPACK (f2c): elapsed time t=1.56047 s, 131072 iters, t-(init.)=1.53563 s t(norm)=0.488164, mflops=10.2425 (err=1.7e-16) 27. Ooura (C): elapsed time t=1.69497 s, 1048576 iters, t-(init.)=1.5261 s t(norm)=0.0606419, mflops=82.4513 (err=1.3e-16) 28. Ooura (F): elapsed time t=1.63567 s, 1048576 iters, t-(init.)=1.45784 s t(norm)=0.0579295, mflops=86.3119 (err=1.3e-16) 29. Ransom: elapsed time t=1.96875 s, 65536 iters, t-(init.)=1.95533 s t(norm)=1.24316, mflops=4.022 (err=3.4e-16) 30. SCIPORT: elapsed time t=1.14285 s, 524288 iters, t-(init.)=1.04107 s t(norm)=0.0827365, mflops=60.4329 (err=1.4e-16) 31. Singleton: elapsed time t=1.91337 s, 262144 iters, t-(init.)=1.85735 s t(norm)=0.295217, mflops=16.9367 (err=1.4e-16) 32. Singleton (f2c): elapsed time t=1.93643 s, 262144 iters, t-(init.)=1.89621 s t(norm)=0.301394, mflops=16.5896 (err=1.4e-16) 33. Sorensen: elapsed time t=1.25783 s, 524288 iters, t-(init.)=1.1506 s t(norm)=0.0914413, mflops=54.6799 (err=1.5e-16) 34. Sorensen DIT: elapsed time t=1.55617 s, 262144 iters, t-(init.)=1.49477 s t(norm)=0.237587, mflops=21.045 (err=1.1e-16) 35. Temperton: elapsed time t=1.44074 s, 262144 iters, t-(init.)=1.40058 s t(norm)=0.222615, mflops=22.4603 (err=4.6e-09) 36. Temperton (f2c): elapsed time t=1.07642 s, 131072 iters, t-(init.)=1.04841 s t(norm)=0.33328, mflops=15.0024 (err=1.4e-16) 37. Valkenburg: elapsed time t=1.38097 s, 65536 iters, t-(init.)=1.36905 s t(norm)=0.870419, mflops=5.74436 (err=1.5e-16) Top mflops for N=8 = 331.372 Normalized results and averages for N=8: fft 0: mflops = 130.617 (norm. = 0.39417), norm. avg. (of 3) = 0.380071 fft 1: mflops = 108.688 (norm. = 0.327993), norm. avg. (of 3) = 0.366253 fft 2: mflops = 30.6319 (norm. = 0.0924395), norm. avg. (of 3) = 0.121993 fft 3: mflops = 7.10948 (norm. = 0.0214546), norm. avg. (of 3) = 0.0228694 fft 4: mflops = 19.0684 (norm. = 0.0575439), norm. avg. (of 3) = 0.055149 fft 5: mflops = 9.41684 (norm. = 0.0284177), norm. avg. (of 3) = 0.0481034 fft 6: mflops = 35.1524 (norm. = 0.106081), norm. avg. (of 3) = 0.0884915 fft 7: mflops = 23.3794 (norm. = 0.0705533), norm. avg. (of 3) = 0.0764095 fft 8: mflops = 20.7883 (norm. = 0.062734), norm. avg. (of 3) = 0.204481 fft 9: mflops = 29.2144 (norm. = 0.0881618), norm. avg. (of 3) = 0.0572255 fft 10: mflops = 21.3677 (norm. = 0.0644825), norm. avg. (of 3) = 0.0422003 fft 11: mflops = 18.9575 (norm. = 0.057209), norm. avg. (of 2) = 0.0820065 fft 12: mflops = 58.2854 (norm. = 0.175891), norm. avg. (of 3) = 0.154843 fft 13: mflops = 44.4313 (norm. = 0.134083), norm. avg. (of 3) = 0.123872 fft 14: mflops = 282.661 (norm. = 0.853), norm. avg. (of 3) = 0.674576 fft 15: mflops = 275.239 (norm. = 0.830604), norm. avg. (of 3) = 0.624774 fft 16: mflops = 331.372 (norm. = 1), norm. avg. (of 3) = 1 fft 17: mflops = 84.0863 (norm. = 0.253752), norm. avg. (of 1) = 0.253752 fft 18: mflops = 38.8027 (norm. = 0.117097), norm. avg. (of 3) = 0.105059 fft 19: mflops = 18.52 (norm. = 0.0558889), norm. avg. (of 3) = 0.0526792 fft 20: mflops = 18.8725 (norm. = 0.0569525), norm. avg. (of 3) = 0.0537191 fft 21: mflops = 146.038 (norm. = 0.440707), norm. avg. (of 3) = 0.416208 fft 22: mflops = 55.7722 (norm. = 0.168307), norm. avg. (of 2) = 0.15768 fft 23: mflops = 54.3422 (norm. = 0.163991), norm. avg. (of 2) = 0.152563 fft 24: mflops = 50.7698 (norm. = 0.153211), norm. avg. (of 2) = 0.143337 fft 25: mflops = 19.5974 (norm. = 0.0591401), norm. avg. (of 2) = 0.0482881 fft 26: mflops = 10.2425 (norm. = 0.0309092), norm. avg. (of 3) = 0.0269452 fft 27: mflops = 82.4513 (norm. = 0.248818), norm. avg. (of 3) = 0.253133 fft 28: mflops = 86.3119 (norm. = 0.260468), norm. avg. (of 3) = 0.260492 fft 29: mflops = 4.022 (norm. = 0.0121374), norm. avg. (of 2) = 0.0153382 fft 30: mflops = 60.4329 (norm. = 0.182371), norm. avg. (of 2) = 0.197027 fft 31: mflops = 16.9367 (norm. = 0.0511107), norm. avg. (of 3) = 0.0568381 fft 32: mflops = 16.5896 (norm. = 0.0500633), norm. avg. (of 3) = 0.0568193 fft 33: mflops = 54.6799 (norm. = 0.16501), norm. avg. (of 3) = 0.162538 fft 34: mflops = 21.045 (norm. = 0.0635085), norm. avg. (of 3) = 0.231249 fft 35: mflops = 22.4603 (norm. = 0.0677795), norm. avg. (of 3) = 0.0503095 fft 36: mflops = 15.0024 (norm. = 0.0452736), norm. avg. (of 3) = 0.0363568 fft 37: mflops = 5.74436 (norm. = 0.0173351), norm. avg. (of 3) = 0.0343725 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.63579 s, 262144 iters, t-(init.)=1.5618 s t(norm)=0.0930905, mflops=53.7112 (err=1.5e-16) 1. Arndt DIT: elapsed time t=1.56544 s, 262144 iters, t-(init.)=1.49558 s t(norm)=0.0891436, mflops=56.0893 (err=2.2e-16) 2. Arndt Split-Radix: elapsed time t=1.23569 s, 131072 iters, t-(init.)=1.19984 s t(norm)=0.143032, mflops=34.9571 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.32985 s, 65536 iters, t-(init.)=1.31344 s t(norm)=0.313147, mflops=15.9669 (err=2.0e-16) 4. Bailey: elapsed time t=1.45411 s, 131072 iters, t-(init.)=1.4176 s t(norm)=0.168991, mflops=29.5874 (err=2.0e-16) 5. Beauregard: elapsed time t=1.06496 s, 32768 iters, t-(init.)=1.05677 s t(norm)=0.503905, mflops=9.9225 (err=2.7e-16) 6. Bergland: elapsed time t=1.76463 s, 262144 iters, t-(init.)=1.6916 s t(norm)=0.100827, mflops=49.5897 (err=2.6e-16) 7. Brenner: elapsed time t=1.20324 s, 131072 iters, t-(init.)=1.16458 s t(norm)=0.138828, mflops=36.0157 (err=2.1e-16) 8. Burrus: elapsed time t=1.07999 s, 65536 iters, t-(init.)=1.06119 s t(norm)=0.253008, mflops=19.7622 (err=1.4e-16) 9. CWP (min N): elapsed time t=1.58858 s, 262144 iters, t-(init.)=1.51741 s t(norm)=0.0904449, mflops=55.2823 10. CWP (best N) (N=28): elapsed time t=1.06654 s, 131072 iters, t-(init.)=1.01467 s t(norm)=0.120958, mflops=41.3365 11. Edelblute: elapsed time t=1.18529 s, 65536 iters, t-(init.)=1.16653 s t(norm)=0.278122, mflops=17.9777 (err=1.4e-16) 12. FFTPACK: elapsed time t=1.79658 s, 524288 iters, t-(init.)=1.66323 s t(norm)=0.049568, mflops=100.872 (err=1.8e-16) 13. FFTPACK (f2c): elapsed time t=1.1516 s, 262144 iters, t-(init.)=1.07443 s t(norm)=0.0640413, mflops=78.0747 (err=1.8e-16) FFTW_MEASURE plan: (cost = 1.092727e-06) FFTW_NOTW 16 14. FFTW: elapsed time t=1.23344 s, 1048576 iters, t-(init.)=0.937444 s t(norm)=0.013969, mflops=357.935 (err=1.8e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.22678 s, 1048576 iters, t-(init.)=0.958991 s t(norm)=0.0142901, mflops=349.893 (err=1.8e-16) 16. Frigo-old: elapsed time t=1.20575 s, 1048576 iters, t-(init.)=0.925923 s t(norm)=0.0137973, mflops=362.389 (err=1.8e-16) 17. Green: elapsed time t=1.8158 s, 524288 iters, t-(init.)=1.64449 s t(norm)=0.0490097, mflops=102.021 (err=1.9e-16) 18. GSL: elapsed time t=1.50752 s, 262144 iters, t-(init.)=1.42212 s t(norm)=0.0847648, mflops=58.9868 (err=1.8e-16) 19. GSL DIT: elapsed time t=1.48535 s, 131072 iters, t-(init.)=1.45193 s t(norm)=0.173083, mflops=28.8878 (err=2.1e-16) 20. GSL DIF: elapsed time t=1.36273 s, 131072 iters, t-(init.)=1.31934 s t(norm)=0.157277, mflops=31.791 (err=2.8e-16) 21. Krukar: elapsed time t=1.2439 s, 524288 iters, t-(init.)=1.10182 s t(norm)=0.0328368, mflops=152.268 (err=1.9e-16) 22. Mayer (Buneman): elapsed time t=1.71115 s, 262144 iters, t-(init.)=1.6382 s t(norm)=0.0976443, mflops=51.2063 (err=1.7e-16) 23. Mayer (simple): elapsed time t=1.48674 s, 262144 iters, t-(init.)=1.40841 s t(norm)=0.0839479, mflops=59.5607 24. Mayer (lookup): elapsed time t=1.53938 s, 262144 iters, t-(init.)=1.47382 s t(norm)=0.0878465, mflops=56.9175 (err=1.8e-16) 25. Monro: elapsed time t=1.33273 s, 131072 iters, t-(init.)=1.29503 s t(norm)=0.15438, mflops=32.3876 (err=2.1e-08) 26. NAPACK (f2c): elapsed time t=1.30185 s, 65536 iters, t-(init.)=1.28441 s t(norm)=0.306228, mflops=16.3277 (err=3.3e-16) 27. Ooura (C): elapsed time t=1.80009 s, 524288 iters, t-(init.)=1.66888 s t(norm)=0.0497364, mflops=100.53 (err=2.0e-16) 28. Ooura (F): elapsed time t=1.77741 s, 524288 iters, t-(init.)=1.64603 s t(norm)=0.0490554, mflops=101.926 (err=2.0e-16) 29. Ransom: elapsed time t=1.28832 s, 65536 iters, t-(init.)=1.26956 s t(norm)=0.302686, mflops=16.5187 (err=3.4e-16) 30. SCIPORT: elapsed time t=1.09373 s, 262144 iters, t-(init.)=1.02078 s t(norm)=0.0608434, mflops=82.1782 (err=2.8e-16) 31. Singleton: elapsed time t=1.81772 s, 262144 iters, t-(init.)=1.73676 s t(norm)=0.103519, mflops=48.3003 (err=1.7e-16) 32. Singleton (f2c): elapsed time t=1.81822 s, 262144 iters, t-(init.)=1.75266 s t(norm)=0.104467, mflops=47.862 (err=1.7e-16) 33. Sorensen: elapsed time t=1.19379 s, 262144 iters, t-(init.)=1.11776 s t(norm)=0.0666234, mflops=75.0487 (err=1.5e-16) 34. Sorensen DIT: elapsed time t=1.09486 s, 65536 iters, t-(init.)=1.07449 s t(norm)=0.256178, mflops=19.5177 (err=1.6e-16) 35. Temperton: elapsed time t=1.13567 s, 131072 iters, t-(init.)=1.09702 s t(norm)=0.130775, mflops=38.2336 (err=1.7e-08) 36. Temperton (f2c): elapsed time t=1.52902 s, 131072 iters, t-(init.)=1.49255 s t(norm)=0.177926, mflops=28.1016 (err=1.8e-16) 37. Valkenburg: elapsed time t=1.78695 s, 32768 iters, t-(init.)=1.77612 s t(norm)=0.84692, mflops=5.90375 (err=2.9e-16) Top mflops for N=16 = 362.389 Normalized results and averages for N=16: fft 0: mflops = 53.7112 (norm. = 0.148214), norm. avg. (of 4) = 0.322107 fft 1: mflops = 56.0893 (norm. = 0.154776), norm. avg. (of 4) = 0.313384 fft 2: mflops = 34.9571 (norm. = 0.096463), norm. avg. (of 4) = 0.11561 fft 3: mflops = 15.9669 (norm. = 0.0440602), norm. avg. (of 4) = 0.0281671 fft 4: mflops = 29.5874 (norm. = 0.0816454), norm. avg. (of 4) = 0.0617731 fft 5: mflops = 9.9225 (norm. = 0.0273808), norm. avg. (of 4) = 0.0429228 fft 6: mflops = 49.5897 (norm. = 0.136841), norm. avg. (of 4) = 0.100579 fft 7: mflops = 36.0157 (norm. = 0.0993842), norm. avg. (of 4) = 0.0821532 fft 8: mflops = 19.7622 (norm. = 0.0545332), norm. avg. (of 4) = 0.166994 fft 9: mflops = 55.2823 (norm. = 0.15255), norm. avg. (of 4) = 0.0810565 fft 10: mflops = 41.3365 (norm. = 0.114067), norm. avg. (of 4) = 0.0601669 fft 11: mflops = 17.9777 (norm. = 0.049609), norm. avg. (of 3) = 0.0712073 fft 12: mflops = 100.872 (norm. = 0.278352), norm. avg. (of 4) = 0.18572 fft 13: mflops = 78.0747 (norm. = 0.215444), norm. avg. (of 4) = 0.146765 fft 14: mflops = 357.935 (norm. = 0.987711), norm. avg. (of 4) = 0.752859 fft 15: mflops = 349.893 (norm. = 0.965519), norm. avg. (of 4) = 0.70996 fft 16: mflops = 362.389 (norm. = 1), norm. avg. (of 4) = 1 fft 17: mflops = 102.021 (norm. = 0.281522), norm. avg. (of 2) = 0.267637 fft 18: mflops = 58.9868 (norm. = 0.162772), norm. avg. (of 4) = 0.119487 fft 19: mflops = 28.8878 (norm. = 0.079715), norm. avg. (of 4) = 0.0594381 fft 20: mflops = 31.791 (norm. = 0.0877263), norm. avg. (of 4) = 0.0622209 fft 21: mflops = 152.268 (norm. = 0.420178), norm. avg. (of 4) = 0.417201 fft 22: mflops = 51.2063 (norm. = 0.141302), norm. avg. (of 3) = 0.15222 fft 23: mflops = 59.5607 (norm. = 0.164356), norm. avg. (of 3) = 0.156494 fft 24: mflops = 56.9175 (norm. = 0.157062), norm. avg. (of 3) = 0.147912 fft 25: mflops = 32.3876 (norm. = 0.0893725), norm. avg. (of 3) = 0.0619829 fft 26: mflops = 16.3277 (norm. = 0.0450558), norm. avg. (of 4) = 0.0314729 fft 27: mflops = 100.53 (norm. = 0.277409), norm. avg. (of 4) = 0.259202 fft 28: mflops = 101.926 (norm. = 0.28126), norm. avg. (of 4) = 0.265684 fft 29: mflops = 16.5187 (norm. = 0.0455829), norm. avg. (of 3) = 0.0254197 fft 30: mflops = 82.1782 (norm. = 0.226768), norm. avg. (of 3) = 0.206941 fft 31: mflops = 48.3003 (norm. = 0.133283), norm. avg. (of 4) = 0.0759493 fft 32: mflops = 47.862 (norm. = 0.132074), norm. avg. (of 4) = 0.0756329 fft 33: mflops = 75.0487 (norm. = 0.207094), norm. avg. (of 4) = 0.173677 fft 34: mflops = 19.5177 (norm. = 0.0538584), norm. avg. (of 4) = 0.186902 fft 35: mflops = 38.2336 (norm. = 0.105504), norm. avg. (of 4) = 0.0641082 fft 36: mflops = 28.1016 (norm. = 0.0775455), norm. avg. (of 4) = 0.046654 fft 37: mflops = 5.90375 (norm. = 0.0162912), norm. avg. (of 4) = 0.0298522 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.65768 s, 131072 iters, t-(init.)=1.59525 s t(norm)=0.0760673, mflops=65.7313 (err=3.1e-16) 1. Arndt DIT: elapsed time t=1.6719 s, 131072 iters, t-(init.)=1.60501 s t(norm)=0.076533, mflops=65.3313 (err=2.5e-16) 2. Arndt Split-Radix: elapsed time t=1.36714 s, 65536 iters, t-(init.)=1.3377 s t(norm)=0.127573, mflops=39.1933 (err=2.7e-16) 3. Arndt 4-step: elapsed time t=1.61251 s, 32768 iters, t-(init.)=1.59794 s t(norm)=0.304783, mflops=16.4051 (err=2.8e-16) 4. Bailey: elapsed time t=1.10384 s, 65536 iters, t-(init.)=1.07228 s t(norm)=0.102261, mflops=48.8946 (err=2.7e-16) 5. Beauregard: elapsed time t=1.35732 s, 16384 iters, t-(init.)=1.35006 s t(norm)=0.515005, mflops=9.70864 (err=1.8e-16) 6. Bergland: elapsed time t=1.64429 s, 131072 iters, t-(init.)=1.58236 s t(norm)=0.0754529, mflops=66.2665 (err=2.6e-16) 7. Brenner: elapsed time t=1.18887 s, 65536 iters, t-(init.)=1.15609 s t(norm)=0.110253, mflops=45.3503 (err=2.2e-16) 8. Burrus: elapsed time t=1.2851 s, 32768 iters, t-(init.)=1.2704 s t(norm)=0.242309, mflops=20.6348 (err=2.9e-16) 9. CWP (min N) (N=33): elapsed time t=1.61703 s, 131072 iters, t-(init.)=1.55354 s t(norm)=0.0740786, mflops=67.4959 10. CWP (best N) (N=35): elapsed time t=1.30477 s, 131072 iters, t-(init.)=1.24173 s t(norm)=0.0592102, mflops=84.4449 11. Edelblute: elapsed time t=1.38113 s, 32768 iters, t-(init.)=1.3666 s t(norm)=0.260657, mflops=19.1823 (err=2.9e-16) 12. FFTPACK: elapsed time t=1.90088 s, 262144 iters, t-(init.)=1.77083 s t(norm)=0.0422198, mflops=118.428 (err=1.9e-16) 13. FFTPACK (f2c): elapsed time t=1.37715 s, 131072 iters, t-(init.)=1.31897 s t(norm)=0.0628934, mflops=79.4997 (err=1.9e-16) FFTW_MEASURE plan: (cost = 3.546344e-06) FFTW_TWIDDLE 4 FFTW_NOTW 8 14. FFTW: elapsed time t=1.84981 s, 524288 iters, t-(init.)=1.5876 s t(norm)=0.0189256, mflops=264.192 (err=1.9e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.62022 s, 524288 iters, t-(init.)=1.37495 s t(norm)=0.0163907, mflops=305.052 (err=2.1e-16) 16. Frigo-old: elapsed time t=1.59247 s, 524288 iters, t-(init.)=1.33335 s t(norm)=0.0158948, mflops=314.569 (err=2.2e-16) 17. Green: elapsed time t=1.73832 s, 262144 iters, t-(init.)=1.61439 s t(norm)=0.03849, mflops=129.904 (err=2.0e-16) 18. GSL: elapsed time t=1.01855 s, 65536 iters, t-(init.)=0.987331 s t(norm)=0.0941592, mflops=53.1016 (err=2.0e-16) 19. GSL DIT: elapsed time t=1.28868 s, 65536 iters, t-(init.)=1.2592 s t(norm)=0.120087, mflops=41.6365 (err=2.2e-16) 20. GSL DIF: elapsed time t=1.20016 s, 65536 iters, t-(init.)=1.16819 s t(norm)=0.111407, mflops=44.8803 (err=2.5e-16) 21. Krukar: elapsed time t=1.41901 s, 262144 iters, t-(init.)=1.28794 s t(norm)=0.0307069, mflops=162.83 (err=2.2e-16) 22. Mayer (Buneman): elapsed time t=1.74495 s, 131072 iters, t-(init.)=1.68311 s t(norm)=0.0802571, mflops=62.2998 (err=2.7e-16) 23. Mayer (simple): elapsed time t=1.48781 s, 131072 iters, t-(init.)=1.41901 s t(norm)=0.0676636, mflops=73.895 24. Mayer (lookup): elapsed time t=1.43 s, 131072 iters, t-(init.)=1.36967 s t(norm)=0.0653111, mflops=76.5567 (err=2.5e-16) 25. Monro: elapsed time t=1.03641 s, 65536 iters, t-(init.)=1.00444 s t(norm)=0.0957908, mflops=52.1971 (err=3.7e-08) 26. NAPACK (f2c): elapsed time t=1.17538 s, 32768 iters, t-(init.)=1.16083 s t(norm)=0.221411, mflops=22.5825 (err=5.4e-16) 27. Ooura (C): elapsed time t=1.89159 s, 262144 iters, t-(init.)=1.77052 s t(norm)=0.0422125, mflops=118.448 (err=2.7e-16) 28. Ooura (F): elapsed time t=1.94019 s, 262144 iters, t-(init.)=1.81188 s t(norm)=0.0431985, mflops=115.745 (err=2.7e-16) 29. Ransom: elapsed time t=1.87767 s, 32768 iters, t-(init.)=1.8626 s t(norm)=0.355264, mflops=14.0741 (err=7.0e-16) 30. SCIPORT: elapsed time t=1.14624 s, 131072 iters, t-(init.)=1.08431 s t(norm)=0.0517038, mflops=96.7046 (err=1.8e-16) 31. Singleton: elapsed time t=1.63166 s, 131072 iters, t-(init.)=1.56144 s t(norm)=0.0744554, mflops=67.1543 (err=2.2e-16) 32. Singleton (f2c): elapsed time t=1.61711 s, 131072 iters, t-(init.)=1.55895 s t(norm)=0.0743364, mflops=67.2618 (err=2.2e-16) 33. Sorensen: elapsed time t=1.00729 s, 131072 iters, t-(init.)=0.941604 s t(norm)=0.0448992, mflops=111.361 (err=2.7e-16) 34. Sorensen DIT: elapsed time t=1.26044 s, 32768 iters, t-(init.)=1.2439 s t(norm)=0.237255, mflops=21.0744 (err=2.6e-16) 35. Temperton: elapsed time t=1.11455 s, 65536 iters, t-(init.)=1.08545 s t(norm)=0.103516, mflops=48.3015 (err=3.1e-08) 36. Temperton (f2c): elapsed time t=1.72069 s, 65536 iters, t-(init.)=1.68977 s t(norm)=0.161149, mflops=31.0272 (err=2.0e-16) 37. Valkenburg: elapsed time t=1.09017 s, 8192 iters, t-(init.)=1.08631 s t(norm)=0.828786, mflops=6.03292 (err=4.3e-16) Top mflops for N=32 = 314.569 Normalized results and averages for N=32: fft 0: mflops = 65.7313 (norm. = 0.208957), norm. avg. (of 5) = 0.299477 fft 1: mflops = 65.3313 (norm. = 0.207685), norm. avg. (of 5) = 0.292244 fft 2: mflops = 39.1933 (norm. = 0.124594), norm. avg. (of 5) = 0.117407 fft 3: mflops = 16.4051 (norm. = 0.052151), norm. avg. (of 5) = 0.0329639 fft 4: mflops = 48.8946 (norm. = 0.155434), norm. avg. (of 5) = 0.0805052 fft 5: mflops = 9.70864 (norm. = 0.0308633), norm. avg. (of 5) = 0.0405109 fft 6: mflops = 66.2665 (norm. = 0.210658), norm. avg. (of 5) = 0.122595 fft 7: mflops = 45.3503 (norm. = 0.144166), norm. avg. (of 5) = 0.0945558 fft 8: mflops = 20.6348 (norm. = 0.0655971), norm. avg. (of 5) = 0.146715 fft 9: mflops = 67.4959 (norm. = 0.214566), norm. avg. (of 5) = 0.107758 fft 10: mflops = 84.4449 (norm. = 0.268446), norm. avg. (of 5) = 0.101823 fft 11: mflops = 19.1823 (norm. = 0.0609796), norm. avg. (of 4) = 0.0686504 fft 12: mflops = 118.428 (norm. = 0.376476), norm. avg. (of 5) = 0.223871 fft 13: mflops = 79.4997 (norm. = 0.252726), norm. avg. (of 5) = 0.167957 fft 14: mflops = 264.192 (norm. = 0.839854), norm. avg. (of 5) = 0.770258 fft 15: mflops = 305.052 (norm. = 0.969745), norm. avg. (of 5) = 0.761917 fft 16: mflops = 314.569 (norm. = 1), norm. avg. (of 5) = 1 fft 17: mflops = 129.904 (norm. = 0.412958), norm. avg. (of 3) = 0.316077 fft 18: mflops = 53.1016 (norm. = 0.168807), norm. avg. (of 5) = 0.129351 fft 19: mflops = 41.6365 (norm. = 0.132361), norm. avg. (of 5) = 0.0740226 fft 20: mflops = 44.8803 (norm. = 0.142673), norm. avg. (of 5) = 0.0783112 fft 21: mflops = 162.83 (norm. = 0.517628), norm. avg. (of 5) = 0.437286 fft 22: mflops = 62.2998 (norm. = 0.198048), norm. avg. (of 4) = 0.163677 fft 23: mflops = 73.895 (norm. = 0.234909), norm. avg. (of 4) = 0.176098 fft 24: mflops = 76.5567 (norm. = 0.24337), norm. avg. (of 4) = 0.171776 fft 25: mflops = 52.1971 (norm. = 0.165932), norm. avg. (of 4) = 0.0879702 fft 26: mflops = 22.5825 (norm. = 0.0717886), norm. avg. (of 5) = 0.039536 fft 27: mflops = 118.448 (norm. = 0.376542), norm. avg. (of 5) = 0.28267 fft 28: mflops = 115.745 (norm. = 0.367947), norm. avg. (of 5) = 0.286137 fft 29: mflops = 14.0741 (norm. = 0.0447408), norm. avg. (of 4) = 0.03025 fft 30: mflops = 96.7046 (norm. = 0.307419), norm. avg. (of 4) = 0.23206 fft 31: mflops = 67.1543 (norm. = 0.213481), norm. avg. (of 5) = 0.103456 fft 32: mflops = 67.2618 (norm. = 0.213822), norm. avg. (of 5) = 0.103271 fft 33: mflops = 111.361 (norm. = 0.35401), norm. avg. (of 5) = 0.209744 fft 34: mflops = 21.0744 (norm. = 0.0669946), norm. avg. (of 5) = 0.16292 fft 35: mflops = 48.3015 (norm. = 0.153548), norm. avg. (of 5) = 0.0819962 fft 36: mflops = 31.0272 (norm. = 0.0986341), norm. avg. (of 5) = 0.05705 fft 37: mflops = 6.03292 (norm. = 0.0191784), norm. avg. (of 5) = 0.0277174 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.04044 s, 32768 iters, t-(init.)=1.01187 s t(norm)=0.0804163, mflops=62.1764 (err=5.7e-16) 1. Arndt DIT: elapsed time t=1.92123 s, 65536 iters, t-(init.)=1.86573 s t(norm)=0.0741375, mflops=67.4422 (err=5.6e-16) 2. Arndt Split-Radix: elapsed time t=1.46833 s, 32768 iters, t-(init.)=1.44079 s t(norm)=0.114504, mflops=43.6667 (err=5.7e-16) 3. Arndt 4-step: elapsed time t=1.29037 s, 16384 iters, t-(init.)=1.27601 s t(norm)=0.202816, mflops=24.6529 (err=5.6e-16) 4. Bailey: elapsed time t=1.83963 s, 65536 iters, t-(init.)=1.78116 s t(norm)=0.0707771, mflops=70.6444 (err=5.7e-16) 5. Beauregard: elapsed time t=1.67482 s, 8192 iters, t-(init.)=1.66801 s t(norm)=0.530246, mflops=9.42958 (err=5.9e-16) 6. Bergland: elapsed time t=1.76484 s, 65536 iters, t-(init.)=1.70503 s t(norm)=0.0677519, mflops=73.7987 (err=5.9e-16) 7. Brenner: elapsed time t=1.12667 s, 32768 iters, t-(init.)=1.09721 s t(norm)=0.0871988, mflops=57.3402 (err=5.8e-16) 8. Burrus: elapsed time t=1.41001 s, 16384 iters, t-(init.)=1.39637 s t(norm)=0.221947, mflops=22.5279 (err=5.7e-16) 9. CWP (min N) (N=65): elapsed time t=1.51851 s, 65536 iters, t-(init.)=1.45925 s t(norm)=0.0579853, mflops=86.2287 10. CWP (best N) (N=84): elapsed time t=1.05464 s, 65536 iters, t-(init.)=0.984249 s t(norm)=0.0391105, mflops=127.843 11. Edelblute: elapsed time t=1.53608 s, 16384 iters, t-(init.)=1.52244 s t(norm)=0.241985, mflops=20.6625 (err=5.7e-16) 12. FFTPACK: elapsed time t=1.73598 s, 131072 iters, t-(init.)=1.62658 s t(norm)=0.0323172, mflops=154.716 (err=5.6e-16) 13. FFTPACK (f2c): elapsed time t=1.30674 s, 65536 iters, t-(init.)=1.25224 s t(norm)=0.0497594, mflops=100.484 (err=5.6e-16) FFTW_MEASURE plan: (cost = 7.729750e-06) FFTW_TWIDDLE 4 FFTW_NOTW 16 14. FFTW: elapsed time t=1.86288 s, 262144 iters, t-(init.)=1.63199 s t(norm)=0.0162124, mflops=308.406 (err=5.5e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.30195 s, 131072 iters, t-(init.)=1.1919 s t(norm)=0.0236809, mflops=211.14 (err=5.6e-16) 16. Frigo-old: elapsed time t=1.73989 s, 131072 iters, t-(init.)=1.62713 s t(norm)=0.0323281, mflops=154.664 (err=5.6e-16) 17. Green: elapsed time t=1.64553 s, 131072 iters, t-(init.)=1.53112 s t(norm)=0.0304205, mflops=164.363 (err=5.5e-16) 18. GSL: elapsed time t=1.95195 s, 65536 iters, t-(init.)=1.89504 s t(norm)=0.075302, mflops=66.3993 (err=5.6e-16) 19. GSL DIT: elapsed time t=1.2362 s, 32768 iters, t-(init.)=1.20898 s t(norm)=0.0960814, mflops=52.0392 (err=5.6e-16) 20. GSL DIF: elapsed time t=1.13724 s, 32768 iters, t-(init.)=1.1085 s t(norm)=0.0880956, mflops=56.7565 (err=5.4e-16) 21. Krukar: elapsed time t=1.02318 s, 65536 iters, t-(init.)=0.967685 s t(norm)=0.0384523, mflops=130.031 (err=6.0e-16) 22. Mayer (Buneman): elapsed time t=1.04172 s, 32768 iters, t-(init.)=1.01408 s t(norm)=0.0805918, mflops=62.041 (err=5.4e-16) 23. Mayer (simple): elapsed time t=1.61035 s, 65536 iters, t-(init.)=1.55143 s t(norm)=0.0616483, mflops=81.1052 24. Mayer (lookup): elapsed time t=1.60143 s, 65536 iters, t-(init.)=1.5469 s t(norm)=0.0614681, mflops=81.343 (err=5.4e-16) 25. Monro: elapsed time t=1.86707 s, 65536 iters, t-(init.)=1.81184 s t(norm)=0.0719961, mflops=69.4483 (err=3.4e-08) 26. NAPACK (f2c): elapsed time t=1.12122 s, 16384 iters, t-(init.)=1.10735 s t(norm)=0.176008, mflops=28.4078 (err=1.1e-15) 27. Ooura (C): elapsed time t=1.00891 s, 65536 iters, t-(init.)=0.954451 s t(norm)=0.0379265, mflops=131.834 (err=5.9e-16) 28. Ooura (F): elapsed time t=1.02803 s, 65536 iters, t-(init.)=0.971136 s t(norm)=0.0385895, mflops=129.569 (err=5.9e-16) 29. Ransom: elapsed time t=1.95194 s, 32768 iters, t-(init.)=1.92367 s t(norm)=0.152879, mflops=32.7055 (err=8.6e-16) 30. SCIPORT: elapsed time t=1.13054 s, 65536 iters, t-(init.)=1.07377 s t(norm)=0.0426679, mflops=117.184 (err=5.9e-16) 31. Singleton: elapsed time t=1.32374 s, 65536 iters, t-(init.)=1.26745 s t(norm)=0.050364, mflops=99.2772 (err=9.2e-16) 32. Singleton (f2c): elapsed time t=1.30416 s, 65536 iters, t-(init.)=1.24965 s t(norm)=0.0496567, mflops=100.691 (err=9.2e-16) 33. Sorensen: elapsed time t=1.01185 s, 65536 iters, t-(init.)=0.953012 s t(norm)=0.0378693, mflops=132.033 (err=5.4e-16) 34. Sorensen DIT: elapsed time t=1.39306 s, 16384 iters, t-(init.)=1.37926 s t(norm)=0.219227, mflops=22.8074 (err=5.5e-16) 35. Temperton: elapsed time t=1.74114 s, 65536 iters, t-(init.)=1.68662 s t(norm)=0.0670203, mflops=74.6043 (err=3.8e-08) 36. Temperton (f2c): elapsed time t=1.34664 s, 32768 iters, t-(init.)=1.31678 s t(norm)=0.104648, mflops=47.7791 (err=5.6e-16) 37. Valkenburg: elapsed time t=1.30997 s, 4096 iters, t-(init.)=1.30652 s t(norm)=0.830661, mflops=6.0193 (err=8.1e-16) Top mflops for N=64 = 308.406 Normalized results and averages for N=64: fft 0: mflops = 62.1764 (norm. = 0.201606), norm. avg. (of 6) = 0.283165 fft 1: mflops = 67.4422 (norm. = 0.21868), norm. avg. (of 6) = 0.279983 fft 2: mflops = 43.6667 (norm. = 0.141588), norm. avg. (of 6) = 0.121437 fft 3: mflops = 24.6529 (norm. = 0.0799365), norm. avg. (of 6) = 0.0407927 fft 4: mflops = 70.6444 (norm. = 0.229063), norm. avg. (of 6) = 0.105265 fft 5: mflops = 9.42958 (norm. = 0.0305752), norm. avg. (of 6) = 0.0388549 fft 6: mflops = 73.7987 (norm. = 0.23929), norm. avg. (of 6) = 0.142044 fft 7: mflops = 57.3402 (norm. = 0.185924), norm. avg. (of 6) = 0.109784 fft 8: mflops = 22.5279 (norm. = 0.0730461), norm. avg. (of 6) = 0.134437 fft 9: mflops = 86.2287 (norm. = 0.279595), norm. avg. (of 6) = 0.136398 fft 10: mflops = 127.843 (norm. = 0.414527), norm. avg. (of 6) = 0.15394 fft 11: mflops = 20.6625 (norm. = 0.0669976), norm. avg. (of 5) = 0.0683198 fft 12: mflops = 154.716 (norm. = 0.501664), norm. avg. (of 6) = 0.27017 fft 13: mflops = 100.484 (norm. = 0.325816), norm. avg. (of 6) = 0.194267 fft 14: mflops = 308.406 (norm. = 1), norm. avg. (of 6) = 0.808549 fft 15: mflops = 211.14 (norm. = 0.684617), norm. avg. (of 6) = 0.749034 fft 16: mflops = 154.664 (norm. = 0.501495), norm. avg. (of 6) = 0.916916 fft 17: mflops = 164.363 (norm. = 0.532942), norm. avg. (of 4) = 0.370294 fft 18: mflops = 66.3993 (norm. = 0.215298), norm. avg. (of 6) = 0.143676 fft 19: mflops = 52.0392 (norm. = 0.168736), norm. avg. (of 6) = 0.0898082 fft 20: mflops = 56.7565 (norm. = 0.184032), norm. avg. (of 6) = 0.0959313 fft 21: mflops = 130.031 (norm. = 0.421623), norm. avg. (of 6) = 0.434676 fft 22: mflops = 62.041 (norm. = 0.201167), norm. avg. (of 5) = 0.171175 fft 23: mflops = 81.1052 (norm. = 0.262982), norm. avg. (of 5) = 0.193474 fft 24: mflops = 81.343 (norm. = 0.263753), norm. avg. (of 5) = 0.190172 fft 25: mflops = 69.4483 (norm. = 0.225184), norm. avg. (of 5) = 0.115413 fft 26: mflops = 28.4078 (norm. = 0.0921117), norm. avg. (of 6) = 0.0482986 fft 27: mflops = 131.834 (norm. = 0.427469), norm. avg. (of 6) = 0.306803 fft 28: mflops = 129.569 (norm. = 0.420124), norm. avg. (of 6) = 0.308468 fft 29: mflops = 32.7055 (norm. = 0.106047), norm. avg. (of 5) = 0.0454094 fft 30: mflops = 117.184 (norm. = 0.379966), norm. avg. (of 5) = 0.261642 fft 31: mflops = 99.2772 (norm. = 0.321904), norm. avg. (of 6) = 0.139864 fft 32: mflops = 100.691 (norm. = 0.326489), norm. avg. (of 6) = 0.140474 fft 33: mflops = 132.033 (norm. = 0.428114), norm. avg. (of 6) = 0.246139 fft 34: mflops = 22.8074 (norm. = 0.0739526), norm. avg. (of 6) = 0.148092 fft 35: mflops = 74.6043 (norm. = 0.241903), norm. avg. (of 6) = 0.108647 fft 36: mflops = 47.7791 (norm. = 0.154922), norm. avg. (of 6) = 0.0733621 fft 37: mflops = 6.0193 (norm. = 0.0195175), norm. avg. (of 6) = 0.0263507 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.10883 s, 16384 iters, t-(init.)=1.08198 s t(norm)=0.0737037, mflops=67.8392 (err=3.5e-16) 1. Arndt DIT: elapsed time t=1.02716 s, 16384 iters, t-(init.)=1.00033 s t(norm)=0.0681422, mflops=73.3759 (err=3.2e-16) 2. Arndt Split-Radix: elapsed time t=1.59485 s, 16384 iters, t-(init.)=1.56855 s t(norm)=0.106849, mflops=46.7949 (err=3.6e-16) 3. Arndt 4-step: elapsed time t=1.55643 s, 8192 iters, t-(init.)=1.54328 s t(norm)=0.210255, mflops=23.7807 (err=3.3e-16) 4. Bailey: elapsed time t=1.5474 s, 32768 iters, t-(init.)=1.49267 s t(norm)=0.05084, mflops=98.3478 (err=3.2e-16) 5. Beauregard: elapsed time t=1.0024 s, 2048 iters, t-(init.)=0.999028 s t(norm)=0.544427, mflops=9.18397 (err=3.7e-16) 6. Bergland: elapsed time t=1.87026 s, 32768 iters, t-(init.)=1.81726 s t(norm)=0.0618953, mflops=80.7815 (err=3.7e-16) 7. Brenner: elapsed time t=1.18742 s, 16384 iters, t-(init.)=1.16017 s t(norm)=0.07903, mflops=63.2671 (err=4.1e-16) 8. Burrus: elapsed time t=1.50849 s, 8192 iters, t-(init.)=1.49498 s t(norm)=0.203675, mflops=24.5489 (err=3.4e-16) 9. CWP (min N) (N=130): elapsed time t=1.35165 s, 32768 iters, t-(init.)=1.29785 s t(norm)=0.0442045, mflops=113.111 10. CWP (best N) (N=140): elapsed time t=1.73491 s, 65536 iters, t-(init.)=1.61926 s t(norm)=0.0275758, mflops=181.318 11. Edelblute: elapsed time t=1.6221 s, 8192 iters, t-(init.)=1.60893 s t(norm)=0.219199, mflops=22.8103 (err=3.4e-16) 12. FFTPACK: elapsed time t=1.62607 s, 65536 iters, t-(init.)=1.5205 s t(norm)=0.025894, mflops=193.095 (err=3.5e-16) 13. FFTPACK (f2c): elapsed time t=1.31162 s, 32768 iters, t-(init.)=1.25903 s t(norm)=0.0428822, mflops=116.598 (err=3.5e-16) FFTW_MEASURE plan: (cost = 1.558863e-05) FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.22327 s, 65536 iters, t-(init.)=1.11487 s t(norm)=0.0189862, mflops=263.349 (err=3.8e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.42821 s, 65536 iters, t-(init.)=1.32239 s t(norm)=0.0225202, mflops=222.023 (err=3.5e-16) 16. Frigo-old: elapsed time t=1.68196 s, 65536 iters, t-(init.)=1.57478 s t(norm)=0.0268183, mflops=186.44 (err=3.4e-16) 17. Green: elapsed time t=1.89501 s, 65536 iters, t-(init.)=1.78582 s t(norm)=0.0304122, mflops=164.407 (err=4.2e-16) 18. GSL: elapsed time t=1.00389 s, 16384 iters, t-(init.)=0.97702 s t(norm)=0.0665542, mflops=75.1267 (err=3.4e-16) 19. GSL DIT: elapsed time t=1.28114 s, 16384 iters, t-(init.)=1.25484 s t(norm)=0.0854795, mflops=58.4936 (err=3.5e-16) 20. GSL DIF: elapsed time t=1.0818 s, 16384 iters, t-(init.)=1.05471 s t(norm)=0.0718465, mflops=69.5929 (err=3.7e-16) 21. Krukar: elapsed time t=1.93071 s, 32768 iters, t-(init.)=1.87649 s t(norm)=0.0639129, mflops=78.2315 (err=3.6e-16) 22. Mayer (Buneman): elapsed time t=1.07188 s, 16384 iters, t-(init.)=1.04539 s t(norm)=0.0712114, mflops=70.2135 (err=3.2e-16) 23. Mayer (simple): elapsed time t=1.66141 s, 32768 iters, t-(init.)=1.60768 s t(norm)=0.0547574, mflops=91.3119 24. Mayer (lookup): elapsed time t=1.58703 s, 32768 iters, t-(init.)=1.53437 s t(norm)=0.0522603, mflops=95.6749 (err=3.4e-16) 25. Monro: elapsed time t=1.82664 s, 32768 iters, t-(init.)=1.77313 s t(norm)=0.0603924, mflops=82.7919 (err=5.2e-08) 26. NAPACK (f2c): elapsed time t=1.17182 s, 8192 iters, t-(init.)=1.15854 s t(norm)=0.157838, mflops=31.678 (err=1.2e-15) 27. Ooura (C): elapsed time t=1.05982 s, 32768 iters, t-(init.)=1.00687 s t(norm)=0.0342936, mflops=145.8 (err=3.3e-16) 28. Ooura (F): elapsed time t=1.08844 s, 32768 iters, t-(init.)=1.03458 s t(norm)=0.0352377, mflops=141.893 (err=3.3e-16) 29. Ransom: elapsed time t=1.32906 s, 8192 iters, t-(init.)=1.31565 s t(norm)=0.179243, mflops=27.8951 (err=1.0e-15) 30. SCIPORT: elapsed time t=1.26876 s, 32768 iters, t-(init.)=1.21571 s t(norm)=0.0414067, mflops=120.753 (err=3.6e-16) 31. Singleton: elapsed time t=1.58482 s, 32768 iters, t-(init.)=1.53178 s t(norm)=0.052172, mflops=95.8369 (err=4.2e-16) 32. Singleton (f2c): elapsed time t=1.58633 s, 32768 iters, t-(init.)=1.53326 s t(norm)=0.0522224, mflops=95.7443 (err=4.2e-16) 33. Sorensen: elapsed time t=1.85238 s, 65536 iters, t-(init.)=1.74298 s t(norm)=0.0296827, mflops=168.448 (err=3.1e-16) 34. Sorensen DIT: elapsed time t=1.48383 s, 8192 iters, t-(init.)=1.47 s t(norm)=0.200271, mflops=24.9661 (err=3.1e-16) 35. Temperton: elapsed time t=1.01328 s, 16384 iters, t-(init.)=0.986386 s t(norm)=0.0671922, mflops=74.4133 (err=4.7e-08) 36. Temperton (f2c): elapsed time t=1.58518 s, 16384 iters, t-(init.)=1.55789 s t(norm)=0.106123, mflops=47.1151 (err=3.6e-16) 37. Valkenburg: elapsed time t=1.51295 s, 2048 iters, t-(init.)=1.50952 s t(norm)=0.822626, mflops=6.0781 (err=5.7e-16) Top mflops for N=128 = 263.349 Normalized results and averages for N=128: fft 0: mflops = 67.8392 (norm. = 0.257601), norm. avg. (of 7) = 0.279513 fft 1: mflops = 73.3759 (norm. = 0.278626), norm. avg. (of 7) = 0.279789 fft 2: mflops = 46.7949 (norm. = 0.177691), norm. avg. (of 7) = 0.129473 fft 3: mflops = 23.7807 (norm. = 0.0903008), norm. avg. (of 7) = 0.0478652 fft 4: mflops = 98.3478 (norm. = 0.37345), norm. avg. (of 7) = 0.143577 fft 5: mflops = 9.18397 (norm. = 0.0348737), norm. avg. (of 7) = 0.0382862 fft 6: mflops = 80.7815 (norm. = 0.306747), norm. avg. (of 7) = 0.165573 fft 7: mflops = 63.2671 (norm. = 0.24024), norm. avg. (of 7) = 0.12842 fft 8: mflops = 24.5489 (norm. = 0.0932179), norm. avg. (of 7) = 0.128548 fft 9: mflops = 113.111 (norm. = 0.429508), norm. avg. (of 7) = 0.178271 fft 10: mflops = 181.318 (norm. = 0.688508), norm. avg. (of 7) = 0.230307 fft 11: mflops = 22.8103 (norm. = 0.0866162), norm. avg. (of 6) = 0.0713692 fft 12: mflops = 193.095 (norm. = 0.733226), norm. avg. (of 7) = 0.336321 fft 13: mflops = 116.598 (norm. = 0.442752), norm. avg. (of 7) = 0.229765 fft 14: mflops = 263.349 (norm. = 1), norm. avg. (of 7) = 0.835899 fft 15: mflops = 222.023 (norm. = 0.843074), norm. avg. (of 7) = 0.762468 fft 16: mflops = 186.44 (norm. = 0.707956), norm. avg. (of 7) = 0.887064 fft 17: mflops = 164.407 (norm. = 0.624294), norm. avg. (of 5) = 0.421094 fft 18: mflops = 75.1267 (norm. = 0.285274), norm. avg. (of 7) = 0.163904 fft 19: mflops = 58.4936 (norm. = 0.222114), norm. avg. (of 7) = 0.108709 fft 20: mflops = 69.5929 (norm. = 0.264261), norm. avg. (of 7) = 0.119978 fft 21: mflops = 78.2315 (norm. = 0.297063), norm. avg. (of 7) = 0.415017 fft 22: mflops = 70.2135 (norm. = 0.266617), norm. avg. (of 6) = 0.187082 fft 23: mflops = 91.3119 (norm. = 0.346733), norm. avg. (of 6) = 0.219017 fft 24: mflops = 95.6749 (norm. = 0.3633), norm. avg. (of 6) = 0.219026 fft 25: mflops = 82.7919 (norm. = 0.31438), norm. avg. (of 6) = 0.148574 fft 26: mflops = 31.678 (norm. = 0.120289), norm. avg. (of 7) = 0.058583 fft 27: mflops = 145.8 (norm. = 0.553635), norm. avg. (of 7) = 0.342065 fft 28: mflops = 141.893 (norm. = 0.538803), norm. avg. (of 7) = 0.341373 fft 29: mflops = 27.8951 (norm. = 0.105924), norm. avg. (of 6) = 0.0554952 fft 30: mflops = 120.753 (norm. = 0.458529), norm. avg. (of 6) = 0.294456 fft 31: mflops = 95.8369 (norm. = 0.363915), norm. avg. (of 7) = 0.171871 fft 32: mflops = 95.7443 (norm. = 0.363564), norm. avg. (of 7) = 0.172344 fft 33: mflops = 168.448 (norm. = 0.639638), norm. avg. (of 7) = 0.302353 fft 34: mflops = 24.9661 (norm. = 0.0948023), norm. avg. (of 7) = 0.140479 fft 35: mflops = 74.4133 (norm. = 0.282565), norm. avg. (of 7) = 0.133493 fft 36: mflops = 47.1151 (norm. = 0.178907), norm. avg. (of 7) = 0.08844 fft 37: mflops = 6.0781 (norm. = 0.02308), norm. avg. (of 7) = 0.0258835 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.25751 s, 8192 iters, t-(init.)=1.23142 s t(norm)=0.0733983, mflops=68.1215 (err=9.6e-16) 1. Arndt DIT: elapsed time t=1.16297 s, 8192 iters, t-(init.)=1.13671 s t(norm)=0.0677534, mflops=73.797 (err=9.9e-16) 2. Arndt Split-Radix: elapsed time t=1.72722 s, 8192 iters, t-(init.)=1.70137 s t(norm)=0.101409, mflops=49.3051 (err=9.8e-16) 3. Arndt 4-step: elapsed time t=1.58526 s, 4096 iters, t-(init.)=1.57234 s t(norm)=0.187438, mflops=26.6755 (err=1.0e-15) 4. Bailey: elapsed time t=1.74321 s, 16384 iters, t-(init.)=1.69109 s t(norm)=0.0503983, mflops=99.2097 (err=9.8e-16) 5. Beauregard: elapsed time t=1.16271 s, 1024 iters, t-(init.)=1.15948 s t(norm)=0.552883, mflops=9.0435 (err=1.1e-15) 6. Bergland: elapsed time t=1.88753 s, 16384 iters, t-(init.)=1.83535 s t(norm)=0.0546978, mflops=91.4113 (err=1.0e-15) 7. Brenner: elapsed time t=1.16872 s, 8192 iters, t-(init.)=1.14269 s t(norm)=0.0681095, mflops=73.4112 (err=1.1e-15) 8. Burrus: elapsed time t=1.55325 s, 4096 iters, t-(init.)=1.54033 s t(norm)=0.183622, mflops=27.2299 (err=1.0e-15) 9. CWP (min N) (N=260): elapsed time t=1.23237 s, 16384 iters, t-(init.)=1.17971 s t(norm)=0.0351582, mflops=142.214 10. CWP (best N) (N=280): elapsed time t=1.83744 s, 32768 iters, t-(init.)=1.72455 s t(norm)=0.0256979, mflops=194.569 11. Edelblute: elapsed time t=1.66016 s, 4096 iters, t-(init.)=1.64717 s t(norm)=0.196358, mflops=25.4636 (err=1.0e-15) 12. FFTPACK: elapsed time t=1.81155 s, 32768 iters, t-(init.)=1.70797 s t(norm)=0.0254508, mflops=196.457 (err=1.1e-15) 13. FFTPACK (f2c): elapsed time t=1.44307 s, 16384 iters, t-(init.)=1.39081 s t(norm)=0.0414495, mflops=120.629 (err=1.1e-15) FFTW_MEASURE plan: (cost = 3.840275e-05) FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.24387 s, 32768 iters, t-(init.)=1.13841 s t(norm)=0.0169636, mflops=294.749 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.54157 s, 32768 iters, t-(init.)=1.43748 s t(norm)=0.0214202, mflops=233.425 (err=1.1e-15) 16. Frigo-old: elapsed time t=1.02651 s, 16384 iters, t-(init.)=0.974248 s t(norm)=0.0290349, mflops=172.207 (err=1.1e-15) 17. Green: elapsed time t=1.11013 s, 16384 iters, t-(init.)=1.05712 s t(norm)=0.0315047, mflops=158.706 (err=1.1e-15) 18. GSL: elapsed time t=1.05115 s, 8192 iters, t-(init.)=1.02509 s t(norm)=0.0611003, mflops=81.8327 (err=1.1e-15) 19. GSL DIT: elapsed time t=1.30761 s, 8192 iters, t-(init.)=1.28148 s t(norm)=0.0763824, mflops=65.4601 (err=1.0e-15) 20. GSL DIF: elapsed time t=1.12805 s, 8192 iters, t-(init.)=1.10169 s t(norm)=0.0656659, mflops=76.1431 (err=1.1e-15) 21. Krukar: elapsed time t=1.72595 s, 16384 iters, t-(init.)=1.67351 s t(norm)=0.0498746, mflops=100.251 (err=1.1e-15) 22. Mayer (Buneman): elapsed time t=1.16352 s, 8192 iters, t-(init.)=1.13719 s t(norm)=0.0677819, mflops=73.7661 (err=9.7e-16) 23. Mayer (simple): elapsed time t=1.80062 s, 16384 iters, t-(init.)=1.74764 s t(norm)=0.0520837, mflops=95.9992 24. Mayer (lookup): elapsed time t=1.83009 s, 16384 iters, t-(init.)=1.77776 s t(norm)=0.0529812, mflops=94.373 (err=9.6e-16) 25. Monro: elapsed time t=1.79583 s, 16384 iters, t-(init.)=1.7431 s t(norm)=0.0519484, mflops=96.2493 (err=8.5e-08) 26. NAPACK (f2c): elapsed time t=1.27054 s, 4096 iters, t-(init.)=1.25754 s t(norm)=0.14991, mflops=33.3533 (err=3.8e-15) 27. Ooura (C): elapsed time t=1.14055 s, 16384 iters, t-(init.)=1.0887 s t(norm)=0.0324457, mflops=154.104 (err=9.8e-16) 28. Ooura (F): elapsed time t=1.16743 s, 16384 iters, t-(init.)=1.11487 s t(norm)=0.0332257, mflops=150.486 (err=9.8e-16) 29. Ransom: elapsed time t=1.76492 s, 8192 iters, t-(init.)=1.73871 s t(norm)=0.103635, mflops=48.2461 (err=1.9e-15) 30. SCIPORT: elapsed time t=1.39041 s, 16384 iters, t-(init.)=1.33775 s t(norm)=0.0398682, mflops=125.413 (err=1.1e-15) 31. Singleton: elapsed time t=1.32374 s, 16384 iters, t-(init.)=1.27154 s t(norm)=0.0378948, mflops=131.944 (err=1.7e-15) 32. Singleton (f2c): elapsed time t=1.27879 s, 16384 iters, t-(init.)=1.22686 s t(norm)=0.0365634, mflops=136.749 (err=1.7e-15) 33. Sorensen: elapsed time t=1.01384 s, 16384 iters, t-(init.)=0.96125 s t(norm)=0.0286475, mflops=174.535 (err=9.8e-16) 34. Sorensen DIT: elapsed time t=1.52986 s, 4096 iters, t-(init.)=1.51689 s t(norm)=0.180828, mflops=27.6506 (err=9.8e-16) 35. Temperton: elapsed time t=1.92941 s, 16384 iters, t-(init.)=1.87715 s t(norm)=0.0559435, mflops=89.376 (err=9.5e-08) 36. Temperton (f2c): elapsed time t=1.50112 s, 8192 iters, t-(init.)=1.47458 s t(norm)=0.0878918, mflops=56.8881 (err=1.1e-15) 37. Valkenburg: elapsed time t=1.73599 s, 1024 iters, t-(init.)=1.73268 s t(norm)=0.826207, mflops=6.05176 (err=1.2e-15) Top mflops for N=256 = 294.749 Normalized results and averages for N=256: fft 0: mflops = 68.1215 (norm. = 0.231117), norm. avg. (of 8) = 0.273464 fft 1: mflops = 73.797 (norm. = 0.250373), norm. avg. (of 8) = 0.276112 fft 2: mflops = 49.3051 (norm. = 0.167278), norm. avg. (of 8) = 0.134199 fft 3: mflops = 26.6755 (norm. = 0.0905026), norm. avg. (of 8) = 0.0531949 fft 4: mflops = 99.2097 (norm. = 0.336591), norm. avg. (of 8) = 0.167704 fft 5: mflops = 9.0435 (norm. = 0.0306821), norm. avg. (of 8) = 0.0373357 fft 6: mflops = 91.4113 (norm. = 0.310133), norm. avg. (of 8) = 0.183643 fft 7: mflops = 73.4112 (norm. = 0.249064), norm. avg. (of 8) = 0.143501 fft 8: mflops = 27.2299 (norm. = 0.0923836), norm. avg. (of 8) = 0.124028 fft 9: mflops = 142.214 (norm. = 0.482494), norm. avg. (of 8) = 0.216299 fft 10: mflops = 194.569 (norm. = 0.660118), norm. avg. (of 8) = 0.284033 fft 11: mflops = 25.4636 (norm. = 0.0863911), norm. avg. (of 7) = 0.0735152 fft 12: mflops = 196.457 (norm. = 0.666526), norm. avg. (of 8) = 0.377596 fft 13: mflops = 120.629 (norm. = 0.40926), norm. avg. (of 8) = 0.252202 fft 14: mflops = 294.749 (norm. = 1), norm. avg. (of 8) = 0.856411 fft 15: mflops = 233.425 (norm. = 0.791947), norm. avg. (of 8) = 0.766153 fft 16: mflops = 172.207 (norm. = 0.58425), norm. avg. (of 8) = 0.849213 fft 17: mflops = 158.706 (norm. = 0.538447), norm. avg. (of 6) = 0.440652 fft 18: mflops = 81.8327 (norm. = 0.277636), norm. avg. (of 8) = 0.17812 fft 19: mflops = 65.4601 (norm. = 0.222088), norm. avg. (of 8) = 0.122881 fft 20: mflops = 76.1431 (norm. = 0.258332), norm. avg. (of 8) = 0.137273 fft 21: mflops = 100.251 (norm. = 0.340125), norm. avg. (of 8) = 0.405655 fft 22: mflops = 73.7661 (norm. = 0.250268), norm. avg. (of 7) = 0.196109 fft 23: mflops = 95.9992 (norm. = 0.325699), norm. avg. (of 7) = 0.234258 fft 24: mflops = 94.373 (norm. = 0.320182), norm. avg. (of 7) = 0.233477 fft 25: mflops = 96.2493 (norm. = 0.326547), norm. avg. (of 7) = 0.173999 fft 26: mflops = 33.3533 (norm. = 0.113159), norm. avg. (of 8) = 0.0654049 fft 27: mflops = 154.104 (norm. = 0.522831), norm. avg. (of 8) = 0.364661 fft 28: mflops = 150.486 (norm. = 0.510556), norm. avg. (of 8) = 0.362521 fft 29: mflops = 48.2461 (norm. = 0.163686), norm. avg. (of 7) = 0.070951 fft 30: mflops = 125.413 (norm. = 0.425493), norm. avg. (of 7) = 0.313176 fft 31: mflops = 131.944 (norm. = 0.44765), norm. avg. (of 8) = 0.206343 fft 32: mflops = 136.749 (norm. = 0.463951), norm. avg. (of 8) = 0.208795 fft 33: mflops = 174.535 (norm. = 0.592151), norm. avg. (of 8) = 0.338578 fft 34: mflops = 27.6506 (norm. = 0.0938109), norm. avg. (of 8) = 0.134646 fft 35: mflops = 89.376 (norm. = 0.303228), norm. avg. (of 8) = 0.15471 fft 36: mflops = 56.8881 (norm. = 0.193006), norm. avg. (of 8) = 0.101511 fft 37: mflops = 6.05176 (norm. = 0.0205319), norm. avg. (of 8) = 0.0252145 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.29609 s, 4096 iters, t-(init.)=1.27038 s t(norm)=0.067307, mflops=74.2865 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.1785 s, 4096 iters, t-(init.)=1.15278 s t(norm)=0.0610763, mflops=81.8648 (err=1.1e-15) 2. Arndt Split-Radix: elapsed time t=1.74464 s, 4096 iters, t-(init.)=1.71897 s t(norm)=0.0910745, mflops=54.9001 (err=1.1e-15) 3. Arndt 4-step: elapsed time t=1.66431 s, 2048 iters, t-(init.)=1.65143 s t(norm)=0.174992, mflops=28.5728 (err=1.0e-15) 4. Bailey: elapsed time t=1.97605 s, 8192 iters, t-(init.)=1.92455 s t(norm)=0.0509831, mflops=98.0717 (err=1.1e-15) 5. Beauregard: elapsed time t=1.31374 s, 512 iters, t-(init.)=1.31054 s t(norm)=0.555479, mflops=9.00124 (err=1.1e-15) 6. Bergland: elapsed time t=1.97569 s, 8192 iters, t-(init.)=1.92426 s t(norm)=0.0509755, mflops=98.0863 (err=1.0e-15) 7. Brenner: elapsed time t=1.22506 s, 4096 iters, t-(init.)=1.19926 s t(norm)=0.063539, mflops=78.6918 (err=1.0e-15) 8. Burrus: elapsed time t=1.57457 s, 2048 iters, t-(init.)=1.56175 s t(norm)=0.165489, mflops=30.2134 (err=1.1e-15) 9. CWP (min N) (N=520): elapsed time t=1.35085 s, 8192 iters, t-(init.)=1.29817 s t(norm)=0.0343898, mflops=145.392 10. CWP (best N) (N=560): elapsed time t=1.12283 s, 8192 iters, t-(init.)=1.06678 s t(norm)=0.0282601, mflops=176.928 11. Edelblute: elapsed time t=1.68822 s, 2048 iters, t-(init.)=1.67542 s t(norm)=0.177534, mflops=28.1637 (err=1.1e-15) 12. FFTPACK: elapsed time t=1.32095 s, 8192 iters, t-(init.)=1.26928 s t(norm)=0.0336244, mflops=148.702 (err=1.0e-15) 13. FFTPACK (f2c): elapsed time t=1.12343 s, 4096 iters, t-(init.)=1.09777 s t(norm)=0.0581621, mflops=85.9667 (err=1.0e-15) FFTW_MEASURE plan: (cost = 1.140640e-04) FFTW_TWIDDLE 16 FFTW_NOTW 32 14. FFTW: elapsed time t=1.75423 s, 16384 iters, t-(init.)=1.65083 s t(norm)=0.0218661, mflops=228.665 (err=9.7e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.74313 s, 16384 iters, t-(init.)=1.63944 s t(norm)=0.0217152, mflops=230.254 (err=9.7e-16) 16. Frigo-old: elapsed time t=1.33429 s, 8192 iters, t-(init.)=1.28259 s t(norm)=0.0339771, mflops=147.158 (err=9.5e-16) 17. Green: elapsed time t=1.1152 s, 8192 iters, t-(init.)=1.06358 s t(norm)=0.0281751, mflops=177.461 (err=9.6e-16) 18. GSL: elapsed time t=1.51631 s, 4096 iters, t-(init.)=1.49058 s t(norm)=0.0789738, mflops=63.3121 (err=1.0e-15) 19. GSL DIT: elapsed time t=1.41929 s, 4096 iters, t-(init.)=1.39324 s t(norm)=0.0738164, mflops=67.7356 (err=1.2e-15) 20. GSL DIF: elapsed time t=1.15115 s, 4096 iters, t-(init.)=1.12536 s t(norm)=0.0596235, mflops=83.8595 (err=1.1e-15) 21. Krukar: elapsed time t=1.10903 s, 4096 iters, t-(init.)=1.08307 s t(norm)=0.0573833, mflops=87.1334 (err=1.0e-15) 22. Mayer (Buneman): elapsed time t=1.18917 s, 4096 iters, t-(init.)=1.16332 s t(norm)=0.0616351, mflops=81.1225 (err=1.0e-15) 23. Mayer (simple): elapsed time t=1.86252 s, 8192 iters, t-(init.)=1.81104 s t(norm)=0.0479761, mflops=104.219 24. Mayer (lookup): elapsed time t=1.87717 s, 8192 iters, t-(init.)=1.8259 s t(norm)=0.0483698, mflops=103.37 (err=1.0e-15) 25. Monro: elapsed time t=1.80416 s, 8192 iters, t-(init.)=1.75286 s t(norm)=0.0464349, mflops=107.678 (err=8.4e-08) 26. NAPACK (f2c): elapsed time t=1.51975 s, 2048 iters, t-(init.)=1.50692 s t(norm)=0.159679, mflops=31.3129 (err=7.1e-15) 27. Ooura (C): elapsed time t=1.25822 s, 8192 iters, t-(init.)=1.20702 s t(norm)=0.0319752, mflops=156.371 (err=9.6e-16) 28. Ooura (F): elapsed time t=1.2839 s, 8192 iters, t-(init.)=1.23251 s t(norm)=0.0326503, mflops=153.138 (err=9.6e-16) 29. Ransom: elapsed time t=1.10975 s, 2048 iters, t-(init.)=1.09689 s t(norm)=0.11623, mflops=43.0181 (err=1.4e-15) 30. SCIPORT: elapsed time t=1.78417 s, 8192 iters, t-(init.)=1.73248 s t(norm)=0.0458949, mflops=108.944 (err=1.0e-15) 31. Singleton: elapsed time t=1.41332 s, 8192 iters, t-(init.)=1.36155 s t(norm)=0.0360687, mflops=138.625 (err=1.2e-15) 32. Singleton (f2c): elapsed time t=1.37499 s, 8192 iters, t-(init.)=1.32371 s t(norm)=0.0350664, mflops=142.587 (err=1.2e-15) 33. Sorensen: elapsed time t=1.05312 s, 8192 iters, t-(init.)=1.00179 s t(norm)=0.0265384, mflops=188.406 (err=1.0e-15) 34. Sorensen DIT: elapsed time t=1.54689 s, 2048 iters, t-(init.)=1.53392 s t(norm)=0.16254, mflops=30.7616 (err=1.1e-15) 35. Temperton: elapsed time t=1.18159 s, 4096 iters, t-(init.)=1.15597 s t(norm)=0.0612454, mflops=81.6388 (err=1.0e-07) 36. Temperton (f2c): elapsed time t=1.93473 s, 4096 iters, t-(init.)=1.90902 s t(norm)=0.101143, mflops=49.4348 (err=1.0e-15) 37. Valkenburg: elapsed time t=1.94134 s, 512 iters, t-(init.)=1.93808 s t(norm)=0.821467, mflops=6.08667 (err=1.3e-15) Top mflops for N=512 = 230.254 Normalized results and averages for N=512: fft 0: mflops = 74.2865 (norm. = 0.322629), norm. avg. (of 9) = 0.278926 fft 1: mflops = 81.8648 (norm. = 0.355542), norm. avg. (of 9) = 0.284938 fft 2: mflops = 54.9001 (norm. = 0.238433), norm. avg. (of 9) = 0.145781 fft 3: mflops = 28.5728 (norm. = 0.124093), norm. avg. (of 9) = 0.0610724 fft 4: mflops = 98.0717 (norm. = 0.425929), norm. avg. (of 9) = 0.196395 fft 5: mflops = 9.00124 (norm. = 0.0390927), norm. avg. (of 9) = 0.0375309 fft 6: mflops = 98.0863 (norm. = 0.425992), norm. avg. (of 9) = 0.210571 fft 7: mflops = 78.6918 (norm. = 0.341761), norm. avg. (of 9) = 0.16553 fft 8: mflops = 30.2134 (norm. = 0.131218), norm. avg. (of 9) = 0.124827 fft 9: mflops = 145.392 (norm. = 0.631442), norm. avg. (of 9) = 0.262426 fft 10: mflops = 176.928 (norm. = 0.768404), norm. avg. (of 9) = 0.337852 fft 11: mflops = 28.1637 (norm. = 0.122316), norm. avg. (of 8) = 0.0796153 fft 12: mflops = 148.702 (norm. = 0.645817), norm. avg. (of 9) = 0.407399 fft 13: mflops = 85.9667 (norm. = 0.373356), norm. avg. (of 9) = 0.265663 fft 14: mflops = 228.665 (norm. = 0.993098), norm. avg. (of 9) = 0.871599 fft 15: mflops = 230.254 (norm. = 1), norm. avg. (of 9) = 0.792136 fft 16: mflops = 147.158 (norm. = 0.639111), norm. avg. (of 9) = 0.825868 fft 17: mflops = 177.461 (norm. = 0.770721), norm. avg. (of 7) = 0.487805 fft 18: mflops = 63.3121 (norm. = 0.274967), norm. avg. (of 9) = 0.188881 fft 19: mflops = 67.7356 (norm. = 0.294178), norm. avg. (of 9) = 0.141914 fft 20: mflops = 83.8595 (norm. = 0.364205), norm. avg. (of 9) = 0.162487 fft 21: mflops = 87.1334 (norm. = 0.378423), norm. avg. (of 9) = 0.40263 fft 22: mflops = 81.1225 (norm. = 0.352318), norm. avg. (of 8) = 0.215635 fft 23: mflops = 104.219 (norm. = 0.452625), norm. avg. (of 8) = 0.261554 fft 24: mflops = 103.37 (norm. = 0.448941), norm. avg. (of 8) = 0.26041 fft 25: mflops = 107.678 (norm. = 0.467647), norm. avg. (of 8) = 0.210705 fft 26: mflops = 31.3129 (norm. = 0.135993), norm. avg. (of 9) = 0.0732481 fft 27: mflops = 156.371 (norm. = 0.679126), norm. avg. (of 9) = 0.399601 fft 28: mflops = 153.138 (norm. = 0.665084), norm. avg. (of 9) = 0.396139 fft 29: mflops = 43.0181 (norm. = 0.186829), norm. avg. (of 8) = 0.0854357 fft 30: mflops = 108.944 (norm. = 0.47315), norm. avg. (of 8) = 0.333172 fft 31: mflops = 138.625 (norm. = 0.602051), norm. avg. (of 9) = 0.250311 fft 32: mflops = 142.587 (norm. = 0.619259), norm. avg. (of 9) = 0.254402 fft 33: mflops = 188.406 (norm. = 0.818255), norm. avg. (of 9) = 0.391875 fft 34: mflops = 30.7616 (norm. = 0.133599), norm. avg. (of 9) = 0.13453 fft 35: mflops = 81.6388 (norm. = 0.35456), norm. avg. (of 9) = 0.176915 fft 36: mflops = 49.4348 (norm. = 0.214697), norm. avg. (of 9) = 0.114087 fft 37: mflops = 6.08667 (norm. = 0.0264346), norm. avg. (of 9) = 0.0253501 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.35384 s, 2048 iters, t-(init.)=1.32827 s t(norm)=0.0633371, mflops=78.9427 (err=1.8e-15) 1. Arndt DIT: elapsed time t=1.23153 s, 2048 iters, t-(init.)=1.20585 s t(norm)=0.0574994, mflops=86.9574 (err=1.8e-15) 2. Arndt Split-Radix: elapsed time t=1.81206 s, 2048 iters, t-(init.)=1.78651 s t(norm)=0.0851874, mflops=58.6941 (err=1.8e-15) 3. Arndt 4-step: elapsed time t=1.58554 s, 1024 iters, t-(init.)=1.57273 s t(norm)=0.149987, mflops=33.3362 (err=1.8e-15) 4. Bailey: elapsed time t=1.19358 s, 2048 iters, t-(init.)=1.16792 s t(norm)=0.0556907, mflops=89.7816 (err=1.9e-15) 5. Beauregard: elapsed time t=1.47395 s, 256 iters, t-(init.)=1.47076 s t(norm)=0.561052, mflops=8.91183 (err=2.0e-15) 6. Bergland: elapsed time t=1.0657 s, 2048 iters, t-(init.)=1.04 s t(norm)=0.0495908, mflops=100.825 (err=2.2e-15) 7. Brenner: elapsed time t=1.25035 s, 2048 iters, t-(init.)=1.22473 s t(norm)=0.0583997, mflops=85.6169 (err=1.9e-15) 8. Burrus: elapsed time t=1.61149 s, 1024 iters, t-(init.)=1.59868 s t(norm)=0.152462, mflops=32.795 (err=1.8e-15) 9. CWP (min N) (N=1040): elapsed time t=1.6537 s, 4096 iters, t-(init.)=1.60172 s t(norm)=0.0381879, mflops=130.932 10. CWP (best N) (N=1040): elapsed time t=1.6563 s, 4096 iters, t-(init.)=1.60434 s t(norm)=0.0382503, mflops=130.718 11. Edelblute: elapsed time t=1.72038 s, 1024 iters, t-(init.)=1.70762 s t(norm)=0.162851, mflops=30.7028 (err=1.8e-15) 12. FFTPACK: elapsed time t=1.59086 s, 4096 iters, t-(init.)=1.53968 s t(norm)=0.0367088, mflops=136.207 (err=2.0e-15) 13. FFTPACK (f2c): elapsed time t=1.36712 s, 2048 iters, t-(init.)=1.34163 s t(norm)=0.0639741, mflops=78.1567 (err=2.0e-15) FFTW_MEASURE plan: (cost = 2.703210e-04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 4 FFTW_NOTW 16 14. FFTW: elapsed time t=1.10466 s, 4096 iters, t-(init.)=1.05351 s t(norm)=0.0251177, mflops=199.063 (err=1.9e-15) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.12936 s, 4096 iters, t-(init.)=1.07824 s t(norm)=0.0257072, mflops=194.498 (err=2.0e-15) 16. Frigo-old: elapsed time t=1.80387 s, 4096 iters, t-(init.)=1.75278 s t(norm)=0.0417896, mflops=119.647 (err=1.9e-15) 17. Green: elapsed time t=1.37075 s, 4096 iters, t-(init.)=1.31945 s t(norm)=0.0314582, mflops=158.941 (err=2.0e-15) 18. GSL: elapsed time t=1.66139 s, 2048 iters, t-(init.)=1.63581 s t(norm)=0.0780014, mflops=64.1015 (err=2.0e-15) 19. GSL DIT: elapsed time t=1.51278 s, 2048 iters, t-(init.)=1.4873 s t(norm)=0.0709198, mflops=70.5022 (err=2.1e-15) 20. GSL DIF: elapsed time t=1.20535 s, 2048 iters, t-(init.)=1.17979 s t(norm)=0.056257, mflops=88.8779 (err=2.2e-15) 21. Krukar: elapsed time t=1.75894 s, 2048 iters, t-(init.)=1.73337 s t(norm)=0.0826535, mflops=60.4935 (err=1.9e-15) 22. Mayer (Buneman): elapsed time t=1.26539 s, 2048 iters, t-(init.)=1.23984 s t(norm)=0.0591203, mflops=84.5733 (err=1.8e-15) 23. Mayer (simple): elapsed time t=1.90429 s, 4096 iters, t-(init.)=1.85319 s t(norm)=0.0441834, mflops=113.165 24. Mayer (lookup): elapsed time t=1.01304 s, 2048 iters, t-(init.)=0.987464 s t(norm)=0.047086, mflops=106.189 (err=1.8e-15) 25. Monro: elapsed time t=1.90994 s, 4096 iters, t-(init.)=1.85872 s t(norm)=0.0443153, mflops=112.828 (err=1.0e-07) 26. NAPACK (f2c): elapsed time t=1.75683 s, 1024 iters, t-(init.)=1.74409 s t(norm)=0.166329, mflops=30.0609 (err=1.7e-14) 27. Ooura (C): elapsed time t=1.29713 s, 4096 iters, t-(init.)=1.24616 s t(norm)=0.0297109, mflops=168.289 (err=2.2e-15) 28. Ooura (F): elapsed time t=1.31475 s, 4096 iters, t-(init.)=1.26375 s t(norm)=0.0301301, mflops=165.947 (err=2.2e-15) 29. Ransom: elapsed time t=1.74133 s, 2048 iters, t-(init.)=1.71573 s t(norm)=0.0818125, mflops=61.1153 (err=2.3e-15) 30. SCIPORT: elapsed time t=1.13372 s, 2048 iters, t-(init.)=1.10808 s t(norm)=0.0528375, mflops=94.6298 (err=2.0e-15) 31. Singleton: elapsed time t=1.43847 s, 4096 iters, t-(init.)=1.38699 s t(norm)=0.0330685, mflops=151.201 (err=2.8e-15) 32. Singleton (f2c): elapsed time t=1.39787 s, 4096 iters, t-(init.)=1.3469 s t(norm)=0.0321126, mflops=155.702 (err=2.8e-15) 33. Sorensen: elapsed time t=1.12417 s, 4096 iters, t-(init.)=1.07287 s t(norm)=0.0255791, mflops=195.472 (err=1.8e-15) 34. Sorensen DIT: elapsed time t=1.56713 s, 1024 iters, t-(init.)=1.5543 s t(norm)=0.14823, mflops=33.7314 (err=1.8e-15) 35. Temperton: elapsed time t=1.16925 s, 2048 iters, t-(init.)=1.14373 s t(norm)=0.0545374, mflops=91.6803 (err=1.1e-07) 36. Temperton (f2c): elapsed time t=1.90824 s, 2048 iters, t-(init.)=1.88269 s t(norm)=0.0897737, mflops=55.6956 (err=2.0e-15) 37. Valkenburg: elapsed time t=1.07288 s, 128 iters, t-(init.)=1.07128 s t(norm)=0.817319, mflops=6.11756 (err=2.4e-15) Top mflops for N=1024 = 199.063 Normalized results and averages for N=1024: fft 0: mflops = 78.9427 (norm. = 0.396572), norm. avg. (of 10) = 0.290691 fft 1: mflops = 86.9574 (norm. = 0.436835), norm. avg. (of 10) = 0.300128 fft 2: mflops = 58.6941 (norm. = 0.294853), norm. avg. (of 10) = 0.160688 fft 3: mflops = 33.3362 (norm. = 0.167466), norm. avg. (of 10) = 0.0717118 fft 4: mflops = 89.7816 (norm. = 0.451022), norm. avg. (of 10) = 0.221858 fft 5: mflops = 8.91183 (norm. = 0.044769), norm. avg. (of 10) = 0.0382547 fft 6: mflops = 100.825 (norm. = 0.506499), norm. avg. (of 10) = 0.240164 fft 7: mflops = 85.6169 (norm. = 0.4301), norm. avg. (of 10) = 0.191987 fft 8: mflops = 32.795 (norm. = 0.164747), norm. avg. (of 10) = 0.128819 fft 9: mflops = 130.932 (norm. = 0.657741), norm. avg. (of 10) = 0.301957 fft 10: mflops = 130.718 (norm. = 0.656667), norm. avg. (of 10) = 0.369734 fft 11: mflops = 30.7028 (norm. = 0.154237), norm. avg. (of 9) = 0.0879066 fft 12: mflops = 136.207 (norm. = 0.684242), norm. avg. (of 10) = 0.435083 fft 13: mflops = 78.1567 (norm. = 0.392624), norm. avg. (of 10) = 0.278359 fft 14: mflops = 199.063 (norm. = 1), norm. avg. (of 10) = 0.884439 fft 15: mflops = 194.498 (norm. = 0.97707), norm. avg. (of 10) = 0.810629 fft 16: mflops = 119.647 (norm. = 0.601052), norm. avg. (of 10) = 0.803386 fft 17: mflops = 158.941 (norm. = 0.798449), norm. avg. (of 8) = 0.526636 fft 18: mflops = 64.1015 (norm. = 0.322017), norm. avg. (of 10) = 0.202195 fft 19: mflops = 70.5022 (norm. = 0.354171), norm. avg. (of 10) = 0.16314 fft 20: mflops = 88.8779 (norm. = 0.446482), norm. avg. (of 10) = 0.190887 fft 21: mflops = 60.4935 (norm. = 0.303892), norm. avg. (of 10) = 0.392756 fft 22: mflops = 84.5733 (norm. = 0.424858), norm. avg. (of 9) = 0.238882 fft 23: mflops = 113.165 (norm. = 0.568488), norm. avg. (of 9) = 0.295657 fft 24: mflops = 106.189 (norm. = 0.533444), norm. avg. (of 9) = 0.290747 fft 25: mflops = 112.828 (norm. = 0.566796), norm. avg. (of 9) = 0.250271 fft 26: mflops = 30.0609 (norm. = 0.151012), norm. avg. (of 10) = 0.0810245 fft 27: mflops = 168.289 (norm. = 0.845406), norm. avg. (of 10) = 0.444182 fft 28: mflops = 165.947 (norm. = 0.833643), norm. avg. (of 10) = 0.439889 fft 29: mflops = 61.1153 (norm. = 0.307016), norm. avg. (of 9) = 0.110056 fft 30: mflops = 94.6298 (norm. = 0.475377), norm. avg. (of 9) = 0.348973 fft 31: mflops = 151.201 (norm. = 0.759566), norm. avg. (of 10) = 0.301236 fft 32: mflops = 155.702 (norm. = 0.782176), norm. avg. (of 10) = 0.307179 fft 33: mflops = 195.472 (norm. = 0.981962), norm. avg. (of 10) = 0.450884 fft 34: mflops = 33.7314 (norm. = 0.169451), norm. avg. (of 10) = 0.138022 fft 35: mflops = 91.6803 (norm. = 0.46056), norm. avg. (of 10) = 0.20528 fft 36: mflops = 55.6956 (norm. = 0.279789), norm. avg. (of 10) = 0.130657 fft 37: mflops = 6.11756 (norm. = 0.0307319), norm. avg. (of 10) = 0.0258883 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.69891 s, 1024 iters, t-(init.)=1.67333 s t(norm)=0.072537, mflops=68.9303 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.61954 s, 1024 iters, t-(init.)=1.594 s t(norm)=0.0690981, mflops=72.3609 (err=1.4e-15) 2. Arndt Split-Radix: elapsed time t=1.21139 s, 512 iters, t-(init.)=1.19862 s t(norm)=0.103918, mflops=48.115 (err=1.4e-15) 3. Arndt 4-step: elapsed time t=1.89509 s, 512 iters, t-(init.)=1.88236 s t(norm)=0.163196, mflops=30.638 (err=1.4e-15) 4. Bailey: elapsed time t=1.42288 s, 1024 iters, t-(init.)=1.39736 s t(norm)=0.060574, mflops=82.5437 (err=1.4e-15) 5. Beauregard: elapsed time t=1.68859 s, 128 iters, t-(init.)=1.68541 s t(norm)=0.584485, mflops=8.55454 (err=1.4e-15) 6. Bergland: elapsed time t=1.24341 s, 1024 iters, t-(init.)=1.21794 s t(norm)=0.0527963, mflops=94.7036 (err=1.5e-15) 7. Brenner: elapsed time t=1.53569 s, 1024 iters, t-(init.)=1.51008 s t(norm)=0.0654601, mflops=76.3825 (err=1.4e-15) 8. Burrus: elapsed time t=1.89514 s, 512 iters, t-(init.)=1.88241 s t(norm)=0.163201, mflops=30.6371 (err=1.4e-15) 9. CWP (min N) (N=2145): elapsed time t=1.13737 s, 1024 iters, t-(init.)=1.11066 s t(norm)=0.0481457, mflops=103.851 10. CWP (best N) (N=2184): elapsed time t=1.94549 s, 2048 iters, t-(init.)=1.89109 s t(norm)=0.0409883, mflops=121.986 11. Edelblute: elapsed time t=1.02798 s, 256 iters, t-(init.)=1.02163 s t(norm)=0.177145, mflops=28.2254 (err=1.4e-15) 12. FFTPACK: elapsed time t=1.5907 s, 2048 iters, t-(init.)=1.53978 s t(norm)=0.0333738, mflops=149.818 (err=1.4e-15) 13. FFTPACK (f2c): elapsed time t=1.50374 s, 1024 iters, t-(init.)=1.47825 s t(norm)=0.0640806, mflops=78.0267 (err=1.4e-15) FFTW_MEASURE plan: (cost = 5.472220e-04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.13738 s, 2048 iters, t-(init.)=1.08641 s t(norm)=0.0235474, mflops=212.338 (err=1.4e-15) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.15026 s, 2048 iters, t-(init.)=1.09941 s t(norm)=0.023829, mflops=209.828 (err=1.4e-15) 16. Frigo-old: elapsed time t=1.04863 s, 1024 iters, t-(init.)=1.02307 s t(norm)=0.044349, mflops=112.742 (err=1.3e-15) 17. Green: elapsed time t=1.86412 s, 2048 iters, t-(init.)=1.8131 s t(norm)=0.039298, mflops=127.233 (err=1.4e-15) 18. GSL: elapsed time t=1.84884 s, 1024 iters, t-(init.)=1.82338 s t(norm)=0.0790415, mflops=63.2579 (err=1.4e-15) 19. GSL DIT: elapsed time t=1.10041 s, 512 iters, t-(init.)=1.0877 s t(norm)=0.094301, mflops=53.0217 (err=2.0e-15) 20. GSL DIF: elapsed time t=1.87347 s, 1024 iters, t-(init.)=1.84801 s t(norm)=0.0801092, mflops=62.4148 (err=2.3e-15) 21. Krukar: elapsed time t=1.02253 s, 512 iters, t-(init.)=1.00979 s t(norm)=0.0875469, mflops=57.1123 (err=1.4e-15) 22. Mayer (Buneman): elapsed time t=1.43337 s, 1024 iters, t-(init.)=1.40794 s t(norm)=0.0610324, mflops=81.9237 (err=1.4e-15) 23. Mayer (simple): elapsed time t=1.14283 s, 1024 iters, t-(init.)=1.11732 s t(norm)=0.0484347, mflops=103.232 24. Mayer (lookup): elapsed time t=1.18526 s, 1024 iters, t-(init.)=1.15983 s t(norm)=0.0502772, mflops=99.4487 (err=1.4e-15) 25. Monro: elapsed time t=1.39783 s, 1024 iters, t-(init.)=1.37238 s t(norm)=0.0594909, mflops=84.0465 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.16453 s, 256 iters, t-(init.)=1.15816 s t(norm)=0.20082, mflops=24.8979 (err=1.5e-14) 27. Ooura (C): elapsed time t=1.00447 s, 1024 iters, t-(init.)=0.979023 s t(norm)=0.0424395, mflops=117.815 (err=1.4e-15) 28. Ooura (F): elapsed time t=1.02606 s, 1024 iters, t-(init.)=1.00064 s t(norm)=0.0433764, mflops=115.27 (err=1.4e-15) 29. Ransom: elapsed time t=1.28728 s, 512 iters, t-(init.)=1.27453 s t(norm)=0.110499, mflops=45.2492 (err=2.1e-15) 30. SCIPORT: elapsed time t=1.30247 s, 1024 iters, t-(init.)=1.27703 s t(norm)=0.0553576, mflops=90.3218 (err=1.4e-15) 31. Singleton: elapsed time t=1.14328 s, 1024 iters, t-(init.)=1.11778 s t(norm)=0.0484542, mflops=103.19 (err=1.9e-15) 32. Singleton (f2c): elapsed time t=1.19347 s, 1024 iters, t-(init.)=1.16801 s t(norm)=0.0506318, mflops=98.7522 (err=1.9e-15) 33. Sorensen: elapsed time t=1.00534 s, 1024 iters, t-(init.)=0.979877 s t(norm)=0.0424765, mflops=117.712 (err=1.4e-15) 34. Sorensen DIT: elapsed time t=1.98536 s, 512 iters, t-(init.)=1.97263 s t(norm)=0.171022, mflops=29.2359 (err=1.4e-15) 35. Temperton: elapsed time t=1.5631 s, 1024 iters, t-(init.)=1.53767 s t(norm)=0.0666562, mflops=75.0117 (err=1.1e-07) 36. Temperton (f2c): elapsed time t=1.16953 s, 512 iters, t-(init.)=1.15675 s t(norm)=0.100287, mflops=49.8567 (err=1.4e-15) 37. Valkenburg: elapsed time t=1.21321 s, 64 iters, t-(init.)=1.21162 s t(norm)=0.840359, mflops=5.94984 (err=1.7e-15) Top mflops for N=2048 = 212.338 Normalized results and averages for N=2048: fft 0: mflops = 68.9303 (norm. = 0.324625), norm. avg. (of 11) = 0.293776 fft 1: mflops = 72.3609 (norm. = 0.340781), norm. avg. (of 11) = 0.303823 fft 2: mflops = 48.115 (norm. = 0.226596), norm. avg. (of 11) = 0.16668 fft 3: mflops = 30.638 (norm. = 0.144289), norm. avg. (of 11) = 0.0783097 fft 4: mflops = 82.5437 (norm. = 0.388737), norm. avg. (of 11) = 0.237029 fft 5: mflops = 8.55454 (norm. = 0.0402874), norm. avg. (of 11) = 0.0384395 fft 6: mflops = 94.7036 (norm. = 0.446004), norm. avg. (of 11) = 0.258876 fft 7: mflops = 76.3825 (norm. = 0.359721), norm. avg. (of 11) = 0.207235 fft 8: mflops = 30.6371 (norm. = 0.144284), norm. avg. (of 11) = 0.130225 fft 9: mflops = 103.851 (norm. = 0.489085), norm. avg. (of 11) = 0.318969 fft 10: mflops = 121.986 (norm. = 0.574489), norm. avg. (of 11) = 0.388348 fft 11: mflops = 28.2254 (norm. = 0.132927), norm. avg. (of 10) = 0.0924086 fft 12: mflops = 149.818 (norm. = 0.705564), norm. avg. (of 11) = 0.459672 fft 13: mflops = 78.0267 (norm. = 0.367465), norm. avg. (of 11) = 0.28646 fft 14: mflops = 212.338 (norm. = 1), norm. avg. (of 11) = 0.894945 fft 15: mflops = 209.828 (norm. = 0.98818), norm. avg. (of 11) = 0.82677 fft 16: mflops = 112.742 (norm. = 0.530956), norm. avg. (of 11) = 0.77862 fft 17: mflops = 127.233 (norm. = 0.5992), norm. avg. (of 9) = 0.534698 fft 18: mflops = 63.2579 (norm. = 0.297911), norm. avg. (of 11) = 0.210896 fft 19: mflops = 53.0217 (norm. = 0.249704), norm. avg. (of 11) = 0.17101 fft 20: mflops = 62.4148 (norm. = 0.293941), norm. avg. (of 11) = 0.200255 fft 21: mflops = 57.1123 (norm. = 0.268969), norm. avg. (of 11) = 0.381502 fft 22: mflops = 81.9237 (norm. = 0.385818), norm. avg. (of 10) = 0.253575 fft 23: mflops = 103.232 (norm. = 0.486167), norm. avg. (of 10) = 0.314708 fft 24: mflops = 99.4487 (norm. = 0.468351), norm. avg. (of 10) = 0.308508 fft 25: mflops = 84.0465 (norm. = 0.395815), norm. avg. (of 10) = 0.264825 fft 26: mflops = 24.8979 (norm. = 0.117256), norm. avg. (of 11) = 0.0843182 fft 27: mflops = 117.815 (norm. = 0.554846), norm. avg. (of 11) = 0.454242 fft 28: mflops = 115.27 (norm. = 0.542861), norm. avg. (of 11) = 0.44925 fft 29: mflops = 45.2492 (norm. = 0.2131), norm. avg. (of 10) = 0.12036 fft 30: mflops = 90.3218 (norm. = 0.425368), norm. avg. (of 10) = 0.356612 fft 31: mflops = 103.19 (norm. = 0.485971), norm. avg. (of 11) = 0.31803 fft 32: mflops = 98.7522 (norm. = 0.465071), norm. avg. (of 11) = 0.321533 fft 33: mflops = 117.712 (norm. = 0.554362), norm. avg. (of 11) = 0.460291 fft 34: mflops = 29.2359 (norm. = 0.137686), norm. avg. (of 11) = 0.137991 fft 35: mflops = 75.0117 (norm. = 0.353266), norm. avg. (of 11) = 0.218733 fft 36: mflops = 49.8567 (norm. = 0.234799), norm. avg. (of 11) = 0.140125 fft 37: mflops = 5.94984 (norm. = 0.0280206), norm. avg. (of 11) = 0.0260821 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.81844 s, 512 iters, t-(init.)=1.79301 s t(norm)=0.0712479, mflops=70.1775 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.73536 s, 512 iters, t-(init.)=1.70993 s t(norm)=0.0679464, mflops=73.5874 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.26576 s, 256 iters, t-(init.)=1.25306 s t(norm)=0.0995839, mflops=50.2089 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.87403 s, 256 iters, t-(init.)=1.86133 s t(norm)=0.147925, mflops=33.8008 (err=3.7e-15) 4. Bailey: elapsed time t=1.56163 s, 512 iters, t-(init.)=1.53622 s t(norm)=0.0610439, mflops=81.9083 (err=3.7e-15) 5. Beauregard: elapsed time t=1.85805 s, 64 iters, t-(init.)=1.85488 s t(norm)=0.589649, mflops=8.47962 (err=3.8e-15) 6. Bergland: elapsed time t=1.28266 s, 512 iters, t-(init.)=1.25723 s t(norm)=0.049958, mflops=100.084 (err=3.9e-15) 7. Brenner: elapsed time t=1.57391 s, 512 iters, t-(init.)=1.54849 s t(norm)=0.0615316, mflops=81.2591 (err=3.8e-15) 8. Burrus: elapsed time t=1.93846 s, 256 iters, t-(init.)=1.92576 s t(norm)=0.153045, mflops=32.6701 (err=3.7e-15) 9. CWP (min N) (N=4290): elapsed time t=1.243 s, 512 iters, t-(init.)=1.21637 s t(norm)=0.048334, mflops=103.447 10. CWP (best N) (N=4368): elapsed time t=1.10232 s, 512 iters, t-(init.)=1.07523 s t(norm)=0.0427259, mflops=117.025 11. Edelblute: elapsed time t=1.05292 s, 128 iters, t-(init.)=1.04656 s t(norm)=0.166347, mflops=30.0577 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.88163 s, 1024 iters, t-(init.)=1.83081 s t(norm)=0.036375, mflops=137.457 (err=3.8e-15) 13. FFTPACK (f2c): elapsed time t=1.56896 s, 512 iters, t-(init.)=1.54354 s t(norm)=0.0613349, mflops=81.5197 (err=3.8e-15) FFTW_MEASURE plan: (cost = 1.220700e-03) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.19453 s, 1024 iters, t-(init.)=1.14356 s t(norm)=0.0227205, mflops=220.066 (err=3.8e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.2524 s, 1024 iters, t-(init.)=1.20154 s t(norm)=0.0238724, mflops=209.447 (err=3.8e-15) 16. Frigo-old: elapsed time t=1.15881 s, 512 iters, t-(init.)=1.1334 s t(norm)=0.0450372, mflops=111.019 (err=3.8e-15) 17. Green: elapsed time t=1.89953 s, 1024 iters, t-(init.)=1.84863 s t(norm)=0.036729, mflops=136.132 (err=3.8e-15) 18. GSL: elapsed time t=1.87371 s, 512 iters, t-(init.)=1.84824 s t(norm)=0.0734425, mflops=68.0804 (err=3.8e-15) 19. GSL DIT: elapsed time t=1.2019 s, 256 iters, t-(init.)=1.18917 s t(norm)=0.0945069, mflops=52.9062 (err=4.1e-15) 20. GSL DIF: elapsed time t=1.01731 s, 256 iters, t-(init.)=1.00458 s t(norm)=0.0798372, mflops=62.6275 (err=4.3e-15) 21. Krukar: elapsed time t=1.17921 s, 256 iters, t-(init.)=1.16647 s t(norm)=0.0927025, mflops=53.936 (err=3.8e-15) 22. Mayer (Buneman): elapsed time t=1.95876 s, 512 iters, t-(init.)=1.9333 s t(norm)=0.0768224, mflops=65.0851 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.64622 s, 512 iters, t-(init.)=1.62078 s t(norm)=0.064404, mflops=77.6349 24. Mayer (lookup): elapsed time t=1.68303 s, 512 iters, t-(init.)=1.65761 s t(norm)=0.0658674, mflops=75.91 (err=3.7e-15) 25. Monro: elapsed time t=1.49259 s, 512 iters, t-(init.)=1.46715 s t(norm)=0.0582993, mflops=85.7643 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.2301 s, 128 iters, t-(init.)=1.22375 s t(norm)=0.19451, mflops=25.7057 (err=4.9e-14) 27. Ooura (C): elapsed time t=1.96003 s, 1024 iters, t-(init.)=1.90912 s t(norm)=0.0379309, mflops=131.819 (err=3.9e-15) 28. Ooura (F): elapsed time t=1.00908 s, 512 iters, t-(init.)=0.983593 s t(norm)=0.0390845, mflops=127.928 (err=3.9e-15) 29. Ransom: elapsed time t=1.02317 s, 256 iters, t-(init.)=1.01041 s t(norm)=0.0802998, mflops=62.2666 (err=4.4e-15) 30. SCIPORT: elapsed time t=1.36046 s, 512 iters, t-(init.)=1.33499 s t(norm)=0.0530476, mflops=94.2551 (err=3.8e-15) 31. Singleton: elapsed time t=1.05885 s, 512 iters, t-(init.)=1.03337 s t(norm)=0.0410626, mflops=121.765 (err=5.8e-15) 32. Singleton (f2c): elapsed time t=1.08501 s, 512 iters, t-(init.)=1.05956 s t(norm)=0.0421032, mflops=118.756 (err=5.8e-15) 33. Sorensen: elapsed time t=1.09103 s, 512 iters, t-(init.)=1.06556 s t(norm)=0.0423415, mflops=118.087 (err=3.7e-15) 34. Sorensen DIT: elapsed time t=1.03326 s, 128 iters, t-(init.)=1.02689 s t(norm)=0.163219, mflops=30.6337 (err=3.7e-15) 35. Temperton: elapsed time t=1.53013 s, 512 iters, t-(init.)=1.50469 s t(norm)=0.0597909, mflops=83.6248 (err=1.2e-07) 36. Temperton (f2c): elapsed time t=1.17992 s, 256 iters, t-(init.)=1.16675 s t(norm)=0.0927248, mflops=53.923 (err=3.8e-15) 37. Valkenburg: elapsed time t=1.3345 s, 32 iters, t-(init.)=1.33291 s t(norm)=0.847441, mflops=5.90012 (err=4.0e-15) Top mflops for N=4096 = 220.066 Normalized results and averages for N=4096: fft 0: mflops = 70.1775 (norm. = 0.318893), norm. avg. (of 12) = 0.295869 fft 1: mflops = 73.5874 (norm. = 0.334388), norm. avg. (of 12) = 0.30637 fft 2: mflops = 50.2089 (norm. = 0.228154), norm. avg. (of 12) = 0.171802 fft 3: mflops = 33.8008 (norm. = 0.153594), norm. avg. (of 12) = 0.0845834 fft 4: mflops = 81.9083 (norm. = 0.372199), norm. avg. (of 12) = 0.248293 fft 5: mflops = 8.47962 (norm. = 0.0385322), norm. avg. (of 12) = 0.0384472 fft 6: mflops = 100.084 (norm. = 0.454792), norm. avg. (of 12) = 0.275203 fft 7: mflops = 81.2591 (norm. = 0.369249), norm. avg. (of 12) = 0.220737 fft 8: mflops = 32.6701 (norm. = 0.148456), norm. avg. (of 12) = 0.131744 fft 9: mflops = 103.447 (norm. = 0.470072), norm. avg. (of 12) = 0.331561 fft 10: mflops = 117.025 (norm. = 0.531773), norm. avg. (of 12) = 0.4003 fft 11: mflops = 30.0577 (norm. = 0.136585), norm. avg. (of 11) = 0.0964246 fft 12: mflops = 137.457 (norm. = 0.624618), norm. avg. (of 12) = 0.473418 fft 13: mflops = 81.5197 (norm. = 0.370433), norm. avg. (of 12) = 0.293458 fft 14: mflops = 220.066 (norm. = 1), norm. avg. (of 12) = 0.903699 fft 15: mflops = 209.447 (norm. = 0.951745), norm. avg. (of 12) = 0.837185 fft 16: mflops = 111.019 (norm. = 0.504482), norm. avg. (of 12) = 0.755775 fft 17: mflops = 136.132 (norm. = 0.618599), norm. avg. (of 10) = 0.543088 fft 18: mflops = 68.0804 (norm. = 0.309364), norm. avg. (of 12) = 0.219102 fft 19: mflops = 52.9062 (norm. = 0.240411), norm. avg. (of 12) = 0.176793 fft 20: mflops = 62.6275 (norm. = 0.284585), norm. avg. (of 12) = 0.207283 fft 21: mflops = 53.936 (norm. = 0.24509), norm. avg. (of 12) = 0.370135 fft 22: mflops = 65.0851 (norm. = 0.295753), norm. avg. (of 11) = 0.25741 fft 23: mflops = 77.6349 (norm. = 0.352781), norm. avg. (of 11) = 0.318169 fft 24: mflops = 75.91 (norm. = 0.344942), norm. avg. (of 11) = 0.31182 fft 25: mflops = 85.7643 (norm. = 0.389721), norm. avg. (of 11) = 0.276179 fft 26: mflops = 25.7057 (norm. = 0.116809), norm. avg. (of 12) = 0.0870258 fft 27: mflops = 131.819 (norm. = 0.598997), norm. avg. (of 12) = 0.466305 fft 28: mflops = 127.928 (norm. = 0.581317), norm. avg. (of 12) = 0.460256 fft 29: mflops = 62.2666 (norm. = 0.282946), norm. avg. (of 11) = 0.135141 fft 30: mflops = 94.2551 (norm. = 0.428304), norm. avg. (of 11) = 0.36313 fft 31: mflops = 121.765 (norm. = 0.553313), norm. avg. (of 12) = 0.337637 fft 32: mflops = 118.756 (norm. = 0.539638), norm. avg. (of 12) = 0.339708 fft 33: mflops = 118.087 (norm. = 0.5366), norm. avg. (of 12) = 0.46665 fft 34: mflops = 30.6337 (norm. = 0.139202), norm. avg. (of 12) = 0.138092 fft 35: mflops = 83.6248 (norm. = 0.379999), norm. avg. (of 12) = 0.232172 fft 36: mflops = 53.923 (norm. = 0.245031), norm. avg. (of 12) = 0.148867 fft 37: mflops = 5.90012 (norm. = 0.0268107), norm. avg. (of 12) = 0.0261428 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.88562 s, 256 iters, t-(init.)=1.86014 s t(norm)=0.0682296, mflops=73.2819 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.75228 s, 256 iters, t-(init.)=1.72686 s t(norm)=0.0633407, mflops=78.9382 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.34762 s, 128 iters, t-(init.)=1.33491 s t(norm)=0.0979281, mflops=51.0579 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.03126 s, 64 iters, t-(init.)=1.02489 s t(norm)=0.150371, mflops=33.2512 (err=3.7e-15) 4. Bailey: elapsed time t=1.64692 s, 256 iters, t-(init.)=1.62147 s t(norm)=0.0594752, mflops=84.0687 (err=3.7e-15) 5. Beauregard: elapsed time t=1.01737 s, 16 iters, t-(init.)=1.01578 s t(norm)=0.596136, mflops=8.38735 (err=3.7e-15) 6. Bergland: elapsed time t=1.40441 s, 256 iters, t-(init.)=1.37898 s t(norm)=0.0505806, mflops=98.8522 (err=3.7e-15) 7. Brenner: elapsed time t=1.69927 s, 256 iters, t-(init.)=1.67385 s t(norm)=0.0613964, mflops=81.4379 (err=3.7e-15) 8. Burrus: elapsed time t=1.98902 s, 128 iters, t-(init.)=1.97631 s t(norm)=0.144981, mflops=34.4873 (err=3.7e-15) 9. CWP (min N) (N=8580): elapsed time t=1.25732 s, 256 iters, t-(init.)=1.23069 s t(norm)=0.0451414, mflops=110.763 10. CWP (best N) (N=9240): elapsed time t=1.11166 s, 256 iters, t-(init.)=1.08298 s t(norm)=0.0397235, mflops=125.87 11. Edelblute: elapsed time t=1.09648 s, 64 iters, t-(init.)=1.09013 s t(norm)=0.159943, mflops=31.2611 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.00284 s, 256 iters, t-(init.)=0.977406 s t(norm)=0.035851, mflops=139.466 (err=3.7e-15) 13. FFTPACK (f2c): elapsed time t=1.94616 s, 256 iters, t-(init.)=1.92072 s t(norm)=0.0704516, mflops=70.9707 (err=3.7e-15) FFTW_MEASURE plan: (cost = 3.005138e-03) FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.38271 s, 512 iters, t-(init.)=1.33183 s t(norm)=0.0244257, mflops=204.703 (err=3.7e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.31211 s, 512 iters, t-(init.)=1.26124 s t(norm)=0.0231311, mflops=216.159 (err=3.7e-15) 16. Frigo-old: elapsed time t=1.25751 s, 256 iters, t-(init.)=1.23208 s t(norm)=0.0451923, mflops=110.638 (err=3.7e-15) 17. Green: elapsed time t=1.09281 s, 256 iters, t-(init.)=1.06738 s t(norm)=0.0391513, mflops=127.71 (err=3.7e-15) 18. GSL: elapsed time t=1.08805 s, 128 iters, t-(init.)=1.07529 s t(norm)=0.0788825, mflops=63.3854 (err=3.7e-15) 19. GSL DIT: elapsed time t=1.3099 s, 128 iters, t-(init.)=1.29719 s t(norm)=0.095161, mflops=52.5426 (err=4.3e-15) 20. GSL DIF: elapsed time t=1.09965 s, 128 iters, t-(init.)=1.08694 s t(norm)=0.0797376, mflops=62.7057 (err=4.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.03135 s, 128 iters, t-(init.)=1.01863 s t(norm)=0.0747264, mflops=66.9108 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.75464 s, 256 iters, t-(init.)=1.72921 s t(norm)=0.063427, mflops=78.8308 24. Mayer (lookup): elapsed time t=1.78736 s, 256 iters, t-(init.)=1.76194 s t(norm)=0.0646274, mflops=77.3665 (err=3.7e-15) 25. Monro: elapsed time t=1.54098 s, 256 iters, t-(init.)=1.51556 s t(norm)=0.0555904, mflops=89.9436 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.35622 s, 64 iters, t-(init.)=1.34985 s t(norm)=0.198049, mflops=25.2463 (err=4.5e-14) 27. Ooura (C): elapsed time t=1.14288 s, 256 iters, t-(init.)=1.11743 s t(norm)=0.0409872, mflops=121.989 (err=3.7e-15) 28. Ooura (F): elapsed time t=1.16178 s, 256 iters, t-(init.)=1.13637 s t(norm)=0.0416817, mflops=119.957 (err=3.7e-15) 29. Ransom: elapsed time t=1.32059 s, 128 iters, t-(init.)=1.30785 s t(norm)=0.0959433, mflops=52.1141 (err=4.9e-15) 30. SCIPORT: elapsed time t=1.52782 s, 256 iters, t-(init.)=1.50239 s t(norm)=0.0551074, mflops=90.732 (err=3.7e-15) 31. Singleton: elapsed time t=1.19545 s, 256 iters, t-(init.)=1.17002 s t(norm)=0.0429161, mflops=116.506 (err=5.6e-15) 32. Singleton (f2c): elapsed time t=1.26173 s, 256 iters, t-(init.)=1.23628 s t(norm)=0.0453466, mflops=110.262 (err=5.6e-15) 33. Sorensen: elapsed time t=1.19742 s, 256 iters, t-(init.)=1.17197 s t(norm)=0.0429877, mflops=116.312 (err=3.7e-15) 34. Sorensen DIT: elapsed time t=1.06829 s, 64 iters, t-(init.)=1.06194 s t(norm)=0.155807, mflops=32.091 (err=3.7e-15) 35. Temperton: elapsed time t=1.79428 s, 256 iters, t-(init.)=1.76882 s t(norm)=0.0648799, mflops=77.0654 (err=1.4e-07) 36. Temperton (f2c): elapsed time t=1.40954 s, 128 iters, t-(init.)=1.39679 s t(norm)=0.102468, mflops=48.7956 (err=3.7e-15) 37. Valkenburg: elapsed time t=1.44081 s, 16 iters, t-(init.)=1.43921 s t(norm)=0.84464, mflops=5.91968 (err=3.8e-15) Top mflops for N=8192 = 216.159 Normalized results and averages for N=8192: fft 0: mflops = 73.2819 (norm. = 0.339018), norm. avg. (of 13) = 0.299188 fft 1: mflops = 78.9382 (norm. = 0.365185), norm. avg. (of 13) = 0.310895 fft 2: mflops = 51.0579 (norm. = 0.236205), norm. avg. (of 13) = 0.176756 fft 3: mflops = 33.2512 (norm. = 0.153827), norm. avg. (of 13) = 0.0899098 fft 4: mflops = 84.0687 (norm. = 0.38892), norm. avg. (of 13) = 0.259111 fft 5: mflops = 8.38735 (norm. = 0.0388017), norm. avg. (of 13) = 0.0384745 fft 6: mflops = 98.8522 (norm. = 0.457311), norm. avg. (of 13) = 0.289211 fft 7: mflops = 81.4379 (norm. = 0.376749), norm. avg. (of 13) = 0.232738 fft 8: mflops = 34.4873 (norm. = 0.159545), norm. avg. (of 13) = 0.133882 fft 9: mflops = 110.763 (norm. = 0.512414), norm. avg. (of 13) = 0.345473 fft 10: mflops = 125.87 (norm. = 0.582303), norm. avg. (of 13) = 0.4143 fft 11: mflops = 31.2611 (norm. = 0.14462), norm. avg. (of 12) = 0.100441 fft 12: mflops = 139.466 (norm. = 0.645199), norm. avg. (of 13) = 0.486632 fft 13: mflops = 70.9707 (norm. = 0.328326), norm. avg. (of 13) = 0.29614 fft 14: mflops = 204.703 (norm. = 0.946998), norm. avg. (of 13) = 0.90703 fft 15: mflops = 216.159 (norm. = 1), norm. avg. (of 13) = 0.849709 fft 16: mflops = 110.638 (norm. = 0.511837), norm. avg. (of 13) = 0.737011 fft 17: mflops = 127.71 (norm. = 0.590813), norm. avg. (of 11) = 0.547427 fft 18: mflops = 63.3854 (norm. = 0.293235), norm. avg. (of 13) = 0.224804 fft 19: mflops = 52.5426 (norm. = 0.243073), norm. avg. (of 13) = 0.181891 fft 20: mflops = 62.7057 (norm. = 0.29009), norm. avg. (of 13) = 0.213653 fft 21: mflops = -1 (norm. = -0.00462621), norm. avg. (of 12) = 0.370135 fft 22: mflops = 66.9108 (norm. = 0.309544), norm. avg. (of 12) = 0.261754 fft 23: mflops = 78.8308 (norm. = 0.364688), norm. avg. (of 12) = 0.322046 fft 24: mflops = 77.3665 (norm. = 0.357914), norm. avg. (of 12) = 0.315661 fft 25: mflops = 89.9436 (norm. = 0.416098), norm. avg. (of 12) = 0.287839 fft 26: mflops = 25.2463 (norm. = 0.116795), norm. avg. (of 13) = 0.0893157 fft 27: mflops = 121.989 (norm. = 0.564349), norm. avg. (of 13) = 0.473847 fft 28: mflops = 119.957 (norm. = 0.554945), norm. avg. (of 13) = 0.46754 fft 29: mflops = 52.1141 (norm. = 0.241091), norm. avg. (of 12) = 0.14397 fft 30: mflops = 90.732 (norm. = 0.419746), norm. avg. (of 12) = 0.367848 fft 31: mflops = 116.506 (norm. = 0.538983), norm. avg. (of 13) = 0.353126 fft 32: mflops = 110.262 (norm. = 0.510095), norm. avg. (of 13) = 0.352815 fft 33: mflops = 116.312 (norm. = 0.538086), norm. avg. (of 13) = 0.472145 fft 34: mflops = 32.091 (norm. = 0.14846), norm. avg. (of 13) = 0.13889 fft 35: mflops = 77.0654 (norm. = 0.356521), norm. avg. (of 13) = 0.241737 fft 36: mflops = 48.7956 (norm. = 0.225739), norm. avg. (of 13) = 0.15478 fft 37: mflops = 5.91968 (norm. = 0.0273857), norm. avg. (of 13) = 0.0262384 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=2.00491 s, 128 iters, t-(init.)=1.97949 s t(norm)=0.0674209, mflops=74.161 (err=6.8e-15) 1. Arndt DIT: elapsed time t=1.90567 s, 128 iters, t-(init.)=1.8802 s t(norm)=0.0640391, mflops=78.0773 (err=6.8e-15) 2. Arndt Split-Radix: elapsed time t=1.41359 s, 64 iters, t-(init.)=1.40086 s t(norm)=0.0954264, mflops=52.3964 (err=6.8e-15) 3. Arndt 4-step: elapsed time t=1.9053 s, 64 iters, t-(init.)=1.89259 s t(norm)=0.128922, mflops=38.783 (err=6.8e-15) 4. Bailey: elapsed time t=1.82484 s, 128 iters, t-(init.)=1.79939 s t(norm)=0.0612869, mflops=81.5835 (err=6.8e-15) 5. Beauregard: elapsed time t=1.10044 s, 8 iters, t-(init.)=1.09885 s t(norm)=0.598827, mflops=8.34965 (err=6.8e-15) 6. Bergland: elapsed time t=1.45 s, 128 iters, t-(init.)=1.42454 s t(norm)=0.0485196, mflops=103.051 (err=6.8e-15) 7. Brenner: elapsed time t=1.70864 s, 128 iters, t-(init.)=1.6832 s t(norm)=0.0573296, mflops=87.215 (err=6.8e-15) 8. Burrus: elapsed time t=1.01697 s, 32 iters, t-(init.)=1.01062 s t(norm)=0.137687, mflops=36.3143 (err=6.8e-15) 9. CWP (min N) (N=17160): elapsed time t=1.31792 s, 128 iters, t-(init.)=1.29124 s t(norm)=0.0439793, mflops=113.69 10. CWP (best N) (N=17160): elapsed time t=1.3182 s, 128 iters, t-(init.)=1.29159 s t(norm)=0.0439913, mflops=113.659 11. Edelblute: elapsed time t=1.12918 s, 32 iters, t-(init.)=1.12282 s t(norm)=0.152972, mflops=32.6858 (err=6.8e-15) 12. FFTPACK: elapsed time t=1.16675 s, 128 iters, t-(init.)=1.14131 s t(norm)=0.038873, mflops=128.624 (err=6.8e-15) 13. FFTPACK (f2c): elapsed time t=1.01663 s, 64 iters, t-(init.)=1.00391 s t(norm)=0.0683857, mflops=73.1147 (err=6.8e-15) FFTW_MEASURE plan: (cost = 6.091695e-03) FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_NOTW 32 14. FFTW: elapsed time t=1.50255 s, 256 iters, t-(init.)=1.45163 s t(norm)=0.0247211, mflops=202.256 (err=6.8e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.47922 s, 256 iters, t-(init.)=1.42838 s t(norm)=0.0243251, mflops=205.549 (err=6.8e-15) 16. Frigo-old: elapsed time t=1.47703 s, 128 iters, t-(init.)=1.45163 s t(norm)=0.0494424, mflops=101.128 (err=6.8e-15) 17. Green: elapsed time t=1.21088 s, 128 iters, t-(init.)=1.18548 s t(norm)=0.0403771, mflops=123.833 (err=6.8e-15) 18. GSL: elapsed time t=1.10417 s, 64 iters, t-(init.)=1.09147 s t(norm)=0.0743505, mflops=67.2491 (err=6.8e-15) 19. GSL DIT: elapsed time t=1.43363 s, 64 iters, t-(init.)=1.42094 s t(norm)=0.0967939, mflops=51.6562 (err=7.2e-15) 20. GSL DIF: elapsed time t=1.1734 s, 64 iters, t-(init.)=1.16067 s t(norm)=0.0790645, mflops=63.2395 (err=7.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.09241 s, 64 iters, t-(init.)=1.07949 s t(norm)=0.0735347, mflops=67.9951 (err=6.8e-15) 23. Mayer (simple): elapsed time t=1.89375 s, 128 iters, t-(init.)=1.86811 s t(norm)=0.0636275, mflops=78.5824 24. Mayer (lookup): elapsed time t=1.91803 s, 128 iters, t-(init.)=1.89249 s t(norm)=0.064458, mflops=77.5699 (err=6.8e-15) 25. Monro: elapsed time t=1.62223 s, 128 iters, t-(init.)=1.59677 s t(norm)=0.0543857, mflops=91.9359 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.42824 s, 32 iters, t-(init.)=1.42184 s t(norm)=0.19371, mflops=25.8118 (err=2.3e-13) 27. Ooura (C): elapsed time t=1.11724 s, 128 iters, t-(init.)=1.09179 s t(norm)=0.037186, mflops=134.459 (err=6.8e-15) 28. Ooura (F): elapsed time t=1.15374 s, 128 iters, t-(init.)=1.12828 s t(norm)=0.0384291, mflops=130.11 (err=6.8e-15) 29. Ransom: elapsed time t=1.06785 s, 64 iters, t-(init.)=1.05512 s t(norm)=0.071874, mflops=69.5662 (err=7.4e-15) 30. SCIPORT: elapsed time t=1.91058 s, 128 iters, t-(init.)=1.88507 s t(norm)=0.0642052, mflops=77.8753 (err=6.8e-15) 31. Singleton: elapsed time t=1.21935 s, 128 iters, t-(init.)=1.19389 s t(norm)=0.0406635, mflops=122.96 (err=1.0e-14) 32. Singleton (f2c): elapsed time t=1.26942 s, 128 iters, t-(init.)=1.24396 s t(norm)=0.042369, mflops=118.011 (err=1.0e-14) 33. Sorensen: elapsed time t=1.27714 s, 128 iters, t-(init.)=1.25172 s t(norm)=0.0426333, mflops=117.279 (err=6.8e-15) 34. Sorensen DIT: elapsed time t=1.10107 s, 32 iters, t-(init.)=1.09469 s t(norm)=0.149139, mflops=33.5257 (err=6.8e-15) 35. Temperton: elapsed time t=1.75882 s, 128 iters, t-(init.)=1.73344 s t(norm)=0.0590406, mflops=84.6875 (err=1.5e-07) 36. Temperton (f2c): elapsed time t=1.3889 s, 64 iters, t-(init.)=1.3762 s t(norm)=0.0937464, mflops=53.3354 (err=6.8e-15) 37. Valkenburg: elapsed time t=1.56741 s, 8 iters, t-(init.)=1.5658 s t(norm)=0.853294, mflops=5.85964 (err=6.9e-15) Top mflops for N=16384 = 205.549 Normalized results and averages for N=16384: fft 0: mflops = 74.161 (norm. = 0.360795), norm. avg. (of 14) = 0.303589 fft 1: mflops = 78.0773 (norm. = 0.379847), norm. avg. (of 14) = 0.31582 fft 2: mflops = 52.3964 (norm. = 0.25491), norm. avg. (of 14) = 0.182339 fft 3: mflops = 38.783 (norm. = 0.18868), norm. avg. (of 14) = 0.0969648 fft 4: mflops = 81.5835 (norm. = 0.396905), norm. avg. (of 14) = 0.268953 fft 5: mflops = 8.34965 (norm. = 0.0406212), norm. avg. (of 14) = 0.0386278 fft 6: mflops = 103.051 (norm. = 0.501345), norm. avg. (of 14) = 0.304363 fft 7: mflops = 87.215 (norm. = 0.424303), norm. avg. (of 14) = 0.246421 fft 8: mflops = 36.3143 (norm. = 0.17667), norm. avg. (of 14) = 0.136939 fft 9: mflops = 113.69 (norm. = 0.553103), norm. avg. (of 14) = 0.360303 fft 10: mflops = 113.659 (norm. = 0.552952), norm. avg. (of 14) = 0.424204 fft 11: mflops = 32.6858 (norm. = 0.159017), norm. avg. (of 13) = 0.104947 fft 12: mflops = 128.624 (norm. = 0.625759), norm. avg. (of 14) = 0.496569 fft 13: mflops = 73.1147 (norm. = 0.355704), norm. avg. (of 14) = 0.300394 fft 14: mflops = 202.256 (norm. = 0.983979), norm. avg. (of 14) = 0.912526 fft 15: mflops = 205.549 (norm. = 1), norm. avg. (of 14) = 0.860444 fft 16: mflops = 101.128 (norm. = 0.491989), norm. avg. (of 14) = 0.719509 fft 17: mflops = 123.833 (norm. = 0.602448), norm. avg. (of 12) = 0.552012 fft 18: mflops = 67.2491 (norm. = 0.327168), norm. avg. (of 14) = 0.232116 fft 19: mflops = 51.6562 (norm. = 0.251308), norm. avg. (of 14) = 0.18685 fft 20: mflops = 63.2395 (norm. = 0.307662), norm. avg. (of 14) = 0.220368 fft 21: mflops = -1 (norm. = -0.00486502), norm. avg. (of 12) = 0.370135 fft 22: mflops = 67.9951 (norm. = 0.330797), norm. avg. (of 13) = 0.267065 fft 23: mflops = 78.5824 (norm. = 0.382305), norm. avg. (of 13) = 0.326681 fft 24: mflops = 77.5699 (norm. = 0.377379), norm. avg. (of 13) = 0.320409 fft 25: mflops = 91.9359 (norm. = 0.44727), norm. avg. (of 13) = 0.300103 fft 26: mflops = 25.8118 (norm. = 0.125575), norm. avg. (of 14) = 0.0919056 fft 27: mflops = 134.459 (norm. = 0.654146), norm. avg. (of 14) = 0.486725 fft 28: mflops = 130.11 (norm. = 0.632986), norm. avg. (of 14) = 0.479357 fft 29: mflops = 69.5662 (norm. = 0.338441), norm. avg. (of 13) = 0.158929 fft 30: mflops = 77.8753 (norm. = 0.378865), norm. avg. (of 13) = 0.368695 fft 31: mflops = 122.96 (norm. = 0.598205), norm. avg. (of 14) = 0.370631 fft 32: mflops = 118.011 (norm. = 0.574124), norm. avg. (of 14) = 0.368623 fft 33: mflops = 117.279 (norm. = 0.570566), norm. avg. (of 14) = 0.479175 fft 34: mflops = 33.5257 (norm. = 0.163103), norm. avg. (of 14) = 0.140619 fft 35: mflops = 84.6875 (norm. = 0.412006), norm. avg. (of 14) = 0.253899 fft 36: mflops = 53.3354 (norm. = 0.259478), norm. avg. (of 14) = 0.162258 fft 37: mflops = 5.85964 (norm. = 0.0285073), norm. avg. (of 14) = 0.0264005 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.23489 s, 32 iters, t-(init.)=1.22175 s t(norm)=0.0776767, mflops=64.3694 (err=1.4e-14) 1. Arndt DIT: elapsed time t=1.16053 s, 32 iters, t-(init.)=1.14735 s t(norm)=0.0729464, mflops=68.5435 (err=1.4e-14) 2. Arndt Split-Radix: elapsed time t=1.6516 s, 32 iters, t-(init.)=1.63845 s t(norm)=0.10417, mflops=47.9984 (err=1.4e-14) 3. Arndt 4-step: elapsed time t=1.12298 s, 16 iters, t-(init.)=1.11643 s t(norm)=0.141961, mflops=35.221 (err=1.4e-14) 4. Bailey: elapsed time t=1.00956 s, 32 iters, t-(init.)=0.99609 s t(norm)=0.0633297, mflops=78.9519 (err=1.4e-14) 5. Beauregard: elapsed time t=1.18501 s, 4 iters, t-(init.)=1.18332 s t(norm)=0.601869, mflops=8.30745 (err=1.4e-14) 6. Bergland: elapsed time t=1.94074 s, 64 iters, t-(init.)=1.91386 s t(norm)=0.0608398, mflops=82.183 (err=1.4e-14) 7. Brenner: elapsed time t=1.12358 s, 32 iters, t-(init.)=1.11013 s t(norm)=0.07058, mflops=70.8416 (err=1.4e-14) 8. Burrus: elapsed time t=1.12256 s, 16 iters, t-(init.)=1.11582 s t(norm)=0.141884, mflops=35.2399 (err=1.4e-14) 9. CWP (min N) (N=34320): elapsed time t=1.4457 s, 64 iters, t-(init.)=1.41812 s t(norm)=0.0450808, mflops=110.912 10. CWP (best N) (N=34320): elapsed time t=1.44573 s, 64 iters, t-(init.)=1.41812 s t(norm)=0.0450809, mflops=110.912 11. Edelblute: elapsed time t=1.24542 s, 16 iters, t-(init.)=1.2387 s t(norm)=0.157508, mflops=31.7444 (err=1.4e-14) 12. FFTPACK: elapsed time t=1.18524 s, 64 iters, t-(init.)=1.15834 s t(norm)=0.0368225, mflops=135.786 (err=1.4e-14) 13. FFTPACK (f2c): elapsed time t=1.10415 s, 32 iters, t-(init.)=1.09068 s t(norm)=0.0693437, mflops=72.1046 (err=1.4e-14) FFTW_MEASURE plan: (cost = 1.401808e-02) FFTW_TWIDDLE 32 FFTW_TWIDDLE 32 FFTW_NOTW 32 14. FFTW: elapsed time t=1.77565 s, 128 iters, t-(init.)=1.72178 s t(norm)=0.027367, mflops=182.702 (err=1.4e-14) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.68119 s, 128 iters, t-(init.)=1.62725 s t(norm)=0.0258645, mflops=193.316 (err=1.4e-14) 16. Frigo-old: elapsed time t=1.61233 s, 64 iters, t-(init.)=1.58529 s t(norm)=0.050395, mflops=99.2163 (err=1.4e-14) 17. Green: elapsed time t=1.65284 s, 64 iters, t-(init.)=1.62584 s t(norm)=0.0516841, mflops=96.7416 (err=1.4e-14) 18. GSL: elapsed time t=1.20634 s, 32 iters, t-(init.)=1.19291 s t(norm)=0.0758431, mflops=65.9256 (err=1.4e-14) 19. GSL DIT: elapsed time t=1.91382 s, 32 iters, t-(init.)=1.90033 s t(norm)=0.12082, mflops=41.3839 (err=1.4e-14) 20. GSL DIF: elapsed time t=1.64816 s, 32 iters, t-(init.)=1.63464 s t(norm)=0.103928, mflops=48.1103 (err=1.4e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.15044 s, 32 iters, t-(init.)=1.13695 s t(norm)=0.0722854, mflops=69.1703 (err=1.4e-14) 23. Mayer (simple): elapsed time t=1.00103 s, 32 iters, t-(init.)=0.987503 s t(norm)=0.0627838, mflops=79.6384 24. Mayer (lookup): elapsed time t=1.01296 s, 32 iters, t-(init.)=0.999437 s t(norm)=0.0635425, mflops=78.6875 (err=1.4e-14) 25. Monro: elapsed time t=1.04992 s, 32 iters, t-(init.)=1.03643 s t(norm)=0.0658945, mflops=75.8789 (err=1.5e-07) 26. NAPACK (f2c): elapsed time t=1.56542 s, 16 iters, t-(init.)=1.55868 s t(norm)=0.198196, mflops=25.2275 (err=5.6e-13) 27. Ooura (C): elapsed time t=1.3145 s, 64 iters, t-(init.)=1.28761 s t(norm)=0.0409322, mflops=122.153 (err=1.4e-14) 28. Ooura (F): elapsed time t=1.33869 s, 64 iters, t-(init.)=1.31176 s t(norm)=0.0416998, mflops=119.905 (err=1.4e-14) 29. Ransom: elapsed time t=1.3838 s, 32 iters, t-(init.)=1.37051 s t(norm)=0.0871346, mflops=57.3825 (err=1.5e-14) 30. SCIPORT: elapsed time t=1.07654 s, 32 iters, t-(init.)=1.06309 s t(norm)=0.0675897, mflops=73.9757 (err=1.4e-14) 31. Singleton: elapsed time t=1.04398 s, 32 iters, t-(init.)=1.03044 s t(norm)=0.0655139, mflops=76.3197 (err=2.1e-14) 32. Singleton (f2c): elapsed time t=1.07283 s, 32 iters, t-(init.)=1.05931 s t(norm)=0.0673489, mflops=74.2402 (err=2.1e-14) 33. Sorensen: elapsed time t=1.4149 s, 64 iters, t-(init.)=1.38863 s t(norm)=0.0441433, mflops=113.268 (err=1.4e-14) 34. Sorensen DIT: elapsed time t=1.21767 s, 16 iters, t-(init.)=1.21108 s t(norm)=0.153997, mflops=32.4682 (err=1.4e-14) 35. Temperton: elapsed time t=1.30768 s, 32 iters, t-(init.)=1.29453 s t(norm)=0.0823039, mflops=60.7504 (err=1.5e-07) 36. Temperton (f2c): elapsed time t=1.88442 s, 32 iters, t-(init.)=1.87126 s t(norm)=0.118972, mflops=42.0268 (err=1.4e-14) 37. Valkenburg: elapsed time t=1.72616 s, 4 iters, t-(init.)=1.72451 s t(norm)=0.877131, mflops=5.7004 (err=1.4e-14) Top mflops for N=32768 = 193.316 Normalized results and averages for N=32768: fft 0: mflops = 64.3694 (norm. = 0.332976), norm. avg. (of 15) = 0.305548 fft 1: mflops = 68.5435 (norm. = 0.354568), norm. avg. (of 15) = 0.318403 fft 2: mflops = 47.9984 (norm. = 0.24829), norm. avg. (of 15) = 0.186736 fft 3: mflops = 35.221 (norm. = 0.182194), norm. avg. (of 15) = 0.102647 fft 4: mflops = 78.9519 (norm. = 0.408409), norm. avg. (of 15) = 0.27825 fft 5: mflops = 8.30745 (norm. = 0.0429735), norm. avg. (of 15) = 0.0389175 fft 6: mflops = 82.183 (norm. = 0.425124), norm. avg. (of 15) = 0.312414 fft 7: mflops = 70.8416 (norm. = 0.366456), norm. avg. (of 15) = 0.254423 fft 8: mflops = 35.2399 (norm. = 0.182292), norm. avg. (of 15) = 0.139962 fft 9: mflops = 110.912 (norm. = 0.573736), norm. avg. (of 15) = 0.374532 fft 10: mflops = 110.912 (norm. = 0.573734), norm. avg. (of 15) = 0.434173 fft 11: mflops = 31.7444 (norm. = 0.16421), norm. avg. (of 14) = 0.10918 fft 12: mflops = 135.786 (norm. = 0.702408), norm. avg. (of 15) = 0.510292 fft 13: mflops = 72.1046 (norm. = 0.372989), norm. avg. (of 15) = 0.305234 fft 14: mflops = 182.702 (norm. = 0.945098), norm. avg. (of 15) = 0.914698 fft 15: mflops = 193.316 (norm. = 1), norm. avg. (of 15) = 0.869748 fft 16: mflops = 99.2163 (norm. = 0.513235), norm. avg. (of 15) = 0.705758 fft 17: mflops = 96.7416 (norm. = 0.500434), norm. avg. (of 13) = 0.548044 fft 18: mflops = 65.9256 (norm. = 0.341026), norm. avg. (of 15) = 0.239377 fft 19: mflops = 41.3839 (norm. = 0.214074), norm. avg. (of 15) = 0.188665 fft 20: mflops = 48.1103 (norm. = 0.248869), norm. avg. (of 15) = 0.222268 fft 21: mflops = -1 (norm. = -0.00517289), norm. avg. (of 12) = 0.370135 fft 22: mflops = 69.1703 (norm. = 0.35781), norm. avg. (of 14) = 0.273547 fft 23: mflops = 79.6384 (norm. = 0.411961), norm. avg. (of 14) = 0.332773 fft 24: mflops = 78.6875 (norm. = 0.407042), norm. avg. (of 14) = 0.326597 fft 25: mflops = 75.8789 (norm. = 0.392513), norm. avg. (of 14) = 0.306704 fft 26: mflops = 25.2275 (norm. = 0.130499), norm. avg. (of 15) = 0.0944786 fft 27: mflops = 122.153 (norm. = 0.631886), norm. avg. (of 15) = 0.496403 fft 28: mflops = 119.905 (norm. = 0.620254), norm. avg. (of 15) = 0.48875 fft 29: mflops = 57.3825 (norm. = 0.296833), norm. avg. (of 14) = 0.168779 fft 30: mflops = 73.9757 (norm. = 0.382668), norm. avg. (of 14) = 0.369693 fft 31: mflops = 76.3197 (norm. = 0.394793), norm. avg. (of 15) = 0.372242 fft 32: mflops = 74.2402 (norm. = 0.384037), norm. avg. (of 15) = 0.36965 fft 33: mflops = 113.268 (norm. = 0.58592), norm. avg. (of 15) = 0.486292 fft 34: mflops = 32.4682 (norm. = 0.167954), norm. avg. (of 15) = 0.142441 fft 35: mflops = 60.7504 (norm. = 0.314255), norm. avg. (of 15) = 0.257923 fft 36: mflops = 42.0268 (norm. = 0.2174), norm. avg. (of 15) = 0.165935 fft 37: mflops = 5.7004 (norm. = 0.0294875), norm. avg. (of 15) = 0.0266063 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.90217 s, 16 iters, t-(init.)=1.88656 s t(norm)=0.112448, mflops=44.4651 (err=1.7e-14) 1. Arndt DIT: elapsed time t=1.84601 s, 16 iters, t-(init.)=1.83041 s t(norm)=0.109101, mflops=45.8292 (err=1.7e-14) 2. Arndt Split-Radix: elapsed time t=1.32873 s, 8 iters, t-(init.)=1.3209 s t(norm)=0.157463, mflops=31.7535 (err=1.7e-14) 3. Arndt 4-step: elapsed time t=1.1763 s, 8 iters, t-(init.)=1.16848 s t(norm)=0.139294, mflops=35.8953 (err=1.7e-14) 4. Bailey: elapsed time t=1.12181 s, 16 iters, t-(init.)=1.10621 s t(norm)=0.0659352, mflops=75.8321 (err=1.7e-14) 5. Beauregard: elapsed time t=1.28229 s, 2 iters, t-(init.)=1.28031 s t(norm)=0.610502, mflops=8.18999 (err=1.7e-14) 6. Bergland: elapsed time t=1.15963 s, 16 iters, t-(init.)=1.14418 s t(norm)=0.0681986, mflops=73.3153 (err=1.7e-14) 7. Brenner: elapsed time t=1.47009 s, 16 iters, t-(init.)=1.45469 s t(norm)=0.0867062, mflops=57.666 (err=1.7e-14) 8. Burrus: elapsed time t=1.6136 s, 8 iters, t-(init.)=1.60586 s t(norm)=0.191433, mflops=26.1188 (err=1.7e-14) 9. CWP (min N) (N=72072): elapsed time t=1.59876 s, 32 iters, t-(init.)=1.56535 s t(norm)=0.046651, mflops=107.179 10. CWP (best N) (N=72072): elapsed time t=1.59889 s, 32 iters, t-(init.)=1.5655 s t(norm)=0.0466555, mflops=107.169 11. Edelblute: elapsed time t=1.75208 s, 8 iters, t-(init.)=1.74423 s t(norm)=0.207929, mflops=24.0467 (err=1.7e-14) 12. FFTPACK: elapsed time t=1.53941 s, 32 iters, t-(init.)=1.50863 s t(norm)=0.0449606, mflops=111.209 (err=1.7e-14) 13. FFTPACK (f2c): elapsed time t=1.22624 s, 16 iters, t-(init.)=1.21075 s t(norm)=0.0721663, mflops=69.2844 (err=1.7e-14) FFTW_MEASURE plan: (cost = 4.713866e-02) FFTW_TWIDDLE 16 FFTW_TWIDDLE 32 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.50766 s, 32 iters, t-(init.)=1.47575 s t(norm)=0.0439807, mflops=113.686 (err=1.7e-14) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.22443 s, 32 iters, t-(init.)=1.19261 s t(norm)=0.0355425, mflops=140.677 (err=1.7e-14) 16. Frigo-old: elapsed time t=1.98103 s, 32 iters, t-(init.)=1.94973 s t(norm)=0.0581064, mflops=86.049 (err=1.7e-14) 17. Green: elapsed time t=1.02141 s, 16 iters, t-(init.)=1.00595 s t(norm)=0.0599594, mflops=83.3898 (err=1.7e-14) 18. GSL: elapsed time t=1.28368 s, 16 iters, t-(init.)=1.2682 s t(norm)=0.0755906, mflops=66.1458 (err=1.7e-14) 19. GSL DIT: elapsed time t=1.23302 s, 8 iters, t-(init.)=1.22513 s t(norm)=0.146046, mflops=34.2357 (err=1.7e-14) 20. GSL DIF: elapsed time t=1.06358 s, 8 iters, t-(init.)=1.05575 s t(norm)=0.125855, mflops=39.7283 (err=1.8e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.46719 s, 16 iters, t-(init.)=1.45152 s t(norm)=0.0865174, mflops=57.7918 (err=1.7e-14) 23. Mayer (simple): elapsed time t=1.32847 s, 16 iters, t-(init.)=1.31286 s t(norm)=0.0782525, mflops=63.8957 24. Mayer (lookup): elapsed time t=1.40405 s, 16 iters, t-(init.)=1.38842 s t(norm)=0.082756, mflops=60.4185 (err=1.7e-14) 25. Monro: elapsed time t=1.71185 s, 16 iters, t-(init.)=1.6962 s t(norm)=0.101102, mflops=49.4552 (err=1.6e-07) 26. NAPACK (f2c): elapsed time t=1.66664 s, 8 iters, t-(init.)=1.65879 s t(norm)=0.197743, mflops=25.2853 (err=8.6e-13) 27. Ooura (C): elapsed time t=1.53742 s, 32 iters, t-(init.)=1.50667 s t(norm)=0.0449023, mflops=111.353 (err=1.7e-14) 28. Ooura (F): elapsed time t=1.54663 s, 32 iters, t-(init.)=1.51584 s t(norm)=0.0451755, mflops=110.679 (err=1.7e-14) 29. Ransom: elapsed time t=1.56704 s, 16 iters, t-(init.)=1.55158 s t(norm)=0.0924816, mflops=54.0648 (err=1.7e-14) 30. SCIPORT: elapsed time t=1.28098 s, 8 iters, t-(init.)=1.27304 s t(norm)=0.151758, mflops=32.9473 (err=1.7e-14) 31. Singleton: elapsed time t=1.05877 s, 16 iters, t-(init.)=1.04314 s t(norm)=0.0621762, mflops=80.4167 (err=2.3e-14) 32. Singleton (f2c): elapsed time t=1.09123 s, 16 iters, t-(init.)=1.07554 s t(norm)=0.0641074, mflops=77.9942 (err=2.3e-14) 33. Sorensen: elapsed time t=1.76494 s, 32 iters, t-(init.)=1.73387 s t(norm)=0.0516733, mflops=96.7617 (err=1.7e-14) 34. Sorensen DIT: elapsed time t=1.72947 s, 8 iters, t-(init.)=1.72164 s t(norm)=0.205236, mflops=24.3622 (err=1.7e-14) 35. Temperton: elapsed time t=1.46889 s, 16 iters, t-(init.)=1.45342 s t(norm)=0.0866308, mflops=57.7162 (err=1.7e-07) 36. Temperton (f2c): elapsed time t=1.04442 s, 8 iters, t-(init.)=1.03663 s t(norm)=0.123576, mflops=40.4609 (err=1.7e-14) 37. Valkenburg: elapsed time t=1.93326 s, 2 iters, t-(init.)=1.92998 s t(norm)=0.920284, mflops=5.4331 (err=1.7e-14) Top mflops for N=65536 = 140.677 Normalized results and averages for N=65536: fft 0: mflops = 44.4651 (norm. = 0.31608), norm. avg. (of 16) = 0.306206 fft 1: mflops = 45.8292 (norm. = 0.325777), norm. avg. (of 16) = 0.318864 fft 2: mflops = 31.7535 (norm. = 0.22572), norm. avg. (of 16) = 0.189172 fft 3: mflops = 35.8953 (norm. = 0.255162), norm. avg. (of 16) = 0.112179 fft 4: mflops = 75.8321 (norm. = 0.539052), norm. avg. (of 16) = 0.29455 fft 5: mflops = 8.18999 (norm. = 0.0582185), norm. avg. (of 16) = 0.0401238 fft 6: mflops = 73.3153 (norm. = 0.521161), norm. avg. (of 16) = 0.325461 fft 7: mflops = 57.666 (norm. = 0.409919), norm. avg. (of 16) = 0.264142 fft 8: mflops = 26.1188 (norm. = 0.185666), norm. avg. (of 16) = 0.142819 fft 9: mflops = 107.179 (norm. = 0.761881), norm. avg. (of 16) = 0.398741 fft 10: mflops = 107.169 (norm. = 0.761808), norm. avg. (of 16) = 0.45465 fft 11: mflops = 24.0467 (norm. = 0.170936), norm. avg. (of 15) = 0.113297 fft 12: mflops = 111.209 (norm. = 0.790526), norm. avg. (of 16) = 0.527807 fft 13: mflops = 69.2844 (norm. = 0.492508), norm. avg. (of 16) = 0.316939 fft 14: mflops = 113.686 (norm. = 0.808139), norm. avg. (of 16) = 0.908038 fft 15: mflops = 140.677 (norm. = 1), norm. avg. (of 16) = 0.877889 fft 16: mflops = 86.049 (norm. = 0.611679), norm. avg. (of 16) = 0.699878 fft 17: mflops = 83.3898 (norm. = 0.592776), norm. avg. (of 14) = 0.55124 fft 18: mflops = 66.1458 (norm. = 0.470197), norm. avg. (of 16) = 0.253803 fft 19: mflops = 34.2357 (norm. = 0.243364), norm. avg. (of 16) = 0.192083 fft 20: mflops = 39.7283 (norm. = 0.282409), norm. avg. (of 16) = 0.226026 fft 21: mflops = -1 (norm. = -0.0071085), norm. avg. (of 12) = 0.370135 fft 22: mflops = 57.7918 (norm. = 0.410813), norm. avg. (of 15) = 0.282698 fft 23: mflops = 63.8957 (norm. = 0.454203), norm. avg. (of 15) = 0.340868 fft 24: mflops = 60.4185 (norm. = 0.429485), norm. avg. (of 15) = 0.333456 fft 25: mflops = 49.4552 (norm. = 0.351552), norm. avg. (of 15) = 0.309694 fft 26: mflops = 25.2853 (norm. = 0.179741), norm. avg. (of 16) = 0.0998074 fft 27: mflops = 111.353 (norm. = 0.791552), norm. avg. (of 16) = 0.51485 fft 28: mflops = 110.679 (norm. = 0.786764), norm. avg. (of 16) = 0.507376 fft 29: mflops = 54.0648 (norm. = 0.38432), norm. avg. (of 15) = 0.183149 fft 30: mflops = 32.9473 (norm. = 0.234206), norm. avg. (of 15) = 0.360661 fft 31: mflops = 80.4167 (norm. = 0.571642), norm. avg. (of 16) = 0.384704 fft 32: mflops = 77.9942 (norm. = 0.554421), norm. avg. (of 16) = 0.381199 fft 33: mflops = 96.7617 (norm. = 0.68783), norm. avg. (of 16) = 0.498888 fft 34: mflops = 24.3622 (norm. = 0.173179), norm. avg. (of 16) = 0.144363 fft 35: mflops = 57.7162 (norm. = 0.410276), norm. avg. (of 16) = 0.267445 fft 36: mflops = 40.4609 (norm. = 0.287616), norm. avg. (of 16) = 0.17354 fft 37: mflops = 5.4331 (norm. = 0.0386212), norm. avg. (of 16) = 0.0273572 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.27739 s, 4 iters, t-(init.)=1.26404 s t(norm)=0.141821, mflops=35.2557 (err=3.3e-14) 1. Arndt DIT: elapsed time t=1.27764 s, 4 iters, t-(init.)=1.26442 s t(norm)=0.141864, mflops=35.245 (err=3.3e-14) 2. Arndt Split-Radix: elapsed time t=1.86082 s, 4 iters, t-(init.)=1.84751 s t(norm)=0.207285, mflops=24.1213 (err=3.3e-14) 3. Arndt 4-step: elapsed time t=1.4674 s, 4 iters, t-(init.)=1.45383 s t(norm)=0.163115, mflops=30.6531 (err=3.3e-14) 4. Bailey: elapsed time t=1.0811 s, 4 iters, t-(init.)=1.06634 s t(norm)=0.11964, mflops=41.7919 (err=3.3e-14) 5. Beauregard: elapsed time t=1.39074 s, 1 iters, t-(init.)=1.38734 s t(norm)=0.622621, mflops=8.03057 (err=3.3e-14) 6. Bergland: elapsed time t=1.45509 s, 8 iters, t-(init.)=1.42847 s t(norm)=0.0801352, mflops=62.3946 (err=3.4e-14) 7. Brenner: elapsed time t=1.02335 s, 4 iters, t-(init.)=1.00998 s t(norm)=0.113317, mflops=44.124 (err=3.3e-14) 8. Burrus: elapsed time t=1.07853 s, 2 iters, t-(init.)=1.07189 s t(norm)=0.240526, mflops=20.7877 (err=3.3e-14) 9. CWP (min N) (N=144144): elapsed time t=1.81858 s, 16 iters, t-(init.)=1.76024 s t(norm)=0.0493734, mflops=101.269 10. CWP (best N) (N=144144): elapsed time t=1.81747 s, 16 iters, t-(init.)=1.75933 s t(norm)=0.0493478, mflops=101.322 11. Edelblute: elapsed time t=1.12761 s, 2 iters, t-(init.)=1.12095 s t(norm)=0.251536, mflops=19.8779 (err=3.3e-14) 12. FFTPACK: elapsed time t=1.30451 s, 8 iters, t-(init.)=1.2769 s t(norm)=0.071632, mflops=69.8012 (err=3.3e-14) 13. FFTPACK (f2c): elapsed time t=1.9061 s, 8 iters, t-(init.)=1.8784 s t(norm)=0.105376, mflops=47.4493 (err=3.3e-14) FFTW_MEASURE plan: (cost = 9.543003e-02) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 14. FFTW: elapsed time t=1.66666 s, 16 iters, t-(init.)=1.60745 s t(norm)=0.0450878, mflops=110.895 (err=3.3e-14) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.68449 s, 16 iters, t-(init.)=1.62529 s t(norm)=0.0455882, mflops=109.677 (err=3.3e-14) 16. Frigo-old: elapsed time t=1.39674 s, 8 iters, t-(init.)=1.36714 s t(norm)=0.0766945, mflops=65.1937 (err=3.3e-14) 17. Green: elapsed time t=1.3408 s, 8 iters, t-(init.)=1.31413 s t(norm)=0.0737208, mflops=67.8234 (err=3.3e-14) 18. GSL: elapsed time t=1.9404 s, 8 iters, t-(init.)=1.91192 s t(norm)=0.107256, mflops=46.6176 (err=3.3e-14) 19. GSL DIT: elapsed time t=1.64973 s, 4 iters, t-(init.)=1.63639 s t(norm)=0.183598, mflops=27.2334 (err=3.5e-14) 20. GSL DIF: elapsed time t=1.48587 s, 4 iters, t-(init.)=1.47241 s t(norm)=0.1652, mflops=30.2663 (err=3.5e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.80426 s, 8 iters, t-(init.)=1.77753 s t(norm)=0.0997165, mflops=50.1422 (err=3.3e-14) 23. Mayer (simple): elapsed time t=1.67329 s, 8 iters, t-(init.)=1.64658 s t(norm)=0.0923708, mflops=54.1297 24. Mayer (lookup): elapsed time t=1.79726 s, 8 iters, t-(init.)=1.77074 s t(norm)=0.0993358, mflops=50.3343 (err=3.3e-14) 25. Monro: elapsed time t=1.23127 s, 4 iters, t-(init.)=1.21776 s t(norm)=0.136629, mflops=36.5954 (err=1.7e-07) 26. NAPACK (f2c): elapsed time t=1.13695 s, 2 iters, t-(init.)=1.12938 s t(norm)=0.253427, mflops=19.7296 (err=2.0e-12) 27. Ooura (C): elapsed time t=1.95242 s, 16 iters, t-(init.)=1.89898 s t(norm)=0.053265, mflops=93.8702 (err=3.4e-14) 28. Ooura (F): elapsed time t=1.98048 s, 16 iters, t-(init.)=1.92709 s t(norm)=0.0540536, mflops=92.5008 (err=3.4e-14) 29. Ransom: elapsed time t=1.98579 s, 8 iters, t-(init.)=1.95898 s t(norm)=0.109896, mflops=45.4975 (err=3.3e-14) 30. SCIPORT: elapsed time t=1.88344 s, 4 iters, t-(init.)=1.86649 s t(norm)=0.209414, mflops=23.8761 (err=3.3e-14) 31. Singleton: elapsed time t=1.69586 s, 8 iters, t-(init.)=1.66896 s t(norm)=0.0936261, mflops=53.4039 (err=4.8e-14) 32. Singleton (f2c): elapsed time t=1.72049 s, 8 iters, t-(init.)=1.69356 s t(norm)=0.0950062, mflops=52.6281 (err=4.8e-14) 33. Sorensen: elapsed time t=1.27819 s, 8 iters, t-(init.)=1.25192 s t(norm)=0.0702309, mflops=71.1938 (err=3.3e-14) 34. Sorensen DIT: elapsed time t=1.14806 s, 2 iters, t-(init.)=1.14148 s t(norm)=0.256142, mflops=19.5204 (err=3.3e-14) 35. Temperton: elapsed time t=1.08838 s, 4 iters, t-(init.)=1.07488 s t(norm)=0.120598, mflops=41.46 (err=1.9e-07) 36. Temperton (f2c): elapsed time t=1.61528 s, 4 iters, t-(init.)=1.60178 s t(norm)=0.179715, mflops=27.8218 (err=3.3e-14) 37. Valkenburg: elapsed time t=2.21734 s, 1 iters, t-(init.)=2.20983 s t(norm)=0.991747, mflops=5.04161 (err=3.4e-14) Top mflops for N=131072 = 110.895 Normalized results and averages for N=131072: fft 0: mflops = 35.2557 (norm. = 0.31792), norm. avg. (of 17) = 0.306895 fft 1: mflops = 35.245 (norm. = 0.317824), norm. avg. (of 17) = 0.318803 fft 2: mflops = 24.1213 (norm. = 0.217515), norm. avg. (of 17) = 0.190839 fft 3: mflops = 30.6531 (norm. = 0.276416), norm. avg. (of 17) = 0.12184 fft 4: mflops = 41.7919 (norm. = 0.376861), norm. avg. (of 17) = 0.299392 fft 5: mflops = 8.03057 (norm. = 0.0724161), norm. avg. (of 17) = 0.0420234 fft 6: mflops = 62.3946 (norm. = 0.562646), norm. avg. (of 17) = 0.339413 fft 7: mflops = 44.124 (norm. = 0.39789), norm. avg. (of 17) = 0.272009 fft 8: mflops = 20.7877 (norm. = 0.187455), norm. avg. (of 17) = 0.145444 fft 9: mflops = 101.269 (norm. = 0.913199), norm. avg. (of 17) = 0.429004 fft 10: mflops = 101.322 (norm. = 0.913674), norm. avg. (of 17) = 0.481651 fft 11: mflops = 19.8779 (norm. = 0.17925), norm. avg. (of 16) = 0.117419 fft 12: mflops = 69.8012 (norm. = 0.629437), norm. avg. (of 17) = 0.533785 fft 13: mflops = 47.4493 (norm. = 0.427877), norm. avg. (of 17) = 0.323464 fft 14: mflops = 110.895 (norm. = 1), norm. avg. (of 17) = 0.913447 fft 15: mflops = 109.677 (norm. = 0.989022), norm. avg. (of 17) = 0.884426 fft 16: mflops = 65.1937 (norm. = 0.587888), norm. avg. (of 17) = 0.69329 fft 17: mflops = 67.8234 (norm. = 0.611602), norm. avg. (of 15) = 0.555264 fft 18: mflops = 46.6176 (norm. = 0.420377), norm. avg. (of 17) = 0.263601 fft 19: mflops = 27.2334 (norm. = 0.245579), norm. avg. (of 17) = 0.19523 fft 20: mflops = 30.2663 (norm. = 0.272928), norm. avg. (of 17) = 0.228785 fft 21: mflops = -1 (norm. = -0.00901756), norm. avg. (of 12) = 0.370135 fft 22: mflops = 50.1422 (norm. = 0.45216), norm. avg. (of 16) = 0.293289 fft 23: mflops = 54.1297 (norm. = 0.488117), norm. avg. (of 16) = 0.350071 fft 24: mflops = 50.3343 (norm. = 0.453893), norm. avg. (of 16) = 0.340983 fft 25: mflops = 36.5954 (norm. = 0.330001), norm. avg. (of 16) = 0.310963 fft 26: mflops = 19.7296 (norm. = 0.177913), norm. avg. (of 17) = 0.104402 fft 27: mflops = 93.8702 (norm. = 0.84648), norm. avg. (of 17) = 0.534357 fft 28: mflops = 92.5008 (norm. = 0.834132), norm. avg. (of 17) = 0.526597 fft 29: mflops = 45.4975 (norm. = 0.410277), norm. avg. (of 16) = 0.197344 fft 30: mflops = 23.8761 (norm. = 0.215304), norm. avg. (of 16) = 0.351576 fft 31: mflops = 53.4039 (norm. = 0.481573), norm. avg. (of 17) = 0.390403 fft 32: mflops = 52.6281 (norm. = 0.474577), norm. avg. (of 17) = 0.386691 fft 33: mflops = 71.1938 (norm. = 0.641994), norm. avg. (of 17) = 0.507306 fft 34: mflops = 19.5204 (norm. = 0.176027), norm. avg. (of 17) = 0.146225 fft 35: mflops = 41.46 (norm. = 0.373868), norm. avg. (of 17) = 0.273705 fft 36: mflops = 27.8218 (norm. = 0.250885), norm. avg. (of 17) = 0.178089 fft 37: mflops = 5.04161 (norm. = 0.045463), norm. avg. (of 17) = 0.0284223 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=1.05401 s, 1 iters, t-(init.)=1.04093 s t(norm)=0.220602, mflops=22.6653 (err=4.3e-14) 1. Arndt DIT: elapsed time t=1.0684 s, 1 iters, t-(init.)=1.05553 s t(norm)=0.223695, mflops=22.3518 (err=4.3e-14) 2. Arndt Split-Radix: elapsed time t=1.50334 s, 1 iters, t-(init.)=1.49023 s t(norm)=0.315821, mflops=15.8318 (err=4.3e-14) 3. Arndt 4-step: elapsed time t=1.69699 s, 2 iters, t-(init.)=1.66539 s t(norm)=0.176471, mflops=28.3333 (err=4.3e-14) 4. Bailey: elapsed time t=1.72878 s, 2 iters, t-(init.)=1.69321 s t(norm)=0.179419, mflops=27.8677 (err=4.3e-14) 5. Beauregard: elapsed time t=3.1032 s, 1 iters, t-(init.)=3.08742 s t(norm)=0.654309, mflops=7.64165 (err=4.4e-14) 6. Bergland: elapsed time t=1.0731 s, 2 iters, t-(init.)=1.0435 s t(norm)=0.110573, mflops=45.2188 (err=4.4e-14) 7. Brenner: elapsed time t=1.56544 s, 2 iters, t-(init.)=1.53681 s t(norm)=0.162846, mflops=30.7038 (err=4.4e-14) 8. Burrus: elapsed time t=1.62379 s, 1 iters, t-(init.)=1.61071 s t(norm)=0.341354, mflops=14.6475 (err=4.3e-14) 9. CWP (min N) (N=360360): elapsed time t=1.39192 s, 4 iters, t-(init.)=1.30654 s t(norm)=0.0692232, mflops=72.2301 10. CWP (best N) (N=360360): elapsed time t=1.3923 s, 4 iters, t-(init.)=1.30688 s t(norm)=0.0692413, mflops=72.2113 11. Edelblute: elapsed time t=1.68838 s, 1 iters, t-(init.)=1.67531 s t(norm)=0.355044, mflops=14.0828 (err=4.3e-14) 12. FFTPACK: elapsed time t=1.84322 s, 4 iters, t-(init.)=1.77602 s t(norm)=0.0940969, mflops=53.1367 (err=4.4e-14) 13. FFTPACK (f2c): elapsed time t=1.22607 s, 2 iters, t-(init.)=1.19041 s t(norm)=0.126141, mflops=39.6383 (err=4.4e-14) FFTW_MEASURE plan: (cost = 3.052274e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_NOTW 32 14. FFTW: elapsed time t=1.23968 s, 4 iters, t-(init.)=1.16878 s t(norm)=0.061924, mflops=80.7442 (err=4.4e-14) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.19884 s, 4 iters, t-(init.)=1.12811 s t(norm)=0.0597693, mflops=83.655 (err=4.4e-14) 16. Frigo-old: elapsed time t=1.98048 s, 4 iters, t-(init.)=1.91312 s t(norm)=0.10136, mflops=49.3289 (err=4.4e-14) 17. Green: elapsed time t=1.03375 s, 2 iters, t-(init.)=1.00437 s t(norm)=0.106427, mflops=46.9805 (err=4.4e-14) 18. GSL: elapsed time t=1.19223 s, 2 iters, t-(init.)=1.15833 s t(norm)=0.122741, mflops=40.736 (err=4.4e-14) 19. GSL DIT: elapsed time t=1.29677 s, 1 iters, t-(init.)=1.28365 s t(norm)=0.27204, mflops=18.3796 (err=4.6e-14) 20. GSL DIF: elapsed time t=1.22319 s, 1 iters, t-(init.)=1.2074 s t(norm)=0.255882, mflops=19.5402 (err=4.6e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.26583 s, 2 iters, t-(init.)=1.23716 s t(norm)=0.131094, mflops=38.1406 (err=4.3e-14) 23. Mayer (simple): elapsed time t=1.19736 s, 2 iters, t-(init.)=1.16883 s t(norm)=0.123853, mflops=40.3704 24. Mayer (lookup): elapsed time t=1.30383 s, 2 iters, t-(init.)=1.27528 s t(norm)=0.135133, mflops=37.0005 (err=4.3e-14) 25. Monro: elapsed time t=1.03971 s, 1 iters, t-(init.)=1.0266 s t(norm)=0.217565, mflops=22.9816 (err=1.8e-07) 26. NAPACK (f2c): elapsed time t=1.45642 s, 1 iters, t-(init.)=1.4361 s t(norm)=0.30435, mflops=16.4285 (err=3.7e-12) 27. Ooura (C): elapsed time t=1.29107 s, 4 iters, t-(init.)=1.23246 s t(norm)=0.0652981, mflops=76.5719 (err=4.4e-14) 28. Ooura (F): elapsed time t=1.30454 s, 4 iters, t-(init.)=1.24593 s t(norm)=0.0660115, mflops=75.7443 (err=4.4e-14) 29. Ransom: elapsed time t=1.97352 s, 4 iters, t-(init.)=1.91042 s t(norm)=0.101217, mflops=49.3986 (err=4.3e-14) 30. SCIPORT: elapsed time t=1.4754 s, 1 iters, t-(init.)=1.45763 s t(norm)=0.308911, mflops=16.1859 (err=4.4e-14) 31. Singleton: elapsed time t=1.2201 s, 2 iters, t-(init.)=1.18835 s t(norm)=0.125922, mflops=39.707 (err=6.0e-14) 32. Singleton (f2c): elapsed time t=1.23607 s, 2 iters, t-(init.)=1.2045 s t(norm)=0.127634, mflops=39.1746 (err=6.0e-14) 33. Sorensen: elapsed time t=1.12551 s, 2 iters, t-(init.)=1.0966 s t(norm)=0.1162, mflops=43.0292 (err=4.3e-14) 34. Sorensen DIT: elapsed time t=1.69093 s, 1 iters, t-(init.)=1.66271 s t(norm)=0.352375, mflops=14.1894 (err=4.3e-14) 35. Temperton: elapsed time t=1.48623 s, 2 iters, t-(init.)=1.45904 s t(norm)=0.154606, mflops=32.3403 (err=2.0e-07) 36. Temperton (f2c): elapsed time t=1.12549 s, 1 iters, t-(init.)=1.11413 s t(norm)=0.236115, mflops=21.1762 (err=4.4e-14) 37. Valkenburg: elapsed time t=5.19783 s, 1 iters, t-(init.)=5.17879 s t(norm)=1.09753, mflops=4.55569 (err=4.4e-14) Top mflops for N=262144 = 83.655 Normalized results and averages for N=262144: fft 0: mflops = 22.6653 (norm. = 0.270937), norm. avg. (of 18) = 0.304897 fft 1: mflops = 22.3518 (norm. = 0.267191), norm. avg. (of 18) = 0.315935 fft 2: mflops = 15.8318 (norm. = 0.189251), norm. avg. (of 18) = 0.190751 fft 3: mflops = 28.3333 (norm. = 0.338692), norm. avg. (of 18) = 0.133887 fft 4: mflops = 27.8677 (norm. = 0.333126), norm. avg. (of 18) = 0.301266 fft 5: mflops = 7.64165 (norm. = 0.0913471), norm. avg. (of 18) = 0.0447636 fft 6: mflops = 45.2188 (norm. = 0.540539), norm. avg. (of 18) = 0.350587 fft 7: mflops = 30.7038 (norm. = 0.367028), norm. avg. (of 18) = 0.277288 fft 8: mflops = 14.6475 (norm. = 0.175095), norm. avg. (of 18) = 0.147092 fft 9: mflops = 72.2301 (norm. = 0.863428), norm. avg. (of 18) = 0.453138 fft 10: mflops = 72.2113 (norm. = 0.863203), norm. avg. (of 18) = 0.502849 fft 11: mflops = 14.0828 (norm. = 0.168343), norm. avg. (of 17) = 0.120415 fft 12: mflops = 53.1367 (norm. = 0.635188), norm. avg. (of 18) = 0.539418 fft 13: mflops = 39.6383 (norm. = 0.47383), norm. avg. (of 18) = 0.331818 fft 14: mflops = 80.7442 (norm. = 0.965204), norm. avg. (of 18) = 0.916323 fft 15: mflops = 83.655 (norm. = 1), norm. avg. (of 18) = 0.890847 fft 16: mflops = 49.3289 (norm. = 0.58967), norm. avg. (of 18) = 0.687533 fft 17: mflops = 46.9805 (norm. = 0.561598), norm. avg. (of 16) = 0.55566 fft 18: mflops = 40.736 (norm. = 0.486953), norm. avg. (of 18) = 0.27601 fft 19: mflops = 18.3796 (norm. = 0.219707), norm. avg. (of 18) = 0.19659 fft 20: mflops = 19.5402 (norm. = 0.233581), norm. avg. (of 18) = 0.229052 fft 21: mflops = -1 (norm. = -0.0119539), norm. avg. (of 12) = 0.370135 fft 22: mflops = 38.1406 (norm. = 0.455927), norm. avg. (of 17) = 0.302856 fft 23: mflops = 40.3704 (norm. = 0.482581), norm. avg. (of 17) = 0.357866 fft 24: mflops = 37.0005 (norm. = 0.442298), norm. avg. (of 17) = 0.346943 fft 25: mflops = 22.9816 (norm. = 0.274719), norm. avg. (of 17) = 0.308831 fft 26: mflops = 16.4285 (norm. = 0.196384), norm. avg. (of 18) = 0.109512 fft 27: mflops = 76.5719 (norm. = 0.915329), norm. avg. (of 18) = 0.555522 fft 28: mflops = 75.7443 (norm. = 0.905437), norm. avg. (of 18) = 0.547644 fft 29: mflops = 49.3986 (norm. = 0.590503), norm. avg. (of 17) = 0.220471 fft 30: mflops = 16.1859 (norm. = 0.193484), norm. avg. (of 17) = 0.342276 fft 31: mflops = 39.707 (norm. = 0.474652), norm. avg. (of 18) = 0.395083 fft 32: mflops = 39.1746 (norm. = 0.468287), norm. avg. (of 18) = 0.391225 fft 33: mflops = 43.0292 (norm. = 0.514365), norm. avg. (of 18) = 0.507698 fft 34: mflops = 14.1894 (norm. = 0.169618), norm. avg. (of 18) = 0.147525 fft 35: mflops = 32.3403 (norm. = 0.386591), norm. avg. (of 18) = 0.279977 fft 36: mflops = 21.1762 (norm. = 0.253137), norm. avg. (of 18) = 0.182259 fft 37: mflops = 4.55569 (norm. = 0.0544581), norm. avg. (of 18) = 0.0298687 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. Brenner 1. CWP (min N) 2. CWP (best N) 3. FFTPACK 4. FFTPACK (f2c) 5. FFTW 6. FFTW_ESTIMATE 7. Frigo-old 8. GSL 9. NAPACK (f2c) 10. Singleton 11. Singleton (f2c) 12. Temperton 13. Temperton (f2c) 14. Valkenburg Computing normalized averages (15 transforms). Benchmarking for array size = 6: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.22113 s, 262144 iters, t-(init.)=1.1799 s t(norm)=0.290202, mflops=17.2294 2. CWP (best N) (N=15): elapsed time t=1.55055 s, 262144 iters, t-(init.)=1.48816 s t(norm)=0.36602, mflops=13.6605 3. FFTPACK: elapsed time t=1.12278 s, 524288 iters, t-(init.)=1.02743 s t(norm)=0.126351, mflops=39.5724 (err=1.4e-16) 4. FFTPACK (f2c): elapsed time t=1.41545 s, 524288 iters, t-(init.)=1.33925 s t(norm)=0.164697, mflops=30.3587 (err=1.8e-16) FFTW_MEASURE plan: (cost = 6.937969e-07) FFTW_NOTW 6 5. FFTW: elapsed time t=1.13467 s, 2097152 iters, t-(init.)=0.786803 s t(norm)=0.0241897, mflops=206.699 (err=1.1e-16) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 6. FFTW_ESTIMATE: elapsed time t=1.19781 s, 2097152 iters, t-(init.)=0.885391 s t(norm)=0.0272207, mflops=183.684 (err=1.1e-16) 7. Frigo-old: elapsed time t=1.0595 s, 262144 iters, t-(init.)=1.00982 s t(norm)=0.248369, mflops=20.1313 (err=3.3e-16) 8. GSL: elapsed time t=1.37543 s, 524288 iters, t-(init.)=1.28668 s t(norm)=0.158232, mflops=31.5991 (err=1.2e-16) 9. NAPACK (f2c): elapsed time t=1.0991 s, 131072 iters, t-(init.)=1.08216 s t(norm)=0.532325, mflops=9.39276 (err=4.7e-16) 10. Singleton: elapsed time t=1.77425 s, 262144 iters, t-(init.)=1.73513 s t(norm)=0.426763, mflops=11.7161 (err=1.0e-16) 11. Singleton (f2c): elapsed time t=1.79061 s, 262144 iters, t-(init.)=1.7472 s t(norm)=0.429731, mflops=11.6352 (err=1.0e-16) 12. Temperton: elapsed time t=1.58096 s, 262144 iters, t-(init.)=1.54285 s t(norm)=0.37947, mflops=13.1763 (err=3.7e-16) 13. Temperton (f2c): elapsed time t=1.20028 s, 131072 iters, t-(init.)=1.17962 s t(norm)=0.580268, mflops=8.61672 (err=1.0e-16) 14. Valkenburg: elapsed time t=1.85155 s, 131072 iters, t-(init.)=1.82802 s t(norm)=0.89922, mflops=5.56038 (err=3.4e-16) Top mflops for N=6 = 206.699 Normalized results and averages for N=6: fft 0: mflops = -1 (norm. = -0.00483794), norm. avg. (of 0) = -1 fft 1: mflops = 17.2294 (norm. = 0.0833546), norm. avg. (of 1) = 0.0833546 fft 2: mflops = 13.6605 (norm. = 0.0660886), norm. avg. (of 1) = 0.0660886 fft 3: mflops = 39.5724 (norm. = 0.191449), norm. avg. (of 1) = 0.191449 fft 4: mflops = 30.3587 (norm. = 0.146874), norm. avg. (of 1) = 0.146874 fft 5: mflops = 206.699 (norm. = 1), norm. avg. (of 1) = 1 fft 6: mflops = 183.684 (norm. = 0.888651), norm. avg. (of 1) = 0.888651 fft 7: mflops = 20.1313 (norm. = 0.0973941), norm. avg. (of 1) = 0.0973941 fft 8: mflops = 31.5991 (norm. = 0.152875), norm. avg. (of 1) = 0.152875 fft 9: mflops = 9.39276 (norm. = 0.0454416), norm. avg. (of 1) = 0.0454416 fft 10: mflops = 11.7161 (norm. = 0.0566819), norm. avg. (of 1) = 0.0566819 fft 11: mflops = 11.6352 (norm. = 0.0562903), norm. avg. (of 1) = 0.0562903 fft 12: mflops = 13.1763 (norm. = 0.063746), norm. avg. (of 1) = 0.063746 fft 13: mflops = 8.61672 (norm. = 0.0416872), norm. avg. (of 1) = 0.0416872 fft 14: mflops = 5.56038 (norm. = 0.0269008), norm. avg. (of 1) = 0.0269008 Benchmarking for array size = 9: 0. Brenner: elapsed time t=1.36294 s, 65536 iters, t-(init.)=1.35105 s t(norm)=0.722604, mflops=6.91942 (err=3.6e-16) 1. CWP (min N): elapsed time t=1.24924 s, 262144 iters, t-(init.)=1.18995 s t(norm)=0.15911, mflops=31.4248 2. CWP (best N) (N=15): elapsed time t=1.54284 s, 262144 iters, t-(init.)=1.47086 s t(norm)=0.196671, mflops=25.4232 3. FFTPACK: elapsed time t=1.33212 s, 524288 iters, t-(init.)=1.21795 s t(norm)=0.0814267, mflops=61.4049 (err=1.4e-16) 4. FFTPACK (f2c): elapsed time t=1.87039 s, 524288 iters, t-(init.)=1.77894 s t(norm)=0.118932, mflops=42.0408 (err=2.4e-16) FFTW_MEASURE plan: (cost = 8.513047e-07) FFTW_NOTW 9 5. FFTW: elapsed time t=1.78405 s, 2097152 iters, t-(init.)=1.33141 s t(norm)=0.0222531, mflops=224.688 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.82852 s, 2097152 iters, t-(init.)=1.46529 s t(norm)=0.0244908, mflops=204.159 (err=1.2e-16) 7. Frigo-old: elapsed time t=1.04781 s, 131072 iters, t-(init.)=1.02242 s t(norm)=0.273418, mflops=18.287 (err=3.3e-16) 8. GSL: elapsed time t=1.22108 s, 262144 iters, t-(init.)=1.17319 s t(norm)=0.156869, mflops=31.8737 (err=1.4e-16) 9. NAPACK (f2c): elapsed time t=1.19781 s, 131072 iters, t-(init.)=1.17138 s t(norm)=0.313255, mflops=15.9615 (err=4.6e-16) 10. Singleton: elapsed time t=1.81286 s, 262144 iters, t-(init.)=1.76634 s t(norm)=0.236179, mflops=21.1703 (err=1.5e-16) 11. Singleton (f2c): elapsed time t=1.78873 s, 262144 iters, t-(init.)=1.74115 s t(norm)=0.232812, mflops=21.4766 (err=1.5e-16) 12. Temperton: elapsed time t=1.55124 s, 262144 iters, t-(init.)=1.5047 s t(norm)=0.201195, mflops=24.8515 (err=1.1e-08) 13. Temperton (f2c): elapsed time t=1.10564 s, 131072 iters, t-(init.)=1.08022 s t(norm)=0.288877, mflops=17.3084 (err=1.4e-16) 14. Valkenburg: elapsed time t=1.7374 s, 65536 iters, t-(init.)=1.72577 s t(norm)=0.923023, mflops=5.41698 (err=3.7e-16) Top mflops for N=9 = 224.688 Normalized results and averages for N=9: fft 0: mflops = 6.91942 (norm. = 0.0307957), norm. avg. (of 1) = 0.0307957 fft 1: mflops = 31.4248 (norm. = 0.13986), norm. avg. (of 2) = 0.111607 fft 2: mflops = 25.4232 (norm. = 0.113149), norm. avg. (of 2) = 0.0896187 fft 3: mflops = 61.4049 (norm. = 0.273289), norm. avg. (of 2) = 0.232369 fft 4: mflops = 42.0408 (norm. = 0.187107), norm. avg. (of 2) = 0.166991 fft 5: mflops = 224.688 (norm. = 1), norm. avg. (of 2) = 1 fft 6: mflops = 204.159 (norm. = 0.90863), norm. avg. (of 2) = 0.898641 fft 7: mflops = 18.287 (norm. = 0.0813885), norm. avg. (of 2) = 0.0893913 fft 8: mflops = 31.8737 (norm. = 0.141858), norm. avg. (of 2) = 0.147366 fft 9: mflops = 15.9615 (norm. = 0.0710382), norm. avg. (of 2) = 0.0582399 fft 10: mflops = 21.1703 (norm. = 0.0942209), norm. avg. (of 2) = 0.0754514 fft 11: mflops = 21.4766 (norm. = 0.0955839), norm. avg. (of 2) = 0.0759371 fft 12: mflops = 24.8515 (norm. = 0.110604), norm. avg. (of 2) = 0.0871752 fft 13: mflops = 17.3084 (norm. = 0.0770331), norm. avg. (of 2) = 0.0593601 fft 14: mflops = 5.41698 (norm. = 0.0241089), norm. avg. (of 2) = 0.0255048 Benchmarking for array size = 12: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.43379 s, 262144 iters, t-(init.)=1.37347 s t(norm)=0.121791, mflops=41.054 2. CWP (best N) (N=15): elapsed time t=1.54715 s, 262144 iters, t-(init.)=1.48046 s t(norm)=0.131278, mflops=38.0871 3. FFTPACK: elapsed time t=1.40677 s, 524288 iters, t-(init.)=1.27341 s t(norm)=0.0564588, mflops=88.5602 (err=1.7e-16) 4. FFTPACK (f2c): elapsed time t=1.03052 s, 262144 iters, t-(init.)=0.966023 s t(norm)=0.0856607, mflops=58.3698 (err=2.2e-16) FFTW_MEASURE plan: (cost = 8.679375e-07) FFTW_NOTW 12 5. FFTW: elapsed time t=1.01164 s, 1048576 iters, t-(init.)=0.739993 s t(norm)=0.0164045, mflops=304.795 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.93652 s, 2097152 iters, t-(init.)=1.47952 s t(norm)=0.0163993, mflops=304.89 (err=1.2e-16) 7. Frigo-old: elapsed time t=2.00759 s, 262144 iters, t-(init.)=1.94734 s t(norm)=0.172678, mflops=28.9557 (err=2.9e-16) 8. GSL: elapsed time t=1.27391 s, 262144 iters, t-(init.)=1.20835 s t(norm)=0.107149, mflops=46.6642 (err=1.6e-16) 9. NAPACK (f2c): elapsed time t=1.84085 s, 131072 iters, t-(init.)=1.81435 s t(norm)=0.32177, mflops=15.5391 (err=5.5e-16) 10. Singleton: elapsed time t=1.2355 s, 131072 iters, t-(init.)=1.20534 s t(norm)=0.213764, mflops=23.3903 (err=1.5e-16) 11. Singleton (f2c): elapsed time t=1.22948 s, 131072 iters, t-(init.)=1.19826 s t(norm)=0.212509, mflops=23.5285 (err=1.5e-16) 12. Temperton: elapsed time t=1.88499 s, 262144 iters, t-(init.)=1.83214 s t(norm)=0.162462, mflops=30.7764 (err=5.4e-16) 13. Temperton (f2c): elapsed time t=1.41705 s, 131072 iters, t-(init.)=1.38904 s t(norm)=0.246342, mflops=20.297 (err=1.4e-16) 14. Valkenburg: elapsed time t=1.23709 s, 32768 iters, t-(init.)=1.23009 s t(norm)=0.872611, mflops=5.72993 (err=3.4e-16) Top mflops for N=12 = 304.89 Normalized results and averages for N=12: fft 0: mflops = -1 (norm. = -0.00327987), norm. avg. (of 1) = 0.0307957 fft 1: mflops = 41.054 (norm. = 0.134652), norm. avg. (of 3) = 0.119289 fft 2: mflops = 38.0871 (norm. = 0.124921), norm. avg. (of 3) = 0.101386 fft 3: mflops = 88.5602 (norm. = 0.290466), norm. avg. (of 3) = 0.251735 fft 4: mflops = 58.3698 (norm. = 0.191445), norm. avg. (of 3) = 0.175142 fft 5: mflops = 304.795 (norm. = 0.999687), norm. avg. (of 3) = 0.999896 fft 6: mflops = 304.89 (norm. = 1), norm. avg. (of 3) = 0.932427 fft 7: mflops = 28.9557 (norm. = 0.0949707), norm. avg. (of 3) = 0.0912511 fft 8: mflops = 46.6642 (norm. = 0.153052), norm. avg. (of 3) = 0.149262 fft 9: mflops = 15.5391 (norm. = 0.0509661), norm. avg. (of 3) = 0.0558153 fft 10: mflops = 23.3903 (norm. = 0.076717), norm. avg. (of 3) = 0.0758733 fft 11: mflops = 23.5285 (norm. = 0.0771702), norm. avg. (of 3) = 0.0763481 fft 12: mflops = 30.7764 (norm. = 0.100943), norm. avg. (of 3) = 0.0917643 fft 13: mflops = 20.297 (norm. = 0.0665713), norm. avg. (of 3) = 0.0617639 fft 14: mflops = 5.72993 (norm. = 0.0187934), norm. avg. (of 3) = 0.0232677 Benchmarking for array size = 15: 0. Brenner: elapsed time t=1.96318 s, 65536 iters, t-(init.)=1.94423 s t(norm)=0.506228, mflops=9.87698 (err=3.5e-16) 1. CWP (min N): elapsed time t=1.54863 s, 262144 iters, t-(init.)=1.47886 s t(norm)=0.0962642, mflops=51.9404 2. CWP (best N): elapsed time t=1.53868 s, 262144 iters, t-(init.)=1.47199 s t(norm)=0.0958172, mflops=52.1827 3. FFTPACK: elapsed time t=1.79382 s, 524288 iters, t-(init.)=1.66565 s t(norm)=0.0542113, mflops=92.2316 (err=2.1e-16) 4. FFTPACK (f2c): elapsed time t=1.36094 s, 262144 iters, t-(init.)=1.2985 s t(norm)=0.0845237, mflops=59.155 (err=4.4e-16) FFTW_MEASURE plan: (cost = 1.614906e-06) FFTW_NOTW 15 5. FFTW: elapsed time t=1.52615 s, 1048576 iters, t-(init.)=1.25827 s t(norm)=0.0204764, mflops=244.184 (err=1.9e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.53584 s, 1048576 iters, t-(init.)=1.24649 s t(norm)=0.0202845, mflops=246.493 (err=1.9e-16) 7. Frigo-old: elapsed time t=1.84405 s, 131072 iters, t-(init.)=1.80768 s t(norm)=0.235337, mflops=21.2462 (err=4.2e-16) 8. GSL: elapsed time t=1.00264 s, 131072 iters, t-(init.)=0.964564 s t(norm)=0.125574, mflops=39.8173 (err=2.0e-16) 9. NAPACK (f2c): elapsed time t=1.53635 s, 65536 iters, t-(init.)=1.52073 s t(norm)=0.395958, mflops=12.6276 (err=6.3e-16) 10. Singleton: elapsed time t=1.35263 s, 131072 iters, t-(init.)=1.31088 s t(norm)=0.17066, mflops=29.2981 (err=2.8e-16) 11. Singleton (f2c): elapsed time t=1.35361 s, 131072 iters, t-(init.)=1.31763 s t(norm)=0.171539, mflops=29.1479 (err=2.8e-16) 12. Temperton: elapsed time t=1.87398 s, 262144 iters, t-(init.)=1.79991 s t(norm)=0.117162, mflops=42.6758 (err=7.9e-16) 13. Temperton (f2c): elapsed time t=1.60497 s, 131072 iters, t-(init.)=1.5722 s t(norm)=0.20468, mflops=24.4284 (err=2.0e-16) 14. Valkenburg: elapsed time t=1.02219 s, 16384 iters, t-(init.)=1.0173 s t(norm)=1.05952, mflops=4.71914 (err=4.5e-16) Top mflops for N=15 = 246.493 Normalized results and averages for N=15: fft 0: mflops = 9.87698 (norm. = 0.04007), norm. avg. (of 2) = 0.0354328 fft 1: mflops = 51.9404 (norm. = 0.210717), norm. avg. (of 4) = 0.142146 fft 2: mflops = 52.1827 (norm. = 0.2117), norm. avg. (of 4) = 0.128965 fft 3: mflops = 92.2316 (norm. = 0.374175), norm. avg. (of 4) = 0.282345 fft 4: mflops = 59.155 (norm. = 0.239986), norm. avg. (of 4) = 0.191353 fft 5: mflops = 244.184 (norm. = 0.990631), norm. avg. (of 4) = 0.99758 fft 6: mflops = 246.493 (norm. = 1), norm. avg. (of 4) = 0.94932 fft 7: mflops = 21.2462 (norm. = 0.0861937), norm. avg. (of 4) = 0.0899867 fft 8: mflops = 39.8173 (norm. = 0.161535), norm. avg. (of 4) = 0.15233 fft 9: mflops = 12.6276 (norm. = 0.051229), norm. avg. (of 4) = 0.0546687 fft 10: mflops = 29.2981 (norm. = 0.11886), norm. avg. (of 4) = 0.0866198 fft 11: mflops = 29.1479 (norm. = 0.11825), norm. avg. (of 4) = 0.0868237 fft 12: mflops = 42.6758 (norm. = 0.173132), norm. avg. (of 4) = 0.112106 fft 13: mflops = 24.4284 (norm. = 0.0991036), norm. avg. (of 4) = 0.0710988 fft 14: mflops = 4.71914 (norm. = 0.0191451), norm. avg. (of 4) = 0.022237 Benchmarking for array size = 18: 0. Brenner: elapsed time t=1.35525 s, 32768 iters, t-(init.)=1.34573 s t(norm)=0.547153, mflops=9.13822 (err=4.3e-16) 1. CWP (min N): elapsed time t=1.98574 s, 262144 iters, t-(init.)=1.90635 s t(norm)=0.0968861, mflops=51.607 2. CWP (best N) (N=28): elapsed time t=1.06319 s, 131072 iters, t-(init.)=1.00662 s t(norm)=0.102318, mflops=48.8671 3. FFTPACK: elapsed time t=1.33278 s, 262144 iters, t-(init.)=1.25975 s t(norm)=0.0640242, mflops=78.0954 (err=2.8e-16) 4. FFTPACK (f2c): elapsed time t=1.04085 s, 131072 iters, t-(init.)=0.999102 s t(norm)=0.101554, mflops=49.2347 (err=2.7e-16) FFTW_MEASURE plan: (cost = 2.844734e-06) FFTW_TWIDDLE 3 FFTW_NOTW 6 5. FFTW: elapsed time t=1.72915 s, 524288 iters, t-(init.)=1.56844 s t(norm)=0.0398564, mflops=125.45 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.77737 s, 524288 iters, t-(init.)=1.6241 s t(norm)=0.0412707, mflops=121.151 (err=2.0e-16) 7. Frigo-old: elapsed time t=1.21964 s, 65536 iters, t-(init.)=1.19767 s t(norm)=0.243476, mflops=20.5359 (err=5.0e-16) 8. GSL: elapsed time t=1.89762 s, 262144 iters, t-(init.)=1.81728 s t(norm)=0.0923595, mflops=54.1363 (err=2.3e-16) 9. NAPACK (f2c): elapsed time t=1.18168 s, 65536 iters, t-(init.)=1.16368 s t(norm)=0.236567, mflops=21.1357 (err=8.7e-16) 10. Singleton: elapsed time t=1.50113 s, 131072 iters, t-(init.)=1.46143 s t(norm)=0.148548, mflops=33.6592 (err=2.1e-16) 11. Singleton (f2c): elapsed time t=1.47649 s, 131072 iters, t-(init.)=1.43844 s t(norm)=0.146211, mflops=34.1972 (err=2.1e-16) 12. Temperton: elapsed time t=1.4265 s, 131072 iters, t-(init.)=1.38469 s t(norm)=0.140748, mflops=35.5244 (err=2.7e-08) 13. Temperton (f2c): elapsed time t=1.05574 s, 65536 iters, t-(init.)=1.03378 s t(norm)=0.210158, mflops=23.7916 (err=2.9e-16) 14. Valkenburg: elapsed time t=1.09291 s, 16384 iters, t-(init.)=1.08794 s t(norm)=0.884677, mflops=5.65178 (err=5.1e-16) Top mflops for N=18 = 125.45 Normalized results and averages for N=18: fft 0: mflops = 9.13822 (norm. = 0.0728433), norm. avg. (of 3) = 0.047903 fft 1: mflops = 51.607 (norm. = 0.411374), norm. avg. (of 5) = 0.195991 fft 2: mflops = 48.8671 (norm. = 0.389534), norm. avg. (of 5) = 0.181078 fft 3: mflops = 78.0954 (norm. = 0.622521), norm. avg. (of 5) = 0.35038 fft 4: mflops = 49.2347 (norm. = 0.392464), norm. avg. (of 5) = 0.231575 fft 5: mflops = 125.45 (norm. = 1), norm. avg. (of 5) = 0.998064 fft 6: mflops = 121.151 (norm. = 0.965733), norm. avg. (of 5) = 0.952603 fft 7: mflops = 20.5359 (norm. = 0.163697), norm. avg. (of 5) = 0.104729 fft 8: mflops = 54.1363 (norm. = 0.431536), norm. avg. (of 5) = 0.208171 fft 9: mflops = 21.1357 (norm. = 0.168479), norm. avg. (of 5) = 0.0774307 fft 10: mflops = 33.6592 (norm. = 0.268307), norm. avg. (of 5) = 0.122957 fft 11: mflops = 34.1972 (norm. = 0.272595), norm. avg. (of 5) = 0.123978 fft 12: mflops = 35.5244 (norm. = 0.283175), norm. avg. (of 5) = 0.14632 fft 13: mflops = 23.7916 (norm. = 0.189649), norm. avg. (of 5) = 0.0948089 fft 14: mflops = 5.65178 (norm. = 0.045052), norm. avg. (of 5) = 0.0268 Benchmarking for array size = 24: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.96255 s, 262144 iters, t-(init.)=1.86649 s t(norm)=0.0647051, mflops=77.2736 2. CWP (best N) (N=28): elapsed time t=1.06801 s, 131072 iters, t-(init.)=1.0162 s t(norm)=0.0704567, mflops=70.9656 3. FFTPACK: elapsed time t=1.46194 s, 262144 iters, t-(init.)=1.36574 s t(norm)=0.0473456, mflops=105.606 (err=2.2e-16) 4. FFTPACK (f2c): elapsed time t=1.20525 s, 131072 iters, t-(init.)=1.15504 s t(norm)=0.0800831, mflops=62.4352 (err=2.7e-16) FFTW_MEASURE plan: (cost = 2.962672e-06) FFTW_TWIDDLE 3 FFTW_NOTW 8 5. FFTW: elapsed time t=1.91277 s, 524288 iters, t-(init.)=1.70491 s t(norm)=0.0295518, mflops=169.194 (err=2.3e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.0757 s, 262144 iters, t-(init.)=0.981024 s t(norm)=0.0340089, mflops=147.02 (err=2.1e-16) 7. Frigo-old: elapsed time t=1.93773 s, 131072 iters, t-(init.)=1.89012 s t(norm)=0.131048, mflops=38.1538 (err=3.9e-16) 8. GSL: elapsed time t=1.09683 s, 131072 iters, t-(init.)=1.04292 s t(norm)=0.0723094, mflops=69.1473 (err=2.1e-16) 9. NAPACK (f2c): elapsed time t=1.65269 s, 65536 iters, t-(init.)=1.62996 s t(norm)=0.226022, mflops=22.1217 (err=8.0e-16) 10. Singleton: elapsed time t=1.09386 s, 65536 iters, t-(init.)=1.06717 s t(norm)=0.147981, mflops=33.788 (err=2.3e-16) 11. Singleton (f2c): elapsed time t=1.10098 s, 65536 iters, t-(init.)=1.07596 s t(norm)=0.149201, mflops=33.5119 (err=2.3e-16) 12. Temperton: elapsed time t=1.58449 s, 131072 iters, t-(init.)=1.53898 s t(norm)=0.106703, mflops=46.8592 (err=4.5e-09) 13. Temperton (f2c): elapsed time t=1.18342 s, 65536 iters, t-(init.)=1.1599 s t(norm)=0.16084, mflops=31.0868 (err=2.8e-16) 14. Valkenburg: elapsed time t=1.54118 s, 16384 iters, t-(init.)=1.53501 s t(norm)=0.85142, mflops=5.87254 (err=4.8e-16) Top mflops for N=24 = 169.194 Normalized results and averages for N=24: fft 0: mflops = -1 (norm. = -0.00591036), norm. avg. (of 3) = 0.047903 fft 1: mflops = 77.2736 (norm. = 0.456715), norm. avg. (of 6) = 0.239445 fft 2: mflops = 70.9656 (norm. = 0.419432), norm. avg. (of 6) = 0.220804 fft 3: mflops = 105.606 (norm. = 0.624172), norm. avg. (of 6) = 0.396012 fft 4: mflops = 62.4352 (norm. = 0.369014), norm. avg. (of 6) = 0.254482 fft 5: mflops = 169.194 (norm. = 1), norm. avg. (of 6) = 0.998386 fft 6: mflops = 147.02 (norm. = 0.868943), norm. avg. (of 6) = 0.93866 fft 7: mflops = 38.1538 (norm. = 0.225503), norm. avg. (of 6) = 0.124858 fft 8: mflops = 69.1473 (norm. = 0.408686), norm. avg. (of 6) = 0.24159 fft 9: mflops = 22.1217 (norm. = 0.130747), norm. avg. (of 6) = 0.0863168 fft 10: mflops = 33.788 (norm. = 0.199699), norm. avg. (of 6) = 0.135748 fft 11: mflops = 33.5119 (norm. = 0.198068), norm. avg. (of 6) = 0.136326 fft 12: mflops = 46.8592 (norm. = 0.276955), norm. avg. (of 6) = 0.168092 fft 13: mflops = 31.0868 (norm. = 0.183734), norm. avg. (of 6) = 0.10963 fft 14: mflops = 5.87254 (norm. = 0.0347088), norm. avg. (of 6) = 0.0281182 Benchmarking for array size = 36: 0. Brenner: elapsed time t=1.30915 s, 16384 iters, t-(init.)=1.30041 s t(norm)=0.426455, mflops=11.7246 (err=5.3e-16) 1. CWP (min N): elapsed time t=1.34461 s, 131072 iters, t-(init.)=1.27632 s t(norm)=0.0523194, mflops=95.5669 2. CWP (best N): elapsed time t=1.33944 s, 131072 iters, t-(init.)=1.27021 s t(norm)=0.0520688, mflops=96.0268 3. FFTPACK: elapsed time t=1.02042 s, 131072 iters, t-(init.)=0.955337 s t(norm)=0.0391616, mflops=127.676 (err=3.8e-16) 4. FFTPACK (f2c): elapsed time t=1.79854 s, 131072 iters, t-(init.)=1.73188 s t(norm)=0.0709938, mflops=70.4287 (err=4.7e-16) FFTW_MEASURE plan: (cost = 4.467719e-06) FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.24568 s, 262144 iters, t-(init.)=1.10826 s t(norm)=0.0227152, mflops=220.117 (err=4.4e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.28441 s, 262144 iters, t-(init.)=1.15 s t(norm)=0.0235707, mflops=212.128 (err=4.4e-16) 7. Frigo-old: elapsed time t=1.24137 s, 32768 iters, t-(init.)=1.22322 s t(norm)=0.20057, mflops=24.9289 (err=5.9e-16) 8. GSL: elapsed time t=1.65652 s, 131072 iters, t-(init.)=1.58676 s t(norm)=0.065045, mflops=76.8699 (err=4.3e-16) 9. NAPACK (f2c): elapsed time t=1.07596 s, 32768 iters, t-(init.)=1.05978 s t(norm)=0.173771, mflops=28.7735 (err=1.4e-15) 10. Singleton: elapsed time t=1.11128 s, 65536 iters, t-(init.)=1.07712 s t(norm)=0.0883072, mflops=56.6205 (err=4.7e-16) 11. Singleton (f2c): elapsed time t=1.0822 s, 65536 iters, t-(init.)=1.04752 s t(norm)=0.085881, mflops=58.2201 (err=4.7e-16) 12. Temperton: elapsed time t=1.04373 s, 65536 iters, t-(init.)=1.00859 s t(norm)=0.0826889, mflops=60.4676 (err=5.1e-08) 13. Temperton (f2c): elapsed time t=1.42931 s, 65536 iters, t-(init.)=1.39625 s t(norm)=0.114472, mflops=43.679 (err=3.7e-16) 14. Valkenburg: elapsed time t=1.3443 s, 8192 iters, t-(init.)=1.34016 s t(norm)=0.87898, mflops=5.68841 (err=6.2e-16) Top mflops for N=36 = 220.117 Normalized results and averages for N=36: fft 0: mflops = 11.7246 (norm. = 0.0532652), norm. avg. (of 4) = 0.0492435 fft 1: mflops = 95.5669 (norm. = 0.434165), norm. avg. (of 7) = 0.267262 fft 2: mflops = 96.0268 (norm. = 0.436254), norm. avg. (of 7) = 0.251583 fft 3: mflops = 127.676 (norm. = 0.580038), norm. avg. (of 7) = 0.422301 fft 4: mflops = 70.4287 (norm. = 0.319961), norm. avg. (of 7) = 0.263836 fft 5: mflops = 220.117 (norm. = 1), norm. avg. (of 7) = 0.998617 fft 6: mflops = 212.128 (norm. = 0.963706), norm. avg. (of 7) = 0.942238 fft 7: mflops = 24.9289 (norm. = 0.113253), norm. avg. (of 7) = 0.1232 fft 8: mflops = 76.8699 (norm. = 0.349223), norm. avg. (of 7) = 0.256966 fft 9: mflops = 28.7735 (norm. = 0.130719), norm. avg. (of 7) = 0.09266 fft 10: mflops = 56.6205 (norm. = 0.257229), norm. avg. (of 7) = 0.153102 fft 11: mflops = 58.2201 (norm. = 0.264496), norm. avg. (of 7) = 0.154636 fft 12: mflops = 60.4676 (norm. = 0.274707), norm. avg. (of 7) = 0.183323 fft 13: mflops = 43.679 (norm. = 0.198435), norm. avg. (of 7) = 0.122316 fft 14: mflops = 5.68841 (norm. = 0.0258427), norm. avg. (of 7) = 0.0277931 Benchmarking for array size = 80: 0. Brenner: elapsed time t=1.00528 s, 8192 iters, t-(init.)=0.996341 s t(norm)=0.24048, mflops=20.7918 (err=3.9e-16) 1. CWP (min N): elapsed time t=1.37121 s, 65536 iters, t-(init.)=1.3011 s t(norm)=0.0392547, mflops=127.373 2. CWP (best N) (N=84): elapsed time t=1.04627 s, 65536 iters, t-(init.)=0.973522 s t(norm)=0.0293715, mflops=170.233 3. FFTPACK: elapsed time t=1.13603 s, 65536 iters, t-(init.)=1.0675 s t(norm)=0.0322068, mflops=155.247 (err=3.2e-16) 4. FFTPACK (f2c): elapsed time t=1.73088 s, 65536 iters, t-(init.)=1.66149 s t(norm)=0.0501276, mflops=99.7454 (err=4.0e-16) FFTW_MEASURE plan: (cost = 1.044750e-05) FFTW_TWIDDLE 5 FFTW_NOTW 16 5. FFTW: elapsed time t=1.42435 s, 131072 iters, t-(init.)=1.28605 s t(norm)=0.0194002, mflops=257.729 (err=3.6e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 6. FFTW_ESTIMATE: elapsed time t=1.42937 s, 131072 iters, t-(init.)=1.29349 s t(norm)=0.0195125, mflops=256.246 (err=3.6e-16) 7. Frigo-old: elapsed time t=1.90484 s, 32768 iters, t-(init.)=1.87024 s t(norm)=0.112852, mflops=44.3059 (err=3.5e-16) 8. GSL: elapsed time t=1.34065 s, 32768 iters, t-(init.)=1.30656 s t(norm)=0.0788389, mflops=63.4205 (err=3.2e-16) 9. NAPACK (f2c): elapsed time t=1.12116 s, 8192 iters, t-(init.)=1.11277 s t(norm)=0.268582, mflops=18.6163 (err=5.0e-16) 10. Singleton: elapsed time t=1.85371 s, 65536 iters, t-(init.)=1.78571 s t(norm)=0.0538754, mflops=92.8068 (err=4.4e-16) 11. Singleton (f2c): elapsed time t=1.78991 s, 65536 iters, t-(init.)=1.7203 s t(norm)=0.0519022, mflops=96.3351 (err=4.4e-16) 12. Temperton: elapsed time t=1.91214 s, 65536 iters, t-(init.)=1.84202 s t(norm)=0.0555745, mflops=89.9694 (err=5.3e-08) 13. Temperton (f2c): elapsed time t=1.47372 s, 32768 iters, t-(init.)=1.43923 s t(norm)=0.0868444, mflops=57.5743 (err=3.4e-16) 14. Valkenburg: elapsed time t=1.94888 s, 4096 iters, t-(init.)=1.94443 s t(norm)=0.938629, mflops=5.32692 (err=4.6e-16) Top mflops for N=80 = 257.729 Normalized results and averages for N=80: fft 0: mflops = 20.7918 (norm. = 0.0806731), norm. avg. (of 5) = 0.0555295 fft 1: mflops = 127.373 (norm. = 0.494215), norm. avg. (of 8) = 0.295631 fft 2: mflops = 170.233 (norm. = 0.660512), norm. avg. (of 8) = 0.302699 fft 3: mflops = 155.247 (norm. = 0.602366), norm. avg. (of 8) = 0.444809 fft 4: mflops = 99.7454 (norm. = 0.387017), norm. avg. (of 8) = 0.279234 fft 5: mflops = 257.729 (norm. = 1), norm. avg. (of 8) = 0.99879 fft 6: mflops = 256.246 (norm. = 0.994247), norm. avg. (of 8) = 0.948739 fft 7: mflops = 44.3059 (norm. = 0.171909), norm. avg. (of 8) = 0.129289 fft 8: mflops = 63.4205 (norm. = 0.246075), norm. avg. (of 8) = 0.255605 fft 9: mflops = 18.6163 (norm. = 0.0722321), norm. avg. (of 8) = 0.0901066 fft 10: mflops = 92.8068 (norm. = 0.360095), norm. avg. (of 8) = 0.178976 fft 11: mflops = 96.3351 (norm. = 0.373785), norm. avg. (of 8) = 0.18203 fft 12: mflops = 89.9694 (norm. = 0.349086), norm. avg. (of 8) = 0.204043 fft 13: mflops = 57.5743 (norm. = 0.223391), norm. avg. (of 8) = 0.134951 fft 14: mflops = 5.32692 (norm. = 0.0206687), norm. avg. (of 8) = 0.0269025 Benchmarking for array size = 108: 0. Brenner: elapsed time t=1.16096 s, 4096 iters, t-(init.)=1.15508 s t(norm)=0.386555, mflops=12.9348 (err=6.4e-16) 1. CWP (min N) (N=110): elapsed time t=1.11431 s, 32768 iters, t-(init.)=1.06846 s t(norm)=0.0446957, mflops=111.868 2. CWP (best N) (N=112): elapsed time t=1.86786 s, 65536 iters, t-(init.)=1.77512 s t(norm)=0.0371285, mflops=134.668 3. FFTPACK: elapsed time t=1.48857 s, 65536 iters, t-(init.)=1.39566 s t(norm)=0.0291916, mflops=171.282 (err=3.8e-16) 4. FFTPACK (f2c): elapsed time t=1.50163 s, 32768 iters, t-(init.)=1.45648 s t(norm)=0.0609276, mflops=82.0646 (err=4.0e-16) FFTW_MEASURE plan: (cost = 1.690312e-05) FFTW_TWIDDLE 9 FFTW_NOTW 12 5. FFTW: elapsed time t=1.0308 s, 65536 iters, t-(init.)=0.940401 s t(norm)=0.0196694, mflops=254.202 (err=3.7e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.02488 s, 65536 iters, t-(init.)=0.934966 s t(norm)=0.0195557, mflops=255.68 (err=3.7e-16) 7. Frigo-old: elapsed time t=1.29646 s, 8192 iters, t-(init.)=1.28499 s t(norm)=0.215014, mflops=23.2543 (err=5.7e-16) 8. GSL: elapsed time t=1.74883 s, 32768 iters, t-(init.)=1.70175 s t(norm)=0.0711875, mflops=70.2371 (err=4.0e-16) 9. NAPACK (f2c): elapsed time t=1.59939 s, 16384 iters, t-(init.)=1.5768 s t(norm)=0.131921, mflops=37.9015 (err=3.1e-15) 10. Singleton: elapsed time t=1.81879 s, 32768 iters, t-(init.)=1.77315 s t(norm)=0.0741745, mflops=67.4086 (err=4.5e-16) 11. Singleton (f2c): elapsed time t=1.77925 s, 32768 iters, t-(init.)=1.7334 s t(norm)=0.0725113, mflops=68.9548 (err=4.5e-16) 12. Temperton: elapsed time t=1.38057 s, 32768 iters, t-(init.)=1.33591 s t(norm)=0.0558836, mflops=89.4717 (err=7.4e-08) 13. Temperton (f2c): elapsed time t=1.99379 s, 32768 iters, t-(init.)=1.94821 s t(norm)=0.0814973, mflops=61.3517 (err=3.5e-16) 14. Valkenburg: elapsed time t=1.32322 s, 2048 iters, t-(init.)=1.32037 s t(norm)=0.88374, mflops=5.65778 (err=6.6e-16) Top mflops for N=108 = 255.68 Normalized results and averages for N=108: fft 0: mflops = 12.9348 (norm. = 0.0505898), norm. avg. (of 6) = 0.0547062 fft 1: mflops = 111.868 (norm. = 0.43753), norm. avg. (of 9) = 0.311398 fft 2: mflops = 134.668 (norm. = 0.526704), norm. avg. (of 9) = 0.327588 fft 3: mflops = 171.282 (norm. = 0.669909), norm. avg. (of 9) = 0.469821 fft 4: mflops = 82.0646 (norm. = 0.320967), norm. avg. (of 9) = 0.283871 fft 5: mflops = 254.202 (norm. = 0.994221), norm. avg. (of 9) = 0.998282 fft 6: mflops = 255.68 (norm. = 1), norm. avg. (of 9) = 0.954434 fft 7: mflops = 23.2543 (norm. = 0.090951), norm. avg. (of 9) = 0.125029 fft 8: mflops = 70.2371 (norm. = 0.274707), norm. avg. (of 9) = 0.257727 fft 9: mflops = 37.9015 (norm. = 0.148238), norm. avg. (of 9) = 0.0965656 fft 10: mflops = 67.4086 (norm. = 0.263645), norm. avg. (of 9) = 0.188384 fft 11: mflops = 68.9548 (norm. = 0.269692), norm. avg. (of 9) = 0.19177 fft 12: mflops = 89.4717 (norm. = 0.349937), norm. avg. (of 9) = 0.220254 fft 13: mflops = 61.3517 (norm. = 0.239955), norm. avg. (of 9) = 0.146618 fft 14: mflops = 5.65778 (norm. = 0.0221284), norm. avg. (of 9) = 0.0263721 Benchmarking for array size = 210: 0. Brenner: elapsed time t=1.94895 s, 4096 iters, t-(init.)=1.93816 s t(norm)=0.292091, mflops=17.118 (err=6.7e-16) 1. CWP (min N): elapsed time t=1.37026 s, 32768 iters, t-(init.)=1.2825 s t(norm)=0.02416, mflops=206.954 2. CWP (best N): elapsed time t=1.36633 s, 32768 iters, t-(init.)=1.2806 s t(norm)=0.0241241, mflops=207.262 3. FFTPACK: elapsed time t=1.21407 s, 16384 iters, t-(init.)=1.17119 s t(norm)=0.044126, mflops=113.312 (err=5.0e-16) 4. FFTPACK (f2c): elapsed time t=1.145 s, 8192 iters, t-(init.)=1.12372 s t(norm)=0.0846752, mflops=59.0491 (err=6.4e-16) FFTW_MEASURE plan: (cost = 4.817700e-05) FFTW_TWIDDLE 2 FFTW_TWIDDLE 7 FFTW_NOTW 15 5. FFTW: elapsed time t=1.93935 s, 32768 iters, t-(init.)=1.85215 s t(norm)=0.034891, mflops=143.303 (err=4.6e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.873 s, 32768 iters, t-(init.)=1.78719 s t(norm)=0.0336672, mflops=148.513 (err=4.7e-16) 7. Frigo-old: elapsed time t=1.54717 s, 4096 iters, t-(init.)=1.53647 s t(norm)=0.231553, mflops=21.5933 (err=5.9e-16) 8. GSL: elapsed time t=1.23668 s, 8192 iters, t-(init.)=1.21501 s t(norm)=0.0915539, mflops=54.6127 (err=6.3e-16) 9. NAPACK (f2c): elapsed time t=1.00825 s, 2048 iters, t-(init.)=1.00291 s t(norm)=0.302287, mflops=16.5405 (err=1.5e-14) 10. Singleton: elapsed time t=1.12303 s, 8192 iters, t-(init.)=1.10127 s t(norm)=0.082983, mflops=60.2533 (err=6.4e-16) 11. Singleton (f2c): elapsed time t=1.21217 s, 8192 iters, t-(init.)=1.1909 s t(norm)=0.0897371, mflops=55.7183 (err=6.4e-16) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.89354 s, 1024 iters, t-(init.)=1.89082 s t(norm)=1.13982, mflops=4.38666 (err=7.1e-16) Top mflops for N=210 = 207.262 Normalized results and averages for N=210: fft 0: mflops = 17.118 (norm. = 0.0825911), norm. avg. (of 7) = 0.0586897 fft 1: mflops = 206.954 (norm. = 0.998515), norm. avg. (of 10) = 0.38011 fft 2: mflops = 207.262 (norm. = 1), norm. avg. (of 10) = 0.394829 fft 3: mflops = 113.312 (norm. = 0.546709), norm. avg. (of 10) = 0.477509 fft 4: mflops = 59.0491 (norm. = 0.284901), norm. avg. (of 10) = 0.283974 fft 5: mflops = 143.303 (norm. = 0.691413), norm. avg. (of 10) = 0.967595 fft 6: mflops = 148.513 (norm. = 0.716546), norm. avg. (of 10) = 0.930646 fft 7: mflops = 21.5933 (norm. = 0.104184), norm. avg. (of 10) = 0.122944 fft 8: mflops = 54.6127 (norm. = 0.263496), norm. avg. (of 10) = 0.258304 fft 9: mflops = 16.5405 (norm. = 0.0798051), norm. avg. (of 10) = 0.0948896 fft 10: mflops = 60.2533 (norm. = 0.290711), norm. avg. (of 10) = 0.198617 fft 11: mflops = 55.7183 (norm. = 0.268831), norm. avg. (of 10) = 0.199476 fft 12: mflops = -1 (norm. = -0.00482482), norm. avg. (of 9) = 0.220254 fft 13: mflops = -1 (norm. = -0.00482482), norm. avg. (of 9) = 0.146618 fft 14: mflops = 4.38666 (norm. = 0.0211648), norm. avg. (of 10) = 0.0258514 Benchmarking for array size = 504: 0. Brenner: elapsed time t=1.22949 s, 1024 iters, t-(init.)=1.22314 s t(norm)=0.263998, mflops=18.9395 (err=1.5e-15) 1. CWP (min N): elapsed time t=1.98873 s, 16384 iters, t-(init.)=1.8877 s t(norm)=0.0254646, mflops=196.351 2. CWP (best N): elapsed time t=1.98946 s, 16384 iters, t-(init.)=1.88823 s t(norm)=0.0254718, mflops=196.296 3. FFTPACK: elapsed time t=1.64886 s, 8192 iters, t-(init.)=1.59791 s t(norm)=0.0431109, mflops=115.98 (err=1.2e-15) 4. FFTPACK (f2c): elapsed time t=1.66645 s, 4096 iters, t-(init.)=1.64123 s t(norm)=0.0885595, mflops=56.4592 (err=1.3e-15) FFTW_MEASURE plan: (cost = 1.557410e-04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 7 FFTW_NOTW 12 5. FFTW: elapsed time t=1.15933 s, 8192 iters, t-(init.)=1.10844 s t(norm)=0.0299052, mflops=167.195 (err=1.3e-15) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.23221 s, 8192 iters, t-(init.)=1.18144 s t(norm)=0.0318746, mflops=156.865 (err=1.2e-15) 7. Frigo-old: elapsed time t=1.69957 s, 2048 iters, t-(init.)=1.68679 s t(norm)=0.182035, mflops=27.4672 (err=1.3e-15) 8. GSL: elapsed time t=1.58391 s, 4096 iters, t-(init.)=1.55864 s t(norm)=0.084103, mflops=59.4509 (err=1.3e-15) 9. NAPACK (f2c): elapsed time t=1.09759 s, 1024 iters, t-(init.)=1.09123 s t(norm)=0.235527, mflops=21.229 (err=4.1e-14) 10. Singleton: elapsed time t=1.35557 s, 4096 iters, t-(init.)=1.32999 s t(norm)=0.0717653, mflops=69.6716 (err=1.9e-15) 11. Singleton (f2c): elapsed time t=1.36168 s, 4096 iters, t-(init.)=1.33644 s t(norm)=0.0721133, mflops=69.3354 (err=1.9e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.19456 s, 256 iters, t-(init.)=1.19297 s t(norm)=1.02994, mflops=4.85464 (err=1.4e-15) Top mflops for N=504 = 196.351 Normalized results and averages for N=504: fft 0: mflops = 18.9395 (norm. = 0.0964576), norm. avg. (of 8) = 0.0634107 fft 1: mflops = 196.351 (norm. = 1), norm. avg. (of 11) = 0.436463 fft 2: mflops = 196.296 (norm. = 0.999719), norm. avg. (of 11) = 0.449819 fft 3: mflops = 115.98 (norm. = 0.590678), norm. avg. (of 11) = 0.487797 fft 4: mflops = 56.4592 (norm. = 0.287543), norm. avg. (of 11) = 0.284298 fft 5: mflops = 167.195 (norm. = 0.851511), norm. avg. (of 11) = 0.957042 fft 6: mflops = 156.865 (norm. = 0.7989), norm. avg. (of 11) = 0.918669 fft 7: mflops = 27.4672 (norm. = 0.139889), norm. avg. (of 11) = 0.124485 fft 8: mflops = 59.4509 (norm. = 0.302779), norm. avg. (of 11) = 0.262347 fft 9: mflops = 21.229 (norm. = 0.108118), norm. avg. (of 11) = 0.0960922 fft 10: mflops = 69.6716 (norm. = 0.354832), norm. avg. (of 11) = 0.212818 fft 11: mflops = 69.3354 (norm. = 0.35312), norm. avg. (of 11) = 0.213444 fft 12: mflops = -1 (norm. = -0.00509293), norm. avg. (of 9) = 0.220254 fft 13: mflops = -1 (norm. = -0.00509293), norm. avg. (of 9) = 0.146618 fft 14: mflops = 4.85464 (norm. = 0.0247243), norm. avg. (of 11) = 0.0257489 Benchmarking for array size = 1000: 0. Brenner: elapsed time t=1.42648 s, 512 iters, t-(init.)=1.42023 s t(norm)=0.27834, mflops=17.9636 (err=1.1e-15) 1. CWP (min N) (N=1001): elapsed time t=1.89063 s, 4096 iters, t-(init.)=1.84058 s t(norm)=0.0450904, mflops=110.888 2. CWP (best N) (N=1008): elapsed time t=1.25067 s, 4096 iters, t-(init.)=1.20024 s t(norm)=0.0294032, mflops=170.049 3. FFTPACK: elapsed time t=1.71729 s, 4096 iters, t-(init.)=1.6671 s t(norm)=0.0408403, mflops=122.428 (err=1.0e-15) 4. FFTPACK (f2c): elapsed time t=1.42577 s, 2048 iters, t-(init.)=1.40081 s t(norm)=0.0686336, mflops=72.8506 (err=1.1e-15) FFTW_MEASURE plan: (cost = 3.241270e-04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 5. FFTW: elapsed time t=1.34601 s, 4096 iters, t-(init.)=1.29603 s t(norm)=0.0317499, mflops=157.481 (err=9.7e-16) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.34987 s, 4096 iters, t-(init.)=1.29967 s t(norm)=0.0318392, mflops=157.039 (err=9.7e-16) 7. Frigo-old: elapsed time t=1.884 s, 1024 iters, t-(init.)=1.87153 s t(norm)=0.183394, mflops=27.2637 (err=1.0e-15) 8. GSL: elapsed time t=1.87027 s, 2048 iters, t-(init.)=1.84508 s t(norm)=0.0904009, mflops=55.3092 (err=1.0e-15) 9. NAPACK (f2c): elapsed time t=1.57311 s, 512 iters, t-(init.)=1.56686 s t(norm)=0.307078, mflops=16.2825 (err=1.7e-14) 10. Singleton: elapsed time t=1.08533 s, 2048 iters, t-(init.)=1.06037 s t(norm)=0.0519537, mflops=96.2396 (err=1.5e-15) 11. Singleton (f2c): elapsed time t=1.05196 s, 2048 iters, t-(init.)=1.02697 s t(norm)=0.0503172, mflops=99.3697 (err=1.5e-15) 12. Temperton: elapsed time t=2.00123 s, 4096 iters, t-(init.)=1.95131 s t(norm)=0.0478029, mflops=104.596 (err=1.3e-07) 13. Temperton (f2c): elapsed time t=1.59313 s, 2048 iters, t-(init.)=1.56821 s t(norm)=0.0768358, mflops=65.0739 (err=9.9e-16) 14. Valkenburg: elapsed time t=1.34628 s, 128 iters, t-(init.)=1.34471 s t(norm)=1.05417, mflops=4.74309 (err=1.1e-15) Top mflops for N=1000 = 170.049 Normalized results and averages for N=1000: fft 0: mflops = 17.9636 (norm. = 0.105638), norm. avg. (of 9) = 0.0681026 fft 1: mflops = 110.888 (norm. = 0.652095), norm. avg. (of 12) = 0.454433 fft 2: mflops = 170.049 (norm. = 1), norm. avg. (of 12) = 0.495668 fft 3: mflops = 122.428 (norm. = 0.719956), norm. avg. (of 12) = 0.507144 fft 4: mflops = 72.8506 (norm. = 0.428409), norm. avg. (of 12) = 0.296307 fft 5: mflops = 157.481 (norm. = 0.926088), norm. avg. (of 12) = 0.954463 fft 6: mflops = 157.039 (norm. = 0.923492), norm. avg. (of 12) = 0.919071 fft 7: mflops = 27.2637 (norm. = 0.160328), norm. avg. (of 12) = 0.127472 fft 8: mflops = 55.3092 (norm. = 0.325254), norm. avg. (of 12) = 0.26759 fft 9: mflops = 16.2825 (norm. = 0.0957516), norm. avg. (of 12) = 0.0960638 fft 10: mflops = 96.2396 (norm. = 0.565951), norm. avg. (of 12) = 0.242246 fft 11: mflops = 99.3697 (norm. = 0.584358), norm. avg. (of 12) = 0.244353 fft 12: mflops = 104.596 (norm. = 0.615094), norm. avg. (of 10) = 0.259738 fft 13: mflops = 65.0739 (norm. = 0.382677), norm. avg. (of 10) = 0.170224 fft 14: mflops = 4.74309 (norm. = 0.0278924), norm. avg. (of 12) = 0.0259275 Benchmarking for array size = 1960: 0. Brenner: elapsed time t=1.38461 s, 256 iters, t-(init.)=1.37852 s t(norm)=0.251208, mflops=19.9039 (err=2.9e-15) 1. CWP (min N) (N=1980): elapsed time t=1.65468 s, 2048 iters, t-(init.)=1.60548 s t(norm)=0.0365709, mflops=136.721 2. CWP (best N) (N=1980): elapsed time t=1.65981 s, 2048 iters, t-(init.)=1.61057 s t(norm)=0.0366868, mflops=136.289 3. FFTPACK: elapsed time t=1.20219 s, 1024 iters, t-(init.)=1.17785 s t(norm)=0.0536601, mflops=93.1792 (err=2.8e-15) 4. FFTPACK (f2c): elapsed time t=1.35473 s, 512 iters, t-(init.)=1.34256 s t(norm)=0.122328, mflops=40.8738 (err=2.8e-15) FFTW_MEASURE plan: (cost = 8.775420e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 7 5. FFTW: elapsed time t=1.75036 s, 2048 iters, t-(init.)=1.70144 s t(norm)=0.0387566, mflops=129.01 (err=2.8e-15) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.8325 s, 2048 iters, t-(init.)=1.78381 s t(norm)=0.0406329, mflops=123.053 (err=2.8e-15) 7. Frigo-old: elapsed time t=1.08327 s, 256 iters, t-(init.)=1.07715 s t(norm)=0.196288, mflops=25.4727 (err=2.8e-15) 8. GSL: elapsed time t=1.05344 s, 512 iters, t-(init.)=1.04121 s t(norm)=0.0948699, mflops=52.7038 (err=2.8e-15) 9. NAPACK (f2c): elapsed time t=1.02626 s, 128 iters, t-(init.)=1.02322 s t(norm)=0.372923, mflops=13.4076 (err=1.3e-13) 10. Singleton: elapsed time t=1.82708 s, 1024 iters, t-(init.)=1.8027 s t(norm)=0.0821268, mflops=60.8815 (err=4.3e-15) 11. Singleton (f2c): elapsed time t=1.76775 s, 1024 iters, t-(init.)=1.74339 s t(norm)=0.0794245, mflops=62.9529 (err=4.3e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.64599 s, 64 iters, t-(init.)=1.64447 s t(norm)=1.19869, mflops=4.17124 (err=2.7e-15) Top mflops for N=1960 = 136.721 Normalized results and averages for N=1960: fft 0: mflops = 19.9039 (norm. = 0.145581), norm. avg. (of 10) = 0.0758504 fft 1: mflops = 136.721 (norm. = 1), norm. avg. (of 13) = 0.496399 fft 2: mflops = 136.289 (norm. = 0.996843), norm. avg. (of 13) = 0.53422 fft 3: mflops = 93.1792 (norm. = 0.68153), norm. avg. (of 13) = 0.520558 fft 4: mflops = 40.8738 (norm. = 0.298959), norm. avg. (of 13) = 0.296511 fft 5: mflops = 129.01 (norm. = 0.943605), norm. avg. (of 13) = 0.953627 fft 6: mflops = 123.053 (norm. = 0.900032), norm. avg. (of 13) = 0.917606 fft 7: mflops = 25.4727 (norm. = 0.186312), norm. avg. (of 13) = 0.131998 fft 8: mflops = 52.7038 (norm. = 0.385485), norm. avg. (of 13) = 0.276659 fft 9: mflops = 13.4076 (norm. = 0.0980658), norm. avg. (of 13) = 0.0962178 fft 10: mflops = 60.8815 (norm. = 0.445299), norm. avg. (of 13) = 0.257865 fft 11: mflops = 62.9529 (norm. = 0.460449), norm. avg. (of 13) = 0.260976 fft 12: mflops = -1 (norm. = -0.00731419), norm. avg. (of 10) = 0.259738 fft 13: mflops = -1 (norm. = -0.00731419), norm. avg. (of 10) = 0.170224 fft 14: mflops = 4.17124 (norm. = 0.0305092), norm. avg. (of 13) = 0.02628 Benchmarking for array size = 4725: 0. Brenner: elapsed time t=1.19539 s, 64 iters, t-(init.)=1.19105 s t(norm)=0.32268, mflops=15.4952 (err=1.9e-15) 1. CWP (min N) (N=5005): elapsed time t=1.43355 s, 512 iters, t-(init.)=1.40244 s t(norm)=0.0474937, mflops=105.277 2. CWP (best N) (N=5040): elapsed time t=1.03828 s, 512 iters, t-(init.)=1.00702 s t(norm)=0.0341029, mflops=146.615 3. FFTPACK: elapsed time t=1.28996 s, 512 iters, t-(init.)=1.26066 s t(norm)=0.0426923, mflops=117.117 (err=1.8e-15) 4. FFTPACK (f2c): elapsed time t=1.42774 s, 256 iters, t-(init.)=1.41304 s t(norm)=0.0957054, mflops=52.2437 (err=1.9e-15) FFTW_MEASURE plan: (cost = 2.144370e-03) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 15 5. FFTW: elapsed time t=1.08238 s, 512 iters, t-(init.)=1.05302 s t(norm)=0.0356606, mflops=140.211 (err=1.8e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.0876 s, 512 iters, t-(init.)=1.0583 s t(norm)=0.0358394, mflops=139.511 (err=1.8e-15) 7. Frigo-old: elapsed time t=1.01572 s, 64 iters, t-(init.)=1.01205 s t(norm)=0.274185, mflops=18.2359 (err=1.9e-15) 8. GSL: elapsed time t=1.36749 s, 256 iters, t-(init.)=1.35282 s t(norm)=0.0916265, mflops=54.5694 (err=1.9e-15) 9. NAPACK (f2c): elapsed time t=1.19215 s, 64 iters, t-(init.)=1.18849 s t(norm)=0.321985, mflops=15.5287 (err=3.5e-13) 10. Singleton: elapsed time t=1.19473 s, 256 iters, t-(init.)=1.18006 s t(norm)=0.0799254, mflops=62.5583 (err=2.4e-15) 11. Singleton (f2c): elapsed time t=1.25298 s, 256 iters, t-(init.)=1.2383 s t(norm)=0.0838703, mflops=59.6159 (err=2.4e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.0471 s, 16 iters, t-(init.)=1.04618 s t(norm)=1.13373, mflops=4.41023 (err=1.8e-15) Top mflops for N=4725 = 146.615 Normalized results and averages for N=4725: fft 0: mflops = 15.4952 (norm. = 0.105686), norm. avg. (of 11) = 0.0785628 fft 1: mflops = 105.277 (norm. = 0.718052), norm. avg. (of 14) = 0.512232 fft 2: mflops = 146.615 (norm. = 1), norm. avg. (of 14) = 0.56749 fft 3: mflops = 117.117 (norm. = 0.798807), norm. avg. (of 14) = 0.540433 fft 4: mflops = 52.2437 (norm. = 0.356332), norm. avg. (of 14) = 0.300784 fft 5: mflops = 140.211 (norm. = 0.956319), norm. avg. (of 14) = 0.95382 fft 6: mflops = 139.511 (norm. = 0.951548), norm. avg. (of 14) = 0.920031 fft 7: mflops = 18.2359 (norm. = 0.124379), norm. avg. (of 14) = 0.131454 fft 8: mflops = 54.5694 (norm. = 0.372195), norm. avg. (of 14) = 0.283483 fft 9: mflops = 15.5287 (norm. = 0.105915), norm. avg. (of 14) = 0.0969104 fft 10: mflops = 62.5583 (norm. = 0.426684), norm. avg. (of 14) = 0.269924 fft 11: mflops = 59.6159 (norm. = 0.406615), norm. avg. (of 14) = 0.271379 fft 12: mflops = -1 (norm. = -0.00682058), norm. avg. (of 10) = 0.259738 fft 13: mflops = -1 (norm. = -0.00682058), norm. avg. (of 10) = 0.170224 fft 14: mflops = 4.41023 (norm. = 0.0300803), norm. avg. (of 14) = 0.0265514 Benchmarking for array size = 10368: 0. Brenner: elapsed time t=1.18421 s, 32 iters, t-(init.)=1.18019 s t(norm)=0.266658, mflops=18.7506 (err=3.1e-15) 1. CWP (min N) (N=10920): elapsed time t=1.40812 s, 256 iters, t-(init.)=1.37424 s t(norm)=0.0388129, mflops=128.823 2. CWP (best N) (N=11088): elapsed time t=1.47207 s, 256 iters, t-(init.)=1.43771 s t(norm)=0.0406057, mflops=123.136 3. FFTPACK: elapsed time t=1.24029 s, 256 iters, t-(init.)=1.20815 s t(norm)=0.0341221, mflops=146.533 (err=3.0e-15) 4. FFTPACK (f2c): elapsed time t=1.26399 s, 128 iters, t-(init.)=1.24789 s t(norm)=0.0704891, mflops=70.933 (err=3.0e-15) FFTW_MEASURE plan: (cost = 4.007904e-03) FFTW_TWIDDLE 16 FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_NOTW 12 5. FFTW: elapsed time t=1.00664 s, 256 iters, t-(init.)=0.974499 s t(norm)=0.027523, mflops=181.666 (err=3.0e-15) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.13621 s, 256 iters, t-(init.)=1.10401 s t(norm)=0.0311809, mflops=160.354 (err=3.0e-15) 7. Frigo-old: elapsed time t=1.64695 s, 64 iters, t-(init.)=1.63889 s t(norm)=0.18515, mflops=27.0052 (err=3.1e-15) 8. GSL: elapsed time t=1.21885 s, 128 iters, t-(init.)=1.20273 s t(norm)=0.067938, mflops=73.5965 (err=3.0e-15) 9. NAPACK (f2c): elapsed time t=1.57278 s, 64 iters, t-(init.)=1.56475 s t(norm)=0.176774, mflops=28.2847 (err=8.1e-14) 10. Singleton: elapsed time t=1.09345 s, 128 iters, t-(init.)=1.07738 s t(norm)=0.0608575, mflops=82.1591 (err=4.4e-15) 11. Singleton (f2c): elapsed time t=1.17577 s, 128 iters, t-(init.)=1.15968 s t(norm)=0.0655061, mflops=76.3287 (err=4.4e-15) 12. Temperton: elapsed time t=1.90898 s, 256 iters, t-(init.)=1.87683 s t(norm)=0.0530078, mflops=94.3258 (err=2.1e-07) 13. Temperton (f2c): elapsed time t=1.32964 s, 128 iters, t-(init.)=1.31358 s t(norm)=0.0741993, mflops=67.3861 (err=3.0e-15) 14. Valkenburg: elapsed time t=1.98286 s, 16 iters, t-(init.)=1.98085 s t(norm)=0.895129, mflops=5.58579 (err=3.0e-15) Top mflops for N=10368 = 181.666 Normalized results and averages for N=10368: fft 0: mflops = 18.7506 (norm. = 0.103214), norm. avg. (of 12) = 0.0806171 fft 1: mflops = 128.823 (norm. = 0.709119), norm. avg. (of 15) = 0.525358 fft 2: mflops = 123.136 (norm. = 0.677811), norm. avg. (of 15) = 0.574845 fft 3: mflops = 146.533 (norm. = 0.806603), norm. avg. (of 15) = 0.558178 fft 4: mflops = 70.933 (norm. = 0.390457), norm. avg. (of 15) = 0.306762 fft 5: mflops = 181.666 (norm. = 1), norm. avg. (of 15) = 0.956898 fft 6: mflops = 160.354 (norm. = 0.882686), norm. avg. (of 15) = 0.917541 fft 7: mflops = 27.0052 (norm. = 0.148653), norm. avg. (of 15) = 0.1326 fft 8: mflops = 73.5965 (norm. = 0.405119), norm. avg. (of 15) = 0.291592 fft 9: mflops = 28.2847 (norm. = 0.155696), norm. avg. (of 15) = 0.100829 fft 10: mflops = 82.1591 (norm. = 0.452253), norm. avg. (of 15) = 0.282079 fft 11: mflops = 76.3287 (norm. = 0.420159), norm. avg. (of 15) = 0.281297 fft 12: mflops = 94.3258 (norm. = 0.519225), norm. avg. (of 11) = 0.283328 fft 13: mflops = 67.3861 (norm. = 0.370933), norm. avg. (of 11) = 0.18847 fft 14: mflops = 5.58579 (norm. = 0.0307475), norm. avg. (of 15) = 0.0268312 Benchmarking for array size = 27000: 0. Brenner: elapsed time t=1.11023 s, 8 iters, t-(init.)=1.10759 s t(norm)=0.348337, mflops=14.3539 (err=5.6e-15) 1. CWP (min N) (N=27720): elapsed time t=1.87006 s, 128 iters, t-(init.)=1.82711 s t(norm)=0.035914, mflops=139.222 2. CWP (best N) (N=27720): elapsed time t=1.8731 s, 128 iters, t-(init.)=1.83015 s t(norm)=0.0359737, mflops=138.99 3. FFTPACK: elapsed time t=1.056 s, 64 iters, t-(init.)=1.03507 s t(norm)=0.0406912, mflops=122.877 (err=5.5e-15) 4. FFTPACK (f2c): elapsed time t=1.90294 s, 64 iters, t-(init.)=1.88198 s t(norm)=0.0739848, mflops=67.5814 (err=5.5e-15) FFTW_MEASURE plan: (cost = 1.620794e-02) FFTW_TWIDDLE 3 FFTW_TWIDDLE 6 FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 15 5. FFTW: elapsed time t=1.0679 s, 64 iters, t-(init.)=1.04647 s t(norm)=0.0411391, mflops=121.539 (err=5.5e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.96389 s, 128 iters, t-(init.)=1.92183 s t(norm)=0.0377757, mflops=132.36 (err=5.6e-15) 7. Frigo-old: elapsed time t=1.55694 s, 16 iters, t-(init.)=1.55165 s t(norm)=0.243996, mflops=20.4922 (err=5.7e-15) 8. GSL: elapsed time t=1.94704 s, 64 iters, t-(init.)=1.92612 s t(norm)=0.0757201, mflops=66.0326 (err=5.5e-15) 9. NAPACK (f2c): elapsed time t=1.90052 s, 16 iters, t-(init.)=1.89526 s t(norm)=0.298028, mflops=16.777 (err=1.1e-12) 10. Singleton: elapsed time t=1.80392 s, 64 iters, t-(init.)=1.78297 s t(norm)=0.0700927, mflops=71.3341 (err=7.7e-15) 11. Singleton (f2c): elapsed time t=1.836 s, 64 iters, t-(init.)=1.81496 s t(norm)=0.0713503, mflops=70.0768 (err=7.7e-15) 12. Temperton: elapsed time t=1.36785 s, 64 iters, t-(init.)=1.34686 s t(norm)=0.0529483, mflops=94.4318 (err=1.4e-07) 13. Temperton (f2c): elapsed time t=1.97769 s, 64 iters, t-(init.)=1.95676 s t(norm)=0.0769249, mflops=64.9985 (err=5.6e-15) 14. Valkenburg: elapsed time t=1.64536 s, 4 iters, t-(init.)=1.64403 s t(norm)=1.03409, mflops=4.83518 (err=5.5e-15) Top mflops for N=27000 = 139.222 Normalized results and averages for N=27000: fft 0: mflops = 14.3539 (norm. = 0.103101), norm. avg. (of 13) = 0.0823466 fft 1: mflops = 139.222 (norm. = 1), norm. avg. (of 16) = 0.555023 fft 2: mflops = 138.99 (norm. = 0.998341), norm. avg. (of 16) = 0.601313 fft 3: mflops = 122.877 (norm. = 0.882599), norm. avg. (of 16) = 0.578454 fft 4: mflops = 67.5814 (norm. = 0.485424), norm. avg. (of 16) = 0.317929 fft 5: mflops = 121.539 (norm. = 0.87299), norm. avg. (of 16) = 0.951654 fft 6: mflops = 132.36 (norm. = 0.950716), norm. avg. (of 16) = 0.919614 fft 7: mflops = 20.4922 (norm. = 0.147191), norm. avg. (of 16) = 0.133512 fft 8: mflops = 66.0326 (norm. = 0.474299), norm. avg. (of 16) = 0.303011 fft 9: mflops = 16.777 (norm. = 0.120506), norm. avg. (of 16) = 0.102059 fft 10: mflops = 71.3341 (norm. = 0.512378), norm. avg. (of 16) = 0.296473 fft 11: mflops = 70.0768 (norm. = 0.503347), norm. avg. (of 16) = 0.295176 fft 12: mflops = 94.4318 (norm. = 0.678285), norm. avg. (of 12) = 0.316241 fft 13: mflops = 64.9985 (norm. = 0.466871), norm. avg. (of 12) = 0.21167 fft 14: mflops = 4.83518 (norm. = 0.0347301), norm. avg. (of 16) = 0.0273248 Benchmarking for array size = 75600: 0. Brenner: elapsed time t=1.93416 s, 4 iters, t-(init.)=1.93027 s t(norm)=0.393875, mflops=12.6944 (err=1.1e-14) 1. CWP (min N) (N=80080): elapsed time t=1.77825 s, 32 iters, t-(init.)=1.74519 s t(norm)=0.0445136, mflops=112.325 2. CWP (best N) (N=80080): elapsed time t=1.78608 s, 32 iters, t-(init.)=1.75306 s t(norm)=0.0447142, mflops=111.821 3. FFTPACK: elapsed time t=1.34944 s, 16 iters, t-(init.)=1.33315 s t(norm)=0.0680076, mflops=73.5212 (err=1.0e-14) 4. FFTPACK (f2c): elapsed time t=1.11805 s, 8 iters, t-(init.)=1.10964 s t(norm)=0.113212, mflops=44.1651 (err=1.1e-14) FFTW_MEASURE plan: (cost = 6.459627e-02) FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 16 FFTW_NOTW 15 5. FFTW: elapsed time t=1.06063 s, 16 iters, t-(init.)=1.04171 s t(norm)=0.0531408, mflops=94.0896 (err=1.1e-14) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.16579 s, 16 iters, t-(init.)=1.14701 s t(norm)=0.0585125, mflops=85.4518 (err=1.1e-14) 7. Frigo-old: elapsed time t=1.46552 s, 4 iters, t-(init.)=1.46003 s t(norm)=0.297922, mflops=16.7829 (err=1.1e-14) 8. GSL: elapsed time t=1.90555 s, 16 iters, t-(init.)=1.88912 s t(norm)=0.0963694, mflops=51.8837 (err=1.1e-14) 9. NAPACK (f2c): elapsed time t=1.58231 s, 4 iters, t-(init.)=1.57707 s t(norm)=0.321804, mflops=15.5374 (err=5.1e-12) 10. Singleton: elapsed time t=1.05117 s, 8 iters, t-(init.)=1.04339 s t(norm)=0.106452, mflops=46.9693 (err=1.5e-14) 11. Singleton (f2c): elapsed time t=1.04621 s, 8 iters, t-(init.)=1.03842 s t(norm)=0.105945, mflops=47.1943 (err=1.5e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.47842 s, 1 iters, t-(init.)=1.47598 s t(norm)=1.2047, mflops=4.1504 (err=1.1e-14) Top mflops for N=75600 = 112.325 Normalized results and averages for N=75600: fft 0: mflops = 12.6944 (norm. = 0.113015), norm. avg. (of 14) = 0.0845372 fft 1: mflops = 112.325 (norm. = 1), norm. avg. (of 17) = 0.581198 fft 2: mflops = 111.821 (norm. = 0.995512), norm. avg. (of 17) = 0.624501 fft 3: mflops = 73.5212 (norm. = 0.654538), norm. avg. (of 17) = 0.58293 fft 4: mflops = 44.1651 (norm. = 0.393189), norm. avg. (of 17) = 0.322356 fft 5: mflops = 94.0896 (norm. = 0.837653), norm. avg. (of 17) = 0.944948 fft 6: mflops = 85.4518 (norm. = 0.760753), norm. avg. (of 17) = 0.91027 fft 7: mflops = 16.7829 (norm. = 0.149414), norm. avg. (of 17) = 0.134448 fft 8: mflops = 51.8837 (norm. = 0.461906), norm. avg. (of 17) = 0.312358 fft 9: mflops = 15.5374 (norm. = 0.138325), norm. avg. (of 17) = 0.104192 fft 10: mflops = 46.9693 (norm. = 0.418155), norm. avg. (of 17) = 0.30363 fft 11: mflops = 47.1943 (norm. = 0.420157), norm. avg. (of 17) = 0.302527 fft 12: mflops = -1 (norm. = -0.00890271), norm. avg. (of 12) = 0.316241 fft 13: mflops = -1 (norm. = -0.00890271), norm. avg. (of 12) = 0.21167 fft 14: mflops = 4.1504 (norm. = 0.0369498), norm. avg. (of 17) = 0.027891 Benchmarking for array size = 165375: 0. Brenner: elapsed time t=1.67328 s, 1 iters, t-(init.)=1.66288 s t(norm)=0.580041, mflops=8.62008 (err=2.7e-14) 1. CWP (min N) (N=180180): elapsed time t=1.28797 s, 8 iters, t-(init.)=1.21989 s t(norm)=0.0531898, mflops=94.0029 2. CWP (best N) (N=180180): elapsed time t=1.24887 s, 8 iters, t-(init.)=1.18071 s t(norm)=0.0514813, mflops=97.1226 3. FFTPACK: elapsed time t=1.63861 s, 4 iters, t-(init.)=1.60653 s t(norm)=0.140096, mflops=35.6898 (err=2.7e-14) 4. FFTPACK (f2c): elapsed time t=1.19491 s, 2 iters, t-(init.)=1.17724 s t(norm)=0.205319, mflops=24.3523 (err=2.7e-14) FFTW_MEASURE plan: (cost = 2.209995e-01) FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_NOTW 15 5. FFTW: elapsed time t=1.74664 s, 8 iters, t-(init.)=1.68389 s t(norm)=0.0734209, mflops=68.1005 (err=2.7e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.69363 s, 8 iters, t-(init.)=1.62871 s t(norm)=0.0710151, mflops=70.4075 (err=2.7e-14) 7. Frigo-old: elapsed time t=1.16988 s, 1 iters, t-(init.)=1.15659 s t(norm)=0.403438, mflops=12.3935 (err=2.7e-14) 8. GSL: elapsed time t=1.41453 s, 4 iters, t-(init.)=1.38164 s t(norm)=0.120485, mflops=41.4991 (err=2.7e-14) 9. NAPACK (f2c): elapsed time t=1.26984 s, 1 iters, t-(init.)=1.25747 s t(norm)=0.438624, mflops=11.3993 (err=1.6e-11) 10. Singleton: elapsed time t=1.99504 s, 4 iters, t-(init.)=1.96822 s t(norm)=0.171637, mflops=29.1313 (err=4.0e-14) 11. Singleton (f2c): elapsed time t=1.00151 s, 2 iters, t-(init.)=0.989134 s t(norm)=0.172513, mflops=28.9833 (err=4.0e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=3.91552 s, 1 iters, t-(init.)=3.90315 s t(norm)=1.36148, mflops=3.67247 (err=2.7e-14) Top mflops for N=165375 = 97.1226 Normalized results and averages for N=165375: fft 0: mflops = 8.62008 (norm. = 0.0887546), norm. avg. (of 15) = 0.0848184 fft 1: mflops = 94.0029 (norm. = 0.967878), norm. avg. (of 18) = 0.60268 fft 2: mflops = 97.1226 (norm. = 1), norm. avg. (of 18) = 0.645362 fft 3: mflops = 35.6898 (norm. = 0.367471), norm. avg. (of 18) = 0.57096 fft 4: mflops = 24.3523 (norm. = 0.250738), norm. avg. (of 18) = 0.318377 fft 5: mflops = 68.1005 (norm. = 0.701181), norm. avg. (of 18) = 0.931406 fft 6: mflops = 70.4075 (norm. = 0.724934), norm. avg. (of 18) = 0.899973 fft 7: mflops = 12.3935 (norm. = 0.127606), norm. avg. (of 18) = 0.134068 fft 8: mflops = 41.4991 (norm. = 0.427286), norm. avg. (of 18) = 0.318743 fft 9: mflops = 11.3993 (norm. = 0.11737), norm. avg. (of 18) = 0.104925 fft 10: mflops = 29.1313 (norm. = 0.299944), norm. avg. (of 18) = 0.303426 fft 11: mflops = 28.9833 (norm. = 0.29842), norm. avg. (of 18) = 0.302299 fft 12: mflops = -1 (norm. = -0.0102963), norm. avg. (of 12) = 0.316241 fft 13: mflops = -1 (norm. = -0.0102963), norm. avg. (of 12) = 0.21167 fft 14: mflops = 3.67247 (norm. = 0.0378127), norm. avg. (of 18) = 0.0284422 Benchmarking for array size = 362880: 0. Brenner: elapsed time t=3.65652 s, 1 iters, t-(init.)=3.62459 s t(norm)=0.540816, mflops=9.24529 (err=1.1e-13) 1. CWP (min N) (N=720720): elapsed time t=1.74291 s, 2 iters, t-(init.)=1.62124 s t(norm)=0.120951, mflops=41.3391 2. CWP (best N) (N=720720): elapsed time t=1.73805 s, 2 iters, t-(init.)=1.61637 s t(norm)=0.120587, mflops=41.4638 3. FFTPACK: elapsed time t=1.80306 s, 2 iters, t-(init.)=1.74953 s t(norm)=0.130521, mflops=38.3079 (err=1.1e-13) 4. FFTPACK (f2c): elapsed time t=1.20124 s, 1 iters, t-(init.)=1.17259 s t(norm)=0.17496, mflops=28.578 (err=1.1e-13) FFTW_MEASURE plan: (cost = 5.103019e-01) FFTW_TWIDDLE 10 FFTW_TWIDDLE 4 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_NOTW 16 5. FFTW: elapsed time t=1.01122 s, 2 iters, t-(init.)=0.952953 s t(norm)=0.0710939, mflops=70.3296 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 7.511616e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.99786 s, 4 iters, t-(init.)=1.88992 s t(norm)=0.0704975, mflops=70.9245 (err=1.1e-13) 7. Frigo-old: elapsed time t=2.33247 s, 1 iters, t-(init.)=2.30148 s t(norm)=0.343398, mflops=14.5604 (err=1.1e-13) 8. GSL: elapsed time t=1.73596 s, 2 iters, t-(init.)=1.6811 s t(norm)=0.125416, mflops=39.8672 (err=1.1e-13) 9. NAPACK (f2c): elapsed time t=2.60788 s, 1 iters, t-(init.)=2.57595 s t(norm)=0.384351, mflops=13.0089 (err=3.4e-11) 10. Singleton: elapsed time t=1.25845 s, 1 iters, t-(init.)=1.24187 s t(norm)=0.185296, mflops=26.9839 (err=1.6e-13) 11. Singleton (f2c): elapsed time t=1.3321 s, 1 iters, t-(init.)=1.31557 s t(norm)=0.196292, mflops=25.4722 (err=1.6e-13) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=8.19199 s, 1 iters, t-(init.)=8.16011 s t(norm)=1.21755, mflops=4.10661 (err=1.1e-13) Top mflops for N=362880 = 70.9245 Normalized results and averages for N=362880: fft 0: mflops = 9.24529 (norm. = 0.130354), norm. avg. (of 16) = 0.0876643 fft 1: mflops = 41.3391 (norm. = 0.582861), norm. avg. (of 19) = 0.601637 fft 2: mflops = 41.4638 (norm. = 0.584618), norm. avg. (of 19) = 0.642165 fft 3: mflops = 38.3079 (norm. = 0.540122), norm. avg. (of 19) = 0.569337 fft 4: mflops = 28.578 (norm. = 0.402936), norm. avg. (of 19) = 0.322827 fft 5: mflops = 70.3296 (norm. = 0.991611), norm. avg. (of 19) = 0.934574 fft 6: mflops = 70.9245 (norm. = 1), norm. avg. (of 19) = 0.905238 fft 7: mflops = 14.5604 (norm. = 0.205294), norm. avg. (of 19) = 0.137816 fft 8: mflops = 39.8672 (norm. = 0.562108), norm. avg. (of 19) = 0.331551 fft 9: mflops = 13.0089 (norm. = 0.183419), norm. avg. (of 19) = 0.109056 fft 10: mflops = 26.9839 (norm. = 0.380459), norm. avg. (of 19) = 0.30748 fft 11: mflops = 25.4722 (norm. = 0.359145), norm. avg. (of 19) = 0.305291 fft 12: mflops = -1 (norm. = -0.0140995), norm. avg. (of 12) = 0.316241 fft 13: mflops = -1 (norm. = -0.0140995), norm. avg. (of 12) = 0.21167 fft 14: mflops = 4.10661 (norm. = 0.0579012), norm. avg. (of 19) = 0.0299927 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) Maximum array size N = 2097152 Benchmarking FFTs: 0. FFTW 1. HARM 2. HARM (f2c) 3. PDA 4. PDA (f2c) 5. Singleton 6. Singleton (f2c) 7. Temperton 8. Temperton (f2c) Computing normalized averages (9 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.27332 s, 131072 iters, t-(init.)=1.15952 s t(norm)=0.0230376, mflops=217.036 (err=1.9e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. Skipping fft (all dimensions must be > 4 for HARM). 3. PDA: elapsed time t=1.27364 s, 16384 iters, t-(init.)=1.25898 s t(norm)=0.200109, mflops=24.9864 (err=2.8e-16) 4. PDA (f2c): elapsed time t=1.71679 s, 16384 iters, t-(init.)=1.70309 s t(norm)=0.270699, mflops=18.4707 (err=2.8e-16) 5. Singleton: elapsed time t=1.1677 s, 65536 iters, t-(init.)=1.11188 s t(norm)=0.0441821, mflops=113.168 (err=1.9e-16) 6. Singleton (f2c): elapsed time t=1.13968 s, 65536 iters, t-(init.)=1.08137 s t(norm)=0.0429698, mflops=116.361 (err=1.9e-16) 7. Temperton: elapsed time t=1.86964 s, 65536 iters, t-(init.)=1.81267 s t(norm)=0.0720292, mflops=69.4163 (err=1.9e-16) 8. Temperton (f2c): elapsed time t=1.63424 s, 32768 iters, t-(init.)=1.6056 s t(norm)=0.127602, mflops=39.1845 (err=1.9e-16) Top mflops for N=64 = 217.036 Normalized results and averages for N=64: fft 0: mflops = 217.036 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.00460753), norm. avg. (of 0) = -1 fft 2: mflops = -1 (norm. = -0.00460753), norm. avg. (of 0) = -1 fft 3: mflops = 24.9864 (norm. = 0.115125), norm. avg. (of 1) = 0.115125 fft 4: mflops = 18.4707 (norm. = 0.0851043), norm. avg. (of 1) = 0.0851043 fft 5: mflops = 113.168 (norm. = 0.521425), norm. avg. (of 1) = 0.521425 fft 6: mflops = 116.361 (norm. = 0.536136), norm. avg. (of 1) = 0.536136 fft 7: mflops = 69.4163 (norm. = 0.319837), norm. avg. (of 1) = 0.319837 fft 8: mflops = 39.1845 (norm. = 0.180544), norm. avg. (of 1) = 0.180544 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.12803 s, 16384 iters, t-(init.)=1.02558 s t(norm)=0.0135844, mflops=368.07 (err=3.6e-16) 1. HARM: elapsed time t=1.69269 s, 8192 iters, t-(init.)=1.64096 s t(norm)=0.0434706, mflops=115.02 (err=4.0e-16) 2. HARM (f2c): elapsed time t=1.00029 s, 4096 iters, t-(init.)=0.974395 s t(norm)=0.0516253, mflops=96.8518 (err=4.0e-16) 3. PDA: elapsed time t=1.01528 s, 2048 iters, t-(init.)=1.0023 s t(norm)=0.106208, mflops=47.0775 (err=3.0e-16) 4. PDA (f2c): elapsed time t=1.42132 s, 2048 iters, t-(init.)=1.40849 s t(norm)=0.149249, mflops=33.501 (err=3.0e-16) 5. Singleton: elapsed time t=1.06656 s, 4096 iters, t-(init.)=1.0407 s t(norm)=0.0551384, mflops=90.6808 (err=3.5e-16) 6. Singleton (f2c): elapsed time t=1.06299 s, 4096 iters, t-(init.)=1.03724 s t(norm)=0.0549552, mflops=90.9833 (err=3.5e-16) 7. Temperton: elapsed time t=1.0463 s, 8192 iters, t-(init.)=0.994806 s t(norm)=0.0263534, mflops=189.729 (err=1.3e-08) 8. Temperton (f2c): elapsed time t=1.6189 s, 8192 iters, t-(init.)=1.56707 s t(norm)=0.0415132, mflops=120.444 (err=3.3e-16) Top mflops for N=512 = 368.07 Normalized results and averages for N=512: fft 0: mflops = 368.07 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 115.02 (norm. = 0.312495), norm. avg. (of 1) = 0.312495 fft 2: mflops = 96.8518 (norm. = 0.263134), norm. avg. (of 1) = 0.263134 fft 3: mflops = 47.0775 (norm. = 0.127904), norm. avg. (of 2) = 0.121514 fft 4: mflops = 33.501 (norm. = 0.0910178), norm. avg. (of 2) = 0.0880611 fft 5: mflops = 90.6808 (norm. = 0.246368), norm. avg. (of 2) = 0.383897 fft 6: mflops = 90.9833 (norm. = 0.24719), norm. avg. (of 2) = 0.391663 fft 7: mflops = 189.729 (norm. = 0.51547), norm. avg. (of 2) = 0.417654 fft 8: mflops = 120.444 (norm. = 0.32723), norm. avg. (of 2) = 0.253887 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.81185 s, 2048 iters, t-(init.)=1.71026 s t(norm)=0.0169899, mflops=294.293 (err=4.2e-16) 1. HARM: elapsed time t=1.08021 s, 512 iters, t-(init.)=1.05419 s t(norm)=0.0418898, mflops=119.361 (err=4.0e-16) 2. HARM (f2c): elapsed time t=1.17985 s, 512 iters, t-(init.)=1.15439 s t(norm)=0.0458715, mflops=109 (err=4.0e-16) 3. PDA: elapsed time t=1.6778 s, 512 iters, t-(init.)=1.65235 s t(norm)=0.0656584, mflops=76.1518 (err=4.0e-16) 4. PDA (f2c): elapsed time t=1.24885 s, 256 iters, t-(init.)=1.23615 s t(norm)=0.0982402, mflops=50.8957 (err=4.0e-16) 5. Singleton: elapsed time t=1.21965 s, 512 iters, t-(init.)=1.19425 s t(norm)=0.0474554, mflops=105.362 (err=4.1e-16) 6. Singleton (f2c): elapsed time t=1.25818 s, 512 iters, t-(init.)=1.23274 s t(norm)=0.0489848, mflops=102.072 (err=4.1e-16) 7. Temperton: elapsed time t=1.58591 s, 1024 iters, t-(init.)=1.535 s t(norm)=0.0304976, mflops=163.947 (err=6.3e-08) 8. Temperton (f2c): elapsed time t=1.07133 s, 512 iters, t-(init.)=1.04593 s t(norm)=0.0415614, mflops=120.304 (err=4.6e-16) Top mflops for N=4096 = 294.293 Normalized results and averages for N=4096: fft 0: mflops = 294.293 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 119.361 (norm. = 0.405585), norm. avg. (of 2) = 0.35904 fft 2: mflops = 109 (norm. = 0.37038), norm. avg. (of 2) = 0.316757 fft 3: mflops = 76.1518 (norm. = 0.258762), norm. avg. (of 3) = 0.167264 fft 4: mflops = 50.8957 (norm. = 0.172942), norm. avg. (of 3) = 0.116355 fft 5: mflops = 105.362 (norm. = 0.358018), norm. avg. (of 3) = 0.37527 fft 6: mflops = 102.072 (norm. = 0.34684), norm. avg. (of 3) = 0.376722 fft 7: mflops = 163.947 (norm. = 0.557089), norm. avg. (of 3) = 0.464132 fft 8: mflops = 120.304 (norm. = 0.40879), norm. avg. (of 3) = 0.305521 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.27725 s, 128 iters, t-(init.)=1.2236 s t(norm)=0.0194486, mflops=257.088 (err=5.2e-16) 1. HARM: elapsed time t=1.44319 s, 64 iters, t-(init.)=1.4158 s t(norm)=0.0450071, mflops=111.094 (err=5.2e-16) 2. HARM (f2c): elapsed time t=1.64713 s, 64 iters, t-(init.)=1.62027 s t(norm)=0.0515069, mflops=97.0744 (err=5.2e-16) 3. PDA: elapsed time t=1.86981 s, 32 iters, t-(init.)=1.85544 s t(norm)=0.117965, mflops=42.3853 (err=4.2e-16) 4. PDA (f2c): elapsed time t=1.21635 s, 16 iters, t-(init.)=1.20961 s t(norm)=0.15381, mflops=32.5076 (err=4.2e-16) 5. Singleton: elapsed time t=1.94763 s, 64 iters, t-(init.)=1.92068 s t(norm)=0.0610567, mflops=81.8911 (err=5.3e-16) 6. Singleton (f2c): elapsed time t=1.06174 s, 32 iters, t-(init.)=1.04831 s t(norm)=0.0666498, mflops=75.0189 (err=5.3e-16) 7. Temperton: elapsed time t=1.06379 s, 64 iters, t-(init.)=1.03678 s t(norm)=0.0329584, mflops=151.707 (err=9.6e-08) 8. Temperton (f2c): elapsed time t=1.74838 s, 64 iters, t-(init.)=1.72142 s t(norm)=0.0547225, mflops=91.3701 (err=4.7e-16) Top mflops for N=32768 = 257.088 Normalized results and averages for N=32768: fft 0: mflops = 257.088 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 111.094 (norm. = 0.432123), norm. avg. (of 3) = 0.383401 fft 2: mflops = 97.0744 (norm. = 0.377593), norm. avg. (of 3) = 0.337035 fft 3: mflops = 42.3853 (norm. = 0.164867), norm. avg. (of 4) = 0.166665 fft 4: mflops = 32.5076 (norm. = 0.126446), norm. avg. (of 4) = 0.118878 fft 5: mflops = 81.8911 (norm. = 0.318534), norm. avg. (of 4) = 0.361086 fft 6: mflops = 75.0189 (norm. = 0.291803), norm. avg. (of 4) = 0.355492 fft 7: mflops = 151.707 (norm. = 0.590097), norm. avg. (of 4) = 0.495623 fft 8: mflops = 91.3701 (norm. = 0.355405), norm. avg. (of 4) = 0.317992 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.01925 s, 4 iters, t-(init.)=0.966721 s t(norm)=0.0512187, mflops=97.6205 (err=1.2e-15) 1. HARM: elapsed time t=1.14897 s, 4 iters, t-(init.)=1.09451 s t(norm)=0.057989, mflops=86.2233 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.19701 s, 4 iters, t-(init.)=1.14252 s t(norm)=0.060533, mflops=82.5996 (err=1.2e-15) 3. PDA: elapsed time t=1.0718 s, 2 iters, t-(init.)=1.04568 s t(norm)=0.110804, mflops=45.1246 (err=1.3e-15) 4. PDA (f2c): elapsed time t=1.35531 s, 2 iters, t-(init.)=1.32934 s t(norm)=0.140861, mflops=35.4959 (err=1.3e-15) 5. Singleton: elapsed time t=1.32461 s, 2 iters, t-(init.)=1.29914 s t(norm)=0.137662, mflops=36.3208 (err=1.7e-15) 6. Singleton (f2c): elapsed time t=1.30184 s, 2 iters, t-(init.)=1.27637 s t(norm)=0.135249, mflops=36.969 (err=1.7e-15) 7. Temperton: elapsed time t=1.53073 s, 8 iters, t-(init.)=1.42368 s t(norm)=0.0377146, mflops=132.575 (err=1.3e-07) 8. Temperton (f2c): elapsed time t=1.95305 s, 8 iters, t-(init.)=1.83967 s t(norm)=0.0487347, mflops=102.596 (err=1.3e-15) Top mflops for N=262144 = 132.575 Normalized results and averages for N=262144: fft 0: mflops = 97.6205 (norm. = 0.736344), norm. avg. (of 5) = 0.947269 fft 1: mflops = 86.2233 (norm. = 0.650376), norm. avg. (of 4) = 0.450145 fft 2: mflops = 82.5996 (norm. = 0.623042), norm. avg. (of 4) = 0.408537 fft 3: mflops = 45.1246 (norm. = 0.340371), norm. avg. (of 5) = 0.201406 fft 4: mflops = 35.4959 (norm. = 0.267743), norm. avg. (of 5) = 0.148651 fft 5: mflops = 36.3208 (norm. = 0.273965), norm. avg. (of 5) = 0.343662 fft 6: mflops = 36.969 (norm. = 0.278854), norm. avg. (of 5) = 0.340164 fft 7: mflops = 132.575 (norm. = 1), norm. avg. (of 5) = 0.596499 fft 8: mflops = 102.596 (norm. = 0.773876), norm. avg. (of 5) = 0.409169 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.3008 s, 2 iters, t-(init.)=1.21638 s t(norm)=0.0610545, mflops=81.8941 (err=1.2e-15) 1. HARM: elapsed time t=1.77488 s, 2 iters, t-(init.)=1.68951 s t(norm)=0.0848024, mflops=58.9606 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.83098 s, 2 iters, t-(init.)=1.74533 s t(norm)=0.0876038, mflops=57.0751 (err=1.2e-15) 3. PDA: elapsed time t=1.30404 s, 1 iters, t-(init.)=1.26202 s t(norm)=0.12669, mflops=39.4664 (err=1.2e-15) 4. PDA (f2c): elapsed time t=1.6168 s, 1 iters, t-(init.)=1.57463 s t(norm)=0.158072, mflops=31.6312 (err=1.2e-15) 5. Singleton: elapsed time t=2.20846 s, 1 iters, t-(init.)=2.16706 s t(norm)=0.217544, mflops=22.9839 (err=1.7e-15) 6. Singleton (f2c): elapsed time t=2.25423 s, 1 iters, t-(init.)=2.21279 s t(norm)=0.222135, mflops=22.5088 (err=1.7e-15) 7. Temperton: elapsed time t=1.04151 s, 2 iters, t-(init.)=0.960018 s t(norm)=0.0481866, mflops=103.763 (err=1.5e-07) 8. Temperton (f2c): elapsed time t=1.30209 s, 2 iters, t-(init.)=1.22038 s t(norm)=0.0612549, mflops=81.6261 (err=1.3e-15) Top mflops for N=524288 = 103.763 Normalized results and averages for N=524288: fft 0: mflops = 81.8941 (norm. = 0.789239), norm. avg. (of 6) = 0.920931 fft 1: mflops = 58.9606 (norm. = 0.568222), norm. avg. (of 5) = 0.47376 fft 2: mflops = 57.0751 (norm. = 0.550051), norm. avg. (of 5) = 0.43684 fft 3: mflops = 39.4664 (norm. = 0.38035), norm. avg. (of 6) = 0.23123 fft 4: mflops = 31.6312 (norm. = 0.30484), norm. avg. (of 6) = 0.174682 fft 5: mflops = 22.9839 (norm. = 0.221503), norm. avg. (of 6) = 0.323302 fft 6: mflops = 22.5088 (norm. = 0.216925), norm. avg. (of 6) = 0.319625 fft 7: mflops = 103.763 (norm. = 1), norm. avg. (of 6) = 0.663749 fft 8: mflops = 81.6261 (norm. = 0.786657), norm. avg. (of 6) = 0.472083 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=1.05866 s, 1 iters, t-(init.)=0.966627 s t(norm)=0.0460924, mflops=108.478 (err=2.0e-15) 1. HARM: elapsed time t=1.88475 s, 1 iters, t-(init.)=1.78628 s t(norm)=0.0851763, mflops=58.7018 (err=1.9e-15) 2. HARM (f2c): elapsed time t=1.92142 s, 1 iters, t-(init.)=1.81275 s t(norm)=0.0864386, mflops=57.8445 (err=1.9e-15) 3. PDA: elapsed time t=1.47375 s, 1 iters, t-(init.)=1.38107 s t(norm)=0.0658544, mflops=75.9251 (err=2.0e-15) 4. PDA (f2c): elapsed time t=2.17134 s, 1 iters, t-(init.)=2.07698 s t(norm)=0.0990383, mflops=50.4855 (err=2.0e-15) 5. Singleton: elapsed time t=4.04778 s, 1 iters, t-(init.)=3.95322 s t(norm)=0.188504, mflops=26.5246 (err=2.8e-15) 6. Singleton (f2c): elapsed time t=4.04305 s, 1 iters, t-(init.)=3.94849 s t(norm)=0.188279, mflops=26.5564 (err=2.8e-15) 7. Skipping fft (Temperton can't handle dimensions > 256). 8. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 108.478 Normalized results and averages for N=1048576: fft 0: mflops = 108.478 (norm. = 1), norm. avg. (of 7) = 0.932226 fft 1: mflops = 58.7018 (norm. = 0.541141), norm. avg. (of 6) = 0.48499 fft 2: mflops = 57.8445 (norm. = 0.533238), norm. avg. (of 6) = 0.452906 fft 3: mflops = 75.9251 (norm. = 0.699914), norm. avg. (of 7) = 0.298185 fft 4: mflops = 50.4855 (norm. = 0.465399), norm. avg. (of 7) = 0.216213 fft 5: mflops = 26.5246 (norm. = 0.244516), norm. avg. (of 7) = 0.312047 fft 6: mflops = 26.5564 (norm. = 0.244809), norm. avg. (of 7) = 0.308937 fft 7: mflops = -1 (norm. = -0.00921847), norm. avg. (of 6) = 0.663749 fft 8: mflops = -1 (norm. = -0.00921847), norm. avg. (of 6) = 0.472083 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=2.59564 s, 1 iters, t-(init.)=2.39254 s t(norm)=0.0543264, mflops=92.0364 (err=7.3e-16) 1. HARM: elapsed time t=3.9909 s, 1 iters, t-(init.)=3.78608 s t(norm)=0.0859687, mflops=58.1607 (err=6.9e-16) 2. HARM (f2c): elapsed time t=4.13002 s, 1 iters, t-(init.)=3.93722 s t(norm)=0.0894005, mflops=55.9281 (err=6.9e-16) 3. PDA: elapsed time t=4.80999 s, 1 iters, t-(init.)=4.6183 s t(norm)=0.104866, mflops=47.6801 (err=7.1e-16) 4. PDA (f2c): elapsed time t=6.1158 s, 1 iters, t-(init.)=5.91235 s t(norm)=0.134249, mflops=37.2442 (err=7.1e-16) 5. Singleton: elapsed time t=13.8681 s, 1 iters, t-(init.)=13.6759 s t(norm)=0.310532, mflops=16.1014 (err=8.4e-16) 6. Singleton (f2c): elapsed time t=14.1692 s, 1 iters, t-(init.)=13.9748 s t(norm)=0.317319, mflops=15.757 (err=8.4e-16) 7. Temperton: elapsed time t=2.5106 s, 1 iters, t-(init.)=2.31934 s t(norm)=0.0526641, mflops=94.9413 (err=1.5e-07) 8. Temperton (f2c): elapsed time t=2.92396 s, 1 iters, t-(init.)=2.72559 s t(norm)=0.0618887, mflops=80.7902 (err=7.4e-16) Top mflops for N=2097152 = 94.9413 Normalized results and averages for N=2097152: fft 0: mflops = 92.0364 (norm. = 0.969403), norm. avg. (of 8) = 0.936873 fft 1: mflops = 58.1607 (norm. = 0.612596), norm. avg. (of 7) = 0.50322 fft 2: mflops = 55.9281 (norm. = 0.589081), norm. avg. (of 7) = 0.47236 fft 3: mflops = 47.6801 (norm. = 0.502206), norm. avg. (of 8) = 0.323687 fft 4: mflops = 37.2442 (norm. = 0.392287), norm. avg. (of 8) = 0.238222 fft 5: mflops = 16.1014 (norm. = 0.169593), norm. avg. (of 8) = 0.29424 fft 6: mflops = 15.757 (norm. = 0.165966), norm. avg. (of 8) = 0.291065 fft 7: mflops = 94.9413 (norm. = 1), norm. avg. (of 7) = 0.711785 fft 8: mflops = 80.7902 (norm. = 0.850948), norm. avg. (of 7) = 0.526207 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) Maximum array size N = 2985984 Benchmarking FFTs: 0. FFTW 1. PDA 2. PDA (f2c) 3. Singleton 4. Singleton (f2c) 5. Temperton 6. Temperton (f2c) Computing normalized averages (7 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.89239 s, 65536 iters, t-(init.)=1.78958 s t(norm)=0.0313611, mflops=159.433 (err=3.0e-16) 1. PDA: elapsed time t=1.19144 s, 8192 iters, t-(init.)=1.17824 s t(norm)=0.165182, mflops=30.2697 (err=2.3e-16) 2. PDA (f2c): elapsed time t=1.88636 s, 8192 iters, t-(init.)=1.87344 s t(norm)=0.262646, mflops=19.037 (err=2.3e-16) 3. Singleton: elapsed time t=1.03036 s, 32768 iters, t-(init.)=0.977248 s t(norm)=0.0342511, mflops=145.981 (err=3.1e-16) 4. Singleton (f2c): elapsed time t=1.87101 s, 65536 iters, t-(init.)=1.76387 s t(norm)=0.0309106, mflops=161.757 (err=3.1e-16) 5. Temperton: elapsed time t=1.44054 s, 32768 iters, t-(init.)=1.38857 s t(norm)=0.0486672, mflops=102.739 (err=5.3e-16) 6. Temperton (f2c): elapsed time t=1.13609 s, 16384 iters, t-(init.)=1.10981 s t(norm)=0.0777948, mflops=64.2717 (err=2.4e-16) Top mflops for N=125 = 161.757 Normalized results and averages for N=125: fft 0: mflops = 159.433 (norm. = 0.985636), norm. avg. (of 1) = 0.985636 fft 1: mflops = 30.2697 (norm. = 0.187131), norm. avg. (of 1) = 0.187131 fft 2: mflops = 19.037 (norm. = 0.117689), norm. avg. (of 1) = 0.117689 fft 3: mflops = 145.981 (norm. = 0.902469), norm. avg. (of 1) = 0.902469 fft 4: mflops = 161.757 (norm. = 1), norm. avg. (of 1) = 1 fft 5: mflops = 102.739 (norm. = 0.635142), norm. avg. (of 1) = 0.635142 fft 6: mflops = 64.2717 (norm. = 0.397335), norm. avg. (of 1) = 0.397335 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.17982 s, 32768 iters, t-(init.)=1.09225 s t(norm)=0.0198995, mflops=251.262 (err=2.9e-16) 1. PDA: elapsed time t=1.10118 s, 4096 iters, t-(init.)=1.08977 s t(norm)=0.158834, mflops=31.4794 (err=3.6e-16) 2. PDA (f2c): elapsed time t=1.47341 s, 4096 iters, t-(init.)=1.46238 s t(norm)=0.213143, mflops=23.4585 (err=3.6e-16) 3. Singleton: elapsed time t=1.65276 s, 16384 iters, t-(init.)=1.60896 s t(norm)=0.0586267, mflops=85.2854 (err=2.9e-16) 4. Singleton (f2c): elapsed time t=1.72245 s, 16384 iters, t-(init.)=1.67765 s t(norm)=0.0611298, mflops=81.7932 (err=2.9e-16) 5. Temperton: elapsed time t=1.28923 s, 16384 iters, t-(init.)=1.2447 s t(norm)=0.0453541, mflops=110.244 (err=2.1e-15) 6. Temperton (f2c): elapsed time t=1.0284 s, 8192 iters, t-(init.)=1.00601 s t(norm)=0.0733131, mflops=68.2006 (err=3.1e-16) Top mflops for N=216 = 251.262 Normalized results and averages for N=216: fft 0: mflops = 251.262 (norm. = 1), norm. avg. (of 2) = 0.992818 fft 1: mflops = 31.4794 (norm. = 0.125285), norm. avg. (of 2) = 0.156208 fft 2: mflops = 23.4585 (norm. = 0.0933625), norm. avg. (of 2) = 0.105526 fft 3: mflops = 85.2854 (norm. = 0.339428), norm. avg. (of 2) = 0.620948 fft 4: mflops = 81.7932 (norm. = 0.325529), norm. avg. (of 2) = 0.662764 fft 5: mflops = 110.244 (norm. = 0.438759), norm. avg. (of 2) = 0.53695 fft 6: mflops = 68.2006 (norm. = 0.271432), norm. avg. (of 2) = 0.334384 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.42425 s, 16384 iters, t-(init.)=1.35525 s t(norm)=0.0286344, mflops=174.615 (err=3.8e-16) 1. PDA: elapsed time t=1.89396 s, 2048 iters, t-(init.)=1.88526 s t(norm)=0.318661, mflops=15.6906 (err=4.5e-16) 2. PDA (f2c): elapsed time t=1.26612 s, 1024 iters, t-(init.)=1.26179 s t(norm)=0.426556, mflops=11.7218 (err=4.5e-16) 3. Singleton: elapsed time t=1.45342 s, 8192 iters, t-(init.)=1.41894 s t(norm)=0.0599598, mflops=83.3892 (err=5.8e-16) 4. Singleton (f2c): elapsed time t=1.48216 s, 8192 iters, t-(init.)=1.44704 s t(norm)=0.0611476, mflops=81.7694 (err=5.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 174.615 Normalized results and averages for N=343: fft 0: mflops = 174.615 (norm. = 1), norm. avg. (of 3) = 0.995212 fft 1: mflops = 15.6906 (norm. = 0.0898583), norm. avg. (of 3) = 0.134091 fft 2: mflops = 11.7218 (norm. = 0.0671292), norm. avg. (of 3) = 0.092727 fft 3: mflops = 83.3892 (norm. = 0.477559), norm. avg. (of 3) = 0.573152 fft 4: mflops = 81.7694 (norm. = 0.468283), norm. avg. (of 3) = 0.597937 fft 5: mflops = -1 (norm. = -0.00572687), norm. avg. (of 2) = 0.53695 fft 6: mflops = -1 (norm. = -0.00572687), norm. avg. (of 2) = 0.334384 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.23537 s, 8192 iters, t-(init.)=1.16257 s t(norm)=0.0204706, mflops=244.252 (err=5.3e-16) 1. PDA: elapsed time t=1.46427 s, 2048 iters, t-(init.)=1.44603 s t(norm)=0.101847, mflops=49.0931 (err=4.9e-16) 2. PDA (f2c): elapsed time t=1.08812 s, 1024 iters, t-(init.)=1.07901 s t(norm)=0.151995, mflops=32.8958 (err=4.9e-16) 3. Singleton: elapsed time t=1.57254 s, 4096 iters, t-(init.)=1.53598 s t(norm)=0.0540912, mflops=92.4364 (err=4.5e-16) 4. Singleton (f2c): elapsed time t=1.51467 s, 4096 iters, t-(init.)=1.47804 s t(norm)=0.0520511, mflops=96.0595 (err=4.5e-16) 5. Temperton: elapsed time t=1.62774 s, 8192 iters, t-(init.)=1.55463 s t(norm)=0.0273741, mflops=182.654 (err=6.0e-08) 6. Temperton (f2c): elapsed time t=1.09041 s, 4096 iters, t-(init.)=1.05374 s t(norm)=0.0371085, mflops=134.74 (err=5.1e-16) Top mflops for N=729 = 244.252 Normalized results and averages for N=729: fft 0: mflops = 244.252 (norm. = 1), norm. avg. (of 4) = 0.996409 fft 1: mflops = 49.0931 (norm. = 0.200993), norm. avg. (of 4) = 0.150817 fft 2: mflops = 32.8958 (norm. = 0.13468), norm. avg. (of 4) = 0.103215 fft 3: mflops = 92.4364 (norm. = 0.378447), norm. avg. (of 4) = 0.524476 fft 4: mflops = 96.0595 (norm. = 0.39328), norm. avg. (of 4) = 0.546773 fft 5: mflops = 182.654 (norm. = 0.74781), norm. avg. (of 3) = 0.607237 fft 6: mflops = 134.74 (norm. = 0.551642), norm. avg. (of 3) = 0.406803 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.71997 s, 8192 iters, t-(init.)=1.62012 s t(norm)=0.0198448, mflops=251.955 (err=4.0e-16) 1. PDA: elapsed time t=1.03349 s, 1024 iters, t-(init.)=1.02098 s t(norm)=0.100048, mflops=49.9761 (err=4.3e-16) 2. PDA (f2c): elapsed time t=1.72886 s, 1024 iters, t-(init.)=1.71639 s t(norm)=0.168192, mflops=29.7279 (err=4.3e-16) 3. Singleton: elapsed time t=1.99987 s, 4096 iters, t-(init.)=1.94994 s t(norm)=0.0477693, mflops=104.67 (err=4.6e-16) 4. Singleton (f2c): elapsed time t=1.01816 s, 2048 iters, t-(init.)=0.993103 s t(norm)=0.0486579, mflops=102.758 (err=4.6e-16) 5. Temperton: elapsed time t=1.13667 s, 4096 iters, t-(init.)=1.08666 s t(norm)=0.0266208, mflops=187.823 (err=6.3e-16) 6. Temperton (f2c): elapsed time t=1.16534 s, 2048 iters, t-(init.)=1.14037 s t(norm)=0.0558734, mflops=89.488 (err=3.4e-16) Top mflops for N=1000 = 251.955 Normalized results and averages for N=1000: fft 0: mflops = 251.955 (norm. = 1), norm. avg. (of 5) = 0.997127 fft 1: mflops = 49.9761 (norm. = 0.198353), norm. avg. (of 5) = 0.160324 fft 2: mflops = 29.7279 (norm. = 0.117989), norm. avg. (of 5) = 0.10617 fft 3: mflops = 104.67 (norm. = 0.415429), norm. avg. (of 5) = 0.502666 fft 4: mflops = 102.758 (norm. = 0.407844), norm. avg. (of 5) = 0.518987 fft 5: mflops = 187.823 (norm. = 0.745462), norm. avg. (of 4) = 0.641793 fft 6: mflops = 89.488 (norm. = 0.355174), norm. avg. (of 4) = 0.393896 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.50856 s, 2048 iters, t-(init.)=1.47547 s t(norm)=0.0521551, mflops=95.868 (err=4.3e-16) 1. PDA: elapsed time t=1.04648 s, 256 iters, t-(init.)=1.04234 s t(norm)=0.294757, mflops=16.9631 (err=5.6e-16) 2. PDA (f2c): elapsed time t=1.47369 s, 256 iters, t-(init.)=1.46955 s t(norm)=0.415566, mflops=12.0318 (err=5.6e-16) 3. Singleton: elapsed time t=1.78537 s, 2048 iters, t-(init.)=1.7522 s t(norm)=0.061937, mflops=80.7272 (err=6.6e-16) 4. Singleton (f2c): elapsed time t=1.80176 s, 2048 iters, t-(init.)=1.7686 s t(norm)=0.0625168, mflops=79.9786 (err=6.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 95.868 Normalized results and averages for N=1331: fft 0: mflops = 95.868 (norm. = 1), norm. avg. (of 6) = 0.997606 fft 1: mflops = 16.9631 (norm. = 0.176943), norm. avg. (of 6) = 0.163094 fft 2: mflops = 12.0318 (norm. = 0.125504), norm. avg. (of 6) = 0.109392 fft 3: mflops = 80.7272 (norm. = 0.842066), norm. avg. (of 6) = 0.559233 fft 4: mflops = 79.9786 (norm. = 0.834257), norm. avg. (of 6) = 0.571532 fft 5: mflops = -1 (norm. = -0.010431), norm. avg. (of 4) = 0.641793 fft 6: mflops = -1 (norm. = -0.010431), norm. avg. (of 4) = 0.393896 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.41124 s, 4096 iters, t-(init.)=1.32524 s t(norm)=0.0174094, mflops=287.201 (err=3.9e-16) 1. PDA: elapsed time t=1.423 s, 1024 iters, t-(init.)=1.40145 s t(norm)=0.0736427, mflops=67.8954 (err=3.8e-16) 2. PDA (f2c): elapsed time t=1.13305 s, 512 iters, t-(init.)=1.12231 s t(norm)=0.117949, mflops=42.3913 (err=3.8e-16) 3. Singleton: elapsed time t=1.2482 s, 1024 iters, t-(init.)=1.22674 s t(norm)=0.0644617, mflops=77.5655 (err=4.0e-16) 4. Singleton (f2c): elapsed time t=1.27127 s, 1024 iters, t-(init.)=1.24975 s t(norm)=0.0656708, mflops=76.1374 (err=4.0e-16) 5. Temperton: elapsed time t=1.19548 s, 2048 iters, t-(init.)=1.15244 s t(norm)=0.0302788, mflops=165.132 (err=1.8e-15) 6. Temperton (f2c): elapsed time t=1.14037 s, 1024 iters, t-(init.)=1.11884 s t(norm)=0.0587918, mflops=85.0458 (err=3.9e-16) Top mflops for N=1728 = 287.201 Normalized results and averages for N=1728: fft 0: mflops = 287.201 (norm. = 1), norm. avg. (of 7) = 0.997948 fft 1: mflops = 67.8954 (norm. = 0.236404), norm. avg. (of 7) = 0.173567 fft 2: mflops = 42.3913 (norm. = 0.147602), norm. avg. (of 7) = 0.114851 fft 3: mflops = 77.5655 (norm. = 0.270074), norm. avg. (of 7) = 0.517925 fft 4: mflops = 76.1374 (norm. = 0.265101), norm. avg. (of 7) = 0.527756 fft 5: mflops = 165.132 (norm. = 0.574969), norm. avg. (of 5) = 0.628428 fft 6: mflops = 85.0458 (norm. = 0.296119), norm. avg. (of 5) = 0.374341 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.43796 s, 1024 iters, t-(init.)=1.41067 s t(norm)=0.0564834, mflops=88.5216 (err=4.5e-16) 1. PDA: elapsed time t=1.89133 s, 256 iters, t-(init.)=1.88448 s t(norm)=0.301819, mflops=16.5662 (err=9.0e-16) 2. PDA (f2c): elapsed time t=1.32617 s, 128 iters, t-(init.)=1.32274 s t(norm)=0.423701, mflops=11.8008 (err=9.0e-16) 3. Singleton: elapsed time t=1.75715 s, 1024 iters, t-(init.)=1.72988 s t(norm)=0.0692645, mflops=72.1871 (err=7.7e-16) 4. Singleton (f2c): elapsed time t=1.70817 s, 1024 iters, t-(init.)=1.68083 s t(norm)=0.0673008, mflops=74.2934 (err=7.7e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 88.5216 Normalized results and averages for N=2197: fft 0: mflops = 88.5216 (norm. = 1), norm. avg. (of 8) = 0.998205 fft 1: mflops = 16.5662 (norm. = 0.187143), norm. avg. (of 8) = 0.175264 fft 2: mflops = 11.8008 (norm. = 0.13331), norm. avg. (of 8) = 0.117158 fft 3: mflops = 72.1871 (norm. = 0.815474), norm. avg. (of 8) = 0.555118 fft 4: mflops = 74.2934 (norm. = 0.839268), norm. avg. (of 8) = 0.566695 fft 5: mflops = -1 (norm. = -0.0112967), norm. avg. (of 5) = 0.628428 fft 6: mflops = -1 (norm. = -0.0112967), norm. avg. (of 5) = 0.374341 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.0655 s, 1024 iters, t-(init.)=1.03136 s t(norm)=0.0321352, mflops=155.593 (err=4.1e-16) 1. PDA: elapsed time t=1.36564 s, 256 iters, t-(init.)=1.35709 s t(norm)=0.169137, mflops=29.5618 (err=4.7e-16) 2. PDA (f2c): elapsed time t=1.89863 s, 256 iters, t-(init.)=1.89009 s t(norm)=0.235567, mflops=21.2254 (err=4.7e-16) 3. Singleton: elapsed time t=1.31239 s, 512 iters, t-(init.)=1.29531 s t(norm)=0.0807189, mflops=61.9433 (err=5.6e-16) 4. Singleton (f2c): elapsed time t=1.33691 s, 512 iters, t-(init.)=1.31985 s t(norm)=0.0822479, mflops=60.7918 (err=5.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 155.593 Normalized results and averages for N=2744: fft 0: mflops = 155.593 (norm. = 1), norm. avg. (of 9) = 0.998404 fft 1: mflops = 29.5618 (norm. = 0.189995), norm. avg. (of 9) = 0.176901 fft 2: mflops = 21.2254 (norm. = 0.136417), norm. avg. (of 9) = 0.119298 fft 3: mflops = 61.9433 (norm. = 0.398112), norm. avg. (of 9) = 0.537673 fft 4: mflops = 60.7918 (norm. = 0.390711), norm. avg. (of 9) = 0.547141 fft 5: mflops = -1 (norm. = -0.00642704), norm. avg. (of 5) = 0.628428 fft 6: mflops = -1 (norm. = -0.00642704), norm. avg. (of 5) = 0.374341 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.89822 s, 2048 iters, t-(init.)=1.81419 s t(norm)=0.0223937, mflops=223.277 (err=3.9e-16) 1. PDA: elapsed time t=1.44708 s, 512 iters, t-(init.)=1.42613 s t(norm)=0.0704146, mflops=71.008 (err=4.4e-16) 2. PDA (f2c): elapsed time t=1.26428 s, 256 iters, t-(init.)=1.25381 s t(norm)=0.123813, mflops=40.3836 (err=4.4e-16) 3. Singleton: elapsed time t=1.30253 s, 512 iters, t-(init.)=1.28155 s t(norm)=0.0632761, mflops=79.0188 (err=5.0e-16) 4. Singleton (f2c): elapsed time t=1.32276 s, 512 iters, t-(init.)=1.30178 s t(norm)=0.0642748, mflops=77.791 (err=5.0e-16) 5. Temperton: elapsed time t=1.31344 s, 1024 iters, t-(init.)=1.27146 s t(norm)=0.0313889, mflops=159.292 (err=1.9e-15) 6. Temperton (f2c): elapsed time t=1.04063 s, 512 iters, t-(init.)=1.01965 s t(norm)=0.0503447, mflops=99.3153 (err=4.1e-16) Top mflops for N=3375 = 223.277 Normalized results and averages for N=3375: fft 0: mflops = 223.277 (norm. = 1), norm. avg. (of 10) = 0.998564 fft 1: mflops = 71.008 (norm. = 0.318026), norm. avg. (of 10) = 0.191013 fft 2: mflops = 40.3836 (norm. = 0.180868), norm. avg. (of 10) = 0.125455 fft 3: mflops = 79.0188 (norm. = 0.353905), norm. avg. (of 10) = 0.519296 fft 4: mflops = 77.791 (norm. = 0.348406), norm. avg. (of 10) = 0.527268 fft 5: mflops = 159.292 (norm. = 0.713427), norm. avg. (of 6) = 0.642595 fft 6: mflops = 99.3153 (norm. = 0.444807), norm. avg. (of 6) = 0.386085 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.25245 s, 128 iters, t-(init.)=1.22643 s t(norm)=0.0406325, mflops=123.054 (err=4.7e-16) 1. PDA: elapsed time t=1.09108 s, 64 iters, t-(init.)=1.07806 s t(norm)=0.0714343, mflops=69.9944 (err=4.7e-16) 2. PDA (f2c): elapsed time t=1.71807 s, 64 iters, t-(init.)=1.70286 s t(norm)=0.112834, mflops=44.3128 (err=4.7e-16) 3. Singleton: elapsed time t=1.05643 s, 64 iters, t-(init.)=1.04337 s t(norm)=0.0691351, mflops=72.3222 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=1.07752 s, 64 iters, t-(init.)=1.06443 s t(norm)=0.070531, mflops=70.8908 (err=5.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 123.054 Normalized results and averages for N=16800: fft 0: mflops = 123.054 (norm. = 1), norm. avg. (of 11) = 0.998694 fft 1: mflops = 69.9944 (norm. = 0.56881), norm. avg. (of 11) = 0.225358 fft 2: mflops = 44.3128 (norm. = 0.360108), norm. avg. (of 11) = 0.146787 fft 3: mflops = 72.3222 (norm. = 0.587727), norm. avg. (of 11) = 0.525517 fft 4: mflops = 70.8908 (norm. = 0.576094), norm. avg. (of 11) = 0.531707 fft 5: mflops = -1 (norm. = -0.00812651), norm. avg. (of 6) = 0.642595 fft 6: mflops = -1 (norm. = -0.00812651), norm. avg. (of 6) = 0.386085 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.75119 s, 32 iters, t-(init.)=1.67251 s t(norm)=0.0282068, mflops=177.262 (err=6.8e-16) 1. PDA: elapsed time t=1.56659 s, 8 iters, t-(init.)=1.54707 s t(norm)=0.104365, mflops=47.9088 (err=6.2e-16) 2. PDA (f2c): elapsed time t=1.04618 s, 4 iters, t-(init.)=1.03648 s t(norm)=0.139841, mflops=35.7549 (err=6.2e-16) 3. Singleton: elapsed time t=1.36639 s, 8 iters, t-(init.)=1.34691 s t(norm)=0.0908624, mflops=55.0283 (err=6.5e-16) 4. Singleton (f2c): elapsed time t=1.4052 s, 8 iters, t-(init.)=1.38568 s t(norm)=0.0934776, mflops=53.4888 (err=6.5e-16) 5. Temperton: elapsed time t=1.06026 s, 16 iters, t-(init.)=1.02096 s t(norm)=0.034437, mflops=145.193 (err=1.0e-07) 6. Temperton (f2c): elapsed time t=1.35147 s, 16 iters, t-(init.)=1.31229 s t(norm)=0.0442634, mflops=112.96 (err=7.0e-16) Top mflops for N=110592 = 177.262 Normalized results and averages for N=110592: fft 0: mflops = 177.262 (norm. = 1), norm. avg. (of 12) = 0.998803 fft 1: mflops = 47.9088 (norm. = 0.270271), norm. avg. (of 12) = 0.229101 fft 2: mflops = 35.7549 (norm. = 0.201706), norm. avg. (of 12) = 0.151364 fft 3: mflops = 55.0283 (norm. = 0.310434), norm. avg. (of 12) = 0.507594 fft 4: mflops = 53.4888 (norm. = 0.301749), norm. avg. (of 12) = 0.512544 fft 5: mflops = 145.193 (norm. = 0.819084), norm. avg. (of 7) = 0.667807 fft 6: mflops = 112.96 (norm. = 0.637249), norm. avg. (of 7) = 0.421966 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.53467 s, 16 iters, t-(init.)=1.48887 s t(norm)=0.0469568, mflops=106.481 (err=6.5e-16) 1. PDA: elapsed time t=1.31478 s, 4 iters, t-(init.)=1.30356 s t(norm)=0.16445, mflops=30.4044 (err=7.4e-16) 2. PDA (f2c): elapsed time t=1.91048 s, 4 iters, t-(init.)=1.89929 s t(norm)=0.239604, mflops=20.8678 (err=7.4e-16) 3. Singleton: elapsed time t=1.82722 s, 8 iters, t-(init.)=1.80405 s t(norm)=0.113795, mflops=43.9387 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.91388 s, 8 iters, t-(init.)=1.89072 s t(norm)=0.119261, mflops=41.9247 (err=1.0e-15) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 106.481 Normalized results and averages for N=117649: fft 0: mflops = 106.481 (norm. = 1), norm. avg. (of 13) = 0.998895 fft 1: mflops = 30.4044 (norm. = 0.285539), norm. avg. (of 13) = 0.233442 fft 2: mflops = 20.8678 (norm. = 0.195977), norm. avg. (of 13) = 0.154795 fft 3: mflops = 43.9387 (norm. = 0.412644), norm. avg. (of 13) = 0.50029 fft 4: mflops = 41.9247 (norm. = 0.39373), norm. avg. (of 13) = 0.503404 fft 5: mflops = -1 (norm. = -0.00939136), norm. avg. (of 7) = 0.667807 fft 6: mflops = -1 (norm. = -0.00939136), norm. avg. (of 7) = 0.421966 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.15673 s, 8 iters, t-(init.)=1.07459 s t(norm)=0.0350928, mflops=142.48 (err=7.1e-16) 1. PDA: elapsed time t=1.64923 s, 4 iters, t-(init.)=1.60948 s t(norm)=0.105122, mflops=47.5639 (err=7.4e-16) 2. PDA (f2c): elapsed time t=1.1075 s, 2 iters, t-(init.)=1.0885 s t(norm)=0.142189, mflops=35.1645 (err=7.4e-16) 3. Singleton: elapsed time t=1.19699 s, 2 iters, t-(init.)=1.17927 s t(norm)=0.154045, mflops=32.458 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.20478 s, 2 iters, t-(init.)=1.18707 s t(norm)=0.155065, mflops=32.2446 (err=1.0e-15) 5. Temperton: elapsed time t=1.27682 s, 8 iters, t-(init.)=1.19503 s t(norm)=0.0390262, mflops=128.119 (err=2.0e-15) 6. Temperton (f2c): elapsed time t=1.04248 s, 4 iters, t-(init.)=1.00237 s t(norm)=0.065469, mflops=76.372 (err=7.1e-16) Top mflops for N=216000 = 142.48 Normalized results and averages for N=216000: fft 0: mflops = 142.48 (norm. = 1), norm. avg. (of 14) = 0.998974 fft 1: mflops = 47.5639 (norm. = 0.333829), norm. avg. (of 14) = 0.240613 fft 2: mflops = 35.1645 (norm. = 0.246804), norm. avg. (of 14) = 0.161367 fft 3: mflops = 32.458 (norm. = 0.227808), norm. avg. (of 14) = 0.480827 fft 4: mflops = 32.2446 (norm. = 0.22631), norm. avg. (of 14) = 0.483612 fft 5: mflops = 128.119 (norm. = 0.89921), norm. avg. (of 8) = 0.696733 fft 6: mflops = 76.372 (norm. = 0.536021), norm. avg. (of 8) = 0.436223 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.77613 s, 8 iters, t-(init.)=1.69152 s t(norm)=0.0488704, mflops=102.311 (err=7.3e-16) 1. PDA: elapsed time t=1.94132 s, 4 iters, t-(init.)=1.90029 s t(norm)=0.109804, mflops=45.5356 (err=7.8e-16) 2. PDA (f2c): elapsed time t=1.32171 s, 2 iters, t-(init.)=1.3022 s t(norm)=0.15049, mflops=33.2248 (err=7.8e-16) 3. Singleton: elapsed time t=1.50551 s, 2 iters, t-(init.)=1.48477 s t(norm)=0.171588, mflops=29.1395 (err=9.4e-16) 4. Singleton (f2c): elapsed time t=1.53465 s, 2 iters, t-(init.)=1.51394 s t(norm)=0.174959, mflops=28.5781 (err=9.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 102.311 Normalized results and averages for N=241920: fft 0: mflops = 102.311 (norm. = 1), norm. avg. (of 15) = 0.999042 fft 1: mflops = 45.5356 (norm. = 0.445069), norm. avg. (of 15) = 0.254243 fft 2: mflops = 33.2248 (norm. = 0.324742), norm. avg. (of 15) = 0.172259 fft 3: mflops = 29.1395 (norm. = 0.284812), norm. avg. (of 15) = 0.467759 fft 4: mflops = 28.5781 (norm. = 0.279324), norm. avg. (of 15) = 0.469992 fft 5: mflops = -1 (norm. = -0.00977409), norm. avg. (of 8) = 0.696733 fft 6: mflops = -1 (norm. = -0.00977409), norm. avg. (of 8) = 0.436223 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.78291 s, 4 iters, t-(init.)=1.6581 s t(norm)=0.0525823, mflops=95.089 (err=7.0e-16) 1. PDA: elapsed time t=1.70474 s, 2 iters, t-(init.)=1.64487 s t(norm)=0.104326, mflops=47.9269 (err=7.5e-16) 2. PDA (f2c): elapsed time t=1.14914 s, 1 iters, t-(init.)=1.12182 s t(norm)=0.142303, mflops=35.1363 (err=7.5e-16) 3. Singleton: elapsed time t=1.35064 s, 1 iters, t-(init.)=1.31952 s t(norm)=0.167381, mflops=29.8721 (err=9.8e-16) 4. Singleton (f2c): elapsed time t=1.34233 s, 1 iters, t-(init.)=1.31191 s t(norm)=0.166416, mflops=30.0453 (err=9.8e-16) 5. Temperton: elapsed time t=1.67001 s, 4 iters, t-(init.)=1.54527 s t(norm)=0.0490042, mflops=102.032 (err=1.4e-07) 6. Temperton (f2c): elapsed time t=1.0363 s, 2 iters, t-(init.)=0.976563 s t(norm)=0.0619384, mflops=80.7254 (err=8.9e-16) Top mflops for N=421875 = 102.032 Normalized results and averages for N=421875: fft 0: mflops = 95.089 (norm. = 0.931953), norm. avg. (of 16) = 0.994849 fft 1: mflops = 47.9269 (norm. = 0.469725), norm. avg. (of 16) = 0.267711 fft 2: mflops = 35.1363 (norm. = 0.344366), norm. avg. (of 16) = 0.183016 fft 3: mflops = 29.8721 (norm. = 0.292772), norm. avg. (of 16) = 0.456823 fft 4: mflops = 30.0453 (norm. = 0.294469), norm. avg. (of 16) = 0.459022 fft 5: mflops = 102.032 (norm. = 1), norm. avg. (of 9) = 0.730429 fft 6: mflops = 80.7254 (norm. = 0.791178), norm. avg. (of 9) = 0.475662 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.1094 s, 2 iters, t-(init.)=1.02984 s t(norm)=0.0530271, mflops=94.2915 (err=6.3e-16) 1. PDA: elapsed time t=1.01107 s, 1 iters, t-(init.)=0.972659 s t(norm)=0.100166, mflops=49.9172 (err=6.2e-16) 2. PDA (f2c): elapsed time t=1.32121 s, 1 iters, t-(init.)=1.27698 s t(norm)=0.131505, mflops=38.0213 (err=6.2e-16) 3. Singleton: elapsed time t=1.60288 s, 1 iters, t-(init.)=1.56375 s t(norm)=0.161038, mflops=31.0486 (err=8.2e-16) 4. Singleton (f2c): elapsed time t=1.52138 s, 1 iters, t-(init.)=1.4824 s t(norm)=0.152659, mflops=32.7527 (err=8.2e-16) 5. Temperton: elapsed time t=1.84021 s, 4 iters, t-(init.)=1.67511 s t(norm)=0.0431262, mflops=115.939 (err=1.7e-07) 6. Temperton (f2c): elapsed time t=1.26433 s, 2 iters, t-(init.)=1.18489 s t(norm)=0.061011, mflops=81.9524 (err=6.6e-16) Top mflops for N=512000 = 115.939 Normalized results and averages for N=512000: fft 0: mflops = 94.2915 (norm. = 0.813287), norm. avg. (of 17) = 0.984169 fft 1: mflops = 49.9172 (norm. = 0.430548), norm. avg. (of 17) = 0.27729 fft 2: mflops = 38.0213 (norm. = 0.327943), norm. avg. (of 17) = 0.191541 fft 3: mflops = 31.0486 (norm. = 0.267802), norm. avg. (of 17) = 0.445704 fft 4: mflops = 32.7527 (norm. = 0.2825), norm. avg. (of 17) = 0.448639 fft 5: mflops = 115.939 (norm. = 1), norm. avg. (of 10) = 0.757386 fft 6: mflops = 81.9524 (norm. = 0.706859), norm. avg. (of 10) = 0.498782 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.42887 s, 2 iters, t-(init.)=1.33224 s t(norm)=0.0586051, mflops=85.3168 (err=7.0e-16) 1. PDA: elapsed time t=1.41227 s, 1 iters, t-(init.)=1.36716 s t(norm)=0.120283, mflops=41.5688 (err=6.8e-16) 2. PDA (f2c): elapsed time t=1.9639 s, 1 iters, t-(init.)=1.9178 s t(norm)=0.168727, mflops=29.6336 (err=6.8e-16) 3. Singleton: elapsed time t=2.52132 s, 1 iters, t-(init.)=2.48317 s t(norm)=0.218468, mflops=22.8866 (err=8.9e-16) 4. Singleton (f2c): elapsed time t=2.62007 s, 1 iters, t-(init.)=2.58187 s t(norm)=0.227152, mflops=22.0117 (err=8.9e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 85.3168 Normalized results and averages for N=592704: fft 0: mflops = 85.3168 (norm. = 1), norm. avg. (of 18) = 0.985049 fft 1: mflops = 41.5688 (norm. = 0.487228), norm. avg. (of 18) = 0.288953 fft 2: mflops = 29.6336 (norm. = 0.347336), norm. avg. (of 18) = 0.200196 fft 3: mflops = 22.8866 (norm. = 0.268254), norm. avg. (of 18) = 0.435845 fft 4: mflops = 22.0117 (norm. = 0.258), norm. avg. (of 18) = 0.438048 fft 5: mflops = -1 (norm. = -0.011721), norm. avg. (of 10) = 0.757386 fft 6: mflops = -1 (norm. = -0.011721), norm. avg. (of 10) = 0.498782 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=1.05552 s, 1 iters, t-(init.)=0.980297 s t(norm)=0.0560879, mflops=89.1457 (err=7.7e-16) 1. PDA: elapsed time t=1.86469 s, 1 iters, t-(init.)=1.77561 s t(norm)=0.101592, mflops=49.2166 (err=6.4e-16) 2. PDA (f2c): elapsed time t=2.4613 s, 1 iters, t-(init.)=2.38753 s t(norm)=0.136603, mflops=36.6023 (err=6.4e-16) 3. Singleton: elapsed time t=4.66507 s, 1 iters, t-(init.)=4.59585 s t(norm)=0.262953, mflops=19.0148 (err=7.0e-16) 4. Singleton (f2c): elapsed time t=4.75044 s, 1 iters, t-(init.)=4.68125 s t(norm)=0.267839, mflops=18.6679 (err=7.0e-16) 5. Temperton: elapsed time t=1.10784 s, 1 iters, t-(init.)=1.03378 s t(norm)=0.0591481, mflops=84.5335 (err=1.6e-07) 6. Temperton (f2c): elapsed time t=1.43203 s, 1 iters, t-(init.)=1.35809 s t(norm)=0.0777032, mflops=64.3474 (err=7.5e-16) Top mflops for N=884736 = 89.1457 Normalized results and averages for N=884736: fft 0: mflops = 89.1457 (norm. = 1), norm. avg. (of 19) = 0.985836 fft 1: mflops = 49.2166 (norm. = 0.552091), norm. avg. (of 19) = 0.302802 fft 2: mflops = 36.6023 (norm. = 0.41059), norm. avg. (of 19) = 0.211269 fft 3: mflops = 19.0148 (norm. = 0.2133), norm. avg. (of 19) = 0.424132 fft 4: mflops = 18.6679 (norm. = 0.209409), norm. avg. (of 19) = 0.426014 fft 5: mflops = 84.5335 (norm. = 0.948262), norm. avg. (of 11) = 0.774739 fft 6: mflops = 64.3474 (norm. = 0.721823), norm. avg. (of 11) = 0.519058 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=1.55503 s, 1 iters, t-(init.)=1.45498 s t(norm)=0.0623981, mflops=80.1306 (err=7.4e-16) 1. PDA: elapsed time t=2.75524 s, 1 iters, t-(init.)=2.65598 s t(norm)=0.113904, mflops=43.8966 (err=7.1e-16) 2. PDA (f2c): elapsed time t=4.01104 s, 1 iters, t-(init.)=3.91129 s t(norm)=0.167739, mflops=29.8083 (err=7.1e-16) 3. Singleton: elapsed time t=5.13401 s, 1 iters, t-(init.)=5.03494 s t(norm)=0.215928, mflops=23.1559 (err=8.0e-16) 4. Singleton (f2c): elapsed time t=5.14223 s, 1 iters, t-(init.)=5.04316 s t(norm)=0.21628, mflops=23.1182 (err=8.0e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 80.1306 Normalized results and averages for N=1157625: fft 0: mflops = 80.1306 (norm. = 1), norm. avg. (of 20) = 0.986544 fft 1: mflops = 43.8966 (norm. = 0.547814), norm. avg. (of 20) = 0.315053 fft 2: mflops = 29.8083 (norm. = 0.371996), norm. avg. (of 20) = 0.219306 fft 3: mflops = 23.1559 (norm. = 0.288977), norm. avg. (of 20) = 0.417375 fft 4: mflops = 23.1182 (norm. = 0.288506), norm. avg. (of 20) = 0.419139 fft 5: mflops = -1 (norm. = -0.0124796), norm. avg. (of 11) = 0.774739 fft 6: mflops = -1 (norm. = -0.0124796), norm. avg. (of 11) = 0.519058 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=1.98357 s, 1 iters, t-(init.)=1.85818 s t(norm)=0.064764, mflops=77.2034 (err=5.3e-16) 1. PDA: elapsed time t=3.28637 s, 1 iters, t-(init.)=3.16118 s t(norm)=0.110178, mflops=45.3811 (err=5.4e-16) 2. PDA (f2c): elapsed time t=4.56814 s, 1 iters, t-(init.)=4.44235 s t(norm)=0.154832, mflops=32.2932 (err=5.4e-16) 3. Singleton: elapsed time t=6.15291 s, 1 iters, t-(init.)=6.02478 s t(norm)=0.209984, mflops=23.8113 (err=6.4e-16) 4. Singleton (f2c): elapsed time t=6.20829 s, 1 iters, t-(init.)=6.07976 s t(norm)=0.211901, mflops=23.596 (err=6.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 77.2034 Normalized results and averages for N=1404928: fft 0: mflops = 77.2034 (norm. = 1), norm. avg. (of 21) = 0.987185 fft 1: mflops = 45.3811 (norm. = 0.587812), norm. avg. (of 21) = 0.328041 fft 2: mflops = 32.2932 (norm. = 0.418287), norm. avg. (of 21) = 0.228781 fft 3: mflops = 23.8113 (norm. = 0.308423), norm. avg. (of 21) = 0.412187 fft 4: mflops = 23.596 (norm. = 0.305634), norm. avg. (of 21) = 0.413734 fft 5: mflops = -1 (norm. = -0.0129528), norm. avg. (of 11) = 0.774739 fft 6: mflops = -1 (norm. = -0.0129528), norm. avg. (of 11) = 0.519058 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=2.2903 s, 1 iters, t-(init.)=2.13427 s t(norm)=0.0596077, mflops=83.8817 (err=7.5e-16) 1. PDA: elapsed time t=3.43086 s, 1 iters, t-(init.)=3.2608 s t(norm)=0.0910702, mflops=54.9027 (err=7.8e-16) 2. PDA (f2c): elapsed time t=4.70752 s, 1 iters, t-(init.)=4.55115 s t(norm)=0.127108, mflops=39.3366 (err=7.8e-16) 3. Singleton: elapsed time t=10.1352 s, 1 iters, t-(init.)=9.98531 s t(norm)=0.278878, mflops=17.929 (err=9.4e-16) 4. Singleton (f2c): elapsed time t=10.2198 s, 1 iters, t-(init.)=10.0695 s t(norm)=0.281229, mflops=17.7791 (err=9.4e-16) 5. Temperton: elapsed time t=2.20928 s, 1 iters, t-(init.)=2.05378 s t(norm)=0.0573598, mflops=87.1691 (err=1.1e-08) 6. Temperton (f2c): elapsed time t=2.65618 s, 1 iters, t-(init.)=2.50004 s t(norm)=0.0698233, mflops=71.6094 (err=6.9e-16) Top mflops for N=1728000 = 87.1691 Normalized results and averages for N=1728000: fft 0: mflops = 83.8817 (norm. = 0.962287), norm. avg. (of 22) = 0.986053 fft 1: mflops = 54.9027 (norm. = 0.629841), norm. avg. (of 22) = 0.34176 fft 2: mflops = 39.3366 (norm. = 0.451267), norm. avg. (of 22) = 0.238894 fft 3: mflops = 17.929 (norm. = 0.205681), norm. avg. (of 22) = 0.4028 fft 4: mflops = 17.7791 (norm. = 0.203961), norm. avg. (of 22) = 0.404198 fft 5: mflops = 87.1691 (norm. = 1), norm. avg. (of 12) = 0.79351 fft 6: mflops = 71.6094 (norm. = 0.821499), norm. avg. (of 12) = 0.544262 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=3.86665 s, 1 iters, t-(init.)=3.59447 s t(norm)=0.0559644, mflops=89.3425 (err=1.2e-15) 1. PDA: elapsed time t=6.08959 s, 1 iters, t-(init.)=5.81629 s t(norm)=0.0905572, mflops=55.2137 (err=1.2e-15) 2. PDA (f2c): elapsed time t=8.3541 s, 1 iters, t-(init.)=8.06942 s t(norm)=0.125637, mflops=39.7971 (err=1.2e-15) 3. Singleton: elapsed time t=17.288 s, 1 iters, t-(init.)=17.0141 s t(norm)=0.264902, mflops=18.8749 (err=1.6e-15) 4. Singleton (f2c): elapsed time t=17.6479 s, 1 iters, t-(init.)=17.3741 s t(norm)=0.270507, mflops=18.4838 (err=1.6e-15) 5. Temperton: elapsed time t=3.54944 s, 1 iters, t-(init.)=3.27624 s t(norm)=0.0510097, mflops=98.0205 (err=1.8e-07) 6. Temperton (f2c): elapsed time t=4.18408 s, 1 iters, t-(init.)=3.90875 s t(norm)=0.0608575, mflops=82.1592 (err=1.2e-15) Top mflops for N=2985984 = 98.0205 Normalized results and averages for N=2985984: fft 0: mflops = 89.3425 (norm. = 0.911467), norm. avg. (of 23) = 0.98281 fft 1: mflops = 55.2137 (norm. = 0.563287), norm. avg. (of 23) = 0.351391 fft 2: mflops = 39.7971 (norm. = 0.406007), norm. avg. (of 23) = 0.24616 fft 3: mflops = 18.8749 (norm. = 0.192561), norm. avg. (of 23) = 0.393659 fft 4: mflops = 18.4838 (norm. = 0.188571), norm. avg. (of 23) = 0.394823 fft 5: mflops = 98.0205 (norm. = 1), norm. avg. (of 13) = 0.809394 fft 6: mflops = 82.1592 (norm. = 0.838183), norm. avg. (of 13) = 0.566871 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Bailey, Beauregard, Bergland, Brenner, Burrus, CWP (min N), CWP (best N), Edelblute, FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), Monro, NAPACK (f2c), Ooura (C), Ooura (F), Ransom, SCIPORT, Singleton, Singleton (f2c), Sorensen, Sorensen DIT, Temperton, Temperton (f2c), Valkenburg 2, 46.173, 50.3527, 15.6988, 1.27451, 6.10263, 6.65297, 6.26647, 7.71764, 41.5049, 3.0534, 2.88256, , 10.9998, 9.76639, 48.4477, 41.7639, 98.3304, , 8.2004, 5.20232, 5.30031, 35.9527, , , , , 2.51549, 23.3134, 24.0163, , , 4.0835, 4.11309, 18.6124, 49.7324, 3.20403, 2.43838, 5.89796 4, 60.2042, 56.3313, 24.7992, 7.44555, 9.98212, 10.5031, 20.8316, 17.4616, 28.0063, 11.424, 7.14313, 23.2573, 38.4934, 30.0962, 147.644, 134.789, 217.757, , 24.973, 10.7228, 10.9535, 96.311, 32.0218, 30.7329, 29.0624, 8.15199, 5.30116, 59.5542, 60.268, 4.03698, 46.0953, 16.9578, 17.1081, 29.0311, 27.1045, 11.0108, 8.49232, 5.61841 8, 130.617, 108.688, 30.6319, 7.10948, 19.0684, 9.41684, 35.1524, 23.3794, 20.7883, 29.2144, 21.3677, 18.9575, 58.2854, 44.4313, 282.661, 275.239, 331.372, 84.0863, 38.8027, 18.52, 18.8725, 146.038, 55.7722, 54.3422, 50.7698, 19.5974, 10.2425, 82.4513, 86.3119, 4.022, 60.4329, 16.9367, 16.5896, 54.6799, 21.045, 22.4603, 15.0024, 5.74436 16, 53.7112, 56.0893, 34.9571, 15.9669, 29.5874, 9.9225, 49.5897, 36.0157, 19.7622, 55.2823, 41.3365, 17.9777, 100.872, 78.0747, 357.935, 349.893, 362.389, 102.021, 58.9868, 28.8878, 31.791, 152.268, 51.2063, 59.5607, 56.9175, 32.3876, 16.3277, 100.53, 101.926, 16.5187, 82.1782, 48.3003, 47.862, 75.0487, 19.5177, 38.2336, 28.1016, 5.90375 32, 65.7313, 65.3313, 39.1933, 16.4051, 48.8946, 9.70864, 66.2665, 45.3503, 20.6348, 67.4959, 84.4449, 19.1823, 118.428, 79.4997, 264.192, 305.052, 314.569, 129.904, 53.1016, 41.6365, 44.8803, 162.83, 62.2998, 73.895, 76.5567, 52.1971, 22.5825, 118.448, 115.745, 14.0741, 96.7046, 67.1543, 67.2618, 111.361, 21.0744, 48.3015, 31.0272, 6.03292 64, 62.1764, 67.4422, 43.6667, 24.6529, 70.6444, 9.42958, 73.7987, 57.3402, 22.5279, 86.2287, 127.843, 20.6625, 154.716, 100.484, 308.406, 211.14, 154.664, 164.363, 66.3993, 52.0392, 56.7565, 130.031, 62.041, 81.1052, 81.343, 69.4483, 28.4078, 131.834, 129.569, 32.7055, 117.184, 99.2772, 100.691, 132.033, 22.8074, 74.6043, 47.7791, 6.0193 128, 67.8392, 73.3759, 46.7949, 23.7807, 98.3478, 9.18397, 80.7815, 63.2671, 24.5489, 113.111, 181.318, 22.8103, 193.095, 116.598, 263.349, 222.023, 186.44, 164.407, 75.1267, 58.4936, 69.5929, 78.2315, 70.2135, 91.3119, 95.6749, 82.7919, 31.678, 145.8, 141.893, 27.8951, 120.753, 95.8369, 95.7443, 168.448, 24.9661, 74.4133, 47.1151, 6.0781 256, 68.1215, 73.797, 49.3051, 26.6755, 99.2097, 9.0435, 91.4113, 73.4112, 27.2299, 142.214, 194.569, 25.4636, 196.457, 120.629, 294.749, 233.425, 172.207, 158.706, 81.8327, 65.4601, 76.1431, 100.251, 73.7661, 95.9992, 94.373, 96.2493, 33.3533, 154.104, 150.486, 48.2461, 125.413, 131.944, 136.749, 174.535, 27.6506, 89.376, 56.8881, 6.05176 512, 74.2865, 81.8648, 54.9001, 28.5728, 98.0717, 9.00124, 98.0863, 78.6918, 30.2134, 145.392, 176.928, 28.1637, 148.702, 85.9667, 228.665, 230.254, 147.158, 177.461, 63.3121, 67.7356, 83.8595, 87.1334, 81.1225, 104.219, 103.37, 107.678, 31.3129, 156.371, 153.138, 43.0181, 108.944, 138.625, 142.587, 188.406, 30.7616, 81.6388, 49.4348, 6.08667 1024, 78.9427, 86.9574, 58.6941, 33.3362, 89.7816, 8.91183, 100.825, 85.6169, 32.795, 130.932, 130.718, 30.7028, 136.207, 78.1567, 199.063, 194.498, 119.647, 158.941, 64.1015, 70.5022, 88.8779, 60.4935, 84.5733, 113.165, 106.189, 112.828, 30.0609, 168.289, 165.947, 61.1153, 94.6298, 151.201, 155.702, 195.472, 33.7314, 91.6803, 55.6956, 6.11756 2048, 68.9303, 72.3609, 48.115, 30.638, 82.5437, 8.55454, 94.7036, 76.3825, 30.6371, 103.851, 121.986, 28.2254, 149.818, 78.0267, 212.338, 209.828, 112.742, 127.233, 63.2579, 53.0217, 62.4148, 57.1123, 81.9237, 103.232, 99.4487, 84.0465, 24.8979, 117.815, 115.27, 45.2492, 90.3218, 103.19, 98.7522, 117.712, 29.2359, 75.0117, 49.8567, 5.94984 4096, 70.1775, 73.5874, 50.2089, 33.8008, 81.9083, 8.47962, 100.084, 81.2591, 32.6701, 103.447, 117.025, 30.0577, 137.457, 81.5197, 220.066, 209.447, 111.019, 136.132, 68.0804, 52.9062, 62.6275, 53.936, 65.0851, 77.6349, 75.91, 85.7643, 25.7057, 131.819, 127.928, 62.2666, 94.2551, 121.765, 118.756, 118.087, 30.6337, 83.6248, 53.923, 5.90012 8192, 73.2819, 78.9382, 51.0579, 33.2512, 84.0687, 8.38735, 98.8522, 81.4379, 34.4873, 110.763, 125.87, 31.2611, 139.466, 70.9707, 204.703, 216.159, 110.638, 127.71, 63.3854, 52.5426, 62.7057, , 66.9108, 78.8308, 77.3665, 89.9436, 25.2463, 121.989, 119.957, 52.1141, 90.732, 116.506, 110.262, 116.312, 32.091, 77.0654, 48.7956, 5.91968 16384, 74.161, 78.0773, 52.3964, 38.783, 81.5835, 8.34965, 103.051, 87.215, 36.3143, 113.69, 113.659, 32.6858, 128.624, 73.1147, 202.256, 205.549, 101.128, 123.833, 67.2491, 51.6562, 63.2395, , 67.9951, 78.5824, 77.5699, 91.9359, 25.8118, 134.459, 130.11, 69.5662, 77.8753, 122.96, 118.011, 117.279, 33.5257, 84.6875, 53.3354, 5.85964 32768, 64.3694, 68.5435, 47.9984, 35.221, 78.9519, 8.30745, 82.183, 70.8416, 35.2399, 110.912, 110.912, 31.7444, 135.786, 72.1046, 182.702, 193.316, 99.2163, 96.7416, 65.9256, 41.3839, 48.1103, , 69.1703, 79.6384, 78.6875, 75.8789, 25.2275, 122.153, 119.905, 57.3825, 73.9757, 76.3197, 74.2402, 113.268, 32.4682, 60.7504, 42.0268, 5.7004 65536, 44.4651, 45.8292, 31.7535, 35.8953, 75.8321, 8.18999, 73.3153, 57.666, 26.1188, 107.179, 107.169, 24.0467, 111.209, 69.2844, 113.686, 140.677, 86.049, 83.3898, 66.1458, 34.2357, 39.7283, , 57.7918, 63.8957, 60.4185, 49.4552, 25.2853, 111.353, 110.679, 54.0648, 32.9473, 80.4167, 77.9942, 96.7617, 24.3622, 57.7162, 40.4609, 5.4331 131072, 35.2557, 35.245, 24.1213, 30.6531, 41.7919, 8.03057, 62.3946, 44.124, 20.7877, 101.269, 101.322, 19.8779, 69.8012, 47.4493, 110.895, 109.677, 65.1937, 67.8234, 46.6176, 27.2334, 30.2663, , 50.1422, 54.1297, 50.3343, 36.5954, 19.7296, 93.8702, 92.5008, 45.4975, 23.8761, 53.4039, 52.6281, 71.1938, 19.5204, 41.46, 27.8218, 5.04161 262144, 22.6653, 22.3518, 15.8318, 28.3333, 27.8677, 7.64165, 45.2188, 30.7038, 14.6475, 72.2301, 72.2113, 14.0828, 53.1367, 39.6383, 80.7442, 83.655, 49.3289, 46.9805, 40.736, 18.3796, 19.5402, , 38.1406, 40.3704, 37.0005, 22.9816, 16.4285, 76.5719, 75.7443, 49.3986, 16.1859, 39.707, 39.1746, 43.0292, 14.1894, 32.3403, 21.1762, 4.55569 Norm. Avg., 0.304897, 0.315935, 0.190751, 0.133887, 0.301266, 0.0447636, 0.350587, 0.277288, 0.147092, 0.453138, 0.502849, 0.120415, 0.539418, 0.331818, 0.916323, 0.890847, 0.687533, 0.55566, 0.27601, 0.19659, 0.229052, 0.370135, 0.302856, 0.357866, 0.346943, 0.308831, 0.109512, 0.555522, 0.547644, 0.220471, 0.342276, 0.395083, 0.391225, 0.507698, 0.147525, 0.279977, 0.182259, 0.0298687 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, Brenner, CWP (min N), CWP (best N), FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c), Valkenburg 6, , 17.2294, 13.6605, 39.5724, 30.3587, 206.699, 183.684, 20.1313, 31.5991, 9.39276, 11.7161, 11.6352, 13.1763, 8.61672, 5.56038 9, 6.91942, 31.4248, 25.4232, 61.4049, 42.0408, 224.688, 204.159, 18.287, 31.8737, 15.9615, 21.1703, 21.4766, 24.8515, 17.3084, 5.41698 12, , 41.054, 38.0871, 88.5602, 58.3698, 304.795, 304.89, 28.9557, 46.6642, 15.5391, 23.3903, 23.5285, 30.7764, 20.297, 5.72993 15, 9.87698, 51.9404, 52.1827, 92.2316, 59.155, 244.184, 246.493, 21.2462, 39.8173, 12.6276, 29.2981, 29.1479, 42.6758, 24.4284, 4.71914 18, 9.13822, 51.607, 48.8671, 78.0954, 49.2347, 125.45, 121.151, 20.5359, 54.1363, 21.1357, 33.6592, 34.1972, 35.5244, 23.7916, 5.65178 24, , 77.2736, 70.9656, 105.606, 62.4352, 169.194, 147.02, 38.1538, 69.1473, 22.1217, 33.788, 33.5119, 46.8592, 31.0868, 5.87254 36, 11.7246, 95.5669, 96.0268, 127.676, 70.4287, 220.117, 212.128, 24.9289, 76.8699, 28.7735, 56.6205, 58.2201, 60.4676, 43.679, 5.68841 80, 20.7918, 127.373, 170.233, 155.247, 99.7454, 257.729, 256.246, 44.3059, 63.4205, 18.6163, 92.8068, 96.3351, 89.9694, 57.5743, 5.32692 108, 12.9348, 111.868, 134.668, 171.282, 82.0646, 254.202, 255.68, 23.2543, 70.2371, 37.9015, 67.4086, 68.9548, 89.4717, 61.3517, 5.65778 210, 17.118, 206.954, 207.262, 113.312, 59.0491, 143.303, 148.513, 21.5933, 54.6127, 16.5405, 60.2533, 55.7183, , , 4.38666 504, 18.9395, 196.351, 196.296, 115.98, 56.4592, 167.195, 156.865, 27.4672, 59.4509, 21.229, 69.6716, 69.3354, , , 4.85464 1000, 17.9636, 110.888, 170.049, 122.428, 72.8506, 157.481, 157.039, 27.2637, 55.3092, 16.2825, 96.2396, 99.3697, 104.596, 65.0739, 4.74309 1960, 19.9039, 136.721, 136.289, 93.1792, 40.8738, 129.01, 123.053, 25.4727, 52.7038, 13.4076, 60.8815, 62.9529, , , 4.17124 4725, 15.4952, 105.277, 146.615, 117.117, 52.2437, 140.211, 139.511, 18.2359, 54.5694, 15.5287, 62.5583, 59.6159, , , 4.41023 10368, 18.7506, 128.823, 123.136, 146.533, 70.933, 181.666, 160.354, 27.0052, 73.5965, 28.2847, 82.1591, 76.3287, 94.3258, 67.3861, 5.58579 27000, 14.3539, 139.222, 138.99, 122.877, 67.5814, 121.539, 132.36, 20.4922, 66.0326, 16.777, 71.3341, 70.0768, 94.4318, 64.9985, 4.83518 75600, 12.6944, 112.325, 111.821, 73.5212, 44.1651, 94.0896, 85.4518, 16.7829, 51.8837, 15.5374, 46.9693, 47.1943, , , 4.1504 165375, 8.62008, 94.0029, 97.1226, 35.6898, 24.3523, 68.1005, 70.4075, 12.3935, 41.4991, 11.3993, 29.1313, 28.9833, , , 3.67247 362880, 9.24529, 41.3391, 41.4638, 38.3079, 28.578, 70.3296, 70.9245, 14.5604, 39.8672, 13.0089, 26.9839, 25.4722, , , 4.10661 Norm. Avg., 0.0876643, 0.601637, 0.642165, 0.569337, 0.322827, 0.934574, 0.905238, 0.137816, 0.331551, 0.109056, 0.30748, 0.305291, 0.316241, 0.21167, 0.0299927 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM, HARM (f2c), PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 4x4x4, 217.036, , , 24.9864, 18.4707, 113.168, 116.361, 69.4163, 39.1845 8x8x8, 368.07, 115.02, 96.8518, 47.0775, 33.501, 90.6808, 90.9833, 189.729, 120.444 16x16x16, 294.293, 119.361, 109, 76.1518, 50.8957, 105.362, 102.072, 163.947, 120.304 32x32x32, 257.088, 111.094, 97.0744, 42.3853, 32.5076, 81.8911, 75.0189, 151.707, 91.3701 64x64x64, 97.6205, 86.2233, 82.5996, 45.1246, 35.4959, 36.3208, 36.969, 132.575, 102.596 256x64x32, 81.8941, 58.9606, 57.0751, 39.4664, 31.6312, 22.9839, 22.5088, 103.763, 81.6261 16x1024x64, 108.478, 58.7018, 57.8445, 75.9251, 50.4855, 26.5246, 26.5564, , 128x128x128, 92.0364, 58.1607, 55.9281, 47.6801, 37.2442, 16.1014, 15.757, 94.9413, 80.7902 Norm. Avg., 0.936873, 0.50322, 0.47236, 0.323687, 0.238222, 0.29424, 0.291065, 0.711785, 0.526207 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 5x5x5, 159.433, 30.2697, 19.037, 145.981, 161.757, 102.739, 64.2717 6x6x6, 251.262, 31.4794, 23.4585, 85.2854, 81.7932, 110.244, 68.2006 7x7x7, 174.615, 15.6906, 11.7218, 83.3892, 81.7694, , 9x9x9, 244.252, 49.0931, 32.8958, 92.4364, 96.0595, 182.654, 134.74 10x10x10, 251.955, 49.9761, 29.7279, 104.67, 102.758, 187.823, 89.488 11x11x11, 95.868, 16.9631, 12.0318, 80.7272, 79.9786, , 12x12x12, 287.201, 67.8954, 42.3913, 77.5655, 76.1374, 165.132, 85.0458 13x13x13, 88.5216, 16.5662, 11.8008, 72.1871, 74.2934, , 14x14x14, 155.593, 29.5618, 21.2254, 61.9433, 60.7918, , 15x15x15, 223.277, 71.008, 40.3836, 79.0188, 77.791, 159.292, 99.3153 24x25x28, 123.054, 69.9944, 44.3128, 72.3222, 70.8908, , 48x48x48, 177.262, 47.9088, 35.7549, 55.0283, 53.4888, 145.193, 112.96 49x49x49, 106.481, 30.4044, 20.8678, 43.9387, 41.9247, , 60x60x60, 142.48, 47.5639, 35.1645, 32.458, 32.2446, 128.119, 76.372 72x60x56, 102.311, 45.5356, 33.2248, 29.1395, 28.5781, , 75x75x75, 95.089, 47.9269, 35.1363, 29.8721, 30.0453, 102.032, 80.7254 80x80x80, 94.2915, 49.9172, 38.0213, 31.0486, 32.7527, 115.939, 81.9524 84x84x84, 85.3168, 41.5688, 29.6336, 22.8866, 22.0117, , 96x96x96, 89.1457, 49.2166, 36.6023, 19.0148, 18.6679, 84.5335, 64.3474 105x105x105, 80.1306, 43.8966, 29.8083, 23.1559, 23.1182, , 112x112x112, 77.2034, 45.3811, 32.2932, 23.8113, 23.596, , 120x120x120, 83.8817, 54.9027, 39.3366, 17.929, 17.7791, 87.1691, 71.6094 144x144x144, 89.3425, 55.2137, 39.7971, 18.8749, 18.4838, 98.0205, 82.1592 Norm. Avg., 0.98281, 0.351391, 0.24616, 0.393659, 0.394823, 0.809394, 0.566871 @@@@ end