To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Scott Ransom @ submitter email = ransom@cfa.harvard.edu @ submitter organization = Harvard-Smithsonian Center for Astrophysics @ computer manufacturer = SGI @ computer model = Origin200 @ CPU manufacturer = SGI @ CPU model = MIPS R10000 @ CPU speed = 195 MHz @ RAM = 512 MB @ L2 cache size = 1 MB @ operating system = IRIX 6.4 (IP27) @ C compiler = Mongoose Compilers: Version 7.10 @ C compiler flags = -DUSE_SGIMATH -mips4 -64 -Ofast=ip27 -WOPT:rsv_bits=2656 @ Fortran compiler = Mongoose Compilers: Version 7.10 @ Fortran compiler flags = -mips4 -64 -Ofast=ip27 @ remarks = This benchmark used only 1 of 4 processors. @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000335693 MB) 4 (0.000579834 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) 524288 (50.9973 MB) 1048576 (96 MB) Maximum array size = 1048576 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Bailey 5. Beauregard 6. Bergland 7. Brenner 8. Burrus 9. CWP (min N) 10. CWP (best N) 11. Edelblute 12. FFTPACK 13. FFTPACK (f2c) 14. FFTW 15. FFTW_ESTIMATE 16. Frigo-old 17. Green 18. GSL 19. GSL DIT 20. GSL DIF 21. Krukar 22. Mayer (Buneman) 23. Mayer (simple) 24. Mayer (lookup) 25. Monro 26. NAPACK (f2c) 27. Ooura (C) 28. Ooura (F) 29. Ransom 30. SCIPORT 31. Singleton 32. Singleton (f2c) 33. Sorensen 34. Sorensen DIT 35. Temperton 36. Temperton (f2c) 37. Valkenburg 38. SGIMATH Computing normalized averages (39 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.25 s, 4194304 iters, t-(init.)=0.88 s t(norm)=0.104904, mflops=47.6625 (err=1.7e-17) 1. Arndt DIT: elapsed time t=1.52 s, 4194304 iters, t-(init.)=1.19 s t(norm)=0.141859, mflops=35.2463 (err=1.7e-17) 2. Arndt Split-Radix: elapsed time t=1.77 s, 4194304 iters, t-(init.)=1.4 s t(norm)=0.166893, mflops=29.9593 (err=1.7e-17) 3. Arndt 4-step: elapsed time t=1.21 s, 131072 iters, t-(init.)=1.2 s t(norm)=4.57764, mflops=1.09227 (err=1.7e-17) 4. Bailey: elapsed time t=1.78 s, 2097152 iters, t-(init.)=1.62 s t(norm)=0.386238, mflops=12.9454 (err=1.7e-17) 5. Beauregard: elapsed time t=1.05 s, 1048576 iters, t-(init.)=0.96 s t(norm)=0.457764, mflops=10.9227 (err=1.7e-17) 6. Bergland: elapsed time t=1.29 s, 524288 iters, t-(init.)=1.23 s t(norm)=1.17302, mflops=4.2625 (err=1.7e-17) 7. Brenner: elapsed time t=1.85 s, 1048576 iters, t-(init.)=1.76 s t(norm)=0.839233, mflops=5.95782 (err=1.7e-17) 8. Burrus: elapsed time t=1.28 s, 2097152 iters, t-(init.)=1.09 s t(norm)=0.259876, mflops=19.2399 (err=1.7e-17) 9. CWP (min N): elapsed time t=1.75 s, 524288 iters, t-(init.)=1.7 s t(norm)=1.62125, mflops=3.08405 10. CWP (best N) (N=3): elapsed time t=1.77 s, 524288 iters, t-(init.)=1.7 s t(norm)=1.62125, mflops=3.08405 11. Skipping fft (Edelblute can't handle N <= 2). 12. FFTPACK: elapsed time t=1.2 s, 1048576 iters, t-(init.)=1.1 s t(norm)=0.524521, mflops=9.53251 (err=1.7e-17) 13. FFTPACK (f2c): elapsed time t=1.81 s, 1048576 iters, t-(init.)=1.7 s t(norm)=0.810623, mflops=6.16809 (err=1.7e-17) FFTW_MEASURE plan: (cost = 5.149841e-07) FFTW_NOTW 2 14. FFTW: elapsed time t=1.07 s, 2097152 iters, t-(init.)=0.91 s t(norm)=0.216961, mflops=23.0456 (err=1.7e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 15. FFTW_ESTIMATE: elapsed time t=1.09 s, 2097152 iters, t-(init.)=0.9 s t(norm)=0.214577, mflops=23.3017 (err=1.7e-17) 16. Frigo-old: elapsed time t=1.11 s, 4194304 iters, t-(init.)=0.7 s t(norm)=0.0834465, mflops=59.9186 (err=1.7e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.01 s, 524288 iters, t-(init.)=0.97 s t(norm)=0.925064, mflops=5.40503 (err=1.7e-17) 19. GSL DIT: elapsed time t=1.33 s, 1048576 iters, t-(init.)=1.22 s t(norm)=0.581741, mflops=8.59489 (err=1.7e-17) 20. GSL DIF: elapsed time t=1.14 s, 1048576 iters, t-(init.)=1.04 s t(norm)=0.495911, mflops=10.0825 (err=1.7e-17) 21. Krukar: elapsed time t=1.77 s, 4194304 iters, t-(init.)=1.39 s t(norm)=0.165701, mflops=30.1748 (err=1.7e-17) 22. Skipping fft (Mayer can't handle N <= 2). 23. Skipping fft (Mayer can't handle N <= 2). 24. Skipping fft (Mayer can't handle N <= 2). 25. Skipping fft (Monro can't handle N <= 2). 26. NAPACK (f2c): elapsed time t=1.08 s, 524288 iters, t-(init.)=1.04 s t(norm)=0.991821, mflops=5.04123 (err=1.7e-17) 27. Ooura (C): elapsed time t=1.28 s, 4194304 iters, t-(init.)=0.86 s t(norm)=0.10252, mflops=48.771 (err=1.7e-17) 28. Ooura (F): elapsed time t=1.09 s, 2097152 iters, t-(init.)=0.86 s t(norm)=0.20504, mflops=24.3855 (err=1.7e-17) 29. Skipping fft (Ransom doesn't work for N=2). 30. Skipping fft (SCIPORT can't handle N < 4). 31. Singleton: elapsed time t=1.97 s, 1048576 iters, t-(init.)=1.87 s t(norm)=0.891685, mflops=5.60736 (err=1.7e-17) 32. Singleton (f2c): elapsed time t=1.85 s, 1048576 iters, t-(init.)=1.75 s t(norm)=0.834465, mflops=5.99186 (err=1.7e-17) 33. Sorensen: elapsed time t=1.29 s, 2097152 iters, t-(init.)=1.1 s t(norm)=0.26226, mflops=19.065 (err=1.7e-17) 34. Sorensen DIT: elapsed time t=1.37 s, 2097152 iters, t-(init.)=1.21 s t(norm)=0.288486, mflops=17.3318 (err=1.7e-17) 35. Temperton: elapsed time t=1.85 s, 524288 iters, t-(init.)=1.8 s t(norm)=1.71661, mflops=2.91271 (err=1.7e-17) 36. Temperton (f2c): elapsed time t=1.07 s, 262144 iters, t-(init.)=1.05 s t(norm)=2.00272, mflops=2.49661 (err=1.7e-17) 37. Valkenburg: elapsed time t=1.62 s, 1048576 iters, t-(init.)=1.53 s t(norm)=0.729561, mflops=6.85344 (err=1.7e-17) 38. SGIMATH: elapsed time t=1.72 s, 2097152 iters, t-(init.)=1.54 s t(norm)=0.367165, mflops=13.6179 (err=1.7e-17) Top mflops for N=2 = 59.9186 Normalized results and averages for N=2: fft 0: mflops = 47.6625 (norm. = 0.795455), norm. avg. (of 1) = 0.795455 fft 1: mflops = 35.2463 (norm. = 0.588235), norm. avg. (of 1) = 0.588235 fft 2: mflops = 29.9593 (norm. = 0.5), norm. avg. (of 1) = 0.5 fft 3: mflops = 1.09227 (norm. = 0.0182292), norm. avg. (of 1) = 0.0182292 fft 4: mflops = 12.9454 (norm. = 0.216049), norm. avg. (of 1) = 0.216049 fft 5: mflops = 10.9227 (norm. = 0.182292), norm. avg. (of 1) = 0.182292 fft 6: mflops = 4.2625 (norm. = 0.0711382), norm. avg. (of 1) = 0.0711382 fft 7: mflops = 5.95782 (norm. = 0.0994318), norm. avg. (of 1) = 0.0994318 fft 8: mflops = 19.2399 (norm. = 0.321101), norm. avg. (of 1) = 0.321101 fft 9: mflops = 3.08405 (norm. = 0.0514706), norm. avg. (of 1) = 0.0514706 fft 10: mflops = 3.08405 (norm. = 0.0514706), norm. avg. (of 1) = 0.0514706 fft 11: mflops = -1 (norm. = -0.0166893), norm. avg. (of 0) = -1 fft 12: mflops = 9.53251 (norm. = 0.159091), norm. avg. (of 1) = 0.159091 fft 13: mflops = 6.16809 (norm. = 0.102941), norm. avg. (of 1) = 0.102941 fft 14: mflops = 23.0456 (norm. = 0.384615), norm. avg. (of 1) = 0.384615 fft 15: mflops = 23.3017 (norm. = 0.388889), norm. avg. (of 1) = 0.388889 fft 16: mflops = 59.9186 (norm. = 1), norm. avg. (of 1) = 1 fft 17: mflops = -1 (norm. = -0.0166893), norm. avg. (of 0) = -1 fft 18: mflops = 5.40503 (norm. = 0.0902062), norm. avg. (of 1) = 0.0902062 fft 19: mflops = 8.59489 (norm. = 0.143443), norm. avg. (of 1) = 0.143443 fft 20: mflops = 10.0825 (norm. = 0.168269), norm. avg. (of 1) = 0.168269 fft 21: mflops = 30.1748 (norm. = 0.503597), norm. avg. (of 1) = 0.503597 fft 22: mflops = -1 (norm. = -0.0166893), norm. avg. (of 0) = -1 fft 23: mflops = -1 (norm. = -0.0166893), norm. avg. (of 0) = -1 fft 24: mflops = -1 (norm. = -0.0166893), norm. avg. (of 0) = -1 fft 25: mflops = -1 (norm. = -0.0166893), norm. avg. (of 0) = -1 fft 26: mflops = 5.04123 (norm. = 0.0841346), norm. avg. (of 1) = 0.0841346 fft 27: mflops = 48.771 (norm. = 0.813953), norm. avg. (of 1) = 0.813953 fft 28: mflops = 24.3855 (norm. = 0.406977), norm. avg. (of 1) = 0.406977 fft 29: mflops = -1 (norm. = -0.0166893), norm. avg. (of 0) = -1 fft 30: mflops = -1 (norm. = -0.0166893), norm. avg. (of 0) = -1 fft 31: mflops = 5.60736 (norm. = 0.0935829), norm. avg. (of 1) = 0.0935829 fft 32: mflops = 5.99186 (norm. = 0.1), norm. avg. (of 1) = 0.1 fft 33: mflops = 19.065 (norm. = 0.318182), norm. avg. (of 1) = 0.318182 fft 34: mflops = 17.3318 (norm. = 0.289256), norm. avg. (of 1) = 0.289256 fft 35: mflops = 2.91271 (norm. = 0.0486111), norm. avg. (of 1) = 0.0486111 fft 36: mflops = 2.49661 (norm. = 0.0416667), norm. avg. (of 1) = 0.0416667 fft 37: mflops = 6.85344 (norm. = 0.114379), norm. avg. (of 1) = 0.114379 fft 38: mflops = 13.6179 (norm. = 0.227273), norm. avg. (of 1) = 0.227273 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.46 s, 2097152 iters, t-(init.)=1.32 s t(norm)=0.0786781, mflops=63.5501 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.39 s, 2097152 iters, t-(init.)=1.26 s t(norm)=0.0751019, mflops=66.5763 (err=1.3e-16) 2. Arndt Split-Radix: elapsed time t=1.06 s, 1048576 iters, t-(init.)=0.99 s t(norm)=0.118017, mflops=42.3667 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.02 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.963211, mflops=5.19097 (err=1.3e-16) 4. Bailey: elapsed time t=1.85 s, 1048576 iters, t-(init.)=1.79 s t(norm)=0.213385, mflops=23.4319 (err=1.3e-16) 5. Beauregard: elapsed time t=1.34 s, 524288 iters, t-(init.)=1.29 s t(norm)=0.30756, mflops=16.257 (err=6.5e-17) 6. Bergland: elapsed time t=1.59 s, 524288 iters, t-(init.)=1.55 s t(norm)=0.369549, mflops=13.53 (err=5.3e-17) 7. Brenner: elapsed time t=1.62 s, 524288 iters, t-(init.)=1.58 s t(norm)=0.376701, mflops=13.2731 (err=5.3e-17) 8. Burrus: elapsed time t=1.49 s, 1048576 iters, t-(init.)=1.42 s t(norm)=0.169277, mflops=29.5374 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.98 s, 524288 iters, t-(init.)=1.95 s t(norm)=0.464916, mflops=10.7546 10. CWP (best N) (N=15): elapsed time t=1.63 s, 262144 iters, t-(init.)=1.57 s t(norm)=0.748634, mflops=6.67883 11. Edelblute: elapsed time t=1.06 s, 1048576 iters, t-(init.)=0.95 s t(norm)=0.113249, mflops=44.1506 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.45 s, 1048576 iters, t-(init.)=1.39 s t(norm)=0.165701, mflops=30.1748 (err=5.3e-17) 13. FFTPACK (f2c): elapsed time t=1.19 s, 524288 iters, t-(init.)=1.14 s t(norm)=0.271797, mflops=18.3961 (err=5.3e-17) FFTW_MEASURE plan: (cost = 5.340576e-07) FFTW_NOTW 4 14. FFTW: elapsed time t=1.21 s, 2097152 iters, t-(init.)=1.08 s t(norm)=0.064373, mflops=77.6723 (err=5.3e-17) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 15. FFTW_ESTIMATE: elapsed time t=1.21 s, 2097152 iters, t-(init.)=1.07 s t(norm)=0.063777, mflops=78.3982 (err=5.3e-17) 16. Frigo-old: elapsed time t=1.13 s, 4194304 iters, t-(init.)=0.88 s t(norm)=0.026226, mflops=190.65 (err=5.3e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.14 s, 524288 iters, t-(init.)=1.11 s t(norm)=0.264645, mflops=18.8933 (err=5.3e-17) 19. GSL DIT: elapsed time t=1.35 s, 524288 iters, t-(init.)=1.31 s t(norm)=0.312328, mflops=16.0088 (err=6.5e-17) 20. GSL DIF: elapsed time t=1.29 s, 524288 iters, t-(init.)=1.25 s t(norm)=0.298023, mflops=16.7772 (err=6.5e-17) 21. Krukar: elapsed time t=1.81 s, 4194304 iters, t-(init.)=1.53 s t(norm)=0.0455976, mflops=109.655 (err=5.3e-17) 22. Mayer (Buneman): elapsed time t=1.12 s, 1048576 iters, t-(init.)=1.05 s t(norm)=0.12517, mflops=39.9458 (err=1.3e-16) 23. Mayer (simple): elapsed time t=1.11 s, 1048576 iters, t-(init.)=1.04 s t(norm)=0.123978, mflops=40.3298 24. Mayer (lookup): elapsed time t=1.38 s, 1048576 iters, t-(init.)=1.31 s t(norm)=0.156164, mflops=32.0176 (err=1.3e-16) 25. Monro: elapsed time t=1.05 s, 131072 iters, t-(init.)=1.04 s t(norm)=0.991821, mflops=5.04123 (err=1.3e-16) 26. NAPACK (f2c): elapsed time t=1.04 s, 262144 iters, t-(init.)=1.02 s t(norm)=0.486374, mflops=10.2802 (err=1.6e-16) 27. Ooura (C): elapsed time t=1.16 s, 2097152 iters, t-(init.)=1 s t(norm)=0.0596046, mflops=83.8861 (err=5.3e-17) 28. Ooura (F): elapsed time t=1.5 s, 2097152 iters, t-(init.)=1.36 s t(norm)=0.0810623, mflops=61.6809 (err=5.3e-17) 29. Ransom: elapsed time t=1.71 s, 262144 iters, t-(init.)=1.69 s t(norm)=0.805855, mflops=6.20459 (err=2.4e-16) 30. SCIPORT: elapsed time t=1.96 s, 2097152 iters, t-(init.)=1.83 s t(norm)=0.109076, mflops=45.8394 (err=6.5e-17) 31. Singleton: elapsed time t=1.1 s, 524288 iters, t-(init.)=1.06 s t(norm)=0.252724, mflops=19.7845 (err=5.3e-17) 32. Singleton (f2c): elapsed time t=1 s, 524288 iters, t-(init.)=0.96 s t(norm)=0.228882, mflops=21.8453 (err=5.3e-17) 33. Sorensen: elapsed time t=1.24 s, 1048576 iters, t-(init.)=1.17 s t(norm)=0.139475, mflops=35.8488 (err=1.3e-16) 34. Sorensen DIT: elapsed time t=1.45 s, 1048576 iters, t-(init.)=1.39 s t(norm)=0.165701, mflops=30.1748 (err=1.3e-16) 35. Temperton: elapsed time t=1.09 s, 262144 iters, t-(init.)=1.07 s t(norm)=0.510216, mflops=9.79978 (err=5.3e-17) 36. Temperton (f2c): elapsed time t=1.24 s, 262144 iters, t-(init.)=1.22 s t(norm)=0.581741, mflops=8.59489 (err=5.3e-17) 37. Valkenburg: elapsed time t=1.68 s, 262144 iters, t-(init.)=1.66 s t(norm)=0.79155, mflops=6.31672 (err=1.4e-16) 38. SGIMATH: elapsed time t=1.85 s, 2097152 iters, t-(init.)=1.71 s t(norm)=0.101924, mflops=49.0562 (err=5.3e-17) Top mflops for N=4 = 190.65 Normalized results and averages for N=4: fft 0: mflops = 63.5501 (norm. = 0.333333), norm. avg. (of 2) = 0.564394 fft 1: mflops = 66.5763 (norm. = 0.349206), norm. avg. (of 2) = 0.468721 fft 2: mflops = 42.3667 (norm. = 0.222222), norm. avg. (of 2) = 0.361111 fft 3: mflops = 5.19097 (norm. = 0.0272277), norm. avg. (of 2) = 0.0227284 fft 4: mflops = 23.4319 (norm. = 0.122905), norm. avg. (of 2) = 0.169477 fft 5: mflops = 16.257 (norm. = 0.0852713), norm. avg. (of 2) = 0.133781 fft 6: mflops = 13.53 (norm. = 0.0709677), norm. avg. (of 2) = 0.071053 fft 7: mflops = 13.2731 (norm. = 0.0696203), norm. avg. (of 2) = 0.084526 fft 8: mflops = 29.5374 (norm. = 0.15493), norm. avg. (of 2) = 0.238015 fft 9: mflops = 10.7546 (norm. = 0.0564103), norm. avg. (of 2) = 0.0539404 fft 10: mflops = 6.67883 (norm. = 0.0350318), norm. avg. (of 2) = 0.0432512 fft 11: mflops = 44.1506 (norm. = 0.231579), norm. avg. (of 1) = 0.231579 fft 12: mflops = 30.1748 (norm. = 0.158273), norm. avg. (of 2) = 0.158682 fft 13: mflops = 18.3961 (norm. = 0.0964912), norm. avg. (of 2) = 0.0997162 fft 14: mflops = 77.6723 (norm. = 0.407407), norm. avg. (of 2) = 0.396011 fft 15: mflops = 78.3982 (norm. = 0.411215), norm. avg. (of 2) = 0.400052 fft 16: mflops = 190.65 (norm. = 1), norm. avg. (of 2) = 1 fft 17: mflops = -1 (norm. = -0.00524521), norm. avg. (of 0) = -1 fft 18: mflops = 18.8933 (norm. = 0.0990991), norm. avg. (of 2) = 0.0946526 fft 19: mflops = 16.0088 (norm. = 0.0839695), norm. avg. (of 2) = 0.113706 fft 20: mflops = 16.7772 (norm. = 0.088), norm. avg. (of 2) = 0.128135 fft 21: mflops = 109.655 (norm. = 0.575163), norm. avg. (of 2) = 0.53938 fft 22: mflops = 39.9458 (norm. = 0.209524), norm. avg. (of 1) = 0.209524 fft 23: mflops = 40.3298 (norm. = 0.211538), norm. avg. (of 1) = 0.211538 fft 24: mflops = 32.0176 (norm. = 0.167939), norm. avg. (of 1) = 0.167939 fft 25: mflops = 5.04123 (norm. = 0.0264423), norm. avg. (of 1) = 0.0264423 fft 26: mflops = 10.2802 (norm. = 0.0539216), norm. avg. (of 2) = 0.0690281 fft 27: mflops = 83.8861 (norm. = 0.44), norm. avg. (of 2) = 0.626977 fft 28: mflops = 61.6809 (norm. = 0.323529), norm. avg. (of 2) = 0.365253 fft 29: mflops = 6.20459 (norm. = 0.0325444), norm. avg. (of 1) = 0.0325444 fft 30: mflops = 45.8394 (norm. = 0.240437), norm. avg. (of 1) = 0.240437 fft 31: mflops = 19.7845 (norm. = 0.103774), norm. avg. (of 2) = 0.0986782 fft 32: mflops = 21.8453 (norm. = 0.114583), norm. avg. (of 2) = 0.107292 fft 33: mflops = 35.8488 (norm. = 0.188034), norm. avg. (of 2) = 0.253108 fft 34: mflops = 30.1748 (norm. = 0.158273), norm. avg. (of 2) = 0.223765 fft 35: mflops = 9.79978 (norm. = 0.0514019), norm. avg. (of 2) = 0.0500065 fft 36: mflops = 8.59489 (norm. = 0.045082), norm. avg. (of 2) = 0.0433743 fft 37: mflops = 6.31672 (norm. = 0.0331325), norm. avg. (of 2) = 0.0737558 fft 38: mflops = 49.0562 (norm. = 0.25731), norm. avg. (of 2) = 0.242291 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.21 s, 1048576 iters, t-(init.)=1.1 s t(norm)=0.0437101, mflops=114.39 (err=1.2e-16) 1. Arndt DIT: elapsed time t=1.26 s, 1048576 iters, t-(init.)=1.15 s t(norm)=0.0456969, mflops=109.417 (err=1.2e-16) 2. Arndt Split-Radix: elapsed time t=1.34 s, 524288 iters, t-(init.)=1.28 s t(norm)=0.101725, mflops=49.152 (err=1.1e-16) 3. Arndt 4-step: elapsed time t=1.3 s, 65536 iters, t-(init.)=1.29 s t(norm)=0.82016, mflops=6.09637 (err=1.5e-16) 4. Bailey: elapsed time t=1.38 s, 524288 iters, t-(init.)=1.33 s t(norm)=0.105699, mflops=47.3042 (err=1.3e-16) 5. Beauregard: elapsed time t=1.88 s, 262144 iters, t-(init.)=1.85 s t(norm)=0.29405, mflops=17.0039 (err=1.2e-16) 6. Bergland: elapsed time t=1.34 s, 262144 iters, t-(init.)=1.31 s t(norm)=0.208219, mflops=24.0132 (err=1.3e-16) 7. Brenner: elapsed time t=1.55 s, 262144 iters, t-(init.)=1.52 s t(norm)=0.241597, mflops=20.6956 (err=1.2e-16) 8. Burrus: elapsed time t=1.94 s, 524288 iters, t-(init.)=1.88 s t(norm)=0.149409, mflops=33.4652 (err=1.5e-16) 9. CWP (min N): elapsed time t=1.01 s, 262144 iters, t-(init.)=0.98 s t(norm)=0.155767, mflops=32.0993 10. CWP (best N) (N=15): elapsed time t=1.71 s, 262144 iters, t-(init.)=1.65 s t(norm)=0.26226, mflops=19.065 11. Edelblute: elapsed time t=1.58 s, 524288 iters, t-(init.)=1.51 s t(norm)=0.120004, mflops=41.6653 (err=1.5e-16) 12. FFTPACK: elapsed time t=1.41 s, 524288 iters, t-(init.)=1.35 s t(norm)=0.107288, mflops=46.6034 (err=1.2e-16) 13. FFTPACK (f2c): elapsed time t=1.31 s, 262144 iters, t-(init.)=1.28 s t(norm)=0.203451, mflops=24.576 (err=1.2e-16) FFTW_MEASURE plan: (cost = 8.773804e-07) FFTW_NOTW 8 14. FFTW: elapsed time t=1.78 s, 2097152 iters, t-(init.)=1.56 s t(norm)=0.0309944, mflops=161.319 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.78 s, 2097152 iters, t-(init.)=1.56 s t(norm)=0.0309944, mflops=161.319 (err=1.2e-16) 16. Frigo-old: elapsed time t=1.19 s, 2097152 iters, t-(init.)=0.96 s t(norm)=0.0190735, mflops=262.144 (err=1.4e-16) 17. Green: elapsed time t=1.34 s, 1048576 iters, t-(init.)=1.19 s t(norm)=0.0472864, mflops=105.739 (err=1.4e-16) 18. GSL: elapsed time t=1.66 s, 524288 iters, t-(init.)=1.6 s t(norm)=0.127157, mflops=39.3216 (err=1.2e-16) 19. GSL DIT: elapsed time t=1.3 s, 262144 iters, t-(init.)=1.27 s t(norm)=0.201861, mflops=24.7695 (err=1.2e-16) 20. GSL DIF: elapsed time t=1.23 s, 262144 iters, t-(init.)=1.2 s t(norm)=0.190735, mflops=26.2144 (err=1.4e-16) 21. Krukar: elapsed time t=1.8 s, 2097152 iters, t-(init.)=1.58 s t(norm)=0.0313918, mflops=159.277 (err=1.2e-16) 22. Mayer (Buneman): elapsed time t=1.15 s, 524288 iters, t-(init.)=1.09 s t(norm)=0.0866254, mflops=57.7198 (err=1.5e-16) 23. Mayer (simple): elapsed time t=1.13 s, 524288 iters, t-(init.)=1.07 s t(norm)=0.085036, mflops=58.7987 24. Mayer (lookup): elapsed time t=1.3 s, 524288 iters, t-(init.)=1.25 s t(norm)=0.0993411, mflops=50.3316 (err=1.5e-16) 25. Monro: elapsed time t=1.41 s, 131072 iters, t-(init.)=1.4 s t(norm)=0.445048, mflops=11.2347 (err=1.1e-08) 26. NAPACK (f2c): elapsed time t=1.02 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.32107, mflops=15.5729 (err=1.7e-16) 27. Ooura (C): elapsed time t=1.09 s, 1048576 iters, t-(init.)=0.95 s t(norm)=0.0377496, mflops=132.452 (err=1.3e-16) 28. Ooura (F): elapsed time t=1.53 s, 1048576 iters, t-(init.)=1.42 s t(norm)=0.0564257, mflops=88.6121 (err=1.3e-16) 29. Ransom: elapsed time t=1.36 s, 65536 iters, t-(init.)=1.35 s t(norm)=0.858307, mflops=5.82542 (err=3.9e-16) 30. SCIPORT: elapsed time t=1.05 s, 524288 iters, t-(init.)=0.99 s t(norm)=0.0786781, mflops=63.5501 (err=1.2e-16) 31. Singleton: elapsed time t=1.52 s, 262144 iters, t-(init.)=1.49 s t(norm)=0.236829, mflops=21.1123 (err=1.4e-16) 32. Singleton (f2c): elapsed time t=1.5 s, 262144 iters, t-(init.)=1.47 s t(norm)=0.23365, mflops=21.3995 (err=1.4e-16) 33. Sorensen: elapsed time t=1.35 s, 524288 iters, t-(init.)=1.3 s t(norm)=0.103315, mflops=48.3958 (err=1.2e-16) 34. Sorensen DIT: elapsed time t=1.89 s, 524288 iters, t-(init.)=1.83 s t(norm)=0.145435, mflops=34.3795 (err=1.2e-16) 35. Temperton: elapsed time t=1.61 s, 262144 iters, t-(init.)=1.59 s t(norm)=0.252724, mflops=19.7845 (err=4.6e-09) 36. Temperton (f2c): elapsed time t=1.05 s, 131072 iters, t-(init.)=1.04 s t(norm)=0.330607, mflops=15.1237 (err=1.4e-16) 37. Valkenburg: elapsed time t=1.24 s, 65536 iters, t-(init.)=1.23 s t(norm)=0.782013, mflops=6.39376 (err=1.4e-16) 38. SGIMATH: elapsed time t=1.22 s, 1048576 iters, t-(init.)=1.1 s t(norm)=0.0437101, mflops=114.39 (err=1.4e-16) Top mflops for N=8 = 262.144 Normalized results and averages for N=8: fft 0: mflops = 114.39 (norm. = 0.436364), norm. avg. (of 3) = 0.521717 fft 1: mflops = 109.417 (norm. = 0.417391), norm. avg. (of 3) = 0.451611 fft 2: mflops = 49.152 (norm. = 0.1875), norm. avg. (of 3) = 0.303241 fft 3: mflops = 6.09637 (norm. = 0.0232558), norm. avg. (of 3) = 0.0229042 fft 4: mflops = 47.3042 (norm. = 0.180451), norm. avg. (of 3) = 0.173135 fft 5: mflops = 17.0039 (norm. = 0.0648649), norm. avg. (of 3) = 0.110809 fft 6: mflops = 24.0132 (norm. = 0.0916031), norm. avg. (of 3) = 0.077903 fft 7: mflops = 20.6956 (norm. = 0.0789474), norm. avg. (of 3) = 0.0826665 fft 8: mflops = 33.4652 (norm. = 0.12766), norm. avg. (of 3) = 0.20123 fft 9: mflops = 32.0993 (norm. = 0.122449), norm. avg. (of 3) = 0.0767766 fft 10: mflops = 19.065 (norm. = 0.0727273), norm. avg. (of 3) = 0.0530766 fft 11: mflops = 41.6653 (norm. = 0.15894), norm. avg. (of 2) = 0.19526 fft 12: mflops = 46.6034 (norm. = 0.177778), norm. avg. (of 3) = 0.165047 fft 13: mflops = 24.576 (norm. = 0.09375), norm. avg. (of 3) = 0.0977275 fft 14: mflops = 161.319 (norm. = 0.615385), norm. avg. (of 3) = 0.469136 fft 15: mflops = 161.319 (norm. = 0.615385), norm. avg. (of 3) = 0.471829 fft 16: mflops = 262.144 (norm. = 1), norm. avg. (of 3) = 1 fft 17: mflops = 105.739 (norm. = 0.403361), norm. avg. (of 1) = 0.403361 fft 18: mflops = 39.3216 (norm. = 0.15), norm. avg. (of 3) = 0.113102 fft 19: mflops = 24.7695 (norm. = 0.0944882), norm. avg. (of 3) = 0.1073 fft 20: mflops = 26.2144 (norm. = 0.1), norm. avg. (of 3) = 0.118756 fft 21: mflops = 159.277 (norm. = 0.607595), norm. avg. (of 3) = 0.562118 fft 22: mflops = 57.7198 (norm. = 0.220183), norm. avg. (of 2) = 0.214854 fft 23: mflops = 58.7987 (norm. = 0.224299), norm. avg. (of 2) = 0.217919 fft 24: mflops = 50.3316 (norm. = 0.192), norm. avg. (of 2) = 0.179969 fft 25: mflops = 11.2347 (norm. = 0.0428571), norm. avg. (of 2) = 0.0346497 fft 26: mflops = 15.5729 (norm. = 0.0594059), norm. avg. (of 3) = 0.0658207 fft 27: mflops = 132.452 (norm. = 0.505263), norm. avg. (of 3) = 0.586406 fft 28: mflops = 88.6121 (norm. = 0.338028), norm. avg. (of 3) = 0.356178 fft 29: mflops = 5.82542 (norm. = 0.0222222), norm. avg. (of 2) = 0.0273833 fft 30: mflops = 63.5501 (norm. = 0.242424), norm. avg. (of 2) = 0.241431 fft 31: mflops = 21.1123 (norm. = 0.0805369), norm. avg. (of 3) = 0.0926311 fft 32: mflops = 21.3995 (norm. = 0.0816327), norm. avg. (of 3) = 0.0987387 fft 33: mflops = 48.3958 (norm. = 0.184615), norm. avg. (of 3) = 0.230277 fft 34: mflops = 34.3795 (norm. = 0.131148), norm. avg. (of 3) = 0.192892 fft 35: mflops = 19.7845 (norm. = 0.0754717), norm. avg. (of 3) = 0.0584949 fft 36: mflops = 15.1237 (norm. = 0.0576923), norm. avg. (of 3) = 0.048147 fft 37: mflops = 6.39376 (norm. = 0.0243902), norm. avg. (of 3) = 0.0573006 fft 38: mflops = 114.39 (norm. = 0.436364), norm. avg. (of 3) = 0.306982 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.23 s, 262144 iters, t-(init.)=1.18 s t(norm)=0.0703335, mflops=71.0899 (err=1.5e-16) 1. Arndt DIT: elapsed time t=1.2 s, 262144 iters, t-(init.)=1.14 s t(norm)=0.0679493, mflops=73.5843 (err=2.2e-16) 2. Arndt Split-Radix: elapsed time t=1.65 s, 262144 iters, t-(init.)=1.59 s t(norm)=0.0947714, mflops=52.7585 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.54 s, 65536 iters, t-(init.)=1.53 s t(norm)=0.36478, mflops=13.7069 (err=1.6e-16) 4. Bailey: elapsed time t=1.4 s, 262144 iters, t-(init.)=1.35 s t(norm)=0.0804663, mflops=62.1378 (err=1.6e-16) 5. Beauregard: elapsed time t=1.94 s, 131072 iters, t-(init.)=1.91 s t(norm)=0.22769, mflops=21.9597 (err=2.3e-16) 6. Bergland: elapsed time t=1.19 s, 131072 iters, t-(init.)=1.16 s t(norm)=0.138283, mflops=36.1578 (err=2.6e-16) 7. Brenner: elapsed time t=1.5 s, 131072 iters, t-(init.)=1.48 s t(norm)=0.17643, mflops=28.3399 (err=2.1e-16) 8. Burrus: elapsed time t=1.14 s, 131072 iters, t-(init.)=1.11 s t(norm)=0.132322, mflops=37.7865 (err=1.4e-16) 9. CWP (min N): elapsed time t=1.4 s, 262144 iters, t-(init.)=1.35 s t(norm)=0.0804663, mflops=62.1378 10. CWP (best N) (N=28): elapsed time t=1.09 s, 131072 iters, t-(init.)=1.04 s t(norm)=0.123978, mflops=40.3298 11. Edelblute: elapsed time t=1.04 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.120401, mflops=41.5278 (err=1.4e-16) 12. FFTPACK: elapsed time t=1.04 s, 262144 iters, t-(init.)=0.99 s t(norm)=0.0590086, mflops=84.7334 (err=1.8e-16) 13. FFTPACK (f2c): elapsed time t=1.11 s, 131072 iters, t-(init.)=1.08 s t(norm)=0.128746, mflops=38.8361 (err=1.8e-16) FFTW_MEASURE plan: (cost = 1.983643e-06) FFTW_NOTW 16 14. FFTW: elapsed time t=1.76 s, 1048576 iters, t-(init.)=1.55 s t(norm)=0.0230968, mflops=216.48 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.77 s, 1048576 iters, t-(init.)=1.56 s t(norm)=0.0232458, mflops=215.093 (err=1.7e-16) 16. Frigo-old: elapsed time t=1.29 s, 1048576 iters, t-(init.)=1.08 s t(norm)=0.0160933, mflops=310.689 (err=1.8e-16) 17. Green: elapsed time t=1.07 s, 262144 iters, t-(init.)=1.01 s t(norm)=0.0602007, mflops=83.0555 (err=1.9e-16) 18. GSL: elapsed time t=1.25 s, 262144 iters, t-(init.)=1.2 s t(norm)=0.0715256, mflops=69.9051 (err=1.8e-16) 19. GSL DIT: elapsed time t=1.17 s, 131072 iters, t-(init.)=1.14 s t(norm)=0.135899, mflops=36.7921 (err=2.1e-16) 20. GSL DIF: elapsed time t=1.16 s, 131072 iters, t-(init.)=1.13 s t(norm)=0.134706, mflops=37.1177 (err=2.8e-16) 21. Krukar: elapsed time t=1.03 s, 524288 iters, t-(init.)=0.93 s t(norm)=0.0277162, mflops=180.4 (err=2.0e-16) 22. Mayer (Buneman): elapsed time t=1.62 s, 262144 iters, t-(init.)=1.56 s t(norm)=0.0929832, mflops=53.7731 (err=1.7e-16) 23. Mayer (simple): elapsed time t=1.36 s, 262144 iters, t-(init.)=1.3 s t(norm)=0.077486, mflops=64.5278 24. Mayer (lookup): elapsed time t=1.59 s, 262144 iters, t-(init.)=1.54 s t(norm)=0.0917912, mflops=54.4715 (err=1.9e-16) 25. Monro: elapsed time t=1.12 s, 65536 iters, t-(init.)=1.11 s t(norm)=0.264645, mflops=18.8933 (err=2.1e-08) 26. NAPACK (f2c): elapsed time t=1.71 s, 131072 iters, t-(init.)=1.68 s t(norm)=0.200272, mflops=24.9661 (err=3.3e-16) 27. Ooura (C): elapsed time t=1.37 s, 524288 iters, t-(init.)=1.25 s t(norm)=0.0372529, mflops=134.218 (err=2.0e-16) 28. Ooura (F): elapsed time t=1.5 s, 524288 iters, t-(init.)=1.39 s t(norm)=0.0414252, mflops=120.699 (err=2.0e-16) 29. Ransom: elapsed time t=1.99 s, 131072 iters, t-(init.)=1.96 s t(norm)=0.23365, mflops=21.3995 (err=4.2e-16) 30. SCIPORT: elapsed time t=1.07 s, 262144 iters, t-(init.)=1.02 s t(norm)=0.0607967, mflops=82.2413 (err=2.6e-16) 31. Singleton: elapsed time t=1.61 s, 262144 iters, t-(init.)=1.56 s t(norm)=0.0929832, mflops=53.7731 (err=1.7e-16) 32. Singleton (f2c): elapsed time t=1.57 s, 262144 iters, t-(init.)=1.52 s t(norm)=0.0905991, mflops=55.1882 (err=1.7e-16) 33. Sorensen: elapsed time t=1.38 s, 262144 iters, t-(init.)=1.33 s t(norm)=0.0792742, mflops=63.0722 (err=1.7e-16) 34. Sorensen DIT: elapsed time t=1.09 s, 131072 iters, t-(init.)=1.06 s t(norm)=0.126362, mflops=39.5689 (err=1.9e-16) 35. Temperton: elapsed time t=1.53 s, 131072 iters, t-(init.)=1.5 s t(norm)=0.178814, mflops=27.962 (err=1.7e-08) 36. Temperton (f2c): elapsed time t=1.55 s, 131072 iters, t-(init.)=1.53 s t(norm)=0.18239, mflops=27.4138 (err=1.8e-16) 37. Valkenburg: elapsed time t=1.59 s, 32768 iters, t-(init.)=1.58 s t(norm)=0.753403, mflops=6.63656 (err=2.9e-16) 38. SGIMATH: elapsed time t=1.23 s, 524288 iters, t-(init.)=1.12 s t(norm)=0.0333786, mflops=149.797 (err=1.8e-16) Top mflops for N=16 = 310.689 Normalized results and averages for N=16: fft 0: mflops = 71.0899 (norm. = 0.228814), norm. avg. (of 4) = 0.448491 fft 1: mflops = 73.5843 (norm. = 0.236842), norm. avg. (of 4) = 0.397919 fft 2: mflops = 52.7585 (norm. = 0.169811), norm. avg. (of 4) = 0.269883 fft 3: mflops = 13.7069 (norm. = 0.0441176), norm. avg. (of 4) = 0.0282076 fft 4: mflops = 62.1378 (norm. = 0.2), norm. avg. (of 4) = 0.179851 fft 5: mflops = 21.9597 (norm. = 0.0706806), norm. avg. (of 4) = 0.100777 fft 6: mflops = 36.1578 (norm. = 0.116379), norm. avg. (of 4) = 0.0875221 fft 7: mflops = 28.3399 (norm. = 0.0912162), norm. avg. (of 4) = 0.0848039 fft 8: mflops = 37.7865 (norm. = 0.121622), norm. avg. (of 4) = 0.181328 fft 9: mflops = 62.1378 (norm. = 0.2), norm. avg. (of 4) = 0.107582 fft 10: mflops = 40.3298 (norm. = 0.129808), norm. avg. (of 4) = 0.0722594 fft 11: mflops = 41.5278 (norm. = 0.133663), norm. avg. (of 3) = 0.174728 fft 12: mflops = 84.7334 (norm. = 0.272727), norm. avg. (of 4) = 0.191967 fft 13: mflops = 38.8361 (norm. = 0.125), norm. avg. (of 4) = 0.104546 fft 14: mflops = 216.48 (norm. = 0.696774), norm. avg. (of 4) = 0.526045 fft 15: mflops = 215.093 (norm. = 0.692308), norm. avg. (of 4) = 0.526949 fft 16: mflops = 310.689 (norm. = 1), norm. avg. (of 4) = 1 fft 17: mflops = 83.0555 (norm. = 0.267327), norm. avg. (of 2) = 0.335344 fft 18: mflops = 69.9051 (norm. = 0.225), norm. avg. (of 4) = 0.141076 fft 19: mflops = 36.7921 (norm. = 0.118421), norm. avg. (of 4) = 0.11008 fft 20: mflops = 37.1177 (norm. = 0.119469), norm. avg. (of 4) = 0.118935 fft 21: mflops = 180.4 (norm. = 0.580645), norm. avg. (of 4) = 0.56675 fft 22: mflops = 53.7731 (norm. = 0.173077), norm. avg. (of 3) = 0.200928 fft 23: mflops = 64.5278 (norm. = 0.207692), norm. avg. (of 3) = 0.21451 fft 24: mflops = 54.4715 (norm. = 0.175325), norm. avg. (of 3) = 0.178421 fft 25: mflops = 18.8933 (norm. = 0.0608108), norm. avg. (of 3) = 0.0433701 fft 26: mflops = 24.9661 (norm. = 0.0803571), norm. avg. (of 4) = 0.0694548 fft 27: mflops = 134.218 (norm. = 0.432), norm. avg. (of 4) = 0.547804 fft 28: mflops = 120.699 (norm. = 0.388489), norm. avg. (of 4) = 0.364256 fft 29: mflops = 21.3995 (norm. = 0.0688776), norm. avg. (of 3) = 0.0412147 fft 30: mflops = 82.2413 (norm. = 0.264706), norm. avg. (of 3) = 0.249189 fft 31: mflops = 53.7731 (norm. = 0.173077), norm. avg. (of 4) = 0.112743 fft 32: mflops = 55.1882 (norm. = 0.177632), norm. avg. (of 4) = 0.118462 fft 33: mflops = 63.0722 (norm. = 0.203008), norm. avg. (of 4) = 0.22346 fft 34: mflops = 39.5689 (norm. = 0.127358), norm. avg. (of 4) = 0.176509 fft 35: mflops = 27.962 (norm. = 0.09), norm. avg. (of 4) = 0.0663712 fft 36: mflops = 27.4138 (norm. = 0.0882353), norm. avg. (of 4) = 0.0581691 fft 37: mflops = 6.63656 (norm. = 0.0213608), norm. avg. (of 4) = 0.0483157 fft 38: mflops = 149.797 (norm. = 0.482143), norm. avg. (of 4) = 0.350772 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.22 s, 131072 iters, t-(init.)=1.17 s t(norm)=0.0557899, mflops=89.6219 (err=2.9e-16) 1. Arndt DIT: elapsed time t=1.21 s, 131072 iters, t-(init.)=1.16 s t(norm)=0.0553131, mflops=90.3945 (err=2.7e-16) 2. Arndt Split-Radix: elapsed time t=1.81 s, 131072 iters, t-(init.)=1.76 s t(norm)=0.0839233, mflops=59.5782 (err=3.4e-16) 3. Arndt 4-step: elapsed time t=1.78 s, 32768 iters, t-(init.)=1.77 s t(norm)=0.337601, mflops=14.8104 (err=2.7e-16) 4. Bailey: elapsed time t=1.17 s, 131072 iters, t-(init.)=1.12 s t(norm)=0.0534058, mflops=93.6229 (err=2.6e-16) 5. Beauregard: elapsed time t=1.07 s, 32768 iters, t-(init.)=1.05 s t(norm)=0.200272, mflops=24.9661 (err=2.3e-16) 6. Bergland: elapsed time t=1.1 s, 65536 iters, t-(init.)=1.08 s t(norm)=0.102997, mflops=48.5452 (err=3.0e-16) 7. Brenner: elapsed time t=1.45 s, 65536 iters, t-(init.)=1.43 s t(norm)=0.136375, mflops=36.6635 (err=2.1e-16) 8. Burrus: elapsed time t=1.25 s, 65536 iters, t-(init.)=1.22 s t(norm)=0.116348, mflops=42.9744 (err=2.7e-16) 9. CWP (min N) (N=33): elapsed time t=1.33 s, 131072 iters, t-(init.)=1.28 s t(norm)=0.0610352, mflops=81.92 10. CWP (best N) (N=35): elapsed time t=1.31 s, 131072 iters, t-(init.)=1.26 s t(norm)=0.0600815, mflops=83.2203 11. Edelblute: elapsed time t=1.16 s, 65536 iters, t-(init.)=1.14 s t(norm)=0.108719, mflops=45.9902 (err=2.7e-16) 12. FFTPACK: elapsed time t=1.28 s, 131072 iters, t-(init.)=1.23 s t(norm)=0.058651, mflops=85.2501 (err=2.1e-16) 13. FFTPACK (f2c): elapsed time t=1.84 s, 65536 iters, t-(init.)=1.82 s t(norm)=0.173569, mflops=28.807 (err=2.1e-16) FFTW_MEASURE plan: (cost = 4.577637e-06) FFTW_NOTW 32 14. FFTW: elapsed time t=1.2 s, 262144 iters, t-(init.)=1.11 s t(norm)=0.0264645, mflops=188.933 (err=2.2e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.19 s, 262144 iters, t-(init.)=1.09 s t(norm)=0.0259876, mflops=192.399 (err=2.2e-16) 16. Frigo-old: elapsed time t=1.73 s, 524288 iters, t-(init.)=1.53 s t(norm)=0.018239, mflops=274.138 (err=2.2e-16) 17. Green: elapsed time t=1.88 s, 262144 iters, t-(init.)=1.77 s t(norm)=0.0422001, mflops=118.483 (err=2.1e-16) 18. GSL: elapsed time t=1.52 s, 131072 iters, t-(init.)=1.47 s t(norm)=0.0700951, mflops=71.3317 (err=2.0e-16) 19. GSL DIT: elapsed time t=1.16 s, 65536 iters, t-(init.)=1.14 s t(norm)=0.108719, mflops=45.9902 (err=2.2e-16) 20. GSL DIF: elapsed time t=1.15 s, 65536 iters, t-(init.)=1.12 s t(norm)=0.106812, mflops=46.8114 (err=2.5e-16) 21. Krukar: elapsed time t=1.21 s, 262144 iters, t-(init.)=1.11 s t(norm)=0.0264645, mflops=188.933 (err=2.2e-16) 22. Mayer (Buneman): elapsed time t=1.73 s, 131072 iters, t-(init.)=1.68 s t(norm)=0.0801086, mflops=62.4152 (err=2.8e-16) 23. Mayer (simple): elapsed time t=1.41 s, 131072 iters, t-(init.)=1.36 s t(norm)=0.0648499, mflops=77.1012 24. Mayer (lookup): elapsed time t=1.53 s, 131072 iters, t-(init.)=1.48 s t(norm)=0.0705719, mflops=70.8497 (err=2.9e-16) 25. Monro: elapsed time t=1.89 s, 65536 iters, t-(init.)=1.87 s t(norm)=0.178337, mflops=28.0368 (err=3.7e-08) 26. NAPACK (f2c): elapsed time t=1.7 s, 65536 iters, t-(init.)=1.68 s t(norm)=0.160217, mflops=31.2076 (err=1.3e-15) 27. Ooura (C): elapsed time t=1.5 s, 262144 iters, t-(init.)=1.39 s t(norm)=0.0331402, mflops=150.874 (err=2.6e-16) 28. Ooura (F): elapsed time t=1.65 s, 262144 iters, t-(init.)=1.55 s t(norm)=0.0369549, mflops=135.3 (err=2.6e-16) 29. Ransom: elapsed time t=1.44 s, 32768 iters, t-(init.)=1.42 s t(norm)=0.270844, mflops=18.4608 (err=7.5e-16) 30. SCIPORT: elapsed time t=1.1 s, 131072 iters, t-(init.)=1.05 s t(norm)=0.0500679, mflops=99.8644 (err=1.7e-16) 31. Singleton: elapsed time t=1.53 s, 131072 iters, t-(init.)=1.48 s t(norm)=0.0705719, mflops=70.8497 (err=2.2e-16) 32. Singleton (f2c): elapsed time t=1.49 s, 131072 iters, t-(init.)=1.44 s t(norm)=0.0686646, mflops=72.8178 (err=2.2e-16) 33. Sorensen: elapsed time t=1.29 s, 131072 iters, t-(init.)=1.24 s t(norm)=0.0591278, mflops=84.5626 (err=2.7e-16) 34. Sorensen DIT: elapsed time t=1.21 s, 65536 iters, t-(init.)=1.18 s t(norm)=0.112534, mflops=44.4312 (err=2.4e-16) 35. Temperton: elapsed time t=1.61 s, 65536 iters, t-(init.)=1.59 s t(norm)=0.151634, mflops=32.9741 (err=3.1e-08) 36. Temperton (f2c): elapsed time t=1.61 s, 65536 iters, t-(init.)=1.59 s t(norm)=0.151634, mflops=32.9741 (err=1.7e-16) 37. Valkenburg: elapsed time t=1.93 s, 16384 iters, t-(init.)=1.92 s t(norm)=0.732422, mflops=6.82667 (err=4.2e-16) 38. SGIMATH: elapsed time t=1.06 s, 262144 iters, t-(init.)=0.97 s t(norm)=0.0231266, mflops=216.201 (err=1.8e-16) Top mflops for N=32 = 274.138 Normalized results and averages for N=32: fft 0: mflops = 89.6219 (norm. = 0.326923), norm. avg. (of 5) = 0.424178 fft 1: mflops = 90.3945 (norm. = 0.329741), norm. avg. (of 5) = 0.384283 fft 2: mflops = 59.5782 (norm. = 0.21733), norm. avg. (of 5) = 0.259373 fft 3: mflops = 14.8104 (norm. = 0.0540254), norm. avg. (of 5) = 0.0333712 fft 4: mflops = 93.6229 (norm. = 0.341518), norm. avg. (of 5) = 0.212185 fft 5: mflops = 24.9661 (norm. = 0.0910714), norm. avg. (of 5) = 0.098836 fft 6: mflops = 48.5452 (norm. = 0.177083), norm. avg. (of 5) = 0.105434 fft 7: mflops = 36.6635 (norm. = 0.133741), norm. avg. (of 5) = 0.0945914 fft 8: mflops = 42.9744 (norm. = 0.156762), norm. avg. (of 5) = 0.176415 fft 9: mflops = 81.92 (norm. = 0.298828), norm. avg. (of 5) = 0.145832 fft 10: mflops = 83.2203 (norm. = 0.303571), norm. avg. (of 5) = 0.118522 fft 11: mflops = 45.9902 (norm. = 0.167763), norm. avg. (of 4) = 0.172986 fft 12: mflops = 85.2501 (norm. = 0.310976), norm. avg. (of 5) = 0.215769 fft 13: mflops = 28.807 (norm. = 0.105082), norm. avg. (of 5) = 0.104653 fft 14: mflops = 188.933 (norm. = 0.689189), norm. avg. (of 5) = 0.558674 fft 15: mflops = 192.399 (norm. = 0.701835), norm. avg. (of 5) = 0.561926 fft 16: mflops = 274.138 (norm. = 1), norm. avg. (of 5) = 1 fft 17: mflops = 118.483 (norm. = 0.432203), norm. avg. (of 3) = 0.36763 fft 18: mflops = 71.3317 (norm. = 0.260204), norm. avg. (of 5) = 0.164902 fft 19: mflops = 45.9902 (norm. = 0.167763), norm. avg. (of 5) = 0.121617 fft 20: mflops = 46.8114 (norm. = 0.170759), norm. avg. (of 5) = 0.129299 fft 21: mflops = 188.933 (norm. = 0.689189), norm. avg. (of 5) = 0.591238 fft 22: mflops = 62.4152 (norm. = 0.227679), norm. avg. (of 4) = 0.207616 fft 23: mflops = 77.1012 (norm. = 0.28125), norm. avg. (of 4) = 0.231195 fft 24: mflops = 70.8497 (norm. = 0.258446), norm. avg. (of 4) = 0.198427 fft 25: mflops = 28.0368 (norm. = 0.102273), norm. avg. (of 4) = 0.0580957 fft 26: mflops = 31.2076 (norm. = 0.113839), norm. avg. (of 5) = 0.0783317 fft 27: mflops = 150.874 (norm. = 0.55036), norm. avg. (of 5) = 0.548315 fft 28: mflops = 135.3 (norm. = 0.493548), norm. avg. (of 5) = 0.390114 fft 29: mflops = 18.4608 (norm. = 0.0673415), norm. avg. (of 4) = 0.0477464 fft 30: mflops = 99.8644 (norm. = 0.364286), norm. avg. (of 4) = 0.277963 fft 31: mflops = 70.8497 (norm. = 0.258446), norm. avg. (of 5) = 0.141883 fft 32: mflops = 72.8178 (norm. = 0.265625), norm. avg. (of 5) = 0.147895 fft 33: mflops = 84.5626 (norm. = 0.308468), norm. avg. (of 5) = 0.240461 fft 34: mflops = 44.4312 (norm. = 0.162076), norm. avg. (of 5) = 0.173622 fft 35: mflops = 32.9741 (norm. = 0.120283), norm. avg. (of 5) = 0.0771535 fft 36: mflops = 32.9741 (norm. = 0.120283), norm. avg. (of 5) = 0.0705919 fft 37: mflops = 6.82667 (norm. = 0.0249023), norm. avg. (of 5) = 0.043633 fft 38: mflops = 216.201 (norm. = 0.78866), norm. avg. (of 5) = 0.43835 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.5 s, 65536 iters, t-(init.)=1.45 s t(norm)=0.0576178, mflops=86.7787 (err=5.8e-16) 1. Arndt DIT: elapsed time t=1.5 s, 65536 iters, t-(init.)=1.46 s t(norm)=0.0580152, mflops=86.1843 (err=5.7e-16) 2. Arndt Split-Radix: elapsed time t=1.95 s, 65536 iters, t-(init.)=1.91 s t(norm)=0.0758966, mflops=65.8791 (err=5.8e-16) 3. Arndt 4-step: elapsed time t=1.11 s, 16384 iters, t-(init.)=1.09 s t(norm)=0.173251, mflops=28.8599 (err=5.3e-16) 4. Bailey: elapsed time t=1.39 s, 65536 iters, t-(init.)=1.34 s t(norm)=0.0532468, mflops=93.9023 (err=5.6e-16) 5. Beauregard: elapsed time t=1.23 s, 16384 iters, t-(init.)=1.22 s t(norm)=0.193914, mflops=25.7847 (err=6.0e-16) 6. Bergland: elapsed time t=1.11 s, 32768 iters, t-(init.)=1.08 s t(norm)=0.0858307, mflops=58.2542 (err=6.3e-16) 7. Brenner: elapsed time t=1.51 s, 32768 iters, t-(init.)=1.48 s t(norm)=0.11762, mflops=42.5098 (err=5.8e-16) 8. Burrus: elapsed time t=1.33 s, 32768 iters, t-(init.)=1.31 s t(norm)=0.104109, mflops=48.0264 (err=5.7e-16) 9. CWP (min N) (N=65): elapsed time t=1.2 s, 65536 iters, t-(init.)=1.15 s t(norm)=0.0456969, mflops=109.417 10. CWP (best N) (N=84): elapsed time t=1.17 s, 65536 iters, t-(init.)=1.1 s t(norm)=0.0437101, mflops=114.39 11. Edelblute: elapsed time t=1.26 s, 32768 iters, t-(init.)=1.24 s t(norm)=0.0985463, mflops=50.7375 (err=5.7e-16) 12. FFTPACK: elapsed time t=1.02 s, 65536 iters, t-(init.)=0.97 s t(norm)=0.0385443, mflops=129.721 (err=5.7e-16) 13. FFTPACK (f2c): elapsed time t=1.78 s, 32768 iters, t-(init.)=1.76 s t(norm)=0.139872, mflops=35.7469 (err=5.7e-16) FFTW_MEASURE plan: (cost = 1.037598e-05) FFTW_TWIDDLE 8 FFTW_NOTW 8 14. FFTW: elapsed time t=1.39 s, 131072 iters, t-(init.)=1.29 s t(norm)=0.02563, mflops=195.084 (err=5.6e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.48 s, 131072 iters, t-(init.)=1.38 s t(norm)=0.0274181, mflops=182.361 (err=5.3e-16) 16. Frigo-old: elapsed time t=1.54 s, 131072 iters, t-(init.)=1.44 s t(norm)=0.0286102, mflops=174.763 (err=5.5e-16) 17. Green: elapsed time t=1.61 s, 131072 iters, t-(init.)=1.51 s t(norm)=0.030001, mflops=166.661 (err=5.7e-16) 18. GSL: elapsed time t=1.33 s, 65536 iters, t-(init.)=1.28 s t(norm)=0.0508626, mflops=98.304 (err=5.7e-16) 19. GSL DIT: elapsed time t=1.26 s, 32768 iters, t-(init.)=1.24 s t(norm)=0.0985463, mflops=50.7375 (err=5.6e-16) 20. GSL DIF: elapsed time t=1.2 s, 32768 iters, t-(init.)=1.17 s t(norm)=0.0929832, mflops=53.7731 (err=5.4e-16) 21. Krukar: elapsed time t=1.45 s, 131072 iters, t-(init.)=1.35 s t(norm)=0.0268221, mflops=186.414 (err=5.9e-16) 22. Mayer (Buneman): elapsed time t=1.02 s, 32768 iters, t-(init.)=1 s t(norm)=0.0794729, mflops=62.9146 (err=5.3e-16) 23. Mayer (simple): elapsed time t=1.6 s, 65536 iters, t-(init.)=1.56 s t(norm)=0.0619888, mflops=80.6597 24. Mayer (lookup): elapsed time t=1.72 s, 65536 iters, t-(init.)=1.67 s t(norm)=0.0663598, mflops=75.3468 (err=5.2e-16) 25. Monro: elapsed time t=1.78 s, 32768 iters, t-(init.)=1.76 s t(norm)=0.139872, mflops=35.7469 (err=3.4e-08) 26. NAPACK (f2c): elapsed time t=1.61 s, 32768 iters, t-(init.)=1.59 s t(norm)=0.126362, mflops=39.5689 (err=1.8e-15) 27. Ooura (C): elapsed time t=1.68 s, 131072 iters, t-(init.)=1.59 s t(norm)=0.0315905, mflops=158.276 (err=5.7e-16) 28. Ooura (F): elapsed time t=1.65 s, 131072 iters, t-(init.)=1.55 s t(norm)=0.0307957, mflops=162.36 (err=5.7e-16) 29. Ransom: elapsed time t=1.72 s, 32768 iters, t-(init.)=1.7 s t(norm)=0.135104, mflops=37.0086 (err=9.0e-16) 30. SCIPORT: elapsed time t=1.15 s, 65536 iters, t-(init.)=1.1 s t(norm)=0.0437101, mflops=114.39 (err=5.6e-16) 31. Singleton: elapsed time t=1.34 s, 65536 iters, t-(init.)=1.29 s t(norm)=0.05126, mflops=97.542 (err=9.2e-16) 32. Singleton (f2c): elapsed time t=1.31 s, 65536 iters, t-(init.)=1.26 s t(norm)=0.0500679, mflops=99.8644 (err=9.2e-16) 33. Sorensen: elapsed time t=1.26 s, 65536 iters, t-(init.)=1.21 s t(norm)=0.0480811, mflops=103.991 (err=5.7e-16) 34. Sorensen DIT: elapsed time t=1.31 s, 32768 iters, t-(init.)=1.28 s t(norm)=0.101725, mflops=49.152 (err=5.7e-16) 35. Temperton: elapsed time t=1.34 s, 32768 iters, t-(init.)=1.32 s t(norm)=0.104904, mflops=47.6625 (err=3.8e-08) 36. Temperton (f2c): elapsed time t=1.34 s, 32768 iters, t-(init.)=1.32 s t(norm)=0.104904, mflops=47.6625 (err=5.7e-16) 37. Valkenburg: elapsed time t=1.13 s, 4096 iters, t-(init.)=1.13 s t(norm)=0.718435, mflops=6.95958 (err=7.8e-16) 38. SGIMATH: elapsed time t=1.14 s, 131072 iters, t-(init.)=1.04 s t(norm)=0.0206629, mflops=241.979 (err=5.7e-16) Top mflops for N=64 = 241.979 Normalized results and averages for N=64: fft 0: mflops = 86.7787 (norm. = 0.358621), norm. avg. (of 6) = 0.413251 fft 1: mflops = 86.1843 (norm. = 0.356164), norm. avg. (of 6) = 0.379597 fft 2: mflops = 65.8791 (norm. = 0.272251), norm. avg. (of 6) = 0.261519 fft 3: mflops = 28.8599 (norm. = 0.119266), norm. avg. (of 6) = 0.047687 fft 4: mflops = 93.9023 (norm. = 0.38806), norm. avg. (of 6) = 0.241497 fft 5: mflops = 25.7847 (norm. = 0.106557), norm. avg. (of 6) = 0.100123 fft 6: mflops = 58.2542 (norm. = 0.240741), norm. avg. (of 6) = 0.127985 fft 7: mflops = 42.5098 (norm. = 0.175676), norm. avg. (of 6) = 0.108105 fft 8: mflops = 48.0264 (norm. = 0.198473), norm. avg. (of 6) = 0.180091 fft 9: mflops = 109.417 (norm. = 0.452174), norm. avg. (of 6) = 0.196889 fft 10: mflops = 114.39 (norm. = 0.472727), norm. avg. (of 6) = 0.177556 fft 11: mflops = 50.7375 (norm. = 0.209677), norm. avg. (of 5) = 0.180325 fft 12: mflops = 129.721 (norm. = 0.536082), norm. avg. (of 6) = 0.269155 fft 13: mflops = 35.7469 (norm. = 0.147727), norm. avg. (of 6) = 0.111832 fft 14: mflops = 195.084 (norm. = 0.806202), norm. avg. (of 6) = 0.599929 fft 15: mflops = 182.361 (norm. = 0.753623), norm. avg. (of 6) = 0.593876 fft 16: mflops = 174.763 (norm. = 0.722222), norm. avg. (of 6) = 0.953704 fft 17: mflops = 166.661 (norm. = 0.688742), norm. avg. (of 4) = 0.447908 fft 18: mflops = 98.304 (norm. = 0.40625), norm. avg. (of 6) = 0.205127 fft 19: mflops = 50.7375 (norm. = 0.209677), norm. avg. (of 6) = 0.136294 fft 20: mflops = 53.7731 (norm. = 0.222222), norm. avg. (of 6) = 0.144787 fft 21: mflops = 186.414 (norm. = 0.77037), norm. avg. (of 6) = 0.621093 fft 22: mflops = 62.9146 (norm. = 0.26), norm. avg. (of 5) = 0.218093 fft 23: mflops = 80.6597 (norm. = 0.333333), norm. avg. (of 5) = 0.251623 fft 24: mflops = 75.3468 (norm. = 0.311377), norm. avg. (of 5) = 0.221017 fft 25: mflops = 35.7469 (norm. = 0.147727), norm. avg. (of 5) = 0.0760221 fft 26: mflops = 39.5689 (norm. = 0.163522), norm. avg. (of 6) = 0.0925301 fft 27: mflops = 158.276 (norm. = 0.654088), norm. avg. (of 6) = 0.565944 fft 28: mflops = 162.36 (norm. = 0.670968), norm. avg. (of 6) = 0.436923 fft 29: mflops = 37.0086 (norm. = 0.152941), norm. avg. (of 5) = 0.0687854 fft 30: mflops = 114.39 (norm. = 0.472727), norm. avg. (of 5) = 0.316916 fft 31: mflops = 97.542 (norm. = 0.403101), norm. avg. (of 6) = 0.18542 fft 32: mflops = 99.8644 (norm. = 0.412698), norm. avg. (of 6) = 0.192028 fft 33: mflops = 103.991 (norm. = 0.429752), norm. avg. (of 6) = 0.27201 fft 34: mflops = 49.152 (norm. = 0.203125), norm. avg. (of 6) = 0.178539 fft 35: mflops = 47.6625 (norm. = 0.19697), norm. avg. (of 6) = 0.0971229 fft 36: mflops = 47.6625 (norm. = 0.19697), norm. avg. (of 6) = 0.0916548 fft 37: mflops = 6.95958 (norm. = 0.0287611), norm. avg. (of 6) = 0.0411543 fft 38: mflops = 241.979 (norm. = 1), norm. avg. (of 6) = 0.531958 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.56 s, 32768 iters, t-(init.)=1.52 s t(norm)=0.0517709, mflops=96.5794 (err=3.6e-16) 1. Arndt DIT: elapsed time t=1.55 s, 32768 iters, t-(init.)=1.51 s t(norm)=0.0514303, mflops=97.219 (err=3.4e-16) 2. Arndt Split-Radix: elapsed time t=1.04 s, 16384 iters, t-(init.)=1.02 s t(norm)=0.069482, mflops=71.9611 (err=4.2e-16) 3. Arndt 4-step: elapsed time t=1.35 s, 8192 iters, t-(init.)=1.34 s t(norm)=0.182561, mflops=27.3882 (err=3.1e-16) 4. Bailey: elapsed time t=1.23 s, 32768 iters, t-(init.)=1.18 s t(norm)=0.0401906, mflops=124.407 (err=3.4e-16) 5. Beauregard: elapsed time t=1.4 s, 8192 iters, t-(init.)=1.39 s t(norm)=0.189372, mflops=26.403 (err=3.5e-16) 6. Bergland: elapsed time t=1.15 s, 16384 iters, t-(init.)=1.13 s t(norm)=0.0769751, mflops=64.956 (err=3.9e-16) 7. Brenner: elapsed time t=1.54 s, 16384 iters, t-(init.)=1.52 s t(norm)=0.103542, mflops=48.2897 (err=4.2e-16) 8. Burrus: elapsed time t=1.39 s, 16384 iters, t-(init.)=1.37 s t(norm)=0.0933238, mflops=53.5769 (err=3.3e-16) 9. CWP (min N) (N=130): elapsed time t=1.1 s, 32768 iters, t-(init.)=1.05 s t(norm)=0.0357628, mflops=139.81 10. CWP (best N) (N=140): elapsed time t=1 s, 32768 iters, t-(init.)=0.95 s t(norm)=0.0323568, mflops=154.527 11. Edelblute: elapsed time t=1.32 s, 16384 iters, t-(init.)=1.3 s t(norm)=0.0885555, mflops=56.4618 (err=3.3e-16) 12. FFTPACK: elapsed time t=1.96 s, 65536 iters, t-(init.)=1.87 s t(norm)=0.0318459, mflops=157.006 (err=3.4e-16) 13. FFTPACK (f2c): elapsed time t=1.01 s, 8192 iters, t-(init.)=1 s t(norm)=0.136239, mflops=36.7002 (err=3.4e-16) FFTW_MEASURE plan: (cost = 2.197266e-05) FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.42 s, 65536 iters, t-(init.)=1.33 s t(norm)=0.0226498, mflops=220.753 (err=3.5e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.65 s, 65536 iters, t-(init.)=1.56 s t(norm)=0.0265666, mflops=188.206 (err=3.4e-16) 16. Frigo-old: elapsed time t=1.52 s, 65536 iters, t-(init.)=1.42 s t(norm)=0.0241825, mflops=206.761 (err=3.4e-16) 17. Green: elapsed time t=1.86 s, 65536 iters, t-(init.)=1.77 s t(norm)=0.0301429, mflops=165.876 (err=4.2e-16) 18. GSL: elapsed time t=1.32 s, 32768 iters, t-(init.)=1.27 s t(norm)=0.0432559, mflops=115.591 (err=3.3e-16) 19. GSL DIT: elapsed time t=1.28 s, 16384 iters, t-(init.)=1.25 s t(norm)=0.0851495, mflops=58.7203 (err=3.5e-16) 20. GSL DIF: elapsed time t=1.3 s, 16384 iters, t-(init.)=1.28 s t(norm)=0.0871931, mflops=57.344 (err=3.7e-16) 21. Krukar: elapsed time t=1.79 s, 65536 iters, t-(init.)=1.7 s t(norm)=0.0289508, mflops=172.707 (err=3.6e-16) 22. Mayer (Buneman): elapsed time t=1.07 s, 16384 iters, t-(init.)=1.05 s t(norm)=0.0715256, mflops=69.9051 (err=3.3e-16) 23. Mayer (simple): elapsed time t=1.64 s, 32768 iters, t-(init.)=1.59 s t(norm)=0.0541551, mflops=92.3274 24. Mayer (lookup): elapsed time t=1.75 s, 32768 iters, t-(init.)=1.7 s t(norm)=0.0579017, mflops=86.3533 (err=3.4e-16) 25. Monro: elapsed time t=1.75 s, 16384 iters, t-(init.)=1.72 s t(norm)=0.117166, mflops=42.6746 (err=5.2e-08) 26. NAPACK (f2c): elapsed time t=1.82 s, 16384 iters, t-(init.)=1.8 s t(norm)=0.122615, mflops=40.778 (err=2.1e-15) 27. Ooura (C): elapsed time t=1.83 s, 65536 iters, t-(init.)=1.73 s t(norm)=0.0294617, mflops=169.712 (err=3.5e-16) 28. Ooura (F): elapsed time t=1.82 s, 65536 iters, t-(init.)=1.73 s t(norm)=0.0294617, mflops=169.712 (err=3.5e-16) 29. Ransom: elapsed time t=1.1 s, 8192 iters, t-(init.)=1.09 s t(norm)=0.148501, mflops=33.6699 (err=9.6e-16) 30. SCIPORT: elapsed time t=1.22 s, 32768 iters, t-(init.)=1.18 s t(norm)=0.0401906, mflops=124.407 (err=3.7e-16) 31. Singleton: elapsed time t=1.56 s, 32768 iters, t-(init.)=1.51 s t(norm)=0.0514303, mflops=97.219 (err=4.2e-16) 32. Singleton (f2c): elapsed time t=1.57 s, 32768 iters, t-(init.)=1.52 s t(norm)=0.0517709, mflops=96.5794 (err=4.2e-16) 33. Sorensen: elapsed time t=1.21 s, 32768 iters, t-(init.)=1.16 s t(norm)=0.0395094, mflops=126.552 (err=3.2e-16) 34. Sorensen DIT: elapsed time t=1.37 s, 16384 iters, t-(init.)=1.35 s t(norm)=0.0919615, mflops=54.3706 (err=3.3e-16) 35. Temperton: elapsed time t=1.54 s, 16384 iters, t-(init.)=1.51 s t(norm)=0.102861, mflops=48.6095 (err=4.7e-08) 36. Temperton (f2c): elapsed time t=1.55 s, 16384 iters, t-(init.)=1.52 s t(norm)=0.103542, mflops=48.2897 (err=3.6e-16) 37. Valkenburg: elapsed time t=1.3 s, 2048 iters, t-(init.)=1.3 s t(norm)=0.708444, mflops=7.05772 (err=5.3e-16) 38. SGIMATH: elapsed time t=1.13 s, 65536 iters, t-(init.)=1.03 s t(norm)=0.0175408, mflops=285.05 (err=3.4e-16) Top mflops for N=128 = 285.05 Normalized results and averages for N=128: fft 0: mflops = 96.5794 (norm. = 0.338816), norm. avg. (of 7) = 0.402618 fft 1: mflops = 97.219 (norm. = 0.34106), norm. avg. (of 7) = 0.374091 fft 2: mflops = 71.9611 (norm. = 0.252451), norm. avg. (of 7) = 0.260224 fft 3: mflops = 27.3882 (norm. = 0.0960821), norm. avg. (of 7) = 0.0546006 fft 4: mflops = 124.407 (norm. = 0.436441), norm. avg. (of 7) = 0.269346 fft 5: mflops = 26.403 (norm. = 0.0926259), norm. avg. (of 7) = 0.0990519 fft 6: mflops = 64.956 (norm. = 0.227876), norm. avg. (of 7) = 0.142255 fft 7: mflops = 48.2897 (norm. = 0.169408), norm. avg. (of 7) = 0.116863 fft 8: mflops = 53.5769 (norm. = 0.187956), norm. avg. (of 7) = 0.181215 fft 9: mflops = 139.81 (norm. = 0.490476), norm. avg. (of 7) = 0.23883 fft 10: mflops = 154.527 (norm. = 0.542105), norm. avg. (of 7) = 0.229634 fft 11: mflops = 56.4618 (norm. = 0.198077), norm. avg. (of 6) = 0.183283 fft 12: mflops = 157.006 (norm. = 0.550802), norm. avg. (of 7) = 0.30939 fft 13: mflops = 36.7002 (norm. = 0.12875), norm. avg. (of 7) = 0.114249 fft 14: mflops = 220.753 (norm. = 0.774436), norm. avg. (of 7) = 0.624858 fft 15: mflops = 188.206 (norm. = 0.660256), norm. avg. (of 7) = 0.603359 fft 16: mflops = 206.761 (norm. = 0.725352), norm. avg. (of 7) = 0.921082 fft 17: mflops = 165.876 (norm. = 0.581921), norm. avg. (of 5) = 0.474711 fft 18: mflops = 115.591 (norm. = 0.405512), norm. avg. (of 7) = 0.233753 fft 19: mflops = 58.7203 (norm. = 0.206), norm. avg. (of 7) = 0.146252 fft 20: mflops = 57.344 (norm. = 0.201172), norm. avg. (of 7) = 0.152842 fft 21: mflops = 172.707 (norm. = 0.605882), norm. avg. (of 7) = 0.61892 fft 22: mflops = 69.9051 (norm. = 0.245238), norm. avg. (of 6) = 0.222617 fft 23: mflops = 92.3274 (norm. = 0.323899), norm. avg. (of 6) = 0.263669 fft 24: mflops = 86.3533 (norm. = 0.302941), norm. avg. (of 6) = 0.234671 fft 25: mflops = 42.6746 (norm. = 0.149709), norm. avg. (of 6) = 0.0883033 fft 26: mflops = 40.778 (norm. = 0.143056), norm. avg. (of 7) = 0.099748 fft 27: mflops = 169.712 (norm. = 0.595376), norm. avg. (of 7) = 0.570149 fft 28: mflops = 169.712 (norm. = 0.595376), norm. avg. (of 7) = 0.459559 fft 29: mflops = 33.6699 (norm. = 0.118119), norm. avg. (of 6) = 0.0770077 fft 30: mflops = 124.407 (norm. = 0.436441), norm. avg. (of 6) = 0.336837 fft 31: mflops = 97.219 (norm. = 0.34106), norm. avg. (of 7) = 0.207654 fft 32: mflops = 96.5794 (norm. = 0.338816), norm. avg. (of 7) = 0.212998 fft 33: mflops = 126.552 (norm. = 0.443966), norm. avg. (of 7) = 0.296575 fft 34: mflops = 54.3706 (norm. = 0.190741), norm. avg. (of 7) = 0.180283 fft 35: mflops = 48.6095 (norm. = 0.17053), norm. avg. (of 7) = 0.10761 fft 36: mflops = 48.2897 (norm. = 0.169408), norm. avg. (of 7) = 0.102762 fft 37: mflops = 7.05772 (norm. = 0.0247596), norm. avg. (of 7) = 0.0388122 fft 38: mflops = 285.05 (norm. = 1), norm. avg. (of 7) = 0.598821 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.75 s, 16384 iters, t-(init.)=1.7 s t(norm)=0.0506639, mflops=98.6895 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.74 s, 16384 iters, t-(init.)=1.69 s t(norm)=0.0503659, mflops=99.2735 (err=1.0e-15) 2. Arndt Split-Radix: elapsed time t=1.1 s, 8192 iters, t-(init.)=1.08 s t(norm)=0.064373, mflops=77.6723 (err=1.0e-15) 3. Arndt 4-step: elapsed time t=1.21 s, 4096 iters, t-(init.)=1.2 s t(norm)=0.143051, mflops=34.9525 (err=1.0e-15) 4. Bailey: elapsed time t=1.54 s, 16384 iters, t-(init.)=1.49 s t(norm)=0.0444055, mflops=112.599 (err=1.0e-15) 5. Beauregard: elapsed time t=1.53 s, 4096 iters, t-(init.)=1.52 s t(norm)=0.181198, mflops=27.5941 (err=1.1e-15) 6. Bergland: elapsed time t=1.17 s, 8192 iters, t-(init.)=1.15 s t(norm)=0.0685453, mflops=72.9444 (err=1.0e-15) 7. Brenner: elapsed time t=1.65 s, 8192 iters, t-(init.)=1.62 s t(norm)=0.0965595, mflops=51.7815 (err=1.1e-15) 8. Burrus: elapsed time t=1.45 s, 8192 iters, t-(init.)=1.43 s t(norm)=0.0852346, mflops=58.6616 (err=1.0e-15) 9. CWP (min N) (N=260): elapsed time t=1.02 s, 16384 iters, t-(init.)=0.97 s t(norm)=0.0289083, mflops=172.961 10. CWP (best N) (N=280): elapsed time t=1.81 s, 32768 iters, t-(init.)=1.71 s t(norm)=0.025481, mflops=196.225 11. Edelblute: elapsed time t=1.4 s, 8192 iters, t-(init.)=1.37 s t(norm)=0.0816584, mflops=61.2307 (err=1.0e-15) 12. FFTPACK: elapsed time t=1.71 s, 32768 iters, t-(init.)=1.61 s t(norm)=0.0239909, mflops=208.413 (err=1.1e-15) 13. FFTPACK (f2c): elapsed time t=1.01 s, 4096 iters, t-(init.)=1 s t(norm)=0.119209, mflops=41.943 (err=1.1e-15) FFTW_MEASURE plan: (cost = 4.760742e-05) FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.52 s, 32768 iters, t-(init.)=1.43 s t(norm)=0.0213087, mflops=234.646 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.78 s, 32768 iters, t-(init.)=1.69 s t(norm)=0.025183, mflops=198.547 (err=1.1e-15) 16. Frigo-old: elapsed time t=1.69 s, 32768 iters, t-(init.)=1.59 s t(norm)=0.0236928, mflops=211.034 (err=1.0e-15) 17. Green: elapsed time t=1.9 s, 32768 iters, t-(init.)=1.8 s t(norm)=0.0268221, mflops=186.414 (err=1.1e-15) 18. GSL: elapsed time t=1.38 s, 16384 iters, t-(init.)=1.33 s t(norm)=0.0396371, mflops=126.144 (err=1.1e-15) 19. GSL DIT: elapsed time t=1.32 s, 8192 iters, t-(init.)=1.29 s t(norm)=0.07689, mflops=65.028 (err=1.0e-15) 20. GSL DIF: elapsed time t=1.34 s, 8192 iters, t-(init.)=1.32 s t(norm)=0.0786781, mflops=63.5501 (err=1.1e-15) 21. Krukar: elapsed time t=1.13 s, 16384 iters, t-(init.)=1.09 s t(norm)=0.0324845, mflops=153.919 (err=1.1e-15) 22. Mayer (Buneman): elapsed time t=1.18 s, 8192 iters, t-(init.)=1.16 s t(norm)=0.0691414, mflops=72.3156 (err=9.8e-16) 23. Mayer (simple): elapsed time t=1.81 s, 16384 iters, t-(init.)=1.76 s t(norm)=0.0524521, mflops=95.3251 24. Mayer (lookup): elapsed time t=1.91 s, 16384 iters, t-(init.)=1.87 s t(norm)=0.0557303, mflops=89.7177 (err=9.6e-16) 25. Monro: elapsed time t=1.75 s, 8192 iters, t-(init.)=1.72 s t(norm)=0.10252, mflops=48.771 (err=8.5e-08) 26. NAPACK (f2c): elapsed time t=1.88 s, 8192 iters, t-(init.)=1.85 s t(norm)=0.110269, mflops=45.3438 (err=4.9e-15) 27. Ooura (C): elapsed time t=1 s, 16384 iters, t-(init.)=0.95 s t(norm)=0.0283122, mflops=176.602 (err=1.0e-15) 28. Ooura (F): elapsed time t=1.9 s, 32768 iters, t-(init.)=1.81 s t(norm)=0.0269711, mflops=185.384 (err=1.0e-15) 29. Ransom: elapsed time t=1.73 s, 8192 iters, t-(init.)=1.71 s t(norm)=0.101924, mflops=49.0562 (err=1.8e-15) 30. SCIPORT: elapsed time t=1.29 s, 16384 iters, t-(init.)=1.24 s t(norm)=0.0369549, mflops=135.3 (err=1.1e-15) 31. Singleton: elapsed time t=1.41 s, 16384 iters, t-(init.)=1.36 s t(norm)=0.0405312, mflops=123.362 (err=1.7e-15) 32. Singleton (f2c): elapsed time t=1.36 s, 16384 iters, t-(init.)=1.31 s t(norm)=0.039041, mflops=128.07 (err=1.7e-15) 33. Sorensen: elapsed time t=1.22 s, 16384 iters, t-(init.)=1.17 s t(norm)=0.0348687, mflops=143.395 (err=9.7e-16) 34. Sorensen DIT: elapsed time t=1.43 s, 8192 iters, t-(init.)=1.4 s t(norm)=0.0834465, mflops=59.9186 (err=9.8e-16) 35. Temperton: elapsed time t=1.67 s, 8192 iters, t-(init.)=1.65 s t(norm)=0.0983477, mflops=50.84 (err=9.5e-08) 36. Temperton (f2c): elapsed time t=1.48 s, 8192 iters, t-(init.)=1.45 s t(norm)=0.0864267, mflops=57.8525 (err=1.1e-15) 37. Valkenburg: elapsed time t=1.48 s, 1024 iters, t-(init.)=1.48 s t(norm)=0.705719, mflops=7.08497 (err=1.1e-15) 38. SGIMATH: elapsed time t=1.24 s, 32768 iters, t-(init.)=1.15 s t(norm)=0.0171363, mflops=291.778 (err=1.1e-15) Top mflops for N=256 = 291.778 Normalized results and averages for N=256: fft 0: mflops = 98.6895 (norm. = 0.338235), norm. avg. (of 8) = 0.39457 fft 1: mflops = 99.2735 (norm. = 0.340237), norm. avg. (of 8) = 0.36986 fft 2: mflops = 77.6723 (norm. = 0.266204), norm. avg. (of 8) = 0.260971 fft 3: mflops = 34.9525 (norm. = 0.119792), norm. avg. (of 8) = 0.0627494 fft 4: mflops = 112.599 (norm. = 0.385906), norm. avg. (of 8) = 0.283916 fft 5: mflops = 27.5941 (norm. = 0.0945724), norm. avg. (of 8) = 0.0984919 fft 6: mflops = 72.9444 (norm. = 0.25), norm. avg. (of 8) = 0.155724 fft 7: mflops = 51.7815 (norm. = 0.177469), norm. avg. (of 8) = 0.124439 fft 8: mflops = 58.6616 (norm. = 0.201049), norm. avg. (of 8) = 0.183694 fft 9: mflops = 172.961 (norm. = 0.592784), norm. avg. (of 8) = 0.283074 fft 10: mflops = 196.225 (norm. = 0.672515), norm. avg. (of 8) = 0.284994 fft 11: mflops = 61.2307 (norm. = 0.209854), norm. avg. (of 7) = 0.187079 fft 12: mflops = 208.413 (norm. = 0.714286), norm. avg. (of 8) = 0.360002 fft 13: mflops = 41.943 (norm. = 0.14375), norm. avg. (of 8) = 0.117937 fft 14: mflops = 234.646 (norm. = 0.804196), norm. avg. (of 8) = 0.647276 fft 15: mflops = 198.547 (norm. = 0.680473), norm. avg. (of 8) = 0.612998 fft 16: mflops = 211.034 (norm. = 0.72327), norm. avg. (of 8) = 0.896356 fft 17: mflops = 186.414 (norm. = 0.638889), norm. avg. (of 6) = 0.502074 fft 18: mflops = 126.144 (norm. = 0.432331), norm. avg. (of 8) = 0.258575 fft 19: mflops = 65.028 (norm. = 0.222868), norm. avg. (of 8) = 0.155829 fft 20: mflops = 63.5501 (norm. = 0.217803), norm. avg. (of 8) = 0.160962 fft 21: mflops = 153.919 (norm. = 0.527523), norm. avg. (of 8) = 0.607496 fft 22: mflops = 72.3156 (norm. = 0.247845), norm. avg. (of 7) = 0.226221 fft 23: mflops = 95.3251 (norm. = 0.326705), norm. avg. (of 7) = 0.272674 fft 24: mflops = 89.7177 (norm. = 0.307487), norm. avg. (of 7) = 0.245074 fft 25: mflops = 48.771 (norm. = 0.167151), norm. avg. (of 7) = 0.0995672 fft 26: mflops = 45.3438 (norm. = 0.155405), norm. avg. (of 8) = 0.106705 fft 27: mflops = 176.602 (norm. = 0.605263), norm. avg. (of 8) = 0.574538 fft 28: mflops = 185.384 (norm. = 0.635359), norm. avg. (of 8) = 0.481534 fft 29: mflops = 49.0562 (norm. = 0.168129), norm. avg. (of 7) = 0.090025 fft 30: mflops = 135.3 (norm. = 0.46371), norm. avg. (of 7) = 0.354962 fft 31: mflops = 123.362 (norm. = 0.422794), norm. avg. (of 8) = 0.234546 fft 32: mflops = 128.07 (norm. = 0.438931), norm. avg. (of 8) = 0.24124 fft 33: mflops = 143.395 (norm. = 0.491453), norm. avg. (of 8) = 0.320935 fft 34: mflops = 59.9186 (norm. = 0.205357), norm. avg. (of 8) = 0.183417 fft 35: mflops = 50.84 (norm. = 0.174242), norm. avg. (of 8) = 0.115939 fft 36: mflops = 57.8525 (norm. = 0.198276), norm. avg. (of 8) = 0.114702 fft 37: mflops = 7.08497 (norm. = 0.0242821), norm. avg. (of 8) = 0.036996 fft 38: mflops = 291.778 (norm. = 1), norm. avg. (of 8) = 0.648969 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.83 s, 8192 iters, t-(init.)=1.78 s t(norm)=0.0471539, mflops=106.036 (err=1.1e-15) 1. Arndt DIT: elapsed time t=1.82 s, 8192 iters, t-(init.)=1.78 s t(norm)=0.0471539, mflops=106.036 (err=1.1e-15) 2. Arndt Split-Radix: elapsed time t=1.18 s, 4096 iters, t-(init.)=1.16 s t(norm)=0.061459, mflops=81.355 (err=1.1e-15) 3. Arndt 4-step: elapsed time t=1.33 s, 2048 iters, t-(init.)=1.32 s t(norm)=0.139872, mflops=35.7469 (err=1.0e-15) 4. Bailey: elapsed time t=1.45 s, 8192 iters, t-(init.)=1.4 s t(norm)=0.0370873, mflops=134.817 (err=1.1e-15) 5. Beauregard: elapsed time t=1.71 s, 2048 iters, t-(init.)=1.7 s t(norm)=0.180138, mflops=27.7564 (err=1.0e-15) 6. Bergland: elapsed time t=1.2 s, 4096 iters, t-(init.)=1.17 s t(norm)=0.0619888, mflops=80.6597 (err=1.0e-15) 7. Brenner: elapsed time t=1.71 s, 4096 iters, t-(init.)=1.68 s t(norm)=0.0890096, mflops=56.1737 (err=9.9e-16) 8. Burrus: elapsed time t=1.51 s, 4096 iters, t-(init.)=1.48 s t(norm)=0.0784132, mflops=63.7648 (err=1.1e-15) 9. CWP (min N) (N=520): elapsed time t=1.04 s, 8192 iters, t-(init.)=1 s t(norm)=0.026491, mflops=188.744 10. CWP (best N) (N=560): elapsed time t=1.01 s, 8192 iters, t-(init.)=0.96 s t(norm)=0.0254313, mflops=196.608 11. Edelblute: elapsed time t=1.47 s, 4096 iters, t-(init.)=1.44 s t(norm)=0.0762939, mflops=65.536 (err=1.1e-15) 12. FFTPACK: elapsed time t=1.26 s, 8192 iters, t-(init.)=1.21 s t(norm)=0.0320541, mflops=155.987 (err=1.0e-15) 13. FFTPACK (f2c): elapsed time t=1.38 s, 2048 iters, t-(init.)=1.36 s t(norm)=0.144111, mflops=34.6955 (err=1.0e-15) FFTW_MEASURE plan: (cost = 1.123047e-04) FFTW_TWIDDLE 2 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.79 s, 16384 iters, t-(init.)=1.69 s t(norm)=0.0223849, mflops=223.365 (err=9.7e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1 s, 8192 iters, t-(init.)=0.96 s t(norm)=0.0254313, mflops=196.608 (err=9.7e-16) 16. Frigo-old: elapsed time t=1.02 s, 8192 iters, t-(init.)=0.98 s t(norm)=0.0259611, mflops=192.596 (err=9.5e-16) 17. Green: elapsed time t=1.95 s, 16384 iters, t-(init.)=1.86 s t(norm)=0.0246366, mflops=202.95 (err=9.8e-16) 18. GSL: elapsed time t=1.79 s, 8192 iters, t-(init.)=1.75 s t(norm)=0.0463592, mflops=107.854 (err=1.0e-15) 19. GSL DIT: elapsed time t=1.43 s, 4096 iters, t-(init.)=1.41 s t(norm)=0.0747045, mflops=66.9304 (err=1.2e-15) 20. GSL DIF: elapsed time t=1.47 s, 4096 iters, t-(init.)=1.45 s t(norm)=0.0768238, mflops=65.084 (err=1.1e-15) 21. Krukar: elapsed time t=1.33 s, 8192 iters, t-(init.)=1.29 s t(norm)=0.0341733, mflops=146.313 (err=1.0e-15) 22. Mayer (Buneman): elapsed time t=1.23 s, 4096 iters, t-(init.)=1.21 s t(norm)=0.0641081, mflops=77.9933 (err=1.0e-15) 23. Mayer (simple): elapsed time t=1.88 s, 8192 iters, t-(init.)=1.83 s t(norm)=0.0484784, mflops=103.139 24. Mayer (lookup): elapsed time t=1.97 s, 8192 iters, t-(init.)=1.92 s t(norm)=0.0508626, mflops=98.304 (err=9.8e-16) 25. Monro: elapsed time t=1.82 s, 4096 iters, t-(init.)=1.8 s t(norm)=0.0953674, mflops=52.4288 (err=8.4e-08) 26. NAPACK (f2c): elapsed time t=1.02 s, 2048 iters, t-(init.)=1.01 s t(norm)=0.107023, mflops=46.7187 (err=6.9e-15) 27. Ooura (C): elapsed time t=1.1 s, 8192 iters, t-(init.)=1.05 s t(norm)=0.0278155, mflops=179.756 (err=9.8e-16) 28. Ooura (F): elapsed time t=1.06 s, 8192 iters, t-(init.)=1.02 s t(norm)=0.0270208, mflops=185.043 (err=9.8e-16) 29. Ransom: elapsed time t=1.01 s, 2048 iters, t-(init.)=1 s t(norm)=0.105964, mflops=47.1859 (err=1.4e-15) 30. SCIPORT: elapsed time t=1.54 s, 8192 iters, t-(init.)=1.49 s t(norm)=0.0394715, mflops=126.674 (err=1.0e-15) 31. Singleton: elapsed time t=1.53 s, 8192 iters, t-(init.)=1.48 s t(norm)=0.0392066, mflops=127.53 (err=1.2e-15) 32. Singleton (f2c): elapsed time t=1.5 s, 8192 iters, t-(init.)=1.45 s t(norm)=0.0384119, mflops=130.168 (err=1.2e-15) 33. Sorensen: elapsed time t=1.25 s, 8192 iters, t-(init.)=1.21 s t(norm)=0.0320541, mflops=155.987 (err=1.0e-15) 34. Sorensen DIT: elapsed time t=1.5 s, 4096 iters, t-(init.)=1.48 s t(norm)=0.0784132, mflops=63.7648 (err=1.1e-15) 35. Temperton: elapsed time t=1.93 s, 4096 iters, t-(init.)=1.91 s t(norm)=0.101195, mflops=49.4093 (err=1.0e-07) 36. Temperton (f2c): elapsed time t=1.85 s, 4096 iters, t-(init.)=1.83 s t(norm)=0.0969569, mflops=51.5693 (err=9.9e-16) 37. Valkenburg: elapsed time t=1.65 s, 512 iters, t-(init.)=1.64 s t(norm)=0.695123, mflops=7.19298 (err=1.3e-15) 38. SGIMATH: elapsed time t=1.37 s, 16384 iters, t-(init.)=1.28 s t(norm)=0.0169542, mflops=294.912 (err=9.8e-16) Top mflops for N=512 = 294.912 Normalized results and averages for N=512: fft 0: mflops = 106.036 (norm. = 0.359551), norm. avg. (of 9) = 0.390679 fft 1: mflops = 106.036 (norm. = 0.359551), norm. avg. (of 9) = 0.368714 fft 2: mflops = 81.355 (norm. = 0.275862), norm. avg. (of 9) = 0.262626 fft 3: mflops = 35.7469 (norm. = 0.121212), norm. avg. (of 9) = 0.0692453 fft 4: mflops = 134.817 (norm. = 0.457143), norm. avg. (of 9) = 0.303164 fft 5: mflops = 27.7564 (norm. = 0.0941176), norm. avg. (of 9) = 0.0980059 fft 6: mflops = 80.6597 (norm. = 0.273504), norm. avg. (of 9) = 0.16881 fft 7: mflops = 56.1737 (norm. = 0.190476), norm. avg. (of 9) = 0.131776 fft 8: mflops = 63.7648 (norm. = 0.216216), norm. avg. (of 9) = 0.187308 fft 9: mflops = 188.744 (norm. = 0.64), norm. avg. (of 9) = 0.322732 fft 10: mflops = 196.608 (norm. = 0.666667), norm. avg. (of 9) = 0.327403 fft 11: mflops = 65.536 (norm. = 0.222222), norm. avg. (of 8) = 0.191472 fft 12: mflops = 155.987 (norm. = 0.528926), norm. avg. (of 9) = 0.378771 fft 13: mflops = 34.6955 (norm. = 0.117647), norm. avg. (of 9) = 0.117904 fft 14: mflops = 223.365 (norm. = 0.757396), norm. avg. (of 9) = 0.659511 fft 15: mflops = 196.608 (norm. = 0.666667), norm. avg. (of 9) = 0.618961 fft 16: mflops = 192.596 (norm. = 0.653061), norm. avg. (of 9) = 0.869323 fft 17: mflops = 202.95 (norm. = 0.688172), norm. avg. (of 7) = 0.528659 fft 18: mflops = 107.854 (norm. = 0.365714), norm. avg. (of 9) = 0.27048 fft 19: mflops = 66.9304 (norm. = 0.22695), norm. avg. (of 9) = 0.163731 fft 20: mflops = 65.084 (norm. = 0.22069), norm. avg. (of 9) = 0.167598 fft 21: mflops = 146.313 (norm. = 0.496124), norm. avg. (of 9) = 0.595121 fft 22: mflops = 77.9933 (norm. = 0.264463), norm. avg. (of 8) = 0.231001 fft 23: mflops = 103.139 (norm. = 0.349727), norm. avg. (of 8) = 0.282305 fft 24: mflops = 98.304 (norm. = 0.333333), norm. avg. (of 8) = 0.256106 fft 25: mflops = 52.4288 (norm. = 0.177778), norm. avg. (of 8) = 0.109344 fft 26: mflops = 46.7187 (norm. = 0.158416), norm. avg. (of 9) = 0.112451 fft 27: mflops = 179.756 (norm. = 0.609524), norm. avg. (of 9) = 0.578425 fft 28: mflops = 185.043 (norm. = 0.627451), norm. avg. (of 9) = 0.497747 fft 29: mflops = 47.1859 (norm. = 0.16), norm. avg. (of 8) = 0.0987718 fft 30: mflops = 126.674 (norm. = 0.42953), norm. avg. (of 8) = 0.364283 fft 31: mflops = 127.53 (norm. = 0.432432), norm. avg. (of 9) = 0.256534 fft 32: mflops = 130.168 (norm. = 0.441379), norm. avg. (of 9) = 0.263477 fft 33: mflops = 155.987 (norm. = 0.528926), norm. avg. (of 9) = 0.344045 fft 34: mflops = 63.7648 (norm. = 0.216216), norm. avg. (of 9) = 0.187061 fft 35: mflops = 49.4093 (norm. = 0.167539), norm. avg. (of 9) = 0.121672 fft 36: mflops = 51.5693 (norm. = 0.174863), norm. avg. (of 9) = 0.121386 fft 37: mflops = 7.19298 (norm. = 0.0243902), norm. avg. (of 9) = 0.0355953 fft 38: mflops = 294.912 (norm. = 1), norm. avg. (of 9) = 0.687972 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1 s, 2048 iters, t-(init.)=0.97 s t(norm)=0.0462532, mflops=108.101 (err=1.8e-15) 1. Arndt DIT: elapsed time t=1 s, 2048 iters, t-(init.)=0.98 s t(norm)=0.04673, mflops=106.998 (err=1.9e-15) 2. Arndt Split-Radix: elapsed time t=1.24 s, 2048 iters, t-(init.)=1.21 s t(norm)=0.0576973, mflops=86.6592 (err=1.8e-15) 3. Arndt 4-step: elapsed time t=1.21 s, 1024 iters, t-(init.)=1.2 s t(norm)=0.114441, mflops=43.6907 (err=1.8e-15) 4. Bailey: elapsed time t=1.22 s, 2048 iters, t-(init.)=1.2 s t(norm)=0.0572205, mflops=87.3813 (err=1.8e-15) 5. Beauregard: elapsed time t=1.92 s, 1024 iters, t-(init.)=1.9 s t(norm)=0.181198, mflops=27.5941 (err=2.0e-15) 6. Bergland: elapsed time t=1.31 s, 2048 iters, t-(init.)=1.28 s t(norm)=0.0610352, mflops=81.92 (err=2.1e-15) 7. Brenner: elapsed time t=1.84 s, 2048 iters, t-(init.)=1.82 s t(norm)=0.0867844, mflops=57.6141 (err=1.9e-15) 8. Burrus: elapsed time t=1.57 s, 2048 iters, t-(init.)=1.55 s t(norm)=0.0739098, mflops=67.6501 (err=1.8e-15) 9. CWP (min N) (N=1040): elapsed time t=1.15 s, 4096 iters, t-(init.)=1.1 s t(norm)=0.026226, mflops=190.65 10. CWP (best N) (N=1040): elapsed time t=1.14 s, 4096 iters, t-(init.)=1.09 s t(norm)=0.0259876, mflops=192.399 11. Edelblute: elapsed time t=1.54 s, 2048 iters, t-(init.)=1.52 s t(norm)=0.0724792, mflops=68.9853 (err=1.8e-15) 12. FFTPACK: elapsed time t=1.69 s, 4096 iters, t-(init.)=1.65 s t(norm)=0.0393391, mflops=127.1 (err=1.9e-15) 13. FFTPACK (f2c): elapsed time t=1.46 s, 1024 iters, t-(init.)=1.44 s t(norm)=0.137329, mflops=36.4089 (err=1.9e-15) FFTW_MEASURE plan: (cost = 2.734375e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.15 s, 4096 iters, t-(init.)=1.1 s t(norm)=0.026226, mflops=190.65 (err=2.0e-15) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.31 s, 4096 iters, t-(init.)=1.27 s t(norm)=0.0302792, mflops=165.13 (err=1.9e-15) 16. Frigo-old: elapsed time t=1.36 s, 4096 iters, t-(init.)=1.31 s t(norm)=0.0312328, mflops=160.088 (err=1.9e-15) 17. Green: elapsed time t=1.09 s, 4096 iters, t-(init.)=1.04 s t(norm)=0.0247955, mflops=201.649 (err=2.0e-15) 18. GSL: elapsed time t=1.88 s, 4096 iters, t-(init.)=1.83 s t(norm)=0.0436306, mflops=114.598 (err=1.9e-15) 19. GSL DIT: elapsed time t=1.58 s, 2048 iters, t-(init.)=1.56 s t(norm)=0.0743866, mflops=67.2164 (err=2.1e-15) 20. GSL DIF: elapsed time t=1.66 s, 2048 iters, t-(init.)=1.64 s t(norm)=0.0782013, mflops=63.9376 (err=2.2e-15) 21. Krukar: elapsed time t=1.31 s, 2048 iters, t-(init.)=1.29 s t(norm)=0.061512, mflops=81.285 (err=1.9e-15) 22. Mayer (Buneman): elapsed time t=1.31 s, 2048 iters, t-(init.)=1.29 s t(norm)=0.061512, mflops=81.285 (err=1.8e-15) 23. Mayer (simple): elapsed time t=1.02 s, 2048 iters, t-(init.)=1 s t(norm)=0.0476837, mflops=104.858 24. Mayer (lookup): elapsed time t=1.07 s, 2048 iters, t-(init.)=1.05 s t(norm)=0.0500679, mflops=99.8644 (err=1.8e-15) 25. Monro: elapsed time t=1.85 s, 2048 iters, t-(init.)=1.82 s t(norm)=0.0867844, mflops=57.6141 (err=1.0e-07) 26. NAPACK (f2c): elapsed time t=1.04 s, 1024 iters, t-(init.)=1.03 s t(norm)=0.0982285, mflops=50.9017 (err=1.6e-14) 27. Ooura (C): elapsed time t=1.18 s, 4096 iters, t-(init.)=1.13 s t(norm)=0.0269413, mflops=185.589 (err=2.2e-15) 28. Ooura (F): elapsed time t=1.1 s, 4096 iters, t-(init.)=1.05 s t(norm)=0.025034, mflops=199.729 (err=2.2e-15) 29. Ransom: elapsed time t=1.81 s, 2048 iters, t-(init.)=1.79 s t(norm)=0.0853539, mflops=58.5797 (err=2.3e-15) 30. SCIPORT: elapsed time t=1 s, 2048 iters, t-(init.)=0.98 s t(norm)=0.04673, mflops=106.998 (err=2.0e-15) 31. Singleton: elapsed time t=1.59 s, 4096 iters, t-(init.)=1.55 s t(norm)=0.0369549, mflops=135.3 (err=2.8e-15) 32. Singleton (f2c): elapsed time t=1.52 s, 4096 iters, t-(init.)=1.47 s t(norm)=0.0350475, mflops=142.663 (err=2.8e-15) 33. Sorensen: elapsed time t=1.31 s, 4096 iters, t-(init.)=1.27 s t(norm)=0.0302792, mflops=165.13 (err=1.8e-15) 34. Sorensen DIT: elapsed time t=1.56 s, 2048 iters, t-(init.)=1.54 s t(norm)=0.0734329, mflops=68.0894 (err=1.9e-15) 35. Temperton: elapsed time t=1.86 s, 2048 iters, t-(init.)=1.84 s t(norm)=0.087738, mflops=56.9878 (err=1.1e-07) 36. Temperton (f2c): elapsed time t=1.79 s, 2048 iters, t-(init.)=1.77 s t(norm)=0.0844002, mflops=59.2416 (err=1.9e-15) 37. Valkenburg: elapsed time t=1.19 s, 128 iters, t-(init.)=1.19 s t(norm)=0.907898, mflops=5.50723 (err=2.4e-15) 38. SGIMATH: elapsed time t=1 s, 4096 iters, t-(init.)=0.96 s t(norm)=0.0228882, mflops=218.453 (err=1.9e-15) Top mflops for N=1024 = 218.453 Normalized results and averages for N=1024: fft 0: mflops = 108.101 (norm. = 0.494845), norm. avg. (of 10) = 0.401096 fft 1: mflops = 106.998 (norm. = 0.489796), norm. avg. (of 10) = 0.380822 fft 2: mflops = 86.6592 (norm. = 0.396694), norm. avg. (of 10) = 0.276033 fft 3: mflops = 43.6907 (norm. = 0.2), norm. avg. (of 10) = 0.0823208 fft 4: mflops = 87.3813 (norm. = 0.4), norm. avg. (of 10) = 0.312847 fft 5: mflops = 27.5941 (norm. = 0.126316), norm. avg. (of 10) = 0.100837 fft 6: mflops = 81.92 (norm. = 0.375), norm. avg. (of 10) = 0.189429 fft 7: mflops = 57.6141 (norm. = 0.263736), norm. avg. (of 10) = 0.144972 fft 8: mflops = 67.6501 (norm. = 0.309677), norm. avg. (of 10) = 0.199545 fft 9: mflops = 190.65 (norm. = 0.872727), norm. avg. (of 10) = 0.377732 fft 10: mflops = 192.399 (norm. = 0.880734), norm. avg. (of 10) = 0.382736 fft 11: mflops = 68.9853 (norm. = 0.315789), norm. avg. (of 9) = 0.205285 fft 12: mflops = 127.1 (norm. = 0.581818), norm. avg. (of 10) = 0.399076 fft 13: mflops = 36.4089 (norm. = 0.166667), norm. avg. (of 10) = 0.122781 fft 14: mflops = 190.65 (norm. = 0.872727), norm. avg. (of 10) = 0.680833 fft 15: mflops = 165.13 (norm. = 0.755906), norm. avg. (of 10) = 0.632656 fft 16: mflops = 160.088 (norm. = 0.732824), norm. avg. (of 10) = 0.855673 fft 17: mflops = 201.649 (norm. = 0.923077), norm. avg. (of 8) = 0.577961 fft 18: mflops = 114.598 (norm. = 0.52459), norm. avg. (of 10) = 0.295891 fft 19: mflops = 67.2164 (norm. = 0.307692), norm. avg. (of 10) = 0.178127 fft 20: mflops = 63.9376 (norm. = 0.292683), norm. avg. (of 10) = 0.180107 fft 21: mflops = 81.285 (norm. = 0.372093), norm. avg. (of 10) = 0.572818 fft 22: mflops = 81.285 (norm. = 0.372093), norm. avg. (of 9) = 0.246678 fft 23: mflops = 104.858 (norm. = 0.48), norm. avg. (of 9) = 0.304272 fft 24: mflops = 99.8644 (norm. = 0.457143), norm. avg. (of 9) = 0.278443 fft 25: mflops = 57.6141 (norm. = 0.263736), norm. avg. (of 9) = 0.126498 fft 26: mflops = 50.9017 (norm. = 0.23301), norm. avg. (of 10) = 0.124507 fft 27: mflops = 185.589 (norm. = 0.849558), norm. avg. (of 10) = 0.605538 fft 28: mflops = 199.729 (norm. = 0.914286), norm. avg. (of 10) = 0.539401 fft 29: mflops = 58.5797 (norm. = 0.268156), norm. avg. (of 9) = 0.117592 fft 30: mflops = 106.998 (norm. = 0.489796), norm. avg. (of 9) = 0.378229 fft 31: mflops = 135.3 (norm. = 0.619355), norm. avg. (of 10) = 0.292816 fft 32: mflops = 142.663 (norm. = 0.653061), norm. avg. (of 10) = 0.302436 fft 33: mflops = 165.13 (norm. = 0.755906), norm. avg. (of 10) = 0.385231 fft 34: mflops = 68.0894 (norm. = 0.311688), norm. avg. (of 10) = 0.199524 fft 35: mflops = 56.9878 (norm. = 0.26087), norm. avg. (of 10) = 0.135592 fft 36: mflops = 59.2416 (norm. = 0.271186), norm. avg. (of 10) = 0.136366 fft 37: mflops = 5.50723 (norm. = 0.0252101), norm. avg. (of 10) = 0.0345568 fft 38: mflops = 218.453 (norm. = 1), norm. avg. (of 10) = 0.719175 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.28 s, 1024 iters, t-(init.)=1.25 s t(norm)=0.054186, mflops=92.2747 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.28 s, 1024 iters, t-(init.)=1.26 s t(norm)=0.0546195, mflops=91.5423 (err=1.4e-15) 2. Arndt Split-Radix: elapsed time t=1.6 s, 1024 iters, t-(init.)=1.58 s t(norm)=0.0684912, mflops=73.0021 (err=1.5e-15) 3. Arndt 4-step: elapsed time t=1.46 s, 512 iters, t-(init.)=1.45 s t(norm)=0.125712, mflops=39.7736 (err=1.4e-15) 4. Bailey: elapsed time t=1.62 s, 512 iters, t-(init.)=1.61 s t(norm)=0.139583, mflops=35.8209 (err=1.4e-15) 5. Beauregard: elapsed time t=1.08 s, 256 iters, t-(init.)=1.08 s t(norm)=0.187267, mflops=26.6999 (err=1.4e-15) 6. Bergland: elapsed time t=1.38 s, 1024 iters, t-(init.)=1.36 s t(norm)=0.0589544, mflops=84.8113 (err=1.5e-15) 7. Brenner: elapsed time t=1.02 s, 512 iters, t-(init.)=1.01 s t(norm)=0.0875646, mflops=57.1007 (err=1.4e-15) 8. Burrus: elapsed time t=1.01 s, 512 iters, t-(init.)=1 s t(norm)=0.0866977, mflops=57.6717 (err=1.4e-15) 9. CWP (min N) (N=2145): elapsed time t=1.46 s, 2048 iters, t-(init.)=1.4 s t(norm)=0.0303442, mflops=164.776 10. CWP (best N) (N=2184): elapsed time t=1.34 s, 2048 iters, t-(init.)=1.27 s t(norm)=0.0275265, mflops=181.643 11. Edelblute: elapsed time t=1.91 s, 1024 iters, t-(init.)=1.89 s t(norm)=0.0819293, mflops=61.0282 (err=1.4e-15) 12. FFTPACK: elapsed time t=1.69 s, 1024 iters, t-(init.)=1.67 s t(norm)=0.0723926, mflops=69.0679 (err=1.4e-15) 13. FFTPACK (f2c): elapsed time t=1.01 s, 256 iters, t-(init.)=1.01 s t(norm)=0.175129, mflops=28.5503 (err=1.4e-15) FFTW_MEASURE plan: (cost = 7.421875e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.68 s, 2048 iters, t-(init.)=1.63 s t(norm)=0.0353293, mflops=141.526 (err=1.4e-15) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.75 s, 2048 iters, t-(init.)=1.7 s t(norm)=0.0368465, mflops=135.698 (err=1.4e-15) 16. Frigo-old: elapsed time t=1.26 s, 1024 iters, t-(init.)=1.24 s t(norm)=0.0537526, mflops=93.0188 (err=1.4e-15) 17. Green: elapsed time t=1.53 s, 2048 iters, t-(init.)=1.48 s t(norm)=0.0320781, mflops=155.869 (err=1.4e-15) 18. GSL: elapsed time t=1.06 s, 512 iters, t-(init.)=1.05 s t(norm)=0.0910325, mflops=54.9254 (err=1.4e-15) 19. GSL DIT: elapsed time t=1.87 s, 1024 iters, t-(init.)=1.84 s t(norm)=0.0797619, mflops=62.6866 (err=2.0e-15) 20. GSL DIF: elapsed time t=1.89 s, 1024 iters, t-(init.)=1.87 s t(norm)=0.0810623, mflops=61.6809 (err=2.3e-15) 21. Krukar: elapsed time t=1.63 s, 1024 iters, t-(init.)=1.61 s t(norm)=0.0697916, mflops=71.6418 (err=1.4e-15) 22. Mayer (Buneman): elapsed time t=1.4 s, 1024 iters, t-(init.)=1.37 s t(norm)=0.0593879, mflops=84.1922 (err=1.4e-15) 23. Mayer (simple): elapsed time t=1.12 s, 1024 iters, t-(init.)=1.1 s t(norm)=0.0476837, mflops=104.858 24. Mayer (lookup): elapsed time t=1.19 s, 1024 iters, t-(init.)=1.16 s t(norm)=0.0502846, mflops=99.4339 (err=1.4e-15) 25. Monro: elapsed time t=1.12 s, 512 iters, t-(init.)=1.11 s t(norm)=0.0962344, mflops=51.9565 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.96 s, 512 iters, t-(init.)=1.95 s t(norm)=0.16906, mflops=29.5752 (err=1.5e-14) 27. Ooura (C): elapsed time t=1.56 s, 2048 iters, t-(init.)=1.51 s t(norm)=0.0327284, mflops=152.773 (err=1.4e-15) 28. Ooura (F): elapsed time t=1.43 s, 2048 iters, t-(init.)=1.38 s t(norm)=0.0299107, mflops=167.164 (err=1.4e-15) 29. Ransom: elapsed time t=1.04 s, 512 iters, t-(init.)=1.03 s t(norm)=0.0892986, mflops=55.9919 (err=2.0e-15) 30. SCIPORT: elapsed time t=1.28 s, 512 iters, t-(init.)=1.27 s t(norm)=0.110106, mflops=45.4108 (err=1.4e-15) 31. Singleton: elapsed time t=1.95 s, 2048 iters, t-(init.)=1.9 s t(norm)=0.0411814, mflops=121.414 (err=1.9e-15) 32. Singleton (f2c): elapsed time t=1.94 s, 2048 iters, t-(init.)=1.9 s t(norm)=0.0411814, mflops=121.414 (err=1.9e-15) 33. Sorensen: elapsed time t=1.28 s, 1024 iters, t-(init.)=1.25 s t(norm)=0.054186, mflops=92.2747 (err=1.4e-15) 34. Sorensen DIT: elapsed time t=1.14 s, 512 iters, t-(init.)=1.13 s t(norm)=0.0979684, mflops=51.0369 (err=1.4e-15) 35. Temperton: elapsed time t=1.22 s, 512 iters, t-(init.)=1.21 s t(norm)=0.104904, mflops=47.6625 (err=1.1e-07) 36. Temperton (f2c): elapsed time t=1.13 s, 512 iters, t-(init.)=1.12 s t(norm)=0.0971014, mflops=51.4926 (err=1.4e-15) 37. Valkenburg: elapsed time t=1.11 s, 64 iters, t-(init.)=1.11 s t(norm)=0.769875, mflops=6.49456 (err=1.7e-15) 38. SGIMATH: elapsed time t=1.88 s, 2048 iters, t-(init.)=1.83 s t(norm)=0.0396642, mflops=126.058 (err=1.4e-15) Top mflops for N=2048 = 181.643 Normalized results and averages for N=2048: fft 0: mflops = 92.2747 (norm. = 0.508), norm. avg. (of 11) = 0.410814 fft 1: mflops = 91.5423 (norm. = 0.503968), norm. avg. (of 11) = 0.392017 fft 2: mflops = 73.0021 (norm. = 0.401899), norm. avg. (of 11) = 0.287475 fft 3: mflops = 39.7736 (norm. = 0.218966), norm. avg. (of 11) = 0.094743 fft 4: mflops = 35.8209 (norm. = 0.197205), norm. avg. (of 11) = 0.302334 fft 5: mflops = 26.6999 (norm. = 0.146991), norm. avg. (of 11) = 0.105033 fft 6: mflops = 84.8113 (norm. = 0.466912), norm. avg. (of 11) = 0.214655 fft 7: mflops = 57.1007 (norm. = 0.314356), norm. avg. (of 11) = 0.160371 fft 8: mflops = 57.6717 (norm. = 0.3175), norm. avg. (of 11) = 0.210268 fft 9: mflops = 164.776 (norm. = 0.907143), norm. avg. (of 11) = 0.42586 fft 10: mflops = 181.643 (norm. = 1), norm. avg. (of 11) = 0.438851 fft 11: mflops = 61.0282 (norm. = 0.335979), norm. avg. (of 10) = 0.218354 fft 12: mflops = 69.0679 (norm. = 0.38024), norm. avg. (of 11) = 0.397364 fft 13: mflops = 28.5503 (norm. = 0.157178), norm. avg. (of 11) = 0.125908 fft 14: mflops = 141.526 (norm. = 0.779141), norm. avg. (of 11) = 0.68977 fft 15: mflops = 135.698 (norm. = 0.747059), norm. avg. (of 11) = 0.643056 fft 16: mflops = 93.0188 (norm. = 0.512097), norm. avg. (of 11) = 0.824439 fft 17: mflops = 155.869 (norm. = 0.858108), norm. avg. (of 9) = 0.609089 fft 18: mflops = 54.9254 (norm. = 0.302381), norm. avg. (of 11) = 0.296481 fft 19: mflops = 62.6866 (norm. = 0.345109), norm. avg. (of 11) = 0.193307 fft 20: mflops = 61.6809 (norm. = 0.339572), norm. avg. (of 11) = 0.194604 fft 21: mflops = 71.6418 (norm. = 0.39441), norm. avg. (of 11) = 0.556599 fft 22: mflops = 84.1922 (norm. = 0.463504), norm. avg. (of 10) = 0.268361 fft 23: mflops = 104.858 (norm. = 0.577273), norm. avg. (of 10) = 0.331572 fft 24: mflops = 99.4339 (norm. = 0.547414), norm. avg. (of 10) = 0.30534 fft 25: mflops = 51.9565 (norm. = 0.286036), norm. avg. (of 10) = 0.142452 fft 26: mflops = 29.5752 (norm. = 0.162821), norm. avg. (of 11) = 0.12799 fft 27: mflops = 152.773 (norm. = 0.84106), norm. avg. (of 11) = 0.626949 fft 28: mflops = 167.164 (norm. = 0.92029), norm. avg. (of 11) = 0.574027 fft 29: mflops = 55.9919 (norm. = 0.308252), norm. avg. (of 10) = 0.136658 fft 30: mflops = 45.4108 (norm. = 0.25), norm. avg. (of 10) = 0.365406 fft 31: mflops = 121.414 (norm. = 0.668421), norm. avg. (of 11) = 0.326962 fft 32: mflops = 121.414 (norm. = 0.668421), norm. avg. (of 11) = 0.335707 fft 33: mflops = 92.2747 (norm. = 0.508), norm. avg. (of 11) = 0.396392 fft 34: mflops = 51.0369 (norm. = 0.280973), norm. avg. (of 11) = 0.206928 fft 35: mflops = 47.6625 (norm. = 0.262397), norm. avg. (of 11) = 0.14712 fft 36: mflops = 51.4926 (norm. = 0.283482), norm. avg. (of 11) = 0.14974 fft 37: mflops = 6.49456 (norm. = 0.0357545), norm. avg. (of 11) = 0.0346657 fft 38: mflops = 126.058 (norm. = 0.693989), norm. avg. (of 11) = 0.716885 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.46 s, 256 iters, t-(init.)=1.42 s t(norm)=0.112851, mflops=44.306 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.38 s, 256 iters, t-(init.)=1.34 s t(norm)=0.106494, mflops=46.9512 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.6 s, 256 iters, t-(init.)=1.56 s t(norm)=0.123978, mflops=40.3298 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.62 s, 256 iters, t-(init.)=1.58 s t(norm)=0.125567, mflops=39.8193 (err=3.7e-15) 4. Bailey: elapsed time t=1.62 s, 128 iters, t-(init.)=1.6 s t(norm)=0.254313, mflops=19.6608 (err=3.7e-15) 5. Beauregard: elapsed time t=1.33 s, 128 iters, t-(init.)=1.31 s t(norm)=0.208219, mflops=24.0132 (err=3.8e-15) 6. Bergland: elapsed time t=1.08 s, 256 iters, t-(init.)=1.04 s t(norm)=0.0826518, mflops=60.4948 (err=3.9e-15) 7. Brenner: elapsed time t=1.51 s, 256 iters, t-(init.)=1.47 s t(norm)=0.116825, mflops=42.799 (err=3.8e-15) 8. Burrus: elapsed time t=1.88 s, 256 iters, t-(init.)=1.84 s t(norm)=0.14623, mflops=34.1927 (err=3.7e-15) 9. CWP (min N) (N=4290): elapsed time t=1.98 s, 1024 iters, t-(init.)=1.83 s t(norm)=0.0363588, mflops=137.518 10. CWP (best N) (N=4368): elapsed time t=1.78 s, 1024 iters, t-(init.)=1.63 s t(norm)=0.0323852, mflops=154.392 11. Edelblute: elapsed time t=1.76 s, 256 iters, t-(init.)=1.72 s t(norm)=0.136693, mflops=36.5782 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.33 s, 256 iters, t-(init.)=1.29 s t(norm)=0.10252, mflops=48.771 (err=3.8e-15) 13. FFTPACK (f2c): elapsed time t=1.03 s, 128 iters, t-(init.)=1.01 s t(norm)=0.160535, mflops=31.1458 (err=3.8e-15) FFTW_MEASURE plan: (cost = 2.265625e-03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.22 s, 512 iters, t-(init.)=1.15 s t(norm)=0.0456969, mflops=109.417 (err=3.8e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.22 s, 512 iters, t-(init.)=1.15 s t(norm)=0.0456969, mflops=109.417 (err=3.8e-15) 16. Frigo-old: elapsed time t=1.69 s, 512 iters, t-(init.)=1.62 s t(norm)=0.064373, mflops=77.6723 (err=3.8e-15) 17. Green: elapsed time t=1.31 s, 512 iters, t-(init.)=1.23 s t(norm)=0.0488758, mflops=102.3 (err=3.8e-15) 18. GSL: elapsed time t=1.51 s, 256 iters, t-(init.)=1.47 s t(norm)=0.116825, mflops=42.799 (err=3.8e-15) 19. GSL DIT: elapsed time t=1.82 s, 256 iters, t-(init.)=1.78 s t(norm)=0.141462, mflops=35.3453 (err=4.1e-15) 20. GSL DIF: elapsed time t=1.84 s, 256 iters, t-(init.)=1.81 s t(norm)=0.143846, mflops=34.7594 (err=4.3e-15) 21. Krukar: elapsed time t=1.19 s, 256 iters, t-(init.)=1.15 s t(norm)=0.0913938, mflops=54.7083 (err=3.8e-15) 22. Mayer (Buneman): elapsed time t=1.63 s, 512 iters, t-(init.)=1.55 s t(norm)=0.0615915, mflops=81.1801 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.35 s, 512 iters, t-(init.)=1.28 s t(norm)=0.0508626, mflops=98.304 24. Mayer (lookup): elapsed time t=1.52 s, 512 iters, t-(init.)=1.45 s t(norm)=0.0576178, mflops=86.7787 (err=3.7e-15) 25. Monro: elapsed time t=1.79 s, 256 iters, t-(init.)=1.75 s t(norm)=0.139078, mflops=35.9512 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.04 s, 128 iters, t-(init.)=1.02 s t(norm)=0.162125, mflops=30.8405 (err=4.9e-14) 27. Ooura (C): elapsed time t=1.44 s, 512 iters, t-(init.)=1.37 s t(norm)=0.0544389, mflops=91.8461 (err=3.9e-15) 28. Ooura (F): elapsed time t=1.46 s, 512 iters, t-(init.)=1.39 s t(norm)=0.0552336, mflops=90.5245 (err=3.9e-15) 29. Ransom: elapsed time t=1.14 s, 256 iters, t-(init.)=1.1 s t(norm)=0.0874201, mflops=57.1951 (err=4.3e-15) 30. SCIPORT: elapsed time t=1.48 s, 256 iters, t-(init.)=1.45 s t(norm)=0.115236, mflops=43.3894 (err=3.8e-15) 31. Singleton: elapsed time t=1.01 s, 256 iters, t-(init.)=0.97 s t(norm)=0.0770887, mflops=64.8604 (err=5.8e-15) 32. Singleton (f2c): elapsed time t=1.97 s, 512 iters, t-(init.)=1.9 s t(norm)=0.0754992, mflops=66.2259 (err=5.8e-15) 33. Sorensen: elapsed time t=1.2 s, 256 iters, t-(init.)=1.16 s t(norm)=0.0921885, mflops=54.2367 (err=3.7e-15) 34. Sorensen DIT: elapsed time t=1.02 s, 128 iters, t-(init.)=1 s t(norm)=0.158946, mflops=31.4573 (err=3.7e-15) 35. Temperton: elapsed time t=1.82 s, 256 iters, t-(init.)=1.79 s t(norm)=0.142256, mflops=35.1478 (err=1.2e-07) 36. Temperton (f2c): elapsed time t=1.66 s, 256 iters, t-(init.)=1.62 s t(norm)=0.128746, mflops=38.8361 (err=3.8e-15) 37. Valkenburg: elapsed time t=1.25 s, 32 iters, t-(init.)=1.25 s t(norm)=0.794729, mflops=6.29146 (err=4.0e-15) 38. SGIMATH: elapsed time t=1.06 s, 512 iters, t-(init.)=0.99 s t(norm)=0.0393391, mflops=127.1 (err=3.8e-15) Top mflops for N=4096 = 154.392 Normalized results and averages for N=4096: fft 0: mflops = 44.306 (norm. = 0.286972), norm. avg. (of 12) = 0.400494 fft 1: mflops = 46.9512 (norm. = 0.304104), norm. avg. (of 12) = 0.384691 fft 2: mflops = 40.3298 (norm. = 0.261218), norm. avg. (of 12) = 0.285287 fft 3: mflops = 39.8193 (norm. = 0.257911), norm. avg. (of 12) = 0.10834 fft 4: mflops = 19.6608 (norm. = 0.127344), norm. avg. (of 12) = 0.287752 fft 5: mflops = 24.0132 (norm. = 0.155534), norm. avg. (of 12) = 0.109241 fft 6: mflops = 60.4948 (norm. = 0.391827), norm. avg. (of 12) = 0.229419 fft 7: mflops = 42.799 (norm. = 0.277211), norm. avg. (of 12) = 0.170107 fft 8: mflops = 34.1927 (norm. = 0.221467), norm. avg. (of 12) = 0.211201 fft 9: mflops = 137.518 (norm. = 0.89071), norm. avg. (of 12) = 0.464598 fft 10: mflops = 154.392 (norm. = 1), norm. avg. (of 12) = 0.485613 fft 11: mflops = 36.5782 (norm. = 0.236919), norm. avg. (of 11) = 0.220042 fft 12: mflops = 48.771 (norm. = 0.315891), norm. avg. (of 12) = 0.390574 fft 13: mflops = 31.1458 (norm. = 0.201733), norm. avg. (of 12) = 0.132226 fft 14: mflops = 109.417 (norm. = 0.708696), norm. avg. (of 12) = 0.691347 fft 15: mflops = 109.417 (norm. = 0.708696), norm. avg. (of 12) = 0.648526 fft 16: mflops = 77.6723 (norm. = 0.503086), norm. avg. (of 12) = 0.797659 fft 17: mflops = 102.3 (norm. = 0.662602), norm. avg. (of 10) = 0.61444 fft 18: mflops = 42.799 (norm. = 0.277211), norm. avg. (of 12) = 0.294875 fft 19: mflops = 35.3453 (norm. = 0.228933), norm. avg. (of 12) = 0.196276 fft 20: mflops = 34.7594 (norm. = 0.225138), norm. avg. (of 12) = 0.197148 fft 21: mflops = 54.7083 (norm. = 0.354348), norm. avg. (of 12) = 0.539745 fft 22: mflops = 81.1801 (norm. = 0.525806), norm. avg. (of 11) = 0.291765 fft 23: mflops = 98.304 (norm. = 0.636719), norm. avg. (of 11) = 0.359312 fft 24: mflops = 86.7787 (norm. = 0.562069), norm. avg. (of 11) = 0.328679 fft 25: mflops = 35.9512 (norm. = 0.232857), norm. avg. (of 11) = 0.150671 fft 26: mflops = 30.8405 (norm. = 0.199755), norm. avg. (of 12) = 0.13397 fft 27: mflops = 91.8461 (norm. = 0.594891), norm. avg. (of 12) = 0.624278 fft 28: mflops = 90.5245 (norm. = 0.586331), norm. avg. (of 12) = 0.575053 fft 29: mflops = 57.1951 (norm. = 0.370455), norm. avg. (of 11) = 0.157913 fft 30: mflops = 43.3894 (norm. = 0.281034), norm. avg. (of 11) = 0.357736 fft 31: mflops = 64.8604 (norm. = 0.420103), norm. avg. (of 12) = 0.334724 fft 32: mflops = 66.2259 (norm. = 0.428947), norm. avg. (of 12) = 0.343477 fft 33: mflops = 54.2367 (norm. = 0.351293), norm. avg. (of 12) = 0.392633 fft 34: mflops = 31.4573 (norm. = 0.20375), norm. avg. (of 12) = 0.206664 fft 35: mflops = 35.1478 (norm. = 0.227654), norm. avg. (of 12) = 0.153831 fft 36: mflops = 38.8361 (norm. = 0.251543), norm. avg. (of 12) = 0.158224 fft 37: mflops = 6.29146 (norm. = 0.04075), norm. avg. (of 12) = 0.0351727 fft 38: mflops = 127.1 (norm. = 0.823232), norm. avg. (of 12) = 0.725748 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.54 s, 128 iters, t-(init.)=1.51 s t(norm)=0.110773, mflops=45.1374 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.48 s, 128 iters, t-(init.)=1.44 s t(norm)=0.105638, mflops=47.3316 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.95 s, 128 iters, t-(init.)=1.91 s t(norm)=0.140117, mflops=35.6845 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.85 s, 128 iters, t-(init.)=1.81 s t(norm)=0.132781, mflops=37.656 (err=3.7e-15) 4. Bailey: elapsed time t=1.74 s, 64 iters, t-(init.)=1.72 s t(norm)=0.252357, mflops=19.8132 (err=3.7e-15) 5. Beauregard: elapsed time t=1.46 s, 64 iters, t-(init.)=1.44 s t(norm)=0.211276, mflops=23.6658 (err=3.7e-15) 6. Bergland: elapsed time t=1.25 s, 128 iters, t-(init.)=1.21 s t(norm)=0.0887651, mflops=56.3285 (err=3.7e-15) 7. Brenner: elapsed time t=1.69 s, 128 iters, t-(init.)=1.66 s t(norm)=0.121777, mflops=41.0587 (err=3.7e-15) 8. Burrus: elapsed time t=1.1 s, 64 iters, t-(init.)=1.08 s t(norm)=0.158457, mflops=31.5544 (err=3.7e-15) 9. CWP (min N) (N=8580): elapsed time t=1.03 s, 256 iters, t-(init.)=0.96 s t(norm)=0.0352126, mflops=141.995 10. CWP (best N) (N=9240): elapsed time t=1.97 s, 512 iters, t-(init.)=1.81 s t(norm)=0.0331952, mflops=150.624 11. Edelblute: elapsed time t=1.04 s, 64 iters, t-(init.)=1.02 s t(norm)=0.149654, mflops=33.4105 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.72 s, 128 iters, t-(init.)=1.68 s t(norm)=0.123244, mflops=40.5699 (err=3.7e-15) 13. FFTPACK (f2c): elapsed time t=1.39 s, 64 iters, t-(init.)=1.37 s t(norm)=0.201005, mflops=24.875 (err=3.7e-15) FFTW_MEASURE plan: (cost = 5.156250e-03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.37 s, 256 iters, t-(init.)=1.29 s t(norm)=0.0473169, mflops=105.67 (err=3.7e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.37 s, 256 iters, t-(init.)=1.3 s t(norm)=0.0476837, mflops=104.858 (err=3.7e-15) 16. Frigo-old: elapsed time t=1.76 s, 256 iters, t-(init.)=1.69 s t(norm)=0.0619888, mflops=80.6597 (err=3.7e-15) 17. Green: elapsed time t=1.61 s, 256 iters, t-(init.)=1.54 s t(norm)=0.0564869, mflops=88.5162 (err=3.7e-15) 18. GSL: elapsed time t=1.83 s, 128 iters, t-(init.)=1.79 s t(norm)=0.131314, mflops=38.0768 (err=3.7e-15) 19. GSL DIT: elapsed time t=1.04 s, 64 iters, t-(init.)=1.03 s t(norm)=0.151121, mflops=33.0861 (err=4.3e-15) 20. GSL DIF: elapsed time t=1.03 s, 64 iters, t-(init.)=1.01 s t(norm)=0.148186, mflops=33.7413 (err=4.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.36 s, 128 iters, t-(init.)=1.32 s t(norm)=0.0968346, mflops=51.6344 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.23 s, 128 iters, t-(init.)=1.19 s t(norm)=0.0872979, mflops=57.2752 24. Mayer (lookup): elapsed time t=1.28 s, 128 iters, t-(init.)=1.24 s t(norm)=0.0909659, mflops=54.9657 (err=3.7e-15) 25. Monro: elapsed time t=1.98 s, 128 iters, t-(init.)=1.94 s t(norm)=0.142318, mflops=35.1327 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.29 s, 64 iters, t-(init.)=1.27 s t(norm)=0.186333, mflops=26.8336 (err=4.5e-14) 27. Ooura (C): elapsed time t=1.6 s, 256 iters, t-(init.)=1.52 s t(norm)=0.0557533, mflops=89.6808 (err=3.7e-15) 28. Ooura (F): elapsed time t=1.61 s, 256 iters, t-(init.)=1.54 s t(norm)=0.0564869, mflops=88.5162 (err=3.7e-15) 29. Ransom: elapsed time t=1.33 s, 128 iters, t-(init.)=1.3 s t(norm)=0.0953674, mflops=52.4288 (err=4.8e-15) 30. SCIPORT: elapsed time t=1.63 s, 128 iters, t-(init.)=1.59 s t(norm)=0.116642, mflops=42.8663 (err=3.7e-15) 31. Singleton: elapsed time t=1.17 s, 128 iters, t-(init.)=1.13 s t(norm)=0.0828963, mflops=60.3163 (err=5.6e-15) 32. Singleton (f2c): elapsed time t=1.16 s, 128 iters, t-(init.)=1.13 s t(norm)=0.0828963, mflops=60.3163 (err=5.6e-15) 33. Sorensen: elapsed time t=1.51 s, 128 iters, t-(init.)=1.48 s t(norm)=0.108572, mflops=46.0523 (err=3.7e-15) 34. Sorensen DIT: elapsed time t=1.21 s, 64 iters, t-(init.)=1.19 s t(norm)=0.174596, mflops=28.6376 (err=3.7e-15) 35. Temperton: elapsed time t=1.05 s, 64 iters, t-(init.)=1.03 s t(norm)=0.151121, mflops=33.0861 (err=1.4e-07) 36. Temperton (f2c): elapsed time t=1 s, 64 iters, t-(init.)=0.98 s t(norm)=0.143785, mflops=34.7742 (err=3.7e-15) 37. Valkenburg: elapsed time t=1.34 s, 16 iters, t-(init.)=1.33 s t(norm)=0.780546, mflops=6.40577 (err=3.8e-15) 38. SGIMATH: elapsed time t=1.08 s, 256 iters, t-(init.)=1.01 s t(norm)=0.0370466, mflops=134.965 (err=3.7e-15) Top mflops for N=8192 = 150.624 Normalized results and averages for N=8192: fft 0: mflops = 45.1374 (norm. = 0.299669), norm. avg. (of 13) = 0.392738 fft 1: mflops = 47.3316 (norm. = 0.314236), norm. avg. (of 13) = 0.379272 fft 2: mflops = 35.6845 (norm. = 0.236911), norm. avg. (of 13) = 0.281566 fft 3: mflops = 37.656 (norm. = 0.25), norm. avg. (of 13) = 0.119237 fft 4: mflops = 19.8132 (norm. = 0.131541), norm. avg. (of 13) = 0.275736 fft 5: mflops = 23.6658 (norm. = 0.157118), norm. avg. (of 13) = 0.112924 fft 6: mflops = 56.3285 (norm. = 0.373967), norm. avg. (of 13) = 0.240538 fft 7: mflops = 41.0587 (norm. = 0.27259), norm. avg. (of 13) = 0.177991 fft 8: mflops = 31.5544 (norm. = 0.209491), norm. avg. (of 13) = 0.21107 fft 9: mflops = 141.995 (norm. = 0.942708), norm. avg. (of 13) = 0.501375 fft 10: mflops = 150.624 (norm. = 1), norm. avg. (of 13) = 0.525181 fft 11: mflops = 33.4105 (norm. = 0.221814), norm. avg. (of 12) = 0.22019 fft 12: mflops = 40.5699 (norm. = 0.269345), norm. avg. (of 13) = 0.381249 fft 13: mflops = 24.875 (norm. = 0.165146), norm. avg. (of 13) = 0.134759 fft 14: mflops = 105.67 (norm. = 0.70155), norm. avg. (of 13) = 0.692132 fft 15: mflops = 104.858 (norm. = 0.696154), norm. avg. (of 13) = 0.65219 fft 16: mflops = 80.6597 (norm. = 0.535503), norm. avg. (of 13) = 0.777494 fft 17: mflops = 88.5162 (norm. = 0.587662), norm. avg. (of 11) = 0.612006 fft 18: mflops = 38.0768 (norm. = 0.252793), norm. avg. (of 13) = 0.291638 fft 19: mflops = 33.0861 (norm. = 0.21966), norm. avg. (of 13) = 0.198075 fft 20: mflops = 33.7413 (norm. = 0.22401), norm. avg. (of 13) = 0.199214 fft 21: mflops = -1 (norm. = -0.00663904), norm. avg. (of 12) = 0.539745 fft 22: mflops = 51.6344 (norm. = 0.342803), norm. avg. (of 12) = 0.296018 fft 23: mflops = 57.2752 (norm. = 0.380252), norm. avg. (of 12) = 0.361057 fft 24: mflops = 54.9657 (norm. = 0.364919), norm. avg. (of 12) = 0.331699 fft 25: mflops = 35.1327 (norm. = 0.233247), norm. avg. (of 12) = 0.157552 fft 26: mflops = 26.8336 (norm. = 0.17815), norm. avg. (of 13) = 0.137369 fft 27: mflops = 89.6808 (norm. = 0.595395), norm. avg. (of 13) = 0.622056 fft 28: mflops = 88.5162 (norm. = 0.587662), norm. avg. (of 13) = 0.576023 fft 29: mflops = 52.4288 (norm. = 0.348077), norm. avg. (of 12) = 0.17376 fft 30: mflops = 42.8663 (norm. = 0.284591), norm. avg. (of 12) = 0.35164 fft 31: mflops = 60.3163 (norm. = 0.400442), norm. avg. (of 13) = 0.339779 fft 32: mflops = 60.3163 (norm. = 0.400442), norm. avg. (of 13) = 0.347859 fft 33: mflops = 46.0523 (norm. = 0.305743), norm. avg. (of 13) = 0.38595 fft 34: mflops = 28.6376 (norm. = 0.190126), norm. avg. (of 13) = 0.205391 fft 35: mflops = 33.0861 (norm. = 0.21966), norm. avg. (of 13) = 0.158895 fft 36: mflops = 34.7742 (norm. = 0.230867), norm. avg. (of 13) = 0.163812 fft 37: mflops = 6.40577 (norm. = 0.0425282), norm. avg. (of 13) = 0.0357385 fft 38: mflops = 134.965 (norm. = 0.89604), norm. avg. (of 13) = 0.738847 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.85 s, 64 iters, t-(init.)=1.82 s t(norm)=0.123978, mflops=40.3298 (err=6.8e-15) 1. Arndt DIT: elapsed time t=1.73 s, 64 iters, t-(init.)=1.69 s t(norm)=0.115122, mflops=43.4321 (err=6.8e-15) 2. Arndt Split-Radix: elapsed time t=1.09 s, 32 iters, t-(init.)=1.07 s t(norm)=0.145776, mflops=34.2992 (err=6.8e-15) 3. Arndt 4-step: elapsed time t=1.67 s, 64 iters, t-(init.)=1.64 s t(norm)=0.111716, mflops=44.7563 (err=6.8e-15) 4. Bailey: elapsed time t=1 s, 16 iters, t-(init.)=0.99 s t(norm)=0.269754, mflops=18.5354 (err=6.8e-15) 5. Beauregard: elapsed time t=1.58 s, 32 iters, t-(init.)=1.56 s t(norm)=0.212533, mflops=23.5257 (err=6.8e-15) 6. Bergland: elapsed time t=1.3 s, 64 iters, t-(init.)=1.26 s t(norm)=0.0858307, mflops=58.2542 (err=6.8e-15) 7. Brenner: elapsed time t=1.83 s, 64 iters, t-(init.)=1.79 s t(norm)=0.121934, mflops=41.0058 (err=6.8e-15) 8. Burrus: elapsed time t=1.24 s, 32 iters, t-(init.)=1.22 s t(norm)=0.166212, mflops=30.0821 (err=6.8e-15) 9. CWP (min N) (N=17160): elapsed time t=1.06 s, 128 iters, t-(init.)=0.98 s t(norm)=0.0333786, mflops=149.797 10. CWP (best N) (N=17160): elapsed time t=1.06 s, 128 iters, t-(init.)=0.98 s t(norm)=0.0333786, mflops=149.797 11. Edelblute: elapsed time t=1.15 s, 32 iters, t-(init.)=1.13 s t(norm)=0.15395, mflops=32.478 (err=6.8e-15) 12. FFTPACK: elapsed time t=1.8 s, 64 iters, t-(init.)=1.77 s t(norm)=0.120572, mflops=41.4691 (err=6.8e-15) 13. FFTPACK (f2c): elapsed time t=1.38 s, 32 iters, t-(init.)=1.36 s t(norm)=0.185285, mflops=26.9854 (err=6.8e-15) FFTW_MEASURE plan: (cost = 1.125000e-02) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.63 s, 128 iters, t-(init.)=1.55 s t(norm)=0.0527927, mflops=94.7101 (err=6.8e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.68 s, 128 iters, t-(init.)=1.61 s t(norm)=0.0548363, mflops=91.1805 (err=6.8e-15) 16. Frigo-old: elapsed time t=1.09 s, 64 iters, t-(init.)=1.05 s t(norm)=0.0715256, mflops=69.9051 (err=6.8e-15) 17. Green: elapsed time t=1.78 s, 128 iters, t-(init.)=1.71 s t(norm)=0.0582423, mflops=85.8483 (err=6.8e-15) 18. GSL: elapsed time t=1.92 s, 64 iters, t-(init.)=1.88 s t(norm)=0.128065, mflops=39.0427 (err=6.8e-15) 19. GSL DIT: elapsed time t=1.17 s, 32 iters, t-(init.)=1.15 s t(norm)=0.156675, mflops=31.9132 (err=7.2e-15) 20. GSL DIF: elapsed time t=1.15 s, 32 iters, t-(init.)=1.13 s t(norm)=0.15395, mflops=32.478 (err=7.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.52 s, 64 iters, t-(init.)=1.49 s t(norm)=0.101498, mflops=49.262 (err=6.8e-15) 23. Mayer (simple): elapsed time t=1.41 s, 64 iters, t-(init.)=1.38 s t(norm)=0.094005, mflops=53.1886 24. Mayer (lookup): elapsed time t=1.45 s, 64 iters, t-(init.)=1.41 s t(norm)=0.0960486, mflops=52.057 (err=6.8e-15) 25. Monro: elapsed time t=1.08 s, 32 iters, t-(init.)=1.06 s t(norm)=0.144414, mflops=34.6228 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.36 s, 32 iters, t-(init.)=1.34 s t(norm)=0.182561, mflops=27.3882 (err=2.3e-13) 27. Ooura (C): elapsed time t=1.92 s, 128 iters, t-(init.)=1.84 s t(norm)=0.06267, mflops=79.783 (err=6.8e-15) 28. Ooura (F): elapsed time t=1.01 s, 64 iters, t-(init.)=0.97 s t(norm)=0.066076, mflops=75.6704 (err=6.8e-15) 29. Ransom: elapsed time t=1.15 s, 64 iters, t-(init.)=1.11 s t(norm)=0.0756127, mflops=66.1264 (err=7.3e-15) 30. SCIPORT: elapsed time t=1.99 s, 64 iters, t-(init.)=1.96 s t(norm)=0.133514, mflops=37.4491 (err=6.8e-15) 31. Singleton: elapsed time t=1.33 s, 64 iters, t-(init.)=1.3 s t(norm)=0.0885555, mflops=56.4618 (err=1.0e-14) 32. Singleton (f2c): elapsed time t=1.29 s, 64 iters, t-(init.)=1.25 s t(norm)=0.0851495, mflops=58.7203 (err=1.0e-14) 33. Sorensen: elapsed time t=1.89 s, 64 iters, t-(init.)=1.85 s t(norm)=0.126021, mflops=39.6758 (err=6.8e-15) 34. Sorensen DIT: elapsed time t=1.36 s, 32 iters, t-(init.)=1.35 s t(norm)=0.183923, mflops=27.1853 (err=6.8e-15) 35. Temperton: elapsed time t=1.12 s, 32 iters, t-(init.)=1.1 s t(norm)=0.149863, mflops=33.3638 (err=1.5e-07) 36. Temperton (f2c): elapsed time t=1.05 s, 32 iters, t-(init.)=1.03 s t(norm)=0.140326, mflops=35.6312 (err=6.8e-15) 37. Valkenburg: elapsed time t=1.48 s, 8 iters, t-(init.)=1.48 s t(norm)=0.806536, mflops=6.19935 (err=6.9e-15) 38. SGIMATH: elapsed time t=1.13 s, 128 iters, t-(init.)=1.06 s t(norm)=0.0361034, mflops=138.491 (err=6.8e-15) Top mflops for N=16384 = 149.797 Normalized results and averages for N=16384: fft 0: mflops = 40.3298 (norm. = 0.269231), norm. avg. (of 14) = 0.383916 fft 1: mflops = 43.4321 (norm. = 0.289941), norm. avg. (of 14) = 0.372891 fft 2: mflops = 34.2992 (norm. = 0.228972), norm. avg. (of 14) = 0.277809 fft 3: mflops = 44.7563 (norm. = 0.29878), norm. avg. (of 14) = 0.132062 fft 4: mflops = 18.5354 (norm. = 0.123737), norm. avg. (of 14) = 0.264879 fft 5: mflops = 23.5257 (norm. = 0.157051), norm. avg. (of 14) = 0.116076 fft 6: mflops = 58.2542 (norm. = 0.388889), norm. avg. (of 14) = 0.251135 fft 7: mflops = 41.0058 (norm. = 0.273743), norm. avg. (of 14) = 0.18483 fft 8: mflops = 30.0821 (norm. = 0.20082), norm. avg. (of 14) = 0.210337 fft 9: mflops = 149.797 (norm. = 1), norm. avg. (of 14) = 0.536991 fft 10: mflops = 149.797 (norm. = 1), norm. avg. (of 14) = 0.559097 fft 11: mflops = 32.478 (norm. = 0.216814), norm. avg. (of 13) = 0.21993 fft 12: mflops = 41.4691 (norm. = 0.276836), norm. avg. (of 14) = 0.373791 fft 13: mflops = 26.9854 (norm. = 0.180147), norm. avg. (of 14) = 0.138001 fft 14: mflops = 94.7101 (norm. = 0.632258), norm. avg. (of 14) = 0.687855 fft 15: mflops = 91.1805 (norm. = 0.608696), norm. avg. (of 14) = 0.649083 fft 16: mflops = 69.9051 (norm. = 0.466667), norm. avg. (of 14) = 0.755292 fft 17: mflops = 85.8483 (norm. = 0.573099), norm. avg. (of 12) = 0.608764 fft 18: mflops = 39.0427 (norm. = 0.260638), norm. avg. (of 14) = 0.289424 fft 19: mflops = 31.9132 (norm. = 0.213043), norm. avg. (of 14) = 0.199144 fft 20: mflops = 32.478 (norm. = 0.216814), norm. avg. (of 14) = 0.200472 fft 21: mflops = -1 (norm. = -0.00667572), norm. avg. (of 12) = 0.539745 fft 22: mflops = 49.262 (norm. = 0.328859), norm. avg. (of 13) = 0.298544 fft 23: mflops = 53.1886 (norm. = 0.355072), norm. avg. (of 13) = 0.360597 fft 24: mflops = 52.057 (norm. = 0.347518), norm. avg. (of 13) = 0.332916 fft 25: mflops = 34.6228 (norm. = 0.231132), norm. avg. (of 13) = 0.163212 fft 26: mflops = 27.3882 (norm. = 0.182836), norm. avg. (of 14) = 0.140616 fft 27: mflops = 79.783 (norm. = 0.532609), norm. avg. (of 14) = 0.615667 fft 28: mflops = 75.6704 (norm. = 0.505155), norm. avg. (of 14) = 0.570961 fft 29: mflops = 66.1264 (norm. = 0.441441), norm. avg. (of 13) = 0.194351 fft 30: mflops = 37.4491 (norm. = 0.25), norm. avg. (of 13) = 0.343822 fft 31: mflops = 56.4618 (norm. = 0.376923), norm. avg. (of 14) = 0.342432 fft 32: mflops = 58.7203 (norm. = 0.392), norm. avg. (of 14) = 0.351012 fft 33: mflops = 39.6758 (norm. = 0.264865), norm. avg. (of 14) = 0.377301 fft 34: mflops = 27.1853 (norm. = 0.181481), norm. avg. (of 14) = 0.203684 fft 35: mflops = 33.3638 (norm. = 0.222727), norm. avg. (of 14) = 0.163454 fft 36: mflops = 35.6312 (norm. = 0.237864), norm. avg. (of 14) = 0.169101 fft 37: mflops = 6.19935 (norm. = 0.0413851), norm. avg. (of 14) = 0.0361418 fft 38: mflops = 138.491 (norm. = 0.924528), norm. avg. (of 14) = 0.75211 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.9 s, 32 iters, t-(init.)=1.87 s t(norm)=0.118891, mflops=42.0552 (err=1.4e-14) 1. Arndt DIT: elapsed time t=1.84 s, 32 iters, t-(init.)=1.8 s t(norm)=0.114441, mflops=43.6907 (err=1.4e-14) 2. Arndt Split-Radix: elapsed time t=1.21 s, 16 iters, t-(init.)=1.19 s t(norm)=0.151316, mflops=33.0434 (err=1.4e-14) 3. Arndt 4-step: elapsed time t=1.95 s, 32 iters, t-(init.)=1.91 s t(norm)=0.121435, mflops=41.1745 (err=1.4e-14) 4. Bailey: elapsed time t=1.42 s, 8 iters, t-(init.)=1.41 s t(norm)=0.358582, mflops=13.9438 (err=1.4e-14) 5. Beauregard: elapsed time t=1.7 s, 16 iters, t-(init.)=1.68 s t(norm)=0.213623, mflops=23.4057 (err=1.4e-14) 6. Bergland: elapsed time t=1.39 s, 32 iters, t-(init.)=1.35 s t(norm)=0.0858307, mflops=58.2542 (err=1.4e-14) 7. Brenner: elapsed time t=1 s, 16 iters, t-(init.)=0.98 s t(norm)=0.124613, mflops=40.1241 (err=1.4e-14) 8. Burrus: elapsed time t=1.35 s, 16 iters, t-(init.)=1.33 s t(norm)=0.169118, mflops=29.5651 (err=1.4e-14) 9. CWP (min N) (N=34320): elapsed time t=1.13 s, 64 iters, t-(init.)=1.05 s t(norm)=0.0333786, mflops=149.797 10. CWP (best N) (N=34320): elapsed time t=1.13 s, 64 iters, t-(init.)=1.05 s t(norm)=0.0333786, mflops=149.797 11. Edelblute: elapsed time t=1.27 s, 16 iters, t-(init.)=1.25 s t(norm)=0.158946, mflops=31.4573 (err=1.4e-14) 12. FFTPACK: elapsed time t=1.42 s, 16 iters, t-(init.)=1.4 s t(norm)=0.178019, mflops=28.0869 (err=1.4e-14) 13. FFTPACK (f2c): elapsed time t=1.02 s, 8 iters, t-(init.)=1.01 s t(norm)=0.256856, mflops=19.4661 (err=1.4e-14) FFTW_MEASURE plan: (cost = 2.875000e-02) FFTW_TWIDDLE 32 FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.84 s, 64 iters, t-(init.)=1.77 s t(norm)=0.0562668, mflops=88.8624 (err=1.4e-14) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.02 s, 32 iters, t-(init.)=0.99 s t(norm)=0.0629425, mflops=79.4376 (err=1.4e-14) 16. Frigo-old: elapsed time t=1.55 s, 32 iters, t-(init.)=1.51 s t(norm)=0.0960032, mflops=52.0816 (err=1.4e-14) 17. Green: elapsed time t=1.94 s, 64 iters, t-(init.)=1.87 s t(norm)=0.0594457, mflops=84.1104 (err=1.4e-14) 18. GSL: elapsed time t=1.16 s, 16 iters, t-(init.)=1.14 s t(norm)=0.144958, mflops=34.4926 (err=1.4e-14) 19. GSL DIT: elapsed time t=1.29 s, 16 iters, t-(init.)=1.27 s t(norm)=0.161489, mflops=30.9619 (err=1.4e-14) 20. GSL DIF: elapsed time t=1.26 s, 16 iters, t-(init.)=1.24 s t(norm)=0.157674, mflops=31.711 (err=1.4e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.69 s, 32 iters, t-(init.)=1.65 s t(norm)=0.104904, mflops=47.6625 (err=1.4e-14) 23. Mayer (simple): elapsed time t=1.57 s, 32 iters, t-(init.)=1.54 s t(norm)=0.0979106, mflops=51.067 24. Mayer (lookup): elapsed time t=1.61 s, 32 iters, t-(init.)=1.57 s t(norm)=0.0998179, mflops=50.0912 (err=1.4e-14) 25. Monro: elapsed time t=1.16 s, 16 iters, t-(init.)=1.14 s t(norm)=0.144958, mflops=34.4926 (err=1.5e-07) 26. NAPACK (f2c): elapsed time t=1.62 s, 16 iters, t-(init.)=1.6 s t(norm)=0.203451, mflops=24.576 (err=5.6e-13) 27. Ooura (C): elapsed time t=1.03 s, 32 iters, t-(init.)=1 s t(norm)=0.0635783, mflops=78.6432 (err=1.4e-14) 28. Ooura (F): elapsed time t=1.08 s, 32 iters, t-(init.)=1.05 s t(norm)=0.0667572, mflops=74.8983 (err=1.4e-14) 29. Ransom: elapsed time t=1.34 s, 32 iters, t-(init.)=1.31 s t(norm)=0.0832876, mflops=60.033 (err=1.5e-14) 30. SCIPORT: elapsed time t=1.3 s, 16 iters, t-(init.)=1.28 s t(norm)=0.16276, mflops=30.72 (err=1.4e-14) 31. Singleton: elapsed time t=1.58 s, 32 iters, t-(init.)=1.55 s t(norm)=0.0985463, mflops=50.7375 (err=2.1e-14) 32. Singleton (f2c): elapsed time t=1.55 s, 32 iters, t-(init.)=1.51 s t(norm)=0.0960032, mflops=52.0816 (err=2.1e-14) 33. Sorensen: elapsed time t=1.07 s, 16 iters, t-(init.)=1.05 s t(norm)=0.133514, mflops=37.4491 (err=1.4e-14) 34. Sorensen DIT: elapsed time t=1.49 s, 16 iters, t-(init.)=1.46 s t(norm)=0.185649, mflops=26.9326 (err=1.4e-14) 35. Temperton: elapsed time t=1.26 s, 16 iters, t-(init.)=1.24 s t(norm)=0.157674, mflops=31.711 (err=1.5e-07) 36. Temperton (f2c): elapsed time t=1.17 s, 16 iters, t-(init.)=1.15 s t(norm)=0.14623, mflops=34.1927 (err=1.4e-14) 37. Valkenburg: elapsed time t=1.55 s, 4 iters, t-(init.)=1.54 s t(norm)=0.783285, mflops=6.38338 (err=1.4e-14) 38. SGIMATH: elapsed time t=1.16 s, 64 iters, t-(init.)=1.09 s t(norm)=0.0346502, mflops=144.299 (err=1.4e-14) Top mflops for N=32768 = 149.797 Normalized results and averages for N=32768: fft 0: mflops = 42.0552 (norm. = 0.280749), norm. avg. (of 15) = 0.377038 fft 1: mflops = 43.6907 (norm. = 0.291667), norm. avg. (of 15) = 0.367476 fft 2: mflops = 33.0434 (norm. = 0.220588), norm. avg. (of 15) = 0.273994 fft 3: mflops = 41.1745 (norm. = 0.274869), norm. avg. (of 15) = 0.141582 fft 4: mflops = 13.9438 (norm. = 0.0930851), norm. avg. (of 15) = 0.253426 fft 5: mflops = 23.4057 (norm. = 0.15625), norm. avg. (of 15) = 0.118754 fft 6: mflops = 58.2542 (norm. = 0.388889), norm. avg. (of 15) = 0.260318 fft 7: mflops = 40.1241 (norm. = 0.267857), norm. avg. (of 15) = 0.190365 fft 8: mflops = 29.5651 (norm. = 0.197368), norm. avg. (of 15) = 0.209473 fft 9: mflops = 149.797 (norm. = 1), norm. avg. (of 15) = 0.567859 fft 10: mflops = 149.797 (norm. = 1), norm. avg. (of 15) = 0.58849 fft 11: mflops = 31.4573 (norm. = 0.21), norm. avg. (of 14) = 0.219221 fft 12: mflops = 28.0869 (norm. = 0.1875), norm. avg. (of 15) = 0.361371 fft 13: mflops = 19.4661 (norm. = 0.12995), norm. avg. (of 15) = 0.137464 fft 14: mflops = 88.8624 (norm. = 0.59322), norm. avg. (of 15) = 0.681546 fft 15: mflops = 79.4376 (norm. = 0.530303), norm. avg. (of 15) = 0.641164 fft 16: mflops = 52.0816 (norm. = 0.347682), norm. avg. (of 15) = 0.728118 fft 17: mflops = 84.1104 (norm. = 0.561497), norm. avg. (of 13) = 0.605128 fft 18: mflops = 34.4926 (norm. = 0.230263), norm. avg. (of 15) = 0.28548 fft 19: mflops = 30.9619 (norm. = 0.206693), norm. avg. (of 15) = 0.199647 fft 20: mflops = 31.711 (norm. = 0.211694), norm. avg. (of 15) = 0.20122 fft 21: mflops = -1 (norm. = -0.00667572), norm. avg. (of 12) = 0.539745 fft 22: mflops = 47.6625 (norm. = 0.318182), norm. avg. (of 14) = 0.299947 fft 23: mflops = 51.067 (norm. = 0.340909), norm. avg. (of 14) = 0.359191 fft 24: mflops = 50.0912 (norm. = 0.334395), norm. avg. (of 14) = 0.333022 fft 25: mflops = 34.4926 (norm. = 0.230263), norm. avg. (of 14) = 0.168001 fft 26: mflops = 24.576 (norm. = 0.164062), norm. avg. (of 15) = 0.142179 fft 27: mflops = 78.6432 (norm. = 0.525), norm. avg. (of 15) = 0.609623 fft 28: mflops = 74.8983 (norm. = 0.5), norm. avg. (of 15) = 0.56623 fft 29: mflops = 60.033 (norm. = 0.400763), norm. avg. (of 14) = 0.209094 fft 30: mflops = 30.72 (norm. = 0.205078), norm. avg. (of 14) = 0.333911 fft 31: mflops = 50.7375 (norm. = 0.33871), norm. avg. (of 15) = 0.342184 fft 32: mflops = 52.0816 (norm. = 0.347682), norm. avg. (of 15) = 0.35079 fft 33: mflops = 37.4491 (norm. = 0.25), norm. avg. (of 15) = 0.368814 fft 34: mflops = 26.9326 (norm. = 0.179795), norm. avg. (of 15) = 0.202091 fft 35: mflops = 31.711 (norm. = 0.211694), norm. avg. (of 15) = 0.16667 fft 36: mflops = 34.1927 (norm. = 0.228261), norm. avg. (of 15) = 0.173045 fft 37: mflops = 6.38338 (norm. = 0.0426136), norm. avg. (of 15) = 0.0365733 fft 38: mflops = 144.299 (norm. = 0.963303), norm. avg. (of 15) = 0.766189 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.54 s, 8 iters, t-(init.)=1.52 s t(norm)=0.181198, mflops=27.5941 (err=1.7e-14) 1. Arndt DIT: elapsed time t=1.44 s, 8 iters, t-(init.)=1.41 s t(norm)=0.168085, mflops=29.7468 (err=1.7e-14) 2. Arndt Split-Radix: elapsed time t=1.82 s, 8 iters, t-(init.)=1.8 s t(norm)=0.214577, mflops=23.3017 (err=1.7e-14) 3. Arndt 4-step: elapsed time t=1.92 s, 16 iters, t-(init.)=1.87 s t(norm)=0.111461, mflops=44.8589 (err=1.7e-14) 4. Bailey: elapsed time t=1.8 s, 2 iters, t-(init.)=1.79 s t(norm)=0.853539, mflops=5.85797 (err=1.7e-14) 5. Beauregard: elapsed time t=1.86 s, 8 iters, t-(init.)=1.84 s t(norm)=0.219345, mflops=22.7951 (err=1.7e-14) 6. Bergland: elapsed time t=1.85 s, 16 iters, t-(init.)=1.81 s t(norm)=0.107884, mflops=46.3459 (err=1.7e-14) 7. Brenner: elapsed time t=1.25 s, 8 iters, t-(init.)=1.22 s t(norm)=0.145435, mflops=34.3795 (err=1.7e-14) 8. Burrus: elapsed time t=1.01 s, 4 iters, t-(init.)=1 s t(norm)=0.238419, mflops=20.9715 (err=1.7e-14) 9. CWP (min N) (N=72072): elapsed time t=1.4 s, 32 iters, t-(init.)=1.27 s t(norm)=0.0378489, mflops=132.104 10. CWP (best N) (N=72072): elapsed time t=1.4 s, 32 iters, t-(init.)=1.27 s t(norm)=0.0378489, mflops=132.104 11. Edelblute: elapsed time t=1.88 s, 8 iters, t-(init.)=1.86 s t(norm)=0.221729, mflops=22.55 (err=1.7e-14) 12. FFTPACK: elapsed time t=1.47 s, 4 iters, t-(init.)=1.46 s t(norm)=0.348091, mflops=14.3641 (err=1.7e-14) 13. FFTPACK (f2c): elapsed time t=1.48 s, 4 iters, t-(init.)=1.47 s t(norm)=0.350475, mflops=14.2663 (err=1.7e-14) FFTW_MEASURE plan: (cost = 8.250000e-02) FFTW_TWIDDLE 32 FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.34 s, 16 iters, t-(init.)=1.29 s t(norm)=0.07689, mflops=65.028 (err=1.7e-14) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.58 s, 16 iters, t-(init.)=1.53 s t(norm)=0.0911951, mflops=54.8275 (err=1.7e-14) 16. Frigo-old: elapsed time t=1.46 s, 8 iters, t-(init.)=1.43 s t(norm)=0.170469, mflops=29.3308 (err=1.7e-14) 17. Green: elapsed time t=1.54 s, 16 iters, t-(init.)=1.49 s t(norm)=0.0888109, mflops=56.2994 (err=1.7e-14) 18. GSL: elapsed time t=1.41 s, 4 iters, t-(init.)=1.4 s t(norm)=0.333786, mflops=14.9797 (err=1.7e-14) 19. GSL DIT: elapsed time t=1.71 s, 8 iters, t-(init.)=1.69 s t(norm)=0.201464, mflops=24.8184 (err=1.7e-14) 20. GSL DIF: elapsed time t=1.64 s, 8 iters, t-(init.)=1.62 s t(norm)=0.193119, mflops=25.8908 (err=1.8e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.88 s, 16 iters, t-(init.)=1.84 s t(norm)=0.109673, mflops=45.5903 (err=1.7e-14) 23. Mayer (simple): elapsed time t=1.77 s, 16 iters, t-(init.)=1.72 s t(norm)=0.10252, mflops=48.771 24. Mayer (lookup): elapsed time t=1.87 s, 16 iters, t-(init.)=1.83 s t(norm)=0.109076, mflops=45.8394 (err=1.7e-14) 25. Monro: elapsed time t=1.77 s, 8 iters, t-(init.)=1.74 s t(norm)=0.207424, mflops=24.1052 (err=1.6e-07) 26. NAPACK (f2c): elapsed time t=1.7 s, 4 iters, t-(init.)=1.69 s t(norm)=0.402927, mflops=12.4092 (err=8.6e-13) 27. Ooura (C): elapsed time t=1.43 s, 16 iters, t-(init.)=1.38 s t(norm)=0.0822544, mflops=60.787 (err=1.7e-14) 28. Ooura (F): elapsed time t=1.54 s, 16 iters, t-(init.)=1.49 s t(norm)=0.0888109, mflops=56.2994 (err=1.7e-14) 29. Ransom: elapsed time t=1.26 s, 16 iters, t-(init.)=1.22 s t(norm)=0.0727177, mflops=68.7591 (err=1.7e-14) 30. SCIPORT: elapsed time t=1 s, 2 iters, t-(init.)=0.99 s t(norm)=0.472069, mflops=10.5917 (err=1.7e-14) 31. Singleton: elapsed time t=1.03 s, 8 iters, t-(init.)=1 s t(norm)=0.119209, mflops=41.943 (err=2.3e-14) 32. Singleton (f2c): elapsed time t=1 s, 8 iters, t-(init.)=0.98 s t(norm)=0.116825, mflops=42.799 (err=2.3e-14) 33. Sorensen: elapsed time t=1.89 s, 8 iters, t-(init.)=1.86 s t(norm)=0.221729, mflops=22.55 (err=1.7e-14) 34. Sorensen DIT: elapsed time t=1.12 s, 4 iters, t-(init.)=1.11 s t(norm)=0.264645, mflops=18.8933 (err=1.7e-14) 35. Temperton: elapsed time t=1.77 s, 8 iters, t-(init.)=1.75 s t(norm)=0.208616, mflops=23.9675 (err=1.7e-07) 36. Temperton (f2c): elapsed time t=1.65 s, 8 iters, t-(init.)=1.63 s t(norm)=0.194311, mflops=25.7319 (err=1.7e-14) 37. Valkenburg: elapsed time t=1.97 s, 2 iters, t-(init.)=1.97 s t(norm)=0.939369, mflops=5.32272 (err=1.7e-14) 38. SGIMATH: elapsed time t=1.42 s, 32 iters, t-(init.)=1.33 s t(norm)=0.0396371, mflops=126.144 (err=1.7e-14) Top mflops for N=65536 = 132.104 Normalized results and averages for N=65536: fft 0: mflops = 27.5941 (norm. = 0.208882), norm. avg. (of 16) = 0.366529 fft 1: mflops = 29.7468 (norm. = 0.225177), norm. avg. (of 16) = 0.358582 fft 2: mflops = 23.3017 (norm. = 0.176389), norm. avg. (of 16) = 0.267894 fft 3: mflops = 44.8589 (norm. = 0.339572), norm. avg. (of 16) = 0.153957 fft 4: mflops = 5.85797 (norm. = 0.0443436), norm. avg. (of 16) = 0.240358 fft 5: mflops = 22.7951 (norm. = 0.172554), norm. avg. (of 16) = 0.122117 fft 6: mflops = 46.3459 (norm. = 0.350829), norm. avg. (of 16) = 0.265975 fft 7: mflops = 34.3795 (norm. = 0.260246), norm. avg. (of 16) = 0.194733 fft 8: mflops = 20.9715 (norm. = 0.15875), norm. avg. (of 16) = 0.206303 fft 9: mflops = 132.104 (norm. = 1), norm. avg. (of 16) = 0.594868 fft 10: mflops = 132.104 (norm. = 1), norm. avg. (of 16) = 0.61421 fft 11: mflops = 22.55 (norm. = 0.170699), norm. avg. (of 15) = 0.215986 fft 12: mflops = 14.3641 (norm. = 0.108733), norm. avg. (of 16) = 0.345582 fft 13: mflops = 14.2663 (norm. = 0.107993), norm. avg. (of 16) = 0.135622 fft 14: mflops = 65.028 (norm. = 0.492248), norm. avg. (of 16) = 0.669715 fft 15: mflops = 54.8275 (norm. = 0.415033), norm. avg. (of 16) = 0.627031 fft 16: mflops = 29.3308 (norm. = 0.222028), norm. avg. (of 16) = 0.696487 fft 17: mflops = 56.2994 (norm. = 0.426174), norm. avg. (of 14) = 0.592345 fft 18: mflops = 14.9797 (norm. = 0.113393), norm. avg. (of 16) = 0.274724 fft 19: mflops = 24.8184 (norm. = 0.18787), norm. avg. (of 16) = 0.198911 fft 20: mflops = 25.8908 (norm. = 0.195988), norm. avg. (of 16) = 0.200893 fft 21: mflops = -1 (norm. = -0.00756979), norm. avg. (of 12) = 0.539745 fft 22: mflops = 45.5903 (norm. = 0.345109), norm. avg. (of 15) = 0.302958 fft 23: mflops = 48.771 (norm. = 0.369186), norm. avg. (of 15) = 0.359857 fft 24: mflops = 45.8394 (norm. = 0.346995), norm. avg. (of 15) = 0.333953 fft 25: mflops = 24.1052 (norm. = 0.182471), norm. avg. (of 15) = 0.168966 fft 26: mflops = 12.4092 (norm. = 0.0939349), norm. avg. (of 16) = 0.139164 fft 27: mflops = 60.787 (norm. = 0.460145), norm. avg. (of 16) = 0.60028 fft 28: mflops = 56.2994 (norm. = 0.426174), norm. avg. (of 16) = 0.557476 fft 29: mflops = 68.7591 (norm. = 0.520492), norm. avg. (of 15) = 0.229854 fft 30: mflops = 10.5917 (norm. = 0.0801768), norm. avg. (of 15) = 0.316996 fft 31: mflops = 41.943 (norm. = 0.3175), norm. avg. (of 16) = 0.340641 fft 32: mflops = 42.799 (norm. = 0.32398), norm. avg. (of 16) = 0.349114 fft 33: mflops = 22.55 (norm. = 0.170699), norm. avg. (of 16) = 0.356432 fft 34: mflops = 18.8933 (norm. = 0.143018), norm. avg. (of 16) = 0.198399 fft 35: mflops = 23.9675 (norm. = 0.181429), norm. avg. (of 16) = 0.167592 fft 36: mflops = 25.7319 (norm. = 0.194785), norm. avg. (of 16) = 0.174404 fft 37: mflops = 5.32272 (norm. = 0.0402919), norm. avg. (of 16) = 0.0368057 fft 38: mflops = 126.144 (norm. = 0.954887), norm. avg. (of 16) = 0.777983 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.39 s, 1 iters, t-(init.)=1.38 s t(norm)=0.619327, mflops=8.07328 (err=3.3e-14) 1. Arndt DIT: elapsed time t=1.36 s, 1 iters, t-(init.)=1.35 s t(norm)=0.605864, mflops=8.25268 (err=3.3e-14) 2. Arndt Split-Radix: elapsed time t=1.69 s, 1 iters, t-(init.)=1.68 s t(norm)=0.753964, mflops=6.63162 (err=3.3e-14) 3. Arndt 4-step: elapsed time t=1.92 s, 4 iters, t-(init.)=1.87 s t(norm)=0.209808, mflops=23.8313 (err=3.3e-14) 4. Bailey: elapsed time t=3.38 s, 1 iters, t-(init.)=3.37 s t(norm)=1.51242, mflops=3.30597 (err=3.3e-14) 5. Beauregard: elapsed time t=1.33 s, 2 iters, t-(init.)=1.3 s t(norm)=0.291712, mflops=17.1402 (err=3.3e-14) 6. Bergland: elapsed time t=1.31 s, 2 iters, t-(init.)=1.28 s t(norm)=0.287224, mflops=17.408 (err=3.4e-14) 7. Brenner: elapsed time t=1.88 s, 2 iters, t-(init.)=1.86 s t(norm)=0.417373, mflops=11.9797 (err=3.3e-14) 8. Burrus: elapsed time t=1.78 s, 1 iters, t-(init.)=1.76 s t(norm)=0.789867, mflops=6.33018 (err=3.3e-14) 9. CWP (min N) (N=144144): elapsed time t=1.97 s, 16 iters, t-(init.)=1.76 s t(norm)=0.0493667, mflops=101.283 10. CWP (best N) (N=144144): elapsed time t=1.95 s, 16 iters, t-(init.)=1.74 s t(norm)=0.0488057, mflops=102.447 11. Edelblute: elapsed time t=1.68 s, 1 iters, t-(init.)=1.67 s t(norm)=0.749476, mflops=6.67133 (err=3.3e-14) 12. FFTPACK: elapsed time t=1.53 s, 1 iters, t-(init.)=1.52 s t(norm)=0.682158, mflops=7.32968 (err=3.3e-14) 13. FFTPACK (f2c): elapsed time t=1.48 s, 1 iters, t-(init.)=1.47 s t(norm)=0.659718, mflops=7.57899 (err=3.3e-14) FFTW_MEASURE plan: (cost = 3.500000e-01) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.4 s, 4 iters, t-(init.)=1.35 s t(norm)=0.151466, mflops=33.0107 (err=3.3e-14) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.67 s, 4 iters, t-(init.)=1.62 s t(norm)=0.181759, mflops=27.5089 (err=3.3e-14) 16. Frigo-old: elapsed time t=1.07 s, 2 iters, t-(init.)=1.05 s t(norm)=0.235614, mflops=21.2212 (err=3.3e-14) 17. Green: elapsed time t=1.17 s, 2 iters, t-(init.)=1.15 s t(norm)=0.258053, mflops=19.3759 (err=3.3e-14) 18. GSL: elapsed time t=1.29 s, 1 iters, t-(init.)=1.28 s t(norm)=0.574449, mflops=8.704 (err=3.3e-14) 19. GSL DIT: elapsed time t=1.47 s, 1 iters, t-(init.)=1.46 s t(norm)=0.65523, mflops=7.6309 (err=3.5e-14) 20. GSL DIF: elapsed time t=1.45 s, 1 iters, t-(init.)=1.44 s t(norm)=0.646255, mflops=7.73689 (err=3.5e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.35 s, 4 iters, t-(init.)=1.31 s t(norm)=0.146978, mflops=34.0187 (err=3.3e-14) 23. Mayer (simple): elapsed time t=1.28 s, 4 iters, t-(init.)=1.23 s t(norm)=0.138002, mflops=36.2313 24. Mayer (lookup): elapsed time t=1.74 s, 4 iters, t-(init.)=1.69 s t(norm)=0.189613, mflops=26.3695 (err=3.3e-14) 25. Monro: elapsed time t=1.54 s, 1 iters, t-(init.)=1.53 s t(norm)=0.686646, mflops=7.28178 (err=1.7e-07) 26. NAPACK (f2c): elapsed time t=1.13 s, 1 iters, t-(init.)=1.12 s t(norm)=0.502642, mflops=9.94743 (err=2.0e-12) 27. Ooura (C): elapsed time t=1.16 s, 4 iters, t-(init.)=1.11 s t(norm)=0.124539, mflops=40.1482 (err=3.4e-14) 28. Ooura (F): elapsed time t=1.19 s, 4 iters, t-(init.)=1.14 s t(norm)=0.127905, mflops=39.0916 (err=3.4e-14) 29. Ransom: elapsed time t=1.11 s, 4 iters, t-(init.)=1.05 s t(norm)=0.117807, mflops=42.4424 (err=3.3e-14) 30. SCIPORT: elapsed time t=1.23 s, 1 iters, t-(init.)=1.22 s t(norm)=0.547521, mflops=9.13207 (err=3.3e-14) 31. Singleton: elapsed time t=1.93 s, 2 iters, t-(init.)=1.91 s t(norm)=0.428592, mflops=11.6661 (err=4.8e-14) 32. Singleton (f2c): elapsed time t=1.89 s, 2 iters, t-(init.)=1.87 s t(norm)=0.419617, mflops=11.9156 (err=4.8e-14) 33. Sorensen: elapsed time t=1.84 s, 2 iters, t-(init.)=1.82 s t(norm)=0.408397, mflops=12.243 (err=3.3e-14) 34. Sorensen DIT: elapsed time t=1.88 s, 1 iters, t-(init.)=1.87 s t(norm)=0.839233, mflops=5.95782 (err=3.3e-14) 35. Temperton: elapsed time t=1.21 s, 1 iters, t-(init.)=1.2 s t(norm)=0.538545, mflops=9.28427 (err=1.9e-07) 36. Temperton (f2c): elapsed time t=1.15 s, 1 iters, t-(init.)=1.14 s t(norm)=0.511618, mflops=9.77291 (err=3.3e-14) 37. Valkenburg: elapsed time t=2.71 s, 1 iters, t-(init.)=2.7 s t(norm)=1.21173, mflops=4.12634 (err=3.4e-14) 38. SGIMATH: elapsed time t=1.13 s, 8 iters, t-(init.)=1.03 s t(norm)=0.0577814, mflops=86.533 (err=3.3e-14) Top mflops for N=131072 = 102.447 Normalized results and averages for N=131072: fft 0: mflops = 8.07328 (norm. = 0.0788043), norm. avg. (of 17) = 0.349604 fft 1: mflops = 8.25268 (norm. = 0.0805556), norm. avg. (of 17) = 0.342228 fft 2: mflops = 6.63162 (norm. = 0.0647321), norm. avg. (of 17) = 0.255943 fft 3: mflops = 23.8313 (norm. = 0.23262), norm. avg. (of 17) = 0.158584 fft 4: mflops = 3.30597 (norm. = 0.03227), norm. avg. (of 17) = 0.228118 fft 5: mflops = 17.1402 (norm. = 0.167308), norm. avg. (of 17) = 0.124775 fft 6: mflops = 17.408 (norm. = 0.169922), norm. avg. (of 17) = 0.260325 fft 7: mflops = 11.9797 (norm. = 0.116935), norm. avg. (of 17) = 0.190157 fft 8: mflops = 6.33018 (norm. = 0.0617898), norm. avg. (of 17) = 0.197802 fft 9: mflops = 101.283 (norm. = 0.988636), norm. avg. (of 17) = 0.61803 fft 10: mflops = 102.447 (norm. = 1), norm. avg. (of 17) = 0.636903 fft 11: mflops = 6.67133 (norm. = 0.0651198), norm. avg. (of 16) = 0.206557 fft 12: mflops = 7.32968 (norm. = 0.0715461), norm. avg. (of 17) = 0.329462 fft 13: mflops = 7.57899 (norm. = 0.0739796), norm. avg. (of 17) = 0.131996 fft 14: mflops = 33.0107 (norm. = 0.322222), norm. avg. (of 17) = 0.649274 fft 15: mflops = 27.5089 (norm. = 0.268519), norm. avg. (of 17) = 0.605942 fft 16: mflops = 21.2212 (norm. = 0.207143), norm. avg. (of 17) = 0.667702 fft 17: mflops = 19.3759 (norm. = 0.18913), norm. avg. (of 15) = 0.565464 fft 18: mflops = 8.704 (norm. = 0.0849609), norm. avg. (of 17) = 0.263562 fft 19: mflops = 7.6309 (norm. = 0.0744863), norm. avg. (of 17) = 0.191592 fft 20: mflops = 7.73689 (norm. = 0.0755208), norm. avg. (of 17) = 0.193518 fft 21: mflops = -1 (norm. = -0.00976114), norm. avg. (of 12) = 0.539745 fft 22: mflops = 34.0187 (norm. = 0.332061), norm. avg. (of 16) = 0.304777 fft 23: mflops = 36.2313 (norm. = 0.353659), norm. avg. (of 16) = 0.35947 fft 24: mflops = 26.3695 (norm. = 0.257396), norm. avg. (of 16) = 0.329169 fft 25: mflops = 7.28178 (norm. = 0.0710784), norm. avg. (of 16) = 0.162848 fft 26: mflops = 9.94743 (norm. = 0.0970982), norm. avg. (of 17) = 0.13669 fft 27: mflops = 40.1482 (norm. = 0.391892), norm. avg. (of 17) = 0.588022 fft 28: mflops = 39.0916 (norm. = 0.381579), norm. avg. (of 17) = 0.54713 fft 29: mflops = 42.4424 (norm. = 0.414286), norm. avg. (of 16) = 0.241381 fft 30: mflops = 9.13207 (norm. = 0.0891393), norm. avg. (of 16) = 0.302755 fft 31: mflops = 11.6661 (norm. = 0.113874), norm. avg. (of 17) = 0.327302 fft 32: mflops = 11.9156 (norm. = 0.11631), norm. avg. (of 17) = 0.33542 fft 33: mflops = 12.243 (norm. = 0.119505), norm. avg. (of 17) = 0.342495 fft 34: mflops = 5.95782 (norm. = 0.0581551), norm. avg. (of 17) = 0.190149 fft 35: mflops = 9.28427 (norm. = 0.090625), norm. avg. (of 17) = 0.163065 fft 36: mflops = 9.77291 (norm. = 0.0953947), norm. avg. (of 17) = 0.169756 fft 37: mflops = 4.12634 (norm. = 0.0402778), norm. avg. (of 17) = 0.03701 fft 38: mflops = 86.533 (norm. = 0.84466), norm. avg. (of 17) = 0.781905 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=3.35 s, 1 iters, t-(init.)=3.33 s t(norm)=0.705719, mflops=7.08497 (err=4.3e-14) 1. Arndt DIT: elapsed time t=3.41 s, 1 iters, t-(init.)=3.39 s t(norm)=0.718435, mflops=6.95958 (err=4.3e-14) 2. Arndt Split-Radix: elapsed time t=4.22 s, 1 iters, t-(init.)=4.2 s t(norm)=0.890096, mflops=5.61737 (err=4.3e-14) 3. Arndt 4-step: elapsed time t=1.93 s, 2 iters, t-(init.)=1.88 s t(norm)=0.199212, mflops=25.0989 (err=4.3e-14) 4. Bailey: elapsed time t=7.53 s, 1 iters, t-(init.)=7.51 s t(norm)=1.59158, mflops=3.14154 (err=4.3e-14) 5. Beauregard: elapsed time t=1.46 s, 1 iters, t-(init.)=1.43 s t(norm)=0.303057, mflops=16.4986 (err=4.4e-14) 6. Bergland: elapsed time t=1.63 s, 1 iters, t-(init.)=1.6 s t(norm)=0.339084, mflops=14.7456 (err=4.4e-14) 7. Brenner: elapsed time t=2.16 s, 1 iters, t-(init.)=2.14 s t(norm)=0.453525, mflops=11.0247 (err=4.4e-14) 8. Burrus: elapsed time t=4.46 s, 1 iters, t-(init.)=4.44 s t(norm)=0.940959, mflops=5.31373 (err=4.3e-14) 9. CWP (min N) (N=360360): elapsed time t=1.4 s, 4 iters, t-(init.)=1.26 s t(norm)=0.0667572, mflops=74.8983 10. CWP (best N) (N=360360): elapsed time t=1.4 s, 4 iters, t-(init.)=1.26 s t(norm)=0.0667572, mflops=74.8983 11. Edelblute: elapsed time t=4.18 s, 1 iters, t-(init.)=4.15 s t(norm)=0.8795, mflops=5.68505 (err=4.3e-14) 12. FFTPACK: elapsed time t=3.33 s, 1 iters, t-(init.)=3.31 s t(norm)=0.70148, mflops=7.12778 (err=4.4e-14) 13. FFTPACK (f2c): elapsed time t=2.8 s, 1 iters, t-(init.)=2.78 s t(norm)=0.589159, mflops=8.48668 (err=4.4e-14) FFTW_MEASURE plan: (cost = 8.600000e-01) FFTW_TWIDDLE 8 FFTW_TWIDDLE 32 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.66 s, 2 iters, t-(init.)=1.61 s t(norm)=0.170602, mflops=29.308 (err=4.4e-14) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.93 s, 2 iters, t-(init.)=1.89 s t(norm)=0.200272, mflops=24.9661 (err=4.4e-14) 16. Frigo-old: elapsed time t=1.41 s, 1 iters, t-(init.)=1.39 s t(norm)=0.294579, mflops=16.9734 (err=4.4e-14) 17. Green: elapsed time t=1.34 s, 1 iters, t-(init.)=1.32 s t(norm)=0.279744, mflops=17.8735 (err=4.4e-14) 18. GSL: elapsed time t=2.83 s, 1 iters, t-(init.)=2.8 s t(norm)=0.593397, mflops=8.42606 (err=4.4e-14) 19. GSL DIT: elapsed time t=3.31 s, 1 iters, t-(init.)=3.29 s t(norm)=0.697242, mflops=7.17111 (err=4.6e-14) 20. GSL DIF: elapsed time t=3.33 s, 1 iters, t-(init.)=3.3 s t(norm)=0.699361, mflops=7.14938 (err=4.6e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=2.52 s, 1 iters, t-(init.)=2.49 s t(norm)=0.5277, mflops=9.47508 (err=4.3e-14) 23. Mayer (simple): elapsed time t=2.5 s, 1 iters, t-(init.)=2.48 s t(norm)=0.525581, mflops=9.51329 24. Mayer (lookup): elapsed time t=2.63 s, 1 iters, t-(init.)=2.6 s t(norm)=0.551012, mflops=9.07422 (err=4.3e-14) 25. Monro: elapsed time t=3.52 s, 1 iters, t-(init.)=3.5 s t(norm)=0.741747, mflops=6.74085 (err=1.8e-07) 26. NAPACK (f2c): elapsed time t=2.66 s, 1 iters, t-(init.)=2.64 s t(norm)=0.559489, mflops=8.93673 (err=3.7e-12) 27. Ooura (C): elapsed time t=1.07 s, 1 iters, t-(init.)=1.04 s t(norm)=0.220405, mflops=22.6855 (err=4.4e-14) 28. Ooura (F): elapsed time t=1.12 s, 1 iters, t-(init.)=1.1 s t(norm)=0.23312, mflops=21.4481 (err=4.4e-14) 29. Ransom: elapsed time t=1.06 s, 2 iters, t-(init.)=1.01 s t(norm)=0.107023, mflops=46.7187 (err=4.3e-14) 30. SCIPORT: elapsed time t=2.87 s, 1 iters, t-(init.)=2.85 s t(norm)=0.603994, mflops=8.27823 (err=4.4e-14) 31. Singleton: elapsed time t=2.09 s, 1 iters, t-(init.)=2.06 s t(norm)=0.436571, mflops=11.4529 (err=6.0e-14) 32. Singleton (f2c): elapsed time t=2.05 s, 1 iters, t-(init.)=2.02 s t(norm)=0.428094, mflops=11.6797 (err=6.0e-14) 33. Sorensen: elapsed time t=2.28 s, 1 iters, t-(init.)=2.25 s t(norm)=0.476837, mflops=10.4858 (err=4.3e-14) 34. Sorensen DIT: elapsed time t=4.69 s, 1 iters, t-(init.)=4.66 s t(norm)=0.987583, mflops=5.06287 (err=4.3e-14) 35. Temperton: elapsed time t=2.55 s, 1 iters, t-(init.)=2.53 s t(norm)=0.536177, mflops=9.32528 (err=2.0e-07) 36. Temperton (f2c): elapsed time t=2.39 s, 1 iters, t-(init.)=2.37 s t(norm)=0.502268, mflops=9.95484 (err=4.4e-14) 37. Valkenburg: elapsed time t=6.32 s, 1 iters, t-(init.)=6.29 s t(norm)=1.33302, mflops=3.75087 (err=4.4e-14) 38. SGIMATH: elapsed time t=1.2 s, 4 iters, t-(init.)=1.11 s t(norm)=0.0588099, mflops=85.0197 (err=4.4e-14) Top mflops for N=262144 = 85.0197 Normalized results and averages for N=262144: fft 0: mflops = 7.08497 (norm. = 0.0833333), norm. avg. (of 18) = 0.334811 fft 1: mflops = 6.95958 (norm. = 0.0818584), norm. avg. (of 18) = 0.327763 fft 2: mflops = 5.61737 (norm. = 0.0660714), norm. avg. (of 18) = 0.245395 fft 3: mflops = 25.0989 (norm. = 0.295213), norm. avg. (of 18) = 0.166174 fft 4: mflops = 3.14154 (norm. = 0.0369507), norm. avg. (of 18) = 0.217497 fft 5: mflops = 16.4986 (norm. = 0.194056), norm. avg. (of 18) = 0.128624 fft 6: mflops = 14.7456 (norm. = 0.173438), norm. avg. (of 18) = 0.255498 fft 7: mflops = 11.0247 (norm. = 0.129673), norm. avg. (of 18) = 0.186796 fft 8: mflops = 5.31373 (norm. = 0.0625), norm. avg. (of 18) = 0.190285 fft 9: mflops = 74.8983 (norm. = 0.880952), norm. avg. (of 18) = 0.632637 fft 10: mflops = 74.8983 (norm. = 0.880952), norm. avg. (of 18) = 0.650462 fft 11: mflops = 5.68505 (norm. = 0.0668675), norm. avg. (of 17) = 0.19834 fft 12: mflops = 7.12778 (norm. = 0.0838369), norm. avg. (of 18) = 0.315816 fft 13: mflops = 8.48668 (norm. = 0.0998201), norm. avg. (of 18) = 0.130209 fft 14: mflops = 29.308 (norm. = 0.34472), norm. avg. (of 18) = 0.632355 fft 15: mflops = 24.9661 (norm. = 0.293651), norm. avg. (of 18) = 0.588593 fft 16: mflops = 16.9734 (norm. = 0.19964), norm. avg. (of 18) = 0.641699 fft 17: mflops = 17.8735 (norm. = 0.210227), norm. avg. (of 16) = 0.543262 fft 18: mflops = 8.42606 (norm. = 0.0991071), norm. avg. (of 18) = 0.254425 fft 19: mflops = 7.17111 (norm. = 0.0843465), norm. avg. (of 18) = 0.185634 fft 20: mflops = 7.14938 (norm. = 0.0840909), norm. avg. (of 18) = 0.187439 fft 21: mflops = -1 (norm. = -0.011762), norm. avg. (of 12) = 0.539745 fft 22: mflops = 9.47508 (norm. = 0.111446), norm. avg. (of 17) = 0.293404 fft 23: mflops = 9.51329 (norm. = 0.111895), norm. avg. (of 17) = 0.344906 fft 24: mflops = 9.07422 (norm. = 0.106731), norm. avg. (of 17) = 0.316084 fft 25: mflops = 6.74085 (norm. = 0.0792857), norm. avg. (of 17) = 0.157933 fft 26: mflops = 8.93673 (norm. = 0.105114), norm. avg. (of 18) = 0.134935 fft 27: mflops = 22.6855 (norm. = 0.266827), norm. avg. (of 18) = 0.570178 fft 28: mflops = 21.4481 (norm. = 0.252273), norm. avg. (of 18) = 0.530749 fft 29: mflops = 46.7187 (norm. = 0.549505), norm. avg. (of 17) = 0.259506 fft 30: mflops = 8.27823 (norm. = 0.0973684), norm. avg. (of 17) = 0.290673 fft 31: mflops = 11.4529 (norm. = 0.134709), norm. avg. (of 18) = 0.316602 fft 32: mflops = 11.6797 (norm. = 0.137376), norm. avg. (of 18) = 0.324418 fft 33: mflops = 10.4858 (norm. = 0.123333), norm. avg. (of 18) = 0.330319 fft 34: mflops = 5.06287 (norm. = 0.0595494), norm. avg. (of 18) = 0.182894 fft 35: mflops = 9.32528 (norm. = 0.109684), norm. avg. (of 18) = 0.160099 fft 36: mflops = 9.95484 (norm. = 0.117089), norm. avg. (of 18) = 0.16683 fft 37: mflops = 3.75087 (norm. = 0.0441176), norm. avg. (of 18) = 0.0374048 fft 38: mflops = 85.0197 (norm. = 1), norm. avg. (of 18) = 0.794022 Benchmarking for array size = 524288 (power of 2): 0. Arndt DIF: elapsed time t=7.38 s, 1 iters, t-(init.)=7.33 s t(norm)=0.735835, mflops=6.795 (err=1.1e-13) 1. Arndt DIT: elapsed time t=7.41 s, 1 iters, t-(init.)=7.36 s t(norm)=0.738847, mflops=6.7673 (err=1.1e-13) 2. Arndt Split-Radix: elapsed time t=9.57 s, 1 iters, t-(init.)=9.53 s t(norm)=0.956686, mflops=5.22638 (err=1.1e-13) 3. Arndt 4-step: elapsed time t=2.73 s, 1 iters, t-(init.)=2.68 s t(norm)=0.269037, mflops=18.5848 (err=1.1e-13) 4. Bailey: elapsed time t=16.22 s, 1 iters, t-(init.)=16.17 s t(norm)=1.62325, mflops=3.08023 (err=1.1e-13) 5. Beauregard: elapsed time t=3.14 s, 1 iters, t-(init.)=3.09 s t(norm)=0.310195, mflops=16.1189 (err=1.1e-13) 6. Bergland: elapsed time t=3.6 s, 1 iters, t-(init.)=3.55 s t(norm)=0.356373, mflops=14.0302 (err=1.1e-13) 7. Brenner: elapsed time t=4.9 s, 1 iters, t-(init.)=4.85 s t(norm)=0.486876, mflops=10.2696 (err=1.1e-13) 8. Burrus: elapsed time t=10.18 s, 1 iters, t-(init.)=10.13 s t(norm)=1.01692, mflops=4.91682 (err=1.1e-13) 9. CWP (min N) (N=720720): elapsed time t=1.46 s, 2 iters, t-(init.)=1.33 s t(norm)=0.0667572, mflops=74.8983 10. CWP (best N) (N=720720): elapsed time t=1.45 s, 2 iters, t-(init.)=1.32 s t(norm)=0.0662553, mflops=75.4657 11. Edelblute: elapsed time t=9.45 s, 1 iters, t-(init.)=9.41 s t(norm)=0.94464, mflops=5.29302 (err=1.1e-13) 12. FFTPACK: elapsed time t=7.65 s, 1 iters, t-(init.)=7.6 s t(norm)=0.762939, mflops=6.5536 (err=1.1e-13) 13. FFTPACK (f2c): elapsed time t=7.13 s, 1 iters, t-(init.)=7.08 s t(norm)=0.710738, mflops=7.03494 (err=1.1e-13) FFTW_MEASURE plan: (cost = 1.820000e+00) FFTW_TWIDDLE 8 FFTW_TWIDDLE 32 FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.78 s, 1 iters, t-(init.)=1.73 s t(norm)=0.173669, mflops=28.7904 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 5.976883e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=2.03 s, 1 iters, t-(init.)=1.98 s t(norm)=0.198766, mflops=25.1552 (err=1.1e-13) 16. Frigo-old: elapsed time t=3.27 s, 1 iters, t-(init.)=3.22 s t(norm)=0.323245, mflops=15.4681 (err=1.1e-13) 17. Green: elapsed time t=2.98 s, 1 iters, t-(init.)=2.93 s t(norm)=0.294133, mflops=16.9991 (err=1.1e-13) 18. GSL: elapsed time t=6.24 s, 1 iters, t-(init.)=6.19 s t(norm)=0.621394, mflops=8.04642 (err=1.1e-13) 19. GSL DIT: elapsed time t=7.28 s, 1 iters, t-(init.)=7.23 s t(norm)=0.725796, mflops=6.88898 (err=1.1e-13) 20. GSL DIF: elapsed time t=7.38 s, 1 iters, t-(init.)=7.33 s t(norm)=0.735835, mflops=6.795 (err=1.1e-13) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=5.79 s, 1 iters, t-(init.)=5.74 s t(norm)=0.57622, mflops=8.67724 (err=1.1e-13) 23. Mayer (simple): elapsed time t=5.73 s, 1 iters, t-(init.)=5.68 s t(norm)=0.570197, mflops=8.7689 24. Mayer (lookup): elapsed time t=6.01 s, 1 iters, t-(init.)=5.97 s t(norm)=0.599309, mflops=8.34294 (err=1.1e-13) 25. Monro: elapsed time t=7.86 s, 1 iters, t-(init.)=7.81 s t(norm)=0.784021, mflops=6.37738 (err=1.9e-07) 26. NAPACK (f2c): elapsed time t=6.36 s, 1 iters, t-(init.)=6.32 s t(norm)=0.634444, mflops=7.88091 (err=7.9e-12) 27. Ooura (C): elapsed time t=2.15 s, 1 iters, t-(init.)=2.1 s t(norm)=0.210812, mflops=23.7178 (err=1.1e-13) 28. Ooura (F): elapsed time t=2.23 s, 1 iters, t-(init.)=2.18 s t(norm)=0.218843, mflops=22.8474 (err=1.1e-13) 29. Ransom: elapsed time t=1.3 s, 1 iters, t-(init.)=1.25 s t(norm)=0.125483, mflops=39.8459 (err=1.1e-13) 30. SCIPORT: elapsed time t=6.59 s, 1 iters, t-(init.)=6.54 s t(norm)=0.656529, mflops=7.6158 (err=1.1e-13) 31. Singleton: elapsed time t=5.62 s, 1 iters, t-(init.)=5.57 s t(norm)=0.559154, mflops=8.94208 (err=1.6e-13) 32. Singleton (f2c): elapsed time t=5.43 s, 1 iters, t-(init.)=5.38 s t(norm)=0.540081, mflops=9.25787 (err=1.6e-13) 33. Sorensen: elapsed time t=6.19 s, 1 iters, t-(init.)=6.14 s t(norm)=0.616375, mflops=8.11195 (err=1.1e-13) 34. Sorensen DIT: elapsed time t=10.75 s, 1 iters, t-(init.)=10.7 s t(norm)=1.07414, mflops=4.65489 (err=1.1e-13) 35. Temperton: elapsed time t=6.17 s, 1 iters, t-(init.)=6.12 s t(norm)=0.614367, mflops=8.13846 (err=2.1e-07) 36. Temperton (f2c): elapsed time t=5.69 s, 1 iters, t-(init.)=5.64 s t(norm)=0.566181, mflops=8.83109 (err=1.1e-13) 37. Valkenburg: elapsed time t=13.73 s, 1 iters, t-(init.)=13.69 s t(norm)=1.37429, mflops=3.63823 (err=1.1e-13) 38. SGIMATH: elapsed time t=1.32 s, 2 iters, t-(init.)=1.23 s t(norm)=0.0617379, mflops=80.9876 (err=1.1e-13) Top mflops for N=524288 = 80.9876 Normalized results and averages for N=524288: fft 0: mflops = 6.795 (norm. = 0.0839018), norm. avg. (of 19) = 0.321605 fft 1: mflops = 6.7673 (norm. = 0.0835598), norm. avg. (of 19) = 0.31491 fft 2: mflops = 5.22638 (norm. = 0.0645331), norm. avg. (of 19) = 0.235876 fft 3: mflops = 18.5848 (norm. = 0.229478), norm. avg. (of 19) = 0.169506 fft 4: mflops = 3.08023 (norm. = 0.0380334), norm. avg. (of 19) = 0.208052 fft 5: mflops = 16.1189 (norm. = 0.199029), norm. avg. (of 19) = 0.13233 fft 6: mflops = 14.0302 (norm. = 0.173239), norm. avg. (of 19) = 0.251169 fft 7: mflops = 10.2696 (norm. = 0.126804), norm. avg. (of 19) = 0.183639 fft 8: mflops = 4.91682 (norm. = 0.0607108), norm. avg. (of 19) = 0.183465 fft 9: mflops = 74.8983 (norm. = 0.924812), norm. avg. (of 19) = 0.648015 fft 10: mflops = 75.4657 (norm. = 0.931818), norm. avg. (of 19) = 0.66527 fft 11: mflops = 5.29302 (norm. = 0.065356), norm. avg. (of 18) = 0.190952 fft 12: mflops = 6.5536 (norm. = 0.0809211), norm. avg. (of 19) = 0.303453 fft 13: mflops = 7.03494 (norm. = 0.0868644), norm. avg. (of 19) = 0.127927 fft 14: mflops = 28.7904 (norm. = 0.355491), norm. avg. (of 19) = 0.617783 fft 15: mflops = 25.1552 (norm. = 0.310606), norm. avg. (of 19) = 0.573962 fft 16: mflops = 15.4681 (norm. = 0.190994), norm. avg. (of 19) = 0.617977 fft 17: mflops = 16.9991 (norm. = 0.209898), norm. avg. (of 17) = 0.523652 fft 18: mflops = 8.04642 (norm. = 0.0993538), norm. avg. (of 19) = 0.246264 fft 19: mflops = 6.88898 (norm. = 0.0850622), norm. avg. (of 19) = 0.180341 fft 20: mflops = 6.795 (norm. = 0.0839018), norm. avg. (of 19) = 0.181989 fft 21: mflops = -1 (norm. = -0.0123476), norm. avg. (of 12) = 0.539745 fft 22: mflops = 8.67724 (norm. = 0.107143), norm. avg. (of 18) = 0.283056 fft 23: mflops = 8.7689 (norm. = 0.108275), norm. avg. (of 18) = 0.33176 fft 24: mflops = 8.34294 (norm. = 0.103015), norm. avg. (of 18) = 0.304247 fft 25: mflops = 6.37738 (norm. = 0.0787452), norm. avg. (of 18) = 0.153533 fft 26: mflops = 7.88091 (norm. = 0.0973101), norm. avg. (of 19) = 0.132955 fft 27: mflops = 23.7178 (norm. = 0.292857), norm. avg. (of 19) = 0.555582 fft 28: mflops = 22.8474 (norm. = 0.28211), norm. avg. (of 19) = 0.517662 fft 29: mflops = 39.8459 (norm. = 0.492), norm. avg. (of 18) = 0.272422 fft 30: mflops = 7.6158 (norm. = 0.0940367), norm. avg. (of 18) = 0.279749 fft 31: mflops = 8.94208 (norm. = 0.110413), norm. avg. (of 19) = 0.30575 fft 32: mflops = 9.25787 (norm. = 0.114312), norm. avg. (of 19) = 0.313359 fft 33: mflops = 8.11195 (norm. = 0.100163), norm. avg. (of 19) = 0.318206 fft 34: mflops = 4.65489 (norm. = 0.0574766), norm. avg. (of 19) = 0.176293 fft 35: mflops = 8.13846 (norm. = 0.10049), norm. avg. (of 19) = 0.156962 fft 36: mflops = 8.83109 (norm. = 0.109043), norm. avg. (of 19) = 0.163789 fft 37: mflops = 3.63823 (norm. = 0.0449233), norm. avg. (of 19) = 0.0378005 fft 38: mflops = 80.9876 (norm. = 1), norm. avg. (of 19) = 0.804863 Benchmarking for array size = 1048576 (power of 2): 0. Arndt DIF: elapsed time t=16.68 s, 1 iters, t-(init.)=16.58 s t(norm)=0.790596, mflops=6.32434 (err=1.9e-13) 1. Arndt DIT: elapsed time t=16.69 s, 1 iters, t-(init.)=16.59 s t(norm)=0.791073, mflops=6.32053 (err=1.9e-13) 2. Arndt Split-Radix: elapsed time t=20.96 s, 1 iters, t-(init.)=20.86 s t(norm)=0.994682, mflops=5.02673 (err=1.9e-13) 3. Arndt 4-step: elapsed time t=4.6 s, 1 iters, t-(init.)=4.51 s t(norm)=0.215054, mflops=23.25 (err=1.9e-13) 4. Bailey: elapsed time t=35.15 s, 1 iters, t-(init.)=35.05 s t(norm)=1.67131, mflops=2.99166 (err=1.9e-13) 5. Beauregard: elapsed time t=6.64 s, 1 iters, t-(init.)=6.54 s t(norm)=0.311852, mflops=16.0333 (err=1.9e-13) 6. Bergland: elapsed time t=8.12 s, 1 iters, t-(init.)=8.02 s t(norm)=0.382423, mflops=13.0745 (err=1.9e-13) 7. Brenner: elapsed time t=10.35 s, 1 iters, t-(init.)=10.26 s t(norm)=0.489235, mflops=10.22 (err=1.9e-13) 8. Burrus: elapsed time t=22.49 s, 1 iters, t-(init.)=22.4 s t(norm)=1.06812, mflops=4.68114 (err=1.9e-13) 9. Skipping fft (this transform size is too big for CWP). 10. Skipping fft (this transform size is too big for CWP). 11. Edelblute: elapsed time t=20.58 s, 1 iters, t-(init.)=20.48 s t(norm)=0.976563, mflops=5.12 (err=1.9e-13) 12. FFTPACK: elapsed time t=16.44 s, 1 iters, t-(init.)=16.34 s t(norm)=0.779152, mflops=6.41723 (err=1.9e-13) 13. FFTPACK (f2c): elapsed time t=13.63 s, 1 iters, t-(init.)=13.54 s t(norm)=0.645638, mflops=7.74428 (err=1.9e-13) FFTW_MEASURE plan: (cost = 4.190000e+00) FFTW_TWIDDLE 64 FFTW_TWIDDLE 16 FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=4.18 s, 1 iters, t-(init.)=4.08 s t(norm)=0.19455, mflops=25.7004 (err=1.9e-13) FFTW_ESTIMATE plan: (cost = 1.195377e+07) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=4.47 s, 1 iters, t-(init.)=4.38 s t(norm)=0.208855, mflops=23.9401 (err=1.9e-13) 16. Frigo-old: elapsed time t=7.87 s, 1 iters, t-(init.)=7.78 s t(norm)=0.370979, mflops=13.4778 (err=1.9e-13) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=13.53 s, 1 iters, t-(init.)=13.44 s t(norm)=0.640869, mflops=7.8019 (err=1.9e-13) 19. GSL DIT: elapsed time t=15.78 s, 1 iters, t-(init.)=15.69 s t(norm)=0.748158, mflops=6.68308 (err=1.9e-13) 20. GSL DIF: elapsed time t=16.07 s, 1 iters, t-(init.)=15.98 s t(norm)=0.761986, mflops=6.5618 (err=1.9e-13) 21. Skipping fft (Krukar can't handle N > 4096). 22. Skipping fft (Mayer (Buneman) can't handle N > 2^19). 23. Mayer (simple): elapsed time t=13.3 s, 1 iters, t-(init.)=13.2 s t(norm)=0.629425, mflops=7.94376 24. Mayer (lookup): elapsed time t=14.1 s, 1 iters, t-(init.)=14 s t(norm)=0.667572, mflops=7.48983 (err=1.9e-13) 25. Monro: elapsed time t=17.16 s, 1 iters, t-(init.)=17.06 s t(norm)=0.813484, mflops=6.1464 (err=2.0e-07) 26. NAPACK (f2c): elapsed time t=14.29 s, 1 iters, t-(init.)=14.19 s t(norm)=0.676632, mflops=7.38954 (err=1.5e-11) 27. Ooura (C): elapsed time t=5.9 s, 1 iters, t-(init.)=5.81 s t(norm)=0.277042, mflops=18.0478 (err=1.9e-13) 28. Ooura (F): elapsed time t=6.44 s, 1 iters, t-(init.)=6.35 s t(norm)=0.302792, mflops=16.513 (err=1.9e-13) 29. Ransom: elapsed time t=2.25 s, 1 iters, t-(init.)=2.16 s t(norm)=0.102997, mflops=48.5452 (err=1.9e-13) 30. SCIPORT: elapsed time t=14.86 s, 1 iters, t-(init.)=14.77 s t(norm)=0.704288, mflops=7.09936 (err=1.9e-13) 31. Singleton: elapsed time t=10.7 s, 1 iters, t-(init.)=10.6 s t(norm)=0.505447, mflops=9.89223 (err=2.6e-13) 32. Singleton (f2c): elapsed time t=10.4 s, 1 iters, t-(init.)=10.3 s t(norm)=0.491142, mflops=10.1803 (err=2.6e-13) 33. Sorensen: elapsed time t=14.16 s, 1 iters, t-(init.)=14.06 s t(norm)=0.670433, mflops=7.45787 (err=1.9e-13) 34. Sorensen DIT: elapsed time t=23.81 s, 1 iters, t-(init.)=23.71 s t(norm)=1.13058, mflops=4.42251 (err=1.9e-13) 35. Temperton: elapsed time t=12.86 s, 1 iters, t-(init.)=12.76 s t(norm)=0.608444, mflops=8.21768 (err=2.3e-07) 36. Temperton (f2c): elapsed time t=11.77 s, 1 iters, t-(init.)=11.68 s t(norm)=0.556946, mflops=8.97753 (err=1.9e-13) 37. Valkenburg: elapsed time t=30.55 s, 1 iters, t-(init.)=30.46 s t(norm)=1.45245, mflops=3.44247 (err=1.9e-13) 38. SGIMATH: elapsed time t=1.29 s, 1 iters, t-(init.)=1.19 s t(norm)=0.0567436, mflops=88.1156 (err=1.9e-13) Top mflops for N=1048576 = 88.1156 Normalized results and averages for N=1048576: fft 0: mflops = 6.32434 (norm. = 0.0717732), norm. avg. (of 20) = 0.309114 fft 1: mflops = 6.32053 (norm. = 0.07173), norm. avg. (of 20) = 0.302751 fft 2: mflops = 5.02673 (norm. = 0.057047), norm. avg. (of 20) = 0.226934 fft 3: mflops = 23.25 (norm. = 0.263858), norm. avg. (of 20) = 0.174224 fft 4: mflops = 2.99166 (norm. = 0.0339515), norm. avg. (of 20) = 0.199347 fft 5: mflops = 16.0333 (norm. = 0.181957), norm. avg. (of 20) = 0.134811 fft 6: mflops = 13.0745 (norm. = 0.148379), norm. avg. (of 20) = 0.246029 fft 7: mflops = 10.22 (norm. = 0.115984), norm. avg. (of 20) = 0.180256 fft 8: mflops = 4.68114 (norm. = 0.053125), norm. avg. (of 20) = 0.176948 fft 9: mflops = -1 (norm. = -0.0113487), norm. avg. (of 19) = 0.648015 fft 10: mflops = -1 (norm. = -0.0113487), norm. avg. (of 19) = 0.66527 fft 11: mflops = 5.12 (norm. = 0.0581055), norm. avg. (of 19) = 0.18396 fft 12: mflops = 6.41723 (norm. = 0.0728274), norm. avg. (of 20) = 0.291922 fft 13: mflops = 7.74428 (norm. = 0.0878877), norm. avg. (of 20) = 0.125925 fft 14: mflops = 25.7004 (norm. = 0.291667), norm. avg. (of 20) = 0.601477 fft 15: mflops = 23.9401 (norm. = 0.271689), norm. avg. (of 20) = 0.558848 fft 16: mflops = 13.4778 (norm. = 0.152956), norm. avg. (of 20) = 0.594726 fft 17: mflops = -1 (norm. = -0.0113487), norm. avg. (of 17) = 0.523652 fft 18: mflops = 7.8019 (norm. = 0.0885417), norm. avg. (of 20) = 0.238377 fft 19: mflops = 6.68308 (norm. = 0.0758445), norm. avg. (of 20) = 0.175116 fft 20: mflops = 6.5618 (norm. = 0.0744681), norm. avg. (of 20) = 0.176613 fft 21: mflops = -1 (norm. = -0.0113487), norm. avg. (of 12) = 0.539745 fft 22: mflops = -1 (norm. = -0.0113487), norm. avg. (of 18) = 0.283056 fft 23: mflops = 7.94376 (norm. = 0.0901515), norm. avg. (of 19) = 0.319044 fft 24: mflops = 7.48983 (norm. = 0.085), norm. avg. (of 19) = 0.292707 fft 25: mflops = 6.1464 (norm. = 0.0697538), norm. avg. (of 19) = 0.149124 fft 26: mflops = 7.38954 (norm. = 0.0838619), norm. avg. (of 20) = 0.1305 fft 27: mflops = 18.0478 (norm. = 0.204819), norm. avg. (of 20) = 0.538044 fft 28: mflops = 16.513 (norm. = 0.187402), norm. avg. (of 20) = 0.501149 fft 29: mflops = 48.5452 (norm. = 0.550926), norm. avg. (of 19) = 0.28708 fft 30: mflops = 7.09936 (norm. = 0.0805687), norm. avg. (of 19) = 0.269266 fft 31: mflops = 9.89223 (norm. = 0.112264), norm. avg. (of 20) = 0.296076 fft 32: mflops = 10.1803 (norm. = 0.115534), norm. avg. (of 20) = 0.303468 fft 33: mflops = 7.45787 (norm. = 0.0846373), norm. avg. (of 20) = 0.306527 fft 34: mflops = 4.42251 (norm. = 0.0501898), norm. avg. (of 20) = 0.169988 fft 35: mflops = 8.21768 (norm. = 0.0932602), norm. avg. (of 20) = 0.153777 fft 36: mflops = 8.97753 (norm. = 0.101884), norm. avg. (of 20) = 0.160694 fft 37: mflops = 3.44247 (norm. = 0.0390676), norm. avg. (of 20) = 0.0378639 fft 38: mflops = 88.1156 (norm. = 1), norm. avg. (of 20) = 0.814619 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. Brenner 1. CWP (min N) 2. CWP (best N) 3. FFTPACK 4. FFTPACK (f2c) 5. FFTW 6. FFTW_ESTIMATE 7. Frigo-old 8. GSL 9. NAPACK (f2c) 10. Singleton 11. Singleton (f2c) 12. Temperton 13. Temperton (f2c) 14. Valkenburg 15. SGIMATH Computing normalized averages (16 transforms). Benchmarking for array size = 6: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.35 s, 262144 iters, t-(init.)=1.31 s t(norm)=0.3222, mflops=15.5183 2. CWP (best N) (N=15): elapsed time t=1.72 s, 262144 iters, t-(init.)=1.66 s t(norm)=0.408284, mflops=12.2464 3. FFTPACK: elapsed time t=1.3 s, 524288 iters, t-(init.)=1.23 s t(norm)=0.151262, mflops=33.0552 (err=1.4e-16) 4. FFTPACK (f2c): elapsed time t=1.13 s, 262144 iters, t-(init.)=1.09 s t(norm)=0.26809, mflops=18.6504 (err=1.8e-16) FFTW_MEASURE plan: (cost = 7.820129e-07) FFTW_NOTW 6 5. FFTW: elapsed time t=1.63 s, 2097152 iters, t-(init.)=1.42 s t(norm)=0.0436569, mflops=114.529 (err=1.1e-16) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 6. FFTW_ESTIMATE: elapsed time t=1.6 s, 2097152 iters, t-(init.)=1.36 s t(norm)=0.0418122, mflops=119.582 (err=1.1e-16) 7. Frigo-old: elapsed time t=1.2 s, 262144 iters, t-(init.)=1.17 s t(norm)=0.287767, mflops=17.3752 (err=3.3e-16) 8. GSL: elapsed time t=1.37 s, 524288 iters, t-(init.)=1.32 s t(norm)=0.16233, mflops=30.8015 (err=1.2e-16) 9. NAPACK (f2c): elapsed time t=1.33 s, 262144 iters, t-(init.)=1.31 s t(norm)=0.3222, mflops=15.5183 (err=4.7e-16) 10. Singleton: elapsed time t=1.47 s, 262144 iters, t-(init.)=1.44 s t(norm)=0.354174, mflops=14.1173 (err=1.0e-16) 11. Singleton (f2c): elapsed time t=1.44 s, 262144 iters, t-(init.)=1.41 s t(norm)=0.346796, mflops=14.4177 (err=1.0e-16) 12. Temperton: elapsed time t=1.65 s, 262144 iters, t-(init.)=1.62 s t(norm)=0.398446, mflops=12.5487 (err=3.7e-16) 13. Temperton (f2c): elapsed time t=1.29 s, 131072 iters, t-(init.)=1.27 s t(norm)=0.624724, mflops=8.00353 (err=1.0e-16) 14. Valkenburg: elapsed time t=1.51 s, 131072 iters, t-(init.)=1.5 s t(norm)=0.737863, mflops=6.77632 (err=3.0e-16) 15. SGIMATH: elapsed time t=1.29 s, 524288 iters, t-(init.)=1.23 s t(norm)=0.151262, mflops=33.0552 (err=1.8e-16) Top mflops for N=6 = 119.582 Normalized results and averages for N=6: fft 0: mflops = -1 (norm. = -0.00836245), norm. avg. (of 0) = -1 fft 1: mflops = 15.5183 (norm. = 0.129771), norm. avg. (of 1) = 0.129771 fft 2: mflops = 12.2464 (norm. = 0.10241), norm. avg. (of 1) = 0.10241 fft 3: mflops = 33.0552 (norm. = 0.276423), norm. avg. (of 1) = 0.276423 fft 4: mflops = 18.6504 (norm. = 0.155963), norm. avg. (of 1) = 0.155963 fft 5: mflops = 114.529 (norm. = 0.957746), norm. avg. (of 1) = 0.957746 fft 6: mflops = 119.582 (norm. = 1), norm. avg. (of 1) = 1 fft 7: mflops = 17.3752 (norm. = 0.145299), norm. avg. (of 1) = 0.145299 fft 8: mflops = 30.8015 (norm. = 0.257576), norm. avg. (of 1) = 0.257576 fft 9: mflops = 15.5183 (norm. = 0.129771), norm. avg. (of 1) = 0.129771 fft 10: mflops = 14.1173 (norm. = 0.118056), norm. avg. (of 1) = 0.118056 fft 11: mflops = 14.4177 (norm. = 0.120567), norm. avg. (of 1) = 0.120567 fft 12: mflops = 12.5487 (norm. = 0.104938), norm. avg. (of 1) = 0.104938 fft 13: mflops = 8.00353 (norm. = 0.0669291), norm. avg. (of 1) = 0.0669291 fft 14: mflops = 6.77632 (norm. = 0.0566667), norm. avg. (of 1) = 0.0566667 fft 15: mflops = 33.0552 (norm. = 0.276423), norm. avg. (of 1) = 0.276423 Benchmarking for array size = 9: 0. Brenner: elapsed time t=1.34 s, 65536 iters, t-(init.)=1.33 s t(norm)=0.711345, mflops=7.02894 (err=3.6e-16) 1. CWP (min N): elapsed time t=1.18 s, 262144 iters, t-(init.)=1.15 s t(norm)=0.153768, mflops=32.5165 2. CWP (best N) (N=15): elapsed time t=1.73 s, 262144 iters, t-(init.)=1.67 s t(norm)=0.223298, mflops=22.3916 3. FFTPACK: elapsed time t=1.41 s, 524288 iters, t-(init.)=1.35 s t(norm)=0.0902552, mflops=55.3985 (err=1.5e-16) 4. FFTPACK (f2c): elapsed time t=1.54 s, 262144 iters, t-(init.)=1.5 s t(norm)=0.200567, mflops=24.9293 (err=2.5e-16) FFTW_MEASURE plan: (cost = 1.068115e-06) FFTW_NOTW 9 5. FFTW: elapsed time t=1.18 s, 1048576 iters, t-(init.)=1.06 s t(norm)=0.0354335, mflops=141.109 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.17 s, 1048576 iters, t-(init.)=1.06 s t(norm)=0.0354335, mflops=141.109 (err=1.4e-16) 7. Frigo-old: elapsed time t=1.22 s, 131072 iters, t-(init.)=1.2 s t(norm)=0.320907, mflops=15.5808 (err=3.5e-16) 8. GSL: elapsed time t=1.06 s, 262144 iters, t-(init.)=1.04 s t(norm)=0.13906, mflops=35.9557 (err=1.5e-16) 9. NAPACK (f2c): elapsed time t=1.65 s, 262144 iters, t-(init.)=1.62 s t(norm)=0.216613, mflops=23.0827 (err=6.2e-16) 10. Singleton: elapsed time t=1.53 s, 262144 iters, t-(init.)=1.5 s t(norm)=0.200567, mflops=24.9293 (err=1.5e-16) 11. Singleton (f2c): elapsed time t=1.5 s, 262144 iters, t-(init.)=1.47 s t(norm)=0.196556, mflops=25.4381 (err=1.5e-16) 12. Temperton: elapsed time t=1.77 s, 262144 iters, t-(init.)=1.74 s t(norm)=0.232658, mflops=21.4908 (err=1.1e-08) 13. Temperton (f2c): elapsed time t=1.72 s, 131072 iters, t-(init.)=1.71 s t(norm)=0.457293, mflops=10.9339 (err=1.5e-16) 14. Valkenburg: elapsed time t=1.38 s, 65536 iters, t-(init.)=1.37 s t(norm)=0.732739, mflops=6.82371 (err=3.1e-16) 15. SGIMATH: elapsed time t=1.4 s, 524288 iters, t-(init.)=1.34 s t(norm)=0.0895867, mflops=55.8119 (err=2.5e-16) Top mflops for N=9 = 141.109 Normalized results and averages for N=9: fft 0: mflops = 7.02894 (norm. = 0.049812), norm. avg. (of 1) = 0.049812 fft 1: mflops = 32.5165 (norm. = 0.230435), norm. avg. (of 2) = 0.180103 fft 2: mflops = 22.3916 (norm. = 0.158683), norm. avg. (of 2) = 0.130546 fft 3: mflops = 55.3985 (norm. = 0.392593), norm. avg. (of 2) = 0.334508 fft 4: mflops = 24.9293 (norm. = 0.176667), norm. avg. (of 2) = 0.166315 fft 5: mflops = 141.109 (norm. = 1), norm. avg. (of 2) = 0.978873 fft 6: mflops = 141.109 (norm. = 1), norm. avg. (of 2) = 1 fft 7: mflops = 15.5808 (norm. = 0.110417), norm. avg. (of 2) = 0.127858 fft 8: mflops = 35.9557 (norm. = 0.254808), norm. avg. (of 2) = 0.256192 fft 9: mflops = 23.0827 (norm. = 0.16358), norm. avg. (of 2) = 0.146676 fft 10: mflops = 24.9293 (norm. = 0.176667), norm. avg. (of 2) = 0.147361 fft 11: mflops = 25.4381 (norm. = 0.180272), norm. avg. (of 2) = 0.15042 fft 12: mflops = 21.4908 (norm. = 0.152299), norm. avg. (of 2) = 0.128619 fft 13: mflops = 10.9339 (norm. = 0.0774854), norm. avg. (of 2) = 0.0722073 fft 14: mflops = 6.82371 (norm. = 0.0483577), norm. avg. (of 2) = 0.0525122 fft 15: mflops = 55.8119 (norm. = 0.395522), norm. avg. (of 2) = 0.335973 Benchmarking for array size = 12: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.44 s, 262144 iters, t-(init.)=1.4 s t(norm)=0.124143, mflops=40.2761 2. CWP (best N) (N=15): elapsed time t=1.72 s, 262144 iters, t-(init.)=1.66 s t(norm)=0.147198, mflops=33.9678 3. FFTPACK: elapsed time t=1.83 s, 524288 iters, t-(init.)=1.74 s t(norm)=0.077146, mflops=64.8122 (err=1.7e-16) 4. FFTPACK (f2c): elapsed time t=1.85 s, 262144 iters, t-(init.)=1.8 s t(norm)=0.159612, mflops=31.3259 (err=2.2e-16) FFTW_MEASURE plan: (cost = 1.258850e-06) FFTW_NOTW 12 5. FFTW: elapsed time t=1.31 s, 1048576 iters, t-(init.)=1.14 s t(norm)=0.025272, mflops=197.848 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.33 s, 1048576 iters, t-(init.)=1.16 s t(norm)=0.0257153, mflops=194.436 (err=1.2e-16) 7. Frigo-old: elapsed time t=1.09 s, 131072 iters, t-(init.)=1.07 s t(norm)=0.189761, mflops=26.3489 (err=2.8e-16) 8. GSL: elapsed time t=1.05 s, 262144 iters, t-(init.)=1.01 s t(norm)=0.0895603, mflops=55.8283 (err=1.6e-16) 9. NAPACK (f2c): elapsed time t=1.2 s, 131072 iters, t-(init.)=1.18 s t(norm)=0.20927, mflops=23.8926 (err=5.5e-16) 10. Singleton: elapsed time t=1.97 s, 262144 iters, t-(init.)=1.93 s t(norm)=0.17114, mflops=29.2158 (err=1.5e-16) 11. Singleton (f2c): elapsed time t=1.91 s, 262144 iters, t-(init.)=1.87 s t(norm)=0.16582, mflops=30.1533 (err=1.5e-16) 12. Temperton: elapsed time t=1.02 s, 131072 iters, t-(init.)=1 s t(norm)=0.177347, mflops=28.1933 (err=5.4e-16) 13. Temperton (f2c): elapsed time t=1.84 s, 131072 iters, t-(init.)=1.82 s t(norm)=0.322772, mflops=15.4908 (err=1.4e-16) 14. Valkenburg: elapsed time t=1.04 s, 32768 iters, t-(init.)=1.03 s t(norm)=0.73067, mflops=6.84303 (err=3.8e-16) 15. SGIMATH: elapsed time t=1.45 s, 524288 iters, t-(init.)=1.37 s t(norm)=0.0607414, mflops=82.3162 (err=1.5e-16) Top mflops for N=12 = 197.848 Normalized results and averages for N=12: fft 0: mflops = -1 (norm. = -0.00505439), norm. avg. (of 1) = 0.049812 fft 1: mflops = 40.2761 (norm. = 0.203571), norm. avg. (of 3) = 0.187926 fft 2: mflops = 33.9678 (norm. = 0.171687), norm. avg. (of 3) = 0.14426 fft 3: mflops = 64.8122 (norm. = 0.327586), norm. avg. (of 3) = 0.332201 fft 4: mflops = 31.3259 (norm. = 0.158333), norm. avg. (of 3) = 0.163654 fft 5: mflops = 197.848 (norm. = 1), norm. avg. (of 3) = 0.985915 fft 6: mflops = 194.436 (norm. = 0.982759), norm. avg. (of 3) = 0.994253 fft 7: mflops = 26.3489 (norm. = 0.133178), norm. avg. (of 3) = 0.129631 fft 8: mflops = 55.8283 (norm. = 0.282178), norm. avg. (of 3) = 0.264854 fft 9: mflops = 23.8926 (norm. = 0.120763), norm. avg. (of 3) = 0.138038 fft 10: mflops = 29.2158 (norm. = 0.147668), norm. avg. (of 3) = 0.147464 fft 11: mflops = 30.1533 (norm. = 0.152406), norm. avg. (of 3) = 0.151082 fft 12: mflops = 28.1933 (norm. = 0.1425), norm. avg. (of 3) = 0.133246 fft 13: mflops = 15.4908 (norm. = 0.0782967), norm. avg. (of 3) = 0.0742371 fft 14: mflops = 6.84303 (norm. = 0.0345874), norm. avg. (of 3) = 0.0465372 fft 15: mflops = 82.3162 (norm. = 0.416058), norm. avg. (of 3) = 0.362668 Benchmarking for array size = 15: 0. Brenner: elapsed time t=1.87 s, 65536 iters, t-(init.)=1.85 s t(norm)=0.481692, mflops=10.3801 (err=3.6e-16) 1. CWP (min N): elapsed time t=1.75 s, 262144 iters, t-(init.)=1.69 s t(norm)=0.110008, mflops=45.4512 2. CWP (best N): elapsed time t=1.72 s, 262144 iters, t-(init.)=1.66 s t(norm)=0.108055, mflops=46.2726 3. FFTPACK: elapsed time t=1.9 s, 524288 iters, t-(init.)=1.79 s t(norm)=0.0582587, mflops=85.8241 (err=3.0e-16) 4. FFTPACK (f2c): elapsed time t=1.19 s, 131072 iters, t-(init.)=1.15 s t(norm)=0.149715, mflops=33.3968 (err=4.5e-16) FFTW_MEASURE plan: (cost = 2.212524e-06) FFTW_NOTW 15 5. FFTW: elapsed time t=1.13 s, 524288 iters, t-(init.)=1 s t(norm)=0.0325467, mflops=153.625 (err=1.9e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.13 s, 524288 iters, t-(init.)=1 s t(norm)=0.0325467, mflops=153.625 (err=1.9e-16) 7. Frigo-old: elapsed time t=1.87 s, 131072 iters, t-(init.)=1.84 s t(norm)=0.239544, mflops=20.873 (err=3.9e-16) 8. GSL: elapsed time t=1.31 s, 262144 iters, t-(init.)=1.25 s t(norm)=0.0813669, mflops=61.4501 (err=2.3e-16) 9. NAPACK (f2c): elapsed time t=1.01 s, 65536 iters, t-(init.)=1 s t(norm)=0.260374, mflops=19.2031 (err=1.1e-15) 10. Singleton: elapsed time t=1.28 s, 131072 iters, t-(init.)=1.25 s t(norm)=0.162734, mflops=30.725 (err=2.9e-16) 11. Singleton (f2c): elapsed time t=1.29 s, 131072 iters, t-(init.)=1.26 s t(norm)=0.164036, mflops=30.4812 (err=2.9e-16) 12. Temperton: elapsed time t=1.02 s, 131072 iters, t-(init.)=0.99 s t(norm)=0.128885, mflops=38.7942 (err=7.9e-16) 13. Temperton (f2c): elapsed time t=1.34 s, 65536 iters, t-(init.)=1.32 s t(norm)=0.343694, mflops=14.5478 (err=2.0e-16) 14. Valkenburg: elapsed time t=1.53 s, 32768 iters, t-(init.)=1.52 s t(norm)=0.791537, mflops=6.31683 (err=4.9e-16) 15. SGIMATH: elapsed time t=1.94 s, 524288 iters, t-(init.)=1.81 s t(norm)=0.0589096, mflops=84.8758 (err=4.5e-16) Top mflops for N=15 = 153.625 Normalized results and averages for N=15: fft 0: mflops = 10.3801 (norm. = 0.0675676), norm. avg. (of 2) = 0.0586898 fft 1: mflops = 45.4512 (norm. = 0.295858), norm. avg. (of 4) = 0.214909 fft 2: mflops = 46.2726 (norm. = 0.301205), norm. avg. (of 4) = 0.183496 fft 3: mflops = 85.8241 (norm. = 0.558659), norm. avg. (of 4) = 0.388815 fft 4: mflops = 33.3968 (norm. = 0.217391), norm. avg. (of 4) = 0.177089 fft 5: mflops = 153.625 (norm. = 1), norm. avg. (of 4) = 0.989437 fft 6: mflops = 153.625 (norm. = 1), norm. avg. (of 4) = 0.99569 fft 7: mflops = 20.873 (norm. = 0.13587), norm. avg. (of 4) = 0.131191 fft 8: mflops = 61.4501 (norm. = 0.4), norm. avg. (of 4) = 0.29864 fft 9: mflops = 19.2031 (norm. = 0.125), norm. avg. (of 4) = 0.134778 fft 10: mflops = 30.725 (norm. = 0.2), norm. avg. (of 4) = 0.160598 fft 11: mflops = 30.4812 (norm. = 0.198413), norm. avg. (of 4) = 0.162915 fft 12: mflops = 38.7942 (norm. = 0.252525), norm. avg. (of 4) = 0.163066 fft 13: mflops = 14.5478 (norm. = 0.094697), norm. avg. (of 4) = 0.079352 fft 14: mflops = 6.31683 (norm. = 0.0411184), norm. avg. (of 4) = 0.0451825 fft 15: mflops = 84.8758 (norm. = 0.552486), norm. avg. (of 4) = 0.410122 Benchmarking for array size = 18: 0. Brenner: elapsed time t=1.32 s, 32768 iters, t-(init.)=1.31 s t(norm)=0.532624, mflops=9.38749 (err=4.1e-16) 1. CWP (min N): elapsed time t=1 s, 131072 iters, t-(init.)=0.97 s t(norm)=0.0985964, mflops=50.7118 2. CWP (best N) (N=28): elapsed time t=1.11 s, 131072 iters, t-(init.)=1.07 s t(norm)=0.108761, mflops=45.9724 3. FFTPACK: elapsed time t=1.55 s, 262144 iters, t-(init.)=1.48 s t(norm)=0.0752179, mflops=66.4736 (err=2.6e-16) 4. FFTPACK (f2c): elapsed time t=1.17 s, 65536 iters, t-(init.)=1.15 s t(norm)=0.233785, mflops=21.3871 (err=2.9e-16) FFTW_MEASURE plan: (cost = 3.204346e-06) FFTW_TWIDDLE 3 FFTW_NOTW 6 5. FFTW: elapsed time t=1.72 s, 524288 iters, t-(init.)=1.6 s t(norm)=0.0406583, mflops=122.976 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.67 s, 524288 iters, t-(init.)=1.55 s t(norm)=0.0393877, mflops=126.943 (err=2.2e-16) 7. Frigo-old: elapsed time t=1.34 s, 65536 iters, t-(init.)=1.32 s t(norm)=0.268345, mflops=18.6327 (err=4.7e-16) 8. GSL: elapsed time t=1.47 s, 262144 iters, t-(init.)=1.41 s t(norm)=0.0716603, mflops=69.7737 (err=2.3e-16) 9. NAPACK (f2c): elapsed time t=1.62 s, 131072 iters, t-(init.)=1.59 s t(norm)=0.161617, mflops=30.9374 (err=7.8e-16) 10. Singleton: elapsed time t=1.27 s, 131072 iters, t-(init.)=1.24 s t(norm)=0.126041, mflops=39.6697 (err=2.1e-16) 11. Singleton (f2c): elapsed time t=1.26 s, 131072 iters, t-(init.)=1.22 s t(norm)=0.124008, mflops=40.32 (err=2.1e-16) 12. Temperton: elapsed time t=1.63 s, 131072 iters, t-(init.)=1.6 s t(norm)=0.162633, mflops=30.744 (err=2.7e-08) 13. Temperton (f2c): elapsed time t=1.71 s, 65536 iters, t-(init.)=1.7 s t(norm)=0.345596, mflops=14.4678 (err=2.9e-16) 14. Valkenburg: elapsed time t=1.77 s, 32768 iters, t-(init.)=1.77 s t(norm)=0.719652, mflops=6.9478 (err=4.0e-16) 15. SGIMATH: elapsed time t=1.29 s, 262144 iters, t-(init.)=1.23 s t(norm)=0.0625122, mflops=79.9845 (err=2.9e-16) Top mflops for N=18 = 126.943 Normalized results and averages for N=18: fft 0: mflops = 9.38749 (norm. = 0.0739504), norm. avg. (of 3) = 0.0637767 fft 1: mflops = 50.7118 (norm. = 0.399485), norm. avg. (of 5) = 0.251824 fft 2: mflops = 45.9724 (norm. = 0.36215), norm. avg. (of 5) = 0.219227 fft 3: mflops = 66.4736 (norm. = 0.523649), norm. avg. (of 5) = 0.415782 fft 4: mflops = 21.3871 (norm. = 0.168478), norm. avg. (of 5) = 0.175367 fft 5: mflops = 122.976 (norm. = 0.96875), norm. avg. (of 5) = 0.985299 fft 6: mflops = 126.943 (norm. = 1), norm. avg. (of 5) = 0.996552 fft 7: mflops = 18.6327 (norm. = 0.14678), norm. avg. (of 5) = 0.134309 fft 8: mflops = 69.7737 (norm. = 0.549645), norm. avg. (of 5) = 0.348841 fft 9: mflops = 30.9374 (norm. = 0.243711), norm. avg. (of 5) = 0.156565 fft 10: mflops = 39.6697 (norm. = 0.3125), norm. avg. (of 5) = 0.190978 fft 11: mflops = 40.32 (norm. = 0.317623), norm. avg. (of 5) = 0.193856 fft 12: mflops = 30.744 (norm. = 0.242188), norm. avg. (of 5) = 0.17889 fft 13: mflops = 14.4678 (norm. = 0.113971), norm. avg. (of 5) = 0.0862758 fft 14: mflops = 6.9478 (norm. = 0.0547316), norm. avg. (of 5) = 0.0470924 fft 15: mflops = 79.9845 (norm. = 0.630081), norm. avg. (of 5) = 0.454114 Benchmarking for array size = 24: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.86 s, 262144 iters, t-(init.)=1.78 s t(norm)=0.0617068, mflops=81.0283 2. CWP (best N) (N=28): elapsed time t=1.11 s, 131072 iters, t-(init.)=1.06 s t(norm)=0.0734935, mflops=68.0332 3. FFTPACK: elapsed time t=1.88 s, 262144 iters, t-(init.)=1.81 s t(norm)=0.0627468, mflops=79.6853 (err=2.3e-16) 4. FFTPACK (f2c): elapsed time t=1.46 s, 65536 iters, t-(init.)=1.44 s t(norm)=0.19968, mflops=25.04 (err=2.7e-16) FFTW_MEASURE plan: (cost = 3.662109e-06) FFTW_TWIDDLE 2 FFTW_NOTW 12 5. FFTW: elapsed time t=1.9 s, 524288 iters, t-(init.)=1.75 s t(norm)=0.0303334, mflops=164.835 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.92 s, 524288 iters, t-(init.)=1.77 s t(norm)=0.0306801, mflops=162.972 (err=2.1e-16) 7. Frigo-old: elapsed time t=1.05 s, 65536 iters, t-(init.)=1.03 s t(norm)=0.142827, mflops=35.0074 (err=3.4e-16) 8. GSL: elapsed time t=1.67 s, 262144 iters, t-(init.)=1.6 s t(norm)=0.0554668, mflops=90.144 (err=2.0e-16) 9. NAPACK (f2c): elapsed time t=1.1 s, 65536 iters, t-(init.)=1.08 s t(norm)=0.14976, mflops=33.3867 (err=8.0e-16) 10. Singleton: elapsed time t=1.93 s, 131072 iters, t-(init.)=1.89 s t(norm)=0.13104, mflops=38.1562 (err=2.3e-16) 11. Singleton (f2c): elapsed time t=1.9 s, 131072 iters, t-(init.)=1.86 s t(norm)=0.12896, mflops=38.7716 (err=2.3e-16) 12. Temperton: elapsed time t=1.61 s, 131072 iters, t-(init.)=1.57 s t(norm)=0.108854, mflops=45.9333 (err=4.5e-09) 13. Temperton (f2c): elapsed time t=1.72 s, 65536 iters, t-(init.)=1.7 s t(norm)=0.235734, mflops=21.2104 (err=2.8e-16) 14. Valkenburg: elapsed time t=1.28 s, 16384 iters, t-(init.)=1.27 s t(norm)=0.704428, mflops=7.09796 (err=6.3e-16) 15. SGIMATH: elapsed time t=1.04 s, 262144 iters, t-(init.)=0.97 s t(norm)=0.0336267, mflops=148.691 (err=2.6e-16) Top mflops for N=24 = 164.835 Normalized results and averages for N=24: fft 0: mflops = -1 (norm. = -0.00606668), norm. avg. (of 3) = 0.0637767 fft 1: mflops = 81.0283 (norm. = 0.491573), norm. avg. (of 6) = 0.291782 fft 2: mflops = 68.0332 (norm. = 0.412736), norm. avg. (of 6) = 0.251478 fft 3: mflops = 79.6853 (norm. = 0.483425), norm. avg. (of 6) = 0.427056 fft 4: mflops = 25.04 (norm. = 0.15191), norm. avg. (of 6) = 0.171457 fft 5: mflops = 164.835 (norm. = 1), norm. avg. (of 6) = 0.987749 fft 6: mflops = 162.972 (norm. = 0.988701), norm. avg. (of 6) = 0.995243 fft 7: mflops = 35.0074 (norm. = 0.212379), norm. avg. (of 6) = 0.14732 fft 8: mflops = 90.144 (norm. = 0.546875), norm. avg. (of 6) = 0.381847 fft 9: mflops = 33.3867 (norm. = 0.202546), norm. avg. (of 6) = 0.164228 fft 10: mflops = 38.1562 (norm. = 0.231481), norm. avg. (of 6) = 0.197729 fft 11: mflops = 38.7716 (norm. = 0.235215), norm. avg. (of 6) = 0.200749 fft 12: mflops = 45.9333 (norm. = 0.278662), norm. avg. (of 6) = 0.195519 fft 13: mflops = 21.2104 (norm. = 0.128676), norm. avg. (of 6) = 0.0933425 fft 14: mflops = 7.09796 (norm. = 0.043061), norm. avg. (of 6) = 0.0464205 fft 15: mflops = 148.691 (norm. = 0.902062), norm. avg. (of 6) = 0.528772 Benchmarking for array size = 36: 0. Brenner: elapsed time t=1.3 s, 16384 iters, t-(init.)=1.29 s t(norm)=0.423042, mflops=11.8192 (err=5.3e-16) 1. CWP (min N): elapsed time t=1.26 s, 131072 iters, t-(init.)=1.2 s t(norm)=0.0491909, mflops=101.645 2. CWP (best N): elapsed time t=1.27 s, 131072 iters, t-(init.)=1.22 s t(norm)=0.0500107, mflops=99.9786 3. FFTPACK: elapsed time t=1.2 s, 131072 iters, t-(init.)=1.14 s t(norm)=0.0467313, mflops=106.995 (err=4.0e-16) 4. FFTPACK (f2c): elapsed time t=1.09 s, 32768 iters, t-(init.)=1.07 s t(norm)=0.175447, mflops=28.4986 (err=4.8e-16) FFTW_MEASURE plan: (cost = 5.493164e-06) FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.51 s, 262144 iters, t-(init.)=1.4 s t(norm)=0.0286947, mflops=174.248 (err=4.0e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.5 s, 262144 iters, t-(init.)=1.38 s t(norm)=0.0282848, mflops=176.774 (err=4.0e-16) 7. Frigo-old: elapsed time t=1.35 s, 32768 iters, t-(init.)=1.34 s t(norm)=0.219719, mflops=22.7563 (err=5.6e-16) 8. GSL: elapsed time t=1.21 s, 131072 iters, t-(init.)=1.16 s t(norm)=0.0475512, mflops=105.15 (err=4.2e-16) 9. NAPACK (f2c): elapsed time t=1.5 s, 65536 iters, t-(init.)=1.48 s t(norm)=0.121337, mflops=41.2074 (err=1.8e-15) 10. Singleton: elapsed time t=1.02 s, 65536 iters, t-(init.)=1 s t(norm)=0.0819848, mflops=60.9869 (err=4.7e-16) 11. Singleton (f2c): elapsed time t=1.01 s, 65536 iters, t-(init.)=0.98 s t(norm)=0.0803451, mflops=62.2315 (err=4.7e-16) 12. Temperton: elapsed time t=1.14 s, 65536 iters, t-(init.)=1.11 s t(norm)=0.0910031, mflops=54.9432 (err=5.1e-08) 13. Temperton (f2c): elapsed time t=1.41 s, 32768 iters, t-(init.)=1.39 s t(norm)=0.227918, mflops=21.9377 (err=3.8e-16) 14. Valkenburg: elapsed time t=1.06 s, 8192 iters, t-(init.)=1.06 s t(norm)=0.695231, mflops=7.19185 (err=6.1e-16) 15. SGIMATH: elapsed time t=1.77 s, 262144 iters, t-(init.)=1.66 s t(norm)=0.0340237, mflops=146.956 (err=4.3e-16) Top mflops for N=36 = 176.774 Normalized results and averages for N=36: fft 0: mflops = 11.8192 (norm. = 0.0668605), norm. avg. (of 4) = 0.0645476 fft 1: mflops = 101.645 (norm. = 0.575), norm. avg. (of 7) = 0.332242 fft 2: mflops = 99.9786 (norm. = 0.565574), norm. avg. (of 7) = 0.296349 fft 3: mflops = 106.995 (norm. = 0.605263), norm. avg. (of 7) = 0.452514 fft 4: mflops = 28.4986 (norm. = 0.161215), norm. avg. (of 7) = 0.169994 fft 5: mflops = 174.248 (norm. = 0.985714), norm. avg. (of 7) = 0.987459 fft 6: mflops = 176.774 (norm. = 1), norm. avg. (of 7) = 0.995923 fft 7: mflops = 22.7563 (norm. = 0.128731), norm. avg. (of 7) = 0.144665 fft 8: mflops = 105.15 (norm. = 0.594828), norm. avg. (of 7) = 0.412273 fft 9: mflops = 41.2074 (norm. = 0.233108), norm. avg. (of 7) = 0.174068 fft 10: mflops = 60.9869 (norm. = 0.345), norm. avg. (of 7) = 0.218767 fft 11: mflops = 62.2315 (norm. = 0.352041), norm. avg. (of 7) = 0.222362 fft 12: mflops = 54.9432 (norm. = 0.310811), norm. avg. (of 7) = 0.211989 fft 13: mflops = 21.9377 (norm. = 0.124101), norm. avg. (of 7) = 0.0977366 fft 14: mflops = 7.19185 (norm. = 0.040684), norm. avg. (of 7) = 0.045601 fft 15: mflops = 146.956 (norm. = 0.831325), norm. avg. (of 7) = 0.571994 Benchmarking for array size = 80: 0. Brenner: elapsed time t=1.08 s, 8192 iters, t-(init.)=1.07 s t(norm)=0.258258, mflops=19.3605 (err=3.8e-16) 1. CWP (min N): elapsed time t=1.22 s, 65536 iters, t-(init.)=1.16 s t(norm)=0.0349976, mflops=142.867 2. CWP (best N) (N=84): elapsed time t=1.2 s, 65536 iters, t-(init.)=1.14 s t(norm)=0.0343942, mflops=145.373 3. FFTPACK: elapsed time t=1.13 s, 65536 iters, t-(init.)=1.07 s t(norm)=0.0322823, mflops=154.884 (err=3.7e-16) 4. FFTPACK (f2c): elapsed time t=1.13 s, 16384 iters, t-(init.)=1.12 s t(norm)=0.135163, mflops=36.9923 (err=4.2e-16) FFTW_MEASURE plan: (cost = 1.464844e-05) FFTW_TWIDDLE 10 FFTW_NOTW 8 5. FFTW: elapsed time t=1.86 s, 131072 iters, t-(init.)=1.74 s t(norm)=0.0262482, mflops=190.489 (err=3.6e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 6. FFTW_ESTIMATE: elapsed time t=1.9 s, 131072 iters, t-(init.)=1.79 s t(norm)=0.0270025, mflops=185.168 (err=3.8e-16) 7. Frigo-old: elapsed time t=1.68 s, 32768 iters, t-(init.)=1.65 s t(norm)=0.0995622, mflops=50.2199 (err=3.5e-16) 8. GSL: elapsed time t=1.62 s, 65536 iters, t-(init.)=1.56 s t(norm)=0.0470658, mflops=106.234 (err=3.0e-16) 9. NAPACK (f2c): elapsed time t=1.4 s, 16384 iters, t-(init.)=1.38 s t(norm)=0.16654, mflops=30.0227 (err=5.2e-16) 10. Singleton: elapsed time t=1.87 s, 65536 iters, t-(init.)=1.81 s t(norm)=0.0546084, mflops=91.5611 (err=4.3e-16) 11. Singleton (f2c): elapsed time t=1.81 s, 65536 iters, t-(init.)=1.75 s t(norm)=0.0527981, mflops=94.7003 (err=4.3e-16) 12. Temperton: elapsed time t=1.25 s, 32768 iters, t-(init.)=1.22 s t(norm)=0.0736157, mflops=67.9203 (err=5.3e-08) 13. Temperton (f2c): elapsed time t=1.14 s, 16384 iters, t-(init.)=1.12 s t(norm)=0.135163, mflops=36.9923 (err=3.6e-16) 14. Valkenburg: elapsed time t=1.5 s, 4096 iters, t-(init.)=1.49 s t(norm)=0.719261, mflops=6.95158 (err=3.9e-16) 15. SGIMATH: elapsed time t=1.77 s, 131072 iters, t-(init.)=1.65 s t(norm)=0.0248905, mflops=200.879 (err=4.2e-16) Top mflops for N=80 = 200.879 Normalized results and averages for N=80: fft 0: mflops = 19.3605 (norm. = 0.0963785), norm. avg. (of 5) = 0.0709138 fft 1: mflops = 142.867 (norm. = 0.711207), norm. avg. (of 8) = 0.379612 fft 2: mflops = 145.373 (norm. = 0.723684), norm. avg. (of 8) = 0.349766 fft 3: mflops = 154.884 (norm. = 0.771028), norm. avg. (of 8) = 0.492328 fft 4: mflops = 36.9923 (norm. = 0.184152), norm. avg. (of 8) = 0.171764 fft 5: mflops = 190.489 (norm. = 0.948276), norm. avg. (of 8) = 0.982561 fft 6: mflops = 185.168 (norm. = 0.921788), norm. avg. (of 8) = 0.986656 fft 7: mflops = 50.2199 (norm. = 0.25), norm. avg. (of 8) = 0.157832 fft 8: mflops = 106.234 (norm. = 0.528846), norm. avg. (of 8) = 0.426844 fft 9: mflops = 30.0227 (norm. = 0.149457), norm. avg. (of 8) = 0.170992 fft 10: mflops = 91.5611 (norm. = 0.455801), norm. avg. (of 8) = 0.248397 fft 11: mflops = 94.7003 (norm. = 0.471429), norm. avg. (of 8) = 0.253496 fft 12: mflops = 67.9203 (norm. = 0.338115), norm. avg. (of 8) = 0.227755 fft 13: mflops = 36.9923 (norm. = 0.184152), norm. avg. (of 8) = 0.108538 fft 14: mflops = 6.95158 (norm. = 0.0346057), norm. avg. (of 8) = 0.0442266 fft 15: mflops = 200.879 (norm. = 1), norm. avg. (of 8) = 0.625495 Benchmarking for array size = 108: 0. Brenner: elapsed time t=1.12 s, 4096 iters, t-(init.)=1.12 s t(norm)=0.374814, mflops=13.3399 (err=7.3e-16) 1. CWP (min N) (N=110): elapsed time t=1.93 s, 65536 iters, t-(init.)=1.85 s t(norm)=0.0386946, mflops=129.217 2. CWP (best N) (N=112): elapsed time t=1.63 s, 65536 iters, t-(init.)=1.55 s t(norm)=0.0324198, mflops=154.227 3. FFTPACK: elapsed time t=1.57 s, 65536 iters, t-(init.)=1.49 s t(norm)=0.0311648, mflops=160.437 (err=5.0e-16) 4. FFTPACK (f2c): elapsed time t=1.84 s, 16384 iters, t-(init.)=1.82 s t(norm)=0.152268, mflops=32.8368 (err=5.9e-16) FFTW_MEASURE plan: (cost = 2.075195e-05) FFTW_TWIDDLE 9 FFTW_NOTW 12 5. FFTW: elapsed time t=1.36 s, 65536 iters, t-(init.)=1.28 s t(norm)=0.0267725, mflops=186.759 (err=5.1e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.35 s, 65536 iters, t-(init.)=1.27 s t(norm)=0.0265633, mflops=188.23 (err=5.1e-16) 7. Frigo-old: elapsed time t=1.42 s, 8192 iters, t-(init.)=1.41 s t(norm)=0.235932, mflops=21.1925 (err=7.5e-16) 8. GSL: elapsed time t=1.1 s, 32768 iters, t-(init.)=1.06 s t(norm)=0.0443419, mflops=112.76 (err=5.2e-16) 9. NAPACK (f2c): elapsed time t=1.19 s, 16384 iters, t-(init.)=1.17 s t(norm)=0.0978868, mflops=51.0794 (err=2.5e-15) 10. Singleton: elapsed time t=1.68 s, 32768 iters, t-(init.)=1.64 s t(norm)=0.0686044, mflops=72.8816 (err=6.7e-16) 11. Singleton (f2c): elapsed time t=1.73 s, 32768 iters, t-(init.)=1.69 s t(norm)=0.070696, mflops=70.7254 (err=6.7e-16) 12. Temperton: elapsed time t=1.73 s, 32768 iters, t-(init.)=1.69 s t(norm)=0.070696, mflops=70.7254 (err=7.4e-08) 13. Temperton (f2c): elapsed time t=1.36 s, 8192 iters, t-(init.)=1.35 s t(norm)=0.225893, mflops=22.1344 (err=6.8e-16) 14. Valkenburg: elapsed time t=1.02 s, 2048 iters, t-(init.)=1.02 s t(norm)=0.682698, mflops=7.32389 (err=9.6e-16) 15. SGIMATH: elapsed time t=1.29 s, 65536 iters, t-(init.)=1.21 s t(norm)=0.0253083, mflops=197.563 (err=6.1e-16) Top mflops for N=108 = 197.563 Normalized results and averages for N=108: fft 0: mflops = 13.3399 (norm. = 0.0675223), norm. avg. (of 6) = 0.0703485 fft 1: mflops = 129.217 (norm. = 0.654054), norm. avg. (of 9) = 0.410106 fft 2: mflops = 154.227 (norm. = 0.780645), norm. avg. (of 9) = 0.397641 fft 3: mflops = 160.437 (norm. = 0.812081), norm. avg. (of 9) = 0.527856 fft 4: mflops = 32.8368 (norm. = 0.166209), norm. avg. (of 9) = 0.171146 fft 5: mflops = 186.759 (norm. = 0.945312), norm. avg. (of 9) = 0.978422 fft 6: mflops = 188.23 (norm. = 0.952756), norm. avg. (of 9) = 0.982889 fft 7: mflops = 21.1925 (norm. = 0.10727), norm. avg. (of 9) = 0.152214 fft 8: mflops = 112.76 (norm. = 0.570755), norm. avg. (of 9) = 0.442835 fft 9: mflops = 51.0794 (norm. = 0.258547), norm. avg. (of 9) = 0.18072 fft 10: mflops = 72.8816 (norm. = 0.368902), norm. avg. (of 9) = 0.261786 fft 11: mflops = 70.7254 (norm. = 0.357988), norm. avg. (of 9) = 0.265106 fft 12: mflops = 70.7254 (norm. = 0.357988), norm. avg. (of 9) = 0.242225 fft 13: mflops = 22.1344 (norm. = 0.112037), norm. avg. (of 9) = 0.108927 fft 14: mflops = 7.32389 (norm. = 0.0370711), norm. avg. (of 9) = 0.0434315 fft 15: mflops = 197.563 (norm. = 1), norm. avg. (of 9) = 0.667106 Benchmarking for array size = 210: 0. Brenner: elapsed time t=1.87 s, 4096 iters, t-(init.)=1.86 s t(norm)=0.280311, mflops=17.8373 (err=7.0e-16) 1. CWP (min N): elapsed time t=1.55 s, 32768 iters, t-(init.)=1.48 s t(norm)=0.0278804, mflops=179.337 2. CWP (best N): elapsed time t=1.56 s, 32768 iters, t-(init.)=1.48 s t(norm)=0.0278804, mflops=179.337 3. FFTPACK: elapsed time t=1.87 s, 16384 iters, t-(init.)=1.84 s t(norm)=0.0693242, mflops=72.1248 (err=5.4e-16) 4. FFTPACK (f2c): elapsed time t=1.75 s, 4096 iters, t-(init.)=1.74 s t(norm)=0.262226, mflops=19.0675 (err=6.0e-16) FFTW_MEASURE plan: (cost = 5.615234e-05) FFTW_TWIDDLE 7 FFTW_TWIDDLE 3 FFTW_NOTW 10 5. FFTW: elapsed time t=1.79 s, 32768 iters, t-(init.)=1.71 s t(norm)=0.0322132, mflops=155.216 (err=5.0e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.88 s, 32768 iters, t-(init.)=1.8 s t(norm)=0.0339086, mflops=147.455 (err=5.1e-16) 7. Frigo-old: elapsed time t=1.31 s, 4096 iters, t-(init.)=1.3 s t(norm)=0.195916, mflops=25.5211 (err=5.5e-16) 8. GSL: elapsed time t=1.29 s, 16384 iters, t-(init.)=1.25 s t(norm)=0.0470953, mflops=106.168 (err=7.0e-16) 9. NAPACK (f2c): elapsed time t=1.3 s, 4096 iters, t-(init.)=1.29 s t(norm)=0.194409, mflops=25.7189 (err=1.5e-14) 10. Singleton: elapsed time t=1.27 s, 8192 iters, t-(init.)=1.25 s t(norm)=0.0941905, mflops=53.0839 (err=6.3e-16) 11. Singleton (f2c): elapsed time t=1.23 s, 8192 iters, t-(init.)=1.21 s t(norm)=0.0911764, mflops=54.8387 (err=6.3e-16) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.38 s, 1024 iters, t-(init.)=1.37 s t(norm)=0.825863, mflops=6.05428 (err=5.8e-16) 15. SGIMATH: elapsed time t=1.45 s, 16384 iters, t-(init.)=1.41 s t(norm)=0.0531235, mflops=94.1204 (err=6.1e-16) Top mflops for N=210 = 179.337 Normalized results and averages for N=210: fft 0: mflops = 17.8373 (norm. = 0.0994624), norm. avg. (of 7) = 0.0745077 fft 1: mflops = 179.337 (norm. = 1), norm. avg. (of 10) = 0.469095 fft 2: mflops = 179.337 (norm. = 1), norm. avg. (of 10) = 0.457877 fft 3: mflops = 72.1248 (norm. = 0.402174), norm. avg. (of 10) = 0.515288 fft 4: mflops = 19.0675 (norm. = 0.106322), norm. avg. (of 10) = 0.164664 fft 5: mflops = 155.216 (norm. = 0.865497), norm. avg. (of 10) = 0.96713 fft 6: mflops = 147.455 (norm. = 0.822222), norm. avg. (of 10) = 0.966823 fft 7: mflops = 25.5211 (norm. = 0.142308), norm. avg. (of 10) = 0.151223 fft 8: mflops = 106.168 (norm. = 0.592), norm. avg. (of 10) = 0.457751 fft 9: mflops = 25.7189 (norm. = 0.143411), norm. avg. (of 10) = 0.176989 fft 10: mflops = 53.0839 (norm. = 0.296), norm. avg. (of 10) = 0.265208 fft 11: mflops = 54.8387 (norm. = 0.305785), norm. avg. (of 10) = 0.269174 fft 12: mflops = -1 (norm. = -0.00557608), norm. avg. (of 9) = 0.242225 fft 13: mflops = -1 (norm. = -0.00557608), norm. avg. (of 9) = 0.108927 fft 14: mflops = 6.05428 (norm. = 0.0337591), norm. avg. (of 10) = 0.0424643 fft 15: mflops = 94.1204 (norm. = 0.524823), norm. avg. (of 10) = 0.652878 Benchmarking for array size = 504: 0. Brenner: elapsed time t=1.22 s, 1024 iters, t-(init.)=1.22 s t(norm)=0.26332, mflops=18.9883 (err=9.4e-16) 1. CWP (min N): elapsed time t=1.77 s, 16384 iters, t-(init.)=1.68 s t(norm)=0.0226628, mflops=220.626 2. CWP (best N): elapsed time t=1.77 s, 16384 iters, t-(init.)=1.68 s t(norm)=0.0226628, mflops=220.626 3. FFTPACK: elapsed time t=1.15 s, 4096 iters, t-(init.)=1.12 s t(norm)=0.0604342, mflops=82.7346 (err=7.3e-16) 4. FFTPACK (f2c): elapsed time t=1.21 s, 1024 iters, t-(init.)=1.21 s t(norm)=0.261162, mflops=19.1452 (err=8.4e-16) FFTW_MEASURE plan: (cost = 1.318359e-04) FFTW_TWIDDLE 7 FFTW_TWIDDLE 6 FFTW_NOTW 12 5. FFTW: elapsed time t=1.39 s, 8192 iters, t-(init.)=1.34 s t(norm)=0.0361526, mflops=138.303 (err=6.8e-16) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.53 s, 8192 iters, t-(init.)=1.49 s t(norm)=0.0401995, mflops=124.38 (err=6.8e-16) 7. Frigo-old: elapsed time t=1.63 s, 2048 iters, t-(init.)=1.62 s t(norm)=0.174828, mflops=28.5996 (err=8.6e-16) 8. GSL: elapsed time t=1.5 s, 8192 iters, t-(init.)=1.45 s t(norm)=0.0391204, mflops=127.811 (err=8.2e-16) 9. NAPACK (f2c): elapsed time t=1.35 s, 2048 iters, t-(init.)=1.33 s t(norm)=0.143531, mflops=34.8356 (err=4.1e-14) 10. Singleton: elapsed time t=1.38 s, 4096 iters, t-(init.)=1.36 s t(norm)=0.0733844, mflops=68.1344 (err=8.9e-16) 11. Singleton (f2c): elapsed time t=1.37 s, 4096 iters, t-(init.)=1.35 s t(norm)=0.0728448, mflops=68.6391 (err=8.9e-16) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.75 s, 512 iters, t-(init.)=1.74 s t(norm)=0.751111, mflops=6.65681 (err=1.2e-15) 15. SGIMATH: elapsed time t=1.38 s, 4096 iters, t-(init.)=1.35 s t(norm)=0.0728448, mflops=68.6391 (err=7.2e-16) Top mflops for N=504 = 220.626 Normalized results and averages for N=504: fft 0: mflops = 18.9883 (norm. = 0.0860656), norm. avg. (of 8) = 0.0759524 fft 1: mflops = 220.626 (norm. = 1), norm. avg. (of 11) = 0.517359 fft 2: mflops = 220.626 (norm. = 1), norm. avg. (of 11) = 0.507161 fft 3: mflops = 82.7346 (norm. = 0.375), norm. avg. (of 11) = 0.502535 fft 4: mflops = 19.1452 (norm. = 0.0867769), norm. avg. (of 11) = 0.157583 fft 5: mflops = 138.303 (norm. = 0.626866), norm. avg. (of 11) = 0.936197 fft 6: mflops = 124.38 (norm. = 0.563758), norm. avg. (of 11) = 0.93018 fft 7: mflops = 28.5996 (norm. = 0.12963), norm. avg. (of 11) = 0.14926 fft 8: mflops = 127.811 (norm. = 0.57931), norm. avg. (of 11) = 0.468802 fft 9: mflops = 34.8356 (norm. = 0.157895), norm. avg. (of 11) = 0.175253 fft 10: mflops = 68.1344 (norm. = 0.308824), norm. avg. (of 11) = 0.269173 fft 11: mflops = 68.6391 (norm. = 0.311111), norm. avg. (of 11) = 0.272986 fft 12: mflops = -1 (norm. = -0.00453256), norm. avg. (of 9) = 0.242225 fft 13: mflops = -1 (norm. = -0.00453256), norm. avg. (of 9) = 0.108927 fft 14: mflops = 6.65681 (norm. = 0.0301724), norm. avg. (of 11) = 0.0413468 fft 15: mflops = 68.6391 (norm. = 0.311111), norm. avg. (of 11) = 0.621808 Benchmarking for array size = 1000: 0. Brenner: elapsed time t=1.19 s, 512 iters, t-(init.)=1.19 s t(norm)=0.23322, mflops=21.439 (err=1.1e-15) 1. CWP (min N) (N=1001): elapsed time t=1.25 s, 4096 iters, t-(init.)=1.2 s t(norm)=0.0293975, mflops=170.083 2. CWP (best N) (N=1008): elapsed time t=1.99 s, 8192 iters, t-(init.)=1.9 s t(norm)=0.023273, mflops=214.841 3. FFTPACK: elapsed time t=1.41 s, 4096 iters, t-(init.)=1.36 s t(norm)=0.0333171, mflops=150.073 (err=1.0e-15) 4. FFTPACK (f2c): elapsed time t=1.47 s, 1024 iters, t-(init.)=1.45 s t(norm)=0.142088, mflops=35.1895 (err=1.1e-15) FFTW_MEASURE plan: (cost = 3.125000e-04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 5. FFTW: elapsed time t=1.33 s, 4096 iters, t-(init.)=1.28 s t(norm)=0.0313573, mflops=159.453 (err=1.0e-15) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.33 s, 4096 iters, t-(init.)=1.28 s t(norm)=0.0313573, mflops=159.453 (err=1.0e-15) 7. Frigo-old: elapsed time t=1.65 s, 1024 iters, t-(init.)=1.64 s t(norm)=0.160706, mflops=31.1127 (err=1.1e-15) 8. GSL: elapsed time t=1.88 s, 4096 iters, t-(init.)=1.84 s t(norm)=0.0450761, mflops=110.924 (err=1.0e-15) 9. NAPACK (f2c): elapsed time t=1.83 s, 1024 iters, t-(init.)=1.82 s t(norm)=0.178345, mflops=28.0356 (err=1.7e-14) 10. Singleton: elapsed time t=1.01 s, 2048 iters, t-(init.)=0.98 s t(norm)=0.0480159, mflops=104.132 (err=1.5e-15) 11. Singleton (f2c): elapsed time t=1.04 s, 2048 iters, t-(init.)=1.02 s t(norm)=0.0499757, mflops=100.049 (err=1.5e-15) 12. Temperton: elapsed time t=1.2 s, 2048 iters, t-(init.)=1.18 s t(norm)=0.057815, mflops=86.4827 (err=1.3e-07) 13. Temperton (f2c): elapsed time t=1.72 s, 1024 iters, t-(init.)=1.71 s t(norm)=0.167566, mflops=29.8391 (err=1.0e-15) 14. Valkenburg: elapsed time t=1.02 s, 128 iters, t-(init.)=1.02 s t(norm)=0.799611, mflops=6.25304 (err=1.1e-15) 15. SGIMATH: elapsed time t=1.08 s, 4096 iters, t-(init.)=1.03 s t(norm)=0.0252328, mflops=198.155 (err=1.2e-15) Top mflops for N=1000 = 214.841 Normalized results and averages for N=1000: fft 0: mflops = 21.439 (norm. = 0.0997899), norm. avg. (of 9) = 0.078601 fft 1: mflops = 170.083 (norm. = 0.791667), norm. avg. (of 12) = 0.540218 fft 2: mflops = 214.841 (norm. = 1), norm. avg. (of 12) = 0.548231 fft 3: mflops = 150.073 (norm. = 0.698529), norm. avg. (of 12) = 0.518867 fft 4: mflops = 35.1895 (norm. = 0.163793), norm. avg. (of 12) = 0.158101 fft 5: mflops = 159.453 (norm. = 0.742188), norm. avg. (of 12) = 0.920029 fft 6: mflops = 159.453 (norm. = 0.742188), norm. avg. (of 12) = 0.914514 fft 7: mflops = 31.1127 (norm. = 0.144817), norm. avg. (of 12) = 0.14889 fft 8: mflops = 110.924 (norm. = 0.516304), norm. avg. (of 12) = 0.47276 fft 9: mflops = 28.0356 (norm. = 0.130495), norm. avg. (of 12) = 0.171524 fft 10: mflops = 104.132 (norm. = 0.484694), norm. avg. (of 12) = 0.287133 fft 11: mflops = 100.049 (norm. = 0.465686), norm. avg. (of 12) = 0.289045 fft 12: mflops = 86.4827 (norm. = 0.402542), norm. avg. (of 10) = 0.258257 fft 13: mflops = 29.8391 (norm. = 0.138889), norm. avg. (of 10) = 0.111923 fft 14: mflops = 6.25304 (norm. = 0.0291054), norm. avg. (of 12) = 0.0403267 fft 15: mflops = 198.155 (norm. = 0.92233), norm. avg. (of 12) = 0.646852 Benchmarking for array size = 1960: 0. Brenner: elapsed time t=1.29 s, 256 iters, t-(init.)=1.29 s t(norm)=0.235077, mflops=21.2696 (err=3.0e-15) 1. CWP (min N) (N=1980): elapsed time t=1.1 s, 2048 iters, t-(init.)=1.06 s t(norm)=0.0241455, mflops=207.078 2. CWP (best N) (N=1980): elapsed time t=1.1 s, 2048 iters, t-(init.)=1.05 s t(norm)=0.0239177, mflops=209.05 3. FFTPACK: elapsed time t=1.01 s, 512 iters, t-(init.)=1 s t(norm)=0.0911151, mflops=54.8757 (err=2.9e-15) 4. FFTPACK (f2c): elapsed time t=1.82 s, 256 iters, t-(init.)=1.82 s t(norm)=0.331659, mflops=15.0757 (err=3.0e-15) FFTW_MEASURE plan: (cost = 8.203125e-04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 7 FFTW_NOTW 14 5. FFTW: elapsed time t=1.84 s, 2048 iters, t-(init.)=1.79 s t(norm)=0.040774, mflops=122.627 (err=2.8e-15) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.78 s, 2048 iters, t-(init.)=1.74 s t(norm)=0.039635, mflops=126.151 (err=2.8e-15) 7. Frigo-old: elapsed time t=1.77 s, 512 iters, t-(init.)=1.76 s t(norm)=0.160362, mflops=31.1794 (err=2.9e-15) 8. GSL: elapsed time t=1.36 s, 1024 iters, t-(init.)=1.34 s t(norm)=0.0610471, mflops=81.904 (err=2.9e-15) 9. NAPACK (f2c): elapsed time t=1.21 s, 256 iters, t-(init.)=1.2 s t(norm)=0.218676, mflops=22.8649 (err=1.3e-13) 10. Singleton: elapsed time t=1.54 s, 1024 iters, t-(init.)=1.52 s t(norm)=0.0692474, mflops=72.2048 (err=4.4e-15) 11. Singleton (f2c): elapsed time t=1.55 s, 1024 iters, t-(init.)=1.52 s t(norm)=0.0692474, mflops=72.2048 (err=4.4e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.24 s, 64 iters, t-(init.)=1.24 s t(norm)=0.903861, mflops=5.53182 (err=3.2e-15) 15. SGIMATH: elapsed time t=1.35 s, 512 iters, t-(init.)=1.34 s t(norm)=0.122094, mflops=40.952 (err=2.8e-15) Top mflops for N=1960 = 209.05 Normalized results and averages for N=1960: fft 0: mflops = 21.2696 (norm. = 0.101744), norm. avg. (of 10) = 0.0809153 fft 1: mflops = 207.078 (norm. = 0.990566), norm. avg. (of 13) = 0.57486 fft 2: mflops = 209.05 (norm. = 1), norm. avg. (of 13) = 0.582982 fft 3: mflops = 54.8757 (norm. = 0.2625), norm. avg. (of 13) = 0.499147 fft 4: mflops = 15.0757 (norm. = 0.0721154), norm. avg. (of 13) = 0.151487 fft 5: mflops = 122.627 (norm. = 0.586592), norm. avg. (of 13) = 0.89438 fft 6: mflops = 126.151 (norm. = 0.603448), norm. avg. (of 13) = 0.890586 fft 7: mflops = 31.1794 (norm. = 0.149148), norm. avg. (of 13) = 0.14891 fft 8: mflops = 81.904 (norm. = 0.391791), norm. avg. (of 13) = 0.466532 fft 9: mflops = 22.8649 (norm. = 0.109375), norm. avg. (of 13) = 0.166743 fft 10: mflops = 72.2048 (norm. = 0.345395), norm. avg. (of 13) = 0.291614 fft 11: mflops = 72.2048 (norm. = 0.345395), norm. avg. (of 13) = 0.293379 fft 12: mflops = -1 (norm. = -0.00478354), norm. avg. (of 10) = 0.258257 fft 13: mflops = -1 (norm. = -0.00478354), norm. avg. (of 10) = 0.111923 fft 14: mflops = 5.53182 (norm. = 0.0264617), norm. avg. (of 13) = 0.0392602 fft 15: mflops = 40.952 (norm. = 0.195896), norm. avg. (of 13) = 0.612163 Benchmarking for array size = 4725: 0. Brenner: elapsed time t=1.07 s, 64 iters, t-(init.)=1.06 s t(norm)=0.287175, mflops=17.411 (err=1.9e-15) 1. CWP (min N) (N=5005): elapsed time t=1.12 s, 512 iters, t-(init.)=1.03 s t(norm)=0.034881, mflops=143.345 2. CWP (best N) (N=5040): elapsed time t=1.93 s, 1024 iters, t-(init.)=1.75 s t(norm)=0.0296319, mflops=168.737 3. FFTPACK: elapsed time t=1.12 s, 256 iters, t-(init.)=1.08 s t(norm)=0.0731484, mflops=68.3542 (err=1.9e-15) 4. FFTPACK (f2c): elapsed time t=1.68 s, 128 iters, t-(init.)=1.66 s t(norm)=0.224864, mflops=22.2357 (err=1.9e-15) FFTW_MEASURE plan: (cost = 2.265625e-03) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 15 5. FFTW: elapsed time t=1.29 s, 512 iters, t-(init.)=1.21 s t(norm)=0.0409767, mflops=122.021 (err=1.9e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.42 s, 512 iters, t-(init.)=1.34 s t(norm)=0.0453791, mflops=110.183 (err=1.9e-15) 7. Frigo-old: elapsed time t=1.93 s, 128 iters, t-(init.)=1.91 s t(norm)=0.258729, mflops=19.3253 (err=1.9e-15) 8. GSL: elapsed time t=1.71 s, 512 iters, t-(init.)=1.63 s t(norm)=0.0552, mflops=90.5797 (err=2.0e-15) 9. NAPACK (f2c): elapsed time t=1.47 s, 128 iters, t-(init.)=1.45 s t(norm)=0.196417, mflops=25.456 (err=3.5e-13) 10. Singleton: elapsed time t=1.23 s, 256 iters, t-(init.)=1.19 s t(norm)=0.0805988, mflops=62.0357 (err=2.4e-15) 11. Singleton (f2c): elapsed time t=1.24 s, 256 iters, t-(init.)=1.2 s t(norm)=0.0812761, mflops=61.5187 (err=2.4e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.57 s, 32 iters, t-(init.)=1.57 s t(norm)=0.850689, mflops=5.87759 (err=1.8e-15) 15. SGIMATH: elapsed time t=1.82 s, 512 iters, t-(init.)=1.74 s t(norm)=0.0589251, mflops=84.8534 (err=1.8e-15) Top mflops for N=4725 = 168.737 Normalized results and averages for N=4725: fft 0: mflops = 17.411 (norm. = 0.103184), norm. avg. (of 11) = 0.0829398 fft 1: mflops = 143.345 (norm. = 0.849515), norm. avg. (of 14) = 0.594479 fft 2: mflops = 168.737 (norm. = 1), norm. avg. (of 14) = 0.612769 fft 3: mflops = 68.3542 (norm. = 0.405093), norm. avg. (of 14) = 0.492429 fft 4: mflops = 22.2357 (norm. = 0.131777), norm. avg. (of 14) = 0.150079 fft 5: mflops = 122.021 (norm. = 0.72314), norm. avg. (of 14) = 0.882149 fft 6: mflops = 110.183 (norm. = 0.652985), norm. avg. (of 14) = 0.873615 fft 7: mflops = 19.3253 (norm. = 0.114529), norm. avg. (of 14) = 0.146454 fft 8: mflops = 90.5797 (norm. = 0.53681), norm. avg. (of 14) = 0.471552 fft 9: mflops = 25.456 (norm. = 0.150862), norm. avg. (of 14) = 0.165609 fft 10: mflops = 62.0357 (norm. = 0.367647), norm. avg. (of 14) = 0.297045 fft 11: mflops = 61.5187 (norm. = 0.364583), norm. avg. (of 14) = 0.298465 fft 12: mflops = -1 (norm. = -0.00592638), norm. avg. (of 10) = 0.258257 fft 13: mflops = -1 (norm. = -0.00592638), norm. avg. (of 10) = 0.111923 fft 14: mflops = 5.87759 (norm. = 0.0348328), norm. avg. (of 14) = 0.0389439 fft 15: mflops = 84.8534 (norm. = 0.502874), norm. avg. (of 14) = 0.604357 Benchmarking for array size = 10368: 0. Brenner: elapsed time t=1.2 s, 32 iters, t-(init.)=1.19 s t(norm)=0.268875, mflops=18.596 (err=3.1e-15) 1. CWP (min N) (N=10920): elapsed time t=1.34 s, 256 iters, t-(init.)=1.25 s t(norm)=0.035304, mflops=141.627 2. CWP (best N) (N=11088): elapsed time t=1.21 s, 256 iters, t-(init.)=1.12 s t(norm)=0.0316324, mflops=158.066 3. FFTPACK: elapsed time t=1.6 s, 256 iters, t-(init.)=1.51 s t(norm)=0.0426473, mflops=117.241 (err=3.0e-15) 4. FFTPACK (f2c): elapsed time t=1.46 s, 64 iters, t-(init.)=1.44 s t(norm)=0.162681, mflops=30.735 (err=3.1e-15) FFTW_MEASURE plan: (cost = 4.687500e-03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_TWIDDLE 9 FFTW_NOTW 9 5. FFTW: elapsed time t=1.37 s, 256 iters, t-(init.)=1.28 s t(norm)=0.0361513, mflops=138.308 (err=3.1e-15) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.55 s, 256 iters, t-(init.)=1.46 s t(norm)=0.0412351, mflops=121.256 (err=3.1e-15) 7. Frigo-old: elapsed time t=1.66 s, 64 iters, t-(init.)=1.64 s t(norm)=0.185275, mflops=26.9868 (err=3.2e-15) 8. GSL: elapsed time t=1.8 s, 256 iters, t-(init.)=1.71 s t(norm)=0.0482959, mflops=103.528 (err=3.0e-15) 9. NAPACK (f2c): elapsed time t=1.1 s, 64 iters, t-(init.)=1.08 s t(norm)=0.122011, mflops=40.98 (err=8.1e-14) 10. Singleton: elapsed time t=1.3 s, 128 iters, t-(init.)=1.26 s t(norm)=0.0711729, mflops=70.2515 (err=4.4e-15) 11. Singleton (f2c): elapsed time t=1.33 s, 128 iters, t-(init.)=1.29 s t(norm)=0.0728675, mflops=68.6177 (err=4.4e-15) 12. Temperton: elapsed time t=1.67 s, 128 iters, t-(init.)=1.62 s t(norm)=0.091508, mflops=54.64 (err=2.1e-07) 13. Temperton (f2c): elapsed time t=1.64 s, 64 iters, t-(init.)=1.62 s t(norm)=0.183016, mflops=27.32 (err=3.0e-15) 14. Valkenburg: elapsed time t=1.61 s, 16 iters, t-(init.)=1.6 s t(norm)=0.723026, mflops=6.91538 (err=3.1e-15) 15. SGIMATH: elapsed time t=1.46 s, 256 iters, t-(init.)=1.37 s t(norm)=0.0386932, mflops=129.222 (err=3.1e-15) Top mflops for N=10368 = 158.066 Normalized results and averages for N=10368: fft 0: mflops = 18.596 (norm. = 0.117647), norm. avg. (of 12) = 0.085832 fft 1: mflops = 141.627 (norm. = 0.896), norm. avg. (of 15) = 0.61458 fft 2: mflops = 158.066 (norm. = 1), norm. avg. (of 15) = 0.638585 fft 3: mflops = 117.241 (norm. = 0.741722), norm. avg. (of 15) = 0.509048 fft 4: mflops = 30.735 (norm. = 0.194444), norm. avg. (of 15) = 0.153036 fft 5: mflops = 138.308 (norm. = 0.875), norm. avg. (of 15) = 0.881672 fft 6: mflops = 121.256 (norm. = 0.767123), norm. avg. (of 15) = 0.866515 fft 7: mflops = 26.9868 (norm. = 0.170732), norm. avg. (of 15) = 0.148072 fft 8: mflops = 103.528 (norm. = 0.654971), norm. avg. (of 15) = 0.48378 fft 9: mflops = 40.98 (norm. = 0.259259), norm. avg. (of 15) = 0.171852 fft 10: mflops = 70.2515 (norm. = 0.444444), norm. avg. (of 15) = 0.306872 fft 11: mflops = 68.6177 (norm. = 0.434109), norm. avg. (of 15) = 0.307508 fft 12: mflops = 54.64 (norm. = 0.345679), norm. avg. (of 11) = 0.266204 fft 13: mflops = 27.32 (norm. = 0.17284), norm. avg. (of 11) = 0.117461 fft 14: mflops = 6.91538 (norm. = 0.04375), norm. avg. (of 15) = 0.0392643 fft 15: mflops = 129.222 (norm. = 0.817518), norm. avg. (of 15) = 0.618567 Benchmarking for array size = 27000: 0. Brenner: elapsed time t=1.78 s, 16 iters, t-(init.)=1.77 s t(norm)=0.278331, mflops=17.9642 (err=5.8e-15) 1. CWP (min N) (N=27720): elapsed time t=1.65 s, 128 iters, t-(init.)=1.53 s t(norm)=0.0300739, mflops=166.257 2. CWP (best N) (N=27720): elapsed time t=1.66 s, 128 iters, t-(init.)=1.54 s t(norm)=0.0302705, mflops=165.177 3. FFTPACK: elapsed time t=1.26 s, 64 iters, t-(init.)=1.21 s t(norm)=0.0475679, mflops=105.113 (err=5.8e-15) 4. FFTPACK (f2c): elapsed time t=1.05 s, 16 iters, t-(init.)=1.04 s t(norm)=0.163539, mflops=30.5737 (err=5.8e-15) FFTW_MEASURE plan: (cost = 1.750000e-02) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 3 FFTW_TWIDDLE 10 FFTW_NOTW 9 5. FFTW: elapsed time t=1.18 s, 64 iters, t-(init.)=1.13 s t(norm)=0.0444229, mflops=112.555 (err=5.9e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.17 s, 64 iters, t-(init.)=1.11 s t(norm)=0.0436367, mflops=114.583 (err=5.9e-15) 7. Frigo-old: elapsed time t=1.62 s, 16 iters, t-(init.)=1.6 s t(norm)=0.251599, mflops=19.8729 (err=5.9e-15) 8. GSL: elapsed time t=1.4 s, 64 iters, t-(init.)=1.34 s t(norm)=0.0526785, mflops=94.9154 (err=5.8e-15) 9. NAPACK (f2c): elapsed time t=1.17 s, 16 iters, t-(init.)=1.16 s t(norm)=0.182409, mflops=27.4109 (err=1.1e-12) 10. Singleton: elapsed time t=1.93 s, 64 iters, t-(init.)=1.87 s t(norm)=0.073514, mflops=68.0142 (err=8.0e-15) 11. Singleton (f2c): elapsed time t=1.97 s, 64 iters, t-(init.)=1.91 s t(norm)=0.0750865, mflops=66.5898 (err=8.0e-15) 12. Temperton: elapsed time t=1.02 s, 32 iters, t-(init.)=0.99 s t(norm)=0.0778384, mflops=64.2357 (err=1.4e-07) 13. Temperton (f2c): elapsed time t=1.42 s, 16 iters, t-(init.)=1.41 s t(norm)=0.221721, mflops=22.5508 (err=5.8e-15) 14. Valkenburg: elapsed time t=1.26 s, 4 iters, t-(init.)=1.26 s t(norm)=0.792536, mflops=6.30886 (err=5.6e-15) 15. SGIMATH: elapsed time t=1.11 s, 64 iters, t-(init.)=1.05 s t(norm)=0.0412779, mflops=121.13 (err=5.8e-15) Top mflops for N=27000 = 166.257 Normalized results and averages for N=27000: fft 0: mflops = 17.9642 (norm. = 0.108051), norm. avg. (of 13) = 0.0875412 fft 1: mflops = 166.257 (norm. = 1), norm. avg. (of 16) = 0.638669 fft 2: mflops = 165.177 (norm. = 0.993506), norm. avg. (of 16) = 0.660767 fft 3: mflops = 105.113 (norm. = 0.632231), norm. avg. (of 16) = 0.516747 fft 4: mflops = 30.5737 (norm. = 0.183894), norm. avg. (of 16) = 0.154965 fft 5: mflops = 112.555 (norm. = 0.676991), norm. avg. (of 16) = 0.86888 fft 6: mflops = 114.583 (norm. = 0.689189), norm. avg. (of 16) = 0.855432 fft 7: mflops = 19.8729 (norm. = 0.119531), norm. avg. (of 16) = 0.146289 fft 8: mflops = 94.9154 (norm. = 0.570896), norm. avg. (of 16) = 0.489225 fft 9: mflops = 27.4109 (norm. = 0.164871), norm. avg. (of 16) = 0.171416 fft 10: mflops = 68.0142 (norm. = 0.409091), norm. avg. (of 16) = 0.313261 fft 11: mflops = 66.5898 (norm. = 0.400524), norm. avg. (of 16) = 0.313322 fft 12: mflops = 64.2357 (norm. = 0.386364), norm. avg. (of 12) = 0.276218 fft 13: mflops = 22.5508 (norm. = 0.135638), norm. avg. (of 12) = 0.118976 fft 14: mflops = 6.30886 (norm. = 0.0379464), norm. avg. (of 16) = 0.039182 fft 15: mflops = 121.13 (norm. = 0.728571), norm. avg. (of 16) = 0.625443 Benchmarking for array size = 75600: 0. Brenner: elapsed time t=1.01 s, 2 iters, t-(init.)=1 s t(norm)=0.408103, mflops=12.2518 (err=1.1e-14) 1. CWP (min N) (N=80080): elapsed time t=1.75 s, 32 iters, t-(init.)=1.57 s t(norm)=0.0400451, mflops=124.859 2. CWP (best N) (N=80080): elapsed time t=1.74 s, 32 iters, t-(init.)=1.56 s t(norm)=0.03979, mflops=125.66 3. FFTPACK: elapsed time t=1.19 s, 8 iters, t-(init.)=1.15 s t(norm)=0.11733, mflops=42.615 (err=1.1e-14) 4. FFTPACK (f2c): elapsed time t=1.35 s, 4 iters, t-(init.)=1.33 s t(norm)=0.271388, mflops=18.4238 (err=1.1e-14) FFTW_MEASURE plan: (cost = 7.500000e-02) FFTW_TWIDDLE 4 FFTW_TWIDDLE 6 FFTW_TWIDDLE 10 FFTW_TWIDDLE 3 FFTW_TWIDDLE 7 FFTW_NOTW 15 5. FFTW: elapsed time t=1.22 s, 16 iters, t-(init.)=1.15 s t(norm)=0.0586648, mflops=85.23 (err=1.0e-14) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.22 s, 16 iters, t-(init.)=1.14 s t(norm)=0.0581547, mflops=85.9776 (err=1.0e-14) 7. Frigo-old: elapsed time t=1.27 s, 4 iters, t-(init.)=1.25 s t(norm)=0.255064, mflops=19.6029 (err=1.1e-14) 8. GSL: elapsed time t=1.88 s, 16 iters, t-(init.)=1.8 s t(norm)=0.0918232, mflops=54.4525 (err=1.1e-14) 9. NAPACK (f2c): elapsed time t=1.38 s, 4 iters, t-(init.)=1.36 s t(norm)=0.27751, mflops=18.0174 (err=5.1e-12) 10. Singleton: elapsed time t=1.1 s, 8 iters, t-(init.)=1.06 s t(norm)=0.108147, mflops=46.2332 (err=1.5e-14) 11. Singleton (f2c): elapsed time t=1.09 s, 8 iters, t-(init.)=1.05 s t(norm)=0.107127, mflops=46.6736 (err=1.5e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.05 s, 1 iters, t-(init.)=1.04 s t(norm)=0.848854, mflops=5.89029 (err=1.1e-14) 15. SGIMATH: elapsed time t=1.43 s, 16 iters, t-(init.)=1.35 s t(norm)=0.0688674, mflops=72.6033 (err=1.1e-14) Top mflops for N=75600 = 125.66 Normalized results and averages for N=75600: fft 0: mflops = 12.2518 (norm. = 0.0975), norm. avg. (of 14) = 0.0882525 fft 1: mflops = 124.859 (norm. = 0.993631), norm. avg. (of 17) = 0.659549 fft 2: mflops = 125.66 (norm. = 1), norm. avg. (of 17) = 0.680722 fft 3: mflops = 42.615 (norm. = 0.33913), norm. avg. (of 17) = 0.506299 fft 4: mflops = 18.4238 (norm. = 0.146617), norm. avg. (of 17) = 0.154474 fft 5: mflops = 85.23 (norm. = 0.678261), norm. avg. (of 17) = 0.857667 fft 6: mflops = 85.9776 (norm. = 0.684211), norm. avg. (of 17) = 0.84536 fft 7: mflops = 19.6029 (norm. = 0.156), norm. avg. (of 17) = 0.14686 fft 8: mflops = 54.4525 (norm. = 0.433333), norm. avg. (of 17) = 0.485937 fft 9: mflops = 18.0174 (norm. = 0.143382), norm. avg. (of 17) = 0.169767 fft 10: mflops = 46.2332 (norm. = 0.367925), norm. avg. (of 17) = 0.316476 fft 11: mflops = 46.6736 (norm. = 0.371429), norm. avg. (of 17) = 0.31674 fft 12: mflops = -1 (norm. = -0.00795801), norm. avg. (of 12) = 0.276218 fft 13: mflops = -1 (norm. = -0.00795801), norm. avg. (of 12) = 0.118976 fft 14: mflops = 5.89029 (norm. = 0.046875), norm. avg. (of 17) = 0.0396345 fft 15: mflops = 72.6033 (norm. = 0.577778), norm. avg. (of 17) = 0.622639 Benchmarking for array size = 165375: 0. Brenner: elapsed time t=1.66 s, 1 iters, t-(init.)=1.65 s t(norm)=0.575547, mflops=8.68739 (err=2.7e-14) 1. CWP (min N) (N=180180): elapsed time t=1.4 s, 8 iters, t-(init.)=1.27 s t(norm)=0.0553746, mflops=90.2941 2. CWP (best N) (N=180180): elapsed time t=1.4 s, 8 iters, t-(init.)=1.27 s t(norm)=0.0553746, mflops=90.2941 3. FFTPACK: elapsed time t=1.08 s, 2 iters, t-(init.)=1.05 s t(norm)=0.183129, mflops=27.3032 (err=2.7e-14) 4. FFTPACK (f2c): elapsed time t=1.24 s, 1 iters, t-(init.)=1.23 s t(norm)=0.429044, mflops=11.6538 (err=2.7e-14) FFTW_MEASURE plan: (cost = 2.400000e-01) FFTW_TWIDDLE 3 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 7 5. FFTW: elapsed time t=1.92 s, 8 iters, t-(init.)=1.8 s t(norm)=0.0784837, mflops=63.7075 (err=2.7e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.84 s, 8 iters, t-(init.)=1.72 s t(norm)=0.0749955, mflops=66.6707 (err=2.7e-14) 7. Frigo-old: elapsed time t=1.19 s, 1 iters, t-(init.)=1.17 s t(norm)=0.408115, mflops=12.2514 (err=2.7e-14) 8. GSL: elapsed time t=1.16 s, 4 iters, t-(init.)=1.1 s t(norm)=0.0959245, mflops=52.1243 (err=2.7e-14) 9. NAPACK (f2c): elapsed time t=1.84 s, 2 iters, t-(init.)=1.81 s t(norm)=0.315679, mflops=15.8389 (err=1.6e-11) 10. Singleton: elapsed time t=1.27 s, 2 iters, t-(init.)=1.24 s t(norm)=0.216266, mflops=23.1197 (err=4.0e-14) 11. Singleton (f2c): elapsed time t=1.28 s, 2 iters, t-(init.)=1.25 s t(norm)=0.21801, mflops=22.9347 (err=4.0e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=2.88 s, 1 iters, t-(init.)=2.87 s t(norm)=1.0011, mflops=4.99449 (err=2.7e-14) 15. SGIMATH: elapsed time t=1.04 s, 4 iters, t-(init.)=0.97 s t(norm)=0.0845879, mflops=59.1101 (err=2.7e-14) Top mflops for N=165375 = 90.2941 Normalized results and averages for N=165375: fft 0: mflops = 8.68739 (norm. = 0.0962121), norm. avg. (of 15) = 0.0887832 fft 1: mflops = 90.2941 (norm. = 1), norm. avg. (of 18) = 0.678463 fft 2: mflops = 90.2941 (norm. = 1), norm. avg. (of 18) = 0.69846 fft 3: mflops = 27.3032 (norm. = 0.302381), norm. avg. (of 18) = 0.49497 fft 4: mflops = 11.6538 (norm. = 0.129065), norm. avg. (of 18) = 0.153062 fft 5: mflops = 63.7075 (norm. = 0.705556), norm. avg. (of 18) = 0.849216 fft 6: mflops = 66.6707 (norm. = 0.738372), norm. avg. (of 18) = 0.839417 fft 7: mflops = 12.2514 (norm. = 0.135684), norm. avg. (of 18) = 0.146239 fft 8: mflops = 52.1243 (norm. = 0.577273), norm. avg. (of 18) = 0.491011 fft 9: mflops = 15.8389 (norm. = 0.175414), norm. avg. (of 18) = 0.17008 fft 10: mflops = 23.1197 (norm. = 0.256048), norm. avg. (of 18) = 0.313119 fft 11: mflops = 22.9347 (norm. = 0.254), norm. avg. (of 18) = 0.313254 fft 12: mflops = -1 (norm. = -0.0110749), norm. avg. (of 12) = 0.276218 fft 13: mflops = -1 (norm. = -0.0110749), norm. avg. (of 12) = 0.118976 fft 14: mflops = 4.99449 (norm. = 0.0553136), norm. avg. (of 18) = 0.0405056 fft 15: mflops = 59.1101 (norm. = 0.654639), norm. avg. (of 18) = 0.624417 Benchmarking for array size = 362880: 0. Brenner: elapsed time t=4.22 s, 1 iters, t-(init.)=4.19 s t(norm)=0.625179, mflops=7.99771 (err=1.1e-13) 1. CWP (min N) (N=720720): elapsed time t=1.47 s, 2 iters, t-(init.)=1.34 s t(norm)=0.099969, mflops=50.0155 2. CWP (best N) (N=720720): elapsed time t=1.47 s, 2 iters, t-(init.)=1.34 s t(norm)=0.099969, mflops=50.0155 3. FFTPACK: elapsed time t=1.76 s, 2 iters, t-(init.)=1.69 s t(norm)=0.12608, mflops=39.6573 (err=1.1e-13) 4. FFTPACK (f2c): elapsed time t=1.99 s, 1 iters, t-(init.)=1.96 s t(norm)=0.292447, mflops=17.0971 (err=1.1e-13) FFTW_MEASURE plan: (cost = 5.500000e-01) FFTW_TWIDDLE 5 FFTW_TWIDDLE 32 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.12 s, 2 iters, t-(init.)=1.05 s t(norm)=0.0783339, mflops=63.8293 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 7.511616e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.15 s, 2 iters, t-(init.)=1.08 s t(norm)=0.080572, mflops=62.0563 (err=1.1e-13) 7. Frigo-old: elapsed time t=2.46 s, 1 iters, t-(init.)=2.43 s t(norm)=0.362574, mflops=13.7903 (err=1.1e-13) 8. GSL: elapsed time t=1.19 s, 2 iters, t-(init.)=1.13 s t(norm)=0.0843022, mflops=59.3104 (err=1.1e-13) 9. NAPACK (f2c): elapsed time t=1.86 s, 1 iters, t-(init.)=1.83 s t(norm)=0.27305, mflops=18.3117 (err=3.4e-11) 10. Singleton: elapsed time t=1.85 s, 1 iters, t-(init.)=1.82 s t(norm)=0.271558, mflops=18.4123 (err=1.6e-13) 11. Singleton (f2c): elapsed time t=1.87 s, 1 iters, t-(init.)=1.84 s t(norm)=0.274542, mflops=18.2122 (err=1.6e-13) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=6.72 s, 1 iters, t-(init.)=6.69 s t(norm)=0.998198, mflops=5.00903 (err=1.1e-13) 15. SGIMATH: elapsed time t=1.08 s, 2 iters, t-(init.)=1.01 s t(norm)=0.0753498, mflops=66.3572 (err=1.1e-13) Top mflops for N=362880 = 66.3572 Normalized results and averages for N=362880: fft 0: mflops = 7.99771 (norm. = 0.120525), norm. avg. (of 16) = 0.090767 fft 1: mflops = 50.0155 (norm. = 0.753731), norm. avg. (of 19) = 0.682424 fft 2: mflops = 50.0155 (norm. = 0.753731), norm. avg. (of 19) = 0.701369 fft 3: mflops = 39.6573 (norm. = 0.597633), norm. avg. (of 19) = 0.500374 fft 4: mflops = 17.0971 (norm. = 0.257653), norm. avg. (of 19) = 0.158567 fft 5: mflops = 63.8293 (norm. = 0.961905), norm. avg. (of 19) = 0.855147 fft 6: mflops = 62.0563 (norm. = 0.935185), norm. avg. (of 19) = 0.844457 fft 7: mflops = 13.7903 (norm. = 0.207819), norm. avg. (of 19) = 0.14948 fft 8: mflops = 59.3104 (norm. = 0.893805), norm. avg. (of 19) = 0.512211 fft 9: mflops = 18.3117 (norm. = 0.275956), norm. avg. (of 19) = 0.175653 fft 10: mflops = 18.4123 (norm. = 0.277473), norm. avg. (of 19) = 0.311243 fft 11: mflops = 18.2122 (norm. = 0.274457), norm. avg. (of 19) = 0.311212 fft 12: mflops = -1 (norm. = -0.01507), norm. avg. (of 12) = 0.276218 fft 13: mflops = -1 (norm. = -0.01507), norm. avg. (of 12) = 0.118976 fft 14: mflops = 5.00903 (norm. = 0.0754858), norm. avg. (of 19) = 0.0423466 fft 15: mflops = 66.3572 (norm. = 1), norm. avg. (of 19) = 0.644184 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) 512x128x64 (64.0236 MB) Maximum array size N = 4194304 Benchmarking FFTs: 0. FFTW 1. HARM 2. HARM (f2c) 3. PDA 4. PDA (f2c) 5. Singleton 6. Singleton (f2c) 7. Temperton 8. Temperton (f2c) Computing normalized averages (9 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.46 s, 131072 iters, t-(init.)=1.36 s t(norm)=0.0270208, mflops=185.043 (err=1.9e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. Skipping fft (all dimensions must be > 4 for HARM). 3. PDA: elapsed time t=1.52 s, 16384 iters, t-(init.)=1.5 s t(norm)=0.238419, mflops=20.9715 (err=2.8e-16) 4. PDA (f2c): elapsed time t=1.14 s, 8192 iters, t-(init.)=1.14 s t(norm)=0.362396, mflops=13.7971 (err=2.8e-16) 5. Singleton: elapsed time t=1.17 s, 65536 iters, t-(init.)=1.12 s t(norm)=0.0445048, mflops=112.347 (err=1.9e-16) 6. Singleton (f2c): elapsed time t=1.12 s, 65536 iters, t-(init.)=1.07 s t(norm)=0.042518, mflops=117.597 (err=1.9e-16) 7. Temperton: elapsed time t=1.04 s, 32768 iters, t-(init.)=1.02 s t(norm)=0.0810623, mflops=61.6809 (err=1.9e-16) 8. Temperton (f2c): elapsed time t=1.31 s, 32768 iters, t-(init.)=1.29 s t(norm)=0.10252, mflops=48.771 (err=1.9e-16) Top mflops for N=64 = 185.043 Normalized results and averages for N=64: fft 0: mflops = 185.043 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.00540415), norm. avg. (of 0) = -1 fft 2: mflops = -1 (norm. = -0.00540415), norm. avg. (of 0) = -1 fft 3: mflops = 20.9715 (norm. = 0.113333), norm. avg. (of 1) = 0.113333 fft 4: mflops = 13.7971 (norm. = 0.0745614), norm. avg. (of 1) = 0.0745614 fft 5: mflops = 112.347 (norm. = 0.607143), norm. avg. (of 1) = 0.607143 fft 6: mflops = 117.597 (norm. = 0.635514), norm. avg. (of 1) = 0.635514 fft 7: mflops = 61.6809 (norm. = 0.333333), norm. avg. (of 1) = 0.333333 fft 8: mflops = 48.771 (norm. = 0.263566), norm. avg. (of 1) = 0.263566 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.47 s, 16384 iters, t-(init.)=1.38 s t(norm)=0.0182788, mflops=273.542 (err=3.4e-16) 1. HARM: elapsed time t=1.43 s, 8192 iters, t-(init.)=1.38 s t(norm)=0.0365575, mflops=136.771 (err=3.8e-16) 2. HARM (f2c): elapsed time t=1.63 s, 8192 iters, t-(init.)=1.58 s t(norm)=0.0418557, mflops=119.458 (err=3.8e-16) 3. PDA: elapsed time t=1.22 s, 2048 iters, t-(init.)=1.21 s t(norm)=0.128216, mflops=38.9966 (err=3.1e-16) 4. PDA (f2c): elapsed time t=1.12 s, 1024 iters, t-(init.)=1.12 s t(norm)=0.237359, mflops=21.0651 (err=3.1e-16) 5. Singleton: elapsed time t=1.81 s, 8192 iters, t-(init.)=1.76 s t(norm)=0.0466241, mflops=107.241 (err=3.5e-16) 6. Singleton (f2c): elapsed time t=1.88 s, 8192 iters, t-(init.)=1.83 s t(norm)=0.0484784, mflops=103.139 (err=3.5e-16) 7. Temperton: elapsed time t=1.37 s, 8192 iters, t-(init.)=1.32 s t(norm)=0.0349681, mflops=142.988 (err=1.3e-08) 8. Temperton (f2c): elapsed time t=1.97 s, 8192 iters, t-(init.)=1.92 s t(norm)=0.0508626, mflops=98.304 (err=3.3e-16) Top mflops for N=512 = 273.542 Normalized results and averages for N=512: fft 0: mflops = 273.542 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 136.771 (norm. = 0.5), norm. avg. (of 1) = 0.5 fft 2: mflops = 119.458 (norm. = 0.436709), norm. avg. (of 1) = 0.436709 fft 3: mflops = 38.9966 (norm. = 0.142562), norm. avg. (of 2) = 0.127948 fft 4: mflops = 21.0651 (norm. = 0.0770089), norm. avg. (of 2) = 0.0757852 fft 5: mflops = 107.241 (norm. = 0.392045), norm. avg. (of 2) = 0.499594 fft 6: mflops = 103.139 (norm. = 0.377049), norm. avg. (of 2) = 0.506282 fft 7: mflops = 142.988 (norm. = 0.522727), norm. avg. (of 2) = 0.42803 fft 8: mflops = 98.304 (norm. = 0.359375), norm. avg. (of 2) = 0.31147 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.02 s, 512 iters, t-(init.)=0.94 s t(norm)=0.0373522, mflops=133.861 (err=4.2e-16) 1. HARM: elapsed time t=1.41 s, 512 iters, t-(init.)=1.34 s t(norm)=0.0532468, mflops=93.9023 (err=4.0e-16) 2. HARM (f2c): elapsed time t=1.54 s, 512 iters, t-(init.)=1.47 s t(norm)=0.0584126, mflops=85.598 (err=4.0e-16) 3. PDA: elapsed time t=1.22 s, 256 iters, t-(init.)=1.19 s t(norm)=0.0945727, mflops=52.8694 (err=3.9e-16) 4. PDA (f2c): elapsed time t=1.1 s, 128 iters, t-(init.)=1.08 s t(norm)=0.171661, mflops=29.1271 (err=3.9e-16) 5. Singleton: elapsed time t=1.04 s, 256 iters, t-(init.)=1 s t(norm)=0.0794729, mflops=62.9146 (err=4.1e-16) 6. Singleton (f2c): elapsed time t=1.03 s, 256 iters, t-(init.)=0.99 s t(norm)=0.0786781, mflops=63.5501 (err=4.1e-16) 7. Temperton: elapsed time t=1.97 s, 512 iters, t-(init.)=1.9 s t(norm)=0.0754992, mflops=66.2259 (err=6.3e-08) 8. Temperton (f2c): elapsed time t=1.6 s, 512 iters, t-(init.)=1.53 s t(norm)=0.0607967, mflops=82.2413 (err=4.5e-16) Top mflops for N=4096 = 133.861 Normalized results and averages for N=4096: fft 0: mflops = 133.861 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 93.9023 (norm. = 0.701493), norm. avg. (of 2) = 0.600746 fft 2: mflops = 85.598 (norm. = 0.639456), norm. avg. (of 2) = 0.538082 fft 3: mflops = 52.8694 (norm. = 0.394958), norm. avg. (of 3) = 0.216951 fft 4: mflops = 29.1271 (norm. = 0.217593), norm. avg. (of 3) = 0.123054 fft 5: mflops = 62.9146 (norm. = 0.47), norm. avg. (of 3) = 0.489729 fft 6: mflops = 63.5501 (norm. = 0.474747), norm. avg. (of 3) = 0.49577 fft 7: mflops = 66.2259 (norm. = 0.494737), norm. avg. (of 3) = 0.450266 fft 8: mflops = 82.2413 (norm. = 0.614379), norm. avg. (of 3) = 0.41244 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.37 s, 64 iters, t-(init.)=1.3 s t(norm)=0.0413259, mflops=120.99 (err=5.2e-16) 1. HARM: elapsed time t=1.05 s, 32 iters, t-(init.)=1.02 s t(norm)=0.0648499, mflops=77.1012 (err=5.1e-16) 2. HARM (f2c): elapsed time t=1.11 s, 32 iters, t-(init.)=1.08 s t(norm)=0.0686646, mflops=72.8178 (err=5.1e-16) 3. PDA: elapsed time t=1.32 s, 32 iters, t-(init.)=1.28 s t(norm)=0.0813802, mflops=61.44 (err=4.3e-16) 4. PDA (f2c): elapsed time t=1.65 s, 16 iters, t-(init.)=1.64 s t(norm)=0.208537, mflops=23.9766 (err=4.3e-16) 5. Singleton: elapsed time t=1.69 s, 32 iters, t-(init.)=1.66 s t(norm)=0.10554, mflops=47.3754 (err=5.3e-16) 6. Singleton (f2c): elapsed time t=1.69 s, 32 iters, t-(init.)=1.66 s t(norm)=0.10554, mflops=47.3754 (err=5.3e-16) 7. Temperton: elapsed time t=1.29 s, 32 iters, t-(init.)=1.25 s t(norm)=0.0794729, mflops=62.9146 (err=9.6e-08) 8. Temperton (f2c): elapsed time t=1.17 s, 32 iters, t-(init.)=1.13 s t(norm)=0.0718435, mflops=69.5958 (err=4.9e-16) Top mflops for N=32768 = 120.99 Normalized results and averages for N=32768: fft 0: mflops = 120.99 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 77.1012 (norm. = 0.637255), norm. avg. (of 3) = 0.612916 fft 2: mflops = 72.8178 (norm. = 0.601852), norm. avg. (of 3) = 0.559339 fft 3: mflops = 61.44 (norm. = 0.507812), norm. avg. (of 4) = 0.289666 fft 4: mflops = 23.9766 (norm. = 0.198171), norm. avg. (of 4) = 0.141833 fft 5: mflops = 47.3754 (norm. = 0.391566), norm. avg. (of 4) = 0.465189 fft 6: mflops = 47.3754 (norm. = 0.391566), norm. avg. (of 4) = 0.469719 fft 7: mflops = 62.9146 (norm. = 0.52), norm. avg. (of 4) = 0.467699 fft 8: mflops = 69.5958 (norm. = 0.575221), norm. avg. (of 4) = 0.453135 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.32 s, 2 iters, t-(init.)=1.27 s t(norm)=0.134574, mflops=37.1543 (err=1.2e-15) 1. HARM: elapsed time t=1.91 s, 2 iters, t-(init.)=1.86 s t(norm)=0.197093, mflops=25.3688 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.63 s, 2 iters, t-(init.)=1.58 s t(norm)=0.167423, mflops=29.8645 (err=1.2e-15) 3. PDA: elapsed time t=1.11 s, 1 iters, t-(init.)=1.08 s t(norm)=0.228882, mflops=21.8453 (err=1.2e-15) 4. PDA (f2c): elapsed time t=1.61 s, 1 iters, t-(init.)=1.59 s t(norm)=0.336965, mflops=14.8383 (err=1.2e-15) 5. Singleton: elapsed time t=2.11 s, 1 iters, t-(init.)=2.09 s t(norm)=0.442929, mflops=11.2885 (err=1.7e-15) 6. Singleton (f2c): elapsed time t=2.14 s, 1 iters, t-(init.)=2.11 s t(norm)=0.447167, mflops=11.1815 (err=1.7e-15) 7. Temperton: elapsed time t=1.19 s, 1 iters, t-(init.)=1.17 s t(norm)=0.247955, mflops=20.1649 (err=1.3e-07) 8. Temperton (f2c): elapsed time t=1.05 s, 1 iters, t-(init.)=1.03 s t(norm)=0.218285, mflops=22.9058 (err=1.3e-15) Top mflops for N=262144 = 37.1543 Normalized results and averages for N=262144: fft 0: mflops = 37.1543 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 25.3688 (norm. = 0.682796), norm. avg. (of 4) = 0.630386 fft 2: mflops = 29.8645 (norm. = 0.803797), norm. avg. (of 4) = 0.620453 fft 3: mflops = 21.8453 (norm. = 0.587963), norm. avg. (of 5) = 0.349326 fft 4: mflops = 14.8383 (norm. = 0.399371), norm. avg. (of 5) = 0.193341 fft 5: mflops = 11.2885 (norm. = 0.303828), norm. avg. (of 5) = 0.432916 fft 6: mflops = 11.1815 (norm. = 0.300948), norm. avg. (of 5) = 0.435965 fft 7: mflops = 20.1649 (norm. = 0.542735), norm. avg. (of 5) = 0.482706 fft 8: mflops = 22.9058 (norm. = 0.616505), norm. avg. (of 5) = 0.485809 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.46 s, 1 iters, t-(init.)=1.41 s t(norm)=0.141545, mflops=35.3244 (err=1.2e-15) 1. HARM: elapsed time t=2.69 s, 1 iters, t-(init.)=2.65 s t(norm)=0.266025, mflops=18.7952 (err=1.2e-15) 2. HARM (f2c): elapsed time t=2.17 s, 1 iters, t-(init.)=2.12 s t(norm)=0.21282, mflops=23.494 (err=1.2e-15) 3. PDA: elapsed time t=2.02 s, 1 iters, t-(init.)=1.97 s t(norm)=0.197762, mflops=25.2829 (err=1.2e-15) 4. PDA (f2c): elapsed time t=3.05 s, 1 iters, t-(init.)=3 s t(norm)=0.30116, mflops=16.6025 (err=1.2e-15) 5. Singleton: elapsed time t=5.19 s, 1 iters, t-(init.)=5.14 s t(norm)=0.515988, mflops=9.69015 (err=1.7e-15) 6. Singleton (f2c): elapsed time t=5.15 s, 1 iters, t-(init.)=5.1 s t(norm)=0.511973, mflops=9.76615 (err=1.7e-15) 7. Temperton: elapsed time t=3.45 s, 1 iters, t-(init.)=3.4 s t(norm)=0.341315, mflops=14.6492 (err=1.5e-07) 8. Temperton (f2c): elapsed time t=2.99 s, 1 iters, t-(init.)=2.94 s t(norm)=0.295137, mflops=16.9413 (err=1.3e-15) Top mflops for N=524288 = 35.3244 Normalized results and averages for N=524288: fft 0: mflops = 35.3244 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 18.7952 (norm. = 0.532075), norm. avg. (of 5) = 0.610724 fft 2: mflops = 23.494 (norm. = 0.665094), norm. avg. (of 5) = 0.629382 fft 3: mflops = 25.2829 (norm. = 0.715736), norm. avg. (of 6) = 0.410394 fft 4: mflops = 16.6025 (norm. = 0.47), norm. avg. (of 6) = 0.239451 fft 5: mflops = 9.69015 (norm. = 0.274319), norm. avg. (of 6) = 0.406484 fft 6: mflops = 9.76615 (norm. = 0.276471), norm. avg. (of 6) = 0.409383 fft 7: mflops = 14.6492 (norm. = 0.414706), norm. avg. (of 6) = 0.471373 fft 8: mflops = 16.9413 (norm. = 0.479592), norm. avg. (of 6) = 0.484773 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=2.83 s, 1 iters, t-(init.)=2.73 s t(norm)=0.130177, mflops=38.4094 (err=2.0e-15) 1. HARM: elapsed time t=5.6 s, 1 iters, t-(init.)=5.5 s t(norm)=0.26226, mflops=19.065 (err=1.9e-15) 2. HARM (f2c): elapsed time t=4.56 s, 1 iters, t-(init.)=4.46 s t(norm)=0.212669, mflops=23.5107 (err=1.9e-15) 3. PDA: elapsed time t=4.77 s, 1 iters, t-(init.)=4.68 s t(norm)=0.22316, mflops=22.4055 (err=2.0e-15) 4. PDA (f2c): elapsed time t=7.1 s, 1 iters, t-(init.)=7 s t(norm)=0.333786, mflops=14.9797 (err=2.0e-15) 5. Singleton: elapsed time t=10.14 s, 1 iters, t-(init.)=10.04 s t(norm)=0.478745, mflops=10.444 (err=2.8e-15) 6. Singleton (f2c): elapsed time t=10.59 s, 1 iters, t-(init.)=10.49 s t(norm)=0.500202, mflops=9.99596 (err=2.8e-15) 7. Skipping fft (Temperton can't handle dimensions > 256). 8. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 38.4094 Normalized results and averages for N=1048576: fft 0: mflops = 38.4094 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 19.065 (norm. = 0.496364), norm. avg. (of 6) = 0.591664 fft 2: mflops = 23.5107 (norm. = 0.612108), norm. avg. (of 6) = 0.626503 fft 3: mflops = 22.4055 (norm. = 0.583333), norm. avg. (of 7) = 0.4351 fft 4: mflops = 14.9797 (norm. = 0.39), norm. avg. (of 7) = 0.260958 fft 5: mflops = 10.444 (norm. = 0.271912), norm. avg. (of 7) = 0.387259 fft 6: mflops = 9.99596 (norm. = 0.260248), norm. avg. (of 7) = 0.388078 fft 7: mflops = -1 (norm. = -0.0260353), norm. avg. (of 6) = 0.471373 fft 8: mflops = -1 (norm. = -0.0260353), norm. avg. (of 6) = 0.484773 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=6.22 s, 1 iters, t-(init.)=6.03 s t(norm)=0.13692, mflops=36.5176 (err=7.3e-16) 1. HARM: elapsed time t=10.99 s, 1 iters, t-(init.)=10.79 s t(norm)=0.245003, mflops=20.4079 (err=7.0e-16) 2. HARM (f2c): elapsed time t=9.51 s, 1 iters, t-(init.)=9.32 s t(norm)=0.211625, mflops=23.6267 (err=7.0e-16) 3. PDA: elapsed time t=9.04 s, 1 iters, t-(init.)=8.85 s t(norm)=0.200953, mflops=24.8815 (err=7.0e-16) 4. PDA (f2c): elapsed time t=13.7 s, 1 iters, t-(init.)=13.51 s t(norm)=0.306765, mflops=16.2991 (err=7.0e-16) 5. Singleton: elapsed time t=28.97 s, 1 iters, t-(init.)=28.78 s t(norm)=0.653494, mflops=7.65118 (err=8.4e-16) 6. Singleton (f2c): elapsed time t=29.11 s, 1 iters, t-(init.)=28.91 s t(norm)=0.656446, mflops=7.61677 (err=8.4e-16) 7. Temperton: elapsed time t=23.15 s, 1 iters, t-(init.)=22.95 s t(norm)=0.521115, mflops=9.59481 (err=1.5e-07) 8. Temperton (f2c): elapsed time t=20.57 s, 1 iters, t-(init.)=20.37 s t(norm)=0.462532, mflops=10.8101 (err=7.3e-16) Top mflops for N=2097152 = 36.5176 Normalized results and averages for N=2097152: fft 0: mflops = 36.5176 (norm. = 1), norm. avg. (of 8) = 1 fft 1: mflops = 20.4079 (norm. = 0.558851), norm. avg. (of 7) = 0.586976 fft 2: mflops = 23.6267 (norm. = 0.646996), norm. avg. (of 7) = 0.62943 fft 3: mflops = 24.8815 (norm. = 0.681356), norm. avg. (of 8) = 0.465882 fft 4: mflops = 16.2991 (norm. = 0.446336), norm. avg. (of 8) = 0.28413 fft 5: mflops = 7.65118 (norm. = 0.209521), norm. avg. (of 8) = 0.365042 fft 6: mflops = 7.61677 (norm. = 0.208578), norm. avg. (of 8) = 0.36564 fft 7: mflops = 9.59481 (norm. = 0.262745), norm. avg. (of 7) = 0.441569 fft 8: mflops = 10.8101 (norm. = 0.296024), norm. avg. (of 7) = 0.457809 Benchmarking for array size = 512x128x64 (power of 2): 0. FFTW: elapsed time t=12.98 s, 1 iters, t-(init.)=12.58 s t(norm)=0.136332, mflops=36.6752 (err=1.3e-15) 1. HARM: elapsed time t=28.2 s, 1 iters, t-(init.)=27.81 s t(norm)=0.301383, mflops=16.5902 (err=1.2e-15) 2. HARM (f2c): elapsed time t=22.41 s, 1 iters, t-(init.)=22.02 s t(norm)=0.238635, mflops=20.9525 (err=1.2e-15) 3. PDA: elapsed time t=19.06 s, 1 iters, t-(init.)=18.67 s t(norm)=0.202331, mflops=24.712 (err=1.3e-15) 4. PDA (f2c): elapsed time t=29.42 s, 1 iters, t-(init.)=29.03 s t(norm)=0.314604, mflops=15.893 (err=1.3e-15) 5. Singleton: elapsed time t=58.01 s, 1 iters, t-(init.)=57.61 s t(norm)=0.624332, mflops=8.00857 (err=1.6e-15) 6. Singleton (f2c): elapsed time t=57.66 s, 1 iters, t-(init.)=57.27 s t(norm)=0.620647, mflops=8.05611 (err=1.6e-15) 7. Skipping fft (Temperton can't handle dimensions > 256). 8. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=4194304 = 36.6752 Normalized results and averages for N=4194304: fft 0: mflops = 36.6752 (norm. = 1), norm. avg. (of 9) = 1 fft 1: mflops = 16.5902 (norm. = 0.452355), norm. avg. (of 8) = 0.570149 fft 2: mflops = 20.9525 (norm. = 0.571299), norm. avg. (of 8) = 0.622164 fft 3: mflops = 24.712 (norm. = 0.673808), norm. avg. (of 9) = 0.488985 fft 4: mflops = 15.893 (norm. = 0.433345), norm. avg. (of 9) = 0.30071 fft 5: mflops = 8.00857 (norm. = 0.218365), norm. avg. (of 9) = 0.348744 fft 6: mflops = 8.05611 (norm. = 0.219661), norm. avg. (of 9) = 0.34942 fft 7: mflops = -1 (norm. = -0.0272664), norm. avg. (of 7) = 0.441569 fft 8: mflops = -1 (norm. = -0.0272664), norm. avg. (of 7) = 0.457809 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) 180x180x180 (88.9976 MB) Maximum array size N = 5832000 Benchmarking FFTs: 0. FFTW 1. PDA 2. PDA (f2c) 3. Singleton 4. Singleton (f2c) 5. Temperton 6. Temperton (f2c) Computing normalized averages (7 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.93 s, 65536 iters, t-(init.)=1.84 s t(norm)=0.0322447, mflops=155.064 (err=3.9e-16) 1. PDA: elapsed time t=1.35 s, 8192 iters, t-(init.)=1.34 s t(norm)=0.18786, mflops=26.6155 (err=3.0e-16) 2. PDA (f2c): elapsed time t=1.03 s, 4096 iters, t-(init.)=1.03 s t(norm)=0.2888, mflops=17.313 (err=3.0e-16) 3. Singleton: elapsed time t=1.96 s, 65536 iters, t-(init.)=1.87 s t(norm)=0.0327704, mflops=152.577 (err=3.4e-16) 4. Singleton (f2c): elapsed time t=1.92 s, 65536 iters, t-(init.)=1.83 s t(norm)=0.0320694, mflops=155.912 (err=3.4e-16) 5. Temperton: elapsed time t=1.37 s, 32768 iters, t-(init.)=1.32 s t(norm)=0.0462641, mflops=108.075 (err=5.5e-16) 6. Temperton (f2c): elapsed time t=1.17 s, 8192 iters, t-(init.)=1.16 s t(norm)=0.162625, mflops=30.7455 (err=2.5e-16) Top mflops for N=125 = 155.912 Normalized results and averages for N=125: fft 0: mflops = 155.064 (norm. = 0.994565), norm. avg. (of 1) = 0.994565 fft 1: mflops = 26.6155 (norm. = 0.170709), norm. avg. (of 1) = 0.170709 fft 2: mflops = 17.313 (norm. = 0.111044), norm. avg. (of 1) = 0.111044 fft 3: mflops = 152.577 (norm. = 0.97861), norm. avg. (of 1) = 0.97861 fft 4: mflops = 155.912 (norm. = 1), norm. avg. (of 1) = 1 fft 5: mflops = 108.075 (norm. = 0.693182), norm. avg. (of 1) = 0.693182 fft 6: mflops = 30.7455 (norm. = 0.197198), norm. avg. (of 1) = 0.197198 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.32 s, 32768 iters, t-(init.)=1.24 s t(norm)=0.0225914, mflops=221.323 (err=2.9e-16) 1. PDA: elapsed time t=1.31 s, 4096 iters, t-(init.)=1.3 s t(norm)=0.189476, mflops=26.3886 (err=3.6e-16) 2. PDA (f2c): elapsed time t=1.09 s, 2048 iters, t-(init.)=1.08 s t(norm)=0.314822, mflops=15.882 (err=3.6e-16) 3. Singleton: elapsed time t=1.7 s, 16384 iters, t-(init.)=1.66 s t(norm)=0.0604866, mflops=82.663 (err=2.9e-16) 4. Singleton (f2c): elapsed time t=1.69 s, 16384 iters, t-(init.)=1.65 s t(norm)=0.0601222, mflops=83.164 (err=2.9e-16) 5. Temperton: elapsed time t=1.41 s, 16384 iters, t-(init.)=1.37 s t(norm)=0.0499196, mflops=100.161 (err=2.1e-15) 6. Temperton (f2c): elapsed time t=1.91 s, 8192 iters, t-(init.)=1.89 s t(norm)=0.137734, mflops=36.3017 (err=3.1e-16) Top mflops for N=216 = 221.323 Normalized results and averages for N=216: fft 0: mflops = 221.323 (norm. = 1), norm. avg. (of 2) = 0.997283 fft 1: mflops = 26.3886 (norm. = 0.119231), norm. avg. (of 2) = 0.14497 fft 2: mflops = 15.882 (norm. = 0.0717593), norm. avg. (of 2) = 0.0914015 fft 3: mflops = 82.663 (norm. = 0.373494), norm. avg. (of 2) = 0.676052 fft 4: mflops = 83.164 (norm. = 0.375758), norm. avg. (of 2) = 0.687879 fft 5: mflops = 100.161 (norm. = 0.452555), norm. avg. (of 2) = 0.572868 fft 6: mflops = 36.3017 (norm. = 0.164021), norm. avg. (of 2) = 0.18061 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.42 s, 16384 iters, t-(init.)=1.36 s t(norm)=0.0287347, mflops=174.006 (err=3.7e-16) 1. PDA: elapsed time t=1.38 s, 1024 iters, t-(init.)=1.38 s t(norm)=0.466516, mflops=10.7177 (err=4.9e-16) 2. PDA (f2c): elapsed time t=1.47 s, 512 iters, t-(init.)=1.47 s t(norm)=0.993882, mflops=5.03078 (err=4.9e-16) 3. Singleton: elapsed time t=1.79 s, 8192 iters, t-(init.)=1.75 s t(norm)=0.0739495, mflops=67.6137 (err=5.8e-16) 4. Singleton (f2c): elapsed time t=1.65 s, 8192 iters, t-(init.)=1.62 s t(norm)=0.0684561, mflops=73.0395 (err=5.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 174.006 Normalized results and averages for N=343: fft 0: mflops = 174.006 (norm. = 1), norm. avg. (of 3) = 0.998188 fft 1: mflops = 10.7177 (norm. = 0.0615942), norm. avg. (of 3) = 0.117178 fft 2: mflops = 5.03078 (norm. = 0.0289116), norm. avg. (of 3) = 0.0705715 fft 3: mflops = 67.6137 (norm. = 0.388571), norm. avg. (of 3) = 0.580225 fft 4: mflops = 73.0395 (norm. = 0.419753), norm. avg. (of 3) = 0.598504 fft 5: mflops = -1 (norm. = -0.00574693), norm. avg. (of 2) = 0.572868 fft 6: mflops = -1 (norm. = -0.00574693), norm. avg. (of 2) = 0.18061 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.48 s, 8192 iters, t-(init.)=1.41 s t(norm)=0.0248274, mflops=201.39 (err=5.6e-16) 1. PDA: elapsed time t=1.61 s, 2048 iters, t-(init.)=1.59 s t(norm)=0.111987, mflops=44.6479 (err=4.3e-16) 2. PDA (f2c): elapsed time t=1.63 s, 1024 iters, t-(init.)=1.62 s t(norm)=0.228201, mflops=21.9105 (err=4.3e-16) 3. Singleton: elapsed time t=1.3 s, 4096 iters, t-(init.)=1.27 s t(norm)=0.0447246, mflops=111.795 (err=4.5e-16) 4. Singleton (f2c): elapsed time t=1.3 s, 4096 iters, t-(init.)=1.27 s t(norm)=0.0447246, mflops=111.795 (err=4.5e-16) 5. Temperton: elapsed time t=1.83 s, 8192 iters, t-(init.)=1.77 s t(norm)=0.0311663, mflops=160.43 (err=6.0e-08) 6. Temperton (f2c): elapsed time t=1.22 s, 1024 iters, t-(init.)=1.21 s t(norm)=0.170446, mflops=29.3347 (err=5.0e-16) Top mflops for N=729 = 201.39 Normalized results and averages for N=729: fft 0: mflops = 201.39 (norm. = 1), norm. avg. (of 4) = 0.998641 fft 1: mflops = 44.6479 (norm. = 0.221698), norm. avg. (of 4) = 0.143308 fft 2: mflops = 21.9105 (norm. = 0.108796), norm. avg. (of 4) = 0.0801277 fft 3: mflops = 111.795 (norm. = 0.555118), norm. avg. (of 4) = 0.573948 fft 4: mflops = 111.795 (norm. = 0.555118), norm. avg. (of 4) = 0.587657 fft 5: mflops = 160.43 (norm. = 0.79661), norm. avg. (of 3) = 0.647449 fft 6: mflops = 29.3347 (norm. = 0.145661), norm. avg. (of 3) = 0.16896 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.96 s, 8192 iters, t-(init.)=1.87 s t(norm)=0.0229055, mflops=218.288 (err=4.8e-16) 1. PDA: elapsed time t=1.1 s, 1024 iters, t-(init.)=1.09 s t(norm)=0.106811, mflops=46.8118 (err=4.7e-16) 2. PDA (f2c): elapsed time t=1.07 s, 512 iters, t-(init.)=1.07 s t(norm)=0.209702, mflops=23.8434 (err=4.7e-16) 3. Singleton: elapsed time t=1.01 s, 2048 iters, t-(init.)=0.99 s t(norm)=0.0485058, mflops=103.08 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=1.97 s, 4096 iters, t-(init.)=1.92 s t(norm)=0.0470359, mflops=106.302 (err=5.4e-16) 5. Temperton: elapsed time t=1.21 s, 4096 iters, t-(init.)=1.17 s t(norm)=0.0286625, mflops=174.444 (err=6.6e-16) 6. Temperton (f2c): elapsed time t=1.11 s, 1024 iters, t-(init.)=1.1 s t(norm)=0.107791, mflops=46.3862 (err=3.9e-16) Top mflops for N=1000 = 218.288 Normalized results and averages for N=1000: fft 0: mflops = 218.288 (norm. = 1), norm. avg. (of 5) = 0.998913 fft 1: mflops = 46.8118 (norm. = 0.21445), norm. avg. (of 5) = 0.157536 fft 2: mflops = 23.8434 (norm. = 0.109229), norm. avg. (of 5) = 0.085948 fft 3: mflops = 103.08 (norm. = 0.472222), norm. avg. (of 5) = 0.553603 fft 4: mflops = 106.302 (norm. = 0.486979), norm. avg. (of 5) = 0.567522 fft 5: mflops = 174.444 (norm. = 0.799145), norm. avg. (of 4) = 0.685373 fft 6: mflops = 46.3862 (norm. = 0.2125), norm. avg. (of 4) = 0.179845 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.91 s, 4096 iters, t-(init.)=1.85 s t(norm)=0.032697, mflops=152.919 (err=4.1e-16) 1. PDA: elapsed time t=1.63 s, 256 iters, t-(init.)=1.63 s t(norm)=0.460939, mflops=10.8474 (err=6.3e-16) 2. PDA (f2c): elapsed time t=1.91 s, 128 iters, t-(init.)=1.91 s t(norm)=1.08024, mflops=4.62861 (err=5.3e-16) 3. Singleton: elapsed time t=1.17 s, 1024 iters, t-(init.)=1.15 s t(norm)=0.0813006, mflops=61.5002 (err=6.6e-16) 4. Singleton (f2c): elapsed time t=1.1 s, 1024 iters, t-(init.)=1.08 s t(norm)=0.0763519, mflops=65.4863 (err=6.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 152.919 Normalized results and averages for N=1331: fft 0: mflops = 152.919 (norm. = 1), norm. avg. (of 6) = 0.999094 fft 1: mflops = 10.8474 (norm. = 0.0709356), norm. avg. (of 6) = 0.143103 fft 2: mflops = 4.62861 (norm. = 0.0302683), norm. avg. (of 6) = 0.076668 fft 3: mflops = 61.5002 (norm. = 0.402174), norm. avg. (of 6) = 0.528365 fft 4: mflops = 65.4863 (norm. = 0.428241), norm. avg. (of 6) = 0.544308 fft 5: mflops = -1 (norm. = -0.0065394), norm. avg. (of 4) = 0.685373 fft 6: mflops = -1 (norm. = -0.0065394), norm. avg. (of 4) = 0.179845 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.51 s, 4096 iters, t-(init.)=1.43 s t(norm)=0.0187857, mflops=266.16 (err=3.9e-16) 1. PDA: elapsed time t=1.6 s, 1024 iters, t-(init.)=1.58 s t(norm)=0.0830247, mflops=60.223 (err=3.8e-16) 2. PDA (f2c): elapsed time t=1.71 s, 512 iters, t-(init.)=1.7 s t(norm)=0.178661, mflops=27.986 (err=3.8e-16) 3. Singleton: elapsed time t=1.77 s, 2048 iters, t-(init.)=1.73 s t(norm)=0.0454534, mflops=110.003 (err=4.0e-16) 4. Singleton (f2c): elapsed time t=1.81 s, 2048 iters, t-(init.)=1.77 s t(norm)=0.0465044, mflops=107.517 (err=4.0e-16) 5. Temperton: elapsed time t=1.79 s, 4096 iters, t-(init.)=1.71 s t(norm)=0.022464, mflops=222.579 (err=1.8e-15) 6. Temperton (f2c): elapsed time t=1.64 s, 1024 iters, t-(init.)=1.62 s t(norm)=0.0851266, mflops=58.736 (err=3.9e-16) Top mflops for N=1728 = 266.16 Normalized results and averages for N=1728: fft 0: mflops = 266.16 (norm. = 1), norm. avg. (of 7) = 0.999224 fft 1: mflops = 60.223 (norm. = 0.226266), norm. avg. (of 7) = 0.154983 fft 2: mflops = 27.986 (norm. = 0.105147), norm. avg. (of 7) = 0.0807365 fft 3: mflops = 110.003 (norm. = 0.413295), norm. avg. (of 7) = 0.511926 fft 4: mflops = 107.517 (norm. = 0.403955), norm. avg. (of 7) = 0.524258 fft 5: mflops = 222.579 (norm. = 0.836257), norm. avg. (of 5) = 0.71555 fft 6: mflops = 58.736 (norm. = 0.220679), norm. avg. (of 5) = 0.188012 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.05 s, 1024 iters, t-(init.)=1.02 s t(norm)=0.0408409, mflops=122.426 (err=4.5e-16) 1. PDA: elapsed time t=1.48 s, 128 iters, t-(init.)=1.48 s t(norm)=0.474075, mflops=10.5469 (err=6.7e-16) 2. PDA (f2c): elapsed time t=1.78 s, 64 iters, t-(init.)=1.78 s t(norm)=1.14034, mflops=4.38465 (err=6.5e-16) 3. Singleton: elapsed time t=1.03 s, 512 iters, t-(init.)=1.01 s t(norm)=0.080881, mflops=61.8192 (err=8.7e-16) 4. Singleton (f2c): elapsed time t=1.86 s, 1024 iters, t-(init.)=1.82 s t(norm)=0.072873, mflops=68.6125 (err=8.7e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 122.426 Normalized results and averages for N=2197: fft 0: mflops = 122.426 (norm. = 1), norm. avg. (of 8) = 0.999321 fft 1: mflops = 10.5469 (norm. = 0.0861486), norm. avg. (of 8) = 0.146379 fft 2: mflops = 4.38465 (norm. = 0.0358146), norm. avg. (of 8) = 0.0751212 fft 3: mflops = 61.8192 (norm. = 0.50495), norm. avg. (of 8) = 0.511054 fft 4: mflops = 68.6125 (norm. = 0.56044), norm. avg. (of 8) = 0.52878 fft 5: mflops = -1 (norm. = -0.00816818), norm. avg. (of 5) = 0.71555 fft 6: mflops = -1 (norm. = -0.00816818), norm. avg. (of 5) = 0.188012 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.01 s, 1024 iters, t-(init.)=0.93 s t(norm)=0.0289771, mflops=172.55 (err=4.1e-16) 1. PDA: elapsed time t=1.78 s, 256 iters, t-(init.)=1.76 s t(norm)=0.219353, mflops=22.7943 (err=5.0e-16) 2. PDA (f2c): elapsed time t=1.24 s, 64 iters, t-(init.)=1.24 s t(norm)=0.618177, mflops=8.0883 (err=5.0e-16) 3. Singleton: elapsed time t=1.39 s, 512 iters, t-(init.)=1.35 s t(norm)=0.0841269, mflops=59.434 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=1.32 s, 512 iters, t-(init.)=1.28 s t(norm)=0.0797648, mflops=62.6843 (err=5.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 172.55 Normalized results and averages for N=2744: fft 0: mflops = 172.55 (norm. = 1), norm. avg. (of 9) = 0.999396 fft 1: mflops = 22.7943 (norm. = 0.132102), norm. avg. (of 9) = 0.144793 fft 2: mflops = 8.0883 (norm. = 0.046875), norm. avg. (of 9) = 0.0719828 fft 3: mflops = 59.434 (norm. = 0.344444), norm. avg. (of 9) = 0.492542 fft 4: mflops = 62.6843 (norm. = 0.363281), norm. avg. (of 9) = 0.510392 fft 5: mflops = -1 (norm. = -0.00579541), norm. avg. (of 5) = 0.71555 fft 6: mflops = -1 (norm. = -0.00579541), norm. avg. (of 5) = 0.188012 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.39 s, 1024 iters, t-(init.)=1.27 s t(norm)=0.0313529, mflops=159.475 (err=5.4e-16) 1. PDA: elapsed time t=1.57 s, 512 iters, t-(init.)=1.52 s t(norm)=0.0750494, mflops=66.6228 (err=5.3e-16) 2. PDA (f2c): elapsed time t=1.73 s, 256 iters, t-(init.)=1.7 s t(norm)=0.167874, mflops=29.7843 (err=5.3e-16) 3. Singleton: elapsed time t=1.29 s, 512 iters, t-(init.)=1.23 s t(norm)=0.0607308, mflops=82.3306 (err=6.7e-16) 4. Singleton (f2c): elapsed time t=1.28 s, 512 iters, t-(init.)=1.22 s t(norm)=0.060237, mflops=83.0054 (err=6.7e-16) 5. Temperton: elapsed time t=1.24 s, 1024 iters, t-(init.)=1.12 s t(norm)=0.0276498, mflops=180.833 (err=2.1e-15) 6. Temperton (f2c): elapsed time t=1.52 s, 256 iters, t-(init.)=1.49 s t(norm)=0.147136, mflops=33.9821 (err=5.2e-16) Top mflops for N=3375 = 180.833 Normalized results and averages for N=3375: fft 0: mflops = 159.475 (norm. = 0.88189), norm. avg. (of 10) = 0.987645 fft 1: mflops = 66.6228 (norm. = 0.368421), norm. avg. (of 10) = 0.167155 fft 2: mflops = 29.7843 (norm. = 0.164706), norm. avg. (of 10) = 0.0812551 fft 3: mflops = 82.3306 (norm. = 0.455285), norm. avg. (of 10) = 0.488816 fft 4: mflops = 83.0054 (norm. = 0.459016), norm. avg. (of 10) = 0.505254 fft 5: mflops = 180.833 (norm. = 1), norm. avg. (of 6) = 0.762958 fft 6: mflops = 33.9821 (norm. = 0.187919), norm. avg. (of 6) = 0.187997 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.45 s, 128 iters, t-(init.)=1.38 s t(norm)=0.0457205, mflops=109.36 (err=5.4e-16) 1. PDA: elapsed time t=1.44 s, 64 iters, t-(init.)=1.4 s t(norm)=0.0927663, mflops=53.8989 (err=5.0e-16) 2. PDA (f2c): elapsed time t=1.87 s, 32 iters, t-(init.)=1.85 s t(norm)=0.245168, mflops=20.3942 (err=4.9e-16) 3. Singleton: elapsed time t=1.18 s, 64 iters, t-(init.)=1.14 s t(norm)=0.0755382, mflops=66.1916 (err=5.3e-16) 4. Singleton (f2c): elapsed time t=1.15 s, 64 iters, t-(init.)=1.11 s t(norm)=0.0735504, mflops=67.9806 (err=5.3e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 109.36 Normalized results and averages for N=16800: fft 0: mflops = 109.36 (norm. = 1), norm. avg. (of 11) = 0.988769 fft 1: mflops = 53.8989 (norm. = 0.492857), norm. avg. (of 11) = 0.196765 fft 2: mflops = 20.3942 (norm. = 0.186486), norm. avg. (of 11) = 0.0908216 fft 3: mflops = 66.1916 (norm. = 0.605263), norm. avg. (of 11) = 0.499402 fft 4: mflops = 67.9806 (norm. = 0.621622), norm. avg. (of 11) = 0.515833 fft 5: mflops = -1 (norm. = -0.0091441), norm. avg. (of 6) = 0.762958 fft 6: mflops = -1 (norm. = -0.0091441), norm. avg. (of 6) = 0.187997 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.77 s, 16 iters, t-(init.)=1.61 s t(norm)=0.0543051, mflops=92.0724 (err=6.7e-16) 1. PDA: elapsed time t=1.28 s, 8 iters, t-(init.)=1.21 s t(norm)=0.0816263, mflops=61.2548 (err=6.4e-16) 2. PDA (f2c): elapsed time t=1.49 s, 4 iters, t-(init.)=1.46 s t(norm)=0.196982, mflops=25.383 (err=6.4e-16) 3. Singleton: elapsed time t=1.24 s, 4 iters, t-(init.)=1.21 s t(norm)=0.163253, mflops=30.6274 (err=6.5e-16) 4. Singleton (f2c): elapsed time t=1.25 s, 4 iters, t-(init.)=1.21 s t(norm)=0.163253, mflops=30.6274 (err=6.5e-16) 5. Temperton: elapsed time t=1.42 s, 8 iters, t-(init.)=1.34 s t(norm)=0.0903961, mflops=55.3121 (err=1.0e-07) 6. Temperton (f2c): elapsed time t=1.66 s, 8 iters, t-(init.)=1.58 s t(norm)=0.106586, mflops=46.9103 (err=7.3e-16) Top mflops for N=110592 = 92.0724 Normalized results and averages for N=110592: fft 0: mflops = 92.0724 (norm. = 1), norm. avg. (of 12) = 0.989705 fft 1: mflops = 61.2548 (norm. = 0.665289), norm. avg. (of 12) = 0.235808 fft 2: mflops = 25.383 (norm. = 0.275685), norm. avg. (of 12) = 0.106227 fft 3: mflops = 30.6274 (norm. = 0.332645), norm. avg. (of 12) = 0.485506 fft 4: mflops = 30.6274 (norm. = 0.332645), norm. avg. (of 12) = 0.500567 fft 5: mflops = 55.3121 (norm. = 0.600746), norm. avg. (of 7) = 0.739785 fft 6: mflops = 46.9103 (norm. = 0.509494), norm. avg. (of 7) = 0.233925 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.85 s, 16 iters, t-(init.)=1.67 s t(norm)=0.0526696, mflops=94.9315 (err=6.8e-16) 1. PDA: elapsed time t=1.56 s, 4 iters, t-(init.)=1.52 s t(norm)=0.191755, mflops=26.0749 (err=7.6e-16) 2. PDA (f2c): elapsed time t=1.21 s, 1 iters, t-(init.)=1.2 s t(norm)=0.605542, mflops=8.25706 (err=7.6e-16) 3. Singleton: elapsed time t=1.4 s, 4 iters, t-(init.)=1.36 s t(norm)=0.17157, mflops=29.1426 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.39 s, 4 iters, t-(init.)=1.35 s t(norm)=0.170309, mflops=29.3584 (err=1.0e-15) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 94.9315 Normalized results and averages for N=117649: fft 0: mflops = 94.9315 (norm. = 1), norm. avg. (of 13) = 0.990497 fft 1: mflops = 26.0749 (norm. = 0.274671), norm. avg. (of 13) = 0.238798 fft 2: mflops = 8.25706 (norm. = 0.0869792), norm. avg. (of 13) = 0.104746 fft 3: mflops = 29.1426 (norm. = 0.306985), norm. avg. (of 13) = 0.471774 fft 4: mflops = 29.3584 (norm. = 0.309259), norm. avg. (of 13) = 0.485851 fft 5: mflops = -1 (norm. = -0.0105339), norm. avg. (of 7) = 0.739785 fft 6: mflops = -1 (norm. = -0.0105339), norm. avg. (of 7) = 0.233925 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.84 s, 8 iters, t-(init.)=1.69 s t(norm)=0.0551903, mflops=90.5956 (err=7.6e-16) 1. PDA: elapsed time t=1.79 s, 4 iters, t-(init.)=1.71 s t(norm)=0.111687, mflops=44.768 (err=7.7e-16) 2. PDA (f2c): elapsed time t=1.78 s, 2 iters, t-(init.)=1.75 s t(norm)=0.228599, mflops=21.8724 (err=7.7e-16) 3. Singleton: elapsed time t=1.88 s, 2 iters, t-(init.)=1.85 s t(norm)=0.241662, mflops=20.6901 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.9 s, 2 iters, t-(init.)=1.86 s t(norm)=0.242968, mflops=20.5788 (err=1.0e-15) 5. Temperton: elapsed time t=1.81 s, 8 iters, t-(init.)=1.66 s t(norm)=0.0542106, mflops=92.2329 (err=2.0e-15) 6. Temperton (f2c): elapsed time t=1.07 s, 2 iters, t-(init.)=1.04 s t(norm)=0.135853, mflops=36.8045 (err=7.3e-16) Top mflops for N=216000 = 92.2329 Normalized results and averages for N=216000: fft 0: mflops = 90.5956 (norm. = 0.982249), norm. avg. (of 14) = 0.989907 fft 1: mflops = 44.768 (norm. = 0.48538), norm. avg. (of 14) = 0.256411 fft 2: mflops = 21.8724 (norm. = 0.237143), norm. avg. (of 14) = 0.114203 fft 3: mflops = 20.6901 (norm. = 0.224324), norm. avg. (of 14) = 0.454099 fft 4: mflops = 20.5788 (norm. = 0.223118), norm. avg. (of 14) = 0.467085 fft 5: mflops = 92.2329 (norm. = 1), norm. avg. (of 8) = 0.772312 fft 6: mflops = 36.8045 (norm. = 0.399038), norm. avg. (of 8) = 0.254564 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.16 s, 4 iters, t-(init.)=1.08 s t(norm)=0.0624055, mflops=80.1211 (err=5.9e-16) 1. PDA: elapsed time t=1.22 s, 2 iters, t-(init.)=1.18 s t(norm)=0.136368, mflops=36.6656 (err=6.8e-16) 2. PDA (f2c): elapsed time t=1.29 s, 1 iters, t-(init.)=1.27 s t(norm)=0.293537, mflops=17.0336 (err=6.8e-16) 3. Singleton: elapsed time t=1.09 s, 1 iters, t-(init.)=1.07 s t(norm)=0.247311, mflops=20.2175 (err=6.3e-16) 4. Singleton (f2c): elapsed time t=1.11 s, 1 iters, t-(init.)=1.09 s t(norm)=0.251934, mflops=19.8465 (err=6.3e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 80.1211 Normalized results and averages for N=241920: fft 0: mflops = 80.1211 (norm. = 1), norm. avg. (of 15) = 0.99058 fft 1: mflops = 36.6656 (norm. = 0.457627), norm. avg. (of 15) = 0.269825 fft 2: mflops = 17.0336 (norm. = 0.212598), norm. avg. (of 15) = 0.120763 fft 3: mflops = 20.2175 (norm. = 0.252336), norm. avg. (of 15) = 0.440648 fft 4: mflops = 19.8465 (norm. = 0.247706), norm. avg. (of 15) = 0.452459 fft 5: mflops = -1 (norm. = -0.0124811), norm. avg. (of 8) = 0.772312 fft 6: mflops = -1 (norm. = -0.0124811), norm. avg. (of 8) = 0.254564 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.14 s, 2 iters, t-(init.)=1.07 s t(norm)=0.0678646, mflops=73.6762 (err=7.1e-16) 1. PDA: elapsed time t=1.04 s, 1 iters, t-(init.)=1.01 s t(norm)=0.128118, mflops=39.0265 (err=7.7e-16) 2. PDA (f2c): elapsed time t=1.9 s, 1 iters, t-(init.)=1.86 s t(norm)=0.23594, mflops=21.1918 (err=7.7e-16) 3. Singleton: elapsed time t=1.88 s, 1 iters, t-(init.)=1.84 s t(norm)=0.233403, mflops=21.4221 (err=8.9e-16) 4. Singleton (f2c): elapsed time t=1.94 s, 1 iters, t-(init.)=1.91 s t(norm)=0.242283, mflops=20.637 (err=8.9e-16) 5. Temperton: elapsed time t=1.03 s, 2 iters, t-(init.)=0.95 s t(norm)=0.0602536, mflops=82.9826 (err=1.4e-07) 6. Temperton (f2c): elapsed time t=1.33 s, 1 iters, t-(init.)=1.29 s t(norm)=0.163636, mflops=30.5556 (err=7.6e-16) Top mflops for N=421875 = 82.9826 Normalized results and averages for N=421875: fft 0: mflops = 73.6762 (norm. = 0.88785), norm. avg. (of 16) = 0.98416 fft 1: mflops = 39.0265 (norm. = 0.470297), norm. avg. (of 16) = 0.282355 fft 2: mflops = 21.1918 (norm. = 0.255376), norm. avg. (of 16) = 0.129176 fft 3: mflops = 21.4221 (norm. = 0.258152), norm. avg. (of 16) = 0.429242 fft 4: mflops = 20.637 (norm. = 0.248691), norm. avg. (of 16) = 0.439724 fft 5: mflops = 82.9826 (norm. = 1), norm. avg. (of 9) = 0.797611 fft 6: mflops = 30.5556 (norm. = 0.368217), norm. avg. (of 9) = 0.267192 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.43 s, 2 iters, t-(init.)=1.34 s t(norm)=0.0689976, mflops=72.4663 (err=7.4e-16) 1. PDA: elapsed time t=1.34 s, 1 iters, t-(init.)=1.29 s t(norm)=0.132846, mflops=37.6375 (err=7.1e-16) 2. PDA (f2c): elapsed time t=2.26 s, 1 iters, t-(init.)=2.21 s t(norm)=0.227589, mflops=21.9694 (err=6.1e-16) 3. Singleton: elapsed time t=2.87 s, 1 iters, t-(init.)=2.82 s t(norm)=0.290408, mflops=17.2172 (err=7.9e-16) 4. Singleton (f2c): elapsed time t=2.95 s, 1 iters, t-(init.)=2.9 s t(norm)=0.298646, mflops=16.7422 (err=7.9e-16) 5. Temperton: elapsed time t=1.03 s, 1 iters, t-(init.)=0.99 s t(norm)=0.101952, mflops=49.0428 (err=1.7e-07) 6. Temperton (f2c): elapsed time t=1.23 s, 1 iters, t-(init.)=1.18 s t(norm)=0.121518, mflops=41.1461 (err=6.7e-16) Top mflops for N=512000 = 72.4663 Normalized results and averages for N=512000: fft 0: mflops = 72.4663 (norm. = 1), norm. avg. (of 17) = 0.985091 fft 1: mflops = 37.6375 (norm. = 0.51938), norm. avg. (of 17) = 0.296297 fft 2: mflops = 21.9694 (norm. = 0.303167), norm. avg. (of 17) = 0.139411 fft 3: mflops = 17.2172 (norm. = 0.237589), norm. avg. (of 17) = 0.417968 fft 4: mflops = 16.7422 (norm. = 0.231034), norm. avg. (of 17) = 0.427448 fft 5: mflops = 49.0428 (norm. = 0.676768), norm. avg. (of 10) = 0.785526 fft 6: mflops = 41.1461 (norm. = 0.567797), norm. avg. (of 10) = 0.297252 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.61 s, 2 iters, t-(init.)=1.51 s t(norm)=0.0664247, mflops=75.2732 (err=7.1e-16) 1. PDA: elapsed time t=2.05 s, 1 iters, t-(init.)=1.99 s t(norm)=0.17508, mflops=28.5584 (err=7.7e-16) 2. PDA (f2c): elapsed time t=4.48 s, 1 iters, t-(init.)=4.43 s t(norm)=0.38975, mflops=12.8287 (err=1.4e-15) 3. Singleton: elapsed time t=3.5 s, 1 iters, t-(init.)=3.45 s t(norm)=0.30353, mflops=16.4728 (err=9.1e-16) 4. Singleton (f2c): elapsed time t=3.5 s, 1 iters, t-(init.)=3.45 s t(norm)=0.30353, mflops=16.4728 (err=9.1e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 75.2732 Normalized results and averages for N=592704: fft 0: mflops = 75.2732 (norm. = 1), norm. avg. (of 18) = 0.98592 fft 1: mflops = 28.5584 (norm. = 0.379397), norm. avg. (of 18) = 0.300914 fft 2: mflops = 12.8287 (norm. = 0.170429), norm. avg. (of 18) = 0.141134 fft 3: mflops = 16.4728 (norm. = 0.218841), norm. avg. (of 18) = 0.406905 fft 4: mflops = 16.4728 (norm. = 0.218841), norm. avg. (of 18) = 0.415859 fft 5: mflops = -1 (norm. = -0.0132849), norm. avg. (of 10) = 0.785526 fft 6: mflops = -1 (norm. = -0.0132849), norm. avg. (of 10) = 0.297252 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=2.22 s, 1 iters, t-(init.)=2.14 s t(norm)=0.122441, mflops=40.8361 (err=7.6e-16) 1. PDA: elapsed time t=3.68 s, 1 iters, t-(init.)=3.6 s t(norm)=0.205975, mflops=24.2748 (err=6.5e-16) 2. PDA (f2c): elapsed time t=5.57 s, 1 iters, t-(init.)=5.5 s t(norm)=0.314684, mflops=15.889 (err=6.5e-16) 3. Singleton: elapsed time t=8.45 s, 1 iters, t-(init.)=8.38 s t(norm)=0.479464, mflops=10.4283 (err=7.0e-16) 4. Singleton (f2c): elapsed time t=8.57 s, 1 iters, t-(init.)=8.49 s t(norm)=0.485757, mflops=10.2932 (err=7.0e-16) 5. Temperton: elapsed time t=3.79 s, 1 iters, t-(init.)=3.71 s t(norm)=0.212269, mflops=23.5551 (err=1.6e-07) 6. Temperton (f2c): elapsed time t=3.73 s, 1 iters, t-(init.)=3.65 s t(norm)=0.208836, mflops=23.9423 (err=7.7e-16) Top mflops for N=884736 = 40.8361 Normalized results and averages for N=884736: fft 0: mflops = 40.8361 (norm. = 1), norm. avg. (of 19) = 0.986661 fft 1: mflops = 24.2748 (norm. = 0.594444), norm. avg. (of 19) = 0.316363 fft 2: mflops = 15.889 (norm. = 0.389091), norm. avg. (of 19) = 0.154185 fft 3: mflops = 10.4283 (norm. = 0.25537), norm. avg. (of 19) = 0.39893 fft 4: mflops = 10.2932 (norm. = 0.252061), norm. avg. (of 19) = 0.407238 fft 5: mflops = 23.5551 (norm. = 0.576819), norm. avg. (of 11) = 0.766553 fft 6: mflops = 23.9423 (norm. = 0.586301), norm. avg. (of 11) = 0.32353 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=1.83 s, 1 iters, t-(init.)=1.72 s t(norm)=0.0737636, mflops=67.7841 (err=7.9e-16) 1. PDA: elapsed time t=4.18 s, 1 iters, t-(init.)=4.08 s t(norm)=0.174974, mflops=28.5757 (err=8.6e-16) 2. PDA (f2c): elapsed time t=8.87 s, 1 iters, t-(init.)=8.76 s t(norm)=0.37568, mflops=13.3092 (err=6.8e-16) 3. Singleton: elapsed time t=6.94 s, 1 iters, t-(init.)=6.84 s t(norm)=0.293339, mflops=17.0451 (err=7.0e-16) 4. Singleton (f2c): elapsed time t=6.95 s, 1 iters, t-(init.)=6.84 s t(norm)=0.293339, mflops=17.0451 (err=7.0e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 67.7841 Normalized results and averages for N=1157625: fft 0: mflops = 67.7841 (norm. = 1), norm. avg. (of 20) = 0.987328 fft 1: mflops = 28.5757 (norm. = 0.421569), norm. avg. (of 20) = 0.321623 fft 2: mflops = 13.3092 (norm. = 0.196347), norm. avg. (of 20) = 0.156293 fft 3: mflops = 17.0451 (norm. = 0.251462), norm. avg. (of 20) = 0.391557 fft 4: mflops = 17.0451 (norm. = 0.251462), norm. avg. (of 20) = 0.399449 fft 5: mflops = -1 (norm. = -0.0147527), norm. avg. (of 11) = 0.766553 fft 6: mflops = -1 (norm. = -0.0147527), norm. avg. (of 11) = 0.32353 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=2.34 s, 1 iters, t-(init.)=2.22 s t(norm)=0.0773747, mflops=64.6206 (err=5.9e-16) 1. PDA: elapsed time t=5.13 s, 1 iters, t-(init.)=5.01 s t(norm)=0.174616, mflops=28.6343 (err=5.7e-16) 2. PDA (f2c): elapsed time t=10.51 s, 1 iters, t-(init.)=10.38 s t(norm)=0.361779, mflops=13.8206 (err=5.7e-16) 3. Singleton: elapsed time t=9.15 s, 1 iters, t-(init.)=9.03 s t(norm)=0.314727, mflops=15.8868 (err=6.6e-16) 4. Singleton (f2c): elapsed time t=9.27 s, 1 iters, t-(init.)=9.15 s t(norm)=0.318909, mflops=15.6784 (err=6.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 64.6206 Normalized results and averages for N=1404928: fft 0: mflops = 64.6206 (norm. = 1), norm. avg. (of 21) = 0.987931 fft 1: mflops = 28.6343 (norm. = 0.443114), norm. avg. (of 21) = 0.327409 fft 2: mflops = 13.8206 (norm. = 0.213873), norm. avg. (of 21) = 0.159035 fft 3: mflops = 15.8868 (norm. = 0.245847), norm. avg. (of 21) = 0.384618 fft 4: mflops = 15.6784 (norm. = 0.242623), norm. avg. (of 21) = 0.391981 fft 5: mflops = -1 (norm. = -0.0154749), norm. avg. (of 11) = 0.766553 fft 6: mflops = -1 (norm. = -0.0154749), norm. avg. (of 11) = 0.32353 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=2.6 s, 1 iters, t-(init.)=2.45 s t(norm)=0.0684256, mflops=73.0721 (err=7.4e-16) 1. PDA: elapsed time t=4.47 s, 1 iters, t-(init.)=4.32 s t(norm)=0.120652, mflops=41.4413 (err=8.1e-16) 2. PDA (f2c): elapsed time t=8.39 s, 1 iters, t-(init.)=8.24 s t(norm)=0.230133, mflops=21.7265 (err=8.1e-16) 3. Singleton: elapsed time t=15.55 s, 1 iters, t-(init.)=15.4 s t(norm)=0.430104, mflops=11.6251 (err=9.6e-16) 4. Singleton (f2c): elapsed time t=15.75 s, 1 iters, t-(init.)=15.6 s t(norm)=0.435689, mflops=11.4761 (err=9.6e-16) 5. Temperton: elapsed time t=3.53 s, 1 iters, t-(init.)=3.38 s t(norm)=0.0943994, mflops=52.9665 (err=1.1e-08) 6. Temperton (f2c): elapsed time t=6.04 s, 1 iters, t-(init.)=5.89 s t(norm)=0.164501, mflops=30.395 (err=7.1e-16) Top mflops for N=1728000 = 73.0721 Normalized results and averages for N=1728000: fft 0: mflops = 73.0721 (norm. = 1), norm. avg. (of 22) = 0.98848 fft 1: mflops = 41.4413 (norm. = 0.56713), norm. avg. (of 22) = 0.338305 fft 2: mflops = 21.7265 (norm. = 0.29733), norm. avg. (of 22) = 0.165321 fft 3: mflops = 11.6251 (norm. = 0.159091), norm. avg. (of 22) = 0.374367 fft 4: mflops = 11.4761 (norm. = 0.157051), norm. avg. (of 22) = 0.381302 fft 5: mflops = 52.9665 (norm. = 0.724852), norm. avg. (of 12) = 0.763078 fft 6: mflops = 30.395 (norm. = 0.415959), norm. avg. (of 12) = 0.331232 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=5.15 s, 1 iters, t-(init.)=4.88 s t(norm)=0.0759795, mflops=65.8072 (err=1.2e-15) 1. PDA: elapsed time t=8.67 s, 1 iters, t-(init.)=8.4 s t(norm)=0.130784, mflops=38.2309 (err=1.1e-15) 2. PDA (f2c): elapsed time t=15.42 s, 1 iters, t-(init.)=15.15 s t(norm)=0.235879, mflops=21.1973 (err=1.1e-15) 3. Singleton: elapsed time t=27.43 s, 1 iters, t-(init.)=27.17 s t(norm)=0.423025, mflops=11.8196 (err=1.6e-15) 4. Singleton (f2c): elapsed time t=27.7 s, 1 iters, t-(init.)=27.44 s t(norm)=0.427229, mflops=11.7033 (err=1.6e-15) 5. Temperton: elapsed time t=8.83 s, 1 iters, t-(init.)=8.57 s t(norm)=0.133431, mflops=37.4725 (err=1.8e-07) 6. Temperton (f2c): elapsed time t=10.78 s, 1 iters, t-(init.)=10.52 s t(norm)=0.163792, mflops=30.5265 (err=1.2e-15) Top mflops for N=2985984 = 65.8072 Normalized results and averages for N=2985984: fft 0: mflops = 65.8072 (norm. = 1), norm. avg. (of 23) = 0.988981 fft 1: mflops = 38.2309 (norm. = 0.580952), norm. avg. (of 23) = 0.348855 fft 2: mflops = 21.1973 (norm. = 0.322112), norm. avg. (of 23) = 0.172138 fft 3: mflops = 11.8196 (norm. = 0.17961), norm. avg. (of 23) = 0.365899 fft 4: mflops = 11.7033 (norm. = 0.177843), norm. avg. (of 23) = 0.372456 fft 5: mflops = 37.4725 (norm. = 0.569428), norm. avg. (of 13) = 0.748182 fft 6: mflops = 30.5265 (norm. = 0.463878), norm. avg. (of 13) = 0.341436 Benchmarking for array size = 180x180x180: 0. FFTW: elapsed time t=10.31 s, 1 iters, t-(init.)=9.73 s t(norm)=0.0742309, mflops=67.3574 (err=9.5e-16) 1. PDA: elapsed time t=15.67 s, 1 iters, t-(init.)=15.1 s t(norm)=0.115199, mflops=43.4031 (err=9.9e-16) 2. PDA (f2c): elapsed time t=29.75 s, 1 iters, t-(init.)=29.18 s t(norm)=0.222616, mflops=22.4602 (err=9.5e-16) 3. Singleton: elapsed time t=59.95 s, 1 iters, t-(init.)=59.38 s t(norm)=0.453015, mflops=11.0372 (err=1.2e-15) 4. Singleton (f2c): elapsed time t=61.03 s, 1 iters, t-(init.)=60.45 s t(norm)=0.461178, mflops=10.8418 (err=1.2e-15) 5. Temperton: elapsed time t=8.91 s, 1 iters, t-(init.)=8.34 s t(norm)=0.0636265, mflops=78.5836 (err=9.9e-08) 6. Temperton (f2c): elapsed time t=21.1 s, 1 iters, t-(init.)=20.53 s t(norm)=0.156625, mflops=31.9234 (err=9.4e-16) Top mflops for N=5832000 = 78.5836 Normalized results and averages for N=5832000: fft 0: mflops = 67.3574 (norm. = 0.857143), norm. avg. (of 24) = 0.983487 fft 1: mflops = 43.4031 (norm. = 0.552318), norm. avg. (of 24) = 0.357333 fft 2: mflops = 22.4602 (norm. = 0.285812), norm. avg. (of 24) = 0.176874 fft 3: mflops = 11.0372 (norm. = 0.140451), norm. avg. (of 24) = 0.356505 fft 4: mflops = 10.8418 (norm. = 0.137965), norm. avg. (of 24) = 0.362686 fft 5: mflops = 78.5836 (norm. = 1), norm. avg. (of 14) = 0.766169 fft 6: mflops = 31.9234 (norm. = 0.406235), norm. avg. (of 14) = 0.346064 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Bailey, Beauregard, Bergland, Brenner, Burrus, CWP (min N), CWP (best N), Edelblute, FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), Monro, NAPACK (f2c), Ooura (C), Ooura (F), Ransom, SCIPORT, Singleton, Singleton (f2c), Sorensen, Sorensen DIT, Temperton, Temperton (f2c), Valkenburg, SGIMATH 2, 47.6625, 35.2463, 29.9593, 1.09227, 12.9454, 10.9227, 4.2625, 5.95782, 19.2399, 3.08405, 3.08405, , 9.53251, 6.16809, 23.0456, 23.3017, 59.9186, , 5.40503, 8.59489, 10.0825, 30.1748, , , , , 5.04123, 48.771, 24.3855, , , 5.60736, 5.99186, 19.065, 17.3318, 2.91271, 2.49661, 6.85344, 13.6179 4, 63.5501, 66.5763, 42.3667, 5.19097, 23.4319, 16.257, 13.53, 13.2731, 29.5374, 10.7546, 6.67883, 44.1506, 30.1748, 18.3961, 77.6723, 78.3982, 190.65, , 18.8933, 16.0088, 16.7772, 109.655, 39.9458, 40.3298, 32.0176, 5.04123, 10.2802, 83.8861, 61.6809, 6.20459, 45.8394, 19.7845, 21.8453, 35.8488, 30.1748, 9.79978, 8.59489, 6.31672, 49.0562 8, 114.39, 109.417, 49.152, 6.09637, 47.3042, 17.0039, 24.0132, 20.6956, 33.4652, 32.0993, 19.065, 41.6653, 46.6034, 24.576, 161.319, 161.319, 262.144, 105.739, 39.3216, 24.7695, 26.2144, 159.277, 57.7198, 58.7987, 50.3316, 11.2347, 15.5729, 132.452, 88.6121, 5.82542, 63.5501, 21.1123, 21.3995, 48.3958, 34.3795, 19.7845, 15.1237, 6.39376, 114.39 16, 71.0899, 73.5843, 52.7585, 13.7069, 62.1378, 21.9597, 36.1578, 28.3399, 37.7865, 62.1378, 40.3298, 41.5278, 84.7334, 38.8361, 216.48, 215.093, 310.689, 83.0555, 69.9051, 36.7921, 37.1177, 180.4, 53.7731, 64.5278, 54.4715, 18.8933, 24.9661, 134.218, 120.699, 21.3995, 82.2413, 53.7731, 55.1882, 63.0722, 39.5689, 27.962, 27.4138, 6.63656, 149.797 32, 89.6219, 90.3945, 59.5782, 14.8104, 93.6229, 24.9661, 48.5452, 36.6635, 42.9744, 81.92, 83.2203, 45.9902, 85.2501, 28.807, 188.933, 192.399, 274.138, 118.483, 71.3317, 45.9902, 46.8114, 188.933, 62.4152, 77.1012, 70.8497, 28.0368, 31.2076, 150.874, 135.3, 18.4608, 99.8644, 70.8497, 72.8178, 84.5626, 44.4312, 32.9741, 32.9741, 6.82667, 216.201 64, 86.7787, 86.1843, 65.8791, 28.8599, 93.9023, 25.7847, 58.2542, 42.5098, 48.0264, 109.417, 114.39, 50.7375, 129.721, 35.7469, 195.084, 182.361, 174.763, 166.661, 98.304, 50.7375, 53.7731, 186.414, 62.9146, 80.6597, 75.3468, 35.7469, 39.5689, 158.276, 162.36, 37.0086, 114.39, 97.542, 99.8644, 103.991, 49.152, 47.6625, 47.6625, 6.95958, 241.979 128, 96.5794, 97.219, 71.9611, 27.3882, 124.407, 26.403, 64.956, 48.2897, 53.5769, 139.81, 154.527, 56.4618, 157.006, 36.7002, 220.753, 188.206, 206.761, 165.876, 115.591, 58.7203, 57.344, 172.707, 69.9051, 92.3274, 86.3533, 42.6746, 40.778, 169.712, 169.712, 33.6699, 124.407, 97.219, 96.5794, 126.552, 54.3706, 48.6095, 48.2897, 7.05772, 285.05 256, 98.6895, 99.2735, 77.6723, 34.9525, 112.599, 27.5941, 72.9444, 51.7815, 58.6616, 172.961, 196.225, 61.2307, 208.413, 41.943, 234.646, 198.547, 211.034, 186.414, 126.144, 65.028, 63.5501, 153.919, 72.3156, 95.3251, 89.7177, 48.771, 45.3438, 176.602, 185.384, 49.0562, 135.3, 123.362, 128.07, 143.395, 59.9186, 50.84, 57.8525, 7.08497, 291.778 512, 106.036, 106.036, 81.355, 35.7469, 134.817, 27.7564, 80.6597, 56.1737, 63.7648, 188.744, 196.608, 65.536, 155.987, 34.6955, 223.365, 196.608, 192.596, 202.95, 107.854, 66.9304, 65.084, 146.313, 77.9933, 103.139, 98.304, 52.4288, 46.7187, 179.756, 185.043, 47.1859, 126.674, 127.53, 130.168, 155.987, 63.7648, 49.4093, 51.5693, 7.19298, 294.912 1024, 108.101, 106.998, 86.6592, 43.6907, 87.3813, 27.5941, 81.92, 57.6141, 67.6501, 190.65, 192.399, 68.9853, 127.1, 36.4089, 190.65, 165.13, 160.088, 201.649, 114.598, 67.2164, 63.9376, 81.285, 81.285, 104.858, 99.8644, 57.6141, 50.9017, 185.589, 199.729, 58.5797, 106.998, 135.3, 142.663, 165.13, 68.0894, 56.9878, 59.2416, 5.50723, 218.453 2048, 92.2747, 91.5423, 73.0021, 39.7736, 35.8209, 26.6999, 84.8113, 57.1007, 57.6717, 164.776, 181.643, 61.0282, 69.0679, 28.5503, 141.526, 135.698, 93.0188, 155.869, 54.9254, 62.6866, 61.6809, 71.6418, 84.1922, 104.858, 99.4339, 51.9565, 29.5752, 152.773, 167.164, 55.9919, 45.4108, 121.414, 121.414, 92.2747, 51.0369, 47.6625, 51.4926, 6.49456, 126.058 4096, 44.306, 46.9512, 40.3298, 39.8193, 19.6608, 24.0132, 60.4948, 42.799, 34.1927, 137.518, 154.392, 36.5782, 48.771, 31.1458, 109.417, 109.417, 77.6723, 102.3, 42.799, 35.3453, 34.7594, 54.7083, 81.1801, 98.304, 86.7787, 35.9512, 30.8405, 91.8461, 90.5245, 57.1951, 43.3894, 64.8604, 66.2259, 54.2367, 31.4573, 35.1478, 38.8361, 6.29146, 127.1 8192, 45.1374, 47.3316, 35.6845, 37.656, 19.8132, 23.6658, 56.3285, 41.0587, 31.5544, 141.995, 150.624, 33.4105, 40.5699, 24.875, 105.67, 104.858, 80.6597, 88.5162, 38.0768, 33.0861, 33.7413, , 51.6344, 57.2752, 54.9657, 35.1327, 26.8336, 89.6808, 88.5162, 52.4288, 42.8663, 60.3163, 60.3163, 46.0523, 28.6376, 33.0861, 34.7742, 6.40577, 134.965 16384, 40.3298, 43.4321, 34.2992, 44.7563, 18.5354, 23.5257, 58.2542, 41.0058, 30.0821, 149.797, 149.797, 32.478, 41.4691, 26.9854, 94.7101, 91.1805, 69.9051, 85.8483, 39.0427, 31.9132, 32.478, , 49.262, 53.1886, 52.057, 34.6228, 27.3882, 79.783, 75.6704, 66.1264, 37.4491, 56.4618, 58.7203, 39.6758, 27.1853, 33.3638, 35.6312, 6.19935, 138.491 32768, 42.0552, 43.6907, 33.0434, 41.1745, 13.9438, 23.4057, 58.2542, 40.1241, 29.5651, 149.797, 149.797, 31.4573, 28.0869, 19.4661, 88.8624, 79.4376, 52.0816, 84.1104, 34.4926, 30.9619, 31.711, , 47.6625, 51.067, 50.0912, 34.4926, 24.576, 78.6432, 74.8983, 60.033, 30.72, 50.7375, 52.0816, 37.4491, 26.9326, 31.711, 34.1927, 6.38338, 144.299 65536, 27.5941, 29.7468, 23.3017, 44.8589, 5.85797, 22.7951, 46.3459, 34.3795, 20.9715, 132.104, 132.104, 22.55, 14.3641, 14.2663, 65.028, 54.8275, 29.3308, 56.2994, 14.9797, 24.8184, 25.8908, , 45.5903, 48.771, 45.8394, 24.1052, 12.4092, 60.787, 56.2994, 68.7591, 10.5917, 41.943, 42.799, 22.55, 18.8933, 23.9675, 25.7319, 5.32272, 126.144 131072, 8.07328, 8.25268, 6.63162, 23.8313, 3.30597, 17.1402, 17.408, 11.9797, 6.33018, 101.283, 102.447, 6.67133, 7.32968, 7.57899, 33.0107, 27.5089, 21.2212, 19.3759, 8.704, 7.6309, 7.73689, , 34.0187, 36.2313, 26.3695, 7.28178, 9.94743, 40.1482, 39.0916, 42.4424, 9.13207, 11.6661, 11.9156, 12.243, 5.95782, 9.28427, 9.77291, 4.12634, 86.533 262144, 7.08497, 6.95958, 5.61737, 25.0989, 3.14154, 16.4986, 14.7456, 11.0247, 5.31373, 74.8983, 74.8983, 5.68505, 7.12778, 8.48668, 29.308, 24.9661, 16.9734, 17.8735, 8.42606, 7.17111, 7.14938, , 9.47508, 9.51329, 9.07422, 6.74085, 8.93673, 22.6855, 21.4481, 46.7187, 8.27823, 11.4529, 11.6797, 10.4858, 5.06287, 9.32528, 9.95484, 3.75087, 85.0197 524288, 6.795, 6.7673, 5.22638, 18.5848, 3.08023, 16.1189, 14.0302, 10.2696, 4.91682, 74.8983, 75.4657, 5.29302, 6.5536, 7.03494, 28.7904, 25.1552, 15.4681, 16.9991, 8.04642, 6.88898, 6.795, , 8.67724, 8.7689, 8.34294, 6.37738, 7.88091, 23.7178, 22.8474, 39.8459, 7.6158, 8.94208, 9.25787, 8.11195, 4.65489, 8.13846, 8.83109, 3.63823, 80.9876 1048576, 6.32434, 6.32053, 5.02673, 23.25, 2.99166, 16.0333, 13.0745, 10.22, 4.68114, , , 5.12, 6.41723, 7.74428, 25.7004, 23.9401, 13.4778, , 7.8019, 6.68308, 6.5618, , , 7.94376, 7.48983, 6.1464, 7.38954, 18.0478, 16.513, 48.5452, 7.09936, 9.89223, 10.1803, 7.45787, 4.42251, 8.21768, 8.97753, 3.44247, 88.1156 Norm. Avg., 0.309114, 0.302751, 0.226934, 0.174224, 0.199347, 0.134811, 0.246029, 0.180256, 0.176948, 0.648015, 0.66527, 0.18396, 0.291922, 0.125925, 0.601477, 0.558848, 0.594726, 0.523652, 0.238377, 0.175116, 0.176613, 0.539745, 0.283056, 0.319044, 0.292707, 0.149124, 0.1305, 0.538044, 0.501149, 0.28708, 0.269266, 0.296076, 0.303468, 0.306527, 0.169988, 0.153777, 0.160694, 0.0378639, 0.814619 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, Brenner, CWP (min N), CWP (best N), FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c), Valkenburg, SGIMATH 6, , 15.5183, 12.2464, 33.0552, 18.6504, 114.529, 119.582, 17.3752, 30.8015, 15.5183, 14.1173, 14.4177, 12.5487, 8.00353, 6.77632, 33.0552 9, 7.02894, 32.5165, 22.3916, 55.3985, 24.9293, 141.109, 141.109, 15.5808, 35.9557, 23.0827, 24.9293, 25.4381, 21.4908, 10.9339, 6.82371, 55.8119 12, , 40.2761, 33.9678, 64.8122, 31.3259, 197.848, 194.436, 26.3489, 55.8283, 23.8926, 29.2158, 30.1533, 28.1933, 15.4908, 6.84303, 82.3162 15, 10.3801, 45.4512, 46.2726, 85.8241, 33.3968, 153.625, 153.625, 20.873, 61.4501, 19.2031, 30.725, 30.4812, 38.7942, 14.5478, 6.31683, 84.8758 18, 9.38749, 50.7118, 45.9724, 66.4736, 21.3871, 122.976, 126.943, 18.6327, 69.7737, 30.9374, 39.6697, 40.32, 30.744, 14.4678, 6.9478, 79.9845 24, , 81.0283, 68.0332, 79.6853, 25.04, 164.835, 162.972, 35.0074, 90.144, 33.3867, 38.1562, 38.7716, 45.9333, 21.2104, 7.09796, 148.691 36, 11.8192, 101.645, 99.9786, 106.995, 28.4986, 174.248, 176.774, 22.7563, 105.15, 41.2074, 60.9869, 62.2315, 54.9432, 21.9377, 7.19185, 146.956 80, 19.3605, 142.867, 145.373, 154.884, 36.9923, 190.489, 185.168, 50.2199, 106.234, 30.0227, 91.5611, 94.7003, 67.9203, 36.9923, 6.95158, 200.879 108, 13.3399, 129.217, 154.227, 160.437, 32.8368, 186.759, 188.23, 21.1925, 112.76, 51.0794, 72.8816, 70.7254, 70.7254, 22.1344, 7.32389, 197.563 210, 17.8373, 179.337, 179.337, 72.1248, 19.0675, 155.216, 147.455, 25.5211, 106.168, 25.7189, 53.0839, 54.8387, , , 6.05428, 94.1204 504, 18.9883, 220.626, 220.626, 82.7346, 19.1452, 138.303, 124.38, 28.5996, 127.811, 34.8356, 68.1344, 68.6391, , , 6.65681, 68.6391 1000, 21.439, 170.083, 214.841, 150.073, 35.1895, 159.453, 159.453, 31.1127, 110.924, 28.0356, 104.132, 100.049, 86.4827, 29.8391, 6.25304, 198.155 1960, 21.2696, 207.078, 209.05, 54.8757, 15.0757, 122.627, 126.151, 31.1794, 81.904, 22.8649, 72.2048, 72.2048, , , 5.53182, 40.952 4725, 17.411, 143.345, 168.737, 68.3542, 22.2357, 122.021, 110.183, 19.3253, 90.5797, 25.456, 62.0357, 61.5187, , , 5.87759, 84.8534 10368, 18.596, 141.627, 158.066, 117.241, 30.735, 138.308, 121.256, 26.9868, 103.528, 40.98, 70.2515, 68.6177, 54.64, 27.32, 6.91538, 129.222 27000, 17.9642, 166.257, 165.177, 105.113, 30.5737, 112.555, 114.583, 19.8729, 94.9154, 27.4109, 68.0142, 66.5898, 64.2357, 22.5508, 6.30886, 121.13 75600, 12.2518, 124.859, 125.66, 42.615, 18.4238, 85.23, 85.9776, 19.6029, 54.4525, 18.0174, 46.2332, 46.6736, , , 5.89029, 72.6033 165375, 8.68739, 90.2941, 90.2941, 27.3032, 11.6538, 63.7075, 66.6707, 12.2514, 52.1243, 15.8389, 23.1197, 22.9347, , , 4.99449, 59.1101 362880, 7.99771, 50.0155, 50.0155, 39.6573, 17.0971, 63.8293, 62.0563, 13.7903, 59.3104, 18.3117, 18.4123, 18.2122, , , 5.00903, 66.3572 Norm. Avg., 0.090767, 0.682424, 0.701369, 0.500374, 0.158567, 0.855147, 0.844457, 0.14948, 0.512211, 0.175653, 0.311243, 0.311212, 0.276218, 0.118976, 0.0423466, 0.644184 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM, HARM (f2c), PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 4x4x4, 185.043, , , 20.9715, 13.7971, 112.347, 117.597, 61.6809, 48.771 8x8x8, 273.542, 136.771, 119.458, 38.9966, 21.0651, 107.241, 103.139, 142.988, 98.304 16x16x16, 133.861, 93.9023, 85.598, 52.8694, 29.1271, 62.9146, 63.5501, 66.2259, 82.2413 32x32x32, 120.99, 77.1012, 72.8178, 61.44, 23.9766, 47.3754, 47.3754, 62.9146, 69.5958 64x64x64, 37.1543, 25.3688, 29.8645, 21.8453, 14.8383, 11.2885, 11.1815, 20.1649, 22.9058 256x64x32, 35.3244, 18.7952, 23.494, 25.2829, 16.6025, 9.69015, 9.76615, 14.6492, 16.9413 16x1024x64, 38.4094, 19.065, 23.5107, 22.4055, 14.9797, 10.444, 9.99596, , 128x128x128, 36.5176, 20.4079, 23.6267, 24.8815, 16.2991, 7.65118, 7.61677, 9.59481, 10.8101 512x128x64, 36.6752, 16.5902, 20.9525, 24.712, 15.893, 8.00857, 8.05611, , Norm. Avg., 1, 0.570149, 0.622164, 0.488985, 0.30071, 0.348744, 0.34942, 0.441569, 0.457809 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 5x5x5, 155.064, 26.6155, 17.313, 152.577, 155.912, 108.075, 30.7455 6x6x6, 221.323, 26.3886, 15.882, 82.663, 83.164, 100.161, 36.3017 7x7x7, 174.006, 10.7177, 5.03078, 67.6137, 73.0395, , 9x9x9, 201.39, 44.6479, 21.9105, 111.795, 111.795, 160.43, 29.3347 10x10x10, 218.288, 46.8118, 23.8434, 103.08, 106.302, 174.444, 46.3862 11x11x11, 152.919, 10.8474, 4.62861, 61.5002, 65.4863, , 12x12x12, 266.16, 60.223, 27.986, 110.003, 107.517, 222.579, 58.736 13x13x13, 122.426, 10.5469, 4.38465, 61.8192, 68.6125, , 14x14x14, 172.55, 22.7943, 8.0883, 59.434, 62.6843, , 15x15x15, 159.475, 66.6228, 29.7843, 82.3306, 83.0054, 180.833, 33.9821 24x25x28, 109.36, 53.8989, 20.3942, 66.1916, 67.9806, , 48x48x48, 92.0724, 61.2548, 25.383, 30.6274, 30.6274, 55.3121, 46.9103 49x49x49, 94.9315, 26.0749, 8.25706, 29.1426, 29.3584, , 60x60x60, 90.5956, 44.768, 21.8724, 20.6901, 20.5788, 92.2329, 36.8045 72x60x56, 80.1211, 36.6656, 17.0336, 20.2175, 19.8465, , 75x75x75, 73.6762, 39.0265, 21.1918, 21.4221, 20.637, 82.9826, 30.5556 80x80x80, 72.4663, 37.6375, 21.9694, 17.2172, 16.7422, 49.0428, 41.1461 84x84x84, 75.2732, 28.5584, 12.8287, 16.4728, 16.4728, , 96x96x96, 40.8361, 24.2748, 15.889, 10.4283, 10.2932, 23.5551, 23.9423 105x105x105, 67.7841, 28.5757, 13.3092, 17.0451, 17.0451, , 112x112x112, 64.6206, 28.6343, 13.8206, 15.8868, 15.6784, , 120x120x120, 73.0721, 41.4413, 21.7265, 11.6251, 11.4761, 52.9665, 30.395 144x144x144, 65.8072, 38.2309, 21.1973, 11.8196, 11.7033, 37.4725, 30.5265 180x180x180, 67.3574, 43.4031, 22.4602, 11.0372, 10.8418, 78.5836, 31.9234 Norm. Avg., 0.983487, 0.357333, 0.176874, 0.356505, 0.362686, 0.766169, 0.346064 @@@@ end