To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Bradley Lucier @ submitter email = lucier@math.purdue.edu @ submitter organization = Purdue University @ computer manufacturer = Apple @ computer model = PowerMac 6500/225 @ CPU manufacturer = IBM @ CPU model = PowerPC 603e @ CPU speed = 225 MHz @ RAM = 32 MB @ L2 cache size = 256 Kbytes @ operating system = MacOS 7.5.5 @ C compiler = Metrowerks Codewarrior Pro 2 @ C compiler flags = all @ Fortran compiler = NONE @ Fortran compiler flags = NONE @ remarks = Virtual Memory off, Extensions off, no other processes @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) Maximum array size = 144144 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Beauregard 5. Bergland 6. CWP (min N) 7. CWP (best N) 8. Edelblute 9. FFTPACK (f2c) 10. FFTW 11. FFTW_ESTIMATE 12. Frigo-old 13. Green 14. GSL 15. GSL DIT 16. GSL DIF 17. Krukar 18. Mayer (Buneman) 19. Mayer (simple) 20. Mayer (lookup) 21. NAPACK (f2c) 22. Nielsen 23. NR (C) 24. Ooura (C) 25. QFT 26. Ransom 27. Singleton (f2c) 28. Temperton (f2c) 29. Valkenburg Computing normalized averages (30 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.25176 s, 2097152 iters, t-(init.)=1.04306 s t(norm)=0.248684, mflops=20.1058 (err=1.7e-17) 1. Arndt DIT: elapsed time t=1.31808 s, 2097152 iters, t-(init.)=1.10947 s t(norm)=0.264519, mflops=18.9023 (err=1.7e-17) 2. Arndt Split-Radix: elapsed time t=1.75488 s, 2097152 iters, t-(init.)=1.54625 s t(norm)=0.368656, mflops=13.5628 (err=1.7e-17) 3. Arndt 4-step: elapsed time t=1.5359 s, 131072 iters, t-(init.)=1.52285 s t(norm)=5.80922, mflops=0.860701 (err=1.7e-17) 4. Beauregard: elapsed time t=1.87362 s, 524288 iters, t-(init.)=1.82144 s t(norm)=1.73706, mflops=2.87843 (err=1.7e-17) 5. Bergland: elapsed time t=1.3401 s, 524288 iters, t-(init.)=1.2856 s t(norm)=1.22604, mflops=4.07816 (err=1.7e-17) 6. CWP (min N): elapsed time t=1.44697 s, 524288 iters, t-(init.)=1.39482 s t(norm)=1.3302, mflops=3.75882 7. CWP (best N) (N=3): elapsed time t=1.56104 s, 524288 iters, t-(init.)=1.49708 s t(norm)=1.42773, mflops=3.50207 8. Skipping fft (Edelblute can't handle N <= 2). 9. FFTPACK (f2c): elapsed time t=1.97947 s, 1048576 iters, t-(init.)=1.87504 s t(norm)=0.894086, mflops=5.5923 (err=1.7e-17) FFTW_MEASURE plan: (cost = 5.675125e-07) FFTW_NOTW 2 10. FFTW: elapsed time t=1.26205 s, 2097152 iters, t-(init.)=1.05363 s t(norm)=0.251205, mflops=19.904 (err=1.7e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 11. FFTW_ESTIMATE: elapsed time t=1.26213 s, 2097152 iters, t-(init.)=1.0536 s t(norm)=0.251199, mflops=19.9046 (err=1.7e-17) 12. Frigo-old: elapsed time t=1.83955 s, 4194304 iters, t-(init.)=1.42238 s t(norm)=0.169561, mflops=29.488 (err=1.7e-17) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=1.41386 s, 1048576 iters, t-(init.)=1.3095 s t(norm)=0.624419, mflops=8.00744 (err=1.7e-17) 15. GSL DIT: elapsed time t=1.13658 s, 524288 iters, t-(init.)=1.08452 s t(norm)=1.03428, mflops=4.83427 (err=1.7e-17) 16. GSL DIF: elapsed time t=1.17891 s, 524288 iters, t-(init.)=1.12641 s t(norm)=1.07423, mflops=4.6545 (err=1.7e-17) 17. Krukar: elapsed time t=1.32767 s, 2097152 iters, t-(init.)=1.11909 s t(norm)=0.266811, mflops=18.7398 (err=1.7e-17) 18. Skipping fft (Mayer can't handle N <= 2). 19. Skipping fft (Mayer can't handle N <= 2). 20. Skipping fft (Mayer can't handle N <= 2). 21. NAPACK (f2c): elapsed time t=1.33645 s, 262144 iters, t-(init.)=1.31041 s t(norm)=2.49942, mflops=2.00047 (err=1.7e-17) 22. Nielsen: elapsed time t=1.78209 s, 262144 iters, t-(init.)=1.75588 s t(norm)=3.34908, mflops=1.49295 (err=1.7e-17) 23. NR (C): elapsed time t=1.05384 s, 524288 iters, t-(init.)=1.00158 s t(norm)=0.95518, mflops=5.23461 (err=1.7e-17) 24. Ooura (C): elapsed time t=1.52699 s, 2097152 iters, t-(init.)=1.31846 s t(norm)=0.314346, mflops=15.906 (err=1.7e-17) 25. Skipping fft (QFT requires N >= 16). 26. Skipping fft (Ransom doesn't work for N=2). 27. Singleton (f2c): elapsed time t=1.35352 s, 524288 iters, t-(init.)=1.30135 s t(norm)=1.24107, mflops=4.0288 (err=1.7e-17) 28. Temperton (f2c): elapsed time t=1.10446 s, 262144 iters, t-(init.)=1.0783 s t(norm)=2.05668, mflops=2.4311 (err=1.7e-17) 29. Valkenburg: elapsed time t=1.98766 s, 1048576 iters, t-(init.)=1.88319 s t(norm)=0.897974, mflops=5.56809 (err=1.7e-17) Top mflops for N=2 = 29.488 Normalized results and averages for N=2: fft 0: mflops = 20.1058 (norm. = 0.681831), norm. avg. (of 1) = 0.681831 fft 1: mflops = 18.9023 (norm. = 0.641016), norm. avg. (of 1) = 0.641016 fft 2: mflops = 13.5628 (norm. = 0.459943), norm. avg. (of 1) = 0.459943 fft 3: mflops = 0.860701 (norm. = 0.0291882), norm. avg. (of 1) = 0.0291882 fft 4: mflops = 2.87843 (norm. = 0.0976138), norm. avg. (of 1) = 0.0976138 fft 5: mflops = 4.07816 (norm. = 0.138299), norm. avg. (of 1) = 0.138299 fft 6: mflops = 3.75882 (norm. = 0.12747), norm. avg. (of 1) = 0.12747 fft 7: mflops = 3.50207 (norm. = 0.118763), norm. avg. (of 1) = 0.118763 fft 8: mflops = -1 (norm. = -0.0339121), norm. avg. (of 0) = -1 fft 9: mflops = 5.5923 (norm. = 0.189647), norm. avg. (of 1) = 0.189647 fft 10: mflops = 19.904 (norm. = 0.674989), norm. avg. (of 1) = 0.674989 fft 11: mflops = 19.9046 (norm. = 0.675006), norm. avg. (of 1) = 0.675006 fft 12: mflops = 29.488 (norm. = 1), norm. avg. (of 1) = 1 fft 13: mflops = -1 (norm. = -0.0339121), norm. avg. (of 0) = -1 fft 14: mflops = 8.00744 (norm. = 0.271549), norm. avg. (of 1) = 0.271549 fft 15: mflops = 4.83427 (norm. = 0.163941), norm. avg. (of 1) = 0.163941 fft 16: mflops = 4.6545 (norm. = 0.157844), norm. avg. (of 1) = 0.157844 fft 17: mflops = 18.7398 (norm. = 0.635508), norm. avg. (of 1) = 0.635508 fft 18: mflops = -1 (norm. = -0.0339121), norm. avg. (of 0) = -1 fft 19: mflops = -1 (norm. = -0.0339121), norm. avg. (of 0) = -1 fft 20: mflops = -1 (norm. = -0.0339121), norm. avg. (of 0) = -1 fft 21: mflops = 2.00047 (norm. = 0.0678401), norm. avg. (of 1) = 0.0678401 fft 22: mflops = 1.49295 (norm. = 0.0506291), norm. avg. (of 1) = 0.0506291 fft 23: mflops = 5.23461 (norm. = 0.177517), norm. avg. (of 1) = 0.177517 fft 24: mflops = 15.906 (norm. = 0.539408), norm. avg. (of 1) = 0.539408 fft 25: mflops = -1 (norm. = -0.0339121), norm. avg. (of 0) = -1 fft 26: mflops = -1 (norm. = -0.0339121), norm. avg. (of 0) = -1 fft 27: mflops = 4.0288 (norm. = 0.136625), norm. avg. (of 1) = 0.136625 fft 28: mflops = 2.4311 (norm. = 0.0824437), norm. avg. (of 1) = 0.0824437 fft 29: mflops = 5.56809 (norm. = 0.188826), norm. avg. (of 1) = 0.188826 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.10526 s, 1048576 iters, t-(init.)=0.953545 s t(norm)=0.113671, mflops=43.9864 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.14759 s, 1048576 iters, t-(init.)=0.995832 s t(norm)=0.118712, mflops=42.1186 (err=1.3e-16) 2. Arndt Split-Radix: elapsed time t=1.12697 s, 524288 iters, t-(init.)=1.05102 s t(norm)=0.250582, mflops=19.9536 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.43808 s, 131072 iters, t-(init.)=1.41911 s t(norm)=1.35336, mflops=3.6945 (err=1.3e-16) 4. Beauregard: elapsed time t=1.68593 s, 262144 iters, t-(init.)=1.64829 s t(norm)=0.785964, mflops=6.36162 (err=5.3e-17) 5. Bergland: elapsed time t=1.51938 s, 524288 iters, t-(init.)=1.441 s t(norm)=0.343562, mflops=14.5534 (err=5.3e-17) 6. CWP (min N): elapsed time t=1.58755 s, 524288 iters, t-(init.)=1.51177 s t(norm)=0.360434, mflops=13.8722 7. CWP (best N) (N=15): elapsed time t=1.68533 s, 262144 iters, t-(init.)=1.58205 s t(norm)=0.75438, mflops=6.62796 8. Edelblute: elapsed time t=1.2505 s, 524288 iters, t-(init.)=1.17209 s t(norm)=0.279447, mflops=17.8925 (err=1.3e-16) 9. FFTPACK (f2c): elapsed time t=1.50712 s, 524288 iters, t-(init.)=1.43115 s t(norm)=0.341213, mflops=14.6536 (err=5.3e-17) FFTW_MEASURE plan: (cost = 7.317734e-07) FFTW_NOTW 4 10. FFTW: elapsed time t=1.6222 s, 2097152 iters, t-(init.)=1.31871 s t(norm)=0.0786012, mflops=63.6122 (err=5.3e-17) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 11. FFTW_ESTIMATE: elapsed time t=1.62237 s, 2097152 iters, t-(init.)=1.31893 s t(norm)=0.0786144, mflops=63.6016 (err=5.3e-17) 12. Frigo-old: elapsed time t=1.22355 s, 2097152 iters, t-(init.)=0.920115 s t(norm)=0.0548431, mflops=91.1691 (err=5.3e-17) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=1.06768 s, 524288 iters, t-(init.)=0.991805 s t(norm)=0.236465, mflops=21.1448 (err=5.3e-17) 15. GSL DIT: elapsed time t=1.11186 s, 262144 iters, t-(init.)=1.07381 s t(norm)=0.512032, mflops=9.76501 (err=6.4e-17) 16. GSL DIF: elapsed time t=1.15657 s, 262144 iters, t-(init.)=1.11858 s t(norm)=0.533379, mflops=9.3742 (err=6.4e-17) 17. Krukar: elapsed time t=1.92535 s, 2097152 iters, t-(init.)=1.6218 s t(norm)=0.0966668, mflops=51.7241 (err=5.3e-17) 18. Mayer (Buneman): elapsed time t=1.03645 s, 524288 iters, t-(init.)=0.95802 s t(norm)=0.22841, mflops=21.8905 (err=1.3e-16) 19. Mayer (simple): elapsed time t=1.94489 s, 1048576 iters, t-(init.)=1.7933 s t(norm)=0.213778, mflops=23.3887 20. Mayer (lookup): elapsed time t=1.07703 s, 524288 iters, t-(init.)=1.00114 s t(norm)=0.238691, mflops=20.9476 (err=1.3e-16) 21. NAPACK (f2c): elapsed time t=1.17384 s, 131072 iters, t-(init.)=1.15487 s t(norm)=1.10137, mflops=4.5398 (err=5.3e-17) 22. Nielsen: elapsed time t=1.91424 s, 262144 iters, t-(init.)=1.87613 s t(norm)=0.89461, mflops=5.58903 (err=1.3e-16) 23. NR (C): elapsed time t=1.08833 s, 262144 iters, t-(init.)=1.05041 s t(norm)=0.500873, mflops=9.98258 (err=6.4e-17) 24. Ooura (C): elapsed time t=1.33805 s, 1048576 iters, t-(init.)=1.18637 s t(norm)=0.141426, mflops=35.3541 (err=5.3e-17) 25. Skipping fft (QFT requires N >= 16). 26. Ransom: elapsed time t=1.70514 s, 131072 iters, t-(init.)=1.68559 s t(norm)=1.60751, mflops=3.1104 (err=2.4e-16) 27. Singleton (f2c): elapsed time t=1.5939 s, 524288 iters, t-(init.)=1.51812 s t(norm)=0.361949, mflops=13.8141 (err=5.3e-17) 28. Temperton (f2c): elapsed time t=1.35667 s, 262144 iters, t-(init.)=1.31873 s t(norm)=0.628819, mflops=7.95141 (err=5.3e-17) 29. Valkenburg: elapsed time t=1.77121 s, 262144 iters, t-(init.)=1.73319 s t(norm)=0.826448, mflops=6.04999 (err=5.3e-17) Top mflops for N=4 = 91.1691 Normalized results and averages for N=4: fft 0: mflops = 43.9864 (norm. = 0.482471), norm. avg. (of 2) = 0.582151 fft 1: mflops = 42.1186 (norm. = 0.461983), norm. avg. (of 2) = 0.551499 fft 2: mflops = 19.9536 (norm. = 0.218863), norm. avg. (of 2) = 0.339403 fft 3: mflops = 3.6945 (norm. = 0.0405236), norm. avg. (of 2) = 0.0348559 fft 4: mflops = 6.36162 (norm. = 0.0697782), norm. avg. (of 2) = 0.083696 fft 5: mflops = 14.5534 (norm. = 0.159631), norm. avg. (of 2) = 0.148965 fft 6: mflops = 13.8722 (norm. = 0.152159), norm. avg. (of 2) = 0.139814 fft 7: mflops = 6.62796 (norm. = 0.0726996), norm. avg. (of 2) = 0.0957312 fft 8: mflops = 17.8925 (norm. = 0.196256), norm. avg. (of 1) = 0.196256 fft 9: mflops = 14.6536 (norm. = 0.16073), norm. avg. (of 2) = 0.175188 fft 10: mflops = 63.6122 (norm. = 0.697739), norm. avg. (of 2) = 0.686364 fft 11: mflops = 63.6016 (norm. = 0.697622), norm. avg. (of 2) = 0.686314 fft 12: mflops = 91.1691 (norm. = 1), norm. avg. (of 2) = 1 fft 13: mflops = -1 (norm. = -0.0109686), norm. avg. (of 0) = -1 fft 14: mflops = 21.1448 (norm. = 0.231929), norm. avg. (of 2) = 0.251739 fft 15: mflops = 9.76501 (norm. = 0.107109), norm. avg. (of 2) = 0.135525 fft 16: mflops = 9.3742 (norm. = 0.102822), norm. avg. (of 2) = 0.130333 fft 17: mflops = 51.7241 (norm. = 0.567342), norm. avg. (of 2) = 0.601425 fft 18: mflops = 21.8905 (norm. = 0.240109), norm. avg. (of 1) = 0.240109 fft 19: mflops = 23.3887 (norm. = 0.256542), norm. avg. (of 1) = 0.256542 fft 20: mflops = 20.9476 (norm. = 0.229766), norm. avg. (of 1) = 0.229766 fft 21: mflops = 4.5398 (norm. = 0.0497954), norm. avg. (of 2) = 0.0588178 fft 22: mflops = 5.58903 (norm. = 0.061304), norm. avg. (of 2) = 0.0559665 fft 23: mflops = 9.98258 (norm. = 0.109495), norm. avg. (of 2) = 0.143506 fft 24: mflops = 35.3541 (norm. = 0.387786), norm. avg. (of 2) = 0.463597 fft 25: mflops = -1 (norm. = -0.0109686), norm. avg. (of 0) = -1 fft 26: mflops = 3.1104 (norm. = 0.0341169), norm. avg. (of 1) = 0.0341169 fft 27: mflops = 13.8141 (norm. = 0.151522), norm. avg. (of 2) = 0.144073 fft 28: mflops = 7.95141 (norm. = 0.0872161), norm. avg. (of 2) = 0.0848299 fft 29: mflops = 6.04999 (norm. = 0.06636), norm. avg. (of 2) = 0.127593 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.05085 s, 524288 iters, t-(init.)=0.927383 s t(norm)=0.0737018, mflops=67.841 (err=1.1e-16) 1. Arndt DIT: elapsed time t=1.07696 s, 524288 iters, t-(init.)=0.953583 s t(norm)=0.075784, mflops=65.977 (err=1.1e-16) 2. Arndt Split-Radix: elapsed time t=1.36801 s, 262144 iters, t-(init.)=1.30636 s t(norm)=0.207641, mflops=24.08 (err=7.7e-17) 3. Arndt 4-step: elapsed time t=1.55294 s, 65536 iters, t-(init.)=1.53747 s t(norm)=0.977498, mflops=5.1151 (err=9.0e-17) 4. Beauregard: elapsed time t=1.87773 s, 131072 iters, t-(init.)=1.84688 s t(norm)=0.587109, mflops=8.51631 (err=1.5e-16) 5. Bergland: elapsed time t=1.38364 s, 262144 iters, t-(init.)=1.32066 s t(norm)=0.209914, mflops=23.8193 (err=1.6e-16) 6. CWP (min N): elapsed time t=1.05053 s, 262144 iters, t-(init.)=0.988791 s t(norm)=0.157164, mflops=31.8139 7. CWP (best N) (N=15): elapsed time t=1.68549 s, 262144 iters, t-(init.)=1.58217 s t(norm)=0.25148, mflops=19.8823 8. Edelblute: elapsed time t=1.7656 s, 262144 iters, t-(init.)=1.7027 s t(norm)=0.270637, mflops=18.4749 (err=8.3e-17) 9. FFTPACK (f2c): elapsed time t=1.53687 s, 262144 iters, t-(init.)=1.47532 s t(norm)=0.234496, mflops=21.3223 (err=1.5e-16) FFTW_MEASURE plan: (cost = 1.334137e-06) FFTW_NOTW 8 10. FFTW: elapsed time t=1.45266 s, 1048576 iters, t-(init.)=1.20615 s t(norm)=0.0479281, mflops=104.323 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 11. FFTW_ESTIMATE: elapsed time t=1.45226 s, 1048576 iters, t-(init.)=1.20558 s t(norm)=0.0479056, mflops=104.372 (err=1.4e-16) 12. Frigo-old: elapsed time t=1.16202 s, 1048576 iters, t-(init.)=0.915587 s t(norm)=0.0363822, mflops=137.43 (err=1.4e-16) 13. Green: elapsed time t=1.28914 s, 524288 iters, t-(init.)=1.16576 s t(norm)=0.0926464, mflops=53.9686 (err=1.4e-16) 14. GSL: elapsed time t=1.02977 s, 262144 iters, t-(init.)=0.968075 s t(norm)=0.153871, mflops=32.4947 (err=1.4e-16) 15. GSL DIT: elapsed time t=1.91966 s, 262144 iters, t-(init.)=1.85802 s t(norm)=0.295324, mflops=16.9305 (err=1.5e-16) 16. GSL DIF: elapsed time t=1.97718 s, 262144 iters, t-(init.)=1.91555 s t(norm)=0.304469, mflops=16.4221 (err=1.6e-16) 17. Krukar: elapsed time t=1.98873 s, 1048576 iters, t-(init.)=1.74216 s t(norm)=0.0692271, mflops=72.2261 (err=1.5e-16) 18. Mayer (Buneman): elapsed time t=1.85775 s, 524288 iters, t-(init.)=1.73203 s t(norm)=0.137649, mflops=36.3242 (err=1.1e-16) 19. Mayer (simple): elapsed time t=1.77367 s, 524288 iters, t-(init.)=1.65023 s t(norm)=0.131148, mflops=38.1247 20. Mayer (lookup): elapsed time t=1.8697 s, 524288 iters, t-(init.)=1.74654 s t(norm)=0.138803, mflops=36.0223 (err=1.1e-16) 21. NAPACK (f2c): elapsed time t=1.09757 s, 65536 iters, t-(init.)=1.0821 s t(norm)=0.687978, mflops=7.26767 (err=1.7e-16) 22. Nielsen: elapsed time t=1.23823 s, 131072 iters, t-(init.)=1.20737 s t(norm)=0.383814, mflops=13.0272 (err=7.5e-16) 23. NR (C): elapsed time t=1.90457 s, 262144 iters, t-(init.)=1.84279 s t(norm)=0.292903, mflops=17.0705 (err=1.6e-16) 24. Ooura (C): elapsed time t=1.12243 s, 524288 iters, t-(init.)=0.999052 s t(norm)=0.0793975, mflops=62.9743 (err=1.5e-16) 25. Skipping fft (QFT requires N >= 16). 26. Ransom: elapsed time t=1.02495 s, 32768 iters, t-(init.)=1.01701 s t(norm)=1.29319, mflops=3.86641 (err=3.1e-16) 27. Singleton (f2c): elapsed time t=1.04363 s, 131072 iters, t-(init.)=1.01276 s t(norm)=0.321946, mflops=15.5305 (err=1.4e-16) 28. Temperton (f2c): elapsed time t=1.42255 s, 131072 iters, t-(init.)=1.3916 s t(norm)=0.442376, mflops=11.3026 (err=1.4e-16) 29. Valkenburg: elapsed time t=1.23874 s, 65536 iters, t-(init.)=1.22327 s t(norm)=0.777735, mflops=6.42893 (err=1.4e-16) Top mflops for N=8 = 137.43 Normalized results and averages for N=8: fft 0: mflops = 67.841 (norm. = 0.49364), norm. avg. (of 3) = 0.552647 fft 1: mflops = 65.977 (norm. = 0.480077), norm. avg. (of 3) = 0.527692 fft 2: mflops = 24.08 (norm. = 0.175217), norm. avg. (of 3) = 0.284674 fft 3: mflops = 5.1151 (norm. = 0.0372197), norm. avg. (of 3) = 0.0356438 fft 4: mflops = 8.51631 (norm. = 0.0619684), norm. avg. (of 3) = 0.0764534 fft 5: mflops = 23.8193 (norm. = 0.173319), norm. avg. (of 3) = 0.157083 fft 6: mflops = 31.8139 (norm. = 0.231492), norm. avg. (of 3) = 0.170373 fft 7: mflops = 19.8823 (norm. = 0.144672), norm. avg. (of 3) = 0.112045 fft 8: mflops = 18.4749 (norm. = 0.134431), norm. avg. (of 2) = 0.165344 fft 9: mflops = 21.3223 (norm. = 0.15515), norm. avg. (of 3) = 0.168509 fft 10: mflops = 104.323 (norm. = 0.759098), norm. avg. (of 3) = 0.710609 fft 11: mflops = 104.372 (norm. = 0.759456), norm. avg. (of 3) = 0.710695 fft 12: mflops = 137.43 (norm. = 1), norm. avg. (of 3) = 1 fft 13: mflops = 53.9686 (norm. = 0.392699), norm. avg. (of 1) = 0.392699 fft 14: mflops = 32.4947 (norm. = 0.236445), norm. avg. (of 3) = 0.246641 fft 15: mflops = 16.9305 (norm. = 0.123194), norm. avg. (of 3) = 0.131414 fft 16: mflops = 16.4221 (norm. = 0.119494), norm. avg. (of 3) = 0.12672 fft 17: mflops = 72.2261 (norm. = 0.525548), norm. avg. (of 3) = 0.576132 fft 18: mflops = 36.3242 (norm. = 0.264311), norm. avg. (of 2) = 0.25221 fft 19: mflops = 38.1247 (norm. = 0.277412), norm. avg. (of 2) = 0.266977 fft 20: mflops = 36.0223 (norm. = 0.262114), norm. avg. (of 2) = 0.24594 fft 21: mflops = 7.26767 (norm. = 0.0528827), norm. avg. (of 3) = 0.0568394 fft 22: mflops = 13.0272 (norm. = 0.0947912), norm. avg. (of 3) = 0.0689081 fft 23: mflops = 17.0705 (norm. = 0.124212), norm. avg. (of 3) = 0.137075 fft 24: mflops = 62.9743 (norm. = 0.458228), norm. avg. (of 3) = 0.461807 fft 25: mflops = -1 (norm. = -0.00727643), norm. avg. (of 0) = -1 fft 26: mflops = 3.86641 (norm. = 0.0281337), norm. avg. (of 2) = 0.0311253 fft 27: mflops = 15.5305 (norm. = 0.113007), norm. avg. (of 3) = 0.133718 fft 28: mflops = 11.3026 (norm. = 0.0822425), norm. avg. (of 3) = 0.0839674 fft 29: mflops = 6.42893 (norm. = 0.0467796), norm. avg. (of 3) = 0.100655 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.08082 s, 131072 iters, t-(init.)=1.02633 s t(norm)=0.122348, mflops=40.8671 (err=1.9e-16) 1. Arndt DIT: elapsed time t=1.10396 s, 131072 iters, t-(init.)=1.0494 s t(norm)=0.125098, mflops=39.9687 (err=1.9e-16) 2. Arndt Split-Radix: elapsed time t=1.5484 s, 131072 iters, t-(init.)=1.49391 s t(norm)=0.178088, mflops=28.076 (err=1.5e-16) 3. Arndt 4-step: elapsed time t=1.13579 s, 32768 iters, t-(init.)=1.12192 s t(norm)=0.534973, mflops=9.34626 (err=2.0e-16) 4. Beauregard: elapsed time t=1.1304 s, 32768 iters, t-(init.)=1.11668 s t(norm)=0.532473, mflops=9.39015 (err=2.3e-16) 5. Bergland: elapsed time t=1.16671 s, 131072 iters, t-(init.)=1.11164 s t(norm)=0.132517, mflops=37.7309 (err=2.6e-16) 6. CWP (min N): elapsed time t=1.68834 s, 262144 iters, t-(init.)=1.57907 s t(norm)=0.0941198, mflops=53.1238 7. CWP (best N) (N=28): elapsed time t=1.22663 s, 131072 iters, t-(init.)=1.13648 s t(norm)=0.135479, mflops=36.906 8. Edelblute: elapsed time t=1.06781 s, 65536 iters, t-(init.)=1.04015 s t(norm)=0.247992, mflops=20.162 (err=1.6e-16) 9. FFTPACK (f2c): elapsed time t=1.45138 s, 131072 iters, t-(init.)=1.39671 s t(norm)=0.166501, mflops=30.0298 (err=2.1e-16) FFTW_MEASURE plan: (cost = 2.658325e-06) FFTW_NOTW 16 10. FFTW: elapsed time t=1.4307 s, 524288 iters, t-(init.)=1.21242 s t(norm)=0.036133, mflops=138.378 (err=2.2e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 11. FFTW_ESTIMATE: elapsed time t=1.43089 s, 524288 iters, t-(init.)=1.21297 s t(norm)=0.0361492, mflops=138.316 (err=2.2e-16) 12. Frigo-old: elapsed time t=1.33641 s, 524288 iters, t-(init.)=1.11839 s t(norm)=0.0333305, mflops=150.013 (err=2.2e-16) 13. Green: elapsed time t=1.22362 s, 262144 iters, t-(init.)=1.11468 s t(norm)=0.0664402, mflops=75.2557 (err=2.6e-16) 14. GSL: elapsed time t=1.8667 s, 262144 iters, t-(init.)=1.75753 s t(norm)=0.104757, mflops=47.7296 (err=2.1e-16) 15. GSL DIT: elapsed time t=1.7291 s, 131072 iters, t-(init.)=1.67463 s t(norm)=0.199631, mflops=25.0462 (err=3.1e-16) 16. GSL DIF: elapsed time t=1.7449 s, 131072 iters, t-(init.)=1.69041 s t(norm)=0.201513, mflops=24.8123 (err=2.5e-16) 17. Krukar: elapsed time t=1.10969 s, 262144 iters, t-(init.)=1.0006 s t(norm)=0.0596406, mflops=83.8355 (err=1.7e-16) 18. Mayer (Buneman): elapsed time t=1.18116 s, 131072 iters, t-(init.)=1.12609 s t(norm)=0.13424, mflops=37.2467 (err=2.3e-16) 19. Mayer (simple): elapsed time t=1.93631 s, 262144 iters, t-(init.)=1.8274 s t(norm)=0.108922, mflops=45.9046 20. Mayer (lookup): elapsed time t=1.96287 s, 262144 iters, t-(init.)=1.85377 s t(norm)=0.110494, mflops=45.2515 (err=2.1e-16) 21. NAPACK (f2c): elapsed time t=1.00195 s, 32768 iters, t-(init.)=0.988221 s t(norm)=0.47122, mflops=10.6107 (err=2.7e-16) 22. Nielsen: elapsed time t=1.56112 s, 65536 iters, t-(init.)=1.53392 s t(norm)=0.365716, mflops=13.6718 (err=1.8e-16) 23. NR (C): elapsed time t=1.68285 s, 131072 iters, t-(init.)=1.62789 s t(norm)=0.194059, mflops=25.7653 (err=2.9e-16) 24. Ooura (C): elapsed time t=1.07412 s, 262144 iters, t-(init.)=0.965213 s t(norm)=0.0575312, mflops=86.9094 (err=2.5e-16) 25. QFT: elapsed time t=1.83142 s, 262144 iters, t-(init.)=1.72233 s t(norm)=0.102659, mflops=48.7049 (err=1.4e-16) 26. Ransom: elapsed time t=1.76844 s, 65536 iters, t-(init.)=1.74084 s t(norm)=0.415049, mflops=12.0468 (err=5.0e-16) 27. Singleton (f2c): elapsed time t=1.0756 s, 131072 iters, t-(init.)=1.02109 s t(norm)=0.121723, mflops=41.0768 (err=2.0e-16) 28. Temperton (f2c): elapsed time t=1.32157 s, 65536 iters, t-(init.)=1.2942 s t(norm)=0.308562, mflops=16.2042 (err=2.1e-16) 29. Valkenburg: elapsed time t=1.57272 s, 32768 iters, t-(init.)=1.55882 s t(norm)=0.743303, mflops=6.72673 (err=2.5e-16) Top mflops for N=16 = 150.013 Normalized results and averages for N=16: fft 0: mflops = 40.8671 (norm. = 0.272424), norm. avg. (of 4) = 0.482591 fft 1: mflops = 39.9687 (norm. = 0.266435), norm. avg. (of 4) = 0.462378 fft 2: mflops = 28.076 (norm. = 0.187157), norm. avg. (of 4) = 0.260295 fft 3: mflops = 9.34626 (norm. = 0.0623031), norm. avg. (of 4) = 0.0423086 fft 4: mflops = 9.39015 (norm. = 0.0625957), norm. avg. (of 4) = 0.072989 fft 5: mflops = 37.7309 (norm. = 0.251518), norm. avg. (of 4) = 0.180692 fft 6: mflops = 53.1238 (norm. = 0.354128), norm. avg. (of 4) = 0.216312 fft 7: mflops = 36.906 (norm. = 0.246019), norm. avg. (of 4) = 0.145538 fft 8: mflops = 20.162 (norm. = 0.134402), norm. avg. (of 3) = 0.15503 fft 9: mflops = 30.0298 (norm. = 0.200182), norm. avg. (of 4) = 0.176427 fft 10: mflops = 138.378 (norm. = 0.92244), norm. avg. (of 4) = 0.763566 fft 11: mflops = 138.316 (norm. = 0.922026), norm. avg. (of 4) = 0.763527 fft 12: mflops = 150.013 (norm. = 1), norm. avg. (of 4) = 1 fft 13: mflops = 75.2557 (norm. = 0.501662), norm. avg. (of 2) = 0.44718 fft 14: mflops = 47.7296 (norm. = 0.318171), norm. avg. (of 4) = 0.264524 fft 15: mflops = 25.0462 (norm. = 0.16696), norm. avg. (of 4) = 0.140301 fft 16: mflops = 24.8123 (norm. = 0.165401), norm. avg. (of 4) = 0.13639 fft 17: mflops = 83.8355 (norm. = 0.558856), norm. avg. (of 4) = 0.571813 fft 18: mflops = 37.2467 (norm. = 0.24829), norm. avg. (of 3) = 0.250903 fft 19: mflops = 45.9046 (norm. = 0.306005), norm. avg. (of 3) = 0.279986 fft 20: mflops = 45.2515 (norm. = 0.301651), norm. avg. (of 3) = 0.26451 fft 21: mflops = 10.6107 (norm. = 0.0707323), norm. avg. (of 4) = 0.0603126 fft 22: mflops = 13.6718 (norm. = 0.0911378), norm. avg. (of 4) = 0.0744655 fft 23: mflops = 25.7653 (norm. = 0.171754), norm. avg. (of 4) = 0.145745 fft 24: mflops = 86.9094 (norm. = 0.579347), norm. avg. (of 4) = 0.491192 fft 25: mflops = 48.7049 (norm. = 0.324672), norm. avg. (of 1) = 0.324672 fft 26: mflops = 12.0468 (norm. = 0.0803051), norm. avg. (of 3) = 0.0475185 fft 27: mflops = 41.0768 (norm. = 0.273822), norm. avg. (of 4) = 0.168744 fft 28: mflops = 16.2042 (norm. = 0.108019), norm. avg. (of 4) = 0.0899803 fft 29: mflops = 6.72673 (norm. = 0.044841), norm. avg. (of 4) = 0.0867016 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.14799 s, 65536 iters, t-(init.)=1.09699 s t(norm)=0.104618, mflops=47.7931 (err=2.4e-16) 1. Arndt DIT: elapsed time t=1.16384 s, 65536 iters, t-(init.)=1.11284 s t(norm)=0.106128, mflops=47.1127 (err=2.7e-16) 2. Arndt Split-Radix: elapsed time t=1.69631 s, 65536 iters, t-(init.)=1.64516 s t(norm)=0.156895, mflops=31.8685 (err=3.0e-16) 3. Arndt 4-step: elapsed time t=1.17613 s, 16384 iters, t-(init.)=1.16327 s t(norm)=0.443754, mflops=11.2675 (err=2.4e-16) 4. Beauregard: elapsed time t=1.37067 s, 16384 iters, t-(init.)=1.3579 s t(norm)=0.517997, mflops=9.65256 (err=2.5e-16) 5. Bergland: elapsed time t=1.02674 s, 65536 iters, t-(init.)=0.975452 s t(norm)=0.0930264, mflops=53.7482 (err=2.6e-16) 6. CWP (min N) (N=33): elapsed time t=1.95644 s, 131072 iters, t-(init.)=1.85181 s t(norm)=0.088301, mflops=56.6245 7. CWP (best N) (N=35): elapsed time t=1.64756 s, 131072 iters, t-(init.)=1.53671 s t(norm)=0.073276, mflops=68.2351 8. Edelblute: elapsed time t=1.20373 s, 32768 iters, t-(init.)=1.17796 s t(norm)=0.224678, mflops=22.2541 (err=2.9e-16) 9. FFTPACK (f2c): elapsed time t=1.03677 s, 32768 iters, t-(init.)=1.01122 s t(norm)=0.192876, mflops=25.9234 (err=2.3e-16) FFTW_MEASURE plan: (cost = 5.997070e-06) FFTW_NOTW 32 10. FFTW: elapsed time t=1.59805 s, 262144 iters, t-(init.)=1.3939 s t(norm)=0.0332332, mflops=150.452 (err=2.4e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.59812 s, 262144 iters, t-(init.)=1.39424 s t(norm)=0.0332413, mflops=150.415 (err=2.4e-16) 12. Frigo-old: elapsed time t=1.54704 s, 262144 iters, t-(init.)=1.34334 s t(norm)=0.0320278, mflops=156.115 (err=2.1e-16) 13. Green: elapsed time t=1.14625 s, 131072 iters, t-(init.)=1.04431 s t(norm)=0.0497966, mflops=100.408 (err=2.4e-16) 14. GSL: elapsed time t=1.20507 s, 65536 iters, t-(init.)=1.15408 s t(norm)=0.110062, mflops=45.4291 (err=2.3e-16) 15. GSL DIT: elapsed time t=1.65015 s, 65536 iters, t-(init.)=1.59916 s t(norm)=0.152508, mflops=32.7852 (err=3.1e-16) 16. GSL DIF: elapsed time t=1.62432 s, 65536 iters, t-(init.)=1.57316 s t(norm)=0.150029, mflops=33.327 (err=3.2e-16) 17. Krukar: elapsed time t=1.39429 s, 131072 iters, t-(init.)=1.29223 s t(norm)=0.0616183, mflops=81.1447 (err=2.7e-16) 18. Mayer (Buneman): elapsed time t=1.29559 s, 65536 iters, t-(init.)=1.24421 s t(norm)=0.118657, mflops=42.1382 (err=2.8e-16) 19. Mayer (simple): elapsed time t=1.00916 s, 65536 iters, t-(init.)=0.95811 s t(norm)=0.0913725, mflops=54.7211 20. Mayer (lookup): elapsed time t=1.00063 s, 65536 iters, t-(init.)=0.949526 s t(norm)=0.0905539, mflops=55.2158 (err=2.6e-16) 21. NAPACK (f2c): elapsed time t=1.04446 s, 16384 iters, t-(init.)=1.0316 s t(norm)=0.393525, mflops=12.7057 (err=6.4e-16) 22. Nielsen: elapsed time t=1.33387 s, 32768 iters, t-(init.)=1.30852 s t(norm)=0.24958, mflops=20.0336 (err=1.1e-15) 23. NR (C): elapsed time t=1.56801 s, 65536 iters, t-(init.)=1.517 s t(norm)=0.144673, mflops=34.5608 (err=2.9e-16) 24. Ooura (C): elapsed time t=1.2134 s, 131072 iters, t-(init.)=1.11153 s t(norm)=0.0530017, mflops=94.3365 (err=2.5e-16) 25. QFT: elapsed time t=1.24867 s, 65536 iters, t-(init.)=1.19766 s t(norm)=0.114218, mflops=43.776 (err=2.8e-16) 26. Ransom: elapsed time t=1.04015 s, 16384 iters, t-(init.)=1.02709 s t(norm)=0.391803, mflops=12.7615 (err=7.4e-16) 27. Singleton (f2c): elapsed time t=1.0449 s, 65536 iters, t-(init.)=0.994036 s t(norm)=0.0947987, mflops=52.7434 (err=2.3e-16) 28. Temperton (f2c): elapsed time t=1.72709 s, 32768 iters, t-(init.)=1.70144 s t(norm)=0.324524, mflops=15.4072 (err=2.6e-16) 29. Valkenburg: elapsed time t=1.89936 s, 16384 iters, t-(init.)=1.88633 s t(norm)=0.71958, mflops=6.9485 (err=2.8e-16) Top mflops for N=32 = 156.115 Normalized results and averages for N=32: fft 0: mflops = 47.7931 (norm. = 0.306141), norm. avg. (of 5) = 0.447301 fft 1: mflops = 47.1127 (norm. = 0.301783), norm. avg. (of 5) = 0.430259 fft 2: mflops = 31.8685 (norm. = 0.204136), norm. avg. (of 5) = 0.249063 fft 3: mflops = 11.2675 (norm. = 0.0721746), norm. avg. (of 5) = 0.0482818 fft 4: mflops = 9.65256 (norm. = 0.06183), norm. avg. (of 5) = 0.0707572 fft 5: mflops = 53.7482 (norm. = 0.344287), norm. avg. (of 5) = 0.213411 fft 6: mflops = 56.6245 (norm. = 0.362711), norm. avg. (of 5) = 0.245592 fft 7: mflops = 68.2351 (norm. = 0.437084), norm. avg. (of 5) = 0.203847 fft 8: mflops = 22.2541 (norm. = 0.14255), norm. avg. (of 4) = 0.15191 fft 9: mflops = 25.9234 (norm. = 0.166054), norm. avg. (of 5) = 0.174353 fft 10: mflops = 150.452 (norm. = 0.963727), norm. avg. (of 5) = 0.803598 fft 11: mflops = 150.415 (norm. = 0.963493), norm. avg. (of 5) = 0.803521 fft 12: mflops = 156.115 (norm. = 1), norm. avg. (of 5) = 1 fft 13: mflops = 100.408 (norm. = 0.643171), norm. avg. (of 3) = 0.512511 fft 14: mflops = 45.4291 (norm. = 0.290999), norm. avg. (of 5) = 0.269819 fft 15: mflops = 32.7852 (norm. = 0.210007), norm. avg. (of 5) = 0.154242 fft 16: mflops = 33.327 (norm. = 0.213478), norm. avg. (of 5) = 0.151808 fft 17: mflops = 81.1447 (norm. = 0.519777), norm. avg. (of 5) = 0.561406 fft 18: mflops = 42.1382 (norm. = 0.269919), norm. avg. (of 4) = 0.255657 fft 19: mflops = 54.7211 (norm. = 0.350519), norm. avg. (of 4) = 0.297619 fft 20: mflops = 55.2158 (norm. = 0.353688), norm. avg. (of 4) = 0.286805 fft 21: mflops = 12.7057 (norm. = 0.0813868), norm. avg. (of 5) = 0.0645275 fft 22: mflops = 20.0336 (norm. = 0.128327), norm. avg. (of 5) = 0.0852377 fft 23: mflops = 34.5608 (norm. = 0.221381), norm. avg. (of 5) = 0.160872 fft 24: mflops = 94.3365 (norm. = 0.604278), norm. avg. (of 5) = 0.513809 fft 25: mflops = 43.776 (norm. = 0.280409), norm. avg. (of 2) = 0.302541 fft 26: mflops = 12.7615 (norm. = 0.0817447), norm. avg. (of 4) = 0.0560751 fft 27: mflops = 52.7434 (norm. = 0.33785), norm. avg. (of 5) = 0.202565 fft 28: mflops = 15.4072 (norm. = 0.0986915), norm. avg. (of 5) = 0.0917225 fft 29: mflops = 6.9485 (norm. = 0.044509), norm. avg. (of 5) = 0.0782631 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.34733 s, 32768 iters, t-(init.)=1.29807 s t(norm)=0.103162, mflops=48.4677 (err=5.0e-16) 1. Arndt DIT: elapsed time t=1.38123 s, 32768 iters, t-(init.)=1.33197 s t(norm)=0.105855, mflops=47.2343 (err=4.9e-16) 2. Arndt Split-Radix: elapsed time t=1.81986 s, 32768 iters, t-(init.)=1.77059 s t(norm)=0.140714, mflops=35.5331 (err=4.5e-16) 3. Arndt 4-step: elapsed time t=1.0039 s, 8192 iters, t-(init.)=0.991576 s t(norm)=0.315214, mflops=15.8623 (err=4.9e-16) 4. Beauregard: elapsed time t=1.63598 s, 8192 iters, t-(init.)=1.62356 s t(norm)=0.516115, mflops=9.68777 (err=4.5e-16) 5. Bergland: elapsed time t=1.04867 s, 32768 iters, t-(init.)=0.999319 s t(norm)=0.0794187, mflops=62.9574 (err=5.5e-16) 6. CWP (min N) (N=65): elapsed time t=1.0974 s, 32768 iters, t-(init.)=1.0474 s t(norm)=0.0832399, mflops=60.0674 7. CWP (best N) (N=84): elapsed time t=1.77817 s, 65536 iters, t-(init.)=1.65014 s t(norm)=0.0655706, mflops=76.2537 8. Edelblute: elapsed time t=1.29994 s, 16384 iters, t-(init.)=1.27519 s t(norm)=0.202686, mflops=24.6687 (err=4.6e-16) 9. FFTPACK (f2c): elapsed time t=1.07283 s, 16384 iters, t-(init.)=1.04823 s t(norm)=0.166612, mflops=30.0099 (err=4.4e-16) FFTW_MEASURE plan: (cost = 1.332715e-05) FFTW_NOTW 64 10. FFTW: elapsed time t=1.77312 s, 131072 iters, t-(init.)=1.57634 s t(norm)=0.031319, mflops=159.648 (err=4.4e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.07172 s, 65536 iters, t-(init.)=0.973141 s t(norm)=0.0386691, mflops=129.302 (err=4.7e-16) 12. Frigo-old: elapsed time t=1.25809 s, 65536 iters, t-(init.)=1.15968 s t(norm)=0.0460817, mflops=108.503 (err=4.5e-16) 13. Green: elapsed time t=1.06311 s, 65536 iters, t-(init.)=0.964507 s t(norm)=0.0383261, mflops=130.46 (err=4.6e-16) 14. GSL: elapsed time t=1.20003 s, 32768 iters, t-(init.)=1.15084 s t(norm)=0.0914609, mflops=54.6682 (err=4.4e-16) 15. GSL DIT: elapsed time t=1.68685 s, 32768 iters, t-(init.)=1.63741 s t(norm)=0.13013, mflops=38.4232 (err=4.6e-16) 16. GSL DIF: elapsed time t=1.62094 s, 32768 iters, t-(init.)=1.57184 s t(norm)=0.124919, mflops=40.0261 (err=4.9e-16) 17. Krukar: elapsed time t=1.37534 s, 32768 iters, t-(init.)=1.32606 s t(norm)=0.105386, mflops=47.4447 (err=5.2e-16) 18. Mayer (Buneman): elapsed time t=1.47537 s, 32768 iters, t-(init.)=1.42603 s t(norm)=0.113331, mflops=44.1187 (err=4.8e-16) 19. Mayer (simple): elapsed time t=1.10514 s, 32768 iters, t-(init.)=1.05588 s t(norm)=0.083914, mflops=59.5848 20. Mayer (lookup): elapsed time t=1.08392 s, 32768 iters, t-(init.)=1.03463 s t(norm)=0.0822249, mflops=60.8088 (err=4.5e-16) 21. NAPACK (f2c): elapsed time t=1.06552 s, 8192 iters, t-(init.)=1.05309 s t(norm)=0.334767, mflops=14.9358 (err=1.1e-15) 22. Nielsen: elapsed time t=1.24382 s, 16384 iters, t-(init.)=1.21904 s t(norm)=0.193762, mflops=25.8049 (err=1.9e-15) 23. NR (C): elapsed time t=1.55135 s, 32768 iters, t-(init.)=1.50215 s t(norm)=0.11938, mflops=41.8829 (err=4.4e-16) 24. Ooura (C): elapsed time t=1.26971 s, 65536 iters, t-(init.)=1.1713 s t(norm)=0.0465434, mflops=107.427 (err=5.4e-16) 25. QFT: elapsed time t=1.57364 s, 32768 iters, t-(init.)=1.52437 s t(norm)=0.121146, mflops=41.2725 (err=4.9e-16) 26. Ransom: elapsed time t=1.3354 s, 16384 iters, t-(init.)=1.31066 s t(norm)=0.208324, mflops=24.0011 (err=9.1e-16) 27. Singleton (f2c): elapsed time t=1.81591 s, 65536 iters, t-(init.)=1.71743 s t(norm)=0.0682444, mflops=73.2661 (err=6.5e-16) 28. Temperton (f2c): elapsed time t=1.6195 s, 16384 iters, t-(init.)=1.5949 s t(norm)=0.253503, mflops=19.7236 (err=4.7e-16) 29. Valkenburg: elapsed time t=1.11205 s, 4096 iters, t-(init.)=1.10577 s t(norm)=0.703027, mflops=7.1121 (err=6.0e-16) Top mflops for N=64 = 159.648 Normalized results and averages for N=64: fft 0: mflops = 48.4677 (norm. = 0.303592), norm. avg. (of 6) = 0.42335 fft 1: mflops = 47.2343 (norm. = 0.295866), norm. avg. (of 6) = 0.40786 fft 2: mflops = 35.5331 (norm. = 0.222572), norm. avg. (of 6) = 0.244648 fft 3: mflops = 15.8623 (norm. = 0.099358), norm. avg. (of 6) = 0.0567945 fft 4: mflops = 9.68777 (norm. = 0.0606822), norm. avg. (of 6) = 0.069078 fft 5: mflops = 62.9574 (norm. = 0.394353), norm. avg. (of 6) = 0.243568 fft 6: mflops = 60.0674 (norm. = 0.37625), norm. avg. (of 6) = 0.267368 fft 7: mflops = 76.2537 (norm. = 0.477638), norm. avg. (of 6) = 0.249479 fft 8: mflops = 24.6687 (norm. = 0.154519), norm. avg. (of 5) = 0.152432 fft 9: mflops = 30.0099 (norm. = 0.187976), norm. avg. (of 6) = 0.176623 fft 10: mflops = 159.648 (norm. = 1), norm. avg. (of 6) = 0.836332 fft 11: mflops = 129.302 (norm. = 0.809922), norm. avg. (of 6) = 0.804588 fft 12: mflops = 108.503 (norm. = 0.67964), norm. avg. (of 6) = 0.946607 fft 13: mflops = 130.46 (norm. = 0.817172), norm. avg. (of 4) = 0.588676 fft 14: mflops = 54.6682 (norm. = 0.34243), norm. avg. (of 6) = 0.281921 fft 15: mflops = 38.4232 (norm. = 0.240675), norm. avg. (of 6) = 0.168648 fft 16: mflops = 40.0261 (norm. = 0.250715), norm. avg. (of 6) = 0.168292 fft 17: mflops = 47.4447 (norm. = 0.297184), norm. avg. (of 6) = 0.517369 fft 18: mflops = 44.1187 (norm. = 0.276351), norm. avg. (of 5) = 0.259796 fft 19: mflops = 59.5848 (norm. = 0.373227), norm. avg. (of 5) = 0.312741 fft 20: mflops = 60.8088 (norm. = 0.380894), norm. avg. (of 5) = 0.305623 fft 21: mflops = 14.9358 (norm. = 0.0935546), norm. avg. (of 6) = 0.0693653 fft 22: mflops = 25.8049 (norm. = 0.161637), norm. avg. (of 6) = 0.0979709 fft 23: mflops = 41.8829 (norm. = 0.262346), norm. avg. (of 6) = 0.177784 fft 24: mflops = 107.427 (norm. = 0.672899), norm. avg. (of 6) = 0.540324 fft 25: mflops = 41.2725 (norm. = 0.258523), norm. avg. (of 3) = 0.287868 fft 26: mflops = 24.0011 (norm. = 0.150338), norm. avg. (of 5) = 0.0749276 fft 27: mflops = 73.2661 (norm. = 0.458924), norm. avg. (of 6) = 0.245292 fft 28: mflops = 19.7236 (norm. = 0.123545), norm. avg. (of 6) = 0.0970262 fft 29: mflops = 7.1121 (norm. = 0.0445488), norm. avg. (of 6) = 0.0726441 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.42578 s, 16384 iters, t-(init.)=1.37727 s t(norm)=0.0938192, mflops=53.294 (err=4.0e-16) 1. Arndt DIT: elapsed time t=1.45211 s, 16384 iters, t-(init.)=1.40377 s t(norm)=0.0956244, mflops=52.2879 (err=4.1e-16) 2. Arndt Split-Radix: elapsed time t=1.92716 s, 16384 iters, t-(init.)=1.87874 s t(norm)=0.127979, mflops=39.069 (err=4.4e-16) 3. Arndt 4-step: elapsed time t=1.14286 s, 4096 iters, t-(init.)=1.13066 s t(norm)=0.308079, mflops=16.2296 (err=4.0e-16) 4. Beauregard: elapsed time t=1.9141 s, 4096 iters, t-(init.)=1.90189 s t(norm)=0.518225, mflops=9.64832 (err=4.1e-16) 5. Bergland: elapsed time t=1.12439 s, 16384 iters, t-(init.)=1.07592 s t(norm)=0.0732914, mflops=68.2209 (err=4.3e-16) 6. CWP (min N) (N=130): elapsed time t=1.11783 s, 16384 iters, t-(init.)=1.06876 s t(norm)=0.0728038, mflops=68.6777 7. CWP (best N) (N=140): elapsed time t=1.54432 s, 32768 iters, t-(init.)=1.43857 s t(norm)=0.0489973, mflops=102.046 8. Edelblute: elapsed time t=1.37363 s, 8192 iters, t-(init.)=1.34945 s t(norm)=0.183848, mflops=27.1964 (err=4.1e-16) 9. FFTPACK (f2c): elapsed time t=1.20734 s, 8192 iters, t-(init.)=1.18317 s t(norm)=0.161195, mflops=31.0184 (err=4.1e-16) FFTW_MEASURE plan: (cost = 3.427539e-05) FFTW_TWIDDLE 4 FFTW_NOTW 32 10. FFTW: elapsed time t=1.13808 s, 32768 iters, t-(init.)=1.04166 s t(norm)=0.0354787, mflops=140.93 (err=4.2e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.13837 s, 32768 iters, t-(init.)=1.04175 s t(norm)=0.0354819, mflops=140.917 (err=4.2e-16) 12. Frigo-old: elapsed time t=1.30091 s, 32768 iters, t-(init.)=1.20423 s t(norm)=0.041016, mflops=121.904 (err=4.4e-16) 13. Green: elapsed time t=1.24905 s, 32768 iters, t-(init.)=1.15237 s t(norm)=0.0392496, mflops=127.39 (err=4.4e-16) 14. GSL: elapsed time t=1.31946 s, 16384 iters, t-(init.)=1.27103 s t(norm)=0.0865823, mflops=57.7485 (err=4.2e-16) 15. GSL DIT: elapsed time t=1.77736 s, 16384 iters, t-(init.)=1.7291 s t(norm)=0.117786, mflops=42.45 (err=4.3e-16) 16. GSL DIF: elapsed time t=1.67426 s, 16384 iters, t-(init.)=1.62585 s t(norm)=0.110752, mflops=45.1457 (err=4.6e-16) 17. Krukar: elapsed time t=1.39116 s, 8192 iters, t-(init.)=1.36689 s t(norm)=0.186224, mflops=26.8493 (err=4.6e-16) 18. Mayer (Buneman): elapsed time t=1.58473 s, 16384 iters, t-(init.)=1.53632 s t(norm)=0.104654, mflops=47.7766 (err=4.0e-16) 19. Mayer (simple): elapsed time t=1.17431 s, 16384 iters, t-(init.)=1.12589 s t(norm)=0.0766952, mflops=65.1932 20. Mayer (lookup): elapsed time t=1.14608 s, 16384 iters, t-(init.)=1.09773 s t(norm)=0.0747768, mflops=66.8657 (err=4.3e-16) 21. NAPACK (f2c): elapsed time t=1.18351 s, 4096 iters, t-(init.)=1.1713 s t(norm)=0.319154, mflops=15.6664 (err=1.2e-15) 22. Nielsen: elapsed time t=1.51235 s, 8192 iters, t-(init.)=1.48802 s t(norm)=0.202727, mflops=24.6638 (err=1.3e-15) 23. NR (C): elapsed time t=1.59275 s, 16384 iters, t-(init.)=1.54434 s t(norm)=0.1052, mflops=47.5285 (err=4.4e-16) 24. Ooura (C): elapsed time t=1.48966 s, 32768 iters, t-(init.)=1.39296 s t(norm)=0.0474439, mflops=105.388 (err=4.1e-16) 25. QFT: elapsed time t=1.96368 s, 16384 iters, t-(init.)=1.9152 s t(norm)=0.130463, mflops=38.3251 (err=4.6e-16) 26. Ransom: elapsed time t=1.57516 s, 8192 iters, t-(init.)=1.55079 s t(norm)=0.211279, mflops=23.6654 (err=1.1e-15) 27. Singleton (f2c): elapsed time t=1.09965 s, 16384 iters, t-(init.)=1.05109 s t(norm)=0.0715996, mflops=69.8328 (err=5.3e-16) 28. Temperton (f2c): elapsed time t=1.0844 s, 4096 iters, t-(init.)=1.07227 s t(norm)=0.292171, mflops=17.1133 (err=4.4e-16) 29. Valkenburg: elapsed time t=1.28169 s, 2048 iters, t-(init.)=1.27561 s t(norm)=0.695151, mflops=7.19268 (err=4.8e-16) Top mflops for N=128 = 140.93 Normalized results and averages for N=128: fft 0: mflops = 53.294 (norm. = 0.37816), norm. avg. (of 7) = 0.416894 fft 1: mflops = 52.2879 (norm. = 0.371021), norm. avg. (of 7) = 0.402597 fft 2: mflops = 39.069 (norm. = 0.277223), norm. avg. (of 7) = 0.249302 fft 3: mflops = 16.2296 (norm. = 0.115161), norm. avg. (of 7) = 0.0651326 fft 4: mflops = 9.64832 (norm. = 0.0684619), norm. avg. (of 7) = 0.06899 fft 5: mflops = 68.2209 (norm. = 0.484077), norm. avg. (of 7) = 0.277926 fft 6: mflops = 68.6777 (norm. = 0.487319), norm. avg. (of 7) = 0.29879 fft 7: mflops = 102.046 (norm. = 0.724094), norm. avg. (of 7) = 0.317281 fft 8: mflops = 27.1964 (norm. = 0.192978), norm. avg. (of 6) = 0.159189 fft 9: mflops = 31.0184 (norm. = 0.220098), norm. avg. (of 7) = 0.182834 fft 10: mflops = 140.93 (norm. = 1), norm. avg. (of 7) = 0.859713 fft 11: mflops = 140.917 (norm. = 0.999909), norm. avg. (of 7) = 0.832491 fft 12: mflops = 121.904 (norm. = 0.864996), norm. avg. (of 7) = 0.934948 fft 13: mflops = 127.39 (norm. = 0.903924), norm. avg. (of 5) = 0.651726 fft 14: mflops = 57.7485 (norm. = 0.409768), norm. avg. (of 7) = 0.300185 fft 15: mflops = 42.45 (norm. = 0.301214), norm. avg. (of 7) = 0.187586 fft 16: mflops = 45.1457 (norm. = 0.320342), norm. avg. (of 7) = 0.190014 fft 17: mflops = 26.8493 (norm. = 0.190516), norm. avg. (of 7) = 0.470676 fft 18: mflops = 47.7766 (norm. = 0.33901), norm. avg. (of 6) = 0.272998 fft 19: mflops = 65.1932 (norm. = 0.462593), norm. avg. (of 6) = 0.337716 fft 20: mflops = 66.8657 (norm. = 0.474461), norm. avg. (of 6) = 0.333762 fft 21: mflops = 15.6664 (norm. = 0.111165), norm. avg. (of 7) = 0.0753367 fft 22: mflops = 24.6638 (norm. = 0.175007), norm. avg. (of 7) = 0.108976 fft 23: mflops = 47.5285 (norm. = 0.337249), norm. avg. (of 7) = 0.200565 fft 24: mflops = 105.388 (norm. = 0.747802), norm. avg. (of 7) = 0.569964 fft 25: mflops = 38.3251 (norm. = 0.271945), norm. avg. (of 4) = 0.283887 fft 26: mflops = 23.6654 (norm. = 0.167924), norm. avg. (of 6) = 0.0904269 fft 27: mflops = 69.8328 (norm. = 0.495515), norm. avg. (of 7) = 0.281038 fft 28: mflops = 17.1133 (norm. = 0.121431), norm. avg. (of 7) = 0.100513 fft 29: mflops = 7.19268 (norm. = 0.0510373), norm. avg. (of 7) = 0.0695574 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.55571 s, 8192 iters, t-(init.)=1.50752 s t(norm)=0.0898554, mflops=55.645 (err=6.7e-16) 1. Arndt DIT: elapsed time t=1.60126 s, 8192 iters, t-(init.)=1.55332 s t(norm)=0.0925851, mflops=54.0043 (err=7.1e-16) 2. Arndt Split-Radix: elapsed time t=1.03108 s, 4096 iters, t-(init.)=1.00703 s t(norm)=0.120047, mflops=41.6503 (err=7.4e-16) 3. Arndt 4-step: elapsed time t=1.15744 s, 2048 iters, t-(init.)=1.14533 s t(norm)=0.273068, mflops=18.3104 (err=7.2e-16) 4. Beauregard: elapsed time t=1.09962 s, 1024 iters, t-(init.)=1.09371 s t(norm)=0.521522, mflops=9.58733 (err=7.8e-16) 5. Bergland: elapsed time t=1.13028 s, 8192 iters, t-(init.)=1.08223 s t(norm)=0.0645061, mflops=77.512 (err=8.3e-16) 6. CWP (min N) (N=260): elapsed time t=1.09787 s, 8192 iters, t-(init.)=1.04922 s t(norm)=0.0625384, mflops=79.9509 7. CWP (best N) (N=280): elapsed time t=1.70301 s, 16384 iters, t-(init.)=1.59824 s t(norm)=0.0476314, mflops=104.973 8. Edelblute: elapsed time t=1.44774 s, 4096 iters, t-(init.)=1.42385 s t(norm)=0.169736, mflops=29.4575 (err=7.0e-16) 9. FFTPACK (f2c): elapsed time t=1.29697 s, 4096 iters, t-(init.)=1.27281 s t(norm)=0.15173, mflops=32.9532 (err=7.8e-16) FFTW_MEASURE plan: (cost = 7.674609e-05) FFTW_TWIDDLE 4 FFTW_NOTW 64 10. FFTW: elapsed time t=1.2797 s, 16384 iters, t-(init.)=1.18405 s t(norm)=0.0352874, mflops=141.694 (err=8.0e-16) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.24641 s, 16384 iters, t-(init.)=1.15058 s t(norm)=0.0342901, mflops=145.815 (err=8.1e-16) 12. Frigo-old: elapsed time t=1.41256 s, 16384 iters, t-(init.)=1.31665 s t(norm)=0.0392393, mflops=127.423 (err=8.0e-16) 13. Green: elapsed time t=1.32532 s, 16384 iters, t-(init.)=1.22924 s t(norm)=0.0366344, mflops=136.484 (err=7.6e-16) 14. GSL: elapsed time t=1.38511 s, 8192 iters, t-(init.)=1.33758 s t(norm)=0.0797262, mflops=62.7146 (err=7.8e-16) 15. GSL DIT: elapsed time t=1.9315 s, 8192 iters, t-(init.)=1.88352 s t(norm)=0.112266, mflops=44.537 (err=7.7e-16) 16. GSL DIF: elapsed time t=1.77805 s, 8192 iters, t-(init.)=1.73014 s t(norm)=0.103124, mflops=48.4851 (err=8.3e-16) 17. Krukar: elapsed time t=1.24682 s, 4096 iters, t-(init.)=1.22285 s t(norm)=0.145775, mflops=34.2995 (err=7.7e-16) 18. Mayer (Buneman): elapsed time t=1.71398 s, 8192 iters, t-(init.)=1.66593 s t(norm)=0.0992974, mflops=50.3538 (err=7.0e-16) 19. Mayer (simple): elapsed time t=1.27148 s, 8192 iters, t-(init.)=1.22347 s t(norm)=0.0729246, mflops=68.564 20. Mayer (lookup): elapsed time t=1.23946 s, 8192 iters, t-(init.)=1.19187 s t(norm)=0.0710409, mflops=70.382 (err=7.1e-16) 21. NAPACK (f2c): elapsed time t=1.25653 s, 2048 iters, t-(init.)=1.2444 s t(norm)=0.296687, mflops=16.8528 (err=3.6e-15) 22. Nielsen: elapsed time t=1.53981 s, 4096 iters, t-(init.)=1.51583 s t(norm)=0.180701, mflops=27.6701 (err=3.4e-15) 23. NR (C): elapsed time t=1.69663 s, 8192 iters, t-(init.)=1.64871 s t(norm)=0.0982709, mflops=50.8798 (err=8.6e-16) 24. Ooura (C): elapsed time t=1.56281 s, 16384 iters, t-(init.)=1.46713 s t(norm)=0.0437239, mflops=114.354 (err=7.9e-16) 25. QFT: elapsed time t=1.26417 s, 4096 iters, t-(init.)=1.24016 s t(norm)=0.147838, mflops=33.8207 (err=9.5e-16) 26. Ransom: elapsed time t=1.25624 s, 4096 iters, t-(init.)=1.2321 s t(norm)=0.146878, mflops=34.0419 (err=1.7e-15) 27. Singleton (f2c): elapsed time t=1.91571 s, 16384 iters, t-(init.)=1.81986 s t(norm)=0.0542361, mflops=92.1895 (err=1.3e-15) 28. Temperton (f2c): elapsed time t=1.04034 s, 2048 iters, t-(init.)=1.02821 s t(norm)=0.245144, mflops=20.3962 (err=7.5e-16) 29. Valkenburg: elapsed time t=1.45568 s, 1024 iters, t-(init.)=1.44963 s t(norm)=0.691239, mflops=7.23338 (err=7.4e-16) Top mflops for N=256 = 145.815 Normalized results and averages for N=256: fft 0: mflops = 55.645 (norm. = 0.381614), norm. avg. (of 8) = 0.412484 fft 1: mflops = 54.0043 (norm. = 0.370363), norm. avg. (of 8) = 0.398568 fft 2: mflops = 41.6503 (norm. = 0.285639), norm. avg. (of 8) = 0.253844 fft 3: mflops = 18.3104 (norm. = 0.125573), norm. avg. (of 8) = 0.0726877 fft 4: mflops = 9.58733 (norm. = 0.0657501), norm. avg. (of 8) = 0.068585 fft 5: mflops = 77.512 (norm. = 0.531579), norm. avg. (of 8) = 0.309633 fft 6: mflops = 79.9509 (norm. = 0.548305), norm. avg. (of 8) = 0.329979 fft 7: mflops = 104.973 (norm. = 0.719905), norm. avg. (of 8) = 0.367609 fft 8: mflops = 29.4575 (norm. = 0.20202), norm. avg. (of 7) = 0.165308 fft 9: mflops = 32.9532 (norm. = 0.225994), norm. avg. (of 8) = 0.188229 fft 10: mflops = 141.694 (norm. = 0.971739), norm. avg. (of 8) = 0.873716 fft 11: mflops = 145.815 (norm. = 1), norm. avg. (of 8) = 0.853429 fft 12: mflops = 127.423 (norm. = 0.87387), norm. avg. (of 8) = 0.927313 fft 13: mflops = 136.484 (norm. = 0.93601), norm. avg. (of 6) = 0.699106 fft 14: mflops = 62.7146 (norm. = 0.430098), norm. avg. (of 8) = 0.316424 fft 15: mflops = 44.537 (norm. = 0.305435), norm. avg. (of 8) = 0.202317 fft 16: mflops = 48.4851 (norm. = 0.332512), norm. avg. (of 8) = 0.207826 fft 17: mflops = 34.2995 (norm. = 0.235227), norm. avg. (of 8) = 0.441245 fft 18: mflops = 50.3538 (norm. = 0.345327), norm. avg. (of 7) = 0.283331 fft 19: mflops = 68.564 (norm. = 0.470213), norm. avg. (of 7) = 0.356645 fft 20: mflops = 70.382 (norm. = 0.482681), norm. avg. (of 7) = 0.355036 fft 21: mflops = 16.8528 (norm. = 0.115577), norm. avg. (of 8) = 0.0803667 fft 22: mflops = 27.6701 (norm. = 0.189762), norm. avg. (of 8) = 0.119074 fft 23: mflops = 50.8798 (norm. = 0.348935), norm. avg. (of 8) = 0.219111 fft 24: mflops = 114.354 (norm. = 0.784242), norm. avg. (of 8) = 0.596749 fft 25: mflops = 33.8207 (norm. = 0.231943), norm. avg. (of 5) = 0.273498 fft 26: mflops = 34.0419 (norm. = 0.23346), norm. avg. (of 7) = 0.11086 fft 27: mflops = 92.1895 (norm. = 0.632238), norm. avg. (of 8) = 0.324938 fft 28: mflops = 20.3962 (norm. = 0.139877), norm. avg. (of 8) = 0.105433 fft 29: mflops = 7.23338 (norm. = 0.0496067), norm. avg. (of 8) = 0.0670635 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.64186 s, 4096 iters, t-(init.)=1.59396 s t(norm)=0.0844512, mflops=59.2058 (err=6.7e-16) 1. Arndt DIT: elapsed time t=1.68067 s, 4096 iters, t-(init.)=1.63277 s t(norm)=0.0865072, mflops=57.7987 (err=6.2e-16) 2. Arndt Split-Radix: elapsed time t=1.08262 s, 2048 iters, t-(init.)=1.0588 s t(norm)=0.112194, mflops=44.5656 (err=6.5e-16) 3. Arndt 4-step: elapsed time t=1.23585 s, 1024 iters, t-(init.)=1.22383 s t(norm)=0.259363, mflops=19.278 (err=6.3e-16) 4. Beauregard: elapsed time t=1.24856 s, 512 iters, t-(init.)=1.24228 s t(norm)=0.526546, mflops=9.49584 (err=6.8e-16) 5. Bergland: elapsed time t=1.18275 s, 4096 iters, t-(init.)=1.1347 s t(norm)=0.0601188, mflops=83.1687 (err=7.2e-16) 6. CWP (min N) (N=520): elapsed time t=1.20713 s, 4096 iters, t-(init.)=1.15835 s t(norm)=0.0613717, mflops=81.4708 7. CWP (best N) (N=560): elapsed time t=1.97536 s, 8192 iters, t-(init.)=1.86993 s t(norm)=0.0495361, mflops=100.936 8. Edelblute: elapsed time t=1.50794 s, 2048 iters, t-(init.)=1.48398 s t(norm)=0.157249, mflops=31.7968 (err=6.2e-16) 9. FFTPACK (f2c): elapsed time t=1.17214 s, 1024 iters, t-(init.)=1.15995 s t(norm)=0.245825, mflops=20.3397 (err=6.4e-16) FFTW_MEASURE plan: (cost = 3.873086e-04) FFTW_TWIDDLE 8 FFTW_NOTW 64 10. FFTW: elapsed time t=1.58359 s, 4096 iters, t-(init.)=1.53532 s t(norm)=0.0813441, mflops=61.4673 (err=6.4e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.56718 s, 4096 iters, t-(init.)=1.51916 s t(norm)=0.0804882, mflops=62.1209 (err=6.5e-16) 12. Frigo-old: elapsed time t=1.73669 s, 4096 iters, t-(init.)=1.68859 s t(norm)=0.0894648, mflops=55.8879 (err=6.3e-16) 13. Green: elapsed time t=1.38973 s, 8192 iters, t-(init.)=1.29385 s t(norm)=0.0342753, mflops=145.877 (err=6.2e-16) 14. GSL: elapsed time t=1.3569 s, 2048 iters, t-(init.)=1.33291 s t(norm)=0.14124, mflops=35.4008 (err=6.4e-16) 15. GSL DIT: elapsed time t=1.0558 s, 2048 iters, t-(init.)=1.03188 s t(norm)=0.109341, mflops=45.7283 (err=9.0e-16) 16. GSL DIF: elapsed time t=1.90788 s, 4096 iters, t-(init.)=1.85981 s t(norm)=0.0985362, mflops=50.7428 (err=7.8e-16) 17. Krukar: elapsed time t=1.69105 s, 2048 iters, t-(init.)=1.66686 s t(norm)=0.176627, mflops=28.3083 (err=6.9e-16) 18. Mayer (Buneman): elapsed time t=1.81206 s, 4096 iters, t-(init.)=1.76413 s t(norm)=0.0934671, mflops=53.4947 (err=6.5e-16) 19. Mayer (simple): elapsed time t=1.35096 s, 4096 iters, t-(init.)=1.30306 s t(norm)=0.0690384, mflops=72.4234 20. Mayer (lookup): elapsed time t=1.31964 s, 4096 iters, t-(init.)=1.27175 s t(norm)=0.0673798, mflops=74.2062 (err=6.5e-16) 21. NAPACK (f2c): elapsed time t=1.62198 s, 1024 iters, t-(init.)=1.60994 s t(norm)=0.34119, mflops=14.6546 (err=6.7e-15) 22. Nielsen: elapsed time t=1.77366 s, 2048 iters, t-(init.)=1.7493 s t(norm)=0.185362, mflops=26.9742 (err=3.2e-15) 23. NR (C): elapsed time t=1.82877 s, 4096 iters, t-(init.)=1.78076 s t(norm)=0.094348, mflops=52.9953 (err=7.1e-16) 24. Ooura (C): elapsed time t=1.81636 s, 8192 iters, t-(init.)=1.72052 s t(norm)=0.0455781, mflops=109.702 (err=6.9e-16) 25. QFT: elapsed time t=1.14976 s, 1024 iters, t-(init.)=1.13767 s t(norm)=0.241104, mflops=20.7379 (err=9.5e-16) 26. Ransom: elapsed time t=1.48581 s, 2048 iters, t-(init.)=1.46185 s t(norm)=0.154903, mflops=32.2783 (err=1.5e-15) 27. Singleton (f2c): elapsed time t=1.06498 s, 4096 iters, t-(init.)=1.01689 s t(norm)=0.0538768, mflops=92.8044 (err=8.4e-16) 28. Temperton (f2c): elapsed time t=1.31544 s, 1024 iters, t-(init.)=1.30342 s t(norm)=0.276232, mflops=18.1007 (err=6.4e-16) 29. Valkenburg: elapsed time t=1.76352 s, 512 iters, t-(init.)=1.75762 s t(norm)=0.744977, mflops=6.71161 (err=7.4e-16) Top mflops for N=512 = 145.877 Normalized results and averages for N=512: fft 0: mflops = 59.2058 (norm. = 0.40586), norm. avg. (of 9) = 0.411748 fft 1: mflops = 57.7987 (norm. = 0.396214), norm. avg. (of 9) = 0.398306 fft 2: mflops = 44.5656 (norm. = 0.305501), norm. avg. (of 9) = 0.259583 fft 3: mflops = 19.278 (norm. = 0.132152), norm. avg. (of 9) = 0.0792948 fft 4: mflops = 9.49584 (norm. = 0.0650946), norm. avg. (of 9) = 0.0681972 fft 5: mflops = 83.1687 (norm. = 0.570127), norm. avg. (of 9) = 0.338577 fft 6: mflops = 81.4708 (norm. = 0.558488), norm. avg. (of 9) = 0.355369 fft 7: mflops = 100.936 (norm. = 0.691927), norm. avg. (of 9) = 0.403645 fft 8: mflops = 31.7968 (norm. = 0.217969), norm. avg. (of 8) = 0.171891 fft 9: mflops = 20.3397 (norm. = 0.13943), norm. avg. (of 9) = 0.182807 fft 10: mflops = 61.4673 (norm. = 0.421362), norm. avg. (of 9) = 0.823455 fft 11: mflops = 62.1209 (norm. = 0.425843), norm. avg. (of 9) = 0.80592 fft 12: mflops = 55.8879 (norm. = 0.383116), norm. avg. (of 9) = 0.866847 fft 13: mflops = 145.877 (norm. = 1), norm. avg. (of 7) = 0.742091 fft 14: mflops = 35.4008 (norm. = 0.242675), norm. avg. (of 9) = 0.308229 fft 15: mflops = 45.7283 (norm. = 0.313471), norm. avg. (of 9) = 0.214667 fft 16: mflops = 50.7428 (norm. = 0.347845), norm. avg. (of 9) = 0.223384 fft 17: mflops = 28.3083 (norm. = 0.194055), norm. avg. (of 9) = 0.413779 fft 18: mflops = 53.4947 (norm. = 0.36671), norm. avg. (of 8) = 0.293753 fft 19: mflops = 72.4234 (norm. = 0.496468), norm. avg. (of 8) = 0.374122 fft 20: mflops = 74.2062 (norm. = 0.508689), norm. avg. (of 8) = 0.374243 fft 21: mflops = 14.6546 (norm. = 0.100458), norm. avg. (of 9) = 0.0825991 fft 22: mflops = 26.9742 (norm. = 0.18491), norm. avg. (of 9) = 0.126389 fft 23: mflops = 52.9953 (norm. = 0.363286), norm. avg. (of 9) = 0.235131 fft 24: mflops = 109.702 (norm. = 0.752013), norm. avg. (of 9) = 0.614 fft 25: mflops = 20.7379 (norm. = 0.14216), norm. avg. (of 6) = 0.251609 fft 26: mflops = 32.2783 (norm. = 0.22127), norm. avg. (of 8) = 0.124661 fft 27: mflops = 92.8044 (norm. = 0.63618), norm. avg. (of 9) = 0.35952 fft 28: mflops = 18.1007 (norm. = 0.124082), norm. avg. (of 9) = 0.107505 fft 29: mflops = 6.71161 (norm. = 0.0460086), norm. avg. (of 9) = 0.0647241 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.03932 s, 1024 iters, t-(init.)=1.01223 s t(norm)=0.0965336, mflops=51.7954 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.1419 s, 1024 iters, t-(init.)=1.11463 s t(norm)=0.106299, mflops=47.037 (err=1.0e-15) 2. Arndt Split-Radix: elapsed time t=1.46503 s, 1024 iters, t-(init.)=1.43772 s t(norm)=0.137111, mflops=36.4667 (err=1.0e-15) 3. Arndt 4-step: elapsed time t=1.40332 s, 512 iters, t-(init.)=1.38953 s t(norm)=0.265031, mflops=18.8657 (err=1.0e-15) 4. Beauregard: elapsed time t=1.45905 s, 256 iters, t-(init.)=1.45196 s t(norm)=0.553879, mflops=9.02724 (err=1.1e-15) 5. Bergland: elapsed time t=1.65843 s, 2048 iters, t-(init.)=1.60452 s t(norm)=0.0765093, mflops=65.3515 (err=1.1e-15) 6. CWP (min N) (N=1040): elapsed time t=1.59049 s, 2048 iters, t-(init.)=1.5076 s t(norm)=0.0718881, mflops=69.5525 7. CWP (best N) (N=1040): elapsed time t=1.59204 s, 2048 iters, t-(init.)=1.50946 s t(norm)=0.0719768, mflops=69.4669 8. Edelblute: elapsed time t=1.88191 s, 1024 iters, t-(init.)=1.85487 s t(norm)=0.176894, mflops=28.2655 (err=1.0e-15) 9. FFTPACK (f2c): elapsed time t=1.74717 s, 512 iters, t-(init.)=1.73347 s t(norm)=0.330634, mflops=15.1225 (err=1.1e-15) FFTW_MEASURE plan: (cost = 1.088281e-03) FFTW_TWIDDLE 16 FFTW_NOTW 64 10. FFTW: elapsed time t=1.25162 s, 1024 iters, t-(init.)=1.22438 s t(norm)=0.116766, mflops=42.8208 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.28646 s, 1024 iters, t-(init.)=1.25955 s t(norm)=0.12012, mflops=41.6252 (err=1.1e-15) 12. Frigo-old: elapsed time t=1.68527 s, 1024 iters, t-(init.)=1.65789 s t(norm)=0.158109, mflops=31.6238 (err=1.1e-15) 13. Green: elapsed time t=1.28942 s, 2048 iters, t-(init.)=1.23542 s t(norm)=0.0589094, mflops=84.8761 (err=1.1e-15) 14. GSL: elapsed time t=1.23919 s, 512 iters, t-(init.)=1.22548 s t(norm)=0.233743, mflops=21.3911 (err=1.1e-15) 15. GSL DIT: elapsed time t=1.30893 s, 1024 iters, t-(init.)=1.28179 s t(norm)=0.122241, mflops=40.9028 (err=1.3e-15) 16. GSL DIF: elapsed time t=1.21051 s, 1024 iters, t-(init.)=1.18359 s t(norm)=0.112876, mflops=44.2963 (err=1.4e-15) 17. Krukar: elapsed time t=1.18441 s, 512 iters, t-(init.)=1.1706 s t(norm)=0.223274, mflops=22.394 (err=1.1e-15) 18. Mayer (Buneman): elapsed time t=1.01707 s, 1024 iters, t-(init.)=0.989764 s t(norm)=0.0943913, mflops=52.971 (err=1.0e-15) 19. Mayer (simple): elapsed time t=1.53251 s, 2048 iters, t-(init.)=1.47817 s t(norm)=0.0704845, mflops=70.9376 20. Mayer (lookup): elapsed time t=1.8165 s, 2048 iters, t-(init.)=1.76251 s t(norm)=0.0840431, mflops=59.4933 (err=1.0e-15) 21. NAPACK (f2c): elapsed time t=1.44906 s, 256 iters, t-(init.)=1.44194 s t(norm)=0.550056, mflops=9.08999 (err=1.6e-14) 22. Nielsen: elapsed time t=1.38655 s, 512 iters, t-(init.)=1.37268 s t(norm)=0.261818, mflops=19.0972 (err=7.2e-15) 23. NR (C): elapsed time t=1.13535 s, 1024 iters, t-(init.)=1.10853 s t(norm)=0.105718, mflops=47.2957 (err=1.2e-15) 24. Ooura (C): elapsed time t=1.45613 s, 2048 iters, t-(init.)=1.40192 s t(norm)=0.0668488, mflops=74.7957 (err=1.1e-15) 25. QFT: elapsed time t=1.56909 s, 512 iters, t-(init.)=1.55573 s t(norm)=0.296732, mflops=16.8502 (err=1.4e-15) 26. Ransom: elapsed time t=1.51441 s, 1024 iters, t-(init.)=1.48728 s t(norm)=0.141838, mflops=35.2515 (err=2.1e-15) 27. Singleton (f2c): elapsed time t=1.34627 s, 2048 iters, t-(init.)=1.29229 s t(norm)=0.061621, mflops=81.1412 (err=1.6e-15) 28. Temperton (f2c): elapsed time t=1.62522 s, 512 iters, t-(init.)=1.6114 s t(norm)=0.30735, mflops=16.2681 (err=1.1e-15) 29. Valkenburg: elapsed time t=1.14797 s, 128 iters, t-(init.)=1.14461 s t(norm)=0.873271, mflops=5.7256 (err=1.1e-15) Top mflops for N=1024 = 84.8761 Normalized results and averages for N=1024: fft 0: mflops = 51.7954 (norm. = 0.610247), norm. avg. (of 10) = 0.431598 fft 1: mflops = 47.037 (norm. = 0.554184), norm. avg. (of 10) = 0.413894 fft 2: mflops = 36.4667 (norm. = 0.429646), norm. avg. (of 10) = 0.27659 fft 3: mflops = 18.8657 (norm. = 0.222273), norm. avg. (of 10) = 0.0935927 fft 4: mflops = 9.02724 (norm. = 0.106358), norm. avg. (of 10) = 0.0720133 fft 5: mflops = 65.3515 (norm. = 0.769964), norm. avg. (of 10) = 0.381715 fft 6: mflops = 69.5525 (norm. = 0.819459), norm. avg. (of 10) = 0.401778 fft 7: mflops = 69.4669 (norm. = 0.81845), norm. avg. (of 10) = 0.445125 fft 8: mflops = 28.2655 (norm. = 0.333021), norm. avg. (of 9) = 0.189794 fft 9: mflops = 15.1225 (norm. = 0.178171), norm. avg. (of 10) = 0.182343 fft 10: mflops = 42.8208 (norm. = 0.504509), norm. avg. (of 10) = 0.79156 fft 11: mflops = 41.6252 (norm. = 0.490422), norm. avg. (of 10) = 0.77437 fft 12: mflops = 31.6238 (norm. = 0.372588), norm. avg. (of 10) = 0.817421 fft 13: mflops = 84.8761 (norm. = 1), norm. avg. (of 8) = 0.77433 fft 14: mflops = 21.3911 (norm. = 0.252027), norm. avg. (of 10) = 0.302609 fft 15: mflops = 40.9028 (norm. = 0.481912), norm. avg. (of 10) = 0.241392 fft 16: mflops = 44.2963 (norm. = 0.521894), norm. avg. (of 10) = 0.253235 fft 17: mflops = 22.394 (norm. = 0.263843), norm. avg. (of 10) = 0.398786 fft 18: mflops = 52.971 (norm. = 0.624098), norm. avg. (of 9) = 0.330458 fft 19: mflops = 70.9376 (norm. = 0.835778), norm. avg. (of 9) = 0.425417 fft 20: mflops = 59.4933 (norm. = 0.700942), norm. avg. (of 9) = 0.410543 fft 21: mflops = 9.08999 (norm. = 0.107097), norm. avg. (of 10) = 0.0850489 fft 22: mflops = 19.0972 (norm. = 0.225001), norm. avg. (of 10) = 0.136251 fft 23: mflops = 47.2957 (norm. = 0.557232), norm. avg. (of 10) = 0.267341 fft 24: mflops = 74.7957 (norm. = 0.881233), norm. avg. (of 10) = 0.640723 fft 25: mflops = 16.8502 (norm. = 0.198527), norm. avg. (of 7) = 0.244026 fft 26: mflops = 35.2515 (norm. = 0.415329), norm. avg. (of 9) = 0.156958 fft 27: mflops = 81.1412 (norm. = 0.955995), norm. avg. (of 10) = 0.419168 fft 28: mflops = 16.2681 (norm. = 0.191669), norm. avg. (of 10) = 0.115922 fft 29: mflops = 5.7256 (norm. = 0.0674583), norm. avg. (of 10) = 0.0649975 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.18163 s, 128 iters, t-(init.)=1.12973 s t(norm)=0.391779, mflops=12.7623 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.20069 s, 128 iters, t-(init.)=1.14878 s t(norm)=0.398386, mflops=12.5506 (err=1.4e-15) 2. Arndt Split-Radix: elapsed time t=1.52924 s, 128 iters, t-(init.)=1.47748 s t(norm)=0.512377, mflops=9.75843 (err=1.4e-15) 3. Arndt 4-step: elapsed time t=1.1305 s, 128 iters, t-(init.)=1.07854 s t(norm)=0.374028, mflops=13.368 (err=1.4e-15) 4. Beauregard: elapsed time t=1.00281 s, 64 iters, t-(init.)=0.976763 s t(norm)=0.677465, mflops=7.38046 (err=1.5e-15) 5. Bergland: elapsed time t=1.22293 s, 256 iters, t-(init.)=1.11947 s t(norm)=0.194111, mflops=25.7584 (err=1.5e-15) 6. CWP (min N) (N=2145): elapsed time t=1.54747 s, 512 iters, t-(init.)=1.3303 s t(norm)=0.115334, mflops=43.3523 7. CWP (best N) (N=2184): elapsed time t=1.54821 s, 512 iters, t-(init.)=1.32754 s t(norm)=0.115094, mflops=43.4426 8. Edelblute: elapsed time t=1.64048 s, 128 iters, t-(init.)=1.58858 s t(norm)=0.550905, mflops=9.07597 (err=1.4e-15) 9. FFTPACK (f2c): elapsed time t=1.91416 s, 256 iters, t-(init.)=1.81074 s t(norm)=0.313973, mflops=15.9249 (err=1.5e-15) FFTW_MEASURE plan: (cost = 2.622406e-03) FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_NOTW 64 10. FFTW: elapsed time t=1.3399 s, 512 iters, t-(init.)=1.13268 s t(norm)=0.0982011, mflops=50.9159 (err=1.5e-15) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.69641 s, 512 iters, t-(init.)=1.48942 s t(norm)=0.12913, mflops=38.7208 (err=1.5e-15) 12. Frigo-old: elapsed time t=1.1282 s, 256 iters, t-(init.)=1.02447 s t(norm)=0.177639, mflops=28.147 (err=1.5e-15) 13. Green: elapsed time t=1.03836 s, 256 iters, t-(init.)=0.934813 s t(norm)=0.162092, mflops=30.8466 (err=1.5e-15) 14. GSL: elapsed time t=1.3497 s, 256 iters, t-(init.)=1.24633 s t(norm)=0.216108, mflops=23.1366 (err=1.5e-15) 15. GSL DIT: elapsed time t=1.17334 s, 128 iters, t-(init.)=1.12157 s t(norm)=0.388949, mflops=12.8551 (err=2.1e-15) 16. GSL DIF: elapsed time t=1.18631 s, 128 iters, t-(init.)=1.13435 s t(norm)=0.393381, mflops=12.7103 (err=2.2e-15) 17. Krukar: elapsed time t=1.52774 s, 256 iters, t-(init.)=1.42394 s t(norm)=0.246905, mflops=20.2507 (err=1.5e-15) 18. Mayer (Buneman): elapsed time t=1.44819 s, 512 iters, t-(init.)=1.2414 s t(norm)=0.107626, mflops=46.4571 (err=1.4e-15) 19. Mayer (simple): elapsed time t=1.15802 s, 512 iters, t-(init.)=0.951082 s t(norm)=0.0824566, mflops=60.638 20. Mayer (lookup): elapsed time t=1.54781 s, 512 iters, t-(init.)=1.34104 s t(norm)=0.116265, mflops=43.0053 (err=1.4e-15) 21. NAPACK (f2c): elapsed time t=1.67939 s, 128 iters, t-(init.)=1.62759 s t(norm)=0.564433, mflops=8.85845 (err=1.5e-14) 22. Nielsen: elapsed time t=1.27192 s, 128 iters, t-(init.)=1.22026 s t(norm)=0.423173, mflops=11.8155 (err=1.2e-14) 23. NR (C): elapsed time t=1.1316 s, 128 iters, t-(init.)=1.07995 s t(norm)=0.374517, mflops=13.3505 (err=1.6e-15) 24. Ooura (C): elapsed time t=1.97524 s, 512 iters, t-(init.)=1.76808 s t(norm)=0.153289, mflops=32.6182 (err=1.4e-15) 25. QFT: elapsed time t=1.1536 s, 128 iters, t-(init.)=1.10158 s t(norm)=0.382019, mflops=13.0883 (err=1.9e-15) 26. Ransom: elapsed time t=1.61746 s, 256 iters, t-(init.)=1.51395 s t(norm)=0.262512, mflops=19.0467 (err=2.6e-15) 27. Singleton (f2c): elapsed time t=1.57061 s, 256 iters, t-(init.)=1.46703 s t(norm)=0.254377, mflops=19.6559 (err=2.0e-15) 28. Temperton (f2c): elapsed time t=1.17239 s, 128 iters, t-(init.)=1.12058 s t(norm)=0.388607, mflops=12.8665 (err=1.5e-15) 29. Valkenburg: elapsed time t=1.46906 s, 64 iters, t-(init.)=1.44327 s t(norm)=1.00103, mflops=4.99487 (err=1.5e-15) Top mflops for N=2048 = 60.638 Normalized results and averages for N=2048: fft 0: mflops = 12.7623 (norm. = 0.210467), norm. avg. (of 11) = 0.411495 fft 1: mflops = 12.5506 (norm. = 0.206977), norm. avg. (of 11) = 0.395084 fft 2: mflops = 9.75843 (norm. = 0.160929), norm. avg. (of 11) = 0.266075 fft 3: mflops = 13.368 (norm. = 0.220456), norm. avg. (of 11) = 0.105126 fft 4: mflops = 7.38046 (norm. = 0.121714), norm. avg. (of 11) = 0.0765315 fft 5: mflops = 25.7584 (norm. = 0.42479), norm. avg. (of 11) = 0.385631 fft 6: mflops = 43.3523 (norm. = 0.714936), norm. avg. (of 11) = 0.430247 fft 7: mflops = 43.4426 (norm. = 0.716425), norm. avg. (of 11) = 0.469789 fft 8: mflops = 9.07597 (norm. = 0.149675), norm. avg. (of 10) = 0.185782 fft 9: mflops = 15.9249 (norm. = 0.262623), norm. avg. (of 11) = 0.189641 fft 10: mflops = 50.9159 (norm. = 0.839671), norm. avg. (of 11) = 0.795934 fft 11: mflops = 38.7208 (norm. = 0.638557), norm. avg. (of 11) = 0.762023 fft 12: mflops = 28.147 (norm. = 0.464181), norm. avg. (of 11) = 0.785308 fft 13: mflops = 30.8466 (norm. = 0.508702), norm. avg. (of 9) = 0.744816 fft 14: mflops = 23.1366 (norm. = 0.381553), norm. avg. (of 11) = 0.309786 fft 15: mflops = 12.8551 (norm. = 0.211998), norm. avg. (of 11) = 0.23872 fft 16: mflops = 12.7103 (norm. = 0.20961), norm. avg. (of 11) = 0.249269 fft 17: mflops = 20.2507 (norm. = 0.333961), norm. avg. (of 11) = 0.392892 fft 18: mflops = 46.4571 (norm. = 0.766138), norm. avg. (of 10) = 0.374026 fft 19: mflops = 60.638 (norm. = 1), norm. avg. (of 10) = 0.482876 fft 20: mflops = 43.0053 (norm. = 0.709214), norm. avg. (of 10) = 0.44041 fft 21: mflops = 8.85845 (norm. = 0.146087), norm. avg. (of 11) = 0.0905978 fft 22: mflops = 11.8155 (norm. = 0.194853), norm. avg. (of 11) = 0.141578 fft 23: mflops = 13.3505 (norm. = 0.220168), norm. avg. (of 11) = 0.263052 fft 24: mflops = 32.6182 (norm. = 0.537916), norm. avg. (of 11) = 0.631377 fft 25: mflops = 13.0883 (norm. = 0.215844), norm. avg. (of 8) = 0.240503 fft 26: mflops = 19.0467 (norm. = 0.314106), norm. avg. (of 10) = 0.172673 fft 27: mflops = 19.6559 (norm. = 0.324151), norm. avg. (of 11) = 0.41053 fft 28: mflops = 12.8665 (norm. = 0.212185), norm. avg. (of 11) = 0.124673 fft 29: mflops = 4.99487 (norm. = 0.082372), norm. avg. (of 11) = 0.066577 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.40832 s, 64 iters, t-(init.)=1.35636 s t(norm)=0.431175, mflops=11.5962 (err=2.5e-15) 1. Arndt DIT: elapsed time t=1.42519 s, 64 iters, t-(init.)=1.37322 s t(norm)=0.436536, mflops=11.4538 (err=2.5e-15) 2. Arndt Split-Radix: elapsed time t=1.80737 s, 64 iters, t-(init.)=1.75543 s t(norm)=0.558036, mflops=8.96 (err=2.5e-15) 3. Arndt 4-step: elapsed time t=1.15499 s, 64 iters, t-(init.)=1.10304 s t(norm)=0.350646, mflops=14.2594 (err=2.5e-15) 4. Beauregard: elapsed time t=1.10035 s, 32 iters, t-(init.)=1.07448 s t(norm)=0.683133, mflops=7.31921 (err=2.6e-15) 5. Bergland: elapsed time t=1.26341 s, 128 iters, t-(init.)=1.15976 s t(norm)=0.18434, mflops=27.1239 (err=2.5e-15) 6. CWP (min N) (N=4290): elapsed time t=1.95463 s, 256 iters, t-(init.)=1.73711 s t(norm)=0.138053, mflops=36.218 7. CWP (best N) (N=4368): elapsed time t=1.71683 s, 256 iters, t-(init.)=1.49507 s t(norm)=0.118818, mflops=42.0813 8. Edelblute: elapsed time t=1.9193 s, 64 iters, t-(init.)=1.86732 s t(norm)=0.593604, mflops=8.42313 (err=2.5e-15) 9. FFTPACK (f2c): elapsed time t=1.10561 s, 64 iters, t-(init.)=1.05344 s t(norm)=0.33488, mflops=14.9307 (err=2.6e-15) FFTW_MEASURE plan: (cost = 6.423500e-03) FFTW_TWIDDLE 2 FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_NOTW 32 10. FFTW: elapsed time t=1.65309 s, 256 iters, t-(init.)=1.4455 s t(norm)=0.114878, mflops=43.5245 (err=2.6e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.69604 s, 256 iters, t-(init.)=1.48787 s t(norm)=0.118246, mflops=42.2849 (err=2.6e-15) 12. Frigo-old: elapsed time t=1.35365 s, 128 iters, t-(init.)=1.24962 s t(norm)=0.198622, mflops=25.1734 (err=2.6e-15) 13. Green: elapsed time t=1.07315 s, 128 iters, t-(init.)=0.969065 s t(norm)=0.154029, mflops=32.4615 (err=2.6e-15) 14. GSL: elapsed time t=1.52656 s, 128 iters, t-(init.)=1.42239 s t(norm)=0.226083, mflops=22.1158 (err=2.6e-15) 15. GSL DIT: elapsed time t=1.31074 s, 64 iters, t-(init.)=1.25877 s t(norm)=0.400152, mflops=12.4953 (err=3.0e-15) 16. GSL DIF: elapsed time t=1.30591 s, 64 iters, t-(init.)=1.25394 s t(norm)=0.398616, mflops=12.5434 (err=3.1e-15) 17. Krukar: elapsed time t=1.7142 s, 128 iters, t-(init.)=1.61032 s t(norm)=0.255954, mflops=19.5348 (err=2.6e-15) 18. Mayer (Buneman): elapsed time t=1.13981 s, 64 iters, t-(init.)=1.08765 s t(norm)=0.345754, mflops=14.4611 (err=2.5e-15) 19. Mayer (simple): elapsed time t=1.07645 s, 64 iters, t-(init.)=1.0245 s t(norm)=0.325679, mflops=15.3526 20. Mayer (lookup): elapsed time t=1.13132 s, 64 iters, t-(init.)=1.07936 s t(norm)=0.34312, mflops=14.5722 (err=2.5e-15) 21. NAPACK (f2c): elapsed time t=1.8643 s, 64 iters, t-(init.)=1.81219 s t(norm)=0.576079, mflops=8.67936 (err=4.7e-14) 22. Nielsen: elapsed time t=1.3043 s, 64 iters, t-(init.)=1.25224 s t(norm)=0.398075, mflops=12.5604 (err=2.2e-14) 23. NR (C): elapsed time t=1.26546 s, 64 iters, t-(init.)=1.21309 s t(norm)=0.385631, mflops=12.9658 (err=2.6e-15) 24. Ooura (C): elapsed time t=1.01815 s, 128 iters, t-(init.)=0.914518 s t(norm)=0.145359, mflops=34.3977 (err=2.5e-15) 25. QFT: elapsed time t=1.46165 s, 64 iters, t-(init.)=1.40917 s t(norm)=0.447962, mflops=11.1616 (err=3.1e-15) 26. Ransom: elapsed time t=1.21603 s, 128 iters, t-(init.)=1.11239 s t(norm)=0.176809, mflops=28.2791 (err=3.1e-15) 27. Singleton (f2c): elapsed time t=1.37893 s, 128 iters, t-(init.)=1.27526 s t(norm)=0.202698, mflops=24.6673 (err=3.8e-15) 28. Temperton (f2c): elapsed time t=1.19927 s, 64 iters, t-(init.)=1.14708 s t(norm)=0.364647, mflops=13.7119 (err=2.6e-15) 29. Valkenburg: elapsed time t=1.63102 s, 32 iters, t-(init.)=1.60475 s t(norm)=1.02027, mflops=4.90065 (err=2.5e-15) Top mflops for N=4096 = 43.5245 Normalized results and averages for N=4096: fft 0: mflops = 11.5962 (norm. = 0.26643), norm. avg. (of 12) = 0.399406 fft 1: mflops = 11.4538 (norm. = 0.263158), norm. avg. (of 12) = 0.38409 fft 2: mflops = 8.96 (norm. = 0.205861), norm. avg. (of 12) = 0.261057 fft 3: mflops = 14.2594 (norm. = 0.327618), norm. avg. (of 12) = 0.123667 fft 4: mflops = 7.31921 (norm. = 0.168163), norm. avg. (of 12) = 0.0841674 fft 5: mflops = 27.1239 (norm. = 0.623186), norm. avg. (of 12) = 0.405427 fft 6: mflops = 36.218 (norm. = 0.832129), norm. avg. (of 12) = 0.463737 fft 7: mflops = 42.0813 (norm. = 0.966841), norm. avg. (of 12) = 0.51121 fft 8: mflops = 8.42313 (norm. = 0.193526), norm. avg. (of 11) = 0.186486 fft 9: mflops = 14.9307 (norm. = 0.343042), norm. avg. (of 12) = 0.202425 fft 10: mflops = 43.5245 (norm. = 1), norm. avg. (of 12) = 0.812939 fft 11: mflops = 42.2849 (norm. = 0.971519), norm. avg. (of 12) = 0.779481 fft 12: mflops = 25.1734 (norm. = 0.578373), norm. avg. (of 12) = 0.768064 fft 13: mflops = 32.4615 (norm. = 0.74582), norm. avg. (of 10) = 0.744916 fft 14: mflops = 22.1158 (norm. = 0.508122), norm. avg. (of 12) = 0.326314 fft 15: mflops = 12.4953 (norm. = 0.287086), norm. avg. (of 12) = 0.24275 fft 16: mflops = 12.5434 (norm. = 0.288191), norm. avg. (of 12) = 0.252512 fft 17: mflops = 19.5348 (norm. = 0.448822), norm. avg. (of 12) = 0.397553 fft 18: mflops = 14.4611 (norm. = 0.332253), norm. avg. (of 11) = 0.370229 fft 19: mflops = 15.3526 (norm. = 0.352734), norm. avg. (of 11) = 0.471045 fft 20: mflops = 14.5722 (norm. = 0.334804), norm. avg. (of 11) = 0.430809 fft 21: mflops = 8.67936 (norm. = 0.199413), norm. avg. (of 12) = 0.0996658 fft 22: mflops = 12.5604 (norm. = 0.288583), norm. avg. (of 12) = 0.153828 fft 23: mflops = 12.9658 (norm. = 0.297896), norm. avg. (of 12) = 0.265956 fft 24: mflops = 34.3977 (norm. = 0.790305), norm. avg. (of 12) = 0.644621 fft 25: mflops = 11.1616 (norm. = 0.256445), norm. avg. (of 9) = 0.242274 fft 26: mflops = 28.2791 (norm. = 0.649727), norm. avg. (of 11) = 0.216041 fft 27: mflops = 24.6673 (norm. = 0.566744), norm. avg. (of 12) = 0.423548 fft 28: mflops = 13.7119 (norm. = 0.315039), norm. avg. (of 12) = 0.140537 fft 29: mflops = 4.90065 (norm. = 0.112595), norm. avg. (of 12) = 0.0704119 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.49915 s, 32 iters, t-(init.)=1.44637 s t(norm)=0.424419, mflops=11.7808 (err=3.0e-15) 1. Arndt DIT: elapsed time t=1.52645 s, 32 iters, t-(init.)=1.47393 s t(norm)=0.432509, mflops=11.5605 (err=3.0e-15) 2. Arndt Split-Radix: elapsed time t=1.98293 s, 32 iters, t-(init.)=1.93057 s t(norm)=0.566502, mflops=8.82609 (err=3.0e-15) 3. Arndt 4-step: elapsed time t=1.47411 s, 32 iters, t-(init.)=1.42171 s t(norm)=0.417184, mflops=11.9851 (err=2.9e-15) 4. Beauregard: elapsed time t=1.19832 s, 16 iters, t-(init.)=1.17222 s t(norm)=0.687947, mflops=7.268 (err=2.9e-15) 5. Bergland: elapsed time t=1.48872 s, 64 iters, t-(init.)=1.38354 s t(norm)=0.202992, mflops=24.6315 (err=2.9e-15) 6. CWP (min N) (N=8580): elapsed time t=1.98604 s, 128 iters, t-(init.)=1.76659 s t(norm)=0.129596, mflops=38.5814 7. CWP (best N) (N=9240): elapsed time t=1.9833 s, 128 iters, t-(init.)=1.7467 s t(norm)=0.128137, mflops=39.0207 8. Edelblute: elapsed time t=1.04749 s, 16 iters, t-(init.)=1.02105 s t(norm)=0.599231, mflops=8.34403 (err=3.0e-15) 9. FFTPACK (f2c): elapsed time t=1.46652 s, 32 iters, t-(init.)=1.41337 s t(norm)=0.414738, mflops=12.0558 (err=2.9e-15) FFTW_MEASURE plan: (cost = 1.513775e-02) FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_NOTW 64 10. FFTW: elapsed time t=1.00955 s, 64 iters, t-(init.)=0.903913 s t(norm)=0.132621, mflops=37.7013 (err=2.9e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.06473 s, 64 iters, t-(init.)=0.959899 s t(norm)=0.140836, mflops=35.5024 (err=2.9e-15) 12. Frigo-old: elapsed time t=1.47538 s, 64 iters, t-(init.)=1.37042 s t(norm)=0.201067, mflops=24.8673 (err=2.9e-15) 13. Green: elapsed time t=1.2394 s, 64 iters, t-(init.)=1.13427 s t(norm)=0.16642, mflops=30.0445 (err=2.9e-15) 14. GSL: elapsed time t=1.06432 s, 32 iters, t-(init.)=1.01141 s t(norm)=0.296788, mflops=16.8471 (err=2.9e-15) 15. GSL DIT: elapsed time t=1.42817 s, 32 iters, t-(init.)=1.37574 s t(norm)=0.403694, mflops=12.3856 (err=3.6e-15) 16. GSL DIF: elapsed time t=1.42266 s, 32 iters, t-(init.)=1.37036 s t(norm)=0.402117, mflops=12.4342 (err=3.6e-15) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.28056 s, 32 iters, t-(init.)=1.22825 s t(norm)=0.360416, mflops=13.8729 (err=2.9e-15) 19. Mayer (simple): elapsed time t=1.21895 s, 32 iters, t-(init.)=1.16656 s t(norm)=0.342312, mflops=14.6065 20. Mayer (lookup): elapsed time t=1.26975 s, 32 iters, t-(init.)=1.21709 s t(norm)=0.357141, mflops=14.0001 (err=3.0e-15) 21. NAPACK (f2c): elapsed time t=1.21505 s, 16 iters, t-(init.)=1.18788 s t(norm)=0.697138, mflops=7.17218 (err=4.3e-14) 22. Nielsen: elapsed time t=1.58997 s, 32 iters, t-(init.)=1.53674 s t(norm)=0.450937, mflops=11.088 (err=1.1e-14) 23. NR (C): elapsed time t=1.37701 s, 32 iters, t-(init.)=1.32446 s t(norm)=0.388647, mflops=12.8652 (err=3.0e-15) 24. Ooura (C): elapsed time t=1.14018 s, 64 iters, t-(init.)=1.03514 s t(norm)=0.151874, mflops=32.922 (err=2.9e-15) 25. QFT: elapsed time t=1.75602 s, 32 iters, t-(init.)=1.70254 s t(norm)=0.499591, mflops=10.0082 (err=4.0e-15) 26. Ransom: elapsed time t=1.57836 s, 64 iters, t-(init.)=1.47334 s t(norm)=0.216167, mflops=23.1302 (err=4.1e-15) 27. Singleton (f2c): elapsed time t=1.59821 s, 64 iters, t-(init.)=1.49366 s t(norm)=0.219149, mflops=22.8156 (err=4.4e-15) 28. Temperton (f2c): elapsed time t=1.42789 s, 32 iters, t-(init.)=1.37539 s t(norm)=0.403594, mflops=12.3887 (err=2.9e-15) 29. Valkenburg: elapsed time t=1.83979 s, 16 iters, t-(init.)=1.8132 s t(norm)=1.06412, mflops=4.69871 (err=2.9e-15) Top mflops for N=8192 = 39.0207 Normalized results and averages for N=8192: fft 0: mflops = 11.7808 (norm. = 0.301912), norm. avg. (of 13) = 0.391907 fft 1: mflops = 11.5605 (norm. = 0.296265), norm. avg. (of 13) = 0.377334 fft 2: mflops = 8.82609 (norm. = 0.22619), norm. avg. (of 13) = 0.258375 fft 3: mflops = 11.9851 (norm. = 0.307148), norm. avg. (of 13) = 0.137781 fft 4: mflops = 7.268 (norm. = 0.18626), norm. avg. (of 13) = 0.0920207 fft 5: mflops = 24.6315 (norm. = 0.631242), norm. avg. (of 13) = 0.422798 fft 6: mflops = 38.5814 (norm. = 0.988742), norm. avg. (of 13) = 0.504122 fft 7: mflops = 39.0207 (norm. = 1), norm. avg. (of 13) = 0.548809 fft 8: mflops = 8.34403 (norm. = 0.213836), norm. avg. (of 12) = 0.188765 fft 9: mflops = 12.0558 (norm. = 0.308959), norm. avg. (of 13) = 0.21062 fft 10: mflops = 37.7013 (norm. = 0.966189), norm. avg. (of 13) = 0.824728 fft 11: mflops = 35.5024 (norm. = 0.909836), norm. avg. (of 13) = 0.789509 fft 12: mflops = 24.8673 (norm. = 0.637286), norm. avg. (of 13) = 0.758004 fft 13: mflops = 30.0445 (norm. = 0.769964), norm. avg. (of 11) = 0.747193 fft 14: mflops = 16.8471 (norm. = 0.431747), norm. avg. (of 13) = 0.334424 fft 15: mflops = 12.3856 (norm. = 0.317411), norm. avg. (of 13) = 0.248493 fft 16: mflops = 12.4342 (norm. = 0.318656), norm. avg. (of 13) = 0.2576 fft 17: mflops = -1 (norm. = -0.0256274), norm. avg. (of 12) = 0.397553 fft 18: mflops = 13.8729 (norm. = 0.355526), norm. avg. (of 12) = 0.369003 fft 19: mflops = 14.6065 (norm. = 0.374328), norm. avg. (of 12) = 0.462985 fft 20: mflops = 14.0001 (norm. = 0.358786), norm. avg. (of 12) = 0.424807 fft 21: mflops = 7.17218 (norm. = 0.183805), norm. avg. (of 13) = 0.106138 fft 22: mflops = 11.088 (norm. = 0.284157), norm. avg. (of 13) = 0.163854 fft 23: mflops = 12.8652 (norm. = 0.329701), norm. avg. (of 13) = 0.270859 fft 24: mflops = 32.922 (norm. = 0.843706), norm. avg. (of 13) = 0.659936 fft 25: mflops = 10.0082 (norm. = 0.256484), norm. avg. (of 10) = 0.243695 fft 26: mflops = 23.1302 (norm. = 0.592769), norm. avg. (of 12) = 0.247435 fft 27: mflops = 22.8156 (norm. = 0.584705), norm. avg. (of 13) = 0.435944 fft 28: mflops = 12.3887 (norm. = 0.317491), norm. avg. (of 13) = 0.154149 fft 29: mflops = 4.69871 (norm. = 0.120416), norm. avg. (of 13) = 0.0742583 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.74989 s, 16 iters, t-(init.)=1.69573 s t(norm)=0.462049, mflops=10.8214 (err=5.6e-15) 1. Arndt DIT: elapsed time t=1.78378 s, 16 iters, t-(init.)=1.72993 s t(norm)=0.471368, mflops=10.6074 (err=5.6e-15) 2. Arndt Split-Radix: elapsed time t=1.10726 s, 8 iters, t-(init.)=1.07996 s t(norm)=0.588533, mflops=8.49571 (err=5.6e-15) 3. Arndt 4-step: elapsed time t=1.24212 s, 16 iters, t-(init.)=1.18839 s t(norm)=0.323811, mflops=15.4411 (err=5.6e-15) 4. Beauregard: elapsed time t=1.31716 s, 8 iters, t-(init.)=1.29022 s t(norm)=0.703113, mflops=7.11123 (err=5.7e-15) 5. Bergland: elapsed time t=1.61646 s, 32 iters, t-(init.)=1.50852 s t(norm)=0.20552, mflops=24.3285 (err=5.7e-15) 6. CWP (min N) (N=17160): elapsed time t=1.08732 s, 32 iters, t-(init.)=0.968587 s t(norm)=0.13196, mflops=37.8904 7. CWP (best N) (N=17160): elapsed time t=1.08742 s, 32 iters, t-(init.)=0.96864 s t(norm)=0.131967, mflops=37.8883 8. Edelblute: elapsed time t=1.16294 s, 8 iters, t-(init.)=1.13606 s t(norm)=0.619106, mflops=8.07616 (err=5.6e-15) 9. FFTPACK (f2c): elapsed time t=1.86689 s, 16 iters, t-(init.)=1.8119 s t(norm)=0.493704, mflops=10.1275 (err=5.7e-15) FFTW_MEASURE plan: (cost = 3.725100e-02) FFTW_TWIDDLE 64 FFTW_TWIDDLE 8 FFTW_NOTW 32 10. FFTW: elapsed time t=1.1925 s, 32 iters, t-(init.)=1.08365 s t(norm)=0.147636, mflops=33.8671 (err=5.7e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.47396 s, 32 iters, t-(init.)=1.36472 s t(norm)=0.185929, mflops=26.892 (err=5.7e-15) 12. Frigo-old: elapsed time t=1.0509 s, 16 iters, t-(init.)=0.995538 s t(norm)=0.271263, mflops=18.4323 (err=5.7e-15) 13. Green: elapsed time t=1.55703 s, 32 iters, t-(init.)=1.44891 s t(norm)=0.197398, mflops=25.3295 (err=5.7e-15) 14. GSL: elapsed time t=1.25179 s, 16 iters, t-(init.)=1.19628 s t(norm)=0.325962, mflops=15.3392 (err=5.7e-15) 15. GSL DIT: elapsed time t=1.57817 s, 16 iters, t-(init.)=1.52425 s t(norm)=0.415325, mflops=12.0388 (err=6.3e-15) 16. GSL DIF: elapsed time t=1.57389 s, 16 iters, t-(init.)=1.51993 s t(norm)=0.414149, mflops=12.0729 (err=6.4e-15) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.41081 s, 16 iters, t-(init.)=1.35677 s t(norm)=0.36969, mflops=13.5248 (err=5.6e-15) 19. Mayer (simple): elapsed time t=1.34766 s, 16 iters, t-(init.)=1.29396 s t(norm)=0.352576, mflops=14.1813 20. Mayer (lookup): elapsed time t=1.44717 s, 16 iters, t-(init.)=1.39289 s t(norm)=0.379533, mflops=13.1741 (err=5.6e-15) 21. NAPACK (f2c): elapsed time t=1.46439 s, 8 iters, t-(init.)=1.43585 s t(norm)=0.782478, mflops=6.38996 (err=2.3e-13) 22. Nielsen: elapsed time t=1.85893 s, 16 iters, t-(init.)=1.80356 s t(norm)=0.491431, mflops=10.1744 (err=1.3e-13) 23. NR (C): elapsed time t=1.52039 s, 16 iters, t-(init.)=1.46625 s t(norm)=0.399522, mflops=12.515 (err=5.6e-15) 24. Ooura (C): elapsed time t=1.2209 s, 32 iters, t-(init.)=1.11337 s t(norm)=0.151684, mflops=32.9632 (err=5.7e-15) 25. QFT: elapsed time t=1.0817 s, 8 iters, t-(init.)=1.05285 s t(norm)=0.573757, mflops=8.7145 (err=7.0e-15) 26. Ransom: elapsed time t=1.30118 s, 32 iters, t-(init.)=1.19321 s t(norm)=0.162562, mflops=30.7576 (err=6.0e-15) 27. Singleton (f2c): elapsed time t=1.6468 s, 32 iters, t-(init.)=1.53891 s t(norm)=0.20966, mflops=23.8481 (err=8.5e-15) 28. Temperton (f2c): elapsed time t=1.53887 s, 16 iters, t-(init.)=1.48512 s t(norm)=0.404662, mflops=12.356 (err=5.7e-15) 29. Valkenburg: elapsed time t=1.1209 s, 4 iters, t-(init.)=1.10673 s t(norm)=1.20624, mflops=4.14512 (err=5.7e-15) Top mflops for N=16384 = 37.8904 Normalized results and averages for N=16384: fft 0: mflops = 10.8214 (norm. = 0.285596), norm. avg. (of 14) = 0.384313 fft 1: mflops = 10.6074 (norm. = 0.27995), norm. avg. (of 14) = 0.370378 fft 2: mflops = 8.49571 (norm. = 0.224218), norm. avg. (of 14) = 0.255935 fft 3: mflops = 15.4411 (norm. = 0.40752), norm. avg. (of 14) = 0.157048 fft 4: mflops = 7.11123 (norm. = 0.187679), norm. avg. (of 14) = 0.0988535 fft 5: mflops = 24.3285 (norm. = 0.642076), norm. avg. (of 14) = 0.438461 fft 6: mflops = 37.8904 (norm. = 1), norm. avg. (of 14) = 0.539542 fft 7: mflops = 37.8883 (norm. = 0.999945), norm. avg. (of 14) = 0.581033 fft 8: mflops = 8.07616 (norm. = 0.213145), norm. avg. (of 13) = 0.190641 fft 9: mflops = 10.1275 (norm. = 0.267284), norm. avg. (of 14) = 0.214667 fft 10: mflops = 33.8671 (norm. = 0.893818), norm. avg. (of 14) = 0.829663 fft 11: mflops = 26.892 (norm. = 0.709731), norm. avg. (of 14) = 0.78381 fft 12: mflops = 18.4323 (norm. = 0.486464), norm. avg. (of 14) = 0.738608 fft 13: mflops = 25.3295 (norm. = 0.668493), norm. avg. (of 12) = 0.740635 fft 14: mflops = 15.3392 (norm. = 0.404832), norm. avg. (of 14) = 0.339453 fft 15: mflops = 12.0388 (norm. = 0.317726), norm. avg. (of 14) = 0.253439 fft 16: mflops = 12.0729 (norm. = 0.318628), norm. avg. (of 14) = 0.261959 fft 17: mflops = -1 (norm. = -0.0263919), norm. avg. (of 12) = 0.397553 fft 18: mflops = 13.5248 (norm. = 0.356946), norm. avg. (of 13) = 0.368076 fft 19: mflops = 14.1813 (norm. = 0.374272), norm. avg. (of 13) = 0.456161 fft 20: mflops = 13.1741 (norm. = 0.347689), norm. avg. (of 13) = 0.418875 fft 21: mflops = 6.38996 (norm. = 0.168643), norm. avg. (of 14) = 0.110603 fft 22: mflops = 10.1744 (norm. = 0.268521), norm. avg. (of 14) = 0.17133 fft 23: mflops = 12.515 (norm. = 0.330293), norm. avg. (of 14) = 0.275105 fft 24: mflops = 32.9632 (norm. = 0.869961), norm. avg. (of 14) = 0.674937 fft 25: mflops = 8.7145 (norm. = 0.229992), norm. avg. (of 11) = 0.242449 fft 26: mflops = 30.7576 (norm. = 0.811751), norm. avg. (of 13) = 0.290844 fft 27: mflops = 23.8481 (norm. = 0.629398), norm. avg. (of 14) = 0.449763 fft 28: mflops = 12.356 (norm. = 0.326098), norm. avg. (of 14) = 0.166431 fft 29: mflops = 4.14512 (norm. = 0.109398), norm. avg. (of 14) = 0.0767683 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.43689 s, 4 iters, t-(init.)=1.39494 s t(norm)=0.709502, mflops=7.0472 (err=5.2e-15) 1. Arndt DIT: elapsed time t=1.56753 s, 4 iters, t-(init.)=1.52574 s t(norm)=0.776033, mflops=6.44303 (err=5.2e-15) 2. Arndt Split-Radix: elapsed time t=1.89786 s, 4 iters, t-(init.)=1.85622 s t(norm)=0.944122, mflops=5.29592 (err=5.2e-15) 3. Arndt 4-step: elapsed time t=1.8873 s, 8 iters, t-(init.)=1.80366 s t(norm)=0.458695, mflops=10.9005 (err=5.2e-15) 4. Beauregard: elapsed time t=1.63195 s, 4 iters, t-(init.)=1.59037 s t(norm)=0.808904, mflops=6.1812 (err=5.2e-15) 5. Bergland: elapsed time t=1.25765 s, 8 iters, t-(init.)=1.17388 s t(norm)=0.298534, mflops=16.7485 (err=5.2e-15) 6. CWP (min N) (N=34320): elapsed time t=1.35743 s, 16 iters, t-(init.)=1.18234 s t(norm)=0.150342, mflops=33.2576 7. CWP (best N) (N=34320): elapsed time t=1.35795 s, 16 iters, t-(init.)=1.18291 s t(norm)=0.150415, mflops=33.2413 8. Edelblute: elapsed time t=1.95456 s, 4 iters, t-(init.)=1.91307 s t(norm)=0.973038, mflops=5.13854 (err=5.2e-15) 9. FFTPACK (f2c): elapsed time t=1.2137 s, 4 iters, t-(init.)=1.1718 s t(norm)=0.596011, mflops=8.38911 (err=5.2e-15) FFTW_MEASURE plan: (cost = 1.137680e-01) FFTW_TWIDDLE 2 FFTW_TWIDDLE 64 FFTW_TWIDDLE 8 FFTW_NOTW 32 10. FFTW: elapsed time t=1.69991 s, 16 iters, t-(init.)=1.53275 s t(norm)=0.194899, mflops=25.6543 (err=5.2e-15) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.03512 s, 8 iters, t-(init.)=0.951376 s t(norm)=0.241947, mflops=20.6656 (err=5.2e-15) 12. Frigo-old: elapsed time t=1.39841 s, 8 iters, t-(init.)=1.3147 s t(norm)=0.334346, mflops=14.9546 (err=5.2e-15) 13. Green: elapsed time t=1.16364 s, 8 iters, t-(init.)=1.08091 s t(norm)=0.274889, mflops=18.1891 (err=5.2e-15) 14. GSL: elapsed time t=1.63432 s, 8 iters, t-(init.)=1.55058 s t(norm)=0.394334, mflops=12.6796 (err=5.2e-15) 15. GSL DIT: elapsed time t=1.27647 s, 4 iters, t-(init.)=1.23621 s t(norm)=0.628767, mflops=7.95207 (err=5.9e-15) 16. GSL DIF: elapsed time t=1.29079 s, 4 iters, t-(init.)=1.24881 s t(norm)=0.63518, mflops=7.87179 (err=6.0e-15) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.62439 s, 8 iters, t-(init.)=1.54215 s t(norm)=0.392188, mflops=12.749 (err=5.2e-15) 19. Mayer (simple): elapsed time t=1.54461 s, 8 iters, t-(init.)=1.4626 s t(norm)=0.37196, mflops=13.4423 20. Mayer (lookup): elapsed time t=1.68717 s, 8 iters, t-(init.)=1.60516 s t(norm)=0.408213, mflops=12.2485 (err=5.2e-15) 21. NAPACK (f2c): elapsed time t=1.74826 s, 4 iters, t-(init.)=1.70627 s t(norm)=0.867856, mflops=5.76132 (err=5.6e-13) 22. Nielsen: elapsed time t=1.29703 s, 4 iters, t-(init.)=1.25507 s t(norm)=0.638364, mflops=7.83253 (err=2.3e-13) 23. NR (C): elapsed time t=1.248 s, 4 iters, t-(init.)=1.20757 s t(norm)=0.614202, mflops=8.14064 (err=5.3e-15) 24. Ooura (C): elapsed time t=1.91229 s, 16 iters, t-(init.)=1.74474 s t(norm)=0.221855, mflops=22.5373 (err=5.2e-15) 25. QFT: elapsed time t=1.24891 s, 4 iters, t-(init.)=1.20682 s t(norm)=0.613822, mflops=8.14568 (err=7.5e-15) 26. Ransom: elapsed time t=1.13938 s, 8 iters, t-(init.)=1.05561 s t(norm)=0.268456, mflops=18.625 (err=6.4e-15) 27. Singleton (f2c): elapsed time t=1.76091 s, 8 iters, t-(init.)=1.67701 s t(norm)=0.426487, mflops=11.7237 (err=7.2e-15) 28. Temperton (f2c): elapsed time t=1.16973 s, 4 iters, t-(init.)=1.12811 s t(norm)=0.573784, mflops=8.71408 (err=5.2e-15) 29. Valkenburg: elapsed time t=1.4683 s, 2 iters, t-(init.)=1.44757 s t(norm)=1.47255, mflops=3.39548 (err=5.2e-15) Top mflops for N=32768 = 33.2576 Normalized results and averages for N=32768: fft 0: mflops = 7.0472 (norm. = 0.211898), norm. avg. (of 15) = 0.372819 fft 1: mflops = 6.44303 (norm. = 0.193731), norm. avg. (of 15) = 0.358602 fft 2: mflops = 5.29592 (norm. = 0.15924), norm. avg. (of 15) = 0.249489 fft 3: mflops = 10.9005 (norm. = 0.32776), norm. avg. (of 15) = 0.168429 fft 4: mflops = 6.1812 (norm. = 0.185859), norm. avg. (of 15) = 0.104654 fft 5: mflops = 16.7485 (norm. = 0.5036), norm. avg. (of 15) = 0.442803 fft 6: mflops = 33.2576 (norm. = 1), norm. avg. (of 15) = 0.570239 fft 7: mflops = 33.2413 (norm. = 0.999512), norm. avg. (of 15) = 0.608932 fft 8: mflops = 5.13854 (norm. = 0.154508), norm. avg. (of 14) = 0.18806 fft 9: mflops = 8.38911 (norm. = 0.252247), norm. avg. (of 15) = 0.217172 fft 10: mflops = 25.6543 (norm. = 0.771384), norm. avg. (of 15) = 0.825778 fft 11: mflops = 20.6656 (norm. = 0.621382), norm. avg. (of 15) = 0.772982 fft 12: mflops = 14.9546 (norm. = 0.44966), norm. avg. (of 15) = 0.719345 fft 13: mflops = 18.1891 (norm. = 0.546917), norm. avg. (of 13) = 0.725733 fft 14: mflops = 12.6796 (norm. = 0.381255), norm. avg. (of 15) = 0.34224 fft 15: mflops = 7.95207 (norm. = 0.239106), norm. avg. (of 15) = 0.252483 fft 16: mflops = 7.87179 (norm. = 0.236692), norm. avg. (of 15) = 0.260275 fft 17: mflops = -1 (norm. = -0.0300684), norm. avg. (of 12) = 0.397553 fft 18: mflops = 12.749 (norm. = 0.383341), norm. avg. (of 14) = 0.369166 fft 19: mflops = 13.4423 (norm. = 0.404188), norm. avg. (of 14) = 0.452449 fft 20: mflops = 12.2485 (norm. = 0.368293), norm. avg. (of 14) = 0.415262 fft 21: mflops = 5.76132 (norm. = 0.173234), norm. avg. (of 15) = 0.114778 fft 22: mflops = 7.83253 (norm. = 0.235511), norm. avg. (of 15) = 0.175609 fft 23: mflops = 8.14064 (norm. = 0.244776), norm. avg. (of 15) = 0.273083 fft 24: mflops = 22.5373 (norm. = 0.677659), norm. avg. (of 15) = 0.675119 fft 25: mflops = 8.14568 (norm. = 0.244927), norm. avg. (of 12) = 0.242656 fft 26: mflops = 18.625 (norm. = 0.560024), norm. avg. (of 14) = 0.310071 fft 27: mflops = 11.7237 (norm. = 0.352512), norm. avg. (of 15) = 0.443279 fft 28: mflops = 8.71408 (norm. = 0.262018), norm. avg. (of 15) = 0.172803 fft 29: mflops = 3.39548 (norm. = 0.102096), norm. avg. (of 15) = 0.0784568 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.74335 s, 2 iters, t-(init.)=1.70166 s t(norm)=0.811414, mflops=6.16208 (err=1.6e-14) 1. Arndt DIT: elapsed time t=1.78892 s, 2 iters, t-(init.)=1.74735 s t(norm)=0.8332, mflops=6.00096 (err=1.6e-14) 2. Arndt Split-Radix: elapsed time t=1.06668 s, 1 iters, t-(init.)=1.04593 s t(norm)=0.997477, mflops=5.01265 (err=1.6e-14) 3. Arndt 4-step: elapsed time t=1.85376 s, 4 iters, t-(init.)=1.77 s t(norm)=0.422, mflops=11.8483 (err=1.6e-14) 4. Beauregard: elapsed time t=1.7487 s, 2 iters, t-(init.)=1.70669 s t(norm)=0.813812, mflops=6.14393 (err=1.6e-14) 5. Bergland: elapsed time t=1.48568 s, 4 iters, t-(init.)=1.4019 s t(norm)=0.334239, mflops=14.9593 (err=1.6e-14) 6. CWP (min N) (N=72072): elapsed time t=1.57295 s, 8 iters, t-(init.)=1.38905 s t(norm)=0.165588, mflops=30.1955 7. CWP (best N) (N=72072): elapsed time t=1.57371 s, 8 iters, t-(init.)=1.38981 s t(norm)=0.165678, mflops=30.179 8. Edelblute: elapsed time t=1.09483 s, 1 iters, t-(init.)=1.0739 s t(norm)=1.02415, mflops=4.88208 (err=1.6e-14) 9. FFTPACK (f2c): elapsed time t=1.43275 s, 2 iters, t-(init.)=1.39073 s t(norm)=0.663154, mflops=7.53973 (err=1.6e-14) FFTW_MEASURE plan: (cost = 2.637200e-01) FFTW_TWIDDLE 4 FFTW_TWIDDLE 64 FFTW_TWIDDLE 8 FFTW_NOTW 32 10. FFTW: elapsed time t=1.0246 s, 4 iters, t-(init.)=0.941062 s t(norm)=0.224367, mflops=22.285 (err=1.6e-14) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.16061 s, 4 iters, t-(init.)=1.07708 s t(norm)=0.256796, mflops=19.4707 (err=1.6e-14) 12. Frigo-old: elapsed time t=1.59571 s, 4 iters, t-(init.)=1.51189 s t(norm)=0.360464, mflops=13.871 (err=1.6e-14) 13. Green: elapsed time t=1.37974 s, 4 iters, t-(init.)=1.29609 s t(norm)=0.309012, mflops=16.1806 (err=1.6e-14) 14. GSL: elapsed time t=1.75107 s, 4 iters, t-(init.)=1.66751 s t(norm)=0.397566, mflops=12.5765 (err=1.6e-14) 15. GSL DIT: elapsed time t=1.39972 s, 2 iters, t-(init.)=1.35771 s t(norm)=0.647405, mflops=7.72314 (err=1.7e-14) 16. GSL DIF: elapsed time t=1.40363 s, 2 iters, t-(init.)=1.36163 s t(norm)=0.649275, mflops=7.7009 (err=1.8e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.32805 s, 2 iters, t-(init.)=1.28712 s t(norm)=0.613749, mflops=8.14666 (err=1.6e-14) 19. Mayer (simple): elapsed time t=1.29432 s, 2 iters, t-(init.)=1.25302 s t(norm)=0.597485, mflops=8.36841 20. Mayer (lookup): elapsed time t=1.3574 s, 2 iters, t-(init.)=1.31636 s t(norm)=0.627689, mflops=7.96573 (err=1.6e-14) 21. NAPACK (f2c): elapsed time t=1.79956 s, 2 iters, t-(init.)=1.75763 s t(norm)=0.838102, mflops=5.96586 (err=8.7e-13) 22. Nielsen: elapsed time t=1.6565 s, 2 iters, t-(init.)=1.61484 s t(norm)=0.770016, mflops=6.49337 (err=2.6e-13) 23. NR (C): elapsed time t=1.3713 s, 2 iters, t-(init.)=1.32954 s t(norm)=0.633976, mflops=7.88673 (err=1.6e-14) 24. Ooura (C): elapsed time t=1.14596 s, 4 iters, t-(init.)=1.06213 s t(norm)=0.253232, mflops=19.7447 (err=1.6e-14) 25. QFT: elapsed time t=1.57851 s, 2 iters, t-(init.)=1.53639 s t(norm)=0.73261, mflops=6.82492 (err=1.9e-14) 26. Ransom: elapsed time t=1.22301 s, 4 iters, t-(init.)=1.13944 s t(norm)=0.271664, mflops=18.4051 (err=1.7e-14) 27. Singleton (f2c): elapsed time t=1.62583 s, 4 iters, t-(init.)=1.54212 s t(norm)=0.36767, mflops=13.5992 (err=2.4e-14) 28. Temperton (f2c): elapsed time t=1.25978 s, 2 iters, t-(init.)=1.2181 s t(norm)=0.580833, mflops=8.60832 (err=1.6e-14) 29. Valkenburg: elapsed time t=1.67759 s, 1 iters, t-(init.)=1.65651 s t(norm)=1.57977, mflops=3.16502 (err=1.6e-14) Top mflops for N=65536 = 30.1955 Normalized results and averages for N=65536: fft 0: mflops = 6.16208 (norm. = 0.204073), norm. avg. (of 16) = 0.362272 fft 1: mflops = 6.00096 (norm. = 0.198737), norm. avg. (of 16) = 0.34861 fft 2: mflops = 5.01265 (norm. = 0.166006), norm. avg. (of 16) = 0.244271 fft 3: mflops = 11.8483 (norm. = 0.392387), norm. avg. (of 16) = 0.182426 fft 4: mflops = 6.14393 (norm. = 0.203472), norm. avg. (of 16) = 0.11083 fft 5: mflops = 14.9593 (norm. = 0.495416), norm. avg. (of 16) = 0.446092 fft 6: mflops = 30.1955 (norm. = 1), norm. avg. (of 16) = 0.597099 fft 7: mflops = 30.179 (norm. = 0.999452), norm. avg. (of 16) = 0.633339 fft 8: mflops = 4.88208 (norm. = 0.161682), norm. avg. (of 15) = 0.186301 fft 9: mflops = 7.53973 (norm. = 0.249697), norm. avg. (of 16) = 0.219205 fft 10: mflops = 22.285 (norm. = 0.738022), norm. avg. (of 16) = 0.820293 fft 11: mflops = 19.4707 (norm. = 0.64482), norm. avg. (of 16) = 0.764972 fft 12: mflops = 13.871 (norm. = 0.459374), norm. avg. (of 16) = 0.703097 fft 13: mflops = 16.1806 (norm. = 0.535861), norm. avg. (of 14) = 0.712171 fft 14: mflops = 12.5765 (norm. = 0.416503), norm. avg. (of 16) = 0.346881 fft 15: mflops = 7.72314 (norm. = 0.255771), norm. avg. (of 16) = 0.252688 fft 16: mflops = 7.7009 (norm. = 0.255035), norm. avg. (of 16) = 0.259947 fft 17: mflops = -1 (norm. = -0.0331175), norm. avg. (of 12) = 0.397553 fft 18: mflops = 8.14666 (norm. = 0.269797), norm. avg. (of 15) = 0.362542 fft 19: mflops = 8.36841 (norm. = 0.277141), norm. avg. (of 15) = 0.440761 fft 20: mflops = 7.96573 (norm. = 0.263805), norm. avg. (of 15) = 0.405165 fft 21: mflops = 5.96586 (norm. = 0.197574), norm. avg. (of 16) = 0.119953 fft 22: mflops = 6.49337 (norm. = 0.215044), norm. avg. (of 16) = 0.178073 fft 23: mflops = 7.88673 (norm. = 0.261189), norm. avg. (of 16) = 0.272339 fft 24: mflops = 19.7447 (norm. = 0.653896), norm. avg. (of 16) = 0.673792 fft 25: mflops = 6.82492 (norm. = 0.226024), norm. avg. (of 13) = 0.241377 fft 26: mflops = 18.4051 (norm. = 0.60953), norm. avg. (of 15) = 0.330035 fft 27: mflops = 13.5992 (norm. = 0.45037), norm. avg. (of 16) = 0.443722 fft 28: mflops = 8.60832 (norm. = 0.285086), norm. avg. (of 16) = 0.179821 fft 29: mflops = 3.16502 (norm. = 0.104818), norm. avg. (of 16) = 0.0801044 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=2.2617 s, 1 iters, t-(init.)=2.21978 s t(norm)=0.99621, mflops=5.01902 (err=3.9e-14) 1. Arndt DIT: elapsed time t=2.34321 s, 1 iters, t-(init.)=2.3012 s t(norm)=1.03275, mflops=4.84143 (err=3.9e-14) 2. Arndt Split-Radix: elapsed time t=2.79306 s, 1 iters, t-(init.)=2.75104 s t(norm)=1.23463, mflops=4.04979 (err=3.9e-14) 3. Arndt 4-step: elapsed time t=1.31479 s, 1 iters, t-(init.)=1.27287 s t(norm)=0.571251, mflops=8.75272 (err=3.9e-14) 4. Beauregard: elapsed time t=1.87005 s, 1 iters, t-(init.)=1.82808 s t(norm)=0.820422, mflops=6.09442 (err=3.8e-14) 5. Bergland: elapsed time t=1.86666 s, 2 iters, t-(init.)=1.78312 s t(norm)=0.400122, mflops=12.4962 (err=3.9e-14) 6. CWP (min N) (N=144144): elapsed time t=1.62585 s, 4 iters, t-(init.)=1.44219 s t(norm)=0.161809, mflops=30.9007 7. CWP (best N) (N=144144): elapsed time t=1.62586 s, 4 iters, t-(init.)=1.44209 s t(norm)=0.161798, mflops=30.9028 8. Edelblute: elapsed time t=2.84945 s, 1 iters, t-(init.)=2.80774 s t(norm)=1.26008, mflops=3.968 (err=3.9e-14) 9. FFTPACK (f2c): elapsed time t=1.62622 s, 1 iters, t-(init.)=1.58439 s t(norm)=0.711054, mflops=7.03181 (err=3.8e-14) FFTW_MEASURE plan: (cost = 5.943430e-01) FFTW_TWIDDLE 2 FFTW_TWIDDLE 4 FFTW_TWIDDLE 64 FFTW_TWIDDLE 8 FFTW_NOTW 32 10. FFTW: elapsed time t=1.12675 s, 2 iters, t-(init.)=1.04296 s t(norm)=0.234033, mflops=21.3645 (err=3.8e-14) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.34128 s, 2 iters, t-(init.)=1.25756 s t(norm)=0.282188, mflops=17.7187 (err=3.8e-14) 12. Frigo-old: elapsed time t=1.89688 s, 2 iters, t-(init.)=1.81339 s t(norm)=0.406913, mflops=12.2876 (err=3.8e-14) 13. Green: elapsed time t=1.81646 s, 2 iters, t-(init.)=1.73297 s t(norm)=0.388868, mflops=12.8578 (err=3.8e-14) 14. GSL: elapsed time t=1.0231 s, 1 iters, t-(init.)=0.981158 s t(norm)=0.440332, mflops=11.3551 (err=3.8e-14) 15. GSL DIT: elapsed time t=1.88819 s, 1 iters, t-(init.)=1.84622 s t(norm)=0.828559, mflops=6.03457 (err=4.0e-14) 16. GSL DIF: elapsed time t=1.88647 s, 1 iters, t-(init.)=1.8445 s t(norm)=0.827789, mflops=6.04018 (err=4.2e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.464 s, 1 iters, t-(init.)=1.42227 s t(norm)=0.638297, mflops=7.83335 (err=3.9e-14) 19. Mayer (simple): elapsed time t=1.41048 s, 1 iters, t-(init.)=1.36842 s t(norm)=0.614131, mflops=8.14159 20. Mayer (lookup): elapsed time t=1.50756 s, 1 iters, t-(init.)=1.46586 s t(norm)=0.657859, mflops=7.60041 (err=3.9e-14) 21. NAPACK (f2c): elapsed time t=1.98233 s, 1 iters, t-(init.)=1.94065 s t(norm)=0.870939, mflops=5.74093 (err=2.0e-12) 22. Nielsen: elapsed time t=2.01229 s, 1 iters, t-(init.)=1.97092 s t(norm)=0.884524, mflops=5.65276 (err=9.2e-13) 23. NR (C): elapsed time t=1.8568 s, 1 iters, t-(init.)=1.81509 s t(norm)=0.814589, mflops=6.13807 (err=3.9e-14) 24. Ooura (C): elapsed time t=1.25064 s, 2 iters, t-(init.)=1.16684 s t(norm)=0.261831, mflops=19.0963 (err=3.9e-14) 25. QFT: elapsed time t=2.20244 s, 1 iters, t-(init.)=2.16085 s t(norm)=0.969765, mflops=5.15589 (err=4.1e-14) 26. Ransom: elapsed time t=1.53071 s, 2 iters, t-(init.)=1.44698 s t(norm)=0.324694, mflops=15.3991 (err=3.9e-14) 27. Singleton (f2c): elapsed time t=1.16168 s, 1 iters, t-(init.)=1.11963 s t(norm)=0.502477, mflops=9.95071 (err=5.7e-14) 28. Temperton (f2c): elapsed time t=1.64457 s, 1 iters, t-(init.)=1.60265 s t(norm)=0.71925, mflops=6.95169 (err=3.8e-14) 29. Valkenburg: elapsed time t=3.78058 s, 1 iters, t-(init.)=3.73891 s t(norm)=1.67798, mflops=2.97978 (err=3.9e-14) Top mflops for N=131072 = 30.9028 Normalized results and averages for N=131072: fft 0: mflops = 5.01902 (norm. = 0.162413), norm. avg. (of 17) = 0.350516 fft 1: mflops = 4.84143 (norm. = 0.156666), norm. avg. (of 17) = 0.337319 fft 2: mflops = 4.04979 (norm. = 0.131049), norm. avg. (of 17) = 0.237611 fft 3: mflops = 8.75272 (norm. = 0.283234), norm. avg. (of 17) = 0.188356 fft 4: mflops = 6.09442 (norm. = 0.197213), norm. avg. (of 17) = 0.115911 fft 5: mflops = 12.4962 (norm. = 0.404371), norm. avg. (of 17) = 0.443637 fft 6: mflops = 30.9007 (norm. = 0.999931), norm. avg. (of 17) = 0.620795 fft 7: mflops = 30.9028 (norm. = 1), norm. avg. (of 17) = 0.654907 fft 8: mflops = 3.968 (norm. = 0.128403), norm. avg. (of 16) = 0.182683 fft 9: mflops = 7.03181 (norm. = 0.227546), norm. avg. (of 17) = 0.219696 fft 10: mflops = 21.3645 (norm. = 0.691344), norm. avg. (of 17) = 0.812708 fft 11: mflops = 17.7187 (norm. = 0.573367), norm. avg. (of 17) = 0.753701 fft 12: mflops = 12.2876 (norm. = 0.397622), norm. avg. (of 17) = 0.685128 fft 13: mflops = 12.8578 (norm. = 0.416073), norm. avg. (of 15) = 0.692431 fft 14: mflops = 11.3551 (norm. = 0.367445), norm. avg. (of 17) = 0.348091 fft 15: mflops = 6.03457 (norm. = 0.195276), norm. avg. (of 17) = 0.249311 fft 16: mflops = 6.04018 (norm. = 0.195457), norm. avg. (of 17) = 0.256154 fft 17: mflops = -1 (norm. = -0.0323595), norm. avg. (of 12) = 0.397553 fft 18: mflops = 7.83335 (norm. = 0.253483), norm. avg. (of 16) = 0.355726 fft 19: mflops = 8.14159 (norm. = 0.263458), norm. avg. (of 16) = 0.42968 fft 20: mflops = 7.60041 (norm. = 0.245946), norm. avg. (of 16) = 0.395214 fft 21: mflops = 5.74093 (norm. = 0.185774), norm. avg. (of 17) = 0.123825 fft 22: mflops = 5.65276 (norm. = 0.18292), norm. avg. (of 17) = 0.178359 fft 23: mflops = 6.13807 (norm. = 0.198625), norm. avg. (of 17) = 0.268003 fft 24: mflops = 19.0963 (norm. = 0.617946), norm. avg. (of 17) = 0.670507 fft 25: mflops = 5.15589 (norm. = 0.166842), norm. avg. (of 14) = 0.236053 fft 26: mflops = 15.3991 (norm. = 0.498307), norm. avg. (of 16) = 0.340552 fft 27: mflops = 9.95071 (norm. = 0.322), norm. avg. (of 17) = 0.436562 fft 28: mflops = 6.95169 (norm. = 0.224953), norm. avg. (of 17) = 0.182476 fft 29: mflops = 2.97978 (norm. = 0.0964241), norm. avg. (of 17) = 0.0810644 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) Maximum array size = 180180 Benchmarking FFTs: 0. CWP (min N) 1. CWP (best N) 2. FFTPACK (f2c) 3. FFTW 4. FFTW_ESTIMATE 5. Frigo-old 6. GSL 7. NAPACK (f2c) 8. Nielsen 9. Singleton (f2c) 10. Temperton (f2c) 11. Valkenburg Computing normalized averages (12 transforms). Benchmarking for array size = 6: 0. CWP (min N): elapsed time t=1.04366 s, 262144 iters, t-(init.)=0.993708 s t(norm)=0.244407, mflops=20.4577 1. CWP (best N) (N=15): elapsed time t=1.6852 s, 262144 iters, t-(init.)=1.5819 s t(norm)=0.389076, mflops=12.8509 2. FFTPACK (f2c): elapsed time t=1.3324 s, 262144 iters, t-(init.)=1.28257 s t(norm)=0.315453, mflops=15.8502 (err=1.7e-16) FFTW_MEASURE plan: (cost = 1.157761e-06) FFTW_NOTW 6 3. FFTW: elapsed time t=1.26205 s, 1048576 iters, t-(init.)=1.06292 s t(norm)=0.0653578, mflops=76.502 (err=1.3e-16) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 4. FFTW_ESTIMATE: elapsed time t=1.26212 s, 1048576 iters, t-(init.)=1.06302 s t(norm)=0.0653635, mflops=76.4953 (err=1.3e-16) 5. Frigo-old: elapsed time t=1.47769 s, 262144 iters, t-(init.)=1.42781 s t(norm)=0.351177, mflops=14.2378 (err=3.2e-16) 6. GSL: elapsed time t=1.90585 s, 524288 iters, t-(init.)=1.80629 s t(norm)=0.222132, mflops=22.5091 (err=1.3e-16) 7. NAPACK (f2c): elapsed time t=1.62525 s, 131072 iters, t-(init.)=1.60021 s t(norm)=0.787159, mflops=6.35195 (err=2.3e-16) 8. Nielsen: elapsed time t=1.00413 s, 65536 iters, t-(init.)=0.991649 s t(norm)=0.975602, mflops=5.12504 (err=2.7e-16) 9. Singleton (f2c): elapsed time t=1.77503 s, 262144 iters, t-(init.)=1.72527 s t(norm)=0.424337, mflops=11.7831 (err=1.3e-16) 10. Temperton (f2c): elapsed time t=1.32075 s, 131072 iters, t-(init.)=1.29571 s t(norm)=0.637371, mflops=7.84473 (err=1.2e-16) 11. Valkenburg: elapsed time t=1.58159 s, 131072 iters, t-(init.)=1.55663 s t(norm)=0.765718, mflops=6.52982 (err=2.1e-16) Top mflops for N=6 = 76.502 Normalized results and averages for N=6: fft 0: mflops = 20.4577 (norm. = 0.267414), norm. avg. (of 1) = 0.267414 fft 1: mflops = 12.8509 (norm. = 0.167982), norm. avg. (of 1) = 0.167982 fft 2: mflops = 15.8502 (norm. = 0.207187), norm. avg. (of 1) = 0.207187 fft 3: mflops = 76.502 (norm. = 1), norm. avg. (of 1) = 1 fft 4: mflops = 76.4953 (norm. = 0.999913), norm. avg. (of 1) = 0.999913 fft 5: mflops = 14.2378 (norm. = 0.186111), norm. avg. (of 1) = 0.186111 fft 6: mflops = 22.5091 (norm. = 0.294229), norm. avg. (of 1) = 0.294229 fft 7: mflops = 6.35195 (norm. = 0.0830299), norm. avg. (of 1) = 0.0830299 fft 8: mflops = 5.12504 (norm. = 0.0669923), norm. avg. (of 1) = 0.0669923 fft 9: mflops = 11.7831 (norm. = 0.154023), norm. avg. (of 1) = 0.154023 fft 10: mflops = 7.84473 (norm. = 0.102543), norm. avg. (of 1) = 0.102543 fft 11: mflops = 6.52982 (norm. = 0.0853549), norm. avg. (of 1) = 0.0853549 Benchmarking for array size = 9: 0. CWP (min N): elapsed time t=1.26759 s, 262144 iters, t-(init.)=1.20004 s t(norm)=0.160459, mflops=31.1607 1. CWP (best N) (N=15): elapsed time t=1.68513 s, 262144 iters, t-(init.)=1.58178 s t(norm)=0.211502, mflops=23.6404 2. FFTPACK (f2c): elapsed time t=1.00933 s, 131072 iters, t-(init.)=0.975545 s t(norm)=0.260883, mflops=19.1657 (err=2.8e-16) FFTW_MEASURE plan: (cost = 1.996521e-06) FFTW_NOTW 9 3. FFTW: elapsed time t=1.08476 s, 524288 iters, t-(init.)=0.94973 s t(norm)=0.0634949, mflops=78.7465 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 4. FFTW_ESTIMATE: elapsed time t=1.085 s, 524288 iters, t-(init.)=0.949951 s t(norm)=0.0635097, mflops=78.7282 (err=1.4e-16) 5. Frigo-old: elapsed time t=1.56928 s, 131072 iters, t-(init.)=1.53549 s t(norm)=0.410624, mflops=12.1766 (err=3.1e-16) 6. GSL: elapsed time t=1.69881 s, 262144 iters, t-(init.)=1.6311 s t(norm)=0.218097, mflops=22.9256 (err=1.4e-16) 7. NAPACK (f2c): elapsed time t=1.15972 s, 65536 iters, t-(init.)=1.14272 s t(norm)=0.611179, mflops=8.18091 (err=5.8e-16) 8. Nielsen: elapsed time t=1.18442 s, 65536 iters, t-(init.)=1.1674 s t(norm)=0.624381, mflops=8.00793 (err=4.5e-16) 9. Singleton (f2c): elapsed time t=1.85194 s, 262144 iters, t-(init.)=1.78428 s t(norm)=0.238578, mflops=20.9575 (err=1.7e-16) 10. Temperton (f2c): elapsed time t=1.69399 s, 131072 iters, t-(init.)=1.66021 s t(norm)=0.443979, mflops=11.2618 (err=1.7e-16) 11. Valkenburg: elapsed time t=1.41955 s, 65536 iters, t-(init.)=1.40262 s t(norm)=0.750184, mflops=6.66503 (err=2.6e-16) Top mflops for N=9 = 78.7465 Normalized results and averages for N=9: fft 0: mflops = 31.1607 (norm. = 0.395709), norm. avg. (of 2) = 0.331561 fft 1: mflops = 23.6404 (norm. = 0.300209), norm. avg. (of 2) = 0.234095 fft 2: mflops = 19.1657 (norm. = 0.243384), norm. avg. (of 2) = 0.225286 fft 3: mflops = 78.7465 (norm. = 1), norm. avg. (of 2) = 1 fft 4: mflops = 78.7282 (norm. = 0.999767), norm. avg. (of 2) = 0.99984 fft 5: mflops = 12.1766 (norm. = 0.15463), norm. avg. (of 2) = 0.17037 fft 6: mflops = 22.9256 (norm. = 0.291131), norm. avg. (of 2) = 0.29268 fft 7: mflops = 8.18091 (norm. = 0.103889), norm. avg. (of 2) = 0.0934595 fft 8: mflops = 8.00793 (norm. = 0.101693), norm. avg. (of 2) = 0.0843424 fft 9: mflops = 20.9575 (norm. = 0.266138), norm. avg. (of 2) = 0.210081 fft 10: mflops = 11.2618 (norm. = 0.143013), norm. avg. (of 2) = 0.122778 fft 11: mflops = 6.66503 (norm. = 0.084639), norm. avg. (of 2) = 0.084997 Benchmarking for array size = 12: 0. CWP (min N): elapsed time t=1.35366 s, 262144 iters, t-(init.)=1.26839 s t(norm)=0.112472, mflops=44.4553 1. CWP (best N) (N=15): elapsed time t=1.68514 s, 262144 iters, t-(init.)=1.58181 s t(norm)=0.140265, mflops=35.6468 2. FFTPACK (f2c): elapsed time t=1.20136 s, 131072 iters, t-(init.)=1.15859 s t(norm)=0.205472, mflops=24.3342 (err=1.9e-16) FFTW_MEASURE plan: (cost = 2.023315e-06) FFTW_NOTW 12 3. FFTW: elapsed time t=1.09257 s, 524288 iters, t-(init.)=0.921871 s t(norm)=0.0408728, mflops=122.331 (err=1.3e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.09241 s, 524288 iters, t-(init.)=0.92171 s t(norm)=0.0408657, mflops=122.352 (err=1.3e-16) 5. Frigo-old: elapsed time t=1.36922 s, 131072 iters, t-(init.)=1.32655 s t(norm)=0.235261, mflops=21.253 (err=2.3e-16) 6. GSL: elapsed time t=1.69473 s, 262144 iters, t-(init.)=1.60947 s t(norm)=0.142717, mflops=35.0343 (err=1.5e-16) 7. NAPACK (f2c): elapsed time t=1.61572 s, 65536 iters, t-(init.)=1.59435 s t(norm)=0.565506, mflops=8.84164 (err=4.2e-16) 8. Nielsen: elapsed time t=1.35276 s, 65536 iters, t-(init.)=1.33138 s t(norm)=0.472234, mflops=10.588 (err=4.8e-16) 9. Singleton (f2c): elapsed time t=1.3059 s, 131072 iters, t-(init.)=1.26329 s t(norm)=0.224042, mflops=22.3173 (err=1.9e-16) 10. Temperton (f2c): elapsed time t=1.90277 s, 131072 iters, t-(init.)=1.86001 s t(norm)=0.329868, mflops=15.1576 (err=1.2e-16) 11. Valkenburg: elapsed time t=1.0472 s, 32768 iters, t-(init.)=1.03658 s t(norm)=0.735339, mflops=6.79958 (err=1.9e-16) Top mflops for N=12 = 122.352 Normalized results and averages for N=12: fft 0: mflops = 44.4553 (norm. = 0.363339), norm. avg. (of 3) = 0.342154 fft 1: mflops = 35.6468 (norm. = 0.291346), norm. avg. (of 3) = 0.253179 fft 2: mflops = 24.3342 (norm. = 0.198887), norm. avg. (of 3) = 0.216486 fft 3: mflops = 122.331 (norm. = 0.999825), norm. avg. (of 3) = 0.999942 fft 4: mflops = 122.352 (norm. = 1), norm. avg. (of 3) = 0.999893 fft 5: mflops = 21.253 (norm. = 0.173704), norm. avg. (of 3) = 0.171481 fft 6: mflops = 35.0343 (norm. = 0.28634), norm. avg. (of 3) = 0.290567 fft 7: mflops = 8.84164 (norm. = 0.0722639), norm. avg. (of 3) = 0.0863943 fft 8: mflops = 10.588 (norm. = 0.0865368), norm. avg. (of 3) = 0.0850739 fft 9: mflops = 22.3173 (norm. = 0.182402), norm. avg. (of 3) = 0.200855 fft 10: mflops = 15.1576 (norm. = 0.123885), norm. avg. (of 3) = 0.123147 fft 11: mflops = 6.79958 (norm. = 0.0555739), norm. avg. (of 3) = 0.0751893 Benchmarking for array size = 15: 0. CWP (min N): elapsed time t=1.68545 s, 262144 iters, t-(init.)=1.58236 s t(norm)=0.103002, mflops=48.5429 1. CWP (best N): elapsed time t=1.68512 s, 262144 iters, t-(init.)=1.58211 s t(norm)=0.102985, mflops=48.5508 2. FFTPACK (f2c): elapsed time t=1.65803 s, 131072 iters, t-(init.)=1.60645 s t(norm)=0.209139, mflops=23.9076 (err=3.6e-16) FFTW_MEASURE plan: (cost = 3.448608e-06) FFTW_NOTW 15 3. FFTW: elapsed time t=1.8556 s, 524288 iters, t-(init.)=1.64909 s t(norm)=0.0536725, mflops=93.1575 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.85587 s, 524288 iters, t-(init.)=1.64958 s t(norm)=0.0536884, mflops=93.1299 (err=1.7e-16) 5. Frigo-old: elapsed time t=1.3453 s, 65536 iters, t-(init.)=1.3196 s t(norm)=0.34359, mflops=14.5522 (err=2.7e-16) 6. GSL: elapsed time t=1.73903 s, 131072 iters, t-(init.)=1.68746 s t(norm)=0.219685, mflops=22.7598 (err=1.9e-16) 7. NAPACK (f2c): elapsed time t=1.55957 s, 32768 iters, t-(init.)=1.54657 s t(norm)=0.805375, mflops=6.20829 (err=9.4e-16) 8. Nielsen: elapsed time t=1.57837 s, 65536 iters, t-(init.)=1.55263 s t(norm)=0.404265, mflops=12.3681 (err=4.5e-15) 9. Singleton (f2c): elapsed time t=1.56861 s, 131072 iters, t-(init.)=1.51693 s t(norm)=0.197485, mflops=25.3184 (err=2.0e-16) 10. Temperton (f2c): elapsed time t=1.26944 s, 65536 iters, t-(init.)=1.24352 s t(norm)=0.32378, mflops=15.4426 (err=2.5e-16) 11. Valkenburg: elapsed time t=1.58761 s, 32768 iters, t-(init.)=1.57468 s t(norm)=0.820012, mflops=6.09747 (err=2.5e-16) Top mflops for N=15 = 93.1575 Normalized results and averages for N=15: fft 0: mflops = 48.5429 (norm. = 0.521085), norm. avg. (of 4) = 0.386887 fft 1: mflops = 48.5508 (norm. = 0.521169), norm. avg. (of 4) = 0.320177 fft 2: mflops = 23.9076 (norm. = 0.256636), norm. avg. (of 4) = 0.226524 fft 3: mflops = 93.1575 (norm. = 1), norm. avg. (of 4) = 0.999956 fft 4: mflops = 93.1299 (norm. = 0.999704), norm. avg. (of 4) = 0.999846 fft 5: mflops = 14.5522 (norm. = 0.156211), norm. avg. (of 4) = 0.167664 fft 6: mflops = 22.7598 (norm. = 0.244316), norm. avg. (of 4) = 0.279004 fft 7: mflops = 6.20829 (norm. = 0.0666429), norm. avg. (of 4) = 0.0814565 fft 8: mflops = 12.3681 (norm. = 0.132766), norm. avg. (of 4) = 0.0969968 fft 9: mflops = 25.3184 (norm. = 0.271781), norm. avg. (of 4) = 0.218586 fft 10: mflops = 15.4426 (norm. = 0.165768), norm. avg. (of 4) = 0.133802 fft 11: mflops = 6.09747 (norm. = 0.0654533), norm. avg. (of 4) = 0.0727553 Benchmarking for array size = 18: 0. CWP (min N): elapsed time t=1.07752 s, 131072 iters, t-(init.)=1.0172 s t(norm)=0.103394, mflops=48.3589 1. CWP (best N) (N=28): elapsed time t=1.22678 s, 131072 iters, t-(init.)=1.13664 s t(norm)=0.115535, mflops=43.2771 2. FFTPACK (f2c): elapsed time t=1.32624 s, 65536 iters, t-(init.)=1.29597 s t(norm)=0.26346, mflops=18.9782 (err=2.6e-16) FFTW_MEASURE plan: (cost = 5.302429e-06) FFTW_TWIDDLE 3 FFTW_NOTW 6 3. FFTW: elapsed time t=1.41144 s, 262144 iters, t-(init.)=1.29063 s t(norm)=0.0655934, mflops=76.2271 (err=1.9e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 4. FFTW_ESTIMATE: elapsed time t=1.44928 s, 262144 iters, t-(init.)=1.3282 s t(norm)=0.0675032, mflops=74.0705 (err=2.3e-16) 5. Frigo-old: elapsed time t=1.75751 s, 65536 iters, t-(init.)=1.72722 s t(norm)=0.351129, mflops=14.2398 (err=3.8e-16) 6. GSL: elapsed time t=1.33801 s, 131072 iters, t-(init.)=1.27744 s t(norm)=0.129846, mflops=38.5071 (err=2.4e-16) 7. NAPACK (f2c): elapsed time t=1.2065 s, 32768 iters, t-(init.)=1.19132 s t(norm)=0.484371, mflops=10.3227 (err=6.0e-16) 8. Nielsen: elapsed time t=1.2533 s, 32768 iters, t-(init.)=1.23812 s t(norm)=0.503397, mflops=9.93252 (err=7.7e-16) 9. Singleton (f2c): elapsed time t=1.68119 s, 131072 iters, t-(init.)=1.62061 s t(norm)=0.164728, mflops=30.3531 (err=1.7e-16) 10. Temperton (f2c): elapsed time t=1.83944 s, 65536 iters, t-(init.)=1.80914 s t(norm)=0.367784, mflops=13.5949 (err=2.8e-16) 11. Valkenburg: elapsed time t=1.80481 s, 32768 iters, t-(init.)=1.78964 s t(norm)=0.727636, mflops=6.87157 (err=2.8e-16) Top mflops for N=18 = 76.2271 Normalized results and averages for N=18: fft 0: mflops = 48.3589 (norm. = 0.634405), norm. avg. (of 5) = 0.43639 fft 1: mflops = 43.2771 (norm. = 0.567739), norm. avg. (of 5) = 0.369689 fft 2: mflops = 18.9782 (norm. = 0.248969), norm. avg. (of 5) = 0.231013 fft 3: mflops = 76.2271 (norm. = 1), norm. avg. (of 5) = 0.999965 fft 4: mflops = 74.0705 (norm. = 0.971708), norm. avg. (of 5) = 0.994218 fft 5: mflops = 14.2398 (norm. = 0.186807), norm. avg. (of 5) = 0.171493 fft 6: mflops = 38.5071 (norm. = 0.505163), norm. avg. (of 5) = 0.324236 fft 7: mflops = 10.3227 (norm. = 0.13542), norm. avg. (of 5) = 0.0922491 fft 8: mflops = 9.93252 (norm. = 0.130302), norm. avg. (of 5) = 0.103658 fft 9: mflops = 30.3531 (norm. = 0.398193), norm. avg. (of 5) = 0.254508 fft 10: mflops = 13.5949 (norm. = 0.178348), norm. avg. (of 5) = 0.142712 fft 11: mflops = 6.87157 (norm. = 0.090146), norm. avg. (of 5) = 0.0762334 Benchmarking for array size = 24: 0. CWP (min N): elapsed time t=1.13373 s, 131072 iters, t-(init.)=1.05524 s t(norm)=0.0731633, mflops=68.3402 1. CWP (best N) (N=28): elapsed time t=1.22681 s, 131072 iters, t-(init.)=1.13641 s t(norm)=0.0787915, mflops=63.4586 2. FFTPACK (f2c): elapsed time t=1.65054 s, 65536 iters, t-(init.)=1.61121 s t(norm)=0.223422, mflops=22.3792 (err=2.4e-16) FFTW_MEASURE plan: (cost = 5.835876e-06) FFTW_TWIDDLE 2 FFTW_NOTW 12 3. FFTW: elapsed time t=1.55552 s, 262144 iters, t-(init.)=1.39902 s t(norm)=0.0484993, mflops=103.094 (err=2.0e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.55551 s, 262144 iters, t-(init.)=1.399 s t(norm)=0.0484988, mflops=103.095 (err=2.0e-16) 5. Frigo-old: elapsed time t=1.35732 s, 65536 iters, t-(init.)=1.31821 s t(norm)=0.182792, mflops=27.3534 (err=2.7e-16) 6. GSL: elapsed time t=1.50717 s, 131072 iters, t-(init.)=1.42927 s t(norm)=0.0990963, mflops=50.456 (err=2.2e-16) 7. NAPACK (f2c): elapsed time t=1.56748 s, 32768 iters, t-(init.)=1.54782 s t(norm)=0.429263, mflops=11.6479 (err=8.2e-16) 8. Nielsen: elapsed time t=1.11401 s, 32768 iters, t-(init.)=1.09413 s t(norm)=0.30344, mflops=16.4777 (err=1.4e-15) 9. Singleton (f2c): elapsed time t=1.25651 s, 65536 iters, t-(init.)=1.2174 s t(norm)=0.168813, mflops=29.6186 (err=2.2e-16) 10. Temperton (f2c): elapsed time t=1.05526 s, 32768 iters, t-(init.)=1.03562 s t(norm)=0.287212, mflops=17.4087 (err=2.7e-16) 11. Valkenburg: elapsed time t=1.29502 s, 16384 iters, t-(init.)=1.28503 s t(norm)=0.712763, mflops=7.01496 (err=2.9e-16) Top mflops for N=24 = 103.095 Normalized results and averages for N=24: fft 0: mflops = 68.3402 (norm. = 0.662885), norm. avg. (of 6) = 0.474139 fft 1: mflops = 63.4586 (norm. = 0.615534), norm. avg. (of 6) = 0.410663 fft 2: mflops = 22.3792 (norm. = 0.217073), norm. avg. (of 6) = 0.228689 fft 3: mflops = 103.094 (norm. = 0.99999), norm. avg. (of 6) = 0.999969 fft 4: mflops = 103.095 (norm. = 1), norm. avg. (of 6) = 0.995182 fft 5: mflops = 27.3534 (norm. = 0.265322), norm. avg. (of 6) = 0.187131 fft 6: mflops = 50.456 (norm. = 0.489411), norm. avg. (of 6) = 0.351765 fft 7: mflops = 11.6479 (norm. = 0.112982), norm. avg. (of 6) = 0.0957046 fft 8: mflops = 16.4777 (norm. = 0.15983), norm. avg. (of 6) = 0.11302 fft 9: mflops = 29.6186 (norm. = 0.287294), norm. avg. (of 6) = 0.259972 fft 10: mflops = 17.4087 (norm. = 0.168861), norm. avg. (of 6) = 0.14707 fft 11: mflops = 7.01496 (norm. = 0.0680435), norm. avg. (of 6) = 0.0748684 Benchmarking for array size = 36: 0. CWP (min N): elapsed time t=1.73283 s, 131072 iters, t-(init.)=1.619 s t(norm)=0.0663666, mflops=75.3391 1. CWP (best N): elapsed time t=1.73267 s, 131072 iters, t-(init.)=1.6189 s t(norm)=0.0663627, mflops=75.3435 2. FFTPACK (f2c): elapsed time t=1.31861 s, 32768 iters, t-(init.)=1.29033 s t(norm)=0.211575, mflops=23.6323 (err=3.7e-16) FFTW_MEASURE plan: (cost = 9.430420e-06) FFTW_TWIDDLE 3 FFTW_NOTW 12 3. FFTW: elapsed time t=1.25706 s, 131072 iters, t-(init.)=1.14302 s t(norm)=0.046855, mflops=106.712 (err=3.5e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.25721 s, 131072 iters, t-(init.)=1.14335 s t(norm)=0.0468685, mflops=106.681 (err=3.5e-16) 5. Frigo-old: elapsed time t=1.74564 s, 32768 iters, t-(init.)=1.71701 s t(norm)=0.281538, mflops=17.7596 (err=4.8e-16) 6. GSL: elapsed time t=1.17874 s, 65536 iters, t-(init.)=1.1217 s t(norm)=0.0919623, mflops=54.3701 (err=2.8e-16) 7. NAPACK (f2c): elapsed time t=1.21636 s, 16384 iters, t-(init.)=1.20205 s t(norm)=0.394198, mflops=12.684 (err=1.0e-15) 8. Nielsen: elapsed time t=1.0092 s, 16384 iters, t-(init.)=0.994881 s t(norm)=0.32626, mflops=15.3252 (err=9.7e-16) 9. Singleton (f2c): elapsed time t=1.39479 s, 65536 iters, t-(init.)=1.33787 s t(norm)=0.109685, mflops=45.585 (err=2.7e-16) 10. Temperton (f2c): elapsed time t=1.56502 s, 32768 iters, t-(init.)=1.53627 s t(norm)=0.251902, mflops=19.849 (err=3.9e-16) 11. Valkenburg: elapsed time t=1.08266 s, 8192 iters, t-(init.)=1.07564 s t(norm)=0.705488, mflops=7.08729 (err=4.0e-16) Top mflops for N=36 = 106.712 Normalized results and averages for N=36: fft 0: mflops = 75.3391 (norm. = 0.706002), norm. avg. (of 7) = 0.507263 fft 1: mflops = 75.3435 (norm. = 0.706044), norm. avg. (of 7) = 0.45286 fft 2: mflops = 23.6323 (norm. = 0.221458), norm. avg. (of 7) = 0.227656 fft 3: mflops = 106.712 (norm. = 1), norm. avg. (of 7) = 0.999974 fft 4: mflops = 106.681 (norm. = 0.99971), norm. avg. (of 7) = 0.995829 fft 5: mflops = 17.7596 (norm. = 0.166425), norm. avg. (of 7) = 0.184173 fft 6: mflops = 54.3701 (norm. = 0.509502), norm. avg. (of 7) = 0.374299 fft 7: mflops = 12.684 (norm. = 0.118861), norm. avg. (of 7) = 0.0990127 fft 8: mflops = 15.3252 (norm. = 0.143612), norm. avg. (of 7) = 0.11739 fft 9: mflops = 45.585 (norm. = 0.427177), norm. avg. (of 7) = 0.283858 fft 10: mflops = 19.849 (norm. = 0.186005), norm. avg. (of 7) = 0.152632 fft 11: mflops = 7.08729 (norm. = 0.0664149), norm. avg. (of 7) = 0.0736608 Benchmarking for array size = 80: 0. CWP (min N): elapsed time t=1.89603 s, 65536 iters, t-(init.)=1.77388 s t(norm)=0.0535186, mflops=93.4254 1. CWP (best N) (N=84): elapsed time t=1.77811 s, 65536 iters, t-(init.)=1.64988 s t(norm)=0.0497776, mflops=100.447 2. FFTPACK (f2c): elapsed time t=1.46829 s, 16384 iters, t-(init.)=1.43722 s t(norm)=0.173446, mflops=28.8275 (err=7.7e-16) FFTW_MEASURE plan: (cost = 2.318481e-05) FFTW_TWIDDLE 5 FFTW_NOTW 16 3. FFTW: elapsed time t=1.54016 s, 65536 iters, t-(init.)=1.41793 s t(norm)=0.0427794, mflops=116.879 (err=7.3e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 4. FFTW_ESTIMATE: elapsed time t=1.54071 s, 65536 iters, t-(init.)=1.41852 s t(norm)=0.0427973, mflops=116.83 (err=7.3e-16) 5. Frigo-old: elapsed time t=1.23176 s, 16384 iters, t-(init.)=1.20115 s t(norm)=0.144956, mflops=34.4931 (err=7.1e-16) 6. GSL: elapsed time t=1.14825 s, 16384 iters, t-(init.)=1.11765 s t(norm)=0.13488, mflops=37.0699 (err=6.9e-16) 7. NAPACK (f2c): elapsed time t=1.15442 s, 4096 iters, t-(init.)=1.14689 s t(norm)=0.553631, mflops=9.03128 (err=1.1e-15) 8. Nielsen: elapsed time t=1.56494 s, 16384 iters, t-(init.)=1.53399 s t(norm)=0.185124, mflops=27.0089 (err=5.4e-15) 9. Singleton (f2c): elapsed time t=1.28136 s, 32768 iters, t-(init.)=1.22025 s t(norm)=0.0736306, mflops=67.9066 (err=1.3e-15) 10. Temperton (f2c): elapsed time t=1.90605 s, 16384 iters, t-(init.)=1.87544 s t(norm)=0.226331, mflops=22.0916 (err=7.0e-16) 11. Valkenburg: elapsed time t=1.53877 s, 4096 iters, t-(init.)=1.53113 s t(norm)=0.739115, mflops=6.76485 (err=8.4e-16) Top mflops for N=80 = 116.879 Normalized results and averages for N=80: fft 0: mflops = 93.4254 (norm. = 0.799337), norm. avg. (of 8) = 0.543772 fft 1: mflops = 100.447 (norm. = 0.859411), norm. avg. (of 8) = 0.503679 fft 2: mflops = 28.8275 (norm. = 0.246644), norm. avg. (of 8) = 0.23003 fft 3: mflops = 116.879 (norm. = 1), norm. avg. (of 8) = 0.999977 fft 4: mflops = 116.83 (norm. = 0.999581), norm. avg. (of 8) = 0.996298 fft 5: mflops = 34.4931 (norm. = 0.295119), norm. avg. (of 8) = 0.198041 fft 6: mflops = 37.0699 (norm. = 0.317166), norm. avg. (of 8) = 0.367157 fft 7: mflops = 9.03128 (norm. = 0.0772705), norm. avg. (of 8) = 0.0962949 fft 8: mflops = 27.0089 (norm. = 0.231085), norm. avg. (of 8) = 0.131602 fft 9: mflops = 67.9066 (norm. = 0.581), norm. avg. (of 8) = 0.321001 fft 10: mflops = 22.0916 (norm. = 0.189013), norm. avg. (of 8) = 0.157179 fft 11: mflops = 6.76485 (norm. = 0.0578793), norm. avg. (of 8) = 0.0716881 Benchmarking for array size = 108: 0. CWP (min N) (N=110): elapsed time t=1.62902 s, 32768 iters, t-(init.)=1.54574 s t(norm)=0.0646612, mflops=77.3261 1. CWP (best N) (N=112): elapsed time t=1.32324 s, 32768 iters, t-(init.)=1.23815 s t(norm)=0.0517943, mflops=96.5358 2. FFTPACK (f2c): elapsed time t=1.15296 s, 8192 iters, t-(init.)=1.13243 s t(norm)=0.189488, mflops=26.3869 (err=4.7e-16) FFTW_MEASURE plan: (cost = 3.511523e-05) FFTW_TWIDDLE 9 FFTW_NOTW 12 3. FFTW: elapsed time t=1.15748 s, 32768 iters, t-(init.)=1.07536 s t(norm)=0.0449844, mflops=111.15 (err=3.7e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.15785 s, 32768 iters, t-(init.)=1.07592 s t(norm)=0.0450077, mflops=111.092 (err=3.7e-16) 5. Frigo-old: elapsed time t=1.84743 s, 8192 iters, t-(init.)=1.82698 s t(norm)=0.305705, mflops=16.3556 (err=5.5e-16) 6. GSL: elapsed time t=1.1378 s, 16384 iters, t-(init.)=1.09664 s t(norm)=0.0917492, mflops=54.4964 (err=4.7e-16) 7. NAPACK (f2c): elapsed time t=1.06487 s, 4096 iters, t-(init.)=1.05442 s t(norm)=0.352869, mflops=14.1696 (err=2.7e-15) 8. Nielsen: elapsed time t=1.68237 s, 8192 iters, t-(init.)=1.66192 s t(norm)=0.278086, mflops=17.9801 (err=1.1e-15) 9. Singleton (f2c): elapsed time t=1.23865 s, 16384 iters, t-(init.)=1.19778 s t(norm)=0.100211, mflops=49.8949 (err=5.1e-16) 10. Temperton (f2c): elapsed time t=1.44824 s, 8192 iters, t-(init.)=1.42754 s t(norm)=0.238867, mflops=20.9321 (err=3.8e-16) 11. Valkenburg: elapsed time t=1.04534 s, 2048 iters, t-(init.)=1.04006 s t(norm)=0.696126, mflops=7.18261 (err=5.2e-16) Top mflops for N=108 = 111.15 Normalized results and averages for N=108: fft 0: mflops = 77.3261 (norm. = 0.695694), norm. avg. (of 9) = 0.560652 fft 1: mflops = 96.5358 (norm. = 0.868521), norm. avg. (of 9) = 0.544217 fft 2: mflops = 26.3869 (norm. = 0.2374), norm. avg. (of 9) = 0.230849 fft 3: mflops = 111.15 (norm. = 1), norm. avg. (of 9) = 0.999979 fft 4: mflops = 111.092 (norm. = 0.999482), norm. avg. (of 9) = 0.996652 fft 5: mflops = 16.3556 (norm. = 0.14715), norm. avg. (of 9) = 0.192386 fft 6: mflops = 54.4964 (norm. = 0.490298), norm. avg. (of 9) = 0.38084 fft 7: mflops = 14.1696 (norm. = 0.127482), norm. avg. (of 9) = 0.0997601 fft 8: mflops = 17.9801 (norm. = 0.161764), norm. avg. (of 9) = 0.134953 fft 9: mflops = 49.8949 (norm. = 0.448898), norm. avg. (of 9) = 0.335212 fft 10: mflops = 20.9321 (norm. = 0.188324), norm. avg. (of 9) = 0.16064 fft 11: mflops = 7.18261 (norm. = 0.0646211), norm. avg. (of 9) = 0.0709029 Benchmarking for array size = 210: 0. CWP (min N): elapsed time t=1.35038 s, 16384 iters, t-(init.)=1.27156 s t(norm)=0.0479075, mflops=104.368 1. CWP (best N): elapsed time t=1.35041 s, 16384 iters, t-(init.)=1.2716 s t(norm)=0.047909, mflops=104.365 2. FFTPACK (f2c): elapsed time t=1.72479 s, 4096 iters, t-(init.)=1.70494 s t(norm)=0.256943, mflops=19.4596 (err=5.7e-16) FFTW_MEASURE plan: (cost = 9.418164e-05) FFTW_TWIDDLE 2 FFTW_TWIDDLE 7 FFTW_NOTW 15 3. FFTW: elapsed time t=1.57299 s, 16384 iters, t-(init.)=1.49415 s t(norm)=0.0562939, mflops=88.8195 (err=4.5e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.58148 s, 16384 iters, t-(init.)=1.50246 s t(norm)=0.0566069, mflops=88.3285 (err=4.6e-16) 5. Frigo-old: elapsed time t=1.99474 s, 4096 iters, t-(init.)=1.97495 s t(norm)=0.297634, mflops=16.7991 (err=5.8e-16) 6. GSL: elapsed time t=1.89336 s, 8192 iters, t-(init.)=1.854 s t(norm)=0.139703, mflops=35.7902 (err=5.3e-16) 7. NAPACK (f2c): elapsed time t=1.1322 s, 1024 iters, t-(init.)=1.12734 s t(norm)=0.679581, mflops=7.35748 (err=1.4e-14) 8. Nielsen: elapsed time t=1.51338 s, 4096 iters, t-(init.)=1.49357 s t(norm)=0.225088, mflops=22.2135 (err=7.6e-15) 9. Singleton (f2c): elapsed time t=1.58715 s, 8192 iters, t-(init.)=1.54779 s t(norm)=0.11663, mflops=42.8707 (err=6.7e-16) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=1.45819 s, 1024 iters, t-(init.)=1.45318 s t(norm)=0.876006, mflops=5.70772 (err=6.5e-16) Top mflops for N=210 = 104.368 Normalized results and averages for N=210: fft 0: mflops = 104.368 (norm. = 1), norm. avg. (of 10) = 0.604587 fft 1: mflops = 104.365 (norm. = 0.999969), norm. avg. (of 10) = 0.589792 fft 2: mflops = 19.4596 (norm. = 0.186452), norm. avg. (of 10) = 0.226409 fft 3: mflops = 88.8195 (norm. = 0.851025), norm. avg. (of 10) = 0.985084 fft 4: mflops = 88.3285 (norm. = 0.84632), norm. avg. (of 10) = 0.981619 fft 5: mflops = 16.7991 (norm. = 0.160961), norm. avg. (of 10) = 0.189244 fft 6: mflops = 35.7902 (norm. = 0.342924), norm. avg. (of 10) = 0.377048 fft 7: mflops = 7.35748 (norm. = 0.0704957), norm. avg. (of 10) = 0.0968337 fft 8: mflops = 22.2135 (norm. = 0.212839), norm. avg. (of 10) = 0.142742 fft 9: mflops = 42.8707 (norm. = 0.410766), norm. avg. (of 10) = 0.342767 fft 10: mflops = -1 (norm. = -0.00958151), norm. avg. (of 9) = 0.16064 fft 11: mflops = 5.70772 (norm. = 0.0546886), norm. avg. (of 10) = 0.0692814 Benchmarking for array size = 504: 0. CWP (min N): elapsed time t=1.81661 s, 8192 iters, t-(init.)=1.7219 s t(norm)=0.046456, mflops=107.629 1. CWP (best N): elapsed time t=1.81685 s, 8192 iters, t-(init.)=1.72243 s t(norm)=0.0464703, mflops=107.596 2. FFTPACK (f2c): elapsed time t=1.48664 s, 1024 iters, t-(init.)=1.47504 s t(norm)=0.318368, mflops=15.7051 (err=9.8e-16) FFTW_MEASURE plan: (cost = 4.033906e-04) FFTW_TWIDDLE 7 FFTW_TWIDDLE 6 FFTW_NOTW 12 3. FFTW: elapsed time t=1.73585 s, 4096 iters, t-(init.)=1.68848 s t(norm)=0.0911088, mflops=54.8795 (err=9.2e-16) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 4. FFTW_ESTIMATE: elapsed time t=1.87996 s, 4096 iters, t-(init.)=1.83269 s t(norm)=0.0988902, mflops=50.5611 (err=8.8e-16) 5. Frigo-old: elapsed time t=1.46329 s, 1024 iters, t-(init.)=1.4514 s t(norm)=0.313266, mflops=15.9609 (err=1.0e-15) 6. GSL: elapsed time t=1.51166 s, 2048 iters, t-(init.)=1.48802 s t(norm)=0.160585, mflops=31.1362 (err=8.9e-16) 7. NAPACK (f2c): elapsed time t=1.31922 s, 512 iters, t-(init.)=1.31336 s t(norm)=0.566944, mflops=8.81922 (err=4.2e-14) 8. Nielsen: elapsed time t=1.17753 s, 1024 iters, t-(init.)=1.16572 s t(norm)=0.251605, mflops=19.8724 (err=5.8e-15) 9. Singleton (f2c): elapsed time t=1.89603 s, 4096 iters, t-(init.)=1.84929 s t(norm)=0.0997859, mflops=50.1073 (err=1.3e-15) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=1.03253 s, 256 iters, t-(init.)=1.02954 s t(norm)=0.888848, mflops=5.62526 (err=1.0e-15) Top mflops for N=504 = 107.629 Normalized results and averages for N=504: fft 0: mflops = 107.629 (norm. = 1), norm. avg. (of 11) = 0.640534 fft 1: mflops = 107.596 (norm. = 0.999692), norm. avg. (of 11) = 0.627056 fft 2: mflops = 15.7051 (norm. = 0.145919), norm. avg. (of 11) = 0.219092 fft 3: mflops = 54.8795 (norm. = 0.509896), norm. avg. (of 11) = 0.941885 fft 4: mflops = 50.5611 (norm. = 0.469774), norm. avg. (of 11) = 0.935087 fft 5: mflops = 15.9609 (norm. = 0.148296), norm. avg. (of 11) = 0.185521 fft 6: mflops = 31.1362 (norm. = 0.289293), norm. avg. (of 11) = 0.36907 fft 7: mflops = 8.81922 (norm. = 0.0819412), norm. avg. (of 11) = 0.0954798 fft 8: mflops = 19.8724 (norm. = 0.184638), norm. avg. (of 11) = 0.146551 fft 9: mflops = 50.1073 (norm. = 0.465557), norm. avg. (of 11) = 0.35393 fft 10: mflops = -1 (norm. = -0.0092912), norm. avg. (of 9) = 0.16064 fft 11: mflops = 5.62526 (norm. = 0.0522654), norm. avg. (of 11) = 0.0677345 Benchmarking for array size = 1000: 0. CWP (min N) (N=1001): elapsed time t=1.65056 s, 2048 iters, t-(init.)=1.59753 s t(norm)=0.078272, mflops=63.8798 1. CWP (best N) (N=1008): elapsed time t=1.19648 s, 2048 iters, t-(init.)=1.14313 s t(norm)=0.0560087, mflops=89.2719 2. FFTPACK (f2c): elapsed time t=1.04865 s, 256 iters, t-(init.)=1.0419 s t(norm)=0.40839, mflops=12.2432 (err=3.1e-15) FFTW_MEASURE plan: (cost = 1.100406e-03) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 3. FFTW: elapsed time t=1.1207 s, 1024 iters, t-(init.)=1.09409 s t(norm)=0.107211, mflops=46.6368 (err=3.1e-15) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 4. FFTW_ESTIMATE: elapsed time t=1.10875 s, 1024 iters, t-(init.)=1.08242 s t(norm)=0.106068, mflops=47.1394 (err=3.1e-15) 5. Frigo-old: elapsed time t=1.01495 s, 256 iters, t-(init.)=1.00787 s t(norm)=0.395052, mflops=12.6566 (err=3.1e-15) 6. GSL: elapsed time t=1.00081 s, 256 iters, t-(init.)=0.993534 s t(norm)=0.389432, mflops=12.8392 (err=3.1e-15) 7. NAPACK (f2c): elapsed time t=1.18535 s, 128 iters, t-(init.)=1.18199 s t(norm)=0.9266, mflops=5.39607 (err=1.8e-14) 8. Nielsen: elapsed time t=1.24486 s, 512 iters, t-(init.)=1.23123 s t(norm)=0.241299, mflops=20.7212 (err=1.5e-14) 9. Singleton (f2c): elapsed time t=1.79124 s, 2048 iters, t-(init.)=1.73827 s t(norm)=0.085168, mflops=58.7075 (err=4.7e-15) 10. Temperton (f2c): elapsed time t=1.36763 s, 512 iters, t-(init.)=1.35418 s t(norm)=0.265396, mflops=18.8398 (err=3.0e-15) 11. Valkenburg: elapsed time t=1.41124 s, 128 iters, t-(init.)=1.40783 s t(norm)=1.10364, mflops=4.53044 (err=3.0e-15) Top mflops for N=1000 = 89.2719 Normalized results and averages for N=1000: fft 0: mflops = 63.8798 (norm. = 0.715565), norm. avg. (of 12) = 0.646786 fft 1: mflops = 89.2719 (norm. = 1), norm. avg. (of 12) = 0.658135 fft 2: mflops = 12.2432 (norm. = 0.137145), norm. avg. (of 12) = 0.212263 fft 3: mflops = 46.6368 (norm. = 0.522414), norm. avg. (of 12) = 0.906929 fft 4: mflops = 47.1394 (norm. = 0.528044), norm. avg. (of 12) = 0.901167 fft 5: mflops = 12.6566 (norm. = 0.141775), norm. avg. (of 12) = 0.181876 fft 6: mflops = 12.8392 (norm. = 0.143822), norm. avg. (of 12) = 0.350299 fft 7: mflops = 5.39607 (norm. = 0.0604454), norm. avg. (of 12) = 0.0925603 fft 8: mflops = 20.7212 (norm. = 0.232113), norm. avg. (of 12) = 0.153681 fft 9: mflops = 58.7075 (norm. = 0.657626), norm. avg. (of 12) = 0.379238 fft 10: mflops = 18.8398 (norm. = 0.211038), norm. avg. (of 10) = 0.16568 fft 11: mflops = 4.53044 (norm. = 0.0507488), norm. avg. (of 12) = 0.0663191 Benchmarking for array size = 1960: 0. CWP (min N) (N=1980): elapsed time t=1.30995 s, 512 iters, t-(init.)=1.11004 s t(norm)=0.101141, mflops=49.4359 1. CWP (best N) (N=1980): elapsed time t=1.31016 s, 512 iters, t-(init.)=1.11 s t(norm)=0.101137, mflops=49.4377 2. FFTPACK (f2c): elapsed time t=1.97887 s, 128 iters, t-(init.)=1.92932 s t(norm)=0.70316, mflops=7.11076 (err=1.5e-15) FFTW_MEASURE plan: (cost = 2.634656e-03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 8 3. FFTW: elapsed time t=1.63914 s, 512 iters, t-(init.)=1.44072 s t(norm)=0.131271, mflops=38.0892 (err=1.5e-15) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 4. FFTW_ESTIMATE: elapsed time t=1.57535 s, 512 iters, t-(init.)=1.37712 s t(norm)=0.125477, mflops=39.848 (err=1.5e-15) 5. Frigo-old: elapsed time t=1.16598 s, 128 iters, t-(init.)=1.11622 s t(norm)=0.406819, mflops=12.2905 (err=1.5e-15) 6. GSL: elapsed time t=1.9823 s, 256 iters, t-(init.)=1.88316 s t(norm)=0.343168, mflops=14.5701 (err=1.6e-15) 7. NAPACK (f2c): elapsed time t=1.40152 s, 64 iters, t-(init.)=1.3764 s t(norm)=1.00328, mflops=4.98363 (err=1.3e-13) 8. Nielsen: elapsed time t=1.99055 s, 256 iters, t-(init.)=1.89101 s t(norm)=0.344599, mflops=14.5096 (err=1.7e-14) 9. Singleton (f2c): elapsed time t=1.18911 s, 256 iters, t-(init.)=1.09019 s t(norm)=0.198666, mflops=25.1679 (err=2.3e-15) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=1.79722 s, 64 iters, t-(init.)=1.77224 s t(norm)=1.29182, mflops=3.8705 (err=1.4e-15) Top mflops for N=1960 = 49.4377 Normalized results and averages for N=1960: fft 0: mflops = 49.4359 (norm. = 0.999964), norm. avg. (of 13) = 0.673954 fft 1: mflops = 49.4377 (norm. = 1), norm. avg. (of 13) = 0.684432 fft 2: mflops = 7.11076 (norm. = 0.143833), norm. avg. (of 13) = 0.206999 fft 3: mflops = 38.0892 (norm. = 0.770448), norm. avg. (of 13) = 0.896431 fft 4: mflops = 39.848 (norm. = 0.806025), norm. avg. (of 13) = 0.893848 fft 5: mflops = 12.2905 (norm. = 0.248605), norm. avg. (of 13) = 0.187009 fft 6: mflops = 14.5701 (norm. = 0.294717), norm. avg. (of 13) = 0.346024 fft 7: mflops = 4.98363 (norm. = 0.100806), norm. avg. (of 13) = 0.0931946 fft 8: mflops = 14.5096 (norm. = 0.293493), norm. avg. (of 13) = 0.164436 fft 9: mflops = 25.1679 (norm. = 0.509083), norm. avg. (of 13) = 0.389226 fft 10: mflops = -1 (norm. = -0.0202275), norm. avg. (of 10) = 0.16568 fft 11: mflops = 3.8705 (norm. = 0.0782904), norm. avg. (of 13) = 0.0672399 Benchmarking for array size = 4725: 0. CWP (min N) (N=5005): elapsed time t=1.00382 s, 128 iters, t-(init.)=0.876312 s t(norm)=0.118705, mflops=42.1211 1. CWP (best N) (N=5040): elapsed time t=1.8413 s, 256 iters, t-(init.)=1.58495 s t(norm)=0.107349, mflops=46.5772 2. FFTPACK (f2c): elapsed time t=1.00985 s, 32 iters, t-(init.)=0.980121 s t(norm)=0.531069, mflops=9.41497 (err=2.4e-15) FFTW_MEASURE plan: (cost = 7.442125e-03) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 15 3. FFTW: elapsed time t=1.17881 s, 128 iters, t-(init.)=1.05832 s t(norm)=0.14336, mflops=34.8772 (err=2.3e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.15802 s, 128 iters, t-(init.)=1.03733 s t(norm)=0.140517, mflops=35.583 (err=2.3e-15) 5. Frigo-old: elapsed time t=1.19272 s, 32 iters, t-(init.)=1.16216 s t(norm)=0.629707, mflops=7.9402 (err=2.3e-15) 6. GSL: elapsed time t=1.19347 s, 64 iters, t-(init.)=1.13342 s t(norm)=0.307067, mflops=16.2831 (err=2.4e-15) 7. NAPACK (f2c): elapsed time t=1.72748 s, 32 iters, t-(init.)=1.69683 s t(norm)=0.91941, mflops=5.43827 (err=3.5e-13) 8. Nielsen: elapsed time t=1.50725 s, 64 iters, t-(init.)=1.44648 s t(norm)=0.39188, mflops=12.759 (err=4.4e-14) 9. Singleton (f2c): elapsed time t=1.87399 s, 128 iters, t-(init.)=1.75406 s t(norm)=0.237605, mflops=21.0434 (err=3.3e-15) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=1.1542 s, 16 iters, t-(init.)=1.13863 s t(norm)=1.23392, mflops=4.05214 (err=2.3e-15) Top mflops for N=4725 = 46.5772 Normalized results and averages for N=4725: fft 0: mflops = 42.1211 (norm. = 0.90433), norm. avg. (of 14) = 0.690409 fft 1: mflops = 46.5772 (norm. = 1), norm. avg. (of 14) = 0.706973 fft 2: mflops = 9.41497 (norm. = 0.202137), norm. avg. (of 14) = 0.206652 fft 3: mflops = 34.8772 (norm. = 0.748805), norm. avg. (of 14) = 0.885886 fft 4: mflops = 35.583 (norm. = 0.763958), norm. avg. (of 14) = 0.88457 fft 5: mflops = 7.9402 (norm. = 0.170474), norm. avg. (of 14) = 0.185828 fft 6: mflops = 16.2831 (norm. = 0.349594), norm. avg. (of 14) = 0.346279 fft 7: mflops = 5.43827 (norm. = 0.116758), norm. avg. (of 14) = 0.0948777 fft 8: mflops = 12.759 (norm. = 0.273932), norm. avg. (of 14) = 0.172257 fft 9: mflops = 21.0434 (norm. = 0.451795), norm. avg. (of 14) = 0.393695 fft 10: mflops = -1 (norm. = -0.0214697), norm. avg. (of 10) = 0.16568 fft 11: mflops = 4.05214 (norm. = 0.0869984), norm. avg. (of 14) = 0.0686513 Benchmarking for array size = 10368: 0. CWP (min N) (N=10920): elapsed time t=1.25541 s, 64 iters, t-(init.)=1.11448 s t(norm)=0.125906, mflops=39.712 1. CWP (best N) (N=11088): elapsed time t=1.16081 s, 64 iters, t-(init.)=1.01723 s t(norm)=0.114919, mflops=43.5088 2. FFTPACK (f2c): elapsed time t=1.85745 s, 32 iters, t-(init.)=1.78988 s t(norm)=0.404416, mflops=12.3635 (err=4.7e-15) FFTW_MEASURE plan: (cost = 1.792350e-02) FFTW_TWIDDLE 32 FFTW_TWIDDLE 3 FFTW_TWIDDLE 9 FFTW_NOTW 12 3. FFTW: elapsed time t=1.17475 s, 64 iters, t-(init.)=1.03916 s t(norm)=0.117397, mflops=42.5907 (err=4.7e-15) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 4. FFTW_ESTIMATE: elapsed time t=1.41443 s, 64 iters, t-(init.)=1.27943 s t(norm)=0.144541, mflops=34.5922 (err=4.7e-15) 5. Frigo-old: elapsed time t=1.05087 s, 16 iters, t-(init.)=1.01614 s t(norm)=0.459184, mflops=10.8889 (err=4.8e-15) 6. GSL: elapsed time t=1.19097 s, 32 iters, t-(init.)=1.12414 s t(norm)=0.253995, mflops=19.6854 (err=4.7e-15) 7. NAPACK (f2c): elapsed time t=1.72191 s, 16 iters, t-(init.)=1.6883 s t(norm)=0.762929, mflops=6.55369 (err=7.8e-14) 8. Nielsen: elapsed time t=1.02797 s, 16 iters, t-(init.)=0.993592 s t(norm)=0.448996, mflops=11.136 (err=1.1e-14) 9. Singleton (f2c): elapsed time t=1.41402 s, 32 iters, t-(init.)=1.34708 s t(norm)=0.304367, mflops=16.4275 (err=6.7e-15) 10. Temperton (f2c): elapsed time t=1.84997 s, 32 iters, t-(init.)=1.78323 s t(norm)=0.402913, mflops=12.4096 (err=4.7e-15) 11. Valkenburg: elapsed time t=1.38106 s, 8 iters, t-(init.)=1.36321 s t(norm)=1.23204, mflops=4.0583 (err=4.7e-15) Top mflops for N=10368 = 43.5088 Normalized results and averages for N=10368: fft 0: mflops = 39.712 (norm. = 0.912735), norm. avg. (of 15) = 0.705231 fft 1: mflops = 43.5088 (norm. = 1), norm. avg. (of 15) = 0.726508 fft 2: mflops = 12.3635 (norm. = 0.284161), norm. avg. (of 15) = 0.211819 fft 3: mflops = 42.5907 (norm. = 0.978897), norm. avg. (of 15) = 0.892087 fft 4: mflops = 34.5922 (norm. = 0.795062), norm. avg. (of 15) = 0.878603 fft 5: mflops = 10.8889 (norm. = 0.250268), norm. avg. (of 15) = 0.190124 fft 6: mflops = 19.6854 (norm. = 0.452447), norm. avg. (of 15) = 0.353357 fft 7: mflops = 6.55369 (norm. = 0.150629), norm. avg. (of 15) = 0.0985945 fft 8: mflops = 11.136 (norm. = 0.255947), norm. avg. (of 15) = 0.177836 fft 9: mflops = 16.4275 (norm. = 0.377568), norm. avg. (of 15) = 0.39262 fft 10: mflops = 12.4096 (norm. = 0.285221), norm. avg. (of 11) = 0.176547 fft 11: mflops = 4.0583 (norm. = 0.0932753), norm. avg. (of 15) = 0.0702929 Benchmarking for array size = 27000: 0. CWP (min N) (N=27720): elapsed time t=1.95881 s, 32 iters, t-(init.)=1.69398 s t(norm)=0.133189, mflops=37.5407 1. CWP (best N) (N=27720): elapsed time t=1.95767 s, 32 iters, t-(init.)=1.69192 s t(norm)=0.133027, mflops=37.5864 2. FFTPACK (f2c): elapsed time t=1.43428 s, 8 iters, t-(init.)=1.36998 s t(norm)=0.430856, mflops=11.6048 (err=7.3e-15) FFTW_MEASURE plan: (cost = 7.406000e-02) FFTW_TWIDDLE 8 FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 15 3. FFTW: elapsed time t=1.15952 s, 16 iters, t-(init.)=1.03066 s t(norm)=0.162071, mflops=30.8506 (err=7.3e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.18654 s, 16 iters, t-(init.)=1.05733 s t(norm)=0.166264, mflops=30.0727 (err=7.3e-15) 5. Frigo-old: elapsed time t=1.14573 s, 4 iters, t-(init.)=1.11286 s t(norm)=0.699986, mflops=7.143 (err=7.3e-15) 6. GSL: elapsed time t=1.08517 s, 8 iters, t-(init.)=1.02115 s t(norm)=0.321151, mflops=15.569 (err=7.3e-15) 7. NAPACK (f2c): elapsed time t=1.56367 s, 4 iters, t-(init.)=1.53154 s t(norm)=0.963332, mflops=5.19032 (err=1.0e-12) 8. Nielsen: elapsed time t=1.67388 s, 8 iters, t-(init.)=1.609 s t(norm)=0.506029, mflops=9.88086 (err=2.0e-13) 9. Singleton (f2c): elapsed time t=1.29928 s, 8 iters, t-(init.)=1.23622 s t(norm)=0.388788, mflops=12.8605 (err=1.1e-14) 10. Temperton (f2c): elapsed time t=1.5268 s, 8 iters, t-(init.)=1.46337 s t(norm)=0.460229, mflops=10.8642 (err=7.3e-15) 11. Valkenburg: elapsed time t=1.15101 s, 2 iters, t-(init.)=1.13415 s t(norm)=1.42675, mflops=3.50446 (err=7.3e-15) Top mflops for N=27000 = 37.5864 Normalized results and averages for N=27000: fft 0: mflops = 37.5407 (norm. = 0.998786), norm. avg. (of 16) = 0.723578 fft 1: mflops = 37.5864 (norm. = 1), norm. avg. (of 16) = 0.743601 fft 2: mflops = 11.6048 (norm. = 0.30875), norm. avg. (of 16) = 0.217877 fft 3: mflops = 30.8506 (norm. = 0.820792), norm. avg. (of 16) = 0.887631 fft 4: mflops = 30.0727 (norm. = 0.800096), norm. avg. (of 16) = 0.873697 fft 5: mflops = 7.143 (norm. = 0.190042), norm. avg. (of 16) = 0.190119 fft 6: mflops = 15.569 (norm. = 0.414219), norm. avg. (of 16) = 0.357161 fft 7: mflops = 5.19032 (norm. = 0.13809), norm. avg. (of 16) = 0.101063 fft 8: mflops = 9.88086 (norm. = 0.262884), norm. avg. (of 16) = 0.183152 fft 9: mflops = 12.8605 (norm. = 0.342158), norm. avg. (of 16) = 0.389466 fft 10: mflops = 10.8642 (norm. = 0.289045), norm. avg. (of 12) = 0.185922 fft 11: mflops = 3.50446 (norm. = 0.0932375), norm. avg. (of 16) = 0.0717269 Benchmarking for array size = 75600: 0. CWP (min N) (N=80080): elapsed time t=1.81302 s, 8 iters, t-(init.)=1.60883 s t(norm)=0.164142, mflops=30.4614 1. CWP (best N) (N=80080): elapsed time t=1.81297 s, 8 iters, t-(init.)=1.60873 s t(norm)=0.164132, mflops=30.4633 2. FFTPACK (f2c): elapsed time t=1.49622 s, 2 iters, t-(init.)=1.44783 s t(norm)=0.590864, mflops=8.46218 (err=9.4e-15) FFTW_MEASURE plan: (cost = 2.248450e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 15 3. FFTW: elapsed time t=1.78704 s, 8 iters, t-(init.)=1.59422 s t(norm)=0.162651, mflops=30.7406 (err=9.4e-15) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.96027 s, 8 iters, t-(init.)=1.76747 s t(norm)=0.180328, mflops=27.7273 (err=9.4e-15) 5. Frigo-old: elapsed time t=1.97765 s, 2 iters, t-(init.)=1.92931 s t(norm)=0.787359, mflops=6.35034 (err=9.4e-15) 6. GSL: elapsed time t=1.79427 s, 4 iters, t-(init.)=1.69755 s t(norm)=0.346388, mflops=14.4347 (err=9.4e-15) 7. NAPACK (f2c): elapsed time t=1.20626 s, 1 iters, t-(init.)=1.18227 s t(norm)=0.96498, mflops=5.18146 (err=5.1e-12) 8. Nielsen: elapsed time t=1.61218 s, 2 iters, t-(init.)=1.56382 s t(norm)=0.638199, mflops=7.83455 (err=4.7e-13) 9. Singleton (f2c): elapsed time t=1.27274 s, 2 iters, t-(init.)=1.22563 s t(norm)=0.500184, mflops=9.99631 (err=1.3e-14) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=1.99331 s, 1 iters, t-(init.)=1.96911 s t(norm)=1.6072, mflops=3.11101 (err=9.5e-15) Top mflops for N=75600 = 30.7406 Normalized results and averages for N=75600: fft 0: mflops = 30.4614 (norm. = 0.990917), norm. avg. (of 17) = 0.739304 fft 1: mflops = 30.4633 (norm. = 0.990981), norm. avg. (of 17) = 0.758153 fft 2: mflops = 8.46218 (norm. = 0.275277), norm. avg. (of 17) = 0.221254 fft 3: mflops = 30.7406 (norm. = 1), norm. avg. (of 17) = 0.894241 fft 4: mflops = 27.7273 (norm. = 0.901976), norm. avg. (of 17) = 0.87536 fft 5: mflops = 6.35034 (norm. = 0.206578), norm. avg. (of 17) = 0.191087 fft 6: mflops = 14.4347 (norm. = 0.469564), norm. avg. (of 17) = 0.363773 fft 7: mflops = 5.18146 (norm. = 0.168554), norm. avg. (of 17) = 0.105033 fft 8: mflops = 7.83455 (norm. = 0.25486), norm. avg. (of 17) = 0.18737 fft 9: mflops = 9.99631 (norm. = 0.325183), norm. avg. (of 17) = 0.385685 fft 10: mflops = -1 (norm. = -0.0325303), norm. avg. (of 12) = 0.185922 fft 11: mflops = 3.11101 (norm. = 0.101202), norm. avg. (of 17) = 0.0734607 Benchmarking for array size = 165375: 0. CWP (min N) (N=180180): elapsed time t=1.3536 s, 2 iters, t-(init.)=1.2383 s t(norm)=0.21597, mflops=23.1514 1. CWP (best N) (N=180180): elapsed time t=1.35389 s, 2 iters, t-(init.)=1.23884 s t(norm)=0.216064, mflops=23.1412 2. FFTPACK (f2c): elapsed time t=2.50573 s, 1 iters, t-(init.)=2.45306 s t(norm)=0.855667, mflops=5.8434 (err=3.7e-14) FFTW_MEASURE plan: (cost = 6.144140e-01) FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 15 3. FFTW: elapsed time t=1.21815 s, 2 iters, t-(init.)=1.11244 s t(norm)=0.194019, mflops=25.7706 (err=3.7e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.2307 s, 2 iters, t-(init.)=1.12518 s t(norm)=0.196241, mflops=25.4788 (err=3.7e-14) 5. Frigo-old: elapsed time t=3.24014 s, 1 iters, t-(init.)=3.18744 s t(norm)=1.11183, mflops=4.49709 (err=3.7e-14) 6. GSL: elapsed time t=1.03121 s, 1 iters, t-(init.)=0.978263 s t(norm)=0.341234, mflops=14.6527 (err=3.7e-14) 7. NAPACK (f2c): elapsed time t=3.1472 s, 1 iters, t-(init.)=3.09429 s t(norm)=1.07934, mflops=4.63246 (err=1.6e-11) 8. Nielsen: elapsed time t=2.06356 s, 1 iters, t-(init.)=2.01076 s t(norm)=0.701385, mflops=7.12875 (err=1.6e-12) 9. Singleton (f2c): elapsed time t=1.4416 s, 1 iters, t-(init.)=1.39023 s t(norm)=0.484934, mflops=10.3107 (err=5.6e-14) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=4.89788 s, 1 iters, t-(init.)=4.84539 s t(norm)=1.69015, mflops=2.95832 (err=3.6e-14) Top mflops for N=165375 = 25.7706 Normalized results and averages for N=165375: fft 0: mflops = 23.1514 (norm. = 0.898365), norm. avg. (of 18) = 0.748141 fft 1: mflops = 23.1412 (norm. = 0.89797), norm. avg. (of 18) = 0.76592 fft 2: mflops = 5.8434 (norm. = 0.226747), norm. avg. (of 18) = 0.221559 fft 3: mflops = 25.7706 (norm. = 1), norm. avg. (of 18) = 0.900116 fft 4: mflops = 25.4788 (norm. = 0.988677), norm. avg. (of 18) = 0.881655 fft 5: mflops = 4.49709 (norm. = 0.174505), norm. avg. (of 18) = 0.190166 fft 6: mflops = 14.6527 (norm. = 0.568582), norm. avg. (of 18) = 0.375151 fft 7: mflops = 4.63246 (norm. = 0.179758), norm. avg. (of 18) = 0.109184 fft 8: mflops = 7.12875 (norm. = 0.276623), norm. avg. (of 18) = 0.192328 fft 9: mflops = 10.3107 (norm. = 0.400094), norm. avg. (of 18) = 0.386485 fft 10: mflops = -1 (norm. = -0.0388039), norm. avg. (of 12) = 0.185922 fft 11: mflops = 2.95832 (norm. = 0.114794), norm. avg. (of 18) = 0.075757 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) Maximum array size N = 1048576 Benchmarking FFTs: 0. FFTW 1. HARM (f2c) 2. NR (C) 3. PDA (f2c) 4. Singleton (f2c) 5. Temperton (f2c) Computing normalized averages (6 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.13177 s, 65536 iters, t-(init.)=1.03316 s t(norm)=0.041054, mflops=121.791 (err=1.9e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. NR (C): elapsed time t=1.46806 s, 32768 iters, t-(init.)=1.41865 s t(norm)=0.112744, mflops=44.3483 (err=2.3e-16) 3. PDA (f2c): elapsed time t=1.70554 s, 8192 iters, t-(init.)=1.69311 s t(norm)=0.538225, mflops=9.28979 (err=2.8e-16) 4. Singleton (f2c): elapsed time t=1.60105 s, 65536 iters, t-(init.)=1.50249 s t(norm)=0.0597035, mflops=83.7472 (err=1.9e-16) 5. Temperton (f2c): elapsed time t=1.21645 s, 16384 iters, t-(init.)=1.19153 s t(norm)=0.189389, mflops=26.4007 (err=1.9e-16) Top mflops for N=64 = 121.791 Normalized results and averages for N=64: fft 0: mflops = 121.791 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.0082108), norm. avg. (of 0) = -1 fft 2: mflops = 44.3483 (norm. = 0.364135), norm. avg. (of 1) = 0.364135 fft 3: mflops = 9.28979 (norm. = 0.0762766), norm. avg. (of 1) = 0.0762766 fft 4: mflops = 83.7472 (norm. = 0.687632), norm. avg. (of 1) = 0.687632 fft 5: mflops = 26.4007 (norm. = 0.216771), norm. avg. (of 1) = 0.216771 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.34054 s, 8192 iters, t-(init.)=1.24498 s t(norm)=0.0329806, mflops=151.604 (err=3.8e-16) 1. HARM (f2c): elapsed time t=1.60739 s, 2048 iters, t-(init.)=1.58324 s t(norm)=0.167766, mflops=29.8034 (err=3.6e-16) 2. NR (C): elapsed time t=1.62634 s, 4096 iters, t-(init.)=1.57827 s t(norm)=0.0836197, mflops=59.7945 (err=2.9e-16) 3. PDA (f2c): elapsed time t=1.62354 s, 1024 iters, t-(init.)=1.6114 s t(norm)=0.341501, mflops=14.6412 (err=3.1e-16) 4. Singleton (f2c): elapsed time t=1.34938 s, 4096 iters, t-(init.)=1.3014 s t(norm)=0.0689505, mflops=72.5158 (err=3.1e-16) 5. Temperton (f2c): elapsed time t=1.69798 s, 2048 iters, t-(init.)=1.67392 s t(norm)=0.177375, mflops=28.1889 (err=3.7e-16) Top mflops for N=512 = 151.604 Normalized results and averages for N=512: fft 0: mflops = 151.604 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 29.8034 (norm. = 0.196587), norm. avg. (of 1) = 0.196587 fft 2: mflops = 59.7945 (norm. = 0.394412), norm. avg. (of 2) = 0.379274 fft 3: mflops = 14.6412 (norm. = 0.0965753), norm. avg. (of 2) = 0.086426 fft 4: mflops = 72.5158 (norm. = 0.478323), norm. avg. (of 2) = 0.582977 fft 5: mflops = 28.1889 (norm. = 0.185937), norm. avg. (of 2) = 0.201354 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.51395 s, 256 iters, t-(init.)=1.30611 s t(norm)=0.103801, mflops=48.1693 (err=4.1e-16) 1. HARM (f2c): elapsed time t=1.5589 s, 128 iters, t-(init.)=1.45508 s t(norm)=0.231279, mflops=21.6189 (err=4.0e-16) 2. NR (C): elapsed time t=1.35327 s, 64 iters, t-(init.)=1.30108 s t(norm)=0.413604, mflops=12.0889 (err=4.7e-16) 3. PDA (f2c): elapsed time t=1.12435 s, 64 iters, t-(init.)=1.07237 s t(norm)=0.340899, mflops=14.6671 (err=3.8e-16) 4. Singleton (f2c): elapsed time t=1.60552 s, 128 iters, t-(init.)=1.50175 s t(norm)=0.238697, mflops=20.947 (err=4.7e-16) 5. Temperton (f2c): elapsed time t=1.705 s, 128 iters, t-(init.)=1.601 s t(norm)=0.254473, mflops=19.6485 (err=4.1e-16) Top mflops for N=4096 = 48.1693 Normalized results and averages for N=4096: fft 0: mflops = 48.1693 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 21.6189 (norm. = 0.448812), norm. avg. (of 2) = 0.322699 fft 2: mflops = 12.0889 (norm. = 0.250966), norm. avg. (of 3) = 0.336504 fft 3: mflops = 14.6671 (norm. = 0.304491), norm. avg. (of 3) = 0.159114 fft 4: mflops = 20.947 (norm. = 0.434863), norm. avg. (of 3) = 0.533606 fft 5: mflops = 19.6485 (norm. = 0.407905), norm. avg. (of 3) = 0.270204 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.42968 s, 16 iters, t-(init.)=1.26222 s t(norm)=0.1605, mflops=31.1526 (err=4.8e-16) 1. HARM (f2c): elapsed time t=1.3262 s, 8 iters, t-(init.)=1.2424 s t(norm)=0.315959, mflops=15.8248 (err=4.8e-16) 2. NR (C): elapsed time t=1.31933 s, 4 iters, t-(init.)=1.2793 s t(norm)=0.650687, mflops=7.68419 (err=6.0e-16) 3. PDA (f2c): elapsed time t=1.93061 s, 8 iters, t-(init.)=1.84713 s t(norm)=0.469751, mflops=10.6439 (err=4.5e-16) 4. Singleton (f2c): elapsed time t=1.91127 s, 8 iters, t-(init.)=1.82806 s t(norm)=0.464901, mflops=10.755 (err=4.9e-16) 5. Temperton (f2c): elapsed time t=1.78383 s, 8 iters, t-(init.)=1.69983 s t(norm)=0.43229, mflops=11.5663 (err=5.1e-16) Top mflops for N=32768 = 31.1526 Normalized results and averages for N=32768: fft 0: mflops = 31.1526 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 15.8248 (norm. = 0.507977), norm. avg. (of 3) = 0.384459 fft 2: mflops = 7.68419 (norm. = 0.246663), norm. avg. (of 4) = 0.314044 fft 3: mflops = 10.6439 (norm. = 0.341671), norm. avg. (of 4) = 0.204754 fft 4: mflops = 10.755 (norm. = 0.345235), norm. avg. (of 4) = 0.486513 fft 5: mflops = 11.5663 (norm. = 0.371279), norm. avg. (of 4) = 0.295473 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.00655 s, 1 iters, t-(init.)=0.92308 s t(norm)=0.195626, mflops=25.559 (err=1.0e-15) 1. HARM (f2c): elapsed time t=1.64039 s, 1 iters, t-(init.)=1.55652 s t(norm)=0.32987, mflops=15.1575 (err=1.0e-15) 2. NR (C): elapsed time t=4.22722 s, 1 iters, t-(init.)=4.14347 s t(norm)=0.878115, mflops=5.69401 (err=1.0e-15) 3. PDA (f2c): elapsed time t=2.53858 s, 1 iters, t-(init.)=2.45504 s t(norm)=0.520292, mflops=9.60999 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=2.54103 s, 1 iters, t-(init.)=2.45725 s t(norm)=0.520759, mflops=9.60138 (err=1.4e-15) 5. Temperton (f2c): elapsed time t=2.16932 s, 1 iters, t-(init.)=2.08581 s t(norm)=0.442041, mflops=11.3112 (err=9.9e-16) Top mflops for N=262144 = 25.559 Normalized results and averages for N=262144: fft 0: mflops = 25.559 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 15.1575 (norm. = 0.59304), norm. avg. (of 4) = 0.436604 fft 2: mflops = 5.69401 (norm. = 0.22278), norm. avg. (of 5) = 0.295791 fft 3: mflops = 9.60999 (norm. = 0.375993), norm. avg. (of 5) = 0.239002 fft 4: mflops = 9.60138 (norm. = 0.375656), norm. avg. (of 5) = 0.464342 fft 5: mflops = 11.3112 (norm. = 0.442552), norm. avg. (of 5) = 0.324889 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=2.11877 s, 1 iters, t-(init.)=1.95184 s t(norm)=0.195939, mflops=25.5182 (err=9.2e-16) 1. HARM (f2c): elapsed time t=3.73104 s, 1 iters, t-(init.)=3.56361 s t(norm)=0.357739, mflops=13.9767 (err=9.4e-16) 2. NR (C): elapsed time t=8.93856 s, 1 iters, t-(init.)=8.77105 s t(norm)=0.880498, mflops=5.67861 (err=9.6e-16) 3. PDA (f2c): elapsed time t=5.28973 s, 1 iters, t-(init.)=5.12224 s t(norm)=0.514205, mflops=9.72375 (err=8.8e-16) 4. Singleton (f2c): elapsed time t=5.77527 s, 1 iters, t-(init.)=5.60798 s t(norm)=0.562967, mflops=8.88152 (err=1.3e-15) 5. Temperton (f2c): elapsed time t=5.45322 s, 1 iters, t-(init.)=5.28602 s t(norm)=0.530647, mflops=9.42246 (err=9.2e-16) Top mflops for N=524288 = 25.5182 Normalized results and averages for N=524288: fft 0: mflops = 25.5182 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 13.9767 (norm. = 0.547714), norm. avg. (of 5) = 0.458826 fft 2: mflops = 5.67861 (norm. = 0.222532), norm. avg. (of 6) = 0.283581 fft 3: mflops = 9.72375 (norm. = 0.381052), norm. avg. (of 6) = 0.262677 fft 4: mflops = 8.88152 (norm. = 0.348047), norm. avg. (of 6) = 0.444959 fft 5: mflops = 9.42246 (norm. = 0.369245), norm. avg. (of 6) = 0.332282 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=5.54856 s, 1 iters, t-(init.)=5.21419 s t(norm)=0.248632, mflops=20.11 (err=1.2e-15) 1. HARM (f2c): elapsed time t=7.70838 s, 1 iters, t-(init.)=7.37376 s t(norm)=0.351608, mflops=14.2204 (err=1.2e-15) 2. NR (C): elapsed time t=18.6959 s, 1 iters, t-(init.)=18.3612 s t(norm)=0.87553, mflops=5.71083 (err=1.3e-15) 3. PDA (f2c): elapsed time t=11.7869 s, 1 iters, t-(init.)=11.4523 s t(norm)=0.546086, mflops=9.15606 (err=1.2e-15) 4. Singleton (f2c): elapsed time t=11.6986 s, 1 iters, t-(init.)=11.364 s t(norm)=0.541877, mflops=9.22719 (err=1.7e-15) 5. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 20.11 Normalized results and averages for N=1048576: fft 0: mflops = 20.11 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 14.2204 (norm. = 0.707127), norm. avg. (of 6) = 0.50021 fft 2: mflops = 5.71083 (norm. = 0.283979), norm. avg. (of 7) = 0.283638 fft 3: mflops = 9.15606 (norm. = 0.455298), norm. avg. (of 7) = 0.290194 fft 4: mflops = 9.22719 (norm. = 0.458835), norm. avg. (of 7) = 0.446941 fft 5: mflops = -1 (norm. = -0.0497264), norm. avg. (of 6) = 0.332282 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) Maximum array size N = 1404928 Benchmarking FFTs: 0. FFTW 1. PDA (f2c) 2. Singleton (f2c) 3. Temperton (f2c) Computing normalized averages (4 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.94716 s, 32768 iters, t-(init.)=1.85256 s t(norm)=0.0649297, mflops=77.0063 (err=2.4e-16) 1. PDA (f2c): elapsed time t=1.69203 s, 4096 iters, t-(init.)=1.6801 s t(norm)=0.47108, mflops=10.6139 (err=2.1e-16) 2. Singleton (f2c): elapsed time t=1.4301 s, 32768 iters, t-(init.)=1.33558 s t(norm)=0.0468102, mflops=106.814 (err=3.1e-16) 3. Temperton (f2c): elapsed time t=1.43806 s, 8192 iters, t-(init.)=1.41426 s t(norm)=0.198271, mflops=25.218 (err=2.4e-16) Top mflops for N=125 = 106.814 Normalized results and averages for N=125: fft 0: mflops = 77.0063 (norm. = 0.720937), norm. avg. (of 1) = 0.720937 fft 1: mflops = 10.6139 (norm. = 0.0993679), norm. avg. (of 1) = 0.0993679 fft 2: mflops = 106.814 (norm. = 1), norm. avg. (of 1) = 1 fft 3: mflops = 25.218 (norm. = 0.236093), norm. avg. (of 1) = 0.236093 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.27995 s, 16384 iters, t-(init.)=1.19896 s t(norm)=0.0436873, mflops=114.45 (err=3.0e-16) 1. PDA (f2c): elapsed time t=1.50431 s, 2048 iters, t-(init.)=1.4941 s t(norm)=0.435533, mflops=11.4802 (err=3.7e-16) 2. Singleton (f2c): elapsed time t=1.11384 s, 8192 iters, t-(init.)=1.07333 s t(norm)=0.0782195, mflops=63.9227 (err=3.1e-16) 3. Temperton (f2c): elapsed time t=1.46946 s, 4096 iters, t-(init.)=1.44923 s t(norm)=0.211227, mflops=23.6712 (err=3.2e-16) Top mflops for N=216 = 114.45 Normalized results and averages for N=216: fft 0: mflops = 114.45 (norm. = 1), norm. avg. (of 2) = 0.860469 fft 1: mflops = 11.4802 (norm. = 0.100308), norm. avg. (of 2) = 0.0998377 fft 2: mflops = 63.9227 (norm. = 0.558522), norm. avg. (of 2) = 0.779261 fft 3: mflops = 23.6712 (norm. = 0.206826), norm. avg. (of 2) = 0.221459 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.63102 s, 8192 iters, t-(init.)=1.56678 s t(norm)=0.0662071, mflops=75.5206 (err=4.0e-16) 1. PDA (f2c): elapsed time t=1.09332 s, 512 iters, t-(init.)=1.08916 s t(norm)=0.736391, mflops=6.78987 (err=4.0e-16) 2. Singleton (f2c): elapsed time t=1.05849 s, 4096 iters, t-(init.)=1.02637 s t(norm)=0.0867422, mflops=57.6421 (err=4.9e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 75.5206 Normalized results and averages for N=343: fft 0: mflops = 75.5206 (norm. = 1), norm. avg. (of 3) = 0.906979 fft 1: mflops = 6.78987 (norm. = 0.0899076), norm. avg. (of 3) = 0.0965277 fft 2: mflops = 57.6421 (norm. = 0.763263), norm. avg. (of 3) = 0.773928 fft 3: mflops = -1 (norm. = -0.0132414), norm. avg. (of 2) = 0.221459 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.51999 s, 4096 iters, t-(init.)=1.45125 s t(norm)=0.0511074, mflops=97.8332 (err=5.4e-16) 1. PDA (f2c): elapsed time t=1.26094 s, 512 iters, t-(init.)=1.25233 s t(norm)=0.352818, mflops=14.1716 (err=5.2e-16) 2. Singleton (f2c): elapsed time t=1.0075 s, 2048 iters, t-(init.)=0.972803 s t(norm)=0.0685168, mflops=72.9748 (err=4.9e-16) 3. Temperton (f2c): elapsed time t=1.33658 s, 1024 iters, t-(init.)=1.31922 s t(norm)=0.185832, mflops=26.906 (err=5.8e-16) Top mflops for N=729 = 97.8332 Normalized results and averages for N=729: fft 0: mflops = 97.8332 (norm. = 1), norm. avg. (of 4) = 0.930234 fft 1: mflops = 14.1716 (norm. = 0.144855), norm. avg. (of 4) = 0.108609 fft 2: mflops = 72.9748 (norm. = 0.745911), norm. avg. (of 4) = 0.766924 fft 3: mflops = 26.906 (norm. = 0.27502), norm. avg. (of 3) = 0.239313 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.14121 s, 2048 iters, t-(init.)=1.0885 s t(norm)=0.0533319, mflops=93.7524 (err=3.8e-16) 1. PDA (f2c): elapsed time t=1.73904 s, 512 iters, t-(init.)=1.72555 s t(norm)=0.338178, mflops=14.7851 (err=4.2e-16) 2. Singleton (f2c): elapsed time t=1.58857 s, 2048 iters, t-(init.)=1.53574 s t(norm)=0.0752447, mflops=66.4499 (err=4.4e-16) 3. Temperton (f2c): elapsed time t=1.14619 s, 512 iters, t-(init.)=1.13273 s t(norm)=0.221996, mflops=22.523 (err=3.6e-16) Top mflops for N=1000 = 93.7524 Normalized results and averages for N=1000: fft 0: mflops = 93.7524 (norm. = 1), norm. avg. (of 5) = 0.944187 fft 1: mflops = 14.7851 (norm. = 0.157704), norm. avg. (of 5) = 0.118428 fft 2: mflops = 66.4499 (norm. = 0.70878), norm. avg. (of 5) = 0.755295 fft 3: mflops = 22.523 (norm. = 0.240239), norm. avg. (of 4) = 0.239544 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.00842 s, 512 iters, t-(init.)=0.873778 s t(norm)=0.123546, mflops=40.4709 (err=4.0e-16) 1. PDA (f2c): elapsed time t=1.39074 s, 128 iters, t-(init.)=1.35716 s t(norm)=0.767571, mflops=6.51406 (err=4.8e-16) 2. Singleton (f2c): elapsed time t=1.96431 s, 1024 iters, t-(init.)=1.69523 s t(norm)=0.119846, mflops=41.7202 (err=6.4e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 41.7202 Normalized results and averages for N=1331: fft 0: mflops = 40.4709 (norm. = 0.970056), norm. avg. (of 6) = 0.948499 fft 1: mflops = 6.51406 (norm. = 0.156137), norm. avg. (of 6) = 0.124713 fft 2: mflops = 41.7202 (norm. = 1), norm. avg. (of 6) = 0.796079 fft 3: mflops = -1 (norm. = -0.0239692), norm. avg. (of 4) = 0.239544 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.70274 s, 1024 iters, t-(init.)=1.35349 s t(norm)=0.0711225, mflops=70.3013 (err=3.8e-16) 1. PDA (f2c): elapsed time t=1.58988 s, 256 iters, t-(init.)=1.50241 s t(norm)=0.315791, mflops=15.8333 (err=3.8e-16) 2. Singleton (f2c): elapsed time t=1.15812 s, 256 iters, t-(init.)=1.07099 s t(norm)=0.225111, mflops=22.2113 (err=4.0e-16) 3. Temperton (f2c): elapsed time t=1.08639 s, 256 iters, t-(init.)=0.998961 s t(norm)=0.209971, mflops=23.8128 (err=3.8e-16) Top mflops for N=1728 = 70.3013 Normalized results and averages for N=1728: fft 0: mflops = 70.3013 (norm. = 1), norm. avg. (of 7) = 0.955856 fft 1: mflops = 15.8333 (norm. = 0.22522), norm. avg. (of 7) = 0.139071 fft 2: mflops = 22.2113 (norm. = 0.315945), norm. avg. (of 7) = 0.727489 fft 3: mflops = 23.8128 (norm. = 0.338725), norm. avg. (of 5) = 0.259381 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.92203 s, 512 iters, t-(init.)=1.69991 s t(norm)=0.136129, mflops=36.7298 (err=4.1e-16) 1. PDA (f2c): elapsed time t=1.28965 s, 64 iters, t-(init.)=1.26173 s t(norm)=0.808317, mflops=6.1857 (err=7.2e-16) 2. Singleton (f2c): elapsed time t=1.86939 s, 512 iters, t-(init.)=1.64709 s t(norm)=0.131899, mflops=37.9078 (err=4.3e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 37.9078 Normalized results and averages for N=2197: fft 0: mflops = 36.7298 (norm. = 0.968925), norm. avg. (of 8) = 0.95749 fft 1: mflops = 6.1857 (norm. = 0.163177), norm. avg. (of 8) = 0.142085 fft 2: mflops = 37.9078 (norm. = 1), norm. avg. (of 8) = 0.761553 fft 3: mflops = -1 (norm. = -0.0263798), norm. avg. (of 5) = 0.259381 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.86284 s, 512 iters, t-(init.)=1.58509 s t(norm)=0.098777, mflops=50.6191 (err=3.9e-16) 1. PDA (f2c): elapsed time t=1.08048 s, 64 iters, t-(init.)=1.04561 s t(norm)=0.521269, mflops=9.59197 (err=3.8e-16) 2. Singleton (f2c): elapsed time t=1.77794 s, 256 iters, t-(init.)=1.63891 s t(norm)=0.204261, mflops=24.4785 (err=4.6e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 50.6191 Normalized results and averages for N=2744: fft 0: mflops = 50.6191 (norm. = 1), norm. avg. (of 9) = 0.962213 fft 1: mflops = 9.59197 (norm. = 0.189493), norm. avg. (of 9) = 0.147352 fft 2: mflops = 24.4785 (norm. = 0.483582), norm. avg. (of 9) = 0.730667 fft 3: mflops = -1 (norm. = -0.0197554), norm. avg. (of 5) = 0.259381 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.03018 s, 256 iters, t-(init.)=0.859123 s t(norm)=0.0848377, mflops=58.936 (err=4.6e-16) 1. PDA (f2c): elapsed time t=1.6405 s, 128 iters, t-(init.)=1.55502 s t(norm)=0.307113, mflops=16.2806 (err=4.5e-16) 2. Singleton (f2c): elapsed time t=1.05352 s, 128 iters, t-(init.)=0.967907 s t(norm)=0.19116, mflops=26.1561 (err=4.8e-16) 3. Temperton (f2c): elapsed time t=1.27161 s, 128 iters, t-(init.)=1.18619 s t(norm)=0.23427, mflops=21.3429 (err=4.6e-16) Top mflops for N=3375 = 58.936 Normalized results and averages for N=3375: fft 0: mflops = 58.936 (norm. = 1), norm. avg. (of 10) = 0.965992 fft 1: mflops = 16.2806 (norm. = 0.276243), norm. avg. (of 10) = 0.160241 fft 2: mflops = 26.1561 (norm. = 0.443805), norm. avg. (of 10) = 0.701981 fft 3: mflops = 21.3429 (norm. = 0.362136), norm. avg. (of 6) = 0.276506 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.91533 s, 64 iters, t-(init.)=1.68822 s t(norm)=0.111865, mflops=44.6969 (err=5.0e-16) 1. PDA (f2c): elapsed time t=1.34089 s, 16 iters, t-(init.)=1.2838 s t(norm)=0.340265, mflops=14.6944 (err=4.4e-16) 2. Singleton (f2c): elapsed time t=1.14765 s, 16 iters, t-(init.)=1.09076 s t(norm)=0.289102, mflops=17.2949 (err=5.6e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 44.6969 Normalized results and averages for N=16800: fft 0: mflops = 44.6969 (norm. = 1), norm. avg. (of 11) = 0.969084 fft 1: mflops = 14.6944 (norm. = 0.328757), norm. avg. (of 11) = 0.175561 fft 2: mflops = 17.2949 (norm. = 0.386937), norm. avg. (of 11) = 0.67334 fft 3: mflops = -1 (norm. = -0.0223729), norm. avg. (of 6) = 0.276506 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.05076 s, 4 iters, t-(init.)=0.909537 s t(norm)=0.122714, mflops=40.745 (err=7.1e-16) 1. PDA (f2c): elapsed time t=1.46056 s, 2 iters, t-(init.)=1.38996 s t(norm)=0.375066, mflops=13.331 (err=7.1e-16) 2. Singleton (f2c): elapsed time t=1.76432 s, 2 iters, t-(init.)=1.69371 s t(norm)=0.457029, mflops=10.9402 (err=8.2e-16) 3. Temperton (f2c): elapsed time t=1.28826 s, 2 iters, t-(init.)=1.21754 s t(norm)=0.32854, mflops=15.2188 (err=7.6e-16) Top mflops for N=110592 = 40.745 Normalized results and averages for N=110592: fft 0: mflops = 40.745 (norm. = 1), norm. avg. (of 12) = 0.97166 fft 1: mflops = 13.331 (norm. = 0.32718), norm. avg. (of 12) = 0.188196 fft 2: mflops = 10.9402 (norm. = 0.268504), norm. avg. (of 12) = 0.639604 fft 3: mflops = 15.2188 (norm. = 0.373514), norm. avg. (of 7) = 0.290365 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.15164 s, 4 iters, t-(init.)=1.00182 s t(norm)=0.126384, mflops=39.5621 (err=8.7e-16) 1. PDA (f2c): elapsed time t=1.13236 s, 1 iters, t-(init.)=1.09511 s t(norm)=0.552613, mflops=9.04793 (err=8.8e-16) 2. Singleton (f2c): elapsed time t=1.59798 s, 2 iters, t-(init.)=1.52309 s t(norm)=0.384288, mflops=13.0111 (err=1.1e-15) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 39.5621 Normalized results and averages for N=117649: fft 0: mflops = 39.5621 (norm. = 1), norm. avg. (of 13) = 0.97384 fft 1: mflops = 9.04793 (norm. = 0.228702), norm. avg. (of 13) = 0.191312 fft 2: mflops = 13.0111 (norm. = 0.328877), norm. avg. (of 13) = 0.615702 fft 3: mflops = -1 (norm. = -0.0252767), norm. avg. (of 7) = 0.290365 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.97823 s, 4 iters, t-(init.)=1.70224 s t(norm)=0.11118, mflops=44.972 (err=4.9e-16) 1. PDA (f2c): elapsed time t=1.52335 s, 1 iters, t-(init.)=1.45415 s t(norm)=0.379905, mflops=13.1612 (err=5.0e-16) 2. Singleton (f2c): elapsed time t=2.28531 s, 1 iters, t-(init.)=2.21686 s t(norm)=0.579168, mflops=8.63307 (err=6.0e-16) 3. Temperton (f2c): elapsed time t=1.19908 s, 1 iters, t-(init.)=1.13031 s t(norm)=0.2953, mflops=16.9319 (err=4.7e-16) Top mflops for N=216000 = 44.972 Normalized results and averages for N=216000: fft 0: mflops = 44.972 (norm. = 1), norm. avg. (of 14) = 0.975709 fft 1: mflops = 13.1612 (norm. = 0.292653), norm. avg. (of 14) = 0.19855 fft 2: mflops = 8.63307 (norm. = 0.191966), norm. avg. (of 14) = 0.585435 fft 3: mflops = 16.9319 (norm. = 0.3765), norm. avg. (of 8) = 0.301132 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.22573 s, 2 iters, t-(init.)=1.07124 s t(norm)=0.123799, mflops=40.3881 (err=5.7e-16) 1. PDA (f2c): elapsed time t=1.92522 s, 1 iters, t-(init.)=1.84815 s t(norm)=0.427165, mflops=11.7051 (err=6.1e-16) 2. Singleton (f2c): elapsed time t=2.77863 s, 1 iters, t-(init.)=2.70126 s t(norm)=0.624346, mflops=8.00837 (err=7.0e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 40.3881 Normalized results and averages for N=241920: fft 0: mflops = 40.3881 (norm. = 1), norm. avg. (of 15) = 0.977328 fft 1: mflops = 11.7051 (norm. = 0.289815), norm. avg. (of 15) = 0.204635 fft 2: mflops = 8.00837 (norm. = 0.198285), norm. avg. (of 15) = 0.559625 fft 3: mflops = -1 (norm. = -0.0247597), norm. avg. (of 8) = 0.301132 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.02501 s, 1 iters, t-(init.)=0.890344 s t(norm)=0.11294, mflops=44.2714 (err=9.0e-16) 1. PDA (f2c): elapsed time t=3.16834 s, 1 iters, t-(init.)=3.03393 s t(norm)=0.384853, mflops=12.992 (err=9.5e-16) 2. Singleton (f2c): elapsed time t=3.99505 s, 1 iters, t-(init.)=3.86063 s t(norm)=0.48972, mflops=10.2099 (err=1.3e-15) 3. Temperton (f2c): elapsed time t=2.36918 s, 1 iters, t-(init.)=2.23466 s t(norm)=0.283466, mflops=17.6388 (err=1.1e-15) Top mflops for N=421875 = 44.2714 Normalized results and averages for N=421875: fft 0: mflops = 44.2714 (norm. = 1), norm. avg. (of 16) = 0.978745 fft 1: mflops = 12.992 (norm. = 0.293463), norm. avg. (of 16) = 0.210186 fft 2: mflops = 10.2099 (norm. = 0.230621), norm. avg. (of 16) = 0.539062 fft 3: mflops = 17.6388 (norm. = 0.398425), norm. avg. (of 9) = 0.311942 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.51866 s, 1 iters, t-(init.)=1.35486 s t(norm)=0.139525, mflops=35.8358 (err=1.5e-15) 1. PDA (f2c): elapsed time t=4.06201 s, 1 iters, t-(init.)=3.89881 s t(norm)=0.401505, mflops=12.4531 (err=1.5e-15) 2. Singleton (f2c): elapsed time t=4.67353 s, 1 iters, t-(init.)=4.51011 s t(norm)=0.464458, mflops=10.7652 (err=2.3e-15) 3. Temperton (f2c): elapsed time t=3.81715 s, 1 iters, t-(init.)=3.6541 s t(norm)=0.376305, mflops=13.2871 (err=1.5e-15) Top mflops for N=512000 = 35.8358 Normalized results and averages for N=512000: fft 0: mflops = 35.8358 (norm. = 1), norm. avg. (of 17) = 0.979995 fft 1: mflops = 12.4531 (norm. = 0.347506), norm. avg. (of 17) = 0.218264 fft 2: mflops = 10.7652 (norm. = 0.300405), norm. avg. (of 17) = 0.525024 fft 3: mflops = 13.2871 (norm. = 0.370777), norm. avg. (of 10) = 0.317825 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.59895 s, 1 iters, t-(init.)=1.41024 s t(norm)=0.124073, mflops=40.2989 (err=7.6e-16) 1. PDA (f2c): elapsed time t=5.4509 s, 1 iters, t-(init.)=5.26183 s t(norm)=0.462934, mflops=10.8007 (err=6.9e-16) 2. Singleton (f2c): elapsed time t=7.43375 s, 1 iters, t-(init.)=7.24455 s t(norm)=0.637373, mflops=7.8447 (err=8.6e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 40.2989 Normalized results and averages for N=592704: fft 0: mflops = 40.2989 (norm. = 1), norm. avg. (of 18) = 0.981107 fft 1: mflops = 10.8007 (norm. = 0.268014), norm. avg. (of 18) = 0.221028 fft 2: mflops = 7.8447 (norm. = 0.194663), norm. avg. (of 18) = 0.50667 fft 3: mflops = -1 (norm. = -0.0248146), norm. avg. (of 10) = 0.317825 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=3.55232 s, 1 iters, t-(init.)=3.26987 s t(norm)=0.187087, mflops=26.7256 (err=8.1e-16) 1. PDA (f2c): elapsed time t=9.16428 s, 1 iters, t-(init.)=8.88217 s t(norm)=0.508196, mflops=9.83873 (err=7.7e-16) 2. Singleton (f2c): elapsed time t=12.2293 s, 1 iters, t-(init.)=11.9472 s t(norm)=0.68356, mflops=7.31465 (err=8.2e-16) 3. Temperton (f2c): elapsed time t=8.32931 s, 1 iters, t-(init.)=8.04744 s t(norm)=0.460436, mflops=10.8593 (err=8.9e-16) Top mflops for N=884736 = 26.7256 Normalized results and averages for N=884736: fft 0: mflops = 26.7256 (norm. = 1), norm. avg. (of 19) = 0.982101 fft 1: mflops = 9.83873 (norm. = 0.368139), norm. avg. (of 19) = 0.228771 fft 2: mflops = 7.31465 (norm. = 0.273694), norm. avg. (of 19) = 0.494408 fft 3: mflops = 10.8593 (norm. = 0.406324), norm. avg. (of 11) = 0.325871 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=3.15421 s, 1 iters, t-(init.)=2.78489 s t(norm)=0.119432, mflops=41.8648 (err=7.9e-16) 1. PDA (f2c): elapsed time t=11.8314 s, 1 iters, t-(init.)=11.4619 s t(norm)=0.491552, mflops=10.1719 (err=8.1e-16) 2. Singleton (f2c): elapsed time t=11.7819 s, 1 iters, t-(init.)=11.4129 s t(norm)=0.489451, mflops=10.2155 (err=9.7e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 41.8648 Normalized results and averages for N=1157625: fft 0: mflops = 41.8648 (norm. = 1), norm. avg. (of 20) = 0.982996 fft 1: mflops = 10.1719 (norm. = 0.242969), norm. avg. (of 20) = 0.22948 fft 2: mflops = 10.2155 (norm. = 0.244012), norm. avg. (of 20) = 0.481889 fft 3: mflops = -1 (norm. = -0.0238864), norm. avg. (of 11) = 0.325871 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=4.7542 s, 1 iters, t-(init.)=4.30602 s t(norm)=0.15008, mflops=33.3156 (err=7.2e-16) 1. PDA (f2c): elapsed time t=15.1414 s, 1 iters, t-(init.)=14.6934 s t(norm)=0.512116, mflops=9.76342 (err=7.0e-16) 2. Singleton (f2c): elapsed time t=14.0831 s, 1 iters, t-(init.)=13.6348 s t(norm)=0.475222, mflops=10.5214 (err=6.8e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 33.3156 Normalized results and averages for N=1404928: fft 0: mflops = 33.3156 (norm. = 1), norm. avg. (of 21) = 0.983806 fft 1: mflops = 9.76342 (norm. = 0.293059), norm. avg. (of 21) = 0.232508 fft 2: mflops = 10.5214 (norm. = 0.31581), norm. avg. (of 21) = 0.47398 fft 3: mflops = -1 (norm. = -0.030016), norm. avg. (of 11) = 0.325871 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Beauregard, Bergland, CWP (min N), CWP (best N), Edelblute, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), NAPACK (f2c), Nielsen, NR (C), Ooura (C), QFT, Ransom, Singleton (f2c), Temperton (f2c), Valkenburg 2, 20.1058, 18.9023, 13.5628, 0.860701, 2.87843, 4.07816, 3.75882, 3.50207, , 5.5923, 19.904, 19.9046, 29.488, , 8.00744, 4.83427, 4.6545, 18.7398, , , , 2.00047, 1.49295, 5.23461, 15.906, , , 4.0288, 2.4311, 5.56809 4, 43.9864, 42.1186, 19.9536, 3.6945, 6.36162, 14.5534, 13.8722, 6.62796, 17.8925, 14.6536, 63.6122, 63.6016, 91.1691, , 21.1448, 9.76501, 9.3742, 51.7241, 21.8905, 23.3887, 20.9476, 4.5398, 5.58903, 9.98258, 35.3541, , 3.1104, 13.8141, 7.95141, 6.04999 8, 67.841, 65.977, 24.08, 5.1151, 8.51631, 23.8193, 31.8139, 19.8823, 18.4749, 21.3223, 104.323, 104.372, 137.43, 53.9686, 32.4947, 16.9305, 16.4221, 72.2261, 36.3242, 38.1247, 36.0223, 7.26767, 13.0272, 17.0705, 62.9743, , 3.86641, 15.5305, 11.3026, 6.42893 16, 40.8671, 39.9687, 28.076, 9.34626, 9.39015, 37.7309, 53.1238, 36.906, 20.162, 30.0298, 138.378, 138.316, 150.013, 75.2557, 47.7296, 25.0462, 24.8123, 83.8355, 37.2467, 45.9046, 45.2515, 10.6107, 13.6718, 25.7653, 86.9094, 48.7049, 12.0468, 41.0768, 16.2042, 6.72673 32, 47.7931, 47.1127, 31.8685, 11.2675, 9.65256, 53.7482, 56.6245, 68.2351, 22.2541, 25.9234, 150.452, 150.415, 156.115, 100.408, 45.4291, 32.7852, 33.327, 81.1447, 42.1382, 54.7211, 55.2158, 12.7057, 20.0336, 34.5608, 94.3365, 43.776, 12.7615, 52.7434, 15.4072, 6.9485 64, 48.4677, 47.2343, 35.5331, 15.8623, 9.68777, 62.9574, 60.0674, 76.2537, 24.6687, 30.0099, 159.648, 129.302, 108.503, 130.46, 54.6682, 38.4232, 40.0261, 47.4447, 44.1187, 59.5848, 60.8088, 14.9358, 25.8049, 41.8829, 107.427, 41.2725, 24.0011, 73.2661, 19.7236, 7.1121 128, 53.294, 52.2879, 39.069, 16.2296, 9.64832, 68.2209, 68.6777, 102.046, 27.1964, 31.0184, 140.93, 140.917, 121.904, 127.39, 57.7485, 42.45, 45.1457, 26.8493, 47.7766, 65.1932, 66.8657, 15.6664, 24.6638, 47.5285, 105.388, 38.3251, 23.6654, 69.8328, 17.1133, 7.19268 256, 55.645, 54.0043, 41.6503, 18.3104, 9.58733, 77.512, 79.9509, 104.973, 29.4575, 32.9532, 141.694, 145.815, 127.423, 136.484, 62.7146, 44.537, 48.4851, 34.2995, 50.3538, 68.564, 70.382, 16.8528, 27.6701, 50.8798, 114.354, 33.8207, 34.0419, 92.1895, 20.3962, 7.23338 512, 59.2058, 57.7987, 44.5656, 19.278, 9.49584, 83.1687, 81.4708, 100.936, 31.7968, 20.3397, 61.4673, 62.1209, 55.8879, 145.877, 35.4008, 45.7283, 50.7428, 28.3083, 53.4947, 72.4234, 74.2062, 14.6546, 26.9742, 52.9953, 109.702, 20.7379, 32.2783, 92.8044, 18.1007, 6.71161 1024, 51.7954, 47.037, 36.4667, 18.8657, 9.02724, 65.3515, 69.5525, 69.4669, 28.2655, 15.1225, 42.8208, 41.6252, 31.6238, 84.8761, 21.3911, 40.9028, 44.2963, 22.394, 52.971, 70.9376, 59.4933, 9.08999, 19.0972, 47.2957, 74.7957, 16.8502, 35.2515, 81.1412, 16.2681, 5.7256 2048, 12.7623, 12.5506, 9.75843, 13.368, 7.38046, 25.7584, 43.3523, 43.4426, 9.07597, 15.9249, 50.9159, 38.7208, 28.147, 30.8466, 23.1366, 12.8551, 12.7103, 20.2507, 46.4571, 60.638, 43.0053, 8.85845, 11.8155, 13.3505, 32.6182, 13.0883, 19.0467, 19.6559, 12.8665, 4.99487 4096, 11.5962, 11.4538, 8.96, 14.2594, 7.31921, 27.1239, 36.218, 42.0813, 8.42313, 14.9307, 43.5245, 42.2849, 25.1734, 32.4615, 22.1158, 12.4953, 12.5434, 19.5348, 14.4611, 15.3526, 14.5722, 8.67936, 12.5604, 12.9658, 34.3977, 11.1616, 28.2791, 24.6673, 13.7119, 4.90065 8192, 11.7808, 11.5605, 8.82609, 11.9851, 7.268, 24.6315, 38.5814, 39.0207, 8.34403, 12.0558, 37.7013, 35.5024, 24.8673, 30.0445, 16.8471, 12.3856, 12.4342, , 13.8729, 14.6065, 14.0001, 7.17218, 11.088, 12.8652, 32.922, 10.0082, 23.1302, 22.8156, 12.3887, 4.69871 16384, 10.8214, 10.6074, 8.49571, 15.4411, 7.11123, 24.3285, 37.8904, 37.8883, 8.07616, 10.1275, 33.8671, 26.892, 18.4323, 25.3295, 15.3392, 12.0388, 12.0729, , 13.5248, 14.1813, 13.1741, 6.38996, 10.1744, 12.515, 32.9632, 8.7145, 30.7576, 23.8481, 12.356, 4.14512 32768, 7.0472, 6.44303, 5.29592, 10.9005, 6.1812, 16.7485, 33.2576, 33.2413, 5.13854, 8.38911, 25.6543, 20.6656, 14.9546, 18.1891, 12.6796, 7.95207, 7.87179, , 12.749, 13.4423, 12.2485, 5.76132, 7.83253, 8.14064, 22.5373, 8.14568, 18.625, 11.7237, 8.71408, 3.39548 65536, 6.16208, 6.00096, 5.01265, 11.8483, 6.14393, 14.9593, 30.1955, 30.179, 4.88208, 7.53973, 22.285, 19.4707, 13.871, 16.1806, 12.5765, 7.72314, 7.7009, , 8.14666, 8.36841, 7.96573, 5.96586, 6.49337, 7.88673, 19.7447, 6.82492, 18.4051, 13.5992, 8.60832, 3.16502 131072, 5.01902, 4.84143, 4.04979, 8.75272, 6.09442, 12.4962, 30.9007, 30.9028, 3.968, 7.03181, 21.3645, 17.7187, 12.2876, 12.8578, 11.3551, 6.03457, 6.04018, , 7.83335, 8.14159, 7.60041, 5.74093, 5.65276, 6.13807, 19.0963, 5.15589, 15.3991, 9.95071, 6.95169, 2.97978 Norm. Avg., 0.350516, 0.337319, 0.237611, 0.188356, 0.115911, 0.443637, 0.620795, 0.654907, 0.182683, 0.219696, 0.812708, 0.753701, 0.685128, 0.692431, 0.348091, 0.249311, 0.256154, 0.397553, 0.355726, 0.42968, 0.395214, 0.123825, 0.178359, 0.268003, 0.670507, 0.236053, 0.340552, 0.436562, 0.182476, 0.0810644 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, CWP (min N), CWP (best N), FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Nielsen, Singleton (f2c), Temperton (f2c), Valkenburg 6, 20.4577, 12.8509, 15.8502, 76.502, 76.4953, 14.2378, 22.5091, 6.35195, 5.12504, 11.7831, 7.84473, 6.52982 9, 31.1607, 23.6404, 19.1657, 78.7465, 78.7282, 12.1766, 22.9256, 8.18091, 8.00793, 20.9575, 11.2618, 6.66503 12, 44.4553, 35.6468, 24.3342, 122.331, 122.352, 21.253, 35.0343, 8.84164, 10.588, 22.3173, 15.1576, 6.79958 15, 48.5429, 48.5508, 23.9076, 93.1575, 93.1299, 14.5522, 22.7598, 6.20829, 12.3681, 25.3184, 15.4426, 6.09747 18, 48.3589, 43.2771, 18.9782, 76.2271, 74.0705, 14.2398, 38.5071, 10.3227, 9.93252, 30.3531, 13.5949, 6.87157 24, 68.3402, 63.4586, 22.3792, 103.094, 103.095, 27.3534, 50.456, 11.6479, 16.4777, 29.6186, 17.4087, 7.01496 36, 75.3391, 75.3435, 23.6323, 106.712, 106.681, 17.7596, 54.3701, 12.684, 15.3252, 45.585, 19.849, 7.08729 80, 93.4254, 100.447, 28.8275, 116.879, 116.83, 34.4931, 37.0699, 9.03128, 27.0089, 67.9066, 22.0916, 6.76485 108, 77.3261, 96.5358, 26.3869, 111.15, 111.092, 16.3556, 54.4964, 14.1696, 17.9801, 49.8949, 20.9321, 7.18261 210, 104.368, 104.365, 19.4596, 88.8195, 88.3285, 16.7991, 35.7902, 7.35748, 22.2135, 42.8707, , 5.70772 504, 107.629, 107.596, 15.7051, 54.8795, 50.5611, 15.9609, 31.1362, 8.81922, 19.8724, 50.1073, , 5.62526 1000, 63.8798, 89.2719, 12.2432, 46.6368, 47.1394, 12.6566, 12.8392, 5.39607, 20.7212, 58.7075, 18.8398, 4.53044 1960, 49.4359, 49.4377, 7.11076, 38.0892, 39.848, 12.2905, 14.5701, 4.98363, 14.5096, 25.1679, , 3.8705 4725, 42.1211, 46.5772, 9.41497, 34.8772, 35.583, 7.9402, 16.2831, 5.43827, 12.759, 21.0434, , 4.05214 10368, 39.712, 43.5088, 12.3635, 42.5907, 34.5922, 10.8889, 19.6854, 6.55369, 11.136, 16.4275, 12.4096, 4.0583 27000, 37.5407, 37.5864, 11.6048, 30.8506, 30.0727, 7.143, 15.569, 5.19032, 9.88086, 12.8605, 10.8642, 3.50446 75600, 30.4614, 30.4633, 8.46218, 30.7406, 27.7273, 6.35034, 14.4347, 5.18146, 7.83455, 9.99631, , 3.11101 165375, 23.1514, 23.1412, 5.8434, 25.7706, 25.4788, 4.49709, 14.6527, 4.63246, 7.12875, 10.3107, , 2.95832 Norm. Avg., 0.748141, 0.76592, 0.221559, 0.900116, 0.881655, 0.190166, 0.375151, 0.109184, 0.192328, 0.386485, 0.185922, 0.075757 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM (f2c), NR (C), PDA (f2c), Singleton (f2c), Temperton (f2c) 4x4x4, 121.791, , 44.3483, 9.28979, 83.7472, 26.4007 8x8x8, 151.604, 29.8034, 59.7945, 14.6412, 72.5158, 28.1889 16x16x16, 48.1693, 21.6189, 12.0889, 14.6671, 20.947, 19.6485 32x32x32, 31.1526, 15.8248, 7.68419, 10.6439, 10.755, 11.5663 64x64x64, 25.559, 15.1575, 5.69401, 9.60999, 9.60138, 11.3112 256x64x32, 25.5182, 13.9767, 5.67861, 9.72375, 8.88152, 9.42246 16x1024x64, 20.11, 14.2204, 5.71083, 9.15606, 9.22719, Norm. Avg., 1, 0.50021, 0.283638, 0.290194, 0.446941, 0.332282 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA (f2c), Singleton (f2c), Temperton (f2c) 5x5x5, 77.0063, 10.6139, 106.814, 25.218 6x6x6, 114.45, 11.4802, 63.9227, 23.6712 7x7x7, 75.5206, 6.78987, 57.6421, 9x9x9, 97.8332, 14.1716, 72.9748, 26.906 10x10x10, 93.7524, 14.7851, 66.4499, 22.523 11x11x11, 40.4709, 6.51406, 41.7202, 12x12x12, 70.3013, 15.8333, 22.2113, 23.8128 13x13x13, 36.7298, 6.1857, 37.9078, 14x14x14, 50.6191, 9.59197, 24.4785, 15x15x15, 58.936, 16.2806, 26.1561, 21.3429 24x25x28, 44.6969, 14.6944, 17.2949, 48x48x48, 40.745, 13.331, 10.9402, 15.2188 49x49x49, 39.5621, 9.04793, 13.0111, 60x60x60, 44.972, 13.1612, 8.63307, 16.9319 72x60x56, 40.3881, 11.7051, 8.00837, 75x75x75, 44.2714, 12.992, 10.2099, 17.6388 80x80x80, 35.8358, 12.4531, 10.7652, 13.2871 84x84x84, 40.2989, 10.8007, 7.8447, 96x96x96, 26.7256, 9.83873, 7.31465, 10.8593 105x105x105, 41.8648, 10.1719, 10.2155, 112x112x112, 33.3156, 9.76342, 10.5214, Norm. Avg., 0.983806, 0.232508, 0.47398, 0.325871 @@@@ end