To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Simon P. Turner @ submitter email = simont@sar.dra.hmg.gb @ submitter organization = Angledata Consultants @ computer manufacturer = Cycle Computers @ computer model = Cycle UP-5 @ CPU manufacturer = Sun @ CPU model = UltraSPARC II @ CPU speed = 300 MHz @ RAM = 128 MB @ L2 cache size = 2 MB @ operating system = SunOS 5.6 @ C compiler = SunSoft WorkShop cc 4.2 @ C compiler flags = -fast -native -DSOLARIS -dalign -xO5 -I../fftw-1.2.1/src/src -DUSE_SUNPERF @ Fortran compiler = SunSoft WorkShop f77 4.2 @ Fortran compiler flags = -fast -native -dalign -libmil -xO5 @ remarks = @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) Maximum array size = 360360 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Bailey 5. Beauregard 6. Bergland 7. Brenner 8. Burrus 9. CWP (min N) 10. CWP (best N) 11. Edelblute 12. FFTPACK 13. FFTPACK (f2c) 14. FFTW 15. FFTW_ESTIMATE 16. Frigo-old 17. Green 18. GSL 19. GSL DIT 20. GSL DIF 21. Krukar 22. Mayer (Buneman) 23. Mayer (simple) 24. Mayer (lookup) 25. Monro 26. NAPACK (f2c) 27. Ooura (C) 28. Ooura (F) 29. Ransom 30. SCIPORT 31. Singleton 32. Singleton (f2c) 33. Sorensen 34. Sorensen DIT 35. Temperton 36. Temperton (f2c) 37. Valkenburg 38. SUNPERF Computing normalized averages (39 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.04671 s, 4194304 iters, t-(init.)=0.705349 s t(norm)=0.0840841, mflops=59.4643 (err=1.7e-17) 1. Arndt DIT: elapsed time t=1.11349 s, 4194304 iters, t-(init.)=0.791276 s t(norm)=0.0943274, mflops=53.0069 (err=1.7e-17) 2. Arndt Split-Radix: elapsed time t=1.27951 s, 2097152 iters, t-(init.)=1.14352 s t(norm)=0.272637, mflops=18.3394 (err=1.7e-17) 3. Arndt 4-step: elapsed time t=1.75913 s, 262144 iters, t-(init.)=1.7425 s t(norm)=3.32355, mflops=1.50441 (err=1.7e-17) 4. Bailey: elapsed time t=1.60257 s, 1048576 iters, t-(init.)=1.5127 s t(norm)=0.721311, mflops=6.93182 (err=1.7e-17) 5. Beauregard: elapsed time t=1.6299 s, 1048576 iters, t-(init.)=1.56336 s t(norm)=0.745468, mflops=6.70719 (err=1.7e-17) 6. Bergland: elapsed time t=1.34651 s, 1048576 iters, t-(init.)=1.23576 s t(norm)=0.589256, mflops=8.48527 (err=1.7e-17) 7. Brenner: elapsed time t=1.22922 s, 1048576 iters, t-(init.)=1.15515 s t(norm)=0.550819, mflops=9.07739 (err=1.7e-17) 8. Burrus: elapsed time t=1.19381 s, 4194304 iters, t-(init.)=0.924039 s t(norm)=0.110154, mflops=45.391 (err=1.7e-17) 9. CWP (min N): elapsed time t=1.54965 s, 524288 iters, t-(init.)=1.5129 s t(norm)=1.44281, mflops=3.46545 10. CWP (best N) (N=3): elapsed time t=1.58949 s, 524288 iters, t-(init.)=1.53256 s t(norm)=1.46157, mflops=3.42099 11. Skipping fft (Edelblute can't handle N <= 2). 12. FFTPACK: elapsed time t=1.54767 s, 2097152 iters, t-(init.)=1.36973 s t(norm)=0.326569, mflops=15.3107 (err=1.7e-17) 13. FFTPACK (f2c): elapsed time t=1.90959 s, 2097152 iters, t-(init.)=1.76328 s t(norm)=0.420399, mflops=11.8935 (err=1.7e-17) FFTW_MEASURE plan: (cost = 2.606719e-07) FFTW_NOTW 2 14. FFTW: elapsed time t=1.37587 s, 4194304 iters, t-(init.)=0.964269 s t(norm)=0.11495, mflops=43.4972 (err=1.7e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 15. FFTW_ESTIMATE: elapsed time t=1.40435 s, 4194304 iters, t-(init.)=1.10794 s t(norm)=0.132076, mflops=37.8569 (err=1.7e-17) 16. Frigo-old: elapsed time t=1.52769 s, 8388608 iters, t-(init.)=0.723326 s t(norm)=0.0431136, mflops=115.973 (err=1.7e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.21571 s, 1048576 iters, t-(init.)=1.11068 s t(norm)=0.529615, mflops=9.44082 (err=1.7e-17) 19. GSL DIT: elapsed time t=1.04493 s, 524288 iters, t-(init.)=1.01165 s t(norm)=0.964788, mflops=5.18249 (err=1.7e-17) 20. GSL DIF: elapsed time t=1.03649 s, 524288 iters, t-(init.)=0.975981 s t(norm)=0.930768, mflops=5.37191 (err=1.7e-17) 21. Krukar: elapsed time t=1.37853 s, 4194304 iters, t-(init.)=0.963774 s t(norm)=0.114891, mflops=43.5196 (err=1.7e-17) 22. Skipping fft (Mayer can't handle N <= 2). 23. Skipping fft (Mayer can't handle N <= 2). 24. Skipping fft (Mayer can't handle N <= 2). 25. Skipping fft (Monro can't handle N <= 2). 26. NAPACK (f2c): elapsed time t=1.97847 s, 524288 iters, t-(init.)=1.94522 s t(norm)=1.85511, mflops=2.69526 (err=1.7e-17) 27. Ooura (C): elapsed time t=1.82707 s, 4194304 iters, t-(init.)=1.54301 s t(norm)=0.183941, mflops=27.1826 (err=1.7e-17) 28. Ooura (F): elapsed time t=1.57777 s, 4194304 iters, t-(init.)=1.30556 s t(norm)=0.155635, mflops=32.1264 (err=1.7e-17) 29. Skipping fft (Ransom doesn't work for N=2). 30. Skipping fft (SCIPORT can't handle N < 4). 31. Singleton: elapsed time t=1.29719 s, 524288 iters, t-(init.)=1.24589 s t(norm)=1.18817, mflops=4.20814 (err=1.7e-17) 32. Singleton (f2c): elapsed time t=1.2704 s, 524288 iters, t-(init.)=1.22964 s t(norm)=1.17268, mflops=4.26375 (err=1.7e-17) 33. Sorensen: elapsed time t=1.17023 s, 2097152 iters, t-(init.)=0.991537 s t(norm)=0.236401, mflops=21.1505 (err=1.7e-17) 34. Sorensen DIT: elapsed time t=1.2138 s, 4194304 iters, t-(init.)=0.767011 s t(norm)=0.0914349, mflops=54.6837 (err=1.7e-17) 35. Temperton: elapsed time t=1.31309 s, 524288 iters, t-(init.)=1.27984 s t(norm)=1.22055, mflops=4.09651 (err=1.7e-17) 36. Temperton (f2c): elapsed time t=1.53651 s, 524288 iters, t-(init.)=1.48806 s t(norm)=1.41912, mflops=3.52331 (err=1.7e-17) 37. Valkenburg: elapsed time t=1.62635 s, 1048576 iters, t-(init.)=1.54068 s t(norm)=0.734653, mflops=6.80593 (err=1.7e-17) 38. SUNPERF: elapsed time t=1.09556 s, 1048576 iters, t-(init.)=1.00903 s t(norm)=0.481142, mflops=10.3919 (err=1.7e-17) Top mflops for N=2 = 115.973 Normalized results and averages for N=2: fft 0: mflops = 59.4643 (norm. = 0.512744), norm. avg. (of 1) = 0.512744 fft 1: mflops = 53.0069 (norm. = 0.457063), norm. avg. (of 1) = 0.457063 fft 2: mflops = 18.3394 (norm. = 0.158135), norm. avg. (of 1) = 0.158135 fft 3: mflops = 1.50441 (norm. = 0.0129721), norm. avg. (of 1) = 0.0129721 fft 4: mflops = 6.93182 (norm. = 0.0597711), norm. avg. (of 1) = 0.0597711 fft 5: mflops = 6.70719 (norm. = 0.0578342), norm. avg. (of 1) = 0.0578342 fft 6: mflops = 8.48527 (norm. = 0.0731661), norm. avg. (of 1) = 0.0731661 fft 7: mflops = 9.07739 (norm. = 0.0782717), norm. avg. (of 1) = 0.0782717 fft 8: mflops = 45.391 (norm. = 0.391394), norm. avg. (of 1) = 0.391394 fft 9: mflops = 3.46545 (norm. = 0.0298816), norm. avg. (of 1) = 0.0298816 fft 10: mflops = 3.42099 (norm. = 0.0294982), norm. avg. (of 1) = 0.0294982 fft 11: mflops = -1 (norm. = -0.00862272), norm. avg. (of 0) = -1 fft 12: mflops = 15.3107 (norm. = 0.13202), norm. avg. (of 1) = 0.13202 fft 13: mflops = 11.8935 (norm. = 0.102554), norm. avg. (of 1) = 0.102554 fft 14: mflops = 43.4972 (norm. = 0.375065), norm. avg. (of 1) = 0.375065 fft 15: mflops = 37.8569 (norm. = 0.32643), norm. avg. (of 1) = 0.32643 fft 16: mflops = 115.973 (norm. = 1), norm. avg. (of 1) = 1 fft 17: mflops = -1 (norm. = -0.00862272), norm. avg. (of 0) = -1 fft 18: mflops = 9.44082 (norm. = 0.0814056), norm. avg. (of 1) = 0.0814056 fft 19: mflops = 5.18249 (norm. = 0.0446871), norm. avg. (of 1) = 0.0446871 fft 20: mflops = 5.37191 (norm. = 0.0463205), norm. avg. (of 1) = 0.0463205 fft 21: mflops = 43.5196 (norm. = 0.375257), norm. avg. (of 1) = 0.375257 fft 22: mflops = -1 (norm. = -0.00862272), norm. avg. (of 0) = -1 fft 23: mflops = -1 (norm. = -0.00862272), norm. avg. (of 0) = -1 fft 24: mflops = -1 (norm. = -0.00862272), norm. avg. (of 0) = -1 fft 25: mflops = -1 (norm. = -0.00862272), norm. avg. (of 0) = -1 fft 26: mflops = 2.69526 (norm. = 0.0232405), norm. avg. (of 1) = 0.0232405 fft 27: mflops = 27.1826 (norm. = 0.234388), norm. avg. (of 1) = 0.234388 fft 28: mflops = 32.1264 (norm. = 0.277017), norm. avg. (of 1) = 0.277017 fft 29: mflops = -1 (norm. = -0.00862272), norm. avg. (of 0) = -1 fft 30: mflops = -1 (norm. = -0.00862272), norm. avg. (of 0) = -1 fft 31: mflops = 4.20814 (norm. = 0.0362856), norm. avg. (of 1) = 0.0362856 fft 32: mflops = 4.26375 (norm. = 0.0367651), norm. avg. (of 1) = 0.0367651 fft 33: mflops = 21.1505 (norm. = 0.182375), norm. avg. (of 1) = 0.182375 fft 34: mflops = 54.6837 (norm. = 0.471522), norm. avg. (of 1) = 0.471522 fft 35: mflops = 4.09651 (norm. = 0.0353231), norm. avg. (of 1) = 0.0353231 fft 36: mflops = 3.52331 (norm. = 0.0303805), norm. avg. (of 1) = 0.0303805 fft 37: mflops = 6.80593 (norm. = 0.0586856), norm. avg. (of 1) = 0.0586856 fft 38: mflops = 10.3919 (norm. = 0.0896067), norm. avg. (of 1) = 0.0896067 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.44226 s, 2097152 iters, t-(init.)=1.17304 s t(norm)=0.0699187, mflops=71.5117 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.53827 s, 2097152 iters, t-(init.)=1.23884 s t(norm)=0.0738405, mflops=67.7135 (err=1.3e-16) 2. Arndt Split-Radix: elapsed time t=1.8438 s, 1048576 iters, t-(init.)=1.6955 s t(norm)=0.20212, mflops=24.7378 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.40412 s, 262144 iters, t-(init.)=1.37215 s t(norm)=0.654293, mflops=7.64183 (err=1.3e-16) 4. Bailey: elapsed time t=1.79617 s, 524288 iters, t-(init.)=1.71294 s t(norm)=0.408397, mflops=12.243 (err=1.3e-16) 5. Beauregard: elapsed time t=1.17589 s, 262144 iters, t-(init.)=1.13982 s t(norm)=0.543508, mflops=9.1995 (err=6.5e-17) 6. Bergland: elapsed time t=1.70016 s, 1048576 iters, t-(init.)=1.54028 s t(norm)=0.183616, mflops=27.2308 (err=5.3e-17) 7. Brenner: elapsed time t=1.0525 s, 524288 iters, t-(init.)=0.979211 s t(norm)=0.233462, mflops=21.4167 (err=5.3e-17) 8. Burrus: elapsed time t=1.90955 s, 1048576 iters, t-(init.)=1.76091 s t(norm)=0.209917, mflops=23.8189 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.64722 s, 524288 iters, t-(init.)=1.57974 s t(norm)=0.376639, mflops=13.2753 10. CWP (best N) (N=15): elapsed time t=1.2742 s, 262144 iters, t-(init.)=1.20788 s t(norm)=0.575964, mflops=8.6811 11. Edelblute: elapsed time t=1.07193 s, 524288 iters, t-(init.)=0.996903 s t(norm)=0.23768, mflops=21.0367 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.93154 s, 2097152 iters, t-(init.)=1.61057 s t(norm)=0.0959975, mflops=52.0847 (err=5.3e-17) 13. FFTPACK (f2c): elapsed time t=1.33795 s, 1048576 iters, t-(init.)=1.20138 s t(norm)=0.143216, mflops=34.9123 (err=5.3e-17) FFTW_MEASURE plan: (cost = 3.140234e-07) FFTW_NOTW 4 14. FFTW: elapsed time t=1.89806 s, 4194304 iters, t-(init.)=1.22171 s t(norm)=0.0364097, mflops=137.326 (err=5.3e-17) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 15. FFTW_ESTIMATE: elapsed time t=1.7383 s, 4194304 iters, t-(init.)=1.18679 s t(norm)=0.035369, mflops=141.367 (err=5.3e-17) 16. Frigo-old: elapsed time t=1.36323 s, 4194304 iters, t-(init.)=0.738573 s t(norm)=0.0220112, mflops=227.157 (err=5.3e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.61788 s, 1048576 iters, t-(init.)=1.43214 s t(norm)=0.170724, mflops=29.287 (err=5.3e-17) 19. GSL DIT: elapsed time t=1.01895 s, 262144 iters, t-(init.)=0.98268 s t(norm)=0.468578, mflops=10.6706 (err=6.5e-17) 20. GSL DIF: elapsed time t=1.00436 s, 262144 iters, t-(init.)=0.960428 s t(norm)=0.457968, mflops=10.9178 (err=6.5e-17) 21. Krukar: elapsed time t=1.06063 s, 2097152 iters, t-(init.)=0.715319 s t(norm)=0.0426363, mflops=117.271 (err=5.3e-17) 22. Mayer (Buneman): elapsed time t=1.20953 s, 1048576 iters, t-(init.)=1.05445 s t(norm)=0.125701, mflops=39.777 (err=1.3e-16) 23. Mayer (simple): elapsed time t=1.27322 s, 1048576 iters, t-(init.)=1.10905 s t(norm)=0.132209, mflops=37.819 24. Mayer (lookup): elapsed time t=1.3565 s, 1048576 iters, t-(init.)=1.20469 s t(norm)=0.14361, mflops=34.8165 (err=1.3e-16) 25. Monro: elapsed time t=1.22549 s, 262144 iters, t-(init.)=1.19247 s t(norm)=0.568614, mflops=8.79331 (err=1.3e-16) 26. NAPACK (f2c): elapsed time t=1.86083 s, 262144 iters, t-(init.)=1.83064 s t(norm)=0.872918, mflops=5.72792 (err=1.6e-16) 27. Ooura (C): elapsed time t=1.45107 s, 2097152 iters, t-(init.)=1.15227 s t(norm)=0.0686809, mflops=72.8005 (err=5.3e-17) 28. Ooura (F): elapsed time t=1.38678 s, 2097152 iters, t-(init.)=1.1158 s t(norm)=0.0665068, mflops=75.1802 (err=5.3e-17) 29. Ransom: elapsed time t=1.21 s, 131072 iters, t-(init.)=1.1925 s t(norm)=1.13726, mflops=4.39655 (err=1.6e-16) 30. SCIPORT: elapsed time t=1.87529 s, 2097152 iters, t-(init.)=1.53982 s t(norm)=0.0917803, mflops=54.4779 (err=6.5e-17) 31. Singleton: elapsed time t=1.27029 s, 524288 iters, t-(init.)=1.19009 s t(norm)=0.283739, mflops=17.6219 (err=5.3e-17) 32. Singleton (f2c): elapsed time t=1.23237 s, 524288 iters, t-(init.)=1.15406 s t(norm)=0.27515, mflops=18.1719 (err=5.3e-17) 33. Sorensen: elapsed time t=1.35537 s, 1048576 iters, t-(init.)=1.21526 s t(norm)=0.14487, mflops=34.5136 (err=1.3e-16) 34. Sorensen DIT: elapsed time t=1.91498 s, 1048576 iters, t-(init.)=1.74504 s t(norm)=0.208025, mflops=24.0356 (err=1.3e-16) 35. Temperton: elapsed time t=1.65272 s, 524288 iters, t-(init.)=1.58198 s t(norm)=0.377174, mflops=13.2565 (err=5.3e-17) 36. Temperton (f2c): elapsed time t=1.82292 s, 524288 iters, t-(init.)=1.74241 s t(norm)=0.415424, mflops=12.0359 (err=5.3e-17) 37. Valkenburg: elapsed time t=1.56989 s, 262144 iters, t-(init.)=1.52632 s t(norm)=0.727806, mflops=6.86996 (err=1.6e-16) 38. SUNPERF: elapsed time t=1.24152 s, 1048576 iters, t-(init.)=1.09356 s t(norm)=0.130362, mflops=38.3547 (err=5.3e-17) Top mflops for N=4 = 227.157 Normalized results and averages for N=4: fft 0: mflops = 71.5117 (norm. = 0.314811), norm. avg. (of 2) = 0.413777 fft 1: mflops = 67.7135 (norm. = 0.298091), norm. avg. (of 2) = 0.377577 fft 2: mflops = 24.7378 (norm. = 0.108902), norm. avg. (of 2) = 0.133519 fft 3: mflops = 7.64183 (norm. = 0.0336412), norm. avg. (of 2) = 0.0233067 fft 4: mflops = 12.243 (norm. = 0.0538965), norm. avg. (of 2) = 0.0568338 fft 5: mflops = 9.1995 (norm. = 0.0404984), norm. avg. (of 2) = 0.0491663 fft 6: mflops = 27.2308 (norm. = 0.119876), norm. avg. (of 2) = 0.0965212 fft 7: mflops = 21.4167 (norm. = 0.0942816), norm. avg. (of 2) = 0.0862767 fft 8: mflops = 23.8189 (norm. = 0.104857), norm. avg. (of 2) = 0.248125 fft 9: mflops = 13.2753 (norm. = 0.058441), norm. avg. (of 2) = 0.0441613 fft 10: mflops = 8.6811 (norm. = 0.0382163), norm. avg. (of 2) = 0.0338572 fft 11: mflops = 21.0367 (norm. = 0.0926085), norm. avg. (of 1) = 0.0926085 fft 12: mflops = 52.0847 (norm. = 0.229289), norm. avg. (of 2) = 0.180654 fft 13: mflops = 34.9123 (norm. = 0.153692), norm. avg. (of 2) = 0.128123 fft 14: mflops = 137.326 (norm. = 0.604543), norm. avg. (of 2) = 0.489804 fft 15: mflops = 141.367 (norm. = 0.62233), norm. avg. (of 2) = 0.47438 fft 16: mflops = 227.157 (norm. = 1), norm. avg. (of 2) = 1 fft 17: mflops = -1 (norm. = -0.00440224), norm. avg. (of 0) = -1 fft 18: mflops = 29.287 (norm. = 0.128928), norm. avg. (of 2) = 0.105167 fft 19: mflops = 10.6706 (norm. = 0.0469744), norm. avg. (of 2) = 0.0458308 fft 20: mflops = 10.9178 (norm. = 0.0480628), norm. avg. (of 2) = 0.0471916 fft 21: mflops = 117.271 (norm. = 0.516254), norm. avg. (of 2) = 0.445756 fft 22: mflops = 39.777 (norm. = 0.175108), norm. avg. (of 1) = 0.175108 fft 23: mflops = 37.819 (norm. = 0.166488), norm. avg. (of 1) = 0.166488 fft 24: mflops = 34.8165 (norm. = 0.153271), norm. avg. (of 1) = 0.153271 fft 25: mflops = 8.79331 (norm. = 0.0387102), norm. avg. (of 1) = 0.0387102 fft 26: mflops = 5.72792 (norm. = 0.0252157), norm. avg. (of 2) = 0.0242281 fft 27: mflops = 72.8005 (norm. = 0.320485), norm. avg. (of 2) = 0.277436 fft 28: mflops = 75.1802 (norm. = 0.330961), norm. avg. (of 2) = 0.303989 fft 29: mflops = 4.39655 (norm. = 0.0193547), norm. avg. (of 1) = 0.0193547 fft 30: mflops = 54.4779 (norm. = 0.239825), norm. avg. (of 1) = 0.239825 fft 31: mflops = 17.6219 (norm. = 0.0775756), norm. avg. (of 2) = 0.0569306 fft 32: mflops = 18.1719 (norm. = 0.0799972), norm. avg. (of 2) = 0.0583812 fft 33: mflops = 34.5136 (norm. = 0.151937), norm. avg. (of 2) = 0.167156 fft 34: mflops = 24.0356 (norm. = 0.10581), norm. avg. (of 2) = 0.288666 fft 35: mflops = 13.2565 (norm. = 0.0583581), norm. avg. (of 2) = 0.0468406 fft 36: mflops = 12.0359 (norm. = 0.0529849), norm. avg. (of 2) = 0.0416827 fft 37: mflops = 6.86996 (norm. = 0.0302432), norm. avg. (of 2) = 0.0444644 fft 38: mflops = 38.3547 (norm. = 0.168847), norm. avg. (of 2) = 0.129227 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.92645 s, 2097152 iters, t-(init.)=1.56645 s t(norm)=0.0311225, mflops=160.655 (err=1.2e-16) 1. Arndt DIT: elapsed time t=1.04422 s, 1048576 iters, t-(init.)=0.846355 s t(norm)=0.0336311, mflops=148.672 (err=1.2e-16) 2. Arndt Split-Radix: elapsed time t=1.05726 s, 262144 iters, t-(init.)=1.00989 s t(norm)=0.160518, mflops=31.1492 (err=1.1e-16) 3. Arndt 4-step: elapsed time t=1.86422 s, 131072 iters, t-(init.)=1.84364 s t(norm)=0.586079, mflops=8.53128 (err=1.3e-16) 4. Bailey: elapsed time t=1.46512 s, 262144 iters, t-(init.)=1.40972 s t(norm)=0.224068, mflops=22.3146 (err=9.8e-17) 5. Beauregard: elapsed time t=1.85218 s, 131072 iters, t-(init.)=1.83302 s t(norm)=0.5827, mflops=8.58074 (err=1.2e-16) 6. Bergland: elapsed time t=1.92667 s, 524288 iters, t-(init.)=1.83165 s t(norm)=0.145567, mflops=34.3485 (err=1.3e-16) 7. Brenner: elapsed time t=1.17758 s, 262144 iters, t-(init.)=1.12395 s t(norm)=0.178647, mflops=27.9881 (err=1.2e-16) 8. Burrus: elapsed time t=1.62391 s, 262144 iters, t-(init.)=1.57468 s t(norm)=0.250289, mflops=19.9769 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.94589 s, 524288 iters, t-(init.)=1.8479 s t(norm)=0.146858, mflops=34.0465 10. CWP (best N) (N=15): elapsed time t=1.28444 s, 262144 iters, t-(init.)=1.21463 s t(norm)=0.19306, mflops=25.8987 11. Edelblute: elapsed time t=1.80815 s, 262144 iters, t-(init.)=1.75179 s t(norm)=0.278439, mflops=17.9573 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.89448 s, 1048576 iters, t-(init.)=1.65401 s t(norm)=0.0657246, mflops=76.075 (err=1.2e-16) 13. FFTPACK (f2c): elapsed time t=1.27488 s, 524288 iters, t-(init.)=1.17006 s t(norm)=0.0929881, mflops=53.7703 (err=1.2e-16) FFTW_MEASURE plan: (cost = 5.345234e-07) FFTW_NOTW 8 14. FFTW: elapsed time t=1.29462 s, 2097152 iters, t-(init.)=0.865985 s t(norm)=0.0172056, mflops=290.603 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.188 s, 2097152 iters, t-(init.)=0.819456 s t(norm)=0.0162811, mflops=307.104 (err=1.2e-16) 16. Frigo-old: elapsed time t=1.13359 s, 2097152 iters, t-(init.)=0.705183 s t(norm)=0.0140107, mflops=356.87 (err=1.4e-16) 17. Green: elapsed time t=1.43678 s, 1048576 iters, t-(init.)=1.24261 s t(norm)=0.0493769, mflops=101.262 (err=1.4e-16) 18. GSL: elapsed time t=1.54788 s, 524288 iters, t-(init.)=1.45885 s t(norm)=0.115939, mflops=43.1262 (err=1.4e-16) 19. GSL DIT: elapsed time t=1.70114 s, 262144 iters, t-(init.)=1.66224 s t(norm)=0.264205, mflops=18.9247 (err=1.2e-16) 20. GSL DIF: elapsed time t=1.67553 s, 262144 iters, t-(init.)=1.63266 s t(norm)=0.259504, mflops=19.2675 (err=1.4e-16) 21. Krukar: elapsed time t=1.90763 s, 2097152 iters, t-(init.)=1.48244 s t(norm)=0.0294534, mflops=169.76 (err=1.9e-16) 22. Mayer (Buneman): elapsed time t=1.05516 s, 524288 iters, t-(init.)=0.969607 s t(norm)=0.0770574, mflops=64.8867 (err=1.2e-16) 23. Mayer (simple): elapsed time t=1.04515 s, 524288 iters, t-(init.)=0.946879 s t(norm)=0.0752512, mflops=66.4441 24. Mayer (lookup): elapsed time t=1.09097 s, 524288 iters, t-(init.)=1.01071 s t(norm)=0.0803239, mflops=62.2479 (err=1.2e-16) 25. Monro: elapsed time t=1.50832 s, 262144 iters, t-(init.)=1.44665 s t(norm)=0.229939, mflops=21.7449 (err=1.1e-08) 26. NAPACK (f2c): elapsed time t=1.43086 s, 131072 iters, t-(init.)=1.40755 s t(norm)=0.44745, mflops=11.1744 (err=1.7e-16) 27. Ooura (C): elapsed time t=1.42404 s, 1048576 iters, t-(init.)=1.21574 s t(norm)=0.0483092, mflops=103.5 (err=1.3e-16) 28. Ooura (F): elapsed time t=1.38613 s, 1048576 iters, t-(init.)=1.21803 s t(norm)=0.0484004, mflops=103.305 (err=1.3e-16) 29. Ransom: elapsed time t=1.61042 s, 65536 iters, t-(init.)=1.59625 s t(norm)=1.01487, mflops=4.92673 (err=3.4e-16) 30. SCIPORT: elapsed time t=1.83645 s, 1048576 iters, t-(init.)=1.63944 s t(norm)=0.0651455, mflops=76.7513 (err=1.4e-16) 31. Singleton: elapsed time t=1.81213 s, 262144 iters, t-(init.)=1.74945 s t(norm)=0.278068, mflops=17.9812 (err=1.4e-16) 32. Singleton (f2c): elapsed time t=1.78165 s, 262144 iters, t-(init.)=1.74131 s t(norm)=0.276774, mflops=18.0653 (err=1.4e-16) 33. Sorensen: elapsed time t=1.10127 s, 524288 iters, t-(init.)=0.985478 s t(norm)=0.0783188, mflops=63.8416 (err=1.5e-16) 34. Sorensen DIT: elapsed time t=1.63408 s, 262144 iters, t-(init.)=1.58227 s t(norm)=0.251495, mflops=19.8811 (err=1.1e-16) 35. Temperton: elapsed time t=1.21139 s, 262144 iters, t-(init.)=1.16158 s t(norm)=0.184628, mflops=27.0815 (err=4.6e-09) 36. Temperton (f2c): elapsed time t=1.40579 s, 262144 iters, t-(init.)=1.36359 s t(norm)=0.216736, mflops=23.0695 (err=1.4e-16) 37. Valkenburg: elapsed time t=1.14311 s, 65536 iters, t-(init.)=1.13044 s t(norm)=0.718717, mflops=6.95685 (err=1.5e-16) 38. SUNPERF: elapsed time t=1.13294 s, 524288 iters, t-(init.)=1.05734 s t(norm)=0.0840301, mflops=59.5025 (err=1.2e-16) Top mflops for N=8 = 356.87 Normalized results and averages for N=8: fft 0: mflops = 160.655 (norm. = 0.450179), norm. avg. (of 3) = 0.425911 fft 1: mflops = 148.672 (norm. = 0.4166), norm. avg. (of 3) = 0.390585 fft 2: mflops = 31.1492 (norm. = 0.0872845), norm. avg. (of 3) = 0.118107 fft 3: mflops = 8.53128 (norm. = 0.0239059), norm. avg. (of 3) = 0.0235064 fft 4: mflops = 22.3146 (norm. = 0.0625288), norm. avg. (of 3) = 0.0587322 fft 5: mflops = 8.58074 (norm. = 0.0240445), norm. avg. (of 3) = 0.0407924 fft 6: mflops = 34.3485 (norm. = 0.0962494), norm. avg. (of 3) = 0.0964306 fft 7: mflops = 27.9881 (norm. = 0.0784267), norm. avg. (of 3) = 0.08366 fft 8: mflops = 19.9769 (norm. = 0.0559782), norm. avg. (of 3) = 0.184076 fft 9: mflops = 34.0465 (norm. = 0.0954033), norm. avg. (of 3) = 0.061242 fft 10: mflops = 25.8987 (norm. = 0.0725719), norm. avg. (of 3) = 0.0467621 fft 11: mflops = 17.9573 (norm. = 0.0503188), norm. avg. (of 2) = 0.0714637 fft 12: mflops = 76.075 (norm. = 0.213173), norm. avg. (of 3) = 0.191494 fft 13: mflops = 53.7703 (norm. = 0.150672), norm. avg. (of 3) = 0.135639 fft 14: mflops = 290.603 (norm. = 0.814313), norm. avg. (of 3) = 0.597973 fft 15: mflops = 307.104 (norm. = 0.860549), norm. avg. (of 3) = 0.603103 fft 16: mflops = 356.87 (norm. = 1), norm. avg. (of 3) = 1 fft 17: mflops = 101.262 (norm. = 0.28375), norm. avg. (of 1) = 0.28375 fft 18: mflops = 43.1262 (norm. = 0.120846), norm. avg. (of 3) = 0.110393 fft 19: mflops = 18.9247 (norm. = 0.0530296), norm. avg. (of 3) = 0.0482304 fft 20: mflops = 19.2675 (norm. = 0.0539904), norm. avg. (of 3) = 0.0494579 fft 21: mflops = 169.76 (norm. = 0.475691), norm. avg. (of 3) = 0.455734 fft 22: mflops = 64.8867 (norm. = 0.181822), norm. avg. (of 2) = 0.178465 fft 23: mflops = 66.4441 (norm. = 0.186186), norm. avg. (of 2) = 0.176337 fft 24: mflops = 62.2479 (norm. = 0.174428), norm. avg. (of 2) = 0.163849 fft 25: mflops = 21.7449 (norm. = 0.0609323), norm. avg. (of 2) = 0.0498213 fft 26: mflops = 11.1744 (norm. = 0.0313124), norm. avg. (of 3) = 0.0265895 fft 27: mflops = 103.5 (norm. = 0.290022), norm. avg. (of 3) = 0.281632 fft 28: mflops = 103.305 (norm. = 0.289476), norm. avg. (of 3) = 0.299151 fft 29: mflops = 4.92673 (norm. = 0.0138054), norm. avg. (of 2) = 0.01658 fft 30: mflops = 76.7513 (norm. = 0.215068), norm. avg. (of 2) = 0.227446 fft 31: mflops = 17.9812 (norm. = 0.0503859), norm. avg. (of 3) = 0.054749 fft 32: mflops = 18.0653 (norm. = 0.0506215), norm. avg. (of 3) = 0.0557946 fft 33: mflops = 63.8416 (norm. = 0.178893), norm. avg. (of 3) = 0.171069 fft 34: mflops = 19.8811 (norm. = 0.0557098), norm. avg. (of 3) = 0.211014 fft 35: mflops = 27.0815 (norm. = 0.0758863), norm. avg. (of 3) = 0.0565225 fft 36: mflops = 23.0695 (norm. = 0.0646441), norm. avg. (of 3) = 0.0493365 fft 37: mflops = 6.95685 (norm. = 0.0194941), norm. avg. (of 3) = 0.036141 fft 38: mflops = 59.5025 (norm. = 0.166734), norm. avg. (of 3) = 0.141729 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.47197 s, 262144 iters, t-(init.)=1.39975 s t(norm)=0.0834319, mflops=59.9291 (err=1.5e-16) 1. Arndt DIT: elapsed time t=1.42324 s, 262144 iters, t-(init.)=1.36462 s t(norm)=0.0813375, mflops=61.4723 (err=2.2e-16) 2. Arndt Split-Radix: elapsed time t=1.22656 s, 131072 iters, t-(init.)=1.19768 s t(norm)=0.142775, mflops=35.0201 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.24023 s, 65536 iters, t-(init.)=1.22504 s t(norm)=0.292073, mflops=17.119 (err=2.0e-16) 4. Bailey: elapsed time t=1.27824 s, 131072 iters, t-(init.)=1.23946 s t(norm)=0.147756, mflops=33.8396 (err=2.0e-16) 5. Beauregard: elapsed time t=1.16803 s, 32768 iters, t-(init.)=1.15895 s t(norm)=0.55263, mflops=9.04765 (err=2.7e-16) 6. Bergland: elapsed time t=1.72314 s, 262144 iters, t-(init.)=1.65094 s t(norm)=0.0984039, mflops=50.811 (err=2.6e-16) 7. Brenner: elapsed time t=1.03648 s, 131072 iters, t-(init.)=1.00314 s t(norm)=0.119583, mflops=41.8119 (err=2.1e-16) 8. Burrus: elapsed time t=1.12523 s, 65536 iters, t-(init.)=1.10547 s t(norm)=0.263564, mflops=18.9707 (err=1.4e-16) 9. CWP (min N): elapsed time t=1.35826 s, 262144 iters, t-(init.)=1.28805 s t(norm)=0.0767739, mflops=65.1263 10. CWP (best N) (N=28): elapsed time t=1.72631 s, 262144 iters, t-(init.)=1.62213 s t(norm)=0.0966868, mflops=51.7134 11. Edelblute: elapsed time t=1.21507 s, 65536 iters, t-(init.)=1.1979 s t(norm)=0.285601, mflops=17.5069 (err=1.4e-16) 12. FFTPACK: elapsed time t=1.45713 s, 524288 iters, t-(init.)=1.31734 s t(norm)=0.0392598, mflops=127.357 (err=1.8e-16) 13. FFTPACK (f2c): elapsed time t=1.03694 s, 262144 iters, t-(init.)=0.958566 s t(norm)=0.057135, mflops=87.512 (err=1.8e-16) FFTW_MEASURE plan: (cost = 9.423828e-07) FFTW_NOTW 16 14. FFTW: elapsed time t=1.06741 s, 1048576 iters, t-(init.)=0.769301 s t(norm)=0.0114635, mflops=436.168 (err=1.8e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.05914 s, 1048576 iters, t-(init.)=0.802224 s t(norm)=0.0119541, mflops=418.268 (err=1.8e-16) 16. Frigo-old: elapsed time t=1.01437 s, 1048576 iters, t-(init.)=0.727631 s t(norm)=0.0108426, mflops=461.146 (err=1.8e-16) 17. Green: elapsed time t=1.54256 s, 524288 iters, t-(init.)=1.38858 s t(norm)=0.0413829, mflops=120.823 (err=1.9e-16) 18. GSL: elapsed time t=1.28694 s, 262144 iters, t-(init.)=1.21674 s t(norm)=0.0725232, mflops=68.9434 (err=1.8e-16) 19. GSL DIT: elapsed time t=1.39953 s, 131072 iters, t-(init.)=1.36954 s t(norm)=0.163262, mflops=30.6256 (err=2.1e-16) 20. GSL DIF: elapsed time t=1.35099 s, 131072 iters, t-(init.)=1.31542 s t(norm)=0.15681, mflops=31.8857 (err=2.8e-16) 21. Krukar: elapsed time t=1.04604 s, 524288 iters, t-(init.)=0.896428 s t(norm)=0.0267156, mflops=187.156 (err=1.9e-16) 22. Mayer (Buneman): elapsed time t=1.45141 s, 262144 iters, t-(init.)=1.38017 s t(norm)=0.0822645, mflops=60.7795 (err=1.7e-16) 23. Mayer (simple): elapsed time t=1.24125 s, 262144 iters, t-(init.)=1.17017 s t(norm)=0.0697478, mflops=71.6869 24. Mayer (lookup): elapsed time t=1.2675 s, 262144 iters, t-(init.)=1.20491 s t(norm)=0.0718184, mflops=69.6201 (err=1.8e-16) 25. Monro: elapsed time t=1.18996 s, 131072 iters, t-(init.)=1.15521 s t(norm)=0.137712, mflops=36.3077 (err=2.1e-08) 26. NAPACK (f2c): elapsed time t=1.1511 s, 65536 iters, t-(init.)=1.13665 s t(norm)=0.270999, mflops=18.4503 (err=3.3e-16) 27. Ooura (C): elapsed time t=1.48155 s, 524288 iters, t-(init.)=1.34044 s t(norm)=0.0399481, mflops=125.162 (err=2.0e-16) 28. Ooura (F): elapsed time t=1.48314 s, 524288 iters, t-(init.)=1.3672 s t(norm)=0.0407457, mflops=122.712 (err=2.0e-16) 29. Ransom: elapsed time t=1.14717 s, 65536 iters, t-(init.)=1.12726 s t(norm)=0.268759, mflops=18.604 (err=3.4e-16) 30. SCIPORT: elapsed time t=1.83871 s, 524288 iters, t-(init.)=1.69475 s t(norm)=0.0505075, mflops=98.9953 (err=2.8e-16) 31. Singleton: elapsed time t=1.58696 s, 262144 iters, t-(init.)=1.52004 s t(norm)=0.0906013, mflops=55.1868 (err=1.7e-16) 32. Singleton (f2c): elapsed time t=1.56687 s, 262144 iters, t-(init.)=1.49699 s t(norm)=0.0892277, mflops=56.0364 (err=1.7e-16) 33. Sorensen: elapsed time t=1.06217 s, 262144 iters, t-(init.)=0.993052 s t(norm)=0.0591905, mflops=84.473 (err=1.5e-16) 34. Sorensen DIT: elapsed time t=1.13087 s, 65536 iters, t-(init.)=1.11267 s t(norm)=0.265281, mflops=18.8479 (err=1.6e-16) 35. Temperton: elapsed time t=1.98839 s, 262144 iters, t-(init.)=1.92218 s t(norm)=0.114571, mflops=43.6411 (err=1.7e-08) 36. Temperton (f2c): elapsed time t=1.06598 s, 131072 iters, t-(init.)=1.02316 s t(norm)=0.12197, mflops=40.9936 (err=1.8e-16) 37. Valkenburg: elapsed time t=1.47377 s, 32768 iters, t-(init.)=1.46448 s t(norm)=0.698317, mflops=7.16007 (err=2.9e-16) 38. SUNPERF: elapsed time t=1.54728 s, 524288 iters, t-(init.)=1.41279 s t(norm)=0.0421045, mflops=118.752 (err=1.8e-16) Top mflops for N=16 = 461.146 Normalized results and averages for N=16: fft 0: mflops = 59.9291 (norm. = 0.129957), norm. avg. (of 4) = 0.351923 fft 1: mflops = 61.4723 (norm. = 0.133303), norm. avg. (of 4) = 0.326264 fft 2: mflops = 35.0201 (norm. = 0.0759415), norm. avg. (of 4) = 0.107566 fft 3: mflops = 17.119 (norm. = 0.0371228), norm. avg. (of 4) = 0.0269105 fft 4: mflops = 33.8396 (norm. = 0.0733816), norm. avg. (of 4) = 0.0623945 fft 5: mflops = 9.04765 (norm. = 0.0196199), norm. avg. (of 4) = 0.0354993 fft 6: mflops = 50.811 (norm. = 0.110184), norm. avg. (of 4) = 0.099869 fft 7: mflops = 41.8119 (norm. = 0.0906695), norm. avg. (of 4) = 0.0854124 fft 8: mflops = 18.9707 (norm. = 0.0411382), norm. avg. (of 4) = 0.148342 fft 9: mflops = 65.1263 (norm. = 0.141227), norm. avg. (of 4) = 0.0812382 fft 10: mflops = 51.7134 (norm. = 0.112141), norm. avg. (of 4) = 0.0631069 fft 11: mflops = 17.5069 (norm. = 0.037964), norm. avg. (of 3) = 0.0602971 fft 12: mflops = 127.357 (norm. = 0.276174), norm. avg. (of 4) = 0.212664 fft 13: mflops = 87.512 (norm. = 0.189771), norm. avg. (of 4) = 0.149172 fft 14: mflops = 436.168 (norm. = 0.945834), norm. avg. (of 4) = 0.684939 fft 15: mflops = 418.268 (norm. = 0.907018), norm. avg. (of 4) = 0.679082 fft 16: mflops = 461.146 (norm. = 1), norm. avg. (of 4) = 1 fft 17: mflops = 120.823 (norm. = 0.262006), norm. avg. (of 2) = 0.272878 fft 18: mflops = 68.9434 (norm. = 0.149505), norm. avg. (of 4) = 0.120171 fft 19: mflops = 30.6256 (norm. = 0.0664119), norm. avg. (of 4) = 0.0527758 fft 20: mflops = 31.8857 (norm. = 0.0691444), norm. avg. (of 4) = 0.0543795 fft 21: mflops = 187.156 (norm. = 0.40585), norm. avg. (of 4) = 0.443263 fft 22: mflops = 60.7795 (norm. = 0.131801), norm. avg. (of 3) = 0.16291 fft 23: mflops = 71.6869 (norm. = 0.155454), norm. avg. (of 3) = 0.169376 fft 24: mflops = 69.6201 (norm. = 0.150972), norm. avg. (of 3) = 0.159557 fft 25: mflops = 36.3077 (norm. = 0.0787336), norm. avg. (of 3) = 0.0594587 fft 26: mflops = 18.4503 (norm. = 0.0400096), norm. avg. (of 4) = 0.0299445 fft 27: mflops = 125.162 (norm. = 0.271416), norm. avg. (of 4) = 0.279078 fft 28: mflops = 122.712 (norm. = 0.266103), norm. avg. (of 4) = 0.290889 fft 29: mflops = 18.604 (norm. = 0.040343), norm. avg. (of 3) = 0.024501 fft 30: mflops = 98.9953 (norm. = 0.214672), norm. avg. (of 3) = 0.223188 fft 31: mflops = 55.1868 (norm. = 0.119673), norm. avg. (of 4) = 0.0709801 fft 32: mflops = 56.0364 (norm. = 0.121516), norm. avg. (of 4) = 0.0722249 fft 33: mflops = 84.473 (norm. = 0.183181), norm. avg. (of 4) = 0.174097 fft 34: mflops = 18.8479 (norm. = 0.040872), norm. avg. (of 4) = 0.168479 fft 35: mflops = 43.6411 (norm. = 0.0946362), norm. avg. (of 4) = 0.0660509 fft 36: mflops = 40.9936 (norm. = 0.0888951), norm. avg. (of 4) = 0.0592262 fft 37: mflops = 7.16007 (norm. = 0.0155267), norm. avg. (of 4) = 0.0309874 fft 38: mflops = 118.752 (norm. = 0.257515), norm. avg. (of 4) = 0.170676 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.50529 s, 131072 iters, t-(init.)=1.45059 s t(norm)=0.0691693, mflops=72.2864 (err=3.1e-16) 1. Arndt DIT: elapsed time t=1.44622 s, 131072 iters, t-(init.)=1.39169 s t(norm)=0.066361, mflops=75.3454 (err=2.5e-16) 2. Arndt Split-Radix: elapsed time t=1.32787 s, 65536 iters, t-(init.)=1.30053 s t(norm)=0.124028, mflops=40.3135 (err=2.7e-16) 3. Arndt 4-step: elapsed time t=1.42571 s, 32768 iters, t-(init.)=1.41132 s t(norm)=0.269189, mflops=18.5743 (err=2.8e-16) 4. Bailey: elapsed time t=1.93374 s, 131072 iters, t-(init.)=1.87774 s t(norm)=0.0895375, mflops=55.8425 (err=2.7e-16) 5. Beauregard: elapsed time t=1.48146 s, 16384 iters, t-(init.)=1.47404 s t(norm)=0.562302, mflops=8.89202 (err=1.8e-16) 6. Bergland: elapsed time t=1.5139 s, 131072 iters, t-(init.)=1.45265 s t(norm)=0.0692679, mflops=72.1835 (err=2.6e-16) 7. Brenner: elapsed time t=1.99242 s, 131072 iters, t-(init.)=1.94088 s t(norm)=0.0925482, mflops=54.0259 (err=2.2e-16) 8. Burrus: elapsed time t=1.30265 s, 32768 iters, t-(init.)=1.28753 s t(norm)=0.245576, mflops=20.3603 (err=2.9e-16) 9. CWP (min N) (N=33): elapsed time t=1.30636 s, 131072 iters, t-(init.)=1.24664 s t(norm)=0.0594445, mflops=84.112 10. CWP (best N) (N=35): elapsed time t=1.0413 s, 131072 iters, t-(init.)=0.987338 s t(norm)=0.0470799, mflops=106.202 11. Edelblute: elapsed time t=1.40629 s, 32768 iters, t-(init.)=1.39261 s t(norm)=0.26562, mflops=18.8239 (err=2.9e-16) 12. FFTPACK: elapsed time t=1.53498 s, 262144 iters, t-(init.)=1.4207 s t(norm)=0.0338721, mflops=147.614 (err=1.9e-16) 13. FFTPACK (f2c): elapsed time t=1.17862 s, 131072 iters, t-(init.)=1.12136 s t(norm)=0.0534707, mflops=93.5092 (err=1.9e-16) FFTW_MEASURE plan: (cost = 2.404641e-06) FFTW_NOTW 32 14. FFTW: elapsed time t=1.44537 s, 524288 iters, t-(init.)=1.21887 s t(norm)=0.01453, mflops=344.115 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.40216 s, 524288 iters, t-(init.)=1.1874 s t(norm)=0.014155, mflops=353.233 (err=2.1e-16) 16. Frigo-old: elapsed time t=1.32519 s, 524288 iters, t-(init.)=1.104 s t(norm)=0.0131607, mflops=379.92 (err=2.2e-16) 17. Green: elapsed time t=1.45688 s, 262144 iters, t-(init.)=1.33642 s t(norm)=0.0318627, mflops=156.923 (err=2.0e-16) 18. GSL: elapsed time t=1.86264 s, 131072 iters, t-(init.)=1.80719 s t(norm)=0.0861735, mflops=58.0225 (err=2.0e-16) 19. GSL DIT: elapsed time t=1.241 s, 65536 iters, t-(init.)=1.21372 s t(norm)=0.115749, mflops=43.1968 (err=2.2e-16) 20. GSL DIF: elapsed time t=1.10332 s, 65536 iters, t-(init.)=1.07553 s t(norm)=0.102571, mflops=48.7468 (err=2.5e-16) 21. Krukar: elapsed time t=1.2443 s, 262144 iters, t-(init.)=1.13226 s t(norm)=0.0269951, mflops=185.219 (err=2.2e-16) 22. Mayer (Buneman): elapsed time t=1.46203 s, 131072 iters, t-(init.)=1.40252 s t(norm)=0.0668773, mflops=74.7637 (err=2.7e-16) 23. Mayer (simple): elapsed time t=1.23888 s, 131072 iters, t-(init.)=1.18311 s t(norm)=0.0564152, mflops=88.6286 24. Mayer (lookup): elapsed time t=1.19317 s, 131072 iters, t-(init.)=1.13588 s t(norm)=0.0541628, mflops=92.3143 (err=2.5e-16) 25. Monro: elapsed time t=1.75855 s, 131072 iters, t-(init.)=1.70072 s t(norm)=0.0810966, mflops=61.6549 (err=3.7e-08) 26. NAPACK (f2c): elapsed time t=1.0105 s, 32768 iters, t-(init.)=0.997205 s t(norm)=0.190202, mflops=26.2879 (err=5.4e-16) 27. Ooura (C): elapsed time t=1.54849 s, 262144 iters, t-(init.)=1.44088 s t(norm)=0.0343532, mflops=145.547 (err=2.7e-16) 28. Ooura (F): elapsed time t=1.59612 s, 262144 iters, t-(init.)=1.48768 s t(norm)=0.0354691, mflops=140.968 (err=2.7e-16) 29. Ransom: elapsed time t=1.54671 s, 32768 iters, t-(init.)=1.53391 s t(norm)=0.292571, mflops=17.0899 (err=7.0e-16) 30. SCIPORT: elapsed time t=1.91581 s, 262144 iters, t-(init.)=1.79852 s t(norm)=0.04288, mflops=116.604 (err=1.8e-16) 31. Singleton: elapsed time t=1.38501 s, 131072 iters, t-(init.)=1.33335 s t(norm)=0.063579, mflops=78.6424 (err=2.2e-16) 32. Singleton (f2c): elapsed time t=1.38379 s, 131072 iters, t-(init.)=1.33039 s t(norm)=0.0634381, mflops=78.817 (err=2.2e-16) 33. Sorensen: elapsed time t=1.76217 s, 262144 iters, t-(init.)=1.66014 s t(norm)=0.0395809, mflops=126.324 (err=2.7e-16) 34. Sorensen DIT: elapsed time t=1.30491 s, 32768 iters, t-(init.)=1.28962 s t(norm)=0.245976, mflops=20.3272 (err=2.6e-16) 35. Temperton: elapsed time t=1.0135 s, 65536 iters, t-(init.)=0.983749 s t(norm)=0.0938176, mflops=53.2949 (err=3.1e-08) 36. Temperton (f2c): elapsed time t=1.24652 s, 65536 iters, t-(init.)=1.21783 s t(norm)=0.116142, mflops=43.0509 (err=2.0e-16) 37. Valkenburg: elapsed time t=1.79602 s, 16384 iters, t-(init.)=1.78861 s t(norm)=0.682301, mflops=7.32814 (err=4.3e-16) 38. SUNPERF: elapsed time t=1.54983 s, 262144 iters, t-(init.)=1.45009 s t(norm)=0.0345729, mflops=144.622 (err=1.9e-16) Top mflops for N=32 = 379.92 Normalized results and averages for N=32: fft 0: mflops = 72.2864 (norm. = 0.190268), norm. avg. (of 5) = 0.319592 fft 1: mflops = 75.3454 (norm. = 0.198319), norm. avg. (of 5) = 0.300675 fft 2: mflops = 40.3135 (norm. = 0.106111), norm. avg. (of 5) = 0.107275 fft 3: mflops = 18.5743 (norm. = 0.0488902), norm. avg. (of 5) = 0.0313064 fft 4: mflops = 55.8425 (norm. = 0.146985), norm. avg. (of 5) = 0.0793126 fft 5: mflops = 8.89202 (norm. = 0.023405), norm. avg. (of 5) = 0.0330804 fft 6: mflops = 72.1835 (norm. = 0.189997), norm. avg. (of 5) = 0.117895 fft 7: mflops = 54.0259 (norm. = 0.142204), norm. avg. (of 5) = 0.0967706 fft 8: mflops = 20.3603 (norm. = 0.053591), norm. avg. (of 5) = 0.129392 fft 9: mflops = 84.112 (norm. = 0.221394), norm. avg. (of 5) = 0.109269 fft 10: mflops = 106.202 (norm. = 0.279539), norm. avg. (of 5) = 0.106393 fft 11: mflops = 18.8239 (norm. = 0.049547), norm. avg. (of 4) = 0.0576096 fft 12: mflops = 147.614 (norm. = 0.38854), norm. avg. (of 5) = 0.247839 fft 13: mflops = 93.5092 (norm. = 0.246129), norm. avg. (of 5) = 0.168564 fft 14: mflops = 344.115 (norm. = 0.905758), norm. avg. (of 5) = 0.729102 fft 15: mflops = 353.233 (norm. = 0.929758), norm. avg. (of 5) = 0.729217 fft 16: mflops = 379.92 (norm. = 1), norm. avg. (of 5) = 1 fft 17: mflops = 156.923 (norm. = 0.413043), norm. avg. (of 3) = 0.3196 fft 18: mflops = 58.0225 (norm. = 0.152723), norm. avg. (of 5) = 0.126681 fft 19: mflops = 43.1968 (norm. = 0.1137), norm. avg. (of 5) = 0.0649606 fft 20: mflops = 48.7468 (norm. = 0.128308), norm. avg. (of 5) = 0.0691652 fft 21: mflops = 185.219 (norm. = 0.48752), norm. avg. (of 5) = 0.452115 fft 22: mflops = 74.7637 (norm. = 0.196788), norm. avg. (of 4) = 0.17138 fft 23: mflops = 88.6286 (norm. = 0.233283), norm. avg. (of 4) = 0.185353 fft 24: mflops = 92.3143 (norm. = 0.242984), norm. avg. (of 4) = 0.180413 fft 25: mflops = 61.6549 (norm. = 0.162284), norm. avg. (of 4) = 0.085165 fft 26: mflops = 26.2879 (norm. = 0.0691933), norm. avg. (of 5) = 0.0377943 fft 27: mflops = 145.547 (norm. = 0.383099), norm. avg. (of 5) = 0.299882 fft 28: mflops = 140.968 (norm. = 0.371047), norm. avg. (of 5) = 0.306921 fft 29: mflops = 17.0899 (norm. = 0.0449829), norm. avg. (of 4) = 0.0296215 fft 30: mflops = 116.604 (norm. = 0.306918), norm. avg. (of 4) = 0.244121 fft 31: mflops = 78.6424 (norm. = 0.206997), norm. avg. (of 5) = 0.0981835 fft 32: mflops = 78.817 (norm. = 0.207457), norm. avg. (of 5) = 0.0992713 fft 33: mflops = 126.324 (norm. = 0.332501), norm. avg. (of 5) = 0.205777 fft 34: mflops = 20.3272 (norm. = 0.053504), norm. avg. (of 5) = 0.145484 fft 35: mflops = 53.2949 (norm. = 0.140279), norm. avg. (of 5) = 0.0808966 fft 36: mflops = 43.0509 (norm. = 0.113316), norm. avg. (of 5) = 0.0700441 fft 37: mflops = 7.32814 (norm. = 0.0192887), norm. avg. (of 5) = 0.0286477 fft 38: mflops = 144.622 (norm. = 0.380665), norm. avg. (of 5) = 0.212674 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.88915 s, 65536 iters, t-(init.)=1.84229 s t(norm)=0.0732059, mflops=68.3005 (err=5.7e-16) 1. Arndt DIT: elapsed time t=1.76533 s, 65536 iters, t-(init.)=1.71677 s t(norm)=0.0682182, mflops=73.2942 (err=5.6e-16) 2. Arndt Split-Radix: elapsed time t=1.42007 s, 32768 iters, t-(init.)=1.39529 s t(norm)=0.110888, mflops=45.0906 (err=5.7e-16) 3. Arndt 4-step: elapsed time t=1.20046 s, 16384 iters, t-(init.)=1.18768 s t(norm)=0.188777, mflops=26.4863 (err=5.6e-16) 4. Bailey: elapsed time t=1.62076 s, 65536 iters, t-(init.)=1.5729 s t(norm)=0.0625013, mflops=79.9983 (err=5.7e-16) 5. Beauregard: elapsed time t=1.81022 s, 8192 iters, t-(init.)=1.80386 s t(norm)=0.573433, mflops=8.71942 (err=5.9e-16) 6. Bergland: elapsed time t=1.58972 s, 65536 iters, t-(init.)=1.54004 s t(norm)=0.0611957, mflops=81.7051 (err=5.9e-16) 7. Brenner: elapsed time t=1.89191 s, 65536 iters, t-(init.)=1.84393 s t(norm)=0.073271, mflops=68.2398 (err=5.8e-16) 8. Burrus: elapsed time t=1.42967 s, 16384 iters, t-(init.)=1.41796 s t(norm)=0.225379, mflops=22.1849 (err=5.7e-16) 9. CWP (min N) (N=65): elapsed time t=1.22706 s, 65536 iters, t-(init.)=1.17507 s t(norm)=0.0466929, mflops=107.083 10. CWP (best N) (N=84): elapsed time t=1.69521 s, 131072 iters, t-(init.)=1.57359 s t(norm)=0.0312644, mflops=159.926 11. Edelblute: elapsed time t=1.52463 s, 16384 iters, t-(init.)=1.51265 s t(norm)=0.240429, mflops=20.7962 (err=5.7e-16) 12. FFTPACK: elapsed time t=1.52513 s, 131072 iters, t-(init.)=1.43152 s t(norm)=0.0284418, mflops=175.798 (err=5.6e-16) 13. FFTPACK (f2c): elapsed time t=1.17291 s, 65536 iters, t-(init.)=1.1235 s t(norm)=0.0446441, mflops=111.997 (err=5.6e-16) FFTW_MEASURE plan: (cost = 6.249000e-06) FFTW_TWIDDLE 4 FFTW_NOTW 16 14. FFTW: elapsed time t=1.51985 s, 262144 iters, t-(init.)=1.32624 s t(norm)=0.013175, mflops=379.506 (err=5.5e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.18613 s, 131072 iters, t-(init.)=1.08395 s t(norm)=0.0215362, mflops=232.167 (err=5.6e-16) 16. Frigo-old: elapsed time t=1.82609 s, 131072 iters, t-(init.)=1.72516 s t(norm)=0.0342758, mflops=145.875 (err=5.6e-16) 17. Green: elapsed time t=1.29268 s, 131072 iters, t-(init.)=1.19254 s t(norm)=0.0236936, mflops=211.028 (err=5.5e-16) 18. GSL: elapsed time t=1.71937 s, 65536 iters, t-(init.)=1.66973 s t(norm)=0.0663493, mflops=75.3588 (err=5.6e-16) 19. GSL DIT: elapsed time t=1.16972 s, 32768 iters, t-(init.)=1.14604 s t(norm)=0.0910789, mflops=54.8975 (err=5.6e-16) 20. GSL DIF: elapsed time t=1.05578 s, 32768 iters, t-(init.)=1.03032 s t(norm)=0.0818829, mflops=61.0628 (err=5.4e-16) 21. Krukar: elapsed time t=1.7393 s, 131072 iters, t-(init.)=1.6373 s t(norm)=0.0325302, mflops=153.703 (err=6.0e-16) 22. Mayer (Buneman): elapsed time t=1.70291 s, 65536 iters, t-(init.)=1.6559 s t(norm)=0.0657996, mflops=75.9883 (err=5.4e-16) 23. Mayer (simple): elapsed time t=1.37151 s, 65536 iters, t-(init.)=1.32402 s t(norm)=0.0526117, mflops=95.0358 24. Mayer (lookup): elapsed time t=1.32163 s, 65536 iters, t-(init.)=1.27466 s t(norm)=0.0506504, mflops=98.7159 (err=5.4e-16) 25. Monro: elapsed time t=1.59026 s, 65536 iters, t-(init.)=1.53985 s t(norm)=0.0611881, mflops=81.7152 (err=3.4e-08) 26. NAPACK (f2c): elapsed time t=1.89467 s, 32768 iters, t-(init.)=1.86904 s t(norm)=0.148538, mflops=33.6615 (err=1.1e-15) 27. Ooura (C): elapsed time t=1.66489 s, 131072 iters, t-(init.)=1.56929 s t(norm)=0.0311789, mflops=160.365 (err=5.9e-16) 28. Ooura (F): elapsed time t=1.72417 s, 131072 iters, t-(init.)=1.63065 s t(norm)=0.032398, mflops=154.33 (err=5.9e-16) 29. Ransom: elapsed time t=1.662 s, 32768 iters, t-(init.)=1.6374 s t(norm)=0.130129, mflops=38.4235 (err=8.6e-16) 30. SCIPORT: elapsed time t=1.03201 s, 65536 iters, t-(init.)=0.981441 s t(norm)=0.038999, mflops=128.209 (err=5.9e-16) 31. Singleton: elapsed time t=1.12562 s, 65536 iters, t-(init.)=1.07695 s t(norm)=0.042794, mflops=116.839 (err=9.2e-16) 32. Singleton (f2c): elapsed time t=1.08835 s, 65536 iters, t-(init.)=1.03841 s t(norm)=0.0412628, mflops=121.175 (err=9.2e-16) 33. Sorensen: elapsed time t=1.73925 s, 131072 iters, t-(init.)=1.64164 s t(norm)=0.0326164, mflops=153.297 (err=5.4e-16) 34. Sorensen DIT: elapsed time t=1.42191 s, 16384 iters, t-(init.)=1.40982 s t(norm)=0.224085, mflops=22.313 (err=5.5e-16) 35. Temperton: elapsed time t=1.57994 s, 65536 iters, t-(init.)=1.53201 s t(norm)=0.0608766, mflops=82.1334 (err=3.8e-08) 36. Temperton (f2c): elapsed time t=1.06988 s, 32768 iters, t-(init.)=1.04431 s t(norm)=0.0829945, mflops=60.2449 (err=5.6e-16) 37. Valkenburg: elapsed time t=1.07801 s, 4096 iters, t-(init.)=1.07503 s t(norm)=0.683486, mflops=7.31543 (err=8.1e-16) 38. SUNPERF: elapsed time t=1.35494 s, 131072 iters, t-(init.)=1.25491 s t(norm)=0.0249327, mflops=200.54 (err=5.6e-16) Top mflops for N=64 = 379.506 Normalized results and averages for N=64: fft 0: mflops = 68.3005 (norm. = 0.179972), norm. avg. (of 6) = 0.296322 fft 1: mflops = 73.2942 (norm. = 0.193131), norm. avg. (of 6) = 0.282751 fft 2: mflops = 45.0906 (norm. = 0.118814), norm. avg. (of 6) = 0.109198 fft 3: mflops = 26.4863 (norm. = 0.0697916), norm. avg. (of 6) = 0.0377206 fft 4: mflops = 79.9983 (norm. = 0.210796), norm. avg. (of 6) = 0.101227 fft 5: mflops = 8.71942 (norm. = 0.0229757), norm. avg. (of 6) = 0.0313963 fft 6: mflops = 81.7051 (norm. = 0.215293), norm. avg. (of 6) = 0.134128 fft 7: mflops = 68.2398 (norm. = 0.179812), norm. avg. (of 6) = 0.110611 fft 8: mflops = 22.1849 (norm. = 0.0584572), norm. avg. (of 6) = 0.117569 fft 9: mflops = 107.083 (norm. = 0.282163), norm. avg. (of 6) = 0.138085 fft 10: mflops = 159.926 (norm. = 0.421407), norm. avg. (of 6) = 0.158896 fft 11: mflops = 20.7962 (norm. = 0.0547981), norm. avg. (of 5) = 0.0570473 fft 12: mflops = 175.798 (norm. = 0.463229), norm. avg. (of 6) = 0.283738 fft 13: mflops = 111.997 (norm. = 0.295113), norm. avg. (of 6) = 0.189655 fft 14: mflops = 379.506 (norm. = 1), norm. avg. (of 6) = 0.774252 fft 15: mflops = 232.167 (norm. = 0.611762), norm. avg. (of 6) = 0.709641 fft 16: mflops = 145.875 (norm. = 0.384382), norm. avg. (of 6) = 0.897397 fft 17: mflops = 211.028 (norm. = 0.55606), norm. avg. (of 4) = 0.378715 fft 18: mflops = 75.3588 (norm. = 0.198571), norm. avg. (of 6) = 0.138663 fft 19: mflops = 54.8975 (norm. = 0.144655), norm. avg. (of 6) = 0.078243 fft 20: mflops = 61.0628 (norm. = 0.160901), norm. avg. (of 6) = 0.0844545 fft 21: mflops = 153.703 (norm. = 0.405009), norm. avg. (of 6) = 0.444264 fft 22: mflops = 75.9883 (norm. = 0.20023), norm. avg. (of 5) = 0.17715 fft 23: mflops = 95.0358 (norm. = 0.25042), norm. avg. (of 5) = 0.198366 fft 24: mflops = 98.7159 (norm. = 0.260117), norm. avg. (of 5) = 0.196354 fft 25: mflops = 81.7152 (norm. = 0.21532), norm. avg. (of 5) = 0.111196 fft 26: mflops = 33.6615 (norm. = 0.0886982), norm. avg. (of 6) = 0.0462783 fft 27: mflops = 160.365 (norm. = 0.422562), norm. avg. (of 6) = 0.320329 fft 28: mflops = 154.33 (norm. = 0.406662), norm. avg. (of 6) = 0.323544 fft 29: mflops = 38.4235 (norm. = 0.101246), norm. avg. (of 5) = 0.0439464 fft 30: mflops = 128.209 (norm. = 0.33783), norm. avg. (of 5) = 0.262863 fft 31: mflops = 116.839 (norm. = 0.307871), norm. avg. (of 6) = 0.133131 fft 32: mflops = 121.175 (norm. = 0.319296), norm. avg. (of 6) = 0.135942 fft 33: mflops = 153.297 (norm. = 0.403938), norm. avg. (of 6) = 0.238804 fft 34: mflops = 22.313 (norm. = 0.0587949), norm. avg. (of 6) = 0.131036 fft 35: mflops = 82.1334 (norm. = 0.216422), norm. avg. (of 6) = 0.103484 fft 36: mflops = 60.2449 (norm. = 0.158746), norm. avg. (of 6) = 0.0848277 fft 37: mflops = 7.31543 (norm. = 0.0192762), norm. avg. (of 6) = 0.0270857 fft 38: mflops = 200.54 (norm. = 0.528423), norm. avg. (of 6) = 0.265298 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.93845 s, 32768 iters, t-(init.)=1.89396 s t(norm)=0.0645077, mflops=77.5101 (err=3.5e-16) 1. Arndt DIT: elapsed time t=1.80732 s, 32768 iters, t-(init.)=1.76191 s t(norm)=0.0600104, mflops=83.3189 (err=3.2e-16) 2. Arndt Split-Radix: elapsed time t=1.48613 s, 16384 iters, t-(init.)=1.46352 s t(norm)=0.0996947, mflops=50.1531 (err=3.6e-16) 3. Arndt 4-step: elapsed time t=1.43511 s, 8192 iters, t-(init.)=1.42375 s t(norm)=0.193971, mflops=25.7771 (err=3.3e-16) 4. Bailey: elapsed time t=1.39066 s, 32768 iters, t-(init.)=1.34319 s t(norm)=0.0457487, mflops=109.293 (err=3.2e-16) 5. Beauregard: elapsed time t=1.07528 s, 2048 iters, t-(init.)=1.07238 s t(norm)=0.5844, mflops=8.55579 (err=3.7e-16) 6. Bergland: elapsed time t=1.08797 s, 16384 iters, t-(init.)=1.06486 s t(norm)=0.0725379, mflops=68.9295 (err=3.7e-16) 7. Brenner: elapsed time t=1.98309 s, 32768 iters, t-(init.)=1.93677 s t(norm)=0.0659659, mflops=75.7967 (err=4.1e-16) 8. Burrus: elapsed time t=1.4867 s, 8192 iters, t-(init.)=1.47502 s t(norm)=0.200956, mflops=24.881 (err=3.4e-16) 9. CWP (min N) (N=130): elapsed time t=1.10526 s, 32768 iters, t-(init.)=1.05845 s t(norm)=0.0360504, mflops=138.695 10. CWP (best N) (N=140): elapsed time t=1.40838 s, 65536 iters, t-(init.)=1.3106 s t(norm)=0.0223194, mflops=224.02 11. Edelblute: elapsed time t=1.60365 s, 8192 iters, t-(init.)=1.59201 s t(norm)=0.216895, mflops=23.0527 (err=3.4e-16) 12. FFTPACK: elapsed time t=1.46694 s, 65536 iters, t-(init.)=1.3751 s t(norm)=0.0234178, mflops=213.513 (err=3.5e-16) 13. FFTPACK (f2c): elapsed time t=1.19428 s, 32768 iters, t-(init.)=1.14952 s t(norm)=0.0391524, mflops=127.706 (err=3.5e-16) FFTW_MEASURE plan: (cost = 1.301275e-05) FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.0907 s, 65536 iters, t-(init.)=0.997467 s t(norm)=0.0169868, mflops=294.347 (err=3.8e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.25043 s, 65536 iters, t-(init.)=1.16015 s t(norm)=0.0197573, mflops=253.071 (err=3.5e-16) 16. Frigo-old: elapsed time t=1.52132 s, 65536 iters, t-(init.)=1.42995 s t(norm)=0.024352, mflops=205.322 (err=3.4e-16) 17. Green: elapsed time t=1.54943 s, 65536 iters, t-(init.)=1.45625 s t(norm)=0.0247998, mflops=201.615 (err=4.2e-16) 18. GSL: elapsed time t=1.74913 s, 32768 iters, t-(init.)=1.7033 s t(norm)=0.058014, mflops=86.1862 (err=3.4e-16) 19. GSL DIT: elapsed time t=1.14097 s, 16384 iters, t-(init.)=1.11838 s t(norm)=0.0761837, mflops=65.6308 (err=3.5e-16) 20. GSL DIF: elapsed time t=1.95309 s, 32768 iters, t-(init.)=1.90762 s t(norm)=0.0649733, mflops=76.9547 (err=3.7e-16) 21. Krukar: elapsed time t=1.59745 s, 32768 iters, t-(init.)=1.55282 s t(norm)=0.0528889, mflops=94.5378 (err=3.6e-16) 22. Mayer (Buneman): elapsed time t=1.74214 s, 32768 iters, t-(init.)=1.69747 s t(norm)=0.0578156, mflops=86.4818 (err=3.2e-16) 23. Mayer (simple): elapsed time t=1.36846 s, 32768 iters, t-(init.)=1.32322 s t(norm)=0.0450687, mflops=110.942 24. Mayer (lookup): elapsed time t=1.31846 s, 32768 iters, t-(init.)=1.27222 s t(norm)=0.0433315, mflops=115.39 (err=3.4e-16) 25. Monro: elapsed time t=1.4755 s, 32768 iters, t-(init.)=1.42948 s t(norm)=0.048688, mflops=102.695 (err=5.2e-08) 26. NAPACK (f2c): elapsed time t=1.02076 s, 8192 iters, t-(init.)=1.00975 s t(norm)=0.137567, mflops=36.3458 (err=1.2e-15) 27. Ooura (C): elapsed time t=1.79078 s, 65536 iters, t-(init.)=1.69771 s t(norm)=0.0289119, mflops=172.939 (err=3.3e-16) 28. Ooura (F): elapsed time t=1.89258 s, 65536 iters, t-(init.)=1.80035 s t(norm)=0.0306598, mflops=163.08 (err=3.3e-16) 29. Ransom: elapsed time t=1.09445 s, 8192 iters, t-(init.)=1.0827 s t(norm)=0.147507, mflops=33.8968 (err=1.0e-15) 30. SCIPORT: elapsed time t=1.1429 s, 32768 iters, t-(init.)=1.09626 s t(norm)=0.0373386, mflops=133.91 (err=3.6e-16) 31. Singleton: elapsed time t=1.34014 s, 32768 iters, t-(init.)=1.29368 s t(norm)=0.0440623, mflops=113.476 (err=4.2e-16) 32. Singleton (f2c): elapsed time t=1.27229 s, 32768 iters, t-(init.)=1.22644 s t(norm)=0.0417722, mflops=119.697 (err=4.2e-16) 33. Sorensen: elapsed time t=1.64415 s, 65536 iters, t-(init.)=1.55336 s t(norm)=0.0264536, mflops=189.01 (err=3.1e-16) 34. Sorensen DIT: elapsed time t=1.50154 s, 8192 iters, t-(init.)=1.49043 s t(norm)=0.203055, mflops=24.6239 (err=3.1e-16) 35. Temperton: elapsed time t=1.85499 s, 32768 iters, t-(init.)=1.80836 s t(norm)=0.0615923, mflops=81.1789 (err=4.7e-08) 36. Temperton (f2c): elapsed time t=1.23844 s, 16384 iters, t-(init.)=1.21514 s t(norm)=0.0827745, mflops=60.405 (err=3.6e-16) 37. Valkenburg: elapsed time t=1.26254 s, 2048 iters, t-(init.)=1.25972 s t(norm)=0.686491, mflops=7.28342 (err=5.7e-16) 38. SUNPERF: elapsed time t=1.29897 s, 65536 iters, t-(init.)=1.21046 s t(norm)=0.0206141, mflops=242.553 (err=3.5e-16) Top mflops for N=128 = 294.347 Normalized results and averages for N=128: fft 0: mflops = 77.5101 (norm. = 0.263329), norm. avg. (of 7) = 0.291609 fft 1: mflops = 83.3189 (norm. = 0.283064), norm. avg. (of 7) = 0.282796 fft 2: mflops = 50.1531 (norm. = 0.170388), norm. avg. (of 7) = 0.117939 fft 3: mflops = 25.7771 (norm. = 0.0875739), norm. avg. (of 7) = 0.0448425 fft 4: mflops = 109.293 (norm. = 0.371306), norm. avg. (of 7) = 0.139809 fft 5: mflops = 8.55579 (norm. = 0.029067), norm. avg. (of 7) = 0.0310635 fft 6: mflops = 68.9295 (norm. = 0.234178), norm. avg. (of 7) = 0.148421 fft 7: mflops = 75.7967 (norm. = 0.257508), norm. avg. (of 7) = 0.131596 fft 8: mflops = 24.881 (norm. = 0.0845297), norm. avg. (of 7) = 0.112849 fft 9: mflops = 138.695 (norm. = 0.471194), norm. avg. (of 7) = 0.185672 fft 10: mflops = 224.02 (norm. = 0.761075), norm. avg. (of 7) = 0.244921 fft 11: mflops = 23.0527 (norm. = 0.078318), norm. avg. (of 6) = 0.0605924 fft 12: mflops = 213.513 (norm. = 0.725379), norm. avg. (of 7) = 0.346829 fft 13: mflops = 127.706 (norm. = 0.433863), norm. avg. (of 7) = 0.224542 fft 14: mflops = 294.347 (norm. = 1), norm. avg. (of 7) = 0.806502 fft 15: mflops = 253.071 (norm. = 0.859772), norm. avg. (of 7) = 0.731089 fft 16: mflops = 205.322 (norm. = 0.697552), norm. avg. (of 7) = 0.868848 fft 17: mflops = 201.615 (norm. = 0.684956), norm. avg. (of 5) = 0.439963 fft 18: mflops = 86.1862 (norm. = 0.292805), norm. avg. (of 7) = 0.160683 fft 19: mflops = 65.6308 (norm. = 0.222971), norm. avg. (of 7) = 0.0989185 fft 20: mflops = 76.9547 (norm. = 0.261442), norm. avg. (of 7) = 0.109738 fft 21: mflops = 94.5378 (norm. = 0.321178), norm. avg. (of 7) = 0.42668 fft 22: mflops = 86.4818 (norm. = 0.293809), norm. avg. (of 6) = 0.196593 fft 23: mflops = 110.942 (norm. = 0.376909), norm. avg. (of 6) = 0.228123 fft 24: mflops = 115.39 (norm. = 0.392019), norm. avg. (of 6) = 0.228965 fft 25: mflops = 102.695 (norm. = 0.348891), norm. avg. (of 6) = 0.150812 fft 26: mflops = 36.3458 (norm. = 0.12348), norm. avg. (of 7) = 0.057307 fft 27: mflops = 172.939 (norm. = 0.587536), norm. avg. (of 7) = 0.358501 fft 28: mflops = 163.08 (norm. = 0.55404), norm. avg. (of 7) = 0.356472 fft 29: mflops = 33.8968 (norm. = 0.115159), norm. avg. (of 6) = 0.0558153 fft 30: mflops = 133.91 (norm. = 0.454939), norm. avg. (of 6) = 0.294876 fft 31: mflops = 113.476 (norm. = 0.385517), norm. avg. (of 7) = 0.169186 fft 32: mflops = 119.697 (norm. = 0.406652), norm. avg. (of 7) = 0.174615 fft 33: mflops = 189.01 (norm. = 0.642135), norm. avg. (of 7) = 0.296423 fft 34: mflops = 24.6239 (norm. = 0.083656), norm. avg. (of 7) = 0.124267 fft 35: mflops = 81.1789 (norm. = 0.275794), norm. avg. (of 7) = 0.1281 fft 36: mflops = 60.405 (norm. = 0.205217), norm. avg. (of 7) = 0.102026 fft 37: mflops = 7.28342 (norm. = 0.0247444), norm. avg. (of 7) = 0.0267513 fft 38: mflops = 242.553 (norm. = 0.824038), norm. avg. (of 7) = 0.345118 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.06423 s, 8192 iters, t-(init.)=1.04258 s t(norm)=0.0621424, mflops=80.4604 (err=9.6e-16) 1. Arndt DIT: elapsed time t=1.94511 s, 16384 iters, t-(init.)=1.90119 s t(norm)=0.0566598, mflops=88.246 (err=9.9e-16) 2. Arndt Split-Radix: elapsed time t=1.53763 s, 8192 iters, t-(init.)=1.51555 s t(norm)=0.0903336, mflops=55.3504 (err=9.8e-16) 3. Arndt 4-step: elapsed time t=1.44055 s, 4096 iters, t-(init.)=1.42962 s t(norm)=0.170424, mflops=29.3386 (err=1.0e-15) 4. Bailey: elapsed time t=1.50503 s, 16384 iters, t-(init.)=1.46029 s t(norm)=0.04352, mflops=114.89 (err=9.8e-16) 5. Beauregard: elapsed time t=1.25047 s, 1024 iters, t-(init.)=1.24771 s t(norm)=0.594953, mflops=8.40403 (err=1.1e-15) 6. Bergland: elapsed time t=1.19813 s, 8192 iters, t-(init.)=1.1762 s t(norm)=0.070107, mflops=71.3196 (err=1.0e-15) 7. Brenner: elapsed time t=1.95167 s, 16384 iters, t-(init.)=1.90736 s t(norm)=0.0568438, mflops=87.9604 (err=1.1e-15) 8. Burrus: elapsed time t=1.54042 s, 4096 iters, t-(init.)=1.52967 s t(norm)=0.182351, mflops=27.4197 (err=1.0e-15) 9. CWP (min N) (N=260): elapsed time t=1.15722 s, 16384 iters, t-(init.)=1.11215 s t(norm)=0.0331448, mflops=150.853 10. CWP (best N) (N=280): elapsed time t=1.71231 s, 32768 iters, t-(init.)=1.61727 s t(norm)=0.0240992, mflops=207.476 11. Edelblute: elapsed time t=1.66157 s, 4096 iters, t-(init.)=1.65067 s t(norm)=0.196775, mflops=25.4097 (err=1.0e-15) 12. FFTPACK: elapsed time t=1.57497 s, 32768 iters, t-(init.)=1.4874 s t(norm)=0.022164, mflops=225.591 (err=1.1e-15) 13. FFTPACK (f2c): elapsed time t=1.25792 s, 16384 iters, t-(init.)=1.21397 s t(norm)=0.0361792, mflops=138.201 (err=1.1e-15) FFTW_MEASURE plan: (cost = 3.188700e-05) FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.1036 s, 32768 iters, t-(init.)=1.0145 s t(norm)=0.0151172, mflops=330.749 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.40996 s, 32768 iters, t-(init.)=1.32365 s t(norm)=0.0197239, mflops=253.499 (err=1.1e-15) 16. Frigo-old: elapsed time t=1.78998 s, 32768 iters, t-(init.)=1.70358 s t(norm)=0.0253853, mflops=196.964 (err=1.1e-15) 17. Green: elapsed time t=1.92673 s, 32768 iters, t-(init.)=1.83762 s t(norm)=0.0273826, mflops=182.598 (err=1.1e-15) 18. GSL: elapsed time t=1.78019 s, 16384 iters, t-(init.)=1.73611 s t(norm)=0.0517402, mflops=96.6367 (err=1.1e-15) 19. GSL DIT: elapsed time t=1.18695 s, 8192 iters, t-(init.)=1.16523 s t(norm)=0.0694529, mflops=71.9912 (err=1.0e-15) 20. GSL DIF: elapsed time t=1.03278 s, 8192 iters, t-(init.)=1.0106 s t(norm)=0.0602364, mflops=83.0062 (err=1.1e-15) 21. Krukar: elapsed time t=1.42866 s, 16384 iters, t-(init.)=1.38517 s t(norm)=0.0412812, mflops=121.121 (err=1.1e-15) 22. Mayer (Buneman): elapsed time t=1.89665 s, 16384 iters, t-(init.)=1.85279 s t(norm)=0.0552175, mflops=90.551 (err=9.7e-16) 23. Mayer (simple): elapsed time t=1.50995 s, 16384 iters, t-(init.)=1.46675 s t(norm)=0.0437125, mflops=114.384 24. Mayer (lookup): elapsed time t=1.50685 s, 16384 iters, t-(init.)=1.46308 s t(norm)=0.0436032, mflops=114.671 (err=9.6e-16) 25. Monro: elapsed time t=1.50156 s, 16384 iters, t-(init.)=1.45728 s t(norm)=0.0434304, mflops=115.127 (err=8.5e-08) 26. NAPACK (f2c): elapsed time t=1.06544 s, 4096 iters, t-(init.)=1.05436 s t(norm)=0.125689, mflops=39.7806 (err=3.8e-15) 27. Ooura (C): elapsed time t=1.96389 s, 32768 iters, t-(init.)=1.87793 s t(norm)=0.0279834, mflops=178.677 (err=9.8e-16) 28. Ooura (F): elapsed time t=1.00311 s, 16384 iters, t-(init.)=0.959211 s t(norm)=0.0285867, mflops=174.906 (err=9.8e-16) 29. Ransom: elapsed time t=1.50932 s, 8192 iters, t-(init.)=1.48723 s t(norm)=0.0886458, mflops=56.4042 (err=1.9e-15) 30. SCIPORT: elapsed time t=1.55971 s, 16384 iters, t-(init.)=1.51591 s t(norm)=0.0451776, mflops=110.674 (err=1.1e-15) 31. Singleton: elapsed time t=1.09718 s, 16384 iters, t-(init.)=1.0537 s t(norm)=0.0314026, mflops=159.223 (err=1.7e-15) 32. Singleton (f2c): elapsed time t=1.11137 s, 16384 iters, t-(init.)=1.06734 s t(norm)=0.0318092, mflops=157.187 (err=1.7e-15) 33. Sorensen: elapsed time t=1.75742 s, 32768 iters, t-(init.)=1.66913 s t(norm)=0.024872, mflops=201.029 (err=9.8e-16) 34. Sorensen DIT: elapsed time t=1.55123 s, 4096 iters, t-(init.)=1.54018 s t(norm)=0.183603, mflops=27.2326 (err=9.8e-16) 35. Temperton: elapsed time t=1.73609 s, 16384 iters, t-(init.)=1.69245 s t(norm)=0.0504389, mflops=99.1298 (err=9.5e-08) 36. Temperton (f2c): elapsed time t=1.23552 s, 8192 iters, t-(init.)=1.2132 s t(norm)=0.0723122, mflops=69.1446 (err=1.1e-15) 37. Valkenburg: elapsed time t=1.43362 s, 1024 iters, t-(init.)=1.43087 s t(norm)=0.68229, mflops=7.32826 (err=1.2e-15) 38. SUNPERF: elapsed time t=1.42053 s, 32768 iters, t-(init.)=1.33232 s t(norm)=0.0198531, mflops=251.85 (err=1.1e-15) Top mflops for N=256 = 330.749 Normalized results and averages for N=256: fft 0: mflops = 80.4604 (norm. = 0.243267), norm. avg. (of 8) = 0.285566 fft 1: mflops = 88.246 (norm. = 0.266807), norm. avg. (of 8) = 0.280797 fft 2: mflops = 55.3504 (norm. = 0.167349), norm. avg. (of 8) = 0.124116 fft 3: mflops = 29.3386 (norm. = 0.0887035), norm. avg. (of 8) = 0.0503251 fft 4: mflops = 114.89 (norm. = 0.347363), norm. avg. (of 8) = 0.165754 fft 5: mflops = 8.40403 (norm. = 0.0254091), norm. avg. (of 8) = 0.0303567 fft 6: mflops = 71.3196 (norm. = 0.215631), norm. avg. (of 8) = 0.156822 fft 7: mflops = 87.9604 (norm. = 0.265943), norm. avg. (of 8) = 0.14839 fft 8: mflops = 27.4197 (norm. = 0.0829018), norm. avg. (of 8) = 0.109106 fft 9: mflops = 150.853 (norm. = 0.456097), norm. avg. (of 8) = 0.219475 fft 10: mflops = 207.476 (norm. = 0.627292), norm. avg. (of 8) = 0.292718 fft 11: mflops = 25.4097 (norm. = 0.0768248), norm. avg. (of 7) = 0.0629113 fft 12: mflops = 225.591 (norm. = 0.682062), norm. avg. (of 8) = 0.388733 fft 13: mflops = 138.201 (norm. = 0.417843), norm. avg. (of 8) = 0.248705 fft 14: mflops = 330.749 (norm. = 1), norm. avg. (of 8) = 0.830689 fft 15: mflops = 253.499 (norm. = 0.766441), norm. avg. (of 8) = 0.735508 fft 16: mflops = 196.964 (norm. = 0.59551), norm. avg. (of 8) = 0.834681 fft 17: mflops = 182.598 (norm. = 0.552073), norm. avg. (of 6) = 0.458648 fft 18: mflops = 96.6367 (norm. = 0.292176), norm. avg. (of 8) = 0.17712 fft 19: mflops = 71.9912 (norm. = 0.217661), norm. avg. (of 8) = 0.113761 fft 20: mflops = 83.0062 (norm. = 0.250965), norm. avg. (of 8) = 0.127392 fft 21: mflops = 121.121 (norm. = 0.366201), norm. avg. (of 8) = 0.41912 fft 22: mflops = 90.551 (norm. = 0.273776), norm. avg. (of 7) = 0.207619 fft 23: mflops = 114.384 (norm. = 0.345833), norm. avg. (of 7) = 0.244939 fft 24: mflops = 114.671 (norm. = 0.3467), norm. avg. (of 7) = 0.245784 fft 25: mflops = 115.127 (norm. = 0.348079), norm. avg. (of 7) = 0.178993 fft 26: mflops = 39.7806 (norm. = 0.120274), norm. avg. (of 8) = 0.0651779 fft 27: mflops = 178.677 (norm. = 0.540221), norm. avg. (of 8) = 0.381216 fft 28: mflops = 174.906 (norm. = 0.52882), norm. avg. (of 8) = 0.378016 fft 29: mflops = 56.4042 (norm. = 0.170535), norm. avg. (of 7) = 0.0722038 fft 30: mflops = 110.674 (norm. = 0.334618), norm. avg. (of 7) = 0.300553 fft 31: mflops = 159.223 (norm. = 0.4814), norm. avg. (of 8) = 0.208213 fft 32: mflops = 157.187 (norm. = 0.475247), norm. avg. (of 8) = 0.212194 fft 33: mflops = 201.029 (norm. = 0.607801), norm. avg. (of 8) = 0.335345 fft 34: mflops = 27.2326 (norm. = 0.0823363), norm. avg. (of 8) = 0.119026 fft 35: mflops = 99.1298 (norm. = 0.299713), norm. avg. (of 8) = 0.149551 fft 36: mflops = 69.1446 (norm. = 0.209055), norm. avg. (of 8) = 0.115405 fft 37: mflops = 7.32826 (norm. = 0.0221566), norm. avg. (of 8) = 0.0261769 fft 38: mflops = 251.85 (norm. = 0.761455), norm. avg. (of 8) = 0.39716 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.10368 s, 4096 iters, t-(init.)=1.08218 s t(norm)=0.0573358, mflops=87.2055 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.04408 s, 4096 iters, t-(init.)=1.02257 s t(norm)=0.0541778, mflops=92.2887 (err=1.1e-15) 2. Arndt Split-Radix: elapsed time t=1.64181 s, 4096 iters, t-(init.)=1.62025 s t(norm)=0.0858441, mflops=58.2451 (err=1.1e-15) 3. Arndt 4-step: elapsed time t=1.53539 s, 2048 iters, t-(init.)=1.52472 s t(norm)=0.161565, mflops=30.9473 (err=1.0e-15) 4. Bailey: elapsed time t=1.66915 s, 8192 iters, t-(init.)=1.62396 s t(norm)=0.0430204, mflops=116.224 (err=1.1e-15) 5. Beauregard: elapsed time t=1.42987 s, 512 iters, t-(init.)=1.42722 s t(norm)=0.604933, mflops=8.26537 (err=1.1e-15) 6. Bergland: elapsed time t=1.30784 s, 4096 iters, t-(init.)=1.28615 s t(norm)=0.0681428, mflops=73.3754 (err=1.0e-15) 7. Brenner: elapsed time t=1.0667 s, 4096 iters, t-(init.)=1.04516 s t(norm)=0.0553747, mflops=90.294 (err=1.0e-15) 8. Burrus: elapsed time t=1.59308 s, 2048 iters, t-(init.)=1.58237 s t(norm)=0.167674, mflops=29.8197 (err=1.1e-15) 9. CWP (min N) (N=520): elapsed time t=1.22203 s, 8192 iters, t-(init.)=1.17822 s t(norm)=0.0312122, mflops=160.194 10. CWP (best N) (N=560): elapsed time t=1.04272 s, 8192 iters, t-(init.)=0.996258 s t(norm)=0.0263918, mflops=189.453 11. Edelblute: elapsed time t=1.69117 s, 2048 iters, t-(init.)=1.68053 s t(norm)=0.178075, mflops=28.078 (err=1.1e-15) 12. FFTPACK: elapsed time t=1.1279 s, 8192 iters, t-(init.)=1.0847 s t(norm)=0.0287347, mflops=174.006 (err=1.0e-15) 13. FFTPACK (f2c): elapsed time t=1.8565 s, 8192 iters, t-(init.)=1.81399 s t(norm)=0.0480544, mflops=104.049 (err=1.0e-15) FFTW_MEASURE plan: (cost = 8.718450e-05) FFTW_TWIDDLE 16 FFTW_NOTW 32 14. FFTW: elapsed time t=1.4934 s, 16384 iters, t-(init.)=1.40805 s t(norm)=0.0186502, mflops=268.093 (err=9.7e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.49326 s, 16384 iters, t-(init.)=1.40721 s t(norm)=0.0186392, mflops=268.252 (err=9.7e-16) 16. Frigo-old: elapsed time t=1.13922 s, 8192 iters, t-(init.)=1.09637 s t(norm)=0.029044, mflops=172.153 (err=9.5e-16) 17. Green: elapsed time t=1.15618 s, 8192 iters, t-(init.)=1.11324 s t(norm)=0.0294907, mflops=169.545 (err=9.6e-16) 18. GSL: elapsed time t=1.26832 s, 4096 iters, t-(init.)=1.24703 s t(norm)=0.0660702, mflops=75.6771 (err=1.0e-15) 19. GSL DIT: elapsed time t=1.18657 s, 4096 iters, t-(init.)=1.1652 s t(norm)=0.0617345, mflops=80.992 (err=1.2e-15) 20. GSL DIF: elapsed time t=1.0698 s, 4096 iters, t-(init.)=1.04851 s t(norm)=0.0555522, mflops=90.0054 (err=1.1e-15) 21. Krukar: elapsed time t=1.82889 s, 8192 iters, t-(init.)=1.78595 s t(norm)=0.0473116, mflops=105.682 (err=1.0e-15) 22. Mayer (Buneman): elapsed time t=1.96312 s, 8192 iters, t-(init.)=1.91982 s t(norm)=0.0508578, mflops=98.3134 (err=1.0e-15) 23. Mayer (simple): elapsed time t=1.49071 s, 8192 iters, t-(init.)=1.44781 s t(norm)=0.0383538, mflops=130.365 24. Mayer (lookup): elapsed time t=1.57911 s, 8192 iters, t-(init.)=1.53648 s t(norm)=0.0407027, mflops=122.842 (err=1.0e-15) 25. Monro: elapsed time t=1.5648 s, 8192 iters, t-(init.)=1.52209 s t(norm)=0.0403217, mflops=124.003 (err=8.4e-08) 26. NAPACK (f2c): elapsed time t=1.27914 s, 2048 iters, t-(init.)=1.26851 s t(norm)=0.134416, mflops=37.1979 (err=7.1e-15) 27. Ooura (C): elapsed time t=1.04753 s, 8192 iters, t-(init.)=1.00465 s t(norm)=0.0266141, mflops=187.87 (err=9.6e-16) 28. Ooura (F): elapsed time t=1.09643 s, 8192 iters, t-(init.)=1.05381 s t(norm)=0.0279164, mflops=179.106 (err=9.6e-16) 29. Ransom: elapsed time t=1.88004 s, 4096 iters, t-(init.)=1.85852 s t(norm)=0.098468, mflops=50.7779 (err=1.4e-15) 30. SCIPORT: elapsed time t=1.71542 s, 8192 iters, t-(init.)=1.67249 s t(norm)=0.044306, mflops=112.852 (err=1.0e-15) 31. Singleton: elapsed time t=1.06674 s, 4096 iters, t-(init.)=1.04526 s t(norm)=0.05538, mflops=90.2852 (err=1.2e-15) 32. Singleton (f2c): elapsed time t=1.17375 s, 8192 iters, t-(init.)=1.13078 s t(norm)=0.0299556, mflops=166.914 (err=1.2e-15) 33. Sorensen: elapsed time t=1.80325 s, 16384 iters, t-(init.)=1.71731 s t(norm)=0.0227466, mflops=219.813 (err=1.0e-15) 34. Sorensen DIT: elapsed time t=1.59085 s, 2048 iters, t-(init.)=1.58001 s t(norm)=0.167424, mflops=29.8644 (err=1.1e-15) 35. Temperton: elapsed time t=1.04837 s, 4096 iters, t-(init.)=1.02695 s t(norm)=0.05441, mflops=91.8948 (err=1.0e-07) 36. Temperton (f2c): elapsed time t=1.59227 s, 4096 iters, t-(init.)=1.57066 s t(norm)=0.0832167, mflops=60.0841 (err=1.0e-15) 37. Valkenburg: elapsed time t=1.61091 s, 512 iters, t-(init.)=1.60822 s t(norm)=0.681654, mflops=7.3351 (err=1.3e-15) 38. SUNPERF: elapsed time t=1.16129 s, 8192 iters, t-(init.)=1.11798 s t(norm)=0.0296163, mflops=168.826 (err=1.0e-15) Top mflops for N=512 = 268.252 Normalized results and averages for N=512: fft 0: mflops = 87.2055 (norm. = 0.325088), norm. avg. (of 9) = 0.289957 fft 1: mflops = 92.2887 (norm. = 0.344038), norm. avg. (of 9) = 0.287824 fft 2: mflops = 58.2451 (norm. = 0.217129), norm. avg. (of 9) = 0.13445 fft 3: mflops = 30.9473 (norm. = 0.115367), norm. avg. (of 9) = 0.057552 fft 4: mflops = 116.224 (norm. = 0.433265), norm. avg. (of 9) = 0.195477 fft 5: mflops = 8.26537 (norm. = 0.030812), norm. avg. (of 9) = 0.0304073 fft 6: mflops = 73.3754 (norm. = 0.273532), norm. avg. (of 9) = 0.16979 fft 7: mflops = 90.294 (norm. = 0.336602), norm. avg. (of 9) = 0.169302 fft 8: mflops = 29.8197 (norm. = 0.111163), norm. avg. (of 9) = 0.109334 fft 9: mflops = 160.194 (norm. = 0.597176), norm. avg. (of 9) = 0.261442 fft 10: mflops = 189.453 (norm. = 0.70625), norm. avg. (of 9) = 0.338666 fft 11: mflops = 28.078 (norm. = 0.10467), norm. avg. (of 8) = 0.0681312 fft 12: mflops = 174.006 (norm. = 0.648666), norm. avg. (of 9) = 0.417615 fft 13: mflops = 104.049 (norm. = 0.387877), norm. avg. (of 9) = 0.264168 fft 14: mflops = 268.093 (norm. = 0.999409), norm. avg. (of 9) = 0.849436 fft 15: mflops = 268.252 (norm. = 1), norm. avg. (of 9) = 0.764896 fft 16: mflops = 172.153 (norm. = 0.641758), norm. avg. (of 9) = 0.813245 fft 17: mflops = 169.545 (norm. = 0.632037), norm. avg. (of 7) = 0.483418 fft 18: mflops = 75.6771 (norm. = 0.282112), norm. avg. (of 9) = 0.188786 fft 19: mflops = 80.992 (norm. = 0.301925), norm. avg. (of 9) = 0.134668 fft 20: mflops = 90.0054 (norm. = 0.335526), norm. avg. (of 9) = 0.150518 fft 21: mflops = 105.682 (norm. = 0.393967), norm. avg. (of 9) = 0.416325 fft 22: mflops = 98.3134 (norm. = 0.366497), norm. avg. (of 8) = 0.227479 fft 23: mflops = 130.365 (norm. = 0.485981), norm. avg. (of 8) = 0.275069 fft 24: mflops = 122.842 (norm. = 0.457935), norm. avg. (of 8) = 0.272303 fft 25: mflops = 124.003 (norm. = 0.462262), norm. avg. (of 8) = 0.214402 fft 26: mflops = 37.1979 (norm. = 0.138668), norm. avg. (of 9) = 0.0733435 fft 27: mflops = 187.87 (norm. = 0.70035), norm. avg. (of 9) = 0.416675 fft 28: mflops = 179.106 (norm. = 0.66768), norm. avg. (of 9) = 0.410201 fft 29: mflops = 50.7779 (norm. = 0.189292), norm. avg. (of 8) = 0.0868398 fft 30: mflops = 112.852 (norm. = 0.420693), norm. avg. (of 8) = 0.31557 fft 31: mflops = 90.2852 (norm. = 0.336569), norm. avg. (of 9) = 0.222475 fft 32: mflops = 166.914 (norm. = 0.622229), norm. avg. (of 9) = 0.257753 fft 33: mflops = 219.813 (norm. = 0.819429), norm. avg. (of 9) = 0.389132 fft 34: mflops = 29.8644 (norm. = 0.11133), norm. avg. (of 9) = 0.118171 fft 35: mflops = 91.8948 (norm. = 0.342569), norm. avg. (of 9) = 0.170998 fft 36: mflops = 60.0841 (norm. = 0.223984), norm. avg. (of 9) = 0.127469 fft 37: mflops = 7.3351 (norm. = 0.0273441), norm. avg. (of 9) = 0.0263066 fft 38: mflops = 168.826 (norm. = 0.629356), norm. avg. (of 9) = 0.42296 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.21365 s, 2048 iters, t-(init.)=1.1925 s t(norm)=0.0568627, mflops=87.9311 (err=1.8e-15) 1. Arndt DIT: elapsed time t=1.09211 s, 2048 iters, t-(init.)=1.07072 s t(norm)=0.0510559, mflops=97.932 (err=1.8e-15) 2. Arndt Split-Radix: elapsed time t=1.71356 s, 2048 iters, t-(init.)=1.6923 s t(norm)=0.0806954, mflops=61.9614 (err=1.8e-15) 3. Arndt 4-step: elapsed time t=1.49136 s, 1024 iters, t-(init.)=1.48077 s t(norm)=0.141217, mflops=35.4065 (err=1.8e-15) 4. Bailey: elapsed time t=1.00703 s, 2048 iters, t-(init.)=0.985727 s t(norm)=0.0470031, mflops=106.376 (err=1.9e-15) 5. Beauregard: elapsed time t=1.62971 s, 256 iters, t-(init.)=1.62707 s t(norm)=0.620677, mflops=8.05571 (err=2.0e-15) 6. Bergland: elapsed time t=1.61419 s, 2048 iters, t-(init.)=1.5929 s t(norm)=0.0759556, mflops=65.8279 (err=2.2e-15) 7. Brenner: elapsed time t=1.07607 s, 2048 iters, t-(init.)=1.05478 s t(norm)=0.0502958, mflops=99.4119 (err=1.9e-15) 8. Burrus: elapsed time t=1.65486 s, 1024 iters, t-(init.)=1.64422 s t(norm)=0.156805, mflops=31.8867 (err=1.8e-15) 9. CWP (min N) (N=1040): elapsed time t=1.36878 s, 4096 iters, t-(init.)=1.32582 s t(norm)=0.0316099, mflops=158.178 10. CWP (best N) (N=1040): elapsed time t=1.36864 s, 4096 iters, t-(init.)=1.32556 s t(norm)=0.0316038, mflops=158.209 11. Edelblute: elapsed time t=1.76334 s, 1024 iters, t-(init.)=1.75271 s t(norm)=0.167151, mflops=29.913 (err=1.8e-15) 12. FFTPACK: elapsed time t=1.35852 s, 4096 iters, t-(init.)=1.31584 s t(norm)=0.0313721, mflops=159.377 (err=2.0e-15) 13. FFTPACK (f2c): elapsed time t=1.1187 s, 2048 iters, t-(init.)=1.09744 s t(norm)=0.0523298, mflops=95.5478 (err=2.0e-15) FFTW_MEASURE plan: (cost = 2.263880e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.7982 s, 8192 iters, t-(init.)=1.71353 s t(norm)=0.0204268, mflops=244.776 (err=2.0e-15) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.85923 s, 8192 iters, t-(init.)=1.77413 s t(norm)=0.0211492, mflops=236.415 (err=2.0e-15) 16. Frigo-old: elapsed time t=1.57885 s, 4096 iters, t-(init.)=1.53622 s t(norm)=0.0366264, mflops=136.514 (err=1.9e-15) 17. Green: elapsed time t=1.22269 s, 4096 iters, t-(init.)=1.18005 s t(norm)=0.0281346, mflops=177.717 (err=2.0e-15) 18. GSL: elapsed time t=1.38288 s, 2048 iters, t-(init.)=1.36166 s t(norm)=0.0649289, mflops=77.0073 (err=2.0e-15) 19. GSL DIT: elapsed time t=1.26242 s, 2048 iters, t-(init.)=1.24113 s t(norm)=0.0591819, mflops=84.4853 (err=2.1e-15) 20. GSL DIF: elapsed time t=1.09025 s, 2048 iters, t-(init.)=1.06901 s t(norm)=0.0509742, mflops=98.0889 (err=2.2e-15) 21. Krukar: elapsed time t=1.4541 s, 2048 iters, t-(init.)=1.43288 s t(norm)=0.0683248, mflops=73.1799 (err=1.9e-15) 22. Mayer (Buneman): elapsed time t=1.04226 s, 2048 iters, t-(init.)=1.021 s t(norm)=0.0486851, mflops=102.701 (err=1.8e-15) 23. Mayer (simple): elapsed time t=1.60819 s, 4096 iters, t-(init.)=1.56589 s t(norm)=0.0373337, mflops=133.927 24. Mayer (lookup): elapsed time t=1.67737 s, 4096 iters, t-(init.)=1.63494 s t(norm)=0.0389801, mflops=128.271 (err=1.8e-15) 25. Monro: elapsed time t=1.60769 s, 4096 iters, t-(init.)=1.56524 s t(norm)=0.0373181, mflops=133.983 (err=1.0e-07) 26. NAPACK (f2c): elapsed time t=1.44135 s, 1024 iters, t-(init.)=1.4307 s t(norm)=0.136442, mflops=36.6456 (err=1.7e-14) 27. Ooura (C): elapsed time t=1.07149 s, 4096 iters, t-(init.)=1.02895 s t(norm)=0.0245321, mflops=203.815 (err=2.2e-15) 28. Ooura (F): elapsed time t=1.09706 s, 4096 iters, t-(init.)=1.05481 s t(norm)=0.0251486, mflops=198.818 (err=2.2e-15) 29. Ransom: elapsed time t=1.42639 s, 2048 iters, t-(init.)=1.40518 s t(norm)=0.067004, mflops=74.6224 (err=2.3e-15) 30. SCIPORT: elapsed time t=1.04142 s, 2048 iters, t-(init.)=1.02012 s t(norm)=0.048643, mflops=102.79 (err=2.0e-15) 31. Singleton: elapsed time t=1.728 s, 4096 iters, t-(init.)=1.68523 s t(norm)=0.0401791, mflops=124.443 (err=2.8e-15) 32. Singleton (f2c): elapsed time t=1.16331 s, 4096 iters, t-(init.)=1.12092 s t(norm)=0.0267249, mflops=187.092 (err=2.8e-15) 33. Sorensen: elapsed time t=1.87998 s, 8192 iters, t-(init.)=1.79505 s t(norm)=0.0213987, mflops=233.659 (err=1.8e-15) 34. Sorensen DIT: elapsed time t=1.65718 s, 1024 iters, t-(init.)=1.64653 s t(norm)=0.157025, mflops=31.8421 (err=1.8e-15) 35. Temperton: elapsed time t=1.01442 s, 2048 iters, t-(init.)=0.993214 s t(norm)=0.0473601, mflops=105.574 (err=1.1e-07) 36. Temperton (f2c): elapsed time t=1.54005 s, 2048 iters, t-(init.)=1.51881 s t(norm)=0.0724227, mflops=69.0392 (err=2.0e-15) 37. Valkenburg: elapsed time t=1.79814 s, 256 iters, t-(init.)=1.79548 s t(norm)=0.684922, mflops=7.3001 (err=2.4e-15) 38. SUNPERF: elapsed time t=1.44108 s, 4096 iters, t-(init.)=1.39843 s t(norm)=0.0333412, mflops=149.965 (err=2.0e-15) Top mflops for N=1024 = 244.776 Normalized results and averages for N=1024: fft 0: mflops = 87.9311 (norm. = 0.359231), norm. avg. (of 10) = 0.296885 fft 1: mflops = 97.932 (norm. = 0.400088), norm. avg. (of 10) = 0.29905 fft 2: mflops = 61.9614 (norm. = 0.253135), norm. avg. (of 10) = 0.146319 fft 3: mflops = 35.4065 (norm. = 0.144648), norm. avg. (of 10) = 0.0662616 fft 4: mflops = 106.376 (norm. = 0.434585), norm. avg. (of 10) = 0.219388 fft 5: mflops = 8.05571 (norm. = 0.0329105), norm. avg. (of 10) = 0.0306576 fft 6: mflops = 65.8279 (norm. = 0.268931), norm. avg. (of 10) = 0.179704 fft 7: mflops = 99.4119 (norm. = 0.406134), norm. avg. (of 10) = 0.192985 fft 8: mflops = 31.8867 (norm. = 0.130269), norm. avg. (of 10) = 0.111428 fft 9: mflops = 158.178 (norm. = 0.646216), norm. avg. (of 10) = 0.299919 fft 10: mflops = 158.209 (norm. = 0.646341), norm. avg. (of 10) = 0.369433 fft 11: mflops = 29.913 (norm. = 0.122206), norm. avg. (of 9) = 0.0741395 fft 12: mflops = 159.377 (norm. = 0.651114), norm. avg. (of 10) = 0.440965 fft 13: mflops = 95.5478 (norm. = 0.390348), norm. avg. (of 10) = 0.276786 fft 14: mflops = 244.776 (norm. = 1), norm. avg. (of 10) = 0.864492 fft 15: mflops = 236.415 (norm. = 0.965842), norm. avg. (of 10) = 0.78499 fft 16: mflops = 136.514 (norm. = 0.557708), norm. avg. (of 10) = 0.787691 fft 17: mflops = 177.717 (norm. = 0.726038), norm. avg. (of 8) = 0.513745 fft 18: mflops = 77.0073 (norm. = 0.314603), norm. avg. (of 10) = 0.201367 fft 19: mflops = 84.4853 (norm. = 0.345153), norm. avg. (of 10) = 0.155717 fft 20: mflops = 98.0889 (norm. = 0.400729), norm. avg. (of 10) = 0.175539 fft 21: mflops = 73.1799 (norm. = 0.298966), norm. avg. (of 10) = 0.404589 fft 22: mflops = 102.701 (norm. = 0.419571), norm. avg. (of 9) = 0.248822 fft 23: mflops = 133.927 (norm. = 0.547142), norm. avg. (of 9) = 0.305299 fft 24: mflops = 128.271 (norm. = 0.524033), norm. avg. (of 9) = 0.300273 fft 25: mflops = 133.983 (norm. = 0.54737), norm. avg. (of 9) = 0.251398 fft 26: mflops = 36.6456 (norm. = 0.149711), norm. avg. (of 10) = 0.0809802 fft 27: mflops = 203.815 (norm. = 0.832657), norm. avg. (of 10) = 0.458274 fft 28: mflops = 198.818 (norm. = 0.812245), norm. avg. (of 10) = 0.450405 fft 29: mflops = 74.6224 (norm. = 0.30486), norm. avg. (of 9) = 0.111064 fft 30: mflops = 102.79 (norm. = 0.419933), norm. avg. (of 9) = 0.327166 fft 31: mflops = 124.443 (norm. = 0.508394), norm. avg. (of 10) = 0.251067 fft 32: mflops = 187.092 (norm. = 0.764338), norm. avg. (of 10) = 0.308412 fft 33: mflops = 233.659 (norm. = 0.954582), norm. avg. (of 10) = 0.445677 fft 34: mflops = 31.8421 (norm. = 0.130086), norm. avg. (of 10) = 0.119362 fft 35: mflops = 105.574 (norm. = 0.431308), norm. avg. (of 10) = 0.197029 fft 36: mflops = 69.0392 (norm. = 0.28205), norm. avg. (of 10) = 0.142927 fft 37: mflops = 7.3001 (norm. = 0.0298236), norm. avg. (of 10) = 0.0266583 fft 38: mflops = 149.965 (norm. = 0.61266), norm. avg. (of 10) = 0.44193 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.4973 s, 1024 iters, t-(init.)=1.47617 s t(norm)=0.0639904, mflops=78.1368 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.42711 s, 1024 iters, t-(init.)=1.40593 s t(norm)=0.0609453, mflops=82.0408 (err=1.4e-15) 2. Arndt Split-Radix: elapsed time t=1.11237 s, 512 iters, t-(init.)=1.10179 s t(norm)=0.0955228, mflops=52.3435 (err=1.4e-15) 3. Arndt 4-step: elapsed time t=1.74699 s, 512 iters, t-(init.)=1.73641 s t(norm)=0.150543, mflops=33.2132 (err=1.4e-15) 4. Bailey: elapsed time t=1.37756 s, 1024 iters, t-(init.)=1.35639 s t(norm)=0.0587978, mflops=85.0373 (err=1.4e-15) 5. Beauregard: elapsed time t=1.83048 s, 128 iters, t-(init.)=1.82783 s t(norm)=0.633876, mflops=7.88798 (err=1.4e-15) 6. Bergland: elapsed time t=1.84616 s, 1024 iters, t-(init.)=1.82501 s t(norm)=0.079112, mflops=63.2015 (err=1.5e-15) 7. Brenner: elapsed time t=1.34238 s, 1024 iters, t-(init.)=1.32122 s t(norm)=0.0572734, mflops=87.3005 (err=1.4e-15) 8. Burrus: elapsed time t=1.89104 s, 512 iters, t-(init.)=1.88044 s t(norm)=0.16303, mflops=30.6692 (err=1.4e-15) 9. CWP (min N) (N=2145): elapsed time t=1.88589 s, 2048 iters, t-(init.)=1.84166 s t(norm)=0.0399168, mflops=125.26 10. CWP (best N) (N=2184): elapsed time t=1.60058 s, 2048 iters, t-(init.)=1.55549 s t(norm)=0.0337143, mflops=148.305 11. Edelblute: elapsed time t=1.01598 s, 256 iters, t-(init.)=1.01071 s t(norm)=0.175252, mflops=28.5303 (err=1.4e-15) 12. FFTPACK: elapsed time t=1.41351 s, 2048 iters, t-(init.)=1.37128 s t(norm)=0.0297216, mflops=168.228 (err=1.4e-15) 13. FFTPACK (f2c): elapsed time t=1.26832 s, 1024 iters, t-(init.)=1.24711 s t(norm)=0.054061, mflops=92.4882 (err=1.4e-15) FFTW_MEASURE plan: (cost = 4.859260e-04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.98484 s, 4096 iters, t-(init.)=1.90038 s t(norm)=0.0205948, mflops=242.78 (err=1.4e-15) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.01333 s, 2048 iters, t-(init.)=0.971124 s t(norm)=0.0210485, mflops=237.546 (err=1.4e-15) 16. Frigo-old: elapsed time t=1.7636 s, 2048 iters, t-(init.)=1.72126 s t(norm)=0.0373073, mflops=134.022 (err=1.3e-15) 17. Green: elapsed time t=1.59634 s, 2048 iters, t-(init.)=1.55412 s t(norm)=0.0336847, mflops=148.435 (err=1.4e-15) 18. GSL: elapsed time t=1.72494 s, 1024 iters, t-(init.)=1.70381 s t(norm)=0.0738582, mflops=67.6973 (err=1.4e-15) 19. GSL DIT: elapsed time t=1.89002 s, 1024 iters, t-(init.)=1.86892 s t(norm)=0.0810154, mflops=61.7166 (err=2.0e-15) 20. GSL DIF: elapsed time t=1.5931 s, 1024 iters, t-(init.)=1.57196 s t(norm)=0.0681428, mflops=73.3754 (err=2.3e-15) 21. Krukar: elapsed time t=1.70854 s, 1024 iters, t-(init.)=1.68739 s t(norm)=0.0731465, mflops=68.3559 (err=1.4e-15) 22. Mayer (Buneman): elapsed time t=1.19657 s, 1024 iters, t-(init.)=1.17544 s t(norm)=0.0509539, mflops=98.128 (err=1.4e-15) 23. Mayer (simple): elapsed time t=1.90833 s, 2048 iters, t-(init.)=1.86608 s t(norm)=0.0404462, mflops=123.621 24. Mayer (lookup): elapsed time t=1.97155 s, 2048 iters, t-(init.)=1.92929 s t(norm)=0.0418161, mflops=119.571 (err=1.4e-15) 25. Monro: elapsed time t=1.16873 s, 1024 iters, t-(init.)=1.1476 s t(norm)=0.0497471, mflops=100.508 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.91623 s, 512 iters, t-(init.)=1.90565 s t(norm)=0.165216, mflops=30.2635 (err=1.5e-14) 27. Ooura (C): elapsed time t=1.65288 s, 2048 iters, t-(init.)=1.61066 s t(norm)=0.03491, mflops=143.225 (err=1.4e-15) 28. Ooura (F): elapsed time t=1.71819 s, 2048 iters, t-(init.)=1.67577 s t(norm)=0.0363213, mflops=137.66 (err=1.4e-15) 29. Ransom: elapsed time t=1.06103 s, 512 iters, t-(init.)=1.05044 s t(norm)=0.091071, mflops=54.9022 (err=2.1e-15) 30. SCIPORT: elapsed time t=1.19143 s, 1024 iters, t-(init.)=1.17034 s t(norm)=0.0507327, mflops=98.5557 (err=1.4e-15) 31. Singleton: elapsed time t=1.07327 s, 1024 iters, t-(init.)=1.05207 s t(norm)=0.0456059, mflops=109.635 (err=1.9e-15) 32. Singleton (f2c): elapsed time t=1.96745 s, 2048 iters, t-(init.)=1.92524 s t(norm)=0.0417283, mflops=119.823 (err=1.9e-15) 33. Sorensen: elapsed time t=1.6746 s, 2048 iters, t-(init.)=1.63229 s t(norm)=0.035379, mflops=141.327 (err=1.4e-15) 34. Sorensen DIT: elapsed time t=1.0698 s, 256 iters, t-(init.)=1.0645 s t(norm)=0.18458, mflops=27.0885 (err=1.4e-15) 35. Temperton: elapsed time t=1.32219 s, 1024 iters, t-(init.)=1.30103 s t(norm)=0.0563982, mflops=88.6554 (err=1.1e-07) 36. Temperton (f2c): elapsed time t=1.82917 s, 1024 iters, t-(init.)=1.80804 s t(norm)=0.0783766, mflops=63.7946 (err=1.4e-15) 37. Valkenburg: elapsed time t=1.01556 s, 64 iters, t-(init.)=1.01425 s t(norm)=0.703462, mflops=7.10771 (err=1.7e-15) 38. SUNPERF: elapsed time t=1.6447 s, 2048 iters, t-(init.)=1.60241 s t(norm)=0.0347312, mflops=143.963 (err=1.4e-15) Top mflops for N=2048 = 242.78 Normalized results and averages for N=2048: fft 0: mflops = 78.1368 (norm. = 0.321842), norm. avg. (of 11) = 0.299153 fft 1: mflops = 82.0408 (norm. = 0.337922), norm. avg. (of 11) = 0.302584 fft 2: mflops = 52.3435 (norm. = 0.215601), norm. avg. (of 11) = 0.152617 fft 3: mflops = 33.2132 (norm. = 0.136803), norm. avg. (of 11) = 0.0726745 fft 4: mflops = 85.0373 (norm. = 0.350264), norm. avg. (of 11) = 0.231286 fft 5: mflops = 7.88798 (norm. = 0.0324902), norm. avg. (of 11) = 0.0308242 fft 6: mflops = 63.2015 (norm. = 0.260324), norm. avg. (of 11) = 0.187033 fft 7: mflops = 87.3005 (norm. = 0.359587), norm. avg. (of 11) = 0.208131 fft 8: mflops = 30.6692 (norm. = 0.126325), norm. avg. (of 11) = 0.112782 fft 9: mflops = 125.26 (norm. = 0.515942), norm. avg. (of 11) = 0.319558 fft 10: mflops = 148.305 (norm. = 0.610861), norm. avg. (of 11) = 0.391381 fft 11: mflops = 28.5303 (norm. = 0.117515), norm. avg. (of 10) = 0.078477 fft 12: mflops = 168.228 (norm. = 0.692922), norm. avg. (of 11) = 0.46387 fft 13: mflops = 92.4882 (norm. = 0.380954), norm. avg. (of 11) = 0.286256 fft 14: mflops = 242.78 (norm. = 1), norm. avg. (of 11) = 0.876811 fft 15: mflops = 237.546 (norm. = 0.978442), norm. avg. (of 11) = 0.802577 fft 16: mflops = 134.022 (norm. = 0.55203), norm. avg. (of 11) = 0.766267 fft 17: mflops = 148.435 (norm. = 0.611398), norm. avg. (of 9) = 0.524596 fft 18: mflops = 67.6973 (norm. = 0.278842), norm. avg. (of 11) = 0.208411 fft 19: mflops = 61.7166 (norm. = 0.254208), norm. avg. (of 11) = 0.164671 fft 20: mflops = 73.3754 (norm. = 0.30223), norm. avg. (of 11) = 0.187056 fft 21: mflops = 68.3559 (norm. = 0.281555), norm. avg. (of 11) = 0.393404 fft 22: mflops = 98.128 (norm. = 0.404185), norm. avg. (of 10) = 0.264359 fft 23: mflops = 123.621 (norm. = 0.509189), norm. avg. (of 10) = 0.325688 fft 24: mflops = 119.571 (norm. = 0.492508), norm. avg. (of 10) = 0.319497 fft 25: mflops = 100.508 (norm. = 0.413989), norm. avg. (of 10) = 0.267657 fft 26: mflops = 30.2635 (norm. = 0.124654), norm. avg. (of 11) = 0.0849505 fft 27: mflops = 143.225 (norm. = 0.589938), norm. avg. (of 11) = 0.470243 fft 28: mflops = 137.66 (norm. = 0.567016), norm. avg. (of 11) = 0.461006 fft 29: mflops = 54.9022 (norm. = 0.22614), norm. avg. (of 10) = 0.122572 fft 30: mflops = 98.5557 (norm. = 0.405946), norm. avg. (of 10) = 0.335044 fft 31: mflops = 109.635 (norm. = 0.451582), norm. avg. (of 11) = 0.269296 fft 32: mflops = 119.823 (norm. = 0.493544), norm. avg. (of 11) = 0.325242 fft 33: mflops = 141.327 (norm. = 0.582119), norm. avg. (of 11) = 0.458081 fft 34: mflops = 27.0885 (norm. = 0.111576), norm. avg. (of 11) = 0.118654 fft 35: mflops = 88.6554 (norm. = 0.365167), norm. avg. (of 11) = 0.212314 fft 36: mflops = 63.7946 (norm. = 0.262767), norm. avg. (of 11) = 0.153822 fft 37: mflops = 7.10771 (norm. = 0.0292763), norm. avg. (of 11) = 0.0268963 fft 38: mflops = 143.963 (norm. = 0.592975), norm. avg. (of 11) = 0.455661 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.58992 s, 512 iters, t-(init.)=1.56886 s t(norm)=0.062341, mflops=80.2041 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.53903 s, 512 iters, t-(init.)=1.51794 s t(norm)=0.0603176, mflops=82.8946 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.15841 s, 256 iters, t-(init.)=1.14786 s t(norm)=0.091224, mflops=54.8102 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.75603 s, 256 iters, t-(init.)=1.74549 s t(norm)=0.138719, mflops=36.044 (err=3.7e-15) 4. Bailey: elapsed time t=1.42447 s, 512 iters, t-(init.)=1.40334 s t(norm)=0.0557636, mflops=89.6642 (err=3.7e-15) 5. Beauregard: elapsed time t=1.00524 s, 32 iters, t-(init.)=1.00391 s t(norm)=0.638266, mflops=7.83373 (err=3.8e-15) 6. Bergland: elapsed time t=1.95226 s, 512 iters, t-(init.)=1.93114 s t(norm)=0.0767366, mflops=65.1579 (err=3.9e-15) 7. Brenner: elapsed time t=1.3507 s, 512 iters, t-(init.)=1.32965 s t(norm)=0.0528357, mflops=94.6331 (err=3.8e-15) 8. Burrus: elapsed time t=1.92254 s, 256 iters, t-(init.)=1.91199 s t(norm)=0.151952, mflops=32.9052 (err=3.7e-15) 9. CWP (min N) (N=4290): elapsed time t=1.02976 s, 512 iters, t-(init.)=1.0076 s t(norm)=0.0400385, mflops=124.88 10. CWP (best N) (N=4368): elapsed time t=1.8266 s, 1024 iters, t-(init.)=1.7817 s t(norm)=0.0353992, mflops=141.246 11. Edelblute: elapsed time t=1.03757 s, 128 iters, t-(init.)=1.03228 s t(norm)=0.164077, mflops=30.4735 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.62691 s, 1024 iters, t-(init.)=1.58475 s t(norm)=0.0314861, mflops=158.8 (err=3.8e-15) 13. FFTPACK (f2c): elapsed time t=1.33127 s, 512 iters, t-(init.)=1.31014 s t(norm)=0.0520603, mflops=96.0424 (err=3.8e-15) FFTW_MEASURE plan: (cost = 1.106007e-03) FFTW_TWIDDLE 32 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.11953 s, 1024 iters, t-(init.)=1.07738 s t(norm)=0.0214057, mflops=233.583 (err=3.8e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.07098 s, 1024 iters, t-(init.)=1.02882 s t(norm)=0.0204409, mflops=244.608 (err=3.8e-15) 16. Frigo-old: elapsed time t=1.00208 s, 512 iters, t-(init.)=0.981008 s t(norm)=0.0389817, mflops=128.265 (err=3.8e-15) 17. Green: elapsed time t=1.03072 s, 512 iters, t-(init.)=1.00966 s t(norm)=0.0401205, mflops=124.625 (err=3.8e-15) 18. GSL: elapsed time t=1.7551 s, 512 iters, t-(init.)=1.73382 s t(norm)=0.0688956, mflops=72.5735 (err=3.8e-15) 19. GSL DIT: elapsed time t=1.03482 s, 256 iters, t-(init.)=1.0243 s t(norm)=0.0814037, mflops=61.4223 (err=4.1e-15) 20. GSL DIF: elapsed time t=1.73744 s, 512 iters, t-(init.)=1.71635 s t(norm)=0.0682018, mflops=73.3118 (err=4.3e-15) 21. Krukar: elapsed time t=1.93466 s, 512 iters, t-(init.)=1.91357 s t(norm)=0.0760385, mflops=65.7562 (err=3.8e-15) 22. Mayer (Buneman): elapsed time t=1.62425 s, 512 iters, t-(init.)=1.60319 s t(norm)=0.063705, mflops=78.4868 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.37949 s, 512 iters, t-(init.)=1.35839 s t(norm)=0.0539777, mflops=92.6309 24. Mayer (lookup): elapsed time t=1.39662 s, 512 iters, t-(init.)=1.37556 s t(norm)=0.0546598, mflops=91.4749 (err=3.7e-15) 25. Monro: elapsed time t=1.24422 s, 512 iters, t-(init.)=1.22312 s t(norm)=0.0486025, mflops=102.875 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.01247 s, 128 iters, t-(init.)=1.00719 s t(norm)=0.160089, mflops=31.2326 (err=4.9e-14) 27. Ooura (C): elapsed time t=1.60305 s, 1024 iters, t-(init.)=1.56093 s t(norm)=0.0310129, mflops=161.223 (err=3.9e-15) 28. Ooura (F): elapsed time t=1.67747 s, 1024 iters, t-(init.)=1.63509 s t(norm)=0.0324864, mflops=153.91 (err=3.9e-15) 29. Ransom: elapsed time t=1.67995 s, 512 iters, t-(init.)=1.65888 s t(norm)=0.0659181, mflops=75.8517 (err=4.4e-15) 30. SCIPORT: elapsed time t=1.29789 s, 512 iters, t-(init.)=1.27682 s t(norm)=0.0507363, mflops=98.5488 (err=3.8e-15) 31. Singleton: elapsed time t=1.86454 s, 1024 iters, t-(init.)=1.82236 s t(norm)=0.036207, mflops=138.095 (err=5.8e-15) 32. Singleton (f2c): elapsed time t=1.787 s, 1024 iters, t-(init.)=1.74488 s t(norm)=0.0346676, mflops=144.227 (err=5.8e-15) 33. Sorensen: elapsed time t=1.80582 s, 1024 iters, t-(init.)=1.76371 s t(norm)=0.0350417, mflops=142.687 (err=3.7e-15) 34. Sorensen DIT: elapsed time t=1.05982 s, 128 iters, t-(init.)=1.05453 s t(norm)=0.167613, mflops=29.8307 (err=3.7e-15) 35. Temperton: elapsed time t=1.29776 s, 512 iters, t-(init.)=1.27644 s t(norm)=0.050721, mflops=98.5785 (err=1.2e-07) 36. Temperton (f2c): elapsed time t=1.89819 s, 512 iters, t-(init.)=1.87709 s t(norm)=0.0745888, mflops=67.0342 (err=3.8e-15) 37. Valkenburg: elapsed time t=1.11792 s, 32 iters, t-(init.)=1.1166 s t(norm)=0.709917, mflops=7.04307 (err=4.0e-15) 38. SUNPERF: elapsed time t=1.80196 s, 1024 iters, t-(init.)=1.7598 s t(norm)=0.0349641, mflops=143.004 (err=3.8e-15) Top mflops for N=4096 = 244.608 Normalized results and averages for N=4096: fft 0: mflops = 80.2041 (norm. = 0.327889), norm. avg. (of 12) = 0.301548 fft 1: mflops = 82.8946 (norm. = 0.338888), norm. avg. (of 12) = 0.305609 fft 2: mflops = 54.8102 (norm. = 0.224074), norm. avg. (of 12) = 0.158572 fft 3: mflops = 36.044 (norm. = 0.147354), norm. avg. (of 12) = 0.0788978 fft 4: mflops = 89.6642 (norm. = 0.366564), norm. avg. (of 12) = 0.242559 fft 5: mflops = 7.83373 (norm. = 0.0320257), norm. avg. (of 12) = 0.0309244 fft 6: mflops = 65.1579 (norm. = 0.266378), norm. avg. (of 12) = 0.193645 fft 7: mflops = 94.6331 (norm. = 0.386877), norm. avg. (of 12) = 0.223026 fft 8: mflops = 32.9052 (norm. = 0.134522), norm. avg. (of 12) = 0.114594 fft 9: mflops = 124.88 (norm. = 0.510531), norm. avg. (of 12) = 0.335472 fft 10: mflops = 141.246 (norm. = 0.57744), norm. avg. (of 12) = 0.406886 fft 11: mflops = 30.4735 (norm. = 0.124581), norm. avg. (of 11) = 0.0826683 fft 12: mflops = 158.8 (norm. = 0.649205), norm. avg. (of 12) = 0.479314 fft 13: mflops = 96.0424 (norm. = 0.392639), norm. avg. (of 12) = 0.295121 fft 14: mflops = 233.583 (norm. = 0.954931), norm. avg. (of 12) = 0.883321 fft 15: mflops = 244.608 (norm. = 1), norm. avg. (of 12) = 0.819029 fft 16: mflops = 128.265 (norm. = 0.524371), norm. avg. (of 12) = 0.746109 fft 17: mflops = 124.625 (norm. = 0.509489), norm. avg. (of 10) = 0.523085 fft 18: mflops = 72.5735 (norm. = 0.296694), norm. avg. (of 12) = 0.215767 fft 19: mflops = 61.4223 (norm. = 0.251105), norm. avg. (of 12) = 0.171874 fft 20: mflops = 73.3118 (norm. = 0.299712), norm. avg. (of 12) = 0.196444 fft 21: mflops = 65.7562 (norm. = 0.268823), norm. avg. (of 12) = 0.383023 fft 22: mflops = 78.4868 (norm. = 0.320868), norm. avg. (of 11) = 0.269496 fft 23: mflops = 92.6309 (norm. = 0.378692), norm. avg. (of 11) = 0.330507 fft 24: mflops = 91.4749 (norm. = 0.373966), norm. avg. (of 11) = 0.324448 fft 25: mflops = 102.875 (norm. = 0.420574), norm. avg. (of 11) = 0.281559 fft 26: mflops = 31.2326 (norm. = 0.127685), norm. avg. (of 12) = 0.0885117 fft 27: mflops = 161.223 (norm. = 0.65911), norm. avg. (of 12) = 0.485982 fft 28: mflops = 153.91 (norm. = 0.629214), norm. avg. (of 12) = 0.475023 fft 29: mflops = 75.8517 (norm. = 0.310095), norm. avg. (of 11) = 0.139619 fft 30: mflops = 98.5488 (norm. = 0.402885), norm. avg. (of 11) = 0.341212 fft 31: mflops = 138.095 (norm. = 0.564558), norm. avg. (of 12) = 0.293901 fft 32: mflops = 144.227 (norm. = 0.589625), norm. avg. (of 12) = 0.347274 fft 33: mflops = 142.687 (norm. = 0.583331), norm. avg. (of 12) = 0.468519 fft 34: mflops = 29.8307 (norm. = 0.121953), norm. avg. (of 12) = 0.118929 fft 35: mflops = 98.5785 (norm. = 0.403007), norm. avg. (of 12) = 0.228205 fft 36: mflops = 67.0342 (norm. = 0.274048), norm. avg. (of 12) = 0.163841 fft 37: mflops = 7.04307 (norm. = 0.0287934), norm. avg. (of 12) = 0.0270544 fft 38: mflops = 143.004 (norm. = 0.584626), norm. avg. (of 12) = 0.466408 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.65544 s, 256 iters, t-(init.)=1.63429 s t(norm)=0.0599453, mflops=83.4094 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.5554 s, 256 iters, t-(init.)=1.53435 s t(norm)=0.0562797, mflops=88.842 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.22132 s, 128 iters, t-(init.)=1.21079 s t(norm)=0.088823, mflops=56.2917 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.87249 s, 128 iters, t-(init.)=1.86197 s t(norm)=0.136593, mflops=36.605 (err=3.7e-15) 4. Bailey: elapsed time t=1.42826 s, 256 iters, t-(init.)=1.40722 s t(norm)=0.0516164, mflops=96.8684 (err=3.7e-15) 5. Beauregard: elapsed time t=1.09591 s, 16 iters, t-(init.)=1.09457 s t(norm)=0.642379, mflops=7.78357 (err=3.7e-15) 6. Bergland: elapsed time t=1.1286 s, 128 iters, t-(init.)=1.11802 s t(norm)=0.0820176, mflops=60.9625 (err=3.7e-15) 7. Brenner: elapsed time t=1.45519 s, 256 iters, t-(init.)=1.43414 s t(norm)=0.0526038, mflops=95.0501 (err=3.7e-15) 8. Burrus: elapsed time t=1.96547 s, 128 iters, t-(init.)=1.95492 s t(norm)=0.143412, mflops=34.8646 (err=3.7e-15) 9. CWP (min N) (N=8580): elapsed time t=1.04227 s, 256 iters, t-(init.)=1.02025 s t(norm)=0.0374224, mflops=133.61 10. CWP (best N) (N=9240): elapsed time t=1.83036 s, 512 iters, t-(init.)=1.78285 s t(norm)=0.0326972, mflops=152.918 11. Edelblute: elapsed time t=1.07199 s, 64 iters, t-(init.)=1.06669 s t(norm)=0.156504, mflops=31.9481 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.78805 s, 512 iters, t-(init.)=1.74592 s t(norm)=0.03202, mflops=156.153 (err=3.7e-15) 13. FFTPACK (f2c): elapsed time t=1.62349 s, 256 iters, t-(init.)=1.60245 s t(norm)=0.0587776, mflops=85.0664 (err=3.7e-15) FFTW_MEASURE plan: (cost = 2.453232e-03) FFTW_TWIDDLE 64 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.33094 s, 512 iters, t-(init.)=1.28877 s t(norm)=0.0236359, mflops=211.543 (err=3.7e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.10698 s, 512 iters, t-(init.)=1.06483 s t(norm)=0.0195288, mflops=256.032 (err=3.7e-15) 16. Frigo-old: elapsed time t=1.09844 s, 256 iters, t-(init.)=1.07732 s t(norm)=0.0395158, mflops=126.532 (err=3.7e-15) 17. Green: elapsed time t=1.03174 s, 256 iters, t-(init.)=1.01069 s t(norm)=0.037072, mflops=134.873 (err=3.7e-15) 18. GSL: elapsed time t=1.01369 s, 128 iters, t-(init.)=1.00316 s t(norm)=0.0735912, mflops=67.9429 (err=3.7e-15) 19. GSL DIT: elapsed time t=1.11927 s, 128 iters, t-(init.)=1.10874 s t(norm)=0.0813369, mflops=61.4727 (err=4.3e-15) 20. GSL DIF: elapsed time t=1.8545 s, 256 iters, t-(init.)=1.83346 s t(norm)=0.0672509, mflops=74.3485 (err=4.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.70983 s, 256 iters, t-(init.)=1.68876 s t(norm)=0.0619432, mflops=80.7191 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.47623 s, 256 iters, t-(init.)=1.45518 s t(norm)=0.0533756, mflops=93.6758 24. Mayer (lookup): elapsed time t=1.47988 s, 256 iters, t-(init.)=1.45885 s t(norm)=0.0535102, mflops=93.4401 (err=3.7e-15) 25. Monro: elapsed time t=1.28115 s, 256 iters, t-(init.)=1.26009 s t(norm)=0.0462199, mflops=108.178 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.12678 s, 64 iters, t-(init.)=1.1215 s t(norm)=0.164546, mflops=30.3867 (err=4.5e-14) 27. Ooura (C): elapsed time t=1.92726 s, 512 iters, t-(init.)=1.88518 s t(norm)=0.034574, mflops=144.617 (err=3.7e-15) 28. Ooura (F): elapsed time t=1.93454 s, 512 iters, t-(init.)=1.89242 s t(norm)=0.0347068, mflops=144.064 (err=3.7e-15) 29. Ransom: elapsed time t=1.09665 s, 128 iters, t-(init.)=1.08612 s t(norm)=0.0796775, mflops=62.753 (err=4.9e-15) 30. SCIPORT: elapsed time t=1.52843 s, 256 iters, t-(init.)=1.50724 s t(norm)=0.0552853, mflops=90.4399 (err=3.7e-15) 31. Singleton: elapsed time t=1.00664 s, 256 iters, t-(init.)=0.985269 s t(norm)=0.0361394, mflops=138.353 (err=5.6e-15) 32. Singleton (f2c): elapsed time t=1.03938 s, 256 iters, t-(init.)=1.0183 s t(norm)=0.037351, mflops=133.865 (err=5.6e-15) 33. Sorensen: elapsed time t=1.98351 s, 512 iters, t-(init.)=1.94146 s t(norm)=0.0356062, mflops=140.425 (err=3.7e-15) 34. Sorensen DIT: elapsed time t=1.07503 s, 64 iters, t-(init.)=1.06975 s t(norm)=0.156953, mflops=31.8566 (err=3.7e-15) 35. Temperton: elapsed time t=1.52033 s, 256 iters, t-(init.)=1.4993 s t(norm)=0.0549939, mflops=90.9193 (err=1.4e-07) 36. Temperton (f2c): elapsed time t=1.12065 s, 128 iters, t-(init.)=1.11012 s t(norm)=0.081438, mflops=61.3964 (err=3.7e-15) 37. Valkenburg: elapsed time t=1.21644 s, 16 iters, t-(init.)=1.2151 s t(norm)=0.713113, mflops=7.01151 (err=3.8e-15) 38. SUNPERF: elapsed time t=1.05276 s, 256 iters, t-(init.)=1.03172 s t(norm)=0.0378432, mflops=132.124 (err=3.7e-15) Top mflops for N=8192 = 256.032 Normalized results and averages for N=8192: fft 0: mflops = 83.4094 (norm. = 0.325777), norm. avg. (of 13) = 0.303412 fft 1: mflops = 88.842 (norm. = 0.346995), norm. avg. (of 13) = 0.308793 fft 2: mflops = 56.2917 (norm. = 0.219862), norm. avg. (of 13) = 0.163286 fft 3: mflops = 36.605 (norm. = 0.14297), norm. avg. (of 13) = 0.0838265 fft 4: mflops = 96.8684 (norm. = 0.378345), norm. avg. (of 13) = 0.253004 fft 5: mflops = 7.78357 (norm. = 0.0304007), norm. avg. (of 13) = 0.0308841 fft 6: mflops = 60.9625 (norm. = 0.238105), norm. avg. (of 13) = 0.197065 fft 7: mflops = 95.0501 (norm. = 0.371243), norm. avg. (of 13) = 0.234428 fft 8: mflops = 34.8646 (norm. = 0.136173), norm. avg. (of 13) = 0.116254 fft 9: mflops = 133.61 (norm. = 0.521848), norm. avg. (of 13) = 0.349809 fft 10: mflops = 152.918 (norm. = 0.597262), norm. avg. (of 13) = 0.42153 fft 11: mflops = 31.9481 (norm. = 0.124782), norm. avg. (of 12) = 0.0861777 fft 12: mflops = 156.153 (norm. = 0.609894), norm. avg. (of 13) = 0.489359 fft 13: mflops = 85.0664 (norm. = 0.332249), norm. avg. (of 13) = 0.297977 fft 14: mflops = 211.543 (norm. = 0.826235), norm. avg. (of 13) = 0.87893 fft 15: mflops = 256.032 (norm. = 1), norm. avg. (of 13) = 0.83295 fft 16: mflops = 126.532 (norm. = 0.494202), norm. avg. (of 13) = 0.726732 fft 17: mflops = 134.873 (norm. = 0.526781), norm. avg. (of 11) = 0.523421 fft 18: mflops = 67.9429 (norm. = 0.265369), norm. avg. (of 13) = 0.219583 fft 19: mflops = 61.4727 (norm. = 0.240097), norm. avg. (of 13) = 0.177122 fft 20: mflops = 74.3485 (norm. = 0.290387), norm. avg. (of 13) = 0.203671 fft 21: mflops = -1 (norm. = -0.00390576), norm. avg. (of 12) = 0.383023 fft 22: mflops = 80.7191 (norm. = 0.315269), norm. avg. (of 12) = 0.27331 fft 23: mflops = 93.6758 (norm. = 0.365875), norm. avg. (of 12) = 0.333454 fft 24: mflops = 93.4401 (norm. = 0.364955), norm. avg. (of 12) = 0.327824 fft 25: mflops = 108.178 (norm. = 0.422519), norm. avg. (of 12) = 0.293305 fft 26: mflops = 30.3867 (norm. = 0.118683), norm. avg. (of 13) = 0.0908326 fft 27: mflops = 144.617 (norm. = 0.564841), norm. avg. (of 13) = 0.492048 fft 28: mflops = 144.064 (norm. = 0.562679), norm. avg. (of 13) = 0.481766 fft 29: mflops = 62.753 (norm. = 0.245098), norm. avg. (of 12) = 0.148409 fft 30: mflops = 90.4399 (norm. = 0.353236), norm. avg. (of 12) = 0.342214 fft 31: mflops = 138.353 (norm. = 0.540373), norm. avg. (of 13) = 0.31286 fft 32: mflops = 133.865 (norm. = 0.522845), norm. avg. (of 13) = 0.360779 fft 33: mflops = 140.425 (norm. = 0.548466), norm. avg. (of 13) = 0.474668 fft 34: mflops = 31.8566 (norm. = 0.124424), norm. avg. (of 13) = 0.119352 fft 35: mflops = 90.9193 (norm. = 0.355109), norm. avg. (of 13) = 0.237967 fft 36: mflops = 61.3964 (norm. = 0.2398), norm. avg. (of 13) = 0.169684 fft 37: mflops = 7.01151 (norm. = 0.0273853), norm. avg. (of 13) = 0.0270798 fft 38: mflops = 132.124 (norm. = 0.516045), norm. avg. (of 13) = 0.470227 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.82448 s, 128 iters, t-(init.)=1.80305 s t(norm)=0.0614115, mflops=81.418 (err=6.8e-15) 1. Arndt DIT: elapsed time t=1.71867 s, 128 iters, t-(init.)=1.69763 s t(norm)=0.057821, mflops=86.4737 (err=6.8e-15) 2. Arndt Split-Radix: elapsed time t=1.29599 s, 64 iters, t-(init.)=1.28546 s t(norm)=0.0875651, mflops=57.1004 (err=6.8e-15) 3. Arndt 4-step: elapsed time t=1.80694 s, 64 iters, t-(init.)=1.7964 s t(norm)=0.12237, mflops=40.8597 (err=6.8e-15) 4. Bailey: elapsed time t=1.73691 s, 128 iters, t-(init.)=1.71582 s t(norm)=0.0584406, mflops=85.5569 (err=6.8e-15) 5. Beauregard: elapsed time t=1.1862 s, 8 iters, t-(init.)=1.18488 s t(norm)=0.645709, mflops=7.74342 (err=6.8e-15) 6. Bergland: elapsed time t=1.19854 s, 64 iters, t-(init.)=1.18797 s t(norm)=0.0809239, mflops=61.7864 (err=6.8e-15) 7. Brenner: elapsed time t=1.48057 s, 128 iters, t-(init.)=1.45954 s t(norm)=0.0497116, mflops=100.58 (err=6.8e-15) 8. Burrus: elapsed time t=2.00327 s, 64 iters, t-(init.)=1.99275 s t(norm)=0.135745, mflops=36.8337 (err=6.8e-15) 9. CWP (min N) (N=17160): elapsed time t=1.09193 s, 128 iters, t-(init.)=1.06987 s t(norm)=0.0364397, mflops=137.213 10. CWP (best N) (N=17160): elapsed time t=1.09178 s, 128 iters, t-(init.)=1.06938 s t(norm)=0.0364229, mflops=137.276 11. Edelblute: elapsed time t=1.11486 s, 32 iters, t-(init.)=1.1096 s t(norm)=0.151171, mflops=33.0751 (err=6.8e-15) 12. FFTPACK: elapsed time t=1.11287 s, 128 iters, t-(init.)=1.09179 s t(norm)=0.0371863, mflops=134.458 (err=6.8e-15) 13. FFTPACK (f2c): elapsed time t=1.77236 s, 128 iters, t-(init.)=1.75125 s t(norm)=0.0596471, mflops=83.8263 (err=6.8e-15) FFTW_MEASURE plan: (cost = 6.107588e-03) FFTW_TWIDDLE 64 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_NOTW 16 14. FFTW: elapsed time t=1.61409 s, 256 iters, t-(init.)=1.57191 s t(norm)=0.0267694, mflops=186.781 (err=6.8e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.38584 s, 256 iters, t-(init.)=1.34358 s t(norm)=0.022881, mflops=218.522 (err=6.8e-15) 16. Frigo-old: elapsed time t=1.32369 s, 128 iters, t-(init.)=1.30248 s t(norm)=0.0443621, mflops=112.709 (err=6.8e-15) 17. Green: elapsed time t=1.07947 s, 128 iters, t-(init.)=1.05842 s t(norm)=0.0360496, mflops=138.698 (err=6.8e-15) 18. GSL: elapsed time t=1.06695 s, 64 iters, t-(init.)=1.05605 s t(norm)=0.071938, mflops=69.5043 (err=6.8e-15) 19. GSL DIT: elapsed time t=1.22297 s, 64 iters, t-(init.)=1.21245 s t(norm)=0.0825913, mflops=60.5391 (err=7.2e-15) 20. GSL DIF: elapsed time t=2.03963 s, 128 iters, t-(init.)=2.01845 s t(norm)=0.0687481, mflops=72.7293 (err=7.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.88066 s, 128 iters, t-(init.)=1.85961 s t(norm)=0.0633381, mflops=78.9414 (err=6.8e-15) 23. Mayer (simple): elapsed time t=1.5844 s, 128 iters, t-(init.)=1.56335 s t(norm)=0.0532475, mflops=93.901 24. Mayer (lookup): elapsed time t=1.58011 s, 128 iters, t-(init.)=1.55908 s t(norm)=0.053102, mflops=94.1584 (err=6.8e-15) 25. Monro: elapsed time t=1.35132 s, 128 iters, t-(init.)=1.33025 s t(norm)=0.045308, mflops=110.356 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.21861 s, 32 iters, t-(init.)=1.21328 s t(norm)=0.165297, mflops=30.2486 (err=2.3e-13) 27. Ooura (C): elapsed time t=1.8631 s, 256 iters, t-(init.)=1.82102 s t(norm)=0.0310117, mflops=161.229 (err=6.8e-15) 28. Ooura (F): elapsed time t=1.91251 s, 256 iters, t-(init.)=1.87037 s t(norm)=0.0318521, mflops=156.975 (err=6.8e-15) 29. Ransom: elapsed time t=1.76594 s, 128 iters, t-(init.)=1.7449 s t(norm)=0.0594311, mflops=84.131 (err=7.4e-15) 30. SCIPORT: elapsed time t=1.21101 s, 64 iters, t-(init.)=1.20043 s t(norm)=0.0817726, mflops=61.1452 (err=6.8e-15) 31. Singleton: elapsed time t=1.00482 s, 128 iters, t-(init.)=0.983755 s t(norm)=0.0335065, mflops=149.225 (err=1.0e-14) 32. Singleton (f2c): elapsed time t=1.06208 s, 128 iters, t-(init.)=1.04105 s t(norm)=0.035458, mflops=141.012 (err=1.0e-14) 33. Sorensen: elapsed time t=1.0623 s, 128 iters, t-(init.)=1.04127 s t(norm)=0.0354654, mflops=140.983 (err=6.8e-15) 34. Sorensen DIT: elapsed time t=1.09584 s, 32 iters, t-(init.)=1.09051 s t(norm)=0.14857, mflops=33.6543 (err=6.8e-15) 35. Temperton: elapsed time t=1.50656 s, 128 iters, t-(init.)=1.48553 s t(norm)=0.0505967, mflops=98.8206 (err=1.5e-07) 36. Temperton (f2c): elapsed time t=1.14126 s, 64 iters, t-(init.)=1.13074 s t(norm)=0.0770253, mflops=64.9137 (err=6.8e-15) 37. Valkenburg: elapsed time t=1.33804 s, 8 iters, t-(init.)=1.33667 s t(norm)=0.728426, mflops=6.86412 (err=6.9e-15) 38. SUNPERF: elapsed time t=1.18329 s, 128 iters, t-(init.)=1.16214 s t(norm)=0.0395822, mflops=126.319 (err=6.8e-15) Top mflops for N=16384 = 218.522 Normalized results and averages for N=16384: fft 0: mflops = 81.418 (norm. = 0.372584), norm. avg. (of 14) = 0.308353 fft 1: mflops = 86.4737 (norm. = 0.39572), norm. avg. (of 14) = 0.315002 fft 2: mflops = 57.1004 (norm. = 0.261302), norm. avg. (of 14) = 0.170288 fft 3: mflops = 40.8597 (norm. = 0.186982), norm. avg. (of 14) = 0.0911948 fft 4: mflops = 85.5569 (norm. = 0.391525), norm. avg. (of 14) = 0.262898 fft 5: mflops = 7.74342 (norm. = 0.0354354), norm. avg. (of 14) = 0.0312092 fft 6: mflops = 61.7864 (norm. = 0.282746), norm. avg. (of 14) = 0.203185 fft 7: mflops = 100.58 (norm. = 0.460274), norm. avg. (of 14) = 0.250559 fft 8: mflops = 36.8337 (norm. = 0.168558), norm. avg. (of 14) = 0.11999 fft 9: mflops = 137.213 (norm. = 0.627913), norm. avg. (of 14) = 0.369673 fft 10: mflops = 137.276 (norm. = 0.628203), norm. avg. (of 14) = 0.436293 fft 11: mflops = 33.0751 (norm. = 0.151358), norm. avg. (of 13) = 0.0911916 fft 12: mflops = 134.458 (norm. = 0.615306), norm. avg. (of 14) = 0.498355 fft 13: mflops = 83.8263 (norm. = 0.383605), norm. avg. (of 14) = 0.304093 fft 14: mflops = 186.781 (norm. = 0.854743), norm. avg. (of 14) = 0.877202 fft 15: mflops = 218.522 (norm. = 1), norm. avg. (of 14) = 0.844882 fft 16: mflops = 112.709 (norm. = 0.515777), norm. avg. (of 14) = 0.711664 fft 17: mflops = 138.698 (norm. = 0.634708), norm. avg. (of 12) = 0.532695 fft 18: mflops = 69.5043 (norm. = 0.318065), norm. avg. (of 14) = 0.226617 fft 19: mflops = 60.5391 (norm. = 0.277038), norm. avg. (of 14) = 0.184258 fft 20: mflops = 72.7293 (norm. = 0.332823), norm. avg. (of 14) = 0.212896 fft 21: mflops = -1 (norm. = -0.00457619), norm. avg. (of 12) = 0.383023 fft 22: mflops = 78.9414 (norm. = 0.361251), norm. avg. (of 13) = 0.280075 fft 23: mflops = 93.901 (norm. = 0.429709), norm. avg. (of 13) = 0.340858 fft 24: mflops = 94.1584 (norm. = 0.430887), norm. avg. (of 13) = 0.335752 fft 25: mflops = 110.356 (norm. = 0.505009), norm. avg. (of 13) = 0.30959 fft 26: mflops = 30.2486 (norm. = 0.138423), norm. avg. (of 14) = 0.0942319 fft 27: mflops = 161.229 (norm. = 0.737816), norm. avg. (of 14) = 0.509603 fft 28: mflops = 156.975 (norm. = 0.718349), norm. avg. (of 14) = 0.498665 fft 29: mflops = 84.131 (norm. = 0.385), norm. avg. (of 13) = 0.166609 fft 30: mflops = 61.1452 (norm. = 0.279812), norm. avg. (of 13) = 0.337414 fft 31: mflops = 149.225 (norm. = 0.682881), norm. avg. (of 14) = 0.33929 fft 32: mflops = 141.012 (norm. = 0.645296), norm. avg. (of 14) = 0.381102 fft 33: mflops = 140.983 (norm. = 0.645163), norm. avg. (of 14) = 0.486847 fft 34: mflops = 33.6543 (norm. = 0.154008), norm. avg. (of 14) = 0.121827 fft 35: mflops = 98.8206 (norm. = 0.452222), norm. avg. (of 14) = 0.253271 fft 36: mflops = 64.9137 (norm. = 0.297058), norm. avg. (of 14) = 0.178782 fft 37: mflops = 6.86412 (norm. = 0.0314115), norm. avg. (of 14) = 0.0273893 fft 38: mflops = 126.319 (norm. = 0.578062), norm. avg. (of 14) = 0.477929 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.27957 s, 32 iters, t-(init.)=1.26426 s t(norm)=0.0803794, mflops=62.205 (err=1.4e-14) 1. Arndt DIT: elapsed time t=1.20413 s, 32 iters, t-(init.)=1.1887 s t(norm)=0.0755754, mflops=66.1591 (err=1.4e-14) 2. Arndt Split-Radix: elapsed time t=1.70133 s, 32 iters, t-(init.)=1.68604 s t(norm)=0.107196, mflops=46.6437 (err=1.4e-14) 3. Arndt 4-step: elapsed time t=1.06739 s, 16 iters, t-(init.)=1.05965 s t(norm)=0.134741, mflops=37.1081 (err=1.4e-14) 4. Bailey: elapsed time t=1.24758 s, 32 iters, t-(init.)=1.23218 s t(norm)=0.0783397, mflops=63.8246 (err=1.4e-14) 5. Beauregard: elapsed time t=1.29453 s, 4 iters, t-(init.)=1.29262 s t(norm)=0.657461, mflops=7.60502 (err=1.4e-14) 6. Bergland: elapsed time t=1.46041 s, 32 iters, t-(init.)=1.44512 s t(norm)=0.091878, mflops=54.42 (err=1.4e-14) 7. Brenner: elapsed time t=1.05135 s, 32 iters, t-(init.)=1.03602 s t(norm)=0.0658686, mflops=75.9087 (err=1.4e-14) 8. Burrus: elapsed time t=1.19733 s, 16 iters, t-(init.)=1.18971 s t(norm)=0.15128, mflops=33.0514 (err=1.4e-14) 9. CWP (min N) (N=34320): elapsed time t=1.26133 s, 64 iters, t-(init.)=1.22853 s t(norm)=0.0390541, mflops=128.028 10. CWP (best N) (N=34320): elapsed time t=1.25887 s, 64 iters, t-(init.)=1.22602 s t(norm)=0.0389743, mflops=128.29 11. Edelblute: elapsed time t=1.33089 s, 16 iters, t-(init.)=1.32329 s t(norm)=0.168265, mflops=29.7151 (err=1.4e-14) 12. FFTPACK: elapsed time t=1.59405 s, 64 iters, t-(init.)=1.56332 s t(norm)=0.0496965, mflops=100.611 (err=1.4e-14) 13. FFTPACK (f2c): elapsed time t=1.18123 s, 32 iters, t-(init.)=1.1658 s t(norm)=0.0741194, mflops=67.4587 (err=1.4e-14) FFTW_MEASURE plan: (cost = 1.780536e-02) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.04535 s, 64 iters, t-(init.)=1.01412 s t(norm)=0.032238, mflops=155.097 (err=1.4e-14) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.9228 s, 128 iters, t-(init.)=1.86085 s t(norm)=0.0295774, mflops=169.048 (err=1.4e-14) 16. Frigo-old: elapsed time t=1.76521 s, 64 iters, t-(init.)=1.73423 s t(norm)=0.0551298, mflops=90.695 (err=1.4e-14) 17. Green: elapsed time t=1.54147 s, 64 iters, t-(init.)=1.51082 s t(norm)=0.0480277, mflops=104.106 (err=1.4e-14) 18. GSL: elapsed time t=1.35893 s, 32 iters, t-(init.)=1.34349 s t(norm)=0.0854166, mflops=58.5366 (err=1.4e-14) 19. GSL DIT: elapsed time t=1.73543 s, 32 iters, t-(init.)=1.72014 s t(norm)=0.109364, mflops=45.7191 (err=1.4e-14) 20. GSL DIF: elapsed time t=1.48698 s, 32 iters, t-(init.)=1.47166 s t(norm)=0.0935659, mflops=53.4383 (err=1.4e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.96397 s, 64 iters, t-(init.)=1.93356 s t(norm)=0.0614662, mflops=81.3455 (err=1.4e-14) 23. Mayer (simple): elapsed time t=1.70875 s, 64 iters, t-(init.)=1.67823 s t(norm)=0.0533496, mflops=93.7215 24. Mayer (lookup): elapsed time t=1.78894 s, 64 iters, t-(init.)=1.75846 s t(norm)=0.0558998, mflops=89.4457 (err=1.4e-14) 25. Monro: elapsed time t=1.05113 s, 32 iters, t-(init.)=1.03581 s t(norm)=0.0658552, mflops=75.9242 (err=1.5e-07) 26. NAPACK (f2c): elapsed time t=1.46617 s, 16 iters, t-(init.)=1.45816 s t(norm)=0.185415, mflops=26.9666 (err=5.6e-13) 27. Ooura (C): elapsed time t=1.2109 s, 64 iters, t-(init.)=1.18035 s t(norm)=0.0375224, mflops=133.254 (err=1.4e-14) 28. Ooura (F): elapsed time t=1.23233 s, 64 iters, t-(init.)=1.20173 s t(norm)=0.0382019, mflops=130.884 (err=1.4e-14) 29. Ransom: elapsed time t=1.18936 s, 32 iters, t-(init.)=1.17408 s t(norm)=0.0746463, mflops=66.9826 (err=1.5e-14) 30. SCIPORT: elapsed time t=1.30107 s, 16 iters, t-(init.)=1.29301 s t(norm)=0.164415, mflops=30.4108 (err=1.4e-14) 31. Singleton: elapsed time t=1.88864 s, 64 iters, t-(init.)=1.85803 s t(norm)=0.059065, mflops=84.6525 (err=2.1e-14) 32. Singleton (f2c): elapsed time t=1.9619 s, 64 iters, t-(init.)=1.93135 s t(norm)=0.0613961, mflops=81.4384 (err=2.1e-14) 33. Sorensen: elapsed time t=1.40759 s, 64 iters, t-(init.)=1.37708 s t(norm)=0.0437761, mflops=114.218 (err=1.4e-14) 34. Sorensen DIT: elapsed time t=1.29534 s, 16 iters, t-(init.)=1.2876 s t(norm)=0.163727, mflops=30.5386 (err=1.4e-14) 35. Temperton: elapsed time t=1.21823 s, 32 iters, t-(init.)=1.203 s t(norm)=0.0764848, mflops=65.3724 (err=1.5e-07) 36. Temperton (f2c): elapsed time t=1.57925 s, 32 iters, t-(init.)=1.56399 s t(norm)=0.0994357, mflops=50.2838 (err=1.4e-14) 37. Valkenburg: elapsed time t=1.47473 s, 4 iters, t-(init.)=1.47247 s t(norm)=0.748937, mflops=6.67613 (err=1.4e-14) 38. SUNPERF: elapsed time t=1.73578 s, 64 iters, t-(init.)=1.70512 s t(norm)=0.0542044, mflops=92.2434 (err=1.4e-14) Top mflops for N=32768 = 169.048 Normalized results and averages for N=32768: fft 0: mflops = 62.205 (norm. = 0.367972), norm. avg. (of 15) = 0.312327 fft 1: mflops = 66.1591 (norm. = 0.391362), norm. avg. (of 15) = 0.320093 fft 2: mflops = 46.6437 (norm. = 0.275919), norm. avg. (of 15) = 0.17733 fft 3: mflops = 37.1081 (norm. = 0.219512), norm. avg. (of 15) = 0.0997492 fft 4: mflops = 63.8246 (norm. = 0.377553), norm. avg. (of 15) = 0.270542 fft 5: mflops = 7.60502 (norm. = 0.0449873), norm. avg. (of 15) = 0.0321277 fft 6: mflops = 54.42 (norm. = 0.32192), norm. avg. (of 15) = 0.211101 fft 7: mflops = 75.9087 (norm. = 0.449036), norm. avg. (of 15) = 0.263791 fft 8: mflops = 33.0514 (norm. = 0.195515), norm. avg. (of 15) = 0.125025 fft 9: mflops = 128.028 (norm. = 0.757345), norm. avg. (of 15) = 0.395518 fft 10: mflops = 128.29 (norm. = 0.758895), norm. avg. (of 15) = 0.4578 fft 11: mflops = 29.7151 (norm. = 0.175779), norm. avg. (of 14) = 0.0972336 fft 12: mflops = 100.611 (norm. = 0.59516), norm. avg. (of 15) = 0.504809 fft 13: mflops = 67.4587 (norm. = 0.39905), norm. avg. (of 15) = 0.310424 fft 14: mflops = 155.097 (norm. = 0.91747), norm. avg. (of 15) = 0.879887 fft 15: mflops = 169.048 (norm. = 1), norm. avg. (of 15) = 0.855223 fft 16: mflops = 90.695 (norm. = 0.536504), norm. avg. (of 15) = 0.699986 fft 17: mflops = 104.106 (norm. = 0.615839), norm. avg. (of 13) = 0.539091 fft 18: mflops = 58.5366 (norm. = 0.346272), norm. avg. (of 15) = 0.234594 fft 19: mflops = 45.7191 (norm. = 0.27045), norm. avg. (of 15) = 0.190005 fft 20: mflops = 53.4383 (norm. = 0.316113), norm. avg. (of 15) = 0.219777 fft 21: mflops = -1 (norm. = -0.00591548), norm. avg. (of 12) = 0.383023 fft 22: mflops = 81.3455 (norm. = 0.481198), norm. avg. (of 14) = 0.294441 fft 23: mflops = 93.7215 (norm. = 0.554407), norm. avg. (of 14) = 0.356112 fft 24: mflops = 89.4457 (norm. = 0.529114), norm. avg. (of 14) = 0.349563 fft 25: mflops = 75.9242 (norm. = 0.449128), norm. avg. (of 14) = 0.319557 fft 26: mflops = 26.9666 (norm. = 0.15952), norm. avg. (of 15) = 0.0985845 fft 27: mflops = 133.254 (norm. = 0.788258), norm. avg. (of 15) = 0.52818 fft 28: mflops = 130.884 (norm. = 0.774239), norm. avg. (of 15) = 0.517036 fft 29: mflops = 66.9826 (norm. = 0.396234), norm. avg. (of 14) = 0.18301 fft 30: mflops = 30.4108 (norm. = 0.179894), norm. avg. (of 14) = 0.326162 fft 31: mflops = 84.6525 (norm. = 0.50076), norm. avg. (of 15) = 0.350055 fft 32: mflops = 81.4384 (norm. = 0.481747), norm. avg. (of 15) = 0.387812 fft 33: mflops = 114.218 (norm. = 0.675651), norm. avg. (of 15) = 0.499434 fft 34: mflops = 30.5386 (norm. = 0.180651), norm. avg. (of 15) = 0.125749 fft 35: mflops = 65.3724 (norm. = 0.386709), norm. avg. (of 15) = 0.262167 fft 36: mflops = 50.2838 (norm. = 0.297452), norm. avg. (of 15) = 0.186693 fft 37: mflops = 6.67613 (norm. = 0.0394925), norm. avg. (of 15) = 0.0281961 fft 38: mflops = 92.2434 (norm. = 0.545664), norm. avg. (of 15) = 0.482445 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.19082 s, 8 iters, t-(init.)=1.17379 s t(norm)=0.139927, mflops=35.733 (err=1.7e-14) 1. Arndt DIT: elapsed time t=1.17084 s, 8 iters, t-(init.)=1.15381 s t(norm)=0.137545, mflops=36.3518 (err=1.7e-14) 2. Arndt Split-Radix: elapsed time t=1.62779 s, 8 iters, t-(init.)=1.61088 s t(norm)=0.192032, mflops=26.0374 (err=1.7e-14) 3. Arndt 4-step: elapsed time t=1.22688 s, 8 iters, t-(init.)=1.20929 s t(norm)=0.144158, mflops=34.6841 (err=1.7e-14) 4. Bailey: elapsed time t=1.21595 s, 8 iters, t-(init.)=1.19745 s t(norm)=0.142747, mflops=35.0269 (err=1.7e-14) 5. Beauregard: elapsed time t=1.4106 s, 2 iters, t-(init.)=1.40617 s t(norm)=0.670515, mflops=7.45695 (err=1.7e-14) 6. Bergland: elapsed time t=1.0121 s, 8 iters, t-(init.)=0.994723 s t(norm)=0.11858, mflops=42.1656 (err=1.7e-14) 7. Brenner: elapsed time t=1.711 s, 16 iters, t-(init.)=1.67635 s t(norm)=0.099918, mflops=50.0411 (err=1.7e-14) 8. Burrus: elapsed time t=1.00077 s, 4 iters, t-(init.)=0.992573 s t(norm)=0.236648, mflops=21.1284 (err=1.7e-14) 9. CWP (min N) (N=72072): elapsed time t=1.60366 s, 32 iters, t-(init.)=1.51043 s t(norm)=0.0450144, mflops=111.076 10. CWP (best N) (N=72072): elapsed time t=1.60425 s, 32 iters, t-(init.)=1.51105 s t(norm)=0.0450327, mflops=111.03 11. Edelblute: elapsed time t=1.04406 s, 4 iters, t-(init.)=1.0359 s t(norm)=0.246977, mflops=20.2448 (err=1.7e-14) 12. FFTPACK: elapsed time t=1.14522 s, 16 iters, t-(init.)=1.10972 s t(norm)=0.0661445, mflops=75.592 (err=1.7e-14) 13. FFTPACK (f2c): elapsed time t=1.47507 s, 16 iters, t-(init.)=1.43972 s t(norm)=0.085814, mflops=58.2655 (err=1.7e-14) FFTW_MEASURE plan: (cost = 4.292254e-02) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.41793 s, 32 iters, t-(init.)=1.34591 s t(norm)=0.0401114, mflops=124.653 (err=1.7e-14) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.44461 s, 32 iters, t-(init.)=1.37241 s t(norm)=0.0409011, mflops=122.246 (err=1.7e-14) 16. Frigo-old: elapsed time t=1.17671 s, 16 iters, t-(init.)=1.14053 s t(norm)=0.067981, mflops=73.55 (err=1.7e-14) 17. Green: elapsed time t=1.17515 s, 16 iters, t-(init.)=1.14033 s t(norm)=0.0679689, mflops=73.563 (err=1.7e-14) 18. GSL: elapsed time t=1.64085 s, 16 iters, t-(init.)=1.60589 s t(norm)=0.0957187, mflops=52.2364 (err=1.7e-14) 19. GSL DIT: elapsed time t=1.3943 s, 8 iters, t-(init.)=1.37736 s t(norm)=0.164195, mflops=30.4517 (err=1.7e-14) 20. GSL DIF: elapsed time t=1.29696 s, 8 iters, t-(init.)=1.27935 s t(norm)=0.15251, mflops=32.7848 (err=1.8e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.30647 s, 16 iters, t-(init.)=1.2718 s t(norm)=0.0758051, mflops=65.9586 (err=1.7e-14) 23. Mayer (simple): elapsed time t=1.18814 s, 16 iters, t-(init.)=1.15343 s t(norm)=0.0687497, mflops=72.7276 24. Mayer (lookup): elapsed time t=1.29349 s, 16 iters, t-(init.)=1.25886 s t(norm)=0.0750337, mflops=66.6367 (err=1.7e-14) 25. Monro: elapsed time t=1.08738 s, 8 iters, t-(init.)=1.07042 s t(norm)=0.127604, mflops=39.1838 (err=1.6e-07) 26. NAPACK (f2c): elapsed time t=1.94487 s, 8 iters, t-(init.)=1.92636 s t(norm)=0.229639, mflops=21.7733 (err=8.6e-13) 27. Ooura (C): elapsed time t=1.63374 s, 32 iters, t-(init.)=1.56405 s t(norm)=0.0466123, mflops=107.268 (err=1.7e-14) 28. Ooura (F): elapsed time t=1.67101 s, 32 iters, t-(init.)=1.6012 s t(norm)=0.0477196, mflops=104.779 (err=1.7e-14) 29. Ransom: elapsed time t=1.43406 s, 16 iters, t-(init.)=1.39842 s t(norm)=0.0833524, mflops=59.9863 (err=1.7e-14) 30. SCIPORT: elapsed time t=1.08115 s, 4 iters, t-(init.)=1.07193 s t(norm)=0.255568, mflops=19.5643 (err=1.7e-14) 31. Singleton: elapsed time t=1.33011 s, 16 iters, t-(init.)=1.29501 s t(norm)=0.0771885, mflops=64.7765 (err=2.3e-14) 32. Singleton (f2c): elapsed time t=1.34752 s, 16 iters, t-(init.)=1.31238 s t(norm)=0.0782237, mflops=63.9193 (err=2.3e-14) 33. Sorensen: elapsed time t=1.38759 s, 16 iters, t-(init.)=1.35294 s t(norm)=0.0806417, mflops=62.0027 (err=1.7e-14) 34. Sorensen DIT: elapsed time t=1.04579 s, 4 iters, t-(init.)=1.03757 s t(norm)=0.247375, mflops=20.2122 (err=1.7e-14) 35. Temperton: elapsed time t=1.68537 s, 16 iters, t-(init.)=1.65081 s t(norm)=0.0983959, mflops=50.8151 (err=1.7e-07) 36. Temperton (f2c): elapsed time t=1.04392 s, 8 iters, t-(init.)=1.02689 s t(norm)=0.122414, mflops=40.8449 (err=1.7e-14) 37. Valkenburg: elapsed time t=1.68757 s, 2 iters, t-(init.)=1.68269 s t(norm)=0.802369, mflops=6.23155 (err=1.7e-14) 38. SUNPERF: elapsed time t=1.17965 s, 16 iters, t-(init.)=1.14419 s t(norm)=0.0681988, mflops=73.3151 (err=1.7e-14) Top mflops for N=65536 = 124.653 Normalized results and averages for N=65536: fft 0: mflops = 35.733 (norm. = 0.28666), norm. avg. (of 16) = 0.310723 fft 1: mflops = 36.3518 (norm. = 0.291624), norm. avg. (of 16) = 0.318314 fft 2: mflops = 26.0374 (norm. = 0.208879), norm. avg. (of 16) = 0.179302 fft 3: mflops = 34.6841 (norm. = 0.278245), norm. avg. (of 16) = 0.110905 fft 4: mflops = 35.0269 (norm. = 0.280996), norm. avg. (of 16) = 0.271195 fft 5: mflops = 7.45695 (norm. = 0.0598217), norm. avg. (of 16) = 0.0338586 fft 6: mflops = 42.1656 (norm. = 0.338264), norm. avg. (of 16) = 0.219048 fft 7: mflops = 50.0411 (norm. = 0.401443), norm. avg. (of 16) = 0.272394 fft 8: mflops = 21.1284 (norm. = 0.169498), norm. avg. (of 16) = 0.127804 fft 9: mflops = 111.076 (norm. = 0.891079), norm. avg. (of 16) = 0.426491 fft 10: mflops = 111.03 (norm. = 0.890716), norm. avg. (of 16) = 0.484857 fft 11: mflops = 20.2448 (norm. = 0.162409), norm. avg. (of 15) = 0.101579 fft 12: mflops = 75.592 (norm. = 0.60642), norm. avg. (of 16) = 0.51116 fft 13: mflops = 58.2655 (norm. = 0.467422), norm. avg. (of 16) = 0.320236 fft 14: mflops = 124.653 (norm. = 1), norm. avg. (of 16) = 0.887394 fft 15: mflops = 122.246 (norm. = 0.980693), norm. avg. (of 16) = 0.863065 fft 16: mflops = 73.55 (norm. = 0.590038), norm. avg. (of 16) = 0.693115 fft 17: mflops = 73.563 (norm. = 0.590143), norm. avg. (of 14) = 0.542737 fft 18: mflops = 52.2364 (norm. = 0.419055), norm. avg. (of 16) = 0.246123 fft 19: mflops = 30.4517 (norm. = 0.244292), norm. avg. (of 16) = 0.193397 fft 20: mflops = 32.7848 (norm. = 0.263008), norm. avg. (of 16) = 0.222479 fft 21: mflops = -1 (norm. = -0.00802228), norm. avg. (of 12) = 0.383023 fft 22: mflops = 65.9586 (norm. = 0.529138), norm. avg. (of 15) = 0.310087 fft 23: mflops = 72.7276 (norm. = 0.583441), norm. avg. (of 15) = 0.371267 fft 24: mflops = 66.6367 (norm. = 0.534578), norm. avg. (of 15) = 0.361898 fft 25: mflops = 39.1838 (norm. = 0.314343), norm. avg. (of 15) = 0.31921 fft 26: mflops = 21.7733 (norm. = 0.174671), norm. avg. (of 16) = 0.10334 fft 27: mflops = 107.268 (norm. = 0.860533), norm. avg. (of 16) = 0.548952 fft 28: mflops = 104.779 (norm. = 0.840564), norm. avg. (of 16) = 0.537257 fft 29: mflops = 59.9863 (norm. = 0.481227), norm. avg. (of 15) = 0.202891 fft 30: mflops = 19.5643 (norm. = 0.15695), norm. avg. (of 15) = 0.314881 fft 31: mflops = 64.7765 (norm. = 0.519655), norm. avg. (of 16) = 0.360655 fft 32: mflops = 63.9193 (norm. = 0.512778), norm. avg. (of 16) = 0.395622 fft 33: mflops = 62.0027 (norm. = 0.497403), norm. avg. (of 16) = 0.499307 fft 34: mflops = 20.2122 (norm. = 0.162148), norm. avg. (of 16) = 0.128024 fft 35: mflops = 50.8151 (norm. = 0.407653), norm. avg. (of 16) = 0.27126 fft 36: mflops = 40.8449 (norm. = 0.327669), norm. avg. (of 16) = 0.195504 fft 37: mflops = 6.23155 (norm. = 0.0499912), norm. avg. (of 16) = 0.0295583 fft 38: mflops = 73.3151 (norm. = 0.588154), norm. avg. (of 16) = 0.489052 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.77433 s, 4 iters, t-(init.)=1.74683 s t(norm)=0.195989, mflops=25.5117 (err=3.3e-14) 1. Arndt DIT: elapsed time t=1.74559 s, 4 iters, t-(init.)=1.71792 s t(norm)=0.192745, mflops=25.941 (err=3.3e-14) 2. Arndt Split-Radix: elapsed time t=1.19719 s, 2 iters, t-(init.)=1.18402 s t(norm)=0.265688, mflops=18.8191 (err=3.3e-14) 3. Arndt 4-step: elapsed time t=1.63272 s, 4 iters, t-(init.)=1.60401 s t(norm)=0.179965, mflops=27.7832 (err=3.3e-14) 4. Bailey: elapsed time t=1.53309 s, 4 iters, t-(init.)=1.50443 s t(norm)=0.168793, mflops=29.6221 (err=3.3e-14) 5. Beauregard: elapsed time t=1.53611 s, 1 iters, t-(init.)=1.5289 s t(norm)=0.686152, mflops=7.28702 (err=3.3e-14) 6. Bergland: elapsed time t=1.29193 s, 4 iters, t-(init.)=1.26402 s t(norm)=0.141819, mflops=35.2562 (err=3.4e-14) 7. Brenner: elapsed time t=1.25397 s, 4 iters, t-(init.)=1.22736 s t(norm)=0.137706, mflops=36.3093 (err=3.3e-14) 8. Burrus: elapsed time t=1.37413 s, 2 iters, t-(init.)=1.36097 s t(norm)=0.305394, mflops=16.3723 (err=3.3e-14) 9. CWP (min N) (N=144144): elapsed time t=1.89846 s, 16 iters, t-(init.)=1.76539 s t(norm)=0.0495178, mflops=100.974 10. CWP (best N) (N=144144): elapsed time t=1.89794 s, 16 iters, t-(init.)=1.76447 s t(norm)=0.0494922, mflops=101.026 11. Edelblute: elapsed time t=1.42733 s, 2 iters, t-(init.)=1.41418 s t(norm)=0.317333, mflops=15.7563 (err=3.3e-14) 12. FFTPACK: elapsed time t=1.66863 s, 8 iters, t-(init.)=1.60986 s t(norm)=0.0903105, mflops=55.3645 (err=3.3e-14) 13. FFTPACK (f2c): elapsed time t=1.04443 s, 4 iters, t-(init.)=1.01449 s t(norm)=0.113823, mflops=43.9279 (err=3.3e-14) FFTW_MEASURE plan: (cost = 1.164606e-01) FFTW_TWIDDLE 8 FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_NOTW 32 14. FFTW: elapsed time t=1.81082 s, 16 iters, t-(init.)=1.69338 s t(norm)=0.0474979, mflops=105.268 (err=3.3e-14) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.70848 s, 16 iters, t-(init.)=1.59076 s t(norm)=0.0446195, mflops=112.059 (err=3.3e-14) 16. Frigo-old: elapsed time t=1.53089 s, 8 iters, t-(init.)=1.47231 s t(norm)=0.0825944, mflops=60.5368 (err=3.3e-14) 17. Green: elapsed time t=1.6918 s, 8 iters, t-(init.)=1.63594 s t(norm)=0.0917737, mflops=54.4818 (err=3.3e-14) 18. GSL: elapsed time t=1.07645 s, 4 iters, t-(init.)=1.04682 s t(norm)=0.11745, mflops=42.5713 (err=3.3e-14) 19. GSL DIT: elapsed time t=1.97712 s, 4 iters, t-(init.)=1.94966 s t(norm)=0.218746, mflops=22.8576 (err=3.5e-14) 20. GSL DIF: elapsed time t=1.88917 s, 4 iters, t-(init.)=1.86046 s t(norm)=0.208738, mflops=23.9535 (err=3.5e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.14857 s, 4 iters, t-(init.)=1.1211 s t(norm)=0.125784, mflops=39.7508 (err=3.3e-14) 23. Mayer (simple): elapsed time t=1.10004 s, 4 iters, t-(init.)=1.07253 s t(norm)=0.120335, mflops=41.5508 24. Mayer (lookup): elapsed time t=1.1525 s, 4 iters, t-(init.)=1.125 s t(norm)=0.126221, mflops=39.6129 (err=3.3e-14) 25. Monro: elapsed time t=1.71827 s, 4 iters, t-(init.)=1.69075 s t(norm)=0.189697, mflops=26.3578 (err=1.7e-07) 26. NAPACK (f2c): elapsed time t=1.23063 s, 2 iters, t-(init.)=1.21498 s t(norm)=0.272635, mflops=18.3395 (err=2.0e-12) 27. Ooura (C): elapsed time t=1.11698 s, 8 iters, t-(init.)=1.06077 s t(norm)=0.0595078, mflops=84.0226 (err=3.4e-14) 28. Ooura (F): elapsed time t=1.13362 s, 8 iters, t-(init.)=1.07734 s t(norm)=0.0604369, mflops=82.731 (err=3.4e-14) 29. Ransom: elapsed time t=1.90229 s, 8 iters, t-(init.)=1.84526 s t(norm)=0.103516, mflops=48.3016 (err=3.3e-14) 30. SCIPORT: elapsed time t=1.58257 s, 2 iters, t-(init.)=1.56718 s t(norm)=0.351665, mflops=14.2181 (err=3.3e-14) 31. Singleton: elapsed time t=1.08417 s, 4 iters, t-(init.)=1.05542 s t(norm)=0.118415, mflops=42.2244 (err=4.8e-14) 32. Singleton (f2c): elapsed time t=1.09339 s, 4 iters, t-(init.)=1.06471 s t(norm)=0.119458, mflops=41.8559 (err=4.8e-14) 33. Sorensen: elapsed time t=1.08119 s, 4 iters, t-(init.)=1.05392 s t(norm)=0.118247, mflops=42.2845 (err=3.3e-14) 34. Sorensen DIT: elapsed time t=1.43595 s, 2 iters, t-(init.)=1.42363 s t(norm)=0.319453, mflops=15.6517 (err=3.3e-14) 35. Temperton: elapsed time t=1.259 s, 4 iters, t-(init.)=1.23253 s t(norm)=0.138286, mflops=36.157 (err=1.9e-07) 36. Temperton (f2c): elapsed time t=1.49234 s, 4 iters, t-(init.)=1.46592 s t(norm)=0.164472, mflops=30.4003 (err=3.3e-14) 37. Valkenburg: elapsed time t=1.91605 s, 1 iters, t-(init.)=1.90756 s t(norm)=0.85609, mflops=5.84051 (err=3.4e-14) 38. SUNPERF: elapsed time t=1.7189 s, 8 iters, t-(init.)=1.65969 s t(norm)=0.0931063, mflops=53.702 (err=3.3e-14) Top mflops for N=131072 = 112.059 Normalized results and averages for N=131072: fft 0: mflops = 25.5117 (norm. = 0.227664), norm. avg. (of 17) = 0.305837 fft 1: mflops = 25.941 (norm. = 0.231495), norm. avg. (of 17) = 0.313207 fft 2: mflops = 18.8191 (norm. = 0.16794), norm. avg. (of 17) = 0.178633 fft 3: mflops = 27.7832 (norm. = 0.247935), norm. avg. (of 17) = 0.118966 fft 4: mflops = 29.6221 (norm. = 0.264345), norm. avg. (of 17) = 0.270792 fft 5: mflops = 7.28702 (norm. = 0.0650287), norm. avg. (of 17) = 0.0356921 fft 6: mflops = 35.2562 (norm. = 0.314623), norm. avg. (of 17) = 0.22467 fft 7: mflops = 36.3093 (norm. = 0.324021), norm. avg. (of 17) = 0.275431 fft 8: mflops = 16.3723 (norm. = 0.146105), norm. avg. (of 17) = 0.128881 fft 9: mflops = 100.974 (norm. = 0.90108), norm. avg. (of 17) = 0.454408 fft 10: mflops = 101.026 (norm. = 0.901547), norm. avg. (of 17) = 0.509368 fft 11: mflops = 15.7563 (norm. = 0.140608), norm. avg. (of 16) = 0.104018 fft 12: mflops = 55.3645 (norm. = 0.494068), norm. avg. (of 17) = 0.510154 fft 13: mflops = 43.9279 (norm. = 0.392009), norm. avg. (of 17) = 0.324458 fft 14: mflops = 105.268 (norm. = 0.9394), norm. avg. (of 17) = 0.890453 fft 15: mflops = 112.059 (norm. = 1), norm. avg. (of 17) = 0.87112 fft 16: mflops = 60.5368 (norm. = 0.540224), norm. avg. (of 17) = 0.684121 fft 17: mflops = 54.4818 (norm. = 0.486191), norm. avg. (of 15) = 0.538967 fft 18: mflops = 42.5713 (norm. = 0.379902), norm. avg. (of 17) = 0.253992 fft 19: mflops = 22.8576 (norm. = 0.203979), norm. avg. (of 17) = 0.19402 fft 20: mflops = 23.9535 (norm. = 0.213758), norm. avg. (of 17) = 0.221966 fft 21: mflops = -1 (norm. = -0.00892391), norm. avg. (of 12) = 0.383023 fft 22: mflops = 39.7508 (norm. = 0.354732), norm. avg. (of 16) = 0.312878 fft 23: mflops = 41.5508 (norm. = 0.370796), norm. avg. (of 16) = 0.371238 fft 24: mflops = 39.6129 (norm. = 0.353502), norm. avg. (of 16) = 0.361373 fft 25: mflops = 26.3578 (norm. = 0.235215), norm. avg. (of 16) = 0.31396 fft 26: mflops = 18.3395 (norm. = 0.16366), norm. avg. (of 17) = 0.106888 fft 27: mflops = 84.0226 (norm. = 0.74981), norm. avg. (of 17) = 0.560767 fft 28: mflops = 82.731 (norm. = 0.738283), norm. avg. (of 17) = 0.549082 fft 29: mflops = 48.3016 (norm. = 0.431039), norm. avg. (of 16) = 0.217151 fft 30: mflops = 14.2181 (norm. = 0.126881), norm. avg. (of 16) = 0.303131 fft 31: mflops = 42.2244 (norm. = 0.376807), norm. avg. (of 17) = 0.361605 fft 32: mflops = 41.8559 (norm. = 0.373518), norm. avg. (of 17) = 0.394322 fft 33: mflops = 42.2845 (norm. = 0.377343), norm. avg. (of 17) = 0.492132 fft 34: mflops = 15.6517 (norm. = 0.139675), norm. avg. (of 17) = 0.128709 fft 35: mflops = 36.157 (norm. = 0.322661), norm. avg. (of 17) = 0.274283 fft 36: mflops = 30.4003 (norm. = 0.271289), norm. avg. (of 17) = 0.199962 fft 37: mflops = 5.84051 (norm. = 0.0521201), norm. avg. (of 17) = 0.0308855 fft 38: mflops = 53.702 (norm. = 0.479232), norm. avg. (of 17) = 0.488474 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=1.175 s, 1 iters, t-(init.)=1.15663 s t(norm)=0.245122, mflops=20.398 (err=4.3e-14) 1. Arndt DIT: elapsed time t=1.16202 s, 1 iters, t-(init.)=1.14784 s t(norm)=0.243259, mflops=20.5542 (err=4.3e-14) 2. Arndt Split-Radix: elapsed time t=1.57945 s, 1 iters, t-(init.)=1.5616 s t(norm)=0.330946, mflops=15.1082 (err=4.3e-14) 3. Arndt 4-step: elapsed time t=1.69298 s, 2 iters, t-(init.)=1.65436 s t(norm)=0.175302, mflops=28.5222 (err=4.3e-14) 4. Bailey: elapsed time t=1.91658 s, 2 iters, t-(init.)=1.87677 s t(norm)=0.19887, mflops=25.1421 (err=4.3e-14) 5. Beauregard: elapsed time t=3.35714 s, 1 iters, t-(init.)=3.33762 s t(norm)=0.707333, mflops=7.0688 (err=4.4e-14) 6. Bergland: elapsed time t=1.5216 s, 2 iters, t-(init.)=1.48448 s t(norm)=0.157301, mflops=31.7862 (err=4.4e-14) 7. Brenner: elapsed time t=1.60743 s, 2 iters, t-(init.)=1.57393 s t(norm)=0.16678, mflops=29.9797 (err=4.4e-14) 8. Burrus: elapsed time t=1.75692 s, 1 iters, t-(init.)=1.73905 s t(norm)=0.368553, mflops=13.5666 (err=4.3e-14) 9. CWP (min N) (N=360360): elapsed time t=1.48682 s, 4 iters, t-(init.)=1.37366 s t(norm)=0.0727789, mflops=68.7012 10. CWP (best N) (N=360360): elapsed time t=1.48855 s, 4 iters, t-(init.)=1.3756 s t(norm)=0.0728819, mflops=68.6041 11. Edelblute: elapsed time t=1.79286 s, 1 iters, t-(init.)=1.775 s t(norm)=0.376171, mflops=13.2918 (err=4.3e-14) 12. FFTPACK: elapsed time t=1.9442 s, 4 iters, t-(init.)=1.86559 s t(norm)=0.0988427, mflops=50.5854 (err=4.4e-14) 13. FFTPACK (f2c): elapsed time t=1.16639 s, 2 iters, t-(init.)=1.12656 s t(norm)=0.119375, mflops=41.8848 (err=4.4e-14) FFTW_MEASURE plan: (cost = 2.534169e-01) FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 14. FFTW: elapsed time t=1.03036 s, 4 iters, t-(init.)=0.950795 s t(norm)=0.0503749, mflops=99.2557 (err=4.4e-14) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.05271 s, 4 iters, t-(init.)=0.9737 s t(norm)=0.0515885, mflops=96.9208 (err=4.4e-14) 16. Frigo-old: elapsed time t=1.02247 s, 2 iters, t-(init.)=0.983532 s t(norm)=0.104219, mflops=47.976 (err=4.4e-14) 17. Green: elapsed time t=1.02469 s, 2 iters, t-(init.)=0.989851 s t(norm)=0.104888, mflops=47.6697 (err=4.4e-14) 18. GSL: elapsed time t=1.18965 s, 2 iters, t-(init.)=1.15067 s t(norm)=0.121929, mflops=41.0075 (err=4.4e-14) 19. GSL DIT: elapsed time t=1.28409 s, 1 iters, t-(init.)=1.26623 s t(norm)=0.268348, mflops=18.6325 (err=4.6e-14) 20. GSL DIF: elapsed time t=1.24229 s, 1 iters, t-(init.)=1.22297 s t(norm)=0.25918, mflops=19.2916 (err=4.6e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.67624 s, 2 iters, t-(init.)=1.64003 s t(norm)=0.173784, mflops=28.7713 (err=4.3e-14) 23. Mayer (simple): elapsed time t=1.62204 s, 2 iters, t-(init.)=1.58575 s t(norm)=0.168032, mflops=29.7562 24. Mayer (lookup): elapsed time t=1.69748 s, 2 iters, t-(init.)=1.66131 s t(norm)=0.176039, mflops=28.4028 (err=4.3e-14) 25. Monro: elapsed time t=1.13204 s, 1 iters, t-(init.)=1.11413 s t(norm)=0.236114, mflops=21.1762 (err=1.8e-07) 26. NAPACK (f2c): elapsed time t=1.38529 s, 1 iters, t-(init.)=1.36472 s t(norm)=0.289223, mflops=17.2877 (err=3.7e-12) 27. Ooura (C): elapsed time t=1.36417 s, 4 iters, t-(init.)=1.29027 s t(norm)=0.0683612, mflops=73.1409 (err=4.4e-14) 28. Ooura (F): elapsed time t=1.38633 s, 4 iters, t-(init.)=1.31239 s t(norm)=0.0695327, mflops=71.9086 (err=4.4e-14) 29. Ransom: elapsed time t=1.7288 s, 4 iters, t-(init.)=1.65096 s t(norm)=0.0874709, mflops=57.1619 (err=4.3e-14) 30. SCIPORT: elapsed time t=2.04155 s, 1 iters, t-(init.)=2.02177 s t(norm)=0.428468, mflops=11.6695 (err=4.4e-14) 31. Singleton: elapsed time t=1.3277 s, 2 iters, t-(init.)=1.28904 s t(norm)=0.136591, mflops=36.6055 (err=6.0e-14) 32. Singleton (f2c): elapsed time t=1.32408 s, 2 iters, t-(init.)=1.28541 s t(norm)=0.136207, mflops=36.7089 (err=6.0e-14) 33. Sorensen: elapsed time t=1.4885 s, 2 iters, t-(init.)=1.45248 s t(norm)=0.153911, mflops=32.4863 (err=4.3e-14) 34. Sorensen DIT: elapsed time t=1.82994 s, 1 iters, t-(init.)=1.81572 s t(norm)=0.3848, mflops=12.9938 (err=4.3e-14) 35. Temperton: elapsed time t=1.48221 s, 2 iters, t-(init.)=1.44687 s t(norm)=0.153316, mflops=32.6125 (err=2.0e-07) 36. Temperton (f2c): elapsed time t=1.71031 s, 2 iters, t-(init.)=1.67433 s t(norm)=0.177418, mflops=28.182 (err=4.4e-14) 37. Valkenburg: elapsed time t=4.26926 s, 1 iters, t-(init.)=4.24928 s t(norm)=0.90054, mflops=5.55222 (err=4.4e-14) 38. SUNPERF: elapsed time t=1.98549 s, 4 iters, t-(init.)=1.90648 s t(norm)=0.101009, mflops=49.5007 (err=4.4e-14) Top mflops for N=262144 = 99.2557 Normalized results and averages for N=262144: fft 0: mflops = 20.398 (norm. = 0.205509), norm. avg. (of 18) = 0.300264 fft 1: mflops = 20.5542 (norm. = 0.207084), norm. avg. (of 18) = 0.307311 fft 2: mflops = 15.1082 (norm. = 0.152215), norm. avg. (of 18) = 0.177166 fft 3: mflops = 28.5222 (norm. = 0.287361), norm. avg. (of 18) = 0.128321 fft 4: mflops = 25.1421 (norm. = 0.253306), norm. avg. (of 18) = 0.269821 fft 5: mflops = 7.0688 (norm. = 0.0712181), norm. avg. (of 18) = 0.0376658 fft 6: mflops = 31.7862 (norm. = 0.320245), norm. avg. (of 18) = 0.22998 fft 7: mflops = 29.9797 (norm. = 0.302045), norm. avg. (of 18) = 0.27691 fft 8: mflops = 13.5666 (norm. = 0.136683), norm. avg. (of 18) = 0.129314 fft 9: mflops = 68.7012 (norm. = 0.692164), norm. avg. (of 18) = 0.467616 fft 10: mflops = 68.6041 (norm. = 0.691185), norm. avg. (of 18) = 0.519469 fft 11: mflops = 13.2918 (norm. = 0.133915), norm. avg. (of 17) = 0.105777 fft 12: mflops = 50.5854 (norm. = 0.509647), norm. avg. (of 18) = 0.510126 fft 13: mflops = 41.8848 (norm. = 0.421989), norm. avg. (of 18) = 0.329877 fft 14: mflops = 99.2557 (norm. = 1), norm. avg. (of 18) = 0.896539 fft 15: mflops = 96.9208 (norm. = 0.976476), norm. avg. (of 18) = 0.876973 fft 16: mflops = 47.976 (norm. = 0.483357), norm. avg. (of 18) = 0.672968 fft 17: mflops = 47.6697 (norm. = 0.480272), norm. avg. (of 16) = 0.535299 fft 18: mflops = 41.0075 (norm. = 0.41315), norm. avg. (of 18) = 0.262835 fft 19: mflops = 18.6325 (norm. = 0.187722), norm. avg. (of 18) = 0.19367 fft 20: mflops = 19.2916 (norm. = 0.194363), norm. avg. (of 18) = 0.220432 fft 21: mflops = -1 (norm. = -0.010075), norm. avg. (of 12) = 0.383023 fft 22: mflops = 28.7713 (norm. = 0.289871), norm. avg. (of 17) = 0.311524 fft 23: mflops = 29.7562 (norm. = 0.299793), norm. avg. (of 17) = 0.367035 fft 24: mflops = 28.4028 (norm. = 0.286157), norm. avg. (of 17) = 0.356948 fft 25: mflops = 21.1762 (norm. = 0.21335), norm. avg. (of 17) = 0.308042 fft 26: mflops = 17.2877 (norm. = 0.174173), norm. avg. (of 18) = 0.110626 fft 27: mflops = 73.1409 (norm. = 0.736894), norm. avg. (of 18) = 0.570552 fft 28: mflops = 71.9086 (norm. = 0.724478), norm. avg. (of 18) = 0.558826 fft 29: mflops = 57.1619 (norm. = 0.575905), norm. avg. (of 17) = 0.238254 fft 30: mflops = 11.6695 (norm. = 0.11757), norm. avg. (of 17) = 0.292216 fft 31: mflops = 36.6055 (norm. = 0.3688), norm. avg. (of 18) = 0.362005 fft 32: mflops = 36.7089 (norm. = 0.369841), norm. avg. (of 18) = 0.392962 fft 33: mflops = 32.4863 (norm. = 0.327299), norm. avg. (of 18) = 0.482975 fft 34: mflops = 12.9938 (norm. = 0.130912), norm. avg. (of 18) = 0.128832 fft 35: mflops = 32.6125 (norm. = 0.32857), norm. avg. (of 18) = 0.277299 fft 36: mflops = 28.182 (norm. = 0.283933), norm. avg. (of 18) = 0.204627 fft 37: mflops = 5.55222 (norm. = 0.0559386), norm. avg. (of 18) = 0.0322773 fft 38: mflops = 49.5007 (norm. = 0.498718), norm. avg. (of 18) = 0.489043 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. Brenner 1. CWP (min N) 2. CWP (best N) 3. FFTPACK 4. FFTPACK (f2c) 5. FFTW 6. FFTW_ESTIMATE 7. Frigo-old 8. GSL 9. NAPACK (f2c) 10. Singleton 11. Singleton (f2c) 12. Temperton 13. Temperton (f2c) 14. Valkenburg 15. SUNPERF Computing normalized averages (16 transforms). Benchmarking for array size = 6: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.02805 s, 262144 iters, t-(init.)=0.975769 s t(norm)=0.239995, mflops=20.8338 2. CWP (best N) (N=15): elapsed time t=1.36379 s, 262144 iters, t-(init.)=1.29905 s t(norm)=0.319508, mflops=15.6491 3. FFTPACK: elapsed time t=1.78467 s, 1048576 iters, t-(init.)=1.60742 s t(norm)=0.0988382, mflops=50.5877 (err=1.4e-16) 4. FFTPACK (f2c): elapsed time t=1.16021 s, 524288 iters, t-(init.)=1.07635 s t(norm)=0.132367, mflops=37.7738 (err=1.8e-16) FFTW_MEASURE plan: (cost = 4.544141e-07) FFTW_NOTW 6 5. FFTW: elapsed time t=1.28158 s, 2097152 iters, t-(init.)=0.909094 s t(norm)=0.0279495, mflops=178.894 (err=1.1e-16) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 6. FFTW_ESTIMATE: elapsed time t=1.18242 s, 2097152 iters, t-(init.)=0.84977 s t(norm)=0.0261256, mflops=191.383 (err=1.1e-16) 7. Frigo-old: elapsed time t=1.76702 s, 524288 iters, t-(init.)=1.68567 s t(norm)=0.207299, mflops=24.1197 (err=3.3e-16) 8. GSL: elapsed time t=1.16744 s, 524288 iters, t-(init.)=1.07462 s t(norm)=0.132154, mflops=37.8346 (err=1.2e-16) 9. NAPACK (f2c): elapsed time t=1.03033 s, 131072 iters, t-(init.)=1.01064 s t(norm)=0.497144, mflops=10.0574 (err=4.7e-16) 10. Singleton: elapsed time t=1.65727 s, 262144 iters, t-(init.)=1.59899 s t(norm)=0.393278, mflops=12.7136 (err=1.0e-16) 11. Singleton (f2c): elapsed time t=1.76691 s, 262144 iters, t-(init.)=1.71832 s t(norm)=0.422629, mflops=11.8307 (err=1.0e-16) 12. Temperton: elapsed time t=1.22616 s, 262144 iters, t-(init.)=1.19332 s t(norm)=0.293502, mflops=17.0357 (err=3.7e-16) 13. Temperton (f2c): elapsed time t=1.58354 s, 262144 iters, t-(init.)=1.53098 s t(norm)=0.376552, mflops=13.2784 (err=1.0e-16) 14. Valkenburg: elapsed time t=1.53607 s, 131072 iters, t-(init.)=1.51234 s t(norm)=0.743931, mflops=6.72105 (err=3.4e-16) 15. SUNPERF: elapsed time t=1.13369 s, 524288 iters, t-(init.)=1.02797 s t(norm)=0.126417, mflops=39.5518 (err=1.4e-16) Top mflops for N=6 = 191.383 Normalized results and averages for N=6: fft 0: mflops = -1 (norm. = -0.00522512), norm. avg. (of 0) = -1 fft 1: mflops = 20.8338 (norm. = 0.108859), norm. avg. (of 1) = 0.108859 fft 2: mflops = 15.6491 (norm. = 0.0817683), norm. avg. (of 1) = 0.0817683 fft 3: mflops = 50.5877 (norm. = 0.264327), norm. avg. (of 1) = 0.264327 fft 4: mflops = 37.7738 (norm. = 0.197372), norm. avg. (of 1) = 0.197372 fft 5: mflops = 178.894 (norm. = 0.934744), norm. avg. (of 1) = 0.934744 fft 6: mflops = 191.383 (norm. = 1), norm. avg. (of 1) = 1 fft 7: mflops = 24.1197 (norm. = 0.126028), norm. avg. (of 1) = 0.126028 fft 8: mflops = 37.8346 (norm. = 0.19769), norm. avg. (of 1) = 0.19769 fft 9: mflops = 10.0574 (norm. = 0.0525513), norm. avg. (of 1) = 0.0525513 fft 10: mflops = 12.7136 (norm. = 0.0664303), norm. avg. (of 1) = 0.0664303 fft 11: mflops = 11.8307 (norm. = 0.0618169), norm. avg. (of 1) = 0.0618169 fft 12: mflops = 17.0357 (norm. = 0.0890134), norm. avg. (of 1) = 0.0890134 fft 13: mflops = 13.2784 (norm. = 0.0693812), norm. avg. (of 1) = 0.0693812 fft 14: mflops = 6.72105 (norm. = 0.0351183), norm. avg. (of 1) = 0.0351183 fft 15: mflops = 39.5518 (norm. = 0.206663), norm. avg. (of 1) = 0.206663 Benchmarking for array size = 9: 0. Brenner: elapsed time t=1.21136 s, 65536 iters, t-(init.)=1.20129 s t(norm)=0.642505, mflops=7.78204 (err=3.6e-16) 1. CWP (min N): elapsed time t=1.10342 s, 262144 iters, t-(init.)=1.04483 s t(norm)=0.139706, mflops=35.7895 2. CWP (best N) (N=15): elapsed time t=1.35232 s, 262144 iters, t-(init.)=1.28541 s t(norm)=0.171874, mflops=29.0911 3. FFTPACK: elapsed time t=1.08017 s, 524288 iters, t-(init.)=0.960065 s t(norm)=0.0641859, mflops=77.8988 (err=1.4e-16) 4. FFTPACK (f2c): elapsed time t=1.57476 s, 524288 iters, t-(init.)=1.46909 s t(norm)=0.0982169, mflops=50.9077 (err=2.4e-16) FFTW_MEASURE plan: (cost = 7.013594e-07) FFTW_NOTW 9 5. FFTW: elapsed time t=1.73803 s, 2097152 iters, t-(init.)=1.3622 s t(norm)=0.0227677, mflops=219.61 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.68557 s, 2097152 iters, t-(init.)=1.27962 s t(norm)=0.0213875, mflops=233.781 (err=1.2e-16) 7. Frigo-old: elapsed time t=1.78449 s, 262144 iters, t-(init.)=1.73812 s t(norm)=0.232407, mflops=21.514 (err=3.3e-16) 8. GSL: elapsed time t=1.01442 s, 262144 iters, t-(init.)=0.966291 s t(norm)=0.129204, mflops=38.6984 (err=1.4e-16) 9. NAPACK (f2c): elapsed time t=1.08155 s, 131072 iters, t-(init.)=1.05683 s t(norm)=0.28262, mflops=17.6916 (err=4.6e-16) 10. Singleton: elapsed time t=1.65392 s, 262144 iters, t-(init.)=1.60567 s t(norm)=0.214696, mflops=23.2887 (err=1.5e-16) 11. Singleton (f2c): elapsed time t=1.71022 s, 262144 iters, t-(init.)=1.66568 s t(norm)=0.222721, mflops=22.4496 (err=1.5e-16) 12. Temperton: elapsed time t=1.29361 s, 262144 iters, t-(init.)=1.24461 s t(norm)=0.166419, mflops=30.0447 (err=1.1e-08) 13. Temperton (f2c): elapsed time t=1.48782 s, 262144 iters, t-(init.)=1.42905 s t(norm)=0.191081, mflops=26.1669 (err=1.4e-16) 14. Valkenburg: elapsed time t=1.44183 s, 65536 iters, t-(init.)=1.42867 s t(norm)=0.764117, mflops=6.5435 (err=3.7e-16) 15. SUNPERF: elapsed time t=1.39376 s, 524288 iters, t-(init.)=1.29506 s t(norm)=0.0865819, mflops=57.7488 (err=1.4e-16) Top mflops for N=9 = 233.781 Normalized results and averages for N=9: fft 0: mflops = 7.78204 (norm. = 0.0332877), norm. avg. (of 1) = 0.0332877 fft 1: mflops = 35.7895 (norm. = 0.15309), norm. avg. (of 2) = 0.130974 fft 2: mflops = 29.0911 (norm. = 0.124437), norm. avg. (of 2) = 0.103103 fft 3: mflops = 77.8988 (norm. = 0.333212), norm. avg. (of 2) = 0.29877 fft 4: mflops = 50.9077 (norm. = 0.217758), norm. avg. (of 2) = 0.207565 fft 5: mflops = 219.61 (norm. = 0.939381), norm. avg. (of 2) = 0.937062 fft 6: mflops = 233.781 (norm. = 1), norm. avg. (of 2) = 1 fft 7: mflops = 21.514 (norm. = 0.0920262), norm. avg. (of 2) = 0.109027 fft 8: mflops = 38.6984 (norm. = 0.165533), norm. avg. (of 2) = 0.181612 fft 9: mflops = 17.6916 (norm. = 0.0756759), norm. avg. (of 2) = 0.0641136 fft 10: mflops = 23.2887 (norm. = 0.0996176), norm. avg. (of 2) = 0.0830239 fft 11: mflops = 22.4496 (norm. = 0.0960284), norm. avg. (of 2) = 0.0789226 fft 12: mflops = 30.0447 (norm. = 0.128516), norm. avg. (of 2) = 0.108765 fft 13: mflops = 26.1669 (norm. = 0.111929), norm. avg. (of 2) = 0.0906551 fft 14: mflops = 6.5435 (norm. = 0.0279899), norm. avg. (of 2) = 0.0315541 fft 15: mflops = 57.7488 (norm. = 0.247021), norm. avg. (of 2) = 0.226842 Benchmarking for array size = 12: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.16616 s, 262144 iters, t-(init.)=1.10085 s t(norm)=0.0976164, mflops=51.2209 2. CWP (best N) (N=15): elapsed time t=1.35754 s, 262144 iters, t-(init.)=1.28534 s t(norm)=0.113976, mflops=43.8689 3. FFTPACK: elapsed time t=1.13263 s, 524288 iters, t-(init.)=1.00028 s t(norm)=0.0443493, mflops=112.741 (err=1.7e-16) 4. FFTPACK (f2c): elapsed time t=1.74476 s, 524288 iters, t-(init.)=1.63921 s t(norm)=0.0726773, mflops=68.7973 (err=2.2e-16) FFTW_MEASURE plan: (cost = 1.430906e-06) FFTW_TWIDDLE 3 FFTW_NOTW 4 5. FFTW: elapsed time t=1.09196 s, 524288 iters, t-(init.)=0.987344 s t(norm)=0.0437756, mflops=114.219 (err=1.1e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.71336 s, 2097152 iters, t-(init.)=1.26817 s t(norm)=0.0140567, mflops=355.703 (err=1.2e-16) 7. Frigo-old: elapsed time t=1.69957 s, 262144 iters, t-(init.)=1.63894 s t(norm)=0.14533, mflops=34.4044 (err=2.9e-16) 8. GSL: elapsed time t=1.10132 s, 262144 iters, t-(init.)=1.04446 s t(norm)=0.0926158, mflops=53.9865 (err=1.6e-16) 9. NAPACK (f2c): elapsed time t=1.65463 s, 131072 iters, t-(init.)=1.62576 s t(norm)=0.288324, mflops=17.3416 (err=5.5e-16) 10. Singleton: elapsed time t=1.1239 s, 131072 iters, t-(init.)=1.09298 s t(norm)=0.193836, mflops=25.7949 (err=1.5e-16) 11. Singleton (f2c): elapsed time t=1.19645 s, 131072 iters, t-(init.)=1.1654 s t(norm)=0.206681, mflops=24.1919 (err=1.5e-16) 12. Temperton: elapsed time t=1.54695 s, 262144 iters, t-(init.)=1.49861 s t(norm)=0.132887, mflops=37.626 (err=5.4e-16) 13. Temperton (f2c): elapsed time t=1.93064 s, 262144 iters, t-(init.)=1.85564 s t(norm)=0.164546, mflops=30.3866 (err=1.4e-16) 14. Valkenburg: elapsed time t=1.03021 s, 32768 iters, t-(init.)=1.02249 s t(norm)=0.72534, mflops=6.89332 (err=3.4e-16) 15. SUNPERF: elapsed time t=1.39135 s, 524288 iters, t-(init.)=1.27626 s t(norm)=0.0565853, mflops=88.3622 (err=1.7e-16) Top mflops for N=12 = 355.703 Normalized results and averages for N=12: fft 0: mflops = -1 (norm. = -0.00281134), norm. avg. (of 1) = 0.0332877 fft 1: mflops = 51.2209 (norm. = 0.143999), norm. avg. (of 3) = 0.135316 fft 2: mflops = 43.8689 (norm. = 0.12333), norm. avg. (of 3) = 0.109845 fft 3: mflops = 112.741 (norm. = 0.316954), norm. avg. (of 3) = 0.304831 fft 4: mflops = 68.7973 (norm. = 0.193412), norm. avg. (of 3) = 0.202848 fft 5: mflops = 114.219 (norm. = 0.321107), norm. avg. (of 3) = 0.731744 fft 6: mflops = 355.703 (norm. = 1), norm. avg. (of 3) = 1 fft 7: mflops = 34.4044 (norm. = 0.0967223), norm. avg. (of 3) = 0.104926 fft 8: mflops = 53.9865 (norm. = 0.151774), norm. avg. (of 3) = 0.171666 fft 9: mflops = 17.3416 (norm. = 0.0487531), norm. avg. (of 3) = 0.0589935 fft 10: mflops = 25.7949 (norm. = 0.0725183), norm. avg. (of 3) = 0.079522 fft 11: mflops = 24.1919 (norm. = 0.0680114), norm. avg. (of 3) = 0.0752856 fft 12: mflops = 37.626 (norm. = 0.105779), norm. avg. (of 3) = 0.10777 fft 13: mflops = 30.3866 (norm. = 0.0854271), norm. avg. (of 3) = 0.0889124 fft 14: mflops = 6.89332 (norm. = 0.0193795), norm. avg. (of 3) = 0.0274959 fft 15: mflops = 88.3622 (norm. = 0.248416), norm. avg. (of 3) = 0.234033 Benchmarking for array size = 15: 0. Brenner: elapsed time t=1.71304 s, 65536 iters, t-(init.)=1.69664 s t(norm)=0.441762, mflops=11.3183 (err=3.5e-16) 1. CWP (min N): elapsed time t=1.36501 s, 262144 iters, t-(init.)=1.2889 s t(norm)=0.0838987, mflops=59.5956 2. CWP (best N): elapsed time t=1.36324 s, 262144 iters, t-(init.)=1.29281 s t(norm)=0.0841534, mflops=59.4153 3. FFTPACK: elapsed time t=1.45058 s, 524288 iters, t-(init.)=1.30465 s t(norm)=0.0424621, mflops=117.752 (err=2.1e-16) 4. FFTPACK (f2c): elapsed time t=1.1384 s, 262144 iters, t-(init.)=1.06991 s t(norm)=0.0696444, mflops=71.7933 (err=4.4e-16) FFTW_MEASURE plan: (cost = 1.967094e-06) FFTW_TWIDDLE 5 FFTW_NOTW 3 5. FFTW: elapsed time t=1.34937 s, 524288 iters, t-(init.)=1.23163 s t(norm)=0.0400856, mflops=124.733 (err=2.3e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.33419 s, 1048576 iters, t-(init.)=1.05204 s t(norm)=0.0171203, mflops=292.052 (err=1.9e-16) 7. Frigo-old: elapsed time t=1.53073 s, 131072 iters, t-(init.)=1.49443 s t(norm)=0.194555, mflops=25.6997 (err=4.2e-16) 8. GSL: elapsed time t=1.77358 s, 262144 iters, t-(init.)=1.70421 s t(norm)=0.110933, mflops=45.0722 (err=2.0e-16) 9. NAPACK (f2c): elapsed time t=1.32121 s, 65536 iters, t-(init.)=1.30196 s t(norm)=0.338996, mflops=14.7494 (err=6.3e-16) 10. Singleton: elapsed time t=1.20143 s, 131072 iters, t-(init.)=1.16343 s t(norm)=0.151463, mflops=33.0114 (err=2.8e-16) 11. Singleton (f2c): elapsed time t=1.31185 s, 131072 iters, t-(init.)=1.28303 s t(norm)=0.167033, mflops=29.9341 (err=2.8e-16) 12. Temperton: elapsed time t=1.55332 s, 262144 iters, t-(init.)=1.48822 s t(norm)=0.0968733, mflops=51.6138 (err=7.9e-16) 13. Temperton (f2c): elapsed time t=1.24329 s, 131072 iters, t-(init.)=1.20851 s t(norm)=0.157332, mflops=31.78 (err=2.0e-16) 14. Valkenburg: elapsed time t=1.67121 s, 32768 iters, t-(init.)=1.66344 s t(norm)=0.866233, mflops=5.77212 (err=4.5e-16) 15. SUNPERF: elapsed time t=1.76436 s, 524288 iters, t-(init.)=1.63545 s t(norm)=0.0532286, mflops=93.9344 (err=2.4e-16) Top mflops for N=15 = 292.052 Normalized results and averages for N=15: fft 0: mflops = 11.3183 (norm. = 0.0387545), norm. avg. (of 2) = 0.0360211 fft 1: mflops = 59.5956 (norm. = 0.204059), norm. avg. (of 4) = 0.152502 fft 2: mflops = 59.4153 (norm. = 0.203441), norm. avg. (of 4) = 0.133244 fft 3: mflops = 117.752 (norm. = 0.403189), norm. avg. (of 4) = 0.32942 fft 4: mflops = 71.7933 (norm. = 0.245824), norm. avg. (of 4) = 0.213592 fft 5: mflops = 124.733 (norm. = 0.427093), norm. avg. (of 4) = 0.655581 fft 6: mflops = 292.052 (norm. = 1), norm. avg. (of 4) = 1 fft 7: mflops = 25.6997 (norm. = 0.087997), norm. avg. (of 4) = 0.100693 fft 8: mflops = 45.0722 (norm. = 0.15433), norm. avg. (of 4) = 0.167332 fft 9: mflops = 14.7494 (norm. = 0.0505029), norm. avg. (of 4) = 0.0568708 fft 10: mflops = 33.0114 (norm. = 0.113033), norm. avg. (of 4) = 0.0878997 fft 11: mflops = 29.9341 (norm. = 0.102496), norm. avg. (of 4) = 0.0820882 fft 12: mflops = 51.6138 (norm. = 0.176728), norm. avg. (of 4) = 0.125009 fft 13: mflops = 31.78 (norm. = 0.108816), norm. avg. (of 4) = 0.0938884 fft 14: mflops = 5.77212 (norm. = 0.019764), norm. avg. (of 4) = 0.0255629 fft 15: mflops = 93.9344 (norm. = 0.321636), norm. avg. (of 4) = 0.255934 Benchmarking for array size = 18: 0. Brenner: elapsed time t=1.20637 s, 32768 iters, t-(init.)=1.19642 s t(norm)=0.486442, mflops=10.2787 (err=4.3e-16) 1. CWP (min N): elapsed time t=1.94263 s, 262144 iters, t-(init.)=1.86595 s t(norm)=0.0948328, mflops=52.7244 2. CWP (best N) (N=28): elapsed time t=1.09126 s, 131072 iters, t-(init.)=1.03702 s t(norm)=0.105408, mflops=47.4346 3. FFTPACK: elapsed time t=1.07394 s, 262144 iters, t-(init.)=0.991692 s t(norm)=0.0504006, mflops=99.2051 (err=2.8e-16) 4. FFTPACK (f2c): elapsed time t=1.72199 s, 262144 iters, t-(init.)=1.65051 s t(norm)=0.0838839, mflops=59.6062 (err=2.7e-16) FFTW_MEASURE plan: (cost = 2.081828e-06) FFTW_TWIDDLE 3 FFTW_NOTW 6 5. FFTW: elapsed time t=1.49776 s, 524288 iters, t-(init.)=1.34543 s t(norm)=0.0341894, mflops=146.244 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.51783 s, 524288 iters, t-(init.)=1.37758 s t(norm)=0.0350063, mflops=142.831 (err=2.0e-16) 7. Frigo-old: elapsed time t=1.03678 s, 65536 iters, t-(init.)=1.01793 s t(norm)=0.206937, mflops=24.162 (err=5.0e-16) 8. GSL: elapsed time t=1.61712 s, 262144 iters, t-(init.)=1.54171 s t(norm)=0.0783543, mflops=63.8127 (err=2.3e-16) 9. NAPACK (f2c): elapsed time t=1.04413 s, 65536 iters, t-(init.)=1.02585 s t(norm)=0.208546, mflops=23.9755 (err=8.7e-16) 10. Singleton: elapsed time t=1.36511 s, 131072 iters, t-(init.)=1.32446 s t(norm)=0.134626, mflops=37.1399 (err=2.1e-16) 11. Singleton (f2c): elapsed time t=1.38295 s, 131072 iters, t-(init.)=1.34821 s t(norm)=0.13704, mflops=36.4857 (err=2.1e-16) 12. Temperton: elapsed time t=1.14009 s, 131072 iters, t-(init.)=1.10859 s t(norm)=0.112684, mflops=44.372 (err=2.7e-08) 13. Temperton (f2c): elapsed time t=1.46386 s, 131072 iters, t-(init.)=1.42598 s t(norm)=0.144945, mflops=34.4959 (err=2.9e-16) 14. Valkenburg: elapsed time t=1.79597 s, 32768 iters, t-(init.)=1.78751 s t(norm)=0.726771, mflops=6.87974 (err=5.1e-16) 15. SUNPERF: elapsed time t=1.27195 s, 262144 iters, t-(init.)=1.19942 s t(norm)=0.0609578, mflops=82.0239 (err=2.8e-16) Top mflops for N=18 = 146.244 Normalized results and averages for N=18: fft 0: mflops = 10.2787 (norm. = 0.0702846), norm. avg. (of 3) = 0.0474423 fft 1: mflops = 52.7244 (norm. = 0.360523), norm. avg. (of 5) = 0.194106 fft 2: mflops = 47.4346 (norm. = 0.324352), norm. avg. (of 5) = 0.171466 fft 3: mflops = 99.2051 (norm. = 0.678352), norm. avg. (of 5) = 0.399207 fft 4: mflops = 59.6062 (norm. = 0.40758), norm. avg. (of 5) = 0.252389 fft 5: mflops = 146.244 (norm. = 1), norm. avg. (of 5) = 0.724465 fft 6: mflops = 142.831 (norm. = 0.976664), norm. avg. (of 5) = 0.995333 fft 7: mflops = 24.162 (norm. = 0.165217), norm. avg. (of 5) = 0.113598 fft 8: mflops = 63.8127 (norm. = 0.436344), norm. avg. (of 5) = 0.221134 fft 9: mflops = 23.9755 (norm. = 0.163942), norm. avg. (of 5) = 0.078285 fft 10: mflops = 37.1399 (norm. = 0.253958), norm. avg. (of 5) = 0.121111 fft 11: mflops = 36.4857 (norm. = 0.249485), norm. avg. (of 5) = 0.115568 fft 12: mflops = 44.372 (norm. = 0.30341), norm. avg. (of 5) = 0.16069 fft 13: mflops = 34.4959 (norm. = 0.235879), norm. avg. (of 5) = 0.122286 fft 14: mflops = 6.87974 (norm. = 0.0470428), norm. avg. (of 5) = 0.0298589 fft 15: mflops = 82.0239 (norm. = 0.560869), norm. avg. (of 5) = 0.316921 Benchmarking for array size = 24: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.00808 s, 131072 iters, t-(init.)=0.965113 s t(norm)=0.0669147, mflops=74.722 2. CWP (best N) (N=28): elapsed time t=1.09012 s, 131072 iters, t-(init.)=1.04068 s t(norm)=0.0721539, mflops=69.2963 3. FFTPACK: elapsed time t=1.18932 s, 262144 iters, t-(init.)=1.1001 s t(norm)=0.0381369, mflops=131.107 (err=2.2e-16) 4. FFTPACK (f2c): elapsed time t=1.96768 s, 262144 iters, t-(init.)=1.88636 s t(norm)=0.0653938, mflops=76.4598 (err=2.7e-16) FFTW_MEASURE plan: (cost = 2.409125e-06) FFTW_TWIDDLE 2 FFTW_NOTW 12 5. FFTW: elapsed time t=1.63951 s, 524288 iters, t-(init.)=1.47104 s t(norm)=0.0254981, mflops=196.093 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.55143 s, 524288 iters, t-(init.)=1.35472 s t(norm)=0.0234819, mflops=212.93 (err=2.1e-16) 7. Frigo-old: elapsed time t=1.67741 s, 131072 iters, t-(init.)=1.62682 s t(norm)=0.112793, mflops=44.329 (err=3.9e-16) 8. GSL: elapsed time t=1.84812 s, 262144 iters, t-(init.)=1.75535 s t(norm)=0.0608521, mflops=82.1664 (err=2.1e-16) 9. NAPACK (f2c): elapsed time t=1.41909 s, 65536 iters, t-(init.)=1.3994 s t(norm)=0.19405, mflops=25.7665 (err=8.0e-16) 10. Singleton: elapsed time t=1.94021 s, 131072 iters, t-(init.)=1.89383 s t(norm)=0.131306, mflops=38.079 (err=2.3e-16) 11. Singleton (f2c): elapsed time t=1.01947 s, 65536 iters, t-(init.)=0.999578 s t(norm)=0.138608, mflops=36.0728 (err=2.3e-16) 12. Temperton: elapsed time t=1.48804 s, 131072 iters, t-(init.)=1.44407 s t(norm)=0.100122, mflops=49.939 (err=4.5e-09) 13. Temperton (f2c): elapsed time t=1.71253 s, 131072 iters, t-(init.)=1.66377 s t(norm)=0.115355, mflops=43.3446 (err=2.8e-16) 14. Valkenburg: elapsed time t=1.27729 s, 16384 iters, t-(init.)=1.27106 s t(norm)=0.705018, mflops=7.09202 (err=4.8e-16) 15. SUNPERF: elapsed time t=1.37614 s, 262144 iters, t-(init.)=1.29239 s t(norm)=0.044803, mflops=111.6 (err=2.2e-16) Top mflops for N=24 = 212.93 Normalized results and averages for N=24: fft 0: mflops = -1 (norm. = -0.00469638), norm. avg. (of 3) = 0.0474423 fft 1: mflops = 74.722 (norm. = 0.350923), norm. avg. (of 6) = 0.220242 fft 2: mflops = 69.2963 (norm. = 0.325442), norm. avg. (of 6) = 0.197128 fft 3: mflops = 131.107 (norm. = 0.615727), norm. avg. (of 6) = 0.435293 fft 4: mflops = 76.4598 (norm. = 0.359084), norm. avg. (of 6) = 0.270172 fft 5: mflops = 196.093 (norm. = 0.920927), norm. avg. (of 6) = 0.757209 fft 6: mflops = 212.93 (norm. = 1), norm. avg. (of 6) = 0.996111 fft 7: mflops = 44.329 (norm. = 0.208186), norm. avg. (of 6) = 0.129363 fft 8: mflops = 82.1664 (norm. = 0.385885), norm. avg. (of 6) = 0.248593 fft 9: mflops = 25.7665 (norm. = 0.121009), norm. avg. (of 6) = 0.0854058 fft 10: mflops = 38.079 (norm. = 0.178833), norm. avg. (of 6) = 0.130732 fft 11: mflops = 36.0728 (norm. = 0.169412), norm. avg. (of 6) = 0.124542 fft 12: mflops = 49.939 (norm. = 0.234533), norm. avg. (of 6) = 0.172997 fft 13: mflops = 43.3446 (norm. = 0.203562), norm. avg. (of 6) = 0.135832 fft 14: mflops = 7.09202 (norm. = 0.0333068), norm. avg. (of 6) = 0.0304335 fft 15: mflops = 111.6 (norm. = 0.524114), norm. avg. (of 6) = 0.351453 Benchmarking for array size = 36: 0. Brenner: elapsed time t=1.16542 s, 16384 iters, t-(init.)=1.15781 s t(norm)=0.37969, mflops=13.1686 (err=5.3e-16) 1. CWP (min N): elapsed time t=1.518 s, 131072 iters, t-(init.)=1.46012 s t(norm)=0.0598538, mflops=83.5369 2. CWP (best N): elapsed time t=1.51754 s, 131072 iters, t-(init.)=1.45922 s t(norm)=0.059817, mflops=83.5882 3. FFTPACK: elapsed time t=1.66598 s, 262144 iters, t-(init.)=1.53778 s t(norm)=0.0315186, mflops=158.636 (err=3.8e-16) 4. FFTPACK (f2c): elapsed time t=1.49962 s, 131072 iters, t-(init.)=1.43421 s t(norm)=0.0587917, mflops=85.046 (err=4.7e-16) FFTW_MEASURE plan: (cost = 3.545375e-06) FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.02871 s, 262144 iters, t-(init.)=0.897971 s t(norm)=0.018405, mflops=271.665 (err=4.4e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.05152 s, 262144 iters, t-(init.)=0.933516 s t(norm)=0.0191335, mflops=261.321 (err=4.4e-16) 7. Frigo-old: elapsed time t=1.00685 s, 32768 iters, t-(init.)=0.991526 s t(norm)=0.16258, mflops=30.7541 (err=5.9e-16) 8. GSL: elapsed time t=1.41687 s, 131072 iters, t-(init.)=1.35394 s t(norm)=0.0555012, mflops=90.0882 (err=4.3e-16) 9. NAPACK (f2c): elapsed time t=1.81982 s, 65536 iters, t-(init.)=1.78713 s t(norm)=0.146517, mflops=34.1256 (err=1.4e-15) 10. Singleton: elapsed time t=1.94592 s, 131072 iters, t-(init.)=1.88873 s t(norm)=0.0774238, mflops=64.5797 (err=4.7e-16) 11. Singleton (f2c): elapsed time t=1.95876 s, 131072 iters, t-(init.)=1.89802 s t(norm)=0.0778044, mflops=64.2637 (err=4.7e-16) 12. Temperton: elapsed time t=1.62582 s, 131072 iters, t-(init.)=1.55998 s t(norm)=0.0639475, mflops=78.1891 (err=5.1e-08) 13. Temperton (f2c): elapsed time t=1.10853 s, 65536 iters, t-(init.)=1.07571 s t(norm)=0.088192, mflops=56.6945 (err=3.7e-16) 14. Valkenburg: elapsed time t=1.10882 s, 8192 iters, t-(init.)=1.10461 s t(norm)=0.72449, mflops=6.90141 (err=6.2e-16) 15. SUNPERF: elapsed time t=1.91947 s, 262144 iters, t-(init.)=1.78882 s t(norm)=0.0366639, mflops=136.374 (err=3.8e-16) Top mflops for N=36 = 271.665 Normalized results and averages for N=36: fft 0: mflops = 13.1686 (norm. = 0.0484737), norm. avg. (of 4) = 0.0477001 fft 1: mflops = 83.5369 (norm. = 0.307499), norm. avg. (of 7) = 0.232707 fft 2: mflops = 83.5882 (norm. = 0.307688), norm. avg. (of 7) = 0.212923 fft 3: mflops = 158.636 (norm. = 0.58394), norm. avg. (of 7) = 0.456529 fft 4: mflops = 85.046 (norm. = 0.313054), norm. avg. (of 7) = 0.276298 fft 5: mflops = 271.665 (norm. = 1), norm. avg. (of 7) = 0.791893 fft 6: mflops = 261.321 (norm. = 0.961923), norm. avg. (of 7) = 0.991227 fft 7: mflops = 30.7541 (norm. = 0.113206), norm. avg. (of 7) = 0.127055 fft 8: mflops = 90.0882 (norm. = 0.331615), norm. avg. (of 7) = 0.260453 fft 9: mflops = 34.1256 (norm. = 0.125616), norm. avg. (of 7) = 0.0911501 fft 10: mflops = 64.5797 (norm. = 0.237718), norm. avg. (of 7) = 0.146015 fft 11: mflops = 64.2637 (norm. = 0.236555), norm. avg. (of 7) = 0.140543 fft 12: mflops = 78.1891 (norm. = 0.287814), norm. avg. (of 7) = 0.189399 fft 13: mflops = 56.6945 (norm. = 0.208692), norm. avg. (of 7) = 0.146241 fft 14: mflops = 6.90141 (norm. = 0.0254041), norm. avg. (of 7) = 0.0297151 fft 15: mflops = 136.374 (norm. = 0.501992), norm. avg. (of 7) = 0.372959 Benchmarking for array size = 80: 0. Brenner: elapsed time t=1.78372 s, 16384 iters, t-(init.)=1.76889 s t(norm)=0.213473, mflops=23.4222 (err=3.9e-16) 1. CWP (min N): elapsed time t=1.55802 s, 65536 iters, t-(init.)=1.49803 s t(norm)=0.045196, mflops=110.629 2. CWP (best N) (N=84): elapsed time t=1.03998 s, 65536 iters, t-(init.)=0.980884 s t(norm)=0.0295936, mflops=168.955 3. FFTPACK: elapsed time t=1.92398 s, 131072 iters, t-(init.)=1.80535 s t(norm)=0.027234, mflops=183.594 (err=3.2e-16) 4. FFTPACK (f2c): elapsed time t=1.57955 s, 65536 iters, t-(init.)=1.52071 s t(norm)=0.0458803, mflops=108.979 (err=4.0e-16) FFTW_MEASURE plan: (cost = 9.213625e-06) FFTW_TWIDDLE 10 FFTW_NOTW 8 5. FFTW: elapsed time t=1.43153 s, 131072 iters, t-(init.)=1.31429 s t(norm)=0.0198263, mflops=252.19 (err=3.5e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 6. FFTW_ESTIMATE: elapsed time t=1.55691 s, 131072 iters, t-(init.)=1.43732 s t(norm)=0.0216822, mflops=230.604 (err=3.6e-16) 7. Frigo-old: elapsed time t=1.74515 s, 32768 iters, t-(init.)=1.71584 s t(norm)=0.103535, mflops=48.2929 (err=3.5e-16) 8. GSL: elapsed time t=1.21379 s, 32768 iters, t-(init.)=1.18524 s t(norm)=0.0715181, mflops=69.9123 (err=3.2e-16) 9. NAPACK (f2c): elapsed time t=1.8647 s, 16384 iters, t-(init.)=1.8493 s t(norm)=0.223176, mflops=22.4038 (err=5.0e-16) 10. Singleton: elapsed time t=1.57482 s, 65536 iters, t-(init.)=1.51149 s t(norm)=0.0456022, mflops=109.644 (err=4.4e-16) 11. Singleton (f2c): elapsed time t=1.55837 s, 65536 iters, t-(init.)=1.49823 s t(norm)=0.0452021, mflops=110.614 (err=4.4e-16) 12. Temperton: elapsed time t=1.7952 s, 65536 iters, t-(init.)=1.73647 s t(norm)=0.05239, mflops=95.4381 (err=5.3e-08) 13. Temperton (f2c): elapsed time t=1.34617 s, 32768 iters, t-(init.)=1.3162 s t(norm)=0.0794204, mflops=62.9561 (err=3.4e-16) 14. Valkenburg: elapsed time t=1.60672 s, 4096 iters, t-(init.)=1.60286 s t(norm)=0.773744, mflops=6.46209 (err=4.6e-16) 15. SUNPERF: elapsed time t=1.76607 s, 131072 iters, t-(init.)=1.64836 s t(norm)=0.0248658, mflops=201.08 (err=3.1e-16) Top mflops for N=80 = 252.19 Normalized results and averages for N=80: fft 0: mflops = 23.4222 (norm. = 0.0928753), norm. avg. (of 5) = 0.0567352 fft 1: mflops = 110.629 (norm. = 0.438674), norm. avg. (of 8) = 0.258453 fft 2: mflops = 168.955 (norm. = 0.669953), norm. avg. (of 8) = 0.270051 fft 3: mflops = 183.594 (norm. = 0.728), norm. avg. (of 8) = 0.490463 fft 4: mflops = 108.979 (norm. = 0.432131), norm. avg. (of 8) = 0.295777 fft 5: mflops = 252.19 (norm. = 1), norm. avg. (of 8) = 0.817907 fft 6: mflops = 230.604 (norm. = 0.914407), norm. avg. (of 8) = 0.981624 fft 7: mflops = 48.2929 (norm. = 0.191494), norm. avg. (of 8) = 0.13511 fft 8: mflops = 69.9123 (norm. = 0.277221), norm. avg. (of 8) = 0.262549 fft 9: mflops = 22.4038 (norm. = 0.0888371), norm. avg. (of 8) = 0.090861 fft 10: mflops = 109.644 (norm. = 0.434768), norm. avg. (of 8) = 0.182109 fft 11: mflops = 110.614 (norm. = 0.438615), norm. avg. (of 8) = 0.177802 fft 12: mflops = 95.4381 (norm. = 0.378438), norm. avg. (of 8) = 0.213029 fft 13: mflops = 62.9561 (norm. = 0.249638), norm. avg. (of 8) = 0.159166 fft 14: mflops = 6.46209 (norm. = 0.0256239), norm. avg. (of 8) = 0.0292037 fft 15: mflops = 201.08 (norm. = 0.797335), norm. avg. (of 8) = 0.426006 Benchmarking for array size = 108: 0. Brenner: elapsed time t=1.06239 s, 4096 iters, t-(init.)=1.05754 s t(norm)=0.353912, mflops=14.1278 (err=6.4e-16) 1. CWP (min N) (N=110): elapsed time t=1.15785 s, 32768 iters, t-(init.)=1.1188 s t(norm)=0.0468015, mflops=106.834 2. CWP (best N) (N=112): elapsed time t=1.05675 s, 32768 iters, t-(init.)=1.01608 s t(norm)=0.0425044, mflops=117.635 3. FFTPACK: elapsed time t=1.25118 s, 65536 iters, t-(init.)=1.17223 s t(norm)=0.0245183, mflops=203.929 (err=3.8e-16) 4. FFTPACK (f2c): elapsed time t=1.33197 s, 32768 iters, t-(init.)=1.2933 s t(norm)=0.0541013, mflops=92.4193 (err=4.0e-16) FFTW_MEASURE plan: (cost = 1.272812e-05) FFTW_TWIDDLE 9 FFTW_NOTW 12 5. FFTW: elapsed time t=1.9973 s, 131072 iters, t-(init.)=1.83873 s t(norm)=0.0192294, mflops=260.018 (err=3.7e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.97153 s, 131072 iters, t-(init.)=1.81688 s t(norm)=0.0190009, mflops=263.146 (err=3.7e-16) 7. Frigo-old: elapsed time t=1.06876 s, 8192 iters, t-(init.)=1.0588 s t(norm)=0.177167, mflops=28.2219 (err=5.7e-16) 8. GSL: elapsed time t=1.56314 s, 32768 iters, t-(init.)=1.52372 s t(norm)=0.0637403, mflops=78.4433 (err=4.0e-16) 9. NAPACK (f2c): elapsed time t=1.40833 s, 16384 iters, t-(init.)=1.38897 s t(norm)=0.116206, mflops=43.0269 (err=3.1e-15) 10. Singleton: elapsed time t=1.6993 s, 32768 iters, t-(init.)=1.66065 s t(norm)=0.0694684, mflops=71.9752 (err=4.5e-16) 11. Singleton (f2c): elapsed time t=1.56402 s, 32768 iters, t-(init.)=1.52449 s t(norm)=0.0637724, mflops=78.4038 (err=4.5e-16) 12. Temperton: elapsed time t=1.30925 s, 32768 iters, t-(init.)=1.26925 s t(norm)=0.0530953, mflops=94.1703 (err=7.4e-08) 13. Temperton (f2c): elapsed time t=1.83224 s, 32768 iters, t-(init.)=1.7933 s t(norm)=0.0750174, mflops=66.6512 (err=3.5e-16) 14. Valkenburg: elapsed time t=1.09869 s, 2048 iters, t-(init.)=1.0962 s t(norm)=0.733702, mflops=6.81475 (err=6.6e-16) 15. SUNPERF: elapsed time t=1.48306 s, 65536 iters, t-(init.)=1.40695 s t(norm)=0.0294278, mflops=169.907 (err=3.8e-16) Top mflops for N=108 = 263.146 Normalized results and averages for N=108: fft 0: mflops = 14.1278 (norm. = 0.0536883), norm. avg. (of 6) = 0.0562274 fft 1: mflops = 106.834 (norm. = 0.405989), norm. avg. (of 9) = 0.274846 fft 2: mflops = 117.635 (norm. = 0.447033), norm. avg. (of 9) = 0.289716 fft 3: mflops = 203.929 (norm. = 0.774968), norm. avg. (of 9) = 0.522074 fft 4: mflops = 92.4193 (norm. = 0.35121), norm. avg. (of 9) = 0.301936 fft 5: mflops = 260.018 (norm. = 0.988116), norm. avg. (of 9) = 0.836819 fft 6: mflops = 263.146 (norm. = 1), norm. avg. (of 9) = 0.983666 fft 7: mflops = 28.2219 (norm. = 0.107248), norm. avg. (of 9) = 0.132014 fft 8: mflops = 78.4433 (norm. = 0.298098), norm. avg. (of 9) = 0.266499 fft 9: mflops = 43.0269 (norm. = 0.16351), norm. avg. (of 9) = 0.0989331 fft 10: mflops = 71.9752 (norm. = 0.273518), norm. avg. (of 9) = 0.192266 fft 11: mflops = 78.4038 (norm. = 0.297948), norm. avg. (of 9) = 0.191152 fft 12: mflops = 94.1703 (norm. = 0.357864), norm. avg. (of 9) = 0.229122 fft 13: mflops = 66.6512 (norm. = 0.253287), norm. avg. (of 9) = 0.169624 fft 14: mflops = 6.81475 (norm. = 0.0258973), norm. avg. (of 9) = 0.0288363 fft 15: mflops = 169.907 (norm. = 0.645678), norm. avg. (of 9) = 0.450414 Benchmarking for array size = 210: 0. Brenner: elapsed time t=1.70298 s, 4096 iters, t-(init.)=1.69393 s t(norm)=0.255284, mflops=19.5861 (err=6.7e-16) 1. CWP (min N): elapsed time t=1.25552 s, 32768 iters, t-(init.)=1.18397 s t(norm)=0.0223038, mflops=224.177 2. CWP (best N): elapsed time t=1.25576 s, 32768 iters, t-(init.)=1.18361 s t(norm)=0.022297, mflops=224.245 3. FFTPACK: elapsed time t=1.02402 s, 16384 iters, t-(init.)=0.987801 s t(norm)=0.0372166, mflops=134.349 (err=5.0e-16) 4. FFTPACK (f2c): elapsed time t=1.02984 s, 8192 iters, t-(init.)=1.01183 s t(norm)=0.076244, mflops=65.5789 (err=6.4e-16) FFTW_MEASURE plan: (cost = 4.067275e-05) FFTW_TWIDDLE 3 FFTW_TWIDDLE 10 FFTW_NOTW 7 5. FFTW: elapsed time t=1.4895 s, 32768 iters, t-(init.)=1.4183 s t(norm)=0.0267182, mflops=187.139 (err=4.7e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.79564 s, 32768 iters, t-(init.)=1.7236 s t(norm)=0.0324694, mflops=153.991 (err=4.7e-16) 7. Frigo-old: elapsed time t=1.25934 s, 4096 iters, t-(init.)=1.25026 s t(norm)=0.18842, mflops=26.5365 (err=5.9e-16) 8. GSL: elapsed time t=1.08761 s, 8192 iters, t-(init.)=1.06953 s t(norm)=0.080592, mflops=62.0409 (err=6.3e-16) 9. NAPACK (f2c): elapsed time t=1.71251 s, 4096 iters, t-(init.)=1.70366 s t(norm)=0.25675, mflops=19.4742 (err=1.5e-14) 10. Singleton: elapsed time t=1.98426 s, 16384 iters, t-(init.)=1.94838 s t(norm)=0.0734074, mflops=68.113 (err=6.4e-16) 11. Singleton (f2c): elapsed time t=1.08447 s, 8192 iters, t-(init.)=1.06619 s t(norm)=0.0803404, mflops=62.2352 (err=6.4e-16) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.55777 s, 1024 iters, t-(init.)=1.55547 s t(norm)=0.937667, mflops=5.33239 (err=7.1e-16) 15. SUNPERF: elapsed time t=1.36356 s, 16384 iters, t-(init.)=1.32812 s t(norm)=0.0500385, mflops=99.923 (err=5.0e-16) Top mflops for N=210 = 224.245 Normalized results and averages for N=210: fft 0: mflops = 19.5861 (norm. = 0.087342), norm. avg. (of 7) = 0.0606723 fft 1: mflops = 224.177 (norm. = 0.999695), norm. avg. (of 10) = 0.347331 fft 2: mflops = 224.245 (norm. = 1), norm. avg. (of 10) = 0.360745 fft 3: mflops = 134.349 (norm. = 0.599114), norm. avg. (of 10) = 0.529778 fft 4: mflops = 65.5789 (norm. = 0.292443), norm. avg. (of 10) = 0.300987 fft 5: mflops = 187.139 (norm. = 0.834525), norm. avg. (of 10) = 0.836589 fft 6: mflops = 153.991 (norm. = 0.686708), norm. avg. (of 10) = 0.95397 fft 7: mflops = 26.5365 (norm. = 0.118337), norm. avg. (of 10) = 0.130646 fft 8: mflops = 62.0409 (norm. = 0.276665), norm. avg. (of 10) = 0.267515 fft 9: mflops = 19.4742 (norm. = 0.0868431), norm. avg. (of 10) = 0.0977241 fft 10: mflops = 68.113 (norm. = 0.303743), norm. avg. (of 10) = 0.203414 fft 11: mflops = 62.2352 (norm. = 0.277532), norm. avg. (of 10) = 0.19979 fft 12: mflops = -1 (norm. = -0.0044594), norm. avg. (of 9) = 0.229122 fft 13: mflops = -1 (norm. = -0.0044594), norm. avg. (of 9) = 0.169624 fft 14: mflops = 5.33239 (norm. = 0.0237792), norm. avg. (of 10) = 0.0283306 fft 15: mflops = 99.923 (norm. = 0.445597), norm. avg. (of 10) = 0.449932 Benchmarking for array size = 504: 0. Brenner: elapsed time t=1.08377 s, 1024 iters, t-(init.)=1.07847 s t(norm)=0.232772, mflops=21.4802 (err=1.5e-15) 1. CWP (min N): elapsed time t=1.79757 s, 16384 iters, t-(init.)=1.71351 s t(norm)=0.0231149, mflops=216.311 2. CWP (best N): elapsed time t=1.79968 s, 16384 iters, t-(init.)=1.71577 s t(norm)=0.0231454, mflops=216.026 3. FFTPACK: elapsed time t=1.3673 s, 8192 iters, t-(init.)=1.32464 s t(norm)=0.0357381, mflops=139.907 (err=1.2e-15) 4. FFTPACK (f2c): elapsed time t=1.42938 s, 4096 iters, t-(init.)=1.40846 s t(norm)=0.0759991, mflops=65.7903 (err=1.3e-15) FFTW_MEASURE plan: (cost = 1.261470e-04) FFTW_TWIDDLE 7 FFTW_TWIDDLE 6 FFTW_NOTW 12 5. FFTW: elapsed time t=1.01467 s, 8192 iters, t-(init.)=0.972646 s t(norm)=0.0262416, mflops=190.537 (err=1.2e-15) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.09745 s, 8192 iters, t-(init.)=1.05517 s t(norm)=0.0284679, mflops=175.637 (err=1.2e-15) 7. Frigo-old: elapsed time t=1.38551 s, 2048 iters, t-(init.)=1.37463 s t(norm)=0.148348, mflops=33.7045 (err=1.3e-15) 8. GSL: elapsed time t=1.32898 s, 4096 iters, t-(init.)=1.30802 s t(norm)=0.0705795, mflops=70.8421 (err=1.3e-15) 9. NAPACK (f2c): elapsed time t=1.86202 s, 2048 iters, t-(init.)=1.85139 s t(norm)=0.199798, mflops=25.0252 (err=4.1e-14) 10. Singleton: elapsed time t=1.10212 s, 4096 iters, t-(init.)=1.08099 s t(norm)=0.058329, mflops=85.7206 (err=1.9e-15) 11. Singleton (f2c): elapsed time t=1.13322 s, 4096 iters, t-(init.)=1.11206 s t(norm)=0.060006, mflops=83.325 (err=1.9e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.9707 s, 512 iters, t-(init.)=1.96805 s t(norm)=0.849554, mflops=5.88544 (err=1.4e-15) 15. SUNPERF: elapsed time t=1.87031 s, 8192 iters, t-(init.)=1.8282 s t(norm)=0.049324, mflops=101.371 (err=1.2e-15) Top mflops for N=504 = 216.311 Normalized results and averages for N=504: fft 0: mflops = 21.4802 (norm. = 0.0993027), norm. avg. (of 8) = 0.0655011 fft 1: mflops = 216.311 (norm. = 1), norm. avg. (of 11) = 0.406665 fft 2: mflops = 216.026 (norm. = 0.998684), norm. avg. (of 11) = 0.418739 fft 3: mflops = 139.907 (norm. = 0.646786), norm. avg. (of 11) = 0.540415 fft 4: mflops = 65.7903 (norm. = 0.304147), norm. avg. (of 11) = 0.301274 fft 5: mflops = 190.537 (norm. = 0.880851), norm. avg. (of 11) = 0.840613 fft 6: mflops = 175.637 (norm. = 0.811965), norm. avg. (of 11) = 0.941061 fft 7: mflops = 33.7045 (norm. = 0.155815), norm. avg. (of 11) = 0.132934 fft 8: mflops = 70.8421 (norm. = 0.327502), norm. avg. (of 11) = 0.272969 fft 9: mflops = 25.0252 (norm. = 0.115691), norm. avg. (of 11) = 0.0993575 fft 10: mflops = 85.7206 (norm. = 0.396285), norm. avg. (of 11) = 0.220947 fft 11: mflops = 83.325 (norm. = 0.38521), norm. avg. (of 11) = 0.216646 fft 12: mflops = -1 (norm. = -0.00462298), norm. avg. (of 9) = 0.229122 fft 13: mflops = -1 (norm. = -0.00462298), norm. avg. (of 9) = 0.169624 fft 14: mflops = 5.88544 (norm. = 0.0272083), norm. avg. (of 11) = 0.0282286 fft 15: mflops = 101.371 (norm. = 0.468634), norm. avg. (of 11) = 0.451632 Benchmarking for array size = 1000: 0. Brenner: elapsed time t=1.19904 s, 512 iters, t-(init.)=1.19384 s t(norm)=0.233973, mflops=21.37 (err=1.1e-15) 1. CWP (min N) (N=1001): elapsed time t=1.57292 s, 4096 iters, t-(init.)=1.53139 s t(norm)=0.0375157, mflops=133.277 2. CWP (best N) (N=1008): elapsed time t=1.04093 s, 4096 iters, t-(init.)=0.999121 s t(norm)=0.0244764, mflops=204.279 3. FFTPACK: elapsed time t=1.34295 s, 4096 iters, t-(init.)=1.30158 s t(norm)=0.0318859, mflops=156.809 (err=1.0e-15) 4. FFTPACK (f2c): elapsed time t=1.18477 s, 2048 iters, t-(init.)=1.16405 s t(norm)=0.0570337, mflops=87.6674 (err=1.1e-15) FFTW_MEASURE plan: (cost = 2.991020e-04) FFTW_TWIDDLE 5 FFTW_TWIDDLE 4 FFTW_TWIDDLE 5 FFTW_NOTW 10 5. FFTW: elapsed time t=1.26353 s, 4096 iters, t-(init.)=1.22222 s t(norm)=0.0299419, mflops=166.99 (err=1.0e-15) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.08615 s, 4096 iters, t-(init.)=1.04489 s t(norm)=0.0255976, mflops=195.331 (err=9.7e-16) 7. Frigo-old: elapsed time t=1.55603 s, 1024 iters, t-(init.)=1.54566 s t(norm)=0.151461, mflops=33.0118 (err=1.0e-15) 8. GSL: elapsed time t=1.54714 s, 2048 iters, t-(init.)=1.5264 s t(norm)=0.0747873, mflops=66.8563 (err=1.0e-15) 9. NAPACK (f2c): elapsed time t=1.31195 s, 512 iters, t-(init.)=1.30678 s t(norm)=0.256106, mflops=19.5232 (err=1.7e-14) 10. Singleton: elapsed time t=1.76221 s, 4096 iters, t-(init.)=1.72062 s t(norm)=0.0421515, mflops=118.62 (err=1.5e-15) 11. Singleton (f2c): elapsed time t=1.71051 s, 4096 iters, t-(init.)=1.66896 s t(norm)=0.0408859, mflops=122.292 (err=1.5e-15) 12. Temperton: elapsed time t=1.66901 s, 4096 iters, t-(init.)=1.62757 s t(norm)=0.039872, mflops=125.401 (err=1.3e-07) 13. Temperton (f2c): elapsed time t=1.33892 s, 2048 iters, t-(init.)=1.31819 s t(norm)=0.0645855, mflops=77.4168 (err=9.9e-16) 14. Valkenburg: elapsed time t=1.10873 s, 128 iters, t-(init.)=1.10744 s t(norm)=0.868154, mflops=5.75935 (err=1.1e-15) 15. SUNPERF: elapsed time t=1.61656 s, 4096 iters, t-(init.)=1.57504 s t(norm)=0.038585, mflops=129.584 (err=9.9e-16) Top mflops for N=1000 = 204.279 Normalized results and averages for N=1000: fft 0: mflops = 21.37 (norm. = 0.104612), norm. avg. (of 9) = 0.0698467 fft 1: mflops = 133.277 (norm. = 0.652429), norm. avg. (of 12) = 0.427145 fft 2: mflops = 204.279 (norm. = 1), norm. avg. (of 12) = 0.467177 fft 3: mflops = 156.809 (norm. = 0.767622), norm. avg. (of 12) = 0.559349 fft 4: mflops = 87.6674 (norm. = 0.429156), norm. avg. (of 12) = 0.311931 fft 5: mflops = 166.99 (norm. = 0.817461), norm. avg. (of 12) = 0.838684 fft 6: mflops = 195.331 (norm. = 0.956197), norm. avg. (of 12) = 0.942322 fft 7: mflops = 33.0118 (norm. = 0.161602), norm. avg. (of 12) = 0.135323 fft 8: mflops = 66.8563 (norm. = 0.32728), norm. avg. (of 12) = 0.277495 fft 9: mflops = 19.5232 (norm. = 0.0955712), norm. avg. (of 12) = 0.099042 fft 10: mflops = 118.62 (norm. = 0.580676), norm. avg. (of 12) = 0.250925 fft 11: mflops = 122.292 (norm. = 0.59865), norm. avg. (of 12) = 0.24848 fft 12: mflops = 125.401 (norm. = 0.613874), norm. avg. (of 10) = 0.267597 fft 13: mflops = 77.4168 (norm. = 0.378976), norm. avg. (of 10) = 0.190559 fft 14: mflops = 5.75935 (norm. = 0.0281936), norm. avg. (of 12) = 0.0282256 fft 15: mflops = 129.584 (norm. = 0.634348), norm. avg. (of 12) = 0.466859 Benchmarking for array size = 1960: 0. Brenner: elapsed time t=1.1916 s, 256 iters, t-(init.)=1.18654 s t(norm)=0.216223, mflops=23.1243 (err=2.9e-15) 1. CWP (min N) (N=1980): elapsed time t=1.37413 s, 2048 iters, t-(init.)=1.3333 s t(norm)=0.0303708, mflops=164.632 2. CWP (best N) (N=1980): elapsed time t=1.37456 s, 2048 iters, t-(init.)=1.33365 s t(norm)=0.0303789, mflops=164.588 3. FFTPACK: elapsed time t=1.92329 s, 2048 iters, t-(init.)=1.88287 s t(norm)=0.0428895, mflops=116.579 (err=2.8e-15) 4. FFTPACK (f2c): elapsed time t=1.11448 s, 512 iters, t-(init.)=1.10437 s t(norm)=0.100625, mflops=49.6894 (err=2.8e-15) FFTW_MEASURE plan: (cost = 7.372560e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 7 5. FFTW: elapsed time t=1.52722 s, 2048 iters, t-(init.)=1.48676 s t(norm)=0.0338664, mflops=147.639 (err=2.8e-15) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.54777 s, 2048 iters, t-(init.)=1.50728 s t(norm)=0.0343339, mflops=145.629 (err=2.8e-15) 7. Frigo-old: elapsed time t=1.78302 s, 512 iters, t-(init.)=1.77287 s t(norm)=0.161535, mflops=30.9531 (err=2.8e-15) 8. GSL: elapsed time t=1.74195 s, 1024 iters, t-(init.)=1.72163 s t(norm)=0.0784331, mflops=63.7486 (err=2.8e-15) 9. NAPACK (f2c): elapsed time t=1.71168 s, 256 iters, t-(init.)=1.70663 s t(norm)=0.310999, mflops=16.0772 (err=1.3e-13) 10. Singleton: elapsed time t=1.43185 s, 1024 iters, t-(init.)=1.4116 s t(norm)=0.064309, mflops=77.7496 (err=4.3e-15) 11. Singleton (f2c): elapsed time t=1.49245 s, 1024 iters, t-(init.)=1.47222 s t(norm)=0.0670709, mflops=74.548 (err=4.3e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.36063 s, 64 iters, t-(init.)=1.35937 s t(norm)=0.990873, mflops=5.04606 (err=2.7e-15) 15. SUNPERF: elapsed time t=1.63614 s, 1024 iters, t-(init.)=1.61597 s t(norm)=0.0736196, mflops=67.9167 (err=2.8e-15) Top mflops for N=1960 = 164.632 Normalized results and averages for N=1960: fft 0: mflops = 23.1243 (norm. = 0.140461), norm. avg. (of 10) = 0.0769081 fft 1: mflops = 164.632 (norm. = 1), norm. avg. (of 13) = 0.471211 fft 2: mflops = 164.588 (norm. = 0.999734), norm. avg. (of 13) = 0.508143 fft 3: mflops = 116.579 (norm. = 0.708119), norm. avg. (of 13) = 0.570793 fft 4: mflops = 49.6894 (norm. = 0.301822), norm. avg. (of 13) = 0.311153 fft 5: mflops = 147.639 (norm. = 0.896782), norm. avg. (of 13) = 0.843153 fft 6: mflops = 145.629 (norm. = 0.884572), norm. avg. (of 13) = 0.93788 fft 7: mflops = 30.9531 (norm. = 0.188014), norm. avg. (of 13) = 0.139376 fft 8: mflops = 63.7486 (norm. = 0.387219), norm. avg. (of 13) = 0.285935 fft 9: mflops = 16.0772 (norm. = 0.0976556), norm. avg. (of 13) = 0.0989353 fft 10: mflops = 77.7496 (norm. = 0.472264), norm. avg. (of 13) = 0.267951 fft 11: mflops = 74.548 (norm. = 0.452817), norm. avg. (of 13) = 0.264198 fft 12: mflops = -1 (norm. = -0.00607417), norm. avg. (of 10) = 0.267597 fft 13: mflops = -1 (norm. = -0.00607417), norm. avg. (of 10) = 0.190559 fft 14: mflops = 5.04606 (norm. = 0.0306506), norm. avg. (of 13) = 0.0284122 fft 15: mflops = 67.9167 (norm. = 0.412537), norm. avg. (of 13) = 0.46268 Benchmarking for array size = 4725: 0. Brenner: elapsed time t=1.03237 s, 64 iters, t-(init.)=1.02932 s t(norm)=0.278864, mflops=17.9299 (err=1.9e-15) 1. CWP (min N) (N=5005): elapsed time t=1.17538 s, 512 iters, t-(init.)=1.14966 s t(norm)=0.0389331, mflops=128.425 2. CWP (best N) (N=5040): elapsed time t=1.71781 s, 1024 iters, t-(init.)=1.66607 s t(norm)=0.0282107, mflops=177.238 3. FFTPACK: elapsed time t=1.07286 s, 512 iters, t-(init.)=1.04852 s t(norm)=0.0355081, mflops=140.813 (err=1.8e-15) 4. FFTPACK (f2c): elapsed time t=1.16645 s, 256 iters, t-(init.)=1.15431 s t(norm)=0.0781814, mflops=63.9539 (err=1.9e-15) FFTW_MEASURE plan: (cost = 1.777300e-03) FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_NOTW 15 5. FFTW: elapsed time t=1.81199 s, 1024 iters, t-(init.)=1.76319 s t(norm)=0.0298553, mflops=167.475 (err=1.9e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.82983 s, 1024 iters, t-(init.)=1.78131 s t(norm)=0.0301621, mflops=165.771 (err=1.8e-15) 7. Frigo-old: elapsed time t=1.64623 s, 128 iters, t-(init.)=1.64017 s t(norm)=0.222177, mflops=22.5045 (err=1.9e-15) 8. GSL: elapsed time t=1.13997 s, 256 iters, t-(init.)=1.12782 s t(norm)=0.0763871, mflops=65.4561 (err=1.9e-15) 9. NAPACK (f2c): elapsed time t=1.98623 s, 128 iters, t-(init.)=1.98016 s t(norm)=0.268233, mflops=18.6405 (err=3.5e-13) 10. Singleton: elapsed time t=1.01952 s, 256 iters, t-(init.)=1.00736 s t(norm)=0.0682285, mflops=73.2831 (err=2.4e-15) 11. Singleton (f2c): elapsed time t=1.05984 s, 256 iters, t-(init.)=1.04769 s t(norm)=0.0709599, mflops=70.4623 (err=2.4e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.73313 s, 32 iters, t-(init.)=1.73161 s t(norm)=0.938259, mflops=5.32902 (err=1.8e-15) 15. SUNPERF: elapsed time t=1.65427 s, 512 iters, t-(init.)=1.62998 s t(norm)=0.0551994, mflops=90.5807 (err=1.8e-15) Top mflops for N=4725 = 177.238 Normalized results and averages for N=4725: fft 0: mflops = 17.9299 (norm. = 0.101163), norm. avg. (of 11) = 0.0791131 fft 1: mflops = 128.425 (norm. = 0.724594), norm. avg. (of 14) = 0.489309 fft 2: mflops = 177.238 (norm. = 1), norm. avg. (of 14) = 0.543276 fft 3: mflops = 140.813 (norm. = 0.794485), norm. avg. (of 14) = 0.586771 fft 4: mflops = 63.9539 (norm. = 0.360836), norm. avg. (of 14) = 0.314702 fft 5: mflops = 167.475 (norm. = 0.944915), norm. avg. (of 14) = 0.850422 fft 6: mflops = 165.771 (norm. = 0.935303), norm. avg. (of 14) = 0.937696 fft 7: mflops = 22.5045 (norm. = 0.126974), norm. avg. (of 14) = 0.13849 fft 8: mflops = 65.4561 (norm. = 0.369312), norm. avg. (of 14) = 0.291891 fft 9: mflops = 18.6405 (norm. = 0.105172), norm. avg. (of 14) = 0.0993808 fft 10: mflops = 73.2831 (norm. = 0.413473), norm. avg. (of 14) = 0.278345 fft 11: mflops = 70.4623 (norm. = 0.397558), norm. avg. (of 14) = 0.273724 fft 12: mflops = -1 (norm. = -0.00564214), norm. avg. (of 10) = 0.267597 fft 13: mflops = -1 (norm. = -0.00564214), norm. avg. (of 10) = 0.190559 fft 14: mflops = 5.32902 (norm. = 0.0300671), norm. avg. (of 14) = 0.0285304 fft 15: mflops = 90.5807 (norm. = 0.511068), norm. avg. (of 14) = 0.466136 Benchmarking for array size = 10368: 0. Brenner: elapsed time t=1.07398 s, 32 iters, t-(init.)=1.07061 s t(norm)=0.241899, mflops=20.6698 (err=3.1e-15) 1. CWP (min N) (N=10920): elapsed time t=1.16922 s, 256 iters, t-(init.)=1.14115 s t(norm)=0.0322296, mflops=155.137 2. CWP (best N) (N=11088): elapsed time t=1.22085 s, 256 iters, t-(init.)=1.19214 s t(norm)=0.0336699, mflops=148.501 3. FFTPACK: elapsed time t=1.03892 s, 256 iters, t-(init.)=1.0122 s t(norm)=0.0285877, mflops=174.9 (err=3.0e-15) 4. FFTPACK (f2c): elapsed time t=1.07089 s, 128 iters, t-(init.)=1.05751 s t(norm)=0.0597349, mflops=83.7031 (err=3.0e-15) FFTW_MEASURE plan: (cost = 3.465565e-03) FFTW_TWIDDLE 16 FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_NOTW 12 5. FFTW: elapsed time t=1.89713 s, 512 iters, t-(init.)=1.84381 s t(norm)=0.0260375, mflops=192.03 (err=3.0e-15) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.80459 s, 512 iters, t-(init.)=1.75123 s t(norm)=0.0247302, mflops=202.182 (err=3.0e-15) 7. Frigo-old: elapsed time t=1.37323 s, 64 iters, t-(init.)=1.36642 s t(norm)=0.154368, mflops=32.3901 (err=3.1e-15) 8. GSL: elapsed time t=1.03612 s, 128 iters, t-(init.)=1.02276 s t(norm)=0.0577723, mflops=86.5467 (err=3.0e-15) 9. NAPACK (f2c): elapsed time t=1.29084 s, 64 iters, t-(init.)=1.28417 s t(norm)=0.145077, mflops=34.4645 (err=8.1e-14) 10. Singleton: elapsed time t=1.94305 s, 256 iters, t-(init.)=1.91639 s t(norm)=0.0541249, mflops=92.379 (err=4.4e-15) 11. Singleton (f2c): elapsed time t=1.03518 s, 128 iters, t-(init.)=1.02186 s t(norm)=0.0577213, mflops=86.6232 (err=4.4e-15) 12. Temperton: elapsed time t=1.772 s, 256 iters, t-(init.)=1.74536 s t(norm)=0.0492945, mflops=101.431 (err=2.1e-07) 13. Temperton (f2c): elapsed time t=1.24511 s, 128 iters, t-(init.)=1.23177 s t(norm)=0.069578, mflops=71.8617 (err=3.0e-15) 14. Valkenburg: elapsed time t=1.62125 s, 16 iters, t-(init.)=1.61957 s t(norm)=0.73187, mflops=6.83181 (err=3.0e-15) 15. SUNPERF: elapsed time t=1.34756 s, 256 iters, t-(init.)=1.32089 s t(norm)=0.037306, mflops=134.027 (err=3.0e-15) Top mflops for N=10368 = 202.182 Normalized results and averages for N=10368: fft 0: mflops = 20.6698 (norm. = 0.102233), norm. avg. (of 12) = 0.0810398 fft 1: mflops = 155.137 (norm. = 0.767311), norm. avg. (of 15) = 0.507843 fft 2: mflops = 148.501 (norm. = 0.734489), norm. avg. (of 15) = 0.556023 fft 3: mflops = 174.9 (norm. = 0.865063), norm. avg. (of 15) = 0.605324 fft 4: mflops = 83.7031 (norm. = 0.413999), norm. avg. (of 15) = 0.321322 fft 5: mflops = 192.03 (norm. = 0.949789), norm. avg. (of 15) = 0.857046 fft 6: mflops = 202.182 (norm. = 1), norm. avg. (of 15) = 0.941849 fft 7: mflops = 32.3901 (norm. = 0.160203), norm. avg. (of 15) = 0.139938 fft 8: mflops = 86.5467 (norm. = 0.428063), norm. avg. (of 15) = 0.300969 fft 9: mflops = 34.4645 (norm. = 0.170462), norm. avg. (of 15) = 0.10412 fft 10: mflops = 92.379 (norm. = 0.456909), norm. avg. (of 15) = 0.29025 fft 11: mflops = 86.6232 (norm. = 0.428441), norm. avg. (of 15) = 0.284038 fft 12: mflops = 101.431 (norm. = 0.501682), norm. avg. (of 11) = 0.288877 fft 13: mflops = 71.8617 (norm. = 0.355431), norm. avg. (of 11) = 0.205547 fft 14: mflops = 6.83181 (norm. = 0.0337904), norm. avg. (of 15) = 0.028881 fft 15: mflops = 134.027 (norm. = 0.6629), norm. avg. (of 15) = 0.479254 Benchmarking for array size = 27000: 0. Brenner: elapsed time t=1.94236 s, 16 iters, t-(init.)=1.93778 s t(norm)=0.304714, mflops=16.4088 (err=5.6e-15) 1. CWP (min N) (N=27720): elapsed time t=1.53957 s, 128 iters, t-(init.)=1.50382 s t(norm)=0.0295592, mflops=169.152 2. CWP (best N) (N=27720): elapsed time t=1.54006 s, 128 iters, t-(init.)=1.5044 s t(norm)=0.0295707, mflops=169.086 3. FFTPACK: elapsed time t=1.05001 s, 64 iters, t-(init.)=1.03226 s t(norm)=0.0405805, mflops=123.212 (err=5.5e-15) 4. FFTPACK (f2c): elapsed time t=1.75345 s, 64 iters, t-(init.)=1.73586 s t(norm)=0.0682405, mflops=73.2703 (err=5.5e-15) FFTW_MEASURE plan: (cost = 1.377532e-02) FFTW_TWIDDLE 8 FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 15 5. FFTW: elapsed time t=1.73436 s, 128 iters, t-(init.)=1.69895 s t(norm)=0.0333948, mflops=149.724 (err=5.5e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.90787 s, 128 iters, t-(init.)=1.87228 s t(norm)=0.0368018, mflops=135.863 (err=5.6e-15) 7. Frigo-old: elapsed time t=1.30678 s, 16 iters, t-(init.)=1.30194 s t(norm)=0.204729, mflops=24.4226 (err=5.7e-15) 8. GSL: elapsed time t=1.70957 s, 64 iters, t-(init.)=1.6922 s t(norm)=0.0665242, mflops=75.1607 (err=5.5e-15) 9. NAPACK (f2c): elapsed time t=1.58116 s, 16 iters, t-(init.)=1.57657 s t(norm)=0.247915, mflops=20.1682 (err=1.1e-12) 10. Singleton: elapsed time t=1.56211 s, 64 iters, t-(init.)=1.54462 s t(norm)=0.0607228, mflops=82.3415 (err=7.7e-15) 11. Singleton (f2c): elapsed time t=1.60835 s, 64 iters, t-(init.)=1.59094 s t(norm)=0.0625436, mflops=79.9443 (err=7.7e-15) 12. Temperton: elapsed time t=1.25993 s, 64 iters, t-(init.)=1.24258 s t(norm)=0.0488487, mflops=102.357 (err=1.4e-07) 13. Temperton (f2c): elapsed time t=1.83655 s, 64 iters, t-(init.)=1.81911 s t(norm)=0.0715134, mflops=69.917 (err=5.6e-15) 14. Valkenburg: elapsed time t=1.36966 s, 4 iters, t-(init.)=1.36829 s t(norm)=0.860648, mflops=5.80958 (err=5.5e-15) 15. SUNPERF: elapsed time t=1.23259 s, 64 iters, t-(init.)=1.215 s t(norm)=0.0477645, mflops=104.68 (err=5.5e-15) Top mflops for N=27000 = 169.152 Normalized results and averages for N=27000: fft 0: mflops = 16.4088 (norm. = 0.0970066), norm. avg. (of 13) = 0.082268 fft 1: mflops = 169.152 (norm. = 1), norm. avg. (of 16) = 0.538603 fft 2: mflops = 169.086 (norm. = 0.999612), norm. avg. (of 16) = 0.583748 fft 3: mflops = 123.212 (norm. = 0.728409), norm. avg. (of 16) = 0.613017 fft 4: mflops = 73.2703 (norm. = 0.433163), norm. avg. (of 16) = 0.328312 fft 5: mflops = 149.724 (norm. = 0.885145), norm. avg. (of 16) = 0.858802 fft 6: mflops = 135.863 (norm. = 0.8032), norm. avg. (of 16) = 0.933184 fft 7: mflops = 24.4226 (norm. = 0.144383), norm. avg. (of 16) = 0.140216 fft 8: mflops = 75.1607 (norm. = 0.444338), norm. avg. (of 16) = 0.309929 fft 9: mflops = 20.1682 (norm. = 0.119231), norm. avg. (of 16) = 0.105064 fft 10: mflops = 82.3415 (norm. = 0.48679), norm. avg. (of 16) = 0.302533 fft 11: mflops = 79.9443 (norm. = 0.472618), norm. avg. (of 16) = 0.295825 fft 12: mflops = 102.357 (norm. = 0.605118), norm. avg. (of 12) = 0.315231 fft 13: mflops = 69.917 (norm. = 0.413338), norm. avg. (of 12) = 0.222863 fft 14: mflops = 5.80958 (norm. = 0.0343453), norm. avg. (of 16) = 0.0292226 fft 15: mflops = 104.68 (norm. = 0.618854), norm. avg. (of 16) = 0.487979 Benchmarking for array size = 75600: 0. Brenner: elapsed time t=1.906 s, 4 iters, t-(init.)=1.89381 s t(norm)=0.386434, mflops=12.9388 (err=1.1e-14) 1. CWP (min N) (N=80080): elapsed time t=1.7694 s, 32 iters, t-(init.)=1.67172 s t(norm)=0.0426395, mflops=117.262 2. CWP (best N) (N=80080): elapsed time t=1.77565 s, 32 iters, t-(init.)=1.67832 s t(norm)=0.0428081, mflops=116.8 3. FFTPACK: elapsed time t=1.8376 s, 16 iters, t-(init.)=1.79135 s t(norm)=0.0913818, mflops=54.7155 (err=1.0e-14) 4. FFTPACK (f2c): elapsed time t=1.2323 s, 8 iters, t-(init.)=1.20929 s t(norm)=0.123379, mflops=40.5257 (err=1.1e-14) FFTW_MEASURE plan: (cost = 6.601977e-02) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 5. FFTW: elapsed time t=1.08907 s, 16 iters, t-(init.)=1.04164 s t(norm)=0.0531372, mflops=94.096 (err=1.1e-14) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.08936 s, 16 iters, t-(init.)=1.04195 s t(norm)=0.0531531, mflops=94.0679 (err=1.1e-14) 7. Frigo-old: elapsed time t=1.21971 s, 4 iters, t-(init.)=1.20731 s t(norm)=0.246353, mflops=20.2961 (err=1.1e-14) 8. GSL: elapsed time t=1.93238 s, 16 iters, t-(init.)=1.88633 s t(norm)=0.0962271, mflops=51.9604 (err=1.1e-14) 9. NAPACK (f2c): elapsed time t=1.48263 s, 4 iters, t-(init.)=1.47016 s t(norm)=0.299988, mflops=16.6673 (err=5.1e-12) 10. Singleton: elapsed time t=1.10971 s, 8 iters, t-(init.)=1.08792 s t(norm)=0.110995, mflops=45.0469 (err=1.5e-14) 11. Singleton (f2c): elapsed time t=1.12335 s, 8 iters, t-(init.)=1.10148 s t(norm)=0.112379, mflops=44.4921 (err=1.5e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.21416 s, 1 iters, t-(init.)=1.21015 s t(norm)=0.987736, mflops=5.06208 (err=1.1e-14) 15. SUNPERF: elapsed time t=1.03667 s, 8 iters, t-(init.)=1.00978 s t(norm)=0.103024, mflops=48.5324 (err=1.0e-14) Top mflops for N=75600 = 117.262 Normalized results and averages for N=75600: fft 0: mflops = 12.9388 (norm. = 0.110341), norm. avg. (of 14) = 0.0842732 fft 1: mflops = 117.262 (norm. = 1), norm. avg. (of 17) = 0.565744 fft 2: mflops = 116.8 (norm. = 0.996063), norm. avg. (of 17) = 0.608002 fft 3: mflops = 54.7155 (norm. = 0.466609), norm. avg. (of 17) = 0.604404 fft 4: mflops = 40.5257 (norm. = 0.345599), norm. avg. (of 17) = 0.329329 fft 5: mflops = 94.096 (norm. = 0.802442), norm. avg. (of 17) = 0.855487 fft 6: mflops = 94.0679 (norm. = 0.802202), norm. avg. (of 17) = 0.925479 fft 7: mflops = 20.2961 (norm. = 0.173083), norm. avg. (of 17) = 0.142149 fft 8: mflops = 51.9604 (norm. = 0.443114), norm. avg. (of 17) = 0.317764 fft 9: mflops = 16.6673 (norm. = 0.142137), norm. avg. (of 17) = 0.107245 fft 10: mflops = 45.0469 (norm. = 0.384156), norm. avg. (of 17) = 0.307335 fft 11: mflops = 44.4921 (norm. = 0.379425), norm. avg. (of 17) = 0.300742 fft 12: mflops = -1 (norm. = -0.00852791), norm. avg. (of 12) = 0.315231 fft 13: mflops = -1 (norm. = -0.00852791), norm. avg. (of 12) = 0.222863 fft 14: mflops = 5.06208 (norm. = 0.043169), norm. avg. (of 17) = 0.0300429 fft 15: mflops = 48.5324 (norm. = 0.41388), norm. avg. (of 17) = 0.48362 Benchmarking for array size = 165375: 0. Brenner: elapsed time t=1.51169 s, 1 iters, t-(init.)=1.50018 s t(norm)=0.523286, mflops=9.555 (err=2.7e-14) 1. CWP (min N) (N=180180): elapsed time t=1.26217 s, 8 iters, t-(init.)=1.17157 s t(norm)=0.0510828, mflops=97.8803 2. CWP (best N) (N=180180): elapsed time t=1.26486 s, 8 iters, t-(init.)=1.17394 s t(norm)=0.051186, mflops=97.6829 3. FFTPACK: elapsed time t=1.02409 s, 2 iters, t-(init.)=1.00252 s t(norm)=0.174847, mflops=28.5965 (err=2.7e-14) 4. FFTPACK (f2c): elapsed time t=1.26318 s, 2 iters, t-(init.)=1.24166 s t(norm)=0.216556, mflops=23.0887 (err=2.7e-14) FFTW_MEASURE plan: (cost = 1.964999e-01) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 15 5. FFTW: elapsed time t=1.5743 s, 8 iters, t-(init.)=1.49065 s t(norm)=0.0649954, mflops=76.9285 (err=2.7e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.62948 s, 8 iters, t-(init.)=1.54584 s t(norm)=0.0674019, mflops=74.1819 (err=2.7e-14) 7. Frigo-old: elapsed time t=1.99843 s, 2 iters, t-(init.)=1.97644 s t(norm)=0.344706, mflops=14.5051 (err=2.7e-14) 8. GSL: elapsed time t=1.32469 s, 4 iters, t-(init.)=1.28368 s t(norm)=0.111942, mflops=44.6658 (err=2.7e-14) 9. NAPACK (f2c): elapsed time t=1.10349 s, 1 iters, t-(init.)=1.09143 s t(norm)=0.38071, mflops=13.1334 (err=1.6e-11) 10. Singleton: elapsed time t=1.85201 s, 4 iters, t-(init.)=1.81462 s t(norm)=0.158242, mflops=31.5972 (err=4.0e-14) 11. Singleton (f2c): elapsed time t=1.8676 s, 4 iters, t-(init.)=1.83017 s t(norm)=0.159598, mflops=31.3287 (err=4.0e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=3.26533 s, 1 iters, t-(init.)=3.25367 s t(norm)=1.13493, mflops=4.40555 (err=2.7e-14) 15. SUNPERF: elapsed time t=1.14413 s, 2 iters, t-(init.)=1.1226 s t(norm)=0.19579, mflops=25.5376 (err=2.7e-14) Top mflops for N=165375 = 97.8803 Normalized results and averages for N=165375: fft 0: mflops = 9.555 (norm. = 0.0976193), norm. avg. (of 15) = 0.085163 fft 1: mflops = 97.8803 (norm. = 1), norm. avg. (of 18) = 0.589869 fft 2: mflops = 97.6829 (norm. = 0.997984), norm. avg. (of 18) = 0.629667 fft 3: mflops = 28.5965 (norm. = 0.292158), norm. avg. (of 18) = 0.587057 fft 4: mflops = 23.0887 (norm. = 0.235888), norm. avg. (of 18) = 0.324138 fft 5: mflops = 76.9285 (norm. = 0.785945), norm. avg. (of 18) = 0.851624 fft 6: mflops = 74.1819 (norm. = 0.757884), norm. avg. (of 18) = 0.916168 fft 7: mflops = 14.5051 (norm. = 0.148192), norm. avg. (of 18) = 0.142485 fft 8: mflops = 44.6658 (norm. = 0.456331), norm. avg. (of 18) = 0.325462 fft 9: mflops = 13.1334 (norm. = 0.134178), norm. avg. (of 18) = 0.108741 fft 10: mflops = 31.5972 (norm. = 0.322814), norm. avg. (of 18) = 0.308195 fft 11: mflops = 31.3287 (norm. = 0.320072), norm. avg. (of 18) = 0.301816 fft 12: mflops = -1 (norm. = -0.0102166), norm. avg. (of 12) = 0.315231 fft 13: mflops = -1 (norm. = -0.0102166), norm. avg. (of 12) = 0.222863 fft 14: mflops = 4.40555 (norm. = 0.0450096), norm. avg. (of 18) = 0.0308744 fft 15: mflops = 25.5376 (norm. = 0.260906), norm. avg. (of 18) = 0.471247 Benchmarking for array size = 362880: 0. Brenner: elapsed time t=3.41243 s, 1 iters, t-(init.)=3.3831 s t(norm)=0.504783, mflops=9.90524 (err=1.1e-13) 1. CWP (min N) (N=720720): elapsed time t=1.65435 s, 2 iters, t-(init.)=1.53405 s t(norm)=0.114446, mflops=43.6888 2. CWP (best N) (N=720720): elapsed time t=1.65663 s, 2 iters, t-(init.)=1.53698 s t(norm)=0.114664, mflops=43.6055 3. FFTPACK: elapsed time t=1.92364 s, 2 iters, t-(init.)=1.86613 s t(norm)=0.13922, mflops=35.9144 (err=1.1e-13) 4. FFTPACK (f2c): elapsed time t=1.15362 s, 1 iters, t-(init.)=1.12505 s t(norm)=0.167866, mflops=29.7857 (err=1.1e-13) FFTW_MEASURE plan: (cost = 4.122288e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_TWIDDLE 6 FFTW_TWIDDLE 7 FFTW_NOTW 12 5. FFTW: elapsed time t=1.80074 s, 4 iters, t-(init.)=1.68361 s t(norm)=0.0628017, mflops=79.6156 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 7.511616e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.84937 s, 4 iters, t-(init.)=1.73294 s t(norm)=0.0646418, mflops=77.3493 (err=1.1e-13) 7. Frigo-old: elapsed time t=1.90702 s, 1 iters, t-(init.)=1.87828 s t(norm)=0.280253, mflops=17.841 (err=1.1e-13) 8. GSL: elapsed time t=1.54231 s, 2 iters, t-(init.)=1.48461 s t(norm)=0.110757, mflops=45.1437 (err=1.1e-13) 9. NAPACK (f2c): elapsed time t=2.2264 s, 1 iters, t-(init.)=2.19735 s t(norm)=0.32786, mflops=15.2504 (err=3.4e-11) 10. Singleton: elapsed time t=1.33671 s, 1 iters, t-(init.)=1.31535 s t(norm)=0.196259, mflops=25.4765 (err=1.6e-13) 11. Singleton (f2c): elapsed time t=1.35883 s, 1 iters, t-(init.)=1.3373 s t(norm)=0.199535, mflops=25.0582 (err=1.6e-13) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=6.73752 s, 1 iters, t-(init.)=6.70773 s t(norm)=1.00084, mflops=4.99578 (err=1.1e-13) 15. SUNPERF: elapsed time t=1.03218 s, 1 iters, t-(init.)=1.00352 s t(norm)=0.149733, mflops=33.3929 (err=1.1e-13) Top mflops for N=362880 = 79.6156 Normalized results and averages for N=362880: fft 0: mflops = 9.90524 (norm. = 0.124413), norm. avg. (of 16) = 0.0876161 fft 1: mflops = 43.6888 (norm. = 0.548746), norm. avg. (of 19) = 0.587705 fft 2: mflops = 43.6055 (norm. = 0.547701), norm. avg. (of 19) = 0.625353 fft 3: mflops = 35.9144 (norm. = 0.451097), norm. avg. (of 19) = 0.579902 fft 4: mflops = 29.7857 (norm. = 0.374119), norm. avg. (of 19) = 0.326768 fft 5: mflops = 79.6156 (norm. = 1), norm. avg. (of 19) = 0.859433 fft 6: mflops = 77.3493 (norm. = 0.971535), norm. avg. (of 19) = 0.919082 fft 7: mflops = 17.841 (norm. = 0.224089), norm. avg. (of 19) = 0.14678 fft 8: mflops = 45.1437 (norm. = 0.567021), norm. avg. (of 19) = 0.338175 fft 9: mflops = 15.2504 (norm. = 0.19155), norm. avg. (of 19) = 0.1131 fft 10: mflops = 25.4765 (norm. = 0.319994), norm. avg. (of 19) = 0.308816 fft 11: mflops = 25.0582 (norm. = 0.31474), norm. avg. (of 19) = 0.302496 fft 12: mflops = -1 (norm. = -0.0125603), norm. avg. (of 12) = 0.315231 fft 13: mflops = -1 (norm. = -0.0125603), norm. avg. (of 12) = 0.222863 fft 14: mflops = 4.99578 (norm. = 0.0627488), norm. avg. (of 19) = 0.032552 fft 15: mflops = 33.3929 (norm. = 0.419426), norm. avg. (of 19) = 0.46852 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) Maximum array size N = 2097152 Benchmarking FFTs: 0. FFTW 1. HARM 2. HARM (f2c) 3. PDA 4. PDA (f2c) 5. Singleton 6. Singleton (f2c) 7. Temperton 8. Temperton (f2c) Computing normalized averages (9 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.09808 s, 131072 iters, t-(init.)=0.999611 s t(norm)=0.0198605, mflops=251.756 (err=1.9e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. Skipping fft (all dimensions must be > 4 for HARM). 3. PDA: elapsed time t=1.12377 s, 16384 iters, t-(init.)=1.11173 s t(norm)=0.176704, mflops=28.2959 (err=2.8e-16) 4. PDA (f2c): elapsed time t=1.47918 s, 16384 iters, t-(init.)=1.46637 s t(norm)=0.233073, mflops=21.4525 (err=2.8e-16) 5. Singleton: elapsed time t=1.05125 s, 65536 iters, t-(init.)=0.999769 s t(norm)=0.0397273, mflops=125.858 (err=1.9e-16) 6. Singleton (f2c): elapsed time t=1.09236 s, 65536 iters, t-(init.)=1.04204 s t(norm)=0.041407, mflops=120.753 (err=1.9e-16) 7. Temperton: elapsed time t=1.65104 s, 65536 iters, t-(init.)=1.60288 s t(norm)=0.0636927, mflops=78.5019 (err=1.9e-16) 8. Temperton (f2c): elapsed time t=1.30026 s, 32768 iters, t-(init.)=1.27486 s t(norm)=0.101317, mflops=49.3502 (err=1.9e-16) Top mflops for N=64 = 251.756 Normalized results and averages for N=64: fft 0: mflops = 251.756 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.0039721), norm. avg. (of 0) = -1 fft 2: mflops = -1 (norm. = -0.0039721), norm. avg. (of 0) = -1 fft 3: mflops = 28.2959 (norm. = 0.112394), norm. avg. (of 1) = 0.112394 fft 4: mflops = 21.4525 (norm. = 0.0852114), norm. avg. (of 1) = 0.0852114 fft 5: mflops = 125.858 (norm. = 0.499921), norm. avg. (of 1) = 0.499921 fft 6: mflops = 120.753 (norm. = 0.479641), norm. avg. (of 1) = 0.479641 fft 7: mflops = 78.5019 (norm. = 0.311817), norm. avg. (of 1) = 0.311817 fft 8: mflops = 49.3502 (norm. = 0.196024), norm. avg. (of 1) = 0.196024 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.00971 s, 16384 iters, t-(init.)=0.922129 s t(norm)=0.012214, mflops=409.365 (err=3.6e-16) 1. HARM: elapsed time t=1.43716 s, 8192 iters, t-(init.)=1.39411 s t(norm)=0.0369313, mflops=135.386 (err=4.0e-16) 2. HARM (f2c): elapsed time t=1.79003 s, 8192 iters, t-(init.)=1.74743 s t(norm)=0.0462911, mflops=108.012 (err=4.0e-16) 3. PDA: elapsed time t=1.77155 s, 4096 iters, t-(init.)=1.75021 s t(norm)=0.0927294, mflops=53.9203 (err=3.0e-16) 4. PDA (f2c): elapsed time t=1.22361 s, 2048 iters, t-(init.)=1.21281 s t(norm)=0.128514, mflops=38.9061 (err=3.0e-16) 5. Singleton: elapsed time t=1.84193 s, 8192 iters, t-(init.)=1.79919 s t(norm)=0.0476622, mflops=104.905 (err=3.5e-16) 6. Singleton (f2c): elapsed time t=1.84119 s, 8192 iters, t-(init.)=1.79599 s t(norm)=0.0475774, mflops=105.092 (err=3.5e-16) 7. Temperton: elapsed time t=1.92631 s, 16384 iters, t-(init.)=1.84093 s t(norm)=0.024384, mflops=205.052 (err=1.3e-08) 8. Temperton (f2c): elapsed time t=1.37691 s, 8192 iters, t-(init.)=1.33378 s t(norm)=0.0353331, mflops=141.51 (err=3.3e-16) Top mflops for N=512 = 409.365 Normalized results and averages for N=512: fft 0: mflops = 409.365 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 135.386 (norm. = 0.330723), norm. avg. (of 1) = 0.330723 fft 2: mflops = 108.012 (norm. = 0.263853), norm. avg. (of 1) = 0.263853 fft 3: mflops = 53.9203 (norm. = 0.131717), norm. avg. (of 2) = 0.122056 fft 4: mflops = 38.9061 (norm. = 0.0950402), norm. avg. (of 2) = 0.0901258 fft 5: mflops = 104.905 (norm. = 0.256263), norm. avg. (of 2) = 0.378092 fft 6: mflops = 105.092 (norm. = 0.256719), norm. avg. (of 2) = 0.36818 fft 7: mflops = 205.052 (norm. = 0.500904), norm. avg. (of 2) = 0.40636 fft 8: mflops = 141.51 (norm. = 0.345683), norm. avg. (of 2) = 0.270853 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.52413 s, 2048 iters, t-(init.)=1.43994 s t(norm)=0.0143045, mflops=349.54 (err=4.2e-16) 1. HARM: elapsed time t=1.88227 s, 1024 iters, t-(init.)=1.84012 s t(norm)=0.0365599, mflops=136.762 (err=4.0e-16) 2. HARM (f2c): elapsed time t=1.04664 s, 512 iters, t-(init.)=1.02555 s t(norm)=0.0407519, mflops=122.694 (err=4.0e-16) 3. PDA: elapsed time t=1.36406 s, 512 iters, t-(init.)=1.34301 s t(norm)=0.0533663, mflops=93.692 (err=4.0e-16) 4. PDA (f2c): elapsed time t=1.03105 s, 256 iters, t-(init.)=1.0205 s t(norm)=0.0811022, mflops=61.6506 (err=4.0e-16) 5. Singleton: elapsed time t=1.10491 s, 512 iters, t-(init.)=1.08387 s t(norm)=0.043069, mflops=116.093 (err=4.1e-16) 6. Singleton (f2c): elapsed time t=1.18549 s, 512 iters, t-(init.)=1.16443 s t(norm)=0.0462701, mflops=108.061 (err=4.1e-16) 7. Temperton: elapsed time t=1.38559 s, 1024 iters, t-(init.)=1.34347 s t(norm)=0.0266923, mflops=187.32 (err=6.3e-08) 8. Temperton (f2c): elapsed time t=1.86359 s, 1024 iters, t-(init.)=1.82146 s t(norm)=0.0361892, mflops=138.163 (err=4.6e-16) Top mflops for N=4096 = 349.54 Normalized results and averages for N=4096: fft 0: mflops = 349.54 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 136.762 (norm. = 0.391263), norm. avg. (of 2) = 0.360993 fft 2: mflops = 122.694 (norm. = 0.351015), norm. avg. (of 2) = 0.307434 fft 3: mflops = 93.692 (norm. = 0.268044), norm. avg. (of 3) = 0.170718 fft 4: mflops = 61.6506 (norm. = 0.176377), norm. avg. (of 3) = 0.118876 fft 5: mflops = 116.093 (norm. = 0.332131), norm. avg. (of 3) = 0.362771 fft 6: mflops = 108.061 (norm. = 0.309153), norm. avg. (of 3) = 0.348504 fft 7: mflops = 187.32 (norm. = 0.535906), norm. avg. (of 3) = 0.449542 fft 8: mflops = 138.163 (norm. = 0.395271), norm. avg. (of 3) = 0.312326 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.09999 s, 128 iters, t-(init.)=1.05311 s t(norm)=0.0167387, mflops=298.709 (err=5.2e-16) 1. HARM: elapsed time t=1.30669 s, 64 iters, t-(init.)=1.28324 s t(norm)=0.040793, mflops=122.57 (err=5.2e-16) 2. HARM (f2c): elapsed time t=1.51032 s, 64 iters, t-(init.)=1.48684 s t(norm)=0.0472655, mflops=105.785 (err=5.2e-16) 3. PDA: elapsed time t=1.40169 s, 32 iters, t-(init.)=1.3899 s t(norm)=0.0883675, mflops=56.5819 (err=4.2e-16) 4. PDA (f2c): elapsed time t=1.00815 s, 16 iters, t-(init.)=1.00222 s t(norm)=0.127439, mflops=39.2343 (err=4.2e-16) 5. Singleton: elapsed time t=1.78401 s, 64 iters, t-(init.)=1.76049 s t(norm)=0.0559644, mflops=89.3425 (err=5.3e-16) 6. Singleton (f2c): elapsed time t=1.9201 s, 64 iters, t-(init.)=1.89668 s t(norm)=0.0602939, mflops=82.9271 (err=5.3e-16) 7. Temperton: elapsed time t=1.98726 s, 128 iters, t-(init.)=1.94038 s t(norm)=0.0308415, mflops=162.119 (err=9.6e-08) 8. Temperton (f2c): elapsed time t=1.5627 s, 64 iters, t-(init.)=1.53928 s t(norm)=0.0489323, mflops=102.182 (err=4.7e-16) Top mflops for N=32768 = 298.709 Normalized results and averages for N=32768: fft 0: mflops = 298.709 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 122.57 (norm. = 0.410333), norm. avg. (of 3) = 0.37744 fft 2: mflops = 105.785 (norm. = 0.354141), norm. avg. (of 3) = 0.323003 fft 3: mflops = 56.5819 (norm. = 0.189421), norm. avg. (of 4) = 0.175394 fft 4: mflops = 39.2343 (norm. = 0.131346), norm. avg. (of 4) = 0.121994 fft 5: mflops = 89.3425 (norm. = 0.299095), norm. avg. (of 4) = 0.346852 fft 6: mflops = 82.9271 (norm. = 0.277618), norm. avg. (of 4) = 0.330783 fft 7: mflops = 162.119 (norm. = 0.542731), norm. avg. (of 4) = 0.47284 fft 8: mflops = 102.182 (norm. = 0.342078), norm. avg. (of 4) = 0.319764 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.60894 s, 8 iters, t-(init.)=1.45406 s t(norm)=0.0385195, mflops=129.804 (err=1.2e-15) 1. HARM: elapsed time t=1.38327 s, 4 iters, t-(init.)=1.30479 s t(norm)=0.0691303, mflops=72.3272 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.44289 s, 4 iters, t-(init.)=1.3655 s t(norm)=0.072347, mflops=69.1114 (err=1.2e-15) 3. PDA: elapsed time t=1.90377 s, 4 iters, t-(init.)=1.82804 s t(norm)=0.096853, mflops=51.6246 (err=1.3e-15) 4. PDA (f2c): elapsed time t=1.19432 s, 2 iters, t-(init.)=1.15791 s t(norm)=0.122697, mflops=40.7509 (err=1.3e-15) 5. Singleton: elapsed time t=1.82275 s, 2 iters, t-(init.)=1.78483 s t(norm)=0.189127, mflops=26.4373 (err=1.7e-15) 6. Singleton (f2c): elapsed time t=1.80896 s, 2 iters, t-(init.)=1.77102 s t(norm)=0.187664, mflops=26.6434 (err=1.7e-15) 7. Temperton: elapsed time t=1.09102 s, 4 iters, t-(init.)=1.01476 s t(norm)=0.0537638, mflops=92.9993 (err=1.3e-07) 8. Temperton (f2c): elapsed time t=1.2642 s, 4 iters, t-(init.)=1.18807 s t(norm)=0.0629465, mflops=79.4326 (err=1.3e-15) Top mflops for N=262144 = 129.804 Normalized results and averages for N=262144: fft 0: mflops = 129.804 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 72.3272 (norm. = 0.557201), norm. avg. (of 4) = 0.42238 fft 2: mflops = 69.1114 (norm. = 0.532427), norm. avg. (of 4) = 0.375359 fft 3: mflops = 51.6246 (norm. = 0.397711), norm. avg. (of 5) = 0.219857 fft 4: mflops = 40.7509 (norm. = 0.313941), norm. avg. (of 5) = 0.160383 fft 5: mflops = 26.4373 (norm. = 0.20367), norm. avg. (of 5) = 0.318216 fft 6: mflops = 26.6434 (norm. = 0.205258), norm. avg. (of 5) = 0.305678 fft 7: mflops = 92.9993 (norm. = 0.716457), norm. avg. (of 5) = 0.521563 fft 8: mflops = 79.4326 (norm. = 0.61194), norm. avg. (of 5) = 0.378199 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.31948 s, 2 iters, t-(init.)=1.23321 s t(norm)=0.0618991, mflops=80.7766 (err=1.2e-15) 1. HARM: elapsed time t=1.7818 s, 2 iters, t-(init.)=1.69538 s t(norm)=0.085097, mflops=58.7565 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.86125 s, 2 iters, t-(init.)=1.77432 s t(norm)=0.0890594, mflops=56.1423 (err=1.2e-15) 3. PDA: elapsed time t=1.13286 s, 1 iters, t-(init.)=1.09022 s t(norm)=0.109444, mflops=45.6855 (err=1.2e-15) 4. PDA (f2c): elapsed time t=1.3772 s, 1 iters, t-(init.)=1.33457 s t(norm)=0.133973, mflops=37.3209 (err=1.2e-15) 5. Singleton: elapsed time t=2.44218 s, 1 iters, t-(init.)=2.39951 s t(norm)=0.240879, mflops=20.7573 (err=1.7e-15) 6. Singleton (f2c): elapsed time t=2.42798 s, 1 iters, t-(init.)=2.38535 s t(norm)=0.239458, mflops=20.8805 (err=1.7e-15) 7. Temperton: elapsed time t=1.15274 s, 2 iters, t-(init.)=1.06859 s t(norm)=0.0536362, mflops=93.2206 (err=1.5e-07) 8. Temperton (f2c): elapsed time t=1.36199 s, 2 iters, t-(init.)=1.27792 s t(norm)=0.0641433, mflops=77.9504 (err=1.3e-15) Top mflops for N=524288 = 93.2206 Normalized results and averages for N=524288: fft 0: mflops = 80.7766 (norm. = 0.86651), norm. avg. (of 6) = 0.977752 fft 1: mflops = 58.7565 (norm. = 0.630295), norm. avg. (of 5) = 0.463963 fft 2: mflops = 56.1423 (norm. = 0.602253), norm. avg. (of 5) = 0.420738 fft 3: mflops = 45.6855 (norm. = 0.49008), norm. avg. (of 6) = 0.264894 fft 4: mflops = 37.3209 (norm. = 0.40035), norm. avg. (of 6) = 0.200378 fft 5: mflops = 20.7573 (norm. = 0.222669), norm. avg. (of 6) = 0.302291 fft 6: mflops = 20.8805 (norm. = 0.22399), norm. avg. (of 6) = 0.292063 fft 7: mflops = 93.2206 (norm. = 1), norm. avg. (of 6) = 0.601303 fft 8: mflops = 77.9504 (norm. = 0.836193), norm. avg. (of 6) = 0.454532 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=1.95125 s, 2 iters, t-(init.)=1.77648 s t(norm)=0.0423545, mflops=118.051 (err=2.0e-15) 1. HARM: elapsed time t=1.81973 s, 1 iters, t-(init.)=1.73189 s t(norm)=0.0825827, mflops=60.5453 (err=1.9e-15) 2. HARM (f2c): elapsed time t=1.88417 s, 1 iters, t-(init.)=1.79635 s t(norm)=0.0856569, mflops=58.3724 (err=1.9e-15) 3. PDA: elapsed time t=1.46906 s, 1 iters, t-(init.)=1.38239 s t(norm)=0.0659174, mflops=75.8525 (err=2.0e-15) 4. PDA (f2c): elapsed time t=1.98063 s, 1 iters, t-(init.)=1.89443 s t(norm)=0.0903333, mflops=55.3506 (err=2.0e-15) 5. Singleton: elapsed time t=4.35254 s, 1 iters, t-(init.)=4.26438 s t(norm)=0.203341, mflops=24.5892 (err=2.8e-15) 6. Singleton (f2c): elapsed time t=4.3042 s, 1 iters, t-(init.)=4.21625 s t(norm)=0.201046, mflops=24.8699 (err=2.8e-15) 7. Skipping fft (Temperton can't handle dimensions > 256). 8. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 118.051 Normalized results and averages for N=1048576: fft 0: mflops = 118.051 (norm. = 1), norm. avg. (of 7) = 0.98093 fft 1: mflops = 60.5453 (norm. = 0.512874), norm. avg. (of 6) = 0.472115 fft 2: mflops = 58.3724 (norm. = 0.494467), norm. avg. (of 6) = 0.433026 fft 3: mflops = 75.8525 (norm. = 0.642539), norm. avg. (of 7) = 0.318844 fft 4: mflops = 55.3506 (norm. = 0.46887), norm. avg. (of 7) = 0.238734 fft 5: mflops = 24.5892 (norm. = 0.208293), norm. avg. (of 7) = 0.288863 fft 6: mflops = 24.8699 (norm. = 0.210671), norm. avg. (of 7) = 0.280436 fft 7: mflops = -1 (norm. = -0.00847091), norm. avg. (of 6) = 0.601303 fft 8: mflops = -1 (norm. = -0.00847091), norm. avg. (of 6) = 0.454532 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=2.61018 s, 1 iters, t-(init.)=2.43497 s t(norm)=0.0552897, mflops=90.4328 (err=7.3e-16) 1. HARM: elapsed time t=4.08505 s, 1 iters, t-(init.)=3.90931 s t(norm)=0.0887668, mflops=56.3273 (err=6.9e-16) 2. HARM (f2c): elapsed time t=4.29275 s, 1 iters, t-(init.)=4.1167 s t(norm)=0.0934759, mflops=53.4897 (err=6.9e-16) 3. PDA: elapsed time t=5.29059 s, 1 iters, t-(init.)=5.11465 s t(norm)=0.116136, mflops=43.053 (err=7.1e-16) 4. PDA (f2c): elapsed time t=6.32458 s, 1 iters, t-(init.)=6.14907 s t(norm)=0.139624, mflops=35.8105 (err=7.1e-16) 5. Singleton: elapsed time t=14.1773 s, 1 iters, t-(init.)=13.9988 s t(norm)=0.317865, mflops=15.73 (err=8.4e-16) 6. Singleton (f2c): elapsed time t=14.3521 s, 1 iters, t-(init.)=14.1763 s t(norm)=0.321895, mflops=15.533 (err=8.4e-16) 7. Temperton: elapsed time t=4.35996 s, 1 iters, t-(init.)=4.18458 s t(norm)=0.0950174, mflops=52.622 (err=1.5e-07) 8. Temperton (f2c): elapsed time t=4.73411 s, 1 iters, t-(init.)=4.5585 s t(norm)=0.103508, mflops=48.3056 (err=7.4e-16) Top mflops for N=2097152 = 90.4328 Normalized results and averages for N=2097152: fft 0: mflops = 90.4328 (norm. = 1), norm. avg. (of 8) = 0.983314 fft 1: mflops = 56.3273 (norm. = 0.622864), norm. avg. (of 7) = 0.49365 fft 2: mflops = 53.4897 (norm. = 0.591486), norm. avg. (of 7) = 0.455663 fft 3: mflops = 43.053 (norm. = 0.476077), norm. avg. (of 8) = 0.338498 fft 4: mflops = 35.8105 (norm. = 0.39599), norm. avg. (of 8) = 0.258391 fft 5: mflops = 15.73 (norm. = 0.173941), norm. avg. (of 8) = 0.274498 fft 6: mflops = 15.533 (norm. = 0.171763), norm. avg. (of 8) = 0.266852 fft 7: mflops = 52.622 (norm. = 0.581891), norm. avg. (of 7) = 0.598529 fft 8: mflops = 48.3056 (norm. = 0.534161), norm. avg. (of 7) = 0.465907 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) Maximum array size N = 2985984 Benchmarking FFTs: 0. FFTW 1. PDA 2. PDA (f2c) 3. Singleton 4. Singleton (f2c) 5. Temperton 6. Temperton (f2c) Computing normalized averages (7 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.76199 s, 65536 iters, t-(init.)=1.67185 s t(norm)=0.029298, mflops=170.66 (err=3.0e-16) 1. PDA: elapsed time t=1.02272 s, 8192 iters, t-(init.)=1.01143 s t(norm)=0.141796, mflops=35.2619 (err=2.3e-16) 2. PDA (f2c): elapsed time t=1.59776 s, 8192 iters, t-(init.)=1.58635 s t(norm)=0.222397, mflops=22.4823 (err=2.3e-16) 3. Singleton: elapsed time t=1.02813 s, 32768 iters, t-(init.)=0.982516 s t(norm)=0.0344358, mflops=145.198 (err=3.1e-16) 4. Singleton (f2c): elapsed time t=1.75964 s, 65536 iters, t-(init.)=1.67078 s t(norm)=0.0292792, mflops=170.77 (err=3.1e-16) 5. Temperton: elapsed time t=1.21191 s, 32768 iters, t-(init.)=1.16764 s t(norm)=0.0409239, mflops=122.178 (err=5.3e-16) 6. Temperton (f2c): elapsed time t=1.82353 s, 32768 iters, t-(init.)=1.77851 s t(norm)=0.0623344, mflops=80.2126 (err=2.4e-16) Top mflops for N=125 = 170.77 Normalized results and averages for N=125: fft 0: mflops = 170.66 (norm. = 0.999358), norm. avg. (of 1) = 0.999358 fft 1: mflops = 35.2619 (norm. = 0.206488), norm. avg. (of 1) = 0.206488 fft 2: mflops = 22.4823 (norm. = 0.131653), norm. avg. (of 1) = 0.131653 fft 3: mflops = 145.198 (norm. = 0.850255), norm. avg. (of 1) = 0.850255 fft 4: mflops = 170.77 (norm. = 1), norm. avg. (of 1) = 1 fft 5: mflops = 122.178 (norm. = 0.715454), norm. avg. (of 1) = 0.715454 fft 6: mflops = 80.2126 (norm. = 0.469712), norm. avg. (of 1) = 0.469712 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.07146 s, 32768 iters, t-(init.)=0.997169 s t(norm)=0.0181673, mflops=275.22 (err=2.9e-16) 1. PDA: elapsed time t=1.88009 s, 8192 iters, t-(init.)=1.86153 s t(norm)=0.13566, mflops=36.857 (err=3.6e-16) 2. PDA (f2c): elapsed time t=1.29098 s, 4096 iters, t-(init.)=1.28145 s t(norm)=0.186772, mflops=26.7706 (err=3.6e-16) 3. Singleton: elapsed time t=1.51547 s, 16384 iters, t-(init.)=1.47847 s t(norm)=0.053872, mflops=92.8125 (err=2.9e-16) 4. Singleton (f2c): elapsed time t=1.53413 s, 16384 iters, t-(init.)=1.49621 s t(norm)=0.0545185, mflops=91.712 (err=2.9e-16) 5. Temperton: elapsed time t=1.1688 s, 16384 iters, t-(init.)=1.13233 s t(norm)=0.0412595, mflops=121.184 (err=2.1e-15) 6. Temperton (f2c): elapsed time t=1.838 s, 16384 iters, t-(init.)=1.80064 s t(norm)=0.0656111, mflops=76.2066 (err=3.1e-16) Top mflops for N=216 = 275.22 Normalized results and averages for N=216: fft 0: mflops = 275.22 (norm. = 1), norm. avg. (of 2) = 0.999679 fft 1: mflops = 36.857 (norm. = 0.133918), norm. avg. (of 2) = 0.170203 fft 2: mflops = 26.7706 (norm. = 0.0972697), norm. avg. (of 2) = 0.114461 fft 3: mflops = 92.8125 (norm. = 0.33723), norm. avg. (of 2) = 0.593742 fft 4: mflops = 91.712 (norm. = 0.333231), norm. avg. (of 2) = 0.666616 fft 5: mflops = 121.184 (norm. = 0.440317), norm. avg. (of 2) = 0.577885 fft 6: mflops = 76.2066 (norm. = 0.276893), norm. avg. (of 2) = 0.373302 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.27187 s, 16384 iters, t-(init.)=1.21407 s t(norm)=0.0256515, mflops=194.92 (err=3.8e-16) 1. PDA: elapsed time t=1.13935 s, 1024 iters, t-(init.)=1.13572 s t(norm)=0.383937, mflops=13.023 (err=4.5e-16) 2. PDA (f2c): elapsed time t=1.06474 s, 1024 iters, t-(init.)=1.06106 s t(norm)=0.358697, mflops=13.9393 (err=4.5e-16) 3. Singleton: elapsed time t=1.23224 s, 8192 iters, t-(init.)=1.20326 s t(norm)=0.0508461, mflops=98.336 (err=5.8e-16) 4. Singleton (f2c): elapsed time t=1.27816 s, 8192 iters, t-(init.)=1.24912 s t(norm)=0.0527839, mflops=94.7258 (err=5.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 194.92 Normalized results and averages for N=343: fft 0: mflops = 194.92 (norm. = 1), norm. avg. (of 3) = 0.999786 fft 1: mflops = 13.023 (norm. = 0.0668118), norm. avg. (of 3) = 0.135739 fft 2: mflops = 13.9393 (norm. = 0.071513), norm. avg. (of 3) = 0.100145 fft 3: mflops = 98.336 (norm. = 0.504493), norm. avg. (of 3) = 0.563993 fft 4: mflops = 94.7258 (norm. = 0.485972), norm. avg. (of 3) = 0.606401 fft 5: mflops = -1 (norm. = -0.0051303), norm. avg. (of 2) = 0.577885 fft 6: mflops = -1 (norm. = -0.0051303), norm. avg. (of 2) = 0.373302 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.065 s, 8192 iters, t-(init.)=1.00471 s t(norm)=0.017691, mflops=282.63 (err=5.3e-16) 1. PDA: elapsed time t=1.20354 s, 2048 iters, t-(init.)=1.18831 s t(norm)=0.0836958, mflops=59.7402 (err=4.9e-16) 2. PDA (f2c): elapsed time t=1.81208 s, 2048 iters, t-(init.)=1.79685 s t(norm)=0.126556, mflops=39.5081 (err=4.9e-16) 3. Singleton: elapsed time t=1.33651 s, 4096 iters, t-(init.)=1.30611 s t(norm)=0.045996, mflops=108.705 (err=4.5e-16) 4. Singleton (f2c): elapsed time t=1.29666 s, 4096 iters, t-(init.)=1.26629 s t(norm)=0.0445941, mflops=112.123 (err=4.5e-16) 5. Temperton: elapsed time t=1.40943 s, 8192 iters, t-(init.)=1.34844 s t(norm)=0.0237435, mflops=210.584 (err=6.0e-08) 6. Temperton (f2c): elapsed time t=1.85977 s, 8192 iters, t-(init.)=1.79888 s t(norm)=0.0316749, mflops=157.854 (err=5.1e-16) Top mflops for N=729 = 282.63 Normalized results and averages for N=729: fft 0: mflops = 282.63 (norm. = 1), norm. avg. (of 4) = 0.99984 fft 1: mflops = 59.7402 (norm. = 0.211372), norm. avg. (of 4) = 0.154648 fft 2: mflops = 39.5081 (norm. = 0.139787), norm. avg. (of 4) = 0.110056 fft 3: mflops = 108.705 (norm. = 0.384619), norm. avg. (of 4) = 0.519149 fft 4: mflops = 112.123 (norm. = 0.396711), norm. avg. (of 4) = 0.553979 fft 5: mflops = 210.584 (norm. = 0.745086), norm. avg. (of 3) = 0.633619 fft 6: mflops = 157.854 (norm. = 0.558517), norm. avg. (of 3) = 0.435041 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.45255 s, 8192 iters, t-(init.)=1.36936 s t(norm)=0.0167732, mflops=298.094 (err=4.0e-16) 1. PDA: elapsed time t=1.70175 s, 2048 iters, t-(init.)=1.68105 s t(norm)=0.0823644, mflops=60.7059 (err=4.3e-16) 2. PDA (f2c): elapsed time t=1.41008 s, 1024 iters, t-(init.)=1.39974 s t(norm)=0.137163, mflops=36.4531 (err=4.3e-16) 3. Singleton: elapsed time t=1.67768 s, 4096 iters, t-(init.)=1.63606 s t(norm)=0.0400801, mflops=124.75 (err=4.6e-16) 4. Singleton (f2c): elapsed time t=1.67833 s, 4096 iters, t-(init.)=1.63667 s t(norm)=0.0400949, mflops=124.704 (err=4.6e-16) 5. Temperton: elapsed time t=1.87229 s, 8192 iters, t-(init.)=1.78941 s t(norm)=0.0219184, mflops=228.119 (err=6.3e-16) 6. Temperton (f2c): elapsed time t=1.71092 s, 4096 iters, t-(init.)=1.66951 s t(norm)=0.0408995, mflops=122.251 (err=3.4e-16) Top mflops for N=1000 = 298.094 Normalized results and averages for N=1000: fft 0: mflops = 298.094 (norm. = 1), norm. avg. (of 5) = 0.999872 fft 1: mflops = 60.7059 (norm. = 0.203646), norm. avg. (of 5) = 0.164447 fft 2: mflops = 36.4531 (norm. = 0.122287), norm. avg. (of 5) = 0.112502 fft 3: mflops = 124.75 (norm. = 0.418492), norm. avg. (of 5) = 0.499018 fft 4: mflops = 124.704 (norm. = 0.418338), norm. avg. (of 5) = 0.52685 fft 5: mflops = 228.119 (norm. = 0.765258), norm. avg. (of 4) = 0.666529 fft 6: mflops = 122.251 (norm. = 0.410108), norm. avg. (of 4) = 0.428808 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.28894 s, 2048 iters, t-(init.)=1.26144 s t(norm)=0.0445894, mflops=112.134 (err=4.3e-16) 1. PDA: elapsed time t=1.09694 s, 256 iters, t-(init.)=1.09349 s t(norm)=0.309221, mflops=16.1696 (err=5.6e-16) 2. PDA (f2c): elapsed time t=1.21986 s, 256 iters, t-(init.)=1.21641 s t(norm)=0.343983, mflops=14.5356 (err=5.6e-16) 3. Singleton: elapsed time t=1.47347 s, 2048 iters, t-(init.)=1.44605 s t(norm)=0.0511149, mflops=97.8187 (err=6.6e-16) 4. Singleton (f2c): elapsed time t=1.54322 s, 2048 iters, t-(init.)=1.51577 s t(norm)=0.0535797, mflops=93.3189 (err=6.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 112.134 Normalized results and averages for N=1331: fft 0: mflops = 112.134 (norm. = 1), norm. avg. (of 6) = 0.999893 fft 1: mflops = 16.1696 (norm. = 0.144199), norm. avg. (of 6) = 0.161073 fft 2: mflops = 14.5356 (norm. = 0.129627), norm. avg. (of 6) = 0.115356 fft 3: mflops = 97.8187 (norm. = 0.872335), norm. avg. (of 6) = 0.561237 fft 4: mflops = 93.3189 (norm. = 0.832206), norm. avg. (of 6) = 0.577743 fft 5: mflops = -1 (norm. = -0.00891787), norm. avg. (of 4) = 0.666529 fft 6: mflops = -1 (norm. = -0.00891787), norm. avg. (of 4) = 0.428808 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.20733 s, 4096 iters, t-(init.)=1.13588 s t(norm)=0.0149218, mflops=335.08 (err=3.9e-16) 1. PDA: elapsed time t=1.18486 s, 1024 iters, t-(init.)=1.16701 s t(norm)=0.0613234, mflops=81.535 (err=3.8e-16) 2. PDA (f2c): elapsed time t=1.86869 s, 1024 iters, t-(init.)=1.85083 s t(norm)=0.0972562, mflops=51.4106 (err=3.8e-16) 3. Singleton: elapsed time t=1.04344 s, 1024 iters, t-(init.)=1.0256 s t(norm)=0.0538927, mflops=92.777 (err=4.0e-16) 4. Singleton (f2c): elapsed time t=1.06378 s, 1024 iters, t-(init.)=1.04593 s t(norm)=0.054961, mflops=90.9737 (err=4.0e-16) 5. Temperton: elapsed time t=1.97672 s, 4096 iters, t-(init.)=1.90525 s t(norm)=0.0250289, mflops=199.769 (err=1.8e-15) 6. Temperton (f2c): elapsed time t=1.93375 s, 2048 iters, t-(init.)=1.89808 s t(norm)=0.0498695, mflops=100.262 (err=3.9e-16) Top mflops for N=1728 = 335.08 Normalized results and averages for N=1728: fft 0: mflops = 335.08 (norm. = 1), norm. avg. (of 7) = 0.999908 fft 1: mflops = 81.535 (norm. = 0.24333), norm. avg. (of 7) = 0.172824 fft 2: mflops = 51.4106 (norm. = 0.153428), norm. avg. (of 7) = 0.120795 fft 3: mflops = 92.777 (norm. = 0.27688), norm. avg. (of 7) = 0.520615 fft 4: mflops = 90.9737 (norm. = 0.271499), norm. avg. (of 7) = 0.533994 fft 5: mflops = 199.769 (norm. = 0.596184), norm. avg. (of 5) = 0.65246 fft 6: mflops = 100.262 (norm. = 0.299217), norm. avg. (of 5) = 0.40289 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.21189 s, 1024 iters, t-(init.)=1.18908 s t(norm)=0.0476107, mflops=105.018 (err=4.5e-16) 1. PDA: elapsed time t=1.90672 s, 256 iters, t-(init.)=1.90106 s t(norm)=0.304475, mflops=16.4217 (err=9.0e-16) 2. PDA (f2c): elapsed time t=1.09824 s, 128 iters, t-(init.)=1.09537 s t(norm)=0.35087, mflops=14.2503 (err=9.0e-16) 3. Singleton: elapsed time t=1.38782 s, 1024 iters, t-(init.)=1.36515 s t(norm)=0.0546607, mflops=91.4733 (err=7.7e-16) 4. Singleton (f2c): elapsed time t=1.42718 s, 1024 iters, t-(init.)=1.40455 s t(norm)=0.0562384, mflops=88.9073 (err=7.7e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 105.018 Normalized results and averages for N=2197: fft 0: mflops = 105.018 (norm. = 1), norm. avg. (of 8) = 0.99992 fft 1: mflops = 16.4217 (norm. = 0.15637), norm. avg. (of 8) = 0.170767 fft 2: mflops = 14.2503 (norm. = 0.135694), norm. avg. (of 8) = 0.122657 fft 3: mflops = 91.4733 (norm. = 0.871022), norm. avg. (of 8) = 0.564416 fft 4: mflops = 88.9073 (norm. = 0.846588), norm. avg. (of 8) = 0.573068 fft 5: mflops = -1 (norm. = -0.00952214), norm. avg. (of 5) = 0.65246 fft 6: mflops = -1 (norm. = -0.00952214), norm. avg. (of 5) = 0.40289 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.84219 s, 2048 iters, t-(init.)=1.78571 s t(norm)=0.0278197, mflops=179.728 (err=4.1e-16) 1. PDA: elapsed time t=1.45171 s, 256 iters, t-(init.)=1.44464 s t(norm)=0.18005, mflops=27.7701 (err=4.7e-16) 2. PDA (f2c): elapsed time t=1.5756 s, 256 iters, t-(init.)=1.56852 s t(norm)=0.195488, mflops=25.577 (err=4.7e-16) 3. Singleton: elapsed time t=1.04254 s, 512 iters, t-(init.)=1.02841 s t(norm)=0.0640868, mflops=78.0192 (err=5.6e-16) 4. Singleton (f2c): elapsed time t=1.14044 s, 512 iters, t-(init.)=1.12631 s t(norm)=0.0701872, mflops=71.238 (err=5.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 179.728 Normalized results and averages for N=2744: fft 0: mflops = 179.728 (norm. = 1), norm. avg. (of 9) = 0.999929 fft 1: mflops = 27.7701 (norm. = 0.154512), norm. avg. (of 9) = 0.168961 fft 2: mflops = 25.577 (norm. = 0.142309), norm. avg. (of 9) = 0.124841 fft 3: mflops = 78.0192 (norm. = 0.434095), norm. avg. (of 9) = 0.549936 fft 4: mflops = 71.238 (norm. = 0.396365), norm. avg. (of 9) = 0.553434 fft 5: mflops = -1 (norm. = -0.00556395), norm. avg. (of 5) = 0.65246 fft 6: mflops = -1 (norm. = -0.00556395), norm. avg. (of 5) = 0.40289 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.62148 s, 2048 iters, t-(init.)=1.55209 s t(norm)=0.0191585, mflops=260.981 (err=3.9e-16) 1. PDA: elapsed time t=1.17682 s, 512 iters, t-(init.)=1.15945 s t(norm)=0.0572473, mflops=87.3404 (err=4.4e-16) 2. PDA (f2c): elapsed time t=1.00855 s, 256 iters, t-(init.)=0.999861 s t(norm)=0.0987355, mflops=50.6404 (err=4.4e-16) 3. Singleton: elapsed time t=1.11475 s, 512 iters, t-(init.)=1.09738 s t(norm)=0.0541827, mflops=92.2804 (err=5.0e-16) 4. Singleton (f2c): elapsed time t=1.14021 s, 512 iters, t-(init.)=1.12284 s t(norm)=0.0554399, mflops=90.1878 (err=5.0e-16) 5. Temperton: elapsed time t=1.14023 s, 1024 iters, t-(init.)=1.10547 s t(norm)=0.0272911, mflops=183.21 (err=1.9e-15) 6. Temperton (f2c): elapsed time t=1.70076 s, 1024 iters, t-(init.)=1.66601 s t(norm)=0.0411294, mflops=121.568 (err=4.1e-16) Top mflops for N=3375 = 260.981 Normalized results and averages for N=3375: fft 0: mflops = 260.981 (norm. = 1), norm. avg. (of 10) = 0.999936 fft 1: mflops = 87.3404 (norm. = 0.334662), norm. avg. (of 10) = 0.185531 fft 2: mflops = 50.6404 (norm. = 0.194038), norm. avg. (of 10) = 0.131761 fft 3: mflops = 92.2804 (norm. = 0.35359), norm. avg. (of 10) = 0.530301 fft 4: mflops = 90.1878 (norm. = 0.345572), norm. avg. (of 10) = 0.532648 fft 5: mflops = 183.21 (norm. = 0.702004), norm. avg. (of 6) = 0.660717 fft 6: mflops = 121.568 (norm. = 0.46581), norm. avg. (of 6) = 0.413376 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.0531 s, 128 iters, t-(init.)=1.0315 s t(norm)=0.0341743, mflops=146.309 (err=4.7e-16) 1. PDA: elapsed time t=1.03101 s, 64 iters, t-(init.)=1.02019 s t(norm)=0.0675996, mflops=73.9649 (err=4.7e-16) 2. PDA (f2c): elapsed time t=1.46947 s, 64 iters, t-(init.)=1.45864 s t(norm)=0.0966515, mflops=51.7322 (err=4.7e-16) 3. Singleton: elapsed time t=1.80769 s, 128 iters, t-(init.)=1.78608 s t(norm)=0.0591741, mflops=84.4964 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=1.86162 s, 128 iters, t-(init.)=1.84002 s t(norm)=0.0609614, mflops=82.0191 (err=5.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 146.309 Normalized results and averages for N=16800: fft 0: mflops = 146.309 (norm. = 1), norm. avg. (of 11) = 0.999942 fft 1: mflops = 73.9649 (norm. = 0.50554), norm. avg. (of 11) = 0.214623 fft 2: mflops = 51.7322 (norm. = 0.353583), norm. avg. (of 11) = 0.151926 fft 3: mflops = 84.4964 (norm. = 0.577522), norm. avg. (of 11) = 0.534594 fft 4: mflops = 82.0191 (norm. = 0.56059), norm. avg. (of 11) = 0.535188 fft 5: mflops = -1 (norm. = -0.00683487), norm. avg. (of 6) = 0.660717 fft 6: mflops = -1 (norm. = -0.00683487), norm. avg. (of 6) = 0.413376 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.79392 s, 32 iters, t-(init.)=1.62544 s t(norm)=0.0274129, mflops=182.396 (err=6.6e-16) 1. PDA: elapsed time t=1.24855 s, 8 iters, t-(init.)=1.20689 s t(norm)=0.0814166, mflops=61.4126 (err=6.2e-16) 2. PDA (f2c): elapsed time t=1.7395 s, 8 iters, t-(init.)=1.69776 s t(norm)=0.11453, mflops=43.6565 (err=6.2e-16) 3. Singleton: elapsed time t=1.66846 s, 8 iters, t-(init.)=1.62655 s t(norm)=0.109727, mflops=45.5678 (err=6.5e-16) 4. Singleton (f2c): elapsed time t=1.68841 s, 8 iters, t-(init.)=1.64643 s t(norm)=0.111068, mflops=45.0175 (err=6.5e-16) 5. Temperton: elapsed time t=1.31202 s, 16 iters, t-(init.)=1.22835 s t(norm)=0.0414321, mflops=120.679 (err=1.0e-07) 6. Temperton (f2c): elapsed time t=1.60862 s, 16 iters, t-(init.)=1.52501 s t(norm)=0.0514383, mflops=97.2039 (err=7.0e-16) Top mflops for N=110592 = 182.396 Normalized results and averages for N=110592: fft 0: mflops = 182.396 (norm. = 1), norm. avg. (of 12) = 0.999947 fft 1: mflops = 61.4126 (norm. = 0.336699), norm. avg. (of 12) = 0.224796 fft 2: mflops = 43.6565 (norm. = 0.23935), norm. avg. (of 12) = 0.159212 fft 3: mflops = 45.5678 (norm. = 0.249829), norm. avg. (of 12) = 0.510864 fft 4: mflops = 45.0175 (norm. = 0.246812), norm. avg. (of 12) = 0.511157 fft 5: mflops = 120.679 (norm. = 0.661634), norm. avg. (of 7) = 0.660848 fft 6: mflops = 97.2039 (norm. = 0.532928), norm. avg. (of 7) = 0.430455 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.44015 s, 16 iters, t-(init.)=1.34881 s t(norm)=0.0425396, mflops=117.538 (err=6.5e-16) 1. PDA: elapsed time t=1.22195 s, 4 iters, t-(init.)=1.1998 s t(norm)=0.151361, mflops=33.0337 (err=7.4e-16) 2. PDA (f2c): elapsed time t=1.55641 s, 4 iters, t-(init.)=1.53432 s t(norm)=0.193561, mflops=25.8317 (err=7.4e-16) 3. Singleton: elapsed time t=1.90207 s, 8 iters, t-(init.)=1.8564 s t(norm)=0.117097, mflops=42.6997 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.92109 s, 8 iters, t-(init.)=1.87554 s t(norm)=0.118304, mflops=42.264 (err=1.0e-15) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 117.538 Normalized results and averages for N=117649: fft 0: mflops = 117.538 (norm. = 1), norm. avg. (of 13) = 0.999951 fft 1: mflops = 33.0337 (norm. = 0.281048), norm. avg. (of 13) = 0.229123 fft 2: mflops = 25.8317 (norm. = 0.219774), norm. avg. (of 13) = 0.16387 fft 3: mflops = 42.6997 (norm. = 0.363286), norm. avg. (of 13) = 0.499511 fft 4: mflops = 42.264 (norm. = 0.359579), norm. avg. (of 13) = 0.499497 fft 5: mflops = -1 (norm. = -0.00850792), norm. avg. (of 7) = 0.660848 fft 6: mflops = -1 (norm. = -0.00850792), norm. avg. (of 7) = 0.430455 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.24812 s, 8 iters, t-(init.)=1.12302 s t(norm)=0.0366746, mflops=136.334 (err=7.3e-16) 1. PDA: elapsed time t=1.31777 s, 4 iters, t-(init.)=1.26033 s t(norm)=0.082317, mflops=60.7408 (err=7.4e-16) 2. PDA (f2c): elapsed time t=1.8711 s, 4 iters, t-(init.)=1.81359 s t(norm)=0.118453, mflops=42.2109 (err=7.4e-16) 3. Singleton: elapsed time t=1.28312 s, 2 iters, t-(init.)=1.25678 s t(norm)=0.164171, mflops=30.456 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.28952 s, 2 iters, t-(init.)=1.26326 s t(norm)=0.165016, mflops=30.3 (err=1.0e-15) 5. Temperton: elapsed time t=1.64627 s, 8 iters, t-(init.)=1.5284 s t(norm)=0.0499131, mflops=100.174 (err=2.0e-15) 6. Temperton (f2c): elapsed time t=1.13989 s, 4 iters, t-(init.)=1.08234 s t(norm)=0.0706916, mflops=70.7297 (err=7.1e-16) Top mflops for N=216000 = 136.334 Normalized results and averages for N=216000: fft 0: mflops = 136.334 (norm. = 1), norm. avg. (of 14) = 0.999954 fft 1: mflops = 60.7408 (norm. = 0.445528), norm. avg. (of 14) = 0.24458 fft 2: mflops = 42.2109 (norm. = 0.309613), norm. avg. (of 14) = 0.17428 fft 3: mflops = 30.456 (norm. = 0.223392), norm. avg. (of 14) = 0.479789 fft 4: mflops = 30.3 (norm. = 0.222248), norm. avg. (of 14) = 0.479694 fft 5: mflops = 100.174 (norm. = 0.734769), norm. avg. (of 8) = 0.670088 fft 6: mflops = 70.7297 (norm. = 0.518797), norm. avg. (of 8) = 0.441498 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.71407 s, 8 iters, t-(init.)=1.57691 s t(norm)=0.0455593, mflops=109.747 (err=7.3e-16) 1. PDA: elapsed time t=1.67492 s, 4 iters, t-(init.)=1.60794 s t(norm)=0.0929114, mflops=53.8147 (err=7.8e-16) 2. PDA (f2c): elapsed time t=1.12944 s, 2 iters, t-(init.)=1.09751 s t(norm)=0.126835, mflops=39.4214 (err=7.8e-16) 3. Singleton: elapsed time t=1.67306 s, 2 iters, t-(init.)=1.63915 s t(norm)=0.189429, mflops=26.3951 (err=9.4e-16) 4. Singleton (f2c): elapsed time t=1.65893 s, 2 iters, t-(init.)=1.62507 s t(norm)=0.187802, mflops=26.6237 (err=9.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 109.747 Normalized results and averages for N=241920: fft 0: mflops = 109.747 (norm. = 1), norm. avg. (of 15) = 0.999957 fft 1: mflops = 53.8147 (norm. = 0.490352), norm. avg. (of 15) = 0.260965 fft 2: mflops = 39.4214 (norm. = 0.359202), norm. avg. (of 15) = 0.186608 fft 3: mflops = 26.3951 (norm. = 0.240508), norm. avg. (of 15) = 0.463837 fft 4: mflops = 26.6237 (norm. = 0.242592), norm. avg. (of 15) = 0.463887 fft 5: mflops = -1 (norm. = -0.00911185), norm. avg. (of 8) = 0.670088 fft 6: mflops = -1 (norm. = -0.00911185), norm. avg. (of 8) = 0.441498 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.62388 s, 4 iters, t-(init.)=1.48894 s t(norm)=0.047218, mflops=105.892 (err=7.0e-16) 1. PDA: elapsed time t=1.43253 s, 2 iters, t-(init.)=1.36617 s t(norm)=0.0866491, mflops=57.704 (err=7.5e-16) 2. PDA (f2c): elapsed time t=1.94537 s, 2 iters, t-(init.)=1.87844 s t(norm)=0.119139, mflops=41.9676 (err=7.5e-16) 3. Singleton: elapsed time t=1.48232 s, 1 iters, t-(init.)=1.44872 s t(norm)=0.18377, mflops=27.208 (err=9.8e-16) 4. Singleton (f2c): elapsed time t=1.48551 s, 1 iters, t-(init.)=1.45186 s t(norm)=0.184168, mflops=27.1491 (err=9.8e-16) 5. Temperton: elapsed time t=1.58248 s, 4 iters, t-(init.)=1.44784 s t(norm)=0.0459146, mflops=108.898 (err=1.4e-07) 6. Temperton (f2c): elapsed time t=1.91666 s, 4 iters, t-(init.)=1.78217 s t(norm)=0.0565168, mflops=88.4693 (err=8.9e-16) Top mflops for N=421875 = 108.898 Normalized results and averages for N=421875: fft 0: mflops = 105.892 (norm. = 0.972397), norm. avg. (of 16) = 0.998235 fft 1: mflops = 57.704 (norm. = 0.529892), norm. avg. (of 16) = 0.277773 fft 2: mflops = 41.9676 (norm. = 0.385386), norm. avg. (of 16) = 0.199032 fft 3: mflops = 27.208 (norm. = 0.249849), norm. avg. (of 16) = 0.450462 fft 4: mflops = 27.1491 (norm. = 0.249308), norm. avg. (of 16) = 0.450476 fft 5: mflops = 108.898 (norm. = 1), norm. avg. (of 9) = 0.706745 fft 6: mflops = 88.4693 (norm. = 0.812407), norm. avg. (of 9) = 0.48271 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.09919 s, 2 iters, t-(init.)=1.01941 s t(norm)=0.0524903, mflops=95.2557 (err=6.3e-16) 1. PDA: elapsed time t=1.73904 s, 2 iters, t-(init.)=1.65704 s t(norm)=0.085322, mflops=58.6015 (err=6.2e-16) 2. PDA (f2c): elapsed time t=1.12605 s, 1 iters, t-(init.)=1.08649 s t(norm)=0.111888, mflops=44.6874 (err=6.2e-16) 3. Singleton: elapsed time t=1.79179 s, 1 iters, t-(init.)=1.75042 s t(norm)=0.180261, mflops=27.7375 (err=8.2e-16) 4. Singleton (f2c): elapsed time t=1.78876 s, 1 iters, t-(init.)=1.74735 s t(norm)=0.179944, mflops=27.7864 (err=8.2e-16) 5. Temperton: elapsed time t=1.15891 s, 2 iters, t-(init.)=1.07937 s t(norm)=0.0555777, mflops=89.9641 (err=1.7e-07) 6. Temperton (f2c): elapsed time t=1.40825 s, 2 iters, t-(init.)=1.32888 s t(norm)=0.068425, mflops=73.0727 (err=6.6e-16) Top mflops for N=512000 = 95.2557 Normalized results and averages for N=512000: fft 0: mflops = 95.2557 (norm. = 1), norm. avg. (of 17) = 0.998339 fft 1: mflops = 58.6015 (norm. = 0.615202), norm. avg. (of 17) = 0.297622 fft 2: mflops = 44.6874 (norm. = 0.469131), norm. avg. (of 17) = 0.21492 fft 3: mflops = 27.7375 (norm. = 0.29119), norm. avg. (of 17) = 0.441093 fft 4: mflops = 27.7864 (norm. = 0.291703), norm. avg. (of 17) = 0.441136 fft 5: mflops = 89.9641 (norm. = 0.944448), norm. avg. (of 10) = 0.730515 fft 6: mflops = 73.0727 (norm. = 0.767122), norm. avg. (of 10) = 0.511151 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.29096 s, 2 iters, t-(init.)=1.19655 s t(norm)=0.0526359, mflops=94.9922 (err=7.0e-16) 1. PDA: elapsed time t=1.23216 s, 1 iters, t-(init.)=1.18909 s t(norm)=0.104616, mflops=47.794 (err=6.8e-16) 2. PDA (f2c): elapsed time t=1.66142 s, 1 iters, t-(init.)=1.61809 s t(norm)=0.14236, mflops=35.1223 (err=6.8e-16) 3. Singleton: elapsed time t=2.53129 s, 1 iters, t-(init.)=2.48799 s t(norm)=0.218893, mflops=22.8422 (err=8.9e-16) 4. Singleton (f2c): elapsed time t=2.55042 s, 1 iters, t-(init.)=2.50708 s t(norm)=0.220572, mflops=22.6683 (err=8.9e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 94.9922 Normalized results and averages for N=592704: fft 0: mflops = 94.9922 (norm. = 1), norm. avg. (of 18) = 0.998431 fft 1: mflops = 47.794 (norm. = 0.503136), norm. avg. (of 18) = 0.309039 fft 2: mflops = 35.1223 (norm. = 0.369739), norm. avg. (of 18) = 0.223521 fft 3: mflops = 22.8422 (norm. = 0.240464), norm. avg. (of 18) = 0.429947 fft 4: mflops = 22.6683 (norm. = 0.238633), norm. avg. (of 18) = 0.429886 fft 5: mflops = -1 (norm. = -0.0105272), norm. avg. (of 10) = 0.730515 fft 6: mflops = -1 (norm. = -0.0105272), norm. avg. (of 10) = 0.511151 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=1.96119 s, 2 iters, t-(init.)=1.81352 s t(norm)=0.0518805, mflops=96.3754 (err=7.7e-16) 1. PDA: elapsed time t=1.59664 s, 1 iters, t-(init.)=1.52369 s t(norm)=0.0871782, mflops=57.3538 (err=6.4e-16) 2. PDA (f2c): elapsed time t=2.17208 s, 1 iters, t-(init.)=2.09872 s t(norm)=0.120079, mflops=41.6393 (err=6.4e-16) 3. Singleton: elapsed time t=4.95824 s, 1 iters, t-(init.)=4.88858 s t(norm)=0.279702, mflops=17.8762 (err=7.0e-16) 4. Singleton (f2c): elapsed time t=4.98641 s, 1 iters, t-(init.)=4.91672 s t(norm)=0.281311, mflops=17.7739 (err=7.0e-16) 5. Temperton: elapsed time t=1.28447 s, 1 iters, t-(init.)=1.2117 s t(norm)=0.0693275, mflops=72.1215 (err=1.6e-07) 6. Temperton (f2c): elapsed time t=1.57433 s, 1 iters, t-(init.)=1.50158 s t(norm)=0.0859134, mflops=58.1982 (err=7.5e-16) Top mflops for N=884736 = 96.3754 Normalized results and averages for N=884736: fft 0: mflops = 96.3754 (norm. = 1), norm. avg. (of 19) = 0.998513 fft 1: mflops = 57.3538 (norm. = 0.595108), norm. avg. (of 19) = 0.324095 fft 2: mflops = 41.6393 (norm. = 0.432053), norm. avg. (of 19) = 0.234497 fft 3: mflops = 17.8762 (norm. = 0.185485), norm. avg. (of 19) = 0.417081 fft 4: mflops = 17.7739 (norm. = 0.184424), norm. avg. (of 19) = 0.416967 fft 5: mflops = 72.1215 (norm. = 0.748339), norm. avg. (of 11) = 0.732136 fft 6: mflops = 58.1982 (norm. = 0.60387), norm. avg. (of 11) = 0.51958 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=1.32729 s, 1 iters, t-(init.)=1.23073 s t(norm)=0.052781, mflops=94.7311 (err=7.4e-16) 1. PDA: elapsed time t=2.42172 s, 1 iters, t-(init.)=2.32578 s t(norm)=0.099743, mflops=50.1288 (err=7.1e-16) 2. PDA (f2c): elapsed time t=3.42585 s, 1 iters, t-(init.)=3.32651 s t(norm)=0.14266, mflops=35.0483 (err=7.1e-16) 3. Singleton: elapsed time t=4.84778 s, 1 iters, t-(init.)=4.75304 s t(norm)=0.203838, mflops=24.5293 (err=8.0e-16) 4. Singleton (f2c): elapsed time t=4.92812 s, 1 iters, t-(init.)=4.83338 s t(norm)=0.207283, mflops=24.1216 (err=8.0e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 94.7311 Normalized results and averages for N=1157625: fft 0: mflops = 94.7311 (norm. = 1), norm. avg. (of 20) = 0.998588 fft 1: mflops = 50.1288 (norm. = 0.529169), norm. avg. (of 20) = 0.334349 fft 2: mflops = 35.0483 (norm. = 0.369977), norm. avg. (of 20) = 0.241271 fft 3: mflops = 24.5293 (norm. = 0.258936), norm. avg. (of 20) = 0.409174 fft 4: mflops = 24.1216 (norm. = 0.254632), norm. avg. (of 20) = 0.40885 fft 5: mflops = -1 (norm. = -0.0105562), norm. avg. (of 11) = 0.732136 fft 6: mflops = -1 (norm. = -0.0105562), norm. avg. (of 11) = 0.51958 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=1.70873 s, 1 iters, t-(init.)=1.59103 s t(norm)=0.0554528, mflops=90.1668 (err=5.4e-16) 1. PDA: elapsed time t=2.9312 s, 1 iters, t-(init.)=2.8142 s t(norm)=0.0980846, mflops=50.9764 (err=5.4e-16) 2. PDA (f2c): elapsed time t=3.92549 s, 1 iters, t-(init.)=3.80842 s t(norm)=0.132737, mflops=37.6685 (err=5.4e-16) 3. Singleton: elapsed time t=6.31401 s, 1 iters, t-(init.)=6.19622 s t(norm)=0.21596, mflops=23.1524 (err=6.4e-16) 4. Singleton (f2c): elapsed time t=6.30609 s, 1 iters, t-(init.)=6.18789 s t(norm)=0.21567, mflops=23.1836 (err=6.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 90.1668 Normalized results and averages for N=1404928: fft 0: mflops = 90.1668 (norm. = 1), norm. avg. (of 21) = 0.998655 fft 1: mflops = 50.9764 (norm. = 0.565357), norm. avg. (of 21) = 0.34535 fft 2: mflops = 37.6685 (norm. = 0.417765), norm. avg. (of 21) = 0.249675 fft 3: mflops = 23.1524 (norm. = 0.256773), norm. avg. (of 21) = 0.401916 fft 4: mflops = 23.1836 (norm. = 0.257119), norm. avg. (of 21) = 0.401625 fft 5: mflops = -1 (norm. = -0.0110906), norm. avg. (of 11) = 0.732136 fft 6: mflops = -1 (norm. = -0.0110906), norm. avg. (of 11) = 0.51958 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=1.9489 s, 1 iters, t-(init.)=1.80458 s t(norm)=0.0503998, mflops=99.2068 (err=7.3e-16) 1. PDA: elapsed time t=3.07108 s, 1 iters, t-(init.)=2.92429 s t(norm)=0.081672, mflops=61.2205 (err=7.8e-16) 2. PDA (f2c): elapsed time t=4.34773 s, 1 iters, t-(init.)=4.20287 s t(norm)=0.117381, mflops=42.5963 (err=7.8e-16) 3. Singleton: elapsed time t=10.8355 s, 1 iters, t-(init.)=10.6919 s t(norm)=0.298611, mflops=16.7442 (err=9.4e-16) 4. Singleton (f2c): elapsed time t=10.8299 s, 1 iters, t-(init.)=10.6863 s t(norm)=0.298456, mflops=16.7529 (err=9.4e-16) 5. Temperton: elapsed time t=2.4858 s, 1 iters, t-(init.)=2.34154 s t(norm)=0.0653963, mflops=76.4569 (err=1.1e-08) 6. Temperton (f2c): elapsed time t=2.86513 s, 1 iters, t-(init.)=2.72057 s t(norm)=0.0759824, mflops=65.8047 (err=6.9e-16) Top mflops for N=1728000 = 99.2068 Normalized results and averages for N=1728000: fft 0: mflops = 99.2068 (norm. = 1), norm. avg. (of 22) = 0.998716 fft 1: mflops = 61.2205 (norm. = 0.6171), norm. avg. (of 22) = 0.357702 fft 2: mflops = 42.5963 (norm. = 0.429369), norm. avg. (of 22) = 0.257843 fft 3: mflops = 16.7442 (norm. = 0.168781), norm. avg. (of 22) = 0.391319 fft 4: mflops = 16.7529 (norm. = 0.168868), norm. avg. (of 22) = 0.391045 fft 5: mflops = 76.4569 (norm. = 0.770683), norm. avg. (of 12) = 0.735348 fft 6: mflops = 65.8047 (norm. = 0.663309), norm. avg. (of 12) = 0.531557 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=3.53208 s, 1 iters, t-(init.)=3.2789 s t(norm)=0.0510511, mflops=97.9412 (err=1.2e-15) 1. PDA: elapsed time t=5.99522 s, 1 iters, t-(init.)=5.74521 s t(norm)=0.0894504, mflops=55.8969 (err=1.2e-15) 2. PDA (f2c): elapsed time t=8.03869 s, 1 iters, t-(init.)=7.78818 s t(norm)=0.121259, mflops=41.2342 (err=1.2e-15) 3. Singleton: elapsed time t=18.4904 s, 1 iters, t-(init.)=18.2395 s t(norm)=0.283982, mflops=17.6068 (err=1.6e-15) 4. Singleton (f2c): elapsed time t=18.5354 s, 1 iters, t-(init.)=18.2849 s t(norm)=0.284688, mflops=17.5631 (err=1.6e-15) 5. Temperton: elapsed time t=4.94471 s, 1 iters, t-(init.)=4.69436 s t(norm)=0.0730891, mflops=68.4096 (err=1.8e-07) 6. Temperton (f2c): elapsed time t=5.48235 s, 1 iters, t-(init.)=5.23228 s t(norm)=0.0814643, mflops=61.3766 (err=1.2e-15) Top mflops for N=2985984 = 97.9412 Normalized results and averages for N=2985984: fft 0: mflops = 97.9412 (norm. = 1), norm. avg. (of 23) = 0.998772 fft 1: mflops = 55.8969 (norm. = 0.570719), norm. avg. (of 23) = 0.366963 fft 2: mflops = 41.2342 (norm. = 0.42101), norm. avg. (of 23) = 0.264937 fft 3: mflops = 17.6068 (norm. = 0.179769), norm. avg. (of 23) = 0.382122 fft 4: mflops = 17.5631 (norm. = 0.179323), norm. avg. (of 23) = 0.38184 fft 5: mflops = 68.4096 (norm. = 0.698477), norm. avg. (of 13) = 0.732512 fft 6: mflops = 61.3766 (norm. = 0.626668), norm. avg. (of 13) = 0.538874 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Bailey, Beauregard, Bergland, Brenner, Burrus, CWP (min N), CWP (best N), Edelblute, FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), Monro, NAPACK (f2c), Ooura (C), Ooura (F), Ransom, SCIPORT, Singleton, Singleton (f2c), Sorensen, Sorensen DIT, Temperton, Temperton (f2c), Valkenburg, SUNPERF 2, 59.4643, 53.0069, 18.3394, 1.50441, 6.93182, 6.70719, 8.48527, 9.07739, 45.391, 3.46545, 3.42099, , 15.3107, 11.8935, 43.4972, 37.8569, 115.973, , 9.44082, 5.18249, 5.37191, 43.5196, , , , , 2.69526, 27.1826, 32.1264, , , 4.20814, 4.26375, 21.1505, 54.6837, 4.09651, 3.52331, 6.80593, 10.3919 4, 71.5117, 67.7135, 24.7378, 7.64183, 12.243, 9.1995, 27.2308, 21.4167, 23.8189, 13.2753, 8.6811, 21.0367, 52.0847, 34.9123, 137.326, 141.367, 227.157, , 29.287, 10.6706, 10.9178, 117.271, 39.777, 37.819, 34.8165, 8.79331, 5.72792, 72.8005, 75.1802, 4.39655, 54.4779, 17.6219, 18.1719, 34.5136, 24.0356, 13.2565, 12.0359, 6.86996, 38.3547 8, 160.655, 148.672, 31.1492, 8.53128, 22.3146, 8.58074, 34.3485, 27.9881, 19.9769, 34.0465, 25.8987, 17.9573, 76.075, 53.7703, 290.603, 307.104, 356.87, 101.262, 43.1262, 18.9247, 19.2675, 169.76, 64.8867, 66.4441, 62.2479, 21.7449, 11.1744, 103.5, 103.305, 4.92673, 76.7513, 17.9812, 18.0653, 63.8416, 19.8811, 27.0815, 23.0695, 6.95685, 59.5025 16, 59.9291, 61.4723, 35.0201, 17.119, 33.8396, 9.04765, 50.811, 41.8119, 18.9707, 65.1263, 51.7134, 17.5069, 127.357, 87.512, 436.168, 418.268, 461.146, 120.823, 68.9434, 30.6256, 31.8857, 187.156, 60.7795, 71.6869, 69.6201, 36.3077, 18.4503, 125.162, 122.712, 18.604, 98.9953, 55.1868, 56.0364, 84.473, 18.8479, 43.6411, 40.9936, 7.16007, 118.752 32, 72.2864, 75.3454, 40.3135, 18.5743, 55.8425, 8.89202, 72.1835, 54.0259, 20.3603, 84.112, 106.202, 18.8239, 147.614, 93.5092, 344.115, 353.233, 379.92, 156.923, 58.0225, 43.1968, 48.7468, 185.219, 74.7637, 88.6286, 92.3143, 61.6549, 26.2879, 145.547, 140.968, 17.0899, 116.604, 78.6424, 78.817, 126.324, 20.3272, 53.2949, 43.0509, 7.32814, 144.622 64, 68.3005, 73.2942, 45.0906, 26.4863, 79.9983, 8.71942, 81.7051, 68.2398, 22.1849, 107.083, 159.926, 20.7962, 175.798, 111.997, 379.506, 232.167, 145.875, 211.028, 75.3588, 54.8975, 61.0628, 153.703, 75.9883, 95.0358, 98.7159, 81.7152, 33.6615, 160.365, 154.33, 38.4235, 128.209, 116.839, 121.175, 153.297, 22.313, 82.1334, 60.2449, 7.31543, 200.54 128, 77.5101, 83.3189, 50.1531, 25.7771, 109.293, 8.55579, 68.9295, 75.7967, 24.881, 138.695, 224.02, 23.0527, 213.513, 127.706, 294.347, 253.071, 205.322, 201.615, 86.1862, 65.6308, 76.9547, 94.5378, 86.4818, 110.942, 115.39, 102.695, 36.3458, 172.939, 163.08, 33.8968, 133.91, 113.476, 119.697, 189.01, 24.6239, 81.1789, 60.405, 7.28342, 242.553 256, 80.4604, 88.246, 55.3504, 29.3386, 114.89, 8.40403, 71.3196, 87.9604, 27.4197, 150.853, 207.476, 25.4097, 225.591, 138.201, 330.749, 253.499, 196.964, 182.598, 96.6367, 71.9912, 83.0062, 121.121, 90.551, 114.384, 114.671, 115.127, 39.7806, 178.677, 174.906, 56.4042, 110.674, 159.223, 157.187, 201.029, 27.2326, 99.1298, 69.1446, 7.32826, 251.85 512, 87.2055, 92.2887, 58.2451, 30.9473, 116.224, 8.26537, 73.3754, 90.294, 29.8197, 160.194, 189.453, 28.078, 174.006, 104.049, 268.093, 268.252, 172.153, 169.545, 75.6771, 80.992, 90.0054, 105.682, 98.3134, 130.365, 122.842, 124.003, 37.1979, 187.87, 179.106, 50.7779, 112.852, 90.2852, 166.914, 219.813, 29.8644, 91.8948, 60.0841, 7.3351, 168.826 1024, 87.9311, 97.932, 61.9614, 35.4065, 106.376, 8.05571, 65.8279, 99.4119, 31.8867, 158.178, 158.209, 29.913, 159.377, 95.5478, 244.776, 236.415, 136.514, 177.717, 77.0073, 84.4853, 98.0889, 73.1799, 102.701, 133.927, 128.271, 133.983, 36.6456, 203.815, 198.818, 74.6224, 102.79, 124.443, 187.092, 233.659, 31.8421, 105.574, 69.0392, 7.3001, 149.965 2048, 78.1368, 82.0408, 52.3435, 33.2132, 85.0373, 7.88798, 63.2015, 87.3005, 30.6692, 125.26, 148.305, 28.5303, 168.228, 92.4882, 242.78, 237.546, 134.022, 148.435, 67.6973, 61.7166, 73.3754, 68.3559, 98.128, 123.621, 119.571, 100.508, 30.2635, 143.225, 137.66, 54.9022, 98.5557, 109.635, 119.823, 141.327, 27.0885, 88.6554, 63.7946, 7.10771, 143.963 4096, 80.2041, 82.8946, 54.8102, 36.044, 89.6642, 7.83373, 65.1579, 94.6331, 32.9052, 124.88, 141.246, 30.4735, 158.8, 96.0424, 233.583, 244.608, 128.265, 124.625, 72.5735, 61.4223, 73.3118, 65.7562, 78.4868, 92.6309, 91.4749, 102.875, 31.2326, 161.223, 153.91, 75.8517, 98.5488, 138.095, 144.227, 142.687, 29.8307, 98.5785, 67.0342, 7.04307, 143.004 8192, 83.4094, 88.842, 56.2917, 36.605, 96.8684, 7.78357, 60.9625, 95.0501, 34.8646, 133.61, 152.918, 31.9481, 156.153, 85.0664, 211.543, 256.032, 126.532, 134.873, 67.9429, 61.4727, 74.3485, , 80.7191, 93.6758, 93.4401, 108.178, 30.3867, 144.617, 144.064, 62.753, 90.4399, 138.353, 133.865, 140.425, 31.8566, 90.9193, 61.3964, 7.01151, 132.124 16384, 81.418, 86.4737, 57.1004, 40.8597, 85.5569, 7.74342, 61.7864, 100.58, 36.8337, 137.213, 137.276, 33.0751, 134.458, 83.8263, 186.781, 218.522, 112.709, 138.698, 69.5043, 60.5391, 72.7293, , 78.9414, 93.901, 94.1584, 110.356, 30.2486, 161.229, 156.975, 84.131, 61.1452, 149.225, 141.012, 140.983, 33.6543, 98.8206, 64.9137, 6.86412, 126.319 32768, 62.205, 66.1591, 46.6437, 37.1081, 63.8246, 7.60502, 54.42, 75.9087, 33.0514, 128.028, 128.29, 29.7151, 100.611, 67.4587, 155.097, 169.048, 90.695, 104.106, 58.5366, 45.7191, 53.4383, , 81.3455, 93.7215, 89.4457, 75.9242, 26.9666, 133.254, 130.884, 66.9826, 30.4108, 84.6525, 81.4384, 114.218, 30.5386, 65.3724, 50.2838, 6.67613, 92.2434 65536, 35.733, 36.3518, 26.0374, 34.6841, 35.0269, 7.45695, 42.1656, 50.0411, 21.1284, 111.076, 111.03, 20.2448, 75.592, 58.2655, 124.653, 122.246, 73.55, 73.563, 52.2364, 30.4517, 32.7848, , 65.9586, 72.7276, 66.6367, 39.1838, 21.7733, 107.268, 104.779, 59.9863, 19.5643, 64.7765, 63.9193, 62.0027, 20.2122, 50.8151, 40.8449, 6.23155, 73.3151 131072, 25.5117, 25.941, 18.8191, 27.7832, 29.6221, 7.28702, 35.2562, 36.3093, 16.3723, 100.974, 101.026, 15.7563, 55.3645, 43.9279, 105.268, 112.059, 60.5368, 54.4818, 42.5713, 22.8576, 23.9535, , 39.7508, 41.5508, 39.6129, 26.3578, 18.3395, 84.0226, 82.731, 48.3016, 14.2181, 42.2244, 41.8559, 42.2845, 15.6517, 36.157, 30.4003, 5.84051, 53.702 262144, 20.398, 20.5542, 15.1082, 28.5222, 25.1421, 7.0688, 31.7862, 29.9797, 13.5666, 68.7012, 68.6041, 13.2918, 50.5854, 41.8848, 99.2557, 96.9208, 47.976, 47.6697, 41.0075, 18.6325, 19.2916, , 28.7713, 29.7562, 28.4028, 21.1762, 17.2877, 73.1409, 71.9086, 57.1619, 11.6695, 36.6055, 36.7089, 32.4863, 12.9938, 32.6125, 28.182, 5.55222, 49.5007 Norm. Avg., 0.300264, 0.307311, 0.177166, 0.128321, 0.269821, 0.0376658, 0.22998, 0.27691, 0.129314, 0.467616, 0.519469, 0.105777, 0.510126, 0.329877, 0.896539, 0.876973, 0.672968, 0.535299, 0.262835, 0.19367, 0.220432, 0.383023, 0.311524, 0.367035, 0.356948, 0.308042, 0.110626, 0.570552, 0.558826, 0.238254, 0.292216, 0.362005, 0.392962, 0.482975, 0.128832, 0.277299, 0.204627, 0.0322773, 0.489043 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, Brenner, CWP (min N), CWP (best N), FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c), Valkenburg, SUNPERF 6, , 20.8338, 15.6491, 50.5877, 37.7738, 178.894, 191.383, 24.1197, 37.8346, 10.0574, 12.7136, 11.8307, 17.0357, 13.2784, 6.72105, 39.5518 9, 7.78204, 35.7895, 29.0911, 77.8988, 50.9077, 219.61, 233.781, 21.514, 38.6984, 17.6916, 23.2887, 22.4496, 30.0447, 26.1669, 6.5435, 57.7488 12, , 51.2209, 43.8689, 112.741, 68.7973, 114.219, 355.703, 34.4044, 53.9865, 17.3416, 25.7949, 24.1919, 37.626, 30.3866, 6.89332, 88.3622 15, 11.3183, 59.5956, 59.4153, 117.752, 71.7933, 124.733, 292.052, 25.6997, 45.0722, 14.7494, 33.0114, 29.9341, 51.6138, 31.78, 5.77212, 93.9344 18, 10.2787, 52.7244, 47.4346, 99.2051, 59.6062, 146.244, 142.831, 24.162, 63.8127, 23.9755, 37.1399, 36.4857, 44.372, 34.4959, 6.87974, 82.0239 24, , 74.722, 69.2963, 131.107, 76.4598, 196.093, 212.93, 44.329, 82.1664, 25.7665, 38.079, 36.0728, 49.939, 43.3446, 7.09202, 111.6 36, 13.1686, 83.5369, 83.5882, 158.636, 85.046, 271.665, 261.321, 30.7541, 90.0882, 34.1256, 64.5797, 64.2637, 78.1891, 56.6945, 6.90141, 136.374 80, 23.4222, 110.629, 168.955, 183.594, 108.979, 252.19, 230.604, 48.2929, 69.9123, 22.4038, 109.644, 110.614, 95.4381, 62.9561, 6.46209, 201.08 108, 14.1278, 106.834, 117.635, 203.929, 92.4193, 260.018, 263.146, 28.2219, 78.4433, 43.0269, 71.9752, 78.4038, 94.1703, 66.6512, 6.81475, 169.907 210, 19.5861, 224.177, 224.245, 134.349, 65.5789, 187.139, 153.991, 26.5365, 62.0409, 19.4742, 68.113, 62.2352, , , 5.33239, 99.923 504, 21.4802, 216.311, 216.026, 139.907, 65.7903, 190.537, 175.637, 33.7045, 70.8421, 25.0252, 85.7206, 83.325, , , 5.88544, 101.371 1000, 21.37, 133.277, 204.279, 156.809, 87.6674, 166.99, 195.331, 33.0118, 66.8563, 19.5232, 118.62, 122.292, 125.401, 77.4168, 5.75935, 129.584 1960, 23.1243, 164.632, 164.588, 116.579, 49.6894, 147.639, 145.629, 30.9531, 63.7486, 16.0772, 77.7496, 74.548, , , 5.04606, 67.9167 4725, 17.9299, 128.425, 177.238, 140.813, 63.9539, 167.475, 165.771, 22.5045, 65.4561, 18.6405, 73.2831, 70.4623, , , 5.32902, 90.5807 10368, 20.6698, 155.137, 148.501, 174.9, 83.7031, 192.03, 202.182, 32.3901, 86.5467, 34.4645, 92.379, 86.6232, 101.431, 71.8617, 6.83181, 134.027 27000, 16.4088, 169.152, 169.086, 123.212, 73.2703, 149.724, 135.863, 24.4226, 75.1607, 20.1682, 82.3415, 79.9443, 102.357, 69.917, 5.80958, 104.68 75600, 12.9388, 117.262, 116.8, 54.7155, 40.5257, 94.096, 94.0679, 20.2961, 51.9604, 16.6673, 45.0469, 44.4921, , , 5.06208, 48.5324 165375, 9.555, 97.8803, 97.6829, 28.5965, 23.0887, 76.9285, 74.1819, 14.5051, 44.6658, 13.1334, 31.5972, 31.3287, , , 4.40555, 25.5376 362880, 9.90524, 43.6888, 43.6055, 35.9144, 29.7857, 79.6156, 77.3493, 17.841, 45.1437, 15.2504, 25.4765, 25.0582, , , 4.99578, 33.3929 Norm. Avg., 0.0876161, 0.587705, 0.625353, 0.579902, 0.326768, 0.859433, 0.919082, 0.14678, 0.338175, 0.1131, 0.308816, 0.302496, 0.315231, 0.222863, 0.032552, 0.46852 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM, HARM (f2c), PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 4x4x4, 251.756, , , 28.2959, 21.4525, 125.858, 120.753, 78.5019, 49.3502 8x8x8, 409.365, 135.386, 108.012, 53.9203, 38.9061, 104.905, 105.092, 205.052, 141.51 16x16x16, 349.54, 136.762, 122.694, 93.692, 61.6506, 116.093, 108.061, 187.32, 138.163 32x32x32, 298.709, 122.57, 105.785, 56.5819, 39.2343, 89.3425, 82.9271, 162.119, 102.182 64x64x64, 129.804, 72.3272, 69.1114, 51.6246, 40.7509, 26.4373, 26.6434, 92.9993, 79.4326 256x64x32, 80.7766, 58.7565, 56.1423, 45.6855, 37.3209, 20.7573, 20.8805, 93.2206, 77.9504 16x1024x64, 118.051, 60.5453, 58.3724, 75.8525, 55.3506, 24.5892, 24.8699, , 128x128x128, 90.4328, 56.3273, 53.4897, 43.053, 35.8105, 15.73, 15.533, 52.622, 48.3056 Norm. Avg., 0.983314, 0.49365, 0.455663, 0.338498, 0.258391, 0.274498, 0.266852, 0.598529, 0.465907 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 5x5x5, 170.66, 35.2619, 22.4823, 145.198, 170.77, 122.178, 80.2126 6x6x6, 275.22, 36.857, 26.7706, 92.8125, 91.712, 121.184, 76.2066 7x7x7, 194.92, 13.023, 13.9393, 98.336, 94.7258, , 9x9x9, 282.63, 59.7402, 39.5081, 108.705, 112.123, 210.584, 157.854 10x10x10, 298.094, 60.7059, 36.4531, 124.75, 124.704, 228.119, 122.251 11x11x11, 112.134, 16.1696, 14.5356, 97.8187, 93.3189, , 12x12x12, 335.08, 81.535, 51.4106, 92.777, 90.9737, 199.769, 100.262 13x13x13, 105.018, 16.4217, 14.2503, 91.4733, 88.9073, , 14x14x14, 179.728, 27.7701, 25.577, 78.0192, 71.238, , 15x15x15, 260.981, 87.3404, 50.6404, 92.2804, 90.1878, 183.21, 121.568 24x25x28, 146.309, 73.9649, 51.7322, 84.4964, 82.0191, , 48x48x48, 182.396, 61.4126, 43.6565, 45.5678, 45.0175, 120.679, 97.2039 49x49x49, 117.538, 33.0337, 25.8317, 42.6997, 42.264, , 60x60x60, 136.334, 60.7408, 42.2109, 30.456, 30.3, 100.174, 70.7297 72x60x56, 109.747, 53.8147, 39.4214, 26.3951, 26.6237, , 75x75x75, 105.892, 57.704, 41.9676, 27.208, 27.1491, 108.898, 88.4693 80x80x80, 95.2557, 58.6015, 44.6874, 27.7375, 27.7864, 89.9641, 73.0727 84x84x84, 94.9922, 47.794, 35.1223, 22.8422, 22.6683, , 96x96x96, 96.3754, 57.3538, 41.6393, 17.8762, 17.7739, 72.1215, 58.1982 105x105x105, 94.7311, 50.1288, 35.0483, 24.5293, 24.1216, , 112x112x112, 90.1668, 50.9764, 37.6685, 23.1524, 23.1836, , 120x120x120, 99.2068, 61.2205, 42.5963, 16.7442, 16.7529, 76.4569, 65.8047 144x144x144, 97.9412, 55.8969, 41.2342, 17.6068, 17.5631, 68.4096, 61.3766 Norm. Avg., 0.998772, 0.366963, 0.264937, 0.382122, 0.38184, 0.732512, 0.538874 @@@@ end