To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Steven G. Johnson @ submitter email = stevenj@alum.mit.edu @ submitter organization = MIT @ computer manufacturer = @ computer model = @ CPU manufacturer = Intel @ CPU model = Pentium Pro @ CPU speed = 200 MHz @ RAM = 256 MB @ L2 cache size = @ operating system = Red Hat Linux 4.1 (kernel 2.0.27) @ C compiler = gcc 2.7.2.1 @ C compiler flags = -pedantic -ansi -O6 -fomit-frame-pointer -Wall @ Fortran compiler = g77 0.5.21 @ Fortran compiler flags = -O6 -fomit-frame-pointer @ remarks = 4-processor machine (only one processor benchmarked) @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) 524288 (50.9973 MB) 1048576 (96 MB) Maximum array size = 1048576 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Bailey 5. Beauregard 6. Bergland 7. Brenner 8. Burrus 9. CWP (min N) 10. CWP (best N) 11. Edelblute 12. FFTPACK 13. FFTPACK (f2c) 14. FFTW 15. FFTW_ESTIMATE 16. Frigo-old 17. Green 18. GSL 19. GSL DIT 20. GSL DIF 21. Krukar 22. Mayer (Buneman) 23. Mayer (simple) 24. Mayer (lookup) 25. Monro 26. NAPACK (f2c) 27. Nielsen 28. NR (C) 29. NR (F) 30. Ooura (C) 31. Ooura (F) 32. QFT 33. Ransom 34. SCIPORT 35. Singleton 36. Singleton (f2c) 37. Sorensen 38. Sorensen DIT 39. Temperton 40. Temperton (f2c) 41. Valkenburg Computing normalized averages (42 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.95 s, 4194304 iters, t-(init.)=1.49 s t(norm)=0.177622, mflops=28.1497 (err=5.6e-17) 1. Arndt DIT: elapsed time t=1.93 s, 4194304 iters, t-(init.)=1.44 s t(norm)=0.171661, mflops=29.1271 (err=5.6e-17) 2. Arndt Split-Radix: elapsed time t=1.37 s, 2097152 iters, t-(init.)=1.12 s t(norm)=0.267029, mflops=18.7246 (err=5.6e-17) 3. Arndt 4-step: elapsed time t=1 s, 131072 iters, t-(init.)=0.98 s t(norm)=3.7384, mflops=1.33747 (err=5.6e-17) 4. Bailey: elapsed time t=1.04 s, 524288 iters, t-(init.)=0.99 s t(norm)=0.944138, mflops=5.29584 (err=5.6e-17) 5. Beauregard: elapsed time t=1.42 s, 524288 iters, t-(init.)=1.36 s t(norm)=1.297, mflops=3.85506 (err=8.4e-17) 6. Bergland: elapsed time t=1.13 s, 524288 iters, t-(init.)=1.08 s t(norm)=1.02997, mflops=4.85452 (err=8.4e-17) 7. Brenner: elapsed time t=1.03 s, 524288 iters, t-(init.)=0.97 s t(norm)=0.925064, mflops=5.40503 (err=8.4e-17) 8. Burrus: elapsed time t=1.43 s, 2097152 iters, t-(init.)=1.2 s t(norm)=0.286102, mflops=17.4763 (err=5.6e-17) 9. CWP (min N): elapsed time t=1.45 s, 524288 iters, t-(init.)=1.39 s t(norm)=1.32561, mflops=3.77186 10. CWP (best N) (N=3): elapsed time t=1.63 s, 524288 iters, t-(init.)=1.56 s t(norm)=1.48773, mflops=3.36082 11. Skipping fft (Edelblute can't handle N <= 2). 12. FFTPACK: elapsed time t=1.13 s, 1048576 iters, t-(init.)=1.01 s t(norm)=0.481606, mflops=10.3819 (err=8.4e-17) 13. FFTPACK (f2c): elapsed time t=1.36 s, 1048576 iters, t-(init.)=1.24 s t(norm)=0.591278, mflops=8.45626 (err=8.4e-17) FFTW_MEASURE plan: (cost = 3.528595e-07) FFTW_NOTW 2 14. FFTW: elapsed time t=1.65 s, 4194304 iters, t-(init.)=1.19 s t(norm)=0.141859, mflops=35.2463 (err=8.4e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 15. FFTW_ESTIMATE: elapsed time t=1.64 s, 4194304 iters, t-(init.)=1.15 s t(norm)=0.137091, mflops=36.4722 (err=8.4e-17) 16. Frigo-old: elapsed time t=1.32 s, 4194304 iters, t-(init.)=0.87 s t(norm)=0.103712, mflops=48.2104 (err=8.4e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.11 s, 524288 iters, t-(init.)=0.91 s t(norm)=0.867844, mflops=5.76141 (err=8.4e-17) 19. GSL DIT: elapsed time t=1.12 s, 524288 iters, t-(init.)=1.06 s t(norm)=1.01089, mflops=4.94611 (err=8.4e-17) 20. GSL DIF: elapsed time t=1.32 s, 524288 iters, t-(init.)=1.26 s t(norm)=1.20163, mflops=4.16102 (err=8.4e-17) 21. Krukar: elapsed time t=1.14 s, 2097152 iters, t-(init.)=0.91 s t(norm)=0.216961, mflops=23.0456 (err=8.4e-17) 22. Skipping fft (Mayer can't handle N <= 2). 23. Skipping fft (Mayer can't handle N <= 2). 24. Skipping fft (Mayer can't handle N <= 2). 25. Skipping fft (Monro can't handle N <= 2). 26. NAPACK (f2c): elapsed time t=1.18 s, 262144 iters, t-(init.)=1.15 s t(norm)=2.19345, mflops=2.27951 (err=8.3e-17) 27. Nielsen: elapsed time t=1.64 s, 262144 iters, t-(init.)=1.61 s t(norm)=3.07083, mflops=1.62822 (err=5.6e-17) 28. NR (C): elapsed time t=1.05 s, 524288 iters, t-(init.)=0.99 s t(norm)=0.944138, mflops=5.29584 (err=8.4e-17) 29. NR (F): elapsed time t=1.15 s, 524288 iters, t-(init.)=1.09 s t(norm)=1.03951, mflops=4.80998 (err=8.4e-17) 30. Ooura (C): elapsed time t=1.7 s, 4194304 iters, t-(init.)=1.26 s t(norm)=0.150204, mflops=33.2881 (err=8.4e-17) 31. Ooura (F): elapsed time t=1.88 s, 4194304 iters, t-(init.)=1.41 s t(norm)=0.168085, mflops=29.7468 (err=8.4e-17) 32. Skipping fft (QFT requires N >= 16). 33. Skipping fft (Ransom doesn't work for N=2). 34. Skipping fft (SCIPORT can't handle N < 4). 35. Singleton: elapsed time t=1.22 s, 524288 iters, t-(init.)=1.16 s t(norm)=1.10626, mflops=4.51972 (err=8.4e-17) 36. Singleton (f2c): elapsed time t=1.25 s, 524288 iters, t-(init.)=1.19 s t(norm)=1.13487, mflops=4.40578 (err=8.4e-17) 37. Sorensen: elapsed time t=1.59 s, 2097152 iters, t-(init.)=1.35 s t(norm)=0.321865, mflops=15.5345 (err=5.6e-17) 38. Sorensen DIT: elapsed time t=1.57 s, 2097152 iters, t-(init.)=1.34 s t(norm)=0.319481, mflops=15.6504 (err=5.6e-17) 39. Temperton: elapsed time t=1.66 s, 524288 iters, t-(init.)=1.6 s t(norm)=1.52588, mflops=3.2768 (err=8.4e-17) 40. Temperton (f2c): elapsed time t=1.7 s, 524288 iters, t-(init.)=1.64 s t(norm)=1.56403, mflops=3.19688 (err=8.4e-17) 41. Valkenburg: elapsed time t=1.18 s, 524288 iters, t-(init.)=1.13 s t(norm)=1.07765, mflops=4.63972 (err=8.3e-17) Top mflops for N=2 = 48.2104 Normalized results and averages for N=2: fft 0: mflops = 28.1497 (norm. = 0.583893), norm. avg. (of 1) = 0.583893 fft 1: mflops = 29.1271 (norm. = 0.604167), norm. avg. (of 1) = 0.604167 fft 2: mflops = 18.7246 (norm. = 0.388393), norm. avg. (of 1) = 0.388393 fft 3: mflops = 1.33747 (norm. = 0.0277423), norm. avg. (of 1) = 0.0277423 fft 4: mflops = 5.29584 (norm. = 0.109848), norm. avg. (of 1) = 0.109848 fft 5: mflops = 3.85506 (norm. = 0.0799632), norm. avg. (of 1) = 0.0799632 fft 6: mflops = 4.85452 (norm. = 0.100694), norm. avg. (of 1) = 0.100694 fft 7: mflops = 5.40503 (norm. = 0.112113), norm. avg. (of 1) = 0.112113 fft 8: mflops = 17.4763 (norm. = 0.3625), norm. avg. (of 1) = 0.3625 fft 9: mflops = 3.77186 (norm. = 0.0782374), norm. avg. (of 1) = 0.0782374 fft 10: mflops = 3.36082 (norm. = 0.0697115), norm. avg. (of 1) = 0.0697115 fft 11: mflops = -1 (norm. = -0.0207424), norm. avg. (of 0) = -1 fft 12: mflops = 10.3819 (norm. = 0.215347), norm. avg. (of 1) = 0.215347 fft 13: mflops = 8.45626 (norm. = 0.175403), norm. avg. (of 1) = 0.175403 fft 14: mflops = 35.2463 (norm. = 0.731092), norm. avg. (of 1) = 0.731092 fft 15: mflops = 36.4722 (norm. = 0.756522), norm. avg. (of 1) = 0.756522 fft 16: mflops = 48.2104 (norm. = 1), norm. avg. (of 1) = 1 fft 17: mflops = -1 (norm. = -0.0207424), norm. avg. (of 0) = -1 fft 18: mflops = 5.76141 (norm. = 0.119505), norm. avg. (of 1) = 0.119505 fft 19: mflops = 4.94611 (norm. = 0.102594), norm. avg. (of 1) = 0.102594 fft 20: mflops = 4.16102 (norm. = 0.0863095), norm. avg. (of 1) = 0.0863095 fft 21: mflops = 23.0456 (norm. = 0.478022), norm. avg. (of 1) = 0.478022 fft 22: mflops = -1 (norm. = -0.0207424), norm. avg. (of 0) = -1 fft 23: mflops = -1 (norm. = -0.0207424), norm. avg. (of 0) = -1 fft 24: mflops = -1 (norm. = -0.0207424), norm. avg. (of 0) = -1 fft 25: mflops = -1 (norm. = -0.0207424), norm. avg. (of 0) = -1 fft 26: mflops = 2.27951 (norm. = 0.0472826), norm. avg. (of 1) = 0.0472826 fft 27: mflops = 1.62822 (norm. = 0.0337733), norm. avg. (of 1) = 0.0337733 fft 28: mflops = 5.29584 (norm. = 0.109848), norm. avg. (of 1) = 0.109848 fft 29: mflops = 4.80998 (norm. = 0.0997706), norm. avg. (of 1) = 0.0997706 fft 30: mflops = 33.2881 (norm. = 0.690476), norm. avg. (of 1) = 0.690476 fft 31: mflops = 29.7468 (norm. = 0.617021), norm. avg. (of 1) = 0.617021 fft 32: mflops = -1 (norm. = -0.0207424), norm. avg. (of 0) = -1 fft 33: mflops = -1 (norm. = -0.0207424), norm. avg. (of 0) = -1 fft 34: mflops = -1 (norm. = -0.0207424), norm. avg. (of 0) = -1 fft 35: mflops = 4.51972 (norm. = 0.09375), norm. avg. (of 1) = 0.09375 fft 36: mflops = 4.40578 (norm. = 0.0913866), norm. avg. (of 1) = 0.0913866 fft 37: mflops = 15.5345 (norm. = 0.322222), norm. avg. (of 1) = 0.322222 fft 38: mflops = 15.6504 (norm. = 0.324627), norm. avg. (of 1) = 0.324627 fft 39: mflops = 3.2768 (norm. = 0.0679688), norm. avg. (of 1) = 0.0679688 fft 40: mflops = 3.19688 (norm. = 0.066311), norm. avg. (of 1) = 0.066311 fft 41: mflops = 4.63972 (norm. = 0.0962389), norm. avg. (of 1) = 0.0962389 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.04 s, 1048576 iters, t-(init.)=0.87 s t(norm)=0.103712, mflops=48.2104 (err=9.6e-17) 1. Arndt DIT: elapsed time t=1.04 s, 1048576 iters, t-(init.)=0.86 s t(norm)=0.10252, mflops=48.771 (err=9.6e-17) 2. Arndt Split-Radix: elapsed time t=1.04 s, 524288 iters, t-(init.)=0.95 s t(norm)=0.226498, mflops=22.0753 (err=1.5e-16) 3. Arndt 4-step: elapsed time t=1.11 s, 131072 iters, t-(init.)=1.08 s t(norm)=1.02997, mflops=4.85452 (err=1.5e-16) 4. Bailey: elapsed time t=1.13 s, 262144 iters, t-(init.)=1.08 s t(norm)=0.514984, mflops=9.70904 (err=1.5e-16) 5. Beauregard: elapsed time t=1.22 s, 131072 iters, t-(init.)=1.19 s t(norm)=1.13487, mflops=4.40578 (err=1.4e-16) 6. Bergland: elapsed time t=1.42 s, 524288 iters, t-(init.)=1.33 s t(norm)=0.317097, mflops=15.7681 (err=7.6e-17) 7. Brenner: elapsed time t=1.09 s, 262144 iters, t-(init.)=1.05 s t(norm)=0.500679, mflops=9.98644 (err=1.4e-16) 8. Burrus: elapsed time t=1.52 s, 524288 iters, t-(init.)=1.43 s t(norm)=0.340939, mflops=14.6654 (err=1.5e-16) 9. CWP (min N): elapsed time t=1.65 s, 524288 iters, t-(init.)=1.56 s t(norm)=0.371933, mflops=13.4433 10. CWP (best N) (N=15): elapsed time t=1.04 s, 131072 iters, t-(init.)=0.97 s t(norm)=0.925064, mflops=5.40503 11. Edelblute: elapsed time t=1.31 s, 524288 iters, t-(init.)=1.22 s t(norm)=0.290871, mflops=17.1898 (err=1.5e-16) 12. FFTPACK: elapsed time t=1 s, 524288 iters, t-(init.)=0.91 s t(norm)=0.216961, mflops=23.0456 (err=1.2e-16) 13. FFTPACK (f2c): elapsed time t=1.99 s, 1048576 iters, t-(init.)=1.82 s t(norm)=0.216961, mflops=23.0456 (err=1.2e-16) FFTW_MEASURE plan: (cost = 5.722046e-07) FFTW_NOTW 4 14. FFTW: elapsed time t=1.28 s, 2097152 iters, t-(init.)=0.93 s t(norm)=0.0554323, mflops=90.2001 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 15. FFTW_ESTIMATE: elapsed time t=1.28 s, 2097152 iters, t-(init.)=0.92 s t(norm)=0.0548363, mflops=91.1805 (err=1.4e-16) 16. Frigo-old: elapsed time t=1.26 s, 2097152 iters, t-(init.)=0.9 s t(norm)=0.0536442, mflops=93.2068 (err=1.4e-16) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.11 s, 262144 iters, t-(init.)=1.07 s t(norm)=0.510216, mflops=9.79978 (err=1.4e-16) 19. GSL DIT: elapsed time t=1.16 s, 262144 iters, t-(init.)=1.12 s t(norm)=0.534058, mflops=9.36229 (err=1.4e-16) 20. GSL DIF: elapsed time t=1.38 s, 262144 iters, t-(init.)=1.33 s t(norm)=0.634193, mflops=7.88403 (err=1.8e-16) 21. Krukar: elapsed time t=1.57 s, 2097152 iters, t-(init.)=1.21 s t(norm)=0.0721216, mflops=69.3273 (err=1.4e-16) 22. Mayer (Buneman): elapsed time t=1.57 s, 1048576 iters, t-(init.)=1.39 s t(norm)=0.165701, mflops=30.1748 (err=8.1e-17) 23. Mayer (simple): elapsed time t=1.59 s, 1048576 iters, t-(init.)=1.41 s t(norm)=0.168085, mflops=29.7468 24. Mayer (lookup): elapsed time t=1.64 s, 1048576 iters, t-(init.)=1.46 s t(norm)=0.174046, mflops=28.7281 (err=8.1e-17) 25. Monro: elapsed time t=1.07 s, 131072 iters, t-(init.)=1.05 s t(norm)=1.00136, mflops=4.99322 (err=8.1e-17) 26. NAPACK (f2c): elapsed time t=1.34 s, 131072 iters, t-(init.)=1.32 s t(norm)=1.25885, mflops=3.97188 (err=1.8e-16) 27. Nielsen: elapsed time t=1.93 s, 262144 iters, t-(init.)=1.88 s t(norm)=0.896454, mflops=5.57753 (err=1.5e-16) 28. NR (C): elapsed time t=1.13 s, 262144 iters, t-(init.)=1.08 s t(norm)=0.514984, mflops=9.70904 (err=1.4e-16) 29. NR (F): elapsed time t=1.33 s, 262144 iters, t-(init.)=1.28 s t(norm)=0.610352, mflops=8.192 (err=1.4e-16) 30. Ooura (C): elapsed time t=1.06 s, 1048576 iters, t-(init.)=0.88 s t(norm)=0.104904, mflops=47.6625 (err=9.8e-17) 31. Ooura (F): elapsed time t=1.18 s, 1048576 iters, t-(init.)=1 s t(norm)=0.119209, mflops=41.943 (err=9.8e-17) 32. Skipping fft (QFT requires N >= 16). 33. Ransom: elapsed time t=1.36 s, 131072 iters, t-(init.)=1.33 s t(norm)=1.26839, mflops=3.94202 (err=2.1e-16) 34. SCIPORT: elapsed time t=1.53 s, 524288 iters, t-(init.)=1.44 s t(norm)=0.343323, mflops=14.5636 (err=8.0e-09) 35. Singleton: elapsed time t=1.14 s, 262144 iters, t-(init.)=1.09 s t(norm)=0.519753, mflops=9.61996 (err=1.5e-16) 36. Singleton (f2c): elapsed time t=1.93 s, 524288 iters, t-(init.)=1.85 s t(norm)=0.441074, mflops=11.336 (err=1.1e-16) 37. Sorensen: elapsed time t=1.06 s, 524288 iters, t-(init.)=0.97 s t(norm)=0.231266, mflops=21.6201 (err=1.5e-16) 38. Sorensen DIT: elapsed time t=1.71 s, 524288 iters, t-(init.)=1.62 s t(norm)=0.386238, mflops=12.9454 (err=8.1e-17) 39. Temperton: elapsed time t=1.22 s, 262144 iters, t-(init.)=1.18 s t(norm)=0.562668, mflops=8.88624 (err=1.7e-16) 40. Temperton (f2c): elapsed time t=1.06 s, 262144 iters, t-(init.)=1.01 s t(norm)=0.481606, mflops=10.3819 (err=1.7e-16) 41. Valkenburg: elapsed time t=1.12 s, 131072 iters, t-(init.)=1.09 s t(norm)=1.03951, mflops=4.80998 (err=1.8e-16) Top mflops for N=4 = 93.2068 Normalized results and averages for N=4: fft 0: mflops = 48.2104 (norm. = 0.517241), norm. avg. (of 2) = 0.550567 fft 1: mflops = 48.771 (norm. = 0.523256), norm. avg. (of 2) = 0.563711 fft 2: mflops = 22.0753 (norm. = 0.236842), norm. avg. (of 2) = 0.312617 fft 3: mflops = 4.85452 (norm. = 0.0520833), norm. avg. (of 2) = 0.0399128 fft 4: mflops = 9.70904 (norm. = 0.104167), norm. avg. (of 2) = 0.107008 fft 5: mflops = 4.40578 (norm. = 0.0472689), norm. avg. (of 2) = 0.0636161 fft 6: mflops = 15.7681 (norm. = 0.169173), norm. avg. (of 2) = 0.134934 fft 7: mflops = 9.98644 (norm. = 0.107143), norm. avg. (of 2) = 0.109628 fft 8: mflops = 14.6654 (norm. = 0.157343), norm. avg. (of 2) = 0.259921 fft 9: mflops = 13.4433 (norm. = 0.144231), norm. avg. (of 2) = 0.111234 fft 10: mflops = 5.40503 (norm. = 0.0579897), norm. avg. (of 2) = 0.0638506 fft 11: mflops = 17.1898 (norm. = 0.184426), norm. avg. (of 1) = 0.184426 fft 12: mflops = 23.0456 (norm. = 0.247253), norm. avg. (of 2) = 0.2313 fft 13: mflops = 23.0456 (norm. = 0.247253), norm. avg. (of 2) = 0.211328 fft 14: mflops = 90.2001 (norm. = 0.967742), norm. avg. (of 2) = 0.849417 fft 15: mflops = 91.1805 (norm. = 0.978261), norm. avg. (of 2) = 0.867391 fft 16: mflops = 93.2068 (norm. = 1), norm. avg. (of 2) = 1 fft 17: mflops = -1 (norm. = -0.0107288), norm. avg. (of 0) = -1 fft 18: mflops = 9.79978 (norm. = 0.10514), norm. avg. (of 2) = 0.112323 fft 19: mflops = 9.36229 (norm. = 0.100446), norm. avg. (of 2) = 0.10152 fft 20: mflops = 7.88403 (norm. = 0.0845865), norm. avg. (of 2) = 0.085448 fft 21: mflops = 69.3273 (norm. = 0.743802), norm. avg. (of 2) = 0.610912 fft 22: mflops = 30.1748 (norm. = 0.323741), norm. avg. (of 1) = 0.323741 fft 23: mflops = 29.7468 (norm. = 0.319149), norm. avg. (of 1) = 0.319149 fft 24: mflops = 28.7281 (norm. = 0.308219), norm. avg. (of 1) = 0.308219 fft 25: mflops = 4.99322 (norm. = 0.0535714), norm. avg. (of 1) = 0.0535714 fft 26: mflops = 3.97188 (norm. = 0.0426136), norm. avg. (of 2) = 0.0449481 fft 27: mflops = 5.57753 (norm. = 0.0598404), norm. avg. (of 2) = 0.0468069 fft 28: mflops = 9.70904 (norm. = 0.104167), norm. avg. (of 2) = 0.107008 fft 29: mflops = 8.192 (norm. = 0.0878906), norm. avg. (of 2) = 0.0938306 fft 30: mflops = 47.6625 (norm. = 0.511364), norm. avg. (of 2) = 0.60092 fft 31: mflops = 41.943 (norm. = 0.45), norm. avg. (of 2) = 0.533511 fft 32: mflops = -1 (norm. = -0.0107288), norm. avg. (of 0) = -1 fft 33: mflops = 3.94202 (norm. = 0.0422932), norm. avg. (of 1) = 0.0422932 fft 34: mflops = 14.5636 (norm. = 0.15625), norm. avg. (of 1) = 0.15625 fft 35: mflops = 9.61996 (norm. = 0.103211), norm. avg. (of 2) = 0.0984805 fft 36: mflops = 11.336 (norm. = 0.121622), norm. avg. (of 2) = 0.106504 fft 37: mflops = 21.6201 (norm. = 0.231959), norm. avg. (of 2) = 0.27709 fft 38: mflops = 12.9454 (norm. = 0.138889), norm. avg. (of 2) = 0.231758 fft 39: mflops = 8.88624 (norm. = 0.095339), norm. avg. (of 2) = 0.0816539 fft 40: mflops = 10.3819 (norm. = 0.111386), norm. avg. (of 2) = 0.0888486 fft 41: mflops = 4.80998 (norm. = 0.0516055), norm. avg. (of 2) = 0.0739222 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.1 s, 262144 iters, t-(init.)=1.01 s t(norm)=0.160535, mflops=31.1458 (err=1.6e-16) 1. Arndt DIT: elapsed time t=1.11 s, 262144 iters, t-(init.)=1.03 s t(norm)=0.163714, mflops=30.541 (err=1.6e-16) 2. Arndt Split-Radix: elapsed time t=1.46 s, 262144 iters, t-(init.)=1.37 s t(norm)=0.217756, mflops=22.9615 (err=2.0e-16) 3. Arndt 4-step: elapsed time t=1.33 s, 65536 iters, t-(init.)=1.31 s t(norm)=0.832876, mflops=6.0033 (err=2.1e-16) 4. Bailey: elapsed time t=1.97 s, 262144 iters, t-(init.)=1.88 s t(norm)=0.298818, mflops=16.7326 (err=1.6e-16) 5. Beauregard: elapsed time t=1.28 s, 65536 iters, t-(init.)=1.26 s t(norm)=0.801086, mflops=6.24152 (err=1.3e-16) 6. Bergland: elapsed time t=1.3 s, 131072 iters, t-(init.)=1.26 s t(norm)=0.400543, mflops=12.483 (err=2.0e-16) 7. Brenner: elapsed time t=1.39 s, 131072 iters, t-(init.)=1.35 s t(norm)=0.429153, mflops=11.6508 (err=1.3e-16) 8. Burrus: elapsed time t=1.2 s, 131072 iters, t-(init.)=1.16 s t(norm)=0.368754, mflops=13.5592 (err=1.9e-16) 9. CWP (min N): elapsed time t=1.04 s, 262144 iters, t-(init.)=0.95 s t(norm)=0.150998, mflops=33.1129 10. CWP (best N) (N=15): elapsed time t=1.04 s, 131072 iters, t-(init.)=0.98 s t(norm)=0.311534, mflops=16.0496 11. Edelblute: elapsed time t=1.05 s, 131072 iters, t-(init.)=1 s t(norm)=0.317891, mflops=15.7286 (err=1.9e-16) 12. FFTPACK: elapsed time t=1.8 s, 524288 iters, t-(init.)=1.63 s t(norm)=0.129541, mflops=38.5979 (err=1.8e-16) 13. FFTPACK (f2c): elapsed time t=1.93 s, 524288 iters, t-(init.)=1.76 s t(norm)=0.139872, mflops=35.7469 (err=1.8e-16) FFTW_MEASURE plan: (cost = 1.220703e-06) FFTW_NOTW 8 14. FFTW: elapsed time t=1.29 s, 1048576 iters, t-(init.)=0.94 s t(norm)=0.0373522, mflops=133.861 (err=1.0e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.29 s, 1048576 iters, t-(init.)=0.94 s t(norm)=0.0373522, mflops=133.861 (err=1.0e-16) 16. Frigo-old: elapsed time t=1.06 s, 1048576 iters, t-(init.)=0.69 s t(norm)=0.0274181, mflops=182.361 (err=1.0e-16) 17. Green: elapsed time t=1.44 s, 524288 iters, t-(init.)=1.26 s t(norm)=0.100136, mflops=49.9322 (err=1.2e-16) 18. GSL: elapsed time t=1.01 s, 131072 iters, t-(init.)=0.97 s t(norm)=0.308355, mflops=16.2151 (err=1.2e-16) 19. GSL DIT: elapsed time t=1.05 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.32107, mflops=15.5729 (err=1.2e-16) 20. GSL DIF: elapsed time t=1.31 s, 131072 iters, t-(init.)=1.27 s t(norm)=0.403722, mflops=12.3848 (err=1.6e-16) 21. Krukar: elapsed time t=1.64 s, 1048576 iters, t-(init.)=1.3 s t(norm)=0.0516574, mflops=96.7916 (err=1.2e-16) 22. Mayer (Buneman): elapsed time t=1.78 s, 524288 iters, t-(init.)=1.6 s t(norm)=0.127157, mflops=39.3216 (err=1.5e-16) 23. Mayer (simple): elapsed time t=1.8 s, 524288 iters, t-(init.)=1.62 s t(norm)=0.128746, mflops=38.8361 24. Mayer (lookup): elapsed time t=1.82 s, 524288 iters, t-(init.)=1.65 s t(norm)=0.13113, mflops=38.13 (err=1.5e-16) 25. Monro: elapsed time t=1.57 s, 131072 iters, t-(init.)=1.53 s t(norm)=0.486374, mflops=10.2802 (err=2.3e-16) 26. NAPACK (f2c): elapsed time t=1.35 s, 65536 iters, t-(init.)=1.33 s t(norm)=0.845591, mflops=5.91302 (err=2.8e-16) 27. Nielsen: elapsed time t=1.69 s, 131072 iters, t-(init.)=1.65 s t(norm)=0.524521, mflops=9.53251 (err=1.3e-15) 28. NR (C): elapsed time t=1.07 s, 131072 iters, t-(init.)=1.03 s t(norm)=0.327428, mflops=15.2705 (err=1.2e-16) 29. NR (F): elapsed time t=1.28 s, 131072 iters, t-(init.)=1.24 s t(norm)=0.394185, mflops=12.6844 (err=1.5e-16) 30. Ooura (C): elapsed time t=1.89 s, 1048576 iters, t-(init.)=1.53 s t(norm)=0.0607967, mflops=82.2413 (err=1.6e-16) 31. Ooura (F): elapsed time t=1.19 s, 524288 iters, t-(init.)=1.01 s t(norm)=0.0802676, mflops=62.2916 (err=1.6e-16) 32. Skipping fft (QFT requires N >= 16). 33. Ransom: elapsed time t=1.66 s, 65536 iters, t-(init.)=1.64 s t(norm)=1.04268, mflops=4.79532 (err=7.5e-16) 34. SCIPORT: elapsed time t=1.19 s, 131072 iters, t-(init.)=1.15 s t(norm)=0.365575, mflops=13.6771 (err=4.5e-08) 35. Singleton: elapsed time t=1.21 s, 131072 iters, t-(init.)=1.17 s t(norm)=0.371933, mflops=13.4433 (err=1.2e-16) 36. Singleton (f2c): elapsed time t=1.13 s, 131072 iters, t-(init.)=1.09 s t(norm)=0.346502, mflops=14.4299 (err=1.2e-16) 37. Sorensen: elapsed time t=1.1 s, 262144 iters, t-(init.)=1.02 s t(norm)=0.162125, mflops=30.8405 (err=2.4e-16) 38. Sorensen DIT: elapsed time t=1.37 s, 131072 iters, t-(init.)=1.33 s t(norm)=0.422796, mflops=11.826 (err=1.8e-16) 39. Temperton: elapsed time t=1.28 s, 131072 iters, t-(init.)=1.23 s t(norm)=0.391006, mflops=12.7875 (err=1.5e-16) 40. Temperton (f2c): elapsed time t=1.05 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.32107, mflops=15.5729 (err=1.4e-16) 41. Valkenburg: elapsed time t=1.66 s, 65536 iters, t-(init.)=1.64 s t(norm)=1.04268, mflops=4.79532 (err=1.3e-16) Top mflops for N=8 = 182.361 Normalized results and averages for N=8: fft 0: mflops = 31.1458 (norm. = 0.170792), norm. avg. (of 3) = 0.423975 fft 1: mflops = 30.541 (norm. = 0.167476), norm. avg. (of 3) = 0.431633 fft 2: mflops = 22.9615 (norm. = 0.125912), norm. avg. (of 3) = 0.250382 fft 3: mflops = 6.0033 (norm. = 0.0329198), norm. avg. (of 3) = 0.0375818 fft 4: mflops = 16.7326 (norm. = 0.0917553), norm. avg. (of 3) = 0.101923 fft 5: mflops = 6.24152 (norm. = 0.0342262), norm. avg. (of 3) = 0.0538194 fft 6: mflops = 12.483 (norm. = 0.0684524), norm. avg. (of 3) = 0.112773 fft 7: mflops = 11.6508 (norm. = 0.0638889), norm. avg. (of 3) = 0.0943817 fft 8: mflops = 13.5592 (norm. = 0.0743534), norm. avg. (of 3) = 0.198065 fft 9: mflops = 33.1129 (norm. = 0.181579), norm. avg. (of 3) = 0.134682 fft 10: mflops = 16.0496 (norm. = 0.0880102), norm. avg. (of 3) = 0.0719038 fft 11: mflops = 15.7286 (norm. = 0.08625), norm. avg. (of 2) = 0.135338 fft 12: mflops = 38.5979 (norm. = 0.211656), norm. avg. (of 3) = 0.224752 fft 13: mflops = 35.7469 (norm. = 0.196023), norm. avg. (of 3) = 0.206226 fft 14: mflops = 133.861 (norm. = 0.734043), norm. avg. (of 3) = 0.810959 fft 15: mflops = 133.861 (norm. = 0.734043), norm. avg. (of 3) = 0.822942 fft 16: mflops = 182.361 (norm. = 1), norm. avg. (of 3) = 1 fft 17: mflops = 49.9322 (norm. = 0.27381), norm. avg. (of 1) = 0.27381 fft 18: mflops = 16.2151 (norm. = 0.0889175), norm. avg. (of 3) = 0.104521 fft 19: mflops = 15.5729 (norm. = 0.085396), norm. avg. (of 3) = 0.0961456 fft 20: mflops = 12.3848 (norm. = 0.0679134), norm. avg. (of 3) = 0.0796031 fft 21: mflops = 96.7916 (norm. = 0.530769), norm. avg. (of 3) = 0.584198 fft 22: mflops = 39.3216 (norm. = 0.215625), norm. avg. (of 2) = 0.269683 fft 23: mflops = 38.8361 (norm. = 0.212963), norm. avg. (of 2) = 0.266056 fft 24: mflops = 38.13 (norm. = 0.209091), norm. avg. (of 2) = 0.258655 fft 25: mflops = 10.2802 (norm. = 0.0563725), norm. avg. (of 2) = 0.054972 fft 26: mflops = 5.91302 (norm. = 0.0324248), norm. avg. (of 3) = 0.0407737 fft 27: mflops = 9.53251 (norm. = 0.0522727), norm. avg. (of 3) = 0.0486288 fft 28: mflops = 15.2705 (norm. = 0.0837379), norm. avg. (of 3) = 0.099251 fft 29: mflops = 12.6844 (norm. = 0.0695565), norm. avg. (of 3) = 0.0857392 fft 30: mflops = 82.2413 (norm. = 0.45098), norm. avg. (of 3) = 0.55094 fft 31: mflops = 62.2916 (norm. = 0.341584), norm. avg. (of 3) = 0.469535 fft 32: mflops = -1 (norm. = -0.00548363), norm. avg. (of 0) = -1 fft 33: mflops = 4.79532 (norm. = 0.0262957), norm. avg. (of 2) = 0.0342945 fft 34: mflops = 13.6771 (norm. = 0.075), norm. avg. (of 2) = 0.115625 fft 35: mflops = 13.4433 (norm. = 0.0737179), norm. avg. (of 3) = 0.0902263 fft 36: mflops = 14.4299 (norm. = 0.0791284), norm. avg. (of 3) = 0.0973789 fft 37: mflops = 30.8405 (norm. = 0.169118), norm. avg. (of 3) = 0.2411 fft 38: mflops = 11.826 (norm. = 0.0648496), norm. avg. (of 3) = 0.176122 fft 39: mflops = 12.7875 (norm. = 0.070122), norm. avg. (of 3) = 0.0778099 fft 40: mflops = 15.5729 (norm. = 0.085396), norm. avg. (of 3) = 0.0876977 fft 41: mflops = 4.79532 (norm. = 0.0262957), norm. avg. (of 3) = 0.0580467 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.43 s, 131072 iters, t-(init.)=1.36 s t(norm)=0.162125, mflops=30.8405 (err=1.6e-16) 1. Arndt DIT: elapsed time t=1.42 s, 131072 iters, t-(init.)=1.35 s t(norm)=0.160933, mflops=31.0689 (err=1.2e-16) 2. Arndt Split-Radix: elapsed time t=1.84 s, 131072 iters, t-(init.)=1.77 s t(norm)=0.211, mflops=23.6966 (err=1.4e-16) 3. Arndt 4-step: elapsed time t=1.12 s, 32768 iters, t-(init.)=1.1 s t(norm)=0.524521, mflops=9.53251 (err=1.3e-16) 4. Bailey: elapsed time t=2 s, 131072 iters, t-(init.)=1.93 s t(norm)=0.230074, mflops=21.7321 (err=1.4e-16) 5. Beauregard: elapsed time t=1.5 s, 32768 iters, t-(init.)=1.48 s t(norm)=0.705719, mflops=7.08497 (err=1.9e-16) 6. Bergland: elapsed time t=1.35 s, 65536 iters, t-(init.)=1.31 s t(norm)=0.312328, mflops=16.0088 (err=2.0e-16) 7. Brenner: elapsed time t=1.61 s, 65536 iters, t-(init.)=1.58 s t(norm)=0.376701, mflops=13.2731 (err=1.5e-16) 8. Burrus: elapsed time t=1.51 s, 65536 iters, t-(init.)=1.47 s t(norm)=0.350475, mflops=14.2663 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.52 s, 262144 iters, t-(init.)=1.38 s t(norm)=0.0822544, mflops=60.787 10. CWP (best N) (N=28): elapsed time t=1.5 s, 131072 iters, t-(init.)=1.39 s t(norm)=0.165701, mflops=30.1748 11. Edelblute: elapsed time t=1.4 s, 65536 iters, t-(init.)=1.36 s t(norm)=0.324249, mflops=15.4202 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.02 s, 131072 iters, t-(init.)=0.95 s t(norm)=0.113249, mflops=44.1506 (err=1.5e-16) 13. FFTPACK (f2c): elapsed time t=1.68 s, 262144 iters, t-(init.)=1.54 s t(norm)=0.0917912, mflops=54.4715 (err=1.3e-16) FFTW_MEASURE plan: (cost = 5.493164e-06) FFTW_NOTW 16 14. FFTW: elapsed time t=1.46 s, 262144 iters, t-(init.)=1.32 s t(norm)=0.0786781, mflops=63.5501 (err=1.6e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.46 s, 262144 iters, t-(init.)=1.33 s t(norm)=0.0792742, mflops=63.0722 (err=1.6e-16) 16. Frigo-old: elapsed time t=1.24 s, 524288 iters, t-(init.)=0.93 s t(norm)=0.0277162, mflops=180.4 (err=1.6e-16) 17. Green: elapsed time t=1.78 s, 262144 iters, t-(init.)=1.64 s t(norm)=0.0977516, mflops=51.15 (err=1.9e-16) 18. GSL: elapsed time t=1.2 s, 65536 iters, t-(init.)=1.17 s t(norm)=0.27895, mflops=17.9244 (err=1.5e-16) 19. GSL DIT: elapsed time t=1.98 s, 131072 iters, t-(init.)=1.91 s t(norm)=0.22769, mflops=21.9597 (err=1.6e-16) 20. GSL DIF: elapsed time t=1.24 s, 65536 iters, t-(init.)=1.2 s t(norm)=0.286102, mflops=17.4763 (err=1.9e-16) 21. Krukar: elapsed time t=1.72 s, 524288 iters, t-(init.)=1.44 s t(norm)=0.0429153, mflops=116.508 (err=1.7e-16) 22. Mayer (Buneman): elapsed time t=1.51 s, 131072 iters, t-(init.)=1.43 s t(norm)=0.170469, mflops=29.3308 (err=1.3e-16) 23. Mayer (simple): elapsed time t=1.32 s, 131072 iters, t-(init.)=1.25 s t(norm)=0.149012, mflops=33.5544 24. Mayer (lookup): elapsed time t=1.16 s, 131072 iters, t-(init.)=1.09 s t(norm)=0.129938, mflops=38.4799 (err=1.4e-16) 25. Monro: elapsed time t=1.44 s, 65536 iters, t-(init.)=1.41 s t(norm)=0.33617, mflops=14.8734 (err=1.6e-16) 26. NAPACK (f2c): elapsed time t=1.3 s, 32768 iters, t-(init.)=1.28 s t(norm)=0.610352, mflops=8.192 (err=3.5e-16) 27. Nielsen: elapsed time t=1.82 s, 65536 iters, t-(init.)=1.78 s t(norm)=0.424385, mflops=11.7818 (err=1.3e-16) 28. NR (C): elapsed time t=1.93 s, 131072 iters, t-(init.)=1.86 s t(norm)=0.221729, mflops=22.55 (err=1.6e-16) 29. NR (F): elapsed time t=1.19 s, 65536 iters, t-(init.)=1.16 s t(norm)=0.276566, mflops=18.0789 (err=1.7e-16) 30. Ooura (C): elapsed time t=1.08 s, 262144 iters, t-(init.)=0.93 s t(norm)=0.0554323, mflops=90.2001 (err=1.4e-16) 31. Ooura (F): elapsed time t=1.33 s, 262144 iters, t-(init.)=1.19 s t(norm)=0.0709295, mflops=70.4925 (err=1.4e-16) 32. QFT: elapsed time t=1.21 s, 262144 iters, t-(init.)=1.08 s t(norm)=0.064373, mflops=77.6723 (err=1.3e-16) 33. Ransom: elapsed time t=1.73 s, 65536 iters, t-(init.)=1.69 s t(norm)=0.402927, mflops=12.4092 (err=6.0e-16) 34. SCIPORT: elapsed time t=1.62 s, 65536 iters, t-(init.)=1.58 s t(norm)=0.376701, mflops=13.2731 (err=5.2e-08) 35. Singleton: elapsed time t=1.53 s, 65536 iters, t-(init.)=1.49 s t(norm)=0.355244, mflops=14.0748 (err=1.7e-16) 36. Singleton (f2c): elapsed time t=1 s, 65536 iters, t-(init.)=0.96 s t(norm)=0.228882, mflops=21.8453 (err=1.9e-16) 37. Sorensen: elapsed time t=1.11 s, 131072 iters, t-(init.)=1.04 s t(norm)=0.123978, mflops=40.3298 (err=1.3e-16) 38. Sorensen DIT: elapsed time t=1.77 s, 65536 iters, t-(init.)=1.73 s t(norm)=0.412464, mflops=12.1223 (err=1.4e-16) 39. Temperton: elapsed time t=1.4 s, 65536 iters, t-(init.)=1.36 s t(norm)=0.324249, mflops=15.4202 (err=1.7e-16) 40. Temperton (f2c): elapsed time t=1.97 s, 131072 iters, t-(init.)=1.9 s t(norm)=0.226498, mflops=22.0753 (err=1.5e-16) 41. Valkenburg: elapsed time t=1.07 s, 16384 iters, t-(init.)=1.06 s t(norm)=1.01089, mflops=4.94611 (err=3.0e-16) Top mflops for N=16 = 180.4 Normalized results and averages for N=16: fft 0: mflops = 30.8405 (norm. = 0.170956), norm. avg. (of 4) = 0.36072 fft 1: mflops = 31.0689 (norm. = 0.172222), norm. avg. (of 4) = 0.36678 fft 2: mflops = 23.6966 (norm. = 0.131356), norm. avg. (of 4) = 0.220626 fft 3: mflops = 9.53251 (norm. = 0.0528409), norm. avg. (of 4) = 0.0413966 fft 4: mflops = 21.7321 (norm. = 0.120466), norm. avg. (of 4) = 0.106559 fft 5: mflops = 7.08497 (norm. = 0.0392736), norm. avg. (of 4) = 0.050183 fft 6: mflops = 16.0088 (norm. = 0.0887405), norm. avg. (of 4) = 0.106765 fft 7: mflops = 13.2731 (norm. = 0.0735759), norm. avg. (of 4) = 0.0891803 fft 8: mflops = 14.2663 (norm. = 0.0790816), norm. avg. (of 4) = 0.168319 fft 9: mflops = 60.787 (norm. = 0.336957), norm. avg. (of 4) = 0.185251 fft 10: mflops = 30.1748 (norm. = 0.167266), norm. avg. (of 4) = 0.0957444 fft 11: mflops = 15.4202 (norm. = 0.0854779), norm. avg. (of 3) = 0.118718 fft 12: mflops = 44.1506 (norm. = 0.244737), norm. avg. (of 4) = 0.229748 fft 13: mflops = 54.4715 (norm. = 0.301948), norm. avg. (of 4) = 0.230157 fft 14: mflops = 63.5501 (norm. = 0.352273), norm. avg. (of 4) = 0.696287 fft 15: mflops = 63.0722 (norm. = 0.349624), norm. avg. (of 4) = 0.704612 fft 16: mflops = 180.4 (norm. = 1), norm. avg. (of 4) = 1 fft 17: mflops = 51.15 (norm. = 0.283537), norm. avg. (of 2) = 0.278673 fft 18: mflops = 17.9244 (norm. = 0.099359), norm. avg. (of 4) = 0.103231 fft 19: mflops = 21.9597 (norm. = 0.121728), norm. avg. (of 4) = 0.102541 fft 20: mflops = 17.4763 (norm. = 0.096875), norm. avg. (of 4) = 0.0839211 fft 21: mflops = 116.508 (norm. = 0.645833), norm. avg. (of 4) = 0.599607 fft 22: mflops = 29.3308 (norm. = 0.162587), norm. avg. (of 3) = 0.233984 fft 23: mflops = 33.5544 (norm. = 0.186), norm. avg. (of 3) = 0.239371 fft 24: mflops = 38.4799 (norm. = 0.213303), norm. avg. (of 3) = 0.243538 fft 25: mflops = 14.8734 (norm. = 0.0824468), norm. avg. (of 3) = 0.0641303 fft 26: mflops = 8.192 (norm. = 0.0454102), norm. avg. (of 4) = 0.0419328 fft 27: mflops = 11.7818 (norm. = 0.065309), norm. avg. (of 4) = 0.0527989 fft 28: mflops = 22.55 (norm. = 0.125), norm. avg. (of 4) = 0.105688 fft 29: mflops = 18.0789 (norm. = 0.100216), norm. avg. (of 4) = 0.0893583 fft 30: mflops = 90.2001 (norm. = 0.5), norm. avg. (of 4) = 0.538205 fft 31: mflops = 70.4925 (norm. = 0.390756), norm. avg. (of 4) = 0.44984 fft 32: mflops = 77.6723 (norm. = 0.430556), norm. avg. (of 1) = 0.430556 fft 33: mflops = 12.4092 (norm. = 0.068787), norm. avg. (of 3) = 0.045792 fft 34: mflops = 13.2731 (norm. = 0.0735759), norm. avg. (of 3) = 0.101609 fft 35: mflops = 14.0748 (norm. = 0.0780201), norm. avg. (of 4) = 0.0871748 fft 36: mflops = 21.8453 (norm. = 0.121094), norm. avg. (of 4) = 0.103308 fft 37: mflops = 40.3298 (norm. = 0.223558), norm. avg. (of 4) = 0.236714 fft 38: mflops = 12.1223 (norm. = 0.0671965), norm. avg. (of 4) = 0.14889 fft 39: mflops = 15.4202 (norm. = 0.0854779), norm. avg. (of 4) = 0.0797269 fft 40: mflops = 22.0753 (norm. = 0.122368), norm. avg. (of 4) = 0.0963654 fft 41: mflops = 4.94611 (norm. = 0.0274175), norm. avg. (of 4) = 0.0503894 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.04 s, 32768 iters, t-(init.)=1.01 s t(norm)=0.192642, mflops=25.9549 (err=6.0e-16) 1. Arndt DIT: elapsed time t=1.04 s, 32768 iters, t-(init.)=1.01 s t(norm)=0.192642, mflops=25.9549 (err=5.5e-16) 2. Arndt Split-Radix: elapsed time t=1.04 s, 32768 iters, t-(init.)=1.01 s t(norm)=0.192642, mflops=25.9549 (err=3.6e-16) 3. Arndt 4-step: elapsed time t=1.31 s, 16384 iters, t-(init.)=1.29 s t(norm)=0.492096, mflops=10.1606 (err=3.2e-16) 4. Bailey: elapsed time t=1.81 s, 65536 iters, t-(init.)=1.75 s t(norm)=0.166893, mflops=29.9593 (err=6.7e-16) 5. Beauregard: elapsed time t=1.78 s, 16384 iters, t-(init.)=1.76 s t(norm)=0.671387, mflops=7.44727 (err=6.6e-16) 6. Bergland: elapsed time t=1.37 s, 32768 iters, t-(init.)=1.34 s t(norm)=0.255585, mflops=19.563 (err=6.4e-16) 7. Brenner: elapsed time t=1.76 s, 32768 iters, t-(init.)=1.73 s t(norm)=0.329971, mflops=15.1528 (err=6.0e-16) 8. Burrus: elapsed time t=1.73 s, 32768 iters, t-(init.)=1.7 s t(norm)=0.324249, mflops=15.4202 (err=3.5e-16) 9. CWP (min N) (N=33): elapsed time t=1.12 s, 65536 iters, t-(init.)=1.06 s t(norm)=0.101089, mflops=49.4611 10. CWP (best N) (N=35): elapsed time t=1.06 s, 65536 iters, t-(init.)=0.99 s t(norm)=0.0944138, mflops=52.9584 11. Edelblute: elapsed time t=1.62 s, 32768 iters, t-(init.)=1.59 s t(norm)=0.303268, mflops=16.487 (err=3.5e-16) 12. FFTPACK: elapsed time t=1.31 s, 65536 iters, t-(init.)=1.25 s t(norm)=0.119209, mflops=41.943 (err=4.4e-16) 13. FFTPACK (f2c): elapsed time t=1.21 s, 65536 iters, t-(init.)=1.15 s t(norm)=0.109673, mflops=45.5903 (err=5.3e-16) FFTW_MEASURE plan: (cost = 1.037598e-05) FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_NOTW 8 14. FFTW: elapsed time t=1.35 s, 131072 iters, t-(init.)=1.23 s t(norm)=0.058651, mflops=85.2501 (err=3.8e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.24 s, 65536 iters, t-(init.)=1.18 s t(norm)=0.112534, mflops=44.4312 (err=6.1e-16) 16. Frigo-old: elapsed time t=1.53 s, 262144 iters, t-(init.)=1.25 s t(norm)=0.0298023, mflops=167.772 (err=5.5e-16) 17. Green: elapsed time t=1.97 s, 131072 iters, t-(init.)=1.85 s t(norm)=0.0882149, mflops=56.6798 (err=6.9e-16) 18. GSL: elapsed time t=1.34 s, 32768 iters, t-(init.)=1.31 s t(norm)=0.249863, mflops=20.011 (err=5.0e-16) 19. GSL DIT: elapsed time t=1.83 s, 65536 iters, t-(init.)=1.76 s t(norm)=0.167847, mflops=29.7891 (err=6.1e-16) 20. GSL DIF: elapsed time t=1.13 s, 32768 iters, t-(init.)=1.09 s t(norm)=0.207901, mflops=24.0499 (err=4.3e-16) 21. Krukar: elapsed time t=1.05 s, 131072 iters, t-(init.)=0.93 s t(norm)=0.0443459, mflops=112.75 (err=8.6e-16) 22. Mayer (Buneman): elapsed time t=1.8 s, 65536 iters, t-(init.)=1.73 s t(norm)=0.164986, mflops=30.3057 (err=3.5e-16) 23. Mayer (simple): elapsed time t=1.41 s, 65536 iters, t-(init.)=1.34 s t(norm)=0.127792, mflops=39.126 24. Mayer (lookup): elapsed time t=1.26 s, 65536 iters, t-(init.)=1.2 s t(norm)=0.114441, mflops=43.6907 (err=5.9e-16) 25. Monro: elapsed time t=1.35 s, 32768 iters, t-(init.)=1.32 s t(norm)=0.25177, mflops=19.8594 (err=6.5e-16) 26. NAPACK (f2c): elapsed time t=1.35 s, 16384 iters, t-(init.)=1.34 s t(norm)=0.511169, mflops=9.78149 (err=9.3e-16) 27. Nielsen: elapsed time t=2 s, 32768 iters, t-(init.)=1.97 s t(norm)=0.375748, mflops=13.3068 (err=3.1e-15) 28. NR (C): elapsed time t=1.71 s, 65536 iters, t-(init.)=1.65 s t(norm)=0.157356, mflops=31.775 (err=6.1e-16) 29. NR (F): elapsed time t=1.09 s, 32768 iters, t-(init.)=1.06 s t(norm)=0.202179, mflops=24.7306 (err=7.0e-16) 30. Ooura (C): elapsed time t=1.17 s, 131072 iters, t-(init.)=1.03 s t(norm)=0.0491142, mflops=101.803 (err=4.3e-16) 31. Ooura (F): elapsed time t=1.46 s, 131072 iters, t-(init.)=1.33 s t(norm)=0.0634193, mflops=78.8403 (err=4.3e-16) 32. QFT: elapsed time t=1.79 s, 131072 iters, t-(init.)=1.66 s t(norm)=0.079155, mflops=63.1672 (err=4.6e-16) 33. Ransom: elapsed time t=1.02 s, 16384 iters, t-(init.)=1 s t(norm)=0.38147, mflops=13.1072 (err=3.8e-15) 34. SCIPORT: elapsed time t=1.03 s, 16384 iters, t-(init.)=1.01 s t(norm)=0.385284, mflops=12.9774 (err=3.1e-07) 35. Singleton: elapsed time t=1.56 s, 32768 iters, t-(init.)=1.53 s t(norm)=0.291824, mflops=17.1336 (err=7.1e-16) 36. Singleton (f2c): elapsed time t=1 s, 32768 iters, t-(init.)=0.97 s t(norm)=0.185013, mflops=27.0252 (err=5.8e-16) 37. Sorensen: elapsed time t=1.06 s, 65536 iters, t-(init.)=1 s t(norm)=0.0953674, mflops=52.4288 (err=3.4e-16) 38. Sorensen DIT: elapsed time t=1.03 s, 16384 iters, t-(init.)=1.01 s t(norm)=0.385284, mflops=12.9774 (err=5.1e-16) 39. Temperton: elapsed time t=1.65 s, 32768 iters, t-(init.)=1.62 s t(norm)=0.30899, mflops=16.1817 (err=5.5e-16) 40. Temperton (f2c): elapsed time t=1.21 s, 32768 iters, t-(init.)=1.18 s t(norm)=0.225067, mflops=22.2156 (err=5.1e-16) 41. Valkenburg: elapsed time t=1.31 s, 8192 iters, t-(init.)=1.3 s t(norm)=0.991821, mflops=5.04123 (err=8.4e-16) Top mflops for N=32 = 167.772 Normalized results and averages for N=32: fft 0: mflops = 25.9549 (norm. = 0.154703), norm. avg. (of 5) = 0.319517 fft 1: mflops = 25.9549 (norm. = 0.154703), norm. avg. (of 5) = 0.324365 fft 2: mflops = 25.9549 (norm. = 0.154703), norm. avg. (of 5) = 0.207441 fft 3: mflops = 10.1606 (norm. = 0.060562), norm. avg. (of 5) = 0.0452297 fft 4: mflops = 29.9593 (norm. = 0.178571), norm. avg. (of 5) = 0.120962 fft 5: mflops = 7.44727 (norm. = 0.0443892), norm. avg. (of 5) = 0.0490242 fft 6: mflops = 19.563 (norm. = 0.116604), norm. avg. (of 5) = 0.108733 fft 7: mflops = 15.1528 (norm. = 0.0903179), norm. avg. (of 5) = 0.0894078 fft 8: mflops = 15.4202 (norm. = 0.0919118), norm. avg. (of 5) = 0.153038 fft 9: mflops = 49.4611 (norm. = 0.294811), norm. avg. (of 5) = 0.207163 fft 10: mflops = 52.9584 (norm. = 0.315657), norm. avg. (of 5) = 0.139727 fft 11: mflops = 16.487 (norm. = 0.0982704), norm. avg. (of 4) = 0.113606 fft 12: mflops = 41.943 (norm. = 0.25), norm. avg. (of 5) = 0.233799 fft 13: mflops = 45.5903 (norm. = 0.271739), norm. avg. (of 5) = 0.238473 fft 14: mflops = 85.2501 (norm. = 0.50813), norm. avg. (of 5) = 0.658656 fft 15: mflops = 44.4312 (norm. = 0.264831), norm. avg. (of 5) = 0.616656 fft 16: mflops = 167.772 (norm. = 1), norm. avg. (of 5) = 1 fft 17: mflops = 56.6798 (norm. = 0.337838), norm. avg. (of 3) = 0.298395 fft 18: mflops = 20.011 (norm. = 0.119275), norm. avg. (of 5) = 0.106439 fft 19: mflops = 29.7891 (norm. = 0.177557), norm. avg. (of 5) = 0.117544 fft 20: mflops = 24.0499 (norm. = 0.143349), norm. avg. (of 5) = 0.0958066 fft 21: mflops = 112.75 (norm. = 0.672043), norm. avg. (of 5) = 0.614094 fft 22: mflops = 30.3057 (norm. = 0.180636), norm. avg. (of 4) = 0.220647 fft 23: mflops = 39.126 (norm. = 0.233209), norm. avg. (of 4) = 0.23783 fft 24: mflops = 43.6907 (norm. = 0.260417), norm. avg. (of 4) = 0.247757 fft 25: mflops = 19.8594 (norm. = 0.118371), norm. avg. (of 4) = 0.0776905 fft 26: mflops = 9.78149 (norm. = 0.0583022), norm. avg. (of 5) = 0.0452067 fft 27: mflops = 13.3068 (norm. = 0.0793147), norm. avg. (of 5) = 0.058102 fft 28: mflops = 31.775 (norm. = 0.189394), norm. avg. (of 5) = 0.122429 fft 29: mflops = 24.7306 (norm. = 0.147406), norm. avg. (of 5) = 0.100968 fft 30: mflops = 101.803 (norm. = 0.606796), norm. avg. (of 5) = 0.551923 fft 31: mflops = 78.8403 (norm. = 0.469925), norm. avg. (of 5) = 0.453857 fft 32: mflops = 63.1672 (norm. = 0.376506), norm. avg. (of 2) = 0.403531 fft 33: mflops = 13.1072 (norm. = 0.078125), norm. avg. (of 4) = 0.0538752 fft 34: mflops = 12.9774 (norm. = 0.0773515), norm. avg. (of 4) = 0.0955444 fft 35: mflops = 17.1336 (norm. = 0.102124), norm. avg. (of 5) = 0.0901647 fft 36: mflops = 27.0252 (norm. = 0.161082), norm. avg. (of 5) = 0.114863 fft 37: mflops = 52.4288 (norm. = 0.3125), norm. avg. (of 5) = 0.251871 fft 38: mflops = 12.9774 (norm. = 0.0773515), norm. avg. (of 5) = 0.134583 fft 39: mflops = 16.1817 (norm. = 0.0964506), norm. avg. (of 5) = 0.0830716 fft 40: mflops = 22.2156 (norm. = 0.132415), norm. avg. (of 5) = 0.103575 fft 41: mflops = 5.04123 (norm. = 0.0300481), norm. avg. (of 5) = 0.0463211 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.91 s, 32768 iters, t-(init.)=1.86 s t(norm)=0.14782, mflops=33.825 (err=2.3e-16) 1. Arndt DIT: elapsed time t=1.87 s, 32768 iters, t-(init.)=1.81 s t(norm)=0.143846, mflops=34.7594 (err=3.4e-16) 2. Arndt Split-Radix: elapsed time t=1.14 s, 16384 iters, t-(init.)=1.12 s t(norm)=0.178019, mflops=28.0869 (err=4.4e-16) 3. Arndt 4-step: elapsed time t=1.21 s, 8192 iters, t-(init.)=1.19 s t(norm)=0.378291, mflops=13.2173 (err=3.6e-16) 4. Bailey: elapsed time t=1.89 s, 32768 iters, t-(init.)=1.84 s t(norm)=0.14623, mflops=34.1927 (err=4.8e-16) 5. Beauregard: elapsed time t=1.07 s, 4096 iters, t-(init.)=1.06 s t(norm)=0.67393, mflops=7.41917 (err=3.8e-16) 6. Bergland: elapsed time t=1.82 s, 16384 iters, t-(init.)=1.79 s t(norm)=0.284513, mflops=17.5739 (err=2.5e-16) 7. Brenner: elapsed time t=1 s, 8192 iters, t-(init.)=0.99 s t(norm)=0.314713, mflops=15.8875 (err=3.8e-16) 8. Burrus: elapsed time t=1.88 s, 16384 iters, t-(init.)=1.85 s t(norm)=0.29405, mflops=17.0039 (err=4.2e-16) 9. CWP (min N) (N=65): elapsed time t=1.06 s, 32768 iters, t-(init.)=1 s t(norm)=0.0794729, mflops=62.9146 10. CWP (best N) (N=84): elapsed time t=1.16 s, 32768 iters, t-(init.)=1.09 s t(norm)=0.0866254, mflops=57.7198 11. Edelblute: elapsed time t=1.78 s, 16384 iters, t-(init.)=1.75 s t(norm)=0.278155, mflops=17.9756 (err=3.6e-16) 12. FFTPACK: elapsed time t=1.51 s, 32768 iters, t-(init.)=1.45 s t(norm)=0.115236, mflops=43.3894 (err=4.2e-16) 13. FFTPACK (f2c): elapsed time t=1.23 s, 32768 iters, t-(init.)=1.17 s t(norm)=0.0929832, mflops=53.7731 (err=4.5e-16) FFTW_MEASURE plan: (cost = 2.807617e-05) FFTW_TWIDDLE 16 FFTW_NOTW 4 14. FFTW: elapsed time t=1.87 s, 65536 iters, t-(init.)=1.75 s t(norm)=0.0695388, mflops=71.9024 (err=3.1e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.28 s, 32768 iters, t-(init.)=1.23 s t(norm)=0.0977516, mflops=51.15 (err=2.6e-16) 16. Frigo-old: elapsed time t=1.99 s, 32768 iters, t-(init.)=1.92 s t(norm)=0.152588, mflops=32.768 (err=4.5e-16) 17. Green: elapsed time t=1.06 s, 32768 iters, t-(init.)=1 s t(norm)=0.0794729, mflops=62.9146 (err=3.7e-16) 18. GSL: elapsed time t=1.61 s, 16384 iters, t-(init.)=1.58 s t(norm)=0.251134, mflops=19.9097 (err=3.8e-16) 19. GSL DIT: elapsed time t=1.95 s, 32768 iters, t-(init.)=1.9 s t(norm)=0.150998, mflops=33.1129 (err=3.2e-16) 20. GSL DIF: elapsed time t=1.18 s, 16384 iters, t-(init.)=1.15 s t(norm)=0.182788, mflops=27.3542 (err=3.1e-16) 21. Krukar: elapsed time t=1.94 s, 65536 iters, t-(init.)=1.83 s t(norm)=0.0727177, mflops=68.7591 (err=5.3e-16) 22. Mayer (Buneman): elapsed time t=1.11 s, 16384 iters, t-(init.)=1.08 s t(norm)=0.171661, mflops=29.1271 (err=2.0e-16) 23. Mayer (simple): elapsed time t=1.77 s, 32768 iters, t-(init.)=1.71 s t(norm)=0.135899, mflops=36.7921 24. Mayer (lookup): elapsed time t=1.51 s, 32768 iters, t-(init.)=1.45 s t(norm)=0.115236, mflops=43.3894 (err=3.4e-16) 25. Monro: elapsed time t=1.37 s, 16384 iters, t-(init.)=1.34 s t(norm)=0.212987, mflops=23.4756 (err=5.7e-16) 26. NAPACK (f2c): elapsed time t=1.38 s, 8192 iters, t-(init.)=1.36 s t(norm)=0.432332, mflops=11.5652 (err=1.0e-15) 27. Nielsen: elapsed time t=1.1 s, 8192 iters, t-(init.)=1.09 s t(norm)=0.346502, mflops=14.4299 (err=6.5e-15) 28. NR (C): elapsed time t=1.61 s, 32768 iters, t-(init.)=1.56 s t(norm)=0.123978, mflops=40.3298 (err=3.2e-16) 29. NR (F): elapsed time t=1.06 s, 16384 iters, t-(init.)=1.03 s t(norm)=0.163714, mflops=30.541 (err=3.4e-16) 30. Ooura (C): elapsed time t=1.4 s, 65536 iters, t-(init.)=1.27 s t(norm)=0.0504653, mflops=99.078 (err=2.9e-16) 31. Ooura (F): elapsed time t=1.71 s, 65536 iters, t-(init.)=1.59 s t(norm)=0.0631809, mflops=79.1378 (err=2.9e-16) 32. QFT: elapsed time t=1.21 s, 32768 iters, t-(init.)=1.15 s t(norm)=0.0913938, mflops=54.7083 (err=5.9e-16) 33. Ransom: elapsed time t=1.47 s, 16384 iters, t-(init.)=1.45 s t(norm)=0.230471, mflops=21.6947 (err=2.5e-15) 34. SCIPORT: elapsed time t=1.26 s, 8192 iters, t-(init.)=1.24 s t(norm)=0.394185, mflops=12.6844 (err=2.0e-07) 35. Singleton: elapsed time t=1.97 s, 16384 iters, t-(init.)=1.95 s t(norm)=0.309944, mflops=16.1319 (err=3.5e-16) 36. Singleton (f2c): elapsed time t=1.16 s, 16384 iters, t-(init.)=1.13 s t(norm)=0.179609, mflops=27.8383 (err=4.6e-16) 37. Sorensen: elapsed time t=1.08 s, 32768 iters, t-(init.)=1.02 s t(norm)=0.0810623, mflops=61.6809 (err=3.4e-16) 38. Sorensen DIT: elapsed time t=1.15 s, 8192 iters, t-(init.)=1.14 s t(norm)=0.362396, mflops=13.7971 (err=5.4e-16) 39. Temperton: elapsed time t=1.78 s, 16384 iters, t-(init.)=1.75 s t(norm)=0.278155, mflops=17.9756 (err=3.9e-16) 40. Temperton (f2c): elapsed time t=1.1 s, 16384 iters, t-(init.)=1.07 s t(norm)=0.170072, mflops=29.3993 (err=3.7e-16) 41. Valkenburg: elapsed time t=1.55 s, 4096 iters, t-(init.)=1.55 s t(norm)=0.985463, mflops=5.07375 (err=8.5e-16) Top mflops for N=64 = 99.078 Normalized results and averages for N=64: fft 0: mflops = 33.825 (norm. = 0.341398), norm. avg. (of 6) = 0.323164 fft 1: mflops = 34.7594 (norm. = 0.350829), norm. avg. (of 6) = 0.328775 fft 2: mflops = 28.0869 (norm. = 0.283482), norm. avg. (of 6) = 0.220115 fft 3: mflops = 13.2173 (norm. = 0.133403), norm. avg. (of 6) = 0.0599253 fft 4: mflops = 34.1927 (norm. = 0.345109), norm. avg. (of 6) = 0.158319 fft 5: mflops = 7.41917 (norm. = 0.0748821), norm. avg. (of 6) = 0.0533339 fft 6: mflops = 17.5739 (norm. = 0.177374), norm. avg. (of 6) = 0.120173 fft 7: mflops = 15.8875 (norm. = 0.160354), norm. avg. (of 6) = 0.101232 fft 8: mflops = 17.0039 (norm. = 0.171622), norm. avg. (of 6) = 0.156135 fft 9: mflops = 62.9146 (norm. = 0.635), norm. avg. (of 6) = 0.278469 fft 10: mflops = 57.7198 (norm. = 0.582569), norm. avg. (of 6) = 0.213534 fft 11: mflops = 17.9756 (norm. = 0.181429), norm. avg. (of 5) = 0.127171 fft 12: mflops = 43.3894 (norm. = 0.437931), norm. avg. (of 6) = 0.267821 fft 13: mflops = 53.7731 (norm. = 0.542735), norm. avg. (of 6) = 0.289183 fft 14: mflops = 71.9024 (norm. = 0.725714), norm. avg. (of 6) = 0.669832 fft 15: mflops = 51.15 (norm. = 0.51626), norm. avg. (of 6) = 0.599923 fft 16: mflops = 32.768 (norm. = 0.330729), norm. avg. (of 6) = 0.888455 fft 17: mflops = 62.9146 (norm. = 0.635), norm. avg. (of 4) = 0.382546 fft 18: mflops = 19.9097 (norm. = 0.200949), norm. avg. (of 6) = 0.122191 fft 19: mflops = 33.1129 (norm. = 0.334211), norm. avg. (of 6) = 0.153655 fft 20: mflops = 27.3542 (norm. = 0.276087), norm. avg. (of 6) = 0.125853 fft 21: mflops = 68.7591 (norm. = 0.693989), norm. avg. (of 6) = 0.62741 fft 22: mflops = 29.1271 (norm. = 0.293981), norm. avg. (of 5) = 0.235314 fft 23: mflops = 36.7921 (norm. = 0.371345), norm. avg. (of 5) = 0.264533 fft 24: mflops = 43.3894 (norm. = 0.437931), norm. avg. (of 5) = 0.285792 fft 25: mflops = 23.4756 (norm. = 0.23694), norm. avg. (of 5) = 0.10954 fft 26: mflops = 11.5652 (norm. = 0.116728), norm. avg. (of 6) = 0.0571269 fft 27: mflops = 14.4299 (norm. = 0.145642), norm. avg. (of 6) = 0.0726921 fft 28: mflops = 40.3298 (norm. = 0.407051), norm. avg. (of 6) = 0.169866 fft 29: mflops = 30.541 (norm. = 0.308252), norm. avg. (of 6) = 0.135515 fft 30: mflops = 99.078 (norm. = 1), norm. avg. (of 6) = 0.626603 fft 31: mflops = 79.1378 (norm. = 0.798742), norm. avg. (of 6) = 0.511338 fft 32: mflops = 54.7083 (norm. = 0.552174), norm. avg. (of 3) = 0.453078 fft 33: mflops = 21.6947 (norm. = 0.218966), norm. avg. (of 5) = 0.0868933 fft 34: mflops = 12.6844 (norm. = 0.128024), norm. avg. (of 5) = 0.10204 fft 35: mflops = 16.1319 (norm. = 0.162821), norm. avg. (of 6) = 0.102274 fft 36: mflops = 27.8383 (norm. = 0.280973), norm. avg. (of 6) = 0.142548 fft 37: mflops = 61.6809 (norm. = 0.622549), norm. avg. (of 6) = 0.313651 fft 38: mflops = 13.7971 (norm. = 0.139254), norm. avg. (of 6) = 0.135361 fft 39: mflops = 17.9756 (norm. = 0.181429), norm. avg. (of 6) = 0.0994645 fft 40: mflops = 29.3993 (norm. = 0.296729), norm. avg. (of 6) = 0.135768 fft 41: mflops = 5.07375 (norm. = 0.0512097), norm. avg. (of 6) = 0.0471359 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.26 s, 8192 iters, t-(init.)=1.23 s t(norm)=0.167574, mflops=29.8375 (err=3.8e-16) 1. Arndt DIT: elapsed time t=1.26 s, 8192 iters, t-(init.)=1.23 s t(norm)=0.167574, mflops=29.8375 (err=5.1e-16) 2. Arndt Split-Radix: elapsed time t=1.22 s, 8192 iters, t-(init.)=1.2 s t(norm)=0.163487, mflops=30.5835 (err=6.1e-16) 3. Arndt 4-step: elapsed time t=1.58 s, 4096 iters, t-(init.)=1.57 s t(norm)=0.427791, mflops=11.6879 (err=3.3e-16) 4. Bailey: elapsed time t=1.86 s, 16384 iters, t-(init.)=1.81 s t(norm)=0.123296, mflops=40.5527 (err=6.1e-16) 5. Beauregard: elapsed time t=1.25 s, 2048 iters, t-(init.)=1.24 s t(norm)=0.675746, mflops=7.39923 (err=9.3e-16) 6. Bergland: elapsed time t=1.96 s, 8192 iters, t-(init.)=1.93 s t(norm)=0.262942, mflops=19.0156 (err=6.2e-16) 7. Brenner: elapsed time t=1.08 s, 4096 iters, t-(init.)=1.07 s t(norm)=0.291552, mflops=17.1496 (err=6.6e-16) 8. Burrus: elapsed time t=1 s, 4096 iters, t-(init.)=0.99 s t(norm)=0.269754, mflops=18.5354 (err=5.3e-16) 9. CWP (min N) (N=130): elapsed time t=1.06 s, 16384 iters, t-(init.)=1.01 s t(norm)=0.0688008, mflops=72.6736 10. CWP (best N) (N=140): elapsed time t=1.97 s, 32768 iters, t-(init.)=1.85 s t(norm)=0.0630106, mflops=79.3517 11. Edelblute: elapsed time t=1.9 s, 8192 iters, t-(init.)=1.87 s t(norm)=0.254767, mflops=19.6258 (err=6.7e-16) 12. FFTPACK: elapsed time t=1.55 s, 16384 iters, t-(init.)=1.49 s t(norm)=0.101498, mflops=49.262 (err=5.3e-16) 13. FFTPACK (f2c): elapsed time t=1.3 s, 16384 iters, t-(init.)=1.24 s t(norm)=0.0844683, mflops=59.1938 (err=6.2e-16) FFTW_MEASURE plan: (cost = 5.126953e-05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 8 14. FFTW: elapsed time t=1.71 s, 32768 iters, t-(init.)=1.6 s t(norm)=0.0544957, mflops=91.7504 (err=5.2e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.61 s, 16384 iters, t-(init.)=1.56 s t(norm)=0.106267, mflops=47.0515 (err=3.6e-16) 16. Frigo-old: elapsed time t=1.29 s, 8192 iters, t-(init.)=1.26 s t(norm)=0.171661, mflops=29.1271 (err=4.4e-16) 17. Green: elapsed time t=1.26 s, 16384 iters, t-(init.)=1.2 s t(norm)=0.0817435, mflops=61.1669 (err=6.9e-16) 18. GSL: elapsed time t=1.69 s, 8192 iters, t-(init.)=1.66 s t(norm)=0.226157, mflops=22.1085 (err=8.2e-16) 19. GSL DIT: elapsed time t=1.02 s, 8192 iters, t-(init.)=1 s t(norm)=0.136239, mflops=36.7002 (err=7.5e-16) 20. GSL DIF: elapsed time t=1.22 s, 8192 iters, t-(init.)=1.19 s t(norm)=0.162125, mflops=30.8405 (err=7.6e-16) 21. Krukar: elapsed time t=1.11 s, 16384 iters, t-(init.)=1.05 s t(norm)=0.0715256, mflops=69.9051 (err=6.5e-16) 22. Mayer (Buneman): elapsed time t=1.22 s, 8192 iters, t-(init.)=1.19 s t(norm)=0.162125, mflops=30.8405 (err=3.1e-16) 23. Mayer (simple): elapsed time t=1.88 s, 16384 iters, t-(init.)=1.81 s t(norm)=0.123296, mflops=40.5527 24. Mayer (lookup): elapsed time t=1.64 s, 16384 iters, t-(init.)=1.59 s t(norm)=0.10831, mflops=46.1637 (err=3.5e-16) 25. Monro: elapsed time t=1.41 s, 8192 iters, t-(init.)=1.38 s t(norm)=0.18801, mflops=26.5943 (err=4.7e-16) 26. NAPACK (f2c): elapsed time t=1.5 s, 4096 iters, t-(init.)=1.49 s t(norm)=0.405993, mflops=12.3155 (err=1.6e-15) 27. Nielsen: elapsed time t=1.15 s, 4096 iters, t-(init.)=1.13 s t(norm)=0.307901, mflops=16.239 (err=1.7e-15) 28. NR (C): elapsed time t=1.57 s, 16384 iters, t-(init.)=1.51 s t(norm)=0.102861, mflops=48.6095 (err=7.5e-16) 29. NR (F): elapsed time t=1.05 s, 8192 iters, t-(init.)=1.02 s t(norm)=0.138964, mflops=35.9805 (err=6.9e-16) 30. Ooura (C): elapsed time t=1.51 s, 32768 iters, t-(init.)=1.38 s t(norm)=0.0470025, mflops=106.377 (err=6.7e-16) 31. Ooura (F): elapsed time t=1.85 s, 32768 iters, t-(init.)=1.74 s t(norm)=0.059264, mflops=84.3682 (err=6.7e-16) 32. QFT: elapsed time t=1.47 s, 16384 iters, t-(init.)=1.41 s t(norm)=0.0960486, mflops=52.057 (err=4.9e-16) 33. Ransom: elapsed time t=1.74 s, 8192 iters, t-(init.)=1.71 s t(norm)=0.232969, mflops=21.4621 (err=1.7e-15) 34. SCIPORT: elapsed time t=1.49 s, 4096 iters, t-(init.)=1.47 s t(norm)=0.400543, mflops=12.483 (err=1.6e-07) 35. Singleton: elapsed time t=1.66 s, 8192 iters, t-(init.)=1.63 s t(norm)=0.22207, mflops=22.5154 (err=6.2e-16) 36. Singleton (f2c): elapsed time t=1.06 s, 8192 iters, t-(init.)=1.03 s t(norm)=0.140326, mflops=35.6312 (err=5.7e-16) 37. Sorensen: elapsed time t=1.09 s, 16384 iters, t-(init.)=1.03 s t(norm)=0.0701632, mflops=71.2624 (err=4.3e-16) 38. Sorensen DIT: elapsed time t=1.24 s, 4096 iters, t-(init.)=1.23 s t(norm)=0.335148, mflops=14.9188 (err=4.0e-16) 39. Temperton: elapsed time t=1.05 s, 4096 iters, t-(init.)=1.03 s t(norm)=0.280653, mflops=17.8156 (err=7.7e-16) 40. Temperton (f2c): elapsed time t=1.44 s, 8192 iters, t-(init.)=1.41 s t(norm)=0.192097, mflops=26.0285 (err=7.7e-16) 41. Valkenburg: elapsed time t=1.77 s, 2048 iters, t-(init.)=1.77 s t(norm)=0.964573, mflops=5.18364 (err=8.6e-16) Top mflops for N=128 = 106.377 Normalized results and averages for N=128: fft 0: mflops = 29.8375 (norm. = 0.280488), norm. avg. (of 7) = 0.317067 fft 1: mflops = 29.8375 (norm. = 0.280488), norm. avg. (of 7) = 0.321877 fft 2: mflops = 30.5835 (norm. = 0.2875), norm. avg. (of 7) = 0.229741 fft 3: mflops = 11.6879 (norm. = 0.109873), norm. avg. (of 7) = 0.0670606 fft 4: mflops = 40.5527 (norm. = 0.381215), norm. avg. (of 7) = 0.190162 fft 5: mflops = 7.39923 (norm. = 0.0695565), norm. avg. (of 7) = 0.0556514 fft 6: mflops = 19.0156 (norm. = 0.178756), norm. avg. (of 7) = 0.128542 fft 7: mflops = 17.1496 (norm. = 0.161215), norm. avg. (of 7) = 0.109801 fft 8: mflops = 18.5354 (norm. = 0.174242), norm. avg. (of 7) = 0.158722 fft 9: mflops = 72.6736 (norm. = 0.683168), norm. avg. (of 7) = 0.336283 fft 10: mflops = 79.3517 (norm. = 0.745946), norm. avg. (of 7) = 0.289593 fft 11: mflops = 19.6258 (norm. = 0.184492), norm. avg. (of 6) = 0.136724 fft 12: mflops = 49.262 (norm. = 0.463087), norm. avg. (of 7) = 0.295716 fft 13: mflops = 59.1938 (norm. = 0.556452), norm. avg. (of 7) = 0.327365 fft 14: mflops = 91.7504 (norm. = 0.8625), norm. avg. (of 7) = 0.697356 fft 15: mflops = 47.0515 (norm. = 0.442308), norm. avg. (of 7) = 0.577407 fft 16: mflops = 29.1271 (norm. = 0.27381), norm. avg. (of 7) = 0.800648 fft 17: mflops = 61.1669 (norm. = 0.575), norm. avg. (of 5) = 0.421037 fft 18: mflops = 22.1085 (norm. = 0.207831), norm. avg. (of 7) = 0.134425 fft 19: mflops = 36.7002 (norm. = 0.345), norm. avg. (of 7) = 0.18099 fft 20: mflops = 30.8405 (norm. = 0.289916), norm. avg. (of 7) = 0.149291 fft 21: mflops = 69.9051 (norm. = 0.657143), norm. avg. (of 7) = 0.631657 fft 22: mflops = 30.8405 (norm. = 0.289916), norm. avg. (of 6) = 0.244414 fft 23: mflops = 40.5527 (norm. = 0.381215), norm. avg. (of 6) = 0.28398 fft 24: mflops = 46.1637 (norm. = 0.433962), norm. avg. (of 6) = 0.310487 fft 25: mflops = 26.5943 (norm. = 0.25), norm. avg. (of 6) = 0.13295 fft 26: mflops = 12.3155 (norm. = 0.115772), norm. avg. (of 7) = 0.0655047 fft 27: mflops = 16.239 (norm. = 0.152655), norm. avg. (of 7) = 0.0841153 fft 28: mflops = 48.6095 (norm. = 0.456954), norm. avg. (of 7) = 0.210879 fft 29: mflops = 35.9805 (norm. = 0.338235), norm. avg. (of 7) = 0.164475 fft 30: mflops = 106.377 (norm. = 1), norm. avg. (of 7) = 0.679945 fft 31: mflops = 84.3682 (norm. = 0.793103), norm. avg. (of 7) = 0.55159 fft 32: mflops = 52.057 (norm. = 0.489362), norm. avg. (of 4) = 0.462149 fft 33: mflops = 21.4621 (norm. = 0.201754), norm. avg. (of 6) = 0.106037 fft 34: mflops = 12.483 (norm. = 0.117347), norm. avg. (of 6) = 0.104591 fft 35: mflops = 22.5154 (norm. = 0.211656), norm. avg. (of 7) = 0.1179 fft 36: mflops = 35.6312 (norm. = 0.334951), norm. avg. (of 7) = 0.170034 fft 37: mflops = 71.2624 (norm. = 0.669903), norm. avg. (of 7) = 0.364544 fft 38: mflops = 14.9188 (norm. = 0.140244), norm. avg. (of 7) = 0.136059 fft 39: mflops = 17.8156 (norm. = 0.167476), norm. avg. (of 7) = 0.10918 fft 40: mflops = 26.0285 (norm. = 0.244681), norm. avg. (of 7) = 0.151327 fft 41: mflops = 5.18364 (norm. = 0.0487288), norm. avg. (of 7) = 0.0473635 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.1 s, 4096 iters, t-(init.)=1.07 s t(norm)=0.127554, mflops=39.1991 (err=4.8e-16) 1. Arndt DIT: elapsed time t=1.08 s, 4096 iters, t-(init.)=1.05 s t(norm)=0.12517, mflops=39.9458 (err=5.1e-16) 2. Arndt Split-Radix: elapsed time t=1.28 s, 4096 iters, t-(init.)=1.26 s t(norm)=0.150204, mflops=33.2881 (err=5.5e-16) 3. Arndt 4-step: elapsed time t=1.56 s, 2048 iters, t-(init.)=1.54 s t(norm)=0.367165, mflops=13.6179 (err=5.7e-16) 4. Bailey: elapsed time t=1.06 s, 4096 iters, t-(init.)=1.04 s t(norm)=0.123978, mflops=40.3298 (err=5.5e-16) 5. Beauregard: elapsed time t=1.43 s, 1024 iters, t-(init.)=1.43 s t(norm)=0.681877, mflops=7.3327 (err=4.8e-16) 6. Bergland: elapsed time t=1.03 s, 2048 iters, t-(init.)=1.02 s t(norm)=0.243187, mflops=20.5603 (err=5.7e-16) 7. Brenner: elapsed time t=1.2 s, 2048 iters, t-(init.)=1.19 s t(norm)=0.283718, mflops=17.6231 (err=4.8e-16) 8. Burrus: elapsed time t=1.03 s, 2048 iters, t-(init.)=1.02 s t(norm)=0.243187, mflops=20.5603 (err=5.4e-16) 9. CWP (min N) (N=260): elapsed time t=1.05 s, 8192 iters, t-(init.)=1 s t(norm)=0.0596046, mflops=83.8861 10. CWP (best N) (N=280): elapsed time t=1 s, 8192 iters, t-(init.)=0.94 s t(norm)=0.0560284, mflops=89.2405 11. Edelblute: elapsed time t=1 s, 2048 iters, t-(init.)=0.99 s t(norm)=0.236034, mflops=21.1834 (err=5.9e-16) 12. FFTPACK: elapsed time t=1.96 s, 8192 iters, t-(init.)=1.91 s t(norm)=0.113845, mflops=43.9194 (err=4.4e-16) 13. FFTPACK (f2c): elapsed time t=1.5 s, 8192 iters, t-(init.)=1.45 s t(norm)=0.0864267, mflops=57.8525 (err=4.5e-16) FFTW_MEASURE plan: (cost = 1.513672e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 16 14. FFTW: elapsed time t=1.22 s, 8192 iters, t-(init.)=1.17 s t(norm)=0.0697374, mflops=71.6975 (err=4.3e-16) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.5 s, 8192 iters, t-(init.)=1.45 s t(norm)=0.0864267, mflops=57.8525 (err=4.6e-16) 16. Frigo-old: elapsed time t=1.57 s, 4096 iters, t-(init.)=1.54 s t(norm)=0.183582, mflops=27.2357 (err=4.5e-16) 17. Green: elapsed time t=1.42 s, 8192 iters, t-(init.)=1.37 s t(norm)=0.0816584, mflops=61.2307 (err=4.8e-16) 18. GSL: elapsed time t=1.02 s, 2048 iters, t-(init.)=1.01 s t(norm)=0.240803, mflops=20.7639 (err=4.7e-16) 19. GSL DIT: elapsed time t=1.07 s, 4096 iters, t-(init.)=1.04 s t(norm)=0.123978, mflops=40.3298 (err=5.0e-16) 20. GSL DIF: elapsed time t=1.27 s, 4096 iters, t-(init.)=1.24 s t(norm)=0.14782, mflops=33.825 (err=4.9e-16) 21. Krukar: elapsed time t=1.45 s, 8192 iters, t-(init.)=1.4 s t(norm)=0.0834465, mflops=59.9186 (err=5.0e-16) 22. Mayer (Buneman): elapsed time t=1.39 s, 4096 iters, t-(init.)=1.36 s t(norm)=0.162125, mflops=30.8405 (err=4.7e-16) 23. Mayer (simple): elapsed time t=1.05 s, 4096 iters, t-(init.)=1.02 s t(norm)=0.121593, mflops=41.1206 24. Mayer (lookup): elapsed time t=1.85 s, 8192 iters, t-(init.)=1.79 s t(norm)=0.106692, mflops=46.8637 (err=5.7e-16) 25. Monro: elapsed time t=1.48 s, 4096 iters, t-(init.)=1.45 s t(norm)=0.172853, mflops=28.9262 (err=5.8e-16) 26. NAPACK (f2c): elapsed time t=1.58 s, 2048 iters, t-(init.)=1.56 s t(norm)=0.371933, mflops=13.4433 (err=3.9e-15) 27. Nielsen: elapsed time t=1.33 s, 2048 iters, t-(init.)=1.32 s t(norm)=0.314713, mflops=15.8875 (err=3.8e-15) 28. NR (C): elapsed time t=1.62 s, 8192 iters, t-(init.)=1.56 s t(norm)=0.0929832, mflops=53.7731 (err=4.9e-16) 29. NR (F): elapsed time t=1.1 s, 4096 iters, t-(init.)=1.08 s t(norm)=0.128746, mflops=38.8361 (err=4.5e-16) 30. Ooura (C): elapsed time t=1.66 s, 16384 iters, t-(init.)=1.53 s t(norm)=0.0455976, mflops=109.655 (err=5.0e-16) 31. Ooura (F): elapsed time t=1.03 s, 8192 iters, t-(init.)=0.98 s t(norm)=0.0584126, mflops=85.598 (err=5.0e-16) 32. QFT: elapsed time t=1.83 s, 8192 iters, t-(init.)=1.78 s t(norm)=0.106096, mflops=47.127 (err=7.0e-16) 33. Ransom: elapsed time t=1.47 s, 4096 iters, t-(init.)=1.44 s t(norm)=0.171661, mflops=29.1271 (err=2.0e-15) 34. SCIPORT: elapsed time t=1.75 s, 2048 iters, t-(init.)=1.73 s t(norm)=0.412464, mflops=12.1223 (err=1.4e-07) 35. Singleton: elapsed time t=1.24 s, 2048 iters, t-(init.)=1.23 s t(norm)=0.293255, mflops=17.05 (err=5.0e-16) 36. Singleton (f2c): elapsed time t=1.39 s, 4096 iters, t-(init.)=1.36 s t(norm)=0.162125, mflops=30.8405 (err=5.4e-16) 37. Sorensen: elapsed time t=1.15 s, 8192 iters, t-(init.)=1.1 s t(norm)=0.0655651, mflops=76.2601 (err=6.0e-16) 38. Sorensen DIT: elapsed time t=1.31 s, 2048 iters, t-(init.)=1.29 s t(norm)=0.30756, mflops=16.257 (err=5.7e-16) 39. Temperton: elapsed time t=1.17 s, 2048 iters, t-(init.)=1.16 s t(norm)=0.276566, mflops=18.0789 (err=4.3e-16) 40. Temperton (f2c): elapsed time t=1.49 s, 4096 iters, t-(init.)=1.46 s t(norm)=0.174046, mflops=28.7281 (err=4.5e-16) 41. Valkenburg: elapsed time t=1.01 s, 512 iters, t-(init.)=1 s t(norm)=0.953674, mflops=5.24288 (err=6.4e-16) Top mflops for N=256 = 109.655 Normalized results and averages for N=256: fft 0: mflops = 39.1991 (norm. = 0.357477), norm. avg. (of 8) = 0.322118 fft 1: mflops = 39.9458 (norm. = 0.364286), norm. avg. (of 8) = 0.327178 fft 2: mflops = 33.2881 (norm. = 0.303571), norm. avg. (of 8) = 0.23897 fft 3: mflops = 13.6179 (norm. = 0.124188), norm. avg. (of 8) = 0.0742016 fft 4: mflops = 40.3298 (norm. = 0.367788), norm. avg. (of 8) = 0.212365 fft 5: mflops = 7.3327 (norm. = 0.0668706), norm. avg. (of 8) = 0.0570538 fft 6: mflops = 20.5603 (norm. = 0.1875), norm. avg. (of 8) = 0.135912 fft 7: mflops = 17.6231 (norm. = 0.160714), norm. avg. (of 8) = 0.116165 fft 8: mflops = 20.5603 (norm. = 0.1875), norm. avg. (of 8) = 0.162319 fft 9: mflops = 83.8861 (norm. = 0.765), norm. avg. (of 8) = 0.389873 fft 10: mflops = 89.2405 (norm. = 0.81383), norm. avg. (of 8) = 0.355122 fft 11: mflops = 21.1834 (norm. = 0.193182), norm. avg. (of 7) = 0.14479 fft 12: mflops = 43.9194 (norm. = 0.400524), norm. avg. (of 8) = 0.308817 fft 13: mflops = 57.8525 (norm. = 0.527586), norm. avg. (of 8) = 0.352392 fft 14: mflops = 71.6975 (norm. = 0.653846), norm. avg. (of 8) = 0.691918 fft 15: mflops = 57.8525 (norm. = 0.527586), norm. avg. (of 8) = 0.571179 fft 16: mflops = 27.2357 (norm. = 0.248377), norm. avg. (of 8) = 0.731614 fft 17: mflops = 61.2307 (norm. = 0.558394), norm. avg. (of 6) = 0.44393 fft 18: mflops = 20.7639 (norm. = 0.189356), norm. avg. (of 8) = 0.141292 fft 19: mflops = 40.3298 (norm. = 0.367788), norm. avg. (of 8) = 0.20434 fft 20: mflops = 33.825 (norm. = 0.308468), norm. avg. (of 8) = 0.169188 fft 21: mflops = 59.9186 (norm. = 0.546429), norm. avg. (of 8) = 0.621004 fft 22: mflops = 30.8405 (norm. = 0.28125), norm. avg. (of 7) = 0.249677 fft 23: mflops = 41.1206 (norm. = 0.375), norm. avg. (of 7) = 0.296983 fft 24: mflops = 46.8637 (norm. = 0.427374), norm. avg. (of 7) = 0.327185 fft 25: mflops = 28.9262 (norm. = 0.263793), norm. avg. (of 7) = 0.151642 fft 26: mflops = 13.4433 (norm. = 0.122596), norm. avg. (of 8) = 0.0726412 fft 27: mflops = 15.8875 (norm. = 0.144886), norm. avg. (of 8) = 0.0917117 fft 28: mflops = 53.7731 (norm. = 0.490385), norm. avg. (of 8) = 0.245817 fft 29: mflops = 38.8361 (norm. = 0.354167), norm. avg. (of 8) = 0.188187 fft 30: mflops = 109.655 (norm. = 1), norm. avg. (of 8) = 0.719952 fft 31: mflops = 85.598 (norm. = 0.780612), norm. avg. (of 8) = 0.580218 fft 32: mflops = 47.127 (norm. = 0.429775), norm. avg. (of 5) = 0.455674 fft 33: mflops = 29.1271 (norm. = 0.265625), norm. avg. (of 7) = 0.128835 fft 34: mflops = 12.1223 (norm. = 0.110549), norm. avg. (of 7) = 0.105443 fft 35: mflops = 17.05 (norm. = 0.155488), norm. avg. (of 8) = 0.122599 fft 36: mflops = 30.8405 (norm. = 0.28125), norm. avg. (of 8) = 0.183936 fft 37: mflops = 76.2601 (norm. = 0.695455), norm. avg. (of 8) = 0.405908 fft 38: mflops = 16.257 (norm. = 0.148256), norm. avg. (of 8) = 0.137583 fft 39: mflops = 18.0789 (norm. = 0.164871), norm. avg. (of 8) = 0.116142 fft 40: mflops = 28.7281 (norm. = 0.261986), norm. avg. (of 8) = 0.165159 fft 41: mflops = 5.24288 (norm. = 0.0478125), norm. avg. (of 8) = 0.0474196 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.45 s, 2048 iters, t-(init.)=1.42 s t(norm)=0.150469, mflops=33.2295 (err=5.4e-16) 1. Arndt DIT: elapsed time t=1.42 s, 2048 iters, t-(init.)=1.39 s t(norm)=0.14729, mflops=33.9467 (err=5.5e-16) 2. Arndt Split-Radix: elapsed time t=1.4 s, 2048 iters, t-(init.)=1.37 s t(norm)=0.14517, mflops=34.4423 (err=6.0e-16) 3. Arndt 4-step: elapsed time t=1.79 s, 1024 iters, t-(init.)=1.78 s t(norm)=0.377231, mflops=13.2545 (err=5.5e-16) 4. Bailey: elapsed time t=1.38 s, 2048 iters, t-(init.)=1.35 s t(norm)=0.143051, mflops=34.9525 (err=5.7e-16) 5. Beauregard: elapsed time t=1.61 s, 512 iters, t-(init.)=1.6 s t(norm)=0.678168, mflops=7.3728 (err=6.6e-16) 6. Bergland: elapsed time t=1.25 s, 1024 iters, t-(init.)=1.23 s t(norm)=0.260671, mflops=19.1813 (err=5.5e-16) 7. Brenner: elapsed time t=1.3 s, 1024 iters, t-(init.)=1.28 s t(norm)=0.271267, mflops=18.432 (err=5.7e-16) 8. Burrus: elapsed time t=1.12 s, 1024 iters, t-(init.)=1.1 s t(norm)=0.23312, mflops=21.4481 (err=5.8e-16) 9. CWP (min N) (N=520): elapsed time t=1.11 s, 4096 iters, t-(init.)=1.04 s t(norm)=0.0551012, mflops=90.7422 10. CWP (best N) (N=560): elapsed time t=1.1 s, 4096 iters, t-(init.)=1.03 s t(norm)=0.0545714, mflops=91.6231 11. Edelblute: elapsed time t=1.1 s, 1024 iters, t-(init.)=1.09 s t(norm)=0.231001, mflops=21.6449 (err=6.1e-16) 12. FFTPACK: elapsed time t=1.3 s, 2048 iters, t-(init.)=1.27 s t(norm)=0.134574, mflops=37.1543 (err=5.5e-16) 13. FFTPACK (f2c): elapsed time t=1.15 s, 2048 iters, t-(init.)=1.11 s t(norm)=0.11762, mflops=42.5098 (err=5.4e-16) FFTW_MEASURE plan: (cost = 3.125000e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_NOTW 8 14. FFTW: elapsed time t=1.26 s, 4096 iters, t-(init.)=1.2 s t(norm)=0.0635783, mflops=78.6432 (err=5.9e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.74 s, 4096 iters, t-(init.)=1.68 s t(norm)=0.0890096, mflops=56.1737 (err=5.4e-16) 16. Frigo-old: elapsed time t=1.36 s, 2048 iters, t-(init.)=1.33 s t(norm)=0.140932, mflops=35.4781 (err=5.4e-16) 17. Green: elapsed time t=1.6 s, 4096 iters, t-(init.)=1.54 s t(norm)=0.0815921, mflops=61.2804 (err=5.9e-16) 18. GSL: elapsed time t=1.15 s, 1024 iters, t-(init.)=1.14 s t(norm)=0.241597, mflops=20.6956 (err=6.1e-16) 19. GSL DIT: elapsed time t=1.19 s, 2048 iters, t-(init.)=1.16 s t(norm)=0.122918, mflops=40.6775 (err=5.9e-16) 20. GSL DIF: elapsed time t=1.43 s, 2048 iters, t-(init.)=1.39 s t(norm)=0.14729, mflops=33.9467 (err=5.6e-16) 21. Krukar: elapsed time t=1.68 s, 4096 iters, t-(init.)=1.61 s t(norm)=0.0853009, mflops=58.616 (err=5.6e-16) 22. Mayer (Buneman): elapsed time t=1.45 s, 2048 iters, t-(init.)=1.42 s t(norm)=0.150469, mflops=33.2295 (err=5.7e-16) 23. Mayer (simple): elapsed time t=1.12 s, 2048 iters, t-(init.)=1.09 s t(norm)=0.115501, mflops=43.2898 24. Mayer (lookup): elapsed time t=1 s, 2048 iters, t-(init.)=0.97 s t(norm)=0.102785, mflops=48.6453 (err=5.1e-16) 25. Monro: elapsed time t=1.71 s, 2048 iters, t-(init.)=1.68 s t(norm)=0.178019, mflops=28.0869 (err=6.0e-16) 26. NAPACK (f2c): elapsed time t=1.81 s, 1024 iters, t-(init.)=1.8 s t(norm)=0.38147, mflops=13.1072 (err=6.0e-15) 27. Nielsen: elapsed time t=1.55 s, 1024 iters, t-(init.)=1.54 s t(norm)=0.326369, mflops=15.3201 (err=3.0e-15) 28. NR (C): elapsed time t=1.8 s, 4096 iters, t-(init.)=1.73 s t(norm)=0.0916587, mflops=54.5502 (err=5.9e-16) 29. NR (F): elapsed time t=1.17 s, 2048 iters, t-(init.)=1.14 s t(norm)=0.120799, mflops=41.3912 (err=5.8e-16) 30. Ooura (C): elapsed time t=1.91 s, 8192 iters, t-(init.)=1.78 s t(norm)=0.0471539, mflops=106.036 (err=5.3e-16) 31. Ooura (F): elapsed time t=1.17 s, 4096 iters, t-(init.)=1.1 s t(norm)=0.0582801, mflops=85.7926 (err=5.3e-16) 32. QFT: elapsed time t=1.15 s, 2048 iters, t-(init.)=1.12 s t(norm)=0.118679, mflops=42.1303 (err=7.4e-16) 33. Ransom: elapsed time t=1.74 s, 2048 iters, t-(init.)=1.71 s t(norm)=0.181198, mflops=27.5941 (err=1.6e-15) 34. SCIPORT: elapsed time t=1.04 s, 512 iters, t-(init.)=1.03 s t(norm)=0.436571, mflops=11.4529 (err=1.3e-07) 35. Singleton: elapsed time t=1.29 s, 1024 iters, t-(init.)=1.27 s t(norm)=0.269148, mflops=18.5771 (err=7.9e-16) 36. Singleton (f2c): elapsed time t=1.47 s, 2048 iters, t-(init.)=1.44 s t(norm)=0.152588, mflops=32.768 (err=8.0e-16) 37. Sorensen: elapsed time t=1.46 s, 4096 iters, t-(init.)=1.39 s t(norm)=0.0736448, mflops=67.8934 (err=5.7e-16) 38. Sorensen DIT: elapsed time t=1.43 s, 1024 iters, t-(init.)=1.41 s t(norm)=0.298818, mflops=16.7326 (err=5.5e-16) 39. Temperton: elapsed time t=1.37 s, 1024 iters, t-(init.)=1.35 s t(norm)=0.286102, mflops=17.4763 (err=6.3e-16) 40. Temperton (f2c): elapsed time t=1.93 s, 2048 iters, t-(init.)=1.89 s t(norm)=0.200272, mflops=24.9661 (err=6.1e-16) 41. Valkenburg: elapsed time t=1.14 s, 256 iters, t-(init.)=1.14 s t(norm)=0.96639, mflops=5.17389 (err=6.6e-16) Top mflops for N=512 = 106.036 Normalized results and averages for N=512: fft 0: mflops = 33.2295 (norm. = 0.31338), norm. avg. (of 9) = 0.321148 fft 1: mflops = 33.9467 (norm. = 0.320144), norm. avg. (of 9) = 0.326397 fft 2: mflops = 34.4423 (norm. = 0.324818), norm. avg. (of 9) = 0.248509 fft 3: mflops = 13.2545 (norm. = 0.125), norm. avg. (of 9) = 0.0798459 fft 4: mflops = 34.9525 (norm. = 0.32963), norm. avg. (of 9) = 0.225394 fft 5: mflops = 7.3728 (norm. = 0.0695312), norm. avg. (of 9) = 0.0584402 fft 6: mflops = 19.1813 (norm. = 0.180894), norm. avg. (of 9) = 0.14091 fft 7: mflops = 18.432 (norm. = 0.173828), norm. avg. (of 9) = 0.122572 fft 8: mflops = 21.4481 (norm. = 0.202273), norm. avg. (of 9) = 0.166758 fft 9: mflops = 90.7422 (norm. = 0.855769), norm. avg. (of 9) = 0.441639 fft 10: mflops = 91.6231 (norm. = 0.864078), norm. avg. (of 9) = 0.411673 fft 11: mflops = 21.6449 (norm. = 0.204128), norm. avg. (of 8) = 0.152207 fft 12: mflops = 37.1543 (norm. = 0.350394), norm. avg. (of 9) = 0.313436 fft 13: mflops = 42.5098 (norm. = 0.400901), norm. avg. (of 9) = 0.357782 fft 14: mflops = 78.6432 (norm. = 0.741667), norm. avg. (of 9) = 0.697445 fft 15: mflops = 56.1737 (norm. = 0.529762), norm. avg. (of 9) = 0.566577 fft 16: mflops = 35.4781 (norm. = 0.334586), norm. avg. (of 9) = 0.6875 fft 17: mflops = 61.2804 (norm. = 0.577922), norm. avg. (of 7) = 0.463071 fft 18: mflops = 20.6956 (norm. = 0.195175), norm. avg. (of 9) = 0.147279 fft 19: mflops = 40.6775 (norm. = 0.383621), norm. avg. (of 9) = 0.22426 fft 20: mflops = 33.9467 (norm. = 0.320144), norm. avg. (of 9) = 0.185961 fft 21: mflops = 58.616 (norm. = 0.552795), norm. avg. (of 9) = 0.613425 fft 22: mflops = 33.2295 (norm. = 0.31338), norm. avg. (of 8) = 0.25764 fft 23: mflops = 43.2898 (norm. = 0.408257), norm. avg. (of 8) = 0.310892 fft 24: mflops = 48.6453 (norm. = 0.458763), norm. avg. (of 8) = 0.343632 fft 25: mflops = 28.0869 (norm. = 0.264881), norm. avg. (of 8) = 0.165797 fft 26: mflops = 13.1072 (norm. = 0.123611), norm. avg. (of 9) = 0.0783045 fft 27: mflops = 15.3201 (norm. = 0.144481), norm. avg. (of 9) = 0.0975749 fft 28: mflops = 54.5502 (norm. = 0.514451), norm. avg. (of 9) = 0.275665 fft 29: mflops = 41.3912 (norm. = 0.390351), norm. avg. (of 9) = 0.210649 fft 30: mflops = 106.036 (norm. = 1), norm. avg. (of 9) = 0.751068 fft 31: mflops = 85.7926 (norm. = 0.809091), norm. avg. (of 9) = 0.605648 fft 32: mflops = 42.1303 (norm. = 0.397321), norm. avg. (of 6) = 0.445949 fft 33: mflops = 27.5941 (norm. = 0.260234), norm. avg. (of 8) = 0.14526 fft 34: mflops = 11.4529 (norm. = 0.10801), norm. avg. (of 8) = 0.105763 fft 35: mflops = 18.5771 (norm. = 0.175197), norm. avg. (of 9) = 0.128443 fft 36: mflops = 32.768 (norm. = 0.309028), norm. avg. (of 9) = 0.197835 fft 37: mflops = 67.8934 (norm. = 0.640288), norm. avg. (of 9) = 0.43195 fft 38: mflops = 16.7326 (norm. = 0.157801), norm. avg. (of 9) = 0.13983 fft 39: mflops = 17.4763 (norm. = 0.164815), norm. avg. (of 9) = 0.12155 fft 40: mflops = 24.9661 (norm. = 0.23545), norm. avg. (of 9) = 0.172969 fft 41: mflops = 5.17389 (norm. = 0.0487939), norm. avg. (of 9) = 0.0475723 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.89 s, 1024 iters, t-(init.)=1.84 s t(norm)=0.175476, mflops=28.4939 (err=5.2e-16) 1. Arndt DIT: elapsed time t=1.82 s, 1024 iters, t-(init.)=1.78 s t(norm)=0.169754, mflops=29.4544 (err=4.9e-16) 2. Arndt Split-Radix: elapsed time t=1.1 s, 512 iters, t-(init.)=1.08 s t(norm)=0.205994, mflops=24.2726 (err=5.1e-16) 3. Arndt 4-step: elapsed time t=1.89 s, 512 iters, t-(init.)=1.87 s t(norm)=0.356674, mflops=14.0184 (err=4.4e-16) 4. Bailey: elapsed time t=1.85 s, 1024 iters, t-(init.)=1.8 s t(norm)=0.171661, mflops=29.1271 (err=5.6e-16) 5. Beauregard: elapsed time t=1.84 s, 256 iters, t-(init.)=1.83 s t(norm)=0.69809, mflops=7.1624 (err=5.1e-16) 6. Bergland: elapsed time t=1.41 s, 512 iters, t-(init.)=1.39 s t(norm)=0.265121, mflops=18.8593 (err=5.0e-16) 7. Brenner: elapsed time t=1.5 s, 512 iters, t-(init.)=1.48 s t(norm)=0.282288, mflops=17.7124 (err=5.1e-16) 8. Burrus: elapsed time t=1.47 s, 512 iters, t-(init.)=1.45 s t(norm)=0.276566, mflops=18.0789 (err=5.2e-16) 9. CWP (min N) (N=1040): elapsed time t=1.26 s, 2048 iters, t-(init.)=1.17 s t(norm)=0.0557899, mflops=89.6219 10. CWP (best N) (N=1040): elapsed time t=1.26 s, 2048 iters, t-(init.)=1.17 s t(norm)=0.0557899, mflops=89.6219 11. Edelblute: elapsed time t=1.45 s, 512 iters, t-(init.)=1.43 s t(norm)=0.272751, mflops=18.3317 (err=5.2e-16) 12. FFTPACK: elapsed time t=1.5 s, 1024 iters, t-(init.)=1.46 s t(norm)=0.139236, mflops=35.9101 (err=4.9e-16) 13. FFTPACK (f2c): elapsed time t=1.28 s, 1024 iters, t-(init.)=1.24 s t(norm)=0.118256, mflops=42.2813 (err=4.7e-16) FFTW_MEASURE plan: (cost = 7.812500e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 4 14. FFTW: elapsed time t=1.59 s, 2048 iters, t-(init.)=1.51 s t(norm)=0.0720024, mflops=69.4421 (err=4.9e-16) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.83 s, 2048 iters, t-(init.)=1.75 s t(norm)=0.0834465, mflops=59.9186 (err=4.8e-16) 16. Frigo-old: elapsed time t=1.12 s, 1024 iters, t-(init.)=1.08 s t(norm)=0.102997, mflops=48.5452 (err=4.7e-16) 17. Green: elapsed time t=1.2 s, 1024 iters, t-(init.)=1.16 s t(norm)=0.110626, mflops=45.1972 (err=5.9e-16) 18. GSL: elapsed time t=1.36 s, 512 iters, t-(init.)=1.34 s t(norm)=0.255585, mflops=19.563 (err=4.9e-16) 19. GSL DIT: elapsed time t=1.04 s, 512 iters, t-(init.)=1.02 s t(norm)=0.19455, mflops=25.7004 (err=5.1e-16) 20. GSL DIF: elapsed time t=1.1 s, 512 iters, t-(init.)=1.08 s t(norm)=0.205994, mflops=24.2726 (err=4.9e-16) 21. Krukar: elapsed time t=1.58 s, 1024 iters, t-(init.)=1.54 s t(norm)=0.146866, mflops=34.0447 (err=5.4e-16) 22. Mayer (Buneman): elapsed time t=1.68 s, 1024 iters, t-(init.)=1.64 s t(norm)=0.156403, mflops=31.9688 (err=4.6e-16) 23. Mayer (simple): elapsed time t=1.34 s, 1024 iters, t-(init.)=1.29 s t(norm)=0.123024, mflops=40.6425 24. Mayer (lookup): elapsed time t=1.31 s, 1024 iters, t-(init.)=1.27 s t(norm)=0.121117, mflops=41.2825 (err=4.6e-16) 25. Monro: elapsed time t=1.18 s, 512 iters, t-(init.)=1.16 s t(norm)=0.221252, mflops=22.5986 (err=5.4e-16) 26. NAPACK (f2c): elapsed time t=1.92 s, 512 iters, t-(init.)=1.9 s t(norm)=0.362396, mflops=13.7971 (err=1.5e-14) 27. Nielsen: elapsed time t=1.73 s, 512 iters, t-(init.)=1.71 s t(norm)=0.326157, mflops=15.3301 (err=6.2e-15) 28. NR (C): elapsed time t=1.77 s, 1024 iters, t-(init.)=1.73 s t(norm)=0.164986, mflops=30.3057 (err=5.1e-16) 29. NR (F): elapsed time t=1.89 s, 1024 iters, t-(init.)=1.84 s t(norm)=0.175476, mflops=28.4939 (err=5.0e-16) 30. Ooura (C): elapsed time t=1.42 s, 2048 iters, t-(init.)=1.33 s t(norm)=0.0634193, mflops=78.8403 (err=4.6e-16) 31. Ooura (F): elapsed time t=1.57 s, 2048 iters, t-(init.)=1.49 s t(norm)=0.0710487, mflops=70.3742 (err=4.6e-16) 32. QFT: elapsed time t=1.37 s, 1024 iters, t-(init.)=1.33 s t(norm)=0.126839, mflops=39.4202 (err=9.5e-16) 33. Ransom: elapsed time t=1.75 s, 1024 iters, t-(init.)=1.71 s t(norm)=0.163078, mflops=30.6601 (err=1.8e-15) 34. SCIPORT: elapsed time t=1.18 s, 256 iters, t-(init.)=1.17 s t(norm)=0.44632, mflops=11.2027 (err=1.4e-07) 35. Singleton: elapsed time t=1.6 s, 512 iters, t-(init.)=1.58 s t(norm)=0.301361, mflops=16.5914 (err=6.0e-16) 36. Singleton (f2c): elapsed time t=1.85 s, 1024 iters, t-(init.)=1.81 s t(norm)=0.172615, mflops=28.9662 (err=6.0e-16) 37. Sorensen: elapsed time t=1.33 s, 1024 iters, t-(init.)=1.29 s t(norm)=0.123024, mflops=40.6425 (err=4.9e-16) 38. Sorensen DIT: elapsed time t=1.8 s, 512 iters, t-(init.)=1.78 s t(norm)=0.339508, mflops=14.7272 (err=4.9e-16) 39. Temperton: elapsed time t=1.49 s, 512 iters, t-(init.)=1.47 s t(norm)=0.28038, mflops=17.8329 (err=4.9e-16) 40. Temperton (f2c): elapsed time t=1.02 s, 512 iters, t-(init.)=1 s t(norm)=0.190735, mflops=26.2144 (err=4.8e-16) 41. Valkenburg: elapsed time t=1.27 s, 128 iters, t-(init.)=1.27 s t(norm)=0.968933, mflops=5.16031 (err=8.2e-16) Top mflops for N=1024 = 89.6219 Normalized results and averages for N=1024: fft 0: mflops = 28.4939 (norm. = 0.317935), norm. avg. (of 10) = 0.320826 fft 1: mflops = 29.4544 (norm. = 0.328652), norm. avg. (of 10) = 0.326622 fft 2: mflops = 24.2726 (norm. = 0.270833), norm. avg. (of 10) = 0.250741 fft 3: mflops = 14.0184 (norm. = 0.156417), norm. avg. (of 10) = 0.087503 fft 4: mflops = 29.1271 (norm. = 0.325), norm. avg. (of 10) = 0.235355 fft 5: mflops = 7.1624 (norm. = 0.079918), norm. avg. (of 10) = 0.060588 fft 6: mflops = 18.8593 (norm. = 0.210432), norm. avg. (of 10) = 0.147862 fft 7: mflops = 17.7124 (norm. = 0.197635), norm. avg. (of 10) = 0.130079 fft 8: mflops = 18.0789 (norm. = 0.201724), norm. avg. (of 10) = 0.170255 fft 9: mflops = 89.6219 (norm. = 1), norm. avg. (of 10) = 0.497475 fft 10: mflops = 89.6219 (norm. = 1), norm. avg. (of 10) = 0.470506 fft 11: mflops = 18.3317 (norm. = 0.204545), norm. avg. (of 9) = 0.158022 fft 12: mflops = 35.9101 (norm. = 0.400685), norm. avg. (of 10) = 0.322161 fft 13: mflops = 42.2813 (norm. = 0.471774), norm. avg. (of 10) = 0.369181 fft 14: mflops = 69.4421 (norm. = 0.774834), norm. avg. (of 10) = 0.705184 fft 15: mflops = 59.9186 (norm. = 0.668571), norm. avg. (of 10) = 0.576777 fft 16: mflops = 48.5452 (norm. = 0.541667), norm. avg. (of 10) = 0.672917 fft 17: mflops = 45.1972 (norm. = 0.50431), norm. avg. (of 8) = 0.468226 fft 18: mflops = 19.563 (norm. = 0.218284), norm. avg. (of 10) = 0.154379 fft 19: mflops = 25.7004 (norm. = 0.286765), norm. avg. (of 10) = 0.230511 fft 20: mflops = 24.2726 (norm. = 0.270833), norm. avg. (of 10) = 0.194448 fft 21: mflops = 34.0447 (norm. = 0.37987), norm. avg. (of 10) = 0.590069 fft 22: mflops = 31.9688 (norm. = 0.356707), norm. avg. (of 9) = 0.268647 fft 23: mflops = 40.6425 (norm. = 0.453488), norm. avg. (of 9) = 0.326736 fft 24: mflops = 41.2825 (norm. = 0.46063), norm. avg. (of 9) = 0.356632 fft 25: mflops = 22.5986 (norm. = 0.252155), norm. avg. (of 9) = 0.175392 fft 26: mflops = 13.7971 (norm. = 0.153947), norm. avg. (of 10) = 0.0858688 fft 27: mflops = 15.3301 (norm. = 0.171053), norm. avg. (of 10) = 0.104923 fft 28: mflops = 30.3057 (norm. = 0.33815), norm. avg. (of 10) = 0.281914 fft 29: mflops = 28.4939 (norm. = 0.317935), norm. avg. (of 10) = 0.221378 fft 30: mflops = 78.8403 (norm. = 0.879699), norm. avg. (of 10) = 0.763932 fft 31: mflops = 70.3742 (norm. = 0.785235), norm. avg. (of 10) = 0.623607 fft 32: mflops = 39.4202 (norm. = 0.43985), norm. avg. (of 7) = 0.445078 fft 33: mflops = 30.6601 (norm. = 0.342105), norm. avg. (of 9) = 0.167132 fft 34: mflops = 11.2027 (norm. = 0.125), norm. avg. (of 9) = 0.107901 fft 35: mflops = 16.5914 (norm. = 0.185127), norm. avg. (of 10) = 0.134111 fft 36: mflops = 28.9662 (norm. = 0.323204), norm. avg. (of 10) = 0.210372 fft 37: mflops = 40.6425 (norm. = 0.453488), norm. avg. (of 10) = 0.434104 fft 38: mflops = 14.7272 (norm. = 0.164326), norm. avg. (of 10) = 0.142279 fft 39: mflops = 17.8329 (norm. = 0.19898), norm. avg. (of 10) = 0.129293 fft 40: mflops = 26.2144 (norm. = 0.2925), norm. avg. (of 10) = 0.184922 fft 41: mflops = 5.16031 (norm. = 0.0575787), norm. avg. (of 10) = 0.0485729 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.11 s, 256 iters, t-(init.)=1.09 s t(norm)=0.189001, mflops=26.4549 (err=4.6e-16) 1. Arndt DIT: elapsed time t=1.08 s, 256 iters, t-(init.)=1.06 s t(norm)=0.183799, mflops=27.2036 (err=4.6e-16) 2. Arndt Split-Radix: elapsed time t=1.25 s, 256 iters, t-(init.)=1.23 s t(norm)=0.213276, mflops=23.4438 (err=4.8e-16) 3. Arndt 4-step: elapsed time t=1.12 s, 128 iters, t-(init.)=1.11 s t(norm)=0.384938, mflops=12.9891 (err=4.7e-16) 4. Bailey: elapsed time t=1.02 s, 256 iters, t-(init.)=1 s t(norm)=0.173395, mflops=28.8358 (err=5.0e-16) 5. Beauregard: elapsed time t=1.02 s, 64 iters, t-(init.)=1.02 s t(norm)=0.707453, mflops=7.06761 (err=4.8e-16) 6. Bergland: elapsed time t=1.46 s, 256 iters, t-(init.)=1.44 s t(norm)=0.249689, mflops=20.0249 (err=4.9e-16) 7. Brenner: elapsed time t=1.62 s, 256 iters, t-(init.)=1.6 s t(norm)=0.277433, mflops=18.0224 (err=5.0e-16) 8. Burrus: elapsed time t=1.61 s, 256 iters, t-(init.)=1.59 s t(norm)=0.275699, mflops=18.1357 (err=4.6e-16) 9. CWP (min N) (N=2145): elapsed time t=1.77 s, 1024 iters, t-(init.)=1.68 s t(norm)=0.072826, mflops=68.6568 10. CWP (best N) (N=2184): elapsed time t=1.64 s, 1024 iters, t-(init.)=1.55 s t(norm)=0.0671907, mflops=74.4151 11. Edelblute: elapsed time t=1.6 s, 256 iters, t-(init.)=1.58 s t(norm)=0.273965, mflops=18.2505 (err=4.7e-16) 12. FFTPACK: elapsed time t=1.61 s, 512 iters, t-(init.)=1.56 s t(norm)=0.135248, mflops=36.969 (err=4.5e-16) 13. FFTPACK (f2c): elapsed time t=1.41 s, 512 iters, t-(init.)=1.36 s t(norm)=0.117909, mflops=42.4056 (err=4.6e-16) FFTW_MEASURE plan: (cost = 1.523438e-03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 8 14. FFTW: elapsed time t=1.51 s, 1024 iters, t-(init.)=1.43 s t(norm)=0.0619888, mflops=80.6597 (err=4.6e-16) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.2 s, 512 iters, t-(init.)=1.16 s t(norm)=0.100569, mflops=49.717 (err=4.4e-16) 16. Frigo-old: elapsed time t=1.2 s, 512 iters, t-(init.)=1.16 s t(norm)=0.100569, mflops=49.717 (err=4.6e-16) 17. Green: elapsed time t=1.41 s, 512 iters, t-(init.)=1.37 s t(norm)=0.118776, mflops=42.0961 (err=5.8e-16) 18. GSL: elapsed time t=1.41 s, 256 iters, t-(init.)=1.39 s t(norm)=0.24102, mflops=20.7452 (err=4.7e-16) 19. GSL DIT: elapsed time t=1.16 s, 256 iters, t-(init.)=1.14 s t(norm)=0.197671, mflops=25.2946 (err=4.5e-16) 20. GSL DIF: elapsed time t=1.2 s, 256 iters, t-(init.)=1.18 s t(norm)=0.204606, mflops=24.4372 (err=4.4e-16) 21. Krukar: elapsed time t=1.79 s, 512 iters, t-(init.)=1.75 s t(norm)=0.151721, mflops=32.9552 (err=5.0e-16) 22. Mayer (Buneman): elapsed time t=1.07 s, 256 iters, t-(init.)=1.04 s t(norm)=0.180331, mflops=27.7268 (err=4.5e-16) 23. Mayer (simple): elapsed time t=1.82 s, 512 iters, t-(init.)=1.77 s t(norm)=0.153455, mflops=32.5829 24. Mayer (lookup): elapsed time t=1.72 s, 512 iters, t-(init.)=1.68 s t(norm)=0.145652, mflops=34.3284 (err=4.5e-16) 25. Monro: elapsed time t=1.28 s, 256 iters, t-(init.)=1.26 s t(norm)=0.218478, mflops=22.8856 (err=6.5e-16) 26. NAPACK (f2c): elapsed time t=1.05 s, 128 iters, t-(init.)=1.04 s t(norm)=0.360662, mflops=13.8634 (err=1.5e-14) 27. Nielsen: elapsed time t=1.92 s, 256 iters, t-(init.)=1.9 s t(norm)=0.329451, mflops=15.1768 (err=1.1e-14) 28. NR (C): elapsed time t=1.96 s, 512 iters, t-(init.)=1.91 s t(norm)=0.165593, mflops=30.1946 (err=4.5e-16) 29. NR (F): elapsed time t=1.04 s, 256 iters, t-(init.)=1.02 s t(norm)=0.176863, mflops=28.2704 (err=4.5e-16) 30. Ooura (C): elapsed time t=1.51 s, 1024 iters, t-(init.)=1.42 s t(norm)=0.0615553, mflops=81.2277 (err=4.6e-16) 31. Ooura (F): elapsed time t=1.73 s, 1024 iters, t-(init.)=1.64 s t(norm)=0.0710921, mflops=70.3313 (err=4.6e-16) 32. QFT: elapsed time t=1.66 s, 512 iters, t-(init.)=1.62 s t(norm)=0.14045, mflops=35.5998 (err=1.2e-15) 33. Ransom: elapsed time t=1.03 s, 256 iters, t-(init.)=1.01 s t(norm)=0.175129, mflops=28.5503 (err=2.1e-15) 34. SCIPORT: elapsed time t=1.33 s, 128 iters, t-(init.)=1.32 s t(norm)=0.457764, mflops=10.9227 (err=1.6e-07) 35. Singleton: elapsed time t=1.58 s, 256 iters, t-(init.)=1.56 s t(norm)=0.270497, mflops=18.4845 (err=5.9e-16) 36. Singleton (f2c): elapsed time t=1.04 s, 256 iters, t-(init.)=1.02 s t(norm)=0.176863, mflops=28.2704 (err=5.9e-16) 37. Sorensen: elapsed time t=1.54 s, 512 iters, t-(init.)=1.5 s t(norm)=0.130046, mflops=38.4478 (err=4.5e-16) 38. Sorensen DIT: elapsed time t=1.95 s, 256 iters, t-(init.)=1.93 s t(norm)=0.334653, mflops=14.9408 (err=4.4e-16) 39. Temperton: elapsed time t=1.66 s, 256 iters, t-(init.)=1.64 s t(norm)=0.284368, mflops=17.5828 (err=4.5e-16) 40. Temperton (f2c): elapsed time t=1.22 s, 256 iters, t-(init.)=1.2 s t(norm)=0.208074, mflops=24.0299 (err=4.7e-16) 41. Valkenburg: elapsed time t=1.39 s, 64 iters, t-(init.)=1.38 s t(norm)=0.957142, mflops=5.22388 (err=7.4e-16) Top mflops for N=2048 = 81.2277 Normalized results and averages for N=2048: fft 0: mflops = 26.4549 (norm. = 0.325688), norm. avg. (of 11) = 0.321268 fft 1: mflops = 27.2036 (norm. = 0.334906), norm. avg. (of 11) = 0.327375 fft 2: mflops = 23.4438 (norm. = 0.288618), norm. avg. (of 11) = 0.254184 fft 3: mflops = 12.9891 (norm. = 0.15991), norm. avg. (of 11) = 0.0940854 fft 4: mflops = 28.8358 (norm. = 0.355), norm. avg. (of 11) = 0.246232 fft 5: mflops = 7.06761 (norm. = 0.0870098), norm. avg. (of 11) = 0.0629899 fft 6: mflops = 20.0249 (norm. = 0.246528), norm. avg. (of 11) = 0.156832 fft 7: mflops = 18.0224 (norm. = 0.221875), norm. avg. (of 11) = 0.138424 fft 8: mflops = 18.1357 (norm. = 0.22327), norm. avg. (of 11) = 0.175075 fft 9: mflops = 68.6568 (norm. = 0.845238), norm. avg. (of 11) = 0.52909 fft 10: mflops = 74.4151 (norm. = 0.916129), norm. avg. (of 11) = 0.511017 fft 11: mflops = 18.2505 (norm. = 0.224684), norm. avg. (of 10) = 0.164688 fft 12: mflops = 36.969 (norm. = 0.455128), norm. avg. (of 11) = 0.334249 fft 13: mflops = 42.4056 (norm. = 0.522059), norm. avg. (of 11) = 0.383079 fft 14: mflops = 80.6597 (norm. = 0.993007), norm. avg. (of 11) = 0.73135 fft 15: mflops = 49.717 (norm. = 0.612069), norm. avg. (of 11) = 0.579985 fft 16: mflops = 49.717 (norm. = 0.612069), norm. avg. (of 11) = 0.667385 fft 17: mflops = 42.0961 (norm. = 0.518248), norm. avg. (of 9) = 0.473784 fft 18: mflops = 20.7452 (norm. = 0.255396), norm. avg. (of 11) = 0.163563 fft 19: mflops = 25.2946 (norm. = 0.311404), norm. avg. (of 11) = 0.237864 fft 20: mflops = 24.4372 (norm. = 0.300847), norm. avg. (of 11) = 0.204121 fft 21: mflops = 32.9552 (norm. = 0.405714), norm. avg. (of 11) = 0.57331 fft 22: mflops = 27.7268 (norm. = 0.341346), norm. avg. (of 10) = 0.275917 fft 23: mflops = 32.5829 (norm. = 0.40113), norm. avg. (of 10) = 0.334176 fft 24: mflops = 34.3284 (norm. = 0.422619), norm. avg. (of 10) = 0.363231 fft 25: mflops = 22.8856 (norm. = 0.281746), norm. avg. (of 10) = 0.186028 fft 26: mflops = 13.8634 (norm. = 0.170673), norm. avg. (of 11) = 0.0935783 fft 27: mflops = 15.1768 (norm. = 0.186842), norm. avg. (of 11) = 0.11237 fft 28: mflops = 30.1946 (norm. = 0.371728), norm. avg. (of 11) = 0.290079 fft 29: mflops = 28.2704 (norm. = 0.348039), norm. avg. (of 11) = 0.232893 fft 30: mflops = 81.2277 (norm. = 1), norm. avg. (of 11) = 0.785392 fft 31: mflops = 70.3313 (norm. = 0.865854), norm. avg. (of 11) = 0.645629 fft 32: mflops = 35.5998 (norm. = 0.438272), norm. avg. (of 8) = 0.444227 fft 33: mflops = 28.5503 (norm. = 0.351485), norm. avg. (of 10) = 0.185567 fft 34: mflops = 10.9227 (norm. = 0.13447), norm. avg. (of 10) = 0.110558 fft 35: mflops = 18.4845 (norm. = 0.227564), norm. avg. (of 11) = 0.142607 fft 36: mflops = 28.2704 (norm. = 0.348039), norm. avg. (of 11) = 0.222887 fft 37: mflops = 38.4478 (norm. = 0.473333), norm. avg. (of 11) = 0.43767 fft 38: mflops = 14.9408 (norm. = 0.183938), norm. avg. (of 11) = 0.146067 fft 39: mflops = 17.5828 (norm. = 0.216463), norm. avg. (of 11) = 0.137217 fft 40: mflops = 24.0299 (norm. = 0.295833), norm. avg. (of 11) = 0.195005 fft 41: mflops = 5.22388 (norm. = 0.0643116), norm. avg. (of 11) = 0.0500037 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.12 s, 128 iters, t-(init.)=1.1 s t(norm)=0.17484, mflops=28.5975 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.08 s, 128 iters, t-(init.)=1.06 s t(norm)=0.168482, mflops=29.6767 (err=1.1e-15) 2. Arndt Split-Radix: elapsed time t=1.35 s, 128 iters, t-(init.)=1.33 s t(norm)=0.211398, mflops=23.6521 (err=1.0e-15) 3. Arndt 4-step: elapsed time t=1.07 s, 64 iters, t-(init.)=1.06 s t(norm)=0.336965, mflops=14.8383 (err=1.0e-15) 4. Bailey: elapsed time t=1.25 s, 128 iters, t-(init.)=1.23 s t(norm)=0.195503, mflops=25.575 (err=1.0e-15) 5. Beauregard: elapsed time t=1.12 s, 32 iters, t-(init.)=1.12 s t(norm)=0.712077, mflops=7.02171 (err=1.0e-15) 6. Bergland: elapsed time t=1.69 s, 128 iters, t-(init.)=1.67 s t(norm)=0.265439, mflops=18.8367 (err=1.1e-15) 7. Brenner: elapsed time t=1.75 s, 128 iters, t-(init.)=1.73 s t(norm)=0.274976, mflops=18.1834 (err=1.1e-15) 8. Burrus: elapsed time t=1.71 s, 128 iters, t-(init.)=1.69 s t(norm)=0.268618, mflops=18.6138 (err=1.0e-15) 9. CWP (min N) (N=4290): elapsed time t=2 s, 512 iters, t-(init.)=1.91 s t(norm)=0.0758966, mflops=65.8791 10. CWP (best N) (N=4368): elapsed time t=1.71 s, 512 iters, t-(init.)=1.62 s t(norm)=0.064373, mflops=77.6723 11. Edelblute: elapsed time t=1.7 s, 128 iters, t-(init.)=1.68 s t(norm)=0.267029, mflops=18.7246 (err=1.1e-15) 12. FFTPACK: elapsed time t=1.85 s, 256 iters, t-(init.)=1.81 s t(norm)=0.143846, mflops=34.7594 (err=1.0e-15) 13. FFTPACK (f2c): elapsed time t=1.58 s, 256 iters, t-(init.)=1.53 s t(norm)=0.121593, mflops=41.1206 (err=1.0e-15) FFTW_MEASURE plan: (cost = 4.062500e-03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 16 14. FFTW: elapsed time t=1.92 s, 512 iters, t-(init.)=1.83 s t(norm)=0.0727177, mflops=68.7591 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.34 s, 256 iters, t-(init.)=1.3 s t(norm)=0.103315, mflops=48.3958 (err=1.1e-15) 16. Frigo-old: elapsed time t=1.01 s, 128 iters, t-(init.)=0.99 s t(norm)=0.157356, mflops=31.775 (err=1.1e-15) 17. Green: elapsed time t=1.47 s, 256 iters, t-(init.)=1.43 s t(norm)=0.113646, mflops=43.9962 (err=1.1e-15) 18. GSL: elapsed time t=1.63 s, 128 iters, t-(init.)=1.61 s t(norm)=0.255903, mflops=19.5387 (err=1.0e-15) 19. GSL DIT: elapsed time t=1.28 s, 128 iters, t-(init.)=1.25 s t(norm)=0.198682, mflops=25.1658 (err=1.0e-15) 20. GSL DIF: elapsed time t=1.31 s, 128 iters, t-(init.)=1.29 s t(norm)=0.20504, mflops=24.3855 (err=1.0e-15) 21. Krukar: elapsed time t=1.01 s, 128 iters, t-(init.)=0.99 s t(norm)=0.157356, mflops=31.775 (err=1.1e-15) 22. Mayer (Buneman): elapsed time t=1.18 s, 128 iters, t-(init.)=1.15 s t(norm)=0.182788, mflops=27.3542 (err=1.1e-15) 23. Mayer (simple): elapsed time t=1.01 s, 128 iters, t-(init.)=0.99 s t(norm)=0.157356, mflops=31.775 24. Mayer (lookup): elapsed time t=1.94 s, 256 iters, t-(init.)=1.9 s t(norm)=0.150998, mflops=33.1129 (err=1.1e-15) 25. Monro: elapsed time t=1.37 s, 128 iters, t-(init.)=1.35 s t(norm)=0.214577, mflops=23.3017 (err=1.3e-15) 26. NAPACK (f2c): elapsed time t=1.16 s, 64 iters, t-(init.)=1.15 s t(norm)=0.365575, mflops=13.6771 (err=4.5e-14) 27. Nielsen: elapsed time t=1.07 s, 64 iters, t-(init.)=1.06 s t(norm)=0.336965, mflops=14.8383 (err=2.2e-14) 28. NR (C): elapsed time t=1.09 s, 128 iters, t-(init.)=1.07 s t(norm)=0.170072, mflops=29.3993 (err=1.0e-15) 29. NR (F): elapsed time t=1.15 s, 128 iters, t-(init.)=1.13 s t(norm)=0.179609, mflops=27.8383 (err=1.0e-15) 30. Ooura (C): elapsed time t=1.68 s, 512 iters, t-(init.)=1.6 s t(norm)=0.0635783, mflops=78.6432 (err=1.1e-15) 31. Ooura (F): elapsed time t=1.89 s, 512 iters, t-(init.)=1.81 s t(norm)=0.0719229, mflops=69.5189 (err=1.1e-15) 32. QFT: elapsed time t=1.9 s, 256 iters, t-(init.)=1.86 s t(norm)=0.14782, mflops=33.825 (err=1.9e-15) 33. Ransom: elapsed time t=1.78 s, 256 iters, t-(init.)=1.74 s t(norm)=0.138283, mflops=36.1578 (err=2.6e-15) 34. SCIPORT: elapsed time t=1.53 s, 64 iters, t-(init.)=1.52 s t(norm)=0.483195, mflops=10.3478 (err=1.7e-07) 35. Singleton: elapsed time t=1.9 s, 128 iters, t-(init.)=1.88 s t(norm)=0.298818, mflops=16.7326 (err=1.6e-15) 36. Singleton (f2c): elapsed time t=1.11 s, 128 iters, t-(init.)=1.09 s t(norm)=0.173251, mflops=28.8599 (err=1.6e-15) 37. Sorensen: elapsed time t=1.76 s, 256 iters, t-(init.)=1.72 s t(norm)=0.136693, mflops=36.5782 (err=1.1e-15) 38. Sorensen DIT: elapsed time t=1.05 s, 64 iters, t-(init.)=1.04 s t(norm)=0.330607, mflops=15.1237 (err=1.0e-15) 39. Temperton: elapsed time t=1.79 s, 128 iters, t-(init.)=1.77 s t(norm)=0.281334, mflops=17.7725 (err=1.1e-15) 40. Temperton (f2c): elapsed time t=1.27 s, 128 iters, t-(init.)=1.25 s t(norm)=0.198682, mflops=25.1658 (err=1.0e-15) 41. Valkenburg: elapsed time t=1.53 s, 32 iters, t-(init.)=1.52 s t(norm)=0.96639, mflops=5.17389 (err=1.1e-15) Top mflops for N=4096 = 78.6432 Normalized results and averages for N=4096: fft 0: mflops = 28.5975 (norm. = 0.363636), norm. avg. (of 12) = 0.324799 fft 1: mflops = 29.6767 (norm. = 0.377358), norm. avg. (of 12) = 0.33154 fft 2: mflops = 23.6521 (norm. = 0.300752), norm. avg. (of 12) = 0.258065 fft 3: mflops = 14.8383 (norm. = 0.188679), norm. avg. (of 12) = 0.101968 fft 4: mflops = 25.575 (norm. = 0.325203), norm. avg. (of 12) = 0.252813 fft 5: mflops = 7.02171 (norm. = 0.0892857), norm. avg. (of 12) = 0.0651813 fft 6: mflops = 18.8367 (norm. = 0.239521), norm. avg. (of 12) = 0.163723 fft 7: mflops = 18.1834 (norm. = 0.231214), norm. avg. (of 12) = 0.146156 fft 8: mflops = 18.6138 (norm. = 0.236686), norm. avg. (of 12) = 0.180209 fft 9: mflops = 65.8791 (norm. = 0.837696), norm. avg. (of 12) = 0.554807 fft 10: mflops = 77.6723 (norm. = 0.987654), norm. avg. (of 12) = 0.550737 fft 11: mflops = 18.7246 (norm. = 0.238095), norm. avg. (of 11) = 0.171362 fft 12: mflops = 34.7594 (norm. = 0.441989), norm. avg. (of 12) = 0.343228 fft 13: mflops = 41.1206 (norm. = 0.522876), norm. avg. (of 12) = 0.394729 fft 14: mflops = 68.7591 (norm. = 0.874317), norm. avg. (of 12) = 0.743264 fft 15: mflops = 48.3958 (norm. = 0.615385), norm. avg. (of 12) = 0.582935 fft 16: mflops = 31.775 (norm. = 0.40404), norm. avg. (of 12) = 0.64544 fft 17: mflops = 43.9962 (norm. = 0.559441), norm. avg. (of 10) = 0.48235 fft 18: mflops = 19.5387 (norm. = 0.248447), norm. avg. (of 12) = 0.170636 fft 19: mflops = 25.1658 (norm. = 0.32), norm. avg. (of 12) = 0.244709 fft 20: mflops = 24.3855 (norm. = 0.310078), norm. avg. (of 12) = 0.21295 fft 21: mflops = 31.775 (norm. = 0.40404), norm. avg. (of 12) = 0.559204 fft 22: mflops = 27.3542 (norm. = 0.347826), norm. avg. (of 11) = 0.282454 fft 23: mflops = 31.775 (norm. = 0.40404), norm. avg. (of 11) = 0.340527 fft 24: mflops = 33.1129 (norm. = 0.421053), norm. avg. (of 11) = 0.368487 fft 25: mflops = 23.3017 (norm. = 0.296296), norm. avg. (of 11) = 0.196052 fft 26: mflops = 13.6771 (norm. = 0.173913), norm. avg. (of 12) = 0.100273 fft 27: mflops = 14.8383 (norm. = 0.188679), norm. avg. (of 12) = 0.118729 fft 28: mflops = 29.3993 (norm. = 0.373832), norm. avg. (of 12) = 0.297058 fft 29: mflops = 27.8383 (norm. = 0.353982), norm. avg. (of 12) = 0.242983 fft 30: mflops = 78.6432 (norm. = 1), norm. avg. (of 12) = 0.803276 fft 31: mflops = 69.5189 (norm. = 0.883978), norm. avg. (of 12) = 0.665492 fft 32: mflops = 33.825 (norm. = 0.430108), norm. avg. (of 9) = 0.442658 fft 33: mflops = 36.1578 (norm. = 0.45977), norm. avg. (of 11) = 0.210495 fft 34: mflops = 10.3478 (norm. = 0.131579), norm. avg. (of 11) = 0.112469 fft 35: mflops = 16.7326 (norm. = 0.212766), norm. avg. (of 12) = 0.148453 fft 36: mflops = 28.8599 (norm. = 0.366972), norm. avg. (of 12) = 0.234894 fft 37: mflops = 36.5782 (norm. = 0.465116), norm. avg. (of 12) = 0.439957 fft 38: mflops = 15.1237 (norm. = 0.192308), norm. avg. (of 12) = 0.14992 fft 39: mflops = 17.7725 (norm. = 0.225989), norm. avg. (of 12) = 0.144615 fft 40: mflops = 25.1658 (norm. = 0.32), norm. avg. (of 12) = 0.205421 fft 41: mflops = 5.17389 (norm. = 0.0657895), norm. avg. (of 12) = 0.0513192 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.25 s, 64 iters, t-(init.)=1.23 s t(norm)=0.180465, mflops=27.7063 (err=1.3e-15) 1. Arndt DIT: elapsed time t=1.22 s, 64 iters, t-(init.)=1.19 s t(norm)=0.174596, mflops=28.6376 (err=1.3e-15) 2. Arndt Split-Radix: elapsed time t=1.44 s, 64 iters, t-(init.)=1.42 s t(norm)=0.208341, mflops=23.9991 (err=1.3e-15) 3. Arndt 4-step: elapsed time t=1.2 s, 32 iters, t-(init.)=1.19 s t(norm)=0.349192, mflops=14.3188 (err=1.4e-15) 4. Bailey: elapsed time t=1.47 s, 64 iters, t-(init.)=1.45 s t(norm)=0.212743, mflops=23.5026 (err=1.3e-15) 5. Beauregard: elapsed time t=1.22 s, 16 iters, t-(init.)=1.21 s t(norm)=0.710121, mflops=7.04106 (err=1.3e-15) 6. Bergland: elapsed time t=1.78 s, 64 iters, t-(init.)=1.76 s t(norm)=0.258226, mflops=19.3629 (err=1.4e-15) 7. Brenner: elapsed time t=1.86 s, 64 iters, t-(init.)=1.84 s t(norm)=0.269963, mflops=18.521 (err=1.4e-15) 8. Burrus: elapsed time t=1.79 s, 64 iters, t-(init.)=1.77 s t(norm)=0.259693, mflops=19.2535 (err=1.3e-15) 9. CWP (min N) (N=8580): elapsed time t=1.01 s, 128 iters, t-(init.)=0.97 s t(norm)=0.0711588, mflops=70.2654 10. CWP (best N) (N=9240): elapsed time t=1.01 s, 128 iters, t-(init.)=0.97 s t(norm)=0.0711588, mflops=70.2654 11. Edelblute: elapsed time t=1.78 s, 64 iters, t-(init.)=1.76 s t(norm)=0.258226, mflops=19.3629 (err=1.3e-15) 12. FFTPACK: elapsed time t=1.14 s, 64 iters, t-(init.)=1.12 s t(norm)=0.164325, mflops=30.4274 (err=1.3e-15) 13. FFTPACK (f2c): elapsed time t=1.06 s, 64 iters, t-(init.)=1.04 s t(norm)=0.152588, mflops=32.768 (err=1.3e-15) FFTW_MEASURE plan: (cost = 8.125000e-03) FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 8 14. FFTW: elapsed time t=1.09 s, 128 iters, t-(init.)=1.04 s t(norm)=0.0762939, mflops=65.536 (err=1.3e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.7 s, 128 iters, t-(init.)=1.65 s t(norm)=0.121043, mflops=41.3075 (err=1.4e-15) 16. Frigo-old: elapsed time t=1.87 s, 128 iters, t-(init.)=1.82 s t(norm)=0.133514, mflops=37.4491 (err=1.4e-15) 17. Green: elapsed time t=1.64 s, 128 iters, t-(init.)=1.6 s t(norm)=0.117375, mflops=42.5984 (err=1.4e-15) 18. GSL: elapsed time t=1.81 s, 64 iters, t-(init.)=1.79 s t(norm)=0.262627, mflops=19.0384 (err=1.3e-15) 19. GSL DIT: elapsed time t=1.42 s, 64 iters, t-(init.)=1.4 s t(norm)=0.205407, mflops=24.3419 (err=1.3e-15) 20. GSL DIF: elapsed time t=1.4 s, 64 iters, t-(init.)=1.38 s t(norm)=0.202472, mflops=24.6947 (err=1.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.25 s, 64 iters, t-(init.)=1.23 s t(norm)=0.180465, mflops=27.7063 (err=1.3e-15) 23. Mayer (simple): elapsed time t=1.09 s, 64 iters, t-(init.)=1.07 s t(norm)=0.156989, mflops=31.8493 24. Mayer (lookup): elapsed time t=1.05 s, 64 iters, t-(init.)=1.03 s t(norm)=0.151121, mflops=33.0861 (err=1.4e-15) 25. Monro: elapsed time t=1.45 s, 64 iters, t-(init.)=1.43 s t(norm)=0.209808, mflops=23.8313 (err=1.5e-15) 26. NAPACK (f2c): elapsed time t=1.28 s, 32 iters, t-(init.)=1.27 s t(norm)=0.372667, mflops=13.4168 (err=4.1e-14) 27. Nielsen: elapsed time t=1.14 s, 32 iters, t-(init.)=1.13 s t(norm)=0.331585, mflops=15.0791 (err=1.1e-14) 28. NR (C): elapsed time t=1.2 s, 64 iters, t-(init.)=1.18 s t(norm)=0.173129, mflops=28.8803 (err=1.3e-15) 29. NR (F): elapsed time t=1.25 s, 64 iters, t-(init.)=1.23 s t(norm)=0.180465, mflops=27.7063 (err=1.3e-15) 30. Ooura (C): elapsed time t=1.83 s, 256 iters, t-(init.)=1.74 s t(norm)=0.0638228, mflops=78.3419 (err=1.4e-15) 31. Ooura (F): elapsed time t=1.04 s, 128 iters, t-(init.)=1 s t(norm)=0.0733596, mflops=68.1574 (err=1.4e-15) 32. QFT: elapsed time t=1.1 s, 64 iters, t-(init.)=1.08 s t(norm)=0.158457, mflops=31.5544 (err=2.8e-15) 33. Ransom: elapsed time t=1.07 s, 64 iters, t-(init.)=1.05 s t(norm)=0.154055, mflops=32.4559 (err=3.2e-15) 34. SCIPORT: elapsed time t=1.71 s, 32 iters, t-(init.)=1.7 s t(norm)=0.498845, mflops=10.0232 (err=1.9e-07) 35. Singleton: elapsed time t=1 s, 32 iters, t-(init.)=0.99 s t(norm)=0.290504, mflops=17.2115 (err=2.0e-15) 36. Singleton (f2c): elapsed time t=1.2 s, 64 iters, t-(init.)=1.18 s t(norm)=0.173129, mflops=28.8803 (err=2.0e-15) 37. Sorensen: elapsed time t=1.96 s, 128 iters, t-(init.)=1.92 s t(norm)=0.14085, mflops=35.4987 (err=1.4e-15) 38. Sorensen DIT: elapsed time t=1.11 s, 32 iters, t-(init.)=1.1 s t(norm)=0.322782, mflops=15.4903 (err=1.3e-15) 39. Temperton: elapsed time t=1.01 s, 32 iters, t-(init.)=1 s t(norm)=0.293438, mflops=17.0394 (err=1.3e-15) 40. Temperton (f2c): elapsed time t=1.44 s, 64 iters, t-(init.)=1.41 s t(norm)=0.206874, mflops=24.1693 (err=1.3e-15) 41. Valkenburg: elapsed time t=1.71 s, 16 iters, t-(init.)=1.7 s t(norm)=0.99769, mflops=5.01158 (err=1.4e-15) Top mflops for N=8192 = 78.3419 Normalized results and averages for N=8192: fft 0: mflops = 27.7063 (norm. = 0.353659), norm. avg. (of 13) = 0.327019 fft 1: mflops = 28.6376 (norm. = 0.365546), norm. avg. (of 13) = 0.334156 fft 2: mflops = 23.9991 (norm. = 0.306338), norm. avg. (of 13) = 0.261778 fft 3: mflops = 14.3188 (norm. = 0.182773), norm. avg. (of 13) = 0.108184 fft 4: mflops = 23.5026 (norm. = 0.3), norm. avg. (of 13) = 0.256443 fft 5: mflops = 7.04106 (norm. = 0.089876), norm. avg. (of 13) = 0.0670809 fft 6: mflops = 19.3629 (norm. = 0.247159), norm. avg. (of 13) = 0.170141 fft 7: mflops = 18.521 (norm. = 0.236413), norm. avg. (of 13) = 0.153099 fft 8: mflops = 19.2535 (norm. = 0.245763), norm. avg. (of 13) = 0.185252 fft 9: mflops = 70.2654 (norm. = 0.896907), norm. avg. (of 13) = 0.581123 fft 10: mflops = 70.2654 (norm. = 0.896907), norm. avg. (of 13) = 0.577365 fft 11: mflops = 19.3629 (norm. = 0.247159), norm. avg. (of 12) = 0.177678 fft 12: mflops = 30.4274 (norm. = 0.388393), norm. avg. (of 13) = 0.346702 fft 13: mflops = 32.768 (norm. = 0.418269), norm. avg. (of 13) = 0.39654 fft 14: mflops = 65.536 (norm. = 0.836538), norm. avg. (of 13) = 0.750439 fft 15: mflops = 41.3075 (norm. = 0.527273), norm. avg. (of 13) = 0.578653 fft 16: mflops = 37.4491 (norm. = 0.478022), norm. avg. (of 13) = 0.632562 fft 17: mflops = 42.5984 (norm. = 0.54375), norm. avg. (of 11) = 0.487932 fft 18: mflops = 19.0384 (norm. = 0.243017), norm. avg. (of 13) = 0.176204 fft 19: mflops = 24.3419 (norm. = 0.310714), norm. avg. (of 13) = 0.249786 fft 20: mflops = 24.6947 (norm. = 0.315217), norm. avg. (of 13) = 0.220817 fft 21: mflops = -1 (norm. = -0.0127646), norm. avg. (of 12) = 0.559204 fft 22: mflops = 27.7063 (norm. = 0.353659), norm. avg. (of 12) = 0.288388 fft 23: mflops = 31.8493 (norm. = 0.406542), norm. avg. (of 12) = 0.346028 fft 24: mflops = 33.0861 (norm. = 0.42233), norm. avg. (of 12) = 0.372974 fft 25: mflops = 23.8313 (norm. = 0.304196), norm. avg. (of 12) = 0.205064 fft 26: mflops = 13.4168 (norm. = 0.17126), norm. avg. (of 13) = 0.105733 fft 27: mflops = 15.0791 (norm. = 0.192478), norm. avg. (of 13) = 0.124402 fft 28: mflops = 28.8803 (norm. = 0.368644), norm. avg. (of 13) = 0.302565 fft 29: mflops = 27.7063 (norm. = 0.353659), norm. avg. (of 13) = 0.251497 fft 30: mflops = 78.3419 (norm. = 1), norm. avg. (of 13) = 0.818409 fft 31: mflops = 68.1574 (norm. = 0.87), norm. avg. (of 13) = 0.681223 fft 32: mflops = 31.5544 (norm. = 0.402778), norm. avg. (of 10) = 0.43867 fft 33: mflops = 32.4559 (norm. = 0.414286), norm. avg. (of 12) = 0.227477 fft 34: mflops = 10.0232 (norm. = 0.127941), norm. avg. (of 12) = 0.113758 fft 35: mflops = 17.2115 (norm. = 0.219697), norm. avg. (of 13) = 0.153934 fft 36: mflops = 28.8803 (norm. = 0.368644), norm. avg. (of 13) = 0.245183 fft 37: mflops = 35.4987 (norm. = 0.453125), norm. avg. (of 13) = 0.44097 fft 38: mflops = 15.4903 (norm. = 0.197727), norm. avg. (of 13) = 0.153598 fft 39: mflops = 17.0394 (norm. = 0.2175), norm. avg. (of 13) = 0.150222 fft 40: mflops = 24.1693 (norm. = 0.308511), norm. avg. (of 13) = 0.213351 fft 41: mflops = 5.01158 (norm. = 0.0639706), norm. avg. (of 13) = 0.0522924 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.49 s, 32 iters, t-(init.)=1.45 s t(norm)=0.197547, mflops=25.3105 (err=1.7e-15) 1. Arndt DIT: elapsed time t=1.45 s, 32 iters, t-(init.)=1.41 s t(norm)=0.192097, mflops=26.0285 (err=1.8e-15) 2. Arndt Split-Radix: elapsed time t=1.83 s, 32 iters, t-(init.)=1.79 s t(norm)=0.243868, mflops=20.5029 (err=1.8e-15) 3. Arndt 4-step: elapsed time t=1.24 s, 16 iters, t-(init.)=1.23 s t(norm)=0.335148, mflops=14.9188 (err=1.8e-15) 4. Bailey: elapsed time t=1.89 s, 32 iters, t-(init.)=1.85 s t(norm)=0.252042, mflops=19.8379 (err=1.7e-15) 5. Beauregard: elapsed time t=1.37 s, 8 iters, t-(init.)=1.36 s t(norm)=0.741141, mflops=6.74635 (err=1.8e-15) 6. Bergland: elapsed time t=1.01 s, 16 iters, t-(init.)=0.99 s t(norm)=0.269754, mflops=18.5354 (err=1.8e-15) 7. Brenner: elapsed time t=1.07 s, 16 iters, t-(init.)=1.05 s t(norm)=0.286102, mflops=17.4763 (err=1.8e-15) 8. Burrus: elapsed time t=1.09 s, 16 iters, t-(init.)=1.07 s t(norm)=0.291552, mflops=17.1496 (err=1.8e-15) 9. CWP (min N) (N=17160): elapsed time t=1.11 s, 64 iters, t-(init.)=1.03 s t(norm)=0.0701632, mflops=71.2624 10. CWP (best N) (N=17160): elapsed time t=1.11 s, 64 iters, t-(init.)=1.03 s t(norm)=0.0701632, mflops=71.2624 11. Edelblute: elapsed time t=1.12 s, 16 iters, t-(init.)=1.1 s t(norm)=0.299726, mflops=16.6819 (err=1.8e-15) 12. FFTPACK: elapsed time t=1.35 s, 32 iters, t-(init.)=1.31 s t(norm)=0.178473, mflops=28.0154 (err=1.8e-15) 13. FFTPACK (f2c): elapsed time t=1.2 s, 32 iters, t-(init.)=1.16 s t(norm)=0.158037, mflops=31.6381 (err=1.8e-15) FFTW_MEASURE plan: (cost = 2.375000e-02) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 16 FFTW_NOTW 4 14. FFTW: elapsed time t=1.47 s, 64 iters, t-(init.)=1.39 s t(norm)=0.0946862, mflops=52.806 (err=1.8e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.74 s, 64 iters, t-(init.)=1.67 s t(norm)=0.11376, mflops=43.9523 (err=1.8e-15) 16. Frigo-old: elapsed time t=1.31 s, 32 iters, t-(init.)=1.28 s t(norm)=0.174386, mflops=28.672 (err=1.9e-15) 17. Green: elapsed time t=1.04 s, 32 iters, t-(init.)=1 s t(norm)=0.136239, mflops=36.7002 (err=1.8e-15) 18. GSL: elapsed time t=1.08 s, 16 iters, t-(init.)=1.06 s t(norm)=0.288827, mflops=17.3114 (err=1.8e-15) 19. GSL DIT: elapsed time t=1.96 s, 32 iters, t-(init.)=1.92 s t(norm)=0.261579, mflops=19.1147 (err=1.8e-15) 20. GSL DIF: elapsed time t=1.93 s, 32 iters, t-(init.)=1.89 s t(norm)=0.257492, mflops=19.4181 (err=1.8e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.38 s, 32 iters, t-(init.)=1.35 s t(norm)=0.183923, mflops=27.1853 (err=1.8e-15) 23. Mayer (simple): elapsed time t=1.22 s, 32 iters, t-(init.)=1.18 s t(norm)=0.160762, mflops=31.1018 24. Mayer (lookup): elapsed time t=1.19 s, 32 iters, t-(init.)=1.16 s t(norm)=0.158037, mflops=31.6381 (err=1.9e-15) 25. Monro: elapsed time t=1.77 s, 32 iters, t-(init.)=1.73 s t(norm)=0.235694, mflops=21.214 (err=2.2e-15) 26. NAPACK (f2c): elapsed time t=1.57 s, 16 iters, t-(init.)=1.55 s t(norm)=0.422341, mflops=11.8388 (err=2.3e-13) 27. Nielsen: elapsed time t=1.3 s, 16 iters, t-(init.)=1.28 s t(norm)=0.348772, mflops=14.336 (err=1.3e-13) 28. NR (C): elapsed time t=1.74 s, 32 iters, t-(init.)=1.7 s t(norm)=0.231607, mflops=21.5883 (err=1.8e-15) 29. NR (F): elapsed time t=1.75 s, 32 iters, t-(init.)=1.71 s t(norm)=0.232969, mflops=21.4621 (err=1.8e-15) 30. Ooura (C): elapsed time t=1.2 s, 64 iters, t-(init.)=1.13 s t(norm)=0.0769751, mflops=64.956 (err=1.9e-15) 31. Ooura (F): elapsed time t=1.33 s, 64 iters, t-(init.)=1.26 s t(norm)=0.0858307, mflops=58.2542 (err=1.9e-15) 32. QFT: elapsed time t=1.4 s, 32 iters, t-(init.)=1.36 s t(norm)=0.185285, mflops=26.9854 (err=3.8e-15) 33. Ransom: elapsed time t=1.07 s, 32 iters, t-(init.)=1.03 s t(norm)=0.140326, mflops=35.6312 (err=4.0e-15) 34. SCIPORT: elapsed time t=1.1 s, 8 iters, t-(init.)=1.09 s t(norm)=0.594003, mflops=8.41747 (err=2.1e-07) 35. Singleton: elapsed time t=1.22 s, 16 iters, t-(init.)=1.2 s t(norm)=0.326974, mflops=15.2917 (err=2.5e-15) 36. Singleton (f2c): elapsed time t=1.51 s, 32 iters, t-(init.)=1.48 s t(norm)=0.201634, mflops=24.7974 (err=2.5e-15) 37. Sorensen: elapsed time t=1.25 s, 32 iters, t-(init.)=1.21 s t(norm)=0.164849, mflops=30.3307 (err=1.9e-15) 38. Sorensen DIT: elapsed time t=1.29 s, 16 iters, t-(init.)=1.27 s t(norm)=0.346048, mflops=14.4489 (err=1.8e-15) 39. Temperton: elapsed time t=1.13 s, 16 iters, t-(init.)=1.11 s t(norm)=0.302451, mflops=16.5316 (err=1.8e-15) 40. Temperton (f2c): elapsed time t=1.65 s, 32 iters, t-(init.)=1.61 s t(norm)=0.219345, mflops=22.7951 (err=1.8e-15) 41. Valkenburg: elapsed time t=1.93 s, 8 iters, t-(init.)=1.92 s t(norm)=1.04632, mflops=4.77867 (err=1.7e-15) Top mflops for N=16384 = 71.2624 Normalized results and averages for N=16384: fft 0: mflops = 25.3105 (norm. = 0.355172), norm. avg. (of 14) = 0.32903 fft 1: mflops = 26.0285 (norm. = 0.365248), norm. avg. (of 14) = 0.336377 fft 2: mflops = 20.5029 (norm. = 0.287709), norm. avg. (of 14) = 0.263631 fft 3: mflops = 14.9188 (norm. = 0.20935), norm. avg. (of 14) = 0.11541 fft 4: mflops = 19.8379 (norm. = 0.278378), norm. avg. (of 14) = 0.258009 fft 5: mflops = 6.74635 (norm. = 0.0946691), norm. avg. (of 14) = 0.0690514 fft 6: mflops = 18.5354 (norm. = 0.260101), norm. avg. (of 14) = 0.176566 fft 7: mflops = 17.4763 (norm. = 0.245238), norm. avg. (of 14) = 0.15968 fft 8: mflops = 17.1496 (norm. = 0.240654), norm. avg. (of 14) = 0.189209 fft 9: mflops = 71.2624 (norm. = 1), norm. avg. (of 14) = 0.611042 fft 10: mflops = 71.2624 (norm. = 1), norm. avg. (of 14) = 0.607553 fft 11: mflops = 16.6819 (norm. = 0.234091), norm. avg. (of 13) = 0.182018 fft 12: mflops = 28.0154 (norm. = 0.39313), norm. avg. (of 14) = 0.350018 fft 13: mflops = 31.6381 (norm. = 0.443966), norm. avg. (of 14) = 0.399927 fft 14: mflops = 52.806 (norm. = 0.741007), norm. avg. (of 14) = 0.749765 fft 15: mflops = 43.9523 (norm. = 0.616766), norm. avg. (of 14) = 0.581376 fft 16: mflops = 28.672 (norm. = 0.402344), norm. avg. (of 14) = 0.616117 fft 17: mflops = 36.7002 (norm. = 0.515), norm. avg. (of 12) = 0.490187 fft 18: mflops = 17.3114 (norm. = 0.242925), norm. avg. (of 14) = 0.18097 fft 19: mflops = 19.1147 (norm. = 0.268229), norm. avg. (of 14) = 0.251104 fft 20: mflops = 19.4181 (norm. = 0.272487), norm. avg. (of 14) = 0.224508 fft 21: mflops = -1 (norm. = -0.0140326), norm. avg. (of 12) = 0.559204 fft 22: mflops = 27.1853 (norm. = 0.381481), norm. avg. (of 13) = 0.295549 fft 23: mflops = 31.1018 (norm. = 0.436441), norm. avg. (of 13) = 0.352983 fft 24: mflops = 31.6381 (norm. = 0.443966), norm. avg. (of 13) = 0.378435 fft 25: mflops = 21.214 (norm. = 0.297688), norm. avg. (of 13) = 0.212189 fft 26: mflops = 11.8388 (norm. = 0.166129), norm. avg. (of 14) = 0.110047 fft 27: mflops = 14.336 (norm. = 0.201172), norm. avg. (of 14) = 0.129886 fft 28: mflops = 21.5883 (norm. = 0.302941), norm. avg. (of 14) = 0.302592 fft 29: mflops = 21.4621 (norm. = 0.30117), norm. avg. (of 14) = 0.255045 fft 30: mflops = 64.956 (norm. = 0.911504), norm. avg. (of 14) = 0.825059 fft 31: mflops = 58.2542 (norm. = 0.81746), norm. avg. (of 14) = 0.690954 fft 32: mflops = 26.9854 (norm. = 0.378676), norm. avg. (of 11) = 0.433216 fft 33: mflops = 35.6312 (norm. = 0.5), norm. avg. (of 13) = 0.24844 fft 34: mflops = 8.41747 (norm. = 0.118119), norm. avg. (of 13) = 0.114094 fft 35: mflops = 15.2917 (norm. = 0.214583), norm. avg. (of 14) = 0.158266 fft 36: mflops = 24.7974 (norm. = 0.347973), norm. avg. (of 14) = 0.252525 fft 37: mflops = 30.3307 (norm. = 0.42562), norm. avg. (of 14) = 0.439874 fft 38: mflops = 14.4489 (norm. = 0.202756), norm. avg. (of 14) = 0.157109 fft 39: mflops = 16.5316 (norm. = 0.231982), norm. avg. (of 14) = 0.156062 fft 40: mflops = 22.7951 (norm. = 0.319876), norm. avg. (of 14) = 0.22096 fft 41: mflops = 4.77867 (norm. = 0.0670573), norm. avg. (of 14) = 0.053347 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.92 s, 16 iters, t-(init.)=1.86 s t(norm)=0.236511, mflops=21.1406 (err=2.1e-15) 1. Arndt DIT: elapsed time t=1.88 s, 16 iters, t-(init.)=1.83 s t(norm)=0.232697, mflops=21.4872 (err=2.1e-15) 2. Arndt Split-Radix: elapsed time t=1.15 s, 8 iters, t-(init.)=1.12 s t(norm)=0.284831, mflops=17.5543 (err=2.1e-15) 3. Arndt 4-step: elapsed time t=1.49 s, 8 iters, t-(init.)=1.47 s t(norm)=0.37384, mflops=13.3747 (err=2.1e-15) 4. Bailey: elapsed time t=1.29 s, 8 iters, t-(init.)=1.26 s t(norm)=0.320435, mflops=15.6038 (err=2.1e-15) 5. Beauregard: elapsed time t=1.5 s, 4 iters, t-(init.)=1.48 s t(norm)=0.752767, mflops=6.64216 (err=2.2e-15) 6. Bergland: elapsed time t=1.2 s, 8 iters, t-(init.)=1.18 s t(norm)=0.30009, mflops=16.6617 (err=2.2e-15) 7. Brenner: elapsed time t=1.23 s, 8 iters, t-(init.)=1.21 s t(norm)=0.307719, mflops=16.2486 (err=2.2e-15) 8. Burrus: elapsed time t=1.32 s, 8 iters, t-(init.)=1.29 s t(norm)=0.328064, mflops=15.2409 (err=2.1e-15) 9. CWP (min N) (N=34320): elapsed time t=1.3 s, 32 iters, t-(init.)=1.19 s t(norm)=0.0756582, mflops=66.0867 10. CWP (best N) (N=34320): elapsed time t=1.29 s, 32 iters, t-(init.)=1.18 s t(norm)=0.0750224, mflops=66.6468 11. Edelblute: elapsed time t=1.33 s, 8 iters, t-(init.)=1.31 s t(norm)=0.33315, mflops=15.0082 (err=2.1e-15) 12. FFTPACK: elapsed time t=1.63 s, 16 iters, t-(init.)=1.58 s t(norm)=0.200907, mflops=24.8871 (err=2.1e-15) 13. FFTPACK (f2c): elapsed time t=1.53 s, 16 iters, t-(init.)=1.48 s t(norm)=0.188192, mflops=26.5686 (err=2.1e-15) FFTW_MEASURE plan: (cost = 5.250000e-02) FFTW_TWIDDLE 32 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 8 14. FFTW: elapsed time t=1.66 s, 32 iters, t-(init.)=1.55 s t(norm)=0.0985463, mflops=50.7375 (err=2.1e-15) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1 s, 16 iters, t-(init.)=0.94 s t(norm)=0.119527, mflops=41.8315 (err=2.1e-15) 16. Frigo-old: elapsed time t=1.61 s, 16 iters, t-(init.)=1.56 s t(norm)=0.198364, mflops=25.2062 (err=2.2e-15) 17. Green: elapsed time t=1.21 s, 16 iters, t-(init.)=1.15 s t(norm)=0.14623, mflops=34.1927 (err=2.2e-15) 18. GSL: elapsed time t=1.2 s, 8 iters, t-(init.)=1.17 s t(norm)=0.297546, mflops=16.8041 (err=2.2e-15) 19. GSL DIT: elapsed time t=1.26 s, 8 iters, t-(init.)=1.24 s t(norm)=0.315348, mflops=15.8555 (err=2.2e-15) 20. GSL DIF: elapsed time t=1.23 s, 8 iters, t-(init.)=1.2 s t(norm)=0.305176, mflops=16.384 (err=2.2e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.66 s, 16 iters, t-(init.)=1.57 s t(norm)=0.199636, mflops=25.0456 (err=2.1e-15) 23. Mayer (simple): elapsed time t=1.5 s, 16 iters, t-(init.)=1.44 s t(norm)=0.183105, mflops=27.3067 24. Mayer (lookup): elapsed time t=1.49 s, 16 iters, t-(init.)=1.43 s t(norm)=0.181834, mflops=27.4976 (err=2.1e-15) 25. Monro: elapsed time t=1.1 s, 8 iters, t-(init.)=1.07 s t(norm)=0.272115, mflops=18.3746 (err=2.5e-15) 26. NAPACK (f2c): elapsed time t=2.02 s, 8 iters, t-(init.)=1.99 s t(norm)=0.506083, mflops=9.8798 (err=5.7e-13) 27. Nielsen: elapsed time t=1.57 s, 8 iters, t-(init.)=1.55 s t(norm)=0.394185, mflops=12.6844 (err=2.3e-13) 28. NR (C): elapsed time t=1.12 s, 8 iters, t-(init.)=1.09 s t(norm)=0.277201, mflops=18.0374 (err=2.2e-15) 29. NR (F): elapsed time t=1.15 s, 8 iters, t-(init.)=1.12 s t(norm)=0.284831, mflops=17.5543 (err=2.2e-15) 30. Ooura (C): elapsed time t=1.54 s, 32 iters, t-(init.)=1.43 s t(norm)=0.090917, mflops=54.9952 (err=2.2e-15) 31. Ooura (F): elapsed time t=1.67 s, 32 iters, t-(init.)=1.56 s t(norm)=0.0991821, mflops=50.4123 (err=2.2e-15) 32. QFT: elapsed time t=1.77 s, 16 iters, t-(init.)=1.72 s t(norm)=0.218709, mflops=22.8614 (err=4.9e-15) 33. Ransom: elapsed time t=1.4 s, 16 iters, t-(init.)=1.34 s t(norm)=0.17039, mflops=29.3445 (err=3.6e-15) 34. SCIPORT: elapsed time t=1.37 s, 4 iters, t-(init.)=1.35 s t(norm)=0.686646, mflops=7.28178 (err=2.3e-07) 35. Singleton: elapsed time t=1.37 s, 8 iters, t-(init.)=1.35 s t(norm)=0.343323, mflops=14.5636 (err=3.2e-15) 36. Singleton (f2c): elapsed time t=1.94 s, 16 iters, t-(init.)=1.88 s t(norm)=0.239054, mflops=20.9157 (err=3.2e-15) 37. Sorensen: elapsed time t=1.56 s, 16 iters, t-(init.)=1.51 s t(norm)=0.192006, mflops=26.0408 (err=2.1e-15) 38. Sorensen DIT: elapsed time t=1.49 s, 8 iters, t-(init.)=1.46 s t(norm)=0.371297, mflops=13.4663 (err=2.1e-15) 39. Temperton: elapsed time t=1.35 s, 8 iters, t-(init.)=1.32 s t(norm)=0.335693, mflops=14.8945 (err=2.2e-15) 40. Temperton (f2c): elapsed time t=1.09 s, 8 iters, t-(init.)=1.07 s t(norm)=0.272115, mflops=18.3746 (err=2.2e-15) 41. Valkenburg: elapsed time t=1.11 s, 2 iters, t-(init.)=1.11 s t(norm)=1.12915, mflops=4.42811 (err=2.3e-15) Top mflops for N=32768 = 66.6468 Normalized results and averages for N=32768: fft 0: mflops = 21.1406 (norm. = 0.317204), norm. avg. (of 15) = 0.328241 fft 1: mflops = 21.4872 (norm. = 0.322404), norm. avg. (of 15) = 0.335446 fft 2: mflops = 17.5543 (norm. = 0.263393), norm. avg. (of 15) = 0.263615 fft 3: mflops = 13.3747 (norm. = 0.20068), norm. avg. (of 15) = 0.121095 fft 4: mflops = 15.6038 (norm. = 0.234127), norm. avg. (of 15) = 0.256417 fft 5: mflops = 6.64216 (norm. = 0.0996622), norm. avg. (of 15) = 0.0710922 fft 6: mflops = 16.6617 (norm. = 0.25), norm. avg. (of 15) = 0.181462 fft 7: mflops = 16.2486 (norm. = 0.243802), norm. avg. (of 15) = 0.165288 fft 8: mflops = 15.2409 (norm. = 0.228682), norm. avg. (of 15) = 0.19184 fft 9: mflops = 66.0867 (norm. = 0.991597), norm. avg. (of 15) = 0.636413 fft 10: mflops = 66.6468 (norm. = 1), norm. avg. (of 15) = 0.633716 fft 11: mflops = 15.0082 (norm. = 0.225191), norm. avg. (of 14) = 0.185101 fft 12: mflops = 24.8871 (norm. = 0.373418), norm. avg. (of 15) = 0.351578 fft 13: mflops = 26.5686 (norm. = 0.398649), norm. avg. (of 15) = 0.399842 fft 14: mflops = 50.7375 (norm. = 0.76129), norm. avg. (of 15) = 0.750533 fft 15: mflops = 41.8315 (norm. = 0.62766), norm. avg. (of 15) = 0.584461 fft 16: mflops = 25.2062 (norm. = 0.378205), norm. avg. (of 15) = 0.600257 fft 17: mflops = 34.1927 (norm. = 0.513043), norm. avg. (of 13) = 0.491946 fft 18: mflops = 16.8041 (norm. = 0.252137), norm. avg. (of 15) = 0.185714 fft 19: mflops = 15.8555 (norm. = 0.237903), norm. avg. (of 15) = 0.250224 fft 20: mflops = 16.384 (norm. = 0.245833), norm. avg. (of 15) = 0.22593 fft 21: mflops = -1 (norm. = -0.0150045), norm. avg. (of 12) = 0.559204 fft 22: mflops = 25.0456 (norm. = 0.375796), norm. avg. (of 14) = 0.301281 fft 23: mflops = 27.3067 (norm. = 0.409722), norm. avg. (of 14) = 0.357036 fft 24: mflops = 27.4976 (norm. = 0.412587), norm. avg. (of 14) = 0.380875 fft 25: mflops = 18.3746 (norm. = 0.275701), norm. avg. (of 14) = 0.216726 fft 26: mflops = 9.8798 (norm. = 0.148241), norm. avg. (of 15) = 0.112594 fft 27: mflops = 12.6844 (norm. = 0.190323), norm. avg. (of 15) = 0.133915 fft 28: mflops = 18.0374 (norm. = 0.270642), norm. avg. (of 15) = 0.300462 fft 29: mflops = 17.5543 (norm. = 0.263393), norm. avg. (of 15) = 0.255601 fft 30: mflops = 54.9952 (norm. = 0.825175), norm. avg. (of 15) = 0.825066 fft 31: mflops = 50.4123 (norm. = 0.75641), norm. avg. (of 15) = 0.695318 fft 32: mflops = 22.8614 (norm. = 0.343023), norm. avg. (of 12) = 0.4257 fft 33: mflops = 29.3445 (norm. = 0.440299), norm. avg. (of 14) = 0.262145 fft 34: mflops = 7.28178 (norm. = 0.109259), norm. avg. (of 14) = 0.113748 fft 35: mflops = 14.5636 (norm. = 0.218519), norm. avg. (of 15) = 0.162283 fft 36: mflops = 20.9157 (norm. = 0.31383), norm. avg. (of 15) = 0.256612 fft 37: mflops = 26.0408 (norm. = 0.390728), norm. avg. (of 15) = 0.436597 fft 38: mflops = 13.4663 (norm. = 0.202055), norm. avg. (of 15) = 0.160105 fft 39: mflops = 14.8945 (norm. = 0.223485), norm. avg. (of 15) = 0.160556 fft 40: mflops = 18.3746 (norm. = 0.275701), norm. avg. (of 15) = 0.22461 fft 41: mflops = 4.42811 (norm. = 0.0664414), norm. avg. (of 15) = 0.05422 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.3 s, 4 iters, t-(init.)=1.26 s t(norm)=0.300407, mflops=16.6441 (err=4.0e-15) 1. Arndt DIT: elapsed time t=1.29 s, 4 iters, t-(init.)=1.25 s t(norm)=0.298023, mflops=16.7772 (err=4.1e-15) 2. Arndt Split-Radix: elapsed time t=1.62 s, 4 iters, t-(init.)=1.57 s t(norm)=0.374317, mflops=13.3577 (err=4.1e-15) 3. Arndt 4-step: elapsed time t=1.54 s, 4 iters, t-(init.)=1.49 s t(norm)=0.355244, mflops=14.0748 (err=4.2e-15) 4. Bailey: elapsed time t=1.57 s, 4 iters, t-(init.)=1.53 s t(norm)=0.36478, mflops=13.7069 (err=4.0e-15) 5. Beauregard: elapsed time t=1.69 s, 2 iters, t-(init.)=1.66 s t(norm)=0.79155, mflops=6.31672 (err=4.2e-15) 6. Bergland: elapsed time t=1.47 s, 4 iters, t-(init.)=1.43 s t(norm)=0.340939, mflops=14.6654 (err=4.3e-15) 7. Brenner: elapsed time t=1.49 s, 4 iters, t-(init.)=1.45 s t(norm)=0.345707, mflops=14.4631 (err=4.3e-15) 8. Burrus: elapsed time t=1.81 s, 4 iters, t-(init.)=1.77 s t(norm)=0.422001, mflops=11.8483 (err=4.1e-15) 9. CWP (min N) (N=72072): elapsed time t=1.81 s, 16 iters, t-(init.)=1.62 s t(norm)=0.0965595, mflops=51.7815 10. CWP (best N) (N=72072): elapsed time t=1.8 s, 16 iters, t-(init.)=1.61 s t(norm)=0.0959635, mflops=52.1032 11. Edelblute: elapsed time t=1.85 s, 4 iters, t-(init.)=1.81 s t(norm)=0.431538, mflops=11.5865 (err=4.1e-15) 12. FFTPACK: elapsed time t=1.77 s, 8 iters, t-(init.)=1.68 s t(norm)=0.200272, mflops=24.9661 (err=4.2e-15) 13. FFTPACK (f2c): elapsed time t=1.61 s, 8 iters, t-(init.)=1.52 s t(norm)=0.181198, mflops=27.5941 (err=4.2e-15) FFTW_MEASURE plan: (cost = 1.200000e-01) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 16 14. FFTW: elapsed time t=1.87 s, 16 iters, t-(init.)=1.7 s t(norm)=0.101328, mflops=49.3448 (err=4.3e-15) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.17 s, 8 iters, t-(init.)=1.09 s t(norm)=0.129938, mflops=38.4799 (err=4.3e-15) 16. Frigo-old: elapsed time t=1.58 s, 8 iters, t-(init.)=1.49 s t(norm)=0.177622, mflops=28.1497 (err=4.4e-15) 17. Green: elapsed time t=1.66 s, 8 iters, t-(init.)=1.57 s t(norm)=0.187159, mflops=26.7153 (err=4.3e-15) 18. GSL: elapsed time t=1.33 s, 4 iters, t-(init.)=1.29 s t(norm)=0.30756, mflops=16.257 (err=4.2e-15) 19. GSL DIT: elapsed time t=1.73 s, 4 iters, t-(init.)=1.68 s t(norm)=0.400543, mflops=12.483 (err=4.2e-15) 20. GSL DIF: elapsed time t=1.68 s, 4 iters, t-(init.)=1.64 s t(norm)=0.391006, mflops=12.7875 (err=4.2e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.01 s, 4 iters, t-(init.)=0.96 s t(norm)=0.228882, mflops=21.8453 (err=4.2e-15) 23. Mayer (simple): elapsed time t=1.9 s, 8 iters, t-(init.)=1.81 s t(norm)=0.215769, mflops=23.173 24. Mayer (lookup): elapsed time t=1.9 s, 8 iters, t-(init.)=1.82 s t(norm)=0.216961, mflops=23.0456 (err=4.2e-15) 25. Monro: elapsed time t=1.47 s, 4 iters, t-(init.)=1.42 s t(norm)=0.338554, mflops=14.7687 (err=4.6e-15) 26. NAPACK (f2c): elapsed time t=1.05 s, 2 iters, t-(init.)=1.03 s t(norm)=0.491142, mflops=10.1803 (err=8.9e-13) 27. Nielsen: elapsed time t=1.85 s, 4 iters, t-(init.)=1.81 s t(norm)=0.431538, mflops=11.5865 (err=2.7e-13) 28. NR (C): elapsed time t=1.59 s, 4 iters, t-(init.)=1.54 s t(norm)=0.367165, mflops=13.6179 (err=4.2e-15) 29. NR (F): elapsed time t=1.7 s, 4 iters, t-(init.)=1.66 s t(norm)=0.395775, mflops=12.6334 (err=4.2e-15) 30. Ooura (C): elapsed time t=1.05 s, 8 iters, t-(init.)=0.97 s t(norm)=0.115633, mflops=43.2402 (err=4.4e-15) 31. Ooura (F): elapsed time t=1.12 s, 8 iters, t-(init.)=1.04 s t(norm)=0.123978, mflops=40.3298 (err=4.4e-15) 32. QFT: elapsed time t=1.1 s, 4 iters, t-(init.)=1.05 s t(norm)=0.25034, mflops=19.9729 (err=7.9e-15) 33. Ransom: elapsed time t=1.64 s, 8 iters, t-(init.)=1.55 s t(norm)=0.184774, mflops=27.06 (err=6.9e-15) 34. SCIPORT: elapsed time t=1.53 s, 2 iters, t-(init.)=1.51 s t(norm)=0.720024, mflops=6.94421 (err=2.5e-07) 35. Singleton: elapsed time t=1.73 s, 4 iters, t-(init.)=1.68 s t(norm)=0.400543, mflops=12.483 (err=5.6e-15) 36. Singleton (f2c): elapsed time t=1.22 s, 4 iters, t-(init.)=1.18 s t(norm)=0.281334, mflops=17.7725 (err=5.6e-15) 37. Sorensen: elapsed time t=1.09 s, 4 iters, t-(init.)=1.04 s t(norm)=0.247955, mflops=20.1649 (err=4.2e-15) 38. Sorensen DIT: elapsed time t=1 s, 2 iters, t-(init.)=0.97 s t(norm)=0.462532, mflops=10.8101 (err=4.1e-15) 39. Temperton: elapsed time t=1.58 s, 4 iters, t-(init.)=1.54 s t(norm)=0.367165, mflops=13.6179 (err=4.2e-15) 40. Temperton (f2c): elapsed time t=1.26 s, 4 iters, t-(init.)=1.19 s t(norm)=0.283718, mflops=17.6231 (err=4.2e-15) 41. Valkenburg: elapsed time t=1.25 s, 1 iters, t-(init.)=1.24 s t(norm)=1.18256, mflops=4.22813 (err=4.0e-15) Top mflops for N=65536 = 52.1032 Normalized results and averages for N=65536: fft 0: mflops = 16.6441 (norm. = 0.319444), norm. avg. (of 16) = 0.327692 fft 1: mflops = 16.7772 (norm. = 0.322), norm. avg. (of 16) = 0.334605 fft 2: mflops = 13.3577 (norm. = 0.256369), norm. avg. (of 16) = 0.263162 fft 3: mflops = 14.0748 (norm. = 0.270134), norm. avg. (of 16) = 0.13041 fft 4: mflops = 13.7069 (norm. = 0.263072), norm. avg. (of 16) = 0.256833 fft 5: mflops = 6.31672 (norm. = 0.121235), norm. avg. (of 16) = 0.0742261 fft 6: mflops = 14.6654 (norm. = 0.281469), norm. avg. (of 16) = 0.187712 fft 7: mflops = 14.4631 (norm. = 0.277586), norm. avg. (of 16) = 0.172307 fft 8: mflops = 11.8483 (norm. = 0.227401), norm. avg. (of 16) = 0.194063 fft 9: mflops = 51.7815 (norm. = 0.993827), norm. avg. (of 16) = 0.658751 fft 10: mflops = 52.1032 (norm. = 1), norm. avg. (of 16) = 0.656609 fft 11: mflops = 11.5865 (norm. = 0.222376), norm. avg. (of 15) = 0.187586 fft 12: mflops = 24.9661 (norm. = 0.479167), norm. avg. (of 16) = 0.359552 fft 13: mflops = 27.5941 (norm. = 0.529605), norm. avg. (of 16) = 0.407952 fft 14: mflops = 49.3448 (norm. = 0.947059), norm. avg. (of 16) = 0.762816 fft 15: mflops = 38.4799 (norm. = 0.738532), norm. avg. (of 16) = 0.594091 fft 16: mflops = 28.1497 (norm. = 0.540268), norm. avg. (of 16) = 0.596507 fft 17: mflops = 26.7153 (norm. = 0.512739), norm. avg. (of 14) = 0.493431 fft 18: mflops = 16.257 (norm. = 0.312016), norm. avg. (of 16) = 0.193608 fft 19: mflops = 12.483 (norm. = 0.239583), norm. avg. (of 16) = 0.249559 fft 20: mflops = 12.7875 (norm. = 0.245427), norm. avg. (of 16) = 0.227148 fft 21: mflops = -1 (norm. = -0.0191927), norm. avg. (of 12) = 0.559204 fft 22: mflops = 21.8453 (norm. = 0.419271), norm. avg. (of 15) = 0.309147 fft 23: mflops = 23.173 (norm. = 0.444751), norm. avg. (of 15) = 0.362884 fft 24: mflops = 23.0456 (norm. = 0.442308), norm. avg. (of 15) = 0.38497 fft 25: mflops = 14.7687 (norm. = 0.283451), norm. avg. (of 15) = 0.221174 fft 26: mflops = 10.1803 (norm. = 0.195388), norm. avg. (of 16) = 0.117768 fft 27: mflops = 11.5865 (norm. = 0.222376), norm. avg. (of 16) = 0.139444 fft 28: mflops = 13.6179 (norm. = 0.261364), norm. avg. (of 16) = 0.298018 fft 29: mflops = 12.6334 (norm. = 0.24247), norm. avg. (of 16) = 0.254781 fft 30: mflops = 43.2402 (norm. = 0.829897), norm. avg. (of 16) = 0.825368 fft 31: mflops = 40.3298 (norm. = 0.774038), norm. avg. (of 16) = 0.700238 fft 32: mflops = 19.9729 (norm. = 0.383333), norm. avg. (of 13) = 0.422441 fft 33: mflops = 27.06 (norm. = 0.519355), norm. avg. (of 15) = 0.279292 fft 34: mflops = 6.94421 (norm. = 0.133278), norm. avg. (of 15) = 0.11505 fft 35: mflops = 12.483 (norm. = 0.239583), norm. avg. (of 16) = 0.167114 fft 36: mflops = 17.7725 (norm. = 0.341102), norm. avg. (of 16) = 0.261893 fft 37: mflops = 20.1649 (norm. = 0.387019), norm. avg. (of 16) = 0.433499 fft 38: mflops = 10.8101 (norm. = 0.207474), norm. avg. (of 16) = 0.163066 fft 39: mflops = 13.6179 (norm. = 0.261364), norm. avg. (of 16) = 0.166857 fft 40: mflops = 17.6231 (norm. = 0.338235), norm. avg. (of 16) = 0.231711 fft 41: mflops = 4.22813 (norm. = 0.0811492), norm. avg. (of 16) = 0.0559031 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.46 s, 2 iters, t-(init.)=1.42 s t(norm)=0.318639, mflops=15.6917 (err=2.8e-15) 1. Arndt DIT: elapsed time t=1.42 s, 2 iters, t-(init.)=1.37 s t(norm)=0.30742, mflops=16.2644 (err=2.8e-15) 2. Arndt Split-Radix: elapsed time t=1.87 s, 2 iters, t-(init.)=1.83 s t(norm)=0.410641, mflops=12.1761 (err=2.8e-15) 3. Arndt 4-step: elapsed time t=1.81 s, 2 iters, t-(init.)=1.76 s t(norm)=0.394933, mflops=12.6604 (err=2.8e-15) 4. Bailey: elapsed time t=1.6 s, 2 iters, t-(init.)=1.56 s t(norm)=0.350055, mflops=14.2835 (err=2.8e-15) 5. Beauregard: elapsed time t=1.81 s, 1 iters, t-(init.)=1.79 s t(norm)=0.80333, mflops=6.22409 (err=2.9e-15) 6. Bergland: elapsed time t=1.57 s, 2 iters, t-(init.)=1.52 s t(norm)=0.341079, mflops=14.6594 (err=2.9e-15) 7. Brenner: elapsed time t=1.64 s, 2 iters, t-(init.)=1.59 s t(norm)=0.356786, mflops=14.014 (err=2.9e-15) 8. Burrus: elapsed time t=1.02 s, 1 iters, t-(init.)=1 s t(norm)=0.448788, mflops=11.1411 (err=2.8e-15) 9. CWP (min N) (N=144144): elapsed time t=1 s, 4 iters, t-(init.)=0.9 s t(norm)=0.100977, mflops=49.5161 10. CWP (best N) (N=144144): elapsed time t=1.97 s, 8 iters, t-(init.)=1.77 s t(norm)=0.0992943, mflops=50.3553 11. Edelblute: elapsed time t=1.05 s, 1 iters, t-(init.)=1.03 s t(norm)=0.462252, mflops=10.8166 (err=2.8e-15) 12. FFTPACK: elapsed time t=1.07 s, 2 iters, t-(init.)=1.03 s t(norm)=0.231126, mflops=21.6332 (err=2.9e-15) 13. FFTPACK (f2c): elapsed time t=1.96 s, 4 iters, t-(init.)=1.87 s t(norm)=0.209808, mflops=23.8313 (err=2.9e-15) FFTW_MEASURE plan: (cost = 2.600000e-01) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_NOTW 8 14. FFTW: elapsed time t=1.96 s, 8 iters, t-(init.)=1.78 s t(norm)=0.0998553, mflops=50.0724 (err=2.9e-15) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.29 s, 4 iters, t-(init.)=1.2 s t(norm)=0.134636, mflops=37.1371 (err=2.9e-15) 16. Frigo-old: elapsed time t=1.01 s, 2 iters, t-(init.)=0.96 s t(norm)=0.215418, mflops=23.2107 (err=2.8e-15) 17. Green: elapsed time t=1.91 s, 4 iters, t-(init.)=1.82 s t(norm)=0.204199, mflops=24.486 (err=2.9e-15) 18. GSL: elapsed time t=1.47 s, 2 iters, t-(init.)=1.43 s t(norm)=0.320883, mflops=15.582 (err=2.9e-15) 19. GSL DIT: elapsed time t=1.95 s, 2 iters, t-(init.)=1.91 s t(norm)=0.428592, mflops=11.6661 (err=2.9e-15) 20. GSL DIF: elapsed time t=1.9 s, 2 iters, t-(init.)=1.86 s t(norm)=0.417373, mflops=11.9797 (err=2.9e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.26 s, 2 iters, t-(init.)=1.21 s t(norm)=0.271517, mflops=18.4151 (err=2.8e-15) 23. Mayer (simple): elapsed time t=1.16 s, 2 iters, t-(init.)=1.11 s t(norm)=0.249077, mflops=20.0741 24. Mayer (lookup): elapsed time t=1.2 s, 2 iters, t-(init.)=1.16 s t(norm)=0.260297, mflops=19.2088 (err=2.8e-15) 25. Monro: elapsed time t=1.66 s, 2 iters, t-(init.)=1.62 s t(norm)=0.363518, mflops=13.7545 (err=4.2e-15) 26. NAPACK (f2c): elapsed time t=1.14 s, 1 iters, t-(init.)=1.12 s t(norm)=0.502642, mflops=9.94743 (err=2.1e-12) 27. Nielsen: elapsed time t=1.96 s, 2 iters, t-(init.)=1.91 s t(norm)=0.428592, mflops=11.6661 (err=9.6e-13) 28. NR (C): elapsed time t=1.79 s, 2 iters, t-(init.)=1.74 s t(norm)=0.390445, mflops=12.8059 (err=2.9e-15) 29. NR (F): elapsed time t=1.84 s, 2 iters, t-(init.)=1.79 s t(norm)=0.401665, mflops=12.4482 (err=2.9e-15) 30. Ooura (C): elapsed time t=1.19 s, 4 iters, t-(init.)=1.1 s t(norm)=0.123417, mflops=40.5132 (err=2.8e-15) 31. Ooura (F): elapsed time t=1.26 s, 4 iters, t-(init.)=1.17 s t(norm)=0.13127, mflops=38.0893 (err=2.8e-15) 32. QFT: elapsed time t=1.33 s, 2 iters, t-(init.)=1.29 s t(norm)=0.289468, mflops=17.2731 (err=8.6e-15) 33. Ransom: elapsed time t=1.02 s, 2 iters, t-(init.)=0.98 s t(norm)=0.219906, mflops=22.737 (err=4.0e-15) 34. SCIPORT: elapsed time t=1.62 s, 1 iters, t-(init.)=1.6 s t(norm)=0.718061, mflops=6.9632 (err=2.7e-07) 35. Singleton: elapsed time t=1.88 s, 2 iters, t-(init.)=1.83 s t(norm)=0.410641, mflops=12.1761 (err=4.3e-15) 36. Singleton (f2c): elapsed time t=1.41 s, 2 iters, t-(init.)=1.37 s t(norm)=0.30742, mflops=16.2644 (err=4.2e-15) 37. Sorensen: elapsed time t=1.25 s, 2 iters, t-(init.)=1.2 s t(norm)=0.269273, mflops=18.5685 (err=2.8e-15) 38. Sorensen DIT: elapsed time t=1.13 s, 1 iters, t-(init.)=1.11 s t(norm)=0.498155, mflops=10.037 (err=2.8e-15) 39. Temperton: elapsed time t=1.79 s, 2 iters, t-(init.)=1.74 s t(norm)=0.390445, mflops=12.8059 (err=2.9e-15) 40. Temperton (f2c): elapsed time t=1.44 s, 2 iters, t-(init.)=1.39 s t(norm)=0.311908, mflops=16.0304 (err=2.9e-15) 41. Valkenburg: elapsed time t=2.75 s, 1 iters, t-(init.)=2.73 s t(norm)=1.22519, mflops=4.081 (err=3.1e-15) Top mflops for N=131072 = 50.3553 Normalized results and averages for N=131072: fft 0: mflops = 15.6917 (norm. = 0.31162), norm. avg. (of 17) = 0.326746 fft 1: mflops = 16.2644 (norm. = 0.322993), norm. avg. (of 17) = 0.333922 fft 2: mflops = 12.1761 (norm. = 0.241803), norm. avg. (of 17) = 0.261906 fft 3: mflops = 12.6604 (norm. = 0.25142), norm. avg. (of 17) = 0.137528 fft 4: mflops = 14.2835 (norm. = 0.283654), norm. avg. (of 17) = 0.258411 fft 5: mflops = 6.22409 (norm. = 0.123603), norm. avg. (of 17) = 0.0771306 fft 6: mflops = 14.6594 (norm. = 0.291118), norm. avg. (of 17) = 0.193795 fft 7: mflops = 14.014 (norm. = 0.278302), norm. avg. (of 17) = 0.178542 fft 8: mflops = 11.1411 (norm. = 0.22125), norm. avg. (of 17) = 0.195662 fft 9: mflops = 49.5161 (norm. = 0.983333), norm. avg. (of 17) = 0.677844 fft 10: mflops = 50.3553 (norm. = 1), norm. avg. (of 17) = 0.676809 fft 11: mflops = 10.8166 (norm. = 0.214806), norm. avg. (of 16) = 0.189288 fft 12: mflops = 21.6332 (norm. = 0.429612), norm. avg. (of 17) = 0.363673 fft 13: mflops = 23.8313 (norm. = 0.473262), norm. avg. (of 17) = 0.411794 fft 14: mflops = 50.0724 (norm. = 0.994382), norm. avg. (of 17) = 0.776438 fft 15: mflops = 37.1371 (norm. = 0.7375), norm. avg. (of 17) = 0.602527 fft 16: mflops = 23.2107 (norm. = 0.460938), norm. avg. (of 17) = 0.588533 fft 17: mflops = 24.486 (norm. = 0.486264), norm. avg. (of 15) = 0.492953 fft 18: mflops = 15.582 (norm. = 0.309441), norm. avg. (of 17) = 0.200422 fft 19: mflops = 11.6661 (norm. = 0.231675), norm. avg. (of 17) = 0.248507 fft 20: mflops = 11.9797 (norm. = 0.237903), norm. avg. (of 17) = 0.227781 fft 21: mflops = -1 (norm. = -0.0198589), norm. avg. (of 12) = 0.559204 fft 22: mflops = 18.4151 (norm. = 0.365702), norm. avg. (of 16) = 0.312682 fft 23: mflops = 20.0741 (norm. = 0.398649), norm. avg. (of 16) = 0.365119 fft 24: mflops = 19.2088 (norm. = 0.381466), norm. avg. (of 16) = 0.384751 fft 25: mflops = 13.7545 (norm. = 0.273148), norm. avg. (of 16) = 0.224422 fft 26: mflops = 9.94743 (norm. = 0.197545), norm. avg. (of 17) = 0.122461 fft 27: mflops = 11.6661 (norm. = 0.231675), norm. avg. (of 17) = 0.144869 fft 28: mflops = 12.8059 (norm. = 0.25431), norm. avg. (of 17) = 0.295447 fft 29: mflops = 12.4482 (norm. = 0.247207), norm. avg. (of 17) = 0.254335 fft 30: mflops = 40.5132 (norm. = 0.804545), norm. avg. (of 17) = 0.824143 fft 31: mflops = 38.0893 (norm. = 0.75641), norm. avg. (of 17) = 0.703542 fft 32: mflops = 17.2731 (norm. = 0.343023), norm. avg. (of 14) = 0.416768 fft 33: mflops = 22.737 (norm. = 0.451531), norm. avg. (of 16) = 0.290057 fft 34: mflops = 6.9632 (norm. = 0.138281), norm. avg. (of 16) = 0.116502 fft 35: mflops = 12.1761 (norm. = 0.241803), norm. avg. (of 17) = 0.171507 fft 36: mflops = 16.2644 (norm. = 0.322993), norm. avg. (of 17) = 0.265487 fft 37: mflops = 18.5685 (norm. = 0.36875), norm. avg. (of 17) = 0.42969 fft 38: mflops = 10.037 (norm. = 0.199324), norm. avg. (of 17) = 0.165199 fft 39: mflops = 12.8059 (norm. = 0.25431), norm. avg. (of 17) = 0.172001 fft 40: mflops = 16.0304 (norm. = 0.318345), norm. avg. (of 17) = 0.236807 fft 41: mflops = 4.081 (norm. = 0.081044), norm. avg. (of 17) = 0.0573819 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=1.58 s, 1 iters, t-(init.)=1.54 s t(norm)=0.326369, mflops=15.3201 (err=6.7e-15) 1. Arndt DIT: elapsed time t=1.61 s, 1 iters, t-(init.)=1.56 s t(norm)=0.330607, mflops=15.1237 (err=6.7e-15) 2. Arndt Split-Radix: elapsed time t=2.02 s, 1 iters, t-(init.)=1.98 s t(norm)=0.419617, mflops=11.9156 (err=6.7e-15) 3. Arndt 4-step: elapsed time t=1.68 s, 1 iters, t-(init.)=1.64 s t(norm)=0.347561, mflops=14.386 (err=6.8e-15) 4. Bailey: elapsed time t=1.75 s, 1 iters, t-(init.)=1.7 s t(norm)=0.360277, mflops=13.8782 (err=6.7e-15) 5. Beauregard: elapsed time t=3.84 s, 1 iters, t-(init.)=3.79 s t(norm)=0.803206, mflops=6.22506 (err=6.8e-15) 6. Bergland: elapsed time t=1.7 s, 1 iters, t-(init.)=1.66 s t(norm)=0.3518, mflops=14.2126 (err=6.8e-15) 7. Brenner: elapsed time t=1.73 s, 1 iters, t-(init.)=1.68 s t(norm)=0.356038, mflops=14.0434 (err=6.9e-15) 8. Burrus: elapsed time t=2.2 s, 1 iters, t-(init.)=2.16 s t(norm)=0.457764, mflops=10.9227 (err=6.7e-15) 9. CWP (min N) (N=360360): elapsed time t=1.46 s, 2 iters, t-(init.)=1.34 s t(norm)=0.141992, mflops=35.2134 10. CWP (best N) (N=360360): elapsed time t=1.45 s, 2 iters, t-(init.)=1.32 s t(norm)=0.139872, mflops=35.7469 11. Edelblute: elapsed time t=2.25 s, 1 iters, t-(init.)=2.21 s t(norm)=0.46836, mflops=10.6755 (err=6.7e-15) 12. FFTPACK: elapsed time t=1.1 s, 1 iters, t-(init.)=1.06 s t(norm)=0.224643, mflops=22.2575 (err=6.8e-15) 13. FFTPACK (f2c): elapsed time t=1 s, 1 iters, t-(init.)=0.95 s t(norm)=0.201331, mflops=24.8347 (err=6.8e-15) FFTW_MEASURE plan: (cost = 5.900000e-01) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 16 14. FFTW: elapsed time t=1.09 s, 2 iters, t-(init.)=1 s t(norm)=0.105964, mflops=47.1859 (err=6.8e-15) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.33 s, 2 iters, t-(init.)=1.24 s t(norm)=0.131395, mflops=38.0532 (err=6.8e-15) 16. Frigo-old: elapsed time t=1.84 s, 2 iters, t-(init.)=1.75 s t(norm)=0.185437, mflops=26.9634 (err=6.9e-15) 17. Green: elapsed time t=1.98 s, 2 iters, t-(init.)=1.89 s t(norm)=0.200272, mflops=24.9661 (err=6.9e-15) 18. GSL: elapsed time t=1.54 s, 1 iters, t-(init.)=1.49 s t(norm)=0.315772, mflops=15.8342 (err=6.8e-15) 19. GSL DIT: elapsed time t=2.12 s, 1 iters, t-(init.)=2.08 s t(norm)=0.440809, mflops=11.3428 (err=6.8e-15) 20. GSL DIF: elapsed time t=2.05 s, 1 iters, t-(init.)=2.01 s t(norm)=0.425975, mflops=11.7378 (err=6.8e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.44 s, 1 iters, t-(init.)=1.39 s t(norm)=0.294579, mflops=16.9734 (err=6.8e-15) 23. Mayer (simple): elapsed time t=1.35 s, 1 iters, t-(init.)=1.3 s t(norm)=0.275506, mflops=18.1484 24. Mayer (lookup): elapsed time t=1.36 s, 1 iters, t-(init.)=1.31 s t(norm)=0.277625, mflops=18.0099 (err=6.8e-15) 25. Monro: elapsed time t=1.75 s, 1 iters, t-(init.)=1.7 s t(norm)=0.360277, mflops=13.8782 (err=9.9e-15) 26. NAPACK (f2c): elapsed time t=2.34 s, 1 iters, t-(init.)=2.3 s t(norm)=0.487434, mflops=10.2578 (err=3.7e-12) 27. Nielsen: elapsed time t=2.15 s, 1 iters, t-(init.)=2.1 s t(norm)=0.445048, mflops=11.2347 (err=2.2e-12) 28. NR (C): elapsed time t=1.94 s, 1 iters, t-(init.)=1.89 s t(norm)=0.400543, mflops=12.483 (err=6.8e-15) 29. NR (F): elapsed time t=1.99 s, 1 iters, t-(init.)=1.94 s t(norm)=0.41114, mflops=12.1613 (err=6.8e-15) 30. Ooura (C): elapsed time t=1.22 s, 2 iters, t-(init.)=1.13 s t(norm)=0.119739, mflops=41.7575 (err=6.9e-15) 31. Ooura (F): elapsed time t=1.29 s, 2 iters, t-(init.)=1.2 s t(norm)=0.127157, mflops=39.3216 (err=6.9e-15) 32. QFT: elapsed time t=1.55 s, 1 iters, t-(init.)=1.51 s t(norm)=0.320011, mflops=15.6245 (err=1.4e-14) 33. Ransom: elapsed time t=1.87 s, 2 iters, t-(init.)=1.78 s t(norm)=0.188616, mflops=26.5089 (err=8.2e-15) 34. SCIPORT: elapsed time t=3.48 s, 1 iters, t-(init.)=3.43 s t(norm)=0.726912, mflops=6.87841 (err=2.8e-07) 35. Singleton: elapsed time t=2.03 s, 1 iters, t-(init.)=1.98 s t(norm)=0.419617, mflops=11.9156 (err=1.0e-14) 36. Singleton (f2c): elapsed time t=1.46 s, 1 iters, t-(init.)=1.42 s t(norm)=0.300937, mflops=16.6148 (err=1.0e-14) 37. Sorensen: elapsed time t=1.35 s, 1 iters, t-(init.)=1.3 s t(norm)=0.275506, mflops=18.1484 (err=6.8e-15) 38. Sorensen DIT: elapsed time t=2.46 s, 1 iters, t-(init.)=2.42 s t(norm)=0.512865, mflops=9.74916 (err=6.7e-15) 39. Temperton: elapsed time t=1.86 s, 1 iters, t-(init.)=1.81 s t(norm)=0.383589, mflops=13.0348 (err=6.8e-15) 40. Temperton (f2c): elapsed time t=1.49 s, 1 iters, t-(init.)=1.45 s t(norm)=0.307295, mflops=16.271 (err=6.8e-15) 41. Valkenburg: elapsed time t=5.83 s, 1 iters, t-(init.)=5.78 s t(norm)=1.22494, mflops=4.08183 (err=6.8e-15) Top mflops for N=262144 = 47.1859 Normalized results and averages for N=262144: fft 0: mflops = 15.3201 (norm. = 0.324675), norm. avg. (of 18) = 0.326631 fft 1: mflops = 15.1237 (norm. = 0.320513), norm. avg. (of 18) = 0.333177 fft 2: mflops = 11.9156 (norm. = 0.252525), norm. avg. (of 18) = 0.261384 fft 3: mflops = 14.386 (norm. = 0.304878), norm. avg. (of 18) = 0.146825 fft 4: mflops = 13.8782 (norm. = 0.294118), norm. avg. (of 18) = 0.260395 fft 5: mflops = 6.22506 (norm. = 0.131926), norm. avg. (of 18) = 0.0801748 fft 6: mflops = 14.2126 (norm. = 0.301205), norm. avg. (of 18) = 0.199762 fft 7: mflops = 14.0434 (norm. = 0.297619), norm. avg. (of 18) = 0.185157 fft 8: mflops = 10.9227 (norm. = 0.231481), norm. avg. (of 18) = 0.197652 fft 9: mflops = 35.2134 (norm. = 0.746269), norm. avg. (of 18) = 0.681646 fft 10: mflops = 35.7469 (norm. = 0.757576), norm. avg. (of 18) = 0.681296 fft 11: mflops = 10.6755 (norm. = 0.226244), norm. avg. (of 17) = 0.191462 fft 12: mflops = 22.2575 (norm. = 0.471698), norm. avg. (of 18) = 0.369675 fft 13: mflops = 24.8347 (norm. = 0.526316), norm. avg. (of 18) = 0.418156 fft 14: mflops = 47.1859 (norm. = 1), norm. avg. (of 18) = 0.788858 fft 15: mflops = 38.0532 (norm. = 0.806452), norm. avg. (of 18) = 0.613856 fft 16: mflops = 26.9634 (norm. = 0.571429), norm. avg. (of 18) = 0.587582 fft 17: mflops = 24.9661 (norm. = 0.529101), norm. avg. (of 16) = 0.495212 fft 18: mflops = 15.8342 (norm. = 0.33557), norm. avg. (of 18) = 0.20793 fft 19: mflops = 11.3428 (norm. = 0.240385), norm. avg. (of 18) = 0.248056 fft 20: mflops = 11.7378 (norm. = 0.248756), norm. avg. (of 18) = 0.228946 fft 21: mflops = -1 (norm. = -0.0211928), norm. avg. (of 12) = 0.559204 fft 22: mflops = 16.9734 (norm. = 0.359712), norm. avg. (of 17) = 0.315448 fft 23: mflops = 18.1484 (norm. = 0.384615), norm. avg. (of 17) = 0.366266 fft 24: mflops = 18.0099 (norm. = 0.381679), norm. avg. (of 17) = 0.38457 fft 25: mflops = 13.8782 (norm. = 0.294118), norm. avg. (of 17) = 0.228522 fft 26: mflops = 10.2578 (norm. = 0.217391), norm. avg. (of 18) = 0.127735 fft 27: mflops = 11.2347 (norm. = 0.238095), norm. avg. (of 18) = 0.150048 fft 28: mflops = 12.483 (norm. = 0.26455), norm. avg. (of 18) = 0.29373 fft 29: mflops = 12.1613 (norm. = 0.257732), norm. avg. (of 18) = 0.254524 fft 30: mflops = 41.7575 (norm. = 0.884956), norm. avg. (of 18) = 0.827522 fft 31: mflops = 39.3216 (norm. = 0.833333), norm. avg. (of 18) = 0.710753 fft 32: mflops = 15.6245 (norm. = 0.331126), norm. avg. (of 15) = 0.411059 fft 33: mflops = 26.5089 (norm. = 0.561798), norm. avg. (of 17) = 0.306042 fft 34: mflops = 6.87841 (norm. = 0.145773), norm. avg. (of 17) = 0.118224 fft 35: mflops = 11.9156 (norm. = 0.252525), norm. avg. (of 18) = 0.176008 fft 36: mflops = 16.6148 (norm. = 0.352113), norm. avg. (of 18) = 0.270299 fft 37: mflops = 18.1484 (norm. = 0.384615), norm. avg. (of 18) = 0.427186 fft 38: mflops = 9.74916 (norm. = 0.206612), norm. avg. (of 18) = 0.167499 fft 39: mflops = 13.0348 (norm. = 0.276243), norm. avg. (of 18) = 0.177792 fft 40: mflops = 16.271 (norm. = 0.344828), norm. avg. (of 18) = 0.242808 fft 41: mflops = 4.08183 (norm. = 0.0865052), norm. avg. (of 18) = 0.0589999 Benchmarking for array size = 524288 (power of 2): 0. Arndt DIF: elapsed time t=3.41 s, 1 iters, t-(init.)=3.32 s t(norm)=0.333284, mflops=15.0022 (err=1.0e-14) 1. Arndt DIT: elapsed time t=3.38 s, 1 iters, t-(init.)=3.29 s t(norm)=0.330272, mflops=15.139 (err=1.0e-14) 2. Arndt Split-Radix: elapsed time t=4.38 s, 1 iters, t-(init.)=4.29 s t(norm)=0.430659, mflops=11.6101 (err=1.0e-14) 3. Arndt 4-step: elapsed time t=4.04 s, 1 iters, t-(init.)=3.95 s t(norm)=0.396528, mflops=12.6095 (err=1.0e-14) 4. Bailey: elapsed time t=3.59 s, 1 iters, t-(init.)=3.5 s t(norm)=0.351354, mflops=14.2307 (err=1.0e-14) 5. Beauregard: elapsed time t=8.14 s, 1 iters, t-(init.)=8.05 s t(norm)=0.808113, mflops=6.18725 (err=1.0e-14) 6. Bergland: elapsed time t=3.62 s, 1 iters, t-(init.)=3.53 s t(norm)=0.354365, mflops=14.1097 (err=1.0e-14) 7. Brenner: elapsed time t=3.72 s, 1 iters, t-(init.)=3.63 s t(norm)=0.364404, mflops=13.721 (err=1.0e-14) 8. Burrus: elapsed time t=4.68 s, 1 iters, t-(init.)=4.59 s t(norm)=0.460775, mflops=10.8513 (err=1.0e-14) 9. CWP (min N) (N=720720): elapsed time t=1.5 s, 1 iters, t-(init.)=1.37 s t(norm)=0.13753, mflops=36.3557 10. CWP (best N) (N=720720): elapsed time t=1.51 s, 1 iters, t-(init.)=1.39 s t(norm)=0.139538, mflops=35.8326 11. Edelblute: elapsed time t=4.86 s, 1 iters, t-(init.)=4.77 s t(norm)=0.478845, mflops=10.4418 (err=1.0e-14) 12. FFTPACK: elapsed time t=2.28 s, 1 iters, t-(init.)=2.2 s t(norm)=0.220851, mflops=22.6397 (err=1.0e-14) 13. FFTPACK (f2c): elapsed time t=2.09 s, 1 iters, t-(init.)=2.01 s t(norm)=0.201777, mflops=24.7798 (err=1.0e-14) FFTW_MEASURE plan: (cost = 1.250000e+00) FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 8 14. FFTW: elapsed time t=1.2 s, 1 iters, t-(init.)=1.11 s t(norm)=0.111429, mflops=44.8715 (err=1.0e-14) FFTW_ESTIMATE plan: (cost = 5.976883e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.57 s, 1 iters, t-(init.)=1.48 s t(norm)=0.148572, mflops=33.6536 (err=1.0e-14) 16. Frigo-old: elapsed time t=2.13 s, 1 iters, t-(init.)=2.05 s t(norm)=0.205793, mflops=24.2963 (err=1.0e-14) 17. Green: elapsed time t=2.17 s, 1 iters, t-(init.)=2.09 s t(norm)=0.209808, mflops=23.8313 (err=1.0e-14) 18. GSL: elapsed time t=3.13 s, 1 iters, t-(init.)=3.04 s t(norm)=0.305176, mflops=16.384 (err=1.0e-14) 19. GSL DIT: elapsed time t=4.52 s, 1 iters, t-(init.)=4.43 s t(norm)=0.444713, mflops=11.2432 (err=1.0e-14) 20. GSL DIF: elapsed time t=4.36 s, 1 iters, t-(init.)=4.27 s t(norm)=0.428652, mflops=11.6645 (err=1.0e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=3.02 s, 1 iters, t-(init.)=2.93 s t(norm)=0.294133, mflops=16.9991 (err=1.0e-14) 23. Mayer (simple): elapsed time t=2.82 s, 1 iters, t-(init.)=2.73 s t(norm)=0.274056, mflops=18.2445 24. Mayer (lookup): elapsed time t=2.87 s, 1 iters, t-(init.)=2.79 s t(norm)=0.280079, mflops=17.8521 (err=1.0e-14) 25. Monro: elapsed time t=3.82 s, 1 iters, t-(init.)=3.73 s t(norm)=0.374443, mflops=13.3532 (err=1.9e-14) 26. NAPACK (f2c): elapsed time t=5.05 s, 1 iters, t-(init.)=4.96 s t(norm)=0.497918, mflops=10.0418 (err=8.0e-12) 27. Nielsen: elapsed time t=4.64 s, 1 iters, t-(init.)=4.55 s t(norm)=0.45676, mflops=10.9467 (err=4.5e-12) 28. NR (C): elapsed time t=4.15 s, 1 iters, t-(init.)=4.06 s t(norm)=0.40757, mflops=12.2678 (err=1.0e-14) 29. NR (F): elapsed time t=4.27 s, 1 iters, t-(init.)=4.18 s t(norm)=0.419617, mflops=11.9156 (err=1.0e-14) 30. Ooura (C): elapsed time t=1.32 s, 1 iters, t-(init.)=1.23 s t(norm)=0.123476, mflops=40.4938 (err=1.0e-14) 31. Ooura (F): elapsed time t=1.39 s, 1 iters, t-(init.)=1.3 s t(norm)=0.130503, mflops=38.3134 (err=1.0e-14) 32. QFT: elapsed time t=3.45 s, 1 iters, t-(init.)=3.36 s t(norm)=0.3373, mflops=14.8236 (err=1.9e-14) 33. Ransom: elapsed time t=2.34 s, 1 iters, t-(init.)=2.25 s t(norm)=0.22587, mflops=22.1366 (err=1.0e-14) 34. SCIPORT: elapsed time t=7.33 s, 1 iters, t-(init.)=7.24 s t(norm)=0.7268, mflops=6.87947 (err=3.0e-07) 35. Singleton: elapsed time t=4.31 s, 1 iters, t-(init.)=4.22 s t(norm)=0.423632, mflops=11.8027 (err=1.6e-14) 36. Singleton (f2c): elapsed time t=3.26 s, 1 iters, t-(init.)=3.17 s t(norm)=0.318226, mflops=15.7121 (err=1.6e-14) 37. Sorensen: elapsed time t=2.94 s, 1 iters, t-(init.)=2.85 s t(norm)=0.286102, mflops=17.4763 (err=1.0e-14) 38. Sorensen DIT: elapsed time t=5.11 s, 1 iters, t-(init.)=5.02 s t(norm)=0.503942, mflops=9.92178 (err=1.0e-14) 39. Temperton: elapsed time t=4.12 s, 1 iters, t-(init.)=4.03 s t(norm)=0.404559, mflops=12.3591 (err=1.0e-14) 40. Temperton (f2c): elapsed time t=3.36 s, 1 iters, t-(init.)=3.27 s t(norm)=0.328265, mflops=15.2316 (err=1.0e-14) 41. Valkenburg: elapsed time t=12.55 s, 1 iters, t-(init.)=12.47 s t(norm)=1.25182, mflops=3.99417 (err=1.0e-14) Top mflops for N=524288 = 44.8715 Normalized results and averages for N=524288: fft 0: mflops = 15.0022 (norm. = 0.334337), norm. avg. (of 19) = 0.327037 fft 1: mflops = 15.139 (norm. = 0.337386), norm. avg. (of 19) = 0.333399 fft 2: mflops = 11.6101 (norm. = 0.258741), norm. avg. (of 19) = 0.261245 fft 3: mflops = 12.6095 (norm. = 0.281013), norm. avg. (of 19) = 0.153888 fft 4: mflops = 14.2307 (norm. = 0.317143), norm. avg. (of 19) = 0.263381 fft 5: mflops = 6.18725 (norm. = 0.137888), norm. avg. (of 19) = 0.0832124 fft 6: mflops = 14.1097 (norm. = 0.314448), norm. avg. (of 19) = 0.205798 fft 7: mflops = 13.721 (norm. = 0.305785), norm. avg. (of 19) = 0.191506 fft 8: mflops = 10.8513 (norm. = 0.24183), norm. avg. (of 19) = 0.199977 fft 9: mflops = 36.3557 (norm. = 0.810219), norm. avg. (of 19) = 0.688413 fft 10: mflops = 35.8326 (norm. = 0.798561), norm. avg. (of 19) = 0.687468 fft 11: mflops = 10.4418 (norm. = 0.232704), norm. avg. (of 18) = 0.193753 fft 12: mflops = 22.6397 (norm. = 0.504545), norm. avg. (of 19) = 0.376773 fft 13: mflops = 24.7798 (norm. = 0.552239), norm. avg. (of 19) = 0.425213 fft 14: mflops = 44.8715 (norm. = 1), norm. avg. (of 19) = 0.799971 fft 15: mflops = 33.6536 (norm. = 0.75), norm. avg. (of 19) = 0.621021 fft 16: mflops = 24.2963 (norm. = 0.541463), norm. avg. (of 19) = 0.585155 fft 17: mflops = 23.8313 (norm. = 0.5311), norm. avg. (of 17) = 0.497323 fft 18: mflops = 16.384 (norm. = 0.365132), norm. avg. (of 19) = 0.216204 fft 19: mflops = 11.2432 (norm. = 0.250564), norm. avg. (of 19) = 0.248188 fft 20: mflops = 11.6645 (norm. = 0.259953), norm. avg. (of 19) = 0.230578 fft 21: mflops = -1 (norm. = -0.0222859), norm. avg. (of 12) = 0.559204 fft 22: mflops = 16.9991 (norm. = 0.37884), norm. avg. (of 18) = 0.31897 fft 23: mflops = 18.2445 (norm. = 0.406593), norm. avg. (of 18) = 0.368506 fft 24: mflops = 17.8521 (norm. = 0.397849), norm. avg. (of 18) = 0.385308 fft 25: mflops = 13.3532 (norm. = 0.297587), norm. avg. (of 18) = 0.232359 fft 26: mflops = 10.0418 (norm. = 0.22379), norm. avg. (of 19) = 0.13279 fft 27: mflops = 10.9467 (norm. = 0.243956), norm. avg. (of 19) = 0.154991 fft 28: mflops = 12.2678 (norm. = 0.273399), norm. avg. (of 19) = 0.29266 fft 29: mflops = 11.9156 (norm. = 0.26555), norm. avg. (of 19) = 0.255104 fft 30: mflops = 40.4938 (norm. = 0.902439), norm. avg. (of 19) = 0.831465 fft 31: mflops = 38.3134 (norm. = 0.853846), norm. avg. (of 19) = 0.718284 fft 32: mflops = 14.8236 (norm. = 0.330357), norm. avg. (of 16) = 0.406015 fft 33: mflops = 22.1366 (norm. = 0.493333), norm. avg. (of 18) = 0.316447 fft 34: mflops = 6.87947 (norm. = 0.153315), norm. avg. (of 18) = 0.120173 fft 35: mflops = 11.8027 (norm. = 0.263033), norm. avg. (of 19) = 0.180589 fft 36: mflops = 15.7121 (norm. = 0.350158), norm. avg. (of 19) = 0.274502 fft 37: mflops = 17.4763 (norm. = 0.389474), norm. avg. (of 19) = 0.425201 fft 38: mflops = 9.92178 (norm. = 0.221116), norm. avg. (of 19) = 0.170321 fft 39: mflops = 12.3591 (norm. = 0.275434), norm. avg. (of 19) = 0.182931 fft 40: mflops = 15.2316 (norm. = 0.33945), norm. avg. (of 19) = 0.247895 fft 41: mflops = 3.99417 (norm. = 0.0890136), norm. avg. (of 19) = 0.0605796 Benchmarking for array size = 1048576 (power of 2): 0. Arndt DIF: elapsed time t=7.36 s, 1 iters, t-(init.)=7.18 s t(norm)=0.342369, mflops=14.6041 (err=4.5e-14) 1. Arndt DIT: elapsed time t=7.17 s, 1 iters, t-(init.)=6.99 s t(norm)=0.333309, mflops=15.0011 (err=4.5e-14) 2. Arndt Split-Radix: elapsed time t=9.35 s, 1 iters, t-(init.)=9.17 s t(norm)=0.43726, mflops=11.4349 (err=4.5e-14) 3. Arndt 4-step: elapsed time t=7.14 s, 1 iters, t-(init.)=6.96 s t(norm)=0.331879, mflops=15.0657 (err=4.5e-14) 4. Bailey: elapsed time t=7.75 s, 1 iters, t-(init.)=7.57 s t(norm)=0.360966, mflops=13.8517 (err=4.5e-14) 5. Beauregard: elapsed time t=17.14 s, 1 iters, t-(init.)=16.96 s t(norm)=0.808716, mflops=6.18264 (err=4.7e-14) 6. Bergland: elapsed time t=7.52 s, 1 iters, t-(init.)=7.34 s t(norm)=0.349998, mflops=14.2858 (err=4.7e-14) 7. Brenner: elapsed time t=7.66 s, 1 iters, t-(init.)=7.48 s t(norm)=0.356674, mflops=14.0184 (err=4.8e-14) 8. Burrus: elapsed time t=10.05 s, 1 iters, t-(init.)=9.87 s t(norm)=0.470638, mflops=10.6239 (err=4.5e-14) 9. Skipping fft (this transform size is too big for CWP). 10. Skipping fft (this transform size is too big for CWP). 11. Edelblute: elapsed time t=10.17 s, 1 iters, t-(init.)=9.99 s t(norm)=0.47636, mflops=10.4963 (err=4.5e-14) 12. FFTPACK: elapsed time t=4.71 s, 1 iters, t-(init.)=4.53 s t(norm)=0.216007, mflops=23.1474 (err=4.7e-14) 13. FFTPACK (f2c): elapsed time t=4.33 s, 1 iters, t-(init.)=4.15 s t(norm)=0.197887, mflops=25.2669 (err=4.7e-14) FFTW_MEASURE plan: (cost = 2.730000e+00) FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_NOTW 16 14. FFTW: elapsed time t=2.57 s, 1 iters, t-(init.)=2.39 s t(norm)=0.113964, mflops=43.8735 (err=4.7e-14) FFTW_ESTIMATE plan: (cost = 1.195377e+07) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=3.36 s, 1 iters, t-(init.)=3.18 s t(norm)=0.151634, mflops=32.9741 (err=4.8e-14) 16. Frigo-old: elapsed time t=5.01 s, 1 iters, t-(init.)=4.83 s t(norm)=0.230312, mflops=21.7096 (err=4.8e-14) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=6.55 s, 1 iters, t-(init.)=6.36 s t(norm)=0.303268, mflops=16.487 (err=4.7e-14) 19. GSL DIT: elapsed time t=9.66 s, 1 iters, t-(init.)=9.48 s t(norm)=0.452042, mflops=11.0609 (err=4.7e-14) 20. GSL DIF: elapsed time t=9.4 s, 1 iters, t-(init.)=9.22 s t(norm)=0.439644, mflops=11.3728 (err=4.7e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Skipping fft (Mayer (Buneman) can't handle N > 2^19). 23. Mayer (simple): elapsed time t=6.16 s, 1 iters, t-(init.)=5.98 s t(norm)=0.285149, mflops=17.5347 24. Mayer (lookup): elapsed time t=6.27 s, 1 iters, t-(init.)=6.09 s t(norm)=0.290394, mflops=17.218 (err=4.5e-14) 25. Monro: elapsed time t=8.13 s, 1 iters, t-(init.)=7.95 s t(norm)=0.379086, mflops=13.1896 (err=4.9e-14) 26. NAPACK (f2c): elapsed time t=10.38 s, 1 iters, t-(init.)=10.2 s t(norm)=0.486374, mflops=10.2802 (err=1.5e-11) 27. Nielsen: elapsed time t=9.51 s, 1 iters, t-(init.)=9.33 s t(norm)=0.444889, mflops=11.2388 (err=8.2e-12) 28. NR (C): elapsed time t=8.83 s, 1 iters, t-(init.)=8.65 s t(norm)=0.412464, mflops=12.1223 (err=4.7e-14) 29. NR (F): elapsed time t=9.2 s, 1 iters, t-(init.)=9.02 s t(norm)=0.430107, mflops=11.625 (err=4.7e-14) 30. Ooura (C): elapsed time t=2.71 s, 1 iters, t-(init.)=2.53 s t(norm)=0.12064, mflops=41.4457 (err=4.7e-14) 31. Ooura (F): elapsed time t=2.88 s, 1 iters, t-(init.)=2.7 s t(norm)=0.128746, mflops=38.8361 (err=4.7e-14) 32. QFT: elapsed time t=7.39 s, 1 iters, t-(init.)=7.21 s t(norm)=0.3438, mflops=14.5434 (err=5.4e-14) 33. Ransom: elapsed time t=4.35 s, 1 iters, t-(init.)=4.17 s t(norm)=0.198841, mflops=25.1457 (err=5.0e-14) 34. SCIPORT: elapsed time t=15.64 s, 1 iters, t-(init.)=15.46 s t(norm)=0.73719, mflops=6.78251 (err=3.2e-07) 35. Singleton: elapsed time t=9.12 s, 1 iters, t-(init.)=8.94 s t(norm)=0.426292, mflops=11.729 (err=6.3e-14) 36. Singleton (f2c): elapsed time t=6.62 s, 1 iters, t-(init.)=6.44 s t(norm)=0.307083, mflops=16.2822 (err=6.3e-14) 37. Sorensen: elapsed time t=6.17 s, 1 iters, t-(init.)=5.99 s t(norm)=0.285625, mflops=17.5054 (err=4.5e-14) 38. Sorensen DIT: elapsed time t=10.75 s, 1 iters, t-(init.)=10.57 s t(norm)=0.504017, mflops=9.9203 (err=4.5e-14) 39. Temperton: elapsed time t=8.39 s, 1 iters, t-(init.)=8.21 s t(norm)=0.391483, mflops=12.7719 (err=4.7e-14) 40. Temperton (f2c): elapsed time t=6.89 s, 1 iters, t-(init.)=6.71 s t(norm)=0.319958, mflops=15.6271 (err=4.7e-14) 41. Valkenburg: elapsed time t=26.65 s, 1 iters, t-(init.)=26.47 s t(norm)=1.26219, mflops=3.96138 (err=4.7e-14) Top mflops for N=1048576 = 43.8735 Normalized results and averages for N=1048576: fft 0: mflops = 14.6041 (norm. = 0.332869), norm. avg. (of 20) = 0.327328 fft 1: mflops = 15.0011 (norm. = 0.341917), norm. avg. (of 20) = 0.333825 fft 2: mflops = 11.4349 (norm. = 0.260632), norm. avg. (of 20) = 0.261215 fft 3: mflops = 15.0657 (norm. = 0.343391), norm. avg. (of 20) = 0.163363 fft 4: mflops = 13.8517 (norm. = 0.31572), norm. avg. (of 20) = 0.265998 fft 5: mflops = 6.18264 (norm. = 0.14092), norm. avg. (of 20) = 0.0860977 fft 6: mflops = 14.2858 (norm. = 0.325613), norm. avg. (of 20) = 0.211789 fft 7: mflops = 14.0184 (norm. = 0.319519), norm. avg. (of 20) = 0.197907 fft 8: mflops = 10.6239 (norm. = 0.242148), norm. avg. (of 20) = 0.202086 fft 9: mflops = -1 (norm. = -0.0227928), norm. avg. (of 19) = 0.688413 fft 10: mflops = -1 (norm. = -0.0227928), norm. avg. (of 19) = 0.687468 fft 11: mflops = 10.4963 (norm. = 0.239239), norm. avg. (of 19) = 0.196147 fft 12: mflops = 23.1474 (norm. = 0.527594), norm. avg. (of 20) = 0.384314 fft 13: mflops = 25.2669 (norm. = 0.575904), norm. avg. (of 20) = 0.432748 fft 14: mflops = 43.8735 (norm. = 1), norm. avg. (of 20) = 0.809972 fft 15: mflops = 32.9741 (norm. = 0.751572), norm. avg. (of 20) = 0.627549 fft 16: mflops = 21.7096 (norm. = 0.494824), norm. avg. (of 20) = 0.580639 fft 17: mflops = -1 (norm. = -0.0227928), norm. avg. (of 17) = 0.497323 fft 18: mflops = 16.487 (norm. = 0.375786), norm. avg. (of 20) = 0.224183 fft 19: mflops = 11.0609 (norm. = 0.25211), norm. avg. (of 20) = 0.248384 fft 20: mflops = 11.3728 (norm. = 0.259219), norm. avg. (of 20) = 0.23201 fft 21: mflops = -1 (norm. = -0.0227928), norm. avg. (of 12) = 0.559204 fft 22: mflops = -1 (norm. = -0.0227928), norm. avg. (of 18) = 0.31897 fft 23: mflops = 17.5347 (norm. = 0.399666), norm. avg. (of 19) = 0.370146 fft 24: mflops = 17.218 (norm. = 0.392447), norm. avg. (of 19) = 0.385684 fft 25: mflops = 13.1896 (norm. = 0.300629), norm. avg. (of 19) = 0.235952 fft 26: mflops = 10.2802 (norm. = 0.234314), norm. avg. (of 20) = 0.137867 fft 27: mflops = 11.2388 (norm. = 0.256163), norm. avg. (of 20) = 0.160049 fft 28: mflops = 12.1223 (norm. = 0.276301), norm. avg. (of 20) = 0.291842 fft 29: mflops = 11.625 (norm. = 0.264967), norm. avg. (of 20) = 0.255597 fft 30: mflops = 41.4457 (norm. = 0.944664), norm. avg. (of 20) = 0.837125 fft 31: mflops = 38.8361 (norm. = 0.885185), norm. avg. (of 20) = 0.726629 fft 32: mflops = 14.5434 (norm. = 0.331484), norm. avg. (of 17) = 0.401631 fft 33: mflops = 25.1457 (norm. = 0.573141), norm. avg. (of 19) = 0.329957 fft 34: mflops = 6.78251 (norm. = 0.154592), norm. avg. (of 19) = 0.121985 fft 35: mflops = 11.729 (norm. = 0.267338), norm. avg. (of 20) = 0.184926 fft 36: mflops = 16.2822 (norm. = 0.371118), norm. avg. (of 20) = 0.279333 fft 37: mflops = 17.5054 (norm. = 0.398998), norm. avg. (of 20) = 0.423891 fft 38: mflops = 9.9203 (norm. = 0.226112), norm. avg. (of 20) = 0.173111 fft 39: mflops = 12.7719 (norm. = 0.291108), norm. avg. (of 20) = 0.18834 fft 40: mflops = 15.6271 (norm. = 0.356185), norm. avg. (of 20) = 0.253309 fft 41: mflops = 3.96138 (norm. = 0.0902909), norm. avg. (of 20) = 0.0620651 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. Brenner 1. CWP (min N) 2. CWP (best N) 3. FFTPACK 4. FFTPACK (f2c) 5. FFTW 6. FFTW_ESTIMATE 7. Frigo-old 8. GSL 9. Nielsen 10. Singleton 11. Singleton (f2c) 12. Temperton 13. Temperton (f2c) 14. Valkenburg Computing normalized averages (15 transforms). Benchmarking for array size = 6: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.19 s, 262144 iters, t-(init.)=1.11 s t(norm)=0.273009, mflops=18.3144 2. CWP (best N) (N=15): elapsed time t=1.05 s, 131072 iters, t-(init.)=0.99 s t(norm)=0.48699, mflops=10.2672 3. FFTPACK: elapsed time t=1.66 s, 524288 iters, t-(init.)=1.51 s t(norm)=0.185696, mflops=26.9258 (err=1.2e-16) 4. FFTPACK (f2c): elapsed time t=1.75 s, 524288 iters, t-(init.)=1.61 s t(norm)=0.197993, mflops=25.2534 (err=1.5e-16) FFTW_MEASURE plan: (cost = 1.296997e-06) FFTW_NOTW 6 5. FFTW: elapsed time t=1.45 s, 1048576 iters, t-(init.)=1.16 s t(norm)=0.0713268, mflops=70.0999 (err=8.7e-17) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 6. FFTW_ESTIMATE: elapsed time t=1.43 s, 1048576 iters, t-(init.)=1.14 s t(norm)=0.070097, mflops=71.3297 (err=8.7e-17) 7. Frigo-old: elapsed time t=1.28 s, 262144 iters, t-(init.)=1.2 s t(norm)=0.295145, mflops=16.9408 (err=2.6e-16) 8. GSL: elapsed time t=1.06 s, 262144 iters, t-(init.)=0.99 s t(norm)=0.243495, mflops=20.5343 (err=8.2e-17) 9. Nielsen: elapsed time t=1.03 s, 65536 iters, t-(init.)=1.01 s t(norm)=0.993656, mflops=5.03192 (err=5.7e-16) 10. Singleton: elapsed time t=1.13 s, 131072 iters, t-(init.)=1.1 s t(norm)=0.5411, mflops=9.24044 (err=1.2e-16) 11. Singleton (f2c): elapsed time t=1.98 s, 262144 iters, t-(init.)=1.91 s t(norm)=0.469773, mflops=10.6434 (err=1.2e-16) 12. Temperton: elapsed time t=1.86 s, 262144 iters, t-(init.)=1.79 s t(norm)=0.440258, mflops=11.357 (err=1.2e-16) 13. Temperton (f2c): elapsed time t=1.97 s, 262144 iters, t-(init.)=1.89 s t(norm)=0.464854, mflops=10.7561 (err=2.1e-16) 14. Valkenburg: elapsed time t=1.07 s, 65536 iters, t-(init.)=1.06 s t(norm)=1.04285, mflops=4.79457 (err=2.4e-16) Top mflops for N=6 = 71.3297 Normalized results and averages for N=6: fft 0: mflops = -1 (norm. = -0.0140194), norm. avg. (of 0) = -1 fft 1: mflops = 18.3144 (norm. = 0.256757), norm. avg. (of 1) = 0.256757 fft 2: mflops = 10.2672 (norm. = 0.143939), norm. avg. (of 1) = 0.143939 fft 3: mflops = 26.9258 (norm. = 0.377483), norm. avg. (of 1) = 0.377483 fft 4: mflops = 25.2534 (norm. = 0.354037), norm. avg. (of 1) = 0.354037 fft 5: mflops = 70.0999 (norm. = 0.982759), norm. avg. (of 1) = 0.982759 fft 6: mflops = 71.3297 (norm. = 1), norm. avg. (of 1) = 1 fft 7: mflops = 16.9408 (norm. = 0.2375), norm. avg. (of 1) = 0.2375 fft 8: mflops = 20.5343 (norm. = 0.287879), norm. avg. (of 1) = 0.287879 fft 9: mflops = 5.03192 (norm. = 0.0705446), norm. avg. (of 1) = 0.0705446 fft 10: mflops = 9.24044 (norm. = 0.129545), norm. avg. (of 1) = 0.129545 fft 11: mflops = 10.6434 (norm. = 0.149215), norm. avg. (of 1) = 0.149215 fft 12: mflops = 11.357 (norm. = 0.159218), norm. avg. (of 1) = 0.159218 fft 13: mflops = 10.7561 (norm. = 0.150794), norm. avg. (of 1) = 0.150794 fft 14: mflops = 4.79457 (norm. = 0.067217), norm. avg. (of 1) = 0.067217 Benchmarking for array size = 9: 0. Brenner: elapsed time t=1.08 s, 32768 iters, t-(init.)=1.07 s t(norm)=1.14457, mflops=4.36845 (err=4.8e-16) 1. CWP (min N): elapsed time t=1.35 s, 262144 iters, t-(init.)=1.26 s t(norm)=0.168476, mflops=29.6777 2. CWP (best N) (N=15): elapsed time t=1.03 s, 131072 iters, t-(init.)=0.96 s t(norm)=0.256726, mflops=19.476 3. FFTPACK: elapsed time t=1.34 s, 262144 iters, t-(init.)=1.24 s t(norm)=0.165802, mflops=30.1564 (err=1.1e-16) 4. FFTPACK (f2c): elapsed time t=1.23 s, 262144 iters, t-(init.)=1.13 s t(norm)=0.151094, mflops=33.092 (err=2.5e-16) FFTW_MEASURE plan: (cost = 3.509521e-06) FFTW_NOTW 9 5. FFTW: elapsed time t=1.85 s, 524288 iters, t-(init.)=1.66 s t(norm)=0.11098, mflops=45.053 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.86 s, 524288 iters, t-(init.)=1.68 s t(norm)=0.112318, mflops=44.5166 (err=2.1e-16) 7. Frigo-old: elapsed time t=1.43 s, 131072 iters, t-(init.)=1.38 s t(norm)=0.369044, mflops=13.5485 (err=3.1e-16) 8. GSL: elapsed time t=1.04 s, 131072 iters, t-(init.)=0.99 s t(norm)=0.264749, mflops=18.8858 (err=1.7e-16) 9. Nielsen: elapsed time t=1.29 s, 65536 iters, t-(init.)=1.27 s t(norm)=0.679254, mflops=7.36102 (err=9.6e-16) 10. Singleton: elapsed time t=1.44 s, 131072 iters, t-(init.)=1.39 s t(norm)=0.371718, mflops=13.4511 (err=1.5e-16) 11. Singleton (f2c): elapsed time t=1.19 s, 131072 iters, t-(init.)=1.14 s t(norm)=0.304862, mflops=16.4009 (err=1.5e-16) 12. Temperton: elapsed time t=1.3 s, 131072 iters, t-(init.)=1.25 s t(norm)=0.334279, mflops=14.9576 (err=1.5e-16) 13. Temperton (f2c): elapsed time t=1.38 s, 131072 iters, t-(init.)=1.34 s t(norm)=0.358347, mflops=13.953 (err=1.3e-16) 14. Valkenburg: elapsed time t=1.94 s, 65536 iters, t-(init.)=1.91 s t(norm)=1.02156, mflops=4.8945 (err=4.0e-16) Top mflops for N=9 = 45.053 Normalized results and averages for N=9: fft 0: mflops = 4.36845 (norm. = 0.0969626), norm. avg. (of 1) = 0.0969626 fft 1: mflops = 29.6777 (norm. = 0.65873), norm. avg. (of 2) = 0.457743 fft 2: mflops = 19.476 (norm. = 0.432292), norm. avg. (of 2) = 0.288116 fft 3: mflops = 30.1564 (norm. = 0.669355), norm. avg. (of 2) = 0.523419 fft 4: mflops = 33.092 (norm. = 0.734513), norm. avg. (of 2) = 0.544275 fft 5: mflops = 45.053 (norm. = 1), norm. avg. (of 2) = 0.991379 fft 6: mflops = 44.5166 (norm. = 0.988095), norm. avg. (of 2) = 0.994048 fft 7: mflops = 13.5485 (norm. = 0.300725), norm. avg. (of 2) = 0.269112 fft 8: mflops = 18.8858 (norm. = 0.419192), norm. avg. (of 2) = 0.353535 fft 9: mflops = 7.36102 (norm. = 0.163386), norm. avg. (of 2) = 0.116965 fft 10: mflops = 13.4511 (norm. = 0.298561), norm. avg. (of 2) = 0.214053 fft 11: mflops = 16.4009 (norm. = 0.364035), norm. avg. (of 2) = 0.256625 fft 12: mflops = 14.9576 (norm. = 0.332), norm. avg. (of 2) = 0.245609 fft 13: mflops = 13.953 (norm. = 0.309701), norm. avg. (of 2) = 0.230248 fft 14: mflops = 4.8945 (norm. = 0.108639), norm. avg. (of 2) = 0.0879279 Benchmarking for array size = 12: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.66 s, 262144 iters, t-(init.)=1.54 s t(norm)=0.136557, mflops=36.6147 2. CWP (best N) (N=15): elapsed time t=1.03 s, 131072 iters, t-(init.)=0.97 s t(norm)=0.172027, mflops=29.0652 3. FFTPACK: elapsed time t=1.54 s, 262144 iters, t-(init.)=1.43 s t(norm)=0.126803, mflops=39.4312 (err=2.0e-16) 4. FFTPACK (f2c): elapsed time t=1.43 s, 262144 iters, t-(init.)=1.32 s t(norm)=0.117049, mflops=42.7171 (err=2.6e-16) FFTW_MEASURE plan: (cost = 2.822876e-06) FFTW_NOTW 12 5. FFTW: elapsed time t=1.51 s, 524288 iters, t-(init.)=1.28 s t(norm)=0.0567511, mflops=88.104 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.49 s, 524288 iters, t-(init.)=1.26 s t(norm)=0.0558644, mflops=89.5025 (err=1.4e-16) 7. Frigo-old: elapsed time t=1.27 s, 131072 iters, t-(init.)=1.21 s t(norm)=0.21459, mflops=23.3002 (err=2.9e-16) 8. GSL: elapsed time t=1.91 s, 262144 iters, t-(init.)=1.8 s t(norm)=0.159612, mflops=31.3259 (err=3.2e-16) 9. Nielsen: elapsed time t=1.56 s, 65536 iters, t-(init.)=1.53 s t(norm)=0.542682, mflops=9.21349 (err=6.0e-16) 10. Singleton: elapsed time t=1.85 s, 131072 iters, t-(init.)=1.8 s t(norm)=0.319225, mflops=15.6629 (err=2.5e-16) 11. Singleton (f2c): elapsed time t=1.59 s, 131072 iters, t-(init.)=1.53 s t(norm)=0.271341, mflops=18.427 (err=2.5e-16) 12. Temperton: elapsed time t=1.7 s, 131072 iters, t-(init.)=1.65 s t(norm)=0.292623, mflops=17.0868 (err=1.7e-16) 13. Temperton (f2c): elapsed time t=1.51 s, 131072 iters, t-(init.)=1.45 s t(norm)=0.257153, mflops=19.4436 (err=1.7e-16) 14. Valkenburg: elapsed time t=1.45 s, 32768 iters, t-(init.)=1.43 s t(norm)=1.01443, mflops=4.9289 (err=4.2e-16) Top mflops for N=12 = 89.5025 Normalized results and averages for N=12: fft 0: mflops = -1 (norm. = -0.0111729), norm. avg. (of 1) = 0.0969626 fft 1: mflops = 36.6147 (norm. = 0.409091), norm. avg. (of 3) = 0.441526 fft 2: mflops = 29.0652 (norm. = 0.324742), norm. avg. (of 3) = 0.300324 fft 3: mflops = 39.4312 (norm. = 0.440559), norm. avg. (of 3) = 0.495799 fft 4: mflops = 42.7171 (norm. = 0.477273), norm. avg. (of 3) = 0.521941 fft 5: mflops = 88.104 (norm. = 0.984375), norm. avg. (of 3) = 0.989045 fft 6: mflops = 89.5025 (norm. = 1), norm. avg. (of 3) = 0.996032 fft 7: mflops = 23.3002 (norm. = 0.260331), norm. avg. (of 3) = 0.266185 fft 8: mflops = 31.3259 (norm. = 0.35), norm. avg. (of 3) = 0.352357 fft 9: mflops = 9.21349 (norm. = 0.102941), norm. avg. (of 3) = 0.112291 fft 10: mflops = 15.6629 (norm. = 0.175), norm. avg. (of 3) = 0.201036 fft 11: mflops = 18.427 (norm. = 0.205882), norm. avg. (of 3) = 0.239711 fft 12: mflops = 17.0868 (norm. = 0.190909), norm. avg. (of 3) = 0.227376 fft 13: mflops = 19.4436 (norm. = 0.217241), norm. avg. (of 3) = 0.225912 fft 14: mflops = 4.9289 (norm. = 0.0550699), norm. avg. (of 3) = 0.0769752 Benchmarking for array size = 15: 0. Brenner: elapsed time t=1.66 s, 32768 iters, t-(init.)=1.64 s t(norm)=0.854027, mflops=5.85462 (err=4.1e-16) 1. CWP (min N): elapsed time t=1.03 s, 131072 iters, t-(init.)=0.96 s t(norm)=0.12498, mflops=40.0066 2. CWP (best N): elapsed time t=1.07 s, 131072 iters, t-(init.)=1 s t(norm)=0.130187, mflops=38.4063 3. FFTPACK: elapsed time t=1.31 s, 131072 iters, t-(init.)=1.24 s t(norm)=0.161432, mflops=30.9728 (err=1.4e-16) 4. FFTPACK (f2c): elapsed time t=1.9 s, 262144 iters, t-(init.)=1.76 s t(norm)=0.114565, mflops=43.6435 (err=3.0e-16) FFTW_MEASURE plan: (cost = 7.019043e-06) FFTW_NOTW 15 5. FFTW: elapsed time t=1.81 s, 262144 iters, t-(init.)=1.67 s t(norm)=0.108706, mflops=45.9956 (err=1.9e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.81 s, 262144 iters, t-(init.)=1.68 s t(norm)=0.109357, mflops=45.7218 (err=1.9e-16) 7. Frigo-old: elapsed time t=1.31 s, 65536 iters, t-(init.)=1.27 s t(norm)=0.330675, mflops=15.1206 (err=2.6e-16) 8. GSL: elapsed time t=1.45 s, 65536 iters, t-(init.)=1.42 s t(norm)=0.369731, mflops=13.5233 (err=1.4e-16) 9. Nielsen: elapsed time t=1.89 s, 65536 iters, t-(init.)=1.85 s t(norm)=0.481692, mflops=10.3801 (err=4.3e-15) 10. Singleton: elapsed time t=1.68 s, 65536 iters, t-(init.)=1.64 s t(norm)=0.427013, mflops=11.7092 (err=2.2e-16) 11. Singleton (f2c): elapsed time t=1.27 s, 65536 iters, t-(init.)=1.24 s t(norm)=0.322864, mflops=15.4864 (err=2.2e-16) 12. Temperton: elapsed time t=1.51 s, 131072 iters, t-(init.)=1.44 s t(norm)=0.187469, mflops=26.671 (err=1.8e-16) 13. Temperton (f2c): elapsed time t=1.71 s, 131072 iters, t-(init.)=1.65 s t(norm)=0.214809, mflops=23.2765 (err=1.8e-16) 14. Valkenburg: elapsed time t=1.11 s, 16384 iters, t-(init.)=1.11 s t(norm)=1.15606, mflops=4.32503 (err=2.2e-16) Top mflops for N=15 = 45.9956 Normalized results and averages for N=15: fft 0: mflops = 5.85462 (norm. = 0.127287), norm. avg. (of 2) = 0.112125 fft 1: mflops = 40.0066 (norm. = 0.869792), norm. avg. (of 4) = 0.548592 fft 2: mflops = 38.4063 (norm. = 0.835), norm. avg. (of 4) = 0.433993 fft 3: mflops = 30.9728 (norm. = 0.673387), norm. avg. (of 4) = 0.540196 fft 4: mflops = 43.6435 (norm. = 0.948864), norm. avg. (of 4) = 0.628672 fft 5: mflops = 45.9956 (norm. = 1), norm. avg. (of 4) = 0.991783 fft 6: mflops = 45.7218 (norm. = 0.994048), norm. avg. (of 4) = 0.995536 fft 7: mflops = 15.1206 (norm. = 0.32874), norm. avg. (of 4) = 0.281824 fft 8: mflops = 13.5233 (norm. = 0.294014), norm. avg. (of 4) = 0.337771 fft 9: mflops = 10.3801 (norm. = 0.225676), norm. avg. (of 4) = 0.140637 fft 10: mflops = 11.7092 (norm. = 0.254573), norm. avg. (of 4) = 0.21442 fft 11: mflops = 15.4864 (norm. = 0.336694), norm. avg. (of 4) = 0.263956 fft 12: mflops = 26.671 (norm. = 0.579861), norm. avg. (of 4) = 0.315497 fft 13: mflops = 23.2765 (norm. = 0.506061), norm. avg. (of 4) = 0.295949 fft 14: mflops = 4.32503 (norm. = 0.0940315), norm. avg. (of 4) = 0.0812393 Benchmarking for array size = 18: 0. Brenner: elapsed time t=1.14 s, 16384 iters, t-(init.)=1.13 s t(norm)=0.918878, mflops=5.44142 (err=4.1e-16) 1. CWP (min N): elapsed time t=1.24 s, 131072 iters, t-(init.)=1.16 s t(norm)=0.117909, mflops=42.4055 2. CWP (best N) (N=28): elapsed time t=1.5 s, 131072 iters, t-(init.)=1.39 s t(norm)=0.141288, mflops=35.3888 3. FFTPACK: elapsed time t=1.62 s, 131072 iters, t-(init.)=1.54 s t(norm)=0.156534, mflops=31.9418 (err=2.5e-16) 4. FFTPACK (f2c): elapsed time t=1.61 s, 131072 iters, t-(init.)=1.53 s t(norm)=0.155518, mflops=32.1506 (err=2.8e-16) FFTW_MEASURE plan: (cost = 9.155273e-06) FFTW_TWIDDLE 6 FFTW_NOTW 3 5. FFTW: elapsed time t=1.23 s, 131072 iters, t-(init.)=1.16 s t(norm)=0.117909, mflops=42.4055 (err=2.2e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.67 s, 131072 iters, t-(init.)=1.6 s t(norm)=0.162633, mflops=30.744 (err=2.2e-16) 7. Frigo-old: elapsed time t=1.62 s, 65536 iters, t-(init.)=1.58 s t(norm)=0.321201, mflops=15.5666 (err=3.5e-16) 8. GSL: elapsed time t=1.54 s, 131072 iters, t-(init.)=1.46 s t(norm)=0.148403, mflops=33.6921 (err=2.1e-16) 9. Nielsen: elapsed time t=1.42 s, 32768 iters, t-(init.)=1.4 s t(norm)=0.569216, mflops=8.78401 (err=8.7e-16) 10. Singleton: elapsed time t=1.48 s, 65536 iters, t-(init.)=1.44 s t(norm)=0.29274, mflops=17.08 (err=2.1e-16) 11. Singleton (f2c): elapsed time t=1.16 s, 65536 iters, t-(init.)=1.13 s t(norm)=0.229719, mflops=21.7657 (err=2.1e-16) 12. Temperton: elapsed time t=1.27 s, 65536 iters, t-(init.)=1.23 s t(norm)=0.250049, mflops=19.9961 (err=2.2e-16) 13. Temperton (f2c): elapsed time t=1.46 s, 65536 iters, t-(init.)=1.42 s t(norm)=0.288674, mflops=17.3206 (err=2.6e-16) 14. Valkenburg: elapsed time t=1.24 s, 16384 iters, t-(init.)=1.23 s t(norm)=1.00019, mflops=4.99903 (err=4.2e-16) Top mflops for N=18 = 42.4055 Normalized results and averages for N=18: fft 0: mflops = 5.44142 (norm. = 0.128319), norm. avg. (of 3) = 0.117523 fft 1: mflops = 42.4055 (norm. = 1), norm. avg. (of 5) = 0.638874 fft 2: mflops = 35.3888 (norm. = 0.834532), norm. avg. (of 5) = 0.514101 fft 3: mflops = 31.9418 (norm. = 0.753247), norm. avg. (of 5) = 0.582806 fft 4: mflops = 32.1506 (norm. = 0.75817), norm. avg. (of 5) = 0.654571 fft 5: mflops = 42.4055 (norm. = 1), norm. avg. (of 5) = 0.993427 fft 6: mflops = 30.744 (norm. = 0.725), norm. avg. (of 5) = 0.941429 fft 7: mflops = 15.5666 (norm. = 0.367089), norm. avg. (of 5) = 0.298877 fft 8: mflops = 33.6921 (norm. = 0.794521), norm. avg. (of 5) = 0.429121 fft 9: mflops = 8.78401 (norm. = 0.207143), norm. avg. (of 5) = 0.153938 fft 10: mflops = 17.08 (norm. = 0.402778), norm. avg. (of 5) = 0.252092 fft 11: mflops = 21.7657 (norm. = 0.513274), norm. avg. (of 5) = 0.31382 fft 12: mflops = 19.9961 (norm. = 0.471545), norm. avg. (of 5) = 0.346707 fft 13: mflops = 17.3206 (norm. = 0.408451), norm. avg. (of 5) = 0.31845 fft 14: mflops = 4.99903 (norm. = 0.117886), norm. avg. (of 5) = 0.0885687 Benchmarking for array size = 24: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.33 s, 131072 iters, t-(init.)=1.23 s t(norm)=0.0852802, mflops=58.6303 2. CWP (best N) (N=28): elapsed time t=1.49 s, 131072 iters, t-(init.)=1.38 s t(norm)=0.0956802, mflops=52.2574 3. FFTPACK: elapsed time t=1.95 s, 131072 iters, t-(init.)=1.86 s t(norm)=0.12896, mflops=38.7716 (err=1.7e-16) 4. FFTPACK (f2c): elapsed time t=1.96 s, 131072 iters, t-(init.)=1.86 s t(norm)=0.12896, mflops=38.7716 (err=2.5e-16) FFTW_MEASURE plan: (cost = 1.068115e-05) FFTW_TWIDDLE 6 FFTW_NOTW 4 5. FFTW: elapsed time t=1.43 s, 131072 iters, t-(init.)=1.34 s t(norm)=0.0929069, mflops=53.8173 (err=2.0e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.76 s, 131072 iters, t-(init.)=1.66 s t(norm)=0.115094, mflops=43.4429 (err=2.1e-16) 7. Frigo-old: elapsed time t=1.91 s, 65536 iters, t-(init.)=1.86 s t(norm)=0.257921, mflops=19.3858 (err=3.8e-16) 8. GSL: elapsed time t=1.3 s, 65536 iters, t-(init.)=1.26 s t(norm)=0.17472, mflops=28.6172 (err=2.1e-16) 9. Nielsen: elapsed time t=1.61 s, 32768 iters, t-(init.)=1.59 s t(norm)=0.440961, mflops=11.3389 (err=1.7e-15) 10. Singleton: elapsed time t=1.93 s, 65536 iters, t-(init.)=1.88 s t(norm)=0.260694, mflops=19.1796 (err=2.1e-16) 11. Singleton (f2c): elapsed time t=1.59 s, 65536 iters, t-(init.)=1.54 s t(norm)=0.213547, mflops=23.414 (err=2.1e-16) 12. Temperton: elapsed time t=1.86 s, 65536 iters, t-(init.)=1.81 s t(norm)=0.250987, mflops=19.9213 (err=2.4e-16) 13. Temperton (f2c): elapsed time t=1.57 s, 65536 iters, t-(init.)=1.52 s t(norm)=0.210774, mflops=23.7221 (err=2.4e-16) 14. Valkenburg: elapsed time t=1.82 s, 16384 iters, t-(init.)=1.81 s t(norm)=1.00395, mflops=4.98033 (err=6.1e-16) Top mflops for N=24 = 58.6303 Normalized results and averages for N=24: fft 0: mflops = -1 (norm. = -0.017056), norm. avg. (of 3) = 0.117523 fft 1: mflops = 58.6303 (norm. = 1), norm. avg. (of 6) = 0.699062 fft 2: mflops = 52.2574 (norm. = 0.891304), norm. avg. (of 6) = 0.576968 fft 3: mflops = 38.7716 (norm. = 0.66129), norm. avg. (of 6) = 0.595887 fft 4: mflops = 38.7716 (norm. = 0.66129), norm. avg. (of 6) = 0.655691 fft 5: mflops = 53.8173 (norm. = 0.91791), norm. avg. (of 6) = 0.980841 fft 6: mflops = 43.4429 (norm. = 0.740964), norm. avg. (of 6) = 0.908018 fft 7: mflops = 19.3858 (norm. = 0.330645), norm. avg. (of 6) = 0.304172 fft 8: mflops = 28.6172 (norm. = 0.488095), norm. avg. (of 6) = 0.43895 fft 9: mflops = 11.3389 (norm. = 0.193396), norm. avg. (of 6) = 0.160514 fft 10: mflops = 19.1796 (norm. = 0.327128), norm. avg. (of 6) = 0.264598 fft 11: mflops = 23.414 (norm. = 0.399351), norm. avg. (of 6) = 0.328075 fft 12: mflops = 19.9213 (norm. = 0.339779), norm. avg. (of 6) = 0.345552 fft 13: mflops = 23.7221 (norm. = 0.404605), norm. avg. (of 6) = 0.332809 fft 14: mflops = 4.98033 (norm. = 0.0849448), norm. avg. (of 6) = 0.0879647 Benchmarking for array size = 36: 0. Brenner: elapsed time t=1.19 s, 8192 iters, t-(init.)=1.18 s t(norm)=0.773936, mflops=6.46048 (err=1.4e-15) 1. CWP (min N): elapsed time t=1.04 s, 65536 iters, t-(init.)=0.97 s t(norm)=0.0795253, mflops=62.8731 2. CWP (best N): elapsed time t=1.06 s, 65536 iters, t-(init.)=1 s t(norm)=0.0819848, mflops=60.9869 3. FFTPACK: elapsed time t=1.63 s, 65536 iters, t-(init.)=1.56 s t(norm)=0.127896, mflops=39.0942 (err=4.3e-16) 4. FFTPACK (f2c): elapsed time t=1.53 s, 65536 iters, t-(init.)=1.46 s t(norm)=0.119698, mflops=41.7719 (err=1.1e-15) FFTW_MEASURE plan: (cost = 1.831055e-05) FFTW_TWIDDLE 2 FFTW_TWIDDLE 3 FFTW_NOTW 6 5. FFTW: elapsed time t=1.18 s, 65536 iters, t-(init.)=1.11 s t(norm)=0.0910031, mflops=54.9432 (err=4.5e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.81 s, 65536 iters, t-(init.)=1.74 s t(norm)=0.142654, mflops=35.05 (err=6.2e-16) 7. Frigo-old: elapsed time t=1.63 s, 32768 iters, t-(init.)=1.6 s t(norm)=0.262351, mflops=19.0584 (err=6.6e-16) 8. GSL: elapsed time t=1.23 s, 65536 iters, t-(init.)=1.16 s t(norm)=0.0951024, mflops=52.5749 (err=4.2e-16) 9. Nielsen: elapsed time t=1.21 s, 16384 iters, t-(init.)=1.19 s t(norm)=0.390248, mflops=12.8124 (err=1.6e-15) 10. Singleton: elapsed time t=1.82 s, 32768 iters, t-(init.)=1.78 s t(norm)=0.291866, mflops=17.1312 (err=4.2e-16) 11. Singleton (f2c): elapsed time t=1.25 s, 32768 iters, t-(init.)=1.22 s t(norm)=0.200043, mflops=24.9946 (err=4.1e-16) 12. Temperton: elapsed time t=1.37 s, 32768 iters, t-(init.)=1.33 s t(norm)=0.21808, mflops=22.9274 (err=6.9e-16) 13. Temperton (f2c): elapsed time t=1.33 s, 32768 iters, t-(init.)=1.29 s t(norm)=0.211521, mflops=23.6383 (err=3.5e-16) 14. Valkenburg: elapsed time t=1.52 s, 8192 iters, t-(init.)=1.52 s t(norm)=0.996935, mflops=5.01537 (err=8.4e-16) Top mflops for N=36 = 62.8731 Normalized results and averages for N=36: fft 0: mflops = 6.46048 (norm. = 0.102754), norm. avg. (of 4) = 0.113831 fft 1: mflops = 62.8731 (norm. = 1), norm. avg. (of 7) = 0.742053 fft 2: mflops = 60.9869 (norm. = 0.97), norm. avg. (of 7) = 0.633116 fft 3: mflops = 39.0942 (norm. = 0.621795), norm. avg. (of 7) = 0.599588 fft 4: mflops = 41.7719 (norm. = 0.664384), norm. avg. (of 7) = 0.656933 fft 5: mflops = 54.9432 (norm. = 0.873874), norm. avg. (of 7) = 0.96556 fft 6: mflops = 35.05 (norm. = 0.557471), norm. avg. (of 7) = 0.85794 fft 7: mflops = 19.0584 (norm. = 0.303125), norm. avg. (of 7) = 0.304022 fft 8: mflops = 52.5749 (norm. = 0.836207), norm. avg. (of 7) = 0.495701 fft 9: mflops = 12.8124 (norm. = 0.203782), norm. avg. (of 7) = 0.166695 fft 10: mflops = 17.1312 (norm. = 0.272472), norm. avg. (of 7) = 0.265722 fft 11: mflops = 24.9946 (norm. = 0.397541), norm. avg. (of 7) = 0.337999 fft 12: mflops = 22.9274 (norm. = 0.364662), norm. avg. (of 7) = 0.348282 fft 13: mflops = 23.6383 (norm. = 0.375969), norm. avg. (of 7) = 0.338975 fft 14: mflops = 5.01537 (norm. = 0.0797697), norm. avg. (of 7) = 0.086794 Benchmarking for array size = 80: 0. Brenner: elapsed time t=1.1 s, 4096 iters, t-(init.)=1.09 s t(norm)=0.526171, mflops=9.50261 (err=5.1e-16) 1. CWP (min N): elapsed time t=1.94 s, 65536 iters, t-(init.)=1.8 s t(norm)=0.0543067, mflops=92.0698 2. CWP (best N) (N=84): elapsed time t=1.16 s, 32768 iters, t-(init.)=1.09 s t(norm)=0.0657714, mflops=76.0209 3. FFTPACK: elapsed time t=1.17 s, 16384 iters, t-(init.)=1.13 s t(norm)=0.13637, mflops=36.6649 (err=4.3e-16) 4. FFTPACK (f2c): elapsed time t=1.64 s, 32768 iters, t-(init.)=1.57 s t(norm)=0.0947349, mflops=52.7788 (err=4.7e-16) FFTW_MEASURE plan: (cost = 4.882813e-05) FFTW_TWIDDLE 2 FFTW_TWIDDLE 5 FFTW_NOTW 8 5. FFTW: elapsed time t=1.57 s, 32768 iters, t-(init.)=1.5 s t(norm)=0.0905111, mflops=55.2419 (err=2.6e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 6. FFTW_ESTIMATE: elapsed time t=1.28 s, 16384 iters, t-(init.)=1.24 s t(norm)=0.149645, mflops=33.4124 (err=4.6e-16) 7. Frigo-old: elapsed time t=1.38 s, 16384 iters, t-(init.)=1.33 s t(norm)=0.160506, mflops=31.1514 (err=3.3e-16) 8. GSL: elapsed time t=1.3 s, 8192 iters, t-(init.)=1.28 s t(norm)=0.308945, mflops=16.1841 (err=4.1e-16) 9. Nielsen: elapsed time t=1.3 s, 8192 iters, t-(init.)=1.28 s t(norm)=0.308945, mflops=16.1841 (err=8.1e-15) 10. Singleton: elapsed time t=1.26 s, 8192 iters, t-(init.)=1.24 s t(norm)=0.29929, mflops=16.7062 (err=4.3e-16) 11. Singleton (f2c): elapsed time t=1.67 s, 16384 iters, t-(init.)=1.63 s t(norm)=0.196711, mflops=25.418 (err=3.5e-16) 12. Temperton: elapsed time t=1.64 s, 16384 iters, t-(init.)=1.6 s t(norm)=0.19309, mflops=25.8946 (err=3.2e-16) 13. Temperton (f2c): elapsed time t=1.24 s, 16384 iters, t-(init.)=1.2 s t(norm)=0.144818, mflops=34.5262 (err=4.0e-16) 14. Valkenburg: elapsed time t=1.09 s, 2048 iters, t-(init.)=1.08 s t(norm)=1.04269, mflops=4.7953 (err=5.4e-16) Top mflops for N=80 = 92.0698 Normalized results and averages for N=80: fft 0: mflops = 9.50261 (norm. = 0.103211), norm. avg. (of 5) = 0.111707 fft 1: mflops = 92.0698 (norm. = 1), norm. avg. (of 8) = 0.774296 fft 2: mflops = 76.0209 (norm. = 0.825688), norm. avg. (of 8) = 0.657187 fft 3: mflops = 36.6649 (norm. = 0.39823), norm. avg. (of 8) = 0.574418 fft 4: mflops = 52.7788 (norm. = 0.573248), norm. avg. (of 8) = 0.646472 fft 5: mflops = 55.2419 (norm. = 0.6), norm. avg. (of 8) = 0.919865 fft 6: mflops = 33.4124 (norm. = 0.362903), norm. avg. (of 8) = 0.79606 fft 7: mflops = 31.1514 (norm. = 0.338346), norm. avg. (of 8) = 0.308313 fft 8: mflops = 16.1841 (norm. = 0.175781), norm. avg. (of 8) = 0.455711 fft 9: mflops = 16.1841 (norm. = 0.175781), norm. avg. (of 8) = 0.167831 fft 10: mflops = 16.7062 (norm. = 0.181452), norm. avg. (of 8) = 0.255189 fft 11: mflops = 25.418 (norm. = 0.276074), norm. avg. (of 8) = 0.330258 fft 12: mflops = 25.8946 (norm. = 0.28125), norm. avg. (of 8) = 0.339903 fft 13: mflops = 34.5262 (norm. = 0.375), norm. avg. (of 8) = 0.343478 fft 14: mflops = 4.7953 (norm. = 0.0520833), norm. avg. (of 8) = 0.0824551 Benchmarking for array size = 108: 0. Brenner: elapsed time t=1.08 s, 2048 iters, t-(init.)=1.07 s t(norm)=0.716163, mflops=6.98165 (err=8.7e-16) 1. CWP (min N) (N=110): elapsed time t=1.74 s, 32768 iters, t-(init.)=1.65 s t(norm)=0.0690227, mflops=72.4399 2. CWP (best N) (N=112): elapsed time t=1.35 s, 32768 iters, t-(init.)=1.25 s t(norm)=0.0522899, mflops=95.6207 3. FFTPACK: elapsed time t=1.39 s, 16384 iters, t-(init.)=1.34 s t(norm)=0.11211, mflops=44.5992 (err=3.4e-16) 4. FFTPACK (f2c): elapsed time t=1.26 s, 16384 iters, t-(init.)=1.22 s t(norm)=0.10207, mflops=48.986 (err=7.1e-16) FFTW_MEASURE plan: (cost = 7.568359e-05) FFTW_TWIDDLE 3 FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.24 s, 16384 iters, t-(init.)=1.19 s t(norm)=0.0995601, mflops=50.2209 (err=2.6e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.57 s, 16384 iters, t-(init.)=1.53 s t(norm)=0.128006, mflops=39.0607 (err=3.0e-16) 7. Frigo-old: elapsed time t=1.74 s, 8192 iters, t-(init.)=1.71 s t(norm)=0.286131, mflops=17.4745 (err=5.6e-16) 8. GSL: elapsed time t=1.3 s, 16384 iters, t-(init.)=1.25 s t(norm)=0.10458, mflops=47.8103 (err=3.2e-16) 9. Nielsen: elapsed time t=1.04 s, 4096 iters, t-(init.)=1.03 s t(norm)=0.344695, mflops=14.5056 (err=1.2e-15) 10. Singleton: elapsed time t=1.34 s, 8192 iters, t-(init.)=1.32 s t(norm)=0.220873, mflops=22.6375 (err=3.3e-16) 11. Singleton (f2c): elapsed time t=1.86 s, 16384 iters, t-(init.)=1.81 s t(norm)=0.151432, mflops=33.0182 (err=3.3e-16) 12. Temperton: elapsed time t=1.3 s, 8192 iters, t-(init.)=1.28 s t(norm)=0.21418, mflops=23.3449 (err=2.9e-16) 13. Temperton (f2c): elapsed time t=1.33 s, 8192 iters, t-(init.)=1.3 s t(norm)=0.217526, mflops=22.9857 (err=3.1e-16) 14. Valkenburg: elapsed time t=1.46 s, 2048 iters, t-(init.)=1.45 s t(norm)=0.970501, mflops=5.15198 (err=6.6e-16) Top mflops for N=108 = 95.6207 Normalized results and averages for N=108: fft 0: mflops = 6.98165 (norm. = 0.073014), norm. avg. (of 6) = 0.105258 fft 1: mflops = 72.4399 (norm. = 0.757576), norm. avg. (of 9) = 0.772438 fft 2: mflops = 95.6207 (norm. = 1), norm. avg. (of 9) = 0.695278 fft 3: mflops = 44.5992 (norm. = 0.466418), norm. avg. (of 9) = 0.562418 fft 4: mflops = 48.986 (norm. = 0.512295), norm. avg. (of 9) = 0.631564 fft 5: mflops = 50.2209 (norm. = 0.52521), norm. avg. (of 9) = 0.876014 fft 6: mflops = 39.0607 (norm. = 0.408497), norm. avg. (of 9) = 0.752998 fft 7: mflops = 17.4745 (norm. = 0.182749), norm. avg. (of 9) = 0.294361 fft 8: mflops = 47.8103 (norm. = 0.5), norm. avg. (of 9) = 0.460632 fft 9: mflops = 14.5056 (norm. = 0.151699), norm. avg. (of 9) = 0.166039 fft 10: mflops = 22.6375 (norm. = 0.236742), norm. avg. (of 9) = 0.253139 fft 11: mflops = 33.0182 (norm. = 0.345304), norm. avg. (of 9) = 0.33193 fft 12: mflops = 23.3449 (norm. = 0.244141), norm. avg. (of 9) = 0.329263 fft 13: mflops = 22.9857 (norm. = 0.240385), norm. avg. (of 9) = 0.332023 fft 14: mflops = 5.15198 (norm. = 0.0538793), norm. avg. (of 9) = 0.0792801 Benchmarking for array size = 210: 0. Brenner: elapsed time t=1 s, 1024 iters, t-(init.)=1 s t(norm)=0.602819, mflops=8.29436 (err=6.2e-16) 1. CWP (min N): elapsed time t=1.76 s, 16384 iters, t-(init.)=1.67 s t(norm)=0.0629193, mflops=79.4669 2. CWP (best N): elapsed time t=1.76 s, 16384 iters, t-(init.)=1.67 s t(norm)=0.0629193, mflops=79.4669 3. FFTPACK: elapsed time t=1.01 s, 4096 iters, t-(init.)=0.99 s t(norm)=0.149198, mflops=33.5126 (err=3.7e-16) 4. FFTPACK (f2c): elapsed time t=1.37 s, 4096 iters, t-(init.)=1.34 s t(norm)=0.201945, mflops=24.7593 (err=4.2e-16) FFTW_MEASURE plan: (cost = 2.343750e-04) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 5. FFTW: elapsed time t=1.95 s, 8192 iters, t-(init.)=1.91 s t(norm)=0.143923, mflops=34.7408 (err=3.5e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.94 s, 8192 iters, t-(init.)=1.89 s t(norm)=0.142416, mflops=35.1084 (err=3.5e-16) 7. Frigo-old: elapsed time t=1.09 s, 2048 iters, t-(init.)=1.08 s t(norm)=0.325523, mflops=15.3599 (err=4.6e-16) 8. GSL: elapsed time t=1.11 s, 2048 iters, t-(init.)=1.1 s t(norm)=0.331551, mflops=15.0806 (err=3.6e-16) 9. Nielsen: elapsed time t=1.31 s, 2048 iters, t-(init.)=1.3 s t(norm)=0.391833, mflops=12.7605 (err=8.6e-15) 10. Singleton: elapsed time t=1.73 s, 4096 iters, t-(init.)=1.7 s t(norm)=0.256198, mflops=19.5161 (err=3.4e-16) 11. Singleton (f2c): elapsed time t=1.68 s, 4096 iters, t-(init.)=1.65 s t(norm)=0.248663, mflops=20.1075 (err=3.4e-16) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.02 s, 512 iters, t-(init.)=1.01 s t(norm)=1.2177, mflops=4.10612 (err=6.1e-16) Top mflops for N=210 = 79.4669 Normalized results and averages for N=210: fft 0: mflops = 8.29436 (norm. = 0.104375), norm. avg. (of 7) = 0.105132 fft 1: mflops = 79.4669 (norm. = 1), norm. avg. (of 10) = 0.795195 fft 2: mflops = 79.4669 (norm. = 1), norm. avg. (of 10) = 0.72575 fft 3: mflops = 33.5126 (norm. = 0.421717), norm. avg. (of 10) = 0.548348 fft 4: mflops = 24.7593 (norm. = 0.311567), norm. avg. (of 10) = 0.599564 fft 5: mflops = 34.7408 (norm. = 0.437173), norm. avg. (of 10) = 0.83213 fft 6: mflops = 35.1084 (norm. = 0.441799), norm. avg. (of 10) = 0.721878 fft 7: mflops = 15.3599 (norm. = 0.193287), norm. avg. (of 10) = 0.284254 fft 8: mflops = 15.0806 (norm. = 0.189773), norm. avg. (of 10) = 0.433546 fft 9: mflops = 12.7605 (norm. = 0.160577), norm. avg. (of 10) = 0.165493 fft 10: mflops = 19.5161 (norm. = 0.245588), norm. avg. (of 10) = 0.252384 fft 11: mflops = 20.1075 (norm. = 0.25303), norm. avg. (of 10) = 0.32404 fft 12: mflops = -1 (norm. = -0.0125839), norm. avg. (of 9) = 0.329263 fft 13: mflops = -1 (norm. = -0.0125839), norm. avg. (of 9) = 0.332023 fft 14: mflops = 4.10612 (norm. = 0.0516708), norm. avg. (of 10) = 0.0765191 Benchmarking for array size = 504: 0. Brenner: elapsed time t=1.33 s, 512 iters, t-(init.)=1.32 s t(norm)=0.569808, mflops=8.77488 (err=6.6e-16) 1. CWP (min N): elapsed time t=1.05 s, 4096 iters, t-(init.)=0.99 s t(norm)=0.0534195, mflops=93.5988 2. CWP (best N): elapsed time t=1.05 s, 4096 iters, t-(init.)=0.99 s t(norm)=0.0534195, mflops=93.5988 3. FFTPACK: elapsed time t=1.37 s, 2048 iters, t-(init.)=1.34 s t(norm)=0.14461, mflops=34.5757 (err=4.6e-16) 4. FFTPACK (f2c): elapsed time t=1.72 s, 2048 iters, t-(init.)=1.69 s t(norm)=0.182382, mflops=27.415 (err=6.0e-16) FFTW_MEASURE plan: (cost = 5.273438e-04) FFTW_TWIDDLE 2 FFTW_TWIDDLE 3 FFTW_TWIDDLE 2 FFTW_TWIDDLE 3 FFTW_NOTW 14 5. FFTW: elapsed time t=1.07 s, 2048 iters, t-(init.)=1.03 s t(norm)=0.111156, mflops=44.9819 (err=4.8e-16) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.16 s, 2048 iters, t-(init.)=1.13 s t(norm)=0.121948, mflops=41.0012 (err=4.5e-16) 7. Frigo-old: elapsed time t=1.45 s, 1024 iters, t-(init.)=1.43 s t(norm)=0.308646, mflops=16.1998 (err=6.0e-16) 8. GSL: elapsed time t=1.01 s, 1024 iters, t-(init.)=0.99 s t(norm)=0.213678, mflops=23.3997 (err=5.7e-16) 9. Nielsen: elapsed time t=1.89 s, 1024 iters, t-(init.)=1.87 s t(norm)=0.403614, mflops=12.3881 (err=5.4e-15) 10. Singleton: elapsed time t=1.88 s, 2048 iters, t-(init.)=1.85 s t(norm)=0.199649, mflops=25.044 (err=6.6e-16) 11. Singleton (f2c): elapsed time t=1.9 s, 2048 iters, t-(init.)=1.87 s t(norm)=0.201807, mflops=24.7761 (err=6.6e-16) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.3 s, 256 iters, t-(init.)=1.3 s t(norm)=1.12235, mflops=4.45494 (err=7.6e-16) Top mflops for N=504 = 93.5988 Normalized results and averages for N=504: fft 0: mflops = 8.77488 (norm. = 0.09375), norm. avg. (of 8) = 0.103709 fft 1: mflops = 93.5988 (norm. = 1), norm. avg. (of 11) = 0.813813 fft 2: mflops = 93.5988 (norm. = 1), norm. avg. (of 11) = 0.750682 fft 3: mflops = 34.5757 (norm. = 0.369403), norm. avg. (of 11) = 0.53208 fft 4: mflops = 27.415 (norm. = 0.292899), norm. avg. (of 11) = 0.571686 fft 5: mflops = 44.9819 (norm. = 0.480583), norm. avg. (of 11) = 0.800171 fft 6: mflops = 41.0012 (norm. = 0.438053), norm. avg. (of 11) = 0.696075 fft 7: mflops = 16.1998 (norm. = 0.173077), norm. avg. (of 11) = 0.274147 fft 8: mflops = 23.3997 (norm. = 0.25), norm. avg. (of 11) = 0.41686 fft 9: mflops = 12.3881 (norm. = 0.132353), norm. avg. (of 11) = 0.16248 fft 10: mflops = 25.044 (norm. = 0.267568), norm. avg. (of 11) = 0.253764 fft 11: mflops = 24.7761 (norm. = 0.264706), norm. avg. (of 11) = 0.318646 fft 12: mflops = -1 (norm. = -0.0106839), norm. avg. (of 9) = 0.329263 fft 13: mflops = -1 (norm. = -0.0106839), norm. avg. (of 9) = 0.332023 fft 14: mflops = 4.45494 (norm. = 0.0475962), norm. avg. (of 11) = 0.0738898 Benchmarking for array size = 1000: 0. Brenner: elapsed time t=1.38 s, 256 iters, t-(init.)=1.37 s t(norm)=0.536994, mflops=9.3111 (err=8.0e-16) 1. CWP (min N) (N=1001): elapsed time t=1.47 s, 2048 iters, t-(init.)=1.39 s t(norm)=0.0681041, mflops=73.417 2. CWP (best N) (N=1008): elapsed time t=1.22 s, 2048 iters, t-(init.)=1.14 s t(norm)=0.0558552, mflops=89.5172 3. FFTPACK: elapsed time t=1.66 s, 1024 iters, t-(init.)=1.61 s t(norm)=0.157766, mflops=31.6924 (err=6.1e-16) 4. FFTPACK (f2c): elapsed time t=1.53 s, 1024 iters, t-(init.)=1.49 s t(norm)=0.146007, mflops=34.2448 (err=7.8e-16) FFTW_MEASURE plan: (cost = 1.328125e-03) FFTW_TWIDDLE 2 FFTW_TWIDDLE 5 FFTW_TWIDDLE 2 FFTW_TWIDDLE 5 FFTW_NOTW 10 5. FFTW: elapsed time t=1.34 s, 1024 iters, t-(init.)=1.29 s t(norm)=0.126409, mflops=39.5541 (err=5.9e-16) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.45 s, 1024 iters, t-(init.)=1.41 s t(norm)=0.138168, mflops=36.1878 (err=6.3e-16) 7. Frigo-old: elapsed time t=1.51 s, 512 iters, t-(init.)=1.48 s t(norm)=0.290055, mflops=17.2381 (err=6.3e-16) 8. GSL: elapsed time t=1.78 s, 512 iters, t-(init.)=1.76 s t(norm)=0.34493, mflops=14.4957 (err=6.3e-16) 9. Nielsen: elapsed time t=1.35 s, 512 iters, t-(init.)=1.33 s t(norm)=0.260657, mflops=19.1823 (err=1.3e-14) 10. Singleton: elapsed time t=1.46 s, 512 iters, t-(init.)=1.44 s t(norm)=0.282216, mflops=17.7169 (err=9.0e-16) 11. Singleton (f2c): elapsed time t=1.06 s, 512 iters, t-(init.)=1.04 s t(norm)=0.203822, mflops=24.5312 (err=9.1e-16) 12. Temperton: elapsed time t=1.38 s, 1024 iters, t-(init.)=1.34 s t(norm)=0.131309, mflops=38.0782 (err=6.3e-16) 13. Temperton (f2c): elapsed time t=1.62 s, 1024 iters, t-(init.)=1.58 s t(norm)=0.154827, mflops=32.2942 (err=6.4e-16) 14. Valkenburg: elapsed time t=1.45 s, 128 iters, t-(init.)=1.45 s t(norm)=1.1367, mflops=4.39869 (err=7.2e-16) Top mflops for N=1000 = 89.5172 Normalized results and averages for N=1000: fft 0: mflops = 9.3111 (norm. = 0.104015), norm. avg. (of 9) = 0.103743 fft 1: mflops = 73.417 (norm. = 0.820144), norm. avg. (of 12) = 0.814341 fft 2: mflops = 89.5172 (norm. = 1), norm. avg. (of 12) = 0.771458 fft 3: mflops = 31.6924 (norm. = 0.354037), norm. avg. (of 12) = 0.517244 fft 4: mflops = 34.2448 (norm. = 0.38255), norm. avg. (of 12) = 0.555924 fft 5: mflops = 39.5541 (norm. = 0.44186), norm. avg. (of 12) = 0.770312 fft 6: mflops = 36.1878 (norm. = 0.404255), norm. avg. (of 12) = 0.671757 fft 7: mflops = 17.2381 (norm. = 0.192568), norm. avg. (of 12) = 0.267348 fft 8: mflops = 14.4957 (norm. = 0.161932), norm. avg. (of 12) = 0.395616 fft 9: mflops = 19.1823 (norm. = 0.214286), norm. avg. (of 12) = 0.166797 fft 10: mflops = 17.7169 (norm. = 0.197917), norm. avg. (of 12) = 0.24911 fft 11: mflops = 24.5312 (norm. = 0.274038), norm. avg. (of 12) = 0.314929 fft 12: mflops = 38.0782 (norm. = 0.425373), norm. avg. (of 10) = 0.338874 fft 13: mflops = 32.2942 (norm. = 0.360759), norm. avg. (of 10) = 0.334897 fft 14: mflops = 4.39869 (norm. = 0.0491379), norm. avg. (of 12) = 0.0718271 Benchmarking for array size = 1960: 0. Brenner: elapsed time t=1.42 s, 128 iters, t-(init.)=1.41 s t(norm)=0.513889, mflops=9.72973 (err=7.3e-16) 1. CWP (min N) (N=1980): elapsed time t=1.48 s, 1024 iters, t-(init.)=1.4 s t(norm)=0.0637805, mflops=78.3938 2. CWP (best N) (N=1980): elapsed time t=1.48 s, 1024 iters, t-(init.)=1.4 s t(norm)=0.0637805, mflops=78.3938 3. FFTPACK: elapsed time t=1.95 s, 512 iters, t-(init.)=1.91 s t(norm)=0.17403, mflops=28.7307 (err=5.6e-16) 4. FFTPACK (f2c): elapsed time t=1.41 s, 256 iters, t-(init.)=1.39 s t(norm)=0.2533, mflops=19.7395 (err=6.3e-16) FFTW_MEASURE plan: (cost = 3.125000e-03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_TWIDDLE 2 FFTW_TWIDDLE 5 FFTW_NOTW 14 5. FFTW: elapsed time t=1.65 s, 512 iters, t-(init.)=1.61 s t(norm)=0.146695, mflops=34.0843 (err=5.3e-16) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.1 s, 256 iters, t-(init.)=1.08 s t(norm)=0.196809, mflops=25.4054 (err=5.6e-16) 7. Frigo-old: elapsed time t=1.78 s, 256 iters, t-(init.)=1.76 s t(norm)=0.320725, mflops=15.5897 (err=6.9e-16) 8. GSL: elapsed time t=1.08 s, 128 iters, t-(init.)=1.07 s t(norm)=0.389972, mflops=12.8214 (err=7.0e-16) 9. Nielsen: elapsed time t=1.16 s, 128 iters, t-(init.)=1.15 s t(norm)=0.419129, mflops=11.9295 (err=1.5e-14) 10. Singleton: elapsed time t=1.31 s, 256 iters, t-(init.)=1.29 s t(norm)=0.235077, mflops=21.2696 (err=7.7e-16) 11. Singleton (f2c): elapsed time t=1.5 s, 256 iters, t-(init.)=1.48 s t(norm)=0.269701, mflops=18.5391 (err=7.7e-16) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.74 s, 64 iters, t-(init.)=1.73 s t(norm)=1.26103, mflops=3.96501 (err=6.1e-16) Top mflops for N=1960 = 78.3938 Normalized results and averages for N=1960: fft 0: mflops = 9.72973 (norm. = 0.124113), norm. avg. (of 10) = 0.10578 fft 1: mflops = 78.3938 (norm. = 1), norm. avg. (of 13) = 0.828622 fft 2: mflops = 78.3938 (norm. = 1), norm. avg. (of 13) = 0.789038 fft 3: mflops = 28.7307 (norm. = 0.366492), norm. avg. (of 13) = 0.505647 fft 4: mflops = 19.7395 (norm. = 0.251799), norm. avg. (of 13) = 0.53253 fft 5: mflops = 34.0843 (norm. = 0.434783), norm. avg. (of 13) = 0.744502 fft 6: mflops = 25.4054 (norm. = 0.324074), norm. avg. (of 13) = 0.645012 fft 7: mflops = 15.5897 (norm. = 0.198864), norm. avg. (of 13) = 0.26208 fft 8: mflops = 12.8214 (norm. = 0.163551), norm. avg. (of 13) = 0.377765 fft 9: mflops = 11.9295 (norm. = 0.152174), norm. avg. (of 13) = 0.165672 fft 10: mflops = 21.2696 (norm. = 0.271318), norm. avg. (of 13) = 0.250819 fft 11: mflops = 18.5391 (norm. = 0.236486), norm. avg. (of 13) = 0.308895 fft 12: mflops = -1 (norm. = -0.0127561), norm. avg. (of 10) = 0.338874 fft 13: mflops = -1 (norm. = -0.0127561), norm. avg. (of 10) = 0.334897 fft 14: mflops = 3.96501 (norm. = 0.050578), norm. avg. (of 13) = 0.0701926 Benchmarking for array size = 4725: 0. Brenner: elapsed time t=1.08 s, 32 iters, t-(init.)=1.08 s t(norm)=0.585188, mflops=8.54427 (err=1.4e-15) 1. CWP (min N) (N=5005): elapsed time t=1.12 s, 256 iters, t-(init.)=1.07 s t(norm)=0.0724711, mflops=68.993 2. CWP (best N) (N=5040): elapsed time t=1.9 s, 512 iters, t-(init.)=1.8 s t(norm)=0.060957, mflops=82.025 3. FFTPACK: elapsed time t=1.23 s, 128 iters, t-(init.)=1.21 s t(norm)=0.163907, mflops=30.5052 (err=1.2e-15) 4. FFTPACK (f2c): elapsed time t=1.47 s, 128 iters, t-(init.)=1.45 s t(norm)=0.196417, mflops=25.456 (err=1.3e-15) FFTW_MEASURE plan: (cost = 9.062500e-03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 3 FFTW_TWIDDLE 3 FFTW_NOTW 15 5. FFTW: elapsed time t=1.15 s, 128 iters, t-(init.)=1.13 s t(norm)=0.15307, mflops=32.6648 (err=1.2e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.54 s, 128 iters, t-(init.)=1.51 s t(norm)=0.204545, mflops=24.4445 (err=1.2e-15) 7. Frigo-old: elapsed time t=1.36 s, 64 iters, t-(init.)=1.35 s t(norm)=0.365742, mflops=13.6708 (err=1.3e-15) 8. GSL: elapsed time t=1.2 s, 64 iters, t-(init.)=1.19 s t(norm)=0.322395, mflops=15.5089 (err=1.3e-15) 9. Nielsen: elapsed time t=1.42 s, 64 iters, t-(init.)=1.4 s t(norm)=0.379288, mflops=13.1826 (err=4.3e-14) 10. Singleton: elapsed time t=1.06 s, 64 iters, t-(init.)=1.05 s t(norm)=0.284466, mflops=17.5768 (err=1.8e-15) 11. Singleton (f2c): elapsed time t=1.77 s, 128 iters, t-(init.)=1.75 s t(norm)=0.237055, mflops=21.0921 (err=1.8e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.11 s, 16 iters, t-(init.)=1.11 s t(norm)=1.20289, mflops=4.15667 (err=1.3e-15) Top mflops for N=4725 = 82.025 Normalized results and averages for N=4725: fft 0: mflops = 8.54427 (norm. = 0.104167), norm. avg. (of 11) = 0.105633 fft 1: mflops = 68.993 (norm. = 0.841121), norm. avg. (of 14) = 0.829515 fft 2: mflops = 82.025 (norm. = 1), norm. avg. (of 14) = 0.804107 fft 3: mflops = 30.5052 (norm. = 0.371901), norm. avg. (of 14) = 0.496094 fft 4: mflops = 25.456 (norm. = 0.310345), norm. avg. (of 14) = 0.51666 fft 5: mflops = 32.6648 (norm. = 0.39823), norm. avg. (of 14) = 0.719768 fft 6: mflops = 24.4445 (norm. = 0.298013), norm. avg. (of 14) = 0.620227 fft 7: mflops = 13.6708 (norm. = 0.166667), norm. avg. (of 14) = 0.255265 fft 8: mflops = 15.5089 (norm. = 0.189076), norm. avg. (of 14) = 0.364287 fft 9: mflops = 13.1826 (norm. = 0.160714), norm. avg. (of 14) = 0.165318 fft 10: mflops = 17.5768 (norm. = 0.214286), norm. avg. (of 14) = 0.248209 fft 11: mflops = 21.0921 (norm. = 0.257143), norm. avg. (of 14) = 0.305198 fft 12: mflops = -1 (norm. = -0.0121914), norm. avg. (of 10) = 0.338874 fft 13: mflops = -1 (norm. = -0.0121914), norm. avg. (of 10) = 0.334897 fft 14: mflops = 4.15667 (norm. = 0.0506757), norm. avg. (of 14) = 0.0687985 Benchmarking for array size = 10368: 0. Brenner: elapsed time t=1.23 s, 16 iters, t-(init.)=1.23 s t(norm)=0.555826, mflops=8.99561 (err=1.1e-15) 1. CWP (min N) (N=10920): elapsed time t=1.26 s, 128 iters, t-(init.)=1.2 s t(norm)=0.0677837, mflops=73.764 2. CWP (best N) (N=11088): elapsed time t=1.17 s, 128 iters, t-(init.)=1.11 s t(norm)=0.0626999, mflops=79.7449 3. FFTPACK: elapsed time t=1.32 s, 64 iters, t-(init.)=1.29 s t(norm)=0.145735, mflops=34.3089 (err=9.8e-16) 4. FFTPACK (f2c): elapsed time t=1.17 s, 64 iters, t-(init.)=1.14 s t(norm)=0.128789, mflops=38.8232 (err=1.1e-15) FFTW_MEASURE plan: (cost = 1.312500e-02) FFTW_TWIDDLE 8 FFTW_TWIDDLE 3 FFTW_TWIDDLE 2 FFTW_TWIDDLE 3 FFTW_TWIDDLE 2 FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.67 s, 128 iters, t-(init.)=1.62 s t(norm)=0.091508, mflops=54.64 (err=9.2e-16) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.15 s, 64 iters, t-(init.)=1.12 s t(norm)=0.12653, mflops=39.5164 (err=9.4e-16) 7. Frigo-old: elapsed time t=1.08 s, 32 iters, t-(init.)=1.07 s t(norm)=0.241762, mflops=20.6815 (err=1.0e-15) 8. GSL: elapsed time t=1.07 s, 64 iters, t-(init.)=1.04 s t(norm)=0.117492, mflops=42.5562 (err=9.4e-16) 9. Nielsen: elapsed time t=1.53 s, 32 iters, t-(init.)=1.51 s t(norm)=0.341178, mflops=14.6551 (err=1.1e-14) 10. Singleton: elapsed time t=1.11 s, 32 iters, t-(init.)=1.1 s t(norm)=0.24854, mflops=20.1175 (err=1.3e-15) 11. Singleton (f2c): elapsed time t=1.66 s, 64 iters, t-(init.)=1.63 s t(norm)=0.184146, mflops=27.1524 (err=1.3e-15) 12. Temperton: elapsed time t=1.11 s, 32 iters, t-(init.)=1.1 s t(norm)=0.24854, mflops=20.1175 (err=9.7e-16) 13. Temperton (f2c): elapsed time t=1 s, 32 iters, t-(init.)=0.99 s t(norm)=0.223686, mflops=22.3527 (err=9.7e-16) 14. Valkenburg: elapsed time t=1.11 s, 8 iters, t-(init.)=1.11 s t(norm)=1.0032, mflops=4.98406 (err=1.3e-15) Top mflops for N=10368 = 79.7449 Normalized results and averages for N=10368: fft 0: mflops = 8.99561 (norm. = 0.112805), norm. avg. (of 12) = 0.106231 fft 1: mflops = 73.764 (norm. = 0.925), norm. avg. (of 15) = 0.835881 fft 2: mflops = 79.7449 (norm. = 1), norm. avg. (of 15) = 0.817167 fft 3: mflops = 34.3089 (norm. = 0.430233), norm. avg. (of 15) = 0.491703 fft 4: mflops = 38.8232 (norm. = 0.486842), norm. avg. (of 15) = 0.514672 fft 5: mflops = 54.64 (norm. = 0.685185), norm. avg. (of 15) = 0.717463 fft 6: mflops = 39.5164 (norm. = 0.495536), norm. avg. (of 15) = 0.611914 fft 7: mflops = 20.6815 (norm. = 0.259346), norm. avg. (of 15) = 0.255537 fft 8: mflops = 42.5562 (norm. = 0.533654), norm. avg. (of 15) = 0.375578 fft 9: mflops = 14.6551 (norm. = 0.183775), norm. avg. (of 15) = 0.166548 fft 10: mflops = 20.1175 (norm. = 0.252273), norm. avg. (of 15) = 0.24848 fft 11: mflops = 27.1524 (norm. = 0.340491), norm. avg. (of 15) = 0.307551 fft 12: mflops = 20.1175 (norm. = 0.252273), norm. avg. (of 11) = 0.331001 fft 13: mflops = 22.3527 (norm. = 0.280303), norm. avg. (of 11) = 0.329934 fft 14: mflops = 4.98406 (norm. = 0.0625), norm. avg. (of 15) = 0.0683786 Benchmarking for array size = 27000: 0. Brenner: elapsed time t=1.05 s, 4 iters, t-(init.)=1.04 s t(norm)=0.654157, mflops=7.64343 (err=3.6e-15) 1. CWP (min N) (N=27720): elapsed time t=1.99 s, 64 iters, t-(init.)=1.81 s t(norm)=0.0711553, mflops=70.2688 2. CWP (best N) (N=27720): elapsed time t=2.01 s, 64 iters, t-(init.)=1.83 s t(norm)=0.0719415, mflops=69.5009 3. FFTPACK: elapsed time t=1.33 s, 16 iters, t-(init.)=1.29 s t(norm)=0.202852, mflops=24.6486 (err=3.4e-15) 4. FFTPACK (f2c): elapsed time t=1.21 s, 16 iters, t-(init.)=1.17 s t(norm)=0.183982, mflops=27.1766 (err=3.5e-15) FFTW_MEASURE plan: (cost = 6.500000e-02) FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 3 FFTW_TWIDDLE 5 FFTW_TWIDDLE 2 FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.05 s, 16 iters, t-(init.)=1.01 s t(norm)=0.158822, mflops=31.4818 (err=3.5e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 2 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1 s, 16 iters, t-(init.)=0.95 s t(norm)=0.149387, mflops=33.4702 (err=3.5e-15) 7. Frigo-old: elapsed time t=1.18 s, 8 iters, t-(init.)=1.16 s t(norm)=0.364818, mflops=13.7055 (err=3.6e-15) 8. GSL: elapsed time t=1.77 s, 16 iters, t-(init.)=1.73 s t(norm)=0.272041, mflops=18.3796 (err=3.4e-15) 9. Nielsen: elapsed time t=1.05 s, 8 iters, t-(init.)=1.03 s t(norm)=0.323933, mflops=15.4353 (err=2.0e-13) 10. Singleton: elapsed time t=1 s, 8 iters, t-(init.)=0.98 s t(norm)=0.308209, mflops=16.2228 (err=5.0e-15) 11. Singleton (f2c): elapsed time t=1.52 s, 16 iters, t-(init.)=1.48 s t(norm)=0.232729, mflops=21.4842 (err=5.0e-15) 12. Temperton: elapsed time t=1.23 s, 16 iters, t-(init.)=1.19 s t(norm)=0.187127, mflops=26.7199 (err=3.6e-15) 13. Temperton (f2c): elapsed time t=1.41 s, 16 iters, t-(init.)=1.34 s t(norm)=0.210714, mflops=23.7288 (err=3.6e-15) 14. Valkenburg: elapsed time t=1.89 s, 4 iters, t-(init.)=1.88 s t(norm)=1.18251, mflops=4.22828 (err=3.4e-15) Top mflops for N=27000 = 70.2688 Normalized results and averages for N=27000: fft 0: mflops = 7.64343 (norm. = 0.108774), norm. avg. (of 13) = 0.106427 fft 1: mflops = 70.2688 (norm. = 1), norm. avg. (of 16) = 0.846138 fft 2: mflops = 69.5009 (norm. = 0.989071), norm. avg. (of 16) = 0.827911 fft 3: mflops = 24.6486 (norm. = 0.350775), norm. avg. (of 16) = 0.482895 fft 4: mflops = 27.1766 (norm. = 0.386752), norm. avg. (of 16) = 0.506677 fft 5: mflops = 31.4818 (norm. = 0.44802), norm. avg. (of 16) = 0.700623 fft 6: mflops = 33.4702 (norm. = 0.476316), norm. avg. (of 16) = 0.603439 fft 7: mflops = 13.7055 (norm. = 0.195043), norm. avg. (of 16) = 0.251756 fft 8: mflops = 18.3796 (norm. = 0.261561), norm. avg. (of 16) = 0.368452 fft 9: mflops = 15.4353 (norm. = 0.21966), norm. avg. (of 16) = 0.169868 fft 10: mflops = 16.2228 (norm. = 0.230867), norm. avg. (of 16) = 0.247379 fft 11: mflops = 21.4842 (norm. = 0.305743), norm. avg. (of 16) = 0.307438 fft 12: mflops = 26.7199 (norm. = 0.380252), norm. avg. (of 12) = 0.335105 fft 13: mflops = 23.7288 (norm. = 0.337687), norm. avg. (of 12) = 0.33058 fft 14: mflops = 4.22828 (norm. = 0.0601729), norm. avg. (of 16) = 0.0678657 Benchmarking for array size = 75600: 0. Brenner: elapsed time t=1.7 s, 2 iters, t-(init.)=1.67 s t(norm)=0.681532, mflops=7.33641 (err=4.7e-15) 1. CWP (min N) (N=80080): elapsed time t=1.02 s, 8 iters, t-(init.)=0.92 s t(norm)=0.0938637, mflops=53.2687 2. CWP (best N) (N=80080): elapsed time t=1.01 s, 8 iters, t-(init.)=0.9 s t(norm)=0.0918232, mflops=54.4525 3. FFTPACK: elapsed time t=1.33 s, 4 iters, t-(init.)=1.28 s t(norm)=0.261186, mflops=19.1435 (err=4.7e-15) 4. FFTPACK (f2c): elapsed time t=1.41 s, 4 iters, t-(init.)=1.36 s t(norm)=0.27751, mflops=18.0174 (err=4.7e-15) FFTW_MEASURE plan: (cost = 2.000000e-01) FFTW_TWIDDLE 8 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 3 FFTW_TWIDDLE 2 FFTW_TWIDDLE 3 FFTW_NOTW 15 5. FFTW: elapsed time t=1.53 s, 8 iters, t-(init.)=1.43 s t(norm)=0.145897, mflops=34.2708 (err=4.7e-15) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 8 FFTW_TWIDDLE 7 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.76 s, 8 iters, t-(init.)=1.66 s t(norm)=0.169363, mflops=29.5224 (err=4.7e-15) 7. Frigo-old: elapsed time t=1.83 s, 4 iters, t-(init.)=1.78 s t(norm)=0.363212, mflops=13.7661 (err=4.7e-15) 8. GSL: elapsed time t=1.54 s, 4 iters, t-(init.)=1.49 s t(norm)=0.304037, mflops=16.4454 (err=4.7e-15) 9. Nielsen: elapsed time t=1.01 s, 2 iters, t-(init.)=0.99 s t(norm)=0.404022, mflops=12.3756 (err=4.8e-13) 10. Singleton: elapsed time t=1.8 s, 4 iters, t-(init.)=1.76 s t(norm)=0.359131, mflops=13.9225 (err=6.1e-15) 11. Singleton (f2c): elapsed time t=1.62 s, 4 iters, t-(init.)=1.58 s t(norm)=0.322401, mflops=15.5086 (err=6.1e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.59 s, 1 iters, t-(init.)=1.58 s t(norm)=1.28961, mflops=3.87716 (err=4.5e-15) Top mflops for N=75600 = 54.4525 Normalized results and averages for N=75600: fft 0: mflops = 7.33641 (norm. = 0.134731), norm. avg. (of 14) = 0.108448 fft 1: mflops = 53.2687 (norm. = 0.978261), norm. avg. (of 17) = 0.85391 fft 2: mflops = 54.4525 (norm. = 1), norm. avg. (of 17) = 0.838033 fft 3: mflops = 19.1435 (norm. = 0.351563), norm. avg. (of 17) = 0.47517 fft 4: mflops = 18.0174 (norm. = 0.330882), norm. avg. (of 17) = 0.496336 fft 5: mflops = 34.2708 (norm. = 0.629371), norm. avg. (of 17) = 0.696431 fft 6: mflops = 29.5224 (norm. = 0.542169), norm. avg. (of 17) = 0.599835 fft 7: mflops = 13.7661 (norm. = 0.252809), norm. avg. (of 17) = 0.251818 fft 8: mflops = 16.4454 (norm. = 0.302013), norm. avg. (of 17) = 0.364544 fft 9: mflops = 12.3756 (norm. = 0.227273), norm. avg. (of 17) = 0.173245 fft 10: mflops = 13.9225 (norm. = 0.255682), norm. avg. (of 17) = 0.247868 fft 11: mflops = 15.5086 (norm. = 0.28481), norm. avg. (of 17) = 0.306107 fft 12: mflops = -1 (norm. = -0.0183646), norm. avg. (of 12) = 0.335105 fft 13: mflops = -1 (norm. = -0.0183646), norm. avg. (of 12) = 0.33058 fft 14: mflops = 3.87716 (norm. = 0.0712025), norm. avg. (of 17) = 0.068062 Benchmarking for array size = 165375: 0. Brenner: elapsed time t=2.12 s, 1 iters, t-(init.)=2.1 s t(norm)=0.732514, mflops=6.82581 (err=1.2e-14) 1. CWP (min N) (N=180180): elapsed time t=1.42 s, 4 iters, t-(init.)=1.26 s t(norm)=0.109877, mflops=45.5054 2. CWP (best N) (N=180180): elapsed time t=1.39 s, 4 iters, t-(init.)=1.26 s t(norm)=0.109877, mflops=45.5054 3. FFTPACK: elapsed time t=1.12 s, 1 iters, t-(init.)=1.09 s t(norm)=0.38021, mflops=13.1506 (err=1.2e-14) 4. FFTPACK (f2c): elapsed time t=1.28 s, 1 iters, t-(init.)=1.25 s t(norm)=0.43602, mflops=11.4674 (err=1.2e-14) FFTW_MEASURE plan: (cost = 5.800000e-01) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 3 FFTW_TWIDDLE 3 FFTW_TWIDDLE 5 FFTW_NOTW 15 5. FFTW: elapsed time t=1.12 s, 2 iters, t-(init.)=1.06 s t(norm)=0.184873, mflops=27.0456 (err=1.2e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.34 s, 2 iters, t-(init.)=1.29 s t(norm)=0.224986, mflops=22.2236 (err=1.2e-14) 7. Frigo-old: elapsed time t=1.47 s, 1 iters, t-(init.)=1.45 s t(norm)=0.505784, mflops=9.88565 (err=1.2e-14) 8. GSL: elapsed time t=1.21 s, 1 iters, t-(init.)=1.18 s t(norm)=0.411603, mflops=12.1476 (err=1.2e-14) 9. Nielsen: elapsed time t=1.3 s, 1 iters, t-(init.)=1.27 s t(norm)=0.442997, mflops=11.2868 (err=1.7e-12) 10. Singleton: elapsed time t=1.07 s, 1 iters, t-(init.)=1.04 s t(norm)=0.362769, mflops=13.7829 (err=1.8e-14) 11. Singleton (f2c): elapsed time t=1.05 s, 1 iters, t-(init.)=1.02 s t(norm)=0.355793, mflops=14.0531 (err=1.8e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=4.07 s, 1 iters, t-(init.)=4.04 s t(norm)=1.40922, mflops=3.54807 (err=1.2e-14) Top mflops for N=165375 = 45.5054 Normalized results and averages for N=165375: fft 0: mflops = 6.82581 (norm. = 0.15), norm. avg. (of 15) = 0.111218 fft 1: mflops = 45.5054 (norm. = 1), norm. avg. (of 18) = 0.862026 fft 2: mflops = 45.5054 (norm. = 1), norm. avg. (of 18) = 0.847032 fft 3: mflops = 13.1506 (norm. = 0.288991), norm. avg. (of 18) = 0.464826 fft 4: mflops = 11.4674 (norm. = 0.252), norm. avg. (of 18) = 0.482762 fft 5: mflops = 27.0456 (norm. = 0.59434), norm. avg. (of 18) = 0.69076 fft 6: mflops = 22.2236 (norm. = 0.488372), norm. avg. (of 18) = 0.593642 fft 7: mflops = 9.88565 (norm. = 0.217241), norm. avg. (of 18) = 0.249897 fft 8: mflops = 12.1476 (norm. = 0.266949), norm. avg. (of 18) = 0.359122 fft 9: mflops = 11.2868 (norm. = 0.248031), norm. avg. (of 18) = 0.1774 fft 10: mflops = 13.7829 (norm. = 0.302885), norm. avg. (of 18) = 0.250924 fft 11: mflops = 14.0531 (norm. = 0.308824), norm. avg. (of 18) = 0.306258 fft 12: mflops = -1 (norm. = -0.0219754), norm. avg. (of 12) = 0.335105 fft 13: mflops = -1 (norm. = -0.0219754), norm. avg. (of 12) = 0.33058 fft 14: mflops = 3.54807 (norm. = 0.0779703), norm. avg. (of 18) = 0.0686125 Benchmarking for array size = 362880: 0. Brenner: elapsed time t=4.73 s, 1 iters, t-(init.)=4.67 s t(norm)=0.696799, mflops=7.17567 (err=7.7e-15) 1. CWP (min N) (N=720720): elapsed time t=1.5 s, 1 iters, t-(init.)=1.37 s t(norm)=0.204414, mflops=24.4601 2. CWP (best N) (N=720720): elapsed time t=1.51 s, 1 iters, t-(init.)=1.38 s t(norm)=0.205906, mflops=24.2829 3. FFTPACK: elapsed time t=1.84 s, 1 iters, t-(init.)=1.78 s t(norm)=0.265589, mflops=18.8261 (err=7.5e-15) 4. FFTPACK (f2c): elapsed time t=1.91 s, 1 iters, t-(init.)=1.85 s t(norm)=0.276034, mflops=18.1137 (err=7.5e-15) FFTW_MEASURE plan: (cost = 1.010000e+00) FFTW_TWIDDLE 8 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 3 FFTW_TWIDDLE 2 FFTW_TWIDDLE 3 FFTW_TWIDDLE 2 FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.98 s, 2 iters, t-(init.)=1.86 s t(norm)=0.138763, mflops=36.0327 (err=7.6e-15) FFTW_ESTIMATE plan: (cost = 7.511616e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.28 s, 1 iters, t-(init.)=1.22 s t(norm)=0.182033, mflops=27.4675 (err=7.6e-15) 7. Frigo-old: elapsed time t=2.52 s, 1 iters, t-(init.)=2.46 s t(norm)=0.36705, mflops=13.6221 (err=7.6e-15) 8. GSL: elapsed time t=1.77 s, 1 iters, t-(init.)=1.71 s t(norm)=0.255145, mflops=19.5967 (err=7.5e-15) 9. Nielsen: elapsed time t=3.04 s, 1 iters, t-(init.)=2.98 s t(norm)=0.444638, mflops=11.2451 (err=3.5e-12) 10. Singleton: elapsed time t=2.65 s, 1 iters, t-(init.)=2.59 s t(norm)=0.386447, mflops=12.9384 (err=1.1e-14) 11. Singleton (f2c): elapsed time t=2.49 s, 1 iters, t-(init.)=2.39 s t(norm)=0.356606, mflops=14.0211 (err=1.1e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=9.01 s, 1 iters, t-(init.)=8.95 s t(norm)=1.33541, mflops=3.74418 (err=7.9e-15) Top mflops for N=362880 = 36.0327 Normalized results and averages for N=362880: fft 0: mflops = 7.17567 (norm. = 0.199143), norm. avg. (of 16) = 0.116714 fft 1: mflops = 24.4601 (norm. = 0.678832), norm. avg. (of 19) = 0.852384 fft 2: mflops = 24.2829 (norm. = 0.673913), norm. avg. (of 19) = 0.83792 fft 3: mflops = 18.8261 (norm. = 0.522472), norm. avg. (of 19) = 0.46786 fft 4: mflops = 18.1137 (norm. = 0.502703), norm. avg. (of 19) = 0.483811 fft 5: mflops = 36.0327 (norm. = 1), norm. avg. (of 19) = 0.707035 fft 6: mflops = 27.4675 (norm. = 0.762295), norm. avg. (of 19) = 0.602519 fft 7: mflops = 13.6221 (norm. = 0.378049), norm. avg. (of 19) = 0.256642 fft 8: mflops = 19.5967 (norm. = 0.54386), norm. avg. (of 19) = 0.368845 fft 9: mflops = 11.2451 (norm. = 0.312081), norm. avg. (of 19) = 0.184488 fft 10: mflops = 12.9384 (norm. = 0.359073), norm. avg. (of 19) = 0.256616 fft 11: mflops = 14.0211 (norm. = 0.389121), norm. avg. (of 19) = 0.310619 fft 12: mflops = -1 (norm. = -0.0277526), norm. avg. (of 12) = 0.335105 fft 13: mflops = -1 (norm. = -0.0277526), norm. avg. (of 12) = 0.33058 fft 14: mflops = 3.74418 (norm. = 0.103911), norm. avg. (of 19) = 0.0704703 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) 512x128x64 (64.0236 MB) 256x128x256 (128.012 MB) Maximum array size N = 8388608 Benchmarking FFTs: 0. FFTW 1. HARM 2. HARM (f2c) 3. NR (C) 4. NR (F) 5. PDA 6. PDA (f2c) 7. Singleton 8. Singleton (f2c) 9. Temperton 10. Temperton (f2c) Computing normalized averages (11 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.05 s, 32768 iters, t-(init.)=0.99 s t(norm)=0.0786781, mflops=63.5501 (err=3.0e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. Skipping fft (all dimensions must be > 4 for HARM). 3. NR (C): elapsed time t=1.39 s, 32768 iters, t-(init.)=1.34 s t(norm)=0.106494, mflops=46.9512 (err=3.0e-16) 4. NR (F): elapsed time t=1.08 s, 16384 iters, t-(init.)=1.05 s t(norm)=0.166893, mflops=29.9593 (err=3.0e-16) 5. PDA: elapsed time t=1.03 s, 8192 iters, t-(init.)=1.02 s t(norm)=0.324249, mflops=15.4202 (err=2.9e-16) 6. PDA (f2c): elapsed time t=1.16 s, 8192 iters, t-(init.)=1.14 s t(norm)=0.362396, mflops=13.7971 (err=3.3e-16) 7. Singleton: elapsed time t=1.47 s, 32768 iters, t-(init.)=1.4 s t(norm)=0.111262, mflops=44.939 (err=3.0e-16) 8. Singleton (f2c): elapsed time t=1.91 s, 32768 iters, t-(init.)=1.85 s t(norm)=0.147025, mflops=34.0079 (err=2.2e-16) 9. Temperton: elapsed time t=1.2 s, 32768 iters, t-(init.)=1.15 s t(norm)=0.0913938, mflops=54.7083 (err=4.1e-16) 10. Temperton (f2c): elapsed time t=1.03 s, 16384 iters, t-(init.)=1 s t(norm)=0.158946, mflops=31.4573 (err=3.0e-16) Top mflops for N=64 = 63.5501 Normalized results and averages for N=64: fft 0: mflops = 63.5501 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.0157356), norm. avg. (of 0) = -1 fft 2: mflops = -1 (norm. = -0.0157356), norm. avg. (of 0) = -1 fft 3: mflops = 46.9512 (norm. = 0.738806), norm. avg. (of 1) = 0.738806 fft 4: mflops = 29.9593 (norm. = 0.471429), norm. avg. (of 1) = 0.471429 fft 5: mflops = 15.4202 (norm. = 0.242647), norm. avg. (of 1) = 0.242647 fft 6: mflops = 13.7971 (norm. = 0.217105), norm. avg. (of 1) = 0.217105 fft 7: mflops = 44.939 (norm. = 0.707143), norm. avg. (of 1) = 0.707143 fft 8: mflops = 34.0079 (norm. = 0.535135), norm. avg. (of 1) = 0.535135 fft 9: mflops = 54.7083 (norm. = 0.86087), norm. avg. (of 1) = 0.86087 fft 10: mflops = 31.4573 (norm. = 0.495), norm. avg. (of 1) = 0.495 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.08 s, 2048 iters, t-(init.)=1.05 s t(norm)=0.111262, mflops=44.939 (err=2.8e-16) 1. HARM: elapsed time t=1.31 s, 2048 iters, t-(init.)=1.27 s t(norm)=0.134574, mflops=37.1543 (err=3.3e-16) 2. HARM (f2c): elapsed time t=1.49 s, 2048 iters, t-(init.)=1.46 s t(norm)=0.154707, mflops=32.3191 (err=3.2e-16) 3. NR (C): elapsed time t=1.53 s, 4096 iters, t-(init.)=1.46 s t(norm)=0.0773536, mflops=64.6382 (err=3.1e-16) 4. NR (F): elapsed time t=1.11 s, 2048 iters, t-(init.)=1.08 s t(norm)=0.114441, mflops=43.6907 (err=3.1e-16) 5. PDA: elapsed time t=1.78 s, 2048 iters, t-(init.)=1.75 s t(norm)=0.185437, mflops=26.9634 (err=2.5e-16) 6. PDA (f2c): elapsed time t=1.99 s, 2048 iters, t-(init.)=1.96 s t(norm)=0.207689, mflops=24.0744 (err=2.6e-16) 7. Singleton: elapsed time t=1.21 s, 2048 iters, t-(init.)=1.17 s t(norm)=0.123978, mflops=40.3298 (err=3.4e-16) 8. Singleton (f2c): elapsed time t=1.72 s, 4096 iters, t-(init.)=1.66 s t(norm)=0.08795, mflops=56.8505 (err=3.4e-16) 9. Temperton: elapsed time t=1.24 s, 4096 iters, t-(init.)=1.17 s t(norm)=0.0619888, mflops=80.6597 (err=2.8e-16) 10. Temperton (f2c): elapsed time t=1.31 s, 2048 iters, t-(init.)=1.27 s t(norm)=0.134574, mflops=37.1543 (err=3.0e-16) Top mflops for N=512 = 80.6597 Normalized results and averages for N=512: fft 0: mflops = 44.939 (norm. = 0.557143), norm. avg. (of 2) = 0.778571 fft 1: mflops = 37.1543 (norm. = 0.46063), norm. avg. (of 1) = 0.46063 fft 2: mflops = 32.3191 (norm. = 0.400685), norm. avg. (of 1) = 0.400685 fft 3: mflops = 64.6382 (norm. = 0.80137), norm. avg. (of 2) = 0.770088 fft 4: mflops = 43.6907 (norm. = 0.541667), norm. avg. (of 2) = 0.506548 fft 5: mflops = 26.9634 (norm. = 0.334286), norm. avg. (of 2) = 0.288466 fft 6: mflops = 24.0744 (norm. = 0.298469), norm. avg. (of 2) = 0.257787 fft 7: mflops = 40.3298 (norm. = 0.5), norm. avg. (of 2) = 0.603571 fft 8: mflops = 56.8505 (norm. = 0.704819), norm. avg. (of 2) = 0.619977 fft 9: mflops = 80.6597 (norm. = 1), norm. avg. (of 2) = 0.930435 fft 10: mflops = 37.1543 (norm. = 0.46063), norm. avg. (of 2) = 0.477815 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.39 s, 256 iters, t-(init.)=1.35 s t(norm)=0.107288, mflops=46.6034 (err=3.1e-16) 1. HARM: elapsed time t=1.59 s, 256 iters, t-(init.)=1.55 s t(norm)=0.123183, mflops=40.59 (err=3.2e-16) 2. HARM (f2c): elapsed time t=1.98 s, 256 iters, t-(init.)=1.94 s t(norm)=0.154177, mflops=32.4302 (err=3.1e-16) 3. NR (C): elapsed time t=1.15 s, 128 iters, t-(init.)=1.13 s t(norm)=0.179609, mflops=27.8383 (err=3.5e-16) 4. NR (F): elapsed time t=1.25 s, 128 iters, t-(init.)=1.23 s t(norm)=0.195503, mflops=25.575 (err=3.5e-16) 5. PDA: elapsed time t=1.02 s, 128 iters, t-(init.)=1 s t(norm)=0.158946, mflops=31.4573 (err=2.9e-16) 6. PDA (f2c): elapsed time t=1.94 s, 256 iters, t-(init.)=1.9 s t(norm)=0.150998, mflops=33.1129 (err=2.9e-16) 7. Singleton: elapsed time t=1.73 s, 256 iters, t-(init.)=1.69 s t(norm)=0.134309, mflops=37.2276 (err=3.5e-16) 8. Singleton (f2c): elapsed time t=1.11 s, 128 iters, t-(init.)=1.08 s t(norm)=0.171661, mflops=29.1271 (err=3.3e-16) 9. Temperton: elapsed time t=1 s, 256 iters, t-(init.)=0.96 s t(norm)=0.0762939, mflops=65.536 (err=3.3e-16) 10. Temperton (f2c): elapsed time t=1.81 s, 256 iters, t-(init.)=1.77 s t(norm)=0.140667, mflops=35.5449 (err=3.1e-16) Top mflops for N=4096 = 65.536 Normalized results and averages for N=4096: fft 0: mflops = 46.6034 (norm. = 0.711111), norm. avg. (of 3) = 0.756085 fft 1: mflops = 40.59 (norm. = 0.619355), norm. avg. (of 2) = 0.539992 fft 2: mflops = 32.4302 (norm. = 0.494845), norm. avg. (of 2) = 0.447765 fft 3: mflops = 27.8383 (norm. = 0.424779), norm. avg. (of 3) = 0.654985 fft 4: mflops = 25.575 (norm. = 0.390244), norm. avg. (of 3) = 0.46778 fft 5: mflops = 31.4573 (norm. = 0.48), norm. avg. (of 3) = 0.352311 fft 6: mflops = 33.1129 (norm. = 0.505263), norm. avg. (of 3) = 0.340279 fft 7: mflops = 37.2276 (norm. = 0.568047), norm. avg. (of 3) = 0.59173 fft 8: mflops = 29.1271 (norm. = 0.444444), norm. avg. (of 3) = 0.561466 fft 9: mflops = 65.536 (norm. = 1), norm. avg. (of 3) = 0.953623 fft 10: mflops = 35.5449 (norm. = 0.542373), norm. avg. (of 3) = 0.499334 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.11 s, 16 iters, t-(init.)=1.05 s t(norm)=0.133514, mflops=37.4491 (err=4.3e-16) 1. HARM: elapsed time t=1.33 s, 16 iters, t-(init.)=1.27 s t(norm)=0.161489, mflops=30.9619 (err=4.6e-16) 2. HARM (f2c): elapsed time t=1.59 s, 16 iters, t-(init.)=1.53 s t(norm)=0.19455, mflops=25.7004 (err=4.3e-16) 3. NR (C): elapsed time t=1.22 s, 8 iters, t-(init.)=1.19 s t(norm)=0.302633, mflops=16.5217 (err=4.2e-16) 4. NR (F): elapsed time t=1.27 s, 8 iters, t-(init.)=1.24 s t(norm)=0.315348, mflops=15.8555 (err=4.2e-16) 5. PDA: elapsed time t=1.43 s, 16 iters, t-(init.)=1.37 s t(norm)=0.174205, mflops=28.7019 (err=3.5e-16) 6. PDA (f2c): elapsed time t=1.4 s, 16 iters, t-(init.)=1.34 s t(norm)=0.17039, mflops=29.3445 (err=3.6e-16) 7. Singleton: elapsed time t=1.86 s, 16 iters, t-(init.)=1.8 s t(norm)=0.228882, mflops=21.8453 (err=4.4e-16) 8. Singleton (f2c): elapsed time t=1.06 s, 8 iters, t-(init.)=1.03 s t(norm)=0.261943, mflops=19.0882 (err=4.2e-16) 9. Temperton: elapsed time t=1.09 s, 16 iters, t-(init.)=1.03 s t(norm)=0.130971, mflops=38.1763 (err=4.2e-16) 10. Temperton (f2c): elapsed time t=1.57 s, 16 iters, t-(init.)=1.51 s t(norm)=0.192006, mflops=26.0408 (err=4.0e-16) Top mflops for N=32768 = 38.1763 Normalized results and averages for N=32768: fft 0: mflops = 37.4491 (norm. = 0.980952), norm. avg. (of 4) = 0.812302 fft 1: mflops = 30.9619 (norm. = 0.811024), norm. avg. (of 3) = 0.630336 fft 2: mflops = 25.7004 (norm. = 0.673203), norm. avg. (of 3) = 0.522911 fft 3: mflops = 16.5217 (norm. = 0.432773), norm. avg. (of 4) = 0.599432 fft 4: mflops = 15.8555 (norm. = 0.415323), norm. avg. (of 4) = 0.454665 fft 5: mflops = 28.7019 (norm. = 0.751825), norm. avg. (of 4) = 0.452189 fft 6: mflops = 29.3445 (norm. = 0.768657), norm. avg. (of 4) = 0.447374 fft 7: mflops = 21.8453 (norm. = 0.572222), norm. avg. (of 4) = 0.586853 fft 8: mflops = 19.0882 (norm. = 0.5), norm. avg. (of 4) = 0.5461 fft 9: mflops = 38.1763 (norm. = 1), norm. avg. (of 4) = 0.965217 fft 10: mflops = 26.0408 (norm. = 0.682119), norm. avg. (of 4) = 0.545031 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.63 s, 2 iters, t-(init.)=1.54 s t(norm)=0.163184, mflops=30.6402 (err=4.2e-16) 1. HARM: elapsed time t=1.73 s, 2 iters, t-(init.)=1.64 s t(norm)=0.173781, mflops=28.7719 (err=4.9e-16) 2. HARM (f2c): elapsed time t=1.02 s, 1 iters, t-(init.)=0.98 s t(norm)=0.207689, mflops=24.0744 (err=4.3e-16) 3. NR (C): elapsed time t=2.07 s, 1 iters, t-(init.)=2.03 s t(norm)=0.430213, mflops=11.6221 (err=4.9e-16) 4. NR (F): elapsed time t=2.08 s, 1 iters, t-(init.)=2.03 s t(norm)=0.430213, mflops=11.6221 (err=4.9e-16) 5. PDA: elapsed time t=1.86 s, 2 iters, t-(init.)=1.77 s t(norm)=0.187556, mflops=26.6587 (err=4.4e-16) 6. PDA (f2c): elapsed time t=1.71 s, 2 iters, t-(init.)=1.62 s t(norm)=0.171661, mflops=29.1271 (err=4.4e-16) 7. Singleton: elapsed time t=1.29 s, 1 iters, t-(init.)=1.25 s t(norm)=0.26491, mflops=18.8744 (err=5.0e-16) 8. Singleton (f2c): elapsed time t=1.54 s, 1 iters, t-(init.)=1.49 s t(norm)=0.315772, mflops=15.8342 (err=4.8e-16) 9. Temperton: elapsed time t=1.4 s, 2 iters, t-(init.)=1.31 s t(norm)=0.138813, mflops=36.0198 (err=4.8e-16) 10. Temperton (f2c): elapsed time t=1.96 s, 2 iters, t-(init.)=1.86 s t(norm)=0.197093, mflops=25.3688 (err=4.4e-16) Top mflops for N=262144 = 36.0198 Normalized results and averages for N=262144: fft 0: mflops = 30.6402 (norm. = 0.850649), norm. avg. (of 5) = 0.819971 fft 1: mflops = 28.7719 (norm. = 0.79878), norm. avg. (of 4) = 0.672447 fft 2: mflops = 24.0744 (norm. = 0.668367), norm. avg. (of 4) = 0.559275 fft 3: mflops = 11.6221 (norm. = 0.32266), norm. avg. (of 5) = 0.544078 fft 4: mflops = 11.6221 (norm. = 0.32266), norm. avg. (of 5) = 0.428264 fft 5: mflops = 26.6587 (norm. = 0.740113), norm. avg. (of 5) = 0.509774 fft 6: mflops = 29.1271 (norm. = 0.808642), norm. avg. (of 5) = 0.519627 fft 7: mflops = 18.8744 (norm. = 0.524), norm. avg. (of 5) = 0.574282 fft 8: mflops = 15.8342 (norm. = 0.439597), norm. avg. (of 5) = 0.524799 fft 9: mflops = 36.0198 (norm. = 1), norm. avg. (of 5) = 0.972174 fft 10: mflops = 25.3688 (norm. = 0.704301), norm. avg. (of 5) = 0.576885 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.71 s, 1 iters, t-(init.)=1.62 s t(norm)=0.162627, mflops=30.7453 (err=5.5e-16) 1. HARM: elapsed time t=1.91 s, 1 iters, t-(init.)=1.83 s t(norm)=0.183708, mflops=27.2171 (err=6.1e-16) 2. HARM (f2c): elapsed time t=2.25 s, 1 iters, t-(init.)=2.16 s t(norm)=0.216835, mflops=23.059 (err=5.9e-16) 3. NR (C): elapsed time t=4.36 s, 1 iters, t-(init.)=4.27 s t(norm)=0.428652, mflops=11.6645 (err=5.7e-16) 4. NR (F): elapsed time t=4.51 s, 1 iters, t-(init.)=4.42 s t(norm)=0.44371, mflops=11.2686 (err=5.7e-16) 5. PDA: elapsed time t=2.02 s, 1 iters, t-(init.)=1.93 s t(norm)=0.193746, mflops=25.8069 (err=4.9e-16) 6. PDA (f2c): elapsed time t=1.87 s, 1 iters, t-(init.)=1.78 s t(norm)=0.178688, mflops=27.9817 (err=5.1e-16) 7. Singleton: elapsed time t=3.15 s, 1 iters, t-(init.)=3.06 s t(norm)=0.307184, mflops=16.2769 (err=7.2e-16) 8. Singleton (f2c): elapsed time t=3.66 s, 1 iters, t-(init.)=3.57 s t(norm)=0.358381, mflops=13.9516 (err=7.1e-16) 9. Temperton: elapsed time t=1.47 s, 1 iters, t-(init.)=1.38 s t(norm)=0.138534, mflops=36.0923 (err=5.6e-16) 10. Temperton (f2c): elapsed time t=2.07 s, 1 iters, t-(init.)=1.98 s t(norm)=0.198766, mflops=25.1552 (err=5.5e-16) Top mflops for N=524288 = 36.0923 Normalized results and averages for N=524288: fft 0: mflops = 30.7453 (norm. = 0.851852), norm. avg. (of 6) = 0.825285 fft 1: mflops = 27.2171 (norm. = 0.754098), norm. avg. (of 5) = 0.688777 fft 2: mflops = 23.059 (norm. = 0.638889), norm. avg. (of 5) = 0.575198 fft 3: mflops = 11.6645 (norm. = 0.323185), norm. avg. (of 6) = 0.507262 fft 4: mflops = 11.2686 (norm. = 0.312217), norm. avg. (of 6) = 0.408923 fft 5: mflops = 25.8069 (norm. = 0.715026), norm. avg. (of 6) = 0.543983 fft 6: mflops = 27.9817 (norm. = 0.775281), norm. avg. (of 6) = 0.562236 fft 7: mflops = 16.2769 (norm. = 0.45098), norm. avg. (of 6) = 0.553732 fft 8: mflops = 13.9516 (norm. = 0.386555), norm. avg. (of 6) = 0.501758 fft 9: mflops = 36.0923 (norm. = 1), norm. avg. (of 6) = 0.976812 fft 10: mflops = 25.1552 (norm. = 0.69697), norm. avg. (of 6) = 0.596899 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=3.39 s, 1 iters, t-(init.)=3.2 s t(norm)=0.152588, mflops=32.768 (err=6.2e-16) 1. HARM: elapsed time t=3.98 s, 1 iters, t-(init.)=3.8 s t(norm)=0.181198, mflops=27.5941 (err=5.8e-16) 2. HARM (f2c): elapsed time t=4.69 s, 1 iters, t-(init.)=4.51 s t(norm)=0.215054, mflops=23.25 (err=5.5e-16) 3. NR (C): elapsed time t=9.02 s, 1 iters, t-(init.)=8.84 s t(norm)=0.421524, mflops=11.8617 (err=7.3e-16) 4. NR (F): elapsed time t=9.4 s, 1 iters, t-(init.)=9.21 s t(norm)=0.439167, mflops=11.3852 (err=7.3e-16) 5. PDA: elapsed time t=4.47 s, 1 iters, t-(init.)=4.28 s t(norm)=0.204086, mflops=24.4994 (err=6.0e-16) 6. PDA (f2c): elapsed time t=4.2 s, 1 iters, t-(init.)=4.01 s t(norm)=0.191212, mflops=26.149 (err=6.0e-16) 7. Singleton: elapsed time t=5.94 s, 1 iters, t-(init.)=5.76 s t(norm)=0.274658, mflops=18.2044 (err=7.1e-16) 8. Singleton (f2c): elapsed time t=6.99 s, 1 iters, t-(init.)=6.81 s t(norm)=0.324726, mflops=15.3976 (err=6.9e-16) 9. Skipping fft (Temperton can't handle dimensions > 256). 10. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 32.768 Normalized results and averages for N=1048576: fft 0: mflops = 32.768 (norm. = 1), norm. avg. (of 7) = 0.850244 fft 1: mflops = 27.5941 (norm. = 0.842105), norm. avg. (of 6) = 0.714332 fft 2: mflops = 23.25 (norm. = 0.709534), norm. avg. (of 6) = 0.597587 fft 3: mflops = 11.8617 (norm. = 0.361991), norm. avg. (of 7) = 0.486509 fft 4: mflops = 11.3852 (norm. = 0.347448), norm. avg. (of 7) = 0.400141 fft 5: mflops = 24.4994 (norm. = 0.747664), norm. avg. (of 7) = 0.57308 fft 6: mflops = 26.149 (norm. = 0.798005), norm. avg. (of 7) = 0.595917 fft 7: mflops = 18.2044 (norm. = 0.555556), norm. avg. (of 7) = 0.553993 fft 8: mflops = 15.3976 (norm. = 0.469897), norm. avg. (of 7) = 0.497207 fft 9: mflops = -1 (norm. = -0.0305176), norm. avg. (of 6) = 0.976812 fft 10: mflops = -1 (norm. = -0.0305176), norm. avg. (of 6) = 0.596899 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=5.47 s, 1 iters, t-(init.)=5.11 s t(norm)=0.11603, mflops=43.0922 (err=6.9e-16) 1. HARM: elapsed time t=8.27 s, 1 iters, t-(init.)=7.91 s t(norm)=0.179609, mflops=27.8383 (err=6.3e-16) 2. HARM (f2c): elapsed time t=9.86 s, 1 iters, t-(init.)=9.5 s t(norm)=0.215712, mflops=23.179 (err=6.1e-16) 3. NR (C): elapsed time t=19.58 s, 1 iters, t-(init.)=19.22 s t(norm)=0.43642, mflops=11.4569 (err=7.5e-16) 4. NR (F): elapsed time t=20.19 s, 1 iters, t-(init.)=19.83 s t(norm)=0.450271, mflops=11.1044 (err=7.5e-16) 5. PDA: elapsed time t=8.22 s, 1 iters, t-(init.)=7.86 s t(norm)=0.178473, mflops=28.0154 (err=6.7e-16) 6. PDA (f2c): elapsed time t=7.48 s, 1 iters, t-(init.)=7.12 s t(norm)=0.161671, mflops=30.9271 (err=6.7e-16) 7. Singleton: elapsed time t=16.64 s, 1 iters, t-(init.)=16.28 s t(norm)=0.369662, mflops=13.5259 (err=8.0e-16) 8. Singleton (f2c): elapsed time t=17.78 s, 1 iters, t-(init.)=17.42 s t(norm)=0.395548, mflops=12.6407 (err=7.9e-16) 9. Temperton: elapsed time t=8.48 s, 1 iters, t-(init.)=8.11 s t(norm)=0.18415, mflops=27.1518 (err=7.0e-16) 10. Temperton (f2c): elapsed time t=11.39 s, 1 iters, t-(init.)=11.02 s t(norm)=0.250226, mflops=19.9819 (err=7.1e-16) Top mflops for N=2097152 = 43.0922 Normalized results and averages for N=2097152: fft 0: mflops = 43.0922 (norm. = 1), norm. avg. (of 8) = 0.868963 fft 1: mflops = 27.8383 (norm. = 0.646018), norm. avg. (of 7) = 0.704573 fft 2: mflops = 23.179 (norm. = 0.537895), norm. avg. (of 7) = 0.58906 fft 3: mflops = 11.4569 (norm. = 0.265869), norm. avg. (of 8) = 0.458929 fft 4: mflops = 11.1044 (norm. = 0.25769), norm. avg. (of 8) = 0.382335 fft 5: mflops = 28.0154 (norm. = 0.650127), norm. avg. (of 8) = 0.582711 fft 6: mflops = 30.9271 (norm. = 0.717697), norm. avg. (of 8) = 0.61114 fft 7: mflops = 13.5259 (norm. = 0.313882), norm. avg. (of 8) = 0.523979 fft 8: mflops = 12.6407 (norm. = 0.293341), norm. avg. (of 8) = 0.471724 fft 9: mflops = 27.1518 (norm. = 0.630086), norm. avg. (of 7) = 0.927279 fft 10: mflops = 19.9819 (norm. = 0.463702), norm. avg. (of 7) = 0.577871 Benchmarking for array size = 512x128x64 (power of 2): 0. FFTW: elapsed time t=12.82 s, 1 iters, t-(init.)=12.1 s t(norm)=0.13113, mflops=38.13 (err=7.7e-16) 1. HARM: elapsed time t=17.73 s, 1 iters, t-(init.)=17 s t(norm)=0.184233, mflops=27.1396 (err=7.3e-16) 2. HARM (f2c): elapsed time t=21.02 s, 1 iters, t-(init.)=20.3 s t(norm)=0.219995, mflops=22.7278 (err=7.0e-16) 3. NR (C): elapsed time t=42.77 s, 1 iters, t-(init.)=42.05 s t(norm)=0.455705, mflops=10.972 (err=8.1e-16) 4. NR (F): elapsed time t=43.96 s, 1 iters, t-(init.)=43.23 s t(norm)=0.468493, mflops=10.6725 (err=8.1e-16) 5. PDA: elapsed time t=17.93 s, 1 iters, t-(init.)=17.2 s t(norm)=0.1864, mflops=26.824 (err=7.6e-16) 6. PDA (f2c): elapsed time t=16.4 s, 1 iters, t-(init.)=15.68 s t(norm)=0.169927, mflops=29.4243 (err=7.6e-16) 7. Singleton: elapsed time t=31.6 s, 1 iters, t-(init.)=30.87 s t(norm)=0.334545, mflops=14.9457 (err=9.6e-16) 8. Singleton (f2c): elapsed time t=35.8 s, 1 iters, t-(init.)=35.07 s t(norm)=0.380061, mflops=13.1558 (err=9.4e-16) 9. Skipping fft (Temperton can't handle dimensions > 256). 10. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=4194304 = 38.13 Normalized results and averages for N=4194304: fft 0: mflops = 38.13 (norm. = 1), norm. avg. (of 9) = 0.883523 fft 1: mflops = 27.1396 (norm. = 0.711765), norm. avg. (of 8) = 0.705472 fft 2: mflops = 22.7278 (norm. = 0.596059), norm. avg. (of 8) = 0.589935 fft 3: mflops = 10.972 (norm. = 0.287753), norm. avg. (of 9) = 0.439909 fft 4: mflops = 10.6725 (norm. = 0.279898), norm. avg. (of 9) = 0.370953 fft 5: mflops = 26.824 (norm. = 0.703488), norm. avg. (of 9) = 0.596131 fft 6: mflops = 29.4243 (norm. = 0.771684), norm. avg. (of 9) = 0.628978 fft 7: mflops = 14.9457 (norm. = 0.391966), norm. avg. (of 9) = 0.509311 fft 8: mflops = 13.1558 (norm. = 0.345024), norm. avg. (of 9) = 0.457646 fft 9: mflops = -1 (norm. = -0.026226), norm. avg. (of 7) = 0.927279 fft 10: mflops = -1 (norm. = -0.026226), norm. avg. (of 7) = 0.577871 Benchmarking for array size = 256x128x256 (power of 2): 0. FFTW: elapsed time t=29.4 s, 1 iters, t-(init.)=27.94 s t(norm)=0.144813, mflops=34.5272 (err=7.3e-16) 1. HARM: elapsed time t=35.94 s, 1 iters, t-(init.)=34.48 s t(norm)=0.17871, mflops=27.9782 (err=7.5e-16) 2. HARM (f2c): elapsed time t=42.12 s, 1 iters, t-(init.)=40.66 s t(norm)=0.210741, mflops=23.7258 (err=7.3e-16) 3. NR (C): elapsed time t=89.34 s, 1 iters, t-(init.)=87.88 s t(norm)=0.455483, mflops=10.9774 (err=7.6e-16) 4. NR (F): elapsed time t=91.11 s, 1 iters, t-(init.)=89.64 s t(norm)=0.464605, mflops=10.7618 (err=7.6e-16) 5. PDA: elapsed time t=37.83 s, 1 iters, t-(init.)=36.31 s t(norm)=0.188195, mflops=26.5682 (err=6.8e-16) 6. PDA (f2c): elapsed time t=33.8 s, 1 iters, t-(init.)=32.35 s t(norm)=0.16767, mflops=29.8204 (err=6.9e-16) 7. Singleton: elapsed time t=65.11 s, 1 iters, t-(init.)=63.65 s t(norm)=0.329899, mflops=15.1562 (err=9.6e-16) 8. Singleton (f2c): elapsed time t=75.05 s, 1 iters, t-(init.)=73.59 s t(norm)=0.381418, mflops=13.109 (err=9.4e-16) 9. Temperton: elapsed time t=35.03 s, 1 iters, t-(init.)=33.57 s t(norm)=0.173994, mflops=28.7367 (err=7.4e-16) 10. Temperton (f2c): elapsed time t=47.32 s, 1 iters, t-(init.)=45.85 s t(norm)=0.237641, mflops=21.0401 (err=7.3e-16) Top mflops for N=8388608 = 34.5272 Normalized results and averages for N=8388608: fft 0: mflops = 34.5272 (norm. = 1), norm. avg. (of 10) = 0.895171 fft 1: mflops = 27.9782 (norm. = 0.810325), norm. avg. (of 9) = 0.717122 fft 2: mflops = 23.7258 (norm. = 0.687162), norm. avg. (of 9) = 0.600738 fft 3: mflops = 10.9774 (norm. = 0.317934), norm. avg. (of 10) = 0.427712 fft 4: mflops = 10.7618 (norm. = 0.311691), norm. avg. (of 10) = 0.365027 fft 5: mflops = 26.5682 (norm. = 0.769485), norm. avg. (of 10) = 0.613466 fft 6: mflops = 29.8204 (norm. = 0.863679), norm. avg. (of 10) = 0.652448 fft 7: mflops = 15.1562 (norm. = 0.438963), norm. avg. (of 10) = 0.502276 fft 8: mflops = 13.109 (norm. = 0.379671), norm. avg. (of 10) = 0.449848 fft 9: mflops = 28.7367 (norm. = 0.832291), norm. avg. (of 8) = 0.915406 fft 10: mflops = 21.0401 (norm. = 0.609378), norm. avg. (of 8) = 0.581809 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) 180x180x180 (88.9976 MB) Maximum array size N = 5832000 Benchmarking FFTs: 0. FFTW 1. PDA 2. PDA (f2c) 3. Singleton 4. Singleton (f2c) 5. Temperton 6. Temperton (f2c) Computing normalized averages (7 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.01 s, 8192 iters, t-(init.)=0.99 s t(norm)=0.138792, mflops=36.0251 (err=2.9e-16) 1. PDA: elapsed time t=1.07 s, 4096 iters, t-(init.)=1.06 s t(norm)=0.297212, mflops=16.823 (err=2.5e-16) 2. PDA (f2c): elapsed time t=1.44 s, 4096 iters, t-(init.)=1.42 s t(norm)=0.398152, mflops=12.558 (err=2.4e-16) 3. Singleton: elapsed time t=1.47 s, 8192 iters, t-(init.)=1.43 s t(norm)=0.200478, mflops=24.9404 (err=3.2e-16) 4. Singleton (f2c): elapsed time t=1.08 s, 8192 iters, t-(init.)=1.06 s t(norm)=0.148606, mflops=33.6461 (err=2.6e-16) 5. Temperton: elapsed time t=1.09 s, 16384 iters, t-(init.)=1.04 s t(norm)=0.072901, mflops=68.5862 (err=2.1e-16) 6. Temperton (f2c): elapsed time t=1.28 s, 8192 iters, t-(init.)=1.25 s t(norm)=0.175243, mflops=28.5319 (err=1.7e-16) Top mflops for N=125 = 68.5862 Normalized results and averages for N=125: fft 0: mflops = 36.0251 (norm. = 0.525253), norm. avg. (of 1) = 0.525253 fft 1: mflops = 16.823 (norm. = 0.245283), norm. avg. (of 1) = 0.245283 fft 2: mflops = 12.558 (norm. = 0.183099), norm. avg. (of 1) = 0.183099 fft 3: mflops = 24.9404 (norm. = 0.363636), norm. avg. (of 1) = 0.363636 fft 4: mflops = 33.6461 (norm. = 0.490566), norm. avg. (of 1) = 0.490566 fft 5: mflops = 68.5862 (norm. = 1), norm. avg. (of 1) = 1 fft 6: mflops = 28.5319 (norm. = 0.416), norm. avg. (of 1) = 0.416 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.88 s, 16384 iters, t-(init.)=1.79 s t(norm)=0.0652235, mflops=76.6595 (err=2.3e-16) 1. PDA: elapsed time t=1.69 s, 4096 iters, t-(init.)=1.66 s t(norm)=0.241946, mflops=20.6657 (err=3.7e-16) 2. PDA (f2c): elapsed time t=1.27 s, 2048 iters, t-(init.)=1.26 s t(norm)=0.367292, mflops=13.6132 (err=3.8e-16) 3. Singleton: elapsed time t=1.99 s, 8192 iters, t-(init.)=1.93 s t(norm)=0.140649, mflops=35.5494 (err=3.0e-16) 4. Singleton (f2c): elapsed time t=1.45 s, 8192 iters, t-(init.)=1.41 s t(norm)=0.102754, mflops=48.6598 (err=3.0e-16) 5. Temperton: elapsed time t=1.43 s, 8192 iters, t-(init.)=1.38 s t(norm)=0.100568, mflops=49.7176 (err=2.6e-16) 6. Temperton (f2c): elapsed time t=1.96 s, 8192 iters, t-(init.)=1.91 s t(norm)=0.139192, mflops=35.9216 (err=2.9e-16) Top mflops for N=216 = 76.6595 Normalized results and averages for N=216: fft 0: mflops = 76.6595 (norm. = 1), norm. avg. (of 2) = 0.762626 fft 1: mflops = 20.6657 (norm. = 0.269578), norm. avg. (of 2) = 0.257431 fft 2: mflops = 13.6132 (norm. = 0.177579), norm. avg. (of 2) = 0.180339 fft 3: mflops = 35.5494 (norm. = 0.463731), norm. avg. (of 2) = 0.413683 fft 4: mflops = 48.6598 (norm. = 0.634752), norm. avg. (of 2) = 0.562659 fft 5: mflops = 49.7176 (norm. = 0.648551), norm. avg. (of 2) = 0.824275 fft 6: mflops = 35.9216 (norm. = 0.468586), norm. avg. (of 2) = 0.442293 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.01 s, 2048 iters, t-(init.)=0.99 s t(norm)=0.167337, mflops=29.8798 (err=2.1e-16) 1. PDA: elapsed time t=1.15 s, 1024 iters, t-(init.)=1.14 s t(norm)=0.385383, mflops=12.9741 (err=3.7e-16) 2. PDA (f2c): elapsed time t=1.37 s, 1024 iters, t-(init.)=1.36 s t(norm)=0.459755, mflops=10.8754 (err=3.4e-16) 3. Singleton: elapsed time t=1.3 s, 4096 iters, t-(init.)=1.26 s t(norm)=0.106487, mflops=46.9539 (err=4.2e-16) 4. Singleton (f2c): elapsed time t=1.61 s, 2048 iters, t-(init.)=1.59 s t(norm)=0.268754, mflops=18.6044 (err=4.2e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 46.9539 Normalized results and averages for N=343: fft 0: mflops = 29.8798 (norm. = 0.636364), norm. avg. (of 3) = 0.720539 fft 1: mflops = 12.9741 (norm. = 0.276316), norm. avg. (of 3) = 0.263726 fft 2: mflops = 10.8754 (norm. = 0.231618), norm. avg. (of 3) = 0.197432 fft 3: mflops = 46.9539 (norm. = 1), norm. avg. (of 3) = 0.609122 fft 4: mflops = 18.6044 (norm. = 0.396226), norm. avg. (of 3) = 0.507181 fft 5: mflops = -1 (norm. = -0.0212975), norm. avg. (of 2) = 0.824275 fft 6: mflops = -1 (norm. = -0.0212975), norm. avg. (of 2) = 0.442293 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.48 s, 1024 iters, t-(init.)=1.45 s t(norm)=0.204254, mflops=24.4793 (err=3.6e-16) 1. PDA: elapsed time t=1.32 s, 1024 iters, t-(init.)=1.3 s t(norm)=0.183124, mflops=27.3039 (err=3.2e-16) 2. PDA (f2c): elapsed time t=1.25 s, 512 iters, t-(init.)=1.24 s t(norm)=0.349345, mflops=14.3125 (err=3.6e-16) 3. Singleton: elapsed time t=1.42 s, 1024 iters, t-(init.)=1.39 s t(norm)=0.195802, mflops=25.536 (err=3.6e-16) 4. Singleton (f2c): elapsed time t=1.03 s, 1024 iters, t-(init.)=1 s t(norm)=0.140865, mflops=35.495 (err=3.6e-16) 5. Temperton: elapsed time t=1.95 s, 2048 iters, t-(init.)=1.89 s t(norm)=0.133117, mflops=37.5609 (err=3.8e-16) 6. Temperton (f2c): elapsed time t=1.97 s, 2048 iters, t-(init.)=1.92 s t(norm)=0.13523, mflops=36.974 (err=3.5e-16) Top mflops for N=729 = 37.5609 Normalized results and averages for N=729: fft 0: mflops = 24.4793 (norm. = 0.651724), norm. avg. (of 4) = 0.703335 fft 1: mflops = 27.3039 (norm. = 0.726923), norm. avg. (of 4) = 0.379525 fft 2: mflops = 14.3125 (norm. = 0.381048), norm. avg. (of 4) = 0.243336 fft 3: mflops = 25.536 (norm. = 0.679856), norm. avg. (of 4) = 0.626806 fft 4: mflops = 35.495 (norm. = 0.945), norm. avg. (of 4) = 0.616636 fft 5: mflops = 37.5609 (norm. = 1), norm. avg. (of 3) = 0.88285 fft 6: mflops = 36.974 (norm. = 0.984375), norm. avg. (of 3) = 0.622987 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.29 s, 512 iters, t-(init.)=1.27 s t(norm)=0.248898, mflops=20.0885 (err=2.7e-16) 1. PDA: elapsed time t=1.02 s, 512 iters, t-(init.)=1 s t(norm)=0.195983, mflops=25.5124 (err=2.8e-16) 2. PDA (f2c): elapsed time t=1.32 s, 512 iters, t-(init.)=1.3 s t(norm)=0.254778, mflops=19.6249 (err=3.1e-16) 3. Singleton: elapsed time t=1.15 s, 512 iters, t-(init.)=1.13 s t(norm)=0.221461, mflops=22.5774 (err=3.8e-16) 4. Singleton (f2c): elapsed time t=1.82 s, 1024 iters, t-(init.)=1.78 s t(norm)=0.174425, mflops=28.6656 (err=3.7e-16) 5. Temperton: elapsed time t=1.58 s, 2048 iters, t-(init.)=1.5 s t(norm)=0.0734937, mflops=68.0331 (err=2.7e-16) 6. Temperton (f2c): elapsed time t=1.71 s, 1024 iters, t-(init.)=1.67 s t(norm)=0.163646, mflops=30.5538 (err=2.8e-16) Top mflops for N=1000 = 68.0331 Normalized results and averages for N=1000: fft 0: mflops = 20.0885 (norm. = 0.295276), norm. avg. (of 5) = 0.621723 fft 1: mflops = 25.5124 (norm. = 0.375), norm. avg. (of 5) = 0.37862 fft 2: mflops = 19.6249 (norm. = 0.288462), norm. avg. (of 5) = 0.252361 fft 3: mflops = 22.5774 (norm. = 0.331858), norm. avg. (of 5) = 0.567816 fft 4: mflops = 28.6656 (norm. = 0.421348), norm. avg. (of 5) = 0.577579 fft 5: mflops = 68.0331 (norm. = 1), norm. avg. (of 4) = 0.912138 fft 6: mflops = 30.5538 (norm. = 0.449102), norm. avg. (of 4) = 0.579516 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.56 s, 512 iters, t-(init.)=1.53 s t(norm)=0.21633, mflops=23.1128 (err=2.4e-16) 1. PDA: elapsed time t=1.29 s, 256 iters, t-(init.)=1.27 s t(norm)=0.359137, mflops=13.9223 (err=4.2e-16) 2. PDA (f2c): elapsed time t=1.55 s, 256 iters, t-(init.)=1.53 s t(norm)=0.432661, mflops=11.5564 (err=4.3e-16) 3. Singleton: elapsed time t=1.81 s, 1024 iters, t-(init.)=1.76 s t(norm)=0.124425, mflops=40.1848 (err=3.8e-16) 4. Singleton (f2c): elapsed time t=1.26 s, 256 iters, t-(init.)=1.25 s t(norm)=0.353481, mflops=14.145 (err=3.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 40.1848 Normalized results and averages for N=1331: fft 0: mflops = 23.1128 (norm. = 0.575163), norm. avg. (of 6) = 0.613963 fft 1: mflops = 13.9223 (norm. = 0.346457), norm. avg. (of 6) = 0.373259 fft 2: mflops = 11.5564 (norm. = 0.287582), norm. avg. (of 6) = 0.258231 fft 3: mflops = 40.1848 (norm. = 1), norm. avg. (of 6) = 0.639847 fft 4: mflops = 14.145 (norm. = 0.352), norm. avg. (of 6) = 0.539982 fft 5: mflops = -1 (norm. = -0.0248851), norm. avg. (of 4) = 0.912138 fft 6: mflops = -1 (norm. = -0.0248851), norm. avg. (of 4) = 0.579516 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.42 s, 512 iters, t-(init.)=1.39 s t(norm)=0.146082, mflops=34.2275 (err=3.5e-16) 1. PDA: elapsed time t=1.33 s, 512 iters, t-(init.)=1.3 s t(norm)=0.136623, mflops=36.5971 (err=3.1e-16) 2. PDA (f2c): elapsed time t=1.1 s, 256 iters, t-(init.)=1.08 s t(norm)=0.227004, mflops=22.026 (err=3.5e-16) 3. Singleton: elapsed time t=1.73 s, 512 iters, t-(init.)=1.7 s t(norm)=0.178661, mflops=27.986 (err=3.7e-16) 4. Singleton (f2c): elapsed time t=1.47 s, 512 iters, t-(init.)=1.43 s t(norm)=0.150285, mflops=33.2701 (err=3.7e-16) 5. Temperton: elapsed time t=1.43 s, 512 iters, t-(init.)=1.39 s t(norm)=0.146082, mflops=34.2275 (err=3.9e-16) 6. Temperton (f2c): elapsed time t=1.28 s, 512 iters, t-(init.)=1.25 s t(norm)=0.131368, mflops=38.0609 (err=3.9e-16) Top mflops for N=1728 = 38.0609 Normalized results and averages for N=1728: fft 0: mflops = 34.2275 (norm. = 0.899281), norm. avg. (of 7) = 0.654723 fft 1: mflops = 36.5971 (norm. = 0.961538), norm. avg. (of 7) = 0.457299 fft 2: mflops = 22.026 (norm. = 0.578704), norm. avg. (of 7) = 0.304013 fft 3: mflops = 27.986 (norm. = 0.735294), norm. avg. (of 7) = 0.653482 fft 4: mflops = 33.2701 (norm. = 0.874126), norm. avg. (of 7) = 0.587717 fft 5: mflops = 34.2275 (norm. = 0.899281), norm. avg. (of 5) = 0.909566 fft 6: mflops = 38.0609 (norm. = 1), norm. avg. (of 5) = 0.663613 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.55 s, 256 iters, t-(init.)=1.53 s t(norm)=0.245046, mflops=20.4044 (err=2.3e-16) 1. PDA: elapsed time t=1.19 s, 128 iters, t-(init.)=1.18 s t(norm)=0.377979, mflops=13.2283 (err=8.7e-16) 2. PDA (f2c): elapsed time t=1.44 s, 128 iters, t-(init.)=1.43 s t(norm)=0.458059, mflops=10.9156 (err=8.4e-16) 3. Singleton: elapsed time t=1.84 s, 512 iters, t-(init.)=1.79 s t(norm)=0.143344, mflops=34.8812 (err=5.3e-16) 4. Singleton (f2c): elapsed time t=1.25 s, 128 iters, t-(init.)=1.24 s t(norm)=0.397198, mflops=12.5882 (err=5.3e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 34.8812 Normalized results and averages for N=2197: fft 0: mflops = 20.4044 (norm. = 0.584967), norm. avg. (of 8) = 0.646003 fft 1: mflops = 13.2283 (norm. = 0.379237), norm. avg. (of 8) = 0.447542 fft 2: mflops = 10.9156 (norm. = 0.312937), norm. avg. (of 8) = 0.305128 fft 3: mflops = 34.8812 (norm. = 1), norm. avg. (of 8) = 0.696797 fft 4: mflops = 12.5882 (norm. = 0.360887), norm. avg. (of 8) = 0.559363 fft 5: mflops = -1 (norm. = -0.0286687), norm. avg. (of 5) = 0.909566 fft 6: mflops = -1 (norm. = -0.0286687), norm. avg. (of 5) = 0.663613 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.1 s, 128 iters, t-(init.)=1.09 s t(norm)=0.271699, mflops=18.4027 (err=2.6e-16) 1. PDA: elapsed time t=1.93 s, 256 iters, t-(init.)=1.91 s t(norm)=0.238048, mflops=21.0042 (err=3.7e-16) 2. PDA (f2c): elapsed time t=1.15 s, 128 iters, t-(init.)=1.13 s t(norm)=0.281669, mflops=17.7513 (err=3.6e-16) 3. Singleton: elapsed time t=1.24 s, 256 iters, t-(init.)=1.21 s t(norm)=0.150805, mflops=33.1553 (err=4.1e-16) 4. Singleton (f2c): elapsed time t=1.1 s, 128 iters, t-(init.)=1.08 s t(norm)=0.269206, mflops=18.5731 (err=4.1e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 33.1553 Normalized results and averages for N=2744: fft 0: mflops = 18.4027 (norm. = 0.555046), norm. avg. (of 9) = 0.635897 fft 1: mflops = 21.0042 (norm. = 0.633508), norm. avg. (of 9) = 0.468204 fft 2: mflops = 17.7513 (norm. = 0.535398), norm. avg. (of 9) = 0.330714 fft 3: mflops = 33.1553 (norm. = 1), norm. avg. (of 9) = 0.730486 fft 4: mflops = 18.5731 (norm. = 0.560185), norm. avg. (of 9) = 0.559455 fft 5: mflops = -1 (norm. = -0.0301611), norm. avg. (of 5) = 0.909566 fft 6: mflops = -1 (norm. = -0.0301611), norm. avg. (of 5) = 0.663613 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.35 s, 128 iters, t-(init.)=1.34 s t(norm)=0.264648, mflops=18.893 (err=3.0e-16) 1. PDA: elapsed time t=1.66 s, 256 iters, t-(init.)=1.63 s t(norm)=0.160961, mflops=31.0634 (err=3.5e-16) 2. PDA (f2c): elapsed time t=1.37 s, 128 iters, t-(init.)=1.35 s t(norm)=0.266623, mflops=18.7531 (err=3.4e-16) 3. Singleton: elapsed time t=1.36 s, 128 iters, t-(init.)=1.35 s t(norm)=0.266623, mflops=18.7531 (err=4.2e-16) 4. Singleton (f2c): elapsed time t=1.96 s, 256 iters, t-(init.)=1.93 s t(norm)=0.190586, mflops=26.2349 (err=4.1e-16) 5. Temperton: elapsed time t=1.69 s, 512 iters, t-(init.)=1.62 s t(norm)=0.0799869, mflops=62.5102 (err=3.8e-16) 6. Temperton (f2c): elapsed time t=1.54 s, 256 iters, t-(init.)=1.51 s t(norm)=0.149111, mflops=33.532 (err=3.8e-16) Top mflops for N=3375 = 62.5102 Normalized results and averages for N=3375: fft 0: mflops = 18.893 (norm. = 0.302239), norm. avg. (of 10) = 0.602531 fft 1: mflops = 31.0634 (norm. = 0.496933), norm. avg. (of 10) = 0.471077 fft 2: mflops = 18.7531 (norm. = 0.3), norm. avg. (of 10) = 0.327643 fft 3: mflops = 18.7531 (norm. = 0.3), norm. avg. (of 10) = 0.687438 fft 4: mflops = 26.2349 (norm. = 0.419689), norm. avg. (of 10) = 0.545478 fft 5: mflops = 62.5102 (norm. = 1), norm. avg. (of 6) = 0.924639 fft 6: mflops = 33.532 (norm. = 0.536424), norm. avg. (of 6) = 0.642415 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.74 s, 32 iters, t-(init.)=1.72 s t(norm)=0.22794, mflops=21.9356 (err=3.3e-16) 1. PDA: elapsed time t=1.27 s, 32 iters, t-(init.)=1.25 s t(norm)=0.165654, mflops=30.1834 (err=3.8e-16) 2. PDA (f2c): elapsed time t=1.61 s, 32 iters, t-(init.)=1.59 s t(norm)=0.210712, mflops=23.7291 (err=3.8e-16) 3. Singleton: elapsed time t=1.58 s, 32 iters, t-(init.)=1.56 s t(norm)=0.206736, mflops=24.1854 (err=4.5e-16) 4. Singleton (f2c): elapsed time t=1.57 s, 32 iters, t-(init.)=1.54 s t(norm)=0.204086, mflops=24.4995 (err=4.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 30.1834 Normalized results and averages for N=16800: fft 0: mflops = 21.9356 (norm. = 0.726744), norm. avg. (of 11) = 0.613823 fft 1: mflops = 30.1834 (norm. = 1), norm. avg. (of 11) = 0.519161 fft 2: mflops = 23.7291 (norm. = 0.786164), norm. avg. (of 11) = 0.369326 fft 3: mflops = 24.1854 (norm. = 0.801282), norm. avg. (of 11) = 0.697787 fft 4: mflops = 24.4995 (norm. = 0.811688), norm. avg. (of 11) = 0.569679 fft 5: mflops = -1 (norm. = -0.0331308), norm. avg. (of 6) = 0.924639 fft 6: mflops = -1 (norm. = -0.0331308), norm. avg. (of 6) = 0.642415 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.15 s, 4 iters, t-(init.)=1.07 s t(norm)=0.144364, mflops=34.6347 (err=4.3e-16) 1. PDA: elapsed time t=1.17 s, 4 iters, t-(init.)=1.1 s t(norm)=0.148411, mflops=33.6901 (err=4.0e-16) 2. PDA (f2c): elapsed time t=1.64 s, 4 iters, t-(init.)=1.57 s t(norm)=0.211824, mflops=23.6045 (err=4.2e-16) 3. Singleton: elapsed time t=1.46 s, 2 iters, t-(init.)=1.42 s t(norm)=0.383171, mflops=13.049 (err=3.7e-16) 4. Singleton (f2c): elapsed time t=1.17 s, 2 iters, t-(init.)=1.13 s t(norm)=0.304918, mflops=16.3978 (err=3.6e-16) 5. Temperton: elapsed time t=1.72 s, 4 iters, t-(init.)=1.64 s t(norm)=0.221268, mflops=22.597 (err=4.9e-16) 6. Temperton (f2c): elapsed time t=1.57 s, 4 iters, t-(init.)=1.49 s t(norm)=0.20103, mflops=24.8719 (err=4.8e-16) Top mflops for N=110592 = 34.6347 Normalized results and averages for N=110592: fft 0: mflops = 34.6347 (norm. = 1), norm. avg. (of 12) = 0.646005 fft 1: mflops = 33.6901 (norm. = 0.972727), norm. avg. (of 12) = 0.556958 fft 2: mflops = 23.6045 (norm. = 0.681529), norm. avg. (of 12) = 0.395343 fft 3: mflops = 13.049 (norm. = 0.376761), norm. avg. (of 12) = 0.671035 fft 4: mflops = 16.3978 (norm. = 0.473451), norm. avg. (of 12) = 0.56166 fft 5: mflops = 22.597 (norm. = 0.652439), norm. avg. (of 7) = 0.885753 fft 6: mflops = 24.8719 (norm. = 0.718121), norm. avg. (of 7) = 0.65323 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.17 s, 2 iters, t-(init.)=1.13 s t(norm)=0.285109, mflops=17.5371 (err=4.3e-16) 1. PDA: elapsed time t=1.04 s, 2 iters, t-(init.)=1 s t(norm)=0.252309, mflops=19.8169 (err=5.7e-16) 2. PDA (f2c): elapsed time t=1.28 s, 2 iters, t-(init.)=1.24 s t(norm)=0.312863, mflops=15.9814 (err=5.7e-16) 3. Singleton: elapsed time t=1.08 s, 2 iters, t-(init.)=1.04 s t(norm)=0.262402, mflops=19.0548 (err=6.8e-16) 4. Singleton (f2c): elapsed time t=1.64 s, 2 iters, t-(init.)=1.6 s t(norm)=0.403695, mflops=12.3856 (err=6.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 19.8169 Normalized results and averages for N=117649: fft 0: mflops = 17.5371 (norm. = 0.884956), norm. avg. (of 13) = 0.664386 fft 1: mflops = 19.8169 (norm. = 1), norm. avg. (of 13) = 0.591038 fft 2: mflops = 15.9814 (norm. = 0.806452), norm. avg. (of 13) = 0.426967 fft 3: mflops = 19.0548 (norm. = 0.961538), norm. avg. (of 13) = 0.693381 fft 4: mflops = 12.3856 (norm. = 0.625), norm. avg. (of 13) = 0.566532 fft 5: mflops = -1 (norm. = -0.0504619), norm. avg. (of 7) = 0.885753 fft 6: mflops = -1 (norm. = -0.0504619), norm. avg. (of 7) = 0.65323 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.7 s, 2 iters, t-(init.)=1.63 s t(norm)=0.212924, mflops=23.4826 (err=4.8e-16) 1. PDA: elapsed time t=1.31 s, 2 iters, t-(init.)=1.23 s t(norm)=0.160672, mflops=31.1192 (err=4.4e-16) 2. PDA (f2c): elapsed time t=1.96 s, 2 iters, t-(init.)=1.89 s t(norm)=0.246887, mflops=20.2522 (err=4.4e-16) 3. Singleton: elapsed time t=1.55 s, 1 iters, t-(init.)=1.51 s t(norm)=0.394496, mflops=12.6744 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=1.38 s, 1 iters, t-(init.)=1.34 s t(norm)=0.350083, mflops=14.2823 (err=5.4e-16) 5. Temperton: elapsed time t=1.26 s, 2 iters, t-(init.)=1.18 s t(norm)=0.154141, mflops=32.4378 (err=5.8e-16) 6. Temperton (f2c): elapsed time t=1.46 s, 2 iters, t-(init.)=1.39 s t(norm)=0.181573, mflops=27.5372 (err=5.4e-16) Top mflops for N=216000 = 32.4378 Normalized results and averages for N=216000: fft 0: mflops = 23.4826 (norm. = 0.723926), norm. avg. (of 14) = 0.668638 fft 1: mflops = 31.1192 (norm. = 0.95935), norm. avg. (of 14) = 0.617346 fft 2: mflops = 20.2522 (norm. = 0.624339), norm. avg. (of 14) = 0.441065 fft 3: mflops = 12.6744 (norm. = 0.390728), norm. avg. (of 14) = 0.671763 fft 4: mflops = 14.2823 (norm. = 0.440299), norm. avg. (of 14) = 0.557516 fft 5: mflops = 32.4378 (norm. = 1), norm. avg. (of 8) = 0.900034 fft 6: mflops = 27.5372 (norm. = 0.848921), norm. avg. (of 8) = 0.677691 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.8 s, 2 iters, t-(init.)=1.72 s t(norm)=0.198773, mflops=25.1543 (err=4.1e-16) 1. PDA: elapsed time t=1.5 s, 2 iters, t-(init.)=1.42 s t(norm)=0.164103, mflops=30.4686 (err=4.7e-16) 2. PDA (f2c): elapsed time t=1.06 s, 1 iters, t-(init.)=1.02 s t(norm)=0.235754, mflops=21.2085 (err=4.7e-16) 3. Singleton: elapsed time t=1.66 s, 1 iters, t-(init.)=1.62 s t(norm)=0.374433, mflops=13.3535 (err=5.2e-16) 4. Singleton (f2c): elapsed time t=1.65 s, 1 iters, t-(init.)=1.61 s t(norm)=0.372122, mflops=13.4365 (err=5.2e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 30.4686 Normalized results and averages for N=241920: fft 0: mflops = 25.1543 (norm. = 0.825581), norm. avg. (of 15) = 0.679101 fft 1: mflops = 30.4686 (norm. = 1), norm. avg. (of 15) = 0.642857 fft 2: mflops = 21.2085 (norm. = 0.696078), norm. avg. (of 15) = 0.458066 fft 3: mflops = 13.3535 (norm. = 0.438272), norm. avg. (of 15) = 0.656197 fft 4: mflops = 13.4365 (norm. = 0.440994), norm. avg. (of 15) = 0.549747 fft 5: mflops = -1 (norm. = -0.0328207), norm. avg. (of 8) = 0.900034 fft 6: mflops = -1 (norm. = -0.0328207), norm. avg. (of 8) = 0.677691 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.71 s, 1 iters, t-(init.)=1.64 s t(norm)=0.208033, mflops=24.0346 (err=5.3e-16) 1. PDA: elapsed time t=1.6 s, 1 iters, t-(init.)=1.53 s t(norm)=0.19408, mflops=25.7626 (err=5.8e-16) 2. PDA (f2c): elapsed time t=1.96 s, 1 iters, t-(init.)=1.89 s t(norm)=0.239746, mflops=20.8554 (err=5.8e-16) 3. Singleton: elapsed time t=3.11 s, 1 iters, t-(init.)=3.03 s t(norm)=0.384354, mflops=13.0088 (err=6.6e-16) 4. Singleton (f2c): elapsed time t=2.62 s, 1 iters, t-(init.)=2.55 s t(norm)=0.323467, mflops=15.4575 (err=6.6e-16) 5. Temperton: elapsed time t=1.75 s, 2 iters, t-(init.)=1.59 s t(norm)=0.100845, mflops=49.5808 (err=7.0e-16) 6. Temperton (f2c): elapsed time t=1.46 s, 1 iters, t-(init.)=1.39 s t(norm)=0.176321, mflops=28.3574 (err=7.0e-16) Top mflops for N=421875 = 49.5808 Normalized results and averages for N=421875: fft 0: mflops = 24.0346 (norm. = 0.484756), norm. avg. (of 16) = 0.666955 fft 1: mflops = 25.7626 (norm. = 0.519608), norm. avg. (of 16) = 0.635154 fft 2: mflops = 20.8554 (norm. = 0.420635), norm. avg. (of 16) = 0.455726 fft 3: mflops = 13.0088 (norm. = 0.262376), norm. avg. (of 16) = 0.631583 fft 4: mflops = 15.4575 (norm. = 0.311765), norm. avg. (of 16) = 0.534874 fft 5: mflops = 49.5808 (norm. = 1), norm. avg. (of 9) = 0.911141 fft 6: mflops = 28.3574 (norm. = 0.571942), norm. avg. (of 9) = 0.665941 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.9 s, 1 iters, t-(init.)=1.81 s t(norm)=0.186397, mflops=26.8245 (err=3.7e-16) 1. PDA: elapsed time t=1.66 s, 1 iters, t-(init.)=1.57 s t(norm)=0.161681, mflops=30.9251 (err=3.8e-16) 2. PDA (f2c): elapsed time t=2.17 s, 1 iters, t-(init.)=2.09 s t(norm)=0.215231, mflops=23.2308 (err=3.7e-16) 3. Singleton: elapsed time t=4.29 s, 1 iters, t-(init.)=4.2 s t(norm)=0.432522, mflops=11.5601 (err=4.8e-16) 4. Singleton (f2c): elapsed time t=3.45 s, 1 iters, t-(init.)=3.36 s t(norm)=0.346018, mflops=14.4501 (err=4.7e-16) 5. Temperton: elapsed time t=2.1 s, 1 iters, t-(init.)=2.01 s t(norm)=0.206993, mflops=24.1554 (err=4.8e-16) 6. Temperton (f2c): elapsed time t=2.15 s, 1 iters, t-(init.)=2.06 s t(norm)=0.212142, mflops=23.5691 (err=5.2e-16) Top mflops for N=512000 = 30.9251 Normalized results and averages for N=512000: fft 0: mflops = 26.8245 (norm. = 0.867403), norm. avg. (of 17) = 0.678746 fft 1: mflops = 30.9251 (norm. = 1), norm. avg. (of 17) = 0.656615 fft 2: mflops = 23.2308 (norm. = 0.751196), norm. avg. (of 17) = 0.473107 fft 3: mflops = 11.5601 (norm. = 0.37381), norm. avg. (of 17) = 0.61642 fft 4: mflops = 14.4501 (norm. = 0.467262), norm. avg. (of 17) = 0.530896 fft 5: mflops = 24.1554 (norm. = 0.781095), norm. avg. (of 10) = 0.898136 fft 6: mflops = 23.5691 (norm. = 0.762136), norm. avg. (of 10) = 0.675561 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=2.76 s, 1 iters, t-(init.)=2.66 s t(norm)=0.234026, mflops=21.3651 (err=5.5e-16) 1. PDA: elapsed time t=2.18 s, 1 iters, t-(init.)=2.08 s t(norm)=0.182998, mflops=27.3227 (err=4.7e-16) 2. PDA (f2c): elapsed time t=3.07 s, 1 iters, t-(init.)=2.97 s t(norm)=0.2613, mflops=19.1351 (err=5.0e-16) 3. Singleton: elapsed time t=4.32 s, 1 iters, t-(init.)=4.22 s t(norm)=0.371274, mflops=13.4671 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=4.84 s, 1 iters, t-(init.)=4.74 s t(norm)=0.417024, mflops=11.9897 (err=5.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 27.3227 Normalized results and averages for N=592704: fft 0: mflops = 21.3651 (norm. = 0.781955), norm. avg. (of 18) = 0.68448 fft 1: mflops = 27.3227 (norm. = 1), norm. avg. (of 18) = 0.675692 fft 2: mflops = 19.1351 (norm. = 0.700337), norm. avg. (of 18) = 0.485731 fft 3: mflops = 13.4671 (norm. = 0.492891), norm. avg. (of 18) = 0.609557 fft 4: mflops = 11.9897 (norm. = 0.438819), norm. avg. (of 18) = 0.525781 fft 5: mflops = -1 (norm. = -0.0365996), norm. avg. (of 10) = 0.898136 fft 6: mflops = -1 (norm. = -0.0365996), norm. avg. (of 10) = 0.675561 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=3.02 s, 1 iters, t-(init.)=2.86 s t(norm)=0.163636, mflops=30.5557 (err=4.6e-16) 1. PDA: elapsed time t=2.8 s, 1 iters, t-(init.)=2.65 s t(norm)=0.15162, mflops=32.9771 (err=4.9e-16) 2. PDA (f2c): elapsed time t=3.7 s, 1 iters, t-(init.)=3.55 s t(norm)=0.203114, mflops=24.6167 (err=4.8e-16) 3. Singleton: elapsed time t=8.08 s, 1 iters, t-(init.)=7.93 s t(norm)=0.453717, mflops=11.0201 (err=5.5e-16) 4. Singleton (f2c): elapsed time t=6.9 s, 1 iters, t-(init.)=6.75 s t(norm)=0.386203, mflops=12.9466 (err=5.5e-16) 5. Temperton: elapsed time t=4.47 s, 1 iters, t-(init.)=4.31 s t(norm)=0.246598, mflops=20.2759 (err=5.2e-16) 6. Temperton (f2c): elapsed time t=4.16 s, 1 iters, t-(init.)=4 s t(norm)=0.228861, mflops=21.8473 (err=5.2e-16) Top mflops for N=884736 = 32.9771 Normalized results and averages for N=884736: fft 0: mflops = 30.5557 (norm. = 0.926573), norm. avg. (of 19) = 0.697221 fft 1: mflops = 32.9771 (norm. = 1), norm. avg. (of 19) = 0.692761 fft 2: mflops = 24.6167 (norm. = 0.746479), norm. avg. (of 19) = 0.499454 fft 3: mflops = 11.0201 (norm. = 0.334174), norm. avg. (of 19) = 0.595064 fft 4: mflops = 12.9466 (norm. = 0.392593), norm. avg. (of 19) = 0.518771 fft 5: mflops = 20.2759 (norm. = 0.614849), norm. avg. (of 11) = 0.872383 fft 6: mflops = 21.8473 (norm. = 0.6625), norm. avg. (of 11) = 0.674373 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=6.02 s, 1 iters, t-(init.)=5.82 s t(norm)=0.249595, mflops=20.0324 (err=4.7e-16) 1. PDA: elapsed time t=5.09 s, 1 iters, t-(init.)=4.89 s t(norm)=0.209712, mflops=23.8423 (err=5.5e-16) 2. PDA (f2c): elapsed time t=6.08 s, 1 iters, t-(init.)=5.88 s t(norm)=0.252169, mflops=19.828 (err=5.5e-16) 3. Singleton: elapsed time t=8.58 s, 1 iters, t-(init.)=8.38 s t(norm)=0.359383, mflops=13.9127 (err=6.5e-16) 4. Singleton (f2c): elapsed time t=9.18 s, 1 iters, t-(init.)=8.98 s t(norm)=0.385115, mflops=12.9831 (err=6.5e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 23.8423 Normalized results and averages for N=1157625: fft 0: mflops = 20.0324 (norm. = 0.840206), norm. avg. (of 20) = 0.704371 fft 1: mflops = 23.8423 (norm. = 1), norm. avg. (of 20) = 0.708123 fft 2: mflops = 19.828 (norm. = 0.831633), norm. avg. (of 20) = 0.516063 fft 3: mflops = 13.9127 (norm. = 0.583532), norm. avg. (of 20) = 0.594487 fft 4: mflops = 12.9831 (norm. = 0.544543), norm. avg. (of 20) = 0.52006 fft 5: mflops = -1 (norm. = -0.0419423), norm. avg. (of 11) = 0.872383 fft 6: mflops = -1 (norm. = -0.0419423), norm. avg. (of 11) = 0.674373 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=5.39 s, 1 iters, t-(init.)=5.15 s t(norm)=0.179495, mflops=27.8559 (err=5.1e-16) 1. PDA: elapsed time t=5.46 s, 1 iters, t-(init.)=5.21 s t(norm)=0.181587, mflops=27.5351 (err=5.6e-16) 2. PDA (f2c): elapsed time t=6.87 s, 1 iters, t-(init.)=6.63 s t(norm)=0.231079, mflops=21.6377 (err=5.7e-16) 3. Singleton: elapsed time t=11.3 s, 1 iters, t-(init.)=11.05 s t(norm)=0.385131, mflops=12.9826 (err=6.3e-16) 4. Singleton (f2c): elapsed time t=11.48 s, 1 iters, t-(init.)=11.24 s t(norm)=0.391753, mflops=12.7631 (err=6.3e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 27.8559 Normalized results and averages for N=1404928: fft 0: mflops = 27.8559 (norm. = 1), norm. avg. (of 21) = 0.718448 fft 1: mflops = 27.5351 (norm. = 0.988484), norm. avg. (of 21) = 0.721473 fft 2: mflops = 21.6377 (norm. = 0.776772), norm. avg. (of 21) = 0.528478 fft 3: mflops = 12.9826 (norm. = 0.466063), norm. avg. (of 21) = 0.588372 fft 4: mflops = 12.7631 (norm. = 0.458185), norm. avg. (of 21) = 0.517113 fft 5: mflops = -1 (norm. = -0.0358991), norm. avg. (of 11) = 0.872383 fft 6: mflops = -1 (norm. = -0.0358991), norm. avg. (of 11) = 0.674373 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=7.83 s, 1 iters, t-(init.)=7.53 s t(norm)=0.210304, mflops=23.7751 (err=4.8e-16) 1. PDA: elapsed time t=5.82 s, 1 iters, t-(init.)=5.52 s t(norm)=0.154167, mflops=32.4324 (err=4.8e-16) 2. PDA (f2c): elapsed time t=8.1 s, 1 iters, t-(init.)=7.8 s t(norm)=0.217845, mflops=22.9521 (err=4.9e-16) 3. Singleton: elapsed time t=16.31 s, 1 iters, t-(init.)=16.01 s t(norm)=0.44714, mflops=11.1822 (err=5.6e-16) 4. Singleton (f2c): elapsed time t=15.21 s, 1 iters, t-(init.)=14.91 s t(norm)=0.416419, mflops=12.0071 (err=5.6e-16) 5. Temperton: elapsed time t=7.04 s, 1 iters, t-(init.)=6.74 s t(norm)=0.18824, mflops=26.5618 (err=5.8e-16) 6. Temperton (f2c): elapsed time t=7.38 s, 1 iters, t-(init.)=7.08 s t(norm)=0.197736, mflops=25.2862 (err=5.7e-16) Top mflops for N=1728000 = 32.4324 Normalized results and averages for N=1728000: fft 0: mflops = 23.7751 (norm. = 0.733068), norm. avg. (of 22) = 0.719113 fft 1: mflops = 32.4324 (norm. = 1), norm. avg. (of 22) = 0.734134 fft 2: mflops = 22.9521 (norm. = 0.707692), norm. avg. (of 22) = 0.536624 fft 3: mflops = 11.1822 (norm. = 0.344785), norm. avg. (of 22) = 0.577299 fft 4: mflops = 12.0071 (norm. = 0.370221), norm. avg. (of 22) = 0.510436 fft 5: mflops = 26.5618 (norm. = 0.818991), norm. avg. (of 12) = 0.867934 fft 6: mflops = 25.2862 (norm. = 0.779661), norm. avg. (of 12) = 0.683147 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=11.7 s, 1 iters, t-(init.)=11.18 s t(norm)=0.174068, mflops=28.7244 (err=6.5e-16) 1. PDA: elapsed time t=9.75 s, 1 iters, t-(init.)=9.23 s t(norm)=0.143707, mflops=34.793 (err=5.7e-16) 2. PDA (f2c): elapsed time t=14.73 s, 1 iters, t-(init.)=14.21 s t(norm)=0.221244, mflops=22.5995 (err=5.8e-16) 3. Singleton: elapsed time t=29.76 s, 1 iters, t-(init.)=29.24 s t(norm)=0.455254, mflops=10.9829 (err=6.0e-16) 4. Singleton (f2c): elapsed time t=25.33 s, 1 iters, t-(init.)=24.81 s t(norm)=0.386281, mflops=12.9439 (err=5.8e-16) 5. Temperton: elapsed time t=15.32 s, 1 iters, t-(init.)=14.81 s t(norm)=0.230585, mflops=21.6839 (err=6.6e-16) 6. Temperton (f2c): elapsed time t=14.03 s, 1 iters, t-(init.)=13.51 s t(norm)=0.210345, mflops=23.7705 (err=6.6e-16) Top mflops for N=2985984 = 34.793 Normalized results and averages for N=2985984: fft 0: mflops = 28.7244 (norm. = 0.825581), norm. avg. (of 23) = 0.723742 fft 1: mflops = 34.793 (norm. = 1), norm. avg. (of 23) = 0.745693 fft 2: mflops = 22.5995 (norm. = 0.649543), norm. avg. (of 23) = 0.541534 fft 3: mflops = 10.9829 (norm. = 0.315663), norm. avg. (of 23) = 0.565924 fft 4: mflops = 12.9439 (norm. = 0.372027), norm. avg. (of 23) = 0.504419 fft 5: mflops = 21.6839 (norm. = 0.623228), norm. avg. (of 13) = 0.84911 fft 6: mflops = 23.7705 (norm. = 0.683198), norm. avg. (of 13) = 0.683151 Benchmarking for array size = 180x180x180: 0. FFTW: elapsed time t=27.77 s, 1 iters, t-(init.)=26.75 s t(norm)=0.204078, mflops=24.5005 (err=7.5e-16) 1. PDA: elapsed time t=20.56 s, 1 iters, t-(init.)=19.55 s t(norm)=0.149148, mflops=33.5236 (err=8.5e-16) 2. PDA (f2c): elapsed time t=32.39 s, 1 iters, t-(init.)=31.38 s t(norm)=0.2394, mflops=20.8855 (err=8.5e-16) 3. Singleton: elapsed time t=58.22 s, 1 iters, t-(init.)=57.2 s t(norm)=0.436383, mflops=11.4578 (err=7.1e-16) 4. Singleton (f2c): elapsed time t=52.85 s, 1 iters, t-(init.)=51.83 s t(norm)=0.395415, mflops=12.6449 (err=7.1e-16) 5. Temperton: elapsed time t=22.67 s, 1 iters, t-(init.)=21.65 s t(norm)=0.16517, mflops=30.2719 (err=7.8e-16) 6. Temperton (f2c): elapsed time t=24.96 s, 1 iters, t-(init.)=23.95 s t(norm)=0.182716, mflops=27.3648 (err=7.9e-16) Top mflops for N=5832000 = 33.5236 Normalized results and averages for N=5832000: fft 0: mflops = 24.5005 (norm. = 0.730841), norm. avg. (of 24) = 0.724038 fft 1: mflops = 33.5236 (norm. = 1), norm. avg. (of 24) = 0.756289 fft 2: mflops = 20.8855 (norm. = 0.623008), norm. avg. (of 24) = 0.544928 fft 3: mflops = 11.4578 (norm. = 0.341783), norm. avg. (of 24) = 0.556585 fft 4: mflops = 12.6449 (norm. = 0.377195), norm. avg. (of 24) = 0.499118 fft 5: mflops = 30.2719 (norm. = 0.903002), norm. avg. (of 14) = 0.85296 fft 6: mflops = 27.3648 (norm. = 0.816284), norm. avg. (of 14) = 0.692661 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Bailey, Beauregard, Bergland, Brenner, Burrus, CWP (min N), CWP (best N), Edelblute, FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), Monro, NAPACK (f2c), Nielsen, NR (C), NR (F), Ooura (C), Ooura (F), QFT, Ransom, SCIPORT, Singleton, Singleton (f2c), Sorensen, Sorensen DIT, Temperton, Temperton (f2c), Valkenburg 2, 28.1497, 29.1271, 18.7246, 1.33747, 5.29584, 3.85506, 4.85452, 5.40503, 17.4763, 3.77186, 3.36082, , 10.3819, 8.45626, 35.2463, 36.4722, 48.2104, , 5.76141, 4.94611, 4.16102, 23.0456, , , , , 2.27951, 1.62822, 5.29584, 4.80998, 33.2881, 29.7468, , , , 4.51972, 4.40578, 15.5345, 15.6504, 3.2768, 3.19688, 4.63972 4, 48.2104, 48.771, 22.0753, 4.85452, 9.70904, 4.40578, 15.7681, 9.98644, 14.6654, 13.4433, 5.40503, 17.1898, 23.0456, 23.0456, 90.2001, 91.1805, 93.2068, , 9.79978, 9.36229, 7.88403, 69.3273, 30.1748, 29.7468, 28.7281, 4.99322, 3.97188, 5.57753, 9.70904, 8.192, 47.6625, 41.943, , 3.94202, 14.5636, 9.61996, 11.336, 21.6201, 12.9454, 8.88624, 10.3819, 4.80998 8, 31.1458, 30.541, 22.9615, 6.0033, 16.7326, 6.24152, 12.483, 11.6508, 13.5592, 33.1129, 16.0496, 15.7286, 38.5979, 35.7469, 133.861, 133.861, 182.361, 49.9322, 16.2151, 15.5729, 12.3848, 96.7916, 39.3216, 38.8361, 38.13, 10.2802, 5.91302, 9.53251, 15.2705, 12.6844, 82.2413, 62.2916, , 4.79532, 13.6771, 13.4433, 14.4299, 30.8405, 11.826, 12.7875, 15.5729, 4.79532 16, 30.8405, 31.0689, 23.6966, 9.53251, 21.7321, 7.08497, 16.0088, 13.2731, 14.2663, 60.787, 30.1748, 15.4202, 44.1506, 54.4715, 63.5501, 63.0722, 180.4, 51.15, 17.9244, 21.9597, 17.4763, 116.508, 29.3308, 33.5544, 38.4799, 14.8734, 8.192, 11.7818, 22.55, 18.0789, 90.2001, 70.4925, 77.6723, 12.4092, 13.2731, 14.0748, 21.8453, 40.3298, 12.1223, 15.4202, 22.0753, 4.94611 32, 25.9549, 25.9549, 25.9549, 10.1606, 29.9593, 7.44727, 19.563, 15.1528, 15.4202, 49.4611, 52.9584, 16.487, 41.943, 45.5903, 85.2501, 44.4312, 167.772, 56.6798, 20.011, 29.7891, 24.0499, 112.75, 30.3057, 39.126, 43.6907, 19.8594, 9.78149, 13.3068, 31.775, 24.7306, 101.803, 78.8403, 63.1672, 13.1072, 12.9774, 17.1336, 27.0252, 52.4288, 12.9774, 16.1817, 22.2156, 5.04123 64, 33.825, 34.7594, 28.0869, 13.2173, 34.1927, 7.41917, 17.5739, 15.8875, 17.0039, 62.9146, 57.7198, 17.9756, 43.3894, 53.7731, 71.9024, 51.15, 32.768, 62.9146, 19.9097, 33.1129, 27.3542, 68.7591, 29.1271, 36.7921, 43.3894, 23.4756, 11.5652, 14.4299, 40.3298, 30.541, 99.078, 79.1378, 54.7083, 21.6947, 12.6844, 16.1319, 27.8383, 61.6809, 13.7971, 17.9756, 29.3993, 5.07375 128, 29.8375, 29.8375, 30.5835, 11.6879, 40.5527, 7.39923, 19.0156, 17.1496, 18.5354, 72.6736, 79.3517, 19.6258, 49.262, 59.1938, 91.7504, 47.0515, 29.1271, 61.1669, 22.1085, 36.7002, 30.8405, 69.9051, 30.8405, 40.5527, 46.1637, 26.5943, 12.3155, 16.239, 48.6095, 35.9805, 106.377, 84.3682, 52.057, 21.4621, 12.483, 22.5154, 35.6312, 71.2624, 14.9188, 17.8156, 26.0285, 5.18364 256, 39.1991, 39.9458, 33.2881, 13.6179, 40.3298, 7.3327, 20.5603, 17.6231, 20.5603, 83.8861, 89.2405, 21.1834, 43.9194, 57.8525, 71.6975, 57.8525, 27.2357, 61.2307, 20.7639, 40.3298, 33.825, 59.9186, 30.8405, 41.1206, 46.8637, 28.9262, 13.4433, 15.8875, 53.7731, 38.8361, 109.655, 85.598, 47.127, 29.1271, 12.1223, 17.05, 30.8405, 76.2601, 16.257, 18.0789, 28.7281, 5.24288 512, 33.2295, 33.9467, 34.4423, 13.2545, 34.9525, 7.3728, 19.1813, 18.432, 21.4481, 90.7422, 91.6231, 21.6449, 37.1543, 42.5098, 78.6432, 56.1737, 35.4781, 61.2804, 20.6956, 40.6775, 33.9467, 58.616, 33.2295, 43.2898, 48.6453, 28.0869, 13.1072, 15.3201, 54.5502, 41.3912, 106.036, 85.7926, 42.1303, 27.5941, 11.4529, 18.5771, 32.768, 67.8934, 16.7326, 17.4763, 24.9661, 5.17389 1024, 28.4939, 29.4544, 24.2726, 14.0184, 29.1271, 7.1624, 18.8593, 17.7124, 18.0789, 89.6219, 89.6219, 18.3317, 35.9101, 42.2813, 69.4421, 59.9186, 48.5452, 45.1972, 19.563, 25.7004, 24.2726, 34.0447, 31.9688, 40.6425, 41.2825, 22.5986, 13.7971, 15.3301, 30.3057, 28.4939, 78.8403, 70.3742, 39.4202, 30.6601, 11.2027, 16.5914, 28.9662, 40.6425, 14.7272, 17.8329, 26.2144, 5.16031 2048, 26.4549, 27.2036, 23.4438, 12.9891, 28.8358, 7.06761, 20.0249, 18.0224, 18.1357, 68.6568, 74.4151, 18.2505, 36.969, 42.4056, 80.6597, 49.717, 49.717, 42.0961, 20.7452, 25.2946, 24.4372, 32.9552, 27.7268, 32.5829, 34.3284, 22.8856, 13.8634, 15.1768, 30.1946, 28.2704, 81.2277, 70.3313, 35.5998, 28.5503, 10.9227, 18.4845, 28.2704, 38.4478, 14.9408, 17.5828, 24.0299, 5.22388 4096, 28.5975, 29.6767, 23.6521, 14.8383, 25.575, 7.02171, 18.8367, 18.1834, 18.6138, 65.8791, 77.6723, 18.7246, 34.7594, 41.1206, 68.7591, 48.3958, 31.775, 43.9962, 19.5387, 25.1658, 24.3855, 31.775, 27.3542, 31.775, 33.1129, 23.3017, 13.6771, 14.8383, 29.3993, 27.8383, 78.6432, 69.5189, 33.825, 36.1578, 10.3478, 16.7326, 28.8599, 36.5782, 15.1237, 17.7725, 25.1658, 5.17389 8192, 27.7063, 28.6376, 23.9991, 14.3188, 23.5026, 7.04106, 19.3629, 18.521, 19.2535, 70.2654, 70.2654, 19.3629, 30.4274, 32.768, 65.536, 41.3075, 37.4491, 42.5984, 19.0384, 24.3419, 24.6947, , 27.7063, 31.8493, 33.0861, 23.8313, 13.4168, 15.0791, 28.8803, 27.7063, 78.3419, 68.1574, 31.5544, 32.4559, 10.0232, 17.2115, 28.8803, 35.4987, 15.4903, 17.0394, 24.1693, 5.01158 16384, 25.3105, 26.0285, 20.5029, 14.9188, 19.8379, 6.74635, 18.5354, 17.4763, 17.1496, 71.2624, 71.2624, 16.6819, 28.0154, 31.6381, 52.806, 43.9523, 28.672, 36.7002, 17.3114, 19.1147, 19.4181, , 27.1853, 31.1018, 31.6381, 21.214, 11.8388, 14.336, 21.5883, 21.4621, 64.956, 58.2542, 26.9854, 35.6312, 8.41747, 15.2917, 24.7974, 30.3307, 14.4489, 16.5316, 22.7951, 4.77867 32768, 21.1406, 21.4872, 17.5543, 13.3747, 15.6038, 6.64216, 16.6617, 16.2486, 15.2409, 66.0867, 66.6468, 15.0082, 24.8871, 26.5686, 50.7375, 41.8315, 25.2062, 34.1927, 16.8041, 15.8555, 16.384, , 25.0456, 27.3067, 27.4976, 18.3746, 9.8798, 12.6844, 18.0374, 17.5543, 54.9952, 50.4123, 22.8614, 29.3445, 7.28178, 14.5636, 20.9157, 26.0408, 13.4663, 14.8945, 18.3746, 4.42811 65536, 16.6441, 16.7772, 13.3577, 14.0748, 13.7069, 6.31672, 14.6654, 14.4631, 11.8483, 51.7815, 52.1032, 11.5865, 24.9661, 27.5941, 49.3448, 38.4799, 28.1497, 26.7153, 16.257, 12.483, 12.7875, , 21.8453, 23.173, 23.0456, 14.7687, 10.1803, 11.5865, 13.6179, 12.6334, 43.2402, 40.3298, 19.9729, 27.06, 6.94421, 12.483, 17.7725, 20.1649, 10.8101, 13.6179, 17.6231, 4.22813 131072, 15.6917, 16.2644, 12.1761, 12.6604, 14.2835, 6.22409, 14.6594, 14.014, 11.1411, 49.5161, 50.3553, 10.8166, 21.6332, 23.8313, 50.0724, 37.1371, 23.2107, 24.486, 15.582, 11.6661, 11.9797, , 18.4151, 20.0741, 19.2088, 13.7545, 9.94743, 11.6661, 12.8059, 12.4482, 40.5132, 38.0893, 17.2731, 22.737, 6.9632, 12.1761, 16.2644, 18.5685, 10.037, 12.8059, 16.0304, 4.081 262144, 15.3201, 15.1237, 11.9156, 14.386, 13.8782, 6.22506, 14.2126, 14.0434, 10.9227, 35.2134, 35.7469, 10.6755, 22.2575, 24.8347, 47.1859, 38.0532, 26.9634, 24.9661, 15.8342, 11.3428, 11.7378, , 16.9734, 18.1484, 18.0099, 13.8782, 10.2578, 11.2347, 12.483, 12.1613, 41.7575, 39.3216, 15.6245, 26.5089, 6.87841, 11.9156, 16.6148, 18.1484, 9.74916, 13.0348, 16.271, 4.08183 524288, 15.0022, 15.139, 11.6101, 12.6095, 14.2307, 6.18725, 14.1097, 13.721, 10.8513, 36.3557, 35.8326, 10.4418, 22.6397, 24.7798, 44.8715, 33.6536, 24.2963, 23.8313, 16.384, 11.2432, 11.6645, , 16.9991, 18.2445, 17.8521, 13.3532, 10.0418, 10.9467, 12.2678, 11.9156, 40.4938, 38.3134, 14.8236, 22.1366, 6.87947, 11.8027, 15.7121, 17.4763, 9.92178, 12.3591, 15.2316, 3.99417 1048576, 14.6041, 15.0011, 11.4349, 15.0657, 13.8517, 6.18264, 14.2858, 14.0184, 10.6239, , , 10.4963, 23.1474, 25.2669, 43.8735, 32.9741, 21.7096, , 16.487, 11.0609, 11.3728, , , 17.5347, 17.218, 13.1896, 10.2802, 11.2388, 12.1223, 11.625, 41.4457, 38.8361, 14.5434, 25.1457, 6.78251, 11.729, 16.2822, 17.5054, 9.9203, 12.7719, 15.6271, 3.96138 Norm. Avg., 0.327328, 0.333825, 0.261215, 0.163363, 0.265998, 0.0860977, 0.211789, 0.197907, 0.202086, 0.688413, 0.687468, 0.196147, 0.384314, 0.432748, 0.809972, 0.627549, 0.580639, 0.497323, 0.224183, 0.248384, 0.23201, 0.559204, 0.31897, 0.370146, 0.385684, 0.235952, 0.137867, 0.160049, 0.291842, 0.255597, 0.837125, 0.726629, 0.401631, 0.329957, 0.121985, 0.184926, 0.279333, 0.423891, 0.173111, 0.18834, 0.253309, 0.0620651 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, Brenner, CWP (min N), CWP (best N), FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, Nielsen, Singleton, Singleton (f2c), Temperton, Temperton (f2c), Valkenburg 6, , 18.3144, 10.2672, 26.9258, 25.2534, 70.0999, 71.3297, 16.9408, 20.5343, 5.03192, 9.24044, 10.6434, 11.357, 10.7561, 4.79457 9, 4.36845, 29.6777, 19.476, 30.1564, 33.092, 45.053, 44.5166, 13.5485, 18.8858, 7.36102, 13.4511, 16.4009, 14.9576, 13.953, 4.8945 12, , 36.6147, 29.0652, 39.4312, 42.7171, 88.104, 89.5025, 23.3002, 31.3259, 9.21349, 15.6629, 18.427, 17.0868, 19.4436, 4.9289 15, 5.85462, 40.0066, 38.4063, 30.9728, 43.6435, 45.9956, 45.7218, 15.1206, 13.5233, 10.3801, 11.7092, 15.4864, 26.671, 23.2765, 4.32503 18, 5.44142, 42.4055, 35.3888, 31.9418, 32.1506, 42.4055, 30.744, 15.5666, 33.6921, 8.78401, 17.08, 21.7657, 19.9961, 17.3206, 4.99903 24, , 58.6303, 52.2574, 38.7716, 38.7716, 53.8173, 43.4429, 19.3858, 28.6172, 11.3389, 19.1796, 23.414, 19.9213, 23.7221, 4.98033 36, 6.46048, 62.8731, 60.9869, 39.0942, 41.7719, 54.9432, 35.05, 19.0584, 52.5749, 12.8124, 17.1312, 24.9946, 22.9274, 23.6383, 5.01537 80, 9.50261, 92.0698, 76.0209, 36.6649, 52.7788, 55.2419, 33.4124, 31.1514, 16.1841, 16.1841, 16.7062, 25.418, 25.8946, 34.5262, 4.7953 108, 6.98165, 72.4399, 95.6207, 44.5992, 48.986, 50.2209, 39.0607, 17.4745, 47.8103, 14.5056, 22.6375, 33.0182, 23.3449, 22.9857, 5.15198 210, 8.29436, 79.4669, 79.4669, 33.5126, 24.7593, 34.7408, 35.1084, 15.3599, 15.0806, 12.7605, 19.5161, 20.1075, , , 4.10612 504, 8.77488, 93.5988, 93.5988, 34.5757, 27.415, 44.9819, 41.0012, 16.1998, 23.3997, 12.3881, 25.044, 24.7761, , , 4.45494 1000, 9.3111, 73.417, 89.5172, 31.6924, 34.2448, 39.5541, 36.1878, 17.2381, 14.4957, 19.1823, 17.7169, 24.5312, 38.0782, 32.2942, 4.39869 1960, 9.72973, 78.3938, 78.3938, 28.7307, 19.7395, 34.0843, 25.4054, 15.5897, 12.8214, 11.9295, 21.2696, 18.5391, , , 3.96501 4725, 8.54427, 68.993, 82.025, 30.5052, 25.456, 32.6648, 24.4445, 13.6708, 15.5089, 13.1826, 17.5768, 21.0921, , , 4.15667 10368, 8.99561, 73.764, 79.7449, 34.3089, 38.8232, 54.64, 39.5164, 20.6815, 42.5562, 14.6551, 20.1175, 27.1524, 20.1175, 22.3527, 4.98406 27000, 7.64343, 70.2688, 69.5009, 24.6486, 27.1766, 31.4818, 33.4702, 13.7055, 18.3796, 15.4353, 16.2228, 21.4842, 26.7199, 23.7288, 4.22828 75600, 7.33641, 53.2687, 54.4525, 19.1435, 18.0174, 34.2708, 29.5224, 13.7661, 16.4454, 12.3756, 13.9225, 15.5086, , , 3.87716 165375, 6.82581, 45.5054, 45.5054, 13.1506, 11.4674, 27.0456, 22.2236, 9.88565, 12.1476, 11.2868, 13.7829, 14.0531, , , 3.54807 362880, 7.17567, 24.4601, 24.2829, 18.8261, 18.1137, 36.0327, 27.4675, 13.6221, 19.5967, 11.2451, 12.9384, 14.0211, , , 3.74418 Norm. Avg., 0.116714, 0.852384, 0.83792, 0.46786, 0.483811, 0.707035, 0.602519, 0.256642, 0.368845, 0.184488, 0.256616, 0.310619, 0.335105, 0.33058, 0.0704703 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM, HARM (f2c), NR (C), NR (F), PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 4x4x4, 63.5501, , , 46.9512, 29.9593, 15.4202, 13.7971, 44.939, 34.0079, 54.7083, 31.4573 8x8x8, 44.939, 37.1543, 32.3191, 64.6382, 43.6907, 26.9634, 24.0744, 40.3298, 56.8505, 80.6597, 37.1543 16x16x16, 46.6034, 40.59, 32.4302, 27.8383, 25.575, 31.4573, 33.1129, 37.2276, 29.1271, 65.536, 35.5449 32x32x32, 37.4491, 30.9619, 25.7004, 16.5217, 15.8555, 28.7019, 29.3445, 21.8453, 19.0882, 38.1763, 26.0408 64x64x64, 30.6402, 28.7719, 24.0744, 11.6221, 11.6221, 26.6587, 29.1271, 18.8744, 15.8342, 36.0198, 25.3688 256x64x32, 30.7453, 27.2171, 23.059, 11.6645, 11.2686, 25.8069, 27.9817, 16.2769, 13.9516, 36.0923, 25.1552 16x1024x64, 32.768, 27.5941, 23.25, 11.8617, 11.3852, 24.4994, 26.149, 18.2044, 15.3976, , 128x128x128, 43.0922, 27.8383, 23.179, 11.4569, 11.1044, 28.0154, 30.9271, 13.5259, 12.6407, 27.1518, 19.9819 512x128x64, 38.13, 27.1396, 22.7278, 10.972, 10.6725, 26.824, 29.4243, 14.9457, 13.1558, , 256x128x256, 34.5272, 27.9782, 23.7258, 10.9774, 10.7618, 26.5682, 29.8204, 15.1562, 13.109, 28.7367, 21.0401 Norm. Avg., 0.895171, 0.717122, 0.600738, 0.427712, 0.365027, 0.613466, 0.652448, 0.502276, 0.449848, 0.915406, 0.581809 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 5x5x5, 36.0251, 16.823, 12.558, 24.9404, 33.6461, 68.5862, 28.5319 6x6x6, 76.6595, 20.6657, 13.6132, 35.5494, 48.6598, 49.7176, 35.9216 7x7x7, 29.8798, 12.9741, 10.8754, 46.9539, 18.6044, , 9x9x9, 24.4793, 27.3039, 14.3125, 25.536, 35.495, 37.5609, 36.974 10x10x10, 20.0885, 25.5124, 19.6249, 22.5774, 28.6656, 68.0331, 30.5538 11x11x11, 23.1128, 13.9223, 11.5564, 40.1848, 14.145, , 12x12x12, 34.2275, 36.5971, 22.026, 27.986, 33.2701, 34.2275, 38.0609 13x13x13, 20.4044, 13.2283, 10.9156, 34.8812, 12.5882, , 14x14x14, 18.4027, 21.0042, 17.7513, 33.1553, 18.5731, , 15x15x15, 18.893, 31.0634, 18.7531, 18.7531, 26.2349, 62.5102, 33.532 24x25x28, 21.9356, 30.1834, 23.7291, 24.1854, 24.4995, , 48x48x48, 34.6347, 33.6901, 23.6045, 13.049, 16.3978, 22.597, 24.8719 49x49x49, 17.5371, 19.8169, 15.9814, 19.0548, 12.3856, , 60x60x60, 23.4826, 31.1192, 20.2522, 12.6744, 14.2823, 32.4378, 27.5372 72x60x56, 25.1543, 30.4686, 21.2085, 13.3535, 13.4365, , 75x75x75, 24.0346, 25.7626, 20.8554, 13.0088, 15.4575, 49.5808, 28.3574 80x80x80, 26.8245, 30.9251, 23.2308, 11.5601, 14.4501, 24.1554, 23.5691 84x84x84, 21.3651, 27.3227, 19.1351, 13.4671, 11.9897, , 96x96x96, 30.5557, 32.9771, 24.6167, 11.0201, 12.9466, 20.2759, 21.8473 105x105x105, 20.0324, 23.8423, 19.828, 13.9127, 12.9831, , 112x112x112, 27.8559, 27.5351, 21.6377, 12.9826, 12.7631, , 120x120x120, 23.7751, 32.4324, 22.9521, 11.1822, 12.0071, 26.5618, 25.2862 144x144x144, 28.7244, 34.793, 22.5995, 10.9829, 12.9439, 21.6839, 23.7705 180x180x180, 24.5005, 33.5236, 20.8855, 11.4578, 12.6449, 30.2719, 27.3648 Norm. Avg., 0.724038, 0.756289, 0.544928, 0.556585, 0.499118, 0.85296, 0.692661 @@@@ end