To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ name = Steven G. Johnson @ email = stevenj@alum.mit.edu @ organization = MIT @ computer manufacturer = SGI @ computer model = Origin 2000 @ CPU manufacturer = SGI @ CPU model = MIPS R10000 @ CPU speed = 195 MHz @ RAM = 4096 MB @ L2 cache size = 4 MB @ operating system = IRIX 6.4 (IP27) @ C compiler = MIPSpro.71 cc @ C compiler flags = -DUSE_SCSL -DUSE_SGIMATH -mips4 -O3 -64 -mips4 -WOPT:rsv_bits=4020 -IPA @ Fortran compiler = MIPSpro.71 f77 @ Fortran compiler flags = -64 -mips4 -O3 -IPA @ remarks = This is a 32 processor machine, but only 1 processor was used. @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log FFT Benchmark Program by M. Frigo and S. G. Johnson. email: fftw@theory.lcs.mit.edu www: http://theory.lcs.mit.edu/~fftw Using FFTW V1.1 ($Id: executor.c,v 1.34 1997/04/30 13:15:56 fftw Exp $) Maximum memory to use: 200 MB Factors to allow: 2 Using double precision. Measuring speed of 1D transforms: Benchmarking for sizes: 2 (0.000335693 MB) 4 (0.000579834 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) 524288 (50.9973 MB) 1048576 (96 MB) 2097152 (192 MB) Maximum array size = 2097152 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Bailey 5. Beauregard 6. Bergland 7. Brenner 8. Burrus 9. CWP (min N) 10. CWP (best N) 11. Edelblute 12. FFTPACK 13. FFTPACK (f2c) 14. FFTW 15. FFTW_ESTIMATE 16. Frigo-old 17. Green 18. GSL 19. GSL DIT 20. GSL DIF 21. Krukar 22. Mayer (Buneman) 23. Mayer (simple) 24. Mayer (lookup) 25. Monro 26. NAPACK (f2c) 27. Nielsen 28. NR (C) 29. NR (F) 30. Ooura (C) 31. Ooura (F) 32. Ransom 33. SCIPORT 34. Singleton 35. Singleton (f2c) 36. Sorensen 37. Sorensen DIT 38. Temperton 39. Temperton (f2c) 40. Valkenburg 41. SCSL 42. SGIMATH Computing normalized averages (43 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.18 s, 4194304 iters, t-(init.)=0.84 s t(norm)=0.100136, mflops=49.9322 (err=1.7e-17) 1. Arndt DIT: elapsed time t=1.36 s, 4194304 iters, t-(init.)=1.02 s t(norm)=0.121593, mflops=41.1206 (err=1.7e-17) 2. Arndt Split-Radix: elapsed time t=1.72 s, 4194304 iters, t-(init.)=1.38 s t(norm)=0.164509, mflops=30.3935 (err=1.7e-17) 3. Arndt 4-step: elapsed time t=1.09 s, 131072 iters, t-(init.)=1.07 s t(norm)=4.08173, mflops=1.22497 (err=1.7e-17) 4. Bailey: elapsed time t=1.84 s, 2097152 iters, t-(init.)=1.69 s t(norm)=0.402927, mflops=12.4092 (err=1.7e-17) 5. Beauregard: elapsed time t=1.1 s, 1048576 iters, t-(init.)=0.98 s t(norm)=0.4673, mflops=10.6998 (err=1.7e-17) 6. Bergland: elapsed time t=1.24 s, 524288 iters, t-(init.)=1.19 s t(norm)=1.13487, mflops=4.40578 (err=1.7e-17) 7. Brenner: elapsed time t=1.56 s, 1048576 iters, t-(init.)=1.47 s t(norm)=0.700951, mflops=7.13317 (err=1.7e-17) 8. Burrus: elapsed time t=1.16 s, 2097152 iters, t-(init.)=1 s t(norm)=0.238419, mflops=20.9715 (err=1.7e-17) 9. CWP (min N): elapsed time t=1.63 s, 524288 iters, t-(init.)=1.59 s t(norm)=1.51634, mflops=3.29741 10. CWP (best N) (N=3): elapsed time t=1.62 s, 524288 iters, t-(init.)=1.56 s t(norm)=1.48773, mflops=3.36082 11. Skipping fft (Edelblute can't handle N <= 2). 12. FFTPACK: elapsed time t=1.14 s, 1048576 iters, t-(init.)=1.06 s t(norm)=0.505447, mflops=9.89223 (err=1.7e-17) 13. FFTPACK (f2c): elapsed time t=1.68 s, 1048576 iters, t-(init.)=1.58 s t(norm)=0.753403, mflops=6.63656 (err=1.7e-17) FFTW_MEASURE plan: (cost = 4.768372e-07) FFTW_NOTW 2 14. FFTW: elapsed time t=1.01 s, 2097152 iters, t-(init.)=0.86 s t(norm)=0.20504, mflops=24.3855 (err=1.7e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 15. FFTW_ESTIMATE: elapsed time t=1.01 s, 2097152 iters, t-(init.)=0.85 s t(norm)=0.202656, mflops=24.6724 (err=1.7e-17) 16. Frigo-old: elapsed time t=1.16 s, 4194304 iters, t-(init.)=0.86 s t(norm)=0.10252, mflops=48.771 (err=1.7e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.91 s, 1048576 iters, t-(init.)=1.83 s t(norm)=0.872612, mflops=5.72992 (err=1.7e-17) 19. GSL DIT: elapsed time t=1.13 s, 1048576 iters, t-(init.)=1.05 s t(norm)=0.500679, mflops=9.98644 (err=1.7e-17) 20. GSL DIF: elapsed time t=2.01 s, 2097152 iters, t-(init.)=1.83 s t(norm)=0.436306, mflops=11.4598 (err=1.7e-17) 21. Krukar: elapsed time t=1.66 s, 4194304 iters, t-(init.)=1.27 s t(norm)=0.151396, mflops=33.026 (err=1.7e-17) 22. Skipping fft (Mayer can't handle N <= 2). 23. Skipping fft (Mayer can't handle N <= 2). 24. Skipping fft (Mayer can't handle N <= 2). 25. Skipping fft (Monro can't handle N <= 2). 26. NAPACK (f2c): elapsed time t=1.02 s, 524288 iters, t-(init.)=0.98 s t(norm)=0.934601, mflops=5.34988 (err=1.7e-17) 27. Nielsen: elapsed time t=1.15 s, 262144 iters, t-(init.)=1.13 s t(norm)=2.1553, mflops=2.31986 (err=1.7e-17) 28. NR (C): elapsed time t=1.01 s, 1048576 iters, t-(init.)=0.91 s t(norm)=0.433922, mflops=11.5228 (err=1.7e-17) 29. NR (F): elapsed time t=1.46 s, 1048576 iters, t-(init.)=1.38 s t(norm)=0.658035, mflops=7.59838 (err=1.7e-17) 30. Ooura (C): elapsed time t=1.04 s, 4194304 iters, t-(init.)=0.64 s t(norm)=0.0762939, mflops=65.536 (err=1.7e-17) 31. Ooura (F): elapsed time t=1.98 s, 4194304 iters, t-(init.)=1.64 s t(norm)=0.195503, mflops=25.575 (err=1.7e-17) 32. Skipping fft (Ransom doesn't work for N=2). 33. Skipping fft (SCIPORT can't handle N < 4). 34. Singleton: elapsed time t=1.79 s, 1048576 iters, t-(init.)=1.69 s t(norm)=0.805855, mflops=6.20459 (err=1.7e-17) 35. Singleton (f2c): elapsed time t=1.68 s, 1048576 iters, t-(init.)=1.6 s t(norm)=0.762939, mflops=6.5536 (err=1.7e-17) 36. Sorensen: elapsed time t=1.19 s, 2097152 iters, t-(init.)=1.02 s t(norm)=0.243187, mflops=20.5603 (err=1.7e-17) 37. Sorensen DIT: elapsed time t=1.26 s, 2097152 iters, t-(init.)=1.07 s t(norm)=0.255108, mflops=19.5996 (err=1.7e-17) 38. Temperton: elapsed time t=1.73 s, 524288 iters, t-(init.)=1.68 s t(norm)=1.60217, mflops=3.12076 (err=1.7e-17) 39. Temperton (f2c): elapsed time t=1.98 s, 524288 iters, t-(init.)=1.94 s t(norm)=1.85013, mflops=2.70252 (err=1.7e-17) 40. Valkenburg: elapsed time t=1.54 s, 1048576 iters, t-(init.)=1.47 s t(norm)=0.700951, mflops=7.13317 (err=1.7e-17) 41. SCSL: elapsed time t=1.73 s, 2097152 iters, t-(init.)=1.54 s t(norm)=0.367165, mflops=13.6179 (err=1.7e-17) 42. SGIMATH: elapsed time t=1.09 s, 1048576 iters, t-(init.)=1.01 s t(norm)=0.481606, mflops=10.3819 (err=1.7e-17) Top mflops for N=2 = 65.536 Normalized results and averages for N=2: fft 0: mflops = 49.9322 (norm. = 0.761905), norm. avg. (of 1) = 0.761905 fft 1: mflops = 41.1206 (norm. = 0.627451), norm. avg. (of 1) = 0.627451 fft 2: mflops = 30.3935 (norm. = 0.463768), norm. avg. (of 1) = 0.463768 fft 3: mflops = 1.22497 (norm. = 0.0186916), norm. avg. (of 1) = 0.0186916 fft 4: mflops = 12.4092 (norm. = 0.189349), norm. avg. (of 1) = 0.189349 fft 5: mflops = 10.6998 (norm. = 0.163265), norm. avg. (of 1) = 0.163265 fft 6: mflops = 4.40578 (norm. = 0.0672269), norm. avg. (of 1) = 0.0672269 fft 7: mflops = 7.13317 (norm. = 0.108844), norm. avg. (of 1) = 0.108844 fft 8: mflops = 20.9715 (norm. = 0.32), norm. avg. (of 1) = 0.32 fft 9: mflops = 3.29741 (norm. = 0.0503145), norm. avg. (of 1) = 0.0503145 fft 10: mflops = 3.36082 (norm. = 0.0512821), norm. avg. (of 1) = 0.0512821 fft 11: mflops = -1 (norm. = -0.0152588), norm. avg. (of 0) = -1 fft 12: mflops = 9.89223 (norm. = 0.150943), norm. avg. (of 1) = 0.150943 fft 13: mflops = 6.63656 (norm. = 0.101266), norm. avg. (of 1) = 0.101266 fft 14: mflops = 24.3855 (norm. = 0.372093), norm. avg. (of 1) = 0.372093 fft 15: mflops = 24.6724 (norm. = 0.376471), norm. avg. (of 1) = 0.376471 fft 16: mflops = 48.771 (norm. = 0.744186), norm. avg. (of 1) = 0.744186 fft 17: mflops = -1 (norm. = -0.0152588), norm. avg. (of 0) = -1 fft 18: mflops = 5.72992 (norm. = 0.0874317), norm. avg. (of 1) = 0.0874317 fft 19: mflops = 9.98644 (norm. = 0.152381), norm. avg. (of 1) = 0.152381 fft 20: mflops = 11.4598 (norm. = 0.174863), norm. avg. (of 1) = 0.174863 fft 21: mflops = 33.026 (norm. = 0.503937), norm. avg. (of 1) = 0.503937 fft 22: mflops = -1 (norm. = -0.0152588), norm. avg. (of 0) = -1 fft 23: mflops = -1 (norm. = -0.0152588), norm. avg. (of 0) = -1 fft 24: mflops = -1 (norm. = -0.0152588), norm. avg. (of 0) = -1 fft 25: mflops = -1 (norm. = -0.0152588), norm. avg. (of 0) = -1 fft 26: mflops = 5.34988 (norm. = 0.0816327), norm. avg. (of 1) = 0.0816327 fft 27: mflops = 2.31986 (norm. = 0.0353982), norm. avg. (of 1) = 0.0353982 fft 28: mflops = 11.5228 (norm. = 0.175824), norm. avg. (of 1) = 0.175824 fft 29: mflops = 7.59838 (norm. = 0.115942), norm. avg. (of 1) = 0.115942 fft 30: mflops = 65.536 (norm. = 1), norm. avg. (of 1) = 1 fft 31: mflops = 25.575 (norm. = 0.390244), norm. avg. (of 1) = 0.390244 fft 32: mflops = -1 (norm. = -0.0152588), norm. avg. (of 0) = -1 fft 33: mflops = -1 (norm. = -0.0152588), norm. avg. (of 0) = -1 fft 34: mflops = 6.20459 (norm. = 0.0946746), norm. avg. (of 1) = 0.0946746 fft 35: mflops = 6.5536 (norm. = 0.1), norm. avg. (of 1) = 0.1 fft 36: mflops = 20.5603 (norm. = 0.313725), norm. avg. (of 1) = 0.313725 fft 37: mflops = 19.5996 (norm. = 0.299065), norm. avg. (of 1) = 0.299065 fft 38: mflops = 3.12076 (norm. = 0.047619), norm. avg. (of 1) = 0.047619 fft 39: mflops = 2.70252 (norm. = 0.0412371), norm. avg. (of 1) = 0.0412371 fft 40: mflops = 7.13317 (norm. = 0.108844), norm. avg. (of 1) = 0.108844 fft 41: mflops = 13.6179 (norm. = 0.207792), norm. avg. (of 1) = 0.207792 fft 42: mflops = 10.3819 (norm. = 0.158416), norm. avg. (of 1) = 0.158416 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.33 s, 2097152 iters, t-(init.)=1.2 s t(norm)=0.0715256, mflops=69.9051 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.29 s, 2097152 iters, t-(init.)=1.16 s t(norm)=0.0691414, mflops=72.3156 (err=1.3e-16) 2. Arndt Split-Radix: elapsed time t=1.99 s, 2097152 iters, t-(init.)=1.86 s t(norm)=0.110865, mflops=45.1 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.89 s, 262144 iters, t-(init.)=1.88 s t(norm)=0.896454, mflops=5.57753 (err=1.3e-16) 4. Bailey: elapsed time t=1.92 s, 1048576 iters, t-(init.)=1.86 s t(norm)=0.221729, mflops=22.55 (err=1.3e-16) 5. Beauregard: elapsed time t=1.27 s, 524288 iters, t-(init.)=1.23 s t(norm)=0.293255, mflops=17.05 (err=6.5e-17) 6. Bergland: elapsed time t=1.5 s, 524288 iters, t-(init.)=1.46 s t(norm)=0.348091, mflops=14.3641 (err=5.3e-17) 7. Brenner: elapsed time t=1.42 s, 524288 iters, t-(init.)=1.39 s t(norm)=0.331402, mflops=15.0874 (err=5.3e-17) 8. Burrus: elapsed time t=1.32 s, 1048576 iters, t-(init.)=1.26 s t(norm)=0.150204, mflops=33.2881 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.77 s, 524288 iters, t-(init.)=1.74 s t(norm)=0.414848, mflops=12.0526 10. CWP (best N) (N=15): elapsed time t=1.59 s, 262144 iters, t-(init.)=1.53 s t(norm)=0.729561, mflops=6.85344 11. Edelblute: elapsed time t=1.87 s, 2097152 iters, t-(init.)=1.68 s t(norm)=0.100136, mflops=49.9322 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.34 s, 1048576 iters, t-(init.)=1.27 s t(norm)=0.151396, mflops=33.026 (err=5.3e-17) 13. FFTPACK (f2c): elapsed time t=1.19 s, 524288 iters, t-(init.)=1.14 s t(norm)=0.271797, mflops=18.3961 (err=5.3e-17) FFTW_MEASURE plan: (cost = 4.577637e-07) FFTW_NOTW 4 14. FFTW: elapsed time t=1.03 s, 2097152 iters, t-(init.)=0.91 s t(norm)=0.0542402, mflops=92.1825 (err=5.3e-17) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 15. FFTW_ESTIMATE: elapsed time t=1.09 s, 2097152 iters, t-(init.)=0.97 s t(norm)=0.0578165, mflops=86.4805 (err=5.3e-17) 16. Frigo-old: elapsed time t=1.8 s, 4194304 iters, t-(init.)=1.56 s t(norm)=0.0464916, mflops=107.546 (err=5.3e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.04 s, 524288 iters, t-(init.)=1.01 s t(norm)=0.240803, mflops=20.7639 (err=5.3e-17) 19. GSL DIT: elapsed time t=1.25 s, 524288 iters, t-(init.)=1.22 s t(norm)=0.290871, mflops=17.1898 (err=6.5e-17) 20. GSL DIF: elapsed time t=1.16 s, 524288 iters, t-(init.)=1.13 s t(norm)=0.269413, mflops=18.5589 (err=6.5e-17) 21. Krukar: elapsed time t=1.83 s, 4194304 iters, t-(init.)=1.59 s t(norm)=0.0473857, mflops=105.517 (err=5.3e-17) 22. Mayer (Buneman): elapsed time t=1.05 s, 1048576 iters, t-(init.)=1 s t(norm)=0.119209, mflops=41.943 (err=1.3e-16) 23. Mayer (simple): elapsed time t=1 s, 1048576 iters, t-(init.)=0.94 s t(norm)=0.112057, mflops=44.6203 24. Mayer (lookup): elapsed time t=1.24 s, 1048576 iters, t-(init.)=1.18 s t(norm)=0.140667, mflops=35.5449 (err=1.3e-16) 25. Monro: elapsed time t=1.03 s, 131072 iters, t-(init.)=1.03 s t(norm)=0.982285, mflops=5.09017 (err=1.3e-16) 26. NAPACK (f2c): elapsed time t=1.01 s, 262144 iters, t-(init.)=1 s t(norm)=0.476837, mflops=10.4858 (err=1.6e-16) 27. Nielsen: elapsed time t=1.2 s, 262144 iters, t-(init.)=1.19 s t(norm)=0.567436, mflops=8.81156 (err=1.3e-16) 28. NR (C): elapsed time t=1.16 s, 524288 iters, t-(init.)=1.13 s t(norm)=0.269413, mflops=18.5589 (err=6.5e-17) 29. NR (F): elapsed time t=1.56 s, 524288 iters, t-(init.)=1.53 s t(norm)=0.36478, mflops=13.7069 (err=6.5e-17) 30. Ooura (C): elapsed time t=1.07 s, 2097152 iters, t-(init.)=0.91 s t(norm)=0.0542402, mflops=92.1825 (err=5.3e-17) 31. Ooura (F): elapsed time t=1.41 s, 2097152 iters, t-(init.)=1.28 s t(norm)=0.0762939, mflops=65.536 (err=5.3e-17) 32. Ransom: elapsed time t=1.57 s, 262144 iters, t-(init.)=1.55 s t(norm)=0.739098, mflops=6.76501 (err=2.4e-16) 33. SCIPORT: elapsed time t=1.01 s, 1048576 iters, t-(init.)=0.95 s t(norm)=0.113249, mflops=44.1506 (err=6.5e-17) 34. Singleton: elapsed time t=1.96 s, 1048576 iters, t-(init.)=1.9 s t(norm)=0.226498, mflops=22.0753 (err=5.3e-17) 35. Singleton (f2c): elapsed time t=1.84 s, 1048576 iters, t-(init.)=1.77 s t(norm)=0.211, mflops=23.6966 (err=5.3e-17) 36. Sorensen: elapsed time t=1.15 s, 1048576 iters, t-(init.)=1.08 s t(norm)=0.128746, mflops=38.8361 (err=1.3e-16) 37. Sorensen DIT: elapsed time t=1.35 s, 1048576 iters, t-(init.)=1.3 s t(norm)=0.154972, mflops=32.2639 (err=1.3e-16) 38. Temperton: elapsed time t=1.01 s, 262144 iters, t-(init.)=0.99 s t(norm)=0.472069, mflops=10.5917 (err=5.3e-17) 39. Temperton (f2c): elapsed time t=1.15 s, 262144 iters, t-(init.)=1.14 s t(norm)=0.543594, mflops=9.19804 (err=5.3e-17) 40. Valkenburg: elapsed time t=1.54 s, 262144 iters, t-(init.)=1.52 s t(norm)=0.724792, mflops=6.89853 (err=1.4e-16) 41. SCSL: elapsed time t=1.83 s, 2097152 iters, t-(init.)=1.71 s t(norm)=0.101924, mflops=49.0562 (err=5.3e-17) 42. SGIMATH: elapsed time t=1.83 s, 2097152 iters, t-(init.)=1.71 s t(norm)=0.101924, mflops=49.0562 (err=5.3e-17) Top mflops for N=4 = 107.546 Normalized results and averages for N=4: fft 0: mflops = 69.9051 (norm. = 0.65), norm. avg. (of 2) = 0.705952 fft 1: mflops = 72.3156 (norm. = 0.672414), norm. avg. (of 2) = 0.649932 fft 2: mflops = 45.1 (norm. = 0.419355), norm. avg. (of 2) = 0.441561 fft 3: mflops = 5.57753 (norm. = 0.0518617), norm. avg. (of 2) = 0.0352766 fft 4: mflops = 22.55 (norm. = 0.209677), norm. avg. (of 2) = 0.199513 fft 5: mflops = 17.05 (norm. = 0.158537), norm. avg. (of 2) = 0.160901 fft 6: mflops = 14.3641 (norm. = 0.133562), norm. avg. (of 2) = 0.100394 fft 7: mflops = 15.0874 (norm. = 0.140288), norm. avg. (of 2) = 0.124566 fft 8: mflops = 33.2881 (norm. = 0.309524), norm. avg. (of 2) = 0.314762 fft 9: mflops = 12.0526 (norm. = 0.112069), norm. avg. (of 2) = 0.0811917 fft 10: mflops = 6.85344 (norm. = 0.0637255), norm. avg. (of 2) = 0.0575038 fft 11: mflops = 49.9322 (norm. = 0.464286), norm. avg. (of 1) = 0.464286 fft 12: mflops = 33.026 (norm. = 0.307087), norm. avg. (of 2) = 0.229015 fft 13: mflops = 18.3961 (norm. = 0.171053), norm. avg. (of 2) = 0.136159 fft 14: mflops = 92.1825 (norm. = 0.857143), norm. avg. (of 2) = 0.614618 fft 15: mflops = 86.4805 (norm. = 0.804124), norm. avg. (of 2) = 0.590297 fft 16: mflops = 107.546 (norm. = 1), norm. avg. (of 2) = 0.872093 fft 17: mflops = -1 (norm. = -0.00929832), norm. avg. (of 0) = -1 fft 18: mflops = 20.7639 (norm. = 0.193069), norm. avg. (of 2) = 0.140251 fft 19: mflops = 17.1898 (norm. = 0.159836), norm. avg. (of 2) = 0.156109 fft 20: mflops = 18.5589 (norm. = 0.172566), norm. avg. (of 2) = 0.173715 fft 21: mflops = 105.517 (norm. = 0.981132), norm. avg. (of 2) = 0.742535 fft 22: mflops = 41.943 (norm. = 0.39), norm. avg. (of 1) = 0.39 fft 23: mflops = 44.6203 (norm. = 0.414894), norm. avg. (of 1) = 0.414894 fft 24: mflops = 35.5449 (norm. = 0.330508), norm. avg. (of 1) = 0.330508 fft 25: mflops = 5.09017 (norm. = 0.0473301), norm. avg. (of 1) = 0.0473301 fft 26: mflops = 10.4858 (norm. = 0.0975), norm. avg. (of 2) = 0.0895663 fft 27: mflops = 8.81156 (norm. = 0.0819328), norm. avg. (of 2) = 0.0586655 fft 28: mflops = 18.5589 (norm. = 0.172566), norm. avg. (of 2) = 0.174195 fft 29: mflops = 13.7069 (norm. = 0.127451), norm. avg. (of 2) = 0.121697 fft 30: mflops = 92.1825 (norm. = 0.857143), norm. avg. (of 2) = 0.928571 fft 31: mflops = 65.536 (norm. = 0.609375), norm. avg. (of 2) = 0.499809 fft 32: mflops = 6.76501 (norm. = 0.0629032), norm. avg. (of 1) = 0.0629032 fft 33: mflops = 44.1506 (norm. = 0.410526), norm. avg. (of 1) = 0.410526 fft 34: mflops = 22.0753 (norm. = 0.205263), norm. avg. (of 2) = 0.149969 fft 35: mflops = 23.6966 (norm. = 0.220339), norm. avg. (of 2) = 0.160169 fft 36: mflops = 38.8361 (norm. = 0.361111), norm. avg. (of 2) = 0.337418 fft 37: mflops = 32.2639 (norm. = 0.3), norm. avg. (of 2) = 0.299533 fft 38: mflops = 10.5917 (norm. = 0.0984848), norm. avg. (of 2) = 0.0730519 fft 39: mflops = 9.19804 (norm. = 0.0855263), norm. avg. (of 2) = 0.0633817 fft 40: mflops = 6.89853 (norm. = 0.0641447), norm. avg. (of 2) = 0.0864941 fft 41: mflops = 49.0562 (norm. = 0.45614), norm. avg. (of 2) = 0.331966 fft 42: mflops = 49.0562 (norm. = 0.45614), norm. avg. (of 2) = 0.307278 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.18 s, 1048576 iters, t-(init.)=1.08 s t(norm)=0.0429153, mflops=116.508 (err=1.2e-16) 1. Arndt DIT: elapsed time t=1.16 s, 1048576 iters, t-(init.)=1.05 s t(norm)=0.0417233, mflops=119.837 (err=1.2e-16) 2. Arndt Split-Radix: elapsed time t=1.24 s, 524288 iters, t-(init.)=1.19 s t(norm)=0.0945727, mflops=52.8694 (err=1.1e-16) 3. Arndt 4-step: elapsed time t=1.21 s, 65536 iters, t-(init.)=1.2 s t(norm)=0.762939, mflops=6.5536 (err=1.5e-16) 4. Bailey: elapsed time t=1.54 s, 524288 iters, t-(init.)=1.49 s t(norm)=0.118415, mflops=42.2245 (err=1.3e-16) 5. Beauregard: elapsed time t=1.82 s, 262144 iters, t-(init.)=1.79 s t(norm)=0.284513, mflops=17.5739 (err=1.2e-16) 6. Bergland: elapsed time t=1.26 s, 262144 iters, t-(init.)=1.24 s t(norm)=0.197093, mflops=25.3688 (err=1.3e-16) 7. Brenner: elapsed time t=1.4 s, 262144 iters, t-(init.)=1.38 s t(norm)=0.219345, mflops=22.7951 (err=1.2e-16) 8. Burrus: elapsed time t=1.8 s, 524288 iters, t-(init.)=1.74 s t(norm)=0.138283, mflops=36.1578 (err=1.5e-16) 9. CWP (min N): elapsed time t=1.06 s, 262144 iters, t-(init.)=1.03 s t(norm)=0.163714, mflops=30.541 10. CWP (best N) (N=15): elapsed time t=1.6 s, 262144 iters, t-(init.)=1.55 s t(norm)=0.246366, mflops=20.295 11. Edelblute: elapsed time t=1.48 s, 524288 iters, t-(init.)=1.42 s t(norm)=0.112851, mflops=44.306 (err=1.5e-16) 12. FFTPACK: elapsed time t=1.32 s, 524288 iters, t-(init.)=1.27 s t(norm)=0.100931, mflops=49.539 (err=1.2e-16) 13. FFTPACK (f2c): elapsed time t=1.37 s, 262144 iters, t-(init.)=1.34 s t(norm)=0.212987, mflops=23.4756 (err=1.2e-16) FFTW_MEASURE plan: (cost = 8.010864e-07) FFTW_NOTW 8 14. FFTW: elapsed time t=1.65 s, 2097152 iters, t-(init.)=1.44 s t(norm)=0.0286102, mflops=174.763 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.65 s, 2097152 iters, t-(init.)=1.45 s t(norm)=0.0288089, mflops=173.557 (err=1.2e-16) 16. Frigo-old: elapsed time t=1.14 s, 2097152 iters, t-(init.)=0.93 s t(norm)=0.0184774, mflops=270.6 (err=1.4e-16) 17. Green: elapsed time t=1.24 s, 1048576 iters, t-(init.)=1.11 s t(norm)=0.0441074, mflops=113.36 (err=1.4e-16) 18. GSL: elapsed time t=1.56 s, 524288 iters, t-(init.)=1.51 s t(norm)=0.120004, mflops=41.6653 (err=1.2e-16) 19. GSL DIT: elapsed time t=1.17 s, 262144 iters, t-(init.)=1.14 s t(norm)=0.181198, mflops=27.5941 (err=1.2e-16) 20. GSL DIF: elapsed time t=1.16 s, 262144 iters, t-(init.)=1.14 s t(norm)=0.181198, mflops=27.5941 (err=1.4e-16) 21. Krukar: elapsed time t=1.74 s, 2097152 iters, t-(init.)=1.53 s t(norm)=0.0303984, mflops=164.483 (err=1.2e-16) 22. Mayer (Buneman): elapsed time t=1.04 s, 524288 iters, t-(init.)=0.99 s t(norm)=0.0786781, mflops=63.5501 (err=1.5e-16) 23. Mayer (simple): elapsed time t=1.06 s, 524288 iters, t-(init.)=1.01 s t(norm)=0.0802676, mflops=62.2916 24. Mayer (lookup): elapsed time t=1.15 s, 524288 iters, t-(init.)=1.1 s t(norm)=0.0874201, mflops=57.1951 (err=1.5e-16) 25. Monro: elapsed time t=1.34 s, 131072 iters, t-(init.)=1.32 s t(norm)=0.419617, mflops=11.9156 (err=1.1e-08) 26. NAPACK (f2c): elapsed time t=1.91 s, 262144 iters, t-(init.)=1.88 s t(norm)=0.298818, mflops=16.7326 (err=1.7e-16) 27. Nielsen: elapsed time t=1.47 s, 262144 iters, t-(init.)=1.44 s t(norm)=0.228882, mflops=21.8453 (err=6.9e-16) 28. NR (C): elapsed time t=1.16 s, 262144 iters, t-(init.)=1.13 s t(norm)=0.179609, mflops=27.8383 (err=1.2e-16) 29. NR (F): elapsed time t=1.52 s, 262144 iters, t-(init.)=1.49 s t(norm)=0.236829, mflops=21.1123 (err=1.2e-16) 30. Ooura (C): elapsed time t=1.06 s, 1048576 iters, t-(init.)=0.94 s t(norm)=0.0373522, mflops=133.861 (err=1.3e-16) 31. Ooura (F): elapsed time t=1.44 s, 1048576 iters, t-(init.)=1.34 s t(norm)=0.0532468, mflops=93.9023 (err=1.3e-16) 32. Ransom: elapsed time t=1.25 s, 65536 iters, t-(init.)=1.24 s t(norm)=0.788371, mflops=6.34219 (err=3.9e-16) 33. SCIPORT: elapsed time t=1.11 s, 524288 iters, t-(init.)=1.06 s t(norm)=0.0842412, mflops=59.3534 (err=1.2e-16) 34. Singleton: elapsed time t=1.39 s, 262144 iters, t-(init.)=1.36 s t(norm)=0.216166, mflops=23.1304 (err=1.4e-16) 35. Singleton (f2c): elapsed time t=1.43 s, 262144 iters, t-(init.)=1.4 s t(norm)=0.222524, mflops=22.4695 (err=1.4e-16) 36. Sorensen: elapsed time t=1.28 s, 524288 iters, t-(init.)=1.23 s t(norm)=0.0977516, mflops=51.15 (err=1.2e-16) 37. Sorensen DIT: elapsed time t=1.76 s, 524288 iters, t-(init.)=1.71 s t(norm)=0.135899, mflops=36.7921 (err=1.2e-16) 38. Temperton: elapsed time t=1.48 s, 262144 iters, t-(init.)=1.46 s t(norm)=0.232061, mflops=21.5461 (err=4.6e-09) 39. Temperton (f2c): elapsed time t=1.08 s, 131072 iters, t-(init.)=1.06 s t(norm)=0.336965, mflops=14.8383 (err=1.4e-16) 40. Valkenburg: elapsed time t=1.15 s, 65536 iters, t-(init.)=1.15 s t(norm)=0.73115, mflops=6.83854 (err=1.4e-16) 41. SCSL: elapsed time t=1.17 s, 1048576 iters, t-(init.)=1.07 s t(norm)=0.042518, mflops=117.597 (err=1.4e-16) 42. SGIMATH: elapsed time t=1.17 s, 1048576 iters, t-(init.)=1.06 s t(norm)=0.0421206, mflops=118.707 (err=1.4e-16) Top mflops for N=8 = 270.6 Normalized results and averages for N=8: fft 0: mflops = 116.508 (norm. = 0.430556), norm. avg. (of 3) = 0.614153 fft 1: mflops = 119.837 (norm. = 0.442857), norm. avg. (of 3) = 0.580907 fft 2: mflops = 52.8694 (norm. = 0.195378), norm. avg. (of 3) = 0.3595 fft 3: mflops = 6.5536 (norm. = 0.0242187), norm. avg. (of 3) = 0.0315907 fft 4: mflops = 42.2245 (norm. = 0.15604), norm. avg. (of 3) = 0.185022 fft 5: mflops = 17.5739 (norm. = 0.0649441), norm. avg. (of 3) = 0.128915 fft 6: mflops = 25.3688 (norm. = 0.09375), norm. avg. (of 3) = 0.0981795 fft 7: mflops = 22.7951 (norm. = 0.0842391), norm. avg. (of 3) = 0.111123 fft 8: mflops = 36.1578 (norm. = 0.133621), norm. avg. (of 3) = 0.254381 fft 9: mflops = 30.541 (norm. = 0.112864), norm. avg. (of 3) = 0.0917492 fft 10: mflops = 20.295 (norm. = 0.075), norm. avg. (of 3) = 0.0633358 fft 11: mflops = 44.306 (norm. = 0.163732), norm. avg. (of 2) = 0.314009 fft 12: mflops = 49.539 (norm. = 0.183071), norm. avg. (of 3) = 0.2137 fft 13: mflops = 23.4756 (norm. = 0.0867537), norm. avg. (of 3) = 0.119691 fft 14: mflops = 174.763 (norm. = 0.645833), norm. avg. (of 3) = 0.625023 fft 15: mflops = 173.557 (norm. = 0.641379), norm. avg. (of 3) = 0.607325 fft 16: mflops = 270.6 (norm. = 1), norm. avg. (of 3) = 0.914729 fft 17: mflops = 113.36 (norm. = 0.418919), norm. avg. (of 1) = 0.418919 fft 18: mflops = 41.6653 (norm. = 0.153974), norm. avg. (of 3) = 0.144825 fft 19: mflops = 27.5941 (norm. = 0.101974), norm. avg. (of 3) = 0.138064 fft 20: mflops = 27.5941 (norm. = 0.101974), norm. avg. (of 3) = 0.149801 fft 21: mflops = 164.483 (norm. = 0.607843), norm. avg. (of 3) = 0.697637 fft 22: mflops = 63.5501 (norm. = 0.234848), norm. avg. (of 2) = 0.312424 fft 23: mflops = 62.2916 (norm. = 0.230198), norm. avg. (of 2) = 0.322546 fft 24: mflops = 57.1951 (norm. = 0.211364), norm. avg. (of 2) = 0.270936 fft 25: mflops = 11.9156 (norm. = 0.0440341), norm. avg. (of 2) = 0.0456821 fft 26: mflops = 16.7326 (norm. = 0.0618351), norm. avg. (of 3) = 0.0803226 fft 27: mflops = 21.8453 (norm. = 0.0807292), norm. avg. (of 3) = 0.0660201 fft 28: mflops = 27.8383 (norm. = 0.102876), norm. avg. (of 3) = 0.150422 fft 29: mflops = 21.1123 (norm. = 0.0780201), norm. avg. (of 3) = 0.107138 fft 30: mflops = 133.861 (norm. = 0.494681), norm. avg. (of 3) = 0.783941 fft 31: mflops = 93.9023 (norm. = 0.347015), norm. avg. (of 3) = 0.448878 fft 32: mflops = 6.34219 (norm. = 0.0234375), norm. avg. (of 2) = 0.0431704 fft 33: mflops = 59.3534 (norm. = 0.21934), norm. avg. (of 2) = 0.314933 fft 34: mflops = 23.1304 (norm. = 0.0854779), norm. avg. (of 3) = 0.128472 fft 35: mflops = 22.4695 (norm. = 0.0830357), norm. avg. (of 3) = 0.134458 fft 36: mflops = 51.15 (norm. = 0.189024), norm. avg. (of 3) = 0.287954 fft 37: mflops = 36.7921 (norm. = 0.135965), norm. avg. (of 3) = 0.24501 fft 38: mflops = 21.5461 (norm. = 0.0796233), norm. avg. (of 3) = 0.0752424 fft 39: mflops = 14.8383 (norm. = 0.0548349), norm. avg. (of 3) = 0.0605328 fft 40: mflops = 6.83854 (norm. = 0.0252717), norm. avg. (of 3) = 0.0660867 fft 41: mflops = 117.597 (norm. = 0.434579), norm. avg. (of 3) = 0.366171 fft 42: mflops = 118.707 (norm. = 0.438679), norm. avg. (of 3) = 0.351078 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.15 s, 262144 iters, t-(init.)=1.11 s t(norm)=0.0661612, mflops=75.573 (err=1.5e-16) 1. Arndt DIT: elapsed time t=1.11 s, 262144 iters, t-(init.)=1.06 s t(norm)=0.0631809, mflops=79.1378 (err=2.2e-16) 2. Arndt Split-Radix: elapsed time t=1.51 s, 262144 iters, t-(init.)=1.46 s t(norm)=0.0870228, mflops=57.4562 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.43 s, 65536 iters, t-(init.)=1.41 s t(norm)=0.33617, mflops=14.8734 (err=1.6e-16) 4. Bailey: elapsed time t=1.52 s, 262144 iters, t-(init.)=1.47 s t(norm)=0.0876188, mflops=57.0654 (err=1.6e-16) 5. Beauregard: elapsed time t=1.85 s, 131072 iters, t-(init.)=1.82 s t(norm)=0.216961, mflops=23.0456 (err=2.3e-16) 6. Bergland: elapsed time t=1.09 s, 131072 iters, t-(init.)=1.06 s t(norm)=0.126362, mflops=39.5689 (err=2.6e-16) 7. Brenner: elapsed time t=1.35 s, 131072 iters, t-(init.)=1.32 s t(norm)=0.157356, mflops=31.775 (err=2.1e-16) 8. Burrus: elapsed time t=1.02 s, 131072 iters, t-(init.)=0.99 s t(norm)=0.118017, mflops=42.3667 (err=1.4e-16) 9. CWP (min N): elapsed time t=1.26 s, 262144 iters, t-(init.)=1.21 s t(norm)=0.0721216, mflops=69.3273 10. CWP (best N) (N=28): elapsed time t=1.05 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.120401, mflops=41.5278 11. Edelblute: elapsed time t=1.87 s, 262144 iters, t-(init.)=1.82 s t(norm)=0.10848, mflops=46.0913 (err=1.4e-16) 12. FFTPACK: elapsed time t=1.74 s, 524288 iters, t-(init.)=1.64 s t(norm)=0.0488758, mflops=102.3 (err=1.8e-16) 13. FFTPACK (f2c): elapsed time t=1.19 s, 131072 iters, t-(init.)=1.16 s t(norm)=0.138283, mflops=36.1578 (err=1.8e-16) FFTW_MEASURE plan: (cost = 1.678467e-06) FFTW_NOTW 16 14. FFTW: elapsed time t=1.63 s, 1048576 iters, t-(init.)=1.43 s t(norm)=0.0213087, mflops=234.646 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.67 s, 1048576 iters, t-(init.)=1.47 s t(norm)=0.0219047, mflops=228.261 (err=1.7e-16) 16. Frigo-old: elapsed time t=1.25 s, 1048576 iters, t-(init.)=1.05 s t(norm)=0.0156462, mflops=319.566 (err=1.8e-16) 17. Green: elapsed time t=1.02 s, 262144 iters, t-(init.)=0.96 s t(norm)=0.0572205, mflops=87.3813 (err=1.9e-16) 18. GSL: elapsed time t=1.15 s, 262144 iters, t-(init.)=1.1 s t(norm)=0.0655651, mflops=76.2601 (err=1.8e-16) 19. GSL DIT: elapsed time t=1.14 s, 131072 iters, t-(init.)=1.11 s t(norm)=0.132322, mflops=37.7865 (err=2.1e-16) 20. GSL DIF: elapsed time t=1.08 s, 131072 iters, t-(init.)=1.06 s t(norm)=0.126362, mflops=39.5689 (err=2.8e-16) 21. Krukar: elapsed time t=1.92 s, 1048576 iters, t-(init.)=1.73 s t(norm)=0.025779, mflops=193.956 (err=2.0e-16) 22. Mayer (Buneman): elapsed time t=1.48 s, 262144 iters, t-(init.)=1.43 s t(norm)=0.0852346, mflops=58.6616 (err=1.7e-16) 23. Mayer (simple): elapsed time t=1.29 s, 262144 iters, t-(init.)=1.24 s t(norm)=0.0739098, mflops=67.6501 24. Mayer (lookup): elapsed time t=1.37 s, 262144 iters, t-(init.)=1.32 s t(norm)=0.0786781, mflops=63.5501 (err=1.9e-16) 25. Monro: elapsed time t=1.12 s, 65536 iters, t-(init.)=1.11 s t(norm)=0.264645, mflops=18.8933 (err=2.1e-08) 26. NAPACK (f2c): elapsed time t=1.56 s, 131072 iters, t-(init.)=1.54 s t(norm)=0.183582, mflops=27.2357 (err=3.3e-16) 27. Nielsen: elapsed time t=1.11 s, 65536 iters, t-(init.)=1.1 s t(norm)=0.26226, mflops=19.065 (err=1.6e-16) 28. NR (C): elapsed time t=1.12 s, 131072 iters, t-(init.)=1.1 s t(norm)=0.13113, mflops=38.13 (err=2.1e-16) 29. NR (F): elapsed time t=1.41 s, 131072 iters, t-(init.)=1.39 s t(norm)=0.165701, mflops=30.1748 (err=2.1e-16) 30. Ooura (C): elapsed time t=1.27 s, 524288 iters, t-(init.)=1.16 s t(norm)=0.0345707, mflops=144.631 (err=2.0e-16) 31. Ooura (F): elapsed time t=1.42 s, 524288 iters, t-(init.)=1.32 s t(norm)=0.0393391, mflops=127.1 (err=2.0e-16) 32. Ransom: elapsed time t=1.92 s, 131072 iters, t-(init.)=1.9 s t(norm)=0.226498, mflops=22.0753 (err=4.2e-16) 33. SCIPORT: elapsed time t=1.19 s, 262144 iters, t-(init.)=1.14 s t(norm)=0.0679493, mflops=73.5843 (err=2.6e-16) 34. Singleton: elapsed time t=1.51 s, 262144 iters, t-(init.)=1.46 s t(norm)=0.0870228, mflops=57.4562 (err=1.7e-16) 35. Singleton (f2c): elapsed time t=1.49 s, 262144 iters, t-(init.)=1.44 s t(norm)=0.0858307, mflops=58.2542 (err=1.7e-16) 36. Sorensen: elapsed time t=1.3 s, 262144 iters, t-(init.)=1.25 s t(norm)=0.0745058, mflops=67.1089 (err=1.7e-16) 37. Sorensen DIT: elapsed time t=1.03 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.120401, mflops=41.5278 (err=1.9e-16) 38. Temperton: elapsed time t=1.56 s, 131072 iters, t-(init.)=1.54 s t(norm)=0.183582, mflops=27.2357 (err=1.7e-08) 39. Temperton (f2c): elapsed time t=1.53 s, 131072 iters, t-(init.)=1.51 s t(norm)=0.180006, mflops=27.7768 (err=1.8e-16) 40. Valkenburg: elapsed time t=1.46 s, 32768 iters, t-(init.)=1.45 s t(norm)=0.691414, mflops=7.23156 (err=2.9e-16) 41. SCSL: elapsed time t=1.95 s, 1048576 iters, t-(init.)=1.76 s t(norm)=0.026226, mflops=190.65 (err=1.8e-16) 42. SGIMATH: elapsed time t=1.96 s, 1048576 iters, t-(init.)=1.77 s t(norm)=0.0263751, mflops=189.573 (err=1.8e-16) Top mflops for N=16 = 319.566 Normalized results and averages for N=16: fft 0: mflops = 75.573 (norm. = 0.236486), norm. avg. (of 4) = 0.519737 fft 1: mflops = 79.1378 (norm. = 0.247642), norm. avg. (of 4) = 0.497591 fft 2: mflops = 57.4562 (norm. = 0.179795), norm. avg. (of 4) = 0.314574 fft 3: mflops = 14.8734 (norm. = 0.0465426), norm. avg. (of 4) = 0.0353286 fft 4: mflops = 57.0654 (norm. = 0.178571), norm. avg. (of 4) = 0.18341 fft 5: mflops = 23.0456 (norm. = 0.0721154), norm. avg. (of 4) = 0.114715 fft 6: mflops = 39.5689 (norm. = 0.123821), norm. avg. (of 4) = 0.10459 fft 7: mflops = 31.775 (norm. = 0.0994318), norm. avg. (of 4) = 0.108201 fft 8: mflops = 42.3667 (norm. = 0.132576), norm. avg. (of 4) = 0.22393 fft 9: mflops = 69.3273 (norm. = 0.216942), norm. avg. (of 4) = 0.123047 fft 10: mflops = 41.5278 (norm. = 0.12995), norm. avg. (of 4) = 0.0799895 fft 11: mflops = 46.0913 (norm. = 0.144231), norm. avg. (of 3) = 0.257416 fft 12: mflops = 102.3 (norm. = 0.320122), norm. avg. (of 4) = 0.240306 fft 13: mflops = 36.1578 (norm. = 0.113147), norm. avg. (of 4) = 0.118055 fft 14: mflops = 234.646 (norm. = 0.734266), norm. avg. (of 4) = 0.652334 fft 15: mflops = 228.261 (norm. = 0.714286), norm. avg. (of 4) = 0.634065 fft 16: mflops = 319.566 (norm. = 1), norm. avg. (of 4) = 0.936047 fft 17: mflops = 87.3813 (norm. = 0.273438), norm. avg. (of 2) = 0.346178 fft 18: mflops = 76.2601 (norm. = 0.238636), norm. avg. (of 4) = 0.168278 fft 19: mflops = 37.7865 (norm. = 0.118243), norm. avg. (of 4) = 0.133108 fft 20: mflops = 39.5689 (norm. = 0.123821), norm. avg. (of 4) = 0.143306 fft 21: mflops = 193.956 (norm. = 0.606936), norm. avg. (of 4) = 0.674962 fft 22: mflops = 58.6616 (norm. = 0.183566), norm. avg. (of 3) = 0.269472 fft 23: mflops = 67.6501 (norm. = 0.211694), norm. avg. (of 3) = 0.285595 fft 24: mflops = 63.5501 (norm. = 0.198864), norm. avg. (of 3) = 0.246912 fft 25: mflops = 18.8933 (norm. = 0.0591216), norm. avg. (of 3) = 0.0501619 fft 26: mflops = 27.2357 (norm. = 0.0852273), norm. avg. (of 4) = 0.0815488 fft 27: mflops = 19.065 (norm. = 0.0596591), norm. avg. (of 4) = 0.0644298 fft 28: mflops = 38.13 (norm. = 0.119318), norm. avg. (of 4) = 0.142646 fft 29: mflops = 30.1748 (norm. = 0.0944245), norm. avg. (of 4) = 0.103959 fft 30: mflops = 144.631 (norm. = 0.452586), norm. avg. (of 4) = 0.701102 fft 31: mflops = 127.1 (norm. = 0.397727), norm. avg. (of 4) = 0.43609 fft 32: mflops = 22.0753 (norm. = 0.0690789), norm. avg. (of 3) = 0.0518066 fft 33: mflops = 73.5843 (norm. = 0.230263), norm. avg. (of 3) = 0.28671 fft 34: mflops = 57.4562 (norm. = 0.179795), norm. avg. (of 4) = 0.141303 fft 35: mflops = 58.2542 (norm. = 0.182292), norm. avg. (of 4) = 0.146417 fft 36: mflops = 67.1089 (norm. = 0.21), norm. avg. (of 4) = 0.268465 fft 37: mflops = 41.5278 (norm. = 0.12995), norm. avg. (of 4) = 0.216245 fft 38: mflops = 27.2357 (norm. = 0.0852273), norm. avg. (of 4) = 0.0777386 fft 39: mflops = 27.7768 (norm. = 0.0869205), norm. avg. (of 4) = 0.0671297 fft 40: mflops = 7.23156 (norm. = 0.0226293), norm. avg. (of 4) = 0.0552223 fft 41: mflops = 190.65 (norm. = 0.596591), norm. avg. (of 4) = 0.423776 fft 42: mflops = 189.573 (norm. = 0.59322), norm. avg. (of 4) = 0.411614 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.16 s, 131072 iters, t-(init.)=1.12 s t(norm)=0.0534058, mflops=93.6229 (err=2.9e-16) 1. Arndt DIT: elapsed time t=1.12 s, 131072 iters, t-(init.)=1.07 s t(norm)=0.0510216, mflops=97.9978 (err=2.7e-16) 2. Arndt Split-Radix: elapsed time t=1.67 s, 131072 iters, t-(init.)=1.63 s t(norm)=0.0777245, mflops=64.3298 (err=3.4e-16) 3. Arndt 4-step: elapsed time t=1.67 s, 32768 iters, t-(init.)=1.66 s t(norm)=0.31662, mflops=15.7918 (err=2.7e-16) 4. Bailey: elapsed time t=1.3 s, 131072 iters, t-(init.)=1.25 s t(norm)=0.0596046, mflops=83.8861 (err=2.6e-16) 5. Beauregard: elapsed time t=1.02 s, 32768 iters, t-(init.)=1.01 s t(norm)=0.192642, mflops=25.9549 (err=2.3e-16) 6. Bergland: elapsed time t=1.96 s, 131072 iters, t-(init.)=1.91 s t(norm)=0.0910759, mflops=54.8993 (err=3.0e-16) 7. Brenner: elapsed time t=1.32 s, 65536 iters, t-(init.)=1.3 s t(norm)=0.123978, mflops=40.3298 (err=2.1e-16) 8. Burrus: elapsed time t=1.13 s, 65536 iters, t-(init.)=1.11 s t(norm)=0.105858, mflops=47.2332 (err=2.7e-16) 9. CWP (min N) (N=33): elapsed time t=1.23 s, 131072 iters, t-(init.)=1.19 s t(norm)=0.0567436, mflops=88.1156 10. CWP (best N) (N=35): elapsed time t=1.22 s, 131072 iters, t-(init.)=1.17 s t(norm)=0.0557899, mflops=89.6219 11. Edelblute: elapsed time t=1.05 s, 65536 iters, t-(init.)=1.03 s t(norm)=0.0982285, mflops=50.9017 (err=2.7e-16) 12. FFTPACK: elapsed time t=1.04 s, 131072 iters, t-(init.)=0.99 s t(norm)=0.0472069, mflops=105.917 (err=2.1e-16) 13. FFTPACK (f2c): elapsed time t=1 s, 32768 iters, t-(init.)=0.99 s t(norm)=0.188828, mflops=26.4792 (err=2.1e-16) FFTW_MEASURE plan: (cost = 4.119873e-06) FFTW_NOTW 32 14. FFTW: elapsed time t=1.08 s, 262144 iters, t-(init.)=0.99 s t(norm)=0.0236034, mflops=211.834 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.09 s, 262144 iters, t-(init.)=1 s t(norm)=0.0238419, mflops=209.715 (err=2.1e-16) 16. Frigo-old: elapsed time t=1.64 s, 524288 iters, t-(init.)=1.46 s t(norm)=0.0174046, mflops=287.281 (err=2.2e-16) 17. Green: elapsed time t=1.75 s, 262144 iters, t-(init.)=1.65 s t(norm)=0.0393391, mflops=127.1 (err=2.1e-16) 18. GSL: elapsed time t=1.39 s, 131072 iters, t-(init.)=1.34 s t(norm)=0.0638962, mflops=78.2519 (err=2.0e-16) 19. GSL DIT: elapsed time t=1.07 s, 65536 iters, t-(init.)=1.04 s t(norm)=0.0991821, mflops=50.4123 (err=2.2e-16) 20. GSL DIF: elapsed time t=1.06 s, 65536 iters, t-(init.)=1.04 s t(norm)=0.0991821, mflops=50.4123 (err=2.5e-16) 21. Krukar: elapsed time t=1.13 s, 262144 iters, t-(init.)=1.04 s t(norm)=0.0247955, mflops=201.649 (err=2.2e-16) 22. Mayer (Buneman): elapsed time t=1.62 s, 131072 iters, t-(init.)=1.57 s t(norm)=0.0748634, mflops=66.7883 (err=2.8e-16) 23. Mayer (simple): elapsed time t=1.31 s, 131072 iters, t-(init.)=1.26 s t(norm)=0.0600815, mflops=83.2203 24. Mayer (lookup): elapsed time t=1.36 s, 131072 iters, t-(init.)=1.32 s t(norm)=0.0629425, mflops=79.4376 (err=2.9e-16) 25. Monro: elapsed time t=1.88 s, 65536 iters, t-(init.)=1.86 s t(norm)=0.177383, mflops=28.1875 (err=3.7e-08) 26. NAPACK (f2c): elapsed time t=1.54 s, 65536 iters, t-(init.)=1.51 s t(norm)=0.144005, mflops=34.7211 (err=1.3e-15) 27. Nielsen: elapsed time t=1.76 s, 65536 iters, t-(init.)=1.73 s t(norm)=0.164986, mflops=30.3057 (err=1.5e-15) 28. NR (C): elapsed time t=1.1 s, 65536 iters, t-(init.)=1.08 s t(norm)=0.102997, mflops=48.5452 (err=2.1e-16) 29. NR (F): elapsed time t=1.32 s, 65536 iters, t-(init.)=1.3 s t(norm)=0.123978, mflops=40.3298 (err=2.1e-16) 30. Ooura (C): elapsed time t=1.37 s, 262144 iters, t-(init.)=1.27 s t(norm)=0.0302792, mflops=165.13 (err=2.6e-16) 31. Ooura (F): elapsed time t=1.52 s, 262144 iters, t-(init.)=1.43 s t(norm)=0.0340939, mflops=146.654 (err=2.6e-16) 32. Ransom: elapsed time t=1.47 s, 32768 iters, t-(init.)=1.46 s t(norm)=0.278473, mflops=17.9551 (err=7.5e-16) 33. SCIPORT: elapsed time t=1.23 s, 131072 iters, t-(init.)=1.18 s t(norm)=0.0562668, mflops=88.8624 (err=1.7e-16) 34. Singleton: elapsed time t=1.4 s, 131072 iters, t-(init.)=1.36 s t(norm)=0.0648499, mflops=77.1012 (err=2.2e-16) 35. Singleton (f2c): elapsed time t=1.41 s, 131072 iters, t-(init.)=1.37 s t(norm)=0.0653267, mflops=76.5384 (err=2.2e-16) 36. Sorensen: elapsed time t=1.2 s, 131072 iters, t-(init.)=1.15 s t(norm)=0.0548363, mflops=91.1805 (err=2.7e-16) 37. Sorensen DIT: elapsed time t=1.12 s, 65536 iters, t-(init.)=1.1 s t(norm)=0.104904, mflops=47.6625 (err=2.4e-16) 38. Temperton: elapsed time t=1.48 s, 65536 iters, t-(init.)=1.46 s t(norm)=0.139236, mflops=35.9101 (err=3.1e-08) 39. Temperton (f2c): elapsed time t=1.6 s, 65536 iters, t-(init.)=1.58 s t(norm)=0.150681, mflops=33.1828 (err=1.7e-16) 40. Valkenburg: elapsed time t=1.8 s, 16384 iters, t-(init.)=1.8 s t(norm)=0.686646, mflops=7.28178 (err=4.2e-16) 41. SCSL: elapsed time t=1.08 s, 262144 iters, t-(init.)=0.99 s t(norm)=0.0236034, mflops=211.834 (err=1.8e-16) 42. SGIMATH: elapsed time t=1.96 s, 524288 iters, t-(init.)=1.77 s t(norm)=0.0211, mflops=236.966 (err=1.8e-16) Top mflops for N=32 = 287.281 Normalized results and averages for N=32: fft 0: mflops = 93.6229 (norm. = 0.325893), norm. avg. (of 5) = 0.480968 fft 1: mflops = 97.9978 (norm. = 0.341121), norm. avg. (of 5) = 0.466297 fft 2: mflops = 64.3298 (norm. = 0.223926), norm. avg. (of 5) = 0.296444 fft 3: mflops = 15.7918 (norm. = 0.0549699), norm. avg. (of 5) = 0.0392569 fft 4: mflops = 83.8861 (norm. = 0.292), norm. avg. (of 5) = 0.205128 fft 5: mflops = 25.9549 (norm. = 0.0903465), norm. avg. (of 5) = 0.109842 fft 6: mflops = 54.8993 (norm. = 0.191099), norm. avg. (of 5) = 0.121892 fft 7: mflops = 40.3298 (norm. = 0.140385), norm. avg. (of 5) = 0.114637 fft 8: mflops = 47.2332 (norm. = 0.164414), norm. avg. (of 5) = 0.212027 fft 9: mflops = 88.1156 (norm. = 0.306723), norm. avg. (of 5) = 0.159782 fft 10: mflops = 89.6219 (norm. = 0.311966), norm. avg. (of 5) = 0.126385 fft 11: mflops = 50.9017 (norm. = 0.177184), norm. avg. (of 4) = 0.237358 fft 12: mflops = 105.917 (norm. = 0.368687), norm. avg. (of 5) = 0.265982 fft 13: mflops = 26.4792 (norm. = 0.0921717), norm. avg. (of 5) = 0.112878 fft 14: mflops = 211.834 (norm. = 0.737374), norm. avg. (of 5) = 0.669342 fft 15: mflops = 209.715 (norm. = 0.73), norm. avg. (of 5) = 0.653252 fft 16: mflops = 287.281 (norm. = 1), norm. avg. (of 5) = 0.948837 fft 17: mflops = 127.1 (norm. = 0.442424), norm. avg. (of 3) = 0.37826 fft 18: mflops = 78.2519 (norm. = 0.272388), norm. avg. (of 5) = 0.1891 fft 19: mflops = 50.4123 (norm. = 0.175481), norm. avg. (of 5) = 0.141583 fft 20: mflops = 50.4123 (norm. = 0.175481), norm. avg. (of 5) = 0.149741 fft 21: mflops = 201.649 (norm. = 0.701923), norm. avg. (of 5) = 0.680354 fft 22: mflops = 66.7883 (norm. = 0.232484), norm. avg. (of 4) = 0.260225 fft 23: mflops = 83.2203 (norm. = 0.289683), norm. avg. (of 4) = 0.286617 fft 24: mflops = 79.4376 (norm. = 0.276515), norm. avg. (of 4) = 0.254313 fft 25: mflops = 28.1875 (norm. = 0.0981183), norm. avg. (of 4) = 0.062151 fft 26: mflops = 34.7211 (norm. = 0.120861), norm. avg. (of 5) = 0.0894112 fft 27: mflops = 30.3057 (norm. = 0.105491), norm. avg. (of 5) = 0.0726421 fft 28: mflops = 48.5452 (norm. = 0.168981), norm. avg. (of 5) = 0.147913 fft 29: mflops = 40.3298 (norm. = 0.140385), norm. avg. (of 5) = 0.111244 fft 30: mflops = 165.13 (norm. = 0.574803), norm. avg. (of 5) = 0.675843 fft 31: mflops = 146.654 (norm. = 0.51049), norm. avg. (of 5) = 0.45097 fft 32: mflops = 17.9551 (norm. = 0.0625), norm. avg. (of 4) = 0.0544799 fft 33: mflops = 88.8624 (norm. = 0.309322), norm. avg. (of 4) = 0.292363 fft 34: mflops = 77.1012 (norm. = 0.268382), norm. avg. (of 5) = 0.166719 fft 35: mflops = 76.5384 (norm. = 0.266423), norm. avg. (of 5) = 0.170418 fft 36: mflops = 91.1805 (norm. = 0.317391), norm. avg. (of 5) = 0.27825 fft 37: mflops = 47.6625 (norm. = 0.165909), norm. avg. (of 5) = 0.206178 fft 38: mflops = 35.9101 (norm. = 0.125), norm. avg. (of 5) = 0.0871909 fft 39: mflops = 33.1828 (norm. = 0.115506), norm. avg. (of 5) = 0.076805 fft 40: mflops = 7.28178 (norm. = 0.0253472), norm. avg. (of 5) = 0.0492473 fft 41: mflops = 211.834 (norm. = 0.737374), norm. avg. (of 5) = 0.486495 fft 42: mflops = 236.966 (norm. = 0.824859), norm. avg. (of 5) = 0.494263 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.39 s, 65536 iters, t-(init.)=1.34 s t(norm)=0.0532468, mflops=93.9023 (err=5.8e-16) 1. Arndt DIT: elapsed time t=1.37 s, 65536 iters, t-(init.)=1.33 s t(norm)=0.0528495, mflops=94.6084 (err=5.7e-16) 2. Arndt Split-Radix: elapsed time t=1.81 s, 65536 iters, t-(init.)=1.77 s t(norm)=0.0703335, mflops=71.0899 (err=5.8e-16) 3. Arndt 4-step: elapsed time t=1.04 s, 16384 iters, t-(init.)=1.02 s t(norm)=0.162125, mflops=30.8405 (err=5.3e-16) 4. Bailey: elapsed time t=1.48 s, 65536 iters, t-(init.)=1.44 s t(norm)=0.0572205, mflops=87.3813 (err=5.6e-16) 5. Beauregard: elapsed time t=1.14 s, 16384 iters, t-(init.)=1.13 s t(norm)=0.179609, mflops=27.8383 (err=6.0e-16) 6. Bergland: elapsed time t=1 s, 32768 iters, t-(init.)=0.97 s t(norm)=0.0770887, mflops=64.8604 (err=6.3e-16) 7. Brenner: elapsed time t=1.38 s, 32768 iters, t-(init.)=1.36 s t(norm)=0.108083, mflops=46.2607 (err=5.8e-16) 8. Burrus: elapsed time t=1.2 s, 32768 iters, t-(init.)=1.18 s t(norm)=0.093778, mflops=53.3174 (err=5.7e-16) 9. CWP (min N) (N=65): elapsed time t=1.13 s, 65536 iters, t-(init.)=1.08 s t(norm)=0.0429153, mflops=116.508 10. CWP (best N) (N=84): elapsed time t=1.08 s, 65536 iters, t-(init.)=1.02 s t(norm)=0.0405312, mflops=123.362 11. Edelblute: elapsed time t=1.14 s, 32768 iters, t-(init.)=1.12 s t(norm)=0.0890096, mflops=56.1737 (err=5.7e-16) 12. FFTPACK: elapsed time t=1.66 s, 131072 iters, t-(init.)=1.57 s t(norm)=0.0311931, mflops=160.292 (err=5.7e-16) 13. FFTPACK (f2c): elapsed time t=1.87 s, 32768 iters, t-(init.)=1.84 s t(norm)=0.14623, mflops=34.1927 (err=5.7e-16) FFTW_MEASURE plan: (cost = 9.765625e-06) FFTW_TWIDDLE 8 FFTW_NOTW 8 14. FFTW: elapsed time t=1.24 s, 131072 iters, t-(init.)=1.15 s t(norm)=0.0228484, mflops=218.833 (err=5.6e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.4 s, 131072 iters, t-(init.)=1.31 s t(norm)=0.0260274, mflops=192.106 (err=5.3e-16) 16. Frigo-old: elapsed time t=1.37 s, 131072 iters, t-(init.)=1.28 s t(norm)=0.0254313, mflops=196.608 (err=5.5e-16) 17. Green: elapsed time t=1.54 s, 131072 iters, t-(init.)=1.45 s t(norm)=0.0288089, mflops=173.557 (err=5.7e-16) 18. GSL: elapsed time t=1.19 s, 65536 iters, t-(init.)=1.15 s t(norm)=0.0456969, mflops=109.417 (err=5.7e-16) 19. GSL DIT: elapsed time t=1.13 s, 32768 iters, t-(init.)=1.11 s t(norm)=0.0882149, mflops=56.6798 (err=5.6e-16) 20. GSL DIF: elapsed time t=1.1 s, 32768 iters, t-(init.)=1.08 s t(norm)=0.0858307, mflops=58.2542 (err=5.4e-16) 21. Krukar: elapsed time t=1.4 s, 131072 iters, t-(init.)=1.32 s t(norm)=0.026226, mflops=190.65 (err=5.9e-16) 22. Mayer (Buneman): elapsed time t=1.88 s, 65536 iters, t-(init.)=1.84 s t(norm)=0.073115, mflops=68.3854 (err=5.3e-16) 23. Mayer (simple): elapsed time t=1.47 s, 65536 iters, t-(init.)=1.42 s t(norm)=0.0564257, mflops=88.6121 24. Mayer (lookup): elapsed time t=1.48 s, 65536 iters, t-(init.)=1.43 s t(norm)=0.0568231, mflops=87.9924 (err=5.2e-16) 25. Monro: elapsed time t=1.82 s, 32768 iters, t-(init.)=1.8 s t(norm)=0.143051, mflops=34.9525 (err=3.4e-08) 26. NAPACK (f2c): elapsed time t=1.44 s, 32768 iters, t-(init.)=1.41 s t(norm)=0.112057, mflops=44.6203 (err=1.8e-15) 27. Nielsen: elapsed time t=1.5 s, 32768 iters, t-(init.)=1.47 s t(norm)=0.116825, mflops=42.799 (err=1.8e-15) 28. NR (C): elapsed time t=1.14 s, 32768 iters, t-(init.)=1.12 s t(norm)=0.0890096, mflops=56.1737 (err=5.4e-16) 29. NR (F): elapsed time t=1.3 s, 32768 iters, t-(init.)=1.28 s t(norm)=0.101725, mflops=49.152 (err=5.4e-16) 30. Ooura (C): elapsed time t=1.51 s, 131072 iters, t-(init.)=1.41 s t(norm)=0.0280142, mflops=178.481 (err=5.7e-16) 31. Ooura (F): elapsed time t=1.52 s, 131072 iters, t-(init.)=1.43 s t(norm)=0.0284115, mflops=175.985 (err=5.7e-16) 32. Ransom: elapsed time t=1.65 s, 32768 iters, t-(init.)=1.63 s t(norm)=0.129541, mflops=38.5979 (err=9.0e-16) 33. SCIPORT: elapsed time t=1.3 s, 65536 iters, t-(init.)=1.26 s t(norm)=0.0500679, mflops=99.8644 (err=5.6e-16) 34. Singleton: elapsed time t=1.22 s, 65536 iters, t-(init.)=1.18 s t(norm)=0.046889, mflops=106.635 (err=9.2e-16) 35. Singleton (f2c): elapsed time t=1.21 s, 65536 iters, t-(init.)=1.17 s t(norm)=0.0464916, mflops=107.546 (err=9.2e-16) 36. Sorensen: elapsed time t=1.15 s, 65536 iters, t-(init.)=1.1 s t(norm)=0.0437101, mflops=114.39 (err=5.7e-16) 37. Sorensen DIT: elapsed time t=1.19 s, 32768 iters, t-(init.)=1.17 s t(norm)=0.0929832, mflops=53.7731 (err=5.7e-16) 38. Temperton: elapsed time t=1.23 s, 32768 iters, t-(init.)=1.2 s t(norm)=0.0953674, mflops=52.4288 (err=3.8e-08) 39. Temperton (f2c): elapsed time t=1.35 s, 32768 iters, t-(init.)=1.33 s t(norm)=0.105699, mflops=47.3042 (err=5.7e-16) 40. Valkenburg: elapsed time t=1.04 s, 4096 iters, t-(init.)=1.03 s t(norm)=0.654856, mflops=7.63526 (err=7.8e-16) 41. SCSL: elapsed time t=1.06 s, 131072 iters, t-(init.)=0.97 s t(norm)=0.0192722, mflops=259.441 (err=5.7e-16) 42. SGIMATH: elapsed time t=1.08 s, 131072 iters, t-(init.)=0.99 s t(norm)=0.0196695, mflops=254.2 (err=5.7e-16) Top mflops for N=64 = 259.441 Normalized results and averages for N=64: fft 0: mflops = 93.9023 (norm. = 0.36194), norm. avg. (of 6) = 0.46113 fft 1: mflops = 94.6084 (norm. = 0.364662), norm. avg. (of 6) = 0.449358 fft 2: mflops = 71.0899 (norm. = 0.274011), norm. avg. (of 6) = 0.292706 fft 3: mflops = 30.8405 (norm. = 0.118873), norm. avg. (of 6) = 0.0525262 fft 4: mflops = 87.3813 (norm. = 0.336806), norm. avg. (of 6) = 0.227074 fft 5: mflops = 27.8383 (norm. = 0.107301), norm. avg. (of 6) = 0.109418 fft 6: mflops = 64.8604 (norm. = 0.25), norm. avg. (of 6) = 0.143243 fft 7: mflops = 46.2607 (norm. = 0.178309), norm. avg. (of 6) = 0.125249 fft 8: mflops = 53.3174 (norm. = 0.205508), norm. avg. (of 6) = 0.210941 fft 9: mflops = 116.508 (norm. = 0.449074), norm. avg. (of 6) = 0.207998 fft 10: mflops = 123.362 (norm. = 0.47549), norm. avg. (of 6) = 0.184569 fft 11: mflops = 56.1737 (norm. = 0.216518), norm. avg. (of 5) = 0.23319 fft 12: mflops = 160.292 (norm. = 0.617834), norm. avg. (of 6) = 0.324624 fft 13: mflops = 34.1927 (norm. = 0.131793), norm. avg. (of 6) = 0.116031 fft 14: mflops = 218.833 (norm. = 0.843478), norm. avg. (of 6) = 0.698364 fft 15: mflops = 192.106 (norm. = 0.740458), norm. avg. (of 6) = 0.667786 fft 16: mflops = 196.608 (norm. = 0.757812), norm. avg. (of 6) = 0.917 fft 17: mflops = 173.557 (norm. = 0.668966), norm. avg. (of 4) = 0.450937 fft 18: mflops = 109.417 (norm. = 0.421739), norm. avg. (of 6) = 0.227873 fft 19: mflops = 56.6798 (norm. = 0.218468), norm. avg. (of 6) = 0.154397 fft 20: mflops = 58.2542 (norm. = 0.224537), norm. avg. (of 6) = 0.162207 fft 21: mflops = 190.65 (norm. = 0.734848), norm. avg. (of 6) = 0.689437 fft 22: mflops = 68.3854 (norm. = 0.263587), norm. avg. (of 5) = 0.260897 fft 23: mflops = 88.6121 (norm. = 0.341549), norm. avg. (of 5) = 0.297603 fft 24: mflops = 87.9924 (norm. = 0.339161), norm. avg. (of 5) = 0.271282 fft 25: mflops = 34.9525 (norm. = 0.134722), norm. avg. (of 5) = 0.0766653 fft 26: mflops = 44.6203 (norm. = 0.171986), norm. avg. (of 6) = 0.103174 fft 27: mflops = 42.799 (norm. = 0.164966), norm. avg. (of 6) = 0.0880294 fft 28: mflops = 56.1737 (norm. = 0.216518), norm. avg. (of 6) = 0.159347 fft 29: mflops = 49.152 (norm. = 0.189453), norm. avg. (of 6) = 0.124279 fft 30: mflops = 178.481 (norm. = 0.687943), norm. avg. (of 6) = 0.677859 fft 31: mflops = 175.985 (norm. = 0.678322), norm. avg. (of 6) = 0.488862 fft 32: mflops = 38.5979 (norm. = 0.148773), norm. avg. (of 5) = 0.0733385 fft 33: mflops = 99.8644 (norm. = 0.384921), norm. avg. (of 5) = 0.310874 fft 34: mflops = 106.635 (norm. = 0.411017), norm. avg. (of 6) = 0.207435 fft 35: mflops = 107.546 (norm. = 0.41453), norm. avg. (of 6) = 0.211103 fft 36: mflops = 114.39 (norm. = 0.440909), norm. avg. (of 6) = 0.30536 fft 37: mflops = 53.7731 (norm. = 0.207265), norm. avg. (of 6) = 0.206359 fft 38: mflops = 52.4288 (norm. = 0.202083), norm. avg. (of 6) = 0.10634 fft 39: mflops = 47.3042 (norm. = 0.182331), norm. avg. (of 6) = 0.0943927 fft 40: mflops = 7.63526 (norm. = 0.0294296), norm. avg. (of 6) = 0.0459444 fft 41: mflops = 259.441 (norm. = 1), norm. avg. (of 6) = 0.572079 fft 42: mflops = 254.2 (norm. = 0.979798), norm. avg. (of 6) = 0.575185 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.43 s, 32768 iters, t-(init.)=1.39 s t(norm)=0.0473431, mflops=105.612 (err=3.6e-16) 1. Arndt DIT: elapsed time t=1.41 s, 32768 iters, t-(init.)=1.37 s t(norm)=0.0466619, mflops=107.154 (err=3.4e-16) 2. Arndt Split-Radix: elapsed time t=1.94 s, 32768 iters, t-(init.)=1.9 s t(norm)=0.0647136, mflops=77.2635 (err=4.2e-16) 3. Arndt 4-step: elapsed time t=1.27 s, 8192 iters, t-(init.)=1.26 s t(norm)=0.171661, mflops=29.1271 (err=3.1e-16) 4. Bailey: elapsed time t=1.42 s, 32768 iters, t-(init.)=1.37 s t(norm)=0.0466619, mflops=107.154 (err=3.4e-16) 5. Beauregard: elapsed time t=1.31 s, 8192 iters, t-(init.)=1.29 s t(norm)=0.175749, mflops=28.4497 (err=3.5e-16) 6. Bergland: elapsed time t=1.05 s, 16384 iters, t-(init.)=1.03 s t(norm)=0.0701632, mflops=71.2624 (err=3.9e-16) 7. Brenner: elapsed time t=1.4 s, 16384 iters, t-(init.)=1.38 s t(norm)=0.094005, mflops=53.1886 (err=4.2e-16) 8. Burrus: elapsed time t=1.27 s, 16384 iters, t-(init.)=1.24 s t(norm)=0.0844683, mflops=59.1938 (err=3.3e-16) 9. CWP (min N) (N=130): elapsed time t=1.04 s, 32768 iters, t-(init.)=1 s t(norm)=0.0340598, mflops=146.801 10. CWP (best N) (N=140): elapsed time t=1.84 s, 65536 iters, t-(init.)=1.74 s t(norm)=0.029632, mflops=168.736 11. Edelblute: elapsed time t=1.22 s, 16384 iters, t-(init.)=1.2 s t(norm)=0.0817435, mflops=61.1669 (err=3.3e-16) 12. FFTPACK: elapsed time t=1.77 s, 65536 iters, t-(init.)=1.69 s t(norm)=0.0287805, mflops=173.729 (err=3.4e-16) 13. FFTPACK (f2c): elapsed time t=1.04 s, 8192 iters, t-(init.)=1.03 s t(norm)=0.140326, mflops=35.6312 (err=3.4e-16) FFTW_MEASURE plan: (cost = 1.953125e-05) FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.3 s, 65536 iters, t-(init.)=1.22 s t(norm)=0.0207765, mflops=240.657 (err=3.5e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.53 s, 65536 iters, t-(init.)=1.44 s t(norm)=0.0245231, mflops=203.89 (err=3.5e-16) 16. Frigo-old: elapsed time t=1.38 s, 65536 iters, t-(init.)=1.29 s t(norm)=0.0219686, mflops=227.598 (err=3.4e-16) 17. Green: elapsed time t=1.74 s, 65536 iters, t-(init.)=1.65 s t(norm)=0.0280993, mflops=177.94 (err=4.2e-16) 18. GSL: elapsed time t=1.21 s, 32768 iters, t-(init.)=1.16 s t(norm)=0.0395094, mflops=126.552 (err=3.3e-16) 19. GSL DIT: elapsed time t=1.18 s, 16384 iters, t-(init.)=1.16 s t(norm)=0.0790187, mflops=63.2761 (err=3.5e-16) 20. GSL DIF: elapsed time t=1.19 s, 16384 iters, t-(init.)=1.17 s t(norm)=0.0796999, mflops=62.7353 (err=3.7e-16) 21. Krukar: elapsed time t=1.73 s, 65536 iters, t-(init.)=1.64 s t(norm)=0.027929, mflops=179.025 (err=3.6e-16) 22. Mayer (Buneman): elapsed time t=1.98 s, 32768 iters, t-(init.)=1.94 s t(norm)=0.066076, mflops=75.6704 (err=3.3e-16) 23. Mayer (simple): elapsed time t=1.52 s, 32768 iters, t-(init.)=1.47 s t(norm)=0.0500679, mflops=99.8644 24. Mayer (lookup): elapsed time t=1.52 s, 32768 iters, t-(init.)=1.48 s t(norm)=0.0504085, mflops=99.1896 (err=3.4e-16) 25. Monro: elapsed time t=1.77 s, 16384 iters, t-(init.)=1.75 s t(norm)=0.119209, mflops=41.943 (err=5.2e-08) 26. NAPACK (f2c): elapsed time t=1.6 s, 16384 iters, t-(init.)=1.58 s t(norm)=0.107629, mflops=46.4559 (err=2.1e-15) 27. Nielsen: elapsed time t=1.06 s, 8192 iters, t-(init.)=1.05 s t(norm)=0.143051, mflops=34.9525 (err=1.0e-15) 28. NR (C): elapsed time t=1.19 s, 16384 iters, t-(init.)=1.17 s t(norm)=0.0796999, mflops=62.7353 (err=3.6e-16) 29. NR (F): elapsed time t=1.33 s, 16384 iters, t-(init.)=1.31 s t(norm)=0.0892367, mflops=56.0308 (err=3.6e-16) 30. Ooura (C): elapsed time t=1.69 s, 65536 iters, t-(init.)=1.61 s t(norm)=0.0274181, mflops=182.361 (err=3.5e-16) 31. Ooura (F): elapsed time t=1.7 s, 65536 iters, t-(init.)=1.61 s t(norm)=0.0274181, mflops=182.361 (err=3.5e-16) 32. Ransom: elapsed time t=1.06 s, 8192 iters, t-(init.)=1.05 s t(norm)=0.143051, mflops=34.9525 (err=9.6e-16) 33. SCIPORT: elapsed time t=1.37 s, 32768 iters, t-(init.)=1.32 s t(norm)=0.0449589, mflops=111.213 (err=3.7e-16) 34. Singleton: elapsed time t=1.41 s, 32768 iters, t-(init.)=1.36 s t(norm)=0.0463213, mflops=107.942 (err=4.2e-16) 35. Singleton (f2c): elapsed time t=1.44 s, 32768 iters, t-(init.)=1.39 s t(norm)=0.0473431, mflops=105.612 (err=4.2e-16) 36. Sorensen: elapsed time t=1.1 s, 32768 iters, t-(init.)=1.05 s t(norm)=0.0357628, mflops=139.81 (err=3.0e-16) 37. Sorensen DIT: elapsed time t=1.24 s, 16384 iters, t-(init.)=1.21 s t(norm)=0.0824247, mflops=60.6614 (err=3.3e-16) 38. Temperton: elapsed time t=1.4 s, 16384 iters, t-(init.)=1.38 s t(norm)=0.094005, mflops=53.1886 (err=4.7e-08) 39. Temperton (f2c): elapsed time t=1.48 s, 16384 iters, t-(init.)=1.46 s t(norm)=0.0994546, mflops=50.2742 (err=3.6e-16) 40. Valkenburg: elapsed time t=1.21 s, 2048 iters, t-(init.)=1.21 s t(norm)=0.659398, mflops=7.58268 (err=5.3e-16) 41. SCSL: elapsed time t=1.07 s, 65536 iters, t-(init.)=0.98 s t(norm)=0.0166893, mflops=299.593 (err=3.4e-16) 42. SGIMATH: elapsed time t=1.06 s, 65536 iters, t-(init.)=0.98 s t(norm)=0.0166893, mflops=299.593 (err=3.4e-16) Top mflops for N=128 = 299.593 Normalized results and averages for N=128: fft 0: mflops = 105.612 (norm. = 0.352518), norm. avg. (of 7) = 0.445614 fft 1: mflops = 107.154 (norm. = 0.357664), norm. avg. (of 7) = 0.436259 fft 2: mflops = 77.2635 (norm. = 0.257895), norm. avg. (of 7) = 0.287733 fft 3: mflops = 29.1271 (norm. = 0.0972222), norm. avg. (of 7) = 0.0589113 fft 4: mflops = 107.154 (norm. = 0.357664), norm. avg. (of 7) = 0.24573 fft 5: mflops = 28.4497 (norm. = 0.0949612), norm. avg. (of 7) = 0.107353 fft 6: mflops = 71.2624 (norm. = 0.237864), norm. avg. (of 7) = 0.15676 fft 7: mflops = 53.1886 (norm. = 0.177536), norm. avg. (of 7) = 0.132719 fft 8: mflops = 59.1938 (norm. = 0.197581), norm. avg. (of 7) = 0.209032 fft 9: mflops = 146.801 (norm. = 0.49), norm. avg. (of 7) = 0.248284 fft 10: mflops = 168.736 (norm. = 0.563218), norm. avg. (of 7) = 0.238662 fft 11: mflops = 61.1669 (norm. = 0.204167), norm. avg. (of 6) = 0.228353 fft 12: mflops = 173.729 (norm. = 0.579882), norm. avg. (of 7) = 0.361089 fft 13: mflops = 35.6312 (norm. = 0.118932), norm. avg. (of 7) = 0.116445 fft 14: mflops = 240.657 (norm. = 0.803279), norm. avg. (of 7) = 0.713352 fft 15: mflops = 203.89 (norm. = 0.680556), norm. avg. (of 7) = 0.66961 fft 16: mflops = 227.598 (norm. = 0.75969), norm. avg. (of 7) = 0.894527 fft 17: mflops = 177.94 (norm. = 0.593939), norm. avg. (of 5) = 0.479537 fft 18: mflops = 126.552 (norm. = 0.422414), norm. avg. (of 7) = 0.255665 fft 19: mflops = 63.2761 (norm. = 0.211207), norm. avg. (of 7) = 0.162513 fft 20: mflops = 62.7353 (norm. = 0.209402), norm. avg. (of 7) = 0.168949 fft 21: mflops = 179.025 (norm. = 0.597561), norm. avg. (of 7) = 0.676312 fft 22: mflops = 75.6704 (norm. = 0.252577), norm. avg. (of 6) = 0.259511 fft 23: mflops = 99.8644 (norm. = 0.333333), norm. avg. (of 6) = 0.303558 fft 24: mflops = 99.1896 (norm. = 0.331081), norm. avg. (of 6) = 0.281249 fft 25: mflops = 41.943 (norm. = 0.14), norm. avg. (of 6) = 0.0872211 fft 26: mflops = 46.4559 (norm. = 0.155063), norm. avg. (of 7) = 0.110586 fft 27: mflops = 34.9525 (norm. = 0.116667), norm. avg. (of 7) = 0.0921205 fft 28: mflops = 62.7353 (norm. = 0.209402), norm. avg. (of 7) = 0.166498 fft 29: mflops = 56.0308 (norm. = 0.187023), norm. avg. (of 7) = 0.133243 fft 30: mflops = 182.361 (norm. = 0.608696), norm. avg. (of 7) = 0.667979 fft 31: mflops = 182.361 (norm. = 0.608696), norm. avg. (of 7) = 0.505981 fft 32: mflops = 34.9525 (norm. = 0.116667), norm. avg. (of 6) = 0.0805599 fft 33: mflops = 111.213 (norm. = 0.371212), norm. avg. (of 6) = 0.320931 fft 34: mflops = 107.942 (norm. = 0.360294), norm. avg. (of 7) = 0.229272 fft 35: mflops = 105.612 (norm. = 0.352518), norm. avg. (of 7) = 0.231305 fft 36: mflops = 139.81 (norm. = 0.466667), norm. avg. (of 7) = 0.328404 fft 37: mflops = 60.6614 (norm. = 0.202479), norm. avg. (of 7) = 0.205805 fft 38: mflops = 53.1886 (norm. = 0.177536), norm. avg. (of 7) = 0.116511 fft 39: mflops = 50.2742 (norm. = 0.167808), norm. avg. (of 7) = 0.104881 fft 40: mflops = 7.58268 (norm. = 0.0253099), norm. avg. (of 7) = 0.0429966 fft 41: mflops = 299.593 (norm. = 1), norm. avg. (of 7) = 0.633211 fft 42: mflops = 299.593 (norm. = 1), norm. avg. (of 7) = 0.635873 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.59 s, 16384 iters, t-(init.)=1.55 s t(norm)=0.0461936, mflops=108.24 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.59 s, 16384 iters, t-(init.)=1.55 s t(norm)=0.0461936, mflops=108.24 (err=1.0e-15) 2. Arndt Split-Radix: elapsed time t=1.02 s, 8192 iters, t-(init.)=1 s t(norm)=0.0596046, mflops=83.8861 (err=1.0e-15) 3. Arndt 4-step: elapsed time t=1.14 s, 4096 iters, t-(init.)=1.13 s t(norm)=0.134706, mflops=37.1177 (err=1.0e-15) 4. Bailey: elapsed time t=1.73 s, 16384 iters, t-(init.)=1.68 s t(norm)=0.0500679, mflops=99.8644 (err=1.0e-15) 5. Beauregard: elapsed time t=1.43 s, 4096 iters, t-(init.)=1.42 s t(norm)=0.169277, mflops=29.5374 (err=1.1e-15) 6. Bergland: elapsed time t=1.06 s, 8192 iters, t-(init.)=1.04 s t(norm)=0.0619888, mflops=80.6597 (err=1.0e-15) 7. Brenner: elapsed time t=1.52 s, 8192 iters, t-(init.)=1.5 s t(norm)=0.089407, mflops=55.9241 (err=1.1e-15) 8. Burrus: elapsed time t=1.32 s, 8192 iters, t-(init.)=1.3 s t(norm)=0.077486, mflops=64.5278 (err=1.0e-15) 9. CWP (min N) (N=260): elapsed time t=1.91 s, 32768 iters, t-(init.)=1.82 s t(norm)=0.0271201, mflops=184.365 10. CWP (best N) (N=280): elapsed time t=1.67 s, 32768 iters, t-(init.)=1.57 s t(norm)=0.0233948, mflops=213.722 11. Edelblute: elapsed time t=1.28 s, 8192 iters, t-(init.)=1.26 s t(norm)=0.0751019, mflops=66.5763 (err=1.0e-15) 12. FFTPACK: elapsed time t=1.68 s, 32768 iters, t-(init.)=1.6 s t(norm)=0.0238419, mflops=209.715 (err=1.1e-15) 13. FFTPACK (f2c): elapsed time t=1.03 s, 4096 iters, t-(init.)=1.02 s t(norm)=0.121593, mflops=41.1206 (err=1.1e-15) FFTW_MEASURE plan: (cost = 4.516602e-05) FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.73 s, 32768 iters, t-(init.)=1.65 s t(norm)=0.0245869, mflops=203.36 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.82 s, 32768 iters, t-(init.)=1.73 s t(norm)=0.025779, mflops=193.956 (err=1.0e-15) 16. Frigo-old: elapsed time t=1.79 s, 32768 iters, t-(init.)=1.71 s t(norm)=0.025481, mflops=196.225 (err=1.0e-15) 17. Green: elapsed time t=1.78 s, 32768 iters, t-(init.)=1.69 s t(norm)=0.025183, mflops=198.547 (err=1.1e-15) 18. GSL: elapsed time t=1.26 s, 16384 iters, t-(init.)=1.22 s t(norm)=0.0363588, mflops=137.518 (err=1.1e-15) 19. GSL DIT: elapsed time t=1.22 s, 8192 iters, t-(init.)=1.2 s t(norm)=0.0715256, mflops=69.9051 (err=1.0e-15) 20. GSL DIF: elapsed time t=1.24 s, 8192 iters, t-(init.)=1.22 s t(norm)=0.0727177, mflops=68.7591 (err=1.1e-15) 21. Krukar: elapsed time t=1.04 s, 16384 iters, t-(init.)=1 s t(norm)=0.0298023, mflops=167.772 (err=1.1e-15) 22. Mayer (Buneman): elapsed time t=1.08 s, 8192 iters, t-(init.)=1.05 s t(norm)=0.0625849, mflops=79.8915 (err=9.8e-16) 23. Mayer (simple): elapsed time t=1.66 s, 16384 iters, t-(init.)=1.62 s t(norm)=0.0482798, mflops=103.563 24. Mayer (lookup): elapsed time t=1.62 s, 16384 iters, t-(init.)=1.57 s t(norm)=0.0467896, mflops=106.861 (err=9.6e-16) 25. Monro: elapsed time t=1.84 s, 8192 iters, t-(init.)=1.82 s t(norm)=0.10848, mflops=46.0913 (err=8.5e-08) 26. NAPACK (f2c): elapsed time t=1.57 s, 8192 iters, t-(init.)=1.54 s t(norm)=0.0917912, mflops=54.4715 (err=4.9e-15) 27. Nielsen: elapsed time t=1.97 s, 8192 iters, t-(init.)=1.94 s t(norm)=0.115633, mflops=43.2402 (err=4.3e-15) 28. NR (C): elapsed time t=1.27 s, 8192 iters, t-(init.)=1.24 s t(norm)=0.0739098, mflops=67.6501 (err=1.1e-15) 29. NR (F): elapsed time t=1.36 s, 8192 iters, t-(init.)=1.34 s t(norm)=0.0798702, mflops=62.6016 (err=1.1e-15) 30. Ooura (C): elapsed time t=1.84 s, 32768 iters, t-(init.)=1.75 s t(norm)=0.026077, mflops=191.74 (err=1.0e-15) 31. Ooura (F): elapsed time t=1.76 s, 32768 iters, t-(init.)=1.67 s t(norm)=0.0248849, mflops=200.925 (err=1.0e-15) 32. Ransom: elapsed time t=1.6 s, 8192 iters, t-(init.)=1.57 s t(norm)=0.0935793, mflops=53.4306 (err=1.8e-15) 33. SCIPORT: elapsed time t=1.47 s, 16384 iters, t-(init.)=1.42 s t(norm)=0.0423193, mflops=118.149 (err=1.1e-15) 34. Singleton: elapsed time t=1.28 s, 16384 iters, t-(init.)=1.23 s t(norm)=0.0366569, mflops=136.4 (err=1.7e-15) 35. Singleton (f2c): elapsed time t=1.28 s, 16384 iters, t-(init.)=1.24 s t(norm)=0.0369549, mflops=135.3 (err=1.7e-15) 36. Sorensen: elapsed time t=1.12 s, 16384 iters, t-(init.)=1.08 s t(norm)=0.0321865, mflops=155.345 (err=9.6e-16) 37. Sorensen DIT: elapsed time t=1.32 s, 8192 iters, t-(init.)=1.3 s t(norm)=0.077486, mflops=64.5278 (err=9.8e-16) 38. Temperton: elapsed time t=1.52 s, 8192 iters, t-(init.)=1.5 s t(norm)=0.089407, mflops=55.9241 (err=9.5e-08) 39. Temperton (f2c): elapsed time t=1.51 s, 8192 iters, t-(init.)=1.49 s t(norm)=0.0888109, mflops=56.2994 (err=1.1e-15) 40. Valkenburg: elapsed time t=1.36 s, 1024 iters, t-(init.)=1.36 s t(norm)=0.648499, mflops=7.71012 (err=1.1e-15) 41. SCSL: elapsed time t=1.15 s, 32768 iters, t-(init.)=1.06 s t(norm)=0.0157952, mflops=316.551 (err=1.1e-15) 42. SGIMATH: elapsed time t=1.16 s, 32768 iters, t-(init.)=1.07 s t(norm)=0.0159442, mflops=313.593 (err=1.1e-15) Top mflops for N=256 = 316.551 Normalized results and averages for N=256: fft 0: mflops = 108.24 (norm. = 0.341935), norm. avg. (of 8) = 0.432654 fft 1: mflops = 108.24 (norm. = 0.341935), norm. avg. (of 8) = 0.424468 fft 2: mflops = 83.8861 (norm. = 0.265), norm. avg. (of 8) = 0.284891 fft 3: mflops = 37.1177 (norm. = 0.117257), norm. avg. (of 8) = 0.0662045 fft 4: mflops = 99.8644 (norm. = 0.315476), norm. avg. (of 8) = 0.254448 fft 5: mflops = 29.5374 (norm. = 0.0933099), norm. avg. (of 8) = 0.105597 fft 6: mflops = 80.6597 (norm. = 0.254808), norm. avg. (of 8) = 0.169016 fft 7: mflops = 55.9241 (norm. = 0.176667), norm. avg. (of 8) = 0.138212 fft 8: mflops = 64.5278 (norm. = 0.203846), norm. avg. (of 8) = 0.208384 fft 9: mflops = 184.365 (norm. = 0.582418), norm. avg. (of 8) = 0.290051 fft 10: mflops = 213.722 (norm. = 0.675159), norm. avg. (of 8) = 0.293224 fft 11: mflops = 66.5763 (norm. = 0.210317), norm. avg. (of 7) = 0.225776 fft 12: mflops = 209.715 (norm. = 0.6625), norm. avg. (of 8) = 0.398766 fft 13: mflops = 41.1206 (norm. = 0.129902), norm. avg. (of 8) = 0.118127 fft 14: mflops = 203.36 (norm. = 0.642424), norm. avg. (of 8) = 0.704486 fft 15: mflops = 193.956 (norm. = 0.612717), norm. avg. (of 8) = 0.662499 fft 16: mflops = 196.225 (norm. = 0.619883), norm. avg. (of 8) = 0.860196 fft 17: mflops = 198.547 (norm. = 0.627219), norm. avg. (of 6) = 0.504151 fft 18: mflops = 137.518 (norm. = 0.434426), norm. avg. (of 8) = 0.27801 fft 19: mflops = 69.9051 (norm. = 0.220833), norm. avg. (of 8) = 0.169803 fft 20: mflops = 68.7591 (norm. = 0.217213), norm. avg. (of 8) = 0.174982 fft 21: mflops = 167.772 (norm. = 0.53), norm. avg. (of 8) = 0.658023 fft 22: mflops = 79.8915 (norm. = 0.252381), norm. avg. (of 7) = 0.258492 fft 23: mflops = 103.563 (norm. = 0.32716), norm. avg. (of 7) = 0.30693 fft 24: mflops = 106.861 (norm. = 0.33758), norm. avg. (of 7) = 0.289296 fft 25: mflops = 46.0913 (norm. = 0.145604), norm. avg. (of 7) = 0.0955615 fft 26: mflops = 54.4715 (norm. = 0.172078), norm. avg. (of 8) = 0.118273 fft 27: mflops = 43.2402 (norm. = 0.136598), norm. avg. (of 8) = 0.0976801 fft 28: mflops = 67.6501 (norm. = 0.21371), norm. avg. (of 8) = 0.172399 fft 29: mflops = 62.6016 (norm. = 0.197761), norm. avg. (of 8) = 0.141307 fft 30: mflops = 191.74 (norm. = 0.605714), norm. avg. (of 8) = 0.660196 fft 31: mflops = 200.925 (norm. = 0.634731), norm. avg. (of 8) = 0.522075 fft 32: mflops = 53.4306 (norm. = 0.16879), norm. avg. (of 7) = 0.0931642 fft 33: mflops = 118.149 (norm. = 0.373239), norm. avg. (of 7) = 0.328403 fft 34: mflops = 136.4 (norm. = 0.430894), norm. avg. (of 8) = 0.254475 fft 35: mflops = 135.3 (norm. = 0.427419), norm. avg. (of 8) = 0.25582 fft 36: mflops = 155.345 (norm. = 0.490741), norm. avg. (of 8) = 0.348696 fft 37: mflops = 64.5278 (norm. = 0.203846), norm. avg. (of 8) = 0.20556 fft 38: mflops = 55.9241 (norm. = 0.176667), norm. avg. (of 8) = 0.12403 fft 39: mflops = 56.2994 (norm. = 0.177852), norm. avg. (of 8) = 0.114002 fft 40: mflops = 7.71012 (norm. = 0.0243566), norm. avg. (of 8) = 0.0406666 fft 41: mflops = 316.551 (norm. = 1), norm. avg. (of 8) = 0.67906 fft 42: mflops = 313.593 (norm. = 0.990654), norm. avg. (of 8) = 0.680221 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.67 s, 8192 iters, t-(init.)=1.63 s t(norm)=0.0431803, mflops=115.794 (err=1.1e-15) 1. Arndt DIT: elapsed time t=1.64 s, 8192 iters, t-(init.)=1.6 s t(norm)=0.0423855, mflops=117.965 (err=1.1e-15) 2. Arndt Split-Radix: elapsed time t=1.1 s, 4096 iters, t-(init.)=1.08 s t(norm)=0.0572205, mflops=87.3813 (err=1.1e-15) 3. Arndt 4-step: elapsed time t=1.24 s, 2048 iters, t-(init.)=1.23 s t(norm)=0.130335, mflops=38.3625 (err=1.0e-15) 4. Bailey: elapsed time t=1.73 s, 8192 iters, t-(init.)=1.68 s t(norm)=0.0445048, mflops=112.347 (err=1.1e-15) 5. Beauregard: elapsed time t=1.6 s, 2048 iters, t-(init.)=1.59 s t(norm)=0.168482, mflops=29.6767 (err=1.0e-15) 6. Bergland: elapsed time t=1.1 s, 4096 iters, t-(init.)=1.08 s t(norm)=0.0572205, mflops=87.3813 (err=1.0e-15) 7. Brenner: elapsed time t=1.57 s, 4096 iters, t-(init.)=1.55 s t(norm)=0.082122, mflops=60.8851 (err=9.9e-16) 8. Burrus: elapsed time t=1.39 s, 4096 iters, t-(init.)=1.37 s t(norm)=0.0725852, mflops=68.8846 (err=1.1e-15) 9. CWP (min N) (N=520): elapsed time t=1.95 s, 16384 iters, t-(init.)=1.86 s t(norm)=0.0246366, mflops=202.95 10. CWP (best N) (N=560): elapsed time t=1.88 s, 16384 iters, t-(init.)=1.79 s t(norm)=0.0237094, mflops=210.887 11. Edelblute: elapsed time t=1.35 s, 4096 iters, t-(init.)=1.32 s t(norm)=0.0699361, mflops=71.4938 (err=1.1e-15) 12. FFTPACK: elapsed time t=1.14 s, 8192 iters, t-(init.)=1.1 s t(norm)=0.02914, mflops=171.585 (err=1.0e-15) 13. FFTPACK (f2c): elapsed time t=1.4 s, 2048 iters, t-(init.)=1.39 s t(norm)=0.14729, mflops=33.9467 (err=1.0e-15) FFTW_MEASURE plan: (cost = 1.074219e-04) FFTW_TWIDDLE 16 FFTW_NOTW 32 14. FFTW: elapsed time t=1.86 s, 16384 iters, t-(init.)=1.77 s t(norm)=0.0234445, mflops=213.27 (err=9.7e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.87 s, 16384 iters, t-(init.)=1.78 s t(norm)=0.0235769, mflops=212.072 (err=9.7e-16) 16. Frigo-old: elapsed time t=1.04 s, 8192 iters, t-(init.)=1 s t(norm)=0.026491, mflops=188.744 (err=9.5e-16) 17. Green: elapsed time t=1.82 s, 16384 iters, t-(init.)=1.73 s t(norm)=0.0229147, mflops=218.201 (err=9.8e-16) 18. GSL: elapsed time t=1.64 s, 8192 iters, t-(init.)=1.6 s t(norm)=0.0423855, mflops=117.965 (err=1.0e-15) 19. GSL DIT: elapsed time t=1.33 s, 4096 iters, t-(init.)=1.31 s t(norm)=0.0694063, mflops=72.0396 (err=1.2e-15) 20. GSL DIF: elapsed time t=1.34 s, 4096 iters, t-(init.)=1.32 s t(norm)=0.0699361, mflops=71.4938 (err=1.1e-15) 21. Krukar: elapsed time t=1.23 s, 8192 iters, t-(init.)=1.19 s t(norm)=0.0315242, mflops=158.608 (err=1.0e-15) 22. Mayer (Buneman): elapsed time t=1.13 s, 4096 iters, t-(init.)=1.11 s t(norm)=0.0588099, mflops=85.0197 (err=1.0e-15) 23. Mayer (simple): elapsed time t=1.72 s, 8192 iters, t-(init.)=1.68 s t(norm)=0.0445048, mflops=112.347 24. Mayer (lookup): elapsed time t=1.73 s, 8192 iters, t-(init.)=1.69 s t(norm)=0.0447697, mflops=111.683 (err=9.8e-16) 25. Monro: elapsed time t=1.86 s, 4096 iters, t-(init.)=1.84 s t(norm)=0.0974867, mflops=51.289 (err=8.4e-08) 26. NAPACK (f2c): elapsed time t=1.84 s, 4096 iters, t-(init.)=1.82 s t(norm)=0.0964271, mflops=51.8527 (err=6.9e-15) 27. Nielsen: elapsed time t=1.86 s, 4096 iters, t-(init.)=1.84 s t(norm)=0.0974867, mflops=51.289 (err=3.7e-15) 28. NR (C): elapsed time t=1.35 s, 4096 iters, t-(init.)=1.33 s t(norm)=0.0704659, mflops=70.9563 (err=1.1e-15) 29. NR (F): elapsed time t=1.42 s, 4096 iters, t-(init.)=1.4 s t(norm)=0.0741747, mflops=67.4085 (err=1.1e-15) 30. Ooura (C): elapsed time t=1.01 s, 8192 iters, t-(init.)=0.97 s t(norm)=0.0256962, mflops=194.581 (err=9.8e-16) 31. Ooura (F): elapsed time t=1.97 s, 16384 iters, t-(init.)=1.88 s t(norm)=0.0249015, mflops=200.791 (err=9.8e-16) 32. Ransom: elapsed time t=1.91 s, 4096 iters, t-(init.)=1.89 s t(norm)=0.100136, mflops=49.9322 (err=1.4e-15) 33. SCIPORT: elapsed time t=1.67 s, 8192 iters, t-(init.)=1.63 s t(norm)=0.0431803, mflops=115.794 (err=1.0e-15) 34. Singleton: elapsed time t=1.39 s, 8192 iters, t-(init.)=1.35 s t(norm)=0.0357628, mflops=139.81 (err=1.2e-15) 35. Singleton (f2c): elapsed time t=1.38 s, 8192 iters, t-(init.)=1.34 s t(norm)=0.0354979, mflops=140.853 (err=1.2e-15) 36. Sorensen: elapsed time t=1.14 s, 8192 iters, t-(init.)=1.1 s t(norm)=0.02914, mflops=171.585 (err=1.0e-15) 37. Sorensen DIT: elapsed time t=1.38 s, 4096 iters, t-(init.)=1.35 s t(norm)=0.0715256, mflops=69.9051 (err=1.1e-15) 38. Temperton: elapsed time t=1.75 s, 4096 iters, t-(init.)=1.73 s t(norm)=0.0916587, mflops=54.5502 (err=1.0e-07) 39. Temperton (f2c): elapsed time t=1.86 s, 4096 iters, t-(init.)=1.84 s t(norm)=0.0974867, mflops=51.289 (err=9.9e-16) 40. Valkenburg: elapsed time t=1.5 s, 512 iters, t-(init.)=1.49 s t(norm)=0.631544, mflops=7.9171 (err=1.3e-15) 41. SCSL: elapsed time t=1.24 s, 16384 iters, t-(init.)=1.15 s t(norm)=0.0152323, mflops=328.25 (err=9.8e-16) 42. SGIMATH: elapsed time t=1.48 s, 16384 iters, t-(init.)=1.4 s t(norm)=0.0185437, mflops=269.634 (err=9.8e-16) Top mflops for N=512 = 328.25 Normalized results and averages for N=512: fft 0: mflops = 115.794 (norm. = 0.352761), norm. avg. (of 9) = 0.423777 fft 1: mflops = 117.965 (norm. = 0.359375), norm. avg. (of 9) = 0.417236 fft 2: mflops = 87.3813 (norm. = 0.266204), norm. avg. (of 9) = 0.282815 fft 3: mflops = 38.3625 (norm. = 0.11687), norm. avg. (of 9) = 0.071834 fft 4: mflops = 112.347 (norm. = 0.342262), norm. avg. (of 9) = 0.264205 fft 5: mflops = 29.6767 (norm. = 0.0904088), norm. avg. (of 9) = 0.10391 fft 6: mflops = 87.3813 (norm. = 0.266204), norm. avg. (of 9) = 0.179815 fft 7: mflops = 60.8851 (norm. = 0.185484), norm. avg. (of 9) = 0.143465 fft 8: mflops = 68.8846 (norm. = 0.209854), norm. avg. (of 9) = 0.208547 fft 9: mflops = 202.95 (norm. = 0.61828), norm. avg. (of 9) = 0.32652 fft 10: mflops = 210.887 (norm. = 0.642458), norm. avg. (of 9) = 0.332028 fft 11: mflops = 71.4938 (norm. = 0.217803), norm. avg. (of 8) = 0.22478 fft 12: mflops = 171.585 (norm. = 0.522727), norm. avg. (of 9) = 0.412539 fft 13: mflops = 33.9467 (norm. = 0.103417), norm. avg. (of 9) = 0.116493 fft 14: mflops = 213.27 (norm. = 0.649718), norm. avg. (of 9) = 0.698401 fft 15: mflops = 212.072 (norm. = 0.646067), norm. avg. (of 9) = 0.660673 fft 16: mflops = 188.744 (norm. = 0.575), norm. avg. (of 9) = 0.828508 fft 17: mflops = 218.201 (norm. = 0.66474), norm. avg. (of 7) = 0.527092 fft 18: mflops = 117.965 (norm. = 0.359375), norm. avg. (of 9) = 0.28705 fft 19: mflops = 72.0396 (norm. = 0.219466), norm. avg. (of 9) = 0.175321 fft 20: mflops = 71.4938 (norm. = 0.217803), norm. avg. (of 9) = 0.17974 fft 21: mflops = 158.608 (norm. = 0.483193), norm. avg. (of 9) = 0.638597 fft 22: mflops = 85.0197 (norm. = 0.259009), norm. avg. (of 8) = 0.258557 fft 23: mflops = 112.347 (norm. = 0.342262), norm. avg. (of 8) = 0.311347 fft 24: mflops = 111.683 (norm. = 0.340237), norm. avg. (of 8) = 0.295664 fft 25: mflops = 51.289 (norm. = 0.15625), norm. avg. (of 8) = 0.103148 fft 26: mflops = 51.8527 (norm. = 0.157967), norm. avg. (of 9) = 0.122683 fft 27: mflops = 51.289 (norm. = 0.15625), norm. avg. (of 9) = 0.104188 fft 28: mflops = 70.9563 (norm. = 0.216165), norm. avg. (of 9) = 0.177262 fft 29: mflops = 67.4085 (norm. = 0.205357), norm. avg. (of 9) = 0.148424 fft 30: mflops = 194.581 (norm. = 0.592784), norm. avg. (of 9) = 0.652706 fft 31: mflops = 200.791 (norm. = 0.611702), norm. avg. (of 9) = 0.532033 fft 32: mflops = 49.9322 (norm. = 0.152116), norm. avg. (of 8) = 0.100533 fft 33: mflops = 115.794 (norm. = 0.352761), norm. avg. (of 8) = 0.331448 fft 34: mflops = 139.81 (norm. = 0.425926), norm. avg. (of 9) = 0.273525 fft 35: mflops = 140.853 (norm. = 0.429104), norm. avg. (of 9) = 0.275073 fft 36: mflops = 171.585 (norm. = 0.522727), norm. avg. (of 9) = 0.368033 fft 37: mflops = 69.9051 (norm. = 0.212963), norm. avg. (of 9) = 0.206383 fft 38: mflops = 54.5502 (norm. = 0.166185), norm. avg. (of 9) = 0.128714 fft 39: mflops = 51.289 (norm. = 0.15625), norm. avg. (of 9) = 0.118696 fft 40: mflops = 7.9171 (norm. = 0.0241191), norm. avg. (of 9) = 0.038828 fft 41: mflops = 328.25 (norm. = 1), norm. avg. (of 9) = 0.71472 fft 42: mflops = 269.634 (norm. = 0.821429), norm. avg. (of 9) = 0.695911 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.82 s, 4096 iters, t-(init.)=1.78 s t(norm)=0.0424385, mflops=117.818 (err=1.8e-15) 1. Arndt DIT: elapsed time t=1.82 s, 4096 iters, t-(init.)=1.77 s t(norm)=0.0422001, mflops=118.483 (err=1.9e-15) 2. Arndt Split-Radix: elapsed time t=1.15 s, 2048 iters, t-(init.)=1.13 s t(norm)=0.0538826, mflops=92.7943 (err=1.8e-15) 3. Arndt 4-step: elapsed time t=1.13 s, 1024 iters, t-(init.)=1.12 s t(norm)=0.106812, mflops=46.8114 (err=1.8e-15) 4. Bailey: elapsed time t=1.26 s, 2048 iters, t-(init.)=1.23 s t(norm)=0.058651, mflops=85.2501 (err=1.8e-15) 5. Beauregard: elapsed time t=1.84 s, 1024 iters, t-(init.)=1.83 s t(norm)=0.174522, mflops=28.6496 (err=2.0e-15) 6. Bergland: elapsed time t=1.17 s, 2048 iters, t-(init.)=1.14 s t(norm)=0.0543594, mflops=91.9804 (err=2.1e-15) 7. Brenner: elapsed time t=1.67 s, 2048 iters, t-(init.)=1.65 s t(norm)=0.0786781, mflops=63.5501 (err=1.9e-15) 8. Burrus: elapsed time t=1.44 s, 2048 iters, t-(init.)=1.42 s t(norm)=0.0677109, mflops=73.8434 (err=1.8e-15) 9. CWP (min N) (N=1040): elapsed time t=1.07 s, 4096 iters, t-(init.)=1.03 s t(norm)=0.0245571, mflops=203.607 10. CWP (best N) (N=1040): elapsed time t=1.06 s, 4096 iters, t-(init.)=1.01 s t(norm)=0.0240803, mflops=207.639 11. Edelblute: elapsed time t=1.41 s, 2048 iters, t-(init.)=1.39 s t(norm)=0.0662804, mflops=75.4371 (err=1.8e-15) 12. FFTPACK: elapsed time t=1.49 s, 4096 iters, t-(init.)=1.44 s t(norm)=0.0343323, mflops=145.636 (err=1.9e-15) 13. FFTPACK (f2c): elapsed time t=1.46 s, 1024 iters, t-(init.)=1.45 s t(norm)=0.138283, mflops=36.1578 (err=1.9e-15) FFTW_MEASURE plan: (cost = 2.539063e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.09 s, 4096 iters, t-(init.)=1.04 s t(norm)=0.0247955, mflops=201.649 (err=2.0e-15) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.23 s, 4096 iters, t-(init.)=1.18 s t(norm)=0.0281334, mflops=177.725 (err=1.9e-15) 16. Frigo-old: elapsed time t=1.22 s, 4096 iters, t-(init.)=1.18 s t(norm)=0.0281334, mflops=177.725 (err=1.9e-15) 17. Green: elapsed time t=1 s, 4096 iters, t-(init.)=0.95 s t(norm)=0.0226498, mflops=220.753 (err=2.0e-15) 18. GSL: elapsed time t=1.68 s, 4096 iters, t-(init.)=1.64 s t(norm)=0.0391006, mflops=127.875 (err=1.9e-15) 19. GSL DIT: elapsed time t=1.49 s, 2048 iters, t-(init.)=1.46 s t(norm)=0.0696182, mflops=71.8203 (err=2.1e-15) 20. GSL DIF: elapsed time t=1.52 s, 2048 iters, t-(init.)=1.5 s t(norm)=0.0715256, mflops=69.9051 (err=2.2e-15) 21. Krukar: elapsed time t=1.19 s, 2048 iters, t-(init.)=1.17 s t(norm)=0.0557899, mflops=89.6219 (err=1.9e-15) 22. Mayer (Buneman): elapsed time t=1.21 s, 2048 iters, t-(init.)=1.19 s t(norm)=0.0567436, mflops=88.1156 (err=1.8e-15) 23. Mayer (simple): elapsed time t=1.87 s, 4096 iters, t-(init.)=1.83 s t(norm)=0.0436306, mflops=114.598 24. Mayer (lookup): elapsed time t=1.83 s, 4096 iters, t-(init.)=1.79 s t(norm)=0.0426769, mflops=117.159 (err=1.8e-15) 25. Monro: elapsed time t=1.95 s, 2048 iters, t-(init.)=1.93 s t(norm)=0.0920296, mflops=54.3304 (err=1.0e-07) 26. NAPACK (f2c): elapsed time t=1.85 s, 2048 iters, t-(init.)=1.83 s t(norm)=0.0872612, mflops=57.2992 (err=1.6e-14) 27. Nielsen: elapsed time t=1.25 s, 1024 iters, t-(init.)=1.24 s t(norm)=0.118256, mflops=42.2813 (err=7.5e-15) 28. NR (C): elapsed time t=1.45 s, 2048 iters, t-(init.)=1.43 s t(norm)=0.0681877, mflops=73.327 (err=2.0e-15) 29. NR (F): elapsed time t=1.48 s, 2048 iters, t-(init.)=1.46 s t(norm)=0.0696182, mflops=71.8203 (err=2.0e-15) 30. Ooura (C): elapsed time t=1.08 s, 4096 iters, t-(init.)=1.04 s t(norm)=0.0247955, mflops=201.649 (err=2.2e-15) 31. Ooura (F): elapsed time t=1.04 s, 4096 iters, t-(init.)=1 s t(norm)=0.0238419, mflops=209.715 (err=2.2e-15) 32. Ransom: elapsed time t=1.66 s, 2048 iters, t-(init.)=1.64 s t(norm)=0.0782013, mflops=63.9376 (err=2.3e-15) 33. SCIPORT: elapsed time t=1.02 s, 2048 iters, t-(init.)=1 s t(norm)=0.0476837, mflops=104.858 (err=2.0e-15) 34. Singleton: elapsed time t=1.42 s, 4096 iters, t-(init.)=1.37 s t(norm)=0.0326633, mflops=153.077 (err=2.8e-15) 35. Singleton (f2c): elapsed time t=1.4 s, 4096 iters, t-(init.)=1.35 s t(norm)=0.0321865, mflops=155.345 (err=2.8e-15) 36. Sorensen: elapsed time t=1.19 s, 4096 iters, t-(init.)=1.15 s t(norm)=0.0274181, mflops=182.361 (err=1.8e-15) 37. Sorensen DIT: elapsed time t=1.44 s, 2048 iters, t-(init.)=1.42 s t(norm)=0.0677109, mflops=73.8434 (err=1.9e-15) 38. Temperton: elapsed time t=1.68 s, 2048 iters, t-(init.)=1.66 s t(norm)=0.079155, mflops=63.1672 (err=1.1e-07) 39. Temperton (f2c): elapsed time t=1.77 s, 2048 iters, t-(init.)=1.75 s t(norm)=0.0834465, mflops=59.9186 (err=1.9e-15) 40. Valkenburg: elapsed time t=1.68 s, 256 iters, t-(init.)=1.68 s t(norm)=0.640869, mflops=7.8019 (err=2.4e-15) 41. SCSL: elapsed time t=1.9 s, 8192 iters, t-(init.)=1.81 s t(norm)=0.0215769, mflops=231.73 (err=1.9e-15) 42. SGIMATH: elapsed time t=1.83 s, 8192 iters, t-(init.)=1.74 s t(norm)=0.0207424, mflops=241.052 (err=1.9e-15) Top mflops for N=1024 = 241.052 Normalized results and averages for N=1024: fft 0: mflops = 117.818 (norm. = 0.488764), norm. avg. (of 10) = 0.430276 fft 1: mflops = 118.483 (norm. = 0.491525), norm. avg. (of 10) = 0.424665 fft 2: mflops = 92.7943 (norm. = 0.384956), norm. avg. (of 10) = 0.293029 fft 3: mflops = 46.8114 (norm. = 0.194196), norm. avg. (of 10) = 0.0840702 fft 4: mflops = 85.2501 (norm. = 0.353659), norm. avg. (of 10) = 0.27315 fft 5: mflops = 28.6496 (norm. = 0.118852), norm. avg. (of 10) = 0.105404 fft 6: mflops = 91.9804 (norm. = 0.381579), norm. avg. (of 10) = 0.199991 fft 7: mflops = 63.5501 (norm. = 0.263636), norm. avg. (of 10) = 0.155482 fft 8: mflops = 73.8434 (norm. = 0.306338), norm. avg. (of 10) = 0.218326 fft 9: mflops = 203.607 (norm. = 0.84466), norm. avg. (of 10) = 0.378334 fft 10: mflops = 207.639 (norm. = 0.861386), norm. avg. (of 10) = 0.384964 fft 11: mflops = 75.4371 (norm. = 0.31295), norm. avg. (of 9) = 0.234576 fft 12: mflops = 145.636 (norm. = 0.604167), norm. avg. (of 10) = 0.431702 fft 13: mflops = 36.1578 (norm. = 0.15), norm. avg. (of 10) = 0.119844 fft 14: mflops = 201.649 (norm. = 0.836538), norm. avg. (of 10) = 0.712215 fft 15: mflops = 177.725 (norm. = 0.737288), norm. avg. (of 10) = 0.668335 fft 16: mflops = 177.725 (norm. = 0.737288), norm. avg. (of 10) = 0.819386 fft 17: mflops = 220.753 (norm. = 0.915789), norm. avg. (of 8) = 0.575679 fft 18: mflops = 127.875 (norm. = 0.530488), norm. avg. (of 10) = 0.311394 fft 19: mflops = 71.8203 (norm. = 0.297945), norm. avg. (of 10) = 0.187583 fft 20: mflops = 69.9051 (norm. = 0.29), norm. avg. (of 10) = 0.190766 fft 21: mflops = 89.6219 (norm. = 0.371795), norm. avg. (of 10) = 0.611917 fft 22: mflops = 88.1156 (norm. = 0.365546), norm. avg. (of 9) = 0.270444 fft 23: mflops = 114.598 (norm. = 0.47541), norm. avg. (of 9) = 0.329576 fft 24: mflops = 117.159 (norm. = 0.486034), norm. avg. (of 9) = 0.316816 fft 25: mflops = 54.3304 (norm. = 0.225389), norm. avg. (of 9) = 0.11673 fft 26: mflops = 57.2992 (norm. = 0.237705), norm. avg. (of 10) = 0.134185 fft 27: mflops = 42.2813 (norm. = 0.175403), norm. avg. (of 10) = 0.111309 fft 28: mflops = 73.327 (norm. = 0.304196), norm. avg. (of 10) = 0.189956 fft 29: mflops = 71.8203 (norm. = 0.297945), norm. avg. (of 10) = 0.163376 fft 30: mflops = 201.649 (norm. = 0.836538), norm. avg. (of 10) = 0.671089 fft 31: mflops = 209.715 (norm. = 0.87), norm. avg. (of 10) = 0.56583 fft 32: mflops = 63.9376 (norm. = 0.265244), norm. avg. (of 9) = 0.118834 fft 33: mflops = 104.858 (norm. = 0.435), norm. avg. (of 9) = 0.342954 fft 34: mflops = 153.077 (norm. = 0.635036), norm. avg. (of 10) = 0.309676 fft 35: mflops = 155.345 (norm. = 0.644444), norm. avg. (of 10) = 0.312011 fft 36: mflops = 182.361 (norm. = 0.756522), norm. avg. (of 10) = 0.406882 fft 37: mflops = 73.8434 (norm. = 0.306338), norm. avg. (of 10) = 0.216378 fft 38: mflops = 63.1672 (norm. = 0.262048), norm. avg. (of 10) = 0.142047 fft 39: mflops = 59.9186 (norm. = 0.248571), norm. avg. (of 10) = 0.131684 fft 40: mflops = 7.8019 (norm. = 0.0323661), norm. avg. (of 10) = 0.0381818 fft 41: mflops = 231.73 (norm. = 0.961326), norm. avg. (of 10) = 0.73938 fft 42: mflops = 241.052 (norm. = 1), norm. avg. (of 10) = 0.72632 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.19 s, 1024 iters, t-(init.)=1.17 s t(norm)=0.0507181, mflops=98.5841 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.16 s, 1024 iters, t-(init.)=1.14 s t(norm)=0.0494177, mflops=101.178 (err=1.4e-15) 2. Arndt Split-Radix: elapsed time t=1.44 s, 1024 iters, t-(init.)=1.42 s t(norm)=0.0615553, mflops=81.2277 (err=1.5e-15) 3. Arndt 4-step: elapsed time t=1.38 s, 512 iters, t-(init.)=1.37 s t(norm)=0.118776, mflops=42.0961 (err=1.4e-15) 4. Bailey: elapsed time t=1.65 s, 512 iters, t-(init.)=1.64 s t(norm)=0.142184, mflops=35.1657 (err=1.4e-15) 5. Beauregard: elapsed time t=1.03 s, 256 iters, t-(init.)=1.02 s t(norm)=0.176863, mflops=28.2704 (err=1.4e-15) 6. Bergland: elapsed time t=1.25 s, 1024 iters, t-(init.)=1.22 s t(norm)=0.0528856, mflops=94.5437 (err=1.5e-15) 7. Brenner: elapsed time t=1.86 s, 1024 iters, t-(init.)=1.84 s t(norm)=0.0797619, mflops=62.6866 (err=1.4e-15) 8. Burrus: elapsed time t=1.85 s, 1024 iters, t-(init.)=1.82 s t(norm)=0.0788949, mflops=63.3755 (err=1.4e-15) 9. CWP (min N) (N=2145): elapsed time t=1.35 s, 2048 iters, t-(init.)=1.29 s t(norm)=0.02796, mflops=178.827 10. CWP (best N) (N=2184): elapsed time t=1.25 s, 2048 iters, t-(init.)=1.19 s t(norm)=0.0257926, mflops=193.854 11. Edelblute: elapsed time t=1.73 s, 1024 iters, t-(init.)=1.71 s t(norm)=0.0741265, mflops=67.4523 (err=1.4e-15) 12. FFTPACK: elapsed time t=1.75 s, 1024 iters, t-(init.)=1.73 s t(norm)=0.0749935, mflops=66.6725 (err=1.4e-15) 13. FFTPACK (f2c): elapsed time t=1.02 s, 256 iters, t-(init.)=1.02 s t(norm)=0.176863, mflops=28.2704 (err=1.4e-15) FFTW_MEASURE plan: (cost = 6.835937e-04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.47 s, 2048 iters, t-(init.)=1.42 s t(norm)=0.0307777, mflops=162.455 (err=1.4e-15) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.56 s, 2048 iters, t-(init.)=1.51 s t(norm)=0.0327284, mflops=152.773 (err=1.4e-15) 16. Frigo-old: elapsed time t=1.15 s, 1024 iters, t-(init.)=1.13 s t(norm)=0.0489842, mflops=102.074 (err=1.4e-15) 17. Green: elapsed time t=1.37 s, 2048 iters, t-(init.)=1.32 s t(norm)=0.0286102, mflops=174.763 (err=1.4e-15) 18. GSL: elapsed time t=1.96 s, 1024 iters, t-(init.)=1.94 s t(norm)=0.0840967, mflops=59.4553 (err=1.4e-15) 19. GSL DIT: elapsed time t=1.73 s, 1024 iters, t-(init.)=1.71 s t(norm)=0.0741265, mflops=67.4523 (err=2.0e-15) 20. GSL DIF: elapsed time t=1.76 s, 1024 iters, t-(init.)=1.74 s t(norm)=0.075427, mflops=66.2893 (err=2.3e-15) 21. Krukar: elapsed time t=1.45 s, 1024 iters, t-(init.)=1.43 s t(norm)=0.0619888, mflops=80.6597 (err=1.4e-15) 22. Mayer (Buneman): elapsed time t=1.29 s, 1024 iters, t-(init.)=1.27 s t(norm)=0.055053, mflops=90.8215 (err=1.4e-15) 23. Mayer (simple): elapsed time t=1.01 s, 1024 iters, t-(init.)=0.99 s t(norm)=0.0429153, mflops=116.508 24. Mayer (lookup): elapsed time t=1.06 s, 1024 iters, t-(init.)=1.04 s t(norm)=0.0450828, mflops=110.907 (err=1.4e-15) 25. Monro: elapsed time t=1.15 s, 512 iters, t-(init.)=1.14 s t(norm)=0.0988353, mflops=50.5892 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.81 s, 512 iters, t-(init.)=1.8 s t(norm)=0.156056, mflops=32.0398 (err=1.5e-14) 27. Nielsen: elapsed time t=1.43 s, 512 iters, t-(init.)=1.42 s t(norm)=0.123111, mflops=40.6139 (err=1.2e-14) 28. NR (C): elapsed time t=1.62 s, 1024 iters, t-(init.)=1.6 s t(norm)=0.0693581, mflops=72.0896 (err=1.5e-15) 29. NR (F): elapsed time t=1.58 s, 1024 iters, t-(init.)=1.56 s t(norm)=0.0676242, mflops=73.9381 (err=1.5e-15) 30. Ooura (C): elapsed time t=1.44 s, 2048 iters, t-(init.)=1.39 s t(norm)=0.0301274, mflops=165.962 (err=1.4e-15) 31. Ooura (F): elapsed time t=1.38 s, 2048 iters, t-(init.)=1.34 s t(norm)=0.0290437, mflops=172.154 (err=1.4e-15) 32. Ransom: elapsed time t=1.94 s, 1024 iters, t-(init.)=1.92 s t(norm)=0.0832298, mflops=60.0747 (err=2.0e-15) 33. SCIPORT: elapsed time t=1.32 s, 512 iters, t-(init.)=1.31 s t(norm)=0.113574, mflops=44.0242 (err=1.4e-15) 34. Singleton: elapsed time t=1.74 s, 2048 iters, t-(init.)=1.69 s t(norm)=0.0366298, mflops=136.501 (err=1.9e-15) 35. Singleton (f2c): elapsed time t=1.78 s, 2048 iters, t-(init.)=1.74 s t(norm)=0.0377135, mflops=132.579 (err=1.9e-15) 36. Sorensen: elapsed time t=1.14 s, 1024 iters, t-(init.)=1.12 s t(norm)=0.0485507, mflops=102.985 (err=1.4e-15) 37. Sorensen DIT: elapsed time t=1.03 s, 512 iters, t-(init.)=1.02 s t(norm)=0.0884316, mflops=56.5409 (err=1.4e-15) 38. Temperton: elapsed time t=1.1 s, 512 iters, t-(init.)=1.08 s t(norm)=0.0936335, mflops=53.3997 (err=1.1e-07) 39. Temperton (f2c): elapsed time t=1.11 s, 512 iters, t-(init.)=1.1 s t(norm)=0.0953674, mflops=52.4288 (err=1.4e-15) 40. Valkenburg: elapsed time t=1.47 s, 64 iters, t-(init.)=1.47 s t(norm)=1.01956, mflops=4.90405 (err=1.7e-15) 41. SCSL: elapsed time t=1.74 s, 2048 iters, t-(init.)=1.69 s t(norm)=0.0366298, mflops=136.501 (err=1.4e-15) 42. SGIMATH: elapsed time t=1.74 s, 2048 iters, t-(init.)=1.7 s t(norm)=0.0368465, mflops=135.698 (err=1.4e-15) Top mflops for N=2048 = 193.854 Normalized results and averages for N=2048: fft 0: mflops = 98.5841 (norm. = 0.508547), norm. avg. (of 11) = 0.437391 fft 1: mflops = 101.178 (norm. = 0.52193), norm. avg. (of 11) = 0.433507 fft 2: mflops = 81.2277 (norm. = 0.419014), norm. avg. (of 11) = 0.304482 fft 3: mflops = 42.0961 (norm. = 0.217153), norm. avg. (of 11) = 0.0961687 fft 4: mflops = 35.1657 (norm. = 0.181402), norm. avg. (of 11) = 0.26481 fft 5: mflops = 28.2704 (norm. = 0.145833), norm. avg. (of 11) = 0.10908 fft 6: mflops = 94.5437 (norm. = 0.487705), norm. avg. (of 11) = 0.226147 fft 7: mflops = 62.6866 (norm. = 0.32337), norm. avg. (of 11) = 0.170744 fft 8: mflops = 63.3755 (norm. = 0.326923), norm. avg. (of 11) = 0.228199 fft 9: mflops = 178.827 (norm. = 0.922481), norm. avg. (of 11) = 0.427802 fft 10: mflops = 193.854 (norm. = 1), norm. avg. (of 11) = 0.440876 fft 11: mflops = 67.4523 (norm. = 0.347953), norm. avg. (of 10) = 0.245914 fft 12: mflops = 66.6725 (norm. = 0.343931), norm. avg. (of 11) = 0.423723 fft 13: mflops = 28.2704 (norm. = 0.145833), norm. avg. (of 11) = 0.122206 fft 14: mflops = 162.455 (norm. = 0.838028), norm. avg. (of 11) = 0.723652 fft 15: mflops = 152.773 (norm. = 0.788079), norm. avg. (of 11) = 0.67922 fft 16: mflops = 102.074 (norm. = 0.526549), norm. avg. (of 11) = 0.792764 fft 17: mflops = 174.763 (norm. = 0.901515), norm. avg. (of 9) = 0.611883 fft 18: mflops = 59.4553 (norm. = 0.306701), norm. avg. (of 11) = 0.310967 fft 19: mflops = 67.4523 (norm. = 0.347953), norm. avg. (of 11) = 0.202162 fft 20: mflops = 66.2893 (norm. = 0.341954), norm. avg. (of 11) = 0.20451 fft 21: mflops = 80.6597 (norm. = 0.416084), norm. avg. (of 11) = 0.594114 fft 22: mflops = 90.8215 (norm. = 0.468504), norm. avg. (of 10) = 0.29025 fft 23: mflops = 116.508 (norm. = 0.60101), norm. avg. (of 10) = 0.356719 fft 24: mflops = 110.907 (norm. = 0.572115), norm. avg. (of 10) = 0.342346 fft 25: mflops = 50.5892 (norm. = 0.260965), norm. avg. (of 10) = 0.131153 fft 26: mflops = 32.0398 (norm. = 0.165278), norm. avg. (of 11) = 0.137012 fft 27: mflops = 40.6139 (norm. = 0.209507), norm. avg. (of 11) = 0.120236 fft 28: mflops = 72.0896 (norm. = 0.371875), norm. avg. (of 11) = 0.206494 fft 29: mflops = 73.9381 (norm. = 0.38141), norm. avg. (of 11) = 0.183197 fft 30: mflops = 165.962 (norm. = 0.856115), norm. avg. (of 11) = 0.687909 fft 31: mflops = 172.154 (norm. = 0.88806), norm. avg. (of 11) = 0.595124 fft 32: mflops = 60.0747 (norm. = 0.309896), norm. avg. (of 10) = 0.137941 fft 33: mflops = 44.0242 (norm. = 0.227099), norm. avg. (of 10) = 0.331368 fft 34: mflops = 136.501 (norm. = 0.704142), norm. avg. (of 11) = 0.345537 fft 35: mflops = 132.579 (norm. = 0.683908), norm. avg. (of 11) = 0.345819 fft 36: mflops = 102.985 (norm. = 0.53125), norm. avg. (of 11) = 0.418188 fft 37: mflops = 56.5409 (norm. = 0.291667), norm. avg. (of 11) = 0.223223 fft 38: mflops = 53.3997 (norm. = 0.275463), norm. avg. (of 11) = 0.154176 fft 39: mflops = 52.4288 (norm. = 0.270455), norm. avg. (of 11) = 0.144299 fft 40: mflops = 4.90405 (norm. = 0.0252976), norm. avg. (of 11) = 0.0370105 fft 41: mflops = 136.501 (norm. = 0.704142), norm. avg. (of 11) = 0.736177 fft 42: mflops = 135.698 (norm. = 0.7), norm. avg. (of 11) = 0.723927 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.33 s, 256 iters, t-(init.)=1.3 s t(norm)=0.103315, mflops=48.3958 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.29 s, 256 iters, t-(init.)=1.26 s t(norm)=0.100136, mflops=49.9322 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.46 s, 256 iters, t-(init.)=1.42 s t(norm)=0.112851, mflops=44.306 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.51 s, 256 iters, t-(init.)=1.47 s t(norm)=0.116825, mflops=42.799 (err=3.7e-15) 4. Bailey: elapsed time t=1.49 s, 128 iters, t-(init.)=1.48 s t(norm)=0.23524, mflops=21.2549 (err=3.7e-15) 5. Beauregard: elapsed time t=1.29 s, 128 iters, t-(init.)=1.28 s t(norm)=0.203451, mflops=24.576 (err=3.8e-15) 6. Bergland: elapsed time t=1.96 s, 512 iters, t-(init.)=1.9 s t(norm)=0.0754992, mflops=66.2259 (err=3.9e-15) 7. Brenner: elapsed time t=1.4 s, 256 iters, t-(init.)=1.37 s t(norm)=0.108878, mflops=45.923 (err=3.8e-15) 8. Burrus: elapsed time t=1.7 s, 256 iters, t-(init.)=1.67 s t(norm)=0.13272, mflops=37.6734 (err=3.7e-15) 9. CWP (min N) (N=4290): elapsed time t=1.77 s, 1024 iters, t-(init.)=1.63 s t(norm)=0.0323852, mflops=154.392 10. CWP (best N) (N=4368): elapsed time t=1.64 s, 1024 iters, t-(init.)=1.5 s t(norm)=0.0298023, mflops=167.772 11. Edelblute: elapsed time t=1.61 s, 256 iters, t-(init.)=1.58 s t(norm)=0.125567, mflops=39.8193 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.24 s, 256 iters, t-(init.)=1.21 s t(norm)=0.0961622, mflops=51.9955 (err=3.8e-15) 13. FFTPACK (f2c): elapsed time t=1.01 s, 128 iters, t-(init.)=0.99 s t(norm)=0.157356, mflops=31.775 (err=3.8e-15) FFTW_MEASURE plan: (cost = 2.109375e-03) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.09 s, 512 iters, t-(init.)=1.03 s t(norm)=0.0409285, mflops=122.164 (err=3.8e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.06 s, 512 iters, t-(init.)=0.99 s t(norm)=0.0393391, mflops=127.1 (err=3.8e-15) 16. Frigo-old: elapsed time t=1.58 s, 512 iters, t-(init.)=1.51 s t(norm)=0.060002, mflops=83.3305 (err=3.8e-15) 17. Green: elapsed time t=1.25 s, 512 iters, t-(init.)=1.18 s t(norm)=0.046889, mflops=106.635 (err=3.8e-15) 18. GSL: elapsed time t=1.38 s, 256 iters, t-(init.)=1.34 s t(norm)=0.106494, mflops=46.9512 (err=3.8e-15) 19. GSL DIT: elapsed time t=1.67 s, 256 iters, t-(init.)=1.64 s t(norm)=0.130335, mflops=38.3625 (err=4.1e-15) 20. GSL DIF: elapsed time t=1.71 s, 256 iters, t-(init.)=1.68 s t(norm)=0.133514, mflops=37.4491 (err=4.3e-15) 21. Krukar: elapsed time t=1.09 s, 256 iters, t-(init.)=1.06 s t(norm)=0.0842412, mflops=59.3534 (err=3.8e-15) 22. Mayer (Buneman): elapsed time t=1.5 s, 512 iters, t-(init.)=1.43 s t(norm)=0.0568231, mflops=87.9924 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.21 s, 512 iters, t-(init.)=1.15 s t(norm)=0.0456969, mflops=109.417 24. Mayer (lookup): elapsed time t=1.33 s, 512 iters, t-(init.)=1.26 s t(norm)=0.0500679, mflops=99.8644 (err=3.7e-15) 25. Monro: elapsed time t=1.9 s, 256 iters, t-(init.)=1.87 s t(norm)=0.148614, mflops=33.6441 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.95 s, 256 iters, t-(init.)=1.92 s t(norm)=0.152588, mflops=32.768 (err=4.9e-14) 27. Nielsen: elapsed time t=1.72 s, 256 iters, t-(init.)=1.69 s t(norm)=0.134309, mflops=37.2276 (err=2.6e-14) 28. NR (C): elapsed time t=1.67 s, 256 iters, t-(init.)=1.64 s t(norm)=0.130335, mflops=38.3625 (err=3.9e-15) 29. NR (F): elapsed time t=1.55 s, 256 iters, t-(init.)=1.52 s t(norm)=0.120799, mflops=41.3912 (err=3.9e-15) 30. Ooura (C): elapsed time t=1.35 s, 512 iters, t-(init.)=1.28 s t(norm)=0.0508626, mflops=98.304 (err=3.9e-15) 31. Ooura (F): elapsed time t=1.39 s, 512 iters, t-(init.)=1.32 s t(norm)=0.0524521, mflops=95.3251 (err=3.9e-15) 32. Ransom: elapsed time t=1.03 s, 256 iters, t-(init.)=1 s t(norm)=0.0794729, mflops=62.9146 (err=4.3e-15) 33. SCIPORT: elapsed time t=1.52 s, 256 iters, t-(init.)=1.48 s t(norm)=0.11762, mflops=42.5098 (err=3.8e-15) 34. Singleton: elapsed time t=1.88 s, 512 iters, t-(init.)=1.81 s t(norm)=0.0719229, mflops=69.5189 (err=5.8e-15) 35. Singleton (f2c): elapsed time t=1.85 s, 512 iters, t-(init.)=1.78 s t(norm)=0.0707308, mflops=70.6905 (err=5.8e-15) 36. Sorensen: elapsed time t=1.05 s, 256 iters, t-(init.)=1.01 s t(norm)=0.0802676, mflops=62.2916 (err=3.7e-15) 37. Sorensen DIT: elapsed time t=1.86 s, 256 iters, t-(init.)=1.83 s t(norm)=0.145435, mflops=34.3795 (err=3.7e-15) 38. Temperton: elapsed time t=1.63 s, 256 iters, t-(init.)=1.59 s t(norm)=0.126362, mflops=39.5689 (err=1.2e-07) 39. Temperton (f2c): elapsed time t=1.61 s, 256 iters, t-(init.)=1.58 s t(norm)=0.125567, mflops=39.8193 (err=3.8e-15) 40. Valkenburg: elapsed time t=1.34 s, 32 iters, t-(init.)=1.34 s t(norm)=0.851949, mflops=5.8689 (err=4.0e-15) 41. SCSL: elapsed time t=1.93 s, 1024 iters, t-(init.)=1.79 s t(norm)=0.0355641, mflops=140.591 (err=3.8e-15) 42. SGIMATH: elapsed time t=1.93 s, 1024 iters, t-(init.)=1.8 s t(norm)=0.0357628, mflops=139.81 (err=3.8e-15) Top mflops for N=4096 = 167.772 Normalized results and averages for N=4096: fft 0: mflops = 48.3958 (norm. = 0.288462), norm. avg. (of 12) = 0.424981 fft 1: mflops = 49.9322 (norm. = 0.297619), norm. avg. (of 12) = 0.422183 fft 2: mflops = 44.306 (norm. = 0.264085), norm. avg. (of 12) = 0.301116 fft 3: mflops = 42.799 (norm. = 0.255102), norm. avg. (of 12) = 0.109413 fft 4: mflops = 21.2549 (norm. = 0.126689), norm. avg. (of 12) = 0.2533 fft 5: mflops = 24.576 (norm. = 0.146484), norm. avg. (of 12) = 0.112197 fft 6: mflops = 66.2259 (norm. = 0.394737), norm. avg. (of 12) = 0.240196 fft 7: mflops = 45.923 (norm. = 0.273723), norm. avg. (of 12) = 0.179326 fft 8: mflops = 37.6734 (norm. = 0.224551), norm. avg. (of 12) = 0.227895 fft 9: mflops = 154.392 (norm. = 0.920245), norm. avg. (of 12) = 0.468839 fft 10: mflops = 167.772 (norm. = 1), norm. avg. (of 12) = 0.48747 fft 11: mflops = 39.8193 (norm. = 0.237342), norm. avg. (of 11) = 0.245135 fft 12: mflops = 51.9955 (norm. = 0.309917), norm. avg. (of 12) = 0.414239 fft 13: mflops = 31.775 (norm. = 0.189394), norm. avg. (of 12) = 0.127805 fft 14: mflops = 122.164 (norm. = 0.728155), norm. avg. (of 12) = 0.724027 fft 15: mflops = 127.1 (norm. = 0.757576), norm. avg. (of 12) = 0.68575 fft 16: mflops = 83.3305 (norm. = 0.496689), norm. avg. (of 12) = 0.768091 fft 17: mflops = 106.635 (norm. = 0.635593), norm. avg. (of 10) = 0.614254 fft 18: mflops = 46.9512 (norm. = 0.279851), norm. avg. (of 12) = 0.308374 fft 19: mflops = 38.3625 (norm. = 0.228659), norm. avg. (of 12) = 0.204371 fft 20: mflops = 37.4491 (norm. = 0.223214), norm. avg. (of 12) = 0.206069 fft 21: mflops = 59.3534 (norm. = 0.353774), norm. avg. (of 12) = 0.574086 fft 22: mflops = 87.9924 (norm. = 0.524476), norm. avg. (of 11) = 0.311544 fft 23: mflops = 109.417 (norm. = 0.652174), norm. avg. (of 11) = 0.383579 fft 24: mflops = 99.8644 (norm. = 0.595238), norm. avg. (of 11) = 0.365336 fft 25: mflops = 33.6441 (norm. = 0.200535), norm. avg. (of 11) = 0.137461 fft 26: mflops = 32.768 (norm. = 0.195313), norm. avg. (of 12) = 0.14187 fft 27: mflops = 37.2276 (norm. = 0.221893), norm. avg. (of 12) = 0.128708 fft 28: mflops = 38.3625 (norm. = 0.228659), norm. avg. (of 12) = 0.208341 fft 29: mflops = 41.3912 (norm. = 0.246711), norm. avg. (of 12) = 0.18849 fft 30: mflops = 98.304 (norm. = 0.585938), norm. avg. (of 12) = 0.679412 fft 31: mflops = 95.3251 (norm. = 0.568182), norm. avg. (of 12) = 0.592879 fft 32: mflops = 62.9146 (norm. = 0.375), norm. avg. (of 11) = 0.159491 fft 33: mflops = 42.5098 (norm. = 0.253378), norm. avg. (of 11) = 0.324278 fft 34: mflops = 69.5189 (norm. = 0.414365), norm. avg. (of 12) = 0.351272 fft 35: mflops = 70.6905 (norm. = 0.421348), norm. avg. (of 12) = 0.352114 fft 36: mflops = 62.2916 (norm. = 0.371287), norm. avg. (of 12) = 0.41428 fft 37: mflops = 34.3795 (norm. = 0.204918), norm. avg. (of 12) = 0.221697 fft 38: mflops = 39.5689 (norm. = 0.235849), norm. avg. (of 12) = 0.160982 fft 39: mflops = 39.8193 (norm. = 0.237342), norm. avg. (of 12) = 0.152053 fft 40: mflops = 5.8689 (norm. = 0.0349813), norm. avg. (of 12) = 0.0368414 fft 41: mflops = 140.591 (norm. = 0.837989), norm. avg. (of 12) = 0.744661 fft 42: mflops = 139.81 (norm. = 0.833333), norm. avg. (of 12) = 0.733044 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.42 s, 128 iters, t-(init.)=1.39 s t(norm)=0.10197, mflops=49.0341 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.39 s, 128 iters, t-(init.)=1.36 s t(norm)=0.099769, mflops=50.1158 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.79 s, 128 iters, t-(init.)=1.76 s t(norm)=0.129113, mflops=38.7258 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.73 s, 128 iters, t-(init.)=1.7 s t(norm)=0.124711, mflops=40.0926 (err=3.7e-15) 4. Bailey: elapsed time t=1.63 s, 64 iters, t-(init.)=1.62 s t(norm)=0.237685, mflops=21.0362 (err=3.7e-15) 5. Beauregard: elapsed time t=1.41 s, 64 iters, t-(init.)=1.4 s t(norm)=0.205407, mflops=24.3419 (err=3.7e-15) 6. Bergland: elapsed time t=1.14 s, 128 iters, t-(init.)=1.11 s t(norm)=0.0814291, mflops=61.4031 (err=3.7e-15) 7. Brenner: elapsed time t=1.54 s, 128 iters, t-(init.)=1.51 s t(norm)=0.110773, mflops=45.1374 (err=3.7e-15) 8. Burrus: elapsed time t=1.01 s, 64 iters, t-(init.)=0.99 s t(norm)=0.145252, mflops=34.4229 (err=3.7e-15) 9. CWP (min N) (N=8580): elapsed time t=1.9 s, 512 iters, t-(init.)=1.76 s t(norm)=0.0322782, mflops=154.903 10. CWP (best N) (N=9240): elapsed time t=1.78 s, 512 iters, t-(init.)=1.63 s t(norm)=0.029894, mflops=167.258 11. Edelblute: elapsed time t=1.89 s, 128 iters, t-(init.)=1.86 s t(norm)=0.136449, mflops=36.6438 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.57 s, 128 iters, t-(init.)=1.54 s t(norm)=0.112974, mflops=44.2581 (err=3.7e-15) 13. FFTPACK (f2c): elapsed time t=1.4 s, 64 iters, t-(init.)=1.39 s t(norm)=0.20394, mflops=24.5171 (err=3.7e-15) FFTW_MEASURE plan: (cost = 4.531250e-03) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 14. FFTW: elapsed time t=1.22 s, 256 iters, t-(init.)=1.15 s t(norm)=0.0421817, mflops=118.535 (err=3.7e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.22 s, 256 iters, t-(init.)=1.15 s t(norm)=0.0421817, mflops=118.535 (err=3.7e-15) 16. Frigo-old: elapsed time t=1.66 s, 256 iters, t-(init.)=1.6 s t(norm)=0.0586877, mflops=85.1968 (err=3.7e-15) 17. Green: elapsed time t=1.5 s, 256 iters, t-(init.)=1.43 s t(norm)=0.0524521, mflops=95.3251 (err=3.7e-15) 18. GSL: elapsed time t=1.69 s, 128 iters, t-(init.)=1.66 s t(norm)=0.121777, mflops=41.0587 (err=3.7e-15) 19. GSL DIT: elapsed time t=1.91 s, 128 iters, t-(init.)=1.88 s t(norm)=0.137916, mflops=36.254 (err=4.3e-15) 20. GSL DIF: elapsed time t=1.92 s, 128 iters, t-(init.)=1.89 s t(norm)=0.13865, mflops=36.0621 (err=4.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.27 s, 128 iters, t-(init.)=1.24 s t(norm)=0.0909659, mflops=54.9657 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.14 s, 128 iters, t-(init.)=1.11 s t(norm)=0.0814291, mflops=61.4031 24. Mayer (lookup): elapsed time t=1.23 s, 128 iters, t-(init.)=1.2 s t(norm)=0.0880315, mflops=56.7979 (err=3.7e-15) 25. Monro: elapsed time t=1.04 s, 64 iters, t-(init.)=1.02 s t(norm)=0.149654, mflops=33.4105 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.2 s, 64 iters, t-(init.)=1.18 s t(norm)=0.173129, mflops=28.8803 (err=4.5e-14) 27. Nielsen: elapsed time t=1.1 s, 64 iters, t-(init.)=1.08 s t(norm)=0.158457, mflops=31.5544 (err=1.1e-14) 28. NR (C): elapsed time t=1.87 s, 128 iters, t-(init.)=1.83 s t(norm)=0.134248, mflops=37.2445 (err=3.8e-15) 29. NR (F): elapsed time t=1.77 s, 128 iters, t-(init.)=1.74 s t(norm)=0.127646, mflops=39.1709 (err=3.8e-15) 30. Ooura (C): elapsed time t=1.5 s, 256 iters, t-(init.)=1.44 s t(norm)=0.0528189, mflops=94.6631 (err=3.7e-15) 31. Ooura (F): elapsed time t=1.53 s, 256 iters, t-(init.)=1.47 s t(norm)=0.0539193, mflops=92.7312 (err=3.7e-15) 32. Ransom: elapsed time t=1.19 s, 128 iters, t-(init.)=1.16 s t(norm)=0.0850971, mflops=58.7564 (err=4.8e-15) 33. SCIPORT: elapsed time t=1.71 s, 128 iters, t-(init.)=1.67 s t(norm)=0.12251, mflops=40.8128 (err=3.7e-15) 34. Singleton: elapsed time t=1.09 s, 128 iters, t-(init.)=1.06 s t(norm)=0.0777611, mflops=64.2995 (err=5.6e-15) 35. Singleton (f2c): elapsed time t=1.09 s, 128 iters, t-(init.)=1.06 s t(norm)=0.0777611, mflops=64.2995 (err=5.6e-15) 36. Sorensen: elapsed time t=1.33 s, 128 iters, t-(init.)=1.29 s t(norm)=0.0946338, mflops=52.8352 (err=3.7e-15) 37. Sorensen DIT: elapsed time t=1.11 s, 64 iters, t-(init.)=1.1 s t(norm)=0.161391, mflops=30.9807 (err=3.7e-15) 38. Temperton: elapsed time t=1.92 s, 128 iters, t-(init.)=1.89 s t(norm)=0.13865, mflops=36.0621 (err=1.4e-07) 39. Temperton (f2c): elapsed time t=1.94 s, 128 iters, t-(init.)=1.91 s t(norm)=0.140117, mflops=35.6845 (err=3.7e-15) 40. Valkenburg: elapsed time t=1.41 s, 16 iters, t-(init.)=1.4 s t(norm)=0.821627, mflops=6.08549 (err=3.8e-15) 41. SCSL: elapsed time t=1.93 s, 512 iters, t-(init.)=1.8 s t(norm)=0.0330118, mflops=151.461 (err=3.7e-15) 42. SGIMATH: elapsed time t=1.89 s, 512 iters, t-(init.)=1.75 s t(norm)=0.0320948, mflops=155.788 (err=3.7e-15) Top mflops for N=8192 = 167.258 Normalized results and averages for N=8192: fft 0: mflops = 49.0341 (norm. = 0.293165), norm. avg. (of 13) = 0.414841 fft 1: mflops = 50.1158 (norm. = 0.299632), norm. avg. (of 13) = 0.412756 fft 2: mflops = 38.7258 (norm. = 0.231534), norm. avg. (of 13) = 0.295763 fft 3: mflops = 40.0926 (norm. = 0.239706), norm. avg. (of 13) = 0.119436 fft 4: mflops = 21.0362 (norm. = 0.125772), norm. avg. (of 13) = 0.24349 fft 5: mflops = 24.3419 (norm. = 0.145536), norm. avg. (of 13) = 0.114761 fft 6: mflops = 61.4031 (norm. = 0.367117), norm. avg. (of 13) = 0.249959 fft 7: mflops = 45.1374 (norm. = 0.269868), norm. avg. (of 13) = 0.186291 fft 8: mflops = 34.4229 (norm. = 0.205808), norm. avg. (of 13) = 0.226196 fft 9: mflops = 154.903 (norm. = 0.926136), norm. avg. (of 13) = 0.504016 fft 10: mflops = 167.258 (norm. = 1), norm. avg. (of 13) = 0.526895 fft 11: mflops = 36.6438 (norm. = 0.219086), norm. avg. (of 12) = 0.242964 fft 12: mflops = 44.2581 (norm. = 0.26461), norm. avg. (of 13) = 0.402729 fft 13: mflops = 24.5171 (norm. = 0.146583), norm. avg. (of 13) = 0.12925 fft 14: mflops = 118.535 (norm. = 0.708696), norm. avg. (of 13) = 0.722848 fft 15: mflops = 118.535 (norm. = 0.708696), norm. avg. (of 13) = 0.687515 fft 16: mflops = 85.1968 (norm. = 0.509375), norm. avg. (of 13) = 0.74819 fft 17: mflops = 95.3251 (norm. = 0.56993), norm. avg. (of 11) = 0.610225 fft 18: mflops = 41.0587 (norm. = 0.245482), norm. avg. (of 13) = 0.303537 fft 19: mflops = 36.254 (norm. = 0.216755), norm. avg. (of 13) = 0.205323 fft 20: mflops = 36.0621 (norm. = 0.215608), norm. avg. (of 13) = 0.206803 fft 21: mflops = -1 (norm. = -0.0059788), norm. avg. (of 12) = 0.574086 fft 22: mflops = 54.9657 (norm. = 0.328629), norm. avg. (of 12) = 0.312967 fft 23: mflops = 61.4031 (norm. = 0.367117), norm. avg. (of 12) = 0.382207 fft 24: mflops = 56.7979 (norm. = 0.339583), norm. avg. (of 12) = 0.36319 fft 25: mflops = 33.4105 (norm. = 0.199755), norm. avg. (of 12) = 0.142652 fft 26: mflops = 28.8803 (norm. = 0.172669), norm. avg. (of 13) = 0.14424 fft 27: mflops = 31.5544 (norm. = 0.188657), norm. avg. (of 13) = 0.133319 fft 28: mflops = 37.2445 (norm. = 0.222678), norm. avg. (of 13) = 0.209444 fft 29: mflops = 39.1709 (norm. = 0.234195), norm. avg. (of 13) = 0.192006 fft 30: mflops = 94.6631 (norm. = 0.565972), norm. avg. (of 13) = 0.670686 fft 31: mflops = 92.7312 (norm. = 0.554422), norm. avg. (of 13) = 0.58992 fft 32: mflops = 58.7564 (norm. = 0.351293), norm. avg. (of 12) = 0.175475 fft 33: mflops = 40.8128 (norm. = 0.244012), norm. avg. (of 12) = 0.317589 fft 34: mflops = 64.2995 (norm. = 0.384434), norm. avg. (of 13) = 0.353823 fft 35: mflops = 64.2995 (norm. = 0.384434), norm. avg. (of 13) = 0.3546 fft 36: mflops = 52.8352 (norm. = 0.315891), norm. avg. (of 13) = 0.406711 fft 37: mflops = 30.9807 (norm. = 0.185227), norm. avg. (of 13) = 0.218892 fft 38: mflops = 36.0621 (norm. = 0.215608), norm. avg. (of 13) = 0.165184 fft 39: mflops = 35.6845 (norm. = 0.213351), norm. avg. (of 13) = 0.156768 fft 40: mflops = 6.08549 (norm. = 0.0363839), norm. avg. (of 13) = 0.0368062 fft 41: mflops = 151.461 (norm. = 0.905556), norm. avg. (of 13) = 0.757038 fft 42: mflops = 155.788 (norm. = 0.931429), norm. avg. (of 13) = 0.748304 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.68 s, 64 iters, t-(init.)=1.64 s t(norm)=0.111716, mflops=44.7563 (err=6.8e-15) 1. Arndt DIT: elapsed time t=1.62 s, 64 iters, t-(init.)=1.59 s t(norm)=0.10831, mflops=46.1637 (err=6.8e-15) 2. Arndt Split-Radix: elapsed time t=1 s, 32 iters, t-(init.)=0.98 s t(norm)=0.133514, mflops=37.4491 (err=6.8e-15) 3. Arndt 4-step: elapsed time t=1.56 s, 64 iters, t-(init.)=1.53 s t(norm)=0.104223, mflops=47.9741 (err=6.8e-15) 4. Bailey: elapsed time t=1.85 s, 32 iters, t-(init.)=1.84 s t(norm)=0.25068, mflops=19.9457 (err=6.8e-15) 5. Beauregard: elapsed time t=1.52 s, 32 iters, t-(init.)=1.5 s t(norm)=0.204359, mflops=24.4668 (err=6.8e-15) 6. Bergland: elapsed time t=1.22 s, 64 iters, t-(init.)=1.19 s t(norm)=0.0810623, mflops=61.6809 (err=6.8e-15) 7. Brenner: elapsed time t=1.68 s, 64 iters, t-(init.)=1.65 s t(norm)=0.112397, mflops=44.485 (err=6.8e-15) 8. Burrus: elapsed time t=1.13 s, 32 iters, t-(init.)=1.12 s t(norm)=0.152588, mflops=32.768 (err=6.8e-15) 9. CWP (min N) (N=17160): elapsed time t=1.92 s, 256 iters, t-(init.)=1.78 s t(norm)=0.0303132, mflops=164.945 10. CWP (best N) (N=17160): elapsed time t=1.92 s, 256 iters, t-(init.)=1.78 s t(norm)=0.0303132, mflops=164.945 11. Edelblute: elapsed time t=1.04 s, 32 iters, t-(init.)=1.02 s t(norm)=0.138964, mflops=35.9805 (err=6.8e-15) 12. FFTPACK: elapsed time t=1.62 s, 64 iters, t-(init.)=1.59 s t(norm)=0.10831, mflops=46.1637 (err=6.8e-15) 13. FFTPACK (f2c): elapsed time t=1.37 s, 32 iters, t-(init.)=1.36 s t(norm)=0.185285, mflops=26.9854 (err=6.8e-15) FFTW_MEASURE plan: (cost = 1.031250e-02) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.36 s, 128 iters, t-(init.)=1.29 s t(norm)=0.0439371, mflops=113.799 (err=6.8e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.45 s, 128 iters, t-(init.)=1.38 s t(norm)=0.0470025, mflops=106.377 (err=6.8e-15) 16. Frigo-old: elapsed time t=1.98 s, 128 iters, t-(init.)=1.91 s t(norm)=0.0650542, mflops=76.859 (err=6.8e-15) 17. Green: elapsed time t=1.66 s, 128 iters, t-(init.)=1.59 s t(norm)=0.0541551, mflops=92.3274 (err=6.8e-15) 18. GSL: elapsed time t=1.75 s, 64 iters, t-(init.)=1.72 s t(norm)=0.117166, mflops=42.6746 (err=6.8e-15) 19. GSL DIT: elapsed time t=1.07 s, 32 iters, t-(init.)=1.05 s t(norm)=0.143051, mflops=34.9525 (err=7.2e-15) 20. GSL DIF: elapsed time t=1.08 s, 32 iters, t-(init.)=1.07 s t(norm)=0.145776, mflops=34.2992 (err=7.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.41 s, 64 iters, t-(init.)=1.37 s t(norm)=0.0933238, mflops=53.5769 (err=6.8e-15) 23. Mayer (simple): elapsed time t=1.32 s, 64 iters, t-(init.)=1.29 s t(norm)=0.0878743, mflops=56.8995 24. Mayer (lookup): elapsed time t=1.34 s, 64 iters, t-(init.)=1.3 s t(norm)=0.0885555, mflops=56.4618 (err=6.8e-15) 25. Monro: elapsed time t=1.16 s, 32 iters, t-(init.)=1.14 s t(norm)=0.155313, mflops=32.1931 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.27 s, 32 iters, t-(init.)=1.25 s t(norm)=0.170299, mflops=29.3601 (err=2.3e-13) 27. Nielsen: elapsed time t=1.12 s, 32 iters, t-(init.)=1.1 s t(norm)=0.149863, mflops=33.3638 (err=1.4e-13) 28. NR (C): elapsed time t=1.03 s, 32 iters, t-(init.)=1.02 s t(norm)=0.138964, mflops=35.9805 (err=7.0e-15) 29. NR (F): elapsed time t=1 s, 32 iters, t-(init.)=0.98 s t(norm)=0.133514, mflops=37.4491 (err=7.0e-15) 30. Ooura (C): elapsed time t=1.79 s, 128 iters, t-(init.)=1.73 s t(norm)=0.0589234, mflops=84.8559 (err=6.8e-15) 31. Ooura (F): elapsed time t=1.89 s, 128 iters, t-(init.)=1.82 s t(norm)=0.0619888, mflops=80.6597 (err=6.8e-15) 32. Ransom: elapsed time t=1.09 s, 64 iters, t-(init.)=1.06 s t(norm)=0.0722068, mflops=69.2456 (err=7.3e-15) 33. SCIPORT: elapsed time t=1.91 s, 64 iters, t-(init.)=1.88 s t(norm)=0.128065, mflops=39.0427 (err=6.8e-15) 34. Singleton: elapsed time t=1.23 s, 64 iters, t-(init.)=1.2 s t(norm)=0.0817435, mflops=61.1669 (err=1.0e-14) 35. Singleton (f2c): elapsed time t=1.19 s, 64 iters, t-(init.)=1.15 s t(norm)=0.0783375, mflops=63.8264 (err=1.0e-14) 36. Sorensen: elapsed time t=1.68 s, 64 iters, t-(init.)=1.64 s t(norm)=0.111716, mflops=44.7563 (err=6.8e-15) 37. Sorensen DIT: elapsed time t=1.23 s, 32 iters, t-(init.)=1.21 s t(norm)=0.164849, mflops=30.3307 (err=6.8e-15) 38. Temperton: elapsed time t=1.01 s, 32 iters, t-(init.)=0.99 s t(norm)=0.134877, mflops=37.0709 (err=1.5e-07) 39. Temperton (f2c): elapsed time t=1.02 s, 32 iters, t-(init.)=1 s t(norm)=0.136239, mflops=36.7002 (err=6.8e-15) 40. Valkenburg: elapsed time t=1.51 s, 8 iters, t-(init.)=1.51 s t(norm)=0.822885, mflops=6.07619 (err=6.9e-15) 41. SCSL: elapsed time t=1.03 s, 128 iters, t-(init.)=0.96 s t(norm)=0.0326974, mflops=152.917 (err=6.8e-15) 42. SGIMATH: elapsed time t=1.02 s, 128 iters, t-(init.)=0.95 s t(norm)=0.0323568, mflops=154.527 (err=6.8e-15) Top mflops for N=16384 = 164.945 Normalized results and averages for N=16384: fft 0: mflops = 44.7563 (norm. = 0.271341), norm. avg. (of 14) = 0.404591 fft 1: mflops = 46.1637 (norm. = 0.279874), norm. avg. (of 14) = 0.403264 fft 2: mflops = 37.4491 (norm. = 0.227041), norm. avg. (of 14) = 0.290854 fft 3: mflops = 47.9741 (norm. = 0.29085), norm. avg. (of 14) = 0.13168 fft 4: mflops = 19.9457 (norm. = 0.120924), norm. avg. (of 14) = 0.234735 fft 5: mflops = 24.4668 (norm. = 0.148333), norm. avg. (of 14) = 0.117159 fft 6: mflops = 61.6809 (norm. = 0.37395), norm. avg. (of 14) = 0.258816 fft 7: mflops = 44.485 (norm. = 0.269697), norm. avg. (of 14) = 0.192248 fft 8: mflops = 32.768 (norm. = 0.198661), norm. avg. (of 14) = 0.224229 fft 9: mflops = 164.945 (norm. = 1), norm. avg. (of 14) = 0.539443 fft 10: mflops = 164.945 (norm. = 1), norm. avg. (of 14) = 0.560688 fft 11: mflops = 35.9805 (norm. = 0.218137), norm. avg. (of 13) = 0.241054 fft 12: mflops = 46.1637 (norm. = 0.279874), norm. avg. (of 14) = 0.393954 fft 13: mflops = 26.9854 (norm. = 0.163603), norm. avg. (of 14) = 0.131703 fft 14: mflops = 113.799 (norm. = 0.689922), norm. avg. (of 14) = 0.720496 fft 15: mflops = 106.377 (norm. = 0.644928), norm. avg. (of 14) = 0.684473 fft 16: mflops = 76.859 (norm. = 0.465969), norm. avg. (of 14) = 0.728031 fft 17: mflops = 92.3274 (norm. = 0.559748), norm. avg. (of 12) = 0.606018 fft 18: mflops = 42.6746 (norm. = 0.258721), norm. avg. (of 14) = 0.300335 fft 19: mflops = 34.9525 (norm. = 0.211905), norm. avg. (of 14) = 0.205793 fft 20: mflops = 34.2992 (norm. = 0.207944), norm. avg. (of 14) = 0.206884 fft 21: mflops = -1 (norm. = -0.00606264), norm. avg. (of 12) = 0.574086 fft 22: mflops = 53.5769 (norm. = 0.324818), norm. avg. (of 13) = 0.313879 fft 23: mflops = 56.8995 (norm. = 0.344961), norm. avg. (of 13) = 0.379342 fft 24: mflops = 56.4618 (norm. = 0.342308), norm. avg. (of 13) = 0.361584 fft 25: mflops = 32.1931 (norm. = 0.195175), norm. avg. (of 13) = 0.146692 fft 26: mflops = 29.3601 (norm. = 0.178), norm. avg. (of 14) = 0.146651 fft 27: mflops = 33.3638 (norm. = 0.202273), norm. avg. (of 14) = 0.138245 fft 28: mflops = 35.9805 (norm. = 0.218137), norm. avg. (of 14) = 0.210065 fft 29: mflops = 37.4491 (norm. = 0.227041), norm. avg. (of 14) = 0.194508 fft 30: mflops = 84.8559 (norm. = 0.514451), norm. avg. (of 14) = 0.659526 fft 31: mflops = 80.6597 (norm. = 0.489011), norm. avg. (of 14) = 0.582712 fft 32: mflops = 69.2456 (norm. = 0.419811), norm. avg. (of 13) = 0.19427 fft 33: mflops = 39.0427 (norm. = 0.236702), norm. avg. (of 13) = 0.311367 fft 34: mflops = 61.1669 (norm. = 0.370833), norm. avg. (of 14) = 0.355038 fft 35: mflops = 63.8264 (norm. = 0.386957), norm. avg. (of 14) = 0.356911 fft 36: mflops = 44.7563 (norm. = 0.271341), norm. avg. (of 14) = 0.397042 fft 37: mflops = 30.3307 (norm. = 0.183884), norm. avg. (of 14) = 0.216391 fft 38: mflops = 37.0709 (norm. = 0.224747), norm. avg. (of 14) = 0.169439 fft 39: mflops = 36.7002 (norm. = 0.2225), norm. avg. (of 14) = 0.161463 fft 40: mflops = 6.07619 (norm. = 0.0368377), norm. avg. (of 14) = 0.0368085 fft 41: mflops = 152.917 (norm. = 0.927083), norm. avg. (of 14) = 0.769184 fft 42: mflops = 154.527 (norm. = 0.936842), norm. avg. (of 14) = 0.761771 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.73 s, 32 iters, t-(init.)=1.69 s t(norm)=0.107447, mflops=46.5344 (err=1.4e-14) 1. Arndt DIT: elapsed time t=1.72 s, 32 iters, t-(init.)=1.68 s t(norm)=0.106812, mflops=46.8114 (err=1.4e-14) 2. Arndt Split-Radix: elapsed time t=1.12 s, 16 iters, t-(init.)=1.11 s t(norm)=0.141144, mflops=35.4249 (err=1.4e-14) 3. Arndt 4-step: elapsed time t=1.81 s, 32 iters, t-(init.)=1.78 s t(norm)=0.113169, mflops=44.1816 (err=1.4e-14) 4. Bailey: elapsed time t=1.96 s, 16 iters, t-(init.)=1.95 s t(norm)=0.247955, mflops=20.1649 (err=1.4e-14) 5. Beauregard: elapsed time t=1.64 s, 16 iters, t-(init.)=1.63 s t(norm)=0.207265, mflops=24.1237 (err=1.4e-14) 6. Bergland: elapsed time t=1.31 s, 32 iters, t-(init.)=1.28 s t(norm)=0.0813802, mflops=61.44 (err=1.4e-14) 7. Brenner: elapsed time t=1.82 s, 32 iters, t-(init.)=1.79 s t(norm)=0.113805, mflops=43.9347 (err=1.4e-14) 8. Burrus: elapsed time t=1.23 s, 16 iters, t-(init.)=1.21 s t(norm)=0.153859, mflops=32.4972 (err=1.4e-14) 9. CWP (min N) (N=34320): elapsed time t=1.04 s, 64 iters, t-(init.)=0.97 s t(norm)=0.0308355, mflops=162.151 10. CWP (best N) (N=34320): elapsed time t=1.03 s, 64 iters, t-(init.)=0.96 s t(norm)=0.0305176, mflops=163.84 11. Edelblute: elapsed time t=1.15 s, 16 iters, t-(init.)=1.13 s t(norm)=0.143687, mflops=34.7979 (err=1.4e-14) 12. FFTPACK: elapsed time t=1.78 s, 32 iters, t-(init.)=1.75 s t(norm)=0.111262, mflops=44.939 (err=1.4e-14) 13. FFTPACK (f2c): elapsed time t=1.55 s, 16 iters, t-(init.)=1.53 s t(norm)=0.19455, mflops=25.7004 (err=1.4e-14) FFTW_MEASURE plan: (cost = 2.250000e-02) FFTW_TWIDDLE 32 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.51 s, 64 iters, t-(init.)=1.45 s t(norm)=0.0460943, mflops=108.473 (err=1.4e-14) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.51 s, 64 iters, t-(init.)=1.44 s t(norm)=0.0457764, mflops=109.227 (err=1.4e-14) 16. Frigo-old: elapsed time t=1.2 s, 32 iters, t-(init.)=1.16 s t(norm)=0.0737508, mflops=67.7959 (err=1.4e-14) 17. Green: elapsed time t=1.83 s, 64 iters, t-(init.)=1.77 s t(norm)=0.0562668, mflops=88.8624 (err=1.4e-14) 18. GSL: elapsed time t=1.93 s, 32 iters, t-(init.)=1.9 s t(norm)=0.120799, mflops=41.3912 (err=1.4e-14) 19. GSL DIT: elapsed time t=1.19 s, 16 iters, t-(init.)=1.17 s t(norm)=0.148773, mflops=33.6082 (err=1.4e-14) 20. GSL DIF: elapsed time t=1.17 s, 16 iters, t-(init.)=1.15 s t(norm)=0.14623, mflops=34.1927 (err=1.4e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.59 s, 32 iters, t-(init.)=1.56 s t(norm)=0.0991821, mflops=50.4123 (err=1.4e-14) 23. Mayer (simple): elapsed time t=1.49 s, 32 iters, t-(init.)=1.46 s t(norm)=0.0928243, mflops=53.8652 24. Mayer (lookup): elapsed time t=1.52 s, 32 iters, t-(init.)=1.48 s t(norm)=0.0940959, mflops=53.1373 (err=1.4e-14) 25. Monro: elapsed time t=1.23 s, 16 iters, t-(init.)=1.22 s t(norm)=0.155131, mflops=32.2308 (err=1.5e-07) 26. NAPACK (f2c): elapsed time t=1.5 s, 16 iters, t-(init.)=1.48 s t(norm)=0.188192, mflops=26.5686 (err=5.6e-13) 27. Nielsen: elapsed time t=1.11 s, 16 iters, t-(init.)=1.09 s t(norm)=0.138601, mflops=36.0749 (err=2.3e-13) 28. NR (C): elapsed time t=1.13 s, 16 iters, t-(init.)=1.11 s t(norm)=0.141144, mflops=35.4249 (err=1.4e-14) 29. NR (F): elapsed time t=1.12 s, 16 iters, t-(init.)=1.11 s t(norm)=0.141144, mflops=35.4249 (err=1.4e-14) 30. Ooura (C): elapsed time t=1.91 s, 64 iters, t-(init.)=1.84 s t(norm)=0.058492, mflops=85.4817 (err=1.4e-14) 31. Ooura (F): elapsed time t=1.01 s, 32 iters, t-(init.)=0.97 s t(norm)=0.0616709, mflops=81.0755 (err=1.4e-14) 32. Ransom: elapsed time t=1.24 s, 32 iters, t-(init.)=1.21 s t(norm)=0.0769297, mflops=64.9944 (err=1.5e-14) 33. SCIPORT: elapsed time t=1.07 s, 16 iters, t-(init.)=1.05 s t(norm)=0.133514, mflops=37.4491 (err=1.4e-14) 34. Singleton: elapsed time t=1.45 s, 32 iters, t-(init.)=1.42 s t(norm)=0.0902812, mflops=55.3825 (err=2.1e-14) 35. Singleton (f2c): elapsed time t=1.44 s, 32 iters, t-(init.)=1.41 s t(norm)=0.0896454, mflops=55.7753 (err=2.1e-14) 36. Sorensen: elapsed time t=1.92 s, 32 iters, t-(init.)=1.89 s t(norm)=0.120163, mflops=41.6102 (err=1.4e-14) 37. Sorensen DIT: elapsed time t=1.37 s, 16 iters, t-(init.)=1.35 s t(norm)=0.171661, mflops=29.1271 (err=1.4e-14) 38. Temperton: elapsed time t=1.14 s, 16 iters, t-(init.)=1.12 s t(norm)=0.142415, mflops=35.1086 (err=1.5e-07) 39. Temperton (f2c): elapsed time t=1.12 s, 16 iters, t-(init.)=1.1 s t(norm)=0.139872, mflops=35.7469 (err=1.4e-14) 40. Valkenburg: elapsed time t=1.68 s, 4 iters, t-(init.)=1.67 s t(norm)=0.849406, mflops=5.88647 (err=1.4e-14) 41. SCSL: elapsed time t=1.04 s, 64 iters, t-(init.)=0.97 s t(norm)=0.0308355, mflops=162.151 (err=1.4e-14) 42. SGIMATH: elapsed time t=1.03 s, 64 iters, t-(init.)=0.96 s t(norm)=0.0305176, mflops=163.84 (err=1.4e-14) Top mflops for N=32768 = 163.84 Normalized results and averages for N=32768: fft 0: mflops = 46.5344 (norm. = 0.284024), norm. avg. (of 15) = 0.396553 fft 1: mflops = 46.8114 (norm. = 0.285714), norm. avg. (of 15) = 0.395428 fft 2: mflops = 35.4249 (norm. = 0.216216), norm. avg. (of 15) = 0.285878 fft 3: mflops = 44.1816 (norm. = 0.269663), norm. avg. (of 15) = 0.140878 fft 4: mflops = 20.1649 (norm. = 0.123077), norm. avg. (of 15) = 0.227291 fft 5: mflops = 24.1237 (norm. = 0.147239), norm. avg. (of 15) = 0.119164 fft 6: mflops = 61.44 (norm. = 0.375), norm. avg. (of 15) = 0.266561 fft 7: mflops = 43.9347 (norm. = 0.268156), norm. avg. (of 15) = 0.197309 fft 8: mflops = 32.4972 (norm. = 0.198347), norm. avg. (of 15) = 0.222503 fft 9: mflops = 162.151 (norm. = 0.989691), norm. avg. (of 15) = 0.56946 fft 10: mflops = 163.84 (norm. = 1), norm. avg. (of 15) = 0.589976 fft 11: mflops = 34.7979 (norm. = 0.212389), norm. avg. (of 14) = 0.239007 fft 12: mflops = 44.939 (norm. = 0.274286), norm. avg. (of 15) = 0.385976 fft 13: mflops = 25.7004 (norm. = 0.156863), norm. avg. (of 15) = 0.133381 fft 14: mflops = 108.473 (norm. = 0.662069), norm. avg. (of 15) = 0.716601 fft 15: mflops = 109.227 (norm. = 0.666667), norm. avg. (of 15) = 0.683286 fft 16: mflops = 67.7959 (norm. = 0.413793), norm. avg. (of 15) = 0.707082 fft 17: mflops = 88.8624 (norm. = 0.542373), norm. avg. (of 13) = 0.601123 fft 18: mflops = 41.3912 (norm. = 0.252632), norm. avg. (of 15) = 0.297155 fft 19: mflops = 33.6082 (norm. = 0.205128), norm. avg. (of 15) = 0.205749 fft 20: mflops = 34.1927 (norm. = 0.208696), norm. avg. (of 15) = 0.207005 fft 21: mflops = -1 (norm. = -0.00610352), norm. avg. (of 12) = 0.574086 fft 22: mflops = 50.4123 (norm. = 0.307692), norm. avg. (of 14) = 0.313437 fft 23: mflops = 53.8652 (norm. = 0.328767), norm. avg. (of 14) = 0.375729 fft 24: mflops = 53.1373 (norm. = 0.324324), norm. avg. (of 14) = 0.358922 fft 25: mflops = 32.2308 (norm. = 0.196721), norm. avg. (of 14) = 0.150266 fft 26: mflops = 26.5686 (norm. = 0.162162), norm. avg. (of 15) = 0.147685 fft 27: mflops = 36.0749 (norm. = 0.220183), norm. avg. (of 15) = 0.143707 fft 28: mflops = 35.4249 (norm. = 0.216216), norm. avg. (of 15) = 0.210475 fft 29: mflops = 35.4249 (norm. = 0.216216), norm. avg. (of 15) = 0.195956 fft 30: mflops = 85.4817 (norm. = 0.521739), norm. avg. (of 15) = 0.65034 fft 31: mflops = 81.0755 (norm. = 0.494845), norm. avg. (of 15) = 0.576855 fft 32: mflops = 64.9944 (norm. = 0.396694), norm. avg. (of 14) = 0.208729 fft 33: mflops = 37.4491 (norm. = 0.228571), norm. avg. (of 14) = 0.305453 fft 34: mflops = 55.3825 (norm. = 0.338028), norm. avg. (of 15) = 0.353904 fft 35: mflops = 55.7753 (norm. = 0.340426), norm. avg. (of 15) = 0.355812 fft 36: mflops = 41.6102 (norm. = 0.253968), norm. avg. (of 15) = 0.387504 fft 37: mflops = 29.1271 (norm. = 0.177778), norm. avg. (of 15) = 0.213817 fft 38: mflops = 35.1086 (norm. = 0.214286), norm. avg. (of 15) = 0.172429 fft 39: mflops = 35.7469 (norm. = 0.218182), norm. avg. (of 15) = 0.165244 fft 40: mflops = 5.88647 (norm. = 0.0359281), norm. avg. (of 15) = 0.0367498 fft 41: mflops = 162.151 (norm. = 0.989691), norm. avg. (of 15) = 0.783884 fft 42: mflops = 163.84 (norm. = 1), norm. avg. (of 15) = 0.777653 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.02 s, 8 iters, t-(init.)=1.01 s t(norm)=0.120401, mflops=41.5278 (err=1.7e-14) 1. Arndt DIT: elapsed time t=1.96 s, 16 iters, t-(init.)=1.93 s t(norm)=0.115037, mflops=43.4643 (err=1.7e-14) 2. Arndt Split-Radix: elapsed time t=1.22 s, 8 iters, t-(init.)=1.2 s t(norm)=0.143051, mflops=34.9525 (err=1.7e-14) 3. Arndt 4-step: elapsed time t=1.66 s, 16 iters, t-(init.)=1.63 s t(norm)=0.0971556, mflops=51.4639 (err=1.7e-14) 4. Bailey: elapsed time t=1.09 s, 4 iters, t-(init.)=1.08 s t(norm)=0.257492, mflops=19.4181 (err=1.7e-14) 5. Beauregard: elapsed time t=1.71 s, 8 iters, t-(init.)=1.7 s t(norm)=0.202656, mflops=24.6724 (err=1.7e-14) 6. Bergland: elapsed time t=1.46 s, 16 iters, t-(init.)=1.43 s t(norm)=0.0852346, mflops=58.6616 (err=1.7e-14) 7. Brenner: elapsed time t=1.96 s, 16 iters, t-(init.)=1.93 s t(norm)=0.115037, mflops=43.4643 (err=1.7e-14) 8. Burrus: elapsed time t=1.33 s, 8 iters, t-(init.)=1.31 s t(norm)=0.156164, mflops=32.0176 (err=1.7e-14) 9. CWP (min N) (N=72072): elapsed time t=1.11 s, 32 iters, t-(init.)=1.04 s t(norm)=0.0309944, mflops=161.319 10. CWP (best N) (N=72072): elapsed time t=1.1 s, 32 iters, t-(init.)=1.02 s t(norm)=0.0303984, mflops=164.483 11. Edelblute: elapsed time t=1.25 s, 8 iters, t-(init.)=1.23 s t(norm)=0.146627, mflops=34.1 (err=1.7e-14) 12. FFTPACK: elapsed time t=1.89 s, 16 iters, t-(init.)=1.85 s t(norm)=0.110269, mflops=45.3438 (err=1.7e-14) 13. FFTPACK (f2c): elapsed time t=1.52 s, 8 iters, t-(init.)=1.51 s t(norm)=0.180006, mflops=27.7768 (err=1.7e-14) FFTW_MEASURE plan: (cost = 5.000000e-02) FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=1.64 s, 32 iters, t-(init.)=1.57 s t(norm)=0.0467896, mflops=106.861 (err=1.7e-14) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.87 s, 32 iters, t-(init.)=1.8 s t(norm)=0.0536442, mflops=93.2068 (err=1.7e-14) 16. Frigo-old: elapsed time t=1.44 s, 16 iters, t-(init.)=1.41 s t(norm)=0.0840425, mflops=59.4937 (err=1.7e-14) 17. Green: elapsed time t=1.05 s, 16 iters, t-(init.)=1.02 s t(norm)=0.0607967, mflops=82.2413 (err=1.7e-14) 18. GSL: elapsed time t=1.04 s, 8 iters, t-(init.)=1.02 s t(norm)=0.121593, mflops=41.1206 (err=1.7e-14) 19. GSL DIT: elapsed time t=1.28 s, 8 iters, t-(init.)=1.27 s t(norm)=0.151396, mflops=33.026 (err=1.7e-14) 20. GSL DIF: elapsed time t=1.23 s, 8 iters, t-(init.)=1.21 s t(norm)=0.144243, mflops=34.6637 (err=1.8e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.71 s, 16 iters, t-(init.)=1.67 s t(norm)=0.0995398, mflops=50.2312 (err=1.7e-14) 23. Mayer (simple): elapsed time t=1.63 s, 16 iters, t-(init.)=1.6 s t(norm)=0.0953674, mflops=52.4288 24. Mayer (lookup): elapsed time t=1.63 s, 16 iters, t-(init.)=1.59 s t(norm)=0.0947714, mflops=52.7585 (err=1.7e-14) 25. Monro: elapsed time t=1.33 s, 8 iters, t-(init.)=1.31 s t(norm)=0.156164, mflops=32.0176 (err=1.6e-07) 26. NAPACK (f2c): elapsed time t=1.57 s, 8 iters, t-(init.)=1.55 s t(norm)=0.184774, mflops=27.06 (err=8.6e-13) 27. Nielsen: elapsed time t=1.33 s, 8 iters, t-(init.)=1.32 s t(norm)=0.157356, mflops=31.775 (err=2.6e-13) 28. NR (C): elapsed time t=1.23 s, 8 iters, t-(init.)=1.21 s t(norm)=0.144243, mflops=34.6637 (err=1.7e-14) 29. NR (F): elapsed time t=1.24 s, 8 iters, t-(init.)=1.22 s t(norm)=0.145435, mflops=34.3795 (err=1.7e-14) 30. Ooura (C): elapsed time t=1.11 s, 16 iters, t-(init.)=1.07 s t(norm)=0.063777, mflops=78.3982 (err=1.7e-14) 31. Ooura (F): elapsed time t=1.21 s, 16 iters, t-(init.)=1.18 s t(norm)=0.0703335, mflops=71.0899 (err=1.7e-14) 32. Ransom: elapsed time t=1.3 s, 16 iters, t-(init.)=1.27 s t(norm)=0.0756979, mflops=66.052 (err=1.7e-14) 33. SCIPORT: elapsed time t=1.24 s, 8 iters, t-(init.)=1.22 s t(norm)=0.145435, mflops=34.3795 (err=1.7e-14) 34. Singleton: elapsed time t=1.51 s, 16 iters, t-(init.)=1.48 s t(norm)=0.0882149, mflops=56.6798 (err=2.3e-14) 35. Singleton (f2c): elapsed time t=1.47 s, 16 iters, t-(init.)=1.44 s t(norm)=0.0858307, mflops=58.2542 (err=2.3e-14) 36. Sorensen: elapsed time t=1.07 s, 8 iters, t-(init.)=1.05 s t(norm)=0.12517, mflops=39.9458 (err=1.7e-14) 37. Sorensen DIT: elapsed time t=1.5 s, 8 iters, t-(init.)=1.49 s t(norm)=0.177622, mflops=28.1497 (err=1.7e-14) 38. Temperton: elapsed time t=1.24 s, 8 iters, t-(init.)=1.22 s t(norm)=0.145435, mflops=34.3795 (err=1.7e-07) 39. Temperton (f2c): elapsed time t=1.21 s, 8 iters, t-(init.)=1.19 s t(norm)=0.141859, mflops=35.2463 (err=1.7e-14) 40. Valkenburg: elapsed time t=1.02 s, 1 iters, t-(init.)=1.02 s t(norm)=0.972748, mflops=5.14008 (err=1.7e-14) 41. SCSL: elapsed time t=1.1 s, 32 iters, t-(init.)=1.03 s t(norm)=0.0306964, mflops=162.886 (err=1.7e-14) 42. SGIMATH: elapsed time t=1.1 s, 32 iters, t-(init.)=1.04 s t(norm)=0.0309944, mflops=161.319 (err=1.7e-14) Top mflops for N=65536 = 164.483 Normalized results and averages for N=65536: fft 0: mflops = 41.5278 (norm. = 0.252475), norm. avg. (of 16) = 0.387548 fft 1: mflops = 43.4643 (norm. = 0.264249), norm. avg. (of 16) = 0.387229 fft 2: mflops = 34.9525 (norm. = 0.2125), norm. avg. (of 16) = 0.281292 fft 3: mflops = 51.4639 (norm. = 0.312883), norm. avg. (of 16) = 0.151629 fft 4: mflops = 19.4181 (norm. = 0.118056), norm. avg. (of 16) = 0.220464 fft 5: mflops = 24.6724 (norm. = 0.15), norm. avg. (of 16) = 0.121092 fft 6: mflops = 58.6616 (norm. = 0.356643), norm. avg. (of 16) = 0.272192 fft 7: mflops = 43.4643 (norm. = 0.264249), norm. avg. (of 16) = 0.201493 fft 8: mflops = 32.0176 (norm. = 0.194656), norm. avg. (of 16) = 0.220763 fft 9: mflops = 161.319 (norm. = 0.980769), norm. avg. (of 16) = 0.595167 fft 10: mflops = 164.483 (norm. = 1), norm. avg. (of 16) = 0.615602 fft 11: mflops = 34.1 (norm. = 0.207317), norm. avg. (of 15) = 0.236894 fft 12: mflops = 45.3438 (norm. = 0.275676), norm. avg. (of 16) = 0.379082 fft 13: mflops = 27.7768 (norm. = 0.168874), norm. avg. (of 16) = 0.135599 fft 14: mflops = 106.861 (norm. = 0.649682), norm. avg. (of 16) = 0.712419 fft 15: mflops = 93.2068 (norm. = 0.566667), norm. avg. (of 16) = 0.675997 fft 16: mflops = 59.4937 (norm. = 0.361702), norm. avg. (of 16) = 0.685496 fft 17: mflops = 82.2413 (norm. = 0.5), norm. avg. (of 14) = 0.5939 fft 18: mflops = 41.1206 (norm. = 0.25), norm. avg. (of 16) = 0.294208 fft 19: mflops = 33.026 (norm. = 0.200787), norm. avg. (of 16) = 0.205439 fft 20: mflops = 34.6637 (norm. = 0.210744), norm. avg. (of 16) = 0.207239 fft 21: mflops = -1 (norm. = -0.00607967), norm. avg. (of 12) = 0.574086 fft 22: mflops = 50.2312 (norm. = 0.305389), norm. avg. (of 15) = 0.3129 fft 23: mflops = 52.4288 (norm. = 0.31875), norm. avg. (of 15) = 0.371931 fft 24: mflops = 52.7585 (norm. = 0.320755), norm. avg. (of 15) = 0.356378 fft 25: mflops = 32.0176 (norm. = 0.194656), norm. avg. (of 15) = 0.153225 fft 26: mflops = 27.06 (norm. = 0.164516), norm. avg. (of 16) = 0.148737 fft 27: mflops = 31.775 (norm. = 0.193182), norm. avg. (of 16) = 0.146799 fft 28: mflops = 34.6637 (norm. = 0.210744), norm. avg. (of 16) = 0.210492 fft 29: mflops = 34.3795 (norm. = 0.209016), norm. avg. (of 16) = 0.196772 fft 30: mflops = 78.3982 (norm. = 0.476636), norm. avg. (of 16) = 0.639484 fft 31: mflops = 71.0899 (norm. = 0.432203), norm. avg. (of 16) = 0.567814 fft 32: mflops = 66.052 (norm. = 0.401575), norm. avg. (of 15) = 0.221585 fft 33: mflops = 34.3795 (norm. = 0.209016), norm. avg. (of 15) = 0.299024 fft 34: mflops = 56.6798 (norm. = 0.344595), norm. avg. (of 16) = 0.353322 fft 35: mflops = 58.2542 (norm. = 0.354167), norm. avg. (of 16) = 0.355709 fft 36: mflops = 39.9458 (norm. = 0.242857), norm. avg. (of 16) = 0.378463 fft 37: mflops = 28.1497 (norm. = 0.171141), norm. avg. (of 16) = 0.21115 fft 38: mflops = 34.3795 (norm. = 0.209016), norm. avg. (of 16) = 0.174715 fft 39: mflops = 35.2463 (norm. = 0.214286), norm. avg. (of 16) = 0.16831 fft 40: mflops = 5.14008 (norm. = 0.03125), norm. avg. (of 16) = 0.036406 fft 41: mflops = 162.886 (norm. = 0.990291), norm. avg. (of 16) = 0.796785 fft 42: mflops = 161.319 (norm. = 0.980769), norm. avg. (of 16) = 0.790348 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.51 s, 4 iters, t-(init.)=1.49 s t(norm)=0.167173, mflops=29.909 (err=3.3e-14) 1. Arndt DIT: elapsed time t=1.41 s, 4 iters, t-(init.)=1.39 s t(norm)=0.155954, mflops=32.0608 (err=3.3e-14) 2. Arndt Split-Radix: elapsed time t=1.85 s, 4 iters, t-(init.)=1.83 s t(norm)=0.20532, mflops=24.3522 (err=3.3e-14) 3. Arndt 4-step: elapsed time t=1.99 s, 8 iters, t-(init.)=1.96 s t(norm)=0.109953, mflops=45.474 (err=3.3e-14) 4. Bailey: elapsed time t=1.52 s, 2 iters, t-(init.)=1.51 s t(norm)=0.338835, mflops=14.7565 (err=3.3e-14) 5. Beauregard: elapsed time t=1.84 s, 4 iters, t-(init.)=1.83 s t(norm)=0.20532, mflops=24.3522 (err=3.3e-14) 6. Bergland: elapsed time t=1.74 s, 8 iters, t-(init.)=1.7 s t(norm)=0.0953674, mflops=52.4288 (err=3.4e-14) 7. Brenner: elapsed time t=1.16 s, 4 iters, t-(init.)=1.14 s t(norm)=0.127905, mflops=39.0916 (err=3.3e-14) 8. Burrus: elapsed time t=1.63 s, 4 iters, t-(init.)=1.62 s t(norm)=0.181759, mflops=27.5089 (err=3.3e-14) 9. CWP (min N) (N=144144): elapsed time t=1.19 s, 16 iters, t-(init.)=1.11 s t(norm)=0.0311347, mflops=160.593 10. CWP (best N) (N=144144): elapsed time t=1.19 s, 16 iters, t-(init.)=1.11 s t(norm)=0.0311347, mflops=160.593 11. Edelblute: elapsed time t=1.55 s, 4 iters, t-(init.)=1.54 s t(norm)=0.172783, mflops=28.938 (err=3.3e-14) 12. FFTPACK: elapsed time t=1.64 s, 4 iters, t-(init.)=1.62 s t(norm)=0.181759, mflops=27.5089 (err=3.3e-14) 13. FFTPACK (f2c): elapsed time t=1.19 s, 2 iters, t-(init.)=1.19 s t(norm)=0.267029, mflops=18.7246 (err=3.3e-14) FFTW_MEASURE plan: (cost = 1.550000e-01) FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 16 14. FFTW: elapsed time t=1.25 s, 8 iters, t-(init.)=1.21 s t(norm)=0.0678792, mflops=73.6603 (err=3.3e-14) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.21 s, 8 iters, t-(init.)=1.17 s t(norm)=0.0656352, mflops=76.1786 (err=3.3e-14) 16. Frigo-old: elapsed time t=1.92 s, 8 iters, t-(init.)=1.88 s t(norm)=0.105465, mflops=47.409 (err=3.3e-14) 17. Green: elapsed time t=1.45 s, 8 iters, t-(init.)=1.42 s t(norm)=0.0796599, mflops=62.7669 (err=3.3e-14) 18. GSL: elapsed time t=1.45 s, 4 iters, t-(init.)=1.44 s t(norm)=0.161564, mflops=30.9476 (err=3.3e-14) 19. GSL DIT: elapsed time t=1.72 s, 4 iters, t-(init.)=1.7 s t(norm)=0.190735, mflops=26.2144 (err=3.5e-14) 20. GSL DIF: elapsed time t=1.64 s, 4 iters, t-(init.)=1.62 s t(norm)=0.181759, mflops=27.5089 (err=3.5e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.91 s, 8 iters, t-(init.)=1.88 s t(norm)=0.105465, mflops=47.409 (err=3.3e-14) 23. Mayer (simple): elapsed time t=1.81 s, 8 iters, t-(init.)=1.77 s t(norm)=0.0992943, mflops=50.3553 24. Mayer (lookup): elapsed time t=1.83 s, 8 iters, t-(init.)=1.79 s t(norm)=0.100416, mflops=49.7927 (err=3.3e-14) 25. Monro: elapsed time t=1.79 s, 4 iters, t-(init.)=1.77 s t(norm)=0.198589, mflops=25.1777 (err=1.7e-07) 26. NAPACK (f2c): elapsed time t=1.8 s, 4 iters, t-(init.)=1.78 s t(norm)=0.199711, mflops=25.0362 (err=2.0e-12) 27. Nielsen: elapsed time t=1.75 s, 4 iters, t-(init.)=1.73 s t(norm)=0.194101, mflops=25.7598 (err=9.2e-13) 28. NR (C): elapsed time t=1.65 s, 4 iters, t-(init.)=1.63 s t(norm)=0.182881, mflops=27.3402 (err=3.4e-14) 29. NR (F): elapsed time t=1.62 s, 4 iters, t-(init.)=1.61 s t(norm)=0.180637, mflops=27.6798 (err=3.4e-14) 30. Ooura (C): elapsed time t=1.25 s, 8 iters, t-(init.)=1.22 s t(norm)=0.0684402, mflops=73.0565 (err=3.4e-14) 31. Ooura (F): elapsed time t=1.33 s, 8 iters, t-(init.)=1.29 s t(norm)=0.0723671, mflops=69.0922 (err=3.4e-14) 32. Ransom: elapsed time t=1.48 s, 8 iters, t-(init.)=1.44 s t(norm)=0.0807818, mflops=61.8951 (err=3.3e-14) 33. SCIPORT: elapsed time t=1.74 s, 4 iters, t-(init.)=1.72 s t(norm)=0.192979, mflops=25.9096 (err=3.3e-14) 34. Singleton: elapsed time t=1.9 s, 8 iters, t-(init.)=1.87 s t(norm)=0.104904, mflops=47.6625 (err=4.8e-14) 35. Singleton (f2c): elapsed time t=1.85 s, 8 iters, t-(init.)=1.81 s t(norm)=0.101538, mflops=49.2425 (err=4.8e-14) 36. Sorensen: elapsed time t=1.28 s, 4 iters, t-(init.)=1.27 s t(norm)=0.14249, mflops=35.0901 (err=3.3e-14) 37. Sorensen DIT: elapsed time t=1.81 s, 4 iters, t-(init.)=1.8 s t(norm)=0.201955, mflops=24.758 (err=3.3e-14) 38. Temperton: elapsed time t=1.6 s, 4 iters, t-(init.)=1.58 s t(norm)=0.177271, mflops=28.2054 (err=1.9e-07) 39. Temperton (f2c): elapsed time t=1.65 s, 4 iters, t-(init.)=1.63 s t(norm)=0.182881, mflops=27.3402 (err=3.3e-14) 40. Valkenburg: elapsed time t=1.89 s, 1 iters, t-(init.)=1.89 s t(norm)=0.848209, mflops=5.89477 (err=3.4e-14) 41. SCSL: elapsed time t=1.33 s, 16 iters, t-(init.)=1.26 s t(norm)=0.035342, mflops=141.475 (err=3.3e-14) 42. SGIMATH: elapsed time t=1.33 s, 16 iters, t-(init.)=1.26 s t(norm)=0.035342, mflops=141.475 (err=3.3e-14) Top mflops for N=131072 = 160.593 Normalized results and averages for N=131072: fft 0: mflops = 29.909 (norm. = 0.186242), norm. avg. (of 17) = 0.375707 fft 1: mflops = 32.0608 (norm. = 0.19964), norm. avg. (of 17) = 0.376194 fft 2: mflops = 24.3522 (norm. = 0.151639), norm. avg. (of 17) = 0.273666 fft 3: mflops = 45.474 (norm. = 0.283163), norm. avg. (of 17) = 0.159366 fft 4: mflops = 14.7565 (norm. = 0.0918874), norm. avg. (of 17) = 0.212901 fft 5: mflops = 24.3522 (norm. = 0.151639), norm. avg. (of 17) = 0.122889 fft 6: mflops = 52.4288 (norm. = 0.326471), norm. avg. (of 17) = 0.275384 fft 7: mflops = 39.0916 (norm. = 0.243421), norm. avg. (of 17) = 0.203959 fft 8: mflops = 27.5089 (norm. = 0.171296), norm. avg. (of 17) = 0.217853 fft 9: mflops = 160.593 (norm. = 1), norm. avg. (of 17) = 0.61898 fft 10: mflops = 160.593 (norm. = 1), norm. avg. (of 17) = 0.638214 fft 11: mflops = 28.938 (norm. = 0.180195), norm. avg. (of 16) = 0.23335 fft 12: mflops = 27.5089 (norm. = 0.171296), norm. avg. (of 17) = 0.366859 fft 13: mflops = 18.7246 (norm. = 0.116597), norm. avg. (of 17) = 0.134481 fft 14: mflops = 73.6603 (norm. = 0.458678), norm. avg. (of 17) = 0.697493 fft 15: mflops = 76.1786 (norm. = 0.474359), norm. avg. (of 17) = 0.664136 fft 16: mflops = 47.409 (norm. = 0.295213), norm. avg. (of 17) = 0.662538 fft 17: mflops = 62.7669 (norm. = 0.390845), norm. avg. (of 15) = 0.580363 fft 18: mflops = 30.9476 (norm. = 0.192708), norm. avg. (of 17) = 0.288237 fft 19: mflops = 26.2144 (norm. = 0.163235), norm. avg. (of 17) = 0.202956 fft 20: mflops = 27.5089 (norm. = 0.171296), norm. avg. (of 17) = 0.205124 fft 21: mflops = -1 (norm. = -0.00622693), norm. avg. (of 12) = 0.574086 fft 22: mflops = 47.409 (norm. = 0.295213), norm. avg. (of 16) = 0.311795 fft 23: mflops = 50.3553 (norm. = 0.313559), norm. avg. (of 16) = 0.368283 fft 24: mflops = 49.7927 (norm. = 0.310056), norm. avg. (of 16) = 0.353483 fft 25: mflops = 25.1777 (norm. = 0.15678), norm. avg. (of 16) = 0.153447 fft 26: mflops = 25.0362 (norm. = 0.155899), norm. avg. (of 17) = 0.149158 fft 27: mflops = 25.7598 (norm. = 0.160405), norm. avg. (of 17) = 0.1476 fft 28: mflops = 27.3402 (norm. = 0.170245), norm. avg. (of 17) = 0.208124 fft 29: mflops = 27.6798 (norm. = 0.17236), norm. avg. (of 17) = 0.195336 fft 30: mflops = 73.0565 (norm. = 0.454918), norm. avg. (of 17) = 0.628627 fft 31: mflops = 69.0922 (norm. = 0.430233), norm. avg. (of 17) = 0.559721 fft 32: mflops = 61.8951 (norm. = 0.385417), norm. avg. (of 16) = 0.231825 fft 33: mflops = 25.9096 (norm. = 0.161337), norm. avg. (of 16) = 0.290419 fft 34: mflops = 47.6625 (norm. = 0.296791), norm. avg. (of 17) = 0.349997 fft 35: mflops = 49.2425 (norm. = 0.30663), norm. avg. (of 17) = 0.352822 fft 36: mflops = 35.0901 (norm. = 0.218504), norm. avg. (of 17) = 0.369054 fft 37: mflops = 24.758 (norm. = 0.154167), norm. avg. (of 17) = 0.207798 fft 38: mflops = 28.2054 (norm. = 0.175633), norm. avg. (of 17) = 0.174769 fft 39: mflops = 27.3402 (norm. = 0.170245), norm. avg. (of 17) = 0.168423 fft 40: mflops = 5.89477 (norm. = 0.0367063), norm. avg. (of 17) = 0.0364237 fft 41: mflops = 141.475 (norm. = 0.880952), norm. avg. (of 17) = 0.801736 fft 42: mflops = 141.475 (norm. = 0.880952), norm. avg. (of 17) = 0.795678 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=1.07 s, 1 iters, t-(init.)=1.06 s t(norm)=0.224643, mflops=22.2575 (err=4.3e-14) 1. Arndt DIT: elapsed time t=1 s, 1 iters, t-(init.)=0.99 s t(norm)=0.209808, mflops=23.8313 (err=4.3e-14) 2. Arndt Split-Radix: elapsed time t=1.2 s, 1 iters, t-(init.)=1.19 s t(norm)=0.252194, mflops=19.826 (err=4.3e-14) 3. Arndt 4-step: elapsed time t=1.96 s, 4 iters, t-(init.)=1.92 s t(norm)=0.101725, mflops=49.152 (err=4.3e-14) 4. Bailey: elapsed time t=3.35 s, 1 iters, t-(init.)=3.33 s t(norm)=0.705719, mflops=7.08497 (err=4.3e-14) 5. Beauregard: elapsed time t=1.03 s, 1 iters, t-(init.)=1.02 s t(norm)=0.216166, mflops=23.1304 (err=4.4e-14) 6. Bergland: elapsed time t=1.06 s, 2 iters, t-(init.)=1.04 s t(norm)=0.110202, mflops=45.3711 (err=4.4e-14) 7. Brenner: elapsed time t=1.54 s, 2 iters, t-(init.)=1.52 s t(norm)=0.161065, mflops=31.0434 (err=4.4e-14) 8. Burrus: elapsed time t=1.3 s, 1 iters, t-(init.)=1.29 s t(norm)=0.273387, mflops=18.2891 (err=4.3e-14) 9. CWP (min N) (N=360360): elapsed time t=1.02 s, 4 iters, t-(init.)=0.94 s t(norm)=0.049803, mflops=100.396 10. CWP (best N) (N=360360): elapsed time t=1.03 s, 4 iters, t-(init.)=0.95 s t(norm)=0.0503328, mflops=99.3388 11. Edelblute: elapsed time t=1.23 s, 1 iters, t-(init.)=1.23 s t(norm)=0.260671, mflops=19.1813 (err=4.3e-14) 12. FFTPACK: elapsed time t=1.5 s, 1 iters, t-(init.)=1.49 s t(norm)=0.315772, mflops=15.8342 (err=4.4e-14) 13. FFTPACK (f2c): elapsed time t=2.2 s, 1 iters, t-(init.)=2.18 s t(norm)=0.462002, mflops=10.8225 (err=4.4e-14) FFTW_MEASURE plan: (cost = 4.000000e-01) FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 14. FFTW: elapsed time t=1.61 s, 4 iters, t-(init.)=1.49 s t(norm)=0.078943, mflops=63.3368 (err=4.4e-14) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.79 s, 4 iters, t-(init.)=1.69 s t(norm)=0.0895394, mflops=55.8413 (err=4.4e-14) 16. Frigo-old: elapsed time t=1.64 s, 2 iters, t-(init.)=1.61 s t(norm)=0.170602, mflops=29.308 (err=4.4e-14) 17. Green: elapsed time t=1 s, 2 iters, t-(init.)=0.98 s t(norm)=0.103845, mflops=48.1489 (err=4.4e-14) 18. GSL: elapsed time t=1.93 s, 1 iters, t-(init.)=1.92 s t(norm)=0.406901, mflops=12.288 (err=4.4e-14) 19. GSL DIT: elapsed time t=1.01 s, 1 iters, t-(init.)=1.01 s t(norm)=0.214047, mflops=23.3594 (err=4.6e-14) 20. GSL DIF: elapsed time t=1.94 s, 2 iters, t-(init.)=1.92 s t(norm)=0.203451, mflops=24.576 (err=4.6e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.29 s, 2 iters, t-(init.)=1.27 s t(norm)=0.134574, mflops=37.1543 (err=4.3e-14) 23. Mayer (simple): elapsed time t=1.24 s, 2 iters, t-(init.)=1.22 s t(norm)=0.129276, mflops=38.677 24. Mayer (lookup): elapsed time t=1.22 s, 2 iters, t-(init.)=1.2 s t(norm)=0.127157, mflops=39.3216 (err=4.3e-14) 25. Monro: elapsed time t=1.21 s, 1 iters, t-(init.)=1.2 s t(norm)=0.254313, mflops=19.6608 (err=1.8e-07) 26. NAPACK (f2c): elapsed time t=2.41 s, 1 iters, t-(init.)=2.38 s t(norm)=0.504388, mflops=9.91301 (err=3.7e-12) 27. Nielsen: elapsed time t=1.19 s, 1 iters, t-(init.)=1.17 s t(norm)=0.247955, mflops=20.1649 (err=2.1e-12) 28. NR (C): elapsed time t=1.98 s, 2 iters, t-(init.)=1.96 s t(norm)=0.207689, mflops=24.0744 (err=4.4e-14) 29. NR (F): elapsed time t=1.95 s, 2 iters, t-(init.)=1.94 s t(norm)=0.20557, mflops=24.3226 (err=4.4e-14) 30. Ooura (C): elapsed time t=1.62 s, 4 iters, t-(init.)=1.58 s t(norm)=0.0837114, mflops=59.729 (err=4.4e-14) 31. Ooura (F): elapsed time t=1.76 s, 4 iters, t-(init.)=1.73 s t(norm)=0.0916587, mflops=54.5502 (err=4.4e-14) 32. Ransom: elapsed time t=1.37 s, 4 iters, t-(init.)=1.33 s t(norm)=0.0704659, mflops=70.9563 (err=4.3e-14) 33. SCIPORT: elapsed time t=1.99 s, 1 iters, t-(init.)=1.97 s t(norm)=0.417497, mflops=11.9761 (err=4.4e-14) 34. Singleton: elapsed time t=1.19 s, 2 iters, t-(init.)=1.18 s t(norm)=0.125037, mflops=39.9881 (err=6.0e-14) 35. Singleton (f2c): elapsed time t=1.14 s, 2 iters, t-(init.)=1.12 s t(norm)=0.118679, mflops=42.1303 (err=6.0e-14) 36. Sorensen: elapsed time t=1.01 s, 1 iters, t-(init.)=1 s t(norm)=0.211928, mflops=23.593 (err=4.3e-14) 37. Sorensen DIT: elapsed time t=1.43 s, 1 iters, t-(init.)=1.42 s t(norm)=0.300937, mflops=16.6148 (err=4.3e-14) 38. Temperton: elapsed time t=1.41 s, 1 iters, t-(init.)=1.4 s t(norm)=0.296699, mflops=16.8521 (err=2.0e-07) 39. Temperton (f2c): elapsed time t=1.75 s, 1 iters, t-(init.)=1.74 s t(norm)=0.368754, mflops=13.5592 (err=4.4e-14) 40. Valkenburg: elapsed time t=4.54 s, 1 iters, t-(init.)=4.53 s t(norm)=0.960032, mflops=5.20816 (err=4.4e-14) 41. SCSL: elapsed time t=1.06 s, 4 iters, t-(init.)=0.98 s t(norm)=0.0519223, mflops=96.2978 (err=4.4e-14) 42. SGIMATH: elapsed time t=1.53 s, 8 iters, t-(init.)=1.46 s t(norm)=0.0386768, mflops=129.276 (err=4.4e-14) Top mflops for N=262144 = 129.276 Normalized results and averages for N=262144: fft 0: mflops = 22.2575 (norm. = 0.17217), norm. avg. (of 18) = 0.364399 fft 1: mflops = 23.8313 (norm. = 0.184343), norm. avg. (of 18) = 0.365536 fft 2: mflops = 19.826 (norm. = 0.153361), norm. avg. (of 18) = 0.266982 fft 3: mflops = 49.152 (norm. = 0.380208), norm. avg. (of 18) = 0.171635 fft 4: mflops = 7.08497 (norm. = 0.0548048), norm. avg. (of 18) = 0.204118 fft 5: mflops = 23.1304 (norm. = 0.178922), norm. avg. (of 18) = 0.126002 fft 6: mflops = 45.3711 (norm. = 0.350962), norm. avg. (of 18) = 0.279583 fft 7: mflops = 31.0434 (norm. = 0.240132), norm. avg. (of 18) = 0.205969 fft 8: mflops = 18.2891 (norm. = 0.141473), norm. avg. (of 18) = 0.21361 fft 9: mflops = 100.396 (norm. = 0.776596), norm. avg. (of 18) = 0.627737 fft 10: mflops = 99.3388 (norm. = 0.768421), norm. avg. (of 18) = 0.645448 fft 11: mflops = 19.1813 (norm. = 0.148374), norm. avg. (of 17) = 0.228352 fft 12: mflops = 15.8342 (norm. = 0.122483), norm. avg. (of 18) = 0.353283 fft 13: mflops = 10.8225 (norm. = 0.0837156), norm. avg. (of 18) = 0.131661 fft 14: mflops = 63.3368 (norm. = 0.489933), norm. avg. (of 18) = 0.685962 fft 15: mflops = 55.8413 (norm. = 0.431953), norm. avg. (of 18) = 0.651237 fft 16: mflops = 29.308 (norm. = 0.226708), norm. avg. (of 18) = 0.638325 fft 17: mflops = 48.1489 (norm. = 0.372449), norm. avg. (of 16) = 0.567368 fft 18: mflops = 12.288 (norm. = 0.0950521), norm. avg. (of 18) = 0.277505 fft 19: mflops = 23.3594 (norm. = 0.180693), norm. avg. (of 18) = 0.201719 fft 20: mflops = 24.576 (norm. = 0.190104), norm. avg. (of 18) = 0.20429 fft 21: mflops = -1 (norm. = -0.00773536), norm. avg. (of 12) = 0.574086 fft 22: mflops = 37.1543 (norm. = 0.287402), norm. avg. (of 17) = 0.31036 fft 23: mflops = 38.677 (norm. = 0.29918), norm. avg. (of 17) = 0.364218 fft 24: mflops = 39.3216 (norm. = 0.304167), norm. avg. (of 17) = 0.350582 fft 25: mflops = 19.6608 (norm. = 0.152083), norm. avg. (of 17) = 0.153367 fft 26: mflops = 9.91301 (norm. = 0.0766807), norm. avg. (of 18) = 0.145132 fft 27: mflops = 20.1649 (norm. = 0.155983), norm. avg. (of 18) = 0.148065 fft 28: mflops = 24.0744 (norm. = 0.186224), norm. avg. (of 18) = 0.206908 fft 29: mflops = 24.3226 (norm. = 0.188144), norm. avg. (of 18) = 0.194936 fft 30: mflops = 59.729 (norm. = 0.462025), norm. avg. (of 18) = 0.619371 fft 31: mflops = 54.5502 (norm. = 0.421965), norm. avg. (of 18) = 0.552068 fft 32: mflops = 70.9563 (norm. = 0.548872), norm. avg. (of 17) = 0.250475 fft 33: mflops = 11.9761 (norm. = 0.0926396), norm. avg. (of 17) = 0.278785 fft 34: mflops = 39.9881 (norm. = 0.309322), norm. avg. (of 18) = 0.347737 fft 35: mflops = 42.1303 (norm. = 0.325893), norm. avg. (of 18) = 0.351326 fft 36: mflops = 23.593 (norm. = 0.1825), norm. avg. (of 18) = 0.35869 fft 37: mflops = 16.6148 (norm. = 0.128521), norm. avg. (of 18) = 0.203394 fft 38: mflops = 16.8521 (norm. = 0.130357), norm. avg. (of 18) = 0.172302 fft 39: mflops = 13.5592 (norm. = 0.104885), norm. avg. (of 18) = 0.164894 fft 40: mflops = 5.20816 (norm. = 0.040287), norm. avg. (of 18) = 0.0366383 fft 41: mflops = 96.2978 (norm. = 0.744898), norm. avg. (of 18) = 0.798578 fft 42: mflops = 129.276 (norm. = 1), norm. avg. (of 18) = 0.807029 Benchmarking for array size = 524288 (power of 2): 0. Arndt DIF: elapsed time t=7.79 s, 1 iters, t-(init.)=7.75 s t(norm)=0.777997, mflops=6.42676 (err=1.1e-13) 1. Arndt DIT: elapsed time t=8.6 s, 1 iters, t-(init.)=8.55 s t(norm)=0.858307, mflops=5.82542 (err=1.1e-13) 2. Arndt Split-Radix: elapsed time t=9.05 s, 1 iters, t-(init.)=8.98 s t(norm)=0.901473, mflops=5.54648 (err=1.1e-13) 3. Arndt 4-step: elapsed time t=1.89 s, 1 iters, t-(init.)=1.85 s t(norm)=0.185716, mflops=26.9229 (err=1.1e-13) 4. Bailey: elapsed time t=11.82 s, 1 iters, t-(init.)=11.78 s t(norm)=1.18256, mflops=4.22813 (err=1.1e-13) 5. Beauregard: elapsed time t=2.79 s, 1 iters, t-(init.)=2.75 s t(norm)=0.276064, mflops=18.1118 (err=1.1e-13) 6. Bergland: elapsed time t=2.8 s, 1 iters, t-(init.)=2.76 s t(norm)=0.277067, mflops=18.0461 (err=1.1e-13) 7. Brenner: elapsed time t=3.72 s, 1 iters, t-(init.)=3.68 s t(norm)=0.369423, mflops=13.5346 (err=1.1e-13) 8. Burrus: elapsed time t=10.41 s, 1 iters, t-(init.)=10.37 s t(norm)=1.04101, mflops=4.80302 (err=1.1e-13) 9. CWP (min N) (N=720720): elapsed time t=1.28 s, 2 iters, t-(init.)=1.18 s t(norm)=0.0592282, mflops=84.4193 10. CWP (best N) (N=720720): elapsed time t=1.37 s, 2 iters, t-(init.)=1.26 s t(norm)=0.0632437, mflops=79.0593 11. Edelblute: elapsed time t=8.74 s, 1 iters, t-(init.)=8.7 s t(norm)=0.873365, mflops=5.72498 (err=1.1e-13) 12. FFTPACK: elapsed time t=8.03 s, 1 iters, t-(init.)=7.99 s t(norm)=0.80209, mflops=6.23371 (err=1.1e-13) 13. FFTPACK (f2c): elapsed time t=6.57 s, 1 iters, t-(init.)=6.53 s t(norm)=0.655526, mflops=7.62747 (err=1.1e-13) FFTW_MEASURE plan: (cost = 1.530000e+00) FFTW_TWIDDLE 2 FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 14. FFTW: elapsed time t=1.16 s, 1 iters, t-(init.)=1.12 s t(norm)=0.112433, mflops=44.4709 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 5.976883e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.82 s, 1 iters, t-(init.)=1.78 s t(norm)=0.178688, mflops=27.9817 (err=1.1e-13) 16. Frigo-old: elapsed time t=2.3 s, 1 iters, t-(init.)=2.26 s t(norm)=0.226874, mflops=22.0387 (err=1.1e-13) 17. Green: elapsed time t=2.3 s, 1 iters, t-(init.)=2.25 s t(norm)=0.22587, mflops=22.1366 (err=1.1e-13) 18. GSL: elapsed time t=5.94 s, 1 iters, t-(init.)=5.9 s t(norm)=0.592282, mflops=8.44193 (err=1.1e-13) 19. GSL DIT: elapsed time t=5.42 s, 1 iters, t-(init.)=5.38 s t(norm)=0.540081, mflops=9.25787 (err=1.1e-13) 20. GSL DIF: elapsed time t=7.04 s, 1 iters, t-(init.)=7 s t(norm)=0.702707, mflops=7.11534 (err=1.1e-13) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.57 s, 1 iters, t-(init.)=1.53 s t(norm)=0.153592, mflops=32.5538 (err=1.1e-13) 23. Mayer (simple): elapsed time t=1.55 s, 1 iters, t-(init.)=1.5 s t(norm)=0.15058, mflops=33.2049 24. Mayer (lookup): elapsed time t=2.61 s, 1 iters, t-(init.)=2.57 s t(norm)=0.257994, mflops=19.3803 (err=1.1e-13) 25. Monro: elapsed time t=8.6 s, 1 iters, t-(init.)=8.56 s t(norm)=0.859311, mflops=5.81862 (err=1.9e-07) 26. NAPACK (f2c): elapsed time t=4.83 s, 1 iters, t-(init.)=4.79 s t(norm)=0.480853, mflops=10.3982 (err=7.9e-12) 27. Nielsen: elapsed time t=6.18 s, 1 iters, t-(init.)=6.14 s t(norm)=0.616375, mflops=8.11195 (err=4.4e-12) 28. NR (C): elapsed time t=7.01 s, 1 iters, t-(init.)=6.97 s t(norm)=0.699696, mflops=7.14596 (err=1.1e-13) 29. NR (F): elapsed time t=6.03 s, 1 iters, t-(init.)=5.98 s t(norm)=0.600313, mflops=8.32899 (err=1.1e-13) 30. Ooura (C): elapsed time t=1.22 s, 1 iters, t-(init.)=1.17 s t(norm)=0.117453, mflops=42.5704 (err=1.1e-13) 31. Ooura (F): elapsed time t=1.29 s, 1 iters, t-(init.)=1.25 s t(norm)=0.125483, mflops=39.8459 (err=1.1e-13) 32. Ransom: elapsed time t=1.06 s, 1 iters, t-(init.)=1.01 s t(norm)=0.101391, mflops=49.3142 (err=1.1e-13) 33. SCIPORT: elapsed time t=4.82 s, 1 iters, t-(init.)=4.78 s t(norm)=0.479849, mflops=10.4199 (err=1.1e-13) 34. Singleton: elapsed time t=4.34 s, 1 iters, t-(init.)=4.29 s t(norm)=0.430659, mflops=11.6101 (err=1.6e-13) 35. Singleton (f2c): elapsed time t=4 s, 1 iters, t-(init.)=3.96 s t(norm)=0.397532, mflops=12.5776 (err=1.6e-13) 36. Sorensen: elapsed time t=4.36 s, 1 iters, t-(init.)=4.31 s t(norm)=0.432667, mflops=11.5562 (err=1.1e-13) 37. Sorensen DIT: elapsed time t=8.83 s, 1 iters, t-(init.)=8.79 s t(norm)=0.8824, mflops=5.66637 (err=1.1e-13) 38. Temperton: elapsed time t=4.86 s, 1 iters, t-(init.)=4.77 s t(norm)=0.478845, mflops=10.4418 (err=2.1e-07) 39. Temperton (f2c): elapsed time t=5.52 s, 1 iters, t-(init.)=5.48 s t(norm)=0.55012, mflops=9.08893 (err=1.1e-13) 40. Valkenburg: elapsed time t=13.59 s, 1 iters, t-(init.)=13.55 s t(norm)=1.36024, mflops=3.67582 (err=1.1e-13) 41. SCSL: elapsed time t=1.19 s, 2 iters, t-(init.)=1.11 s t(norm)=0.0557147, mflops=89.743 (err=1.1e-13) 42. SGIMATH: elapsed time t=1.16 s, 2 iters, t-(init.)=1.09 s t(norm)=0.0547108, mflops=91.3897 (err=1.1e-13) Top mflops for N=524288 = 91.3897 Normalized results and averages for N=524288: fft 0: mflops = 6.42676 (norm. = 0.0703226), norm. avg. (of 19) = 0.348921 fft 1: mflops = 5.82542 (norm. = 0.0637427), norm. avg. (of 19) = 0.349652 fft 2: mflops = 5.54648 (norm. = 0.0606904), norm. avg. (of 19) = 0.256125 fft 3: mflops = 26.9229 (norm. = 0.294595), norm. avg. (of 19) = 0.178107 fft 4: mflops = 4.22813 (norm. = 0.0462649), norm. avg. (of 19) = 0.19581 fft 5: mflops = 18.1118 (norm. = 0.198182), norm. avg. (of 19) = 0.129801 fft 6: mflops = 18.0461 (norm. = 0.197464), norm. avg. (of 19) = 0.275261 fft 7: mflops = 13.5346 (norm. = 0.148098), norm. avg. (of 19) = 0.202923 fft 8: mflops = 4.80302 (norm. = 0.0525554), norm. avg. (of 19) = 0.205133 fft 9: mflops = 84.4193 (norm. = 0.923729), norm. avg. (of 19) = 0.643315 fft 10: mflops = 79.0593 (norm. = 0.865079), norm. avg. (of 19) = 0.657007 fft 11: mflops = 5.72498 (norm. = 0.0626437), norm. avg. (of 18) = 0.219146 fft 12: mflops = 6.23371 (norm. = 0.0682103), norm. avg. (of 19) = 0.338279 fft 13: mflops = 7.62747 (norm. = 0.0834609), norm. avg. (of 19) = 0.129124 fft 14: mflops = 44.4709 (norm. = 0.486607), norm. avg. (of 19) = 0.675469 fft 15: mflops = 27.9817 (norm. = 0.30618), norm. avg. (of 19) = 0.633076 fft 16: mflops = 22.0387 (norm. = 0.24115), norm. avg. (of 19) = 0.617421 fft 17: mflops = 22.1366 (norm. = 0.242222), norm. avg. (of 17) = 0.548242 fft 18: mflops = 8.44193 (norm. = 0.0923729), norm. avg. (of 19) = 0.267761 fft 19: mflops = 9.25787 (norm. = 0.101301), norm. avg. (of 19) = 0.196434 fft 20: mflops = 7.11534 (norm. = 0.0778571), norm. avg. (of 19) = 0.197636 fft 21: mflops = -1 (norm. = -0.0109422), norm. avg. (of 12) = 0.574086 fft 22: mflops = 32.5538 (norm. = 0.356209), norm. avg. (of 18) = 0.312907 fft 23: mflops = 33.2049 (norm. = 0.363333), norm. avg. (of 18) = 0.364169 fft 24: mflops = 19.3803 (norm. = 0.212062), norm. avg. (of 18) = 0.342886 fft 25: mflops = 5.81862 (norm. = 0.0636682), norm. avg. (of 18) = 0.148384 fft 26: mflops = 10.3982 (norm. = 0.113779), norm. avg. (of 19) = 0.143482 fft 27: mflops = 8.11195 (norm. = 0.0887622), norm. avg. (of 19) = 0.144944 fft 28: mflops = 7.14596 (norm. = 0.0781923), norm. avg. (of 19) = 0.200133 fft 29: mflops = 8.32899 (norm. = 0.0911371), norm. avg. (of 19) = 0.189473 fft 30: mflops = 42.5704 (norm. = 0.465812), norm. avg. (of 19) = 0.611289 fft 31: mflops = 39.8459 (norm. = 0.436), norm. avg. (of 19) = 0.545959 fft 32: mflops = 49.3142 (norm. = 0.539604), norm. avg. (of 18) = 0.266537 fft 33: mflops = 10.4199 (norm. = 0.114017), norm. avg. (of 18) = 0.269631 fft 34: mflops = 11.6101 (norm. = 0.12704), norm. avg. (of 19) = 0.336122 fft 35: mflops = 12.5776 (norm. = 0.137626), norm. avg. (of 19) = 0.340079 fft 36: mflops = 11.5562 (norm. = 0.12645), norm. avg. (of 19) = 0.346467 fft 37: mflops = 5.66637 (norm. = 0.0620023), norm. avg. (of 19) = 0.195952 fft 38: mflops = 10.4418 (norm. = 0.114256), norm. avg. (of 19) = 0.169247 fft 39: mflops = 9.08893 (norm. = 0.0994526), norm. avg. (of 19) = 0.161449 fft 40: mflops = 3.67582 (norm. = 0.0402214), norm. avg. (of 19) = 0.0368269 fft 41: mflops = 89.743 (norm. = 0.981982), norm. avg. (of 19) = 0.808231 fft 42: mflops = 91.3897 (norm. = 1), norm. avg. (of 19) = 0.817185 Benchmarking for array size = 1048576 (power of 2): 0. Arndt DIF: elapsed time t=17.62 s, 1 iters, t-(init.)=17.54 s t(norm)=0.836372, mflops=5.9782 (err=1.9e-13) 1. Arndt DIT: elapsed time t=17.67 s, 1 iters, t-(init.)=17.54 s t(norm)=0.836372, mflops=5.9782 (err=1.9e-13) 2. Arndt Split-Radix: elapsed time t=21.71 s, 1 iters, t-(init.)=21.63 s t(norm)=1.0314, mflops=4.84779 (err=1.9e-13) 3. Arndt 4-step: elapsed time t=4.06 s, 1 iters, t-(init.)=3.98 s t(norm)=0.189781, mflops=26.3461 (err=1.9e-13) 4. Bailey: elapsed time t=28.99 s, 1 iters, t-(init.)=28.9 s t(norm)=1.37806, mflops=3.62829 (err=1.9e-13) 5. Beauregard: elapsed time t=6.4 s, 1 iters, t-(init.)=6.31 s t(norm)=0.300884, mflops=16.6177 (err=1.9e-13) 6. Bergland: elapsed time t=7.39 s, 1 iters, t-(init.)=7.3 s t(norm)=0.348091, mflops=14.3641 (err=1.9e-13) 7. Brenner: elapsed time t=8.82 s, 1 iters, t-(init.)=8.74 s t(norm)=0.416756, mflops=11.9974 (err=1.9e-13) 8. Burrus: elapsed time t=22.52 s, 1 iters, t-(init.)=22.44 s t(norm)=1.07002, mflops=4.6728 (err=1.9e-13) 9. Skipping fft (this transform size is too big for CWP). 10. Skipping fft (this transform size is too big for CWP). 11. Edelblute: elapsed time t=20.09 s, 1 iters, t-(init.)=19.89 s t(norm)=0.948429, mflops=5.27188 (err=1.9e-13) 12. FFTPACK: elapsed time t=13.87 s, 1 iters, t-(init.)=13.78 s t(norm)=0.657082, mflops=7.6094 (err=1.9e-13) 13. FFTPACK (f2c): elapsed time t=12.32 s, 1 iters, t-(init.)=12.24 s t(norm)=0.583649, mflops=8.5668 (err=1.9e-13) FFTW_MEASURE plan: (cost = 3.760000e+00) FFTW_TWIDDLE 32 FFTW_TWIDDLE 32 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=3.93 s, 1 iters, t-(init.)=3.84 s t(norm)=0.183105, mflops=27.3067 (err=1.9e-13) FFTW_ESTIMATE plan: (cost = 1.195377e+07) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=3.97 s, 1 iters, t-(init.)=3.88 s t(norm)=0.185013, mflops=27.0252 (err=1.9e-13) 16. Frigo-old: elapsed time t=7.99 s, 1 iters, t-(init.)=7.9 s t(norm)=0.376701, mflops=13.2731 (err=1.9e-13) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=13.83 s, 1 iters, t-(init.)=13.74 s t(norm)=0.655174, mflops=7.63156 (err=1.9e-13) 19. GSL DIT: elapsed time t=14.12 s, 1 iters, t-(init.)=14.01 s t(norm)=0.668049, mflops=7.48448 (err=1.9e-13) 20. GSL DIF: elapsed time t=15.92 s, 1 iters, t-(init.)=15.84 s t(norm)=0.75531, mflops=6.6198 (err=1.9e-13) 21. Skipping fft (Krukar can't handle N > 4096). 22. Skipping fft (Mayer (Buneman) can't handle N > 2^19). 23. Mayer (simple): elapsed time t=9.94 s, 1 iters, t-(init.)=9.84 s t(norm)=0.469208, mflops=10.6563 24. Mayer (lookup): elapsed time t=11.08 s, 1 iters, t-(init.)=11 s t(norm)=0.524521, mflops=9.53251 (err=1.9e-13) 25. Monro: elapsed time t=21.11 s, 1 iters, t-(init.)=21.02 s t(norm)=1.00231, mflops=4.98847 (err=2.0e-07) 26. NAPACK (f2c): elapsed time t=13.14 s, 1 iters, t-(init.)=13.02 s t(norm)=0.620842, mflops=8.05358 (err=1.5e-11) 27. Nielsen: elapsed time t=12.21 s, 1 iters, t-(init.)=12.13 s t(norm)=0.578403, mflops=8.64448 (err=8.1e-12) 28. NR (C): elapsed time t=16.4 s, 1 iters, t-(init.)=16.32 s t(norm)=0.778198, mflops=6.4251 (err=1.9e-13) 29. NR (F): elapsed time t=14.57 s, 1 iters, t-(init.)=14.47 s t(norm)=0.689983, mflops=7.24655 (err=1.9e-13) 30. Ooura (C): elapsed time t=4.65 s, 1 iters, t-(init.)=4.56 s t(norm)=0.217438, mflops=22.9951 (err=1.9e-13) 31. Ooura (F): elapsed time t=4.94 s, 1 iters, t-(init.)=4.84 s t(norm)=0.230789, mflops=21.6648 (err=1.9e-13) 32. Ransom: elapsed time t=1.81 s, 1 iters, t-(init.)=1.73 s t(norm)=0.0824928, mflops=60.6113 (err=1.9e-13) 33. SCIPORT: elapsed time t=11.95 s, 1 iters, t-(init.)=11.87 s t(norm)=0.566006, mflops=8.83383 (err=1.9e-13) 34. Singleton: elapsed time t=8.59 s, 1 iters, t-(init.)=8.51 s t(norm)=0.405788, mflops=12.3217 (err=2.6e-13) 35. Singleton (f2c): elapsed time t=8.74 s, 1 iters, t-(init.)=8.66 s t(norm)=0.412941, mflops=12.1083 (err=2.6e-13) 36. Sorensen: elapsed time t=10.36 s, 1 iters, t-(init.)=10.27 s t(norm)=0.489712, mflops=10.2101 (err=1.9e-13) 37. Sorensen DIT: elapsed time t=21.24 s, 1 iters, t-(init.)=21.16 s t(norm)=1.00899, mflops=4.95546 (err=1.9e-13) 38. Temperton: elapsed time t=10.87 s, 1 iters, t-(init.)=10.78 s t(norm)=0.51403, mflops=9.72705 (err=2.3e-07) 39. Temperton (f2c): elapsed time t=13.25 s, 1 iters, t-(init.)=13.14 s t(norm)=0.626564, mflops=7.98003 (err=1.9e-13) 40. Valkenburg: elapsed time t=29.81 s, 1 iters, t-(init.)=29.73 s t(norm)=1.41764, mflops=3.527 (err=1.9e-13) 41. SCSL: elapsed time t=1.19 s, 1 iters, t-(init.)=1.1 s t(norm)=0.0524521, mflops=95.3251 (err=1.9e-13) 42. SGIMATH: elapsed time t=1.37 s, 1 iters, t-(init.)=1.29 s t(norm)=0.061512, mflops=81.285 (err=1.9e-13) Top mflops for N=1048576 = 95.3251 Normalized results and averages for N=1048576: fft 0: mflops = 5.9782 (norm. = 0.0627138), norm. avg. (of 20) = 0.334611 fft 1: mflops = 5.9782 (norm. = 0.0627138), norm. avg. (of 20) = 0.335305 fft 2: mflops = 4.84779 (norm. = 0.0508553), norm. avg. (of 20) = 0.245861 fft 3: mflops = 26.3461 (norm. = 0.276382), norm. avg. (of 20) = 0.18302 fft 4: mflops = 3.62829 (norm. = 0.0380623), norm. avg. (of 20) = 0.187922 fft 5: mflops = 16.6177 (norm. = 0.174326), norm. avg. (of 20) = 0.132027 fft 6: mflops = 14.3641 (norm. = 0.150685), norm. avg. (of 20) = 0.269032 fft 7: mflops = 11.9974 (norm. = 0.125858), norm. avg. (of 20) = 0.199069 fft 8: mflops = 4.6728 (norm. = 0.0490196), norm. avg. (of 20) = 0.197328 fft 9: mflops = -1 (norm. = -0.0104904), norm. avg. (of 19) = 0.643315 fft 10: mflops = -1 (norm. = -0.0104904), norm. avg. (of 19) = 0.657007 fft 11: mflops = 5.27188 (norm. = 0.0553042), norm. avg. (of 19) = 0.210523 fft 12: mflops = 7.6094 (norm. = 0.0798258), norm. avg. (of 20) = 0.325356 fft 13: mflops = 8.5668 (norm. = 0.0898693), norm. avg. (of 20) = 0.127161 fft 14: mflops = 27.3067 (norm. = 0.286458), norm. avg. (of 20) = 0.656019 fft 15: mflops = 27.0252 (norm. = 0.283505), norm. avg. (of 20) = 0.615598 fft 16: mflops = 13.2731 (norm. = 0.139241), norm. avg. (of 20) = 0.593512 fft 17: mflops = -1 (norm. = -0.0104904), norm. avg. (of 17) = 0.548242 fft 18: mflops = 7.63156 (norm. = 0.0800582), norm. avg. (of 20) = 0.258376 fft 19: mflops = 7.48448 (norm. = 0.0785153), norm. avg. (of 20) = 0.190538 fft 20: mflops = 6.6198 (norm. = 0.0694444), norm. avg. (of 20) = 0.191226 fft 21: mflops = -1 (norm. = -0.0104904), norm. avg. (of 12) = 0.574086 fft 22: mflops = -1 (norm. = -0.0104904), norm. avg. (of 18) = 0.312907 fft 23: mflops = 10.6563 (norm. = 0.111789), norm. avg. (of 19) = 0.350885 fft 24: mflops = 9.53251 (norm. = 0.1), norm. avg. (of 19) = 0.330103 fft 25: mflops = 4.98847 (norm. = 0.0523311), norm. avg. (of 19) = 0.143328 fft 26: mflops = 8.05358 (norm. = 0.0844854), norm. avg. (of 20) = 0.140532 fft 27: mflops = 8.64448 (norm. = 0.0906843), norm. avg. (of 20) = 0.142231 fft 28: mflops = 6.4251 (norm. = 0.067402), norm. avg. (of 20) = 0.193496 fft 29: mflops = 7.24655 (norm. = 0.0760194), norm. avg. (of 20) = 0.183801 fft 30: mflops = 22.9951 (norm. = 0.241228), norm. avg. (of 20) = 0.592786 fft 31: mflops = 21.6648 (norm. = 0.227273), norm. avg. (of 20) = 0.530025 fft 32: mflops = 60.6113 (norm. = 0.635838), norm. avg. (of 19) = 0.285974 fft 33: mflops = 8.83383 (norm. = 0.0926706), norm. avg. (of 19) = 0.260317 fft 34: mflops = 12.3217 (norm. = 0.12926), norm. avg. (of 20) = 0.325778 fft 35: mflops = 12.1083 (norm. = 0.127021), norm. avg. (of 20) = 0.329426 fft 36: mflops = 10.2101 (norm. = 0.107108), norm. avg. (of 20) = 0.334499 fft 37: mflops = 4.95546 (norm. = 0.0519849), norm. avg. (of 20) = 0.188754 fft 38: mflops = 9.72705 (norm. = 0.102041), norm. avg. (of 20) = 0.165887 fft 39: mflops = 7.98003 (norm. = 0.0837139), norm. avg. (of 20) = 0.157562 fft 40: mflops = 3.527 (norm. = 0.0369997), norm. avg. (of 20) = 0.0368356 fft 41: mflops = 95.3251 (norm. = 1), norm. avg. (of 20) = 0.817819 fft 42: mflops = 81.285 (norm. = 0.852713), norm. avg. (of 20) = 0.818962 Benchmarking for array size = 2097152 (power of 2): 0. Arndt DIF: elapsed time t=35.22 s, 1 iters, t-(init.)=35.05 s t(norm)=0.795864, mflops=6.28248 (err=2.7e-13) 1. Arndt DIT: elapsed time t=35.14 s, 1 iters, t-(init.)=34.97 s t(norm)=0.794047, mflops=6.29685 (err=2.7e-13) 2. Arndt Split-Radix: elapsed time t=46.29 s, 1 iters, t-(init.)=46.1 s t(norm)=1.04677, mflops=4.77659 (err=2.7e-13) 3. Arndt 4-step: elapsed time t=13.06 s, 1 iters, t-(init.)=12.88 s t(norm)=0.29246, mflops=17.0963 (err=2.7e-13) 4. Bailey: elapsed time t=63.09 s, 1 iters, t-(init.)=62.91 s t(norm)=1.42847, mflops=3.50025 (err=2.7e-13) 5. Beauregard: elapsed time t=14.23 s, 1 iters, t-(init.)=14.06 s t(norm)=0.319254, mflops=15.6615 (err=2.7e-13) 6. Skipping fft (Bergland doesn't work for N > 2^20). 7. Brenner: elapsed time t=22.48 s, 1 iters, t-(init.)=22.31 s t(norm)=0.506583, mflops=9.87006 (err=2.7e-13) 8. Burrus: elapsed time t=47.2 s, 1 iters, t-(init.)=47.03 s t(norm)=1.06789, mflops=4.68214 (err=2.7e-13) 9. Skipping fft (this transform size is too big for CWP). 10. Skipping fft (this transform size is too big for CWP). 11. Edelblute: elapsed time t=43.72 s, 1 iters, t-(init.)=43.55 s t(norm)=0.988869, mflops=5.05628 (err=2.7e-13) 12. FFTPACK: elapsed time t=35.51 s, 1 iters, t-(init.)=35.24 s t(norm)=0.800178, mflops=6.24861 (err=2.7e-13) 13. FFTPACK (f2c): elapsed time t=33.02 s, 1 iters, t-(init.)=32.85 s t(norm)=0.74591, mflops=6.70323 (err=2.7e-13) FFTW_MEASURE plan: (cost = 7.940000e+00) FFTW_TWIDDLE 32 FFTW_TWIDDLE 64 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 14. FFTW: elapsed time t=8.49 s, 1 iters, t-(init.)=8.32 s t(norm)=0.188918, mflops=26.4665 (err=2.7e-13) FFTW_ESTIMATE plan: (cost = 2.390753e+07) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=9.26 s, 1 iters, t-(init.)=9.09 s t(norm)=0.206402, mflops=24.2245 (err=2.7e-13) 16. Frigo-old: elapsed time t=14.63 s, 1 iters, t-(init.)=14.46 s t(norm)=0.328336, mflops=15.2283 (err=2.7e-13) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=29.02 s, 1 iters, t-(init.)=28.85 s t(norm)=0.655083, mflops=7.63262 (err=2.7e-13) 19. GSL DIT: elapsed time t=33.22 s, 1 iters, t-(init.)=33.05 s t(norm)=0.750451, mflops=6.66266 (err=2.7e-13) 20. GSL DIF: elapsed time t=33.88 s, 1 iters, t-(init.)=33.71 s t(norm)=0.765437, mflops=6.53221 (err=2.7e-13) 21. Skipping fft (Krukar can't handle N > 4096). 22. Skipping fft (Mayer (Buneman) can't handle N > 2^19). 23. Mayer (simple): elapsed time t=27.29 s, 1 iters, t-(init.)=27.12 s t(norm)=0.615801, mflops=8.1195 24. Mayer (lookup): elapsed time t=28.97 s, 1 iters, t-(init.)=28.79 s t(norm)=0.653721, mflops=7.64852 (err=2.7e-13) 25. Skipping fft (Monro can't handle N > 2^20). 26. NAPACK (f2c): elapsed time t=28.6 s, 1 iters, t-(init.)=28.41 s t(norm)=0.645093, mflops=7.75083 (err=1.9e-11) 27. Nielsen: elapsed time t=25.72 s, 1 iters, t-(init.)=25.55 s t(norm)=0.580152, mflops=8.61843 (err=6.3e-12) 28. NR (C): elapsed time t=34.25 s, 1 iters, t-(init.)=34.04 s t(norm)=0.77293, mflops=6.46889 (err=2.7e-13) 29. NR (F): elapsed time t=35.55 s, 1 iters, t-(init.)=35.38 s t(norm)=0.803357, mflops=6.22388 (err=2.7e-13) 30. Ooura (C): elapsed time t=9.21 s, 1 iters, t-(init.)=9.04 s t(norm)=0.205267, mflops=24.3585 (err=2.7e-13) 31. Ooura (F): elapsed time t=9.52 s, 1 iters, t-(init.)=9.33 s t(norm)=0.211852, mflops=23.6014 (err=2.7e-13) 32. Ransom: elapsed time t=5.34 s, 1 iters, t-(init.)=5.08 s t(norm)=0.115349, mflops=43.3466 (err=2.7e-13) 33. SCIPORT: elapsed time t=26.75 s, 1 iters, t-(init.)=26.58 s t(norm)=0.60354, mflops=8.28446 (err=2.7e-13) 34. Singleton: elapsed time t=24.61 s, 1 iters, t-(init.)=24.44 s t(norm)=0.554948, mflops=9.00986 (err=3.7e-13) 35. Singleton (f2c): elapsed time t=22.09 s, 1 iters, t-(init.)=21.91 s t(norm)=0.4975, mflops=10.0502 (err=3.7e-13) 36. Sorensen: elapsed time t=27.25 s, 1 iters, t-(init.)=27.03 s t(norm)=0.613758, mflops=8.14654 (err=2.7e-13) 37. Sorensen DIT: elapsed time t=52.27 s, 1 iters, t-(init.)=52.1 s t(norm)=1.18301, mflops=4.22651 (err=2.7e-13) 38. Temperton: elapsed time t=28.14 s, 1 iters, t-(init.)=27.88 s t(norm)=0.633058, mflops=7.89817 (err=2.4e-07) 39. Temperton (f2c): elapsed time t=29.28 s, 1 iters, t-(init.)=29.11 s t(norm)=0.660987, mflops=7.56444 (err=2.7e-13) 40. Valkenburg: elapsed time t=63.52 s, 1 iters, t-(init.)=63.35 s t(norm)=1.43846, mflops=3.47594 (err=2.7e-13) 41. SCSL: elapsed time t=2.52 s, 1 iters, t-(init.)=2.34 s t(norm)=0.0531333, mflops=94.103 (err=2.7e-13) 42. SGIMATH: elapsed time t=2.85 s, 1 iters, t-(init.)=2.68 s t(norm)=0.0608535, mflops=82.1645 (err=2.7e-13) Top mflops for N=2097152 = 94.103 Normalized results and averages for N=2097152: fft 0: mflops = 6.28248 (norm. = 0.0667618), norm. avg. (of 21) = 0.321856 fft 1: mflops = 6.29685 (norm. = 0.0669145), norm. avg. (of 21) = 0.322525 fft 2: mflops = 4.77659 (norm. = 0.0507592), norm. avg. (of 21) = 0.236571 fft 3: mflops = 17.0963 (norm. = 0.181677), norm. avg. (of 21) = 0.182956 fft 4: mflops = 3.50025 (norm. = 0.037196), norm. avg. (of 21) = 0.180745 fft 5: mflops = 15.6615 (norm. = 0.16643), norm. avg. (of 21) = 0.133665 fft 6: mflops = -1 (norm. = -0.0106267), norm. avg. (of 20) = 0.269032 fft 7: mflops = 9.87006 (norm. = 0.104886), norm. avg. (of 21) = 0.194585 fft 8: mflops = 4.68214 (norm. = 0.0497555), norm. avg. (of 21) = 0.1903 fft 9: mflops = -1 (norm. = -0.0106267), norm. avg. (of 19) = 0.643315 fft 10: mflops = -1 (norm. = -0.0106267), norm. avg. (of 19) = 0.657007 fft 11: mflops = 5.05628 (norm. = 0.0537313), norm. avg. (of 20) = 0.202683 fft 12: mflops = 6.24861 (norm. = 0.0664018), norm. avg. (of 21) = 0.313025 fft 13: mflops = 6.70323 (norm. = 0.0712329), norm. avg. (of 21) = 0.124498 fft 14: mflops = 26.4665 (norm. = 0.28125), norm. avg. (of 21) = 0.638173 fft 15: mflops = 24.2245 (norm. = 0.257426), norm. avg. (of 21) = 0.598542 fft 16: mflops = 15.2283 (norm. = 0.161826), norm. avg. (of 21) = 0.572956 fft 17: mflops = -1 (norm. = -0.0106267), norm. avg. (of 17) = 0.548242 fft 18: mflops = 7.63262 (norm. = 0.0811092), norm. avg. (of 21) = 0.249935 fft 19: mflops = 6.66266 (norm. = 0.0708018), norm. avg. (of 21) = 0.184837 fft 20: mflops = 6.53221 (norm. = 0.0694156), norm. avg. (of 21) = 0.185426 fft 21: mflops = -1 (norm. = -0.0106267), norm. avg. (of 12) = 0.574086 fft 22: mflops = -1 (norm. = -0.0106267), norm. avg. (of 18) = 0.312907 fft 23: mflops = 8.1195 (norm. = 0.0862832), norm. avg. (of 20) = 0.337655 fft 24: mflops = 7.64852 (norm. = 0.0812782), norm. avg. (of 20) = 0.317661 fft 25: mflops = -1 (norm. = -0.0106267), norm. avg. (of 19) = 0.143328 fft 26: mflops = 7.75083 (norm. = 0.0823654), norm. avg. (of 21) = 0.137762 fft 27: mflops = 8.61843 (norm. = 0.0915851), norm. avg. (of 21) = 0.13982 fft 28: mflops = 6.46889 (norm. = 0.0687427), norm. avg. (of 21) = 0.187556 fft 29: mflops = 6.22388 (norm. = 0.0661391), norm. avg. (of 21) = 0.178198 fft 30: mflops = 24.3585 (norm. = 0.25885), norm. avg. (of 21) = 0.576884 fft 31: mflops = 23.6014 (norm. = 0.250804), norm. avg. (of 21) = 0.516728 fft 32: mflops = 43.3466 (norm. = 0.46063), norm. avg. (of 20) = 0.294707 fft 33: mflops = 8.28446 (norm. = 0.0880361), norm. avg. (of 20) = 0.251703 fft 34: mflops = 9.00986 (norm. = 0.0957447), norm. avg. (of 21) = 0.314825 fft 35: mflops = 10.0502 (norm. = 0.106801), norm. avg. (of 21) = 0.318825 fft 36: mflops = 8.14654 (norm. = 0.0865705), norm. avg. (of 21) = 0.322693 fft 37: mflops = 4.22651 (norm. = 0.0449136), norm. avg. (of 21) = 0.181904 fft 38: mflops = 7.89817 (norm. = 0.0839311), norm. avg. (of 21) = 0.161984 fft 39: mflops = 7.56444 (norm. = 0.0803847), norm. avg. (of 21) = 0.153887 fft 40: mflops = 3.47594 (norm. = 0.0369376), norm. avg. (of 21) = 0.0368404 fft 41: mflops = 94.103 (norm. = 1), norm. avg. (of 21) = 0.826495 fft 42: mflops = 82.1645 (norm. = 0.873134), norm. avg. (of 21) = 0.821541 ------------------------------------------------------ @@@@ bench.1d.np2.log FFT Benchmark Program by M. Frigo and S. G. Johnson. email: fftw@theory.lcs.mit.edu www: http://theory.lcs.mit.edu/~fftw Using FFTW V1.1 ($Id: executor.c,v 1.34 1997/04/30 13:15:56 fftw Exp $) Maximum memory to use: 200 MB Factors to allow: anything but 2 Using double precision. Measuring speed of 1D transforms: Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. Brenner 1. CWP (min N) 2. CWP (best N) 3. FFTPACK 4. FFTPACK (f2c) 5. FFTW 6. FFTW_ESTIMATE 7. Frigo-old 8. GSL 9. NAPACK (f2c) 10. Nielsen 11. Singleton 12. Singleton (f2c) 13. Temperton 14. Temperton (f2c) 15. Valkenburg 16. SCSL 17. SGIMATH Computing normalized averages (18 transforms). Benchmarking for array size = 6: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.24 s, 262144 iters, t-(init.)=1.21 s t(norm)=0.297605, mflops=16.8008 2. CWP (best N) (N=15): elapsed time t=1.6 s, 262144 iters, t-(init.)=1.55 s t(norm)=0.381229, mflops=13.1155 3. FFTPACK: elapsed time t=1.2 s, 524288 iters, t-(init.)=1.15 s t(norm)=0.141424, mflops=35.3547 (err=1.4e-16) 4. FFTPACK (f2c): elapsed time t=1.12 s, 262144 iters, t-(init.)=1.09 s t(norm)=0.26809, mflops=18.6504 (err=1.8e-16) FFTW_MEASURE plan: (cost = 7.057190e-07) FFTW_NOTW 6 5. FFTW: elapsed time t=1.48 s, 2097152 iters, t-(init.)=1.28 s t(norm)=0.0393527, mflops=127.056 (err=1.1e-16) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 6. FFTW_ESTIMATE: elapsed time t=1.57 s, 2097152 iters, t-(init.)=1.37 s t(norm)=0.0421197, mflops=118.709 (err=1.1e-16) 7. Frigo-old: elapsed time t=1.08 s, 262144 iters, t-(init.)=1.06 s t(norm)=0.260712, mflops=19.1783 (err=3.3e-16) 8. GSL: elapsed time t=1.28 s, 524288 iters, t-(init.)=1.22 s t(norm)=0.150032, mflops=33.3262 (err=1.2e-16) 9. NAPACK (f2c): elapsed time t=1.18 s, 262144 iters, t-(init.)=1.16 s t(norm)=0.285307, mflops=17.525 (err=4.7e-16) 10. Nielsen: elapsed time t=1.36 s, 131072 iters, t-(init.)=1.34 s t(norm)=0.659158, mflops=7.58544 (err=2.7e-16) 11. Singleton: elapsed time t=1.33 s, 262144 iters, t-(init.)=1.3 s t(norm)=0.319741, mflops=15.6377 (err=1.0e-16) 12. Singleton (f2c): elapsed time t=1.37 s, 262144 iters, t-(init.)=1.34 s t(norm)=0.329579, mflops=15.1709 (err=1.0e-16) 13. Temperton: elapsed time t=1.53 s, 262144 iters, t-(init.)=1.51 s t(norm)=0.371391, mflops=13.4629 (err=3.7e-16) 14. Temperton (f2c): elapsed time t=1.18 s, 131072 iters, t-(init.)=1.17 s t(norm)=0.575533, mflops=8.68759 (err=1.0e-16) 15. Valkenburg: elapsed time t=1.4 s, 131072 iters, t-(init.)=1.39 s t(norm)=0.683753, mflops=7.31258 (err=3.0e-16) 16. SCSL: elapsed time t=1.13 s, 524288 iters, t-(init.)=1.08 s t(norm)=0.132815, mflops=37.6462 (err=1.8e-16) 17. SGIMATH: elapsed time t=1.03 s, 524288 iters, t-(init.)=0.98 s t(norm)=0.120518, mflops=41.4877 (err=1.8e-16) Top mflops for N=6 = 127.056 Normalized results and averages for N=6: fft 0: mflops = -1 (norm. = -0.00787054), norm. avg. (of 0) = -1 fft 1: mflops = 16.8008 (norm. = 0.132231), norm. avg. (of 1) = 0.132231 fft 2: mflops = 13.1155 (norm. = 0.103226), norm. avg. (of 1) = 0.103226 fft 3: mflops = 35.3547 (norm. = 0.278261), norm. avg. (of 1) = 0.278261 fft 4: mflops = 18.6504 (norm. = 0.146789), norm. avg. (of 1) = 0.146789 fft 5: mflops = 127.056 (norm. = 1), norm. avg. (of 1) = 1 fft 6: mflops = 118.709 (norm. = 0.934307), norm. avg. (of 1) = 0.934307 fft 7: mflops = 19.1783 (norm. = 0.150943), norm. avg. (of 1) = 0.150943 fft 8: mflops = 33.3262 (norm. = 0.262295), norm. avg. (of 1) = 0.262295 fft 9: mflops = 17.525 (norm. = 0.137931), norm. avg. (of 1) = 0.137931 fft 10: mflops = 7.58544 (norm. = 0.0597015), norm. avg. (of 1) = 0.0597015 fft 11: mflops = 15.6377 (norm. = 0.123077), norm. avg. (of 1) = 0.123077 fft 12: mflops = 15.1709 (norm. = 0.119403), norm. avg. (of 1) = 0.119403 fft 13: mflops = 13.4629 (norm. = 0.10596), norm. avg. (of 1) = 0.10596 fft 14: mflops = 8.68759 (norm. = 0.0683761), norm. avg. (of 1) = 0.0683761 fft 15: mflops = 7.31258 (norm. = 0.057554), norm. avg. (of 1) = 0.057554 fft 16: mflops = 37.6462 (norm. = 0.296296), norm. avg. (of 1) = 0.296296 fft 17: mflops = 41.4877 (norm. = 0.326531), norm. avg. (of 1) = 0.326531 Benchmarking for array size = 9: 0. Brenner: elapsed time t=1.54 s, 65536 iters, t-(init.)=1.53 s t(norm)=0.818314, mflops=6.11012 (err=3.6e-16) 1. CWP (min N): elapsed time t=1 s, 262144 iters, t-(init.)=0.98 s t(norm)=0.131037, mflops=38.1571 2. CWP (best N) (N=15): elapsed time t=1.59 s, 262144 iters, t-(init.)=1.54 s t(norm)=0.205916, mflops=24.2818 3. FFTPACK: elapsed time t=1.35 s, 524288 iters, t-(init.)=1.3 s t(norm)=0.0869124, mflops=57.5292 (err=1.5e-16) 4. FFTPACK (f2c): elapsed time t=1.6 s, 262144 iters, t-(init.)=1.57 s t(norm)=0.209927, mflops=23.8178 (err=2.5e-16) FFTW_MEASURE plan: (cost = 9.918213e-07) FFTW_NOTW 9 5. FFTW: elapsed time t=1.09 s, 1048576 iters, t-(init.)=0.98 s t(norm)=0.0327593, mflops=152.628 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.06 s, 1048576 iters, t-(init.)=0.95 s t(norm)=0.0317565, mflops=157.448 (err=1.4e-16) 7. Frigo-old: elapsed time t=1.09 s, 131072 iters, t-(init.)=1.07 s t(norm)=0.286142, mflops=17.4738 (err=3.5e-16) 8. GSL: elapsed time t=1.92 s, 524288 iters, t-(init.)=1.87 s t(norm)=0.12502, mflops=39.9935 (err=1.5e-16) 9. NAPACK (f2c): elapsed time t=1.54 s, 262144 iters, t-(init.)=1.51 s t(norm)=0.201904, mflops=24.7642 (err=6.2e-16) 10. Nielsen: elapsed time t=1.58 s, 131072 iters, t-(init.)=1.57 s t(norm)=0.419854, mflops=11.9089 (err=4.7e-16) 11. Singleton: elapsed time t=1.37 s, 262144 iters, t-(init.)=1.34 s t(norm)=0.179173, mflops=27.9059 (err=1.5e-16) 12. Singleton (f2c): elapsed time t=1.36 s, 262144 iters, t-(init.)=1.33 s t(norm)=0.177836, mflops=28.1158 (err=1.5e-16) 13. Temperton: elapsed time t=1.69 s, 262144 iters, t-(init.)=1.66 s t(norm)=0.221961, mflops=22.5265 (err=1.1e-08) 14. Temperton (f2c): elapsed time t=1.62 s, 131072 iters, t-(init.)=1.61 s t(norm)=0.430551, mflops=11.613 (err=1.5e-16) 15. Valkenburg: elapsed time t=1.25 s, 65536 iters, t-(init.)=1.24 s t(norm)=0.663209, mflops=7.5391 (err=3.1e-16) 16. SCSL: elapsed time t=1.32 s, 524288 iters, t-(init.)=1.27 s t(norm)=0.0849068, mflops=58.8881 (err=2.5e-16) 17. SGIMATH: elapsed time t=1.35 s, 524288 iters, t-(init.)=1.3 s t(norm)=0.0869124, mflops=57.5292 (err=2.5e-16) Top mflops for N=9 = 157.448 Normalized results and averages for N=9: fft 0: mflops = 6.11012 (norm. = 0.0388072), norm. avg. (of 1) = 0.0388072 fft 1: mflops = 38.1571 (norm. = 0.242347), norm. avg. (of 2) = 0.187289 fft 2: mflops = 24.2818 (norm. = 0.154221), norm. avg. (of 2) = 0.128723 fft 3: mflops = 57.5292 (norm. = 0.365385), norm. avg. (of 2) = 0.321823 fft 4: mflops = 23.8178 (norm. = 0.151274), norm. avg. (of 2) = 0.149031 fft 5: mflops = 152.628 (norm. = 0.969388), norm. avg. (of 2) = 0.984694 fft 6: mflops = 157.448 (norm. = 1), norm. avg. (of 2) = 0.967153 fft 7: mflops = 17.4738 (norm. = 0.110981), norm. avg. (of 2) = 0.130962 fft 8: mflops = 39.9935 (norm. = 0.254011), norm. avg. (of 2) = 0.258153 fft 9: mflops = 24.7642 (norm. = 0.157285), norm. avg. (of 2) = 0.147608 fft 10: mflops = 11.9089 (norm. = 0.0756369), norm. avg. (of 2) = 0.0676692 fft 11: mflops = 27.9059 (norm. = 0.177239), norm. avg. (of 2) = 0.150158 fft 12: mflops = 28.1158 (norm. = 0.178571), norm. avg. (of 2) = 0.148987 fft 13: mflops = 22.5265 (norm. = 0.143072), norm. avg. (of 2) = 0.124516 fft 14: mflops = 11.613 (norm. = 0.0737578), norm. avg. (of 2) = 0.0710669 fft 15: mflops = 7.5391 (norm. = 0.0478831), norm. avg. (of 2) = 0.0527185 fft 16: mflops = 58.8881 (norm. = 0.374016), norm. avg. (of 2) = 0.335156 fft 17: mflops = 57.5292 (norm. = 0.365385), norm. avg. (of 2) = 0.345958 Benchmarking for array size = 12: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.47 s, 262144 iters, t-(init.)=1.43 s t(norm)=0.126803, mflops=39.4312 2. CWP (best N) (N=15): elapsed time t=1.6 s, 262144 iters, t-(init.)=1.55 s t(norm)=0.137444, mflops=36.3784 3. FFTPACK: elapsed time t=1.5 s, 524288 iters, t-(init.)=1.42 s t(norm)=0.0629582, mflops=79.4177 (err=1.7e-16) 4. FFTPACK (f2c): elapsed time t=1.98 s, 262144 iters, t-(init.)=1.93 s t(norm)=0.17114, mflops=29.2158 (err=2.2e-16) FFTW_MEASURE plan: (cost = 1.182556e-06) FFTW_NOTW 12 5. FFTW: elapsed time t=1.18 s, 1048576 iters, t-(init.)=1.03 s t(norm)=0.0228334, mflops=218.977 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.26 s, 1048576 iters, t-(init.)=1.11 s t(norm)=0.0246069, mflops=203.195 (err=1.2e-16) 7. Frigo-old: elapsed time t=1 s, 131072 iters, t-(init.)=0.98 s t(norm)=0.1738, mflops=28.7687 (err=2.8e-16) 8. GSL: elapsed time t=1.96 s, 524288 iters, t-(init.)=1.89 s t(norm)=0.0837965, mflops=59.6683 (err=1.6e-16) 9. NAPACK (f2c): elapsed time t=1.1 s, 131072 iters, t-(init.)=1.08 s t(norm)=0.191535, mflops=26.1049 (err=5.5e-16) 10. Nielsen: elapsed time t=1.86 s, 131072 iters, t-(init.)=1.84 s t(norm)=0.326319, mflops=15.3224 (err=5.0e-16) 11. Singleton: elapsed time t=1.82 s, 262144 iters, t-(init.)=1.78 s t(norm)=0.157839, mflops=31.6779 (err=1.5e-16) 12. Singleton (f2c): elapsed time t=1.84 s, 262144 iters, t-(init.)=1.8 s t(norm)=0.159612, mflops=31.3259 (err=1.5e-16) 13. Temperton: elapsed time t=1.89 s, 262144 iters, t-(init.)=1.85 s t(norm)=0.164046, mflops=30.4792 (err=5.4e-16) 14. Temperton (f2c): elapsed time t=1.68 s, 131072 iters, t-(init.)=1.66 s t(norm)=0.294396, mflops=16.9839 (err=1.4e-16) 15. Valkenburg: elapsed time t=1.9 s, 65536 iters, t-(init.)=1.89 s t(norm)=0.670372, mflops=7.45854 (err=2.6e-16) 16. SCSL: elapsed time t=1.43 s, 524288 iters, t-(init.)=1.36 s t(norm)=0.060298, mflops=82.9214 (err=1.5e-16) 17. SGIMATH: elapsed time t=1.39 s, 524288 iters, t-(init.)=1.31 s t(norm)=0.0580812, mflops=86.0864 (err=1.5e-16) Top mflops for N=12 = 218.977 Normalized results and averages for N=12: fft 0: mflops = -1 (norm. = -0.00456669), norm. avg. (of 1) = 0.0388072 fft 1: mflops = 39.4312 (norm. = 0.18007), norm. avg. (of 3) = 0.184883 fft 2: mflops = 36.3784 (norm. = 0.166129), norm. avg. (of 3) = 0.141192 fft 3: mflops = 79.4177 (norm. = 0.362676), norm. avg. (of 3) = 0.335441 fft 4: mflops = 29.2158 (norm. = 0.13342), norm. avg. (of 3) = 0.143828 fft 5: mflops = 218.977 (norm. = 1), norm. avg. (of 3) = 0.989796 fft 6: mflops = 203.195 (norm. = 0.927928), norm. avg. (of 3) = 0.954078 fft 7: mflops = 28.7687 (norm. = 0.131378), norm. avg. (of 3) = 0.131101 fft 8: mflops = 59.6683 (norm. = 0.272487), norm. avg. (of 3) = 0.262931 fft 9: mflops = 26.1049 (norm. = 0.119213), norm. avg. (of 3) = 0.138143 fft 10: mflops = 15.3224 (norm. = 0.0699728), norm. avg. (of 3) = 0.0684371 fft 11: mflops = 31.6779 (norm. = 0.144663), norm. avg. (of 3) = 0.148326 fft 12: mflops = 31.3259 (norm. = 0.143056), norm. avg. (of 3) = 0.14701 fft 13: mflops = 30.4792 (norm. = 0.139189), norm. avg. (of 3) = 0.129407 fft 14: mflops = 16.9839 (norm. = 0.0775602), norm. avg. (of 3) = 0.0732314 fft 15: mflops = 7.45854 (norm. = 0.0340608), norm. avg. (of 3) = 0.0464993 fft 16: mflops = 82.9214 (norm. = 0.378676), norm. avg. (of 3) = 0.349663 fft 17: mflops = 86.0864 (norm. = 0.39313), norm. avg. (of 3) = 0.361682 Benchmarking for array size = 15: 0. Brenner: elapsed time t=1.04 s, 32768 iters, t-(init.)=1.04 s t(norm)=0.541578, mflops=9.23228 (err=3.6e-16) 1. CWP (min N): elapsed time t=1.53 s, 262144 iters, t-(init.)=1.48 s t(norm)=0.0963384, mflops=51.9004 2. CWP (best N): elapsed time t=1.59 s, 262144 iters, t-(init.)=1.53 s t(norm)=0.099593, mflops=50.2043 3. FFTPACK: elapsed time t=1.78 s, 524288 iters, t-(init.)=1.66 s t(norm)=0.0540276, mflops=92.5453 (err=3.0e-16) 4. FFTPACK (f2c): elapsed time t=1.23 s, 131072 iters, t-(init.)=1.2 s t(norm)=0.156224, mflops=32.0052 (err=4.5e-16) FFTW_MEASURE plan: (cost = 2.059937e-06) FFTW_NOTW 15 5. FFTW: elapsed time t=1.07 s, 524288 iters, t-(init.)=0.97 s t(norm)=0.0315703, mflops=158.376 (err=1.9e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.06 s, 524288 iters, t-(init.)=0.95 s t(norm)=0.0309194, mflops=161.711 (err=1.9e-16) 7. Frigo-old: elapsed time t=1.64 s, 131072 iters, t-(init.)=1.61 s t(norm)=0.209601, mflops=23.8548 (err=3.9e-16) 8. GSL: elapsed time t=1.22 s, 262144 iters, t-(init.)=1.16 s t(norm)=0.0755085, mflops=66.2178 (err=2.3e-16) 9. NAPACK (f2c): elapsed time t=1.84 s, 131072 iters, t-(init.)=1.81 s t(norm)=0.235638, mflops=21.2189 (err=1.1e-15) 10. Nielsen: elapsed time t=1.09 s, 65536 iters, t-(init.)=1.08 s t(norm)=0.281204, mflops=17.7807 (err=4.7e-15) 11. Singleton: elapsed time t=1.18 s, 131072 iters, t-(init.)=1.15 s t(norm)=0.149715, mflops=33.3968 (err=2.9e-16) 12. Singleton (f2c): elapsed time t=1.21 s, 131072 iters, t-(init.)=1.18 s t(norm)=0.153621, mflops=32.5477 (err=2.9e-16) 13. Temperton: elapsed time t=1.89 s, 262144 iters, t-(init.)=1.84 s t(norm)=0.119772, mflops=41.746 (err=7.9e-16) 14. Temperton (f2c): elapsed time t=1.23 s, 65536 iters, t-(init.)=1.22 s t(norm)=0.317656, mflops=15.7403 (err=2.0e-16) 15. Valkenburg: elapsed time t=1.42 s, 32768 iters, t-(init.)=1.41 s t(norm)=0.734255, mflops=6.80963 (err=4.6e-16) 16. SCSL: elapsed time t=1.87 s, 524288 iters, t-(init.)=1.76 s t(norm)=0.0572823, mflops=87.287 (err=4.5e-16) 17. SGIMATH: elapsed time t=1.86 s, 524288 iters, t-(init.)=1.75 s t(norm)=0.0569568, mflops=87.7858 (err=4.5e-16) Top mflops for N=15 = 161.711 Normalized results and averages for N=15: fft 0: mflops = 9.23228 (norm. = 0.0570913), norm. avg. (of 2) = 0.0479493 fft 1: mflops = 51.9004 (norm. = 0.320946), norm. avg. (of 4) = 0.218899 fft 2: mflops = 50.2043 (norm. = 0.310458), norm. avg. (of 4) = 0.183508 fft 3: mflops = 92.5453 (norm. = 0.572289), norm. avg. (of 4) = 0.394653 fft 4: mflops = 32.0052 (norm. = 0.197917), norm. avg. (of 4) = 0.15735 fft 5: mflops = 158.376 (norm. = 0.979381), norm. avg. (of 4) = 0.987192 fft 6: mflops = 161.711 (norm. = 1), norm. avg. (of 4) = 0.965559 fft 7: mflops = 23.8548 (norm. = 0.147516), norm. avg. (of 4) = 0.135204 fft 8: mflops = 66.2178 (norm. = 0.409483), norm. avg. (of 4) = 0.299569 fft 9: mflops = 21.2189 (norm. = 0.131215), norm. avg. (of 4) = 0.136411 fft 10: mflops = 17.7807 (norm. = 0.109954), norm. avg. (of 4) = 0.0788162 fft 11: mflops = 33.3968 (norm. = 0.206522), norm. avg. (of 4) = 0.162875 fft 12: mflops = 32.5477 (norm. = 0.201271), norm. avg. (of 4) = 0.160575 fft 13: mflops = 41.746 (norm. = 0.258152), norm. avg. (of 4) = 0.161593 fft 14: mflops = 15.7403 (norm. = 0.0973361), norm. avg. (of 4) = 0.0792575 fft 15: mflops = 6.80963 (norm. = 0.0421099), norm. avg. (of 4) = 0.0454019 fft 16: mflops = 87.287 (norm. = 0.539773), norm. avg. (of 4) = 0.39719 fft 17: mflops = 87.7858 (norm. = 0.542857), norm. avg. (of 4) = 0.406976 Benchmarking for array size = 18: 0. Brenner: elapsed time t=1.22 s, 32768 iters, t-(init.)=1.21 s t(norm)=0.491966, mflops=10.1633 (err=4.1e-16) 1. CWP (min N): elapsed time t=1.81 s, 262144 iters, t-(init.)=1.75 s t(norm)=0.0889401, mflops=56.2176 2. CWP (best N) (N=28): elapsed time t=1.05 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.102662, mflops=48.7034 3. FFTPACK: elapsed time t=1.46 s, 262144 iters, t-(init.)=1.41 s t(norm)=0.0716603, mflops=69.7737 (err=2.6e-16) 4. FFTPACK (f2c): elapsed time t=1.18 s, 65536 iters, t-(init.)=1.16 s t(norm)=0.235818, mflops=21.2028 (err=2.9e-16) FFTW_MEASURE plan: (cost = 2.975464e-06) FFTW_TWIDDLE 2 FFTW_NOTW 9 5. FFTW: elapsed time t=1.53 s, 524288 iters, t-(init.)=1.42 s t(norm)=0.0360842, mflops=138.565 (err=2.2e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.55 s, 524288 iters, t-(init.)=1.44 s t(norm)=0.0365925, mflops=136.64 (err=2.2e-16) 7. Frigo-old: elapsed time t=1.2 s, 65536 iters, t-(init.)=1.19 s t(norm)=0.241917, mflops=20.6683 (err=4.7e-16) 8. GSL: elapsed time t=1.36 s, 262144 iters, t-(init.)=1.3 s t(norm)=0.0660698, mflops=75.6776 (err=2.3e-16) 9. NAPACK (f2c): elapsed time t=1.48 s, 131072 iters, t-(init.)=1.45 s t(norm)=0.147386, mflops=33.9244 (err=7.8e-16) 10. Nielsen: elapsed time t=1.88 s, 65536 iters, t-(init.)=1.87 s t(norm)=0.380155, mflops=13.1525 (err=6.6e-16) 11. Singleton: elapsed time t=1.17 s, 131072 iters, t-(init.)=1.14 s t(norm)=0.115876, mflops=43.1495 (err=2.1e-16) 12. Singleton (f2c): elapsed time t=1.16 s, 131072 iters, t-(init.)=1.13 s t(norm)=0.11486, mflops=43.5314 (err=2.1e-16) 13. Temperton: elapsed time t=1.51 s, 131072 iters, t-(init.)=1.48 s t(norm)=0.150436, mflops=33.2368 (err=2.7e-08) 14. Temperton (f2c): elapsed time t=1.59 s, 65536 iters, t-(init.)=1.58 s t(norm)=0.321201, mflops=15.5666 (err=2.9e-16) 15. Valkenburg: elapsed time t=1.61 s, 32768 iters, t-(init.)=1.6 s t(norm)=0.650533, mflops=7.68601 (err=4.8e-16) 16. SCSL: elapsed time t=1.16 s, 262144 iters, t-(init.)=1.1 s t(norm)=0.0559052, mflops=89.4372 (err=2.9e-16) 17. SGIMATH: elapsed time t=1.11 s, 262144 iters, t-(init.)=1.06 s t(norm)=0.0538723, mflops=92.8121 (err=2.9e-16) Top mflops for N=18 = 138.565 Normalized results and averages for N=18: fft 0: mflops = 10.1633 (norm. = 0.0733471), norm. avg. (of 3) = 0.0564152 fft 1: mflops = 56.2176 (norm. = 0.405714), norm. avg. (of 5) = 0.256262 fft 2: mflops = 48.7034 (norm. = 0.351485), norm. avg. (of 5) = 0.217104 fft 3: mflops = 69.7737 (norm. = 0.503546), norm. avg. (of 5) = 0.416431 fft 4: mflops = 21.2028 (norm. = 0.153017), norm. avg. (of 5) = 0.156483 fft 5: mflops = 138.565 (norm. = 1), norm. avg. (of 5) = 0.989754 fft 6: mflops = 136.64 (norm. = 0.986111), norm. avg. (of 5) = 0.969669 fft 7: mflops = 20.6683 (norm. = 0.14916), norm. avg. (of 5) = 0.137995 fft 8: mflops = 75.6776 (norm. = 0.546154), norm. avg. (of 5) = 0.348886 fft 9: mflops = 33.9244 (norm. = 0.244828), norm. avg. (of 5) = 0.158094 fft 10: mflops = 13.1525 (norm. = 0.0949198), norm. avg. (of 5) = 0.082037 fft 11: mflops = 43.1495 (norm. = 0.311404), norm. avg. (of 5) = 0.192581 fft 12: mflops = 43.5314 (norm. = 0.314159), norm. avg. (of 5) = 0.191292 fft 13: mflops = 33.2368 (norm. = 0.239865), norm. avg. (of 5) = 0.177248 fft 14: mflops = 15.5666 (norm. = 0.112342), norm. avg. (of 5) = 0.0858744 fft 15: mflops = 7.68601 (norm. = 0.0554687), norm. avg. (of 5) = 0.0474153 fft 16: mflops = 89.4372 (norm. = 0.645455), norm. avg. (of 5) = 0.446843 fft 17: mflops = 92.8121 (norm. = 0.669811), norm. avg. (of 5) = 0.459543 Benchmarking for array size = 24: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.72 s, 262144 iters, t-(init.)=1.65 s t(norm)=0.0572001, mflops=87.4124 2. CWP (best N) (N=28): elapsed time t=1.05 s, 131072 iters, t-(init.)=1.01 s t(norm)=0.0700268, mflops=71.4012 3. FFTPACK: elapsed time t=1.64 s, 262144 iters, t-(init.)=1.58 s t(norm)=0.0547735, mflops=91.2851 (err=2.3e-16) 4. FFTPACK (f2c): elapsed time t=1.48 s, 65536 iters, t-(init.)=1.46 s t(norm)=0.202454, mflops=24.697 (err=2.7e-16) FFTW_MEASURE plan: (cost = 3.356934e-06) FFTW_TWIDDLE 2 FFTW_NOTW 12 5. FFTW: elapsed time t=1.8 s, 524288 iters, t-(init.)=1.66 s t(norm)=0.0287734, mflops=173.772 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.84 s, 524288 iters, t-(init.)=1.7 s t(norm)=0.0294667, mflops=169.683 (err=2.1e-16) 7. Frigo-old: elapsed time t=1.91 s, 131072 iters, t-(init.)=1.87 s t(norm)=0.129654, mflops=38.5643 (err=3.4e-16) 8. GSL: elapsed time t=1.57 s, 262144 iters, t-(init.)=1.5 s t(norm)=0.0520001, mflops=96.1536 (err=2.1e-16) 9. NAPACK (f2c): elapsed time t=1.01 s, 65536 iters, t-(init.)=0.99 s t(norm)=0.13728, mflops=36.4218 (err=8.0e-16) 10. Nielsen: elapsed time t=1.44 s, 65536 iters, t-(init.)=1.42 s t(norm)=0.196907, mflops=25.3927 (err=1.5e-15) 11. Singleton: elapsed time t=1.77 s, 131072 iters, t-(init.)=1.74 s t(norm)=0.12064, mflops=41.4455 (err=2.3e-16) 12. Singleton (f2c): elapsed time t=1.8 s, 131072 iters, t-(init.)=1.77 s t(norm)=0.12272, mflops=40.7431 (err=2.3e-16) 13. Temperton: elapsed time t=1.49 s, 131072 iters, t-(init.)=1.46 s t(norm)=0.101227, mflops=49.394 (err=4.5e-09) 14. Temperton (f2c): elapsed time t=1.58 s, 65536 iters, t-(init.)=1.56 s t(norm)=0.21632, mflops=23.1139 (err=2.8e-16) 15. Valkenburg: elapsed time t=1.19 s, 16384 iters, t-(init.)=1.19 s t(norm)=0.660055, mflops=7.57513 (err=4.8e-16) 16. SCSL: elapsed time t=1 s, 262144 iters, t-(init.)=0.93 s t(norm)=0.0322401, mflops=155.087 (err=2.6e-16) 17. SGIMATH: elapsed time t=1.01 s, 262144 iters, t-(init.)=0.94 s t(norm)=0.0325867, mflops=153.437 (err=2.6e-16) Top mflops for N=24 = 173.772 Normalized results and averages for N=24: fft 0: mflops = -1 (norm. = -0.00575468), norm. avg. (of 3) = 0.0564152 fft 1: mflops = 87.4124 (norm. = 0.50303), norm. avg. (of 6) = 0.29739 fft 2: mflops = 71.4012 (norm. = 0.410891), norm. avg. (of 6) = 0.249402 fft 3: mflops = 91.2851 (norm. = 0.525316), norm. avg. (of 6) = 0.434579 fft 4: mflops = 24.697 (norm. = 0.142123), norm. avg. (of 6) = 0.15409 fft 5: mflops = 173.772 (norm. = 1), norm. avg. (of 6) = 0.991462 fft 6: mflops = 169.683 (norm. = 0.976471), norm. avg. (of 6) = 0.970803 fft 7: mflops = 38.5643 (norm. = 0.221925), norm. avg. (of 6) = 0.151984 fft 8: mflops = 96.1536 (norm. = 0.553333), norm. avg. (of 6) = 0.38296 fft 9: mflops = 36.4218 (norm. = 0.209596), norm. avg. (of 6) = 0.166678 fft 10: mflops = 25.3927 (norm. = 0.146127), norm. avg. (of 6) = 0.0927186 fft 11: mflops = 41.4455 (norm. = 0.238506), norm. avg. (of 6) = 0.200235 fft 12: mflops = 40.7431 (norm. = 0.234463), norm. avg. (of 6) = 0.198487 fft 13: mflops = 49.394 (norm. = 0.284247), norm. avg. (of 6) = 0.195081 fft 14: mflops = 23.1139 (norm. = 0.133013), norm. avg. (of 6) = 0.0937308 fft 15: mflops = 7.57513 (norm. = 0.0435924), norm. avg. (of 6) = 0.0467782 fft 16: mflops = 155.087 (norm. = 0.892473), norm. avg. (of 6) = 0.521115 fft 17: mflops = 153.437 (norm. = 0.882979), norm. avg. (of 6) = 0.530115 Benchmarking for array size = 36: 0. Brenner: elapsed time t=1.23 s, 16384 iters, t-(init.)=1.23 s t(norm)=0.403365, mflops=12.3957 (err=5.3e-16) 1. CWP (min N): elapsed time t=1.16 s, 131072 iters, t-(init.)=1.11 s t(norm)=0.0455016, mflops=109.886 2. CWP (best N): elapsed time t=1.16 s, 131072 iters, t-(init.)=1.11 s t(norm)=0.0455016, mflops=109.886 3. FFTPACK: elapsed time t=1.09 s, 131072 iters, t-(init.)=1.04 s t(norm)=0.0426321, mflops=117.283 (err=4.0e-16) 4. FFTPACK (f2c): elapsed time t=1.15 s, 32768 iters, t-(init.)=1.14 s t(norm)=0.186925, mflops=26.7486 (err=4.8e-16) FFTW_MEASURE plan: (cost = 5.645752e-06) FFTW_TWIDDLE 3 FFTW_NOTW 12 5. FFTW: elapsed time t=1.48 s, 262144 iters, t-(init.)=1.38 s t(norm)=0.0282848, mflops=176.774 (err=4.0e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.48 s, 262144 iters, t-(init.)=1.38 s t(norm)=0.0282848, mflops=176.774 (err=4.0e-16) 7. Frigo-old: elapsed time t=1.2 s, 32768 iters, t-(init.)=1.18 s t(norm)=0.193484, mflops=25.8419 (err=5.6e-16) 8. GSL: elapsed time t=1.11 s, 131072 iters, t-(init.)=1.06 s t(norm)=0.0434519, mflops=115.07 (err=4.6e-16) 9. NAPACK (f2c): elapsed time t=1.41 s, 65536 iters, t-(init.)=1.39 s t(norm)=0.113959, mflops=43.8755 (err=1.8e-15) 10. Nielsen: elapsed time t=1.47 s, 32768 iters, t-(init.)=1.46 s t(norm)=0.239396, mflops=20.8859 (err=1.5e-15) 11. Singleton: elapsed time t=1.84 s, 131072 iters, t-(init.)=1.79 s t(norm)=0.0733764, mflops=68.1418 (err=4.7e-16) 12. Singleton (f2c): elapsed time t=1.87 s, 131072 iters, t-(init.)=1.81 s t(norm)=0.0741962, mflops=67.3889 (err=4.7e-16) 13. Temperton: elapsed time t=1.06 s, 65536 iters, t-(init.)=1.03 s t(norm)=0.0844443, mflops=59.2106 (err=5.1e-08) 14. Temperton (f2c): elapsed time t=1.33 s, 32768 iters, t-(init.)=1.32 s t(norm)=0.21644, mflops=23.1011 (err=3.8e-16) 15. Valkenburg: elapsed time t=1.95 s, 16384 iters, t-(init.)=1.94 s t(norm)=0.636202, mflops=7.85914 (err=6.1e-16) 16. SCSL: elapsed time t=1.7 s, 262144 iters, t-(init.)=1.6 s t(norm)=0.0327939, mflops=152.467 (err=4.3e-16) 17. SGIMATH: elapsed time t=1.7 s, 262144 iters, t-(init.)=1.59 s t(norm)=0.032589, mflops=153.426 (err=4.3e-16) Top mflops for N=36 = 176.774 Normalized results and averages for N=36: fft 0: mflops = 12.3957 (norm. = 0.070122), norm. avg. (of 4) = 0.0598419 fft 1: mflops = 109.886 (norm. = 0.621622), norm. avg. (of 7) = 0.343709 fft 2: mflops = 109.886 (norm. = 0.621622), norm. avg. (of 7) = 0.302576 fft 3: mflops = 117.283 (norm. = 0.663462), norm. avg. (of 7) = 0.467276 fft 4: mflops = 26.7486 (norm. = 0.151316), norm. avg. (of 7) = 0.153694 fft 5: mflops = 176.774 (norm. = 1), norm. avg. (of 7) = 0.992681 fft 6: mflops = 176.774 (norm. = 1), norm. avg. (of 7) = 0.974974 fft 7: mflops = 25.8419 (norm. = 0.146186), norm. avg. (of 7) = 0.151156 fft 8: mflops = 115.07 (norm. = 0.650943), norm. avg. (of 7) = 0.421244 fft 9: mflops = 43.8755 (norm. = 0.248201), norm. avg. (of 7) = 0.178324 fft 10: mflops = 20.8859 (norm. = 0.118151), norm. avg. (of 7) = 0.0963517 fft 11: mflops = 68.1418 (norm. = 0.385475), norm. avg. (of 7) = 0.226698 fft 12: mflops = 67.3889 (norm. = 0.381215), norm. avg. (of 7) = 0.224591 fft 13: mflops = 59.2106 (norm. = 0.334951), norm. avg. (of 7) = 0.215062 fft 14: mflops = 23.1011 (norm. = 0.130682), norm. avg. (of 7) = 0.0990095 fft 15: mflops = 7.85914 (norm. = 0.0444588), norm. avg. (of 7) = 0.0464468 fft 16: mflops = 152.467 (norm. = 0.8625), norm. avg. (of 7) = 0.569884 fft 17: mflops = 153.426 (norm. = 0.867925), norm. avg. (of 7) = 0.578374 Benchmarking for array size = 80: 0. Brenner: elapsed time t=1.02 s, 8192 iters, t-(init.)=1.01 s t(norm)=0.243777, mflops=20.5106 (err=3.7e-16) 1. CWP (min N): elapsed time t=1.11 s, 65536 iters, t-(init.)=1.06 s t(norm)=0.0319806, mflops=156.345 2. CWP (best N) (N=84): elapsed time t=1.09 s, 65536 iters, t-(init.)=1.03 s t(norm)=0.0310755, mflops=160.899 3. FFTPACK: elapsed time t=1.01 s, 65536 iters, t-(init.)=0.95 s t(norm)=0.0286618, mflops=174.448 (err=3.4e-16) 4. FFTPACK (f2c): elapsed time t=1.15 s, 16384 iters, t-(init.)=1.14 s t(norm)=0.137577, mflops=36.3433 (err=4.2e-16) FFTW_MEASURE plan: (cost = 1.281738e-05) FFTW_TWIDDLE 8 FFTW_NOTW 10 5. FFTW: elapsed time t=1.71 s, 131072 iters, t-(init.)=1.6 s t(norm)=0.0241363, mflops=207.157 (err=3.4e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 6. FFTW_ESTIMATE: elapsed time t=1.8 s, 131072 iters, t-(init.)=1.69 s t(norm)=0.025494, mflops=196.125 (err=4.2e-16) 7. Frigo-old: elapsed time t=1.43 s, 32768 iters, t-(init.)=1.4 s t(norm)=0.084477, mflops=59.1877 (err=3.5e-16) 8. GSL: elapsed time t=1.5 s, 65536 iters, t-(init.)=1.44 s t(norm)=0.0434453, mflops=115.087 (err=3.0e-16) 9. NAPACK (f2c): elapsed time t=1.3 s, 16384 iters, t-(init.)=1.28 s t(norm)=0.154472, mflops=32.3683 (err=5.2e-16) 10. Nielsen: elapsed time t=1.77 s, 32768 iters, t-(init.)=1.74 s t(norm)=0.104993, mflops=47.6223 (err=5.0e-15) 11. Singleton: elapsed time t=1.69 s, 65536 iters, t-(init.)=1.64 s t(norm)=0.0494794, mflops=101.052 (err=4.3e-16) 12. Singleton (f2c): elapsed time t=1.69 s, 65536 iters, t-(init.)=1.63 s t(norm)=0.0491777, mflops=101.672 (err=4.3e-16) 13. Temperton: elapsed time t=1.16 s, 32768 iters, t-(init.)=1.13 s t(norm)=0.068185, mflops=73.3299 (err=5.3e-08) 14. Temperton (f2c): elapsed time t=1.05 s, 16384 iters, t-(init.)=1.04 s t(norm)=0.125509, mflops=39.8379 (err=3.6e-16) 15. Valkenburg: elapsed time t=1.39 s, 4096 iters, t-(init.)=1.38 s t(norm)=0.666162, mflops=7.50569 (err=4.3e-16) 16. SCSL: elapsed time t=1.65 s, 131072 iters, t-(init.)=1.54 s t(norm)=0.0232312, mflops=215.228 (err=4.2e-16) 17. SGIMATH: elapsed time t=1.65 s, 131072 iters, t-(init.)=1.54 s t(norm)=0.0232312, mflops=215.228 (err=4.2e-16) Top mflops for N=80 = 215.228 Normalized results and averages for N=80: fft 0: mflops = 20.5106 (norm. = 0.095297), norm. avg. (of 5) = 0.0669329 fft 1: mflops = 156.345 (norm. = 0.726415), norm. avg. (of 8) = 0.391547 fft 2: mflops = 160.899 (norm. = 0.747573), norm. avg. (of 8) = 0.3582 fft 3: mflops = 174.448 (norm. = 0.810526), norm. avg. (of 8) = 0.510183 fft 4: mflops = 36.3433 (norm. = 0.16886), norm. avg. (of 8) = 0.155589 fft 5: mflops = 207.157 (norm. = 0.9625), norm. avg. (of 8) = 0.988909 fft 6: mflops = 196.125 (norm. = 0.911243), norm. avg. (of 8) = 0.967007 fft 7: mflops = 59.1877 (norm. = 0.275), norm. avg. (of 8) = 0.166636 fft 8: mflops = 115.087 (norm. = 0.534722), norm. avg. (of 8) = 0.435429 fft 9: mflops = 32.3683 (norm. = 0.150391), norm. avg. (of 8) = 0.174832 fft 10: mflops = 47.6223 (norm. = 0.221264), norm. avg. (of 8) = 0.111966 fft 11: mflops = 101.052 (norm. = 0.469512), norm. avg. (of 8) = 0.25705 fft 12: mflops = 101.672 (norm. = 0.472393), norm. avg. (of 8) = 0.255566 fft 13: mflops = 73.3299 (norm. = 0.340708), norm. avg. (of 8) = 0.230768 fft 14: mflops = 39.8379 (norm. = 0.185096), norm. avg. (of 8) = 0.10977 fft 15: mflops = 7.50569 (norm. = 0.0348732), norm. avg. (of 8) = 0.0450001 fft 16: mflops = 215.228 (norm. = 1), norm. avg. (of 8) = 0.623649 fft 17: mflops = 215.228 (norm. = 1), norm. avg. (of 8) = 0.631077 Benchmarking for array size = 108: 0. Brenner: elapsed time t=1.06 s, 4096 iters, t-(init.)=1.06 s t(norm)=0.354735, mflops=14.095 (err=6.5e-16) 1. CWP (min N) (N=110): elapsed time t=1.77 s, 65536 iters, t-(init.)=1.7 s t(norm)=0.0355572, mflops=140.619 2. CWP (best N) (N=112): elapsed time t=1.47 s, 65536 iters, t-(init.)=1.39 s t(norm)=0.0290732, mflops=171.98 3. FFTPACK: elapsed time t=1.45 s, 65536 iters, t-(init.)=1.37 s t(norm)=0.0286549, mflops=174.49 (err=3.5e-16) 4. FFTPACK (f2c): elapsed time t=1.96 s, 16384 iters, t-(init.)=1.94 s t(norm)=0.162308, mflops=30.8056 (err=4.0e-16) FFTW_MEASURE plan: (cost = 1.831055e-05) FFTW_TWIDDLE 9 FFTW_NOTW 12 5. FFTW: elapsed time t=1.22 s, 65536 iters, t-(init.)=1.15 s t(norm)=0.0240534, mflops=207.871 (err=3.9e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.21 s, 65536 iters, t-(init.)=1.13 s t(norm)=0.0236351, mflops=211.55 (err=3.9e-16) 7. Frigo-old: elapsed time t=1.26 s, 8192 iters, t-(init.)=1.25 s t(norm)=0.20916, mflops=23.9052 (err=5.6e-16) 8. GSL: elapsed time t=1.02 s, 32768 iters, t-(init.)=0.98 s t(norm)=0.0409953, mflops=121.965 (err=3.9e-16) 9. NAPACK (f2c): elapsed time t=1.1 s, 16384 iters, t-(init.)=1.08 s t(norm)=0.090357, mflops=55.336 (err=2.4e-15) 10. Nielsen: elapsed time t=1.25 s, 8192 iters, t-(init.)=1.24 s t(norm)=0.207487, mflops=24.098 (err=1.2e-15) 11. Singleton: elapsed time t=1.54 s, 32768 iters, t-(init.)=1.5 s t(norm)=0.0627479, mflops=79.6839 (err=4.5e-16) 12. Singleton (f2c): elapsed time t=1.59 s, 32768 iters, t-(init.)=1.56 s t(norm)=0.0652579, mflops=76.6191 (err=4.5e-16) 13. Temperton: elapsed time t=1.65 s, 32768 iters, t-(init.)=1.62 s t(norm)=0.0677678, mflops=73.7814 (err=7.4e-08) 14. Temperton (f2c): elapsed time t=1.29 s, 8192 iters, t-(init.)=1.29 s t(norm)=0.215853, mflops=23.1639 (err=3.3e-16) 15. Valkenburg: elapsed time t=1.88 s, 4096 iters, t-(init.)=1.88 s t(norm)=0.629153, mflops=7.9472 (err=5.7e-16) 16. SCSL: elapsed time t=1.26 s, 65536 iters, t-(init.)=1.19 s t(norm)=0.02489, mflops=200.884 (err=5.3e-16) 17. SGIMATH: elapsed time t=1.2 s, 65536 iters, t-(init.)=1.13 s t(norm)=0.0236351, mflops=211.55 (err=5.3e-16) Top mflops for N=108 = 211.55 Normalized results and averages for N=108: fft 0: mflops = 14.095 (norm. = 0.0666274), norm. avg. (of 6) = 0.066882 fft 1: mflops = 140.619 (norm. = 0.664706), norm. avg. (of 9) = 0.421898 fft 2: mflops = 171.98 (norm. = 0.81295), norm. avg. (of 9) = 0.408728 fft 3: mflops = 174.49 (norm. = 0.824818), norm. avg. (of 9) = 0.545142 fft 4: mflops = 30.8056 (norm. = 0.145619), norm. avg. (of 9) = 0.154482 fft 5: mflops = 207.871 (norm. = 0.982609), norm. avg. (of 9) = 0.988209 fft 6: mflops = 211.55 (norm. = 1), norm. avg. (of 9) = 0.970673 fft 7: mflops = 23.9052 (norm. = 0.113), norm. avg. (of 9) = 0.160677 fft 8: mflops = 121.965 (norm. = 0.576531), norm. avg. (of 9) = 0.451107 fft 9: mflops = 55.336 (norm. = 0.261574), norm. avg. (of 9) = 0.18447 fft 10: mflops = 24.098 (norm. = 0.113911), norm. avg. (of 9) = 0.112182 fft 11: mflops = 79.6839 (norm. = 0.376667), norm. avg. (of 9) = 0.27034 fft 12: mflops = 76.6191 (norm. = 0.362179), norm. avg. (of 9) = 0.267412 fft 13: mflops = 73.7814 (norm. = 0.348765), norm. avg. (of 9) = 0.243879 fft 14: mflops = 23.1639 (norm. = 0.109496), norm. avg. (of 9) = 0.10974 fft 15: mflops = 7.9472 (norm. = 0.0375665), norm. avg. (of 9) = 0.0441742 fft 16: mflops = 200.884 (norm. = 0.94958), norm. avg. (of 9) = 0.659863 fft 17: mflops = 211.55 (norm. = 1), norm. avg. (of 9) = 0.672069 Benchmarking for array size = 210: 0. Brenner: elapsed time t=1.76 s, 4096 iters, t-(init.)=1.75 s t(norm)=0.263734, mflops=18.9585 (err=6.9e-16) 1. CWP (min N): elapsed time t=1.44 s, 32768 iters, t-(init.)=1.37 s t(norm)=0.0258082, mflops=193.737 2. CWP (best N): elapsed time t=1.43 s, 32768 iters, t-(init.)=1.36 s t(norm)=0.0256198, mflops=195.161 3. FFTPACK: elapsed time t=1.8 s, 16384 iters, t-(init.)=1.76 s t(norm)=0.0663101, mflops=75.4032 (err=5.2e-16) 4. FFTPACK (f2c): elapsed time t=1.8 s, 4096 iters, t-(init.)=1.8 s t(norm)=0.271269, mflops=18.4319 (err=6.0e-16) FFTW_MEASURE plan: (cost = 5.126953e-05) FFTW_TWIDDLE 7 FFTW_TWIDDLE 3 FFTW_NOTW 10 5. FFTW: elapsed time t=1.69 s, 32768 iters, t-(init.)=1.62 s t(norm)=0.0305177, mflops=163.839 (err=5.0e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.78 s, 32768 iters, t-(init.)=1.7 s t(norm)=0.0320248, mflops=156.129 (err=4.9e-16) 7. Frigo-old: elapsed time t=1.15 s, 4096 iters, t-(init.)=1.14 s t(norm)=0.171804, mflops=29.103 (err=5.5e-16) 8. GSL: elapsed time t=1.09 s, 16384 iters, t-(init.)=1.05 s t(norm)=0.03956, mflops=126.39 (err=6.3e-16) 9. NAPACK (f2c): elapsed time t=1.19 s, 4096 iters, t-(init.)=1.18 s t(norm)=0.177832, mflops=28.1165 (err=1.5e-14) 10. Nielsen: elapsed time t=1.05 s, 4096 iters, t-(init.)=1.05 s t(norm)=0.15824, mflops=31.5975 (err=7.4e-15) 11. Singleton: elapsed time t=1.17 s, 8192 iters, t-(init.)=1.15 s t(norm)=0.0866553, mflops=57.6999 (err=6.3e-16) 12. Singleton (f2c): elapsed time t=1.14 s, 8192 iters, t-(init.)=1.12 s t(norm)=0.0843947, mflops=59.2454 (err=6.3e-16) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.27 s, 1024 iters, t-(init.)=1.26 s t(norm)=0.759553, mflops=6.58282 (err=7.2e-16) 16. SCSL: elapsed time t=1.37 s, 16384 iters, t-(init.)=1.33 s t(norm)=0.0501094, mflops=99.7817 (err=6.1e-16) 17. SGIMATH: elapsed time t=1.37 s, 16384 iters, t-(init.)=1.33 s t(norm)=0.0501094, mflops=99.7817 (err=6.1e-16) Top mflops for N=210 = 195.161 Normalized results and averages for N=210: fft 0: mflops = 18.9585 (norm. = 0.0971429), norm. avg. (of 7) = 0.071205 fft 1: mflops = 193.737 (norm. = 0.992701), norm. avg. (of 10) = 0.478978 fft 2: mflops = 195.161 (norm. = 1), norm. avg. (of 10) = 0.467855 fft 3: mflops = 75.4032 (norm. = 0.386364), norm. avg. (of 10) = 0.529264 fft 4: mflops = 18.4319 (norm. = 0.0944444), norm. avg. (of 10) = 0.148478 fft 5: mflops = 163.839 (norm. = 0.839506), norm. avg. (of 10) = 0.973338 fft 6: mflops = 156.129 (norm. = 0.8), norm. avg. (of 10) = 0.953606 fft 7: mflops = 29.103 (norm. = 0.149123), norm. avg. (of 10) = 0.159521 fft 8: mflops = 126.39 (norm. = 0.647619), norm. avg. (of 10) = 0.470758 fft 9: mflops = 28.1165 (norm. = 0.144068), norm. avg. (of 10) = 0.18043 fft 10: mflops = 31.5975 (norm. = 0.161905), norm. avg. (of 10) = 0.117154 fft 11: mflops = 57.6999 (norm. = 0.295652), norm. avg. (of 10) = 0.272872 fft 12: mflops = 59.2454 (norm. = 0.303571), norm. avg. (of 10) = 0.271028 fft 13: mflops = -1 (norm. = -0.00512397), norm. avg. (of 9) = 0.243879 fft 14: mflops = -1 (norm. = -0.00512397), norm. avg. (of 9) = 0.10974 fft 15: mflops = 6.58282 (norm. = 0.0337302), norm. avg. (of 10) = 0.0431298 fft 16: mflops = 99.7817 (norm. = 0.511278), norm. avg. (of 10) = 0.645005 fft 17: mflops = 99.7817 (norm. = 0.511278), norm. avg. (of 10) = 0.655989 Benchmarking for array size = 504: 0. Brenner: elapsed time t=1.16 s, 1024 iters, t-(init.)=1.16 s t(norm)=0.25037, mflops=19.9704 (err=1.5e-15) 1. CWP (min N): elapsed time t=1.6 s, 16384 iters, t-(init.)=1.51 s t(norm)=0.0203696, mflops=245.464 2. CWP (best N): elapsed time t=1.62 s, 16384 iters, t-(init.)=1.54 s t(norm)=0.0207743, mflops=240.683 3. FFTPACK: elapsed time t=1.14 s, 4096 iters, t-(init.)=1.12 s t(norm)=0.0604342, mflops=82.7346 (err=1.3e-15) 4. FFTPACK (f2c): elapsed time t=1.23 s, 1024 iters, t-(init.)=1.22 s t(norm)=0.26332, mflops=18.9883 (err=1.3e-15) FFTW_MEASURE plan: (cost = 1.220703e-04) FFTW_TWIDDLE 7 FFTW_TWIDDLE 6 FFTW_NOTW 12 5. FFTW: elapsed time t=1.02 s, 8192 iters, t-(init.)=0.97 s t(norm)=0.0261702, mflops=191.057 (err=1.2e-15) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.21 s, 8192 iters, t-(init.)=1.17 s t(norm)=0.0315661, mflops=158.398 (err=1.2e-15) 7. Frigo-old: elapsed time t=1.4 s, 2048 iters, t-(init.)=1.39 s t(norm)=0.150006, mflops=33.3319 (err=1.3e-15) 8. GSL: elapsed time t=1.37 s, 8192 iters, t-(init.)=1.33 s t(norm)=0.0358828, mflops=139.343 (err=1.3e-15) 9. NAPACK (f2c): elapsed time t=1.27 s, 2048 iters, t-(init.)=1.26 s t(norm)=0.135977, mflops=36.7709 (err=4.1e-14) 10. Nielsen: elapsed time t=1.55 s, 2048 iters, t-(init.)=1.54 s t(norm)=0.166194, mflops=30.0853 (err=6.0e-15) 11. Singleton: elapsed time t=1.26 s, 4096 iters, t-(init.)=1.24 s t(norm)=0.0669093, mflops=74.728 (err=1.9e-15) 12. Singleton (f2c): elapsed time t=1.24 s, 4096 iters, t-(init.)=1.21 s t(norm)=0.0652905, mflops=76.5808 (err=1.9e-15) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.61 s, 512 iters, t-(init.)=1.6 s t(norm)=0.690677, mflops=7.23928 (err=1.4e-15) 16. SCSL: elapsed time t=1.27 s, 4096 iters, t-(init.)=1.25 s t(norm)=0.0674489, mflops=74.1302 (err=1.3e-15) 17. SGIMATH: elapsed time t=1.28 s, 4096 iters, t-(init.)=1.26 s t(norm)=0.0679885, mflops=73.5419 (err=1.3e-15) Top mflops for N=504 = 245.464 Normalized results and averages for N=504: fft 0: mflops = 19.9704 (norm. = 0.0813578), norm. avg. (of 8) = 0.0724741 fft 1: mflops = 245.464 (norm. = 1), norm. avg. (of 11) = 0.526344 fft 2: mflops = 240.683 (norm. = 0.980519), norm. avg. (of 11) = 0.514461 fft 3: mflops = 82.7346 (norm. = 0.337054), norm. avg. (of 11) = 0.511791 fft 4: mflops = 18.9883 (norm. = 0.0773566), norm. avg. (of 11) = 0.142012 fft 5: mflops = 191.057 (norm. = 0.778351), norm. avg. (of 11) = 0.955612 fft 6: mflops = 158.398 (norm. = 0.645299), norm. avg. (of 11) = 0.925578 fft 7: mflops = 33.3319 (norm. = 0.135791), norm. avg. (of 11) = 0.157364 fft 8: mflops = 139.343 (norm. = 0.567669), norm. avg. (of 11) = 0.479568 fft 9: mflops = 36.7709 (norm. = 0.149802), norm. avg. (of 11) = 0.177646 fft 10: mflops = 30.0853 (norm. = 0.122565), norm. avg. (of 11) = 0.117646 fft 11: mflops = 74.728 (norm. = 0.304435), norm. avg. (of 11) = 0.275741 fft 12: mflops = 76.5808 (norm. = 0.311983), norm. avg. (of 11) = 0.274751 fft 13: mflops = -1 (norm. = -0.00407391), norm. avg. (of 9) = 0.243879 fft 14: mflops = -1 (norm. = -0.00407391), norm. avg. (of 9) = 0.10974 fft 15: mflops = 7.23928 (norm. = 0.0294922), norm. avg. (of 11) = 0.04189 fft 16: mflops = 74.1302 (norm. = 0.302), norm. avg. (of 11) = 0.613822 fft 17: mflops = 73.5419 (norm. = 0.299603), norm. avg. (of 11) = 0.623591 Benchmarking for array size = 1000: 0. Brenner: elapsed time t=1.12 s, 512 iters, t-(init.)=1.11 s t(norm)=0.217541, mflops=22.9842 (err=1.2e-15) 1. CWP (min N) (N=1001): elapsed time t=1.16 s, 4096 iters, t-(init.)=1.12 s t(norm)=0.0274376, mflops=182.231 2. CWP (best N) (N=1008): elapsed time t=1.79 s, 8192 iters, t-(init.)=1.71 s t(norm)=0.0209457, mflops=238.713 3. FFTPACK: elapsed time t=1.28 s, 4096 iters, t-(init.)=1.23 s t(norm)=0.0301324, mflops=165.934 (err=1.0e-15) 4. FFTPACK (f2c): elapsed time t=1.49 s, 1024 iters, t-(init.)=1.48 s t(norm)=0.145027, mflops=34.4762 (err=1.1e-15) FFTW_MEASURE plan: (cost = 2.832031e-04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 5. FFTW: elapsed time t=1.22 s, 4096 iters, t-(init.)=1.17 s t(norm)=0.0286625, mflops=174.444 (err=1.0e-15) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.21 s, 4096 iters, t-(init.)=1.17 s t(norm)=0.0286625, mflops=174.444 (err=1.0e-15) 7. Frigo-old: elapsed time t=1.31 s, 1024 iters, t-(init.)=1.3 s t(norm)=0.127389, mflops=39.2499 (err=1.1e-15) 8. GSL: elapsed time t=1.67 s, 4096 iters, t-(init.)=1.63 s t(norm)=0.0399316, mflops=125.214 (err=1.0e-15) 9. NAPACK (f2c): elapsed time t=1.68 s, 1024 iters, t-(init.)=1.67 s t(norm)=0.163646, mflops=30.5538 (err=1.7e-14) 10. Nielsen: elapsed time t=1.81 s, 2048 iters, t-(init.)=1.79 s t(norm)=0.0877024, mflops=57.011 (err=1.5e-14) 11. Singleton: elapsed time t=1.85 s, 4096 iters, t-(init.)=1.81 s t(norm)=0.0443412, mflops=112.762 (err=1.5e-15) 12. Singleton (f2c): elapsed time t=1.88 s, 4096 iters, t-(init.)=1.84 s t(norm)=0.0450761, mflops=110.924 (err=1.5e-15) 13. Temperton: elapsed time t=1.08 s, 2048 iters, t-(init.)=1.06 s t(norm)=0.0519355, mflops=96.2732 (err=1.3e-07) 14. Temperton (f2c): elapsed time t=1.59 s, 1024 iters, t-(init.)=1.58 s t(norm)=0.154827, mflops=32.2942 (err=1.0e-15) 15. Valkenburg: elapsed time t=1.82 s, 256 iters, t-(init.)=1.81 s t(norm)=0.709459, mflops=7.04763 (err=1.1e-15) 16. SCSL: elapsed time t=1.02 s, 4096 iters, t-(init.)=0.97 s t(norm)=0.0237629, mflops=210.412 (err=1.2e-15) 17. SGIMATH: elapsed time t=1.01 s, 4096 iters, t-(init.)=0.97 s t(norm)=0.0237629, mflops=210.412 (err=1.2e-15) Top mflops for N=1000 = 238.713 Normalized results and averages for N=1000: fft 0: mflops = 22.9842 (norm. = 0.0962838), norm. avg. (of 9) = 0.0751196 fft 1: mflops = 182.231 (norm. = 0.763393), norm. avg. (of 12) = 0.546098 fft 2: mflops = 238.713 (norm. = 1), norm. avg. (of 12) = 0.554923 fft 3: mflops = 165.934 (norm. = 0.695122), norm. avg. (of 12) = 0.527068 fft 4: mflops = 34.4762 (norm. = 0.144426), norm. avg. (of 12) = 0.142213 fft 5: mflops = 174.444 (norm. = 0.730769), norm. avg. (of 12) = 0.936875 fft 6: mflops = 174.444 (norm. = 0.730769), norm. avg. (of 12) = 0.909344 fft 7: mflops = 39.2499 (norm. = 0.164423), norm. avg. (of 12) = 0.157952 fft 8: mflops = 125.214 (norm. = 0.52454), norm. avg. (of 12) = 0.483316 fft 9: mflops = 30.5538 (norm. = 0.127994), norm. avg. (of 12) = 0.173508 fft 10: mflops = 57.011 (norm. = 0.238827), norm. avg. (of 12) = 0.127745 fft 11: mflops = 112.762 (norm. = 0.472376), norm. avg. (of 12) = 0.292127 fft 12: mflops = 110.924 (norm. = 0.464674), norm. avg. (of 12) = 0.290578 fft 13: mflops = 96.2732 (norm. = 0.403302), norm. avg. (of 10) = 0.259821 fft 14: mflops = 32.2942 (norm. = 0.135285), norm. avg. (of 10) = 0.112294 fft 15: mflops = 7.04763 (norm. = 0.0295235), norm. avg. (of 12) = 0.0408594 fft 16: mflops = 210.412 (norm. = 0.881443), norm. avg. (of 12) = 0.636124 fft 17: mflops = 210.412 (norm. = 0.881443), norm. avg. (of 12) = 0.645078 Benchmarking for array size = 1960: 0. Brenner: elapsed time t=1.19 s, 256 iters, t-(init.)=1.18 s t(norm)=0.215032, mflops=23.2524 (err=2.9e-15) 1. CWP (min N) (N=1980): elapsed time t=1 s, 2048 iters, t-(init.)=0.96 s t(norm)=0.0218676, mflops=228.649 2. CWP (best N) (N=1980): elapsed time t=1 s, 2048 iters, t-(init.)=0.96 s t(norm)=0.0218676, mflops=228.649 3. FFTPACK: elapsed time t=1.86 s, 1024 iters, t-(init.)=1.84 s t(norm)=0.0838258, mflops=59.6475 (err=2.7e-15) 4. FFTPACK (f2c): elapsed time t=1.86 s, 256 iters, t-(init.)=1.86 s t(norm)=0.338948, mflops=14.7515 (err=2.8e-15) FFTW_MEASURE plan: (cost = 7.812500e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 7 5. FFTW: elapsed time t=1.71 s, 2048 iters, t-(init.)=1.67 s t(norm)=0.0380405, mflops=131.439 (err=2.8e-15) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.65 s, 2048 iters, t-(init.)=1.61 s t(norm)=0.0366738, mflops=136.337 (err=2.8e-15) 7. Frigo-old: elapsed time t=1.52 s, 512 iters, t-(init.)=1.51 s t(norm)=0.137584, mflops=36.3415 (err=2.8e-15) 8. GSL: elapsed time t=1.2 s, 1024 iters, t-(init.)=1.18 s t(norm)=0.0537579, mflops=93.0096 (err=2.8e-15) 9. NAPACK (f2c): elapsed time t=1.1 s, 256 iters, t-(init.)=1.1 s t(norm)=0.200453, mflops=24.9435 (err=1.3e-13) 10. Nielsen: elapsed time t=1.73 s, 512 iters, t-(init.)=1.72 s t(norm)=0.156718, mflops=31.9045 (err=1.7e-14) 11. Singleton: elapsed time t=1.43 s, 1024 iters, t-(init.)=1.41 s t(norm)=0.0642361, mflops=77.8378 (err=4.3e-15) 12. Singleton (f2c): elapsed time t=1.4 s, 1024 iters, t-(init.)=1.38 s t(norm)=0.0628694, mflops=79.53 (err=4.3e-15) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.1 s, 64 iters, t-(init.)=1.1 s t(norm)=0.801812, mflops=6.23587 (err=2.7e-15) 16. SCSL: elapsed time t=1.21 s, 512 iters, t-(init.)=1.2 s t(norm)=0.109338, mflops=45.7297 (err=2.8e-15) 17. SGIMATH: elapsed time t=1.21 s, 512 iters, t-(init.)=1.2 s t(norm)=0.109338, mflops=45.7297 (err=2.8e-15) Top mflops for N=1960 = 228.649 Normalized results and averages for N=1960: fft 0: mflops = 23.2524 (norm. = 0.101695), norm. avg. (of 10) = 0.0777771 fft 1: mflops = 228.649 (norm. = 1), norm. avg. (of 13) = 0.581013 fft 2: mflops = 228.649 (norm. = 1), norm. avg. (of 13) = 0.589159 fft 3: mflops = 59.6475 (norm. = 0.26087), norm. avg. (of 13) = 0.506591 fft 4: mflops = 14.7515 (norm. = 0.0645161), norm. avg. (of 13) = 0.136237 fft 5: mflops = 131.439 (norm. = 0.57485), norm. avg. (of 13) = 0.909027 fft 6: mflops = 136.337 (norm. = 0.596273), norm. avg. (of 13) = 0.885262 fft 7: mflops = 36.3415 (norm. = 0.15894), norm. avg. (of 13) = 0.158028 fft 8: mflops = 93.0096 (norm. = 0.40678), norm. avg. (of 13) = 0.477428 fft 9: mflops = 24.9435 (norm. = 0.109091), norm. avg. (of 13) = 0.168553 fft 10: mflops = 31.9045 (norm. = 0.139535), norm. avg. (of 13) = 0.128651 fft 11: mflops = 77.8378 (norm. = 0.340426), norm. avg. (of 13) = 0.295842 fft 12: mflops = 79.53 (norm. = 0.347826), norm. avg. (of 13) = 0.294982 fft 13: mflops = -1 (norm. = -0.00437352), norm. avg. (of 10) = 0.259821 fft 14: mflops = -1 (norm. = -0.00437352), norm. avg. (of 10) = 0.112294 fft 15: mflops = 6.23587 (norm. = 0.0272727), norm. avg. (of 13) = 0.0398143 fft 16: mflops = 45.7297 (norm. = 0.2), norm. avg. (of 13) = 0.602576 fft 17: mflops = 45.7297 (norm. = 0.2), norm. avg. (of 13) = 0.610842 Benchmarking for array size = 4725: 0. Brenner: elapsed time t=1.98 s, 128 iters, t-(init.)=1.96 s t(norm)=0.265502, mflops=18.8323 (err=1.9e-15) 1. CWP (min N) (N=5005): elapsed time t=1 s, 512 iters, t-(init.)=0.92 s t(norm)=0.0311558, mflops=160.484 2. CWP (best N) (N=5040): elapsed time t=1.72 s, 1024 iters, t-(init.)=1.56 s t(norm)=0.0264147, mflops=189.288 3. FFTPACK: elapsed time t=1.97 s, 512 iters, t-(init.)=1.89 s t(norm)=0.0640049, mflops=78.119 (err=1.8e-15) 4. FFTPACK (f2c): elapsed time t=1.71 s, 128 iters, t-(init.)=1.69 s t(norm)=0.228928, mflops=21.841 (err=1.9e-15) FFTW_MEASURE plan: (cost = 2.031250e-03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_NOTW 15 5. FFTW: elapsed time t=1.11 s, 512 iters, t-(init.)=1.02 s t(norm)=0.0345423, mflops=144.75 (err=1.8e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.11 s, 512 iters, t-(init.)=1.03 s t(norm)=0.034881, mflops=143.345 (err=1.8e-15) 7. Frigo-old: elapsed time t=1.6 s, 128 iters, t-(init.)=1.58 s t(norm)=0.214027, mflops=23.3615 (err=1.9e-15) 8. GSL: elapsed time t=1.55 s, 512 iters, t-(init.)=1.47 s t(norm)=0.0497816, mflops=100.439 (err=1.9e-15) 9. NAPACK (f2c): elapsed time t=1.35 s, 128 iters, t-(init.)=1.33 s t(norm)=0.180162, mflops=27.7528 (err=3.5e-13) 10. Nielsen: elapsed time t=1.4 s, 128 iters, t-(init.)=1.38 s t(norm)=0.186935, mflops=26.7473 (err=3.8e-14) 11. Singleton: elapsed time t=1.17 s, 256 iters, t-(init.)=1.13 s t(norm)=0.0765349, mflops=65.3296 (err=2.4e-15) 12. Singleton (f2c): elapsed time t=1.15 s, 256 iters, t-(init.)=1.12 s t(norm)=0.0758576, mflops=65.9129 (err=2.4e-15) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.39 s, 32 iters, t-(init.)=1.38 s t(norm)=0.74774, mflops=6.68682 (err=1.8e-15) 16. SCSL: elapsed time t=1.66 s, 512 iters, t-(init.)=1.58 s t(norm)=0.0535067, mflops=93.4462 (err=1.8e-15) 17. SGIMATH: elapsed time t=1.66 s, 512 iters, t-(init.)=1.58 s t(norm)=0.0535067, mflops=93.4462 (err=1.8e-15) Top mflops for N=4725 = 189.288 Normalized results and averages for N=4725: fft 0: mflops = 18.8323 (norm. = 0.0994898), norm. avg. (of 11) = 0.079751 fft 1: mflops = 160.484 (norm. = 0.847826), norm. avg. (of 14) = 0.600072 fft 2: mflops = 189.288 (norm. = 1), norm. avg. (of 14) = 0.618505 fft 3: mflops = 78.119 (norm. = 0.412698), norm. avg. (of 14) = 0.499885 fft 4: mflops = 21.841 (norm. = 0.115385), norm. avg. (of 14) = 0.134747 fft 5: mflops = 144.75 (norm. = 0.764706), norm. avg. (of 14) = 0.898719 fft 6: mflops = 143.345 (norm. = 0.757282), norm. avg. (of 14) = 0.87612 fft 7: mflops = 23.3615 (norm. = 0.123418), norm. avg. (of 14) = 0.155556 fft 8: mflops = 100.439 (norm. = 0.530612), norm. avg. (of 14) = 0.481227 fft 9: mflops = 27.7528 (norm. = 0.146617), norm. avg. (of 14) = 0.166986 fft 10: mflops = 26.7473 (norm. = 0.141304), norm. avg. (of 14) = 0.129555 fft 11: mflops = 65.3296 (norm. = 0.345133), norm. avg. (of 14) = 0.299363 fft 12: mflops = 65.9129 (norm. = 0.348214), norm. avg. (of 14) = 0.298784 fft 13: mflops = -1 (norm. = -0.00528294), norm. avg. (of 10) = 0.259821 fft 14: mflops = -1 (norm. = -0.00528294), norm. avg. (of 10) = 0.112294 fft 15: mflops = 6.68682 (norm. = 0.0353261), norm. avg. (of 14) = 0.0394937 fft 16: mflops = 93.4462 (norm. = 0.493671), norm. avg. (of 14) = 0.594797 fft 17: mflops = 93.4462 (norm. = 0.493671), norm. avg. (of 14) = 0.602472 Benchmarking for array size = 10368: 0. Brenner: elapsed time t=1.14 s, 32 iters, t-(init.)=1.13 s t(norm)=0.255319, mflops=19.5834 (err=3.1e-15) 1. CWP (min N) (N=10920): elapsed time t=1.26 s, 256 iters, t-(init.)=1.17 s t(norm)=0.0330446, mflops=151.311 2. CWP (best N) (N=11088): elapsed time t=1.08 s, 256 iters, t-(init.)=0.99 s t(norm)=0.0279608, mflops=178.822 3. FFTPACK: elapsed time t=1.4 s, 256 iters, t-(init.)=1.32 s t(norm)=0.037281, mflops=134.116 (err=3.0e-15) 4. FFTPACK (f2c): elapsed time t=1.47 s, 64 iters, t-(init.)=1.44 s t(norm)=0.162681, mflops=30.735 (err=3.0e-15) FFTW_MEASURE plan: (cost = 4.062500e-03) FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 8 FFTW_NOTW 16 5. FFTW: elapsed time t=1.1 s, 256 iters, t-(init.)=1.01 s t(norm)=0.0285256, mflops=175.281 (err=3.0e-15) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.18 s, 256 iters, t-(init.)=1.1 s t(norm)=0.0310675, mflops=160.94 (err=3.0e-15) 7. Frigo-old: elapsed time t=1.42 s, 64 iters, t-(init.)=1.4 s t(norm)=0.158162, mflops=31.6132 (err=3.1e-15) 8. GSL: elapsed time t=1.61 s, 256 iters, t-(init.)=1.53 s t(norm)=0.0432121, mflops=115.708 (err=3.0e-15) 9. NAPACK (f2c): elapsed time t=1.9 s, 128 iters, t-(init.)=1.86 s t(norm)=0.105065, mflops=47.5897 (err=8.1e-14) 10. Nielsen: elapsed time t=1.59 s, 64 iters, t-(init.)=1.57 s t(norm)=0.177367, mflops=28.1901 (err=8.4e-15) 11. Singleton: elapsed time t=1.27 s, 128 iters, t-(init.)=1.23 s t(norm)=0.0694783, mflops=71.9649 (err=4.4e-15) 12. Singleton (f2c): elapsed time t=1.31 s, 128 iters, t-(init.)=1.27 s t(norm)=0.0717378, mflops=69.6983 (err=4.4e-15) 13. Temperton: elapsed time t=1.74 s, 128 iters, t-(init.)=1.7 s t(norm)=0.0960269, mflops=52.0687 (err=2.1e-07) 14. Temperton (f2c): elapsed time t=1.61 s, 64 iters, t-(init.)=1.59 s t(norm)=0.179627, mflops=27.8355 (err=3.0e-15) 15. Valkenburg: elapsed time t=1.4 s, 16 iters, t-(init.)=1.39 s t(norm)=0.628129, mflops=7.96015 (err=3.0e-15) 16. SCSL: elapsed time t=1.34 s, 256 iters, t-(init.)=1.26 s t(norm)=0.0355864, mflops=140.503 (err=3.0e-15) 17. SGIMATH: elapsed time t=1.33 s, 256 iters, t-(init.)=1.24 s t(norm)=0.0350216, mflops=142.769 (err=3.0e-15) Top mflops for N=10368 = 178.822 Normalized results and averages for N=10368: fft 0: mflops = 19.5834 (norm. = 0.109513), norm. avg. (of 12) = 0.0822312 fft 1: mflops = 151.311 (norm. = 0.846154), norm. avg. (of 15) = 0.616477 fft 2: mflops = 178.822 (norm. = 1), norm. avg. (of 15) = 0.643938 fft 3: mflops = 134.116 (norm. = 0.75), norm. avg. (of 15) = 0.516559 fft 4: mflops = 30.735 (norm. = 0.171875), norm. avg. (of 15) = 0.137222 fft 5: mflops = 175.281 (norm. = 0.980198), norm. avg. (of 15) = 0.904151 fft 6: mflops = 160.94 (norm. = 0.9), norm. avg. (of 15) = 0.877712 fft 7: mflops = 31.6132 (norm. = 0.176786), norm. avg. (of 15) = 0.156971 fft 8: mflops = 115.708 (norm. = 0.647059), norm. avg. (of 15) = 0.492283 fft 9: mflops = 47.5897 (norm. = 0.266129), norm. avg. (of 15) = 0.173596 fft 10: mflops = 28.1901 (norm. = 0.157643), norm. avg. (of 15) = 0.131428 fft 11: mflops = 71.9649 (norm. = 0.402439), norm. avg. (of 15) = 0.306235 fft 12: mflops = 69.6983 (norm. = 0.389764), norm. avg. (of 15) = 0.30485 fft 13: mflops = 52.0687 (norm. = 0.291176), norm. avg. (of 11) = 0.262672 fft 14: mflops = 27.8355 (norm. = 0.15566), norm. avg. (of 11) = 0.116237 fft 15: mflops = 7.96015 (norm. = 0.0445144), norm. avg. (of 15) = 0.0398284 fft 16: mflops = 140.503 (norm. = 0.785714), norm. avg. (of 15) = 0.607525 fft 17: mflops = 142.769 (norm. = 0.798387), norm. avg. (of 15) = 0.615533 Benchmarking for array size = 27000: 0. Brenner: elapsed time t=1.61 s, 16 iters, t-(init.)=1.59 s t(norm)=0.250026, mflops=19.9979 (err=5.6e-15) 1. CWP (min N) (N=27720): elapsed time t=1.44 s, 128 iters, t-(init.)=1.33 s t(norm)=0.0261427, mflops=191.258 2. CWP (best N) (N=27720): elapsed time t=1.44 s, 128 iters, t-(init.)=1.33 s t(norm)=0.0261427, mflops=191.258 3. FFTPACK: elapsed time t=1.9 s, 128 iters, t-(init.)=1.79 s t(norm)=0.0351845, mflops=142.108 (err=5.5e-15) 4. FFTPACK (f2c): elapsed time t=1.98 s, 32 iters, t-(init.)=1.95 s t(norm)=0.153318, mflops=32.6119 (err=5.5e-15) FFTW_MEASURE plan: (cost = 1.312500e-02) FFTW_TWIDDLE 6 FFTW_TWIDDLE 10 FFTW_TWIDDLE 3 FFTW_TWIDDLE 10 FFTW_NOTW 15 5. FFTW: elapsed time t=1.8 s, 128 iters, t-(init.)=1.69 s t(norm)=0.0332189, mflops=150.517 (err=5.6e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.75 s, 128 iters, t-(init.)=1.64 s t(norm)=0.0322361, mflops=155.106 (err=5.6e-15) 7. Frigo-old: elapsed time t=1.17 s, 16 iters, t-(init.)=1.16 s t(norm)=0.182409, mflops=27.4109 (err=5.7e-15) 8. GSL: elapsed time t=1.11 s, 64 iters, t-(init.)=1.05 s t(norm)=0.0412779, mflops=121.13 (err=5.5e-15) 9. NAPACK (f2c): elapsed time t=1.01 s, 16 iters, t-(init.)=0.99 s t(norm)=0.155677, mflops=32.1178 (err=1.1e-12) 10. Nielsen: elapsed time t=1.81 s, 32 iters, t-(init.)=1.78 s t(norm)=0.139952, mflops=35.7266 (err=2.0e-13) 11. Singleton: elapsed time t=1.9 s, 64 iters, t-(init.)=1.84 s t(norm)=0.0723347, mflops=69.1232 (err=7.7e-15) 12. Singleton (f2c): elapsed time t=1.91 s, 64 iters, t-(init.)=1.85 s t(norm)=0.0727278, mflops=68.7495 (err=7.7e-15) 13. Temperton: elapsed time t=1.09 s, 32 iters, t-(init.)=1.07 s t(norm)=0.0841284, mflops=59.433 (err=1.4e-07) 14. Temperton (f2c): elapsed time t=1.35 s, 16 iters, t-(init.)=1.33 s t(norm)=0.209142, mflops=23.9073 (err=5.6e-15) 15. Valkenburg: elapsed time t=1.11 s, 4 iters, t-(init.)=1.1 s t(norm)=0.691897, mflops=7.22651 (err=5.5e-15) 16. SCSL: elapsed time t=1.04 s, 64 iters, t-(init.)=0.99 s t(norm)=0.0389192, mflops=128.471 (err=5.6e-15) 17. SGIMATH: elapsed time t=1.03 s, 64 iters, t-(init.)=0.97 s t(norm)=0.0381329, mflops=131.12 (err=5.6e-15) Top mflops for N=27000 = 191.258 Normalized results and averages for N=27000: fft 0: mflops = 19.9979 (norm. = 0.10456), norm. avg. (of 13) = 0.0839488 fft 1: mflops = 191.258 (norm. = 1), norm. avg. (of 16) = 0.640447 fft 2: mflops = 191.258 (norm. = 1), norm. avg. (of 16) = 0.666192 fft 3: mflops = 142.108 (norm. = 0.743017), norm. avg. (of 16) = 0.530713 fft 4: mflops = 32.6119 (norm. = 0.170513), norm. avg. (of 16) = 0.139303 fft 5: mflops = 150.517 (norm. = 0.786982), norm. avg. (of 16) = 0.896828 fft 6: mflops = 155.106 (norm. = 0.810976), norm. avg. (of 16) = 0.873541 fft 7: mflops = 27.4109 (norm. = 0.143319), norm. avg. (of 16) = 0.156118 fft 8: mflops = 121.13 (norm. = 0.633333), norm. avg. (of 16) = 0.501098 fft 9: mflops = 32.1178 (norm. = 0.167929), norm. avg. (of 16) = 0.173241 fft 10: mflops = 35.7266 (norm. = 0.186798), norm. avg. (of 16) = 0.134888 fft 11: mflops = 69.1232 (norm. = 0.361413), norm. avg. (of 16) = 0.309684 fft 12: mflops = 68.7495 (norm. = 0.359459), norm. avg. (of 16) = 0.308263 fft 13: mflops = 59.433 (norm. = 0.310748), norm. avg. (of 12) = 0.266678 fft 14: mflops = 23.9073 (norm. = 0.125), norm. avg. (of 12) = 0.116967 fft 15: mflops = 7.22651 (norm. = 0.0377841), norm. avg. (of 16) = 0.0397007 fft 16: mflops = 128.471 (norm. = 0.671717), norm. avg. (of 16) = 0.611537 fft 17: mflops = 131.12 (norm. = 0.685567), norm. avg. (of 16) = 0.61991 Benchmarking for array size = 75600: 0. Brenner: elapsed time t=1.26 s, 4 iters, t-(init.)=1.25 s t(norm)=0.255064, mflops=19.6029 (err=1.1e-14) 1. CWP (min N) (N=80080): elapsed time t=1.28 s, 32 iters, t-(init.)=1.2 s t(norm)=0.0306077, mflops=163.357 2. CWP (best N) (N=80080): elapsed time t=1.28 s, 32 iters, t-(init.)=1.2 s t(norm)=0.0306077, mflops=163.357 3. FFTPACK: elapsed time t=1.21 s, 16 iters, t-(init.)=1.17 s t(norm)=0.0596851, mflops=83.7731 (err=1.1e-14) 4. FFTPACK (f2c): elapsed time t=1.03 s, 4 iters, t-(init.)=1.02 s t(norm)=0.208133, mflops=24.0232 (err=1.1e-14) FFTW_MEASURE plan: (cost = 4.125000e-02) FFTW_TWIDDLE 7 FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_NOTW 12 5. FFTW: elapsed time t=1.4 s, 32 iters, t-(init.)=1.32 s t(norm)=0.0336685, mflops=148.507 (err=1.1e-14) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.43 s, 32 iters, t-(init.)=1.36 s t(norm)=0.0346888, mflops=144.139 (err=1.1e-14) 7. Frigo-old: elapsed time t=1.77 s, 8 iters, t-(init.)=1.75 s t(norm)=0.178545, mflops=28.0041 (err=1.1e-14) 8. GSL: elapsed time t=1.96 s, 32 iters, t-(init.)=1.88 s t(norm)=0.0479521, mflops=104.271 (err=1.1e-14) 9. NAPACK (f2c): elapsed time t=1.59 s, 8 iters, t-(init.)=1.57 s t(norm)=0.16018, mflops=31.2148 (err=5.1e-12) 10. Nielsen: elapsed time t=1.59 s, 8 iters, t-(init.)=1.57 s t(norm)=0.16018, mflops=31.2148 (err=4.8e-13) 11. Singleton: elapsed time t=1.59 s, 16 iters, t-(init.)=1.55 s t(norm)=0.0790699, mflops=63.2352 (err=1.5e-14) 12. Singleton (f2c): elapsed time t=1.58 s, 16 iters, t-(init.)=1.54 s t(norm)=0.0785598, mflops=63.6458 (err=1.5e-14) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=1.79 s, 2 iters, t-(init.)=1.79 s t(norm)=0.730504, mflops=6.84459 (err=1.1e-14) 16. SCSL: elapsed time t=1.08 s, 16 iters, t-(init.)=1.04 s t(norm)=0.0530534, mflops=94.2447 (err=1.1e-14) 17. SGIMATH: elapsed time t=1.09 s, 16 iters, t-(init.)=1.05 s t(norm)=0.0535635, mflops=93.3471 (err=1.1e-14) Top mflops for N=75600 = 163.357 Normalized results and averages for N=75600: fft 0: mflops = 19.6029 (norm. = 0.12), norm. avg. (of 14) = 0.0865239 fft 1: mflops = 163.357 (norm. = 1), norm. avg. (of 17) = 0.661597 fft 2: mflops = 163.357 (norm. = 1), norm. avg. (of 17) = 0.685828 fft 3: mflops = 83.7731 (norm. = 0.512821), norm. avg. (of 17) = 0.52966 fft 4: mflops = 24.0232 (norm. = 0.147059), norm. avg. (of 17) = 0.139759 fft 5: mflops = 148.507 (norm. = 0.909091), norm. avg. (of 17) = 0.897549 fft 6: mflops = 144.139 (norm. = 0.882353), norm. avg. (of 17) = 0.874059 fft 7: mflops = 28.0041 (norm. = 0.171429), norm. avg. (of 17) = 0.157019 fft 8: mflops = 104.271 (norm. = 0.638298), norm. avg. (of 17) = 0.509169 fft 9: mflops = 31.2148 (norm. = 0.191083), norm. avg. (of 17) = 0.174291 fft 10: mflops = 31.2148 (norm. = 0.191083), norm. avg. (of 17) = 0.138194 fft 11: mflops = 63.2352 (norm. = 0.387097), norm. avg. (of 17) = 0.314237 fft 12: mflops = 63.6458 (norm. = 0.38961), norm. avg. (of 17) = 0.313048 fft 13: mflops = -1 (norm. = -0.00612154), norm. avg. (of 12) = 0.266678 fft 14: mflops = -1 (norm. = -0.00612154), norm. avg. (of 12) = 0.116967 fft 15: mflops = 6.84459 (norm. = 0.0418994), norm. avg. (of 17) = 0.03983 fft 16: mflops = 94.2447 (norm. = 0.576923), norm. avg. (of 17) = 0.609501 fft 17: mflops = 93.3471 (norm. = 0.571429), norm. avg. (of 17) = 0.617059 Benchmarking for array size = 165375: 0. Brenner: elapsed time t=1.13 s, 1 iters, t-(init.)=1.12 s t(norm)=0.390674, mflops=12.7984 (err=2.7e-14) 1. CWP (min N) (N=180180): elapsed time t=1.59 s, 16 iters, t-(init.)=1.49 s t(norm)=0.0324835, mflops=153.924 2. CWP (best N) (N=180180): elapsed time t=1.59 s, 16 iters, t-(init.)=1.49 s t(norm)=0.0324835, mflops=153.924 3. FFTPACK: elapsed time t=1.26 s, 4 iters, t-(init.)=1.24 s t(norm)=0.108133, mflops=46.2393 (err=2.7e-14) 4. FFTPACK (f2c): elapsed time t=1.9 s, 2 iters, t-(init.)=1.88 s t(norm)=0.327887, mflops=15.2491 (err=2.7e-14) FFTW_MEASURE plan: (cost = 1.850000e-01) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 15 5. FFTW: elapsed time t=1.54 s, 8 iters, t-(init.)=1.49 s t(norm)=0.064967, mflops=76.9621 (err=2.7e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.41 s, 8 iters, t-(init.)=1.35 s t(norm)=0.0588627, mflops=84.9434 (err=2.7e-14) 7. Frigo-old: elapsed time t=1.7 s, 2 iters, t-(init.)=1.68 s t(norm)=0.293006, mflops=17.0645 (err=2.7e-14) 8. GSL: elapsed time t=1.66 s, 8 iters, t-(init.)=1.61 s t(norm)=0.0701993, mflops=71.2258 (err=2.7e-14) 9. NAPACK (f2c): elapsed time t=1.28 s, 2 iters, t-(init.)=1.27 s t(norm)=0.221498, mflops=22.5735 (err=1.6e-11) 10. Nielsen: elapsed time t=1.3 s, 2 iters, t-(init.)=1.28 s t(norm)=0.223242, mflops=22.3972 (err=1.6e-12) 11. Singleton: elapsed time t=1.35 s, 4 iters, t-(init.)=1.33 s t(norm)=0.115981, mflops=43.1104 (err=4.0e-14) 12. Singleton (f2c): elapsed time t=1.35 s, 4 iters, t-(init.)=1.33 s t(norm)=0.115981, mflops=43.1104 (err=4.0e-14) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=2.49 s, 1 iters, t-(init.)=2.48 s t(norm)=0.865064, mflops=5.77992 (err=2.7e-14) 16. SCSL: elapsed time t=1.63 s, 8 iters, t-(init.)=1.59 s t(norm)=0.0693272, mflops=72.1217 (err=2.7e-14) 17. SGIMATH: elapsed time t=1.62 s, 8 iters, t-(init.)=1.57 s t(norm)=0.0684552, mflops=73.0405 (err=2.7e-14) Top mflops for N=165375 = 153.924 Normalized results and averages for N=165375: fft 0: mflops = 12.7984 (norm. = 0.0831473), norm. avg. (of 15) = 0.0862988 fft 1: mflops = 153.924 (norm. = 1), norm. avg. (of 18) = 0.680397 fft 2: mflops = 153.924 (norm. = 1), norm. avg. (of 18) = 0.703282 fft 3: mflops = 46.2393 (norm. = 0.300403), norm. avg. (of 18) = 0.516924 fft 4: mflops = 15.2491 (norm. = 0.0990691), norm. avg. (of 18) = 0.137499 fft 5: mflops = 76.9621 (norm. = 0.5), norm. avg. (of 18) = 0.875463 fft 6: mflops = 84.9434 (norm. = 0.551852), norm. avg. (of 18) = 0.856159 fft 7: mflops = 17.0645 (norm. = 0.110863), norm. avg. (of 18) = 0.154454 fft 8: mflops = 71.2258 (norm. = 0.462733), norm. avg. (of 18) = 0.506589 fft 9: mflops = 22.5735 (norm. = 0.146654), norm. avg. (of 18) = 0.172756 fft 10: mflops = 22.3972 (norm. = 0.145508), norm. avg. (of 18) = 0.1386 fft 11: mflops = 43.1104 (norm. = 0.280075), norm. avg. (of 18) = 0.312339 fft 12: mflops = 43.1104 (norm. = 0.280075), norm. avg. (of 18) = 0.311216 fft 13: mflops = -1 (norm. = -0.0064967), norm. avg. (of 12) = 0.266678 fft 14: mflops = -1 (norm. = -0.0064967), norm. avg. (of 12) = 0.116967 fft 15: mflops = 5.77992 (norm. = 0.0375504), norm. avg. (of 18) = 0.0397034 fft 16: mflops = 72.1217 (norm. = 0.468553), norm. avg. (of 18) = 0.601671 fft 17: mflops = 73.0405 (norm. = 0.474522), norm. avg. (of 18) = 0.60914 Benchmarking for array size = 362880: 0. Brenner: elapsed time t=3.25 s, 1 iters, t-(init.)=3.23 s t(norm)=0.48194, mflops=10.3747 (err=1.1e-13) 1. CWP (min N) (N=720720): elapsed time t=1.28 s, 2 iters, t-(init.)=1.18 s t(norm)=0.0880324, mflops=56.7973 2. CWP (best N) (N=720720): elapsed time t=1.27 s, 2 iters, t-(init.)=1.16 s t(norm)=0.0865403, mflops=57.7765 3. FFTPACK: elapsed time t=1.52 s, 2 iters, t-(init.)=1.47 s t(norm)=0.109667, mflops=45.5924 (err=1.1e-13) 4. FFTPACK (f2c): elapsed time t=1.87 s, 1 iters, t-(init.)=1.84 s t(norm)=0.274542, mflops=18.2122 (err=1.1e-13) FFTW_MEASURE plan: (cost = 4.900000e-01) FFTW_TWIDDLE 8 FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_NOTW 12 5. FFTW: elapsed time t=1.01 s, 2 iters, t-(init.)=0.96 s t(norm)=0.0716196, mflops=69.8133 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 7.511616e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1 s, 2 iters, t-(init.)=0.95 s t(norm)=0.0708735, mflops=70.5482 (err=1.1e-13) 7. Frigo-old: elapsed time t=1.9 s, 1 iters, t-(init.)=1.87 s t(norm)=0.279018, mflops=17.92 (err=1.1e-13) 8. GSL: elapsed time t=1.14 s, 2 iters, t-(init.)=1.09 s t(norm)=0.0813181, mflops=61.487 (err=1.1e-13) 9. NAPACK (f2c): elapsed time t=1.68 s, 1 iters, t-(init.)=1.66 s t(norm)=0.247684, mflops=20.187 (err=3.4e-11) 10. Nielsen: elapsed time t=2.45 s, 1 iters, t-(init.)=2.42 s t(norm)=0.361082, mflops=13.8473 (err=3.5e-12) 11. Singleton: elapsed time t=1.24 s, 1 iters, t-(init.)=1.23 s t(norm)=0.183525, mflops=27.2442 (err=1.6e-13) 12. Singleton (f2c): elapsed time t=1.26 s, 1 iters, t-(init.)=1.24 s t(norm)=0.185017, mflops=27.0245 (err=1.6e-13) 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 15. Valkenburg: elapsed time t=5.92 s, 1 iters, t-(init.)=5.89 s t(norm)=0.878832, mflops=5.68937 (err=1.1e-13) 16. SCSL: elapsed time t=1.8 s, 4 iters, t-(init.)=1.71 s t(norm)=0.0637862, mflops=78.3869 (err=1.1e-13) 17. SGIMATH: elapsed time t=1.81 s, 4 iters, t-(init.)=1.72 s t(norm)=0.0641592, mflops=77.9312 (err=1.1e-13) Top mflops for N=362880 = 78.3869 Normalized results and averages for N=362880: fft 0: mflops = 10.3747 (norm. = 0.132353), norm. avg. (of 16) = 0.0891771 fft 1: mflops = 56.7973 (norm. = 0.724576), norm. avg. (of 19) = 0.682723 fft 2: mflops = 57.7765 (norm. = 0.737069), norm. avg. (of 19) = 0.70506 fft 3: mflops = 45.5924 (norm. = 0.581633), norm. avg. (of 19) = 0.520329 fft 4: mflops = 18.2122 (norm. = 0.232337), norm. avg. (of 19) = 0.14249 fft 5: mflops = 69.8133 (norm. = 0.890625), norm. avg. (of 19) = 0.876261 fft 6: mflops = 70.5482 (norm. = 0.9), norm. avg. (of 19) = 0.858466 fft 7: mflops = 17.92 (norm. = 0.22861), norm. avg. (of 19) = 0.158357 fft 8: mflops = 61.487 (norm. = 0.784404), norm. avg. (of 19) = 0.521211 fft 9: mflops = 20.187 (norm. = 0.25753), norm. avg. (of 19) = 0.177217 fft 10: mflops = 13.8473 (norm. = 0.176653), norm. avg. (of 19) = 0.140603 fft 11: mflops = 27.2442 (norm. = 0.347561), norm. avg. (of 19) = 0.314193 fft 12: mflops = 27.0245 (norm. = 0.344758), norm. avg. (of 19) = 0.312981 fft 13: mflops = -1 (norm. = -0.0127572), norm. avg. (of 12) = 0.266678 fft 14: mflops = -1 (norm. = -0.0127572), norm. avg. (of 12) = 0.116967 fft 15: mflops = 5.68937 (norm. = 0.0725806), norm. avg. (of 19) = 0.0414337 fft 16: mflops = 78.3869 (norm. = 1), norm. avg. (of 19) = 0.622635 fft 17: mflops = 77.9312 (norm. = 0.994186), norm. avg. (of 19) = 0.629405 ------------------------------------------------------ @@@@ bench.3d.p2.log FFT Benchmark Program by M. Frigo and S. G. Johnson. email: fftw@theory.lcs.mit.edu www: http://theory.lcs.mit.edu/~fftw Using FFTW V1.1 ($Id: executor.c,v 1.34 1997/04/30 13:15:56 fftw Exp $) Maximum memory to use: 200 MB Factors to allow: 2 Using double precision. Measuring speed of 3D transforms: Benchmarking for sizes: 4x4x4 (0.00282288 MB) 8x8x8 (0.0166779 MB) 16x16x16 (0.126419 MB) 32x32x32 (1.00215 MB) 64x64x64 (8.00362 MB) 256x64x32 (16.0061 MB) 16x1024x64 (32.0175 MB) 128x128x128 (64.0065 MB) 512x128x64 (128.011 MB) Maximum array size N = 4194304 Benchmarking FFTs: 0. FFTW 1. HARM 2. HARM (f2c) 3. NR (C) 4. NR (F) 5. PDA 6. PDA (f2c) 7. Singleton 8. Singleton (f2c) 9. Temperton 10. Temperton (f2c) 11. SCSL Computing normalized averages (12 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.36 s, 131072 iters, t-(init.)=1.27 s t(norm)=0.0252326, mflops=198.156 (err=1.9e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. Skipping fft (all dimensions must be > 4 for HARM). 3. NR (C): elapsed time t=1.63 s, 65536 iters, t-(init.)=1.58 s t(norm)=0.0627836, mflops=79.6387 (err=2.4e-16) 4. NR (F): elapsed time t=1.23 s, 32768 iters, t-(init.)=1.21 s t(norm)=0.0961622, mflops=51.9955 (err=2.4e-16) 5. PDA: elapsed time t=1.46 s, 16384 iters, t-(init.)=1.45 s t(norm)=0.230471, mflops=21.6947 (err=2.8e-16) 6. PDA (f2c): elapsed time t=1.13 s, 8192 iters, t-(init.)=1.12 s t(norm)=0.356038, mflops=14.0434 (err=2.8e-16) 7. Singleton: elapsed time t=1.1 s, 65536 iters, t-(init.)=1.06 s t(norm)=0.0421206, mflops=118.707 (err=1.9e-16) 8. Singleton (f2c): elapsed time t=1.05 s, 65536 iters, t-(init.)=1 s t(norm)=0.0397364, mflops=125.829 (err=1.9e-16) 9. Temperton: elapsed time t=1.9 s, 65536 iters, t-(init.)=1.86 s t(norm)=0.0739098, mflops=67.6501 (err=1.9e-16) 10. Temperton (f2c): elapsed time t=1.1 s, 32768 iters, t-(init.)=1.08 s t(norm)=0.0858307, mflops=58.2542 (err=1.9e-16) 11. SCSL: elapsed time t=1.53 s, 65536 iters, t-(init.)=1.49 s t(norm)=0.0592073, mflops=84.4491 (err=1.9e-16) Top mflops for N=64 = 198.156 Normalized results and averages for N=64: fft 0: mflops = 198.156 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.00504653), norm. avg. (of 0) = -1 fft 2: mflops = -1 (norm. = -0.00504653), norm. avg. (of 0) = -1 fft 3: mflops = 79.6387 (norm. = 0.401899), norm. avg. (of 1) = 0.401899 fft 4: mflops = 51.9955 (norm. = 0.262397), norm. avg. (of 1) = 0.262397 fft 5: mflops = 21.6947 (norm. = 0.109483), norm. avg. (of 1) = 0.109483 fft 6: mflops = 14.0434 (norm. = 0.0708705), norm. avg. (of 1) = 0.0708705 fft 7: mflops = 118.707 (norm. = 0.599057), norm. avg. (of 1) = 0.599057 fft 8: mflops = 125.829 (norm. = 0.635), norm. avg. (of 1) = 0.635 fft 9: mflops = 67.6501 (norm. = 0.341398), norm. avg. (of 1) = 0.341398 fft 10: mflops = 58.2542 (norm. = 0.293981), norm. avg. (of 1) = 0.293981 fft 11: mflops = 84.4491 (norm. = 0.426174), norm. avg. (of 1) = 0.426174 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.39 s, 16384 iters, t-(init.)=1.3 s t(norm)=0.0172191, mflops=290.375 (err=3.4e-16) 1. HARM: elapsed time t=1.59 s, 8192 iters, t-(init.)=1.55 s t(norm)=0.041061, mflops=121.77 (err=4.0e-16) 2. HARM (f2c): elapsed time t=1.8 s, 8192 iters, t-(init.)=1.76 s t(norm)=0.0466241, mflops=107.241 (err=4.0e-16) 3. NR (C): elapsed time t=1.61 s, 8192 iters, t-(init.)=1.56 s t(norm)=0.0413259, mflops=120.99 (err=3.5e-16) 4. NR (F): elapsed time t=1.16 s, 4096 iters, t-(init.)=1.13 s t(norm)=0.0598696, mflops=83.5149 (err=3.5e-16) 5. PDA: elapsed time t=1.16 s, 2048 iters, t-(init.)=1.15 s t(norm)=0.121858, mflops=41.0312 (err=3.1e-16) 6. PDA (f2c): elapsed time t=1.12 s, 1024 iters, t-(init.)=1.12 s t(norm)=0.237359, mflops=21.0651 (err=3.1e-16) 7. Singleton: elapsed time t=1.66 s, 8192 iters, t-(init.)=1.62 s t(norm)=0.0429153, mflops=116.508 (err=3.5e-16) 8. Singleton (f2c): elapsed time t=1.69 s, 8192 iters, t-(init.)=1.64 s t(norm)=0.0434452, mflops=115.088 (err=3.5e-16) 9. Temperton: elapsed time t=1.26 s, 8192 iters, t-(init.)=1.21 s t(norm)=0.0320541, mflops=155.987 (err=1.3e-08) 10. Temperton (f2c): elapsed time t=1.48 s, 8192 iters, t-(init.)=1.43 s t(norm)=0.0378821, mflops=131.989 (err=3.3e-16) 11. SCSL: elapsed time t=1.84 s, 16384 iters, t-(init.)=1.76 s t(norm)=0.023312, mflops=214.481 (err=3.5e-16) Top mflops for N=512 = 290.375 Normalized results and averages for N=512: fft 0: mflops = 290.375 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 121.77 (norm. = 0.419355), norm. avg. (of 1) = 0.419355 fft 2: mflops = 107.241 (norm. = 0.369318), norm. avg. (of 1) = 0.369318 fft 3: mflops = 120.99 (norm. = 0.416667), norm. avg. (of 2) = 0.409283 fft 4: mflops = 83.5149 (norm. = 0.287611), norm. avg. (of 2) = 0.275004 fft 5: mflops = 41.0312 (norm. = 0.141304), norm. avg. (of 2) = 0.125394 fft 6: mflops = 21.0651 (norm. = 0.0725446), norm. avg. (of 2) = 0.0717076 fft 7: mflops = 116.508 (norm. = 0.401235), norm. avg. (of 2) = 0.500146 fft 8: mflops = 115.088 (norm. = 0.396341), norm. avg. (of 2) = 0.515671 fft 9: mflops = 155.987 (norm. = 0.53719), norm. avg. (of 2) = 0.439294 fft 10: mflops = 131.989 (norm. = 0.454545), norm. avg. (of 2) = 0.374263 fft 11: mflops = 214.481 (norm. = 0.738636), norm. avg. (of 2) = 0.582405 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.85 s, 1024 iters, t-(init.)=1.72 s t(norm)=0.0341733, mflops=146.313 (err=4.2e-16) 1. HARM: elapsed time t=1.38 s, 512 iters, t-(init.)=1.31 s t(norm)=0.0520547, mflops=96.0528 (err=4.0e-16) 2. HARM (f2c): elapsed time t=1.48 s, 512 iters, t-(init.)=1.41 s t(norm)=0.0560284, mflops=89.2405 (err=4.0e-16) 3. NR (C): elapsed time t=1.43 s, 256 iters, t-(init.)=1.4 s t(norm)=0.111262, mflops=44.939 (err=4.6e-16) 4. NR (F): elapsed time t=1.5 s, 256 iters, t-(init.)=1.46 s t(norm)=0.11603, mflops=43.0922 (err=4.6e-16) 5. PDA: elapsed time t=1.11 s, 256 iters, t-(init.)=1.07 s t(norm)=0.085036, mflops=58.7987 (err=3.9e-16) 6. PDA (f2c): elapsed time t=1.12 s, 128 iters, t-(init.)=1.11 s t(norm)=0.17643, mflops=28.3399 (err=3.9e-16) 7. Singleton: elapsed time t=1.95 s, 512 iters, t-(init.)=1.88 s t(norm)=0.0747045, mflops=66.9304 (err=4.1e-16) 8. Singleton (f2c): elapsed time t=1.94 s, 512 iters, t-(init.)=1.88 s t(norm)=0.0747045, mflops=66.9304 (err=4.1e-16) 9. Temperton: elapsed time t=1.81 s, 512 iters, t-(init.)=1.75 s t(norm)=0.0695388, mflops=71.9024 (err=6.3e-08) 10. Temperton (f2c): elapsed time t=1.58 s, 512 iters, t-(init.)=1.52 s t(norm)=0.0603994, mflops=82.7823 (err=4.5e-16) 11. SCSL: elapsed time t=1.45 s, 1024 iters, t-(init.)=1.32 s t(norm)=0.026226, mflops=190.65 (err=4.5e-16) Top mflops for N=4096 = 190.65 Normalized results and averages for N=4096: fft 0: mflops = 146.313 (norm. = 0.767442), norm. avg. (of 3) = 0.922481 fft 1: mflops = 96.0528 (norm. = 0.503817), norm. avg. (of 2) = 0.461586 fft 2: mflops = 89.2405 (norm. = 0.468085), norm. avg. (of 2) = 0.418702 fft 3: mflops = 44.939 (norm. = 0.235714), norm. avg. (of 3) = 0.351427 fft 4: mflops = 43.0922 (norm. = 0.226027), norm. avg. (of 3) = 0.258678 fft 5: mflops = 58.7987 (norm. = 0.308411), norm. avg. (of 3) = 0.186399 fft 6: mflops = 28.3399 (norm. = 0.148649), norm. avg. (of 3) = 0.0973546 fft 7: mflops = 66.9304 (norm. = 0.351064), norm. avg. (of 3) = 0.450452 fft 8: mflops = 66.9304 (norm. = 0.351064), norm. avg. (of 3) = 0.460802 fft 9: mflops = 71.9024 (norm. = 0.377143), norm. avg. (of 3) = 0.418577 fft 10: mflops = 82.7823 (norm. = 0.434211), norm. avg. (of 3) = 0.394246 fft 11: mflops = 190.65 (norm. = 1), norm. avg. (of 3) = 0.721604 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.23 s, 64 iters, t-(init.)=1.16 s t(norm)=0.0368754, mflops=135.592 (err=5.2e-16) 1. HARM: elapsed time t=1.03 s, 32 iters, t-(init.)=0.99 s t(norm)=0.0629425, mflops=79.4376 (err=5.0e-16) 2. HARM (f2c): elapsed time t=1.1 s, 32 iters, t-(init.)=1.07 s t(norm)=0.0680288, mflops=73.4983 (err=5.0e-16) 3. NR (C): elapsed time t=1.11 s, 16 iters, t-(init.)=1.09 s t(norm)=0.138601, mflops=36.0749 (err=5.2e-16) 4. NR (F): elapsed time t=1.11 s, 16 iters, t-(init.)=1.09 s t(norm)=0.138601, mflops=36.0749 (err=5.2e-16) 5. PDA: elapsed time t=1.23 s, 32 iters, t-(init.)=1.19 s t(norm)=0.0756582, mflops=66.0867 (err=4.3e-16) 6. PDA (f2c): elapsed time t=1.64 s, 16 iters, t-(init.)=1.63 s t(norm)=0.207265, mflops=24.1237 (err=4.3e-16) 7. Singleton: elapsed time t=1.55 s, 32 iters, t-(init.)=1.52 s t(norm)=0.096639, mflops=51.7389 (err=5.3e-16) 8. Singleton (f2c): elapsed time t=1.57 s, 32 iters, t-(init.)=1.53 s t(norm)=0.0972748, mflops=51.4008 (err=5.3e-16) 9. Temperton: elapsed time t=1.18 s, 32 iters, t-(init.)=1.15 s t(norm)=0.073115, mflops=68.3854 (err=9.6e-08) 10. Temperton (f2c): elapsed time t=1.14 s, 32 iters, t-(init.)=1.11 s t(norm)=0.0705719, mflops=70.8497 (err=4.9e-16) 11. SCSL: elapsed time t=1.58 s, 128 iters, t-(init.)=1.44 s t(norm)=0.0228882, mflops=218.453 (err=4.9e-16) Top mflops for N=32768 = 218.453 Normalized results and averages for N=32768: fft 0: mflops = 135.592 (norm. = 0.62069), norm. avg. (of 4) = 0.847033 fft 1: mflops = 79.4376 (norm. = 0.363636), norm. avg. (of 3) = 0.428936 fft 2: mflops = 73.4983 (norm. = 0.336449), norm. avg. (of 3) = 0.391284 fft 3: mflops = 36.0749 (norm. = 0.165138), norm. avg. (of 4) = 0.304854 fft 4: mflops = 36.0749 (norm. = 0.165138), norm. avg. (of 4) = 0.235293 fft 5: mflops = 66.0867 (norm. = 0.302521), norm. avg. (of 4) = 0.21543 fft 6: mflops = 24.1237 (norm. = 0.110429), norm. avg. (of 4) = 0.100623 fft 7: mflops = 51.7389 (norm. = 0.236842), norm. avg. (of 4) = 0.397049 fft 8: mflops = 51.4008 (norm. = 0.235294), norm. avg. (of 4) = 0.404425 fft 9: mflops = 68.3854 (norm. = 0.313043), norm. avg. (of 4) = 0.392194 fft 10: mflops = 70.8497 (norm. = 0.324324), norm. avg. (of 4) = 0.376765 fft 11: mflops = 218.453 (norm. = 1), norm. avg. (of 4) = 0.791203 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.21 s, 4 iters, t-(init.)=1.17 s t(norm)=0.0619888, mflops=80.6597 (err=1.2e-15) 1. HARM: elapsed time t=1.43 s, 4 iters, t-(init.)=1.39 s t(norm)=0.0736448, mflops=67.8934 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.42 s, 4 iters, t-(init.)=1.39 s t(norm)=0.0736448, mflops=67.8934 (err=1.2e-15) 3. NR (C): elapsed time t=1.99 s, 2 iters, t-(init.)=1.97 s t(norm)=0.208749, mflops=23.9522 (err=1.3e-15) 4. NR (F): elapsed time t=1 s, 1 iters, t-(init.)=0.99 s t(norm)=0.209808, mflops=23.8313 (err=1.3e-15) 5. PDA: elapsed time t=1.31 s, 2 iters, t-(init.)=1.29 s t(norm)=0.136693, mflops=36.5782 (err=1.2e-15) 6. PDA (f2c): elapsed time t=1.21 s, 1 iters, t-(init.)=1.2 s t(norm)=0.254313, mflops=19.6608 (err=1.2e-15) 7. Singleton: elapsed time t=1.24 s, 2 iters, t-(init.)=1.22 s t(norm)=0.129276, mflops=38.677 (err=1.7e-15) 8. Singleton (f2c): elapsed time t=1.25 s, 2 iters, t-(init.)=1.23 s t(norm)=0.130335, mflops=38.3625 (err=1.7e-15) 9. Temperton: elapsed time t=1.81 s, 4 iters, t-(init.)=1.77 s t(norm)=0.093778, mflops=53.3174 (err=1.3e-07) 10. Temperton (f2c): elapsed time t=1.78 s, 4 iters, t-(init.)=1.74 s t(norm)=0.0921885, mflops=54.2367 (err=1.3e-15) 11. SCSL: elapsed time t=1.14 s, 8 iters, t-(init.)=1.06 s t(norm)=0.0280804, mflops=178.06 (err=1.3e-15) Top mflops for N=262144 = 178.06 Normalized results and averages for N=262144: fft 0: mflops = 80.6597 (norm. = 0.452991), norm. avg. (of 5) = 0.768225 fft 1: mflops = 67.8934 (norm. = 0.381295), norm. avg. (of 4) = 0.417026 fft 2: mflops = 67.8934 (norm. = 0.381295), norm. avg. (of 4) = 0.388787 fft 3: mflops = 23.9522 (norm. = 0.134518), norm. avg. (of 5) = 0.270787 fft 4: mflops = 23.8313 (norm. = 0.133838), norm. avg. (of 5) = 0.215002 fft 5: mflops = 36.5782 (norm. = 0.205426), norm. avg. (of 5) = 0.213429 fft 6: mflops = 19.6608 (norm. = 0.110417), norm. avg. (of 5) = 0.102582 fft 7: mflops = 38.677 (norm. = 0.217213), norm. avg. (of 5) = 0.361082 fft 8: mflops = 38.3625 (norm. = 0.215447), norm. avg. (of 5) = 0.366629 fft 9: mflops = 53.3174 (norm. = 0.299435), norm. avg. (of 5) = 0.373642 fft 10: mflops = 54.2367 (norm. = 0.304598), norm. avg. (of 5) = 0.362332 fft 11: mflops = 178.06 (norm. = 1), norm. avg. (of 5) = 0.832962 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.24 s, 1 iters, t-(init.)=1.2 s t(norm)=0.120464, mflops=41.5061 (err=1.2e-15) 1. HARM: elapsed time t=1.8 s, 1 iters, t-(init.)=1.76 s t(norm)=0.176681, mflops=28.2996 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.52 s, 1 iters, t-(init.)=1.48 s t(norm)=0.148572, mflops=33.6536 (err=1.2e-15) 3. NR (C): elapsed time t=5.58 s, 1 iters, t-(init.)=5.54 s t(norm)=0.556143, mflops=8.9905 (err=1.3e-15) 4. NR (F): elapsed time t=5.59 s, 1 iters, t-(init.)=5.55 s t(norm)=0.557147, mflops=8.9743 (err=1.3e-15) 5. PDA: elapsed time t=1.85 s, 1 iters, t-(init.)=1.81 s t(norm)=0.1817, mflops=27.5179 (err=1.2e-15) 6. PDA (f2c): elapsed time t=2.96 s, 1 iters, t-(init.)=2.92 s t(norm)=0.293129, mflops=17.0573 (err=1.2e-15) 7. Singleton: elapsed time t=3.7 s, 1 iters, t-(init.)=3.67 s t(norm)=0.368419, mflops=13.5715 (err=1.7e-15) 8. Singleton (f2c): elapsed time t=3.77 s, 1 iters, t-(init.)=3.74 s t(norm)=0.375447, mflops=13.3175 (err=1.7e-15) 9. Temperton: elapsed time t=2.17 s, 1 iters, t-(init.)=2.13 s t(norm)=0.213824, mflops=23.3837 (err=1.5e-07) 10. Temperton (f2c): elapsed time t=2.21 s, 1 iters, t-(init.)=2.17 s t(norm)=0.217839, mflops=22.9527 (err=1.3e-15) 11. SCSL: elapsed time t=1.08 s, 2 iters, t-(init.)=1.01 s t(norm)=0.0506953, mflops=98.6284 (err=1.3e-15) Top mflops for N=524288 = 98.6284 Normalized results and averages for N=524288: fft 0: mflops = 41.5061 (norm. = 0.420833), norm. avg. (of 6) = 0.710326 fft 1: mflops = 28.2996 (norm. = 0.286932), norm. avg. (of 5) = 0.391007 fft 2: mflops = 33.6536 (norm. = 0.341216), norm. avg. (of 5) = 0.379273 fft 3: mflops = 8.9905 (norm. = 0.0911552), norm. avg. (of 6) = 0.240848 fft 4: mflops = 8.9743 (norm. = 0.090991), norm. avg. (of 6) = 0.194334 fft 5: mflops = 27.5179 (norm. = 0.279006), norm. avg. (of 6) = 0.224359 fft 6: mflops = 17.0573 (norm. = 0.172945), norm. avg. (of 6) = 0.114309 fft 7: mflops = 13.5715 (norm. = 0.137602), norm. avg. (of 6) = 0.323835 fft 8: mflops = 13.3175 (norm. = 0.135027), norm. avg. (of 6) = 0.328029 fft 9: mflops = 23.3837 (norm. = 0.237089), norm. avg. (of 6) = 0.350883 fft 10: mflops = 22.9527 (norm. = 0.232719), norm. avg. (of 6) = 0.34073 fft 11: mflops = 98.6284 (norm. = 1), norm. avg. (of 6) = 0.860802 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=2.45 s, 1 iters, t-(init.)=2.37 s t(norm)=0.11301, mflops=44.2437 (err=2.0e-15) 1. HARM: elapsed time t=3.86 s, 1 iters, t-(init.)=3.77 s t(norm)=0.179768, mflops=27.8137 (err=1.9e-15) 2. HARM (f2c): elapsed time t=3.31 s, 1 iters, t-(init.)=3.23 s t(norm)=0.154018, mflops=32.4637 (err=1.9e-15) 3. NR (C): elapsed time t=14.2 s, 1 iters, t-(init.)=14.12 s t(norm)=0.673294, mflops=7.42618 (err=2.0e-15) 4. NR (F): elapsed time t=14.02 s, 1 iters, t-(init.)=13.94 s t(norm)=0.664711, mflops=7.52207 (err=2.0e-15) 5. PDA: elapsed time t=3.72 s, 1 iters, t-(init.)=3.64 s t(norm)=0.173569, mflops=28.807 (err=2.0e-15) 6. PDA (f2c): elapsed time t=5.93 s, 1 iters, t-(init.)=5.84 s t(norm)=0.278473, mflops=17.9551 (err=2.0e-15) 7. Singleton: elapsed time t=8.33 s, 1 iters, t-(init.)=8.25 s t(norm)=0.393391, mflops=12.71 (err=2.8e-15) 8. Singleton (f2c): elapsed time t=8.56 s, 1 iters, t-(init.)=8.47 s t(norm)=0.403881, mflops=12.3799 (err=2.8e-15) 9. Skipping fft (Temperton can't handle dimensions > 256). 10. Skipping fft (Temperton can't handle dimensions > 256). 11. SCSL: elapsed time t=1.99 s, 2 iters, t-(init.)=1.82 s t(norm)=0.0433922, mflops=115.228 (err=2.0e-15) Top mflops for N=1048576 = 115.228 Normalized results and averages for N=1048576: fft 0: mflops = 44.2437 (norm. = 0.383966), norm. avg. (of 7) = 0.663703 fft 1: mflops = 27.8137 (norm. = 0.241379), norm. avg. (of 6) = 0.366069 fft 2: mflops = 32.4637 (norm. = 0.281734), norm. avg. (of 6) = 0.363016 fft 3: mflops = 7.42618 (norm. = 0.0644476), norm. avg. (of 7) = 0.215648 fft 4: mflops = 7.52207 (norm. = 0.0652798), norm. avg. (of 7) = 0.175897 fft 5: mflops = 28.807 (norm. = 0.25), norm. avg. (of 7) = 0.228022 fft 6: mflops = 17.9551 (norm. = 0.155822), norm. avg. (of 7) = 0.12024 fft 7: mflops = 12.71 (norm. = 0.110303), norm. avg. (of 7) = 0.293331 fft 8: mflops = 12.3799 (norm. = 0.107438), norm. avg. (of 7) = 0.296516 fft 9: mflops = -1 (norm. = -0.00867844), norm. avg. (of 6) = 0.350883 fft 10: mflops = -1 (norm. = -0.00867844), norm. avg. (of 6) = 0.34073 fft 11: mflops = 115.228 (norm. = 1), norm. avg. (of 7) = 0.880687 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=5.7 s, 1 iters, t-(init.)=5.53 s t(norm)=0.125567, mflops=39.8193 (err=7.4e-16) 1. HARM: elapsed time t=7.78 s, 1 iters, t-(init.)=7.61 s t(norm)=0.172797, mflops=28.9357 (err=7.0e-16) 2. HARM (f2c): elapsed time t=7.01 s, 1 iters, t-(init.)=6.84 s t(norm)=0.155313, mflops=32.1931 (err=7.0e-16) 3. NR (C): elapsed time t=31.52 s, 1 iters, t-(init.)=31.35 s t(norm)=0.71185, mflops=7.02395 (err=7.8e-16) 4. NR (F): elapsed time t=31.52 s, 1 iters, t-(init.)=31.34 s t(norm)=0.711623, mflops=7.0262 (err=7.8e-16) 5. PDA: elapsed time t=8.35 s, 1 iters, t-(init.)=8.18 s t(norm)=0.185739, mflops=26.9194 (err=7.0e-16) 6. PDA (f2c): elapsed time t=13.4 s, 1 iters, t-(init.)=13.22 s t(norm)=0.30018, mflops=16.6567 (err=7.0e-16) 7. Singleton: elapsed time t=25.67 s, 1 iters, t-(init.)=25.5 s t(norm)=0.579017, mflops=8.63533 (err=8.4e-16) 8. Singleton (f2c): elapsed time t=26.13 s, 1 iters, t-(init.)=25.96 s t(norm)=0.589462, mflops=8.48232 (err=8.4e-16) 9. Temperton: elapsed time t=16.14 s, 1 iters, t-(init.)=15.97 s t(norm)=0.362623, mflops=13.7884 (err=1.5e-07) 10. Temperton (f2c): elapsed time t=16.86 s, 1 iters, t-(init.)=16.7 s t(norm)=0.379199, mflops=13.1857 (err=7.3e-16) 11. SCSL: elapsed time t=1.95 s, 1 iters, t-(init.)=1.78 s t(norm)=0.0404176, mflops=123.708 (err=7.3e-16) Top mflops for N=2097152 = 123.708 Normalized results and averages for N=2097152: fft 0: mflops = 39.8193 (norm. = 0.321881), norm. avg. (of 8) = 0.620975 fft 1: mflops = 28.9357 (norm. = 0.233903), norm. avg. (of 7) = 0.347188 fft 2: mflops = 32.1931 (norm. = 0.260234), norm. avg. (of 7) = 0.348333 fft 3: mflops = 7.02395 (norm. = 0.0567783), norm. avg. (of 8) = 0.19579 fft 4: mflops = 7.0262 (norm. = 0.0567964), norm. avg. (of 8) = 0.16101 fft 5: mflops = 26.9194 (norm. = 0.217604), norm. avg. (of 8) = 0.226719 fft 6: mflops = 16.6567 (norm. = 0.134644), norm. avg. (of 8) = 0.12204 fft 7: mflops = 8.63533 (norm. = 0.0698039), norm. avg. (of 8) = 0.26539 fft 8: mflops = 8.48232 (norm. = 0.068567), norm. avg. (of 8) = 0.268022 fft 9: mflops = 13.7884 (norm. = 0.111459), norm. avg. (of 7) = 0.31668 fft 10: mflops = 13.1857 (norm. = 0.106587), norm. avg. (of 7) = 0.307281 fft 11: mflops = 123.708 (norm. = 1), norm. avg. (of 8) = 0.895601 Benchmarking for array size = 512x128x64 (power of 2): 0. FFTW: elapsed time t=12.17 s, 1 iters, t-(init.)=11.83 s t(norm)=0.128204, mflops=39.0003 (err=1.3e-15) 1. HARM: elapsed time t=22.1 s, 1 iters, t-(init.)=21.75 s t(norm)=0.235709, mflops=21.2126 (err=1.2e-15) 2. HARM (f2c): elapsed time t=17.34 s, 1 iters, t-(init.)=17 s t(norm)=0.184233, mflops=27.1396 (err=1.2e-15) 3. NR (C): elapsed time t=68.76 s, 1 iters, t-(init.)=68.42 s t(norm)=0.741482, mflops=6.74325 (err=1.4e-15) 4. NR (F): elapsed time t=67.64 s, 1 iters, t-(init.)=67.3 s t(norm)=0.729344, mflops=6.85547 (err=1.4e-15) 5. PDA: elapsed time t=17.56 s, 1 iters, t-(init.)=17.21 s t(norm)=0.186508, mflops=26.8085 (err=1.3e-15) 6. PDA (f2c): elapsed time t=28.84 s, 1 iters, t-(init.)=28.49 s t(norm)=0.308752, mflops=16.1942 (err=1.3e-15) 7. Singleton: elapsed time t=49.87 s, 1 iters, t-(init.)=49.52 s t(norm)=0.536659, mflops=9.31691 (err=1.6e-15) 8. Singleton (f2c): elapsed time t=50.16 s, 1 iters, t-(init.)=49.81 s t(norm)=0.539801, mflops=9.26267 (err=1.6e-15) 9. Skipping fft (Temperton can't handle dimensions > 256). 10. Skipping fft (Temperton can't handle dimensions > 256). 11. SCSL: elapsed time t=4.13 s, 1 iters, t-(init.)=3.79 s t(norm)=0.041073, mflops=121.734 (err=1.4e-15) Top mflops for N=4194304 = 121.734 Normalized results and averages for N=4194304: fft 0: mflops = 39.0003 (norm. = 0.320372), norm. avg. (of 9) = 0.587575 fft 1: mflops = 21.2126 (norm. = 0.174253), norm. avg. (of 8) = 0.325571 fft 2: mflops = 27.1396 (norm. = 0.222941), norm. avg. (of 8) = 0.332659 fft 3: mflops = 6.74325 (norm. = 0.0553932), norm. avg. (of 9) = 0.18019 fft 4: mflops = 6.85547 (norm. = 0.056315), norm. avg. (of 9) = 0.149377 fft 5: mflops = 26.8085 (norm. = 0.220221), norm. avg. (of 9) = 0.225997 fft 6: mflops = 16.1942 (norm. = 0.133029), norm. avg. (of 9) = 0.123261 fft 7: mflops = 9.31691 (norm. = 0.0765347), norm. avg. (of 9) = 0.244406 fft 8: mflops = 9.26267 (norm. = 0.0760891), norm. avg. (of 9) = 0.246696 fft 9: mflops = -1 (norm. = -0.0082146), norm. avg. (of 7) = 0.31668 fft 10: mflops = -1 (norm. = -0.0082146), norm. avg. (of 7) = 0.307281 fft 11: mflops = 121.734 (norm. = 1), norm. avg. (of 9) = 0.907201 ------------------------------------------------------ @@@@ bench.3d.np2.log FFT Benchmark Program by M. Frigo and S. G. Johnson. email: fftw@theory.lcs.mit.edu www: http://theory.lcs.mit.edu/~fftw Using FFTW V1.1 ($Id: executor.c,v 1.34 1997/04/30 13:15:56 fftw Exp $) Maximum memory to use: 200 MB Factors to allow: anything but 2 Using double precision. Measuring speed of 3D transforms: Benchmarking for sizes: 5x5x5 (0.00473022 MB) 6x6x6 (0.0075531 MB) 7x7x7 (0.0114746 MB) 9x9x9 (0.0233459 MB) 10x10x10 (0.031662 MB) 11x11x11 (0.0418091 MB) 12x12x12 (0.0539703 MB) 13x13x13 (0.0683289 MB) 14x14x14 (0.0850677 MB) 15x15x15 (0.10437 MB) 24x25x28 (0.514557 MB) 48x48x48 (3.37788 MB) 49x49x49 (3.59329 MB) 60x60x60 (6.59523 MB) 72x60x56 (7.38637 MB) 75x75x75 (12.8787 MB) 80x80x80 (15.6293 MB) 84x84x84 (18.0924 MB) 96x96x96 (27.0051 MB) 105x105x105 (35.3334 MB) 112x112x112 (42.8808 MB) 120x120x120 (52.7406 MB) 144x144x144 (91.1323 MB) 180x180x180 (177.987 MB) Maximum array size N = 5832000 Benchmarking FFTs: 0. FFTW 1. PDA 2. PDA (f2c) 3. Singleton 4. Singleton (f2c) 5. Temperton 6. Temperton (f2c) 7. SCSL Computing normalized averages (8 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.78 s, 65536 iters, t-(init.)=1.69 s t(norm)=0.029616, mflops=168.828 (err=3.9e-16) 1. PDA: elapsed time t=1.25 s, 8192 iters, t-(init.)=1.24 s t(norm)=0.173841, mflops=28.7619 (err=3.0e-16) 2. PDA (f2c): elapsed time t=1.06 s, 4096 iters, t-(init.)=1.06 s t(norm)=0.297212, mflops=16.823 (err=3.0e-16) 3. Singleton: elapsed time t=1.79 s, 65536 iters, t-(init.)=1.71 s t(norm)=0.0299665, mflops=166.853 (err=3.4e-16) 4. Singleton (f2c): elapsed time t=1.75 s, 65536 iters, t-(init.)=1.67 s t(norm)=0.0292655, mflops=170.849 (err=3.4e-16) 5. Temperton: elapsed time t=1.26 s, 32768 iters, t-(init.)=1.22 s t(norm)=0.0427592, mflops=116.934 (err=5.5e-16) 6. Temperton (f2c): elapsed time t=1.08 s, 8192 iters, t-(init.)=1.07 s t(norm)=0.150008, mflops=33.3316 (err=2.5e-16) 7. SCSL: elapsed time t=1.77 s, 32768 iters, t-(init.)=1.72 s t(norm)=0.0602835, mflops=82.9414 (err=5.3e-16) Top mflops for N=125 = 170.849 Normalized results and averages for N=125: fft 0: mflops = 168.828 (norm. = 0.988166), norm. avg. (of 1) = 0.988166 fft 1: mflops = 28.7619 (norm. = 0.168347), norm. avg. (of 1) = 0.168347 fft 2: mflops = 16.823 (norm. = 0.098467), norm. avg. (of 1) = 0.098467 fft 3: mflops = 166.853 (norm. = 0.976608), norm. avg. (of 1) = 0.976608 fft 4: mflops = 170.849 (norm. = 1), norm. avg. (of 1) = 1 fft 5: mflops = 116.934 (norm. = 0.684426), norm. avg. (of 1) = 0.684426 fft 6: mflops = 33.3316 (norm. = 0.195093), norm. avg. (of 1) = 0.195093 fft 7: mflops = 82.9414 (norm. = 0.485465), norm. avg. (of 1) = 0.485465 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.3 s, 32768 iters, t-(init.)=1.23 s t(norm)=0.0224092, mflops=223.123 (err=2.9e-16) 1. PDA: elapsed time t=1.22 s, 4096 iters, t-(init.)=1.21 s t(norm)=0.176358, mflops=28.3514 (err=3.6e-16) 2. PDA (f2c): elapsed time t=1.08 s, 2048 iters, t-(init.)=1.07 s t(norm)=0.311907, mflops=16.0304 (err=3.6e-16) 3. Singleton: elapsed time t=1.54 s, 16384 iters, t-(init.)=1.5 s t(norm)=0.0546565, mflops=91.4804 (err=2.9e-16) 4. Singleton (f2c): elapsed time t=1.55 s, 16384 iters, t-(init.)=1.52 s t(norm)=0.0553853, mflops=90.2767 (err=2.9e-16) 5. Temperton: elapsed time t=1.31 s, 16384 iters, t-(init.)=1.27 s t(norm)=0.0462759, mflops=108.048 (err=2.1e-15) 6. Temperton (f2c): elapsed time t=1.9 s, 8192 iters, t-(init.)=1.88 s t(norm)=0.137006, mflops=36.4948 (err=3.1e-16) 7. SCSL: elapsed time t=1.72 s, 16384 iters, t-(init.)=1.68 s t(norm)=0.0612153, mflops=81.6789 (err=6.0e-16) Top mflops for N=216 = 223.123 Normalized results and averages for N=216: fft 0: mflops = 223.123 (norm. = 1), norm. avg. (of 2) = 0.994083 fft 1: mflops = 28.3514 (norm. = 0.127066), norm. avg. (of 2) = 0.147706 fft 2: mflops = 16.0304 (norm. = 0.0718458), norm. avg. (of 2) = 0.0851564 fft 3: mflops = 91.4804 (norm. = 0.41), norm. avg. (of 2) = 0.693304 fft 4: mflops = 90.2767 (norm. = 0.404605), norm. avg. (of 2) = 0.702303 fft 5: mflops = 108.048 (norm. = 0.484252), norm. avg. (of 2) = 0.584339 fft 6: mflops = 36.4948 (norm. = 0.163564), norm. avg. (of 2) = 0.179329 fft 7: mflops = 81.6789 (norm. = 0.366071), norm. avg. (of 2) = 0.425768 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.46 s, 16384 iters, t-(init.)=1.4 s t(norm)=0.0295798, mflops=169.034 (err=3.7e-16) 1. PDA: elapsed time t=1.25 s, 1024 iters, t-(init.)=1.25 s t(norm)=0.422569, mflops=11.8324 (err=4.9e-16) 2. PDA (f2c): elapsed time t=1.45 s, 512 iters, t-(init.)=1.45 s t(norm)=0.980359, mflops=5.10017 (err=4.9e-16) 3. Singleton: elapsed time t=1.67 s, 8192 iters, t-(init.)=1.64 s t(norm)=0.0693013, mflops=72.1487 (err=5.8e-16) 4. Singleton (f2c): elapsed time t=1.53 s, 8192 iters, t-(init.)=1.5 s t(norm)=0.0633853, mflops=78.8826 (err=5.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 7. SCSL: elapsed time t=1.6 s, 4096 iters, t-(init.)=1.58 s t(norm)=0.133532, mflops=37.4443 (err=6.0e-16) Top mflops for N=343 = 169.034 Normalized results and averages for N=343: fft 0: mflops = 169.034 (norm. = 1), norm. avg. (of 3) = 0.996055 fft 1: mflops = 11.8324 (norm. = 0.07), norm. avg. (of 3) = 0.121804 fft 2: mflops = 5.10017 (norm. = 0.0301724), norm. avg. (of 3) = 0.0668284 fft 3: mflops = 72.1487 (norm. = 0.426829), norm. avg. (of 3) = 0.604479 fft 4: mflops = 78.8826 (norm. = 0.466667), norm. avg. (of 3) = 0.623757 fft 5: mflops = -1 (norm. = -0.00591596), norm. avg. (of 2) = 0.584339 fft 6: mflops = -1 (norm. = -0.00591596), norm. avg. (of 2) = 0.179329 fft 7: mflops = 37.4443 (norm. = 0.221519), norm. avg. (of 3) = 0.357685 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.32 s, 8192 iters, t-(init.)=1.26 s t(norm)=0.0221862, mflops=225.365 (err=5.3e-16) 1. PDA: elapsed time t=1.53 s, 2048 iters, t-(init.)=1.51 s t(norm)=0.106353, mflops=47.0133 (err=4.1e-16) 2. PDA (f2c): elapsed time t=1.66 s, 1024 iters, t-(init.)=1.65 s t(norm)=0.232427, mflops=21.5121 (err=4.1e-16) 3. Singleton: elapsed time t=1.18 s, 4096 iters, t-(init.)=1.15 s t(norm)=0.0404986, mflops=123.461 (err=4.5e-16) 4. Singleton (f2c): elapsed time t=1.26 s, 4096 iters, t-(init.)=1.23 s t(norm)=0.0433159, mflops=115.431 (err=4.5e-16) 5. Temperton: elapsed time t=1.7 s, 8192 iters, t-(init.)=1.64 s t(norm)=0.0288773, mflops=173.147 (err=6.0e-08) 6. Temperton (f2c): elapsed time t=1.14 s, 1024 iters, t-(init.)=1.14 s t(norm)=0.160586, mflops=31.136 (err=4.8e-16) 7. SCSL: elapsed time t=1.22 s, 4096 iters, t-(init.)=1.18 s t(norm)=0.0415551, mflops=120.322 (err=8.4e-16) Top mflops for N=729 = 225.365 Normalized results and averages for N=729: fft 0: mflops = 225.365 (norm. = 1), norm. avg. (of 4) = 0.997041 fft 1: mflops = 47.0133 (norm. = 0.208609), norm. avg. (of 4) = 0.143506 fft 2: mflops = 21.5121 (norm. = 0.0954545), norm. avg. (of 4) = 0.0739849 fft 3: mflops = 123.461 (norm. = 0.547826), norm. avg. (of 4) = 0.590316 fft 4: mflops = 115.431 (norm. = 0.512195), norm. avg. (of 4) = 0.595867 fft 5: mflops = 173.147 (norm. = 0.768293), norm. avg. (of 3) = 0.645657 fft 6: mflops = 31.136 (norm. = 0.138158), norm. avg. (of 3) = 0.165605 fft 7: mflops = 120.322 (norm. = 0.533898), norm. avg. (of 4) = 0.401738 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.86 s, 8192 iters, t-(init.)=1.77 s t(norm)=0.0216806, mflops=230.621 (err=4.8e-16) 1. PDA: elapsed time t=1.01 s, 1024 iters, t-(init.)=1 s t(norm)=0.0979915, mflops=51.0248 (err=4.7e-16) 2. PDA (f2c): elapsed time t=1.07 s, 512 iters, t-(init.)=1.06 s t(norm)=0.207742, mflops=24.0683 (err=4.7e-16) 3. Singleton: elapsed time t=1.83 s, 4096 iters, t-(init.)=1.79 s t(norm)=0.0438512, mflops=114.022 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=1.8 s, 4096 iters, t-(init.)=1.76 s t(norm)=0.0431163, mflops=115.965 (err=5.4e-16) 5. Temperton: elapsed time t=1.1 s, 4096 iters, t-(init.)=1.06 s t(norm)=0.0259678, mflops=192.546 (err=6.6e-16) 6. Temperton (f2c): elapsed time t=1.07 s, 1024 iters, t-(init.)=1.06 s t(norm)=0.103871, mflops=48.1366 (err=3.9e-16) 7. SCSL: elapsed time t=1.57 s, 4096 iters, t-(init.)=1.53 s t(norm)=0.0374818, mflops=133.398 (err=7.0e-16) Top mflops for N=1000 = 230.621 Normalized results and averages for N=1000: fft 0: mflops = 230.621 (norm. = 1), norm. avg. (of 5) = 0.997633 fft 1: mflops = 51.0248 (norm. = 0.22125), norm. avg. (of 5) = 0.159054 fft 2: mflops = 24.0683 (norm. = 0.104363), norm. avg. (of 5) = 0.0800606 fft 3: mflops = 114.022 (norm. = 0.494413), norm. avg. (of 5) = 0.571135 fft 4: mflops = 115.965 (norm. = 0.502841), norm. avg. (of 5) = 0.577262 fft 5: mflops = 192.546 (norm. = 0.834906), norm. avg. (of 4) = 0.692969 fft 6: mflops = 48.1366 (norm. = 0.208726), norm. avg. (of 4) = 0.176385 fft 7: mflops = 133.398 (norm. = 0.578431), norm. avg. (of 5) = 0.437077 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.17 s, 2048 iters, t-(init.)=1.15 s t(norm)=0.0406503, mflops=123 (err=4.4e-16) 1. PDA: elapsed time t=1.49 s, 256 iters, t-(init.)=1.49 s t(norm)=0.421349, mflops=11.8666 (err=5.3e-16) 2. PDA (f2c): elapsed time t=1.85 s, 128 iters, t-(init.)=1.85 s t(norm)=1.0463, mflops=4.77873 (err=5.3e-16) 3. Singleton: elapsed time t=1.1 s, 1024 iters, t-(init.)=1.09 s t(norm)=0.0770588, mflops=64.8855 (err=6.6e-16) 4. Singleton (f2c): elapsed time t=1.03 s, 1024 iters, t-(init.)=1.01 s t(norm)=0.0714031, mflops=70.0249 (err=6.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 7. SCSL: elapsed time t=1.06 s, 512 iters, t-(init.)=1.06 s t(norm)=0.149876, mflops=33.3609 (err=5.8e-16) Top mflops for N=1331 = 123 Normalized results and averages for N=1331: fft 0: mflops = 123 (norm. = 1), norm. avg. (of 6) = 0.998028 fft 1: mflops = 11.8666 (norm. = 0.0964765), norm. avg. (of 6) = 0.148625 fft 2: mflops = 4.77873 (norm. = 0.0388514), norm. avg. (of 6) = 0.0731924 fft 3: mflops = 64.8855 (norm. = 0.527523), norm. avg. (of 6) = 0.563867 fft 4: mflops = 70.0249 (norm. = 0.569307), norm. avg. (of 6) = 0.575936 fft 5: mflops = -1 (norm. = -0.00813006), norm. avg. (of 4) = 0.692969 fft 6: mflops = -1 (norm. = -0.00813006), norm. avg. (of 4) = 0.176385 fft 7: mflops = 33.3609 (norm. = 0.271226), norm. avg. (of 6) = 0.409435 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.45 s, 4096 iters, t-(init.)=1.38 s t(norm)=0.0181288, mflops=275.804 (err=3.9e-16) 1. PDA: elapsed time t=1.56 s, 1024 iters, t-(init.)=1.54 s t(norm)=0.0809228, mflops=61.7872 (err=3.8e-16) 2. PDA (f2c): elapsed time t=1.79 s, 512 iters, t-(init.)=1.79 s t(norm)=0.188119, mflops=26.5789 (err=3.8e-16) 3. Singleton: elapsed time t=1.63 s, 2048 iters, t-(init.)=1.59 s t(norm)=0.0417751, mflops=119.689 (err=4.0e-16) 4. Singleton (f2c): elapsed time t=1.66 s, 2048 iters, t-(init.)=1.62 s t(norm)=0.0425633, mflops=117.472 (err=4.0e-16) 5. Temperton: elapsed time t=1.65 s, 4096 iters, t-(init.)=1.58 s t(norm)=0.0207562, mflops=240.892 (err=1.8e-15) 6. Temperton (f2c): elapsed time t=1.53 s, 1024 iters, t-(init.)=1.51 s t(norm)=0.0793464, mflops=63.0148 (err=3.9e-16) 7. SCSL: elapsed time t=1.56 s, 2048 iters, t-(init.)=1.53 s t(norm)=0.0401987, mflops=124.382 (err=5.1e-16) Top mflops for N=1728 = 275.804 Normalized results and averages for N=1728: fft 0: mflops = 275.804 (norm. = 1), norm. avg. (of 7) = 0.998309 fft 1: mflops = 61.7872 (norm. = 0.224026), norm. avg. (of 7) = 0.159396 fft 2: mflops = 26.5789 (norm. = 0.0963687), norm. avg. (of 7) = 0.0765033 fft 3: mflops = 119.689 (norm. = 0.433962), norm. avg. (of 7) = 0.545309 fft 4: mflops = 117.472 (norm. = 0.425926), norm. avg. (of 7) = 0.554506 fft 5: mflops = 240.892 (norm. = 0.873418), norm. avg. (of 5) = 0.729059 fft 6: mflops = 63.0148 (norm. = 0.228477), norm. avg. (of 5) = 0.186804 fft 7: mflops = 124.382 (norm. = 0.45098), norm. avg. (of 7) = 0.41537 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.12 s, 1024 iters, t-(init.)=1.09 s t(norm)=0.0436437, mflops=114.564 (err=7.8e-16) 1. PDA: elapsed time t=1.34 s, 128 iters, t-(init.)=1.34 s t(norm)=0.42923, mflops=11.6488 (err=1.2e-15) 2. PDA (f2c): elapsed time t=1.72 s, 64 iters, t-(init.)=1.72 s t(norm)=1.1019, mflops=4.5376 (err=1.2e-15) 3. Singleton: elapsed time t=1.91 s, 1024 iters, t-(init.)=1.87 s t(norm)=0.074875, mflops=66.7779 (err=1.5e-15) 4. Singleton (f2c): elapsed time t=1.72 s, 1024 iters, t-(init.)=1.68 s t(norm)=0.0672674, mflops=74.3302 (err=1.5e-15) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 7. SCSL: elapsed time t=1.98 s, 512 iters, t-(init.)=1.96 s t(norm)=0.156957, mflops=31.8558 (err=1.1e-15) Top mflops for N=2197 = 114.564 Normalized results and averages for N=2197: fft 0: mflops = 114.564 (norm. = 1), norm. avg. (of 8) = 0.998521 fft 1: mflops = 11.6488 (norm. = 0.101679), norm. avg. (of 8) = 0.152182 fft 2: mflops = 4.5376 (norm. = 0.0396076), norm. avg. (of 8) = 0.0718913 fft 3: mflops = 66.7779 (norm. = 0.582888), norm. avg. (of 8) = 0.550006 fft 4: mflops = 74.3302 (norm. = 0.64881), norm. avg. (of 8) = 0.566294 fft 5: mflops = -1 (norm. = -0.00872875), norm. avg. (of 5) = 0.729059 fft 6: mflops = -1 (norm. = -0.00872875), norm. avg. (of 5) = 0.186804 fft 7: mflops = 31.8558 (norm. = 0.278061), norm. avg. (of 8) = 0.398207 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.08 s, 1024 iters, t-(init.)=1 s t(norm)=0.0311581, mflops=160.472 (err=4.1e-16) 1. PDA: elapsed time t=1.65 s, 256 iters, t-(init.)=1.63 s t(norm)=0.203151, mflops=24.6122 (err=5.0e-16) 2. PDA (f2c): elapsed time t=1.25 s, 64 iters, t-(init.)=1.24 s t(norm)=0.618177, mflops=8.0883 (err=5.0e-16) 3. Singleton: elapsed time t=1.28 s, 512 iters, t-(init.)=1.24 s t(norm)=0.0772722, mflops=64.7064 (err=5.6e-16) 4. Singleton (f2c): elapsed time t=1.22 s, 512 iters, t-(init.)=1.18 s t(norm)=0.0735332, mflops=67.9965 (err=5.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 7. SCSL: elapsed time t=1.1 s, 256 iters, t-(init.)=1.08 s t(norm)=0.134603, mflops=37.1462 (err=5.8e-16) Top mflops for N=2744 = 160.472 Normalized results and averages for N=2744: fft 0: mflops = 160.472 (norm. = 1), norm. avg. (of 9) = 0.998685 fft 1: mflops = 24.6122 (norm. = 0.153374), norm. avg. (of 9) = 0.152314 fft 2: mflops = 8.0883 (norm. = 0.0504032), norm. avg. (of 9) = 0.0695038 fft 3: mflops = 64.7064 (norm. = 0.403226), norm. avg. (of 9) = 0.533697 fft 4: mflops = 67.9965 (norm. = 0.423729), norm. avg. (of 9) = 0.550453 fft 5: mflops = -1 (norm. = -0.00623163), norm. avg. (of 5) = 0.729059 fft 6: mflops = -1 (norm. = -0.00623163), norm. avg. (of 5) = 0.186804 fft 7: mflops = 37.1462 (norm. = 0.231481), norm. avg. (of 9) = 0.379682 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.3 s, 1024 iters, t-(init.)=1.19 s t(norm)=0.0293779, mflops=170.196 (err=5.4e-16) 1. PDA: elapsed time t=1.49 s, 512 iters, t-(init.)=1.44 s t(norm)=0.0710995, mflops=70.324 (err=5.3e-16) 2. PDA (f2c): elapsed time t=1.82 s, 256 iters, t-(init.)=1.8 s t(norm)=0.177749, mflops=28.1296 (err=5.3e-16) 3. Singleton: elapsed time t=1.17 s, 512 iters, t-(init.)=1.12 s t(norm)=0.0552996, mflops=90.4166 (err=6.7e-16) 4. Singleton (f2c): elapsed time t=1.2 s, 512 iters, t-(init.)=1.15 s t(norm)=0.0567808, mflops=88.0579 (err=6.7e-16) 5. Temperton: elapsed time t=1.15 s, 1024 iters, t-(init.)=1.04 s t(norm)=0.0256748, mflops=194.743 (err=2.1e-15) 6. Temperton (f2c): elapsed time t=1.39 s, 256 iters, t-(init.)=1.37 s t(norm)=0.135286, mflops=36.9586 (err=5.2e-16) 7. SCSL: elapsed time t=1.7 s, 1024 iters, t-(init.)=1.59 s t(norm)=0.0392528, mflops=127.379 (err=6.4e-16) Top mflops for N=3375 = 194.743 Normalized results and averages for N=3375: fft 0: mflops = 170.196 (norm. = 0.87395), norm. avg. (of 10) = 0.986212 fft 1: mflops = 70.324 (norm. = 0.361111), norm. avg. (of 10) = 0.173194 fft 2: mflops = 28.1296 (norm. = 0.144444), norm. avg. (of 10) = 0.0769978 fft 3: mflops = 90.4166 (norm. = 0.464286), norm. avg. (of 10) = 0.526756 fft 4: mflops = 88.0579 (norm. = 0.452174), norm. avg. (of 10) = 0.540625 fft 5: mflops = 194.743 (norm. = 1), norm. avg. (of 6) = 0.774216 fft 6: mflops = 36.9586 (norm. = 0.189781), norm. avg. (of 6) = 0.1873 fft 7: mflops = 127.379 (norm. = 0.654088), norm. avg. (of 10) = 0.407122 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.38 s, 128 iters, t-(init.)=1.31 s t(norm)=0.0434014, mflops=115.204 (err=5.1e-16) 1. PDA: elapsed time t=1.3 s, 64 iters, t-(init.)=1.26 s t(norm)=0.0834896, mflops=59.8877 (err=4.9e-16) 2. PDA (f2c): elapsed time t=1.95 s, 32 iters, t-(init.)=1.93 s t(norm)=0.25577, mflops=19.5488 (err=4.9e-16) 3. Singleton: elapsed time t=1.11 s, 64 iters, t-(init.)=1.08 s t(norm)=0.0715625, mflops=69.869 (err=5.3e-16) 4. Singleton (f2c): elapsed time t=1.08 s, 64 iters, t-(init.)=1.05 s t(norm)=0.0695747, mflops=71.8652 (err=5.3e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 7. SCSL: elapsed time t=1.85 s, 128 iters, t-(init.)=1.78 s t(norm)=0.0589728, mflops=84.7848 (err=6.7e-16) Top mflops for N=16800 = 115.204 Normalized results and averages for N=16800: fft 0: mflops = 115.204 (norm. = 1), norm. avg. (of 11) = 0.987465 fft 1: mflops = 59.8877 (norm. = 0.519841), norm. avg. (of 11) = 0.204707 fft 2: mflops = 19.5488 (norm. = 0.169689), norm. avg. (of 11) = 0.0854243 fft 3: mflops = 69.869 (norm. = 0.606481), norm. avg. (of 11) = 0.534004 fft 4: mflops = 71.8652 (norm. = 0.62381), norm. avg. (of 11) = 0.548188 fft 5: mflops = -1 (norm. = -0.00868027), norm. avg. (of 6) = 0.774216 fft 6: mflops = -1 (norm. = -0.00868027), norm. avg. (of 6) = 0.1873 fft 7: mflops = 84.7848 (norm. = 0.735955), norm. avg. (of 11) = 0.437016 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.28 s, 16 iters, t-(init.)=1.22 s t(norm)=0.0411505, mflops=121.505 (err=6.7e-16) 1. PDA: elapsed time t=1.02 s, 8 iters, t-(init.)=0.99 s t(norm)=0.0667852, mflops=74.8669 (err=6.4e-16) 2. PDA (f2c): elapsed time t=1.45 s, 4 iters, t-(init.)=1.43 s t(norm)=0.192935, mflops=25.9155 (err=6.4e-16) 3. Singleton: elapsed time t=1.4 s, 8 iters, t-(init.)=1.37 s t(norm)=0.0924199, mflops=54.1009 (err=6.5e-16) 4. Singleton (f2c): elapsed time t=1.39 s, 8 iters, t-(init.)=1.36 s t(norm)=0.0917453, mflops=54.4987 (err=6.5e-16) 5. Temperton: elapsed time t=1.02 s, 8 iters, t-(init.)=0.99 s t(norm)=0.0667852, mflops=74.8669 (err=1.0e-07) 6. Temperton (f2c): elapsed time t=1.36 s, 8 iters, t-(init.)=1.33 s t(norm)=0.0897215, mflops=55.728 (err=7.3e-16) 7. SCSL: elapsed time t=1.95 s, 32 iters, t-(init.)=1.84 s t(norm)=0.0310315, mflops=161.127 (err=6.9e-16) Top mflops for N=110592 = 161.127 Normalized results and averages for N=110592: fft 0: mflops = 121.505 (norm. = 0.754098), norm. avg. (of 12) = 0.968018 fft 1: mflops = 74.8669 (norm. = 0.464646), norm. avg. (of 12) = 0.226369 fft 2: mflops = 25.9155 (norm. = 0.160839), norm. avg. (of 12) = 0.0917089 fft 3: mflops = 54.1009 (norm. = 0.335766), norm. avg. (of 12) = 0.517484 fft 4: mflops = 54.4987 (norm. = 0.338235), norm. avg. (of 12) = 0.530691 fft 5: mflops = 74.8669 (norm. = 0.464646), norm. avg. (of 7) = 0.729992 fft 6: mflops = 55.728 (norm. = 0.345865), norm. avg. (of 7) = 0.209952 fft 7: mflops = 161.127 (norm. = 1), norm. avg. (of 12) = 0.483931 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.43 s, 16 iters, t-(init.)=1.37 s t(norm)=0.043208, mflops=115.719 (err=6.8e-16) 1. PDA: elapsed time t=1.37 s, 4 iters, t-(init.)=1.35 s t(norm)=0.170309, mflops=29.3584 (err=7.6e-16) 2. PDA (f2c): elapsed time t=1.22 s, 1 iters, t-(init.)=1.21 s t(norm)=0.610588, mflops=8.18882 (err=7.6e-16) 3. Singleton: elapsed time t=1.66 s, 8 iters, t-(init.)=1.63 s t(norm)=0.102816, mflops=48.6306 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.57 s, 8 iters, t-(init.)=1.54 s t(norm)=0.0971391, mflops=51.4726 (err=1.0e-15) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 7. SCSL: elapsed time t=1.03 s, 4 iters, t-(init.)=1.01 s t(norm)=0.127416, mflops=39.2415 (err=8.3e-16) Top mflops for N=117649 = 115.719 Normalized results and averages for N=117649: fft 0: mflops = 115.719 (norm. = 1), norm. avg. (of 13) = 0.970478 fft 1: mflops = 29.3584 (norm. = 0.253704), norm. avg. (of 13) = 0.228472 fft 2: mflops = 8.18882 (norm. = 0.0707645), norm. avg. (of 13) = 0.0900978 fft 3: mflops = 48.6306 (norm. = 0.420245), norm. avg. (of 13) = 0.510004 fft 4: mflops = 51.4726 (norm. = 0.444805), norm. avg. (of 13) = 0.524085 fft 5: mflops = -1 (norm. = -0.00864159), norm. avg. (of 7) = 0.729992 fft 6: mflops = -1 (norm. = -0.00864159), norm. avg. (of 7) = 0.209952 fft 7: mflops = 39.2415 (norm. = 0.339109), norm. avg. (of 13) = 0.472791 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.36 s, 8 iters, t-(init.)=1.3 s t(norm)=0.0424541, mflops=117.774 (err=7.6e-16) 1. PDA: elapsed time t=1.49 s, 4 iters, t-(init.)=1.46 s t(norm)=0.0953584, mflops=52.4338 (err=7.7e-16) 2. PDA (f2c): elapsed time t=1.71 s, 2 iters, t-(init.)=1.69 s t(norm)=0.220761, mflops=22.6489 (err=7.7e-16) 3. Singleton: elapsed time t=1.86 s, 4 iters, t-(init.)=1.83 s t(norm)=0.119525, mflops=41.8324 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.87 s, 4 iters, t-(init.)=1.84 s t(norm)=0.120178, mflops=41.6051 (err=1.0e-15) 5. Temperton: elapsed time t=1.01 s, 8 iters, t-(init.)=0.95 s t(norm)=0.0310241, mflops=161.165 (err=2.0e-15) 6. Temperton (f2c): elapsed time t=1.61 s, 4 iters, t-(init.)=1.58 s t(norm)=0.103196, mflops=48.4515 (err=7.3e-16) 7. SCSL: elapsed time t=1.12 s, 8 iters, t-(init.)=1.06 s t(norm)=0.0346164, mflops=144.44 (err=9.0e-16) Top mflops for N=216000 = 161.165 Normalized results and averages for N=216000: fft 0: mflops = 117.774 (norm. = 0.730769), norm. avg. (of 14) = 0.953356 fft 1: mflops = 52.4338 (norm. = 0.325342), norm. avg. (of 14) = 0.235391 fft 2: mflops = 22.6489 (norm. = 0.140533), norm. avg. (of 14) = 0.0937003 fft 3: mflops = 41.8324 (norm. = 0.259563), norm. avg. (of 14) = 0.492116 fft 4: mflops = 41.6051 (norm. = 0.258152), norm. avg. (of 14) = 0.50509 fft 5: mflops = 161.165 (norm. = 1), norm. avg. (of 8) = 0.763743 fft 6: mflops = 48.4515 (norm. = 0.300633), norm. avg. (of 8) = 0.221287 fft 7: mflops = 144.44 (norm. = 0.896226), norm. avg. (of 14) = 0.503037 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.85 s, 8 iters, t-(init.)=1.78 s t(norm)=0.0514268, mflops=97.2256 (err=7.5e-16) 1. PDA: elapsed time t=1.04 s, 2 iters, t-(init.)=1.03 s t(norm)=0.119033, mflops=42.0052 (err=8.0e-16) 2. PDA (f2c): elapsed time t=1.26 s, 1 iters, t-(init.)=1.25 s t(norm)=0.288915, mflops=17.3062 (err=8.0e-16) 3. Singleton: elapsed time t=1.19 s, 2 iters, t-(init.)=1.17 s t(norm)=0.135212, mflops=36.979 (err=9.3e-16) 4. Singleton (f2c): elapsed time t=1.19 s, 2 iters, t-(init.)=1.18 s t(norm)=0.136368, mflops=36.6656 (err=9.3e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 7. SCSL: elapsed time t=1.85 s, 8 iters, t-(init.)=1.78 s t(norm)=0.0514268, mflops=97.2256 (err=9.6e-16) Top mflops for N=241920 = 97.2256 Normalized results and averages for N=241920: fft 0: mflops = 97.2256 (norm. = 1), norm. avg. (of 15) = 0.956466 fft 1: mflops = 42.0052 (norm. = 0.432039), norm. avg. (of 15) = 0.248501 fft 2: mflops = 17.3062 (norm. = 0.178), norm. avg. (of 15) = 0.0993202 fft 3: mflops = 36.979 (norm. = 0.380342), norm. avg. (of 15) = 0.484664 fft 4: mflops = 36.6656 (norm. = 0.377119), norm. avg. (of 15) = 0.496558 fft 5: mflops = -1 (norm. = -0.0102854), norm. avg. (of 8) = 0.763743 fft 6: mflops = -1 (norm. = -0.0102854), norm. avg. (of 8) = 0.221287 fft 7: mflops = 97.2256 (norm. = 1), norm. avg. (of 15) = 0.536168 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.95 s, 4 iters, t-(init.)=1.84 s t(norm)=0.0583508, mflops=85.6886 (err=7.6e-16) 1. PDA: elapsed time t=1.82 s, 2 iters, t-(init.)=1.77 s t(norm)=0.112262, mflops=44.5387 (err=7.9e-16) 2. PDA (f2c): elapsed time t=1.87 s, 1 iters, t-(init.)=1.85 s t(norm)=0.234672, mflops=21.3063 (err=7.9e-16) 3. Singleton: elapsed time t=1.34 s, 1 iters, t-(init.)=1.31 s t(norm)=0.166173, mflops=30.0891 (err=9.8e-16) 4. Singleton (f2c): elapsed time t=1.37 s, 1 iters, t-(init.)=1.34 s t(norm)=0.169979, mflops=29.4155 (err=9.8e-16) 5. Temperton: elapsed time t=1.21 s, 2 iters, t-(init.)=1.11 s t(norm)=0.0704016, mflops=71.0212 (err=1.4e-07) 6. Temperton (f2c): elapsed time t=1.31 s, 1 iters, t-(init.)=1.25 s t(norm)=0.158562, mflops=31.5334 (err=9.7e-16) 7. SCSL: elapsed time t=1.47 s, 4 iters, t-(init.)=1.33 s t(norm)=0.0421775, mflops=118.547 (err=1.0e-15) Top mflops for N=421875 = 118.547 Normalized results and averages for N=421875: fft 0: mflops = 85.6886 (norm. = 0.722826), norm. avg. (of 16) = 0.941863 fft 1: mflops = 44.5387 (norm. = 0.375706), norm. avg. (of 16) = 0.256451 fft 2: mflops = 21.3063 (norm. = 0.17973), norm. avg. (of 16) = 0.104346 fft 3: mflops = 30.0891 (norm. = 0.253817), norm. avg. (of 16) = 0.470236 fft 4: mflops = 29.4155 (norm. = 0.248134), norm. avg. (of 16) = 0.481032 fft 5: mflops = 71.0212 (norm. = 0.599099), norm. avg. (of 9) = 0.745449 fft 6: mflops = 31.5334 (norm. = 0.266), norm. avg. (of 9) = 0.226255 fft 7: mflops = 118.547 (norm. = 1), norm. avg. (of 16) = 0.565157 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.24 s, 2 iters, t-(init.)=1.17 s t(norm)=0.0602442, mflops=82.9956 (err=6.7e-16) 1. PDA: elapsed time t=1.29 s, 1 iters, t-(init.)=1.25 s t(norm)=0.128727, mflops=38.8419 (err=6.1e-16) 2. PDA (f2c): elapsed time t=2.32 s, 1 iters, t-(init.)=2.29 s t(norm)=0.235828, mflops=21.2019 (err=6.1e-16) 3. Singleton: elapsed time t=1.77 s, 1 iters, t-(init.)=1.74 s t(norm)=0.179188, mflops=27.9037 (err=7.9e-16) 4. Singleton (f2c): elapsed time t=1.78 s, 1 iters, t-(init.)=1.74 s t(norm)=0.179188, mflops=27.9037 (err=7.9e-16) 5. Temperton: elapsed time t=1.58 s, 2 iters, t-(init.)=1.51 s t(norm)=0.077751, mflops=64.3078 (err=1.7e-07) 6. Temperton (f2c): elapsed time t=1.04 s, 1 iters, t-(init.)=1 s t(norm)=0.102982, mflops=48.5524 (err=6.7e-16) 7. SCSL: elapsed time t=1.58 s, 4 iters, t-(init.)=1.44 s t(norm)=0.0370733, mflops=134.868 (err=7.6e-16) Top mflops for N=512000 = 134.868 Normalized results and averages for N=512000: fft 0: mflops = 82.9956 (norm. = 0.615385), norm. avg. (of 17) = 0.922658 fft 1: mflops = 38.8419 (norm. = 0.288), norm. avg. (of 17) = 0.258307 fft 2: mflops = 21.2019 (norm. = 0.157205), norm. avg. (of 17) = 0.107455 fft 3: mflops = 27.9037 (norm. = 0.206897), norm. avg. (of 17) = 0.454745 fft 4: mflops = 27.9037 (norm. = 0.206897), norm. avg. (of 17) = 0.464906 fft 5: mflops = 64.3078 (norm. = 0.476821), norm. avg. (of 10) = 0.718586 fft 6: mflops = 48.5524 (norm. = 0.36), norm. avg. (of 10) = 0.23963 fft 7: mflops = 134.868 (norm. = 1), norm. avg. (of 17) = 0.590736 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.41 s, 2 iters, t-(init.)=1.33 s t(norm)=0.0585065, mflops=85.4606 (err=7.0e-16) 1. PDA: elapsed time t=1.92 s, 1 iters, t-(init.)=1.88 s t(norm)=0.165402, mflops=30.2294 (err=7.0e-16) 2. PDA (f2c): elapsed time t=4.54 s, 1 iters, t-(init.)=4.49 s t(norm)=0.395029, mflops=12.6573 (err=7.0e-16) 3. Singleton: elapsed time t=2.76 s, 1 iters, t-(init.)=2.72 s t(norm)=0.239305, mflops=20.8939 (err=8.9e-16) 4. Singleton (f2c): elapsed time t=2.75 s, 1 iters, t-(init.)=2.71 s t(norm)=0.238425, mflops=20.971 (err=8.9e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 7. SCSL: elapsed time t=1.82 s, 2 iters, t-(init.)=1.73 s t(norm)=0.0761025, mflops=65.7009 (err=9.7e-16) Top mflops for N=592704 = 85.4606 Normalized results and averages for N=592704: fft 0: mflops = 85.4606 (norm. = 1), norm. avg. (of 18) = 0.926955 fft 1: mflops = 30.2294 (norm. = 0.353723), norm. avg. (of 18) = 0.263608 fft 2: mflops = 12.6573 (norm. = 0.148107), norm. avg. (of 18) = 0.109714 fft 3: mflops = 20.8939 (norm. = 0.244485), norm. avg. (of 18) = 0.443064 fft 4: mflops = 20.971 (norm. = 0.245387), norm. avg. (of 18) = 0.452711 fft 5: mflops = -1 (norm. = -0.0117013), norm. avg. (of 10) = 0.718586 fft 6: mflops = -1 (norm. = -0.0117013), norm. avg. (of 10) = 0.23963 fft 7: mflops = 65.7009 (norm. = 0.768786), norm. avg. (of 18) = 0.600628 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=1.36 s, 1 iters, t-(init.)=1.29 s t(norm)=0.0738077, mflops=67.7436 (err=7.7e-16) 1. PDA: elapsed time t=2.33 s, 1 iters, t-(init.)=2.26 s t(norm)=0.129306, mflops=38.6678 (err=6.5e-16) 2. PDA (f2c): elapsed time t=4.48 s, 1 iters, t-(init.)=4.41 s t(norm)=0.252319, mflops=19.8162 (err=6.5e-16) 3. Singleton: elapsed time t=5.74 s, 1 iters, t-(init.)=5.68 s t(norm)=0.324983, mflops=15.3854 (err=7.0e-16) 4. Singleton (f2c): elapsed time t=5.89 s, 1 iters, t-(init.)=5.83 s t(norm)=0.333565, mflops=14.9896 (err=7.0e-16) 5. Temperton: elapsed time t=1.78 s, 1 iters, t-(init.)=1.7 s t(norm)=0.0972659, mflops=51.4055 (err=1.6e-07) 6. Temperton (f2c): elapsed time t=2.07 s, 1 iters, t-(init.)=2 s t(norm)=0.11443, mflops=43.6947 (err=7.7e-16) 7. SCSL: elapsed time t=1.45 s, 2 iters, t-(init.)=1.31 s t(norm)=0.037476, mflops=133.419 (err=7.7e-16) Top mflops for N=884736 = 133.419 Normalized results and averages for N=884736: fft 0: mflops = 67.7436 (norm. = 0.507752), norm. avg. (of 19) = 0.904892 fft 1: mflops = 38.6678 (norm. = 0.289823), norm. avg. (of 19) = 0.264988 fft 2: mflops = 19.8162 (norm. = 0.148526), norm. avg. (of 19) = 0.111756 fft 3: mflops = 15.3854 (norm. = 0.115317), norm. avg. (of 19) = 0.425814 fft 4: mflops = 14.9896 (norm. = 0.11235), norm. avg. (of 19) = 0.434797 fft 5: mflops = 51.4055 (norm. = 0.385294), norm. avg. (of 11) = 0.688287 fft 6: mflops = 43.6947 (norm. = 0.3275), norm. avg. (of 11) = 0.247618 fft 7: mflops = 133.419 (norm. = 1), norm. avg. (of 19) = 0.621647 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=1.66 s, 1 iters, t-(init.)=1.56 s t(norm)=0.0669019, mflops=74.7363 (err=7.5e-16) 1. PDA: elapsed time t=3.79 s, 1 iters, t-(init.)=3.69 s t(norm)=0.158249, mflops=31.5958 (err=7.3e-16) 2. PDA (f2c): elapsed time t=8.98 s, 1 iters, t-(init.)=8.89 s t(norm)=0.381255, mflops=13.1146 (err=7.3e-16) 3. Singleton: elapsed time t=5.73 s, 1 iters, t-(init.)=5.64 s t(norm)=0.241876, mflops=20.6718 (err=7.8e-16) 4. Singleton (f2c): elapsed time t=5.69 s, 1 iters, t-(init.)=5.6 s t(norm)=0.240161, mflops=20.8194 (err=7.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 7. SCSL: elapsed time t=1.84 s, 1 iters, t-(init.)=1.74 s t(norm)=0.0746213, mflops=67.005 (err=9.6e-16) Top mflops for N=1157625 = 74.7363 Normalized results and averages for N=1157625: fft 0: mflops = 74.7363 (norm. = 1), norm. avg. (of 20) = 0.909647 fft 1: mflops = 31.5958 (norm. = 0.422764), norm. avg. (of 20) = 0.272876 fft 2: mflops = 13.1146 (norm. = 0.175478), norm. avg. (of 20) = 0.114942 fft 3: mflops = 20.6718 (norm. = 0.276596), norm. avg. (of 20) = 0.418354 fft 4: mflops = 20.8194 (norm. = 0.278571), norm. avg. (of 20) = 0.426986 fft 5: mflops = -1 (norm. = -0.0133804), norm. avg. (of 11) = 0.688287 fft 6: mflops = -1 (norm. = -0.0133804), norm. avg. (of 11) = 0.247618 fft 7: mflops = 67.005 (norm. = 0.896552), norm. avg. (of 20) = 0.635393 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=2.07 s, 1 iters, t-(init.)=1.96 s t(norm)=0.0683128, mflops=73.1927 (err=5.5e-16) 1. PDA: elapsed time t=4.73 s, 1 iters, t-(init.)=4.61 s t(norm)=0.160675, mflops=31.1188 (err=5.6e-16) 2. PDA (f2c): elapsed time t=10.85 s, 1 iters, t-(init.)=10.73 s t(norm)=0.373978, mflops=13.3698 (err=5.6e-16) 3. Singleton: elapsed time t=7.05 s, 1 iters, t-(init.)=6.93 s t(norm)=0.241535, mflops=20.701 (err=6.4e-16) 4. Singleton (f2c): elapsed time t=7.51 s, 1 iters, t-(init.)=7.4 s t(norm)=0.257916, mflops=19.3862 (err=6.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 7. SCSL: elapsed time t=2.12 s, 1 iters, t-(init.)=2.01 s t(norm)=0.0700555, mflops=71.372 (err=6.6e-16) Top mflops for N=1404928 = 73.1927 Normalized results and averages for N=1404928: fft 0: mflops = 73.1927 (norm. = 1), norm. avg. (of 21) = 0.91395 fft 1: mflops = 31.1188 (norm. = 0.425163), norm. avg. (of 21) = 0.280128 fft 2: mflops = 13.3698 (norm. = 0.182665), norm. avg. (of 21) = 0.118167 fft 3: mflops = 20.701 (norm. = 0.282828), norm. avg. (of 21) = 0.4119 fft 4: mflops = 19.3862 (norm. = 0.264865), norm. avg. (of 21) = 0.419266 fft 5: mflops = -1 (norm. = -0.0136626), norm. avg. (of 11) = 0.688287 fft 6: mflops = -1 (norm. = -0.0136626), norm. avg. (of 11) = 0.247618 fft 7: mflops = 71.372 (norm. = 0.975124), norm. avg. (of 21) = 0.65157 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=2.4 s, 1 iters, t-(init.)=2.26 s t(norm)=0.0631191, mflops=79.2153 (err=7.5e-16) 1. PDA: elapsed time t=4.15 s, 1 iters, t-(init.)=3.97 s t(norm)=0.110877, mflops=45.0949 (err=8.1e-16) 2. PDA (f2c): elapsed time t=8.51 s, 1 iters, t-(init.)=8.37 s t(norm)=0.233764, mflops=21.3891 (err=8.1e-16) 3. Singleton: elapsed time t=12.39 s, 1 iters, t-(init.)=12.22 s t(norm)=0.34129, mflops=14.6503 (err=9.6e-16) 4. Singleton (f2c): elapsed time t=12.37 s, 1 iters, t-(init.)=12.24 s t(norm)=0.341849, mflops=14.6264 (err=9.6e-16) 5. Temperton: elapsed time t=2.43 s, 1 iters, t-(init.)=2.29 s t(norm)=0.063957, mflops=78.1776 (err=1.1e-08) 6. Temperton (f2c): elapsed time t=4.46 s, 1 iters, t-(init.)=4.29 s t(norm)=0.119815, mflops=41.7311 (err=7.1e-16) 7. SCSL: elapsed time t=1.62 s, 1 iters, t-(init.)=1.48 s t(norm)=0.0413346, mflops=120.964 (err=8.9e-16) Top mflops for N=1728000 = 120.964 Normalized results and averages for N=1728000: fft 0: mflops = 79.2153 (norm. = 0.654867), norm. avg. (of 22) = 0.902173 fft 1: mflops = 45.0949 (norm. = 0.372796), norm. avg. (of 22) = 0.28434 fft 2: mflops = 21.3891 (norm. = 0.176822), norm. avg. (of 22) = 0.120833 fft 3: mflops = 14.6503 (norm. = 0.121113), norm. avg. (of 22) = 0.398682 fft 4: mflops = 14.6264 (norm. = 0.120915), norm. avg. (of 22) = 0.405704 fft 5: mflops = 78.1776 (norm. = 0.646288), norm. avg. (of 12) = 0.684787 fft 6: mflops = 41.7311 (norm. = 0.344988), norm. avg. (of 12) = 0.255732 fft 7: mflops = 120.964 (norm. = 1), norm. avg. (of 22) = 0.667408 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=4.75 s, 1 iters, t-(init.)=4.49 s t(norm)=0.0699074, mflops=71.5232 (err=1.2e-15) 1. PDA: elapsed time t=7.42 s, 1 iters, t-(init.)=7.18 s t(norm)=0.11179, mflops=44.7269 (err=1.2e-15) 2. PDA (f2c): elapsed time t=15.24 s, 1 iters, t-(init.)=14.97 s t(norm)=0.233076, mflops=21.4522 (err=1.2e-15) 3. Singleton: elapsed time t=23.84 s, 1 iters, t-(init.)=23.59 s t(norm)=0.367286, mflops=13.6134 (err=1.6e-15) 4. Singleton (f2c): elapsed time t=23.28 s, 1 iters, t-(init.)=22.76 s t(norm)=0.354363, mflops=14.1098 (err=1.6e-15) 5. Temperton: elapsed time t=6.15 s, 1 iters, t-(init.)=5.91 s t(norm)=0.0920162, mflops=54.3383 (err=1.8e-07) 6. Temperton (f2c): elapsed time t=8.45 s, 1 iters, t-(init.)=8.2 s t(norm)=0.12767, mflops=39.1633 (err=1.2e-15) 7. SCSL: elapsed time t=2.81 s, 1 iters, t-(init.)=2.56 s t(norm)=0.0398581, mflops=125.445 (err=1.3e-15) Top mflops for N=2985984 = 125.445 Normalized results and averages for N=2985984: fft 0: mflops = 71.5232 (norm. = 0.570156), norm. avg. (of 23) = 0.887738 fft 1: mflops = 44.7269 (norm. = 0.356546), norm. avg. (of 23) = 0.28748 fft 2: mflops = 21.4522 (norm. = 0.171009), norm. avg. (of 23) = 0.123015 fft 3: mflops = 13.6134 (norm. = 0.108521), norm. avg. (of 23) = 0.386067 fft 4: mflops = 14.1098 (norm. = 0.112478), norm. avg. (of 23) = 0.392955 fft 5: mflops = 54.3383 (norm. = 0.433164), norm. avg. (of 13) = 0.665431 fft 6: mflops = 39.1633 (norm. = 0.312195), norm. avg. (of 13) = 0.260075 fft 7: mflops = 125.445 (norm. = 1), norm. avg. (of 23) = 0.681868 Benchmarking for array size = 180x180x180: 0. FFTW: elapsed time t=9.75 s, 1 iters, t-(init.)=9.28 s t(norm)=0.0707978, mflops=70.6236 (err=9.7e-16) 1. PDA: elapsed time t=14.44 s, 1 iters, t-(init.)=13.96 s t(norm)=0.106502, mflops=46.9475 (err=9.5e-16) 2. PDA (f2c): elapsed time t=30.03 s, 1 iters, t-(init.)=29.54 s t(norm)=0.225363, mflops=22.1864 (err=9.5e-16) 3. Singleton: elapsed time t=51.35 s, 1 iters, t-(init.)=50.87 s t(norm)=0.388091, mflops=12.8836 (err=1.2e-15) 4. Singleton (f2c): elapsed time t=51.59 s, 1 iters, t-(init.)=51.1 s t(norm)=0.389846, mflops=12.8256 (err=1.2e-15) 5. Temperton: elapsed time t=9.79 s, 1 iters, t-(init.)=9.17 s t(norm)=0.0699586, mflops=71.4708 (err=9.9e-08) 6. Temperton (f2c): elapsed time t=18.55 s, 1 iters, t-(init.)=17.94 s t(norm)=0.136866, mflops=36.5322 (err=9.4e-16) 7. SCSL: elapsed time t=6.13 s, 1 iters, t-(init.)=5.65 s t(norm)=0.0431043, mflops=115.998 (err=1.2e-15) Top mflops for N=5832000 = 115.998 Normalized results and averages for N=5832000: fft 0: mflops = 70.6236 (norm. = 0.608836), norm. avg. (of 24) = 0.876117 fft 1: mflops = 46.9475 (norm. = 0.404728), norm. avg. (of 24) = 0.292365 fft 2: mflops = 22.1864 (norm. = 0.191266), norm. avg. (of 24) = 0.125859 fft 3: mflops = 12.8836 (norm. = 0.111067), norm. avg. (of 24) = 0.374608 fft 4: mflops = 12.8256 (norm. = 0.110568), norm. avg. (of 24) = 0.381189 fft 5: mflops = 71.4708 (norm. = 0.61614), norm. avg. (of 14) = 0.661911 fft 6: mflops = 36.5322 (norm. = 0.314939), norm. avg. (of 14) = 0.263994 fft 7: mflops = 115.998 (norm. = 1), norm. avg. (of 24) = 0.695124 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Bailey, Beauregard, Bergland, Brenner, Burrus, CWP (min N), CWP (best N), Edelblute, FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), Monro, NAPACK (f2c), Nielsen, NR (C), NR (F), Ooura (C), Ooura (F), Ransom, SCIPORT, Singleton, Singleton (f2c), Sorensen, Sorensen DIT, Temperton, Temperton (f2c), Valkenburg, SCSL, SGIMATH 2, 49.9322, 41.1206, 30.3935, 1.22497, 12.4092, 10.6998, 4.40578, 7.13317, 20.9715, 3.29741, 3.36082, , 9.89223, 6.63656, 24.3855, 24.6724, 48.771, , 5.72992, 9.98644, 11.4598, 33.026, , , , , 5.34988, 2.31986, 11.5228, 7.59838, 65.536, 25.575, , , 6.20459, 6.5536, 20.5603, 19.5996, 3.12076, 2.70252, 7.13317, 13.6179, 10.3819 4, 69.9051, 72.3156, 45.1, 5.57753, 22.55, 17.05, 14.3641, 15.0874, 33.2881, 12.0526, 6.85344, 49.9322, 33.026, 18.3961, 92.1825, 86.4805, 107.546, , 20.7639, 17.1898, 18.5589, 105.517, 41.943, 44.6203, 35.5449, 5.09017, 10.4858, 8.81156, 18.5589, 13.7069, 92.1825, 65.536, 6.76501, 44.1506, 22.0753, 23.6966, 38.8361, 32.2639, 10.5917, 9.19804, 6.89853, 49.0562, 49.0562 8, 116.508, 119.837, 52.8694, 6.5536, 42.2245, 17.5739, 25.3688, 22.7951, 36.1578, 30.541, 20.295, 44.306, 49.539, 23.4756, 174.763, 173.557, 270.6, 113.36, 41.6653, 27.5941, 27.5941, 164.483, 63.5501, 62.2916, 57.1951, 11.9156, 16.7326, 21.8453, 27.8383, 21.1123, 133.861, 93.9023, 6.34219, 59.3534, 23.1304, 22.4695, 51.15, 36.7921, 21.5461, 14.8383, 6.83854, 117.597, 118.707 16, 75.573, 79.1378, 57.4562, 14.8734, 57.0654, 23.0456, 39.5689, 31.775, 42.3667, 69.3273, 41.5278, 46.0913, 102.3, 36.1578, 234.646, 228.261, 319.566, 87.3813, 76.2601, 37.7865, 39.5689, 193.956, 58.6616, 67.6501, 63.5501, 18.8933, 27.2357, 19.065, 38.13, 30.1748, 144.631, 127.1, 22.0753, 73.5843, 57.4562, 58.2542, 67.1089, 41.5278, 27.2357, 27.7768, 7.23156, 190.65, 189.573 32, 93.6229, 97.9978, 64.3298, 15.7918, 83.8861, 25.9549, 54.8993, 40.3298, 47.2332, 88.1156, 89.6219, 50.9017, 105.917, 26.4792, 211.834, 209.715, 287.281, 127.1, 78.2519, 50.4123, 50.4123, 201.649, 66.7883, 83.2203, 79.4376, 28.1875, 34.7211, 30.3057, 48.5452, 40.3298, 165.13, 146.654, 17.9551, 88.8624, 77.1012, 76.5384, 91.1805, 47.6625, 35.9101, 33.1828, 7.28178, 211.834, 236.966 64, 93.9023, 94.6084, 71.0899, 30.8405, 87.3813, 27.8383, 64.8604, 46.2607, 53.3174, 116.508, 123.362, 56.1737, 160.292, 34.1927, 218.833, 192.106, 196.608, 173.557, 109.417, 56.6798, 58.2542, 190.65, 68.3854, 88.6121, 87.9924, 34.9525, 44.6203, 42.799, 56.1737, 49.152, 178.481, 175.985, 38.5979, 99.8644, 106.635, 107.546, 114.39, 53.7731, 52.4288, 47.3042, 7.63526, 259.441, 254.2 128, 105.612, 107.154, 77.2635, 29.1271, 107.154, 28.4497, 71.2624, 53.1886, 59.1938, 146.801, 168.736, 61.1669, 173.729, 35.6312, 240.657, 203.89, 227.598, 177.94, 126.552, 63.2761, 62.7353, 179.025, 75.6704, 99.8644, 99.1896, 41.943, 46.4559, 34.9525, 62.7353, 56.0308, 182.361, 182.361, 34.9525, 111.213, 107.942, 105.612, 139.81, 60.6614, 53.1886, 50.2742, 7.58268, 299.593, 299.593 256, 108.24, 108.24, 83.8861, 37.1177, 99.8644, 29.5374, 80.6597, 55.9241, 64.5278, 184.365, 213.722, 66.5763, 209.715, 41.1206, 203.36, 193.956, 196.225, 198.547, 137.518, 69.9051, 68.7591, 167.772, 79.8915, 103.563, 106.861, 46.0913, 54.4715, 43.2402, 67.6501, 62.6016, 191.74, 200.925, 53.4306, 118.149, 136.4, 135.3, 155.345, 64.5278, 55.9241, 56.2994, 7.71012, 316.551, 313.593 512, 115.794, 117.965, 87.3813, 38.3625, 112.347, 29.6767, 87.3813, 60.8851, 68.8846, 202.95, 210.887, 71.4938, 171.585, 33.9467, 213.27, 212.072, 188.744, 218.201, 117.965, 72.0396, 71.4938, 158.608, 85.0197, 112.347, 111.683, 51.289, 51.8527, 51.289, 70.9563, 67.4085, 194.581, 200.791, 49.9322, 115.794, 139.81, 140.853, 171.585, 69.9051, 54.5502, 51.289, 7.9171, 328.25, 269.634 1024, 117.818, 118.483, 92.7943, 46.8114, 85.2501, 28.6496, 91.9804, 63.5501, 73.8434, 203.607, 207.639, 75.4371, 145.636, 36.1578, 201.649, 177.725, 177.725, 220.753, 127.875, 71.8203, 69.9051, 89.6219, 88.1156, 114.598, 117.159, 54.3304, 57.2992, 42.2813, 73.327, 71.8203, 201.649, 209.715, 63.9376, 104.858, 153.077, 155.345, 182.361, 73.8434, 63.1672, 59.9186, 7.8019, 231.73, 241.052 2048, 98.5841, 101.178, 81.2277, 42.0961, 35.1657, 28.2704, 94.5437, 62.6866, 63.3755, 178.827, 193.854, 67.4523, 66.6725, 28.2704, 162.455, 152.773, 102.074, 174.763, 59.4553, 67.4523, 66.2893, 80.6597, 90.8215, 116.508, 110.907, 50.5892, 32.0398, 40.6139, 72.0896, 73.9381, 165.962, 172.154, 60.0747, 44.0242, 136.501, 132.579, 102.985, 56.5409, 53.3997, 52.4288, 4.90405, 136.501, 135.698 4096, 48.3958, 49.9322, 44.306, 42.799, 21.2549, 24.576, 66.2259, 45.923, 37.6734, 154.392, 167.772, 39.8193, 51.9955, 31.775, 122.164, 127.1, 83.3305, 106.635, 46.9512, 38.3625, 37.4491, 59.3534, 87.9924, 109.417, 99.8644, 33.6441, 32.768, 37.2276, 38.3625, 41.3912, 98.304, 95.3251, 62.9146, 42.5098, 69.5189, 70.6905, 62.2916, 34.3795, 39.5689, 39.8193, 5.8689, 140.591, 139.81 8192, 49.0341, 50.1158, 38.7258, 40.0926, 21.0362, 24.3419, 61.4031, 45.1374, 34.4229, 154.903, 167.258, 36.6438, 44.2581, 24.5171, 118.535, 118.535, 85.1968, 95.3251, 41.0587, 36.254, 36.0621, , 54.9657, 61.4031, 56.7979, 33.4105, 28.8803, 31.5544, 37.2445, 39.1709, 94.6631, 92.7312, 58.7564, 40.8128, 64.2995, 64.2995, 52.8352, 30.9807, 36.0621, 35.6845, 6.08549, 151.461, 155.788 16384, 44.7563, 46.1637, 37.4491, 47.9741, 19.9457, 24.4668, 61.6809, 44.485, 32.768, 164.945, 164.945, 35.9805, 46.1637, 26.9854, 113.799, 106.377, 76.859, 92.3274, 42.6746, 34.9525, 34.2992, , 53.5769, 56.8995, 56.4618, 32.1931, 29.3601, 33.3638, 35.9805, 37.4491, 84.8559, 80.6597, 69.2456, 39.0427, 61.1669, 63.8264, 44.7563, 30.3307, 37.0709, 36.7002, 6.07619, 152.917, 154.527 32768, 46.5344, 46.8114, 35.4249, 44.1816, 20.1649, 24.1237, 61.44, 43.9347, 32.4972, 162.151, 163.84, 34.7979, 44.939, 25.7004, 108.473, 109.227, 67.7959, 88.8624, 41.3912, 33.6082, 34.1927, , 50.4123, 53.8652, 53.1373, 32.2308, 26.5686, 36.0749, 35.4249, 35.4249, 85.4817, 81.0755, 64.9944, 37.4491, 55.3825, 55.7753, 41.6102, 29.1271, 35.1086, 35.7469, 5.88647, 162.151, 163.84 65536, 41.5278, 43.4643, 34.9525, 51.4639, 19.4181, 24.6724, 58.6616, 43.4643, 32.0176, 161.319, 164.483, 34.1, 45.3438, 27.7768, 106.861, 93.2068, 59.4937, 82.2413, 41.1206, 33.026, 34.6637, , 50.2312, 52.4288, 52.7585, 32.0176, 27.06, 31.775, 34.6637, 34.3795, 78.3982, 71.0899, 66.052, 34.3795, 56.6798, 58.2542, 39.9458, 28.1497, 34.3795, 35.2463, 5.14008, 162.886, 161.319 131072, 29.909, 32.0608, 24.3522, 45.474, 14.7565, 24.3522, 52.4288, 39.0916, 27.5089, 160.593, 160.593, 28.938, 27.5089, 18.7246, 73.6603, 76.1786, 47.409, 62.7669, 30.9476, 26.2144, 27.5089, , 47.409, 50.3553, 49.7927, 25.1777, 25.0362, 25.7598, 27.3402, 27.6798, 73.0565, 69.0922, 61.8951, 25.9096, 47.6625, 49.2425, 35.0901, 24.758, 28.2054, 27.3402, 5.89477, 141.475, 141.475 262144, 22.2575, 23.8313, 19.826, 49.152, 7.08497, 23.1304, 45.3711, 31.0434, 18.2891, 100.396, 99.3388, 19.1813, 15.8342, 10.8225, 63.3368, 55.8413, 29.308, 48.1489, 12.288, 23.3594, 24.576, , 37.1543, 38.677, 39.3216, 19.6608, 9.91301, 20.1649, 24.0744, 24.3226, 59.729, 54.5502, 70.9563, 11.9761, 39.9881, 42.1303, 23.593, 16.6148, 16.8521, 13.5592, 5.20816, 96.2978, 129.276 524288, 6.42676, 5.82542, 5.54648, 26.9229, 4.22813, 18.1118, 18.0461, 13.5346, 4.80302, 84.4193, 79.0593, 5.72498, 6.23371, 7.62747, 44.4709, 27.9817, 22.0387, 22.1366, 8.44193, 9.25787, 7.11534, , 32.5538, 33.2049, 19.3803, 5.81862, 10.3982, 8.11195, 7.14596, 8.32899, 42.5704, 39.8459, 49.3142, 10.4199, 11.6101, 12.5776, 11.5562, 5.66637, 10.4418, 9.08893, 3.67582, 89.743, 91.3897 1048576, 5.9782, 5.9782, 4.84779, 26.3461, 3.62829, 16.6177, 14.3641, 11.9974, 4.6728, , , 5.27188, 7.6094, 8.5668, 27.3067, 27.0252, 13.2731, , 7.63156, 7.48448, 6.6198, , , 10.6563, 9.53251, 4.98847, 8.05358, 8.64448, 6.4251, 7.24655, 22.9951, 21.6648, 60.6113, 8.83383, 12.3217, 12.1083, 10.2101, 4.95546, 9.72705, 7.98003, 3.527, 95.3251, 81.285 2097152, 6.28248, 6.29685, 4.77659, 17.0963, 3.50025, 15.6615, , 9.87006, 4.68214, , , 5.05628, 6.24861, 6.70323, 26.4665, 24.2245, 15.2283, , 7.63262, 6.66266, 6.53221, , , 8.1195, 7.64852, , 7.75083, 8.61843, 6.46889, 6.22388, 24.3585, 23.6014, 43.3466, 8.28446, 9.00986, 10.0502, 8.14654, 4.22651, 7.89817, 7.56444, 3.47594, 94.103, 82.1645 Norm. Avg., 0.321856, 0.322525, 0.236571, 0.182956, 0.180745, 0.133665, 0.269032, 0.194585, 0.1903, 0.643315, 0.657007, 0.202683, 0.313025, 0.124498, 0.638173, 0.598542, 0.572956, 0.548242, 0.249935, 0.184837, 0.185426, 0.574086, 0.312907, 0.337655, 0.317661, 0.143328, 0.137762, 0.13982, 0.187556, 0.178198, 0.576884, 0.516728, 0.294707, 0.251703, 0.314825, 0.318825, 0.322693, 0.181904, 0.161984, 0.153887, 0.0368404, 0.826495, 0.821541 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, Brenner, CWP (min N), CWP (best N), FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Nielsen, Singleton, Singleton (f2c), Temperton, Temperton (f2c), Valkenburg, SCSL, SGIMATH 6, , 16.8008, 13.1155, 35.3547, 18.6504, 127.056, 118.709, 19.1783, 33.3262, 17.525, 7.58544, 15.6377, 15.1709, 13.4629, 8.68759, 7.31258, 37.6462, 41.4877 9, 6.11012, 38.1571, 24.2818, 57.5292, 23.8178, 152.628, 157.448, 17.4738, 39.9935, 24.7642, 11.9089, 27.9059, 28.1158, 22.5265, 11.613, 7.5391, 58.8881, 57.5292 12, , 39.4312, 36.3784, 79.4177, 29.2158, 218.977, 203.195, 28.7687, 59.6683, 26.1049, 15.3224, 31.6779, 31.3259, 30.4792, 16.9839, 7.45854, 82.9214, 86.0864 15, 9.23228, 51.9004, 50.2043, 92.5453, 32.0052, 158.376, 161.711, 23.8548, 66.2178, 21.2189, 17.7807, 33.3968, 32.5477, 41.746, 15.7403, 6.80963, 87.287, 87.7858 18, 10.1633, 56.2176, 48.7034, 69.7737, 21.2028, 138.565, 136.64, 20.6683, 75.6776, 33.9244, 13.1525, 43.1495, 43.5314, 33.2368, 15.5666, 7.68601, 89.4372, 92.8121 24, , 87.4124, 71.4012, 91.2851, 24.697, 173.772, 169.683, 38.5643, 96.1536, 36.4218, 25.3927, 41.4455, 40.7431, 49.394, 23.1139, 7.57513, 155.087, 153.437 36, 12.3957, 109.886, 109.886, 117.283, 26.7486, 176.774, 176.774, 25.8419, 115.07, 43.8755, 20.8859, 68.1418, 67.3889, 59.2106, 23.1011, 7.85914, 152.467, 153.426 80, 20.5106, 156.345, 160.899, 174.448, 36.3433, 207.157, 196.125, 59.1877, 115.087, 32.3683, 47.6223, 101.052, 101.672, 73.3299, 39.8379, 7.50569, 215.228, 215.228 108, 14.095, 140.619, 171.98, 174.49, 30.8056, 207.871, 211.55, 23.9052, 121.965, 55.336, 24.098, 79.6839, 76.6191, 73.7814, 23.1639, 7.9472, 200.884, 211.55 210, 18.9585, 193.737, 195.161, 75.4032, 18.4319, 163.839, 156.129, 29.103, 126.39, 28.1165, 31.5975, 57.6999, 59.2454, , , 6.58282, 99.7817, 99.7817 504, 19.9704, 245.464, 240.683, 82.7346, 18.9883, 191.057, 158.398, 33.3319, 139.343, 36.7709, 30.0853, 74.728, 76.5808, , , 7.23928, 74.1302, 73.5419 1000, 22.9842, 182.231, 238.713, 165.934, 34.4762, 174.444, 174.444, 39.2499, 125.214, 30.5538, 57.011, 112.762, 110.924, 96.2732, 32.2942, 7.04763, 210.412, 210.412 1960, 23.2524, 228.649, 228.649, 59.6475, 14.7515, 131.439, 136.337, 36.3415, 93.0096, 24.9435, 31.9045, 77.8378, 79.53, , , 6.23587, 45.7297, 45.7297 4725, 18.8323, 160.484, 189.288, 78.119, 21.841, 144.75, 143.345, 23.3615, 100.439, 27.7528, 26.7473, 65.3296, 65.9129, , , 6.68682, 93.4462, 93.4462 10368, 19.5834, 151.311, 178.822, 134.116, 30.735, 175.281, 160.94, 31.6132, 115.708, 47.5897, 28.1901, 71.9649, 69.6983, 52.0687, 27.8355, 7.96015, 140.503, 142.769 27000, 19.9979, 191.258, 191.258, 142.108, 32.6119, 150.517, 155.106, 27.4109, 121.13, 32.1178, 35.7266, 69.1232, 68.7495, 59.433, 23.9073, 7.22651, 128.471, 131.12 75600, 19.6029, 163.357, 163.357, 83.7731, 24.0232, 148.507, 144.139, 28.0041, 104.271, 31.2148, 31.2148, 63.2352, 63.6458, , , 6.84459, 94.2447, 93.3471 165375, 12.7984, 153.924, 153.924, 46.2393, 15.2491, 76.9621, 84.9434, 17.0645, 71.2258, 22.5735, 22.3972, 43.1104, 43.1104, , , 5.77992, 72.1217, 73.0405 362880, 10.3747, 56.7973, 57.7765, 45.5924, 18.2122, 69.8133, 70.5482, 17.92, 61.487, 20.187, 13.8473, 27.2442, 27.0245, , , 5.68937, 78.3869, 77.9312 Norm. Avg., 0.0891771, 0.682723, 0.70506, 0.520329, 0.14249, 0.876261, 0.858466, 0.158357, 0.521211, 0.177217, 0.140603, 0.314193, 0.312981, 0.266678, 0.116967, 0.0414337, 0.622635, 0.629405 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM, HARM (f2c), NR (C), NR (F), PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c), SCSL 4x4x4, 198.156, , , 79.6387, 51.9955, 21.6947, 14.0434, 118.707, 125.829, 67.6501, 58.2542, 84.4491 8x8x8, 290.375, 121.77, 107.241, 120.99, 83.5149, 41.0312, 21.0651, 116.508, 115.088, 155.987, 131.989, 214.481 16x16x16, 146.313, 96.0528, 89.2405, 44.939, 43.0922, 58.7987, 28.3399, 66.9304, 66.9304, 71.9024, 82.7823, 190.65 32x32x32, 135.592, 79.4376, 73.4983, 36.0749, 36.0749, 66.0867, 24.1237, 51.7389, 51.4008, 68.3854, 70.8497, 218.453 64x64x64, 80.6597, 67.8934, 67.8934, 23.9522, 23.8313, 36.5782, 19.6608, 38.677, 38.3625, 53.3174, 54.2367, 178.06 256x64x32, 41.5061, 28.2996, 33.6536, 8.9905, 8.9743, 27.5179, 17.0573, 13.5715, 13.3175, 23.3837, 22.9527, 98.6284 16x1024x64, 44.2437, 27.8137, 32.4637, 7.42618, 7.52207, 28.807, 17.9551, 12.71, 12.3799, , , 115.228 128x128x128, 39.8193, 28.9357, 32.1931, 7.02395, 7.0262, 26.9194, 16.6567, 8.63533, 8.48232, 13.7884, 13.1857, 123.708 512x128x64, 39.0003, 21.2126, 27.1396, 6.74325, 6.85547, 26.8085, 16.1942, 9.31691, 9.26267, , , 121.734 Norm. Avg., 0.587575, 0.325571, 0.332659, 0.18019, 0.149377, 0.225997, 0.123261, 0.244406, 0.246696, 0.31668, 0.307281, 0.907201 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c), SCSL 5x5x5, 168.828, 28.7619, 16.823, 166.853, 170.849, 116.934, 33.3316, 82.9414 6x6x6, 223.123, 28.3514, 16.0304, 91.4804, 90.2767, 108.048, 36.4948, 81.6789 7x7x7, 169.034, 11.8324, 5.10017, 72.1487, 78.8826, , , 37.4443 9x9x9, 225.365, 47.0133, 21.5121, 123.461, 115.431, 173.147, 31.136, 120.322 10x10x10, 230.621, 51.0248, 24.0683, 114.022, 115.965, 192.546, 48.1366, 133.398 11x11x11, 123, 11.8666, 4.77873, 64.8855, 70.0249, , , 33.3609 12x12x12, 275.804, 61.7872, 26.5789, 119.689, 117.472, 240.892, 63.0148, 124.382 13x13x13, 114.564, 11.6488, 4.5376, 66.7779, 74.3302, , , 31.8558 14x14x14, 160.472, 24.6122, 8.0883, 64.7064, 67.9965, , , 37.1462 15x15x15, 170.196, 70.324, 28.1296, 90.4166, 88.0579, 194.743, 36.9586, 127.379 24x25x28, 115.204, 59.8877, 19.5488, 69.869, 71.8652, , , 84.7848 48x48x48, 121.505, 74.8669, 25.9155, 54.1009, 54.4987, 74.8669, 55.728, 161.127 49x49x49, 115.719, 29.3584, 8.18882, 48.6306, 51.4726, , , 39.2415 60x60x60, 117.774, 52.4338, 22.6489, 41.8324, 41.6051, 161.165, 48.4515, 144.44 72x60x56, 97.2256, 42.0052, 17.3062, 36.979, 36.6656, , , 97.2256 75x75x75, 85.6886, 44.5387, 21.3063, 30.0891, 29.4155, 71.0212, 31.5334, 118.547 80x80x80, 82.9956, 38.8419, 21.2019, 27.9037, 27.9037, 64.3078, 48.5524, 134.868 84x84x84, 85.4606, 30.2294, 12.6573, 20.8939, 20.971, , , 65.7009 96x96x96, 67.7436, 38.6678, 19.8162, 15.3854, 14.9896, 51.4055, 43.6947, 133.419 105x105x105, 74.7363, 31.5958, 13.1146, 20.6718, 20.8194, , , 67.005 112x112x112, 73.1927, 31.1188, 13.3698, 20.701, 19.3862, , , 71.372 120x120x120, 79.2153, 45.0949, 21.3891, 14.6503, 14.6264, 78.1776, 41.7311, 120.964 144x144x144, 71.5232, 44.7269, 21.4522, 13.6134, 14.1098, 54.3383, 39.1633, 125.445 180x180x180, 70.6236, 46.9475, 22.1864, 12.8836, 12.8256, 71.4708, 36.5322, 115.998 Norm. Avg., 0.876117, 0.292365, 0.125859, 0.374608, 0.381189, 0.661911, 0.263994, 0.695124 @@@@ end