To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Roger Koch @ submitter email = PRIVATE @ submitter organization = IBM Research @ computer manufacturer = IBM @ computer model = IntelliStation M Pro @ CPU manufacturer = Intel @ CPU model = Pentium II @ CPU speed = 300 MHz @ RAM = 256 MB @ L2 cache size = 512KB @ operating system = Windows NT 4.0 @ C compiler = Intel C/C++ V 2.4 @ C compiler flags = all @ Fortran compiler = NONE @ Fortran compiler flags = NONE @ remarks = Intel, with default speed opts (/O2) @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) 524288 (50.9973 MB) 1048576 (96 MB) 2097152 (192 MB) Maximum array size = 2097152 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Beauregard 5. Bergland 6. CWP (min N) 7. CWP (best N) 8. Edelblute 9. FFTPACK (f2c) 10. FFTW 11. FFTW_ESTIMATE 12. Frigo-old 13. Green 14. GSL 15. GSL DIT 16. GSL DIF 17. Krukar 18. Mayer (Buneman) 19. Mayer (simple) 20. Mayer (lookup) 21. NAPACK (f2c) 22. Ooura (C) 23. Ransom 24. Singleton (f2c) 25. Temperton (f2c) 26. Valkenburg Computing normalized averages (27 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.522 s, 4194304 iters, t-(init.)=1.061 s t(norm)=0.126481, mflops=39.5316 (err=1.5e-017) 1. Arndt DIT: elapsed time t=1.442 s, 4194304 iters, t-(init.)=0.982 s t(norm)=0.117064, mflops=42.7119 (err=1.5e-017) 2. Arndt Split-Radix: elapsed time t=1.672 s, 4194304 iters, t-(init.)=1.211 s t(norm)=0.144362, mflops=34.635 (err=1.5e-017) 3. Arndt 4-step: elapsed time t=1.532 s, 262144 iters, t-(init.)=1.502 s t(norm)=2.86484, mflops=1.7453 (err=1.5e-017) 4. Beauregard: elapsed time t=1.622 s, 1048576 iters, t-(init.)=1.502 s t(norm)=0.716209, mflops=6.9812 (err=1.2e-016) 5. Bergland: elapsed time t=1.492 s, 1048576 iters, t-(init.)=1.371 s t(norm)=0.653744, mflops=7.64826 (err=1.2e-016) 6. CWP (min N): elapsed time t=1.092 s, 524288 iters, t-(init.)=1.032 s t(norm)=0.984192, mflops=5.08031 7. CWP (best N) (N=3): elapsed time t=1.231 s, 524288 iters, t-(init.)=1.16 s t(norm)=1.10626, mflops=4.51972 8. Skipping fft (Edelblute can't handle N <= 2). 9. FFTPACK (f2c): elapsed time t=1.042 s, 1048576 iters, t-(init.)=0.922 s t(norm)=0.439644, mflops=11.3728 (err=1.2e-016) FFTW_MEASURE plan: (cost = 4.577637e-007) FFTW_NOTW 2 10. FFTW: elapsed time t=1.051 s, 2097152 iters, t-(init.)=0.81 s t(norm)=0.193119, mflops=25.8908 (err=1.2e-016) FFTW_ESTIMATE plan: (cost = 1.820000e+002) FFTW_NOTW 2 11. FFTW_ESTIMATE: elapsed time t=1.062 s, 2097152 iters, t-(init.)=0.822 s t(norm)=0.19598, mflops=25.5128 (err=1.2e-016) 12. Frigo-old: elapsed time t=1.091 s, 4194304 iters, t-(init.)=0.6 s t(norm)=0.0715256, mflops=69.9051 (err=1.2e-016) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=1.742 s, 2097152 iters, t-(init.)=1.502 s t(norm)=0.358105, mflops=13.9624 (err=1.2e-016) 15. GSL DIT: elapsed time t=1.232 s, 1048576 iters, t-(init.)=1.112 s t(norm)=0.530243, mflops=9.42964 (err=1.2e-016) 16. GSL DIF: elapsed time t=1.272 s, 1048576 iters, t-(init.)=1.152 s t(norm)=0.549316, mflops=9.10222 (err=1.2e-016) 17. Krukar: elapsed time t=1.031 s, 2097152 iters, t-(init.)=0.791 s t(norm)=0.188589, mflops=26.5127 (err=1.2e-016) 18. Skipping fft (Mayer can't handle N <= 2). 19. Skipping fft (Mayer can't handle N <= 2). 20. Skipping fft (Mayer can't handle N <= 2). 21. NAPACK (f2c): elapsed time t=1.252 s, 524288 iters, t-(init.)=1.192 s t(norm)=1.13678, mflops=4.39839 (err=2.1e-016) 22. Ooura (C): elapsed time t=1.963 s, 4194304 iters, t-(init.)=1.483 s t(norm)=0.176787, mflops=28.2826 (err=1.2e-016) 23. Skipping fft (Ransom doesn't work for N=2). 24. Singleton (f2c): elapsed time t=1.423 s, 1048576 iters, t-(init.)=1.303 s t(norm)=0.621319, mflops=8.0474 (err=1.2e-016) 25. Temperton (f2c): elapsed time t=1.452 s, 524288 iters, t-(init.)=1.392 s t(norm)=1.32751, mflops=3.76644 (err=1.2e-016) 26. Valkenburg: elapsed time t=1.392 s, 1048576 iters, t-(init.)=1.272 s t(norm)=0.606537, mflops=8.24352 (err=2.1e-016) Top mflops for N=2 = 69.9051 Normalized results and averages for N=2: fft 0: mflops = 39.5316 (norm. = 0.565504), norm. avg. (of 1) = 0.565504 fft 1: mflops = 42.7119 (norm. = 0.610998), norm. avg. (of 1) = 0.610998 fft 2: mflops = 34.635 (norm. = 0.495458), norm. avg. (of 1) = 0.495458 fft 3: mflops = 1.7453 (norm. = 0.0249667), norm. avg. (of 1) = 0.0249667 fft 4: mflops = 6.9812 (norm. = 0.0998668), norm. avg. (of 1) = 0.0998668 fft 5: mflops = 7.64826 (norm. = 0.109409), norm. avg. (of 1) = 0.109409 fft 6: mflops = 5.08031 (norm. = 0.0726744), norm. avg. (of 1) = 0.0726744 fft 7: mflops = 4.51972 (norm. = 0.0646552), norm. avg. (of 1) = 0.0646552 fft 8: mflops = -1 (norm. = -0.0143051), norm. avg. (of 0) = -1 fft 9: mflops = 11.3728 (norm. = 0.16269), norm. avg. (of 1) = 0.16269 fft 10: mflops = 25.8908 (norm. = 0.37037), norm. avg. (of 1) = 0.37037 fft 11: mflops = 25.5128 (norm. = 0.364964), norm. avg. (of 1) = 0.364964 fft 12: mflops = 69.9051 (norm. = 1), norm. avg. (of 1) = 1 fft 13: mflops = -1 (norm. = -0.0143051), norm. avg. (of 0) = -1 fft 14: mflops = 13.9624 (norm. = 0.199734), norm. avg. (of 1) = 0.199734 fft 15: mflops = 9.42964 (norm. = 0.134892), norm. avg. (of 1) = 0.134892 fft 16: mflops = 9.10222 (norm. = 0.130208), norm. avg. (of 1) = 0.130208 fft 17: mflops = 26.5127 (norm. = 0.379267), norm. avg. (of 1) = 0.379267 fft 18: mflops = -1 (norm. = -0.0143051), norm. avg. (of 0) = -1 fft 19: mflops = -1 (norm. = -0.0143051), norm. avg. (of 0) = -1 fft 20: mflops = -1 (norm. = -0.0143051), norm. avg. (of 0) = -1 fft 21: mflops = 4.39839 (norm. = 0.0629195), norm. avg. (of 1) = 0.0629195 fft 22: mflops = 28.2826 (norm. = 0.404585), norm. avg. (of 1) = 0.404585 fft 23: mflops = -1 (norm. = -0.0143051), norm. avg. (of 0) = -1 fft 24: mflops = 8.0474 (norm. = 0.115119), norm. avg. (of 1) = 0.115119 fft 25: mflops = 3.76644 (norm. = 0.0538793), norm. avg. (of 1) = 0.0538793 fft 26: mflops = 8.24352 (norm. = 0.117925), norm. avg. (of 1) = 0.117925 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.623 s, 2097152 iters, t-(init.)=1.293 s t(norm)=0.0770688, mflops=64.8771 (err=1.0e-016) 1. Arndt DIT: elapsed time t=1.652 s, 2097152 iters, t-(init.)=1.331 s t(norm)=0.0793338, mflops=63.0249 (err=1.0e-016) 2. Arndt Split-Radix: elapsed time t=1.242 s, 1048576 iters, t-(init.)=1.072 s t(norm)=0.127792, mflops=39.126 (err=1.0e-016) 3. Arndt 4-step: elapsed time t=1.362 s, 262144 iters, t-(init.)=1.322 s t(norm)=0.630379, mflops=7.93174 (err=1.0e-016) 4. Beauregard: elapsed time t=1.362 s, 262144 iters, t-(init.)=1.322 s t(norm)=0.630379, mflops=7.93174 (err=1.9e-016) 5. Bergland: elapsed time t=1.792 s, 1048576 iters, t-(init.)=1.621 s t(norm)=0.193238, mflops=25.8748 (err=1.6e-016) 6. CWP (min N): elapsed time t=1.301 s, 524288 iters, t-(init.)=1.21 s t(norm)=0.288486, mflops=17.3318 7. CWP (best N) (N=15): elapsed time t=1.773 s, 262144 iters, t-(init.)=1.653 s t(norm)=0.788212, mflops=6.34347 8. Edelblute: elapsed time t=1.241 s, 1048576 iters, t-(init.)=1.07 s t(norm)=0.127554, mflops=39.1991 (err=1.0e-016) 9. FFTPACK (f2c): elapsed time t=1.592 s, 1048576 iters, t-(init.)=1.422 s t(norm)=0.169516, mflops=29.4958 (err=1.6e-016) FFTW_MEASURE plan: (cost = 1.029968e-006) FFTW_NOTW 4 10. FFTW: elapsed time t=1.182 s, 1048576 iters, t-(init.)=1.002 s t(norm)=0.119448, mflops=41.8593 (err=1.6e-016) FFTW_ESTIMATE plan: (cost = 3.176000e+002) FFTW_NOTW 4 11. FFTW_ESTIMATE: elapsed time t=1.181 s, 1048576 iters, t-(init.)=1 s t(norm)=0.119209, mflops=41.943 (err=1.6e-016) 12. Frigo-old: elapsed time t=1.582 s, 4194304 iters, t-(init.)=0.891 s t(norm)=0.0265539, mflops=188.296 (err=1.6e-016) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=1.292 s, 1048576 iters, t-(init.)=1.122 s t(norm)=0.133753, mflops=37.3824 (err=1.6e-016) 15. GSL DIT: elapsed time t=1.242 s, 524288 iters, t-(init.)=1.152 s t(norm)=0.274658, mflops=18.2044 (err=1.9e-016) 16. GSL DIF: elapsed time t=1.302 s, 524288 iters, t-(init.)=1.212 s t(norm)=0.288963, mflops=17.3032 (err=1.9e-016) 17. Krukar: elapsed time t=1.372 s, 2097152 iters, t-(init.)=1.032 s t(norm)=0.061512, mflops=81.285 (err=1.6e-016) 18. Mayer (Buneman): elapsed time t=1.973 s, 2097152 iters, t-(init.)=1.623 s t(norm)=0.0967383, mflops=51.6858 (err=1.0e-016) 19. Mayer (simple): elapsed time t=1.863 s, 2097152 iters, t-(init.)=1.533 s t(norm)=0.0913739, mflops=54.7202 20. Mayer (lookup): elapsed time t=1.933 s, 2097152 iters, t-(init.)=1.593 s t(norm)=0.0949502, mflops=52.6592 (err=1.0e-016) 21. NAPACK (f2c): elapsed time t=1.352 s, 262144 iters, t-(init.)=1.312 s t(norm)=0.62561, mflops=7.9922 (err=2.5e-016) 22. Ooura (C): elapsed time t=1.232 s, 1048576 iters, t-(init.)=1.062 s t(norm)=0.1266, mflops=39.4944 (err=1.6e-016) 23. Ransom: elapsed time t=1.852 s, 262144 iters, t-(init.)=1.812 s t(norm)=0.864029, mflops=5.78684 (err=2.3e-016) 24. Singleton (f2c): elapsed time t=1.833 s, 1048576 iters, t-(init.)=1.663 s t(norm)=0.198245, mflops=25.2213 (err=1.6e-016) 25. Temperton (f2c): elapsed time t=1.713 s, 524288 iters, t-(init.)=1.623 s t(norm)=0.386953, mflops=12.9215 (err=1.6e-016) 26. Valkenburg: elapsed time t=1.282 s, 262144 iters, t-(init.)=1.232 s t(norm)=0.587463, mflops=8.51117 (err=2.5e-016) Top mflops for N=4 = 188.296 Normalized results and averages for N=4: fft 0: mflops = 64.8771 (norm. = 0.344548), norm. avg. (of 2) = 0.455026 fft 1: mflops = 63.0249 (norm. = 0.334711), norm. avg. (of 2) = 0.472854 fft 2: mflops = 39.126 (norm. = 0.207789), norm. avg. (of 2) = 0.351624 fft 3: mflops = 7.93174 (norm. = 0.0421237), norm. avg. (of 2) = 0.0335452 fft 4: mflops = 7.93174 (norm. = 0.0421237), norm. avg. (of 2) = 0.0709953 fft 5: mflops = 25.8748 (norm. = 0.137415), norm. avg. (of 2) = 0.123412 fft 6: mflops = 17.3318 (norm. = 0.0920455), norm. avg. (of 2) = 0.0823599 fft 7: mflops = 6.34347 (norm. = 0.0336887), norm. avg. (of 2) = 0.049172 fft 8: mflops = 39.1991 (norm. = 0.208178), norm. avg. (of 1) = 0.208178 fft 9: mflops = 29.4958 (norm. = 0.156646), norm. avg. (of 2) = 0.159668 fft 10: mflops = 41.8593 (norm. = 0.222305), norm. avg. (of 2) = 0.296338 fft 11: mflops = 41.943 (norm. = 0.22275), norm. avg. (of 2) = 0.293857 fft 12: mflops = 188.296 (norm. = 1), norm. avg. (of 2) = 1 fft 13: mflops = -1 (norm. = -0.00531077), norm. avg. (of 0) = -1 fft 14: mflops = 37.3824 (norm. = 0.198529), norm. avg. (of 2) = 0.199132 fft 15: mflops = 18.2044 (norm. = 0.0966797), norm. avg. (of 2) = 0.115786 fft 16: mflops = 17.3032 (norm. = 0.0918936), norm. avg. (of 2) = 0.111051 fft 17: mflops = 81.285 (norm. = 0.431686), norm. avg. (of 2) = 0.405476 fft 18: mflops = 51.6858 (norm. = 0.274492), norm. avg. (of 1) = 0.274492 fft 19: mflops = 54.7202 (norm. = 0.290607), norm. avg. (of 1) = 0.290607 fft 20: mflops = 52.6592 (norm. = 0.279661), norm. avg. (of 1) = 0.279661 fft 21: mflops = 7.9922 (norm. = 0.0424447), norm. avg. (of 2) = 0.0526821 fft 22: mflops = 39.4944 (norm. = 0.209746), norm. avg. (of 2) = 0.307166 fft 23: mflops = 5.78684 (norm. = 0.0307326), norm. avg. (of 1) = 0.0307326 fft 24: mflops = 25.2213 (norm. = 0.133945), norm. avg. (of 2) = 0.124532 fft 25: mflops = 12.9215 (norm. = 0.0686229), norm. avg. (of 2) = 0.0612511 fft 26: mflops = 8.51117 (norm. = 0.0452009), norm. avg. (of 2) = 0.0815627 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.652 s, 1048576 iters, t-(init.)=1.371 s t(norm)=0.0544786, mflops=91.7791 (err=2.2e-016) 1. Arndt DIT: elapsed time t=1.683 s, 1048576 iters, t-(init.)=1.403 s t(norm)=0.0557502, mflops=89.6858 (err=2.2e-016) 2. Arndt Split-Radix: elapsed time t=1.782 s, 524288 iters, t-(init.)=1.632 s t(norm)=0.1297, mflops=38.5506 (err=1.8e-016) 3. Arndt 4-step: elapsed time t=1.833 s, 131072 iters, t-(init.)=1.803 s t(norm)=0.573158, mflops=8.72359 (err=2.0e-016) 4. Beauregard: elapsed time t=1.362 s, 131072 iters, t-(init.)=1.322 s t(norm)=0.420252, mflops=11.8976 (err=1.1e-016) 5. Bergland: elapsed time t=1.512 s, 524288 iters, t-(init.)=1.362 s t(norm)=0.108242, mflops=46.1928 (err=1.2e-016) 6. CWP (min N): elapsed time t=1.843 s, 524288 iters, t-(init.)=1.693 s t(norm)=0.134548, mflops=37.1616 7. CWP (best N) (N=15): elapsed time t=1.763 s, 262144 iters, t-(init.)=1.643 s t(norm)=0.261148, mflops=19.1462 8. Edelblute: elapsed time t=1.051 s, 262144 iters, t-(init.)=0.981 s t(norm)=0.155926, mflops=32.0665 (err=2.2e-016) 9. FFTPACK (f2c): elapsed time t=1.031 s, 262144 iters, t-(init.)=0.951 s t(norm)=0.151157, mflops=33.0781 (err=1.1e-016) FFTW_MEASURE plan: (cost = 2.441406e-006) FFTW_TWIDDLE 4 FFTW_NOTW 2 10. FFTW: elapsed time t=1.302 s, 524288 iters, t-(init.)=1.152 s t(norm)=0.0915527, mflops=54.6133 (err=1.1e-016) FFTW_ESTIMATE plan: (cost = 4.688000e+002) FFTW_NOTW 8 11. FFTW_ESTIMATE: elapsed time t=1.422 s, 524288 iters, t-(init.)=1.271 s t(norm)=0.10101, mflops=49.5 (err=1.1e-016) 12. Frigo-old: elapsed time t=1.592 s, 2097152 iters, t-(init.)=0.981 s t(norm)=0.0194907, mflops=256.532 (err=1.2e-016) 13. Green: elapsed time t=1.492 s, 1048576 iters, t-(init.)=1.201 s t(norm)=0.0477235, mflops=104.77 (err=1.5e-016) 14. GSL: elapsed time t=1.241 s, 524288 iters, t-(init.)=1.1 s t(norm)=0.0874201, mflops=57.1951 (err=1.1e-016) 15. GSL DIT: elapsed time t=1.192 s, 262144 iters, t-(init.)=1.122 s t(norm)=0.178337, mflops=28.0368 (err=1.3e-016) 16. GSL DIF: elapsed time t=1.312 s, 262144 iters, t-(init.)=1.242 s t(norm)=0.197411, mflops=25.3279 (err=1.2e-016) 17. Krukar: elapsed time t=1.221 s, 1048576 iters, t-(init.)=0.92 s t(norm)=0.0365575, mflops=136.771 (err=1.2e-016) 18. Mayer (Buneman): elapsed time t=1.002 s, 524288 iters, t-(init.)=0.842 s t(norm)=0.0669161, mflops=74.7204 (err=2.0e-016) 19. Mayer (simple): elapsed time t=2.003 s, 1048576 iters, t-(init.)=1.713 s t(norm)=0.0680685, mflops=73.4554 20. Mayer (lookup): elapsed time t=1.933 s, 1048576 iters, t-(init.)=1.623 s t(norm)=0.0644922, mflops=77.5287 (err=2.0e-016) 21. NAPACK (f2c): elapsed time t=1.533 s, 131072 iters, t-(init.)=1.493 s t(norm)=0.474612, mflops=10.5349 (err=2.7e-016) 22. Ooura (C): elapsed time t=1.432 s, 524288 iters, t-(init.)=1.292 s t(norm)=0.102679, mflops=48.6955 (err=1.2e-016) 23. Ransom: elapsed time t=1.191 s, 65536 iters, t-(init.)=1.171 s t(norm)=0.744502, mflops=6.7159 (err=2.5e-016) 24. Singleton (f2c): elapsed time t=1.352 s, 262144 iters, t-(init.)=1.282 s t(norm)=0.203768, mflops=24.5377 (err=1.6e-016) 25. Temperton (f2c): elapsed time t=1.282 s, 262144 iters, t-(init.)=1.202 s t(norm)=0.191053, mflops=26.1708 (err=1.1e-016) 26. Valkenburg: elapsed time t=1.863 s, 131072 iters, t-(init.)=1.823 s t(norm)=0.579516, mflops=8.62789 (err=2.0e-016) Top mflops for N=8 = 256.532 Normalized results and averages for N=8: fft 0: mflops = 91.7791 (norm. = 0.357768), norm. avg. (of 3) = 0.422607 fft 1: mflops = 89.6858 (norm. = 0.349608), norm. avg. (of 3) = 0.431772 fft 2: mflops = 38.5506 (norm. = 0.150276), norm. avg. (of 3) = 0.284508 fft 3: mflops = 8.72359 (norm. = 0.0340058), norm. avg. (of 3) = 0.0336987 fft 4: mflops = 11.8976 (norm. = 0.0463786), norm. avg. (of 3) = 0.0627897 fft 5: mflops = 46.1928 (norm. = 0.180066), norm. avg. (of 3) = 0.142297 fft 6: mflops = 37.1616 (norm. = 0.144861), norm. avg. (of 3) = 0.103194 fft 7: mflops = 19.1462 (norm. = 0.0746348), norm. avg. (of 3) = 0.0576596 fft 8: mflops = 32.0665 (norm. = 0.125), norm. avg. (of 2) = 0.166589 fft 9: mflops = 33.0781 (norm. = 0.128943), norm. avg. (of 3) = 0.149426 fft 10: mflops = 54.6133 (norm. = 0.212891), norm. avg. (of 3) = 0.268522 fft 11: mflops = 49.5 (norm. = 0.192958), norm. avg. (of 3) = 0.260224 fft 12: mflops = 256.532 (norm. = 1), norm. avg. (of 3) = 1 fft 13: mflops = 104.77 (norm. = 0.40841), norm. avg. (of 1) = 0.40841 fft 14: mflops = 57.1951 (norm. = 0.222955), norm. avg. (of 3) = 0.207073 fft 15: mflops = 28.0368 (norm. = 0.109291), norm. avg. (of 3) = 0.113621 fft 16: mflops = 25.3279 (norm. = 0.0987319), norm. avg. (of 3) = 0.106945 fft 17: mflops = 136.771 (norm. = 0.533152), norm. avg. (of 3) = 0.448035 fft 18: mflops = 74.7204 (norm. = 0.291271), norm. avg. (of 2) = 0.282881 fft 19: mflops = 73.4554 (norm. = 0.28634), norm. avg. (of 2) = 0.288473 fft 20: mflops = 77.5287 (norm. = 0.302218), norm. avg. (of 2) = 0.29094 fft 21: mflops = 10.5349 (norm. = 0.0410666), norm. avg. (of 3) = 0.0488103 fft 22: mflops = 48.6955 (norm. = 0.189822), norm. avg. (of 3) = 0.268051 fft 23: mflops = 6.7159 (norm. = 0.0261795), norm. avg. (of 2) = 0.0284561 fft 24: mflops = 24.5377 (norm. = 0.0956513), norm. avg. (of 3) = 0.114905 fft 25: mflops = 26.1708 (norm. = 0.102017), norm. avg. (of 3) = 0.0748399 fft 26: mflops = 8.62789 (norm. = 0.0336327), norm. avg. (of 3) = 0.0655861 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.903 s, 262144 iters, t-(init.)=1.793 s t(norm)=0.106871, mflops=46.7853 (err=1.3e-016) 1. Arndt DIT: elapsed time t=1.853 s, 262144 iters, t-(init.)=1.743 s t(norm)=0.103891, mflops=48.1274 (err=1.5e-016) 2. Arndt Split-Radix: elapsed time t=1.162 s, 131072 iters, t-(init.)=1.102 s t(norm)=0.131369, mflops=38.0608 (err=1.1e-016) 3. Arndt 4-step: elapsed time t=1.402 s, 65536 iters, t-(init.)=1.372 s t(norm)=0.32711, mflops=15.2854 (err=1.3e-016) 4. Beauregard: elapsed time t=1.523 s, 65536 iters, t-(init.)=1.493 s t(norm)=0.355959, mflops=14.0466 (err=2.2e-016) 5. Bergland: elapsed time t=1.492 s, 262144 iters, t-(init.)=1.372 s t(norm)=0.0817776, mflops=61.1415 (err=1.9e-016) 6. CWP (min N): elapsed time t=1.432 s, 262144 iters, t-(init.)=1.302 s t(norm)=0.0776052, mflops=64.4286 7. CWP (best N) (N=28): elapsed time t=1.492 s, 131072 iters, t-(init.)=1.392 s t(norm)=0.165939, mflops=30.1315 8. Edelblute: elapsed time t=1.462 s, 131072 iters, t-(init.)=1.402 s t(norm)=0.167131, mflops=29.9166 (err=1.4e-016) 9. FFTPACK (f2c): elapsed time t=1.973 s, 262144 iters, t-(init.)=1.843 s t(norm)=0.109851, mflops=45.516 (err=2.0e-016) FFTW_MEASURE plan: (cost = 4.898071e-006) FFTW_TWIDDLE 8 FFTW_NOTW 2 10. FFTW: elapsed time t=1.351 s, 262144 iters, t-(init.)=1.22 s t(norm)=0.0727177, mflops=68.7591 (err=2.1e-016) FFTW_ESTIMATE plan: (cost = 4.256000e+002) FFTW_NOTW 16 11. FFTW_ESTIMATE: elapsed time t=1.843 s, 262144 iters, t-(init.)=1.723 s t(norm)=0.102699, mflops=48.6861 (err=2.1e-016) 12. Frigo-old: elapsed time t=1.762 s, 1048576 iters, t-(init.)=1.261 s t(norm)=0.0187904, mflops=266.094 (err=2.1e-016) 13. Green: elapsed time t=1.932 s, 524288 iters, t-(init.)=1.701 s t(norm)=0.0506938, mflops=98.6315 (err=2.2e-016) 14. GSL: elapsed time t=1.092 s, 262144 iters, t-(init.)=0.972 s t(norm)=0.0579357, mflops=86.3026 (err=2.0e-016) 15. GSL DIT: elapsed time t=1.121 s, 131072 iters, t-(init.)=1.061 s t(norm)=0.126481, mflops=39.5316 (err=2.4e-016) 16. GSL DIF: elapsed time t=1.202 s, 131072 iters, t-(init.)=1.142 s t(norm)=0.136137, mflops=36.7277 (err=2.4e-016) 17. Krukar: elapsed time t=1.272 s, 524288 iters, t-(init.)=1.022 s t(norm)=0.030458, mflops=164.161 (err=2.5e-016) 18. Mayer (Buneman): elapsed time t=1.492 s, 262144 iters, t-(init.)=1.361 s t(norm)=0.0811219, mflops=61.6356 (err=1.3e-016) 19. Mayer (simple): elapsed time t=1.292 s, 262144 iters, t-(init.)=1.182 s t(norm)=0.0704527, mflops=70.9696 20. Mayer (lookup): elapsed time t=1.252 s, 262144 iters, t-(init.)=1.132 s t(norm)=0.0674725, mflops=74.1043 (err=1.4e-016) 21. NAPACK (f2c): elapsed time t=1.482 s, 65536 iters, t-(init.)=1.452 s t(norm)=0.346184, mflops=14.4432 (err=3.0e-016) 22. Ooura (C): elapsed time t=1.702 s, 262144 iters, t-(init.)=1.581 s t(norm)=0.0942349, mflops=53.0589 (err=1.6e-016) 23. Ransom: elapsed time t=1.052 s, 65536 iters, t-(init.)=1.022 s t(norm)=0.243664, mflops=20.5201 (err=4.1e-016) 24. Singleton (f2c): elapsed time t=1.592 s, 262144 iters, t-(init.)=1.462 s t(norm)=0.087142, mflops=57.3776 (err=2.2e-016) 25. Temperton (f2c): elapsed time t=1.041 s, 131072 iters, t-(init.)=0.97 s t(norm)=0.115633, mflops=43.2402 (err=2.0e-016) 26. Valkenburg: elapsed time t=1.212 s, 32768 iters, t-(init.)=1.202 s t(norm)=0.573158, mflops=8.72359 (err=2.4e-016) Top mflops for N=16 = 266.094 Normalized results and averages for N=16: fft 0: mflops = 46.7853 (norm. = 0.175823), norm. avg. (of 4) = 0.360911 fft 1: mflops = 48.1274 (norm. = 0.180866), norm. avg. (of 4) = 0.369046 fft 2: mflops = 38.0608 (norm. = 0.143035), norm. avg. (of 4) = 0.24914 fft 3: mflops = 15.2854 (norm. = 0.0574435), norm. avg. (of 4) = 0.0396349 fft 4: mflops = 14.0466 (norm. = 0.052788), norm. avg. (of 4) = 0.0602893 fft 5: mflops = 61.1415 (norm. = 0.229774), norm. avg. (of 4) = 0.164166 fft 6: mflops = 64.4286 (norm. = 0.242127), norm. avg. (of 4) = 0.137927 fft 7: mflops = 30.1315 (norm. = 0.113236), norm. avg. (of 4) = 0.0715538 fft 8: mflops = 29.9166 (norm. = 0.112429), norm. avg. (of 3) = 0.148535 fft 9: mflops = 45.516 (norm. = 0.171053), norm. avg. (of 4) = 0.154833 fft 10: mflops = 68.7591 (norm. = 0.258402), norm. avg. (of 4) = 0.265992 fft 11: mflops = 48.6861 (norm. = 0.182966), norm. avg. (of 4) = 0.240909 fft 12: mflops = 266.094 (norm. = 1), norm. avg. (of 4) = 1 fft 13: mflops = 98.6315 (norm. = 0.370664), norm. avg. (of 2) = 0.389537 fft 14: mflops = 86.3026 (norm. = 0.324331), norm. avg. (of 4) = 0.236387 fft 15: mflops = 39.5316 (norm. = 0.148563), norm. avg. (of 4) = 0.122356 fft 16: mflops = 36.7277 (norm. = 0.138025), norm. avg. (of 4) = 0.114715 fft 17: mflops = 164.161 (norm. = 0.616928), norm. avg. (of 4) = 0.490258 fft 18: mflops = 61.6356 (norm. = 0.231631), norm. avg. (of 3) = 0.265798 fft 19: mflops = 70.9696 (norm. = 0.266709), norm. avg. (of 3) = 0.281218 fft 20: mflops = 74.1043 (norm. = 0.278489), norm. avg. (of 3) = 0.28679 fft 21: mflops = 14.4432 (norm. = 0.0542786), norm. avg. (of 4) = 0.0501774 fft 22: mflops = 53.0589 (norm. = 0.199399), norm. avg. (of 4) = 0.250888 fft 23: mflops = 20.5201 (norm. = 0.0771159), norm. avg. (of 3) = 0.044676 fft 24: mflops = 57.3776 (norm. = 0.215629), norm. avg. (of 4) = 0.140086 fft 25: mflops = 43.2402 (norm. = 0.1625), norm. avg. (of 4) = 0.0967549 fft 26: mflops = 8.72359 (norm. = 0.0327839), norm. avg. (of 4) = 0.0573855 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.943 s, 131072 iters, t-(init.)=1.843 s t(norm)=0.0878811, mflops=56.8951 (err=2.6e-016) 1. Arndt DIT: elapsed time t=1.913 s, 131072 iters, t-(init.)=1.813 s t(norm)=0.0864506, mflops=57.8365 (err=2.5e-016) 2. Arndt Split-Radix: elapsed time t=1.322 s, 65536 iters, t-(init.)=1.272 s t(norm)=0.121307, mflops=41.2176 (err=2.2e-016) 3. Arndt 4-step: elapsed time t=1.752 s, 32768 iters, t-(init.)=1.732 s t(norm)=0.330353, mflops=15.1353 (err=2.5e-016) 4. Beauregard: elapsed time t=1.782 s, 32768 iters, t-(init.)=1.751 s t(norm)=0.333977, mflops=14.9711 (err=2.9e-016) 5. Bergland: elapsed time t=1.402 s, 131072 iters, t-(init.)=1.292 s t(norm)=0.0616074, mflops=81.1591 (err=2.8e-016) 6. CWP (min N) (N=33): elapsed time t=1.942 s, 131072 iters, t-(init.)=1.832 s t(norm)=0.0873566, mflops=57.2367 7. CWP (best N) (N=35): elapsed time t=1.943 s, 131072 iters, t-(init.)=1.823 s t(norm)=0.0869274, mflops=57.5193 8. Edelblute: elapsed time t=1.672 s, 65536 iters, t-(init.)=1.622 s t(norm)=0.154686, mflops=32.3236 (err=2.7e-016) 9. FFTPACK (f2c): elapsed time t=1.402 s, 65536 iters, t-(init.)=1.342 s t(norm)=0.127983, mflops=39.0677 (err=3.0e-016) FFTW_MEASURE plan: (cost = 1.037598e-005) FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=1.382 s, 131072 iters, t-(init.)=1.272 s t(norm)=0.0606537, mflops=82.4352 (err=2.9e-016) FFTW_ESTIMATE plan: (cost = 3.200000e+001) FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.101 s, 65536 iters, t-(init.)=1.04 s t(norm)=0.0991821, mflops=50.4123 (err=3.2e-016) 12. Frigo-old: elapsed time t=1.052 s, 262144 iters, t-(init.)=0.832 s t(norm)=0.0198364, mflops=252.062 (err=3.2e-016) 13. Green: elapsed time t=1.993 s, 262144 iters, t-(init.)=1.793 s t(norm)=0.0427485, mflops=116.963 (err=2.9e-016) 14. GSL: elapsed time t=1.352 s, 131072 iters, t-(init.)=1.252 s t(norm)=0.0597, mflops=83.7521 (err=2.9e-016) 15. GSL DIT: elapsed time t=1.072 s, 65536 iters, t-(init.)=1.012 s t(norm)=0.0965118, mflops=51.8071 (err=3.6e-016) 16. GSL DIF: elapsed time t=1.161 s, 65536 iters, t-(init.)=1.1 s t(norm)=0.104904, mflops=47.6625 (err=3.2e-016) 17. Krukar: elapsed time t=1.402 s, 262144 iters, t-(init.)=1.182 s t(norm)=0.0281811, mflops=177.424 (err=2.9e-016) 18. Mayer (Buneman): elapsed time t=1.552 s, 131072 iters, t-(init.)=1.431 s t(norm)=0.0682354, mflops=73.2758 (err=2.3e-016) 19. Mayer (simple): elapsed time t=1.252 s, 131072 iters, t-(init.)=1.152 s t(norm)=0.0549316, mflops=91.0222 20. Mayer (lookup): elapsed time t=1.242 s, 131072 iters, t-(init.)=1.132 s t(norm)=0.053978, mflops=92.6304 (err=2.4e-016) 21. NAPACK (f2c): elapsed time t=1.683 s, 32768 iters, t-(init.)=1.663 s t(norm)=0.317192, mflops=15.7633 (err=4.3e-016) 22. Ooura (C): elapsed time t=1.032 s, 65536 iters, t-(init.)=0.982 s t(norm)=0.0936508, mflops=53.3898 (err=2.7e-016) 23. Ransom: elapsed time t=1.312 s, 32768 iters, t-(init.)=1.282 s t(norm)=0.244522, mflops=20.448 (err=6.6e-016) 24. Singleton (f2c): elapsed time t=1.602 s, 131072 iters, t-(init.)=1.492 s t(norm)=0.0711441, mflops=70.2799 (err=3.0e-016) 25. Temperton (f2c): elapsed time t=1.062 s, 65536 iters, t-(init.)=1.002 s t(norm)=0.0955582, mflops=52.3242 (err=3.0e-016) 26. Valkenburg: elapsed time t=1.472 s, 16384 iters, t-(init.)=1.452 s t(norm)=0.553894, mflops=9.027 (err=2.6e-016) Top mflops for N=32 = 252.062 Normalized results and averages for N=32: fft 0: mflops = 56.8951 (norm. = 0.225719), norm. avg. (of 5) = 0.333872 fft 1: mflops = 57.8365 (norm. = 0.229454), norm. avg. (of 5) = 0.341127 fft 2: mflops = 41.2176 (norm. = 0.163522), norm. avg. (of 5) = 0.232016 fft 3: mflops = 15.1353 (norm. = 0.0600462), norm. avg. (of 5) = 0.0437172 fft 4: mflops = 14.9711 (norm. = 0.0593946), norm. avg. (of 5) = 0.0601104 fft 5: mflops = 81.1591 (norm. = 0.321981), norm. avg. (of 5) = 0.195729 fft 6: mflops = 57.2367 (norm. = 0.227074), norm. avg. (of 5) = 0.155757 fft 7: mflops = 57.5193 (norm. = 0.228195), norm. avg. (of 5) = 0.102882 fft 8: mflops = 32.3236 (norm. = 0.128237), norm. avg. (of 4) = 0.143461 fft 9: mflops = 39.0677 (norm. = 0.154993), norm. avg. (of 5) = 0.154865 fft 10: mflops = 82.4352 (norm. = 0.327044), norm. avg. (of 5) = 0.278202 fft 11: mflops = 50.4123 (norm. = 0.2), norm. avg. (of 5) = 0.232728 fft 12: mflops = 252.062 (norm. = 1), norm. avg. (of 5) = 1 fft 13: mflops = 116.963 (norm. = 0.464027), norm. avg. (of 3) = 0.414367 fft 14: mflops = 83.7521 (norm. = 0.332268), norm. avg. (of 5) = 0.255563 fft 15: mflops = 51.8071 (norm. = 0.205534), norm. avg. (of 5) = 0.138992 fft 16: mflops = 47.6625 (norm. = 0.189091), norm. avg. (of 5) = 0.12959 fft 17: mflops = 177.424 (norm. = 0.703892), norm. avg. (of 5) = 0.532985 fft 18: mflops = 73.2758 (norm. = 0.290706), norm. avg. (of 4) = 0.272025 fft 19: mflops = 91.0222 (norm. = 0.361111), norm. avg. (of 4) = 0.301192 fft 20: mflops = 92.6304 (norm. = 0.367491), norm. avg. (of 4) = 0.306965 fft 21: mflops = 15.7633 (norm. = 0.0625376), norm. avg. (of 5) = 0.0526494 fft 22: mflops = 53.3898 (norm. = 0.211813), norm. avg. (of 5) = 0.243073 fft 23: mflops = 20.448 (norm. = 0.0811232), norm. avg. (of 4) = 0.0537878 fft 24: mflops = 70.2799 (norm. = 0.27882), norm. avg. (of 5) = 0.167833 fft 25: mflops = 52.3242 (norm. = 0.207585), norm. avg. (of 5) = 0.118921 fft 26: mflops = 9.027 (norm. = 0.0358127), norm. avg. (of 5) = 0.0530709 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.231 s, 32768 iters, t-(init.)=1.191 s t(norm)=0.0946522, mflops=52.825 (err=5.7e-016) 1. Arndt DIT: elapsed time t=1.192 s, 32768 iters, t-(init.)=1.142 s t(norm)=0.090758, mflops=55.0916 (err=5.4e-016) 2. Arndt Split-Radix: elapsed time t=1.472 s, 32768 iters, t-(init.)=1.422 s t(norm)=0.11301, mflops=44.2437 (err=5.8e-016) 3. Arndt 4-step: elapsed time t=1.432 s, 16384 iters, t-(init.)=1.412 s t(norm)=0.224431, mflops=22.2785 (err=5.5e-016) 4. Beauregard: elapsed time t=1.092 s, 8192 iters, t-(init.)=1.072 s t(norm)=0.34078, mflops=14.6722 (err=5.7e-016) 5. Bergland: elapsed time t=1.512 s, 65536 iters, t-(init.)=1.412 s t(norm)=0.0561078, mflops=89.1141 (err=5.3e-016) 6. CWP (min N) (N=65): elapsed time t=1.001 s, 32768 iters, t-(init.)=0.951 s t(norm)=0.0755787, mflops=66.1562 7. CWP (best N) (N=84): elapsed time t=1.232 s, 32768 iters, t-(init.)=1.162 s t(norm)=0.0923475, mflops=54.1433 8. Edelblute: elapsed time t=1.853 s, 32768 iters, t-(init.)=1.803 s t(norm)=0.14329, mflops=34.8944 (err=5.7e-016) 9. FFTPACK (f2c): elapsed time t=1.442 s, 32768 iters, t-(init.)=1.392 s t(norm)=0.110626, mflops=45.1972 (err=5.4e-016) FFTW_MEASURE plan: (cost = 2.386475e-005) FFTW_TWIDDLE 32 FFTW_NOTW 2 10. FFTW: elapsed time t=1.572 s, 65536 iters, t-(init.)=1.471 s t(norm)=0.0584523, mflops=85.5399 (err=5.9e-016) FFTW_ESTIMATE plan: (cost = 7.680000e+002) FFTW_TWIDDLE 2 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.222 s, 32768 iters, t-(init.)=1.172 s t(norm)=0.0931422, mflops=53.6814 (err=5.7e-016) 12. Frigo-old: elapsed time t=1.843 s, 131072 iters, t-(init.)=1.633 s t(norm)=0.0324448, mflops=154.108 (err=5.7e-016) 13. Green: elapsed time t=1.932 s, 131072 iters, t-(init.)=1.751 s t(norm)=0.0347892, mflops=143.723 (err=5.7e-016) 14. GSL: elapsed time t=1.321 s, 65536 iters, t-(init.)=1.23 s t(norm)=0.0488758, mflops=102.3 (err=5.4e-016) 15. GSL DIT: elapsed time t=1.152 s, 32768 iters, t-(init.)=1.102 s t(norm)=0.0875791, mflops=57.0913 (err=5.8e-016) 16. GSL DIF: elapsed time t=1.242 s, 32768 iters, t-(init.)=1.192 s t(norm)=0.0947316, mflops=52.7807 (err=5.7e-016) 17. Krukar: elapsed time t=1.582 s, 32768 iters, t-(init.)=1.522 s t(norm)=0.120958, mflops=41.3368 (err=6.0e-016) 18. Mayer (Buneman): elapsed time t=1.853 s, 65536 iters, t-(init.)=1.753 s t(norm)=0.069658, mflops=71.7793 (err=5.1e-016) 19. Mayer (simple): elapsed time t=1.462 s, 65536 iters, t-(init.)=1.372 s t(norm)=0.0545184, mflops=91.7122 20. Mayer (lookup): elapsed time t=1.472 s, 65536 iters, t-(init.)=1.362 s t(norm)=0.054121, mflops=92.3856 (err=5.6e-016) 21. NAPACK (f2c): elapsed time t=1.702 s, 16384 iters, t-(init.)=1.682 s t(norm)=0.267347, mflops=18.7023 (err=1.0e-015) 22. Ooura (C): elapsed time t=1.142 s, 32768 iters, t-(init.)=1.092 s t(norm)=0.0867844, mflops=57.6141 (err=5.3e-016) 23. Ransom: elapsed time t=1.693 s, 32768 iters, t-(init.)=1.643 s t(norm)=0.130574, mflops=38.2925 (err=8.4e-016) 24. Singleton (f2c): elapsed time t=1.522 s, 65536 iters, t-(init.)=1.412 s t(norm)=0.0561078, mflops=89.1141 (err=8.6e-016) 25. Temperton (f2c): elapsed time t=1.893 s, 65536 iters, t-(init.)=1.793 s t(norm)=0.0712474, mflops=70.178 (err=5.4e-016) 26. Valkenburg: elapsed time t=1.713 s, 8192 iters, t-(init.)=1.693 s t(norm)=0.53819, mflops=9.2904 (err=6.7e-016) Top mflops for N=64 = 154.108 Normalized results and averages for N=64: fft 0: mflops = 52.825 (norm. = 0.342779), norm. avg. (of 6) = 0.335357 fft 1: mflops = 55.0916 (norm. = 0.357487), norm. avg. (of 6) = 0.343854 fft 2: mflops = 44.2437 (norm. = 0.287096), norm. avg. (of 6) = 0.241196 fft 3: mflops = 22.2785 (norm. = 0.144564), norm. avg. (of 6) = 0.0605251 fft 4: mflops = 14.6722 (norm. = 0.0952076), norm. avg. (of 6) = 0.0659599 fft 5: mflops = 89.1141 (norm. = 0.578258), norm. avg. (of 6) = 0.259484 fft 6: mflops = 66.1562 (norm. = 0.429285), norm. avg. (of 6) = 0.201345 fft 7: mflops = 54.1433 (norm. = 0.351334), norm. avg. (of 6) = 0.144291 fft 8: mflops = 34.8944 (norm. = 0.226428), norm. avg. (of 5) = 0.160054 fft 9: mflops = 45.1972 (norm. = 0.293283), norm. avg. (of 6) = 0.177934 fft 10: mflops = 85.5399 (norm. = 0.555065), norm. avg. (of 6) = 0.324346 fft 11: mflops = 53.6814 (norm. = 0.348336), norm. avg. (of 6) = 0.251996 fft 12: mflops = 154.108 (norm. = 1), norm. avg. (of 6) = 1 fft 13: mflops = 143.723 (norm. = 0.93261), norm. avg. (of 4) = 0.543928 fft 14: mflops = 102.3 (norm. = 0.663821), norm. avg. (of 6) = 0.323606 fft 15: mflops = 57.0913 (norm. = 0.370463), norm. avg. (of 6) = 0.17757 fft 16: mflops = 52.7807 (norm. = 0.342492), norm. avg. (of 6) = 0.165074 fft 17: mflops = 41.3368 (norm. = 0.268233), norm. avg. (of 6) = 0.488859 fft 18: mflops = 71.7793 (norm. = 0.465773), norm. avg. (of 5) = 0.310774 fft 19: mflops = 91.7122 (norm. = 0.595117), norm. avg. (of 5) = 0.359977 fft 20: mflops = 92.3856 (norm. = 0.599486), norm. avg. (of 5) = 0.365469 fft 21: mflops = 18.7023 (norm. = 0.121359), norm. avg. (of 6) = 0.0641009 fft 22: mflops = 57.6141 (norm. = 0.373855), norm. avg. (of 6) = 0.26487 fft 23: mflops = 38.2925 (norm. = 0.248478), norm. avg. (of 5) = 0.092726 fft 24: mflops = 89.1141 (norm. = 0.578258), norm. avg. (of 6) = 0.236237 fft 25: mflops = 70.178 (norm. = 0.455382), norm. avg. (of 6) = 0.174998 fft 26: mflops = 9.2904 (norm. = 0.060285), norm. avg. (of 6) = 0.0542733 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.251 s, 16384 iters, t-(init.)=1.211 s t(norm)=0.0824928, mflops=60.6113 (err=3.7e-016) 1. Arndt DIT: elapsed time t=1.212 s, 16384 iters, t-(init.)=1.172 s t(norm)=0.0798362, mflops=62.6283 (err=3.6e-016) 2. Arndt Split-Radix: elapsed time t=1.562 s, 16384 iters, t-(init.)=1.522 s t(norm)=0.103678, mflops=48.2262 (err=3.8e-016) 3. Arndt 4-step: elapsed time t=1.853 s, 8192 iters, t-(init.)=1.833 s t(norm)=0.249726, mflops=20.0219 (err=3.3e-016) 4. Beauregard: elapsed time t=1.282 s, 4096 iters, t-(init.)=1.272 s t(norm)=0.346592, mflops=14.4262 (err=4.0e-016) 5. Bergland: elapsed time t=1.642 s, 32768 iters, t-(init.)=1.542 s t(norm)=0.0525202, mflops=95.2015 (err=3.2e-016) 6. CWP (min N) (N=130): elapsed time t=1.142 s, 16384 iters, t-(init.)=1.092 s t(norm)=0.0743866, mflops=67.2164 7. CWP (best N) (N=140): elapsed time t=1.072 s, 16384 iters, t-(init.)=1.022 s t(norm)=0.0696182, mflops=71.8203 8. Edelblute: elapsed time t=1.973 s, 16384 iters, t-(init.)=1.923 s t(norm)=0.130994, mflops=38.1697 (err=3.7e-016) 9. FFTPACK (f2c): elapsed time t=1.743 s, 16384 iters, t-(init.)=1.693 s t(norm)=0.115326, mflops=43.3552 (err=3.7e-016) FFTW_MEASURE plan: (cost = 5.371094e-005) FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=1.782 s, 32768 iters, t-(init.)=1.682 s t(norm)=0.0572886, mflops=87.2774 (err=3.8e-016) FFTW_ESTIMATE plan: (cost = 1.075200e+003) FFTW_TWIDDLE 4 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.292 s, 16384 iters, t-(init.)=1.242 s t(norm)=0.0846045, mflops=59.0985 (err=3.5e-016) 12. Frigo-old: elapsed time t=1.933 s, 65536 iters, t-(init.)=1.723 s t(norm)=0.0293425, mflops=170.401 (err=3.5e-016) 13. Green: elapsed time t=1.302 s, 32768 iters, t-(init.)=1.212 s t(norm)=0.0412805, mflops=121.123 (err=4.0e-016) 14. GSL: elapsed time t=1.412 s, 32768 iters, t-(init.)=1.322 s t(norm)=0.0450271, mflops=111.044 (err=3.6e-016) 15. GSL DIT: elapsed time t=1.191 s, 16384 iters, t-(init.)=1.14 s t(norm)=0.0776563, mflops=64.3862 (err=3.8e-016) 16. GSL DIF: elapsed time t=1.302 s, 16384 iters, t-(init.)=1.252 s t(norm)=0.0852857, mflops=58.6265 (err=3.4e-016) 17. Krukar: elapsed time t=1.081 s, 8192 iters, t-(init.)=1.051 s t(norm)=0.143187, mflops=34.9193 (err=3.5e-016) 18. Mayer (Buneman): elapsed time t=1.953 s, 32768 iters, t-(init.)=1.853 s t(norm)=0.0631128, mflops=79.2232 (err=3.4e-016) 19. Mayer (simple): elapsed time t=1.513 s, 32768 iters, t-(init.)=1.433 s t(norm)=0.0488077, mflops=102.443 20. Mayer (lookup): elapsed time t=1.492 s, 32768 iters, t-(init.)=1.392 s t(norm)=0.0474112, mflops=105.46 (err=3.8e-016) 21. NAPACK (f2c): elapsed time t=1.963 s, 8192 iters, t-(init.)=1.943 s t(norm)=0.264713, mflops=18.8884 (err=1.1e-015) 22. Ooura (C): elapsed time t=1.352 s, 16384 iters, t-(init.)=1.312 s t(norm)=0.0893729, mflops=55.9454 (err=3.3e-016) 23. Ransom: elapsed time t=1.042 s, 8192 iters, t-(init.)=1.022 s t(norm)=0.139236, mflops=35.9101 (err=9.2e-016) 24. Singleton (f2c): elapsed time t=1.733 s, 32768 iters, t-(init.)=1.623 s t(norm)=0.0552791, mflops=90.4502 (err=4.1e-016) 25. Temperton (f2c): elapsed time t=1.082 s, 16384 iters, t-(init.)=1.032 s t(norm)=0.0702994, mflops=71.1243 (err=3.4e-016) 26. Valkenburg: elapsed time t=1.973 s, 4096 iters, t-(init.)=1.963 s t(norm)=0.534875, mflops=9.34798 (err=5.4e-016) Top mflops for N=128 = 170.401 Normalized results and averages for N=128: fft 0: mflops = 60.6113 (norm. = 0.355698), norm. avg. (of 7) = 0.338263 fft 1: mflops = 62.6283 (norm. = 0.367534), norm. avg. (of 7) = 0.347237 fft 2: mflops = 48.2262 (norm. = 0.283016), norm. avg. (of 7) = 0.24717 fft 3: mflops = 20.0219 (norm. = 0.117499), norm. avg. (of 7) = 0.0686641 fft 4: mflops = 14.4262 (norm. = 0.08466), norm. avg. (of 7) = 0.0686313 fft 5: mflops = 95.2015 (norm. = 0.55869), norm. avg. (of 7) = 0.302228 fft 6: mflops = 67.2164 (norm. = 0.39446), norm. avg. (of 7) = 0.228932 fft 7: mflops = 71.8203 (norm. = 0.421477), norm. avg. (of 7) = 0.183889 fft 8: mflops = 38.1697 (norm. = 0.223999), norm. avg. (of 6) = 0.170712 fft 9: mflops = 43.3552 (norm. = 0.25443), norm. avg. (of 7) = 0.188862 fft 10: mflops = 87.2774 (norm. = 0.512188), norm. avg. (of 7) = 0.351181 fft 11: mflops = 59.0985 (norm. = 0.34682), norm. avg. (of 7) = 0.265542 fft 12: mflops = 170.401 (norm. = 1), norm. avg. (of 7) = 1 fft 13: mflops = 121.123 (norm. = 0.710809), norm. avg. (of 5) = 0.577304 fft 14: mflops = 111.044 (norm. = 0.651664), norm. avg. (of 7) = 0.370472 fft 15: mflops = 64.3862 (norm. = 0.377851), norm. avg. (of 7) = 0.206182 fft 16: mflops = 58.6265 (norm. = 0.34405), norm. avg. (of 7) = 0.190642 fft 17: mflops = 34.9193 (norm. = 0.204924), norm. avg. (of 7) = 0.448297 fft 18: mflops = 79.2232 (norm. = 0.464922), norm. avg. (of 6) = 0.336466 fft 19: mflops = 102.443 (norm. = 0.601186), norm. avg. (of 6) = 0.400178 fft 20: mflops = 105.46 (norm. = 0.618894), norm. avg. (of 6) = 0.407707 fft 21: mflops = 18.8884 (norm. = 0.110847), norm. avg. (of 7) = 0.0707789 fft 22: mflops = 55.9454 (norm. = 0.328316), norm. avg. (of 7) = 0.273934 fft 23: mflops = 35.9101 (norm. = 0.210739), norm. avg. (of 6) = 0.112395 fft 24: mflops = 90.4502 (norm. = 0.530807), norm. avg. (of 7) = 0.278319 fft 25: mflops = 71.1243 (norm. = 0.417393), norm. avg. (of 7) = 0.209626 fft 26: mflops = 9.34798 (norm. = 0.0548586), norm. avg. (of 7) = 0.0543569 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.422 s, 8192 iters, t-(init.)=1.382 s t(norm)=0.0823736, mflops=60.699 (err=7.3e-016) 1. Arndt DIT: elapsed time t=1.372 s, 8192 iters, t-(init.)=1.332 s t(norm)=0.0793934, mflops=62.9775 (err=7.7e-016) 2. Arndt Split-Radix: elapsed time t=1.643 s, 8192 iters, t-(init.)=1.593 s t(norm)=0.0949502, mflops=52.6592 (err=7.4e-016) 3. Arndt 4-step: elapsed time t=1.763 s, 4096 iters, t-(init.)=1.743 s t(norm)=0.207782, mflops=24.0637 (err=7.4e-016) 4. Beauregard: elapsed time t=1.472 s, 2048 iters, t-(init.)=1.462 s t(norm)=0.348568, mflops=14.3444 (err=8.8e-016) 5. Bergland: elapsed time t=1.692 s, 16384 iters, t-(init.)=1.592 s t(norm)=0.0474453, mflops=105.385 (err=7.7e-016) 6. CWP (min N) (N=260): elapsed time t=1.131 s, 8192 iters, t-(init.)=1.081 s t(norm)=0.0644326, mflops=77.6004 7. CWP (best N) (N=280): elapsed time t=1.112 s, 8192 iters, t-(init.)=1.062 s t(norm)=0.0633001, mflops=78.9888 8. Edelblute: elapsed time t=1.032 s, 4096 iters, t-(init.)=1.012 s t(norm)=0.12064, mflops=41.4457 (err=7.4e-016) 9. FFTPACK (f2c): elapsed time t=1.843 s, 8192 iters, t-(init.)=1.793 s t(norm)=0.106871, mflops=46.7853 (err=8.5e-016) FFTW_MEASURE plan: (cost = 1.171875e-004) FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_NOTW 2 10. FFTW: elapsed time t=1.953 s, 16384 iters, t-(init.)=1.853 s t(norm)=0.0552237, mflops=90.5408 (err=8.3e-016) FFTW_ESTIMATE plan: (cost = 9.216000e+002) FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.362 s, 8192 iters, t-(init.)=1.312 s t(norm)=0.0782013, mflops=63.9376 (err=8.3e-016) 12. Frigo-old: elapsed time t=1.051 s, 16384 iters, t-(init.)=0.951 s t(norm)=0.028342, mflops=176.417 (err=8.4e-016) 13. Green: elapsed time t=1.392 s, 16384 iters, t-(init.)=1.302 s t(norm)=0.0388026, mflops=128.857 (err=8.4e-016) 14. GSL: elapsed time t=1.462 s, 16384 iters, t-(init.)=1.372 s t(norm)=0.0408888, mflops=122.283 (err=8.5e-016) 15. GSL DIT: elapsed time t=1.281 s, 8192 iters, t-(init.)=1.23 s t(norm)=0.0733137, mflops=68.2001 (err=8.7e-016) 16. GSL DIF: elapsed time t=1.382 s, 8192 iters, t-(init.)=1.332 s t(norm)=0.0793934, mflops=62.9775 (err=8.4e-016) 17. Krukar: elapsed time t=1.923 s, 8192 iters, t-(init.)=1.873 s t(norm)=0.111639, mflops=44.787 (err=8.6e-016) 18. Mayer (Buneman): elapsed time t=1.081 s, 8192 iters, t-(init.)=1.031 s t(norm)=0.0614524, mflops=81.3638 (err=7.4e-016) 19. Mayer (simple): elapsed time t=1.642 s, 16384 iters, t-(init.)=1.562 s t(norm)=0.0465512, mflops=107.409 20. Mayer (lookup): elapsed time t=1.622 s, 16384 iters, t-(init.)=1.522 s t(norm)=0.0453591, mflops=110.231 (err=7.2e-016) 21. NAPACK (f2c): elapsed time t=1.022 s, 2048 iters, t-(init.)=1.012 s t(norm)=0.24128, mflops=20.7228 (err=2.8e-015) 22. Ooura (C): elapsed time t=1.432 s, 8192 iters, t-(init.)=1.392 s t(norm)=0.0829697, mflops=60.263 (err=7.7e-016) 23. Ransom: elapsed time t=1.633 s, 8192 iters, t-(init.)=1.593 s t(norm)=0.0949502, mflops=52.6592 (err=1.3e-015) 24. Singleton (f2c): elapsed time t=1.673 s, 16384 iters, t-(init.)=1.573 s t(norm)=0.0468791, mflops=106.657 (err=1.2e-015) 25. Temperton (f2c): elapsed time t=1.062 s, 8192 iters, t-(init.)=1.012 s t(norm)=0.0603199, mflops=82.8914 (err=8.5e-016) 26. Valkenburg: elapsed time t=1.101 s, 1024 iters, t-(init.)=1.101 s t(norm)=0.524998, mflops=9.52385 (err=8.9e-016) Top mflops for N=256 = 176.417 Normalized results and averages for N=256: fft 0: mflops = 60.699 (norm. = 0.344067), norm. avg. (of 8) = 0.338988 fft 1: mflops = 62.9775 (norm. = 0.356982), norm. avg. (of 8) = 0.348455 fft 2: mflops = 52.6592 (norm. = 0.298493), norm. avg. (of 8) = 0.253586 fft 3: mflops = 24.0637 (norm. = 0.136403), norm. avg. (of 8) = 0.0771315 fft 4: mflops = 14.3444 (norm. = 0.0813098), norm. avg. (of 8) = 0.0702161 fft 5: mflops = 105.385 (norm. = 0.597362), norm. avg. (of 8) = 0.339119 fft 6: mflops = 77.6004 (norm. = 0.43987), norm. avg. (of 8) = 0.2553 fft 7: mflops = 78.9888 (norm. = 0.44774), norm. avg. (of 8) = 0.21687 fft 8: mflops = 41.4457 (norm. = 0.234931), norm. avg. (of 7) = 0.179886 fft 9: mflops = 46.7853 (norm. = 0.265198), norm. avg. (of 8) = 0.198404 fft 10: mflops = 90.5408 (norm. = 0.513222), norm. avg. (of 8) = 0.371436 fft 11: mflops = 63.9376 (norm. = 0.362424), norm. avg. (of 8) = 0.277652 fft 12: mflops = 176.417 (norm. = 1), norm. avg. (of 8) = 1 fft 13: mflops = 128.857 (norm. = 0.730415), norm. avg. (of 6) = 0.602822 fft 14: mflops = 122.283 (norm. = 0.693149), norm. avg. (of 8) = 0.410806 fft 15: mflops = 68.2001 (norm. = 0.386585), norm. avg. (of 8) = 0.228732 fft 16: mflops = 62.9775 (norm. = 0.356982), norm. avg. (of 8) = 0.211434 fft 17: mflops = 44.787 (norm. = 0.253871), norm. avg. (of 8) = 0.423994 fft 18: mflops = 81.3638 (norm. = 0.461203), norm. avg. (of 7) = 0.354285 fft 19: mflops = 107.409 (norm. = 0.608835), norm. avg. (of 7) = 0.429986 fft 20: mflops = 110.231 (norm. = 0.624836), norm. avg. (of 7) = 0.438725 fft 21: mflops = 20.7228 (norm. = 0.117465), norm. avg. (of 8) = 0.0766147 fft 22: mflops = 60.263 (norm. = 0.341595), norm. avg. (of 8) = 0.282391 fft 23: mflops = 52.6592 (norm. = 0.298493), norm. avg. (of 7) = 0.13898 fft 24: mflops = 106.657 (norm. = 0.604577), norm. avg. (of 8) = 0.319101 fft 25: mflops = 82.8914 (norm. = 0.469862), norm. avg. (of 8) = 0.242155 fft 26: mflops = 9.52385 (norm. = 0.053985), norm. avg. (of 8) = 0.0543104 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.432 s, 4096 iters, t-(init.)=1.382 s t(norm)=0.073221, mflops=68.2864 (err=9.4e-016) 1. Arndt DIT: elapsed time t=1.382 s, 4096 iters, t-(init.)=1.342 s t(norm)=0.0711017, mflops=70.3218 (err=9.2e-016) 2. Arndt Split-Radix: elapsed time t=1.772 s, 4096 iters, t-(init.)=1.731 s t(norm)=0.0917117, mflops=54.5187 (err=9.2e-016) 3. Arndt 4-step: elapsed time t=1.021 s, 1024 iters, t-(init.)=1.011 s t(norm)=0.214259, mflops=23.3363 (err=9.0e-016) 4. Beauregard: elapsed time t=1.652 s, 1024 iters, t-(init.)=1.642 s t(norm)=0.347985, mflops=14.3684 (err=9.6e-016) 5. Bergland: elapsed time t=1.802 s, 8192 iters, t-(init.)=1.701 s t(norm)=0.0450611, mflops=110.96 (err=1.0e-015) 6. CWP (min N) (N=520): elapsed time t=1.182 s, 4096 iters, t-(init.)=1.132 s t(norm)=0.0599755, mflops=83.3673 7. CWP (best N) (N=560): elapsed time t=1.171 s, 4096 iters, t-(init.)=1.12 s t(norm)=0.0593397, mflops=84.2606 8. Edelblute: elapsed time t=1.112 s, 2048 iters, t-(init.)=1.092 s t(norm)=0.115712, mflops=43.2105 (err=9.2e-016) 9. FFTPACK (f2c): elapsed time t=1.221 s, 2048 iters, t-(init.)=1.191 s t(norm)=0.126203, mflops=39.6187 (err=9.3e-016) FFTW_MEASURE plan: (cost = 2.832031e-004) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=1.161 s, 4096 iters, t-(init.)=1.11 s t(norm)=0.0588099, mflops=85.0197 (err=8.9e-016) FFTW_ESTIMATE plan: (cost = 1.843200e+003) FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.592 s, 4096 iters, t-(init.)=1.542 s t(norm)=0.0816981, mflops=61.2009 (err=8.7e-016) 12. Frigo-old: elapsed time t=1.513 s, 8192 iters, t-(init.)=1.413 s t(norm)=0.0374317, mflops=133.577 (err=8.5e-016) 13. Green: elapsed time t=1.432 s, 8192 iters, t-(init.)=1.342 s t(norm)=0.0355509, mflops=140.644 (err=9.5e-016) 14. GSL: elapsed time t=1.912 s, 8192 iters, t-(init.)=1.821 s t(norm)=0.04824, mflops=103.648 (err=9.1e-016) 15. GSL DIT: elapsed time t=1.432 s, 4096 iters, t-(init.)=1.382 s t(norm)=0.073221, mflops=68.2864 (err=1.1e-015) 16. GSL DIF: elapsed time t=1.492 s, 4096 iters, t-(init.)=1.442 s t(norm)=0.0763999, mflops=65.4451 (err=1.1e-015) 17. Krukar: elapsed time t=1.111 s, 2048 iters, t-(init.)=1.081 s t(norm)=0.114547, mflops=43.6502 (err=9.3e-016) 18. Mayer (Buneman): elapsed time t=1.112 s, 4096 iters, t-(init.)=1.062 s t(norm)=0.0562668, mflops=88.8624 (err=8.7e-016) 19. Mayer (simple): elapsed time t=1.683 s, 8192 iters, t-(init.)=1.593 s t(norm)=0.0422001, mflops=118.483 20. Mayer (lookup): elapsed time t=1.662 s, 8192 iters, t-(init.)=1.561 s t(norm)=0.0413524, mflops=120.912 (err=8.8e-016) 21. NAPACK (f2c): elapsed time t=1.182 s, 1024 iters, t-(init.)=1.172 s t(norm)=0.248379, mflops=20.1305 (err=7.6e-015) 22. Ooura (C): elapsed time t=1.633 s, 4096 iters, t-(init.)=1.593 s t(norm)=0.0844002, mflops=59.2416 (err=9.6e-016) 23. Ransom: elapsed time t=1.922 s, 4096 iters, t-(init.)=1.882 s t(norm)=0.0997119, mflops=50.1444 (err=1.2e-015) 24. Singleton (f2c): elapsed time t=1.813 s, 8192 iters, t-(init.)=1.713 s t(norm)=0.045379, mflops=110.183 (err=1.1e-015) 25. Temperton (f2c): elapsed time t=1.281 s, 4096 iters, t-(init.)=1.231 s t(norm)=0.0652207, mflops=76.6627 (err=9.3e-016) 26. Valkenburg: elapsed time t=1.262 s, 512 iters, t-(init.)=1.262 s t(norm)=0.534905, mflops=9.34745 (err=1.4e-015) Top mflops for N=512 = 140.644 Normalized results and averages for N=512: fft 0: mflops = 68.2864 (norm. = 0.485528), norm. avg. (of 9) = 0.35527 fft 1: mflops = 70.3218 (norm. = 0.5), norm. avg. (of 9) = 0.365293 fft 2: mflops = 54.5187 (norm. = 0.387637), norm. avg. (of 9) = 0.26848 fft 3: mflops = 23.3363 (norm. = 0.165925), norm. avg. (of 9) = 0.0869974 fft 4: mflops = 14.3684 (norm. = 0.102162), norm. avg. (of 9) = 0.0737657 fft 5: mflops = 110.96 (norm. = 0.788948), norm. avg. (of 9) = 0.3891 fft 6: mflops = 83.3673 (norm. = 0.592756), norm. avg. (of 9) = 0.292795 fft 7: mflops = 84.2606 (norm. = 0.599107), norm. avg. (of 9) = 0.259341 fft 8: mflops = 43.2105 (norm. = 0.307234), norm. avg. (of 8) = 0.195804 fft 9: mflops = 39.6187 (norm. = 0.281696), norm. avg. (of 9) = 0.207659 fft 10: mflops = 85.0197 (norm. = 0.604505), norm. avg. (of 9) = 0.397332 fft 11: mflops = 61.2009 (norm. = 0.435149), norm. avg. (of 9) = 0.295152 fft 12: mflops = 133.577 (norm. = 0.949752), norm. avg. (of 9) = 0.994417 fft 13: mflops = 140.644 (norm. = 1), norm. avg. (of 7) = 0.659562 fft 14: mflops = 103.648 (norm. = 0.736958), norm. avg. (of 9) = 0.447045 fft 15: mflops = 68.2864 (norm. = 0.485528), norm. avg. (of 9) = 0.257265 fft 16: mflops = 65.4451 (norm. = 0.465326), norm. avg. (of 9) = 0.239644 fft 17: mflops = 43.6502 (norm. = 0.310361), norm. avg. (of 9) = 0.411368 fft 18: mflops = 88.8624 (norm. = 0.631827), norm. avg. (of 8) = 0.388978 fft 19: mflops = 118.483 (norm. = 0.842436), norm. avg. (of 8) = 0.481542 fft 20: mflops = 120.912 (norm. = 0.859705), norm. avg. (of 8) = 0.491348 fft 21: mflops = 20.1305 (norm. = 0.143131), norm. avg. (of 9) = 0.0840054 fft 22: mflops = 59.2416 (norm. = 0.421218), norm. avg. (of 9) = 0.297816 fft 23: mflops = 50.1444 (norm. = 0.356536), norm. avg. (of 8) = 0.166175 fft 24: mflops = 110.183 (norm. = 0.783421), norm. avg. (of 9) = 0.370692 fft 25: mflops = 76.6627 (norm. = 0.545085), norm. avg. (of 9) = 0.275814 fft 26: mflops = 9.34745 (norm. = 0.066462), norm. avg. (of 9) = 0.0556606 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.623 s, 2048 iters, t-(init.)=1.573 s t(norm)=0.0750065, mflops=66.6609 (err=1.7e-015) 1. Arndt DIT: elapsed time t=1.532 s, 2048 iters, t-(init.)=1.482 s t(norm)=0.0706673, mflops=70.7541 (err=1.7e-015) 2. Arndt Split-Radix: elapsed time t=1.873 s, 2048 iters, t-(init.)=1.823 s t(norm)=0.0869274, mflops=57.5193 (err=1.6e-015) 3. Arndt 4-step: elapsed time t=1.913 s, 1024 iters, t-(init.)=1.883 s t(norm)=0.179577, mflops=27.8432 (err=1.6e-015) 4. Beauregard: elapsed time t=1.863 s, 512 iters, t-(init.)=1.853 s t(norm)=0.353432, mflops=14.147 (err=1.8e-015) 5. Bergland: elapsed time t=1.032 s, 2048 iters, t-(init.)=0.982 s t(norm)=0.0468254, mflops=106.78 (err=1.7e-015) 6. CWP (min N) (N=1040): elapsed time t=1.281 s, 2048 iters, t-(init.)=1.231 s t(norm)=0.0586987, mflops=85.1808 7. CWP (best N) (N=1040): elapsed time t=1.282 s, 2048 iters, t-(init.)=1.232 s t(norm)=0.0587463, mflops=85.1117 8. Edelblute: elapsed time t=1.172 s, 1024 iters, t-(init.)=1.152 s t(norm)=0.109863, mflops=45.5111 (err=1.7e-015) 9. FFTPACK (f2c): elapsed time t=1.502 s, 1024 iters, t-(init.)=1.472 s t(norm)=0.140381, mflops=35.6174 (err=1.7e-015) FFTW_MEASURE plan: (cost = 6.835938e-004) FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=1.442 s, 2048 iters, t-(init.)=1.392 s t(norm)=0.0663757, mflops=75.3287 (err=1.7e-015) FFTW_ESTIMATE plan: (cost = 1.126400e+004) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.723 s, 2048 iters, t-(init.)=1.673 s t(norm)=0.0797749, mflops=62.6764 (err=1.7e-015) 12. Frigo-old: elapsed time t=1.312 s, 2048 iters, t-(init.)=1.262 s t(norm)=0.0601768, mflops=83.0884 (err=1.7e-015) 13. Green: elapsed time t=1.972 s, 4096 iters, t-(init.)=1.871 s t(norm)=0.0446081, mflops=112.087 (err=1.7e-015) 14. GSL: elapsed time t=1.432 s, 2048 iters, t-(init.)=1.382 s t(norm)=0.0658989, mflops=75.8738 (err=1.7e-015) 15. GSL DIT: elapsed time t=1.642 s, 2048 iters, t-(init.)=1.591 s t(norm)=0.0758648, mflops=65.9067 (err=1.9e-015) 16. GSL DIF: elapsed time t=1.692 s, 2048 iters, t-(init.)=1.642 s t(norm)=0.0782967, mflops=63.8597 (err=1.9e-015) 17. Krukar: elapsed time t=1.442 s, 1024 iters, t-(init.)=1.422 s t(norm)=0.135612, mflops=36.8698 (err=1.7e-015) 18. Mayer (Buneman): elapsed time t=1.202 s, 2048 iters, t-(init.)=1.152 s t(norm)=0.0549316, mflops=91.0222 (err=1.6e-015) 19. Mayer (simple): elapsed time t=1.843 s, 4096 iters, t-(init.)=1.743 s t(norm)=0.0415564, mflops=120.319 20. Mayer (lookup): elapsed time t=1.883 s, 4096 iters, t-(init.)=1.783 s t(norm)=0.04251, mflops=117.619 (err=1.6e-015) 21. NAPACK (f2c): elapsed time t=1.482 s, 512 iters, t-(init.)=1.472 s t(norm)=0.280762, mflops=17.8087 (err=1.6e-014) 22. Ooura (C): elapsed time t=1.803 s, 2048 iters, t-(init.)=1.753 s t(norm)=0.0835896, mflops=59.8161 (err=1.6e-015) 23. Ransom: elapsed time t=1.783 s, 2048 iters, t-(init.)=1.733 s t(norm)=0.0826359, mflops=60.5064 (err=1.9e-015) 24. Singleton (f2c): elapsed time t=1.002 s, 2048 iters, t-(init.)=0.962 s t(norm)=0.0458717, mflops=109 (err=2.6e-015) 25. Temperton (f2c): elapsed time t=1.492 s, 2048 iters, t-(init.)=1.442 s t(norm)=0.0687599, mflops=72.7168 (err=1.7e-015) 26. Valkenburg: elapsed time t=1.462 s, 256 iters, t-(init.)=1.462 s t(norm)=0.557709, mflops=8.96525 (err=1.8e-015) Top mflops for N=1024 = 120.319 Normalized results and averages for N=1024: fft 0: mflops = 66.6609 (norm. = 0.554037), norm. avg. (of 10) = 0.375147 fft 1: mflops = 70.7541 (norm. = 0.588057), norm. avg. (of 10) = 0.38757 fft 2: mflops = 57.5193 (norm. = 0.478058), norm. avg. (of 10) = 0.289438 fft 3: mflops = 27.8432 (norm. = 0.231413), norm. avg. (of 10) = 0.101439 fft 4: mflops = 14.147 (norm. = 0.11758), norm. avg. (of 10) = 0.0781471 fft 5: mflops = 106.78 (norm. = 0.887475), norm. avg. (of 10) = 0.438938 fft 6: mflops = 85.1808 (norm. = 0.707961), norm. avg. (of 10) = 0.334312 fft 7: mflops = 85.1117 (norm. = 0.707386), norm. avg. (of 10) = 0.304146 fft 8: mflops = 45.5111 (norm. = 0.378255), norm. avg. (of 9) = 0.216077 fft 9: mflops = 35.6174 (norm. = 0.296026), norm. avg. (of 10) = 0.216496 fft 10: mflops = 75.3287 (norm. = 0.626078), norm. avg. (of 10) = 0.420207 fft 11: mflops = 62.6764 (norm. = 0.520921), norm. avg. (of 10) = 0.317729 fft 12: mflops = 83.0884 (norm. = 0.690571), norm. avg. (of 10) = 0.964032 fft 13: mflops = 112.087 (norm. = 0.931587), norm. avg. (of 8) = 0.693565 fft 14: mflops = 75.8738 (norm. = 0.630608), norm. avg. (of 10) = 0.465402 fft 15: mflops = 65.9067 (norm. = 0.547769), norm. avg. (of 10) = 0.286316 fft 16: mflops = 63.8597 (norm. = 0.530755), norm. avg. (of 10) = 0.268755 fft 17: mflops = 36.8698 (norm. = 0.306435), norm. avg. (of 10) = 0.400875 fft 18: mflops = 91.0222 (norm. = 0.75651), norm. avg. (of 9) = 0.429815 fft 19: mflops = 120.319 (norm. = 1), norm. avg. (of 9) = 0.539149 fft 20: mflops = 117.619 (norm. = 0.977566), norm. avg. (of 9) = 0.545372 fft 21: mflops = 17.8087 (norm. = 0.148013), norm. avg. (of 10) = 0.0904062 fft 22: mflops = 59.8161 (norm. = 0.497148), norm. avg. (of 10) = 0.31775 fft 23: mflops = 60.5064 (norm. = 0.502885), norm. avg. (of 9) = 0.203587 fft 24: mflops = 109 (norm. = 0.905925), norm. avg. (of 10) = 0.424215 fft 25: mflops = 72.7168 (norm. = 0.604369), norm. avg. (of 10) = 0.30867 fft 26: mflops = 8.96525 (norm. = 0.0745127), norm. avg. (of 10) = 0.0575458 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.453 s, 512 iters, t-(init.)=1.393 s t(norm)=0.12077, mflops=41.4011 (err=1.4e-015) 1. Arndt DIT: elapsed time t=1.442 s, 512 iters, t-(init.)=1.382 s t(norm)=0.119816, mflops=41.7306 (err=1.4e-015) 2. Arndt Split-Radix: elapsed time t=1.002 s, 256 iters, t-(init.)=0.972 s t(norm)=0.16854, mflops=29.6665 (err=1.4e-015) 3. Arndt 4-step: elapsed time t=1.302 s, 256 iters, t-(init.)=1.272 s t(norm)=0.220559, mflops=22.6697 (err=1.3e-015) 4. Beauregard: elapsed time t=1.112 s, 128 iters, t-(init.)=1.102 s t(norm)=0.382163, mflops=13.0834 (err=1.4e-015) 5. Bergland: elapsed time t=1.693 s, 1024 iters, t-(init.)=1.573 s t(norm)=0.0681877, mflops=73.327 (err=1.4e-015) 6. CWP (min N) (N=2145): elapsed time t=1.863 s, 1024 iters, t-(init.)=1.743 s t(norm)=0.075557, mflops=66.1752 7. CWP (best N) (N=2184): elapsed time t=1.833 s, 1024 iters, t-(init.)=1.703 s t(norm)=0.0738231, mflops=67.7295 8. Edelblute: elapsed time t=1.122 s, 256 iters, t-(init.)=1.092 s t(norm)=0.189348, mflops=26.4064 (err=1.4e-015) 9. FFTPACK (f2c): elapsed time t=1.712 s, 512 iters, t-(init.)=1.652 s t(norm)=0.143225, mflops=34.9102 (err=1.4e-015) FFTW_MEASURE plan: (cost = 1.562500e-003) FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=1.642 s, 1024 iters, t-(init.)=1.521 s t(norm)=0.0659336, mflops=75.8339 (err=1.4e-015) FFTW_ESTIMATE plan: (cost = 1.269760e+004) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.022 s, 512 iters, t-(init.)=0.962 s t(norm)=0.0834032, mflops=59.9498 (err=1.4e-015) 12. Frigo-old: elapsed time t=1.542 s, 1024 iters, t-(init.)=1.422 s t(norm)=0.061642, mflops=81.1135 (err=1.4e-015) 13. Green: elapsed time t=1.412 s, 1024 iters, t-(init.)=1.292 s t(norm)=0.0560067, mflops=89.275 (err=1.4e-015) 14. GSL: elapsed time t=1.532 s, 1024 iters, t-(init.)=1.412 s t(norm)=0.0612086, mflops=81.6879 (err=1.4e-015) 15. GSL DIT: elapsed time t=1.643 s, 512 iters, t-(init.)=1.583 s t(norm)=0.137242, mflops=36.4319 (err=1.9e-015) 16. GSL DIF: elapsed time t=1.633 s, 512 iters, t-(init.)=1.573 s t(norm)=0.136375, mflops=36.6635 (err=2.3e-015) 17. Krukar: elapsed time t=1.632 s, 512 iters, t-(init.)=1.572 s t(norm)=0.136289, mflops=36.6868 (err=1.4e-015) 18. Mayer (Buneman): elapsed time t=1.482 s, 1024 iters, t-(init.)=1.362 s t(norm)=0.0590411, mflops=84.6868 (err=1.3e-015) 19. Mayer (simple): elapsed time t=1.171 s, 1024 iters, t-(init.)=1.051 s t(norm)=0.0455596, mflops=109.746 20. Mayer (lookup): elapsed time t=1.402 s, 1024 iters, t-(init.)=1.281 s t(norm)=0.0555299, mflops=90.0417 (err=1.3e-015) 21. NAPACK (f2c): elapsed time t=1.703 s, 256 iters, t-(init.)=1.673 s t(norm)=0.29009, mflops=17.236 (err=1.5e-014) 22. Ooura (C): elapsed time t=1.182 s, 512 iters, t-(init.)=1.122 s t(norm)=0.0972748, mflops=51.4008 (err=1.4e-015) 23. Ransom: elapsed time t=1.272 s, 512 iters, t-(init.)=1.212 s t(norm)=0.105078, mflops=47.5839 (err=2.0e-015) 24. Singleton (f2c): elapsed time t=1.983 s, 1024 iters, t-(init.)=1.863 s t(norm)=0.0807589, mflops=61.9127 (err=1.8e-015) 25. Temperton (f2c): elapsed time t=1.052 s, 512 iters, t-(init.)=0.992 s t(norm)=0.0860041, mflops=58.1368 (err=1.4e-015) 26. Valkenburg: elapsed time t=1.693 s, 128 iters, t-(init.)=1.683 s t(norm)=0.583649, mflops=8.5668 (err=1.6e-015) Top mflops for N=2048 = 109.746 Normalized results and averages for N=2048: fft 0: mflops = 41.4011 (norm. = 0.377243), norm. avg. (of 11) = 0.375338 fft 1: mflops = 41.7306 (norm. = 0.380246), norm. avg. (of 11) = 0.386904 fft 2: mflops = 29.6665 (norm. = 0.270319), norm. avg. (of 11) = 0.2877 fft 3: mflops = 22.6697 (norm. = 0.206564), norm. avg. (of 11) = 0.110996 fft 4: mflops = 13.0834 (norm. = 0.119215), norm. avg. (of 11) = 0.0818805 fft 5: mflops = 73.327 (norm. = 0.66815), norm. avg. (of 11) = 0.459775 fft 6: mflops = 66.1752 (norm. = 0.602983), norm. avg. (of 11) = 0.358736 fft 7: mflops = 67.7295 (norm. = 0.617146), norm. avg. (of 11) = 0.3326 fft 8: mflops = 26.4064 (norm. = 0.240614), norm. avg. (of 10) = 0.21853 fft 9: mflops = 34.9102 (norm. = 0.318099), norm. avg. (of 11) = 0.225732 fft 10: mflops = 75.8339 (norm. = 0.690993), norm. avg. (of 11) = 0.444824 fft 11: mflops = 59.9498 (norm. = 0.546258), norm. avg. (of 11) = 0.338504 fft 12: mflops = 81.1135 (norm. = 0.7391), norm. avg. (of 11) = 0.943584 fft 13: mflops = 89.275 (norm. = 0.813467), norm. avg. (of 9) = 0.706888 fft 14: mflops = 81.6879 (norm. = 0.744334), norm. avg. (of 11) = 0.490759 fft 15: mflops = 36.4319 (norm. = 0.331965), norm. avg. (of 11) = 0.290465 fft 16: mflops = 36.6635 (norm. = 0.334075), norm. avg. (of 11) = 0.274694 fft 17: mflops = 36.6868 (norm. = 0.334288), norm. avg. (of 11) = 0.394821 fft 18: mflops = 84.6868 (norm. = 0.771659), norm. avg. (of 10) = 0.463999 fft 19: mflops = 109.746 (norm. = 1), norm. avg. (of 10) = 0.585234 fft 20: mflops = 90.0417 (norm. = 0.820453), norm. avg. (of 10) = 0.57288 fft 21: mflops = 17.236 (norm. = 0.157053), norm. avg. (of 11) = 0.096465 fft 22: mflops = 51.4008 (norm. = 0.46836), norm. avg. (of 11) = 0.331441 fft 23: mflops = 47.5839 (norm. = 0.433581), norm. avg. (of 10) = 0.226586 fft 24: mflops = 61.9127 (norm. = 0.564144), norm. avg. (of 11) = 0.436936 fft 25: mflops = 58.1368 (norm. = 0.529738), norm. avg. (of 11) = 0.328767 fft 26: mflops = 8.5668 (norm. = 0.07806), norm. avg. (of 11) = 0.0594107 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.753 s, 256 iters, t-(init.)=1.693 s t(norm)=0.134548, mflops=37.1616 (err=3.8e-015) 1. Arndt DIT: elapsed time t=1.753 s, 256 iters, t-(init.)=1.693 s t(norm)=0.134548, mflops=37.1616 (err=3.8e-015) 2. Arndt Split-Radix: elapsed time t=1.161 s, 128 iters, t-(init.)=1.131 s t(norm)=0.179768, mflops=27.8137 (err=3.8e-015) 3. Arndt 4-step: elapsed time t=1.252 s, 128 iters, t-(init.)=1.222 s t(norm)=0.194232, mflops=25.7425 (err=3.8e-015) 4. Beauregard: elapsed time t=1.222 s, 64 iters, t-(init.)=1.212 s t(norm)=0.385284, mflops=12.9774 (err=3.8e-015) 5. Bergland: elapsed time t=1.673 s, 512 iters, t-(init.)=1.553 s t(norm)=0.0617107, mflops=81.0233 (err=3.9e-015) 6. CWP (min N) (N=4290): elapsed time t=1.092 s, 256 iters, t-(init.)=1.022 s t(norm)=0.0812213, mflops=61.5602 7. CWP (best N) (N=4368): elapsed time t=1.902 s, 512 iters, t-(init.)=1.771 s t(norm)=0.0703732, mflops=71.0498 8. Edelblute: elapsed time t=1.282 s, 128 iters, t-(init.)=1.252 s t(norm)=0.199, mflops=25.1256 (err=3.8e-015) 9. FFTPACK (f2c): elapsed time t=1.813 s, 256 iters, t-(init.)=1.753 s t(norm)=0.139316, mflops=35.8897 (err=3.8e-015) FFTW_MEASURE plan: (cost = 3.453125e-003) FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=1.773 s, 512 iters, t-(init.)=1.653 s t(norm)=0.0656843, mflops=76.1217 (err=3.8e-015) FFTW_ESTIMATE plan: (cost = 2.539520e+004) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.032 s, 256 iters, t-(init.)=0.972 s t(norm)=0.0772476, mflops=64.7269 (err=3.8e-015) 12. Frigo-old: elapsed time t=1.653 s, 512 iters, t-(init.)=1.533 s t(norm)=0.0609159, mflops=82.0803 (err=3.8e-015) 13. Green: elapsed time t=1.472 s, 512 iters, t-(init.)=1.352 s t(norm)=0.0537237, mflops=93.0689 (err=3.9e-015) 14. GSL: elapsed time t=1.642 s, 512 iters, t-(init.)=1.522 s t(norm)=0.0604788, mflops=82.6735 (err=3.8e-015) 15. GSL DIT: elapsed time t=1.812 s, 256 iters, t-(init.)=1.752 s t(norm)=0.139236, mflops=35.9101 (err=4.0e-015) 16. GSL DIF: elapsed time t=1.793 s, 256 iters, t-(init.)=1.733 s t(norm)=0.137726, mflops=36.3038 (err=4.3e-015) 17. Krukar: elapsed time t=1.743 s, 256 iters, t-(init.)=1.683 s t(norm)=0.133753, mflops=37.3824 (err=3.9e-015) 18. Mayer (Buneman): elapsed time t=1.432 s, 256 iters, t-(init.)=1.372 s t(norm)=0.109037, mflops=45.8561 (err=3.8e-015) 19. Mayer (simple): elapsed time t=1.322 s, 256 iters, t-(init.)=1.262 s t(norm)=0.100295, mflops=49.8531 20. Mayer (lookup): elapsed time t=1.412 s, 256 iters, t-(init.)=1.352 s t(norm)=0.107447, mflops=46.5344 (err=3.8e-015) 21. NAPACK (f2c): elapsed time t=1.752 s, 128 iters, t-(init.)=1.722 s t(norm)=0.273705, mflops=18.2679 (err=5.1e-014) 22. Ooura (C): elapsed time t=1.222 s, 256 iters, t-(init.)=1.162 s t(norm)=0.0923475, mflops=54.1433 (err=3.9e-015) 23. Ransom: elapsed time t=1.052 s, 256 iters, t-(init.)=0.992 s t(norm)=0.0788371, mflops=63.4219 (err=4.6e-015) 24. Singleton (f2c): elapsed time t=1.792 s, 512 iters, t-(init.)=1.672 s t(norm)=0.0664393, mflops=75.2567 (err=6.0e-015) 25. Temperton (f2c): elapsed time t=1.101 s, 256 iters, t-(init.)=1.04 s t(norm)=0.0826518, mflops=60.4948 (err=3.8e-015) 26. Valkenburg: elapsed time t=1.823 s, 64 iters, t-(init.)=1.803 s t(norm)=0.573158, mflops=8.72359 (err=4.0e-015) Top mflops for N=4096 = 93.0689 Normalized results and averages for N=4096: fft 0: mflops = 37.1616 (norm. = 0.399291), norm. avg. (of 12) = 0.377334 fft 1: mflops = 37.1616 (norm. = 0.399291), norm. avg. (of 12) = 0.387936 fft 2: mflops = 27.8137 (norm. = 0.298851), norm. avg. (of 12) = 0.288629 fft 3: mflops = 25.7425 (norm. = 0.276596), norm. avg. (of 12) = 0.124796 fft 4: mflops = 12.9774 (norm. = 0.139439), norm. avg. (of 12) = 0.0866771 fft 5: mflops = 81.0233 (norm. = 0.870573), norm. avg. (of 12) = 0.494008 fft 6: mflops = 61.5602 (norm. = 0.661448), norm. avg. (of 12) = 0.383962 fft 7: mflops = 71.0498 (norm. = 0.763411), norm. avg. (of 12) = 0.368501 fft 8: mflops = 25.1256 (norm. = 0.269968), norm. avg. (of 11) = 0.223207 fft 9: mflops = 35.8897 (norm. = 0.385625), norm. avg. (of 12) = 0.239057 fft 10: mflops = 76.1217 (norm. = 0.817907), norm. avg. (of 12) = 0.475914 fft 11: mflops = 64.7269 (norm. = 0.695473), norm. avg. (of 12) = 0.368251 fft 12: mflops = 82.0803 (norm. = 0.881931), norm. avg. (of 12) = 0.938446 fft 13: mflops = 93.0689 (norm. = 1), norm. avg. (of 10) = 0.736199 fft 14: mflops = 82.6735 (norm. = 0.888305), norm. avg. (of 12) = 0.523888 fft 15: mflops = 35.9101 (norm. = 0.385845), norm. avg. (of 12) = 0.298414 fft 16: mflops = 36.3038 (norm. = 0.390075), norm. avg. (of 12) = 0.284309 fft 17: mflops = 37.3824 (norm. = 0.401664), norm. avg. (of 12) = 0.395392 fft 18: mflops = 45.8561 (norm. = 0.492711), norm. avg. (of 11) = 0.46661 fft 19: mflops = 49.8531 (norm. = 0.535658), norm. avg. (of 11) = 0.580727 fft 20: mflops = 46.5344 (norm. = 0.5), norm. avg. (of 11) = 0.566254 fft 21: mflops = 18.2679 (norm. = 0.196283), norm. avg. (of 12) = 0.104783 fft 22: mflops = 54.1433 (norm. = 0.581756), norm. avg. (of 12) = 0.352301 fft 23: mflops = 63.4219 (norm. = 0.681452), norm. avg. (of 11) = 0.267938 fft 24: mflops = 75.2567 (norm. = 0.808612), norm. avg. (of 12) = 0.467909 fft 25: mflops = 60.4948 (norm. = 0.65), norm. avg. (of 12) = 0.355536 fft 26: mflops = 8.72359 (norm. = 0.0937327), norm. avg. (of 12) = 0.0622709 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.832 s, 128 iters, t-(init.)=1.772 s t(norm)=0.129993, mflops=38.4636 (err=3.6e-015) 1. Arndt DIT: elapsed time t=1.863 s, 128 iters, t-(init.)=1.803 s t(norm)=0.132267, mflops=37.8022 (err=3.6e-015) 2. Arndt Split-Radix: elapsed time t=1.251 s, 64 iters, t-(init.)=1.221 s t(norm)=0.179144, mflops=27.9105 (err=3.6e-015) 3. Arndt 4-step: elapsed time t=1.462 s, 64 iters, t-(init.)=1.432 s t(norm)=0.210102, mflops=23.798 (err=3.6e-015) 4. Beauregard: elapsed time t=1.332 s, 32 iters, t-(init.)=1.322 s t(norm)=0.387925, mflops=12.8891 (err=3.7e-015) 5. Bergland: elapsed time t=1.982 s, 256 iters, t-(init.)=1.871 s t(norm)=0.0686279, mflops=72.8567 (err=3.7e-015) 6. CWP (min N) (N=8580): elapsed time t=1.092 s, 128 iters, t-(init.)=1.022 s t(norm)=0.0749735, mflops=66.6903 7. CWP (best N) (N=9240): elapsed time t=1.112 s, 128 iters, t-(init.)=1.042 s t(norm)=0.0764407, mflops=65.4102 8. Edelblute: elapsed time t=1.382 s, 64 iters, t-(init.)=1.352 s t(norm)=0.198364, mflops=25.2062 (err=3.6e-015) 9. FFTPACK (f2c): elapsed time t=1.081 s, 64 iters, t-(init.)=1.051 s t(norm)=0.154202, mflops=32.425 (err=3.7e-015) FFTW_MEASURE plan: (cost = 7.812500e-003) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=2.002 s, 256 iters, t-(init.)=1.881 s t(norm)=0.0689947, mflops=72.4694 (err=3.7e-015) FFTW_ESTIMATE plan: (cost = 5.079040e+004) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.082 s, 128 iters, t-(init.)=1.022 s t(norm)=0.0749735, mflops=66.6903 (err=3.7e-015) 12. Frigo-old: elapsed time t=1.932 s, 256 iters, t-(init.)=1.811 s t(norm)=0.0664271, mflops=75.2705 (err=3.7e-015) 13. Green: elapsed time t=1.692 s, 256 iters, t-(init.)=1.571 s t(norm)=0.0576239, mflops=86.7695 (err=3.7e-015) 14. GSL: elapsed time t=1.022 s, 128 iters, t-(init.)=0.962 s t(norm)=0.0705719, mflops=70.8497 (err=3.7e-015) 15. GSL DIT: elapsed time t=1.972 s, 128 iters, t-(init.)=1.911 s t(norm)=0.14019, mflops=35.6659 (err=4.4e-015) 16. GSL DIF: elapsed time t=1.953 s, 128 iters, t-(init.)=1.893 s t(norm)=0.13887, mflops=36.005 (err=4.4e-015) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.602 s, 128 iters, t-(init.)=1.542 s t(norm)=0.11312, mflops=44.2007 (err=3.6e-015) 19. Mayer (simple): elapsed time t=1.472 s, 128 iters, t-(init.)=1.412 s t(norm)=0.103584, mflops=48.2701 20. Mayer (lookup): elapsed time t=1.582 s, 128 iters, t-(init.)=1.522 s t(norm)=0.111653, mflops=44.7815 (err=3.6e-015) 21. NAPACK (f2c): elapsed time t=1.953 s, 64 iters, t-(init.)=1.923 s t(norm)=0.282141, mflops=17.7216 (err=4.5e-014) 22. Ooura (C): elapsed time t=1.372 s, 128 iters, t-(init.)=1.312 s t(norm)=0.0962477, mflops=51.9493 (err=3.7e-015) 23. Ransom: elapsed time t=1.282 s, 128 iters, t-(init.)=1.222 s t(norm)=0.0896454, mflops=55.7753 (err=4.8e-015) 24. Singleton (f2c): elapsed time t=1.032 s, 128 iters, t-(init.)=0.972 s t(norm)=0.0713055, mflops=70.1208 (err=5.7e-015) 25. Temperton (f2c): elapsed time t=1.281 s, 128 iters, t-(init.)=1.22 s t(norm)=0.0894987, mflops=55.8668 (err=3.7e-015) 26. Valkenburg: elapsed time t=1.001 s, 16 iters, t-(init.)=0.991 s t(norm)=0.581595, mflops=8.59705 (err=3.8e-015) Top mflops for N=8192 = 86.7695 Normalized results and averages for N=8192: fft 0: mflops = 38.4636 (norm. = 0.443284), norm. avg. (of 13) = 0.382407 fft 1: mflops = 37.8022 (norm. = 0.435663), norm. avg. (of 13) = 0.391607 fft 2: mflops = 27.9105 (norm. = 0.321663), norm. avg. (of 13) = 0.29117 fft 3: mflops = 23.798 (norm. = 0.274267), norm. avg. (of 13) = 0.136294 fft 4: mflops = 12.8891 (norm. = 0.148544), norm. avg. (of 13) = 0.091436 fft 5: mflops = 72.8567 (norm. = 0.839658), norm. avg. (of 13) = 0.520597 fft 6: mflops = 66.6903 (norm. = 0.768591), norm. avg. (of 13) = 0.413549 fft 7: mflops = 65.4102 (norm. = 0.753839), norm. avg. (of 13) = 0.398142 fft 8: mflops = 25.2062 (norm. = 0.290496), norm. avg. (of 12) = 0.228814 fft 9: mflops = 32.425 (norm. = 0.373692), norm. avg. (of 13) = 0.249413 fft 10: mflops = 72.4694 (norm. = 0.835194), norm. avg. (of 13) = 0.503551 fft 11: mflops = 66.6903 (norm. = 0.768591), norm. avg. (of 13) = 0.399047 fft 12: mflops = 75.2705 (norm. = 0.867477), norm. avg. (of 13) = 0.932987 fft 13: mflops = 86.7695 (norm. = 1), norm. avg. (of 11) = 0.760181 fft 14: mflops = 70.8497 (norm. = 0.816528), norm. avg. (of 13) = 0.546399 fft 15: mflops = 35.6659 (norm. = 0.411041), norm. avg. (of 13) = 0.307077 fft 16: mflops = 36.005 (norm. = 0.41495), norm. avg. (of 13) = 0.294358 fft 17: mflops = -1 (norm. = -0.0115248), norm. avg. (of 12) = 0.395392 fft 18: mflops = 44.2007 (norm. = 0.509403), norm. avg. (of 12) = 0.470176 fft 19: mflops = 48.2701 (norm. = 0.556303), norm. avg. (of 12) = 0.578692 fft 20: mflops = 44.7815 (norm. = 0.516097), norm. avg. (of 12) = 0.562075 fft 21: mflops = 17.7216 (norm. = 0.204238), norm. avg. (of 13) = 0.112434 fft 22: mflops = 51.9493 (norm. = 0.598704), norm. avg. (of 13) = 0.371255 fft 23: mflops = 55.7753 (norm. = 0.642799), norm. avg. (of 12) = 0.299176 fft 24: mflops = 70.1208 (norm. = 0.808128), norm. avg. (of 13) = 0.49408 fft 25: mflops = 55.8668 (norm. = 0.643852), norm. avg. (of 13) = 0.377714 fft 26: mflops = 8.59705 (norm. = 0.0990792), norm. avg. (of 13) = 0.0651023 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.052 s, 32 iters, t-(init.)=1.022 s t(norm)=0.139236, mflops=35.9101 (err=6.9e-015) 1. Arndt DIT: elapsed time t=1.072 s, 32 iters, t-(init.)=1.042 s t(norm)=0.141961, mflops=35.2209 (err=6.9e-015) 2. Arndt Split-Radix: elapsed time t=1.342 s, 32 iters, t-(init.)=1.312 s t(norm)=0.178746, mflops=27.9727 (err=6.9e-015) 3. Arndt 4-step: elapsed time t=1.292 s, 32 iters, t-(init.)=1.262 s t(norm)=0.171934, mflops=29.081 (err=6.9e-015) 4. Beauregard: elapsed time t=1.432 s, 16 iters, t-(init.)=1.422 s t(norm)=0.387464, mflops=12.9044 (err=6.9e-015) 5. Bergland: elapsed time t=1.041 s, 64 iters, t-(init.)=0.981 s t(norm)=0.0668253, mflops=74.8219 (err=6.8e-015) 6. CWP (min N) (N=17160): elapsed time t=1.142 s, 64 iters, t-(init.)=1.082 s t(norm)=0.0737054, mflops=67.8376 7. CWP (best N) (N=17160): elapsed time t=1.132 s, 64 iters, t-(init.)=1.072 s t(norm)=0.0730242, mflops=68.4704 8. Edelblute: elapsed time t=1.472 s, 32 iters, t-(init.)=1.442 s t(norm)=0.196457, mflops=25.4509 (err=6.9e-015) 9. FFTPACK (f2c): elapsed time t=1.362 s, 32 iters, t-(init.)=1.332 s t(norm)=0.181471, mflops=27.5527 (err=6.9e-015) FFTW_MEASURE plan: (cost = 2.000000e-002) FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=1.312 s, 64 iters, t-(init.)=1.252 s t(norm)=0.0852857, mflops=58.6265 (err=6.8e-015) FFTW_ESTIMATE plan: (cost = 1.441792e+005) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.573 s, 64 iters, t-(init.)=1.513 s t(norm)=0.103065, mflops=48.5131 (err=6.8e-015) 12. Frigo-old: elapsed time t=1.392 s, 64 iters, t-(init.)=1.332 s t(norm)=0.0907353, mflops=55.1053 (err=6.8e-015) 13. Green: elapsed time t=1.943 s, 128 iters, t-(init.)=1.823 s t(norm)=0.062091, mflops=80.527 (err=6.9e-015) 14. GSL: elapsed time t=1.552 s, 64 iters, t-(init.)=1.492 s t(norm)=0.101634, mflops=49.1959 (err=6.9e-015) 15. GSL DIT: elapsed time t=1.062 s, 32 iters, t-(init.)=1.042 s t(norm)=0.141961, mflops=35.2209 (err=7.3e-015) 16. GSL DIF: elapsed time t=1.052 s, 32 iters, t-(init.)=1.022 s t(norm)=0.139236, mflops=35.9101 (err=7.4e-015) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.733 s, 64 iters, t-(init.)=1.673 s t(norm)=0.113964, mflops=43.8735 (err=6.9e-015) 19. Mayer (simple): elapsed time t=1.613 s, 64 iters, t-(init.)=1.553 s t(norm)=0.10579, mflops=47.2636 20. Mayer (lookup): elapsed time t=1.713 s, 64 iters, t-(init.)=1.653 s t(norm)=0.112602, mflops=44.4043 (err=6.9e-015) 21. NAPACK (f2c): elapsed time t=1.151 s, 16 iters, t-(init.)=1.131 s t(norm)=0.308173, mflops=16.2247 (err=2.3e-013) 22. Ooura (C): elapsed time t=1.402 s, 64 iters, t-(init.)=1.341 s t(norm)=0.0913484, mflops=54.7355 (err=6.8e-015) 23. Ransom: elapsed time t=1.112 s, 64 iters, t-(init.)=1.052 s t(norm)=0.0716618, mflops=69.7722 (err=7.5e-015) 24. Singleton (f2c): elapsed time t=1.032 s, 64 iters, t-(init.)=0.972 s t(norm)=0.0662122, mflops=75.5147 (err=1.0e-014) 25. Temperton (f2c): elapsed time t=1.372 s, 64 iters, t-(init.)=1.312 s t(norm)=0.0893729, mflops=55.9454 (err=6.9e-015) 26. Valkenburg: elapsed time t=1.152 s, 8 iters, t-(init.)=1.142 s t(norm)=0.622341, mflops=8.03419 (err=6.9e-015) Top mflops for N=16384 = 80.527 Normalized results and averages for N=16384: fft 0: mflops = 35.9101 (norm. = 0.445939), norm. avg. (of 14) = 0.386945 fft 1: mflops = 35.2209 (norm. = 0.43738), norm. avg. (of 14) = 0.394877 fft 2: mflops = 27.9727 (norm. = 0.34737), norm. avg. (of 14) = 0.295185 fft 3: mflops = 29.081 (norm. = 0.361133), norm. avg. (of 14) = 0.152354 fft 4: mflops = 12.9044 (norm. = 0.16025), norm. avg. (of 14) = 0.0963513 fft 5: mflops = 74.8219 (norm. = 0.929154), norm. avg. (of 14) = 0.549779 fft 6: mflops = 67.8376 (norm. = 0.842421), norm. avg. (of 14) = 0.444183 fft 7: mflops = 68.4704 (norm. = 0.85028), norm. avg. (of 14) = 0.430438 fft 8: mflops = 25.4509 (norm. = 0.316054), norm. avg. (of 13) = 0.235525 fft 9: mflops = 27.5527 (norm. = 0.342155), norm. avg. (of 14) = 0.256038 fft 10: mflops = 58.6265 (norm. = 0.728035), norm. avg. (of 14) = 0.519586 fft 11: mflops = 48.5131 (norm. = 0.602445), norm. avg. (of 14) = 0.413575 fft 12: mflops = 55.1053 (norm. = 0.684309), norm. avg. (of 14) = 0.915224 fft 13: mflops = 80.527 (norm. = 1), norm. avg. (of 12) = 0.780166 fft 14: mflops = 49.1959 (norm. = 0.610925), norm. avg. (of 14) = 0.551008 fft 15: mflops = 35.2209 (norm. = 0.43738), norm. avg. (of 14) = 0.316385 fft 16: mflops = 35.9101 (norm. = 0.445939), norm. avg. (of 14) = 0.305185 fft 17: mflops = -1 (norm. = -0.0124182), norm. avg. (of 12) = 0.395392 fft 18: mflops = 43.8735 (norm. = 0.54483), norm. avg. (of 13) = 0.475918 fft 19: mflops = 47.2636 (norm. = 0.586929), norm. avg. (of 13) = 0.579325 fft 20: mflops = 44.4043 (norm. = 0.551422), norm. avg. (of 13) = 0.561255 fft 21: mflops = 16.2247 (norm. = 0.201481), norm. avg. (of 14) = 0.118794 fft 22: mflops = 54.7355 (norm. = 0.679717), norm. avg. (of 14) = 0.393288 fft 23: mflops = 69.7722 (norm. = 0.866445), norm. avg. (of 13) = 0.342812 fft 24: mflops = 75.5147 (norm. = 0.937757), norm. avg. (of 14) = 0.525771 fft 25: mflops = 55.9454 (norm. = 0.694741), norm. avg. (of 14) = 0.400359 fft 26: mflops = 8.03419 (norm. = 0.0997701), norm. avg. (of 14) = 0.0675786 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.261 s, 16 iters, t-(init.)=1.22 s t(norm)=0.155131, mflops=32.2308 (err=1.4e-014) 1. Arndt DIT: elapsed time t=1.292 s, 16 iters, t-(init.)=1.252 s t(norm)=0.1592, mflops=31.407 (err=1.4e-014) 2. Arndt Split-Radix: elapsed time t=1.623 s, 16 iters, t-(init.)=1.583 s t(norm)=0.201289, mflops=24.8399 (err=1.4e-014) 3. Arndt 4-step: elapsed time t=1.613 s, 16 iters, t-(init.)=1.573 s t(norm)=0.200017, mflops=24.9978 (err=1.4e-014) 4. Beauregard: elapsed time t=1.613 s, 8 iters, t-(init.)=1.603 s t(norm)=0.407664, mflops=12.265 (err=1.4e-014) 5. Bergland: elapsed time t=1.252 s, 32 iters, t-(init.)=1.182 s t(norm)=0.0751495, mflops=66.534 (err=1.4e-014) 6. CWP (min N) (N=34320): elapsed time t=1.251 s, 32 iters, t-(init.)=1.17 s t(norm)=0.0743866, mflops=67.2164 7. CWP (best N) (N=34320): elapsed time t=1.252 s, 32 iters, t-(init.)=1.172 s t(norm)=0.0745138, mflops=67.1017 8. Edelblute: elapsed time t=1.742 s, 16 iters, t-(init.)=1.702 s t(norm)=0.21642, mflops=23.1032 (err=1.4e-014) 9. FFTPACK (f2c): elapsed time t=1.793 s, 16 iters, t-(init.)=1.753 s t(norm)=0.222905, mflops=22.431 (err=1.4e-014) FFTW_MEASURE plan: (cost = 5.025000e-002) FFTW_TWIDDLE 32 FFTW_TWIDDLE 2 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=1.642 s, 32 iters, t-(init.)=1.562 s t(norm)=0.0993093, mflops=50.3478 (err=1.4e-014) FFTW_ESTIMATE plan: (cost = 2.883584e+005) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.783 s, 32 iters, t-(init.)=1.713 s t(norm)=0.10891, mflops=45.9096 (err=1.4e-014) 12. Frigo-old: elapsed time t=1.002 s, 16 iters, t-(init.)=0.962 s t(norm)=0.122325, mflops=40.8748 (err=1.4e-014) 13. Green: elapsed time t=1.201 s, 32 iters, t-(init.)=1.13 s t(norm)=0.0718435, mflops=69.5958 (err=1.4e-014) 14. GSL: elapsed time t=1.362 s, 16 iters, t-(init.)=1.332 s t(norm)=0.169373, mflops=29.5207 (err=1.4e-014) 15. GSL DIT: elapsed time t=1.362 s, 16 iters, t-(init.)=1.332 s t(norm)=0.169373, mflops=29.5207 (err=1.4e-014) 16. GSL DIF: elapsed time t=1.362 s, 16 iters, t-(init.)=1.322 s t(norm)=0.168101, mflops=29.744 (err=1.4e-014) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.903 s, 32 iters, t-(init.)=1.833 s t(norm)=0.116539, mflops=42.9041 (err=1.4e-014) 19. Mayer (simple): elapsed time t=1.782 s, 32 iters, t-(init.)=1.711 s t(norm)=0.108782, mflops=45.9633 20. Mayer (lookup): elapsed time t=1.011 s, 16 iters, t-(init.)=0.97 s t(norm)=0.123342, mflops=40.5377 (err=1.4e-014) 21. NAPACK (f2c): elapsed time t=1.002 s, 4 iters, t-(init.)=0.992 s t(norm)=0.504557, mflops=9.90968 (err=5.7e-013) 22. Ooura (C): elapsed time t=1.792 s, 32 iters, t-(init.)=1.711 s t(norm)=0.108782, mflops=45.9633 (err=1.4e-014) 23. Ransom: elapsed time t=1.582 s, 32 iters, t-(init.)=1.512 s t(norm)=0.0961304, mflops=52.0127 (err=1.5e-014) 24. Singleton (f2c): elapsed time t=1.602 s, 32 iters, t-(init.)=1.532 s t(norm)=0.0974019, mflops=51.3337 (err=2.1e-014) 25. Temperton (f2c): elapsed time t=1.933 s, 32 iters, t-(init.)=1.863 s t(norm)=0.118446, mflops=42.2132 (err=1.4e-014) 26. Valkenburg: elapsed time t=1.442 s, 4 iters, t-(init.)=1.432 s t(norm)=0.728353, mflops=6.8648 (err=1.4e-014) Top mflops for N=32768 = 69.5958 Normalized results and averages for N=32768: fft 0: mflops = 32.2308 (norm. = 0.463115), norm. avg. (of 15) = 0.392023 fft 1: mflops = 31.407 (norm. = 0.451278), norm. avg. (of 15) = 0.398637 fft 2: mflops = 24.8399 (norm. = 0.356917), norm. avg. (of 15) = 0.2993 fft 3: mflops = 24.9978 (norm. = 0.359186), norm. avg. (of 15) = 0.166142 fft 4: mflops = 12.265 (norm. = 0.176232), norm. avg. (of 15) = 0.101677 fft 5: mflops = 66.534 (norm. = 0.956007), norm. avg. (of 15) = 0.576861 fft 6: mflops = 67.2164 (norm. = 0.965812), norm. avg. (of 15) = 0.478958 fft 7: mflops = 67.1017 (norm. = 0.964164), norm. avg. (of 15) = 0.46602 fft 8: mflops = 23.1032 (norm. = 0.331962), norm. avg. (of 14) = 0.242413 fft 9: mflops = 22.431 (norm. = 0.322305), norm. avg. (of 15) = 0.260455 fft 10: mflops = 50.3478 (norm. = 0.723431), norm. avg. (of 15) = 0.533175 fft 11: mflops = 45.9096 (norm. = 0.659661), norm. avg. (of 15) = 0.429981 fft 12: mflops = 40.8748 (norm. = 0.587318), norm. avg. (of 15) = 0.893364 fft 13: mflops = 69.5958 (norm. = 1), norm. avg. (of 13) = 0.797076 fft 14: mflops = 29.5207 (norm. = 0.424174), norm. avg. (of 15) = 0.542552 fft 15: mflops = 29.5207 (norm. = 0.424174), norm. avg. (of 15) = 0.323571 fft 16: mflops = 29.744 (norm. = 0.427383), norm. avg. (of 15) = 0.313332 fft 17: mflops = -1 (norm. = -0.0143687), norm. avg. (of 12) = 0.395392 fft 18: mflops = 42.9041 (norm. = 0.616476), norm. avg. (of 14) = 0.485958 fft 19: mflops = 45.9633 (norm. = 0.660432), norm. avg. (of 14) = 0.585119 fft 20: mflops = 40.5377 (norm. = 0.582474), norm. avg. (of 14) = 0.562771 fft 21: mflops = 9.90968 (norm. = 0.142389), norm. avg. (of 15) = 0.120367 fft 22: mflops = 45.9633 (norm. = 0.660432), norm. avg. (of 15) = 0.411098 fft 23: mflops = 52.0127 (norm. = 0.747354), norm. avg. (of 14) = 0.371708 fft 24: mflops = 51.3337 (norm. = 0.737598), norm. avg. (of 15) = 0.539893 fft 25: mflops = 42.2132 (norm. = 0.606549), norm. avg. (of 15) = 0.414105 fft 26: mflops = 6.8648 (norm. = 0.0986383), norm. avg. (of 15) = 0.0696492 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.452 s, 4 iters, t-(init.)=1.402 s t(norm)=0.334263, mflops=14.9583 (err=1.7e-014) 1. Arndt DIT: elapsed time t=1.472 s, 4 iters, t-(init.)=1.432 s t(norm)=0.341415, mflops=14.6449 (err=1.7e-014) 2. Arndt Split-Radix: elapsed time t=1.752 s, 4 iters, t-(init.)=1.712 s t(norm)=0.408173, mflops=12.2497 (err=1.7e-014) 3. Arndt 4-step: elapsed time t=1.001 s, 4 iters, t-(init.)=0.951 s t(norm)=0.226736, mflops=22.0521 (err=1.7e-014) 4. Beauregard: elapsed time t=1.021 s, 2 iters, t-(init.)=1.001 s t(norm)=0.477314, mflops=10.4753 (err=1.7e-014) 5. Bergland: elapsed time t=1.292 s, 8 iters, t-(init.)=1.212 s t(norm)=0.144482, mflops=34.6065 (err=1.7e-014) 6. CWP (min N) (N=72072): elapsed time t=1.001 s, 8 iters, t-(init.)=0.9 s t(norm)=0.107288, mflops=46.6034 7. CWP (best N) (N=72072): elapsed time t=1.002 s, 8 iters, t-(init.)=0.902 s t(norm)=0.107527, mflops=46.5 8. Edelblute: elapsed time t=1.813 s, 4 iters, t-(init.)=1.773 s t(norm)=0.422716, mflops=11.8283 (err=1.7e-014) 9. FFTPACK (f2c): elapsed time t=1.983 s, 8 iters, t-(init.)=1.903 s t(norm)=0.226855, mflops=22.0405 (err=1.7e-014) FFTW_MEASURE plan: (cost = 1.200000e-001) FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=1.843 s, 16 iters, t-(init.)=1.673 s t(norm)=0.0997186, mflops=50.1411 (err=1.7e-014) FFTW_ESTIMATE plan: (cost = 5.767168e+005) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.161 s, 8 iters, t-(init.)=1.081 s t(norm)=0.128865, mflops=38.8002 (err=1.7e-014) 12. Frigo-old: elapsed time t=1.232 s, 8 iters, t-(init.)=1.142 s t(norm)=0.136137, mflops=36.7277 (err=1.7e-014) 13. Green: elapsed time t=1.162 s, 8 iters, t-(init.)=1.082 s t(norm)=0.128984, mflops=38.7644 (err=1.7e-014) 14. GSL: elapsed time t=1.472 s, 8 iters, t-(init.)=1.382 s t(norm)=0.164747, mflops=30.3495 (err=1.7e-014) 15. GSL DIT: elapsed time t=1.562 s, 4 iters, t-(init.)=1.522 s t(norm)=0.362873, mflops=13.7789 (err=1.7e-014) 16. GSL DIF: elapsed time t=1.532 s, 4 iters, t-(init.)=1.492 s t(norm)=0.355721, mflops=14.056 (err=1.8e-014) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.322 s, 8 iters, t-(init.)=1.242 s t(norm)=0.148058, mflops=33.7706 (err=1.7e-014) 19. Mayer (simple): elapsed time t=1.252 s, 8 iters, t-(init.)=1.172 s t(norm)=0.139713, mflops=35.7876 20. Mayer (lookup): elapsed time t=1.392 s, 8 iters, t-(init.)=1.302 s t(norm)=0.15521, mflops=32.2143 (err=1.7e-014) 21. NAPACK (f2c): elapsed time t=1.092 s, 2 iters, t-(init.)=1.072 s t(norm)=0.511169, mflops=9.78149 (err=8.7e-013) 22. Ooura (C): elapsed time t=1.342 s, 8 iters, t-(init.)=1.252 s t(norm)=0.14925, mflops=33.5008 (err=1.7e-014) 23. Ransom: elapsed time t=1.221 s, 8 iters, t-(init.)=1.13 s t(norm)=0.134706, mflops=37.1177 (err=1.7e-014) 24. Singleton (f2c): elapsed time t=1.522 s, 8 iters, t-(init.)=1.441 s t(norm)=0.171781, mflops=29.1069 (err=2.3e-014) 25. Temperton (f2c): elapsed time t=1.642 s, 8 iters, t-(init.)=1.562 s t(norm)=0.186205, mflops=26.8521 (err=1.7e-014) 26. Valkenburg: elapsed time t=1.832 s, 2 iters, t-(init.)=1.812 s t(norm)=0.864029, mflops=5.78684 (err=1.7e-014) Top mflops for N=65536 = 50.1411 Normalized results and averages for N=65536: fft 0: mflops = 14.9583 (norm. = 0.298324), norm. avg. (of 16) = 0.386167 fft 1: mflops = 14.6449 (norm. = 0.292074), norm. avg. (of 16) = 0.391977 fft 2: mflops = 12.2497 (norm. = 0.244305), norm. avg. (of 16) = 0.295863 fft 3: mflops = 22.0521 (norm. = 0.4398), norm. avg. (of 16) = 0.183246 fft 4: mflops = 10.4753 (norm. = 0.208916), norm. avg. (of 16) = 0.108379 fft 5: mflops = 34.6065 (norm. = 0.690182), norm. avg. (of 16) = 0.583944 fft 6: mflops = 46.6034 (norm. = 0.929444), norm. avg. (of 16) = 0.507113 fft 7: mflops = 46.5 (norm. = 0.927384), norm. avg. (of 16) = 0.494855 fft 8: mflops = 11.8283 (norm. = 0.2359), norm. avg. (of 15) = 0.241979 fft 9: mflops = 22.0405 (norm. = 0.439569), norm. avg. (of 16) = 0.27165 fft 10: mflops = 50.1411 (norm. = 1), norm. avg. (of 16) = 0.562352 fft 11: mflops = 38.8002 (norm. = 0.773821), norm. avg. (of 16) = 0.451471 fft 12: mflops = 36.7277 (norm. = 0.732487), norm. avg. (of 16) = 0.883309 fft 13: mflops = 38.7644 (norm. = 0.773105), norm. avg. (of 14) = 0.795364 fft 14: mflops = 30.3495 (norm. = 0.605282), norm. avg. (of 16) = 0.546473 fft 15: mflops = 13.7789 (norm. = 0.274803), norm. avg. (of 16) = 0.320523 fft 16: mflops = 14.056 (norm. = 0.280328), norm. avg. (of 16) = 0.311269 fft 17: mflops = -1 (norm. = -0.0199437), norm. avg. (of 12) = 0.395392 fft 18: mflops = 33.7706 (norm. = 0.67351), norm. avg. (of 15) = 0.498462 fft 19: mflops = 35.7876 (norm. = 0.713737), norm. avg. (of 15) = 0.593693 fft 20: mflops = 32.2143 (norm. = 0.642473), norm. avg. (of 15) = 0.568084 fft 21: mflops = 9.78149 (norm. = 0.195079), norm. avg. (of 16) = 0.125037 fft 22: mflops = 33.5008 (norm. = 0.668131), norm. avg. (of 16) = 0.427162 fft 23: mflops = 37.1177 (norm. = 0.740265), norm. avg. (of 15) = 0.396279 fft 24: mflops = 29.1069 (norm. = 0.5805), norm. avg. (of 16) = 0.542431 fft 25: mflops = 26.8521 (norm. = 0.535531), norm. avg. (of 16) = 0.421694 fft 26: mflops = 5.78684 (norm. = 0.115411), norm. avg. (of 16) = 0.0725093 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.703 s, 2 iters, t-(init.)=1.653 s t(norm)=0.370923, mflops=13.4799 (err=3.4e-014) 1. Arndt DIT: elapsed time t=1.712 s, 2 iters, t-(init.)=1.661 s t(norm)=0.372718, mflops=13.415 (err=3.4e-014) 2. Arndt Split-Radix: elapsed time t=1.131 s, 1 iters, t-(init.)=1.1 s t(norm)=0.493667, mflops=10.1283 (err=3.4e-014) 3. Arndt 4-step: elapsed time t=1.342 s, 2 iters, t-(init.)=1.282 s t(norm)=0.287673, mflops=17.3808 (err=3.4e-014) 4. Beauregard: elapsed time t=1.111 s, 1 iters, t-(init.)=1.091 s t(norm)=0.489628, mflops=10.2118 (err=3.4e-014) 5. Bergland: elapsed time t=1.512 s, 4 iters, t-(init.)=1.412 s t(norm)=0.158422, mflops=31.5612 (err=3.4e-014) 6. CWP (min N) (N=144144): elapsed time t=1.112 s, 4 iters, t-(init.)=1.002 s t(norm)=0.112421, mflops=44.4755 7. CWP (best N) (N=144144): elapsed time t=1.101 s, 4 iters, t-(init.)=0.991 s t(norm)=0.111187, mflops=44.9692 8. Edelblute: elapsed time t=1.172 s, 1 iters, t-(init.)=1.152 s t(norm)=0.517004, mflops=9.67111 (err=3.4e-014) 9. FFTPACK (f2c): elapsed time t=1.212 s, 2 iters, t-(init.)=1.162 s t(norm)=0.260746, mflops=19.1758 (err=3.4e-014) FFTW_MEASURE plan: (cost = 2.700000e-001) FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=1.032 s, 4 iters, t-(init.)=0.932 s t(norm)=0.104568, mflops=47.816 (err=3.4e-014) FFTW_ESTIMATE plan: (cost = 1.153434e+006) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.222 s, 4 iters, t-(init.)=1.122 s t(norm)=0.125885, mflops=39.7188 (err=3.4e-014) 12. Frigo-old: elapsed time t=1.402 s, 4 iters, t-(init.)=1.301 s t(norm)=0.145968, mflops=34.254 (err=3.4e-014) 13. Green: elapsed time t=1.452 s, 4 iters, t-(init.)=1.352 s t(norm)=0.15169, mflops=32.9619 (err=3.4e-014) 14. GSL: elapsed time t=1.842 s, 4 iters, t-(init.)=1.742 s t(norm)=0.195447, mflops=25.5824 (err=3.4e-014) 15. GSL DIT: elapsed time t=1.892 s, 2 iters, t-(init.)=1.842 s t(norm)=0.413334, mflops=12.0968 (err=3.5e-014) 16. GSL DIF: elapsed time t=1.833 s, 2 iters, t-(init.)=1.783 s t(norm)=0.400094, mflops=12.497 (err=3.6e-014) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.422 s, 2 iters, t-(init.)=1.372 s t(norm)=0.307869, mflops=16.2407 (err=3.4e-014) 19. Mayer (simple): elapsed time t=1.402 s, 2 iters, t-(init.)=1.352 s t(norm)=0.303381, mflops=16.4809 20. Mayer (lookup): elapsed time t=1.472 s, 2 iters, t-(init.)=1.422 s t(norm)=0.319088, mflops=15.6696 (err=3.4e-014) 21. NAPACK (f2c): elapsed time t=1.212 s, 1 iters, t-(init.)=1.182 s t(norm)=0.530467, mflops=9.42565 (err=2.0e-012) 22. Ooura (C): elapsed time t=1.552 s, 4 iters, t-(init.)=1.452 s t(norm)=0.16291, mflops=30.6918 (err=3.4e-014) 23. Ransom: elapsed time t=1.582 s, 4 iters, t-(init.)=1.472 s t(norm)=0.165154, mflops=30.2748 (err=3.4e-014) 24. Singleton (f2c): elapsed time t=2.003 s, 4 iters, t-(init.)=1.903 s t(norm)=0.213511, mflops=23.418 (err=4.9e-014) 25. Temperton (f2c): elapsed time t=1.031 s, 2 iters, t-(init.)=0.981 s t(norm)=0.22013, mflops=22.7138 (err=3.4e-014) 26. Valkenburg: elapsed time t=1.973 s, 1 iters, t-(init.)=1.953 s t(norm)=0.876483, mflops=5.70462 (err=3.4e-014) Top mflops for N=131072 = 47.816 Normalized results and averages for N=131072: fft 0: mflops = 13.4799 (norm. = 0.281912), norm. avg. (of 17) = 0.380034 fft 1: mflops = 13.415 (norm. = 0.280554), norm. avg. (of 17) = 0.385423 fft 2: mflops = 10.1283 (norm. = 0.211818), norm. avg. (of 17) = 0.290919 fft 3: mflops = 17.3808 (norm. = 0.363495), norm. avg. (of 17) = 0.193849 fft 4: mflops = 10.2118 (norm. = 0.213566), norm. avg. (of 17) = 0.114567 fft 5: mflops = 31.5612 (norm. = 0.660057), norm. avg. (of 17) = 0.588421 fft 6: mflops = 44.4755 (norm. = 0.93014), norm. avg. (of 17) = 0.531997 fft 7: mflops = 44.9692 (norm. = 0.940464), norm. avg. (of 17) = 0.521067 fft 8: mflops = 9.67111 (norm. = 0.202257), norm. avg. (of 16) = 0.239496 fft 9: mflops = 19.1758 (norm. = 0.401033), norm. avg. (of 17) = 0.279261 fft 10: mflops = 47.816 (norm. = 1), norm. avg. (of 17) = 0.588096 fft 11: mflops = 39.7188 (norm. = 0.83066), norm. avg. (of 17) = 0.473776 fft 12: mflops = 34.254 (norm. = 0.716372), norm. avg. (of 17) = 0.873489 fft 13: mflops = 32.9619 (norm. = 0.689349), norm. avg. (of 15) = 0.788296 fft 14: mflops = 25.5824 (norm. = 0.535017), norm. avg. (of 17) = 0.545799 fft 15: mflops = 12.0968 (norm. = 0.252986), norm. avg. (of 17) = 0.31655 fft 16: mflops = 12.497 (norm. = 0.261357), norm. avg. (of 17) = 0.308333 fft 17: mflops = -1 (norm. = -0.0209135), norm. avg. (of 12) = 0.395392 fft 18: mflops = 16.2407 (norm. = 0.33965), norm. avg. (of 16) = 0.488536 fft 19: mflops = 16.4809 (norm. = 0.344675), norm. avg. (of 16) = 0.57813 fft 20: mflops = 15.6696 (norm. = 0.327707), norm. avg. (of 16) = 0.553061 fft 21: mflops = 9.42565 (norm. = 0.197124), norm. avg. (of 17) = 0.129277 fft 22: mflops = 30.6918 (norm. = 0.641873), norm. avg. (of 17) = 0.439792 fft 23: mflops = 30.2748 (norm. = 0.633152), norm. avg. (of 16) = 0.411083 fft 24: mflops = 23.418 (norm. = 0.489753), norm. avg. (of 17) = 0.539332 fft 25: mflops = 22.7138 (norm. = 0.475025), norm. avg. (of 17) = 0.424831 fft 26: mflops = 5.70462 (norm. = 0.119304), norm. avg. (of 17) = 0.0752619 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=1.983 s, 1 iters, t-(init.)=1.933 s t(norm)=0.409656, mflops=12.2054 (err=4.3e-014) 1. Arndt DIT: elapsed time t=1.993 s, 1 iters, t-(init.)=1.943 s t(norm)=0.411775, mflops=12.1425 (err=4.3e-014) 2. Arndt Split-Radix: elapsed time t=2.484 s, 1 iters, t-(init.)=2.434 s t(norm)=0.515832, mflops=9.69308 (err=4.3e-014) 3. Arndt 4-step: elapsed time t=1.162 s, 1 iters, t-(init.)=1.112 s t(norm)=0.235664, mflops=21.2167 (err=4.3e-014) 4. Beauregard: elapsed time t=2.364 s, 1 iters, t-(init.)=2.314 s t(norm)=0.490401, mflops=10.1957 (err=4.4e-014) 5. Bergland: elapsed time t=1.562 s, 2 iters, t-(init.)=1.452 s t(norm)=0.153859, mflops=32.4972 (err=4.4e-014) 6. CWP (min N) (N=360360): elapsed time t=1.612 s, 2 iters, t-(init.)=1.472 s t(norm)=0.155979, mflops=32.0557 7. CWP (best N) (N=360360): elapsed time t=1.613 s, 2 iters, t-(init.)=1.473 s t(norm)=0.156085, mflops=32.0339 8. Edelblute: elapsed time t=2.563 s, 1 iters, t-(init.)=2.503 s t(norm)=0.530455, mflops=9.42587 (err=4.3e-014) 9. FFTPACK (f2c): elapsed time t=1.262 s, 1 iters, t-(init.)=1.212 s t(norm)=0.256856, mflops=19.4661 (err=4.4e-014) FFTW_MEASURE plan: (cost = 5.900000e-001) FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_TWIDDLE 2 FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=1.112 s, 2 iters, t-(init.)=1.002 s t(norm)=0.106176, mflops=47.0917 (err=4.4e-014) FFTW_ESTIMATE plan: (cost = 2.988442e+006) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.291 s, 2 iters, t-(init.)=1.19 s t(norm)=0.126097, mflops=39.652 (err=4.4e-014) 12. Frigo-old: elapsed time t=1.612 s, 2 iters, t-(init.)=1.512 s t(norm)=0.160217, mflops=31.2076 (err=4.4e-014) 13. Green: elapsed time t=1.503 s, 2 iters, t-(init.)=1.403 s t(norm)=0.148667, mflops=33.6322 (err=4.4e-014) 14. GSL: elapsed time t=1.862 s, 2 iters, t-(init.)=1.762 s t(norm)=0.186708, mflops=26.7798 (err=4.4e-014) 15. GSL DIT: elapsed time t=2.013 s, 1 iters, t-(init.)=1.963 s t(norm)=0.416014, mflops=12.0188 (err=4.6e-014) 16. GSL DIF: elapsed time t=1.973 s, 1 iters, t-(init.)=1.923 s t(norm)=0.407537, mflops=12.2688 (err=4.6e-014) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.722 s, 1 iters, t-(init.)=1.672 s t(norm)=0.354343, mflops=14.1106 (err=4.3e-014) 19. Mayer (simple): elapsed time t=1.703 s, 1 iters, t-(init.)=1.653 s t(norm)=0.350316, mflops=14.2728 20. Mayer (lookup): elapsed time t=1.783 s, 1 iters, t-(init.)=1.733 s t(norm)=0.367271, mflops=13.6139 (err=4.3e-014) 21. NAPACK (f2c): elapsed time t=2.484 s, 1 iters, t-(init.)=2.434 s t(norm)=0.515832, mflops=9.69308 (err=3.7e-012) 22. Ooura (C): elapsed time t=1.582 s, 2 iters, t-(init.)=1.472 s t(norm)=0.155979, mflops=32.0557 (err=4.4e-014) 23. Ransom: elapsed time t=1.312 s, 2 iters, t-(init.)=1.202 s t(norm)=0.127369, mflops=39.2562 (err=4.3e-014) 24. Singleton (f2c): elapsed time t=1.001 s, 1 iters, t-(init.)=0.941 s t(norm)=0.199424, mflops=25.0722 (err=6.1e-014) 25. Temperton (f2c): elapsed time t=1.052 s, 1 iters, t-(init.)=1.002 s t(norm)=0.212351, mflops=23.5459 (err=4.4e-014) 26. Valkenburg: elapsed time t=4.226 s, 1 iters, t-(init.)=4.176 s t(norm)=0.88501, mflops=5.64966 (err=4.4e-014) Top mflops for N=262144 = 47.0917 Normalized results and averages for N=262144: fft 0: mflops = 12.2054 (norm. = 0.259183), norm. avg. (of 18) = 0.37332 fft 1: mflops = 12.1425 (norm. = 0.257849), norm. avg. (of 18) = 0.378335 fft 2: mflops = 9.69308 (norm. = 0.205834), norm. avg. (of 18) = 0.286192 fft 3: mflops = 21.2167 (norm. = 0.45054), norm. avg. (of 18) = 0.208109 fft 4: mflops = 10.1957 (norm. = 0.216508), norm. avg. (of 18) = 0.12023 fft 5: mflops = 32.4972 (norm. = 0.690083), norm. avg. (of 18) = 0.594069 fft 6: mflops = 32.0557 (norm. = 0.680707), norm. avg. (of 18) = 0.540259 fft 7: mflops = 32.0339 (norm. = 0.680244), norm. avg. (of 18) = 0.52991 fft 8: mflops = 9.42587 (norm. = 0.20016), norm. avg. (of 17) = 0.237182 fft 9: mflops = 19.4661 (norm. = 0.413366), norm. avg. (of 18) = 0.286711 fft 10: mflops = 47.0917 (norm. = 1), norm. avg. (of 18) = 0.610979 fft 11: mflops = 39.652 (norm. = 0.842017), norm. avg. (of 18) = 0.494234 fft 12: mflops = 31.2076 (norm. = 0.662698), norm. avg. (of 18) = 0.861779 fft 13: mflops = 33.6322 (norm. = 0.714184), norm. avg. (of 16) = 0.783664 fft 14: mflops = 26.7798 (norm. = 0.568672), norm. avg. (of 18) = 0.54707 fft 15: mflops = 12.0188 (norm. = 0.255222), norm. avg. (of 18) = 0.313143 fft 16: mflops = 12.2688 (norm. = 0.26053), norm. avg. (of 18) = 0.305677 fft 17: mflops = -1 (norm. = -0.0212351), norm. avg. (of 12) = 0.395392 fft 18: mflops = 14.1106 (norm. = 0.299641), norm. avg. (of 17) = 0.477424 fft 19: mflops = 14.2728 (norm. = 0.303085), norm. avg. (of 17) = 0.561951 fft 20: mflops = 13.6139 (norm. = 0.289094), norm. avg. (of 17) = 0.537533 fft 21: mflops = 9.69308 (norm. = 0.205834), norm. avg. (of 18) = 0.13353 fft 22: mflops = 32.0557 (norm. = 0.680707), norm. avg. (of 18) = 0.453176 fft 23: mflops = 39.2562 (norm. = 0.833611), norm. avg. (of 17) = 0.435938 fft 24: mflops = 25.0722 (norm. = 0.532412), norm. avg. (of 18) = 0.538948 fft 25: mflops = 23.5459 (norm. = 0.5), norm. avg. (of 18) = 0.429007 fft 26: mflops = 5.64966 (norm. = 0.119971), norm. avg. (of 18) = 0.0777458 Benchmarking for array size = 524288 (power of 2): 0. Arndt DIF: elapsed time t=4.026 s, 1 iters, t-(init.)=3.926 s t(norm)=0.394118, mflops=12.6865 (err=1.1e-013) 1. Arndt DIT: elapsed time t=4.026 s, 1 iters, t-(init.)=3.926 s t(norm)=0.394118, mflops=12.6865 (err=1.1e-013) 2. Arndt Split-Radix: elapsed time t=5.408 s, 1 iters, t-(init.)=5.298 s t(norm)=0.531849, mflops=9.40116 (err=1.1e-013) 3. Arndt 4-step: elapsed time t=3.034 s, 1 iters, t-(init.)=2.933 s t(norm)=0.294434, mflops=16.9817 (err=1.1e-013) 4. Beauregard: elapsed time t=4.987 s, 1 iters, t-(init.)=4.877 s t(norm)=0.489586, mflops=10.2127 (err=1.1e-013) 5. Bergland: elapsed time t=1.843 s, 1 iters, t-(init.)=1.743 s t(norm)=0.174974, mflops=28.5757 (err=1.1e-013) 6. CWP (min N) (N=720720): elapsed time t=1.632 s, 1 iters, t-(init.)=1.482 s t(norm)=0.148773, mflops=33.6082 7. CWP (best N) (N=720720): elapsed time t=1.632 s, 1 iters, t-(init.)=1.492 s t(norm)=0.149777, mflops=33.3829 8. Edelblute: elapsed time t=5.538 s, 1 iters, t-(init.)=5.428 s t(norm)=0.544899, mflops=9.17601 (err=1.1e-013) 9. FFTPACK (f2c): elapsed time t=2.574 s, 1 iters, t-(init.)=2.464 s t(norm)=0.247353, mflops=20.214 (err=1.1e-013) FFTW_MEASURE plan: (cost = 1.212000e+000) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=1.172 s, 1 iters, t-(init.)=1.072 s t(norm)=0.107615, mflops=46.4621 (err=1.1e-013) FFTW_ESTIMATE plan: (cost = 5.976883e+006) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.392 s, 1 iters, t-(init.)=1.292 s t(norm)=0.1297, mflops=38.5506 (err=1.1e-013) 12. Frigo-old: elapsed time t=1.802 s, 1 iters, t-(init.)=1.702 s t(norm)=0.170858, mflops=29.264 (err=1.1e-013) 13. Green: elapsed time t=1.682 s, 1 iters, t-(init.)=1.572 s t(norm)=0.157808, mflops=31.6841 (err=1.1e-013) 14. GSL: elapsed time t=1.893 s, 1 iters, t-(init.)=1.793 s t(norm)=0.179993, mflops=27.7788 (err=1.1e-013) 15. GSL DIT: elapsed time t=4.336 s, 1 iters, t-(init.)=4.235 s t(norm)=0.425138, mflops=11.7609 (err=1.1e-013) 16. GSL DIF: elapsed time t=4.236 s, 1 iters, t-(init.)=4.136 s t(norm)=0.4152, mflops=12.0424 (err=1.1e-013) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=3.745 s, 1 iters, t-(init.)=3.645 s t(norm)=0.36591, mflops=13.6646 (err=1.1e-013) 19. Mayer (simple): elapsed time t=3.695 s, 1 iters, t-(init.)=3.585 s t(norm)=0.359887, mflops=13.8933 20. Mayer (lookup): elapsed time t=3.846 s, 1 iters, t-(init.)=3.746 s t(norm)=0.376049, mflops=13.2961 (err=1.1e-013) 21. NAPACK (f2c): elapsed time t=5.398 s, 1 iters, t-(init.)=5.298 s t(norm)=0.531849, mflops=9.40116 (err=7.9e-012) 22. Ooura (C): elapsed time t=1.713 s, 1 iters, t-(init.)=1.603 s t(norm)=0.16092, mflops=31.0713 (err=1.1e-013) 23. Ransom: elapsed time t=1.672 s, 1 iters, t-(init.)=1.562 s t(norm)=0.156804, mflops=31.8869 (err=1.1e-013) 24. Singleton (f2c): elapsed time t=2.513 s, 1 iters, t-(init.)=2.413 s t(norm)=0.242233, mflops=20.6413 (err=1.6e-013) 25. Temperton (f2c): elapsed time t=2.364 s, 1 iters, t-(init.)=2.254 s t(norm)=0.226272, mflops=22.0973 (err=1.1e-013) 26. Valkenburg: elapsed time t=9.083 s, 1 iters, t-(init.)=8.983 s t(norm)=0.901774, mflops=5.54462 (err=1.1e-013) Top mflops for N=524288 = 46.4621 Normalized results and averages for N=524288: fft 0: mflops = 12.6865 (norm. = 0.273051), norm. avg. (of 19) = 0.368043 fft 1: mflops = 12.6865 (norm. = 0.273051), norm. avg. (of 19) = 0.372794 fft 2: mflops = 9.40116 (norm. = 0.202341), norm. avg. (of 19) = 0.281779 fft 3: mflops = 16.9817 (norm. = 0.365496), norm. avg. (of 19) = 0.216393 fft 4: mflops = 10.2127 (norm. = 0.219807), norm. avg. (of 19) = 0.125471 fft 5: mflops = 28.5757 (norm. = 0.615032), norm. avg. (of 19) = 0.595172 fft 6: mflops = 33.6082 (norm. = 0.723347), norm. avg. (of 19) = 0.549895 fft 7: mflops = 33.3829 (norm. = 0.718499), norm. avg. (of 19) = 0.539836 fft 8: mflops = 9.17601 (norm. = 0.197494), norm. avg. (of 18) = 0.234978 fft 9: mflops = 20.214 (norm. = 0.435065), norm. avg. (of 19) = 0.294519 fft 10: mflops = 46.4621 (norm. = 1), norm. avg. (of 19) = 0.631454 fft 11: mflops = 38.5506 (norm. = 0.829721), norm. avg. (of 19) = 0.511891 fft 12: mflops = 29.264 (norm. = 0.629847), norm. avg. (of 19) = 0.849572 fft 13: mflops = 31.6841 (norm. = 0.681934), norm. avg. (of 17) = 0.77768 fft 14: mflops = 27.7788 (norm. = 0.597881), norm. avg. (of 19) = 0.549744 fft 15: mflops = 11.7609 (norm. = 0.253129), norm. avg. (of 19) = 0.309984 fft 16: mflops = 12.0424 (norm. = 0.259188), norm. avg. (of 19) = 0.303231 fft 17: mflops = -1 (norm. = -0.0215229), norm. avg. (of 12) = 0.395392 fft 18: mflops = 13.6646 (norm. = 0.294102), norm. avg. (of 18) = 0.46724 fft 19: mflops = 13.8933 (norm. = 0.299024), norm. avg. (of 18) = 0.547343 fft 20: mflops = 13.2961 (norm. = 0.286172), norm. avg. (of 18) = 0.523569 fft 21: mflops = 9.40116 (norm. = 0.202341), norm. avg. (of 19) = 0.137152 fft 22: mflops = 31.0713 (norm. = 0.668746), norm. avg. (of 19) = 0.464522 fft 23: mflops = 31.8869 (norm. = 0.6863), norm. avg. (of 18) = 0.449847 fft 24: mflops = 20.6413 (norm. = 0.44426), norm. avg. (of 19) = 0.533964 fft 25: mflops = 22.0973 (norm. = 0.475599), norm. avg. (of 19) = 0.43146 fft 26: mflops = 5.54462 (norm. = 0.119337), norm. avg. (of 19) = 0.0799348 Benchmarking for array size = 1048576 (power of 2): 0. Arndt DIF: elapsed time t=9.093 s, 1 iters, t-(init.)=8.883 s t(norm)=0.423574, mflops=11.8043 (err=1.9e-013) 1. Arndt DIT: elapsed time t=9.083 s, 1 iters, t-(init.)=8.872 s t(norm)=0.42305, mflops=11.8189 (err=1.9e-013) 2. Arndt Split-Radix: elapsed time t=11.396 s, 1 iters, t-(init.)=11.195 s t(norm)=0.533819, mflops=9.36647 (err=1.9e-013) 3. Arndt 4-step: elapsed time t=4.756 s, 1 iters, t-(init.)=4.545 s t(norm)=0.216722, mflops=23.071 (err=1.9e-013) 4. Beauregard: elapsed time t=10.526 s, 1 iters, t-(init.)=10.316 s t(norm)=0.491905, mflops=10.1646 (err=1.9e-013) 5. Bergland: elapsed time t=3.765 s, 1 iters, t-(init.)=3.554 s t(norm)=0.169468, mflops=29.5041 (err=1.9e-013) 6. Skipping fft (this transform size is too big for CWP). 7. Skipping fft (this transform size is too big for CWP). 8. Edelblute: elapsed time t=11.656 s, 1 iters, t-(init.)=11.445 s t(norm)=0.54574, mflops=9.16187 (err=1.9e-013) 9. FFTPACK (f2c): elapsed time t=5.548 s, 1 iters, t-(init.)=5.337 s t(norm)=0.254488, mflops=19.6473 (err=1.9e-013) FFTW_MEASURE plan: (cost = 2.543000e+000) FFTW_TWIDDLE 16 FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=2.474 s, 1 iters, t-(init.)=2.274 s t(norm)=0.108433, mflops=46.1115 (err=1.9e-013) FFTW_ESTIMATE plan: (cost = 1.195377e+007) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=2.894 s, 1 iters, t-(init.)=2.684 s t(norm)=0.127983, mflops=39.0677 (err=1.9e-013) 12. Frigo-old: elapsed time t=4.547 s, 1 iters, t-(init.)=4.347 s t(norm)=0.207281, mflops=24.1218 (err=1.9e-013) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=3.786 s, 1 iters, t-(init.)=3.576 s t(norm)=0.170517, mflops=29.3226 (err=1.9e-013) 15. GSL DIT: elapsed time t=9.183 s, 1 iters, t-(init.)=8.972 s t(norm)=0.427818, mflops=11.6872 (err=1.9e-013) 16. GSL DIF: elapsed time t=9.003 s, 1 iters, t-(init.)=8.792 s t(norm)=0.419235, mflops=11.9265 (err=1.9e-013) 17. Skipping fft (Krukar can't handle N > 4096). 18. Skipping fft (Mayer (Buneman) can't handle N > 2^19). 19. Mayer (simple): elapsed time t=7.931 s, 1 iters, t-(init.)=7.721 s t(norm)=0.368166, mflops=13.5808 20. Mayer (lookup): elapsed time t=8.232 s, 1 iters, t-(init.)=8.022 s t(norm)=0.382519, mflops=13.0713 (err=1.9e-013) 21. NAPACK (f2c): elapsed time t=11.006 s, 1 iters, t-(init.)=10.795 s t(norm)=0.514746, mflops=9.71353 (err=1.5e-011) 22. Ooura (C): elapsed time t=3.475 s, 1 iters, t-(init.)=3.265 s t(norm)=0.155687, mflops=32.1157 (err=1.9e-013) 23. Ransom: elapsed time t=2.834 s, 1 iters, t-(init.)=2.624 s t(norm)=0.125122, mflops=39.961 (err=1.9e-013) 24. Singleton (f2c): elapsed time t=4.526 s, 1 iters, t-(init.)=4.316 s t(norm)=0.205803, mflops=24.2951 (err=2.6e-013) 25. Temperton (f2c): elapsed time t=4.707 s, 1 iters, t-(init.)=4.497 s t(norm)=0.214434, mflops=23.3172 (err=1.9e-013) 26. Valkenburg: elapsed time t=19.387 s, 1 iters, t-(init.)=19.176 s t(norm)=0.914383, mflops=5.46817 (err=1.9e-013) Top mflops for N=1048576 = 46.1115 Normalized results and averages for N=1048576: fft 0: mflops = 11.8043 (norm. = 0.255995), norm. avg. (of 20) = 0.36244 fft 1: mflops = 11.8189 (norm. = 0.256312), norm. avg. (of 20) = 0.36697 fft 2: mflops = 9.36647 (norm. = 0.203126), norm. avg. (of 20) = 0.277846 fft 3: mflops = 23.071 (norm. = 0.50033), norm. avg. (of 20) = 0.23059 fft 4: mflops = 10.1646 (norm. = 0.220434), norm. avg. (of 20) = 0.130219 fft 5: mflops = 29.5041 (norm. = 0.639842), norm. avg. (of 20) = 0.597406 fft 6: mflops = -1 (norm. = -0.0216866), norm. avg. (of 19) = 0.549895 fft 7: mflops = -1 (norm. = -0.0216866), norm. avg. (of 19) = 0.539836 fft 8: mflops = 9.16187 (norm. = 0.198689), norm. avg. (of 19) = 0.233068 fft 9: mflops = 19.6473 (norm. = 0.426082), norm. avg. (of 20) = 0.301097 fft 10: mflops = 46.1115 (norm. = 1), norm. avg. (of 20) = 0.649881 fft 11: mflops = 39.0677 (norm. = 0.847243), norm. avg. (of 20) = 0.528659 fft 12: mflops = 24.1218 (norm. = 0.523119), norm. avg. (of 20) = 0.833249 fft 13: mflops = -1 (norm. = -0.0216866), norm. avg. (of 17) = 0.77768 fft 14: mflops = 29.3226 (norm. = 0.635906), norm. avg. (of 20) = 0.554052 fft 15: mflops = 11.6872 (norm. = 0.253455), norm. avg. (of 20) = 0.307158 fft 16: mflops = 11.9265 (norm. = 0.258644), norm. avg. (of 20) = 0.301001 fft 17: mflops = -1 (norm. = -0.0216866), norm. avg. (of 12) = 0.395392 fft 18: mflops = -1 (norm. = -0.0216866), norm. avg. (of 18) = 0.46724 fft 19: mflops = 13.5808 (norm. = 0.294521), norm. avg. (of 19) = 0.534037 fft 20: mflops = 13.0713 (norm. = 0.28347), norm. avg. (of 19) = 0.510932 fft 21: mflops = 9.71353 (norm. = 0.210653), norm. avg. (of 20) = 0.140827 fft 22: mflops = 32.1157 (norm. = 0.696478), norm. avg. (of 20) = 0.47612 fft 23: mflops = 39.961 (norm. = 0.866616), norm. avg. (of 19) = 0.471782 fft 24: mflops = 24.2951 (norm. = 0.526877), norm. avg. (of 20) = 0.53361 fft 25: mflops = 23.3172 (norm. = 0.50567), norm. avg. (of 20) = 0.43517 fft 26: mflops = 5.46817 (norm. = 0.118586), norm. avg. (of 20) = 0.0818673 Benchmarking for array size = 2097152 (power of 2): 0. Arndt DIF: elapsed time t=18.086 s, 1 iters, t-(init.)=17.665 s t(norm)=0.401111, mflops=12.4654 (err=2.7e-013) 1. Arndt DIT: elapsed time t=18.126 s, 1 iters, t-(init.)=17.705 s t(norm)=0.402019, mflops=12.4372 (err=2.7e-013) 2. Arndt Split-Radix: elapsed time t=24.004 s, 1 iters, t-(init.)=23.583 s t(norm)=0.535488, mflops=9.33728 (err=2.7e-013) 3. Arndt 4-step: elapsed time t=12.498 s, 1 iters, t-(init.)=12.077 s t(norm)=0.274227, mflops=18.2331 (err=2.7e-013) 4. Beauregard: elapsed time t=22.112 s, 1 iters, t-(init.)=21.692 s t(norm)=0.49255, mflops=10.1513 (err=2.7e-013) 5. Skipping fft (Bergland doesn't work for N > 2^20). 6. Skipping fft (this transform size is too big for CWP). 7. Skipping fft (this transform size is too big for CWP). 8. Edelblute: elapsed time t=24.595 s, 1 iters, t-(init.)=24.174 s t(norm)=0.548908, mflops=9.109 (err=2.7e-013) 9. FFTPACK (f2c): elapsed time t=12.738 s, 1 iters, t-(init.)=12.327 s t(norm)=0.279903, mflops=17.8633 (err=2.7e-013) FFTW_MEASURE plan: (cost = 5.388000e+000) FFTW_TWIDDLE 32 FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 2 10. FFTW: elapsed time t=5.148 s, 1 iters, t-(init.)=4.738 s t(norm)=0.107584, mflops=46.4755 (err=2.7e-013) FFTW_ESTIMATE plan: (cost = 2.390753e+007) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=6.159 s, 1 iters, t-(init.)=5.749 s t(norm)=0.13054, mflops=38.3025 (err=2.7e-013) 12. Frigo-old: elapsed time t=9.844 s, 1 iters, t-(init.)=9.423 s t(norm)=0.213964, mflops=23.3685 (err=2.7e-013) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=8.863 s, 1 iters, t-(init.)=8.443 s t(norm)=0.191711, mflops=26.0809 (err=2.7e-013) 15. GSL DIT: elapsed time t=18.837 s, 1 iters, t-(init.)=18.417 s t(norm)=0.418186, mflops=11.9564 (err=2.7e-013) 16. GSL DIF: elapsed time t=18.968 s, 1 iters, t-(init.)=18.548 s t(norm)=0.421161, mflops=11.872 (err=2.7e-013) 17. Skipping fft (Krukar can't handle N > 4096). 18. Skipping fft (Mayer (Buneman) can't handle N > 2^19). 19. Mayer (simple): elapsed time t=16.584 s, 1 iters, t-(init.)=16.164 s t(norm)=0.367028, mflops=13.6229 20. Mayer (lookup): elapsed time t=17.155 s, 1 iters, t-(init.)=16.735 s t(norm)=0.379994, mflops=13.1581 (err=2.7e-013) 21. NAPACK (f2c): elapsed time t=23.745 s, 1 iters, t-(init.)=23.325 s t(norm)=0.52963, mflops=9.44056 (err=1.9e-011) 22. Ooura (C): elapsed time t=7.461 s, 1 iters, t-(init.)=7.04 s t(norm)=0.159854, mflops=31.2785 (err=2.7e-013) 23. Ransom: elapsed time t=7.892 s, 1 iters, t-(init.)=7.472 s t(norm)=0.169663, mflops=29.4701 (err=2.7e-013) 24. Singleton (f2c): elapsed time t=9.985 s, 1 iters, t-(init.)=9.575 s t(norm)=0.217415, mflops=22.9975 (err=3.7e-013) 25. Temperton (f2c): elapsed time t=10.455 s, 1 iters, t-(init.)=10.034 s t(norm)=0.227837, mflops=21.9455 (err=2.7e-013) 26. Valkenburg: elapsed time t=41.69 s, 1 iters, t-(init.)=41.269 s t(norm)=0.937076, mflops=5.33575 (err=2.7e-013) Top mflops for N=2097152 = 46.4755 Normalized results and averages for N=2097152: fft 0: mflops = 12.4654 (norm. = 0.268214), norm. avg. (of 21) = 0.357953 fft 1: mflops = 12.4372 (norm. = 0.267608), norm. avg. (of 21) = 0.362238 fft 2: mflops = 9.33728 (norm. = 0.200907), norm. avg. (of 21) = 0.274182 fft 3: mflops = 18.2331 (norm. = 0.392316), norm. avg. (of 21) = 0.238291 fft 4: mflops = 10.1513 (norm. = 0.218422), norm. avg. (of 21) = 0.134419 fft 5: mflops = -1 (norm. = -0.0215167), norm. avg. (of 20) = 0.597406 fft 6: mflops = -1 (norm. = -0.0215167), norm. avg. (of 19) = 0.549895 fft 7: mflops = -1 (norm. = -0.0215167), norm. avg. (of 19) = 0.539836 fft 8: mflops = 9.109 (norm. = 0.195996), norm. avg. (of 20) = 0.231214 fft 9: mflops = 17.8633 (norm. = 0.38436), norm. avg. (of 21) = 0.305062 fft 10: mflops = 46.4755 (norm. = 1), norm. avg. (of 21) = 0.666554 fft 11: mflops = 38.3025 (norm. = 0.824143), norm. avg. (of 21) = 0.54273 fft 12: mflops = 23.3685 (norm. = 0.502812), norm. avg. (of 21) = 0.817514 fft 13: mflops = -1 (norm. = -0.0215167), norm. avg. (of 17) = 0.77768 fft 14: mflops = 26.0809 (norm. = 0.561175), norm. avg. (of 21) = 0.554391 fft 15: mflops = 11.9564 (norm. = 0.257262), norm. avg. (of 21) = 0.304782 fft 16: mflops = 11.872 (norm. = 0.255445), norm. avg. (of 21) = 0.298832 fft 17: mflops = -1 (norm. = -0.0215167), norm. avg. (of 12) = 0.395392 fft 18: mflops = -1 (norm. = -0.0215167), norm. avg. (of 18) = 0.46724 fft 19: mflops = 13.6229 (norm. = 0.293121), norm. avg. (of 20) = 0.521991 fft 20: mflops = 13.1581 (norm. = 0.283119), norm. avg. (of 20) = 0.499541 fft 21: mflops = 9.44056 (norm. = 0.20313), norm. avg. (of 21) = 0.143794 fft 22: mflops = 31.2785 (norm. = 0.673011), norm. avg. (of 21) = 0.485496 fft 23: mflops = 29.4701 (norm. = 0.634101), norm. avg. (of 20) = 0.479898 fft 24: mflops = 22.9975 (norm. = 0.49483), norm. avg. (of 21) = 0.531763 fft 25: mflops = 21.9455 (norm. = 0.472195), norm. avg. (of 21) = 0.436933 fft 26: mflops = 5.33575 (norm. = 0.114808), norm. avg. (of 21) = 0.0834359 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. CWP (min N) 1. CWP (best N) 2. FFTPACK (f2c) 3. FFTW 4. FFTW_ESTIMATE 5. Frigo-old 6. GSL 7. NAPACK (f2c) 8. Singleton (f2c) 9. Temperton (f2c) 10. Valkenburg Computing normalized averages (11 transforms). Benchmarking for array size = 6: 0. CWP (min N): elapsed time t=1.002 s, 262144 iters, t-(init.)=0.942 s t(norm)=0.231689, mflops=21.5806 1. CWP (best N) (N=15): elapsed time t=1.773 s, 262144 iters, t-(init.)=1.653 s t(norm)=0.406563, mflops=12.2982 2. FFTPACK (f2c): elapsed time t=1.672 s, 524288 iters, t-(init.)=1.542 s t(norm)=0.189631, mflops=26.367 (err=2.6e-016) FFTW_MEASURE plan: (cost = 1.754761e-006) FFTW_NOTW 6 3. FFTW: elapsed time t=1.963 s, 1048576 iters, t-(init.)=1.712 s t(norm)=0.105268, mflops=47.4976 (err=1.6e-016) FFTW_ESTIMATE plan: (cost = 4.116000e+002) FFTW_NOTW 6 4. FFTW_ESTIMATE: elapsed time t=1.963 s, 1048576 iters, t-(init.)=1.702 s t(norm)=0.104654, mflops=47.7767 (err=1.6e-016) 5. Frigo-old: elapsed time t=2.003 s, 524288 iters, t-(init.)=1.872 s t(norm)=0.230213, mflops=21.719 (err=5.2e-016) 6. GSL: elapsed time t=1.112 s, 524288 iters, t-(init.)=0.992 s t(norm)=0.121993, mflops=40.9858 (err=1.4e-016) 7. NAPACK (f2c): elapsed time t=1.082 s, 131072 iters, t-(init.)=1.052 s t(norm)=0.517488, mflops=9.66206 (err=8.3e-016) 8. Singleton (f2c): elapsed time t=1.222 s, 262144 iters, t-(init.)=1.152 s t(norm)=0.283339, mflops=17.6467 (err=1.7e-016) 9. Temperton (f2c): elapsed time t=1.412 s, 262144 iters, t-(init.)=1.352 s t(norm)=0.33253, mflops=15.0362 (err=1.6e-016) 10. Valkenburg: elapsed time t=1.191 s, 131072 iters, t-(init.)=1.161 s t(norm)=0.571106, mflops=8.75494 (err=4.7e-016) Top mflops for N=6 = 47.7767 Normalized results and averages for N=6: fft 0: mflops = 21.5806 (norm. = 0.451699), norm. avg. (of 1) = 0.451699 fft 1: mflops = 12.2982 (norm. = 0.257411), norm. avg. (of 1) = 0.257411 fft 2: mflops = 26.367 (norm. = 0.551881), norm. avg. (of 1) = 0.551881 fft 3: mflops = 47.4976 (norm. = 0.994159), norm. avg. (of 1) = 0.994159 fft 4: mflops = 47.7767 (norm. = 1), norm. avg. (of 1) = 1 fft 5: mflops = 21.719 (norm. = 0.454594), norm. avg. (of 1) = 0.454594 fft 6: mflops = 40.9858 (norm. = 0.857863), norm. avg. (of 1) = 0.857863 fft 7: mflops = 9.66206 (norm. = 0.202234), norm. avg. (of 1) = 0.202234 fft 8: mflops = 17.6467 (norm. = 0.369358), norm. avg. (of 1) = 0.369358 fft 9: mflops = 15.0362 (norm. = 0.314719), norm. avg. (of 1) = 0.314719 fft 10: mflops = 8.75494 (norm. = 0.183247), norm. avg. (of 1) = 0.183247 Benchmarking for array size = 9: 0. CWP (min N): elapsed time t=1.091 s, 262144 iters, t-(init.)=1.01 s t(norm)=0.135049, mflops=37.0237 1. CWP (best N) (N=15): elapsed time t=1.773 s, 262144 iters, t-(init.)=1.663 s t(norm)=0.222362, mflops=22.4858 2. FFTPACK (f2c): elapsed time t=1.261 s, 262144 iters, t-(init.)=1.17 s t(norm)=0.156442, mflops=31.9606 (err=2.3e-016) FFTW_MEASURE plan: (cost = 3.356934e-006) FFTW_TWIDDLE 3 FFTW_NOTW 3 3. FFTW: elapsed time t=1.792 s, 524288 iters, t-(init.)=1.632 s t(norm)=0.109109, mflops=45.8259 (err=1.1e-016) FFTW_ESTIMATE plan: (cost = 4.851000e+002) FFTW_NOTW 9 4. FFTW_ESTIMATE: elapsed time t=1.963 s, 524288 iters, t-(init.)=1.793 s t(norm)=0.119872, mflops=41.7111 (err=1.5e-016) 5. Frigo-old: elapsed time t=1.051 s, 131072 iters, t-(init.)=1.011 s t(norm)=0.270365, mflops=18.4935 (err=5.5e-016) 6. GSL: elapsed time t=1.923 s, 524288 iters, t-(init.)=1.763 s t(norm)=0.117867, mflops=42.4208 (err=1.1e-016) 7. NAPACK (f2c): elapsed time t=1.652 s, 131072 iters, t-(init.)=1.622 s t(norm)=0.43376, mflops=11.5271 (err=7.6e-016) 8. Singleton (f2c): elapsed time t=1.322 s, 262144 iters, t-(init.)=1.232 s t(norm)=0.164732, mflops=30.3522 (err=9.3e-017) 9. Temperton (f2c): elapsed time t=1.372 s, 262144 iters, t-(init.)=1.282 s t(norm)=0.171418, mflops=29.1685 (err=1.1e-016) 10. Valkenburg: elapsed time t=1.061 s, 65536 iters, t-(init.)=1.041 s t(norm)=0.556774, mflops=8.9803 (err=4.5e-016) Top mflops for N=9 = 45.8259 Normalized results and averages for N=9: fft 0: mflops = 37.0237 (norm. = 0.807921), norm. avg. (of 2) = 0.62981 fft 1: mflops = 22.4858 (norm. = 0.490679), norm. avg. (of 2) = 0.374045 fft 2: mflops = 31.9606 (norm. = 0.697436), norm. avg. (of 2) = 0.624658 fft 3: mflops = 45.8259 (norm. = 1), norm. avg. (of 2) = 0.997079 fft 4: mflops = 41.7111 (norm. = 0.910206), norm. avg. (of 2) = 0.955103 fft 5: mflops = 18.4935 (norm. = 0.403561), norm. avg. (of 2) = 0.429077 fft 6: mflops = 42.4208 (norm. = 0.925695), norm. avg. (of 2) = 0.891779 fft 7: mflops = 11.5271 (norm. = 0.251541), norm. avg. (of 2) = 0.226888 fft 8: mflops = 30.3522 (norm. = 0.662338), norm. avg. (of 2) = 0.515848 fft 9: mflops = 29.1685 (norm. = 0.636505), norm. avg. (of 2) = 0.475612 fft 10: mflops = 8.9803 (norm. = 0.195965), norm. avg. (of 2) = 0.189606 Benchmarking for array size = 12: 0. CWP (min N): elapsed time t=1.442 s, 262144 iters, t-(init.)=1.342 s t(norm)=0.119, mflops=42.0168 1. CWP (best N) (N=15): elapsed time t=1.772 s, 262144 iters, t-(init.)=1.651 s t(norm)=0.1464, mflops=34.153 2. FFTPACK (f2c): elapsed time t=1.612 s, 262144 iters, t-(init.)=1.512 s t(norm)=0.134074, mflops=37.2927 (err=2.3e-016) FFTW_MEASURE plan: (cost = 3.982544e-006) FFTW_TWIDDLE 6 FFTW_NOTW 2 3. FFTW: elapsed time t=1.022 s, 262144 iters, t-(init.)=0.922 s t(norm)=0.081757, mflops=61.1568 (err=1.7e-016) FFTW_ESTIMATE plan: (cost = 4.920000e+002) FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.131 s, 262144 iters, t-(init.)=1.031 s t(norm)=0.0914225, mflops=54.6912 (err=2.0e-016) 5. Frigo-old: elapsed time t=1.893 s, 262144 iters, t-(init.)=1.793 s t(norm)=0.158992, mflops=31.4482 (err=3.3e-016) 6. GSL: elapsed time t=1.002 s, 262144 iters, t-(init.)=0.912 s t(norm)=0.0808703, mflops=61.8274 (err=2.0e-016) 7. NAPACK (f2c): elapsed time t=1.191 s, 65536 iters, t-(init.)=1.171 s t(norm)=0.415347, mflops=12.0381 (err=9.5e-016) 8. Singleton (f2c): elapsed time t=1.762 s, 262144 iters, t-(init.)=1.661 s t(norm)=0.147287, mflops=33.9474 (err=2.1e-016) 9. Temperton (f2c): elapsed time t=1.813 s, 262144 iters, t-(init.)=1.713 s t(norm)=0.151898, mflops=32.9169 (err=1.6e-016) 10. Valkenburg: elapsed time t=1.633 s, 65536 iters, t-(init.)=1.613 s t(norm)=0.572122, mflops=8.7394 (err=4.7e-016) Top mflops for N=12 = 61.8274 Normalized results and averages for N=12: fft 0: mflops = 42.0168 (norm. = 0.679583), norm. avg. (of 3) = 0.646401 fft 1: mflops = 34.153 (norm. = 0.552392), norm. avg. (of 3) = 0.433494 fft 2: mflops = 37.2927 (norm. = 0.603175), norm. avg. (of 3) = 0.617497 fft 3: mflops = 61.1568 (norm. = 0.989154), norm. avg. (of 3) = 0.994438 fft 4: mflops = 54.6912 (norm. = 0.884578), norm. avg. (of 3) = 0.931595 fft 5: mflops = 31.4482 (norm. = 0.508645), norm. avg. (of 3) = 0.4556 fft 6: mflops = 61.8274 (norm. = 1), norm. avg. (of 3) = 0.927853 fft 7: mflops = 12.0381 (norm. = 0.194705), norm. avg. (of 3) = 0.21616 fft 8: mflops = 33.9474 (norm. = 0.549067), norm. avg. (of 3) = 0.526921 fft 9: mflops = 32.9169 (norm. = 0.532399), norm. avg. (of 3) = 0.494541 fft 10: mflops = 8.7394 (norm. = 0.141352), norm. avg. (of 3) = 0.173521 Benchmarking for array size = 15: 0. CWP (min N): elapsed time t=1.773 s, 262144 iters, t-(init.)=1.653 s t(norm)=0.1076, mflops=46.4686 1. CWP (best N): elapsed time t=1.773 s, 262144 iters, t-(init.)=1.653 s t(norm)=0.1076, mflops=46.4686 2. FFTPACK (f2c): elapsed time t=1.071 s, 131072 iters, t-(init.)=1.011 s t(norm)=0.131619, mflops=37.9884 (err=3.5e-016) FFTW_MEASURE plan: (cost = 5.798340e-006) FFTW_TWIDDLE 5 FFTW_NOTW 3 3. FFTW: elapsed time t=1.572 s, 262144 iters, t-(init.)=1.452 s t(norm)=0.0945158, mflops=52.9012 (err=1.9e-016) FFTW_ESTIMATE plan: (cost = 4.485000e+002) FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.673 s, 262144 iters, t-(init.)=1.553 s t(norm)=0.10109, mflops=49.4608 (err=2.0e-016) 5. Frigo-old: elapsed time t=1.923 s, 131072 iters, t-(init.)=1.863 s t(norm)=0.242538, mflops=20.6153 (err=3.8e-016) 6. GSL: elapsed time t=1.642 s, 262144 iters, t-(init.)=1.531 s t(norm)=0.0996581, mflops=50.1715 (err=2.0e-016) 7. NAPACK (f2c): elapsed time t=1.102 s, 32768 iters, t-(init.)=1.092 s t(norm)=0.568657, mflops=8.79265 (err=1.5e-015) 8. Singleton (f2c): elapsed time t=1.162 s, 131072 iters, t-(init.)=1.102 s t(norm)=0.143466, mflops=34.8514 (err=2.4e-016) 9. Temperton (f2c): elapsed time t=1.141 s, 131072 iters, t-(init.)=1.08 s t(norm)=0.140602, mflops=35.5614 (err=2.2e-016) 10. Valkenburg: elapsed time t=1.192 s, 32768 iters, t-(init.)=1.172 s t(norm)=0.610317, mflops=8.19247 (err=4.0e-016) Top mflops for N=15 = 52.9012 Normalized results and averages for N=15: fft 0: mflops = 46.4686 (norm. = 0.878403), norm. avg. (of 4) = 0.704401 fft 1: mflops = 46.4686 (norm. = 0.878403), norm. avg. (of 4) = 0.544721 fft 2: mflops = 37.9884 (norm. = 0.718101), norm. avg. (of 4) = 0.642648 fft 3: mflops = 52.9012 (norm. = 1), norm. avg. (of 4) = 0.995828 fft 4: mflops = 49.4608 (norm. = 0.934965), norm. avg. (of 4) = 0.932437 fft 5: mflops = 20.6153 (norm. = 0.389694), norm. avg. (of 4) = 0.439123 fft 6: mflops = 50.1715 (norm. = 0.9484), norm. avg. (of 4) = 0.932989 fft 7: mflops = 8.79265 (norm. = 0.166209), norm. avg. (of 4) = 0.203672 fft 8: mflops = 34.8514 (norm. = 0.658802), norm. avg. (of 4) = 0.559891 fft 9: mflops = 35.5614 (norm. = 0.672222), norm. avg. (of 4) = 0.538961 fft 10: mflops = 8.19247 (norm. = 0.154863), norm. avg. (of 4) = 0.168857 Benchmarking for array size = 18: 0. CWP (min N): elapsed time t=1.131 s, 131072 iters, t-(init.)=1.061 s t(norm)=0.107846, mflops=46.3623 1. CWP (best N) (N=28): elapsed time t=1.493 s, 131072 iters, t-(init.)=1.393 s t(norm)=0.141593, mflops=35.3126 2. FFTPACK (f2c): elapsed time t=1.693 s, 131072 iters, t-(init.)=1.623 s t(norm)=0.164971, mflops=30.3083 (err=2.6e-016) FFTW_MEASURE plan: (cost = 6.408691e-006) FFTW_TWIDDLE 9 FFTW_NOTW 2 3. FFTW: elapsed time t=1.672 s, 262144 iters, t-(init.)=1.532 s t(norm)=0.0778607, mflops=64.2173 (err=2.1e-016) FFTW_ESTIMATE plan: (cost = 1.168200e+003) FFTW_TWIDDLE 2 FFTW_NOTW 9 4. FFTW_ESTIMATE: elapsed time t=1.112 s, 131072 iters, t-(init.)=1.042 s t(norm)=0.105915, mflops=47.2077 (err=1.8e-016) 5. Frigo-old: elapsed time t=1.172 s, 65536 iters, t-(init.)=1.132 s t(norm)=0.230126, mflops=21.7272 (err=4.0e-016) 6. GSL: elapsed time t=1.512 s, 262144 iters, t-(init.)=1.391 s t(norm)=0.0706946, mflops=70.7267 (err=2.3e-016) 7. NAPACK (f2c): elapsed time t=1.843 s, 65536 iters, t-(init.)=1.813 s t(norm)=0.368568, mflops=13.566 (err=9.5e-016) 8. Singleton (f2c): elapsed time t=1.201 s, 131072 iters, t-(init.)=1.131 s t(norm)=0.114961, mflops=43.4929 (err=2.7e-016) 9. Temperton (f2c): elapsed time t=1.423 s, 131072 iters, t-(init.)=1.353 s t(norm)=0.137527, mflops=36.3566 (err=2.1e-016) 10. Valkenburg: elapsed time t=1.362 s, 32768 iters, t-(init.)=1.342 s t(norm)=0.545635, mflops=9.16364 (err=5.3e-016) Top mflops for N=18 = 70.7267 Normalized results and averages for N=18: fft 0: mflops = 46.3623 (norm. = 0.655514), norm. avg. (of 5) = 0.694624 fft 1: mflops = 35.3126 (norm. = 0.499282), norm. avg. (of 5) = 0.535634 fft 2: mflops = 30.3083 (norm. = 0.428527), norm. avg. (of 5) = 0.599824 fft 3: mflops = 64.2173 (norm. = 0.907963), norm. avg. (of 5) = 0.978255 fft 4: mflops = 47.2077 (norm. = 0.667466), norm. avg. (of 5) = 0.879443 fft 5: mflops = 21.7272 (norm. = 0.3072), norm. avg. (of 5) = 0.412739 fft 6: mflops = 70.7267 (norm. = 1), norm. avg. (of 5) = 0.946391 fft 7: mflops = 13.566 (norm. = 0.191809), norm. avg. (of 5) = 0.2013 fft 8: mflops = 43.4929 (norm. = 0.614943), norm. avg. (of 5) = 0.570901 fft 9: mflops = 36.3566 (norm. = 0.514043), norm. avg. (of 5) = 0.533978 fft 10: mflops = 9.16364 (norm. = 0.129564), norm. avg. (of 5) = 0.160998 Benchmarking for array size = 24: 0. CWP (min N): elapsed time t=1.272 s, 131072 iters, t-(init.)=1.192 s t(norm)=0.0826455, mflops=60.4993 1. CWP (best N) (N=28): elapsed time t=1.492 s, 131072 iters, t-(init.)=1.392 s t(norm)=0.0965122, mflops=51.8069 2. FFTPACK (f2c): elapsed time t=1.102 s, 65536 iters, t-(init.)=1.052 s t(norm)=0.145878, mflops=34.2753 (err=2.5e-016) FFTW_MEASURE plan: (cost = 9.155273e-006) FFTW_TWIDDLE 8 FFTW_NOTW 3 3. FFTW: elapsed time t=1.242 s, 131072 iters, t-(init.)=1.152 s t(norm)=0.0798722, mflops=62.6 (err=1.9e-016) FFTW_ESTIMATE plan: (cost = 1.248000e+003) FFTW_TWIDDLE 2 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.302 s, 131072 iters, t-(init.)=1.222 s t(norm)=0.0847255, mflops=59.0141 (err=2.4e-016) 5. Frigo-old: elapsed time t=1.933 s, 131072 iters, t-(init.)=1.843 s t(norm)=0.127782, mflops=39.1293 (err=3.7e-016) 6. GSL: elapsed time t=1.743 s, 262144 iters, t-(init.)=1.583 s t(norm)=0.0548775, mflops=91.1121 (err=1.5e-016) 7. NAPACK (f2c): elapsed time t=1.242 s, 32768 iters, t-(init.)=1.222 s t(norm)=0.338902, mflops=14.7535 (err=8.0e-016) 8. Singleton (f2c): elapsed time t=1.803 s, 131072 iters, t-(init.)=1.713 s t(norm)=0.118768, mflops=42.0988 (err=2.5e-016) 9. Temperton (f2c): elapsed time t=1.522 s, 131072 iters, t-(init.)=1.432 s t(norm)=0.0992856, mflops=50.3598 (err=2.1e-016) 10. Valkenburg: elapsed time t=1.011 s, 16384 iters, t-(init.)=1.001 s t(norm)=0.555223, mflops=9.0054 (err=4.8e-016) Top mflops for N=24 = 91.1121 Normalized results and averages for N=24: fft 0: mflops = 60.4993 (norm. = 0.66401), norm. avg. (of 6) = 0.689521 fft 1: mflops = 51.8069 (norm. = 0.568606), norm. avg. (of 6) = 0.541129 fft 2: mflops = 34.2753 (norm. = 0.376188), norm. avg. (of 6) = 0.562551 fft 3: mflops = 62.6 (norm. = 0.687066), norm. avg. (of 6) = 0.929724 fft 4: mflops = 59.0141 (norm. = 0.647709), norm. avg. (of 6) = 0.840821 fft 5: mflops = 39.1293 (norm. = 0.429463), norm. avg. (of 6) = 0.415526 fft 6: mflops = 91.1121 (norm. = 1), norm. avg. (of 6) = 0.955326 fft 7: mflops = 14.7535 (norm. = 0.161927), norm. avg. (of 6) = 0.194738 fft 8: mflops = 42.0988 (norm. = 0.462055), norm. avg. (of 6) = 0.55276 fft 9: mflops = 50.3598 (norm. = 0.552723), norm. avg. (of 6) = 0.537102 fft 10: mflops = 9.0054 (norm. = 0.0988387), norm. avg. (of 6) = 0.150638 Benchmarking for array size = 36: 0. CWP (min N): elapsed time t=1.963 s, 131072 iters, t-(init.)=1.833 s t(norm)=0.0751391, mflops=66.5433 1. CWP (best N): elapsed time t=1.963 s, 131072 iters, t-(init.)=1.833 s t(norm)=0.0751391, mflops=66.5433 2. FFTPACK (f2c): elapsed time t=1.762 s, 65536 iters, t-(init.)=1.702 s t(norm)=0.139538, mflops=35.8325 (err=3.9e-016) FFTW_MEASURE plan: (cost = 1.403809e-005) FFTW_TWIDDLE 6 FFTW_NOTW 6 3. FFTW: elapsed time t=1.883 s, 131072 iters, t-(init.)=1.763 s t(norm)=0.0722696, mflops=69.1854 (err=3.6e-016) FFTW_ESTIMATE plan: (cost = 1.803600e+003) FFTW_TWIDDLE 3 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.032 s, 65536 iters, t-(init.)=0.962 s t(norm)=0.0788694, mflops=63.396 (err=3.6e-016) 5. Frigo-old: elapsed time t=1.171 s, 32768 iters, t-(init.)=1.141 s t(norm)=0.187089, mflops=26.7252 (err=5.6e-016) 6. GSL: elapsed time t=1.352 s, 131072 iters, t-(init.)=1.242 s t(norm)=0.0509126, mflops=98.2076 (err=3.9e-016) 7. NAPACK (f2c): elapsed time t=1.933 s, 32768 iters, t-(init.)=1.913 s t(norm)=0.313674, mflops=15.9401 (err=1.4e-015) 8. Singleton (f2c): elapsed time t=1.032 s, 65536 iters, t-(init.)=0.972 s t(norm)=0.0796892, mflops=62.7437 (err=4.7e-016) 9. Temperton (f2c): elapsed time t=1.052 s, 65536 iters, t-(init.)=0.992 s t(norm)=0.0813289, mflops=61.4787 (err=3.5e-016) 10. Valkenburg: elapsed time t=1.663 s, 16384 iters, t-(init.)=1.643 s t(norm)=0.538804, mflops=9.27981 (err=4.7e-016) Top mflops for N=36 = 98.2076 Normalized results and averages for N=36: fft 0: mflops = 66.5433 (norm. = 0.677578), norm. avg. (of 7) = 0.687815 fft 1: mflops = 66.5433 (norm. = 0.677578), norm. avg. (of 7) = 0.560622 fft 2: mflops = 35.8325 (norm. = 0.364865), norm. avg. (of 7) = 0.53431 fft 3: mflops = 69.1854 (norm. = 0.704481), norm. avg. (of 7) = 0.897546 fft 4: mflops = 63.396 (norm. = 0.64553), norm. avg. (of 7) = 0.812922 fft 5: mflops = 26.7252 (norm. = 0.27213), norm. avg. (of 7) = 0.395041 fft 6: mflops = 98.2076 (norm. = 1), norm. avg. (of 7) = 0.961708 fft 7: mflops = 15.9401 (norm. = 0.162311), norm. avg. (of 7) = 0.190105 fft 8: mflops = 62.7437 (norm. = 0.638889), norm. avg. (of 7) = 0.565064 fft 9: mflops = 61.4787 (norm. = 0.626008), norm. avg. (of 7) = 0.549803 fft 10: mflops = 9.27981 (norm. = 0.0944918), norm. avg. (of 7) = 0.142617 Benchmarking for array size = 80: 0. CWP (min N): elapsed time t=2.003 s, 65536 iters, t-(init.)=1.873 s t(norm)=0.0565091, mflops=88.4813 1. CWP (best N) (N=84): elapsed time t=1.231 s, 32768 iters, t-(init.)=1.161 s t(norm)=0.0700556, mflops=71.3719 2. FFTPACK (f2c): elapsed time t=1.903 s, 32768 iters, t-(init.)=1.843 s t(norm)=0.111208, mflops=44.9608 (err=3.9e-016) FFTW_MEASURE plan: (cost = 3.417969e-005) FFTW_TWIDDLE 5 FFTW_TWIDDLE 8 FFTW_NOTW 2 3. FFTW: elapsed time t=1.172 s, 32768 iters, t-(init.)=1.102 s t(norm)=0.0664955, mflops=75.1931 (err=3.8e-016) FFTW_ESTIMATE plan: (cost = 2.600000e+003) FFTW_TWIDDLE 5 FFTW_NOTW 16 4. FFTW_ESTIMATE: elapsed time t=1.392 s, 32768 iters, t-(init.)=1.322 s t(norm)=0.0797704, mflops=62.6799 (err=3.6e-016) 5. Frigo-old: elapsed time t=1.872 s, 32768 iters, t-(init.)=1.801 s t(norm)=0.108674, mflops=46.0093 (err=4.2e-016) 6. GSL: elapsed time t=1.001 s, 32768 iters, t-(init.)=0.951 s t(norm)=0.057384, mflops=87.1323 (err=3.4e-016) 7. NAPACK (f2c): elapsed time t=1.753 s, 8192 iters, t-(init.)=1.743 s t(norm)=0.420696, mflops=11.8851 (err=5.2e-016) 8. Singleton (f2c): elapsed time t=1.001 s, 32768 iters, t-(init.)=0.94 s t(norm)=0.0567203, mflops=88.1519 (err=4.7e-016) 9. Temperton (f2c): elapsed time t=1.532 s, 32768 iters, t-(init.)=1.472 s t(norm)=0.0888215, mflops=56.2926 (err=4.1e-016) 10. Valkenburg: elapsed time t=1.181 s, 4096 iters, t-(init.)=1.171 s t(norm)=0.565272, mflops=8.8453 (err=4.9e-016) Top mflops for N=80 = 88.4813 Normalized results and averages for N=80: fft 0: mflops = 88.4813 (norm. = 1), norm. avg. (of 8) = 0.726838 fft 1: mflops = 71.3719 (norm. = 0.806632), norm. avg. (of 8) = 0.591373 fft 2: mflops = 44.9608 (norm. = 0.508139), norm. avg. (of 8) = 0.531039 fft 3: mflops = 75.1931 (norm. = 0.849819), norm. avg. (of 8) = 0.89158 fft 4: mflops = 62.6799 (norm. = 0.708396), norm. avg. (of 8) = 0.799856 fft 5: mflops = 46.0093 (norm. = 0.519989), norm. avg. (of 8) = 0.410659 fft 6: mflops = 87.1323 (norm. = 0.984753), norm. avg. (of 8) = 0.964589 fft 7: mflops = 11.8851 (norm. = 0.134323), norm. avg. (of 8) = 0.183132 fft 8: mflops = 88.1519 (norm. = 0.996277), norm. avg. (of 8) = 0.618966 fft 9: mflops = 56.2926 (norm. = 0.636209), norm. avg. (of 8) = 0.560604 fft 10: mflops = 8.8453 (norm. = 0.099968), norm. avg. (of 8) = 0.137286 Benchmarking for array size = 108: 0. CWP (min N) (N=110): elapsed time t=1.833 s, 32768 iters, t-(init.)=1.743 s t(norm)=0.0729131, mflops=68.5748 1. CWP (best N) (N=112): elapsed time t=1.482 s, 32768 iters, t-(init.)=1.402 s t(norm)=0.0586484, mflops=85.2538 2. FFTPACK (f2c): elapsed time t=1.652 s, 16384 iters, t-(init.)=1.602 s t(norm)=0.13403, mflops=37.3052 (err=4.5e-016) FFTW_MEASURE plan: (cost = 5.126953e-005) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_NOTW 2 3. FFTW: elapsed time t=1.743 s, 32768 iters, t-(init.)=1.663 s t(norm)=0.0695665, mflops=71.8736 (err=3.1e-016) FFTW_ESTIMATE plan: (cost = 4.633200e+003) FFTW_TWIDDLE 9 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.812 s, 32768 iters, t-(init.)=1.732 s t(norm)=0.072453, mflops=69.0103 (err=3.2e-016) 5. Frigo-old: elapsed time t=1.222 s, 8192 iters, t-(init.)=1.202 s t(norm)=0.201128, mflops=24.8598 (err=5.4e-016) 6. GSL: elapsed time t=1.221 s, 32768 iters, t-(init.)=1.151 s t(norm)=0.0481486, mflops=103.845 (err=3.0e-016) 7. NAPACK (f2c): elapsed time t=1.692 s, 8192 iters, t-(init.)=1.672 s t(norm)=0.279772, mflops=17.8717 (err=3.1e-015) 8. Singleton (f2c): elapsed time t=1.742 s, 32768 iters, t-(init.)=1.652 s t(norm)=0.0691064, mflops=72.3522 (err=3.8e-016) 9. Temperton (f2c): elapsed time t=1.732 s, 32768 iters, t-(init.)=1.641 s t(norm)=0.0686462, mflops=72.8372 (err=3.5e-016) 10. Valkenburg: elapsed time t=1.582 s, 4096 iters, t-(init.)=1.572 s t(norm)=0.526079, mflops=9.50428 (err=5.5e-016) Top mflops for N=108 = 103.845 Normalized results and averages for N=108: fft 0: mflops = 68.5748 (norm. = 0.660356), norm. avg. (of 9) = 0.719451 fft 1: mflops = 85.2538 (norm. = 0.82097), norm. avg. (of 9) = 0.616884 fft 2: mflops = 37.3052 (norm. = 0.359238), norm. avg. (of 9) = 0.51195 fft 3: mflops = 71.8736 (norm. = 0.692123), norm. avg. (of 9) = 0.869418 fft 4: mflops = 69.0103 (norm. = 0.66455), norm. avg. (of 9) = 0.784822 fft 5: mflops = 24.8598 (norm. = 0.239393), norm. avg. (of 9) = 0.39163 fft 6: mflops = 103.845 (norm. = 1), norm. avg. (of 9) = 0.968523 fft 7: mflops = 17.8717 (norm. = 0.172099), norm. avg. (of 9) = 0.181906 fft 8: mflops = 72.3522 (norm. = 0.696731), norm. avg. (of 9) = 0.627606 fft 9: mflops = 72.8372 (norm. = 0.701402), norm. avg. (of 9) = 0.576248 fft 10: mflops = 9.50428 (norm. = 0.0915235), norm. avg. (of 9) = 0.132202 Benchmarking for array size = 210: 0. CWP (min N): elapsed time t=1.883 s, 16384 iters, t-(init.)=1.803 s t(norm)=0.0679302, mflops=73.6049 1. CWP (best N): elapsed time t=1.883 s, 16384 iters, t-(init.)=1.803 s t(norm)=0.0679302, mflops=73.6049 2. FFTPACK (f2c): elapsed time t=1.342 s, 4096 iters, t-(init.)=1.322 s t(norm)=0.199232, mflops=25.0964 (err=5.4e-016) FFTW_MEASURE plan: (cost = 1.171875e-004) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_NOTW 3 3. FFTW: elapsed time t=1.963 s, 16384 iters, t-(init.)=1.873 s t(norm)=0.0705676, mflops=70.8541 (err=4.3e-016) FFTW_ESTIMATE plan: (cost = 9.324000e+003) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.071 s, 8192 iters, t-(init.)=1.031 s t(norm)=0.0776884, mflops=64.3597 (err=4.2e-016) 5. Frigo-old: elapsed time t=1.463 s, 4096 iters, t-(init.)=1.443 s t(norm)=0.217467, mflops=22.992 (err=5.2e-016) 6. GSL: elapsed time t=1.472 s, 16384 iters, t-(init.)=1.402 s t(norm)=0.0528221, mflops=94.6574 (err=5.3e-016) 7. NAPACK (f2c): elapsed time t=1.763 s, 2048 iters, t-(init.)=1.753 s t(norm)=0.528371, mflops=9.46304 (err=1.3e-014) 8. Singleton (f2c): elapsed time t=1.111 s, 8192 iters, t-(init.)=1.061 s t(norm)=0.0799489, mflops=62.5399 (err=5.6e-016) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=1.081 s, 1024 iters, t-(init.)=1.081 s t(norm)=0.651648, mflops=7.67286 (err=5.8e-016) Top mflops for N=210 = 94.6574 Normalized results and averages for N=210: fft 0: mflops = 73.6049 (norm. = 0.777593), norm. avg. (of 10) = 0.725266 fft 1: mflops = 73.6049 (norm. = 0.777593), norm. avg. (of 10) = 0.632955 fft 2: mflops = 25.0964 (norm. = 0.265129), norm. avg. (of 10) = 0.487268 fft 3: mflops = 70.8541 (norm. = 0.748532), norm. avg. (of 10) = 0.85733 fft 4: mflops = 64.3597 (norm. = 0.679922), norm. avg. (of 10) = 0.774332 fft 5: mflops = 22.992 (norm. = 0.242897), norm. avg. (of 10) = 0.376756 fft 6: mflops = 94.6574 (norm. = 1), norm. avg. (of 10) = 0.971671 fft 7: mflops = 9.46304 (norm. = 0.0999715), norm. avg. (of 10) = 0.173713 fft 8: mflops = 62.5399 (norm. = 0.660697), norm. avg. (of 10) = 0.630916 fft 9: mflops = -1 (norm. = -0.0105644), norm. avg. (of 9) = 0.576248 fft 10: mflops = 7.67286 (norm. = 0.0810592), norm. avg. (of 10) = 0.127087 Benchmarking for array size = 504: 0. CWP (min N): elapsed time t=1.112 s, 4096 iters, t-(init.)=1.072 s t(norm)=0.0578442, mflops=86.4391 1. CWP (best N): elapsed time t=1.112 s, 4096 iters, t-(init.)=1.062 s t(norm)=0.0573046, mflops=87.2531 2. FFTPACK (f2c): elapsed time t=1.862 s, 2048 iters, t-(init.)=1.842 s t(norm)=0.198785, mflops=25.1528 (err=1.4e-015) FFTW_MEASURE plan: (cost = 3.515625e-004) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 4 FFTW_NOTW 2 3. FFTW: elapsed time t=1.492 s, 4096 iters, t-(init.)=1.441 s t(norm)=0.0777551, mflops=64.3045 (err=1.3e-015) FFTW_ESTIMATE plan: (cost = 2.147040e+004) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 4. FFTW_ESTIMATE: elapsed time t=1.542 s, 4096 iters, t-(init.)=1.502 s t(norm)=0.0810466, mflops=61.6929 (err=1.3e-015) 5. Frigo-old: elapsed time t=1.762 s, 2048 iters, t-(init.)=1.742 s t(norm)=0.187994, mflops=26.5967 (err=1.4e-015) 6. GSL: elapsed time t=1.021 s, 4096 iters, t-(init.)=0.981 s t(norm)=0.0529339, mflops=94.4575 (err=1.4e-015) 7. NAPACK (f2c): elapsed time t=1.873 s, 1024 iters, t-(init.)=1.863 s t(norm)=0.402103, mflops=12.4346 (err=4.1e-014) 8. Singleton (f2c): elapsed time t=1.332 s, 4096 iters, t-(init.)=1.292 s t(norm)=0.0697152, mflops=71.7204 (err=1.9e-015) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=1.402 s, 512 iters, t-(init.)=1.402 s t(norm)=0.605205, mflops=8.26166 (err=1.4e-015) Top mflops for N=504 = 94.4575 Normalized results and averages for N=504: fft 0: mflops = 86.4391 (norm. = 0.915112), norm. avg. (of 11) = 0.742524 fft 1: mflops = 87.2531 (norm. = 0.923729), norm. avg. (of 11) = 0.659389 fft 2: mflops = 25.1528 (norm. = 0.266287), norm. avg. (of 11) = 0.467179 fft 3: mflops = 64.3045 (norm. = 0.680777), norm. avg. (of 11) = 0.841279 fft 4: mflops = 61.6929 (norm. = 0.653129), norm. avg. (of 11) = 0.763314 fft 5: mflops = 26.5967 (norm. = 0.281573), norm. avg. (of 11) = 0.368103 fft 6: mflops = 94.4575 (norm. = 1), norm. avg. (of 11) = 0.974246 fft 7: mflops = 12.4346 (norm. = 0.131643), norm. avg. (of 11) = 0.169888 fft 8: mflops = 71.7204 (norm. = 0.759288), norm. avg. (of 11) = 0.642586 fft 9: mflops = -1 (norm. = -0.0105868), norm. avg. (of 9) = 0.576248 fft 10: mflops = 8.26166 (norm. = 0.0874643), norm. avg. (of 11) = 0.123485 Benchmarking for array size = 1000: 0. CWP (min N) (N=1001): elapsed time t=1.412 s, 2048 iters, t-(init.)=1.362 s t(norm)=0.0667322, mflops=74.9263 1. CWP (best N) (N=1008): elapsed time t=1.182 s, 2048 iters, t-(init.)=1.132 s t(norm)=0.0554632, mflops=90.1499 2. FFTPACK (f2c): elapsed time t=1.502 s, 1024 iters, t-(init.)=1.482 s t(norm)=0.145223, mflops=34.4297 (err=1.0e-015) FFTW_MEASURE plan: (cost = 7.441406e-004) FFTW_TWIDDLE 10 FFTW_TWIDDLE 5 FFTW_TWIDDLE 10 FFTW_NOTW 2 3. FFTW: elapsed time t=1.512 s, 2048 iters, t-(init.)=1.462 s t(norm)=0.0716318, mflops=69.8014 (err=9.0e-016) FFTW_ESTIMATE plan: (cost = 5.220000e+004) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 4. FFTW_ESTIMATE: elapsed time t=1.582 s, 2048 iters, t-(init.)=1.532 s t(norm)=0.0750615, mflops=66.612 (err=9.1e-016) 5. Frigo-old: elapsed time t=1.963 s, 1024 iters, t-(init.)=1.933 s t(norm)=0.189418, mflops=26.3967 (err=9.3e-016) 6. GSL: elapsed time t=1.482 s, 2048 iters, t-(init.)=1.432 s t(norm)=0.0701619, mflops=71.2637 (err=9.1e-016) 7. NAPACK (f2c): elapsed time t=1.332 s, 256 iters, t-(init.)=1.332 s t(norm)=0.522099, mflops=9.57673 (err=1.6e-014) 8. Singleton (f2c): elapsed time t=1.211 s, 2048 iters, t-(init.)=1.161 s t(norm)=0.0568841, mflops=87.898 (err=1.2e-015) 9. Temperton (f2c): elapsed time t=1.432 s, 2048 iters, t-(init.)=1.381 s t(norm)=0.0676632, mflops=73.8955 (err=9.2e-016) 10. Valkenburg: elapsed time t=1.582 s, 256 iters, t-(init.)=1.582 s t(norm)=0.62009, mflops=8.06334 (err=1.0e-015) Top mflops for N=1000 = 90.1499 Normalized results and averages for N=1000: fft 0: mflops = 74.9263 (norm. = 0.831131), norm. avg. (of 12) = 0.749908 fft 1: mflops = 90.1499 (norm. = 1), norm. avg. (of 12) = 0.687773 fft 2: mflops = 34.4297 (norm. = 0.381916), norm. avg. (of 12) = 0.460073 fft 3: mflops = 69.8014 (norm. = 0.774282), norm. avg. (of 12) = 0.835696 fft 4: mflops = 66.612 (norm. = 0.738903), norm. avg. (of 12) = 0.76128 fft 5: mflops = 26.3967 (norm. = 0.292809), norm. avg. (of 12) = 0.361829 fft 6: mflops = 71.2637 (norm. = 0.790503), norm. avg. (of 12) = 0.958934 fft 7: mflops = 9.57673 (norm. = 0.106231), norm. avg. (of 12) = 0.164584 fft 8: mflops = 87.898 (norm. = 0.975022), norm. avg. (of 12) = 0.670289 fft 9: mflops = 73.8955 (norm. = 0.819696), norm. avg. (of 10) = 0.600593 fft 10: mflops = 8.06334 (norm. = 0.0894437), norm. avg. (of 12) = 0.120648 Benchmarking for array size = 1960: 0. CWP (min N) (N=1980): elapsed time t=1.563 s, 1024 iters, t-(init.)=1.443 s t(norm)=0.0657395, mflops=76.0578 1. CWP (best N) (N=1980): elapsed time t=1.562 s, 1024 iters, t-(init.)=1.452 s t(norm)=0.0661495, mflops=75.5863 2. FFTPACK (f2c): elapsed time t=1.542 s, 256 iters, t-(init.)=1.512 s t(norm)=0.275532, mflops=18.1467 (err=2.8e-015) FFTW_MEASURE plan: (cost = 1.640625e-003) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_NOTW 4 3. FFTW: elapsed time t=1.722 s, 1024 iters, t-(init.)=1.612 s t(norm)=0.0734387, mflops=68.084 (err=2.8e-015) FFTW_ESTIMATE plan: (cost = 9.662800e+004) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 4. FFTW_ESTIMATE: elapsed time t=1.762 s, 1024 iters, t-(init.)=1.641 s t(norm)=0.0747599, mflops=66.8808 (err=2.8e-015) 5. Frigo-old: elapsed time t=1.202 s, 256 iters, t-(init.)=1.172 s t(norm)=0.213574, mflops=23.4111 (err=2.8e-015) 6. GSL: elapsed time t=1.613 s, 1024 iters, t-(init.)=1.493 s t(norm)=0.0680174, mflops=73.5106 (err=2.9e-015) 7. NAPACK (f2c): elapsed time t=1.603 s, 128 iters, t-(init.)=1.583 s t(norm)=0.576941, mflops=8.6664 (err=1.3e-013) 8. Singleton (f2c): elapsed time t=1.022 s, 512 iters, t-(init.)=0.962 s t(norm)=0.0876527, mflops=57.0433 (err=4.2e-015) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=1.923 s, 128 iters, t-(init.)=1.913 s t(norm)=0.697212, mflops=7.17142 (err=2.7e-015) Top mflops for N=1960 = 76.0578 Normalized results and averages for N=1960: fft 0: mflops = 76.0578 (norm. = 1), norm. avg. (of 13) = 0.769146 fft 1: mflops = 75.5863 (norm. = 0.993802), norm. avg. (of 13) = 0.711314 fft 2: mflops = 18.1467 (norm. = 0.238591), norm. avg. (of 13) = 0.443036 fft 3: mflops = 68.084 (norm. = 0.895161), norm. avg. (of 13) = 0.840271 fft 4: mflops = 66.8808 (norm. = 0.879342), norm. avg. (of 13) = 0.770361 fft 5: mflops = 23.4111 (norm. = 0.307807), norm. avg. (of 13) = 0.357673 fft 6: mflops = 73.5106 (norm. = 0.96651), norm. avg. (of 13) = 0.959517 fft 7: mflops = 8.6664 (norm. = 0.113945), norm. avg. (of 13) = 0.160688 fft 8: mflops = 57.0433 (norm. = 0.75), norm. avg. (of 13) = 0.67642 fft 9: mflops = -1 (norm. = -0.0131479), norm. avg. (of 10) = 0.600593 fft 10: mflops = 7.17142 (norm. = 0.0942891), norm. avg. (of 13) = 0.118621 Benchmarking for array size = 4725: 0. CWP (min N) (N=5005): elapsed time t=1.172 s, 256 iters, t-(init.)=1.092 s t(norm)=0.0739612, mflops=67.603 1. CWP (best N) (N=5040): elapsed time t=1.032 s, 256 iters, t-(init.)=0.962 s t(norm)=0.0651563, mflops=76.7385 2. FFTPACK (f2c): elapsed time t=1.532 s, 128 iters, t-(init.)=1.492 s t(norm)=0.202106, mflops=24.7394 (err=1.9e-015) FFTW_MEASURE plan: (cost = 4.859375e-003) FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 3 3. FFTW: elapsed time t=1.242 s, 256 iters, t-(init.)=1.182 s t(norm)=0.0800569, mflops=62.4556 (err=1.8e-015) FFTW_ESTIMATE plan: (cost = 1.946700e+005) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.192 s, 256 iters, t-(init.)=1.122 s t(norm)=0.0759931, mflops=65.7954 (err=1.9e-015) 5. Frigo-old: elapsed time t=1.973 s, 128 iters, t-(init.)=1.943 s t(norm)=0.263199, mflops=18.997 (err=2.0e-015) 6. GSL: elapsed time t=1.923 s, 512 iters, t-(init.)=1.783 s t(norm)=0.0603813, mflops=82.807 (err=1.9e-015) 7. NAPACK (f2c): elapsed time t=1.912 s, 64 iters, t-(init.)=1.902 s t(norm)=0.51529, mflops=9.70327 (err=3.6e-013) 8. Singleton (f2c): elapsed time t=1.432 s, 256 iters, t-(init.)=1.361 s t(norm)=0.0921806, mflops=54.2414 (err=2.5e-015) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=1.202 s, 32 iters, t-(init.)=1.192 s t(norm)=0.645874, mflops=7.74145 (err=1.8e-015) Top mflops for N=4725 = 82.807 Normalized results and averages for N=4725: fft 0: mflops = 67.603 (norm. = 0.816392), norm. avg. (of 14) = 0.772521 fft 1: mflops = 76.7385 (norm. = 0.926715), norm. avg. (of 14) = 0.726699 fft 2: mflops = 24.7394 (norm. = 0.29876), norm. avg. (of 14) = 0.432731 fft 3: mflops = 62.4556 (norm. = 0.75423), norm. avg. (of 14) = 0.834125 fft 4: mflops = 65.7954 (norm. = 0.794563), norm. avg. (of 14) = 0.77209 fft 5: mflops = 18.997 (norm. = 0.229413), norm. avg. (of 14) = 0.348512 fft 6: mflops = 82.807 (norm. = 1), norm. avg. (of 14) = 0.962409 fft 7: mflops = 9.70327 (norm. = 0.117179), norm. avg. (of 14) = 0.157581 fft 8: mflops = 54.2414 (norm. = 0.655033), norm. avg. (of 14) = 0.674893 fft 9: mflops = -1 (norm. = -0.0120763), norm. avg. (of 10) = 0.600593 fft 10: mflops = 7.74145 (norm. = 0.0934878), norm. avg. (of 14) = 0.116826 Benchmarking for array size = 10368: 0. CWP (min N) (N=10920): elapsed time t=1.382 s, 128 iters, t-(init.)=1.302 s t(norm)=0.0735453, mflops=67.9853 1. CWP (best N) (N=11088): elapsed time t=1.272 s, 128 iters, t-(init.)=1.192 s t(norm)=0.0673318, mflops=74.2591 2. FFTPACK (f2c): elapsed time t=1.402 s, 64 iters, t-(init.)=1.372 s t(norm)=0.154999, mflops=32.2583 (err=3.1e-015) FFTW_MEASURE plan: (cost = 1.065625e-002) FFTW_TWIDDLE 3 FFTW_TWIDDLE 8 FFTW_TWIDDLE 9 FFTW_TWIDDLE 8 FFTW_NOTW 6 3. FFTW: elapsed time t=1.422 s, 128 iters, t-(init.)=1.352 s t(norm)=0.0763696, mflops=65.471 (err=3.1e-015) FFTW_ESTIMATE plan: (cost = 1.254528e+005) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 4. FFTW_ESTIMATE: elapsed time t=1.553 s, 128 iters, t-(init.)=1.483 s t(norm)=0.0837694, mflops=59.6877 (err=3.1e-015) 5. Frigo-old: elapsed time t=1.592 s, 64 iters, t-(init.)=1.552 s t(norm)=0.175334, mflops=28.517 (err=3.2e-015) 6. GSL: elapsed time t=1.092 s, 128 iters, t-(init.)=1.022 s t(norm)=0.0577291, mflops=86.6114 (err=3.0e-015) 7. NAPACK (f2c): elapsed time t=1.291 s, 32 iters, t-(init.)=1.271 s t(norm)=0.287177, mflops=17.4109 (err=8.2e-014) 8. Singleton (f2c): elapsed time t=1.753 s, 128 iters, t-(init.)=1.683 s t(norm)=0.0950667, mflops=52.5947 (err=4.4e-015) 9. Temperton (f2c): elapsed time t=1.602 s, 128 iters, t-(init.)=1.522 s t(norm)=0.0859723, mflops=58.1582 (err=3.0e-015) 10. Valkenburg: elapsed time t=1.282 s, 16 iters, t-(init.)=1.272 s t(norm)=0.574806, mflops=8.69859 (err=3.0e-015) Top mflops for N=10368 = 86.6114 Normalized results and averages for N=10368: fft 0: mflops = 67.9853 (norm. = 0.784946), norm. avg. (of 15) = 0.773349 fft 1: mflops = 74.2591 (norm. = 0.857383), norm. avg. (of 15) = 0.735412 fft 2: mflops = 32.2583 (norm. = 0.372449), norm. avg. (of 15) = 0.428712 fft 3: mflops = 65.471 (norm. = 0.755917), norm. avg. (of 15) = 0.828911 fft 4: mflops = 59.6877 (norm. = 0.689144), norm. avg. (of 15) = 0.76656 fft 5: mflops = 28.517 (norm. = 0.329253), norm. avg. (of 15) = 0.347228 fft 6: mflops = 86.6114 (norm. = 1), norm. avg. (of 15) = 0.964915 fft 7: mflops = 17.4109 (norm. = 0.201023), norm. avg. (of 15) = 0.160477 fft 8: mflops = 52.5947 (norm. = 0.607249), norm. avg. (of 15) = 0.670383 fft 9: mflops = 58.1582 (norm. = 0.671485), norm. avg. (of 11) = 0.607037 fft 10: mflops = 8.69859 (norm. = 0.100432), norm. avg. (of 15) = 0.115733 Benchmarking for array size = 27000: 0. CWP (min N) (N=27720): elapsed time t=1.853 s, 64 iters, t-(init.)=1.743 s t(norm)=0.0685214, mflops=72.9699 1. CWP (best N) (N=27720): elapsed time t=1.862 s, 64 iters, t-(init.)=1.752 s t(norm)=0.0688752, mflops=72.5951 2. FFTPACK (f2c): elapsed time t=1.362 s, 16 iters, t-(init.)=1.332 s t(norm)=0.209456, mflops=23.8714 (err=5.6e-015) FFTW_MEASURE plan: (cost = 4.000000e-002) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 3 3. FFTW: elapsed time t=1.302 s, 32 iters, t-(init.)=1.252 s t(norm)=0.098438, mflops=50.7934 (err=5.6e-015) FFTW_ESTIMATE plan: (cost = 1.231200e+006) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.312 s, 32 iters, t-(init.)=1.262 s t(norm)=0.0992243, mflops=50.3909 (err=5.7e-015) 5. Frigo-old: elapsed time t=1.812 s, 16 iters, t-(init.)=1.792 s t(norm)=0.281791, mflops=17.7437 (err=5.8e-015) 6. GSL: elapsed time t=1.532 s, 32 iters, t-(init.)=1.482 s t(norm)=0.116522, mflops=42.9105 (err=5.6e-015) 7. NAPACK (f2c): elapsed time t=1.883 s, 8 iters, t-(init.)=1.873 s t(norm)=0.589056, mflops=8.48816 (err=1.1e-012) 8. Singleton (f2c): elapsed time t=1.292 s, 32 iters, t-(init.)=1.242 s t(norm)=0.0976518, mflops=51.2023 (err=7.8e-015) 9. Temperton (f2c): elapsed time t=1.182 s, 32 iters, t-(init.)=1.132 s t(norm)=0.0890031, mflops=56.1778 (err=5.7e-015) 10. Valkenburg: elapsed time t=1.112 s, 4 iters, t-(init.)=1.102 s t(norm)=0.693155, mflops=7.2134 (err=5.6e-015) Top mflops for N=27000 = 72.9699 Normalized results and averages for N=27000: fft 0: mflops = 72.9699 (norm. = 1), norm. avg. (of 16) = 0.787515 fft 1: mflops = 72.5951 (norm. = 0.994863), norm. avg. (of 16) = 0.751627 fft 2: mflops = 23.8714 (norm. = 0.32714), norm. avg. (of 16) = 0.422364 fft 3: mflops = 50.7934 (norm. = 0.696086), norm. avg. (of 16) = 0.820609 fft 4: mflops = 50.3909 (norm. = 0.690571), norm. avg. (of 16) = 0.761811 fft 5: mflops = 17.7437 (norm. = 0.243164), norm. avg. (of 16) = 0.340724 fft 6: mflops = 42.9105 (norm. = 0.588057), norm. avg. (of 16) = 0.941361 fft 7: mflops = 8.48816 (norm. = 0.116324), norm. avg. (of 16) = 0.157717 fft 8: mflops = 51.2023 (norm. = 0.701691), norm. avg. (of 16) = 0.67234 fft 9: mflops = 56.1778 (norm. = 0.769876), norm. avg. (of 12) = 0.620607 fft 10: mflops = 7.2134 (norm. = 0.0988544), norm. avg. (of 16) = 0.114678 Benchmarking for array size = 75600: 0. CWP (min N) (N=80080): elapsed time t=1.132 s, 8 iters, t-(init.)=1.022 s t(norm)=0.10427, mflops=47.9523 1. CWP (best N) (N=80080): elapsed time t=1.121 s, 8 iters, t-(init.)=1.011 s t(norm)=0.103148, mflops=48.474 2. FFTPACK (f2c): elapsed time t=1.602 s, 4 iters, t-(init.)=1.542 s t(norm)=0.314647, mflops=15.8908 (err=1.1e-014) FFTW_MEASURE plan: (cost = 1.355000e-001) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 4 FFTW_TWIDDLE 10 FFTW_NOTW 6 3. FFTW: elapsed time t=1.081 s, 8 iters, t-(init.)=0.97 s t(norm)=0.098965, mflops=50.5229 (err=1.1e-014) FFTW_ESTIMATE plan: (cost = 2.971080e+006) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.061 s, 8 iters, t-(init.)=0.95 s t(norm)=0.0969245, mflops=51.5866 (err=1.1e-014) 5. Frigo-old: elapsed time t=1.553 s, 4 iters, t-(init.)=1.503 s t(norm)=0.306689, mflops=16.3031 (err=1.1e-014) 6. GSL: elapsed time t=1.693 s, 8 iters, t-(init.)=1.583 s t(norm)=0.161507, mflops=30.9585 (err=1.1e-014) 7. NAPACK (f2c): elapsed time t=1.703 s, 2 iters, t-(init.)=1.673 s t(norm)=0.682756, mflops=7.32326 (err=5.1e-012) 8. Singleton (f2c): elapsed time t=1.082 s, 4 iters, t-(init.)=1.032 s t(norm)=0.210581, mflops=23.7438 (err=1.5e-014) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=1.973 s, 2 iters, t-(init.)=1.943 s t(norm)=0.792944, mflops=6.30562 (err=1.1e-014) Top mflops for N=75600 = 51.5866 Normalized results and averages for N=75600: fft 0: mflops = 47.9523 (norm. = 0.92955), norm. avg. (of 17) = 0.79587 fft 1: mflops = 48.474 (norm. = 0.939664), norm. avg. (of 17) = 0.762688 fft 2: mflops = 15.8908 (norm. = 0.308042), norm. avg. (of 17) = 0.415639 fft 3: mflops = 50.5229 (norm. = 0.979381), norm. avg. (of 17) = 0.829949 fft 4: mflops = 51.5866 (norm. = 1), norm. avg. (of 17) = 0.775822 fft 5: mflops = 16.3031 (norm. = 0.316035), norm. avg. (of 17) = 0.339272 fft 6: mflops = 30.9585 (norm. = 0.600126), norm. avg. (of 17) = 0.921289 fft 7: mflops = 7.32326 (norm. = 0.141961), norm. avg. (of 17) = 0.15679 fft 8: mflops = 23.7438 (norm. = 0.460271), norm. avg. (of 17) = 0.659865 fft 9: mflops = -1 (norm. = -0.0193849), norm. avg. (of 12) = 0.620607 fft 10: mflops = 6.30562 (norm. = 0.122234), norm. avg. (of 17) = 0.115122 Benchmarking for array size = 165375: 0. CWP (min N) (N=180180): elapsed time t=1.543 s, 4 iters, t-(init.)=1.403 s t(norm)=0.122347, mflops=40.8673 1. CWP (best N) (N=180180): elapsed time t=1.542 s, 4 iters, t-(init.)=1.391 s t(norm)=0.121301, mflops=41.2198 2. FFTPACK (f2c): elapsed time t=1.412 s, 1 iters, t-(init.)=1.382 s t(norm)=0.482064, mflops=10.3721 (err=2.7e-014) FFTW_MEASURE plan: (cost = 3.510000e-001) FFTW_TWIDDLE 5 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 3 3. FFTW: elapsed time t=1.351 s, 4 iters, t-(init.)=1.22 s t(norm)=0.106389, mflops=46.9974 (err=2.7e-014) FFTW_ESTIMATE plan: (cost = 8.367975e+006) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.352 s, 4 iters, t-(init.)=1.222 s t(norm)=0.106563, mflops=46.9204 (err=2.7e-014) 5. Frigo-old: elapsed time t=1.232 s, 1 iters, t-(init.)=1.202 s t(norm)=0.419277, mflops=11.9253 (err=2.7e-014) 6. GSL: elapsed time t=1.932 s, 4 iters, t-(init.)=1.801 s t(norm)=0.157055, mflops=31.8361 (err=2.7e-014) 7. NAPACK (f2c): elapsed time t=2.213 s, 1 iters, t-(init.)=2.193 s t(norm)=0.764954, mflops=6.53634 (err=1.6e-011) 8. Singleton (f2c): elapsed time t=1.392 s, 2 iters, t-(init.)=1.332 s t(norm)=0.232312, mflops=21.5228 (err=4.0e-014) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=2.334 s, 1 iters, t-(init.)=2.304 s t(norm)=0.803673, mflops=6.22144 (err=2.7e-014) Top mflops for N=165375 = 46.9974 Normalized results and averages for N=165375: fft 0: mflops = 40.8673 (norm. = 0.869565), norm. avg. (of 18) = 0.799964 fft 1: mflops = 41.2198 (norm. = 0.877067), norm. avg. (of 18) = 0.769043 fft 2: mflops = 10.3721 (norm. = 0.220695), norm. avg. (of 18) = 0.404809 fft 3: mflops = 46.9974 (norm. = 1), norm. avg. (of 18) = 0.839396 fft 4: mflops = 46.9204 (norm. = 0.998363), norm. avg. (of 18) = 0.788185 fft 5: mflops = 11.9253 (norm. = 0.253744), norm. avg. (of 18) = 0.33452 fft 6: mflops = 31.8361 (norm. = 0.677401), norm. avg. (of 18) = 0.907739 fft 7: mflops = 6.53634 (norm. = 0.139079), norm. avg. (of 18) = 0.155806 fft 8: mflops = 21.5228 (norm. = 0.457958), norm. avg. (of 18) = 0.648648 fft 9: mflops = -1 (norm. = -0.0212778), norm. avg. (of 12) = 0.620607 fft 10: mflops = 6.22144 (norm. = 0.132378), norm. avg. (of 18) = 0.116081 Benchmarking for array size = 362880: 0. CWP (min N) (N=720720): elapsed time t=1.622 s, 1 iters, t-(init.)=1.472 s t(norm)=0.219633, mflops=22.7652 1. CWP (best N) (N=720720): elapsed time t=1.613 s, 1 iters, t-(init.)=1.463 s t(norm)=0.21829, mflops=22.9053 2. FFTPACK (f2c): elapsed time t=2.273 s, 1 iters, t-(init.)=2.203 s t(norm)=0.328704, mflops=15.2113 (err=1.1e-013) FFTW_MEASURE plan: (cost = 7.410000e-001) FFTW_TWIDDLE 32 FFTW_TWIDDLE 9 FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 2 3. FFTW: elapsed time t=1.402 s, 2 iters, t-(init.)=1.262 s t(norm)=0.0941499, mflops=53.1068 (err=1.1e-013) FFTW_ESTIMATE plan: (cost = 7.511616e+006) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 4. FFTW_ESTIMATE: elapsed time t=1.622 s, 2 iters, t-(init.)=1.471 s t(norm)=0.109742, mflops=45.5614 (err=1.1e-013) 5. Frigo-old: elapsed time t=2.254 s, 1 iters, t-(init.)=2.184 s t(norm)=0.325869, mflops=15.3436 (err=1.1e-013) 6. GSL: elapsed time t=1.062 s, 1 iters, t-(init.)=0.992 s t(norm)=0.148014, mflops=33.7806 (err=1.1e-013) 7. NAPACK (f2c): elapsed time t=4.346 s, 1 iters, t-(init.)=4.266 s t(norm)=0.636519, mflops=7.85523 (err=3.4e-011) 8. Singleton (f2c): elapsed time t=1.963 s, 1 iters, t-(init.)=1.903 s t(norm)=0.283942, mflops=17.6092 (err=1.6e-013) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=5.898 s, 1 iters, t-(init.)=5.828 s t(norm)=0.869581, mflops=5.7499 (err=1.1e-013) Top mflops for N=362880 = 53.1068 Normalized results and averages for N=362880: fft 0: mflops = 22.7652 (norm. = 0.428668), norm. avg. (of 19) = 0.780422 fft 1: mflops = 22.9053 (norm. = 0.431306), norm. avg. (of 19) = 0.751267 fft 2: mflops = 15.2113 (norm. = 0.286428), norm. avg. (of 19) = 0.398578 fft 3: mflops = 53.1068 (norm. = 1), norm. avg. (of 19) = 0.847849 fft 4: mflops = 45.5614 (norm. = 0.85792), norm. avg. (of 19) = 0.791856 fft 5: mflops = 15.3436 (norm. = 0.288919), norm. avg. (of 19) = 0.33212 fft 6: mflops = 33.7806 (norm. = 0.636089), norm. avg. (of 19) = 0.893442 fft 7: mflops = 7.85523 (norm. = 0.147914), norm. avg. (of 19) = 0.155391 fft 8: mflops = 17.6092 (norm. = 0.331582), norm. avg. (of 19) = 0.63196 fft 9: mflops = -1 (norm. = -0.01883), norm. avg. (of 12) = 0.620607 fft 10: mflops = 5.7499 (norm. = 0.10827), norm. avg. (of 19) = 0.11567 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) 512x128x64 (64.0236 MB) 256x128x256 (128.012 MB) Maximum array size N = 8388608 Benchmarking FFTs: 0. FFTW 1. HARM (f2c) 2. PDA (f2c) 3. Singleton (f2c) 4. Temperton (f2c) Computing normalized averages (5 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.032 s, 65536 iters, t-(init.)=0.932 s t(norm)=0.0370344, mflops=135.01 (err=1.7e-016) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. PDA (f2c): elapsed time t=1.121 s, 8192 iters, t-(init.)=1.101 s t(norm)=0.349998, mflops=14.2858 (err=1.9e-016) 3. Singleton (f2c): elapsed time t=1.232 s, 65536 iters, t-(init.)=1.132 s t(norm)=0.0449816, mflops=111.156 (err=1.7e-016) 4. Temperton (f2c): elapsed time t=1.021 s, 32768 iters, t-(init.)=0.971 s t(norm)=0.0771681, mflops=64.7936 (err=1.7e-016) Top mflops for N=64 = 135.01 Normalized results and averages for N=64: fft 0: mflops = 135.01 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.00740687), norm. avg. (of 0) = -1 fft 2: mflops = 14.2858 (norm. = 0.105813), norm. avg. (of 1) = 0.105813 fft 3: mflops = 111.156 (norm. = 0.823322), norm. avg. (of 1) = 0.823322 fft 4: mflops = 64.7936 (norm. = 0.479918), norm. avg. (of 1) = 0.479918 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.242 s, 8192 iters, t-(init.)=1.142 s t(norm)=0.0302527, mflops=165.275 (err=3.1e-016) 1. HARM (f2c): elapsed time t=1.663 s, 8192 iters, t-(init.)=1.563 s t(norm)=0.0414054, mflops=120.757 (err=3.2e-016) 2. PDA (f2c): elapsed time t=1.101 s, 1024 iters, t-(init.)=1.091 s t(norm)=0.231213, mflops=21.6251 (err=2.8e-016) 3. Singleton (f2c): elapsed time t=1.953 s, 8192 iters, t-(init.)=1.853 s t(norm)=0.0490877, mflops=101.858 (err=3.1e-016) 4. Temperton (f2c): elapsed time t=1.653 s, 8192 iters, t-(init.)=1.553 s t(norm)=0.0411405, mflops=121.535 (err=3.1e-016) Top mflops for N=512 = 165.275 Normalized results and averages for N=512: fft 0: mflops = 165.275 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 120.757 (norm. = 0.730646), norm. avg. (of 1) = 0.730646 fft 2: mflops = 21.6251 (norm. = 0.130843), norm. avg. (of 2) = 0.118328 fft 3: mflops = 101.858 (norm. = 0.616298), norm. avg. (of 2) = 0.71981 fft 4: mflops = 121.535 (norm. = 0.735351), norm. avg. (of 2) = 0.607634 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.162 s, 256 iters, t-(init.)=1.102 s t(norm)=0.0875791, mflops=57.0913 (err=3.8e-016) 1. HARM (f2c): elapsed time t=1.232 s, 512 iters, t-(init.)=1.112 s t(norm)=0.0441869, mflops=113.156 (err=3.7e-016) 2. PDA (f2c): elapsed time t=1.162 s, 128 iters, t-(init.)=1.132 s t(norm)=0.179927, mflops=27.7891 (err=4.0e-016) 3. Singleton (f2c): elapsed time t=1.993 s, 512 iters, t-(init.)=1.873 s t(norm)=0.0744263, mflops=67.1805 (err=3.8e-016) 4. Temperton (f2c): elapsed time t=1.502 s, 512 iters, t-(init.)=1.382 s t(norm)=0.0549157, mflops=91.0486 (err=4.1e-016) Top mflops for N=4096 = 113.156 Normalized results and averages for N=4096: fft 0: mflops = 57.0913 (norm. = 0.504537), norm. avg. (of 3) = 0.834846 fft 1: mflops = 113.156 (norm. = 1), norm. avg. (of 2) = 0.865323 fft 2: mflops = 27.7891 (norm. = 0.245583), norm. avg. (of 3) = 0.160746 fft 3: mflops = 67.1805 (norm. = 0.5937), norm. avg. (of 3) = 0.677773 fft 4: mflops = 91.0486 (norm. = 0.804631), norm. avg. (of 3) = 0.6733 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.311 s, 32 iters, t-(init.)=1.241 s t(norm)=0.0789007, mflops=63.3708 (err=4.7e-016) 1. HARM (f2c): elapsed time t=1.943 s, 64 iters, t-(init.)=1.813 s t(norm)=0.0576337, mflops=86.7548 (err=5.3e-016) 2. PDA (f2c): elapsed time t=1.612 s, 16 iters, t-(init.)=1.582 s t(norm)=0.201162, mflops=24.8556 (err=4.1e-016) 3. Singleton (f2c): elapsed time t=1.592 s, 32 iters, t-(init.)=1.522 s t(norm)=0.0967662, mflops=51.671 (err=4.9e-016) 4. Temperton (f2c): elapsed time t=1.152 s, 32 iters, t-(init.)=1.092 s t(norm)=0.0694275, mflops=72.0176 (err=4.6e-016) Top mflops for N=32768 = 86.7548 Normalized results and averages for N=32768: fft 0: mflops = 63.3708 (norm. = 0.730459), norm. avg. (of 4) = 0.808749 fft 1: mflops = 86.7548 (norm. = 1), norm. avg. (of 3) = 0.910215 fft 2: mflops = 24.8556 (norm. = 0.286504), norm. avg. (of 4) = 0.192186 fft 3: mflops = 51.671 (norm. = 0.595598), norm. avg. (of 4) = 0.657229 fft 4: mflops = 72.0176 (norm. = 0.830128), norm. avg. (of 4) = 0.712507 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.242 s, 2 iters, t-(init.)=1.132 s t(norm)=0.119951, mflops=41.6837 (err=1.3e-015) 1. HARM (f2c): elapsed time t=1.241 s, 2 iters, t-(init.)=1.13 s t(norm)=0.119739, mflops=41.7575 (err=1.2e-015) 2. PDA (f2c): elapsed time t=1.202 s, 1 iters, t-(init.)=1.152 s t(norm)=0.244141, mflops=20.48 (err=1.2e-015) 3. Singleton (f2c): elapsed time t=1.182 s, 1 iters, t-(init.)=1.132 s t(norm)=0.239902, mflops=20.8418 (err=1.7e-015) 4. Temperton (f2c): elapsed time t=1.302 s, 2 iters, t-(init.)=1.192 s t(norm)=0.126309, mflops=39.5855 (err=1.3e-015) Top mflops for N=262144 = 41.7575 Normalized results and averages for N=262144: fft 0: mflops = 41.6837 (norm. = 0.998233), norm. avg. (of 5) = 0.846646 fft 1: mflops = 41.7575 (norm. = 1), norm. avg. (of 4) = 0.932662 fft 2: mflops = 20.48 (norm. = 0.490451), norm. avg. (of 5) = 0.251839 fft 3: mflops = 20.8418 (norm. = 0.499117), norm. avg. (of 5) = 0.625607 fft 4: mflops = 39.5855 (norm. = 0.947987), norm. avg. (of 5) = 0.759603 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.241 s, 1 iters, t-(init.)=1.14 s t(norm)=0.114441, mflops=43.6907 (err=1.2e-015) 1. HARM (f2c): elapsed time t=1.382 s, 1 iters, t-(init.)=1.282 s t(norm)=0.128696, mflops=38.8513 (err=1.2e-015) 2. PDA (f2c): elapsed time t=2.463 s, 1 iters, t-(init.)=2.363 s t(norm)=0.237214, mflops=21.078 (err=1.2e-015) 3. Singleton (f2c): elapsed time t=2.644 s, 1 iters, t-(init.)=2.544 s t(norm)=0.255384, mflops=19.5784 (err=1.8e-015) 4. Temperton (f2c): elapsed time t=1.332 s, 1 iters, t-(init.)=1.222 s t(norm)=0.122673, mflops=40.7589 (err=1.2e-015) Top mflops for N=524288 = 43.6907 Normalized results and averages for N=524288: fft 0: mflops = 43.6907 (norm. = 1), norm. avg. (of 6) = 0.872205 fft 1: mflops = 38.8513 (norm. = 0.889236), norm. avg. (of 5) = 0.923976 fft 2: mflops = 21.078 (norm. = 0.482438), norm. avg. (of 6) = 0.290272 fft 3: mflops = 19.5784 (norm. = 0.448113), norm. avg. (of 6) = 0.596025 fft 4: mflops = 40.7589 (norm. = 0.932897), norm. avg. (of 6) = 0.788485 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=2.574 s, 1 iters, t-(init.)=2.364 s t(norm)=0.112724, mflops=44.356 (err=2.0e-015) 1. HARM (f2c): elapsed time t=2.784 s, 1 iters, t-(init.)=2.573 s t(norm)=0.12269, mflops=40.7531 (err=2.0e-015) 2. PDA (f2c): elapsed time t=5.688 s, 1 iters, t-(init.)=5.477 s t(norm)=0.261164, mflops=19.1451 (err=2.0e-015) 3. Singleton (f2c): elapsed time t=5.017 s, 1 iters, t-(init.)=4.807 s t(norm)=0.229216, mflops=21.8135 (err=2.8e-015) 4. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 44.356 Normalized results and averages for N=1048576: fft 0: mflops = 44.356 (norm. = 1), norm. avg. (of 7) = 0.890461 fft 1: mflops = 40.7531 (norm. = 0.918772), norm. avg. (of 6) = 0.923109 fft 2: mflops = 19.1451 (norm. = 0.431623), norm. avg. (of 7) = 0.310465 fft 3: mflops = 21.8135 (norm. = 0.491783), norm. avg. (of 7) = 0.581133 fft 4: mflops = -1 (norm. = -0.0225449), norm. avg. (of 6) = 0.788485 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=5.247 s, 1 iters, t-(init.)=4.826 s t(norm)=0.109582, mflops=45.628 (err=7.8e-016) 1. HARM (f2c): elapsed time t=6.389 s, 1 iters, t-(init.)=5.968 s t(norm)=0.135513, mflops=36.8969 (err=7.3e-016) 2. PDA (f2c): elapsed time t=10.144 s, 1 iters, t-(init.)=9.733 s t(norm)=0.221003, mflops=22.6242 (err=7.0e-016) 3. Singleton (f2c): elapsed time t=15.031 s, 1 iters, t-(init.)=14.61 s t(norm)=0.331742, mflops=15.0719 (err=8.9e-016) 4. Temperton (f2c): elapsed time t=7.882 s, 1 iters, t-(init.)=7.462 s t(norm)=0.169436, mflops=29.5096 (err=7.7e-016) Top mflops for N=2097152 = 45.628 Normalized results and averages for N=2097152: fft 0: mflops = 45.628 (norm. = 1), norm. avg. (of 8) = 0.904154 fft 1: mflops = 36.8969 (norm. = 0.808646), norm. avg. (of 7) = 0.906757 fft 2: mflops = 22.6242 (norm. = 0.495839), norm. avg. (of 8) = 0.333637 fft 3: mflops = 15.0719 (norm. = 0.330322), norm. avg. (of 8) = 0.549781 fft 4: mflops = 29.5096 (norm. = 0.646744), norm. avg. (of 7) = 0.768236 Benchmarking for array size = 512x128x64 (power of 2): 0. FFTW: elapsed time t=10.986 s, 1 iters, t-(init.)=10.145 s t(norm)=0.109943, mflops=45.4779 (err=1.4e-015) 1. HARM (f2c): elapsed time t=12.928 s, 1 iters, t-(init.)=12.086 s t(norm)=0.130978, mflops=38.1742 (err=1.2e-015) 2. PDA (f2c): elapsed time t=21.612 s, 1 iters, t-(init.)=20.771 s t(norm)=0.2251, mflops=22.2124 (err=1.3e-015) 3. Singleton (f2c): elapsed time t=26.978 s, 1 iters, t-(init.)=26.136 s t(norm)=0.283241, mflops=17.6528 (err=1.6e-015) 4. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=4194304 = 45.4779 Normalized results and averages for N=4194304: fft 0: mflops = 45.4779 (norm. = 1), norm. avg. (of 9) = 0.914803 fft 1: mflops = 38.1742 (norm. = 0.839401), norm. avg. (of 8) = 0.898338 fft 2: mflops = 22.2124 (norm. = 0.488421), norm. avg. (of 9) = 0.350835 fft 3: mflops = 17.6528 (norm. = 0.388162), norm. avg. (of 9) = 0.531824 fft 4: mflops = -1 (norm. = -0.0219887), norm. avg. (of 7) = 0.768236 Benchmarking for array size = 256x128x256 (power of 2): 0. FFTW: elapsed time t=22.252 s, 1 iters, t-(init.)=20.58 s t(norm)=0.106666, mflops=46.8751 (err=1.5e-015) 1. HARM (f2c): elapsed time t=25.727 s, 1 iters, t-(init.)=24.054 s t(norm)=0.124672, mflops=40.1052 (err=1.4e-015) 2. PDA (f2c): elapsed time t=41.87 s, 1 iters, t-(init.)=40.187 s t(norm)=0.20829, mflops=24.005 (err=1.4e-015) 3. Singleton (f2c): elapsed time t=53.838 s, 1 iters, t-(init.)=52.156 s t(norm)=0.270325, mflops=18.4962 (err=2.1e-015) 4. Temperton (f2c): elapsed time t=30.734 s, 1 iters, t-(init.)=29.052 s t(norm)=0.150577, mflops=33.2056 (err=1.5e-015) Top mflops for N=8388608 = 46.8751 Normalized results and averages for N=8388608: fft 0: mflops = 46.8751 (norm. = 1), norm. avg. (of 10) = 0.923323 fft 1: mflops = 40.1052 (norm. = 0.855575), norm. avg. (of 9) = 0.893586 fft 2: mflops = 24.005 (norm. = 0.512106), norm. avg. (of 10) = 0.366962 fft 3: mflops = 18.4962 (norm. = 0.394585), norm. avg. (of 10) = 0.5181 fft 4: mflops = 33.2056 (norm. = 0.708385), norm. avg. (of 8) = 0.760755 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) 180x180x180 (88.9976 MB) 240x240x240 (210.949 MB) Maximum array size N = 13824000 Benchmarking FFTs: 0. FFTW 1. PDA (f2c) 2. Singleton (f2c) 3. Temperton (f2c) Computing normalized averages (4 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.372 s, 32768 iters, t-(init.)=1.271 s t(norm)=0.0445467, mflops=112.242 (err=2.3e-016) 1. PDA (f2c): elapsed time t=1.102 s, 4096 iters, t-(init.)=1.092 s t(norm)=0.306184, mflops=16.33 (err=2.6e-016) 2. Singleton (f2c): elapsed time t=1.162 s, 32768 iters, t-(init.)=1.062 s t(norm)=0.0372216, mflops=134.331 (err=2.9e-016) 3. Temperton (f2c): elapsed time t=1.081 s, 16384 iters, t-(init.)=1.031 s t(norm)=0.0722701, mflops=69.1849 (err=2.1e-016) Top mflops for N=125 = 134.331 Normalized results and averages for N=125: fft 0: mflops = 112.242 (norm. = 0.835563), norm. avg. (of 1) = 0.835563 fft 1: mflops = 16.33 (norm. = 0.121566), norm. avg. (of 1) = 0.121566 fft 2: mflops = 134.331 (norm. = 1), norm. avg. (of 1) = 1 fft 3: mflops = 69.1849 (norm. = 0.515034), norm. avg. (of 1) = 0.515034 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.022 s, 16384 iters, t-(init.)=0.942 s t(norm)=0.0343243, mflops=145.669 (err=2.5e-016) 1. PDA (f2c): elapsed time t=1.002 s, 2048 iters, t-(init.)=0.992 s t(norm)=0.289169, mflops=17.2909 (err=2.9e-016) 2. Singleton (f2c): elapsed time t=1.773 s, 16384 iters, t-(init.)=1.693 s t(norm)=0.061689, mflops=81.0517 (err=2.6e-016) 3. Temperton (f2c): elapsed time t=1.753 s, 16384 iters, t-(init.)=1.673 s t(norm)=0.0609603, mflops=82.0207 (err=2.5e-016) Top mflops for N=216 = 145.669 Normalized results and averages for N=216: fft 0: mflops = 145.669 (norm. = 1), norm. avg. (of 2) = 0.917781 fft 1: mflops = 17.2909 (norm. = 0.1187), norm. avg. (of 2) = 0.120133 fft 2: mflops = 81.0517 (norm. = 0.556409), norm. avg. (of 2) = 0.778204 fft 3: mflops = 82.0207 (norm. = 0.56306), norm. avg. (of 2) = 0.539047 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.101 s, 8192 iters, t-(init.)=1.031 s t(norm)=0.0435668, mflops=114.766 (err=4.1e-016) 1. PDA (f2c): elapsed time t=1.483 s, 1024 iters, t-(init.)=1.473 s t(norm)=0.497955, mflops=10.0411 (err=4.1e-016) 2. Singleton (f2c): elapsed time t=1.492 s, 8192 iters, t-(init.)=1.422 s t(norm)=0.0600893, mflops=83.2095 (err=6.4e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 114.766 Normalized results and averages for N=343: fft 0: mflops = 114.766 (norm. = 1), norm. avg. (of 3) = 0.945188 fft 1: mflops = 10.0411 (norm. = 0.0874915), norm. avg. (of 3) = 0.109252 fft 2: mflops = 83.2095 (norm. = 0.725035), norm. avg. (of 3) = 0.760481 fft 3: mflops = -1 (norm. = -0.00871337), norm. avg. (of 2) = 0.539047 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.172 s, 4096 iters, t-(init.)=1.102 s t(norm)=0.0388082, mflops=128.839 (err=4.6e-016) 1. PDA (f2c): elapsed time t=1.633 s, 1024 iters, t-(init.)=1.613 s t(norm)=0.227215, mflops=22.0056 (err=4.4e-016) 2. Singleton (f2c): elapsed time t=1.362 s, 4096 iters, t-(init.)=1.292 s t(norm)=0.0454993, mflops=109.892 (err=4.1e-016) 3. Temperton (f2c): elapsed time t=1.492 s, 4096 iters, t-(init.)=1.422 s t(norm)=0.0500774, mflops=99.8454 (err=4.2e-016) Top mflops for N=729 = 128.839 Normalized results and averages for N=729: fft 0: mflops = 128.839 (norm. = 1), norm. avg. (of 4) = 0.958891 fft 1: mflops = 22.0056 (norm. = 0.1708), norm. avg. (of 4) = 0.124639 fft 2: mflops = 109.892 (norm. = 0.852941), norm. avg. (of 4) = 0.783596 fft 3: mflops = 99.8454 (norm. = 0.774965), norm. avg. (of 3) = 0.617686 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.542 s, 4096 iters, t-(init.)=1.442 s t(norm)=0.0353259, mflops=141.539 (err=3.5e-016) 1. PDA (f2c): elapsed time t=1.081 s, 512 iters, t-(init.)=1.061 s t(norm)=0.207938, mflops=24.0456 (err=3.9e-016) 2. Singleton (f2c): elapsed time t=1.212 s, 2048 iters, t-(init.)=1.162 s t(norm)=0.0569331, mflops=87.8224 (err=4.6e-016) 3. Temperton (f2c): elapsed time t=1.271 s, 2048 iters, t-(init.)=1.22 s t(norm)=0.0597748, mflops=83.6472 (err=3.6e-016) Top mflops for N=1000 = 141.539 Normalized results and averages for N=1000: fft 0: mflops = 141.539 (norm. = 1), norm. avg. (of 5) = 0.967113 fft 1: mflops = 24.0456 (norm. = 0.169887), norm. avg. (of 5) = 0.133689 fft 2: mflops = 87.8224 (norm. = 0.620482), norm. avg. (of 5) = 0.750973 fft 3: mflops = 83.6472 (norm. = 0.590984), norm. avg. (of 4) = 0.611011 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.582 s, 2048 iters, t-(init.)=1.422 s t(norm)=0.050265, mflops=99.4728 (err=4.0e-016) 1. PDA (f2c): elapsed time t=1.833 s, 256 iters, t-(init.)=1.813 s t(norm)=0.512689, mflops=9.75251 (err=4.9e-016) 2. Singleton (f2c): elapsed time t=1.192 s, 1024 iters, t-(init.)=1.112 s t(norm)=0.0786142, mflops=63.6018 (err=4.6e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 99.4728 Normalized results and averages for N=1331: fft 0: mflops = 99.4728 (norm. = 1), norm. avg. (of 6) = 0.972594 fft 1: mflops = 9.75251 (norm. = 0.0980419), norm. avg. (of 6) = 0.127748 fft 2: mflops = 63.6018 (norm. = 0.639388), norm. avg. (of 6) = 0.732376 fft 3: mflops = -1 (norm. = -0.010053), norm. avg. (of 4) = 0.611011 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.452 s, 2048 iters, t-(init.)=1.252 s t(norm)=0.0328946, mflops=152.001 (err=3.8e-016) 1. PDA (f2c): elapsed time t=1.913 s, 512 iters, t-(init.)=1.863 s t(norm)=0.195791, mflops=25.5374 (err=3.7e-016) 2. Singleton (f2c): elapsed time t=1.542 s, 1024 iters, t-(init.)=1.441 s t(norm)=0.0757207, mflops=66.0322 (err=4.0e-016) 3. Temperton (f2c): elapsed time t=1.052 s, 1024 iters, t-(init.)=0.952 s t(norm)=0.050025, mflops=99.95 (err=3.9e-016) Top mflops for N=1728 = 152.001 Normalized results and averages for N=1728: fft 0: mflops = 152.001 (norm. = 1), norm. avg. (of 7) = 0.976509 fft 1: mflops = 25.5374 (norm. = 0.168009), norm. avg. (of 7) = 0.133499 fft 2: mflops = 66.0322 (norm. = 0.434421), norm. avg. (of 7) = 0.689811 fft 3: mflops = 99.95 (norm. = 0.657563), norm. avg. (of 5) = 0.620321 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.483 s, 1024 iters, t-(init.)=1.353 s t(norm)=0.0541743, mflops=92.2947 (err=4.7e-016) 1. PDA (f2c): elapsed time t=1.693 s, 128 iters, t-(init.)=1.673 s t(norm)=0.535897, mflops=9.33015 (err=8.8e-016) 2. Singleton (f2c): elapsed time t=1.192 s, 512 iters, t-(init.)=1.122 s t(norm)=0.08985, mflops=55.6483 (err=8.5e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 92.2947 Normalized results and averages for N=2197: fft 0: mflops = 92.2947 (norm. = 1), norm. avg. (of 8) = 0.979445 fft 1: mflops = 9.33015 (norm. = 0.101091), norm. avg. (of 8) = 0.129448 fft 2: mflops = 55.6483 (norm. = 0.602941), norm. avg. (of 8) = 0.678952 fft 3: mflops = -1 (norm. = -0.0108349), norm. avg. (of 5) = 0.620321 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.562 s, 1024 iters, t-(init.)=1.402 s t(norm)=0.0436837, mflops=114.459 (err=4.1e-016) 1. PDA (f2c): elapsed time t=1.462 s, 128 iters, t-(init.)=1.441 s t(norm)=0.359191, mflops=13.9202 (err=4.7e-016) 2. Singleton (f2c): elapsed time t=1.512 s, 512 iters, t-(init.)=1.432 s t(norm)=0.0892369, mflops=56.0307 (err=5.7e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 114.459 Normalized results and averages for N=2744: fft 0: mflops = 114.459 (norm. = 1), norm. avg. (of 9) = 0.981729 fft 1: mflops = 13.9202 (norm. = 0.121617), norm. avg. (of 9) = 0.128578 fft 2: mflops = 56.0307 (norm. = 0.489525), norm. avg. (of 9) = 0.657905 fft 3: mflops = -1 (norm. = -0.00873674), norm. avg. (of 5) = 0.620321 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.843 s, 1024 iters, t-(init.)=1.653 s t(norm)=0.0408081, mflops=122.525 (err=5.2e-016) 1. PDA (f2c): elapsed time t=1.993 s, 256 iters, t-(init.)=1.943 s t(norm)=0.19187, mflops=26.0593 (err=5.4e-016) 2. Singleton (f2c): elapsed time t=1.643 s, 512 iters, t-(init.)=1.543 s t(norm)=0.076185, mflops=65.6297 (err=6.6e-016) 3. Temperton (f2c): elapsed time t=1.322 s, 512 iters, t-(init.)=1.222 s t(norm)=0.0603358, mflops=82.8696 (err=5.2e-016) Top mflops for N=3375 = 122.525 Normalized results and averages for N=3375: fft 0: mflops = 122.525 (norm. = 1), norm. avg. (of 10) = 0.983556 fft 1: mflops = 26.0593 (norm. = 0.212687), norm. avg. (of 10) = 0.136989 fft 2: mflops = 65.6297 (norm. = 0.535645), norm. avg. (of 10) = 0.645679 fft 3: mflops = 82.8696 (norm. = 0.67635), norm. avg. (of 6) = 0.629659 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.092 s, 64 iters, t-(init.)=1.032 s t(norm)=0.068382, mflops=73.1187 (err=4.4e-016) 1. PDA (f2c): elapsed time t=1.652 s, 32 iters, t-(init.)=1.622 s t(norm)=0.214953, mflops=23.2609 (err=4.6e-016) 2. Singleton (f2c): elapsed time t=1.422 s, 64 iters, t-(init.)=1.362 s t(norm)=0.0902483, mflops=55.4027 (err=5.5e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 73.1187 Normalized results and averages for N=16800: fft 0: mflops = 73.1187 (norm. = 1), norm. avg. (of 11) = 0.985051 fft 1: mflops = 23.2609 (norm. = 0.318126), norm. avg. (of 11) = 0.153456 fft 2: mflops = 55.4027 (norm. = 0.757709), norm. avg. (of 11) = 0.655863 fft 3: mflops = -1 (norm. = -0.0136764), norm. avg. (of 6) = 0.629659 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.733 s, 8 iters, t-(init.)=1.563 s t(norm)=0.10544, mflops=47.4205 (err=7.5e-016) 1. PDA (f2c): elapsed time t=1.622 s, 4 iters, t-(init.)=1.542 s t(norm)=0.208046, mflops=24.0332 (err=7.1e-016) 2. Singleton (f2c): elapsed time t=1.873 s, 4 iters, t-(init.)=1.773 s t(norm)=0.239212, mflops=20.9019 (err=6.7e-016) 3. Temperton (f2c): elapsed time t=1.753 s, 8 iters, t-(init.)=1.583 s t(norm)=0.106789, mflops=46.8214 (err=8.0e-016) Top mflops for N=110592 = 47.4205 Normalized results and averages for N=110592: fft 0: mflops = 47.4205 (norm. = 1), norm. avg. (of 12) = 0.986297 fft 1: mflops = 24.0332 (norm. = 0.506809), norm. avg. (of 12) = 0.182902 fft 2: mflops = 20.9019 (norm. = 0.440778), norm. avg. (of 12) = 0.63794 fft 3: mflops = 46.8214 (norm. = 0.987366), norm. avg. (of 7) = 0.68076 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.462 s, 8 iters, t-(init.)=1.272 s t(norm)=0.0802343, mflops=62.3175 (err=6.7e-016) 1. PDA (f2c): elapsed time t=1.742 s, 2 iters, t-(init.)=1.702 s t(norm)=0.42943, mflops=11.6433 (err=7.0e-016) 2. Singleton (f2c): elapsed time t=1.513 s, 4 iters, t-(init.)=1.423 s t(norm)=0.179518, mflops=27.8524 (err=9.3e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 62.3175 Normalized results and averages for N=117649: fft 0: mflops = 62.3175 (norm. = 1), norm. avg. (of 13) = 0.987351 fft 1: mflops = 11.6433 (norm. = 0.186839), norm. avg. (of 13) = 0.183205 fft 2: mflops = 27.8524 (norm. = 0.446943), norm. avg. (of 13) = 0.623248 fft 3: mflops = -1 (norm. = -0.0160469), norm. avg. (of 7) = 0.68076 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.673 s, 4 iters, t-(init.)=1.503 s t(norm)=0.0981669, mflops=50.9337 (err=7.5e-016) 1. PDA (f2c): elapsed time t=1.632 s, 2 iters, t-(init.)=1.542 s t(norm)=0.201428, mflops=24.8227 (err=7.4e-016) 2. Singleton (f2c): elapsed time t=1.161 s, 1 iters, t-(init.)=1.121 s t(norm)=0.292868, mflops=17.0725 (err=1.0e-015) 3. Temperton (f2c): elapsed time t=1.873 s, 4 iters, t-(init.)=1.703 s t(norm)=0.11123, mflops=44.952 (err=7.4e-016) Top mflops for N=216000 = 50.9337 Normalized results and averages for N=216000: fft 0: mflops = 50.9337 (norm. = 1), norm. avg. (of 14) = 0.988254 fft 1: mflops = 24.8227 (norm. = 0.487354), norm. avg. (of 14) = 0.20493 fft 2: mflops = 17.0725 (norm. = 0.335192), norm. avg. (of 14) = 0.602672 fft 3: mflops = 44.952 (norm. = 0.88256), norm. avg. (of 8) = 0.705985 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.853 s, 4 iters, t-(init.)=1.663 s t(norm)=0.096093, mflops=52.0329 (err=7.4e-016) 1. PDA (f2c): elapsed time t=1.072 s, 1 iters, t-(init.)=1.022 s t(norm)=0.236217, mflops=21.167 (err=7.8e-016) 2. Singleton (f2c): elapsed time t=1.382 s, 1 iters, t-(init.)=1.331 s t(norm)=0.307636, mflops=16.253 (err=9.5e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 52.0329 Normalized results and averages for N=241920: fft 0: mflops = 52.0329 (norm. = 1), norm. avg. (of 15) = 0.989038 fft 1: mflops = 21.167 (norm. = 0.4068), norm. avg. (of 15) = 0.218388 fft 2: mflops = 16.253 (norm. = 0.312359), norm. avg. (of 15) = 0.583318 fft 3: mflops = -1 (norm. = -0.0192186), norm. avg. (of 8) = 0.705985 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.522 s, 2 iters, t-(init.)=1.352 s t(norm)=0.0857504, mflops=58.3088 (err=7.9e-016) 1. PDA (f2c): elapsed time t=1.632 s, 1 iters, t-(init.)=1.552 s t(norm)=0.196871, mflops=25.3974 (err=7.4e-016) 2. Singleton (f2c): elapsed time t=1.862 s, 1 iters, t-(init.)=1.771 s t(norm)=0.224651, mflops=22.2568 (err=9.1e-016) 3. Temperton (f2c): elapsed time t=1.753 s, 2 iters, t-(init.)=1.593 s t(norm)=0.101036, mflops=49.4874 (err=9.0e-016) Top mflops for N=421875 = 58.3088 Normalized results and averages for N=421875: fft 0: mflops = 58.3088 (norm. = 1), norm. avg. (of 16) = 0.989723 fft 1: mflops = 25.3974 (norm. = 0.435567), norm. avg. (of 16) = 0.231962 fft 2: mflops = 22.2568 (norm. = 0.381705), norm. avg. (of 16) = 0.570717 fft 3: mflops = 49.4874 (norm. = 0.848713), norm. avg. (of 9) = 0.721844 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.081 s, 1 iters, t-(init.)=0.98 s t(norm)=0.100922, mflops=49.5433 (err=7.4e-016) 1. PDA (f2c): elapsed time t=1.973 s, 1 iters, t-(init.)=1.873 s t(norm)=0.192884, mflops=25.9223 (err=7.1e-016) 2. Singleton (f2c): elapsed time t=2.433 s, 1 iters, t-(init.)=2.332 s t(norm)=0.240153, mflops=20.8201 (err=9.7e-016) 3. Temperton (f2c): elapsed time t=1.221 s, 1 iters, t-(init.)=1.111 s t(norm)=0.114412, mflops=43.7015 (err=7.7e-016) Top mflops for N=512000 = 49.5433 Normalized results and averages for N=512000: fft 0: mflops = 49.5433 (norm. = 1), norm. avg. (of 17) = 0.990327 fft 1: mflops = 25.9223 (norm. = 0.523225), norm. avg. (of 17) = 0.249095 fft 2: mflops = 20.8201 (norm. = 0.42024), norm. avg. (of 17) = 0.561866 fft 3: mflops = 43.7015 (norm. = 0.882088), norm. avg. (of 10) = 0.737868 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.122 s, 1 iters, t-(init.)=1.002 s t(norm)=0.0881557, mflops=56.7178 (err=7.8e-016) 1. PDA (f2c): elapsed time t=3.345 s, 1 iters, t-(init.)=3.225 s t(norm)=0.283735, mflops=17.6221 (err=7.2e-016) 2. Singleton (f2c): elapsed time t=3.575 s, 1 iters, t-(init.)=3.455 s t(norm)=0.30397, mflops=16.449 (err=9.6e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 56.7178 Normalized results and averages for N=592704: fft 0: mflops = 56.7178 (norm. = 1), norm. avg. (of 18) = 0.990865 fft 1: mflops = 17.6221 (norm. = 0.310698), norm. avg. (of 18) = 0.252517 fft 2: mflops = 16.449 (norm. = 0.290014), norm. avg. (of 18) = 0.546763 fft 3: mflops = -1 (norm. = -0.0176311), norm. avg. (of 10) = 0.737868 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=1.873 s, 1 iters, t-(init.)=1.703 s t(norm)=0.0974376, mflops=51.3149 (err=7.4e-016) 1. PDA (f2c): elapsed time t=4.206 s, 1 iters, t-(init.)=4.036 s t(norm)=0.230921, mflops=21.6525 (err=6.4e-016) 2. Singleton (f2c): elapsed time t=5.588 s, 1 iters, t-(init.)=5.418 s t(norm)=0.309992, mflops=16.1294 (err=7.0e-016) 3. Temperton (f2c): elapsed time t=2.683 s, 1 iters, t-(init.)=2.512 s t(norm)=0.143725, mflops=34.7887 (err=7.5e-016) Top mflops for N=884736 = 51.3149 Normalized results and averages for N=884736: fft 0: mflops = 51.3149 (norm. = 1), norm. avg. (of 19) = 0.991345 fft 1: mflops = 21.6525 (norm. = 0.421952), norm. avg. (of 19) = 0.261435 fft 2: mflops = 16.1294 (norm. = 0.314323), norm. avg. (of 19) = 0.534529 fft 3: mflops = 34.7887 (norm. = 0.677946), norm. avg. (of 11) = 0.732421 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=2.253 s, 1 iters, t-(init.)=2.023 s t(norm)=0.086758, mflops=57.6316 (err=7.6e-016) 1. PDA (f2c): elapsed time t=6.64 s, 1 iters, t-(init.)=6.41 s t(norm)=0.274898, mflops=18.1886 (err=7.4e-016) 2. Singleton (f2c): elapsed time t=5.628 s, 1 iters, t-(init.)=5.397 s t(norm)=0.231455, mflops=21.6025 (err=8.2e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 57.6316 Normalized results and averages for N=1157625: fft 0: mflops = 57.6316 (norm. = 1), norm. avg. (of 20) = 0.991778 fft 1: mflops = 18.1886 (norm. = 0.315601), norm. avg. (of 20) = 0.264143 fft 2: mflops = 21.6025 (norm. = 0.374838), norm. avg. (of 20) = 0.526544 fft 3: mflops = -1 (norm. = -0.0173516), norm. avg. (of 11) = 0.732421 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=3.155 s, 1 iters, t-(init.)=2.875 s t(norm)=0.100204, mflops=49.8983 (err=5.2e-016) 1. PDA (f2c): elapsed time t=8.092 s, 1 iters, t-(init.)=7.812 s t(norm)=0.272275, mflops=18.3638 (err=5.9e-016) 2. Singleton (f2c): elapsed time t=6.98 s, 1 iters, t-(init.)=6.7 s t(norm)=0.233518, mflops=21.4116 (err=7.0e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 49.8983 Normalized results and averages for N=1404928: fft 0: mflops = 49.8983 (norm. = 1), norm. avg. (of 21) = 0.99217 fft 1: mflops = 18.3638 (norm. = 0.368024), norm. avg. (of 21) = 0.26909 fft 2: mflops = 21.4116 (norm. = 0.429104), norm. avg. (of 21) = 0.521904 fft 3: mflops = -1 (norm. = -0.0200408), norm. avg. (of 11) = 0.732421 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=3.766 s, 1 iters, t-(init.)=3.426 s t(norm)=0.0956841, mflops=52.2553 (err=7.4e-016) 1. PDA (f2c): elapsed time t=7.301 s, 1 iters, t-(init.)=6.951 s t(norm)=0.194133, mflops=25.7555 (err=7.9e-016) 2. Singleton (f2c): elapsed time t=13.359 s, 1 iters, t-(init.)=13.009 s t(norm)=0.363326, mflops=13.7617 (err=9.4e-016) 3. Temperton (f2c): elapsed time t=4.817 s, 1 iters, t-(init.)=4.467 s t(norm)=0.124758, mflops=40.0776 (err=7.0e-016) Top mflops for N=1728000 = 52.2553 Normalized results and averages for N=1728000: fft 0: mflops = 52.2553 (norm. = 1), norm. avg. (of 22) = 0.992526 fft 1: mflops = 25.7555 (norm. = 0.492879), norm. avg. (of 22) = 0.279262 fft 2: mflops = 13.7617 (norm. = 0.263356), norm. avg. (of 22) = 0.510152 fft 3: mflops = 40.0776 (norm. = 0.766958), norm. avg. (of 12) = 0.735299 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=7.02 s, 1 iters, t-(init.)=6.419 s t(norm)=0.0999411, mflops=50.0295 (err=1.2e-015) 1. PDA (f2c): elapsed time t=14.04 s, 1 iters, t-(init.)=13.449 s t(norm)=0.209395, mflops=23.8783 (err=1.2e-015) 2. Singleton (f2c): elapsed time t=20.21 s, 1 iters, t-(init.)=19.61 s t(norm)=0.305319, mflops=16.3763 (err=1.6e-015) 3. Temperton (f2c): elapsed time t=8.742 s, 1 iters, t-(init.)=8.151 s t(norm)=0.126908, mflops=39.3988 (err=1.2e-015) Top mflops for N=2985984 = 50.0295 Normalized results and averages for N=2985984: fft 0: mflops = 50.0295 (norm. = 1), norm. avg. (of 23) = 0.992851 fft 1: mflops = 23.8783 (norm. = 0.477285), norm. avg. (of 23) = 0.287872 fft 2: mflops = 16.3763 (norm. = 0.327333), norm. avg. (of 23) = 0.502204 fft 3: mflops = 39.3988 (norm. = 0.787511), norm. avg. (of 13) = 0.739315 Benchmarking for array size = 180x180x180: 0. FFTW: elapsed time t=13.88 s, 1 iters, t-(init.)=12.709 s t(norm)=0.0969579, mflops=51.5688 (err=9.4e-016) 1. PDA (f2c): elapsed time t=26.669 s, 1 iters, t-(init.)=25.498 s t(norm)=0.194526, mflops=25.7035 (err=9.3e-016) 2. Singleton (f2c): elapsed time t=45.506 s, 1 iters, t-(init.)=44.345 s t(norm)=0.338311, mflops=14.7793 (err=1.2e-015) 3. Temperton (f2c): elapsed time t=15.732 s, 1 iters, t-(init.)=14.56 s t(norm)=0.111079, mflops=45.0129 (err=9.5e-016) Top mflops for N=5832000 = 51.5688 Normalized results and averages for N=5832000: fft 0: mflops = 51.5688 (norm. = 1), norm. avg. (of 24) = 0.993148 fft 1: mflops = 25.7035 (norm. = 0.498431), norm. avg. (of 24) = 0.296645 fft 2: mflops = 14.7793 (norm. = 0.286594), norm. avg. (of 24) = 0.49322 fft 3: mflops = 45.0129 (norm. = 0.872871), norm. avg. (of 14) = 0.748855 Benchmarking for array size = 240x240x240: 0. FFTW: elapsed time t=37.634 s, 1 iters, t-(init.)=34.87 s t(norm)=0.106339, mflops=47.0196 (err=1.5e-015) 1. PDA (f2c): elapsed time t=72.024 s, 1 iters, t-(init.)=69.26 s t(norm)=0.211214, mflops=23.6727 (err=1.5e-015) 2. Singleton (f2c): elapsed time t=104.971 s, 1 iters, t-(init.)=102.217 s t(norm)=0.311718, mflops=16.0401 (err=2.1e-015) 3. Temperton (f2c): elapsed time t=47.018 s, 1 iters, t-(init.)=44.264 s t(norm)=0.134986, mflops=37.0408 (err=1.5e-015) Top mflops for N=13824000 = 47.0196 Normalized results and averages for N=13824000: fft 0: mflops = 47.0196 (norm. = 1), norm. avg. (of 25) = 0.993423 fft 1: mflops = 23.6727 (norm. = 0.503465), norm. avg. (of 25) = 0.304918 fft 2: mflops = 16.0401 (norm. = 0.341137), norm. avg. (of 25) = 0.487137 fft 3: mflops = 37.0408 (norm. = 0.787773), norm. avg. (of 15) = 0.751449 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Beauregard, Bergland, CWP (min N), CWP (best N), Edelblute, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), NAPACK (f2c), Ooura (C), Ransom, Singleton (f2c), Temperton (f2c), Valkenburg 2, 39.5316, 42.7119, 34.635, 1.7453, 6.9812, 7.64826, 5.08031, 4.51972, , 11.3728, 25.8908, 25.5128, 69.9051, , 13.9624, 9.42964, 9.10222, 26.5127, , , , 4.39839, 28.2826, , 8.0474, 3.76644, 8.24352 4, 64.8771, 63.0249, 39.126, 7.93174, 7.93174, 25.8748, 17.3318, 6.34347, 39.1991, 29.4958, 41.8593, 41.943, 188.296, , 37.3824, 18.2044, 17.3032, 81.285, 51.6858, 54.7202, 52.6592, 7.9922, 39.4944, 5.78684, 25.2213, 12.9215, 8.51117 8, 91.7791, 89.6858, 38.5506, 8.72359, 11.8976, 46.1928, 37.1616, 19.1462, 32.0665, 33.0781, 54.6133, 49.5, 256.532, 104.77, 57.1951, 28.0368, 25.3279, 136.771, 74.7204, 73.4554, 77.5287, 10.5349, 48.6955, 6.7159, 24.5377, 26.1708, 8.62789 16, 46.7853, 48.1274, 38.0608, 15.2854, 14.0466, 61.1415, 64.4286, 30.1315, 29.9166, 45.516, 68.7591, 48.6861, 266.094, 98.6315, 86.3026, 39.5316, 36.7277, 164.161, 61.6356, 70.9696, 74.1043, 14.4432, 53.0589, 20.5201, 57.3776, 43.2402, 8.72359 32, 56.8951, 57.8365, 41.2176, 15.1353, 14.9711, 81.1591, 57.2367, 57.5193, 32.3236, 39.0677, 82.4352, 50.4123, 252.062, 116.963, 83.7521, 51.8071, 47.6625, 177.424, 73.2758, 91.0222, 92.6304, 15.7633, 53.3898, 20.448, 70.2799, 52.3242, 9.027 64, 52.825, 55.0916, 44.2437, 22.2785, 14.6722, 89.1141, 66.1562, 54.1433, 34.8944, 45.1972, 85.5399, 53.6814, 154.108, 143.723, 102.3, 57.0913, 52.7807, 41.3368, 71.7793, 91.7122, 92.3856, 18.7023, 57.6141, 38.2925, 89.1141, 70.178, 9.2904 128, 60.6113, 62.6283, 48.2262, 20.0219, 14.4262, 95.2015, 67.2164, 71.8203, 38.1697, 43.3552, 87.2774, 59.0985, 170.401, 121.123, 111.044, 64.3862, 58.6265, 34.9193, 79.2232, 102.443, 105.46, 18.8884, 55.9454, 35.9101, 90.4502, 71.1243, 9.34798 256, 60.699, 62.9775, 52.6592, 24.0637, 14.3444, 105.385, 77.6004, 78.9888, 41.4457, 46.7853, 90.5408, 63.9376, 176.417, 128.857, 122.283, 68.2001, 62.9775, 44.787, 81.3638, 107.409, 110.231, 20.7228, 60.263, 52.6592, 106.657, 82.8914, 9.52385 512, 68.2864, 70.3218, 54.5187, 23.3363, 14.3684, 110.96, 83.3673, 84.2606, 43.2105, 39.6187, 85.0197, 61.2009, 133.577, 140.644, 103.648, 68.2864, 65.4451, 43.6502, 88.8624, 118.483, 120.912, 20.1305, 59.2416, 50.1444, 110.183, 76.6627, 9.34745 1024, 66.6609, 70.7541, 57.5193, 27.8432, 14.147, 106.78, 85.1808, 85.1117, 45.5111, 35.6174, 75.3287, 62.6764, 83.0884, 112.087, 75.8738, 65.9067, 63.8597, 36.8698, 91.0222, 120.319, 117.619, 17.8087, 59.8161, 60.5064, 109, 72.7168, 8.96525 2048, 41.4011, 41.7306, 29.6665, 22.6697, 13.0834, 73.327, 66.1752, 67.7295, 26.4064, 34.9102, 75.8339, 59.9498, 81.1135, 89.275, 81.6879, 36.4319, 36.6635, 36.6868, 84.6868, 109.746, 90.0417, 17.236, 51.4008, 47.5839, 61.9127, 58.1368, 8.5668 4096, 37.1616, 37.1616, 27.8137, 25.7425, 12.9774, 81.0233, 61.5602, 71.0498, 25.1256, 35.8897, 76.1217, 64.7269, 82.0803, 93.0689, 82.6735, 35.9101, 36.3038, 37.3824, 45.8561, 49.8531, 46.5344, 18.2679, 54.1433, 63.4219, 75.2567, 60.4948, 8.72359 8192, 38.4636, 37.8022, 27.9105, 23.798, 12.8891, 72.8567, 66.6903, 65.4102, 25.2062, 32.425, 72.4694, 66.6903, 75.2705, 86.7695, 70.8497, 35.6659, 36.005, , 44.2007, 48.2701, 44.7815, 17.7216, 51.9493, 55.7753, 70.1208, 55.8668, 8.59705 16384, 35.9101, 35.2209, 27.9727, 29.081, 12.9044, 74.8219, 67.8376, 68.4704, 25.4509, 27.5527, 58.6265, 48.5131, 55.1053, 80.527, 49.1959, 35.2209, 35.9101, , 43.8735, 47.2636, 44.4043, 16.2247, 54.7355, 69.7722, 75.5147, 55.9454, 8.03419 32768, 32.2308, 31.407, 24.8399, 24.9978, 12.265, 66.534, 67.2164, 67.1017, 23.1032, 22.431, 50.3478, 45.9096, 40.8748, 69.5958, 29.5207, 29.5207, 29.744, , 42.9041, 45.9633, 40.5377, 9.90968, 45.9633, 52.0127, 51.3337, 42.2132, 6.8648 65536, 14.9583, 14.6449, 12.2497, 22.0521, 10.4753, 34.6065, 46.6034, 46.5, 11.8283, 22.0405, 50.1411, 38.8002, 36.7277, 38.7644, 30.3495, 13.7789, 14.056, , 33.7706, 35.7876, 32.2143, 9.78149, 33.5008, 37.1177, 29.1069, 26.8521, 5.78684 131072, 13.4799, 13.415, 10.1283, 17.3808, 10.2118, 31.5612, 44.4755, 44.9692, 9.67111, 19.1758, 47.816, 39.7188, 34.254, 32.9619, 25.5824, 12.0968, 12.497, , 16.2407, 16.4809, 15.6696, 9.42565, 30.6918, 30.2748, 23.418, 22.7138, 5.70462 262144, 12.2054, 12.1425, 9.69308, 21.2167, 10.1957, 32.4972, 32.0557, 32.0339, 9.42587, 19.4661, 47.0917, 39.652, 31.2076, 33.6322, 26.7798, 12.0188, 12.2688, , 14.1106, 14.2728, 13.6139, 9.69308, 32.0557, 39.2562, 25.0722, 23.5459, 5.64966 524288, 12.6865, 12.6865, 9.40116, 16.9817, 10.2127, 28.5757, 33.6082, 33.3829, 9.17601, 20.214, 46.4621, 38.5506, 29.264, 31.6841, 27.7788, 11.7609, 12.0424, , 13.6646, 13.8933, 13.2961, 9.40116, 31.0713, 31.8869, 20.6413, 22.0973, 5.54462 1048576, 11.8043, 11.8189, 9.36647, 23.071, 10.1646, 29.5041, , , 9.16187, 19.6473, 46.1115, 39.0677, 24.1218, , 29.3226, 11.6872, 11.9265, , , 13.5808, 13.0713, 9.71353, 32.1157, 39.961, 24.2951, 23.3172, 5.46817 2097152, 12.4654, 12.4372, 9.33728, 18.2331, 10.1513, , , , 9.109, 17.8633, 46.4755, 38.3025, 23.3685, , 26.0809, 11.9564, 11.872, , , 13.6229, 13.1581, 9.44056, 31.2785, 29.4701, 22.9975, 21.9455, 5.33575 Norm. Avg., 0.357953, 0.362238, 0.274182, 0.238291, 0.134419, 0.597406, 0.549895, 0.539836, 0.231214, 0.305062, 0.666554, 0.54273, 0.817514, 0.77768, 0.554391, 0.304782, 0.298832, 0.395392, 0.46724, 0.521991, 0.499541, 0.143794, 0.485496, 0.479898, 0.531763, 0.436933, 0.0834359 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, CWP (min N), CWP (best N), FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Singleton (f2c), Temperton (f2c), Valkenburg 6, 21.5806, 12.2982, 26.367, 47.4976, 47.7767, 21.719, 40.9858, 9.66206, 17.6467, 15.0362, 8.75494 9, 37.0237, 22.4858, 31.9606, 45.8259, 41.7111, 18.4935, 42.4208, 11.5271, 30.3522, 29.1685, 8.9803 12, 42.0168, 34.153, 37.2927, 61.1568, 54.6912, 31.4482, 61.8274, 12.0381, 33.9474, 32.9169, 8.7394 15, 46.4686, 46.4686, 37.9884, 52.9012, 49.4608, 20.6153, 50.1715, 8.79265, 34.8514, 35.5614, 8.19247 18, 46.3623, 35.3126, 30.3083, 64.2173, 47.2077, 21.7272, 70.7267, 13.566, 43.4929, 36.3566, 9.16364 24, 60.4993, 51.8069, 34.2753, 62.6, 59.0141, 39.1293, 91.1121, 14.7535, 42.0988, 50.3598, 9.0054 36, 66.5433, 66.5433, 35.8325, 69.1854, 63.396, 26.7252, 98.2076, 15.9401, 62.7437, 61.4787, 9.27981 80, 88.4813, 71.3719, 44.9608, 75.1931, 62.6799, 46.0093, 87.1323, 11.8851, 88.1519, 56.2926, 8.8453 108, 68.5748, 85.2538, 37.3052, 71.8736, 69.0103, 24.8598, 103.845, 17.8717, 72.3522, 72.8372, 9.50428 210, 73.6049, 73.6049, 25.0964, 70.8541, 64.3597, 22.992, 94.6574, 9.46304, 62.5399, , 7.67286 504, 86.4391, 87.2531, 25.1528, 64.3045, 61.6929, 26.5967, 94.4575, 12.4346, 71.7204, , 8.26166 1000, 74.9263, 90.1499, 34.4297, 69.8014, 66.612, 26.3967, 71.2637, 9.57673, 87.898, 73.8955, 8.06334 1960, 76.0578, 75.5863, 18.1467, 68.084, 66.8808, 23.4111, 73.5106, 8.6664, 57.0433, , 7.17142 4725, 67.603, 76.7385, 24.7394, 62.4556, 65.7954, 18.997, 82.807, 9.70327, 54.2414, , 7.74145 10368, 67.9853, 74.2591, 32.2583, 65.471, 59.6877, 28.517, 86.6114, 17.4109, 52.5947, 58.1582, 8.69859 27000, 72.9699, 72.5951, 23.8714, 50.7934, 50.3909, 17.7437, 42.9105, 8.48816, 51.2023, 56.1778, 7.2134 75600, 47.9523, 48.474, 15.8908, 50.5229, 51.5866, 16.3031, 30.9585, 7.32326, 23.7438, , 6.30562 165375, 40.8673, 41.2198, 10.3721, 46.9974, 46.9204, 11.9253, 31.8361, 6.53634, 21.5228, , 6.22144 362880, 22.7652, 22.9053, 15.2113, 53.1068, 45.5614, 15.3436, 33.7806, 7.85523, 17.6092, , 5.7499 Norm. Avg., 0.780422, 0.751267, 0.398578, 0.847849, 0.791856, 0.33212, 0.893442, 0.155391, 0.63196, 0.620607, 0.11567 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM (f2c), PDA (f2c), Singleton (f2c), Temperton (f2c) 4x4x4, 135.01, , 14.2858, 111.156, 64.7936 8x8x8, 165.275, 120.757, 21.6251, 101.858, 121.535 16x16x16, 57.0913, 113.156, 27.7891, 67.1805, 91.0486 32x32x32, 63.3708, 86.7548, 24.8556, 51.671, 72.0176 64x64x64, 41.6837, 41.7575, 20.48, 20.8418, 39.5855 256x64x32, 43.6907, 38.8513, 21.078, 19.5784, 40.7589 16x1024x64, 44.356, 40.7531, 19.1451, 21.8135, 128x128x128, 45.628, 36.8969, 22.6242, 15.0719, 29.5096 512x128x64, 45.4779, 38.1742, 22.2124, 17.6528, 256x128x256, 46.8751, 40.1052, 24.005, 18.4962, 33.2056 Norm. Avg., 0.923323, 0.893586, 0.366962, 0.5181, 0.760755 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA (f2c), Singleton (f2c), Temperton (f2c) 5x5x5, 112.242, 16.33, 134.331, 69.1849 6x6x6, 145.669, 17.2909, 81.0517, 82.0207 7x7x7, 114.766, 10.0411, 83.2095, 9x9x9, 128.839, 22.0056, 109.892, 99.8454 10x10x10, 141.539, 24.0456, 87.8224, 83.6472 11x11x11, 99.4728, 9.75251, 63.6018, 12x12x12, 152.001, 25.5374, 66.0322, 99.95 13x13x13, 92.2947, 9.33015, 55.6483, 14x14x14, 114.459, 13.9202, 56.0307, 15x15x15, 122.525, 26.0593, 65.6297, 82.8696 24x25x28, 73.1187, 23.2609, 55.4027, 48x48x48, 47.4205, 24.0332, 20.9019, 46.8214 49x49x49, 62.3175, 11.6433, 27.8524, 60x60x60, 50.9337, 24.8227, 17.0725, 44.952 72x60x56, 52.0329, 21.167, 16.253, 75x75x75, 58.3088, 25.3974, 22.2568, 49.4874 80x80x80, 49.5433, 25.9223, 20.8201, 43.7015 84x84x84, 56.7178, 17.6221, 16.449, 96x96x96, 51.3149, 21.6525, 16.1294, 34.7887 105x105x105, 57.6316, 18.1886, 21.6025, 112x112x112, 49.8983, 18.3638, 21.4116, 120x120x120, 52.2553, 25.7555, 13.7617, 40.0776 144x144x144, 50.0295, 23.8783, 16.3763, 39.3988 180x180x180, 51.5688, 25.7035, 14.7793, 45.0129 240x240x240, 47.0196, 23.6727, 16.0401, 37.0408 Norm. Avg., 0.993423, 0.304918, 0.487137, 0.751449 @@@@ end