To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Tristram Scott @ submitter email = t.scott@cow.mang.canterbury.ac.nz @ submitter organization = University of Canterbury @ computer manufacturer = SGI @ computer model = O2 @ CPU manufacturer = SGI @ CPU model = R5000 @ CPU speed = 180 MHz @ RAM = 192 MB @ L2 cache size = 512 kB @ operating system = IRIX 6.3 @ C compiler = SGI cc 7.1 @ C compiler flags = -DUSE_SGIMATH -O3 -n32 -mips4 @ Fortran compiler = NONE @ Fortran compiler flags = NONE @ remarks = @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000335693 MB) 4 (0.000579834 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) 524288 (50.9973 MB) 1048576 (96 MB) Maximum array size = 1048576 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Beauregard 5. Bergland 6. CWP (min N) 7. CWP (best N) 8. Edelblute 9. FFTPACK (f2c) 10. FFTW 11. FFTW_ESTIMATE 12. Frigo-old 13. Green 14. GSL 15. GSL DIT 16. GSL DIF 17. Krukar 18. Mayer (Buneman) 19. Mayer (simple) 20. Mayer (lookup) 21. NAPACK (f2c) 22. Ooura (C) 23. Ransom 24. Singleton (f2c) 25. Temperton (f2c) 26. Valkenburg 27. SGIMATH Computing normalized averages (28 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.69 s, 2097152 iters, t-(init.)=1.22 s t(norm)=0.290871, mflops=17.1898 (err=1.7e-17) 1. Arndt DIT: elapsed time t=1.71 s, 2097152 iters, t-(init.)=1.23 s t(norm)=0.293255, mflops=17.05 (err=1.7e-17) 2. Arndt Split-Radix: elapsed time t=1.02 s, 1048576 iters, t-(init.)=0.77 s t(norm)=0.367165, mflops=13.6179 (err=1.7e-17) 3. Arndt 4-step: elapsed time t=1.05 s, 65536 iters, t-(init.)=1.04 s t(norm)=7.93457, mflops=0.630154 (err=1.7e-17) 4. Beauregard: elapsed time t=1.21 s, 262144 iters, t-(init.)=1.15 s t(norm)=2.19345, mflops=2.27951 (err=1.7e-17) 5. Bergland: elapsed time t=1.37 s, 262144 iters, t-(init.)=1.31 s t(norm)=2.49863, mflops=2.0011 (err=1.7e-17) 6. CWP (min N): elapsed time t=1.13 s, 262144 iters, t-(init.)=1.07 s t(norm)=2.04086, mflops=2.44994 7. CWP (best N) (N=3): elapsed time t=1.18 s, 262144 iters, t-(init.)=1.11 s t(norm)=2.11716, mflops=2.36166 8. Skipping fft (Edelblute can't handle N <= 2). 9. FFTPACK (f2c): elapsed time t=1.13 s, 524288 iters, t-(init.)=1.01 s t(norm)=0.963211, mflops=5.19097 (err=1.7e-17) FFTW_MEASURE plan: (cost = 7.247925e-07) FFTW_NOTW 2 10. FFTW: elapsed time t=1.71 s, 2097152 iters, t-(init.)=1.22 s t(norm)=0.290871, mflops=17.1898 (err=1.7e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 11. FFTW_ESTIMATE: elapsed time t=1.71 s, 2097152 iters, t-(init.)=1.22 s t(norm)=0.290871, mflops=17.1898 (err=1.7e-17) 12. Frigo-old: elapsed time t=1.24 s, 2097152 iters, t-(init.)=0.75 s t(norm)=0.178814, mflops=27.962 (err=1.7e-17) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=1.95 s, 1048576 iters, t-(init.)=1.7 s t(norm)=0.810623, mflops=6.16809 (err=1.7e-17) 15. GSL DIT: elapsed time t=1.4 s, 524288 iters, t-(init.)=1.27 s t(norm)=1.21117, mflops=4.12825 (err=1.7e-17) 16. GSL DIF: elapsed time t=1.44 s, 524288 iters, t-(init.)=1.31 s t(norm)=1.24931, mflops=4.0022 (err=1.7e-17) 17. Krukar: elapsed time t=1.07 s, 1048576 iters, t-(init.)=0.83 s t(norm)=0.395775, mflops=12.6334 (err=1.7e-17) 18. Skipping fft (Mayer can't handle N <= 2). 19. Skipping fft (Mayer can't handle N <= 2). 20. Skipping fft (Mayer can't handle N <= 2). 21. NAPACK (f2c): elapsed time t=1.01 s, 262144 iters, t-(init.)=0.95 s t(norm)=1.81198, mflops=2.75941 (err=1.7e-17) 22. Ooura (C): elapsed time t=1.06 s, 1048576 iters, t-(init.)=0.82 s t(norm)=0.391006, mflops=12.7875 (err=1.7e-17) 23. Skipping fft (Ransom doesn't work for N=2). 24. Singleton (f2c): elapsed time t=1.69 s, 524288 iters, t-(init.)=1.57 s t(norm)=1.49727, mflops=3.33941 (err=1.7e-17) 25. Temperton (f2c): elapsed time t=1.44 s, 262144 iters, t-(init.)=1.38 s t(norm)=2.63214, mflops=1.89959 (err=1.7e-17) 26. Valkenburg: elapsed time t=1.39 s, 524288 iters, t-(init.)=1.27 s t(norm)=1.21117, mflops=4.12825 (err=1.7e-17) 27. SGIMATH: elapsed time t=1.67 s, 262144 iters, t-(init.)=1.6 s t(norm)=3.05176, mflops=1.6384 (err=1.7e-17) Top mflops for N=2 = 27.962 Normalized results and averages for N=2: fft 0: mflops = 17.1898 (norm. = 0.614754), norm. avg. (of 1) = 0.614754 fft 1: mflops = 17.05 (norm. = 0.609756), norm. avg. (of 1) = 0.609756 fft 2: mflops = 13.6179 (norm. = 0.487013), norm. avg. (of 1) = 0.487013 fft 3: mflops = 0.630154 (norm. = 0.0225361), norm. avg. (of 1) = 0.0225361 fft 4: mflops = 2.27951 (norm. = 0.0815217), norm. avg. (of 1) = 0.0815217 fft 5: mflops = 2.0011 (norm. = 0.0715649), norm. avg. (of 1) = 0.0715649 fft 6: mflops = 2.44994 (norm. = 0.0876168), norm. avg. (of 1) = 0.0876168 fft 7: mflops = 2.36166 (norm. = 0.0844595), norm. avg. (of 1) = 0.0844595 fft 8: mflops = -1 (norm. = -0.0357628), norm. avg. (of 0) = -1 fft 9: mflops = 5.19097 (norm. = 0.185644), norm. avg. (of 1) = 0.185644 fft 10: mflops = 17.1898 (norm. = 0.614754), norm. avg. (of 1) = 0.614754 fft 11: mflops = 17.1898 (norm. = 0.614754), norm. avg. (of 1) = 0.614754 fft 12: mflops = 27.962 (norm. = 1), norm. avg. (of 1) = 1 fft 13: mflops = -1 (norm. = -0.0357628), norm. avg. (of 0) = -1 fft 14: mflops = 6.16809 (norm. = 0.220588), norm. avg. (of 1) = 0.220588 fft 15: mflops = 4.12825 (norm. = 0.147638), norm. avg. (of 1) = 0.147638 fft 16: mflops = 4.0022 (norm. = 0.14313), norm. avg. (of 1) = 0.14313 fft 17: mflops = 12.6334 (norm. = 0.451807), norm. avg. (of 1) = 0.451807 fft 18: mflops = -1 (norm. = -0.0357628), norm. avg. (of 0) = -1 fft 19: mflops = -1 (norm. = -0.0357628), norm. avg. (of 0) = -1 fft 20: mflops = -1 (norm. = -0.0357628), norm. avg. (of 0) = -1 fft 21: mflops = 2.75941 (norm. = 0.0986842), norm. avg. (of 1) = 0.0986842 fft 22: mflops = 12.7875 (norm. = 0.457317), norm. avg. (of 1) = 0.457317 fft 23: mflops = -1 (norm. = -0.0357628), norm. avg. (of 0) = -1 fft 24: mflops = 3.33941 (norm. = 0.119427), norm. avg. (of 1) = 0.119427 fft 25: mflops = 1.89959 (norm. = 0.0679348), norm. avg. (of 1) = 0.0679348 fft 26: mflops = 4.12825 (norm. = 0.147638), norm. avg. (of 1) = 0.147638 fft 27: mflops = 1.6384 (norm. = 0.0585938), norm. avg. (of 1) = 0.0585938 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.46 s, 1048576 iters, t-(init.)=1.23 s t(norm)=0.146627, mflops=34.1 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.49 s, 1048576 iters, t-(init.)=1.25 s t(norm)=0.149012, mflops=33.5544 (err=1.3e-16) 2. Arndt Split-Radix: elapsed time t=1.18 s, 524288 iters, t-(init.)=1.06 s t(norm)=0.252724, mflops=19.7845 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.95 s, 131072 iters, t-(init.)=1.92 s t(norm)=1.83105, mflops=2.73067 (err=1.3e-16) 4. Beauregard: elapsed time t=1.51 s, 131072 iters, t-(init.)=1.48 s t(norm)=1.41144, mflops=3.54249 (err=6.5e-17) 5. Bergland: elapsed time t=1.7 s, 262144 iters, t-(init.)=1.64 s t(norm)=0.782013, mflops=6.39376 (err=5.3e-17) 6. CWP (min N): elapsed time t=1.19 s, 262144 iters, t-(init.)=1.13 s t(norm)=0.538826, mflops=9.27943 7. CWP (best N) (N=15): elapsed time t=1.16 s, 131072 iters, t-(init.)=1.09 s t(norm)=1.03951, mflops=4.80998 8. Edelblute: elapsed time t=1.18 s, 524288 iters, t-(init.)=1.07 s t(norm)=0.255108, mflops=19.5996 (err=1.3e-16) 9. FFTPACK (f2c): elapsed time t=1.51 s, 524288 iters, t-(init.)=1.4 s t(norm)=0.333786, mflops=14.9797 (err=5.3e-17) FFTW_MEASURE plan: (cost = 8.773804e-07) FFTW_NOTW 4 10. FFTW: elapsed time t=1.02 s, 1048576 iters, t-(init.)=0.79 s t(norm)=0.0941753, mflops=53.0925 (err=5.3e-17) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 11. FFTW_ESTIMATE: elapsed time t=1.03 s, 1048576 iters, t-(init.)=0.8 s t(norm)=0.0953674, mflops=52.4288 (err=5.3e-17) 12. Frigo-old: elapsed time t=1.44 s, 2097152 iters, t-(init.)=0.96 s t(norm)=0.0572205, mflops=87.3813 (err=5.3e-17) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=1.34 s, 524288 iters, t-(init.)=1.23 s t(norm)=0.293255, mflops=17.05 (err=5.3e-17) 15. GSL DIT: elapsed time t=1.37 s, 262144 iters, t-(init.)=1.32 s t(norm)=0.629425, mflops=7.94376 (err=6.5e-17) 16. GSL DIF: elapsed time t=1.39 s, 262144 iters, t-(init.)=1.34 s t(norm)=0.638962, mflops=7.82519 (err=6.5e-17) 17. Krukar: elapsed time t=1.2 s, 1048576 iters, t-(init.)=0.96 s t(norm)=0.114441, mflops=43.6907 (err=5.3e-17) 18. Mayer (Buneman): elapsed time t=1.24 s, 524288 iters, t-(init.)=1.13 s t(norm)=0.269413, mflops=18.5589 (err=1.3e-16) 19. Mayer (simple): elapsed time t=1.17 s, 524288 iters, t-(init.)=1.04 s t(norm)=0.247955, mflops=20.1649 20. Mayer (lookup): elapsed time t=1.28 s, 524288 iters, t-(init.)=1.16 s t(norm)=0.276566, mflops=18.0789 (err=1.3e-16) 21. NAPACK (f2c): elapsed time t=1.01 s, 131072 iters, t-(init.)=0.98 s t(norm)=0.934601, mflops=5.34988 (err=1.6e-16) 22. Ooura (C): elapsed time t=1.87 s, 1048576 iters, t-(init.)=1.63 s t(norm)=0.194311, mflops=25.7319 (err=5.3e-17) 23. Ransom: elapsed time t=1.73 s, 131072 iters, t-(init.)=1.7 s t(norm)=1.62125, mflops=3.08405 (err=2.4e-16) 24. Singleton (f2c): elapsed time t=1.93 s, 524288 iters, t-(init.)=1.81 s t(norm)=0.431538, mflops=11.5865 (err=5.3e-17) 25. Temperton (f2c): elapsed time t=1.6 s, 262144 iters, t-(init.)=1.54 s t(norm)=0.734329, mflops=6.80894 (err=5.3e-17) 26. Valkenburg: elapsed time t=1.24 s, 131072 iters, t-(init.)=1.2 s t(norm)=1.14441, mflops=4.36907 (err=1.4e-16) 27. SGIMATH: elapsed time t=1.88 s, 262144 iters, t-(init.)=1.82 s t(norm)=0.867844, mflops=5.76141 (err=5.3e-17) Top mflops for N=4 = 87.3813 Normalized results and averages for N=4: fft 0: mflops = 34.1 (norm. = 0.390244), norm. avg. (of 2) = 0.502499 fft 1: mflops = 33.5544 (norm. = 0.384), norm. avg. (of 2) = 0.496878 fft 2: mflops = 19.7845 (norm. = 0.226415), norm. avg. (of 2) = 0.356714 fft 3: mflops = 2.73067 (norm. = 0.03125), norm. avg. (of 2) = 0.026893 fft 4: mflops = 3.54249 (norm. = 0.0405405), norm. avg. (of 2) = 0.0610311 fft 5: mflops = 6.39376 (norm. = 0.0731707), norm. avg. (of 2) = 0.0723678 fft 6: mflops = 9.27943 (norm. = 0.106195), norm. avg. (of 2) = 0.0969058 fft 7: mflops = 4.80998 (norm. = 0.0550459), norm. avg. (of 2) = 0.0697527 fft 8: mflops = 19.5996 (norm. = 0.224299), norm. avg. (of 1) = 0.224299 fft 9: mflops = 14.9797 (norm. = 0.171429), norm. avg. (of 2) = 0.178536 fft 10: mflops = 53.0925 (norm. = 0.607595), norm. avg. (of 2) = 0.611175 fft 11: mflops = 52.4288 (norm. = 0.6), norm. avg. (of 2) = 0.607377 fft 12: mflops = 87.3813 (norm. = 1), norm. avg. (of 2) = 1 fft 13: mflops = -1 (norm. = -0.0114441), norm. avg. (of 0) = -1 fft 14: mflops = 17.05 (norm. = 0.195122), norm. avg. (of 2) = 0.207855 fft 15: mflops = 7.94376 (norm. = 0.0909091), norm. avg. (of 2) = 0.119273 fft 16: mflops = 7.82519 (norm. = 0.0895522), norm. avg. (of 2) = 0.116341 fft 17: mflops = 43.6907 (norm. = 0.5), norm. avg. (of 2) = 0.475904 fft 18: mflops = 18.5589 (norm. = 0.212389), norm. avg. (of 1) = 0.212389 fft 19: mflops = 20.1649 (norm. = 0.230769), norm. avg. (of 1) = 0.230769 fft 20: mflops = 18.0789 (norm. = 0.206897), norm. avg. (of 1) = 0.206897 fft 21: mflops = 5.34988 (norm. = 0.0612245), norm. avg. (of 2) = 0.0799544 fft 22: mflops = 25.7319 (norm. = 0.294479), norm. avg. (of 2) = 0.375898 fft 23: mflops = 3.08405 (norm. = 0.0352941), norm. avg. (of 1) = 0.0352941 fft 24: mflops = 11.5865 (norm. = 0.132597), norm. avg. (of 2) = 0.126012 fft 25: mflops = 6.80894 (norm. = 0.0779221), norm. avg. (of 2) = 0.0729284 fft 26: mflops = 4.36907 (norm. = 0.05), norm. avg. (of 2) = 0.0988189 fft 27: mflops = 5.76141 (norm. = 0.0659341), norm. avg. (of 2) = 0.0622639 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.21 s, 524288 iters, t-(init.)=1.05 s t(norm)=0.0834465, mflops=59.9186 (err=1.2e-16) 1. Arndt DIT: elapsed time t=1.26 s, 524288 iters, t-(init.)=1.09 s t(norm)=0.0866254, mflops=57.7198 (err=1.2e-16) 2. Arndt Split-Radix: elapsed time t=1.43 s, 262144 iters, t-(init.)=1.34 s t(norm)=0.212987, mflops=23.4756 (err=1.1e-16) 3. Arndt 4-step: elapsed time t=1.19 s, 32768 iters, t-(init.)=1.18 s t(norm)=1.50045, mflops=3.33234 (err=1.5e-16) 4. Beauregard: elapsed time t=1.98 s, 65536 iters, t-(init.)=1.96 s t(norm)=1.24613, mflops=4.01241 (err=1.2e-16) 5. Bergland: elapsed time t=1.4 s, 131072 iters, t-(init.)=1.36 s t(norm)=0.432332, mflops=11.5652 (err=1.3e-16) 6. CWP (min N): elapsed time t=1.42 s, 262144 iters, t-(init.)=1.34 s t(norm)=0.212987, mflops=23.4756 7. CWP (best N) (N=15): elapsed time t=1.16 s, 131072 iters, t-(init.)=1.09 s t(norm)=0.346502, mflops=14.4299 8. Edelblute: elapsed time t=1.53 s, 262144 iters, t-(init.)=1.45 s t(norm)=0.230471, mflops=21.6947 (err=1.5e-16) 9. FFTPACK (f2c): elapsed time t=1.4 s, 262144 iters, t-(init.)=1.31 s t(norm)=0.208219, mflops=24.0132 (err=1.2e-16) FFTW_MEASURE plan: (cost = 1.373291e-06) FFTW_NOTW 8 10. FFTW: elapsed time t=1.6 s, 1048576 iters, t-(init.)=1.28 s t(norm)=0.0508626, mflops=98.304 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 11. FFTW_ESTIMATE: elapsed time t=1.6 s, 1048576 iters, t-(init.)=1.27 s t(norm)=0.0504653, mflops=99.078 (err=1.2e-16) 12. Frigo-old: elapsed time t=1.11 s, 1048576 iters, t-(init.)=0.79 s t(norm)=0.0313918, mflops=159.277 (err=1.4e-16) 13. Green: elapsed time t=1.68 s, 524288 iters, t-(init.)=1.52 s t(norm)=0.120799, mflops=41.3912 (err=1.4e-16) 14. GSL: elapsed time t=1.12 s, 262144 iters, t-(init.)=1.04 s t(norm)=0.165304, mflops=30.2474 (err=1.2e-16) 15. GSL DIT: elapsed time t=1.26 s, 131072 iters, t-(init.)=1.22 s t(norm)=0.387828, mflops=12.8923 (err=1.2e-16) 16. GSL DIF: elapsed time t=1.22 s, 131072 iters, t-(init.)=1.18 s t(norm)=0.375112, mflops=13.3294 (err=1.4e-16) 17. Krukar: elapsed time t=1.13 s, 524288 iters, t-(init.)=0.96 s t(norm)=0.0762939, mflops=65.536 (err=1.2e-16) 18. Mayer (Buneman): elapsed time t=1.06 s, 262144 iters, t-(init.)=0.98 s t(norm)=0.155767, mflops=32.0993 (err=1.5e-16) 19. Mayer (simple): elapsed time t=1.03 s, 262144 iters, t-(init.)=0.95 s t(norm)=0.150998, mflops=33.1129 20. Mayer (lookup): elapsed time t=1.08 s, 262144 iters, t-(init.)=1 s t(norm)=0.158946, mflops=31.4573 (err=1.5e-16) 21. NAPACK (f2c): elapsed time t=1.01 s, 65536 iters, t-(init.)=0.99 s t(norm)=0.629425, mflops=7.94376 (err=1.7e-16) 22. Ooura (C): elapsed time t=1.59 s, 524288 iters, t-(init.)=1.42 s t(norm)=0.112851, mflops=44.306 (err=1.3e-16) 23. Ransom: elapsed time t=1.25 s, 32768 iters, t-(init.)=1.24 s t(norm)=1.57674, mflops=3.1711 (err=3.9e-16) 24. Singleton (f2c): elapsed time t=1.48 s, 131072 iters, t-(init.)=1.44 s t(norm)=0.457764, mflops=10.9227 (err=1.4e-16) 25. Temperton (f2c): elapsed time t=1.43 s, 131072 iters, t-(init.)=1.39 s t(norm)=0.441869, mflops=11.3156 (err=1.4e-16) 26. Valkenburg: elapsed time t=1.75 s, 65536 iters, t-(init.)=1.73 s t(norm)=1.0999, mflops=4.54585 (err=1.4e-16) 27. SGIMATH: elapsed time t=1.67 s, 65536 iters, t-(init.)=1.65 s t(norm)=1.04904, mflops=4.76625 (err=1.2e-16) Top mflops for N=8 = 159.277 Normalized results and averages for N=8: fft 0: mflops = 59.9186 (norm. = 0.37619), norm. avg. (of 3) = 0.460396 fft 1: mflops = 57.7198 (norm. = 0.362385), norm. avg. (of 3) = 0.452047 fft 2: mflops = 23.4756 (norm. = 0.147388), norm. avg. (of 3) = 0.286939 fft 3: mflops = 3.33234 (norm. = 0.0209216), norm. avg. (of 3) = 0.0249026 fft 4: mflops = 4.01241 (norm. = 0.0251913), norm. avg. (of 3) = 0.0490845 fft 5: mflops = 11.5652 (norm. = 0.0726103), norm. avg. (of 3) = 0.0724486 fft 6: mflops = 23.4756 (norm. = 0.147388), norm. avg. (of 3) = 0.113733 fft 7: mflops = 14.4299 (norm. = 0.0905963), norm. avg. (of 3) = 0.0767006 fft 8: mflops = 21.6947 (norm. = 0.136207), norm. avg. (of 2) = 0.180253 fft 9: mflops = 24.0132 (norm. = 0.150763), norm. avg. (of 3) = 0.169278 fft 10: mflops = 98.304 (norm. = 0.617188), norm. avg. (of 3) = 0.613179 fft 11: mflops = 99.078 (norm. = 0.622047), norm. avg. (of 3) = 0.612267 fft 12: mflops = 159.277 (norm. = 1), norm. avg. (of 3) = 1 fft 13: mflops = 41.3912 (norm. = 0.259868), norm. avg. (of 1) = 0.259868 fft 14: mflops = 30.2474 (norm. = 0.189904), norm. avg. (of 3) = 0.201871 fft 15: mflops = 12.8923 (norm. = 0.0809426), norm. avg. (of 3) = 0.106497 fft 16: mflops = 13.3294 (norm. = 0.0836864), norm. avg. (of 3) = 0.105456 fft 17: mflops = 65.536 (norm. = 0.411458), norm. avg. (of 3) = 0.454422 fft 18: mflops = 32.0993 (norm. = 0.201531), norm. avg. (of 2) = 0.20696 fft 19: mflops = 33.1129 (norm. = 0.207895), norm. avg. (of 2) = 0.219332 fft 20: mflops = 31.4573 (norm. = 0.1975), norm. avg. (of 2) = 0.202198 fft 21: mflops = 7.94376 (norm. = 0.0498737), norm. avg. (of 3) = 0.0699275 fft 22: mflops = 44.306 (norm. = 0.278169), norm. avg. (of 3) = 0.343322 fft 23: mflops = 3.1711 (norm. = 0.0199093), norm. avg. (of 2) = 0.0276017 fft 24: mflops = 10.9227 (norm. = 0.0685764), norm. avg. (of 3) = 0.106867 fft 25: mflops = 11.3156 (norm. = 0.0710432), norm. avg. (of 3) = 0.0723 fft 26: mflops = 4.54585 (norm. = 0.0285405), norm. avg. (of 3) = 0.0753928 fft 27: mflops = 4.76625 (norm. = 0.0299242), norm. avg. (of 3) = 0.051484 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.1 s, 131072 iters, t-(init.)=1.05 s t(norm)=0.12517, mflops=39.9458 (err=1.5e-16) 1. Arndt DIT: elapsed time t=1.09 s, 131072 iters, t-(init.)=1.04 s t(norm)=0.123978, mflops=40.3298 (err=2.2e-16) 2. Arndt Split-Radix: elapsed time t=1.66 s, 131072 iters, t-(init.)=1.6 s t(norm)=0.190735, mflops=26.2144 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.58 s, 32768 iters, t-(init.)=1.57 s t(norm)=0.748634, mflops=6.67883 (err=1.6e-16) 4. Beauregard: elapsed time t=1.05 s, 16384 iters, t-(init.)=1.05 s t(norm)=1.00136, mflops=4.99322 (err=2.3e-16) 5. Bergland: elapsed time t=1.21 s, 65536 iters, t-(init.)=1.18 s t(norm)=0.281334, mflops=17.7725 (err=2.6e-16) 6. CWP (min N): elapsed time t=1.92 s, 262144 iters, t-(init.)=1.81 s t(norm)=0.107884, mflops=46.3459 7. CWP (best N) (N=28): elapsed time t=1.63 s, 131072 iters, t-(init.)=1.55 s t(norm)=0.184774, mflops=27.06 8. Edelblute: elapsed time t=1.84 s, 131072 iters, t-(init.)=1.77 s t(norm)=0.211, mflops=23.6966 (err=1.4e-16) 9. FFTPACK (f2c): elapsed time t=1.06 s, 131072 iters, t-(init.)=1 s t(norm)=0.119209, mflops=41.943 (err=1.8e-16) FFTW_MEASURE plan: (cost = 2.975464e-06) FFTW_NOTW 16 10. FFTW: elapsed time t=1.65 s, 524288 iters, t-(init.)=1.4 s t(norm)=0.0417233, mflops=119.837 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 11. FFTW_ESTIMATE: elapsed time t=1.66 s, 524288 iters, t-(init.)=1.44 s t(norm)=0.0429153, mflops=116.508 (err=1.7e-16) 12. Frigo-old: elapsed time t=1.2 s, 524288 iters, t-(init.)=0.95 s t(norm)=0.0283122, mflops=176.602 (err=1.8e-16) 13. Green: elapsed time t=1.03 s, 131072 iters, t-(init.)=0.97 s t(norm)=0.115633, mflops=43.2402 (err=1.9e-16) 14. GSL: elapsed time t=1.91 s, 262144 iters, t-(init.)=1.8 s t(norm)=0.107288, mflops=46.6034 (err=1.8e-16) 15. GSL DIT: elapsed time t=1.19 s, 65536 iters, t-(init.)=1.16 s t(norm)=0.276566, mflops=18.0789 (err=2.1e-16) 16. GSL DIF: elapsed time t=1.13 s, 65536 iters, t-(init.)=1.1 s t(norm)=0.26226, mflops=19.065 (err=2.8e-16) 17. Krukar: elapsed time t=1.13 s, 262144 iters, t-(init.)=1.01 s t(norm)=0.0602007, mflops=83.0555 (err=2.0e-16) 18. Mayer (Buneman): elapsed time t=1.53 s, 131072 iters, t-(init.)=1.47 s t(norm)=0.175238, mflops=28.5327 (err=1.7e-16) 19. Mayer (simple): elapsed time t=1.26 s, 131072 iters, t-(init.)=1.21 s t(norm)=0.144243, mflops=34.6637 20. Mayer (lookup): elapsed time t=1.31 s, 131072 iters, t-(init.)=1.25 s t(norm)=0.149012, mflops=33.5544 (err=1.9e-16) 21. NAPACK (f2c): elapsed time t=1.9 s, 65536 iters, t-(init.)=1.88 s t(norm)=0.448227, mflops=11.1551 (err=3.3e-16) 22. Ooura (C): elapsed time t=1.5 s, 262144 iters, t-(init.)=1.38 s t(norm)=0.0822544, mflops=60.787 (err=2.0e-16) 23. Ransom: elapsed time t=1 s, 32768 iters, t-(init.)=0.99 s t(norm)=0.472069, mflops=10.5917 (err=4.2e-16) 24. Singleton (f2c): elapsed time t=1.5 s, 131072 iters, t-(init.)=1.44 s t(norm)=0.171661, mflops=29.1271 (err=1.7e-16) 25. Temperton (f2c): elapsed time t=1.13 s, 65536 iters, t-(init.)=1.1 s t(norm)=0.26226, mflops=19.065 (err=1.8e-16) 26. Valkenburg: elapsed time t=1.11 s, 16384 iters, t-(init.)=1.1 s t(norm)=1.04904, mflops=4.76625 (err=2.9e-16) 27. SGIMATH: elapsed time t=1.82 s, 32768 iters, t-(init.)=1.81 s t(norm)=0.863075, mflops=5.79324 (err=1.8e-16) Top mflops for N=16 = 176.602 Normalized results and averages for N=16: fft 0: mflops = 39.9458 (norm. = 0.22619), norm. avg. (of 4) = 0.401845 fft 1: mflops = 40.3298 (norm. = 0.228365), norm. avg. (of 4) = 0.396127 fft 2: mflops = 26.2144 (norm. = 0.148438), norm. avg. (of 4) = 0.252313 fft 3: mflops = 6.67883 (norm. = 0.0378185), norm. avg. (of 4) = 0.0281315 fft 4: mflops = 4.99322 (norm. = 0.0282738), norm. avg. (of 4) = 0.0438819 fft 5: mflops = 17.7725 (norm. = 0.100636), norm. avg. (of 4) = 0.0794954 fft 6: mflops = 46.3459 (norm. = 0.262431), norm. avg. (of 4) = 0.150908 fft 7: mflops = 27.06 (norm. = 0.153226), norm. avg. (of 4) = 0.0958319 fft 8: mflops = 23.6966 (norm. = 0.134181), norm. avg. (of 3) = 0.164896 fft 9: mflops = 41.943 (norm. = 0.2375), norm. avg. (of 4) = 0.186334 fft 10: mflops = 119.837 (norm. = 0.678571), norm. avg. (of 4) = 0.629527 fft 11: mflops = 116.508 (norm. = 0.659722), norm. avg. (of 4) = 0.624131 fft 12: mflops = 176.602 (norm. = 1), norm. avg. (of 4) = 1 fft 13: mflops = 43.2402 (norm. = 0.244845), norm. avg. (of 2) = 0.252357 fft 14: mflops = 46.6034 (norm. = 0.263889), norm. avg. (of 4) = 0.217376 fft 15: mflops = 18.0789 (norm. = 0.102371), norm. avg. (of 4) = 0.105465 fft 16: mflops = 19.065 (norm. = 0.107955), norm. avg. (of 4) = 0.106081 fft 17: mflops = 83.0555 (norm. = 0.470297), norm. avg. (of 4) = 0.458391 fft 18: mflops = 28.5327 (norm. = 0.161565), norm. avg. (of 3) = 0.191828 fft 19: mflops = 34.6637 (norm. = 0.196281), norm. avg. (of 3) = 0.211648 fft 20: mflops = 33.5544 (norm. = 0.19), norm. avg. (of 3) = 0.198132 fft 21: mflops = 11.1551 (norm. = 0.0631649), norm. avg. (of 4) = 0.0682368 fft 22: mflops = 60.787 (norm. = 0.344203), norm. avg. (of 4) = 0.343542 fft 23: mflops = 10.5917 (norm. = 0.0599747), norm. avg. (of 3) = 0.0383927 fft 24: mflops = 29.1271 (norm. = 0.164931), norm. avg. (of 4) = 0.121383 fft 25: mflops = 19.065 (norm. = 0.107955), norm. avg. (of 4) = 0.0812136 fft 26: mflops = 4.76625 (norm. = 0.0269886), norm. avg. (of 4) = 0.0632917 fft 27: mflops = 5.79324 (norm. = 0.0328039), norm. avg. (of 4) = 0.046814 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.1 s, 65536 iters, t-(init.)=1.06 s t(norm)=0.101089, mflops=49.4611 (err=2.9e-16) 1. Arndt DIT: elapsed time t=1.1 s, 65536 iters, t-(init.)=1.05 s t(norm)=0.100136, mflops=49.9322 (err=2.7e-16) 2. Arndt Split-Radix: elapsed time t=1.87 s, 65536 iters, t-(init.)=1.83 s t(norm)=0.174522, mflops=28.6496 (err=3.4e-16) 3. Arndt 4-step: elapsed time t=1.7 s, 16384 iters, t-(init.)=1.69 s t(norm)=0.644684, mflops=7.75574 (err=2.7e-16) 4. Beauregard: elapsed time t=1.12 s, 8192 iters, t-(init.)=1.11 s t(norm)=0.846863, mflops=5.90414 (err=2.3e-16) 5. Bergland: elapsed time t=1.11 s, 32768 iters, t-(init.)=1.09 s t(norm)=0.207901, mflops=24.0499 (err=3.0e-16) 6. CWP (min N) (N=33): elapsed time t=1.1 s, 65536 iters, t-(init.)=1.05 s t(norm)=0.100136, mflops=49.9322 7. CWP (best N) (N=35): elapsed time t=1.04 s, 65536 iters, t-(init.)=0.99 s t(norm)=0.0944138, mflops=52.9584 8. Edelblute: elapsed time t=1.05 s, 32768 iters, t-(init.)=1.03 s t(norm)=0.196457, mflops=25.4509 (err=2.7e-16) 9. FFTPACK (f2c): elapsed time t=1.32 s, 65536 iters, t-(init.)=1.27 s t(norm)=0.121117, mflops=41.2825 (err=2.1e-16) FFTW_MEASURE plan: (cost = 7.934570e-06) FFTW_NOTW 32 10. FFTW: elapsed time t=1.07 s, 131072 iters, t-(init.)=0.97 s t(norm)=0.0462532, mflops=108.101 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.07 s, 131072 iters, t-(init.)=0.98 s t(norm)=0.04673, mflops=106.998 (err=2.1e-16) 12. Frigo-old: elapsed time t=1.64 s, 262144 iters, t-(init.)=1.45 s t(norm)=0.0345707, mflops=144.631 (err=2.2e-16) 13. Green: elapsed time t=1.84 s, 131072 iters, t-(init.)=1.75 s t(norm)=0.0834465, mflops=59.9186 (err=2.1e-16) 14. GSL: elapsed time t=1.1 s, 65536 iters, t-(init.)=1.06 s t(norm)=0.101089, mflops=49.4611 (err=2.0e-16) 15. GSL DIT: elapsed time t=1.2 s, 32768 iters, t-(init.)=1.18 s t(norm)=0.225067, mflops=22.2156 (err=2.2e-16) 16. GSL DIF: elapsed time t=1.11 s, 32768 iters, t-(init.)=1.08 s t(norm)=0.205994, mflops=24.2726 (err=2.5e-16) 17. Krukar: elapsed time t=1.24 s, 131072 iters, t-(init.)=1.16 s t(norm)=0.0553131, mflops=90.3945 (err=2.2e-16) 18. Mayer (Buneman): elapsed time t=1.6 s, 65536 iters, t-(init.)=1.56 s t(norm)=0.148773, mflops=33.6082 (err=2.8e-16) 19. Mayer (simple): elapsed time t=1.25 s, 65536 iters, t-(init.)=1.2 s t(norm)=0.114441, mflops=43.6907 20. Mayer (lookup): elapsed time t=1.29 s, 65536 iters, t-(init.)=1.24 s t(norm)=0.118256, mflops=42.2813 (err=2.9e-16) 21. NAPACK (f2c): elapsed time t=1.97 s, 32768 iters, t-(init.)=1.95 s t(norm)=0.371933, mflops=13.4433 (err=1.3e-15) 22. Ooura (C): elapsed time t=1.51 s, 131072 iters, t-(init.)=1.42 s t(norm)=0.0677109, mflops=73.8434 (err=2.6e-16) 23. Ransom: elapsed time t=1.36 s, 16384 iters, t-(init.)=1.34 s t(norm)=0.511169, mflops=9.78149 (err=7.5e-16) 24. Singleton (f2c): elapsed time t=1.51 s, 65536 iters, t-(init.)=1.47 s t(norm)=0.14019, mflops=35.6659 (err=2.2e-16) 25. Temperton (f2c): elapsed time t=1.36 s, 32768 iters, t-(init.)=1.34 s t(norm)=0.255585, mflops=19.563 (err=1.7e-16) 26. Valkenburg: elapsed time t=1.32 s, 8192 iters, t-(init.)=1.32 s t(norm)=1.00708, mflops=4.96485 (err=4.2e-16) 27. SGIMATH: elapsed time t=1.13 s, 16384 iters, t-(init.)=1.12 s t(norm)=0.427246, mflops=11.7029 (err=2.0e-16) Top mflops for N=32 = 144.631 Normalized results and averages for N=32: fft 0: mflops = 49.4611 (norm. = 0.341981), norm. avg. (of 5) = 0.389872 fft 1: mflops = 49.9322 (norm. = 0.345238), norm. avg. (of 5) = 0.385949 fft 2: mflops = 28.6496 (norm. = 0.198087), norm. avg. (of 5) = 0.241468 fft 3: mflops = 7.75574 (norm. = 0.0536243), norm. avg. (of 5) = 0.0332301 fft 4: mflops = 5.90414 (norm. = 0.0408221), norm. avg. (of 5) = 0.0432699 fft 5: mflops = 24.0499 (norm. = 0.166284), norm. avg. (of 5) = 0.0968532 fft 6: mflops = 49.9322 (norm. = 0.345238), norm. avg. (of 5) = 0.189774 fft 7: mflops = 52.9584 (norm. = 0.366162), norm. avg. (of 5) = 0.149898 fft 8: mflops = 25.4509 (norm. = 0.175971), norm. avg. (of 4) = 0.167664 fft 9: mflops = 41.2825 (norm. = 0.285433), norm. avg. (of 5) = 0.206154 fft 10: mflops = 108.101 (norm. = 0.747423), norm. avg. (of 5) = 0.653106 fft 11: mflops = 106.998 (norm. = 0.739796), norm. avg. (of 5) = 0.647264 fft 12: mflops = 144.631 (norm. = 1), norm. avg. (of 5) = 1 fft 13: mflops = 59.9186 (norm. = 0.414286), norm. avg. (of 3) = 0.306333 fft 14: mflops = 49.4611 (norm. = 0.341981), norm. avg. (of 5) = 0.242297 fft 15: mflops = 22.2156 (norm. = 0.153602), norm. avg. (of 5) = 0.115092 fft 16: mflops = 24.2726 (norm. = 0.167824), norm. avg. (of 5) = 0.118429 fft 17: mflops = 90.3945 (norm. = 0.625), norm. avg. (of 5) = 0.491713 fft 18: mflops = 33.6082 (norm. = 0.232372), norm. avg. (of 4) = 0.201964 fft 19: mflops = 43.6907 (norm. = 0.302083), norm. avg. (of 4) = 0.234257 fft 20: mflops = 42.2813 (norm. = 0.292339), norm. avg. (of 4) = 0.221684 fft 21: mflops = 13.4433 (norm. = 0.0929487), norm. avg. (of 5) = 0.0731792 fft 22: mflops = 73.8434 (norm. = 0.510563), norm. avg. (of 5) = 0.376946 fft 23: mflops = 9.78149 (norm. = 0.0676306), norm. avg. (of 4) = 0.0457022 fft 24: mflops = 35.6659 (norm. = 0.246599), norm. avg. (of 5) = 0.146426 fft 25: mflops = 19.563 (norm. = 0.135261), norm. avg. (of 5) = 0.0920232 fft 26: mflops = 4.96485 (norm. = 0.0343277), norm. avg. (of 5) = 0.0574989 fft 27: mflops = 11.7029 (norm. = 0.0809152), norm. avg. (of 5) = 0.0536342 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.34 s, 32768 iters, t-(init.)=1.3 s t(norm)=0.103315, mflops=48.3958 (err=5.8e-16) 1. Arndt DIT: elapsed time t=1.34 s, 32768 iters, t-(init.)=1.3 s t(norm)=0.103315, mflops=48.3958 (err=5.7e-16) 2. Arndt Split-Radix: elapsed time t=1.03 s, 16384 iters, t-(init.)=1.01 s t(norm)=0.160535, mflops=31.1458 (err=5.8e-16) 3. Arndt 4-step: elapsed time t=1.08 s, 8192 iters, t-(init.)=1.07 s t(norm)=0.340144, mflops=14.6997 (err=5.3e-16) 4. Beauregard: elapsed time t=1.23 s, 4096 iters, t-(init.)=1.22 s t(norm)=0.775655, mflops=6.44616 (err=6.0e-16) 5. Bergland: elapsed time t=1.07 s, 16384 iters, t-(init.)=1.05 s t(norm)=0.166893, mflops=29.9593 (err=6.3e-16) 6. CWP (min N) (N=65): elapsed time t=1.04 s, 32768 iters, t-(init.)=1 s t(norm)=0.0794729, mflops=62.9146 7. CWP (best N) (N=84): elapsed time t=1.05 s, 32768 iters, t-(init.)=1.01 s t(norm)=0.0802676, mflops=62.2916 8. Edelblute: elapsed time t=1.15 s, 16384 iters, t-(init.)=1.13 s t(norm)=0.179609, mflops=27.8383 (err=5.7e-16) 9. FFTPACK (f2c): elapsed time t=1.21 s, 32768 iters, t-(init.)=1.17 s t(norm)=0.0929832, mflops=53.7731 (err=5.7e-16) FFTW_MEASURE plan: (cost = 1.770020e-05) FFTW_TWIDDLE 8 FFTW_NOTW 8 10. FFTW: elapsed time t=1.18 s, 65536 iters, t-(init.)=1.1 s t(norm)=0.0437101, mflops=114.39 (err=5.6e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.49 s, 65536 iters, t-(init.)=1.41 s t(norm)=0.0560284, mflops=89.2405 (err=5.3e-16) 12. Frigo-old: elapsed time t=1.25 s, 65536 iters, t-(init.)=1.18 s t(norm)=0.046889, mflops=106.635 (err=5.5e-16) 13. Green: elapsed time t=1.51 s, 65536 iters, t-(init.)=1.44 s t(norm)=0.0572205, mflops=87.3813 (err=5.7e-16) 14. GSL: elapsed time t=1.08 s, 32768 iters, t-(init.)=1.04 s t(norm)=0.0826518, mflops=60.4948 (err=5.7e-16) 15. GSL DIT: elapsed time t=1.26 s, 16384 iters, t-(init.)=1.24 s t(norm)=0.197093, mflops=25.3688 (err=5.6e-16) 16. GSL DIF: elapsed time t=1.15 s, 16384 iters, t-(init.)=1.13 s t(norm)=0.179609, mflops=27.8383 (err=5.4e-16) 17. Krukar: elapsed time t=1.43 s, 65536 iters, t-(init.)=1.35 s t(norm)=0.0536442, mflops=93.2068 (err=5.9e-16) 18. Mayer (Buneman): elapsed time t=1.9 s, 32768 iters, t-(init.)=1.86 s t(norm)=0.14782, mflops=33.825 (err=5.3e-16) 19. Mayer (simple): elapsed time t=1.45 s, 32768 iters, t-(init.)=1.42 s t(norm)=0.112851, mflops=44.306 20. Mayer (lookup): elapsed time t=1.48 s, 32768 iters, t-(init.)=1.44 s t(norm)=0.114441, mflops=43.6907 (err=5.2e-16) 21. NAPACK (f2c): elapsed time t=1.03 s, 8192 iters, t-(init.)=1.02 s t(norm)=0.324249, mflops=15.4202 (err=1.8e-15) 22. Ooura (C): elapsed time t=1.65 s, 65536 iters, t-(init.)=1.58 s t(norm)=0.0627836, mflops=79.6387 (err=5.7e-16) 23. Ransom: elapsed time t=1.61 s, 16384 iters, t-(init.)=1.59 s t(norm)=0.252724, mflops=19.7845 (err=9.0e-16) 24. Singleton (f2c): elapsed time t=1.3 s, 32768 iters, t-(init.)=1.26 s t(norm)=0.100136, mflops=49.9322 (err=9.2e-16) 25. Temperton (f2c): elapsed time t=1.14 s, 16384 iters, t-(init.)=1.12 s t(norm)=0.178019, mflops=28.0869 (err=5.7e-16) 26. Valkenburg: elapsed time t=1.54 s, 4096 iters, t-(init.)=1.53 s t(norm)=0.972748, mflops=5.14008 (err=7.8e-16) 27. SGIMATH: elapsed time t=1.94 s, 16384 iters, t-(init.)=1.92 s t(norm)=0.305176, mflops=16.384 (err=5.7e-16) Top mflops for N=64 = 114.39 Normalized results and averages for N=64: fft 0: mflops = 48.3958 (norm. = 0.423077), norm. avg. (of 6) = 0.395406 fft 1: mflops = 48.3958 (norm. = 0.423077), norm. avg. (of 6) = 0.392137 fft 2: mflops = 31.1458 (norm. = 0.272277), norm. avg. (of 6) = 0.246603 fft 3: mflops = 14.6997 (norm. = 0.128505), norm. avg. (of 6) = 0.0491092 fft 4: mflops = 6.44616 (norm. = 0.0563525), norm. avg. (of 6) = 0.0454503 fft 5: mflops = 29.9593 (norm. = 0.261905), norm. avg. (of 6) = 0.124362 fft 6: mflops = 62.9146 (norm. = 0.55), norm. avg. (of 6) = 0.249811 fft 7: mflops = 62.2916 (norm. = 0.544554), norm. avg. (of 6) = 0.215674 fft 8: mflops = 27.8383 (norm. = 0.243363), norm. avg. (of 5) = 0.182804 fft 9: mflops = 53.7731 (norm. = 0.470085), norm. avg. (of 6) = 0.250142 fft 10: mflops = 114.39 (norm. = 1), norm. avg. (of 6) = 0.710922 fft 11: mflops = 89.2405 (norm. = 0.780142), norm. avg. (of 6) = 0.66941 fft 12: mflops = 106.635 (norm. = 0.932203), norm. avg. (of 6) = 0.988701 fft 13: mflops = 87.3813 (norm. = 0.763889), norm. avg. (of 4) = 0.420722 fft 14: mflops = 60.4948 (norm. = 0.528846), norm. avg. (of 6) = 0.290055 fft 15: mflops = 25.3688 (norm. = 0.221774), norm. avg. (of 6) = 0.132873 fft 16: mflops = 27.8383 (norm. = 0.243363), norm. avg. (of 6) = 0.139252 fft 17: mflops = 93.2068 (norm. = 0.814815), norm. avg. (of 6) = 0.545563 fft 18: mflops = 33.825 (norm. = 0.295699), norm. avg. (of 5) = 0.220711 fft 19: mflops = 44.306 (norm. = 0.387324), norm. avg. (of 5) = 0.26487 fft 20: mflops = 43.6907 (norm. = 0.381944), norm. avg. (of 5) = 0.253736 fft 21: mflops = 15.4202 (norm. = 0.134804), norm. avg. (of 6) = 0.08345 fft 22: mflops = 79.6387 (norm. = 0.696203), norm. avg. (of 6) = 0.430156 fft 23: mflops = 19.7845 (norm. = 0.172956), norm. avg. (of 5) = 0.0711529 fft 24: mflops = 49.9322 (norm. = 0.436508), norm. avg. (of 6) = 0.194773 fft 25: mflops = 28.0869 (norm. = 0.245536), norm. avg. (of 6) = 0.117609 fft 26: mflops = 5.14008 (norm. = 0.0449346), norm. avg. (of 6) = 0.0554049 fft 27: mflops = 16.384 (norm. = 0.143229), norm. avg. (of 6) = 0.0685667 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.39 s, 16384 iters, t-(init.)=1.36 s t(norm)=0.0926426, mflops=53.9708 (err=3.6e-16) 1. Arndt DIT: elapsed time t=1.39 s, 16384 iters, t-(init.)=1.35 s t(norm)=0.0919615, mflops=54.3706 (err=3.4e-16) 2. Arndt Split-Radix: elapsed time t=1.12 s, 8192 iters, t-(init.)=1.1 s t(norm)=0.149863, mflops=33.3638 (err=4.2e-16) 3. Arndt 4-step: elapsed time t=1.3 s, 4096 iters, t-(init.)=1.3 s t(norm)=0.354222, mflops=14.1154 (err=3.1e-16) 4. Beauregard: elapsed time t=1.33 s, 2048 iters, t-(init.)=1.33 s t(norm)=0.724792, mflops=6.89853 (err=3.5e-16) 5. Bergland: elapsed time t=1.12 s, 8192 iters, t-(init.)=1.1 s t(norm)=0.149863, mflops=33.3638 (err=3.9e-16) 6. CWP (min N) (N=130): elapsed time t=1.05 s, 16384 iters, t-(init.)=1.01 s t(norm)=0.0688008, mflops=72.6736 7. CWP (best N) (N=140): elapsed time t=1.82 s, 32768 iters, t-(init.)=1.75 s t(norm)=0.0596046, mflops=83.8861 8. Edelblute: elapsed time t=1.25 s, 8192 iters, t-(init.)=1.23 s t(norm)=0.167574, mflops=29.8375 (err=3.3e-16) 9. FFTPACK (f2c): elapsed time t=1.38 s, 16384 iters, t-(init.)=1.35 s t(norm)=0.0919615, mflops=54.3706 (err=3.4e-16) FFTW_MEASURE plan: (cost = 3.784180e-05) FFTW_TWIDDLE 8 FFTW_NOTW 16 10. FFTW: elapsed time t=1.26 s, 32768 iters, t-(init.)=1.19 s t(norm)=0.0405312, mflops=123.362 (err=3.5e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.5 s, 32768 iters, t-(init.)=1.43 s t(norm)=0.0487055, mflops=102.658 (err=3.5e-16) 12. Frigo-old: elapsed time t=1.26 s, 32768 iters, t-(init.)=1.19 s t(norm)=0.0405312, mflops=123.362 (err=3.4e-16) 13. Green: elapsed time t=1.79 s, 32768 iters, t-(init.)=1.71 s t(norm)=0.0582423, mflops=85.8483 (err=4.2e-16) 14. GSL: elapsed time t=1.13 s, 16384 iters, t-(init.)=1.1 s t(norm)=0.0749316, mflops=66.7276 (err=3.3e-16) 15. GSL DIT: elapsed time t=1.37 s, 8192 iters, t-(init.)=1.35 s t(norm)=0.183923, mflops=27.1853 (err=3.5e-16) 16. GSL DIF: elapsed time t=1.23 s, 8192 iters, t-(init.)=1.21 s t(norm)=0.164849, mflops=30.3307 (err=3.7e-16) 17. Krukar: elapsed time t=1.15 s, 16384 iters, t-(init.)=1.11 s t(norm)=0.0756127, mflops=66.1264 (err=3.6e-16) 18. Mayer (Buneman): elapsed time t=1 s, 8192 iters, t-(init.)=0.98 s t(norm)=0.133514, mflops=37.4491 (err=3.3e-16) 19. Mayer (simple): elapsed time t=1.5 s, 16384 iters, t-(init.)=1.47 s t(norm)=0.100136, mflops=49.9322 20. Mayer (lookup): elapsed time t=1.55 s, 16384 iters, t-(init.)=1.52 s t(norm)=0.103542, mflops=48.2897 (err=3.4e-16) 21. NAPACK (f2c): elapsed time t=1.16 s, 4096 iters, t-(init.)=1.15 s t(norm)=0.31335, mflops=15.9566 (err=2.1e-15) 22. Ooura (C): elapsed time t=1.79 s, 32768 iters, t-(init.)=1.71 s t(norm)=0.0582423, mflops=85.8483 (err=3.5e-16) 23. Ransom: elapsed time t=1.02 s, 4096 iters, t-(init.)=1.01 s t(norm)=0.275203, mflops=18.1684 (err=9.6e-16) 24. Singleton (f2c): elapsed time t=1.68 s, 16384 iters, t-(init.)=1.64 s t(norm)=0.111716, mflops=44.7563 (err=4.2e-16) 25. Temperton (f2c): elapsed time t=1.27 s, 8192 iters, t-(init.)=1.25 s t(norm)=0.170299, mflops=29.3601 (err=3.6e-16) 26. Valkenburg: elapsed time t=1.76 s, 2048 iters, t-(init.)=1.76 s t(norm)=0.959124, mflops=5.21309 (err=5.3e-16) 27. SGIMATH: elapsed time t=1.81 s, 8192 iters, t-(init.)=1.79 s t(norm)=0.243868, mflops=20.5029 (err=3.3e-16) Top mflops for N=128 = 123.362 Normalized results and averages for N=128: fft 0: mflops = 53.9708 (norm. = 0.4375), norm. avg. (of 7) = 0.40142 fft 1: mflops = 54.3706 (norm. = 0.440741), norm. avg. (of 7) = 0.39908 fft 2: mflops = 33.3638 (norm. = 0.270455), norm. avg. (of 7) = 0.25001 fft 3: mflops = 14.1154 (norm. = 0.114423), norm. avg. (of 7) = 0.0584397 fft 4: mflops = 6.89853 (norm. = 0.0559211), norm. avg. (of 7) = 0.0469461 fft 5: mflops = 33.3638 (norm. = 0.270455), norm. avg. (of 7) = 0.145232 fft 6: mflops = 72.6736 (norm. = 0.589109), norm. avg. (of 7) = 0.298283 fft 7: mflops = 83.8861 (norm. = 0.68), norm. avg. (of 7) = 0.282006 fft 8: mflops = 29.8375 (norm. = 0.24187), norm. avg. (of 6) = 0.192648 fft 9: mflops = 54.3706 (norm. = 0.440741), norm. avg. (of 7) = 0.277371 fft 10: mflops = 123.362 (norm. = 1), norm. avg. (of 7) = 0.752219 fft 11: mflops = 102.658 (norm. = 0.832168), norm. avg. (of 7) = 0.692661 fft 12: mflops = 123.362 (norm. = 1), norm. avg. (of 7) = 0.990315 fft 13: mflops = 85.8483 (norm. = 0.695906), norm. avg. (of 5) = 0.475759 fft 14: mflops = 66.7276 (norm. = 0.540909), norm. avg. (of 7) = 0.325891 fft 15: mflops = 27.1853 (norm. = 0.22037), norm. avg. (of 7) = 0.145372 fft 16: mflops = 30.3307 (norm. = 0.245868), norm. avg. (of 7) = 0.154483 fft 17: mflops = 66.1264 (norm. = 0.536036), norm. avg. (of 7) = 0.544202 fft 18: mflops = 37.4491 (norm. = 0.303571), norm. avg. (of 6) = 0.234521 fft 19: mflops = 49.9322 (norm. = 0.404762), norm. avg. (of 6) = 0.288186 fft 20: mflops = 48.2897 (norm. = 0.391447), norm. avg. (of 6) = 0.276688 fft 21: mflops = 15.9566 (norm. = 0.129348), norm. avg. (of 7) = 0.0900068 fft 22: mflops = 85.8483 (norm. = 0.695906), norm. avg. (of 7) = 0.46812 fft 23: mflops = 18.1684 (norm. = 0.147277), norm. avg. (of 6) = 0.0838403 fft 24: mflops = 44.7563 (norm. = 0.362805), norm. avg. (of 7) = 0.218777 fft 25: mflops = 29.3601 (norm. = 0.238), norm. avg. (of 7) = 0.134807 fft 26: mflops = 5.21309 (norm. = 0.0422585), norm. avg. (of 7) = 0.0535268 fft 27: mflops = 20.5029 (norm. = 0.166201), norm. avg. (of 7) = 0.0825145 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.56 s, 8192 iters, t-(init.)=1.53 s t(norm)=0.0911951, mflops=54.8275 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.6 s, 8192 iters, t-(init.)=1.57 s t(norm)=0.0935793, mflops=53.4306 (err=1.0e-15) 2. Arndt Split-Radix: elapsed time t=1.21 s, 4096 iters, t-(init.)=1.2 s t(norm)=0.143051, mflops=34.9525 (err=1.0e-15) 3. Arndt 4-step: elapsed time t=1.19 s, 2048 iters, t-(init.)=1.18 s t(norm)=0.281334, mflops=17.7725 (err=1.0e-15) 4. Beauregard: elapsed time t=1.43 s, 1024 iters, t-(init.)=1.43 s t(norm)=0.681877, mflops=7.3327 (err=1.1e-15) 5. Bergland: elapsed time t=1.14 s, 4096 iters, t-(init.)=1.13 s t(norm)=0.134706, mflops=37.1177 (err=1.0e-15) 6. CWP (min N) (N=260): elapsed time t=2 s, 16384 iters, t-(init.)=1.94 s t(norm)=0.0578165, mflops=86.4805 7. CWP (best N) (N=280): elapsed time t=1.73 s, 16384 iters, t-(init.)=1.66 s t(norm)=0.0494719, mflops=101.068 8. Edelblute: elapsed time t=1.32 s, 4096 iters, t-(init.)=1.31 s t(norm)=0.156164, mflops=32.0176 (err=1.0e-15) 9. FFTPACK (f2c): elapsed time t=1.39 s, 8192 iters, t-(init.)=1.36 s t(norm)=0.0810623, mflops=61.6809 (err=1.1e-15) FFTW_MEASURE plan: (cost = 8.544922e-05) FFTW_TWIDDLE 16 FFTW_NOTW 16 10. FFTW: elapsed time t=1.43 s, 16384 iters, t-(init.)=1.36 s t(norm)=0.0405312, mflops=123.362 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.56 s, 16384 iters, t-(init.)=1.49 s t(norm)=0.0444055, mflops=112.599 (err=1.0e-15) 12. Frigo-old: elapsed time t=1.51 s, 16384 iters, t-(init.)=1.45 s t(norm)=0.0432134, mflops=115.705 (err=1.0e-15) 13. Green: elapsed time t=1.88 s, 16384 iters, t-(init.)=1.81 s t(norm)=0.0539422, mflops=92.6918 (err=1.1e-15) 14. GSL: elapsed time t=1.19 s, 8192 iters, t-(init.)=1.15 s t(norm)=0.0685453, mflops=72.9444 (err=1.1e-15) 15. GSL DIT: elapsed time t=1.42 s, 4096 iters, t-(init.)=1.41 s t(norm)=0.168085, mflops=29.7468 (err=1.0e-15) 16. GSL DIF: elapsed time t=1.27 s, 4096 iters, t-(init.)=1.25 s t(norm)=0.149012, mflops=33.5544 (err=1.1e-15) 17. Krukar: elapsed time t=1.41 s, 8192 iters, t-(init.)=1.37 s t(norm)=0.0816584, mflops=61.2307 (err=1.1e-15) 18. Mayer (Buneman): elapsed time t=1.12 s, 4096 iters, t-(init.)=1.11 s t(norm)=0.132322, mflops=37.7865 (err=9.8e-16) 19. Mayer (simple): elapsed time t=1.69 s, 8192 iters, t-(init.)=1.66 s t(norm)=0.0989437, mflops=50.5338 20. Mayer (lookup): elapsed time t=1.73 s, 8192 iters, t-(init.)=1.69 s t(norm)=0.100732, mflops=49.6367 (err=9.6e-16) 21. NAPACK (f2c): elapsed time t=1.26 s, 2048 iters, t-(init.)=1.26 s t(norm)=0.300407, mflops=16.6441 (err=4.9e-15) 22. Ooura (C): elapsed time t=1.98 s, 16384 iters, t-(init.)=1.91 s t(norm)=0.0569224, mflops=87.8388 (err=1.0e-15) 23. Ransom: elapsed time t=1.59 s, 4096 iters, t-(init.)=1.57 s t(norm)=0.187159, mflops=26.7153 (err=1.8e-15) 24. Singleton (f2c): elapsed time t=1.41 s, 8192 iters, t-(init.)=1.36 s t(norm)=0.0810623, mflops=61.6809 (err=1.7e-15) 25. Temperton (f2c): elapsed time t=1.29 s, 4096 iters, t-(init.)=1.27 s t(norm)=0.151396, mflops=33.026 (err=1.1e-15) 26. Valkenburg: elapsed time t=1 s, 512 iters, t-(init.)=1 s t(norm)=0.953674, mflops=5.24288 (err=1.1e-15) 27. SGIMATH: elapsed time t=1.35 s, 2048 iters, t-(init.)=1.34 s t(norm)=0.319481, mflops=15.6504 (err=1.1e-15) Top mflops for N=256 = 123.362 Normalized results and averages for N=256: fft 0: mflops = 54.8275 (norm. = 0.444444), norm. avg. (of 8) = 0.406798 fft 1: mflops = 53.4306 (norm. = 0.433121), norm. avg. (of 8) = 0.403335 fft 2: mflops = 34.9525 (norm. = 0.283333), norm. avg. (of 8) = 0.254176 fft 3: mflops = 17.7725 (norm. = 0.144068), norm. avg. (of 8) = 0.0691432 fft 4: mflops = 7.3327 (norm. = 0.0594406), norm. avg. (of 8) = 0.0485079 fft 5: mflops = 37.1177 (norm. = 0.300885), norm. avg. (of 8) = 0.164689 fft 6: mflops = 86.4805 (norm. = 0.701031), norm. avg. (of 8) = 0.348626 fft 7: mflops = 101.068 (norm. = 0.819277), norm. avg. (of 8) = 0.349165 fft 8: mflops = 32.0176 (norm. = 0.259542), norm. avg. (of 7) = 0.202205 fft 9: mflops = 61.6809 (norm. = 0.5), norm. avg. (of 8) = 0.305199 fft 10: mflops = 123.362 (norm. = 1), norm. avg. (of 8) = 0.783191 fft 11: mflops = 112.599 (norm. = 0.912752), norm. avg. (of 8) = 0.720173 fft 12: mflops = 115.705 (norm. = 0.937931), norm. avg. (of 8) = 0.983767 fft 13: mflops = 92.6918 (norm. = 0.751381), norm. avg. (of 6) = 0.521696 fft 14: mflops = 72.9444 (norm. = 0.591304), norm. avg. (of 8) = 0.359068 fft 15: mflops = 29.7468 (norm. = 0.241135), norm. avg. (of 8) = 0.157343 fft 16: mflops = 33.5544 (norm. = 0.272), norm. avg. (of 8) = 0.169172 fft 17: mflops = 61.2307 (norm. = 0.49635), norm. avg. (of 8) = 0.53822 fft 18: mflops = 37.7865 (norm. = 0.306306), norm. avg. (of 7) = 0.244776 fft 19: mflops = 50.5338 (norm. = 0.409639), norm. avg. (of 7) = 0.305536 fft 20: mflops = 49.6367 (norm. = 0.402367), norm. avg. (of 7) = 0.294642 fft 21: mflops = 16.6441 (norm. = 0.134921), norm. avg. (of 8) = 0.0956211 fft 22: mflops = 87.8388 (norm. = 0.712042), norm. avg. (of 8) = 0.49861 fft 23: mflops = 26.7153 (norm. = 0.216561), norm. avg. (of 7) = 0.1028 fft 24: mflops = 61.6809 (norm. = 0.5), norm. avg. (of 8) = 0.25393 fft 25: mflops = 33.026 (norm. = 0.267717), norm. avg. (of 8) = 0.151421 fft 26: mflops = 5.24288 (norm. = 0.0425), norm. avg. (of 8) = 0.0521485 fft 27: mflops = 15.6504 (norm. = 0.126866), norm. avg. (of 8) = 0.0880584 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.62 s, 4096 iters, t-(init.)=1.59 s t(norm)=0.0842412, mflops=59.3534 (err=1.1e-15) 1. Arndt DIT: elapsed time t=1.66 s, 4096 iters, t-(init.)=1.62 s t(norm)=0.0858307, mflops=58.2542 (err=1.1e-15) 2. Arndt Split-Radix: elapsed time t=1.3 s, 2048 iters, t-(init.)=1.29 s t(norm)=0.136693, mflops=36.5782 (err=1.1e-15) 3. Arndt 4-step: elapsed time t=1.27 s, 1024 iters, t-(init.)=1.26 s t(norm)=0.267029, mflops=18.7246 (err=1.0e-15) 4. Beauregard: elapsed time t=1.54 s, 512 iters, t-(init.)=1.54 s t(norm)=0.652737, mflops=7.66005 (err=1.0e-15) 5. Bergland: elapsed time t=1.16 s, 2048 iters, t-(init.)=1.15 s t(norm)=0.121858, mflops=41.0312 (err=1.0e-15) 6. CWP (min N) (N=520): elapsed time t=1 s, 4096 iters, t-(init.)=0.96 s t(norm)=0.0508626, mflops=98.304 7. CWP (best N) (N=560): elapsed time t=1.94 s, 8192 iters, t-(init.)=1.87 s t(norm)=0.0495381, mflops=100.932 8. Edelblute: elapsed time t=1.42 s, 2048 iters, t-(init.)=1.41 s t(norm)=0.149409, mflops=33.4652 (err=1.1e-15) 9. FFTPACK (f2c): elapsed time t=1.76 s, 4096 iters, t-(init.)=1.72 s t(norm)=0.0911289, mflops=54.8673 (err=1.0e-15) FFTW_MEASURE plan: (cost = 2.050781e-04) FFTW_TWIDDLE 16 FFTW_NOTW 32 10. FFTW: elapsed time t=1.08 s, 4096 iters, t-(init.)=1.04 s t(norm)=0.0551012, mflops=90.7422 (err=9.7e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.09 s, 4096 iters, t-(init.)=1.06 s t(norm)=0.0561608, mflops=89.03 (err=9.7e-16) 12. Frigo-old: elapsed time t=1.9 s, 8192 iters, t-(init.)=1.83 s t(norm)=0.0484784, mflops=103.139 (err=9.5e-16) 13. Green: elapsed time t=1.85 s, 8192 iters, t-(init.)=1.79 s t(norm)=0.0474188, mflops=105.443 (err=9.8e-16) 14. GSL: elapsed time t=1.42 s, 4096 iters, t-(init.)=1.39 s t(norm)=0.0736448, mflops=67.8934 (err=1.0e-15) 15. GSL DIT: elapsed time t=1.56 s, 2048 iters, t-(init.)=1.54 s t(norm)=0.163184, mflops=30.6402 (err=1.2e-15) 16. GSL DIF: elapsed time t=1.4 s, 2048 iters, t-(init.)=1.39 s t(norm)=0.14729, mflops=33.9467 (err=1.1e-15) 17. Krukar: elapsed time t=1.51 s, 4096 iters, t-(init.)=1.48 s t(norm)=0.0784132, mflops=63.7648 (err=1.0e-15) 18. Mayer (Buneman): elapsed time t=1.16 s, 2048 iters, t-(init.)=1.14 s t(norm)=0.120799, mflops=41.3912 (err=1.0e-15) 19. Mayer (simple): elapsed time t=1.76 s, 4096 iters, t-(init.)=1.73 s t(norm)=0.0916587, mflops=54.5502 20. Mayer (lookup): elapsed time t=1.77 s, 4096 iters, t-(init.)=1.73 s t(norm)=0.0916587, mflops=54.5502 (err=9.8e-16) 21. NAPACK (f2c): elapsed time t=1.4 s, 1024 iters, t-(init.)=1.39 s t(norm)=0.294579, mflops=16.9734 (err=6.9e-15) 22. Ooura (C): elapsed time t=1.09 s, 4096 iters, t-(init.)=1.06 s t(norm)=0.0561608, mflops=89.03 (err=9.8e-16) 23. Ransom: elapsed time t=1.85 s, 2048 iters, t-(init.)=1.83 s t(norm)=0.193914, mflops=25.7847 (err=1.4e-15) 24. Singleton (f2c): elapsed time t=1.59 s, 4096 iters, t-(init.)=1.56 s t(norm)=0.0826518, mflops=60.4948 (err=1.2e-15) 25. Temperton (f2c): elapsed time t=1.67 s, 2048 iters, t-(init.)=1.66 s t(norm)=0.1759, mflops=28.4253 (err=9.9e-16) 26. Valkenburg: elapsed time t=1.28 s, 256 iters, t-(init.)=1.28 s t(norm)=1.08507, mflops=4.608 (err=1.3e-15) 27. SGIMATH: elapsed time t=1.56 s, 2048 iters, t-(init.)=1.54 s t(norm)=0.163184, mflops=30.6402 (err=1.0e-15) Top mflops for N=512 = 105.443 Normalized results and averages for N=512: fft 0: mflops = 59.3534 (norm. = 0.562893), norm. avg. (of 9) = 0.424142 fft 1: mflops = 58.2542 (norm. = 0.552469), norm. avg. (of 9) = 0.419906 fft 2: mflops = 36.5782 (norm. = 0.346899), norm. avg. (of 9) = 0.264478 fft 3: mflops = 18.7246 (norm. = 0.177579), norm. avg. (of 9) = 0.0811917 fft 4: mflops = 7.66005 (norm. = 0.0726461), norm. avg. (of 9) = 0.05119 fft 5: mflops = 41.0312 (norm. = 0.38913), norm. avg. (of 9) = 0.189627 fft 6: mflops = 98.304 (norm. = 0.932292), norm. avg. (of 9) = 0.413478 fft 7: mflops = 100.932 (norm. = 0.957219), norm. avg. (of 9) = 0.416727 fft 8: mflops = 33.4652 (norm. = 0.317376), norm. avg. (of 8) = 0.216601 fft 9: mflops = 54.8673 (norm. = 0.520349), norm. avg. (of 9) = 0.329105 fft 10: mflops = 90.7422 (norm. = 0.860577), norm. avg. (of 9) = 0.79179 fft 11: mflops = 89.03 (norm. = 0.84434), norm. avg. (of 9) = 0.733969 fft 12: mflops = 103.139 (norm. = 0.978142), norm. avg. (of 9) = 0.983142 fft 13: mflops = 105.443 (norm. = 1), norm. avg. (of 7) = 0.590025 fft 14: mflops = 67.8934 (norm. = 0.643885), norm. avg. (of 9) = 0.390714 fft 15: mflops = 30.6402 (norm. = 0.290584), norm. avg. (of 9) = 0.172147 fft 16: mflops = 33.9467 (norm. = 0.321942), norm. avg. (of 9) = 0.186147 fft 17: mflops = 63.7648 (norm. = 0.60473), norm. avg. (of 9) = 0.54561 fft 18: mflops = 41.3912 (norm. = 0.392544), norm. avg. (of 8) = 0.263247 fft 19: mflops = 54.5502 (norm. = 0.517341), norm. avg. (of 8) = 0.332012 fft 20: mflops = 54.5502 (norm. = 0.517341), norm. avg. (of 8) = 0.322479 fft 21: mflops = 16.9734 (norm. = 0.160971), norm. avg. (of 9) = 0.102882 fft 22: mflops = 89.03 (norm. = 0.84434), norm. avg. (of 9) = 0.537025 fft 23: mflops = 25.7847 (norm. = 0.244536), norm. avg. (of 8) = 0.120517 fft 24: mflops = 60.4948 (norm. = 0.573718), norm. avg. (of 9) = 0.289462 fft 25: mflops = 28.4253 (norm. = 0.269578), norm. avg. (of 9) = 0.16455 fft 26: mflops = 4.608 (norm. = 0.0437012), norm. avg. (of 9) = 0.0512099 fft 27: mflops = 30.6402 (norm. = 0.290584), norm. avg. (of 9) = 0.110561 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.79 s, 2048 iters, t-(init.)=1.76 s t(norm)=0.0839233, mflops=59.5782 (err=1.8e-15) 1. Arndt DIT: elapsed time t=1.86 s, 2048 iters, t-(init.)=1.83 s t(norm)=0.0872612, mflops=57.2992 (err=1.9e-15) 2. Arndt Split-Radix: elapsed time t=1.41 s, 1024 iters, t-(init.)=1.4 s t(norm)=0.133514, mflops=37.4491 (err=1.8e-15) 3. Arndt 4-step: elapsed time t=1.15 s, 512 iters, t-(init.)=1.14 s t(norm)=0.217438, mflops=22.9951 (err=1.8e-15) 4. Beauregard: elapsed time t=1.63 s, 256 iters, t-(init.)=1.62 s t(norm)=0.617981, mflops=8.09086 (err=2.0e-15) 5. Bergland: elapsed time t=1.25 s, 1024 iters, t-(init.)=1.24 s t(norm)=0.118256, mflops=42.2813 (err=2.1e-15) 6. CWP (min N) (N=1040): elapsed time t=1.12 s, 2048 iters, t-(init.)=1.09 s t(norm)=0.0519753, mflops=96.1996 7. CWP (best N) (N=1040): elapsed time t=1.12 s, 2048 iters, t-(init.)=1.09 s t(norm)=0.0519753, mflops=96.1996 8. Edelblute: elapsed time t=1.49 s, 1024 iters, t-(init.)=1.48 s t(norm)=0.141144, mflops=35.4249 (err=1.8e-15) 9. FFTPACK (f2c): elapsed time t=1.2 s, 1024 iters, t-(init.)=1.18 s t(norm)=0.112534, mflops=44.4312 (err=1.9e-15) FFTW_MEASURE plan: (cost = 5.859375e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 16 10. FFTW: elapsed time t=1.2 s, 2048 iters, t-(init.)=1.17 s t(norm)=0.0557899, mflops=89.6219 (err=2.0e-15) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.35 s, 2048 iters, t-(init.)=1.32 s t(norm)=0.0629425, mflops=79.4376 (err=1.9e-15) 12. Frigo-old: elapsed time t=1.42 s, 2048 iters, t-(init.)=1.39 s t(norm)=0.0662804, mflops=75.4371 (err=1.9e-15) 13. Green: elapsed time t=1.08 s, 2048 iters, t-(init.)=1.05 s t(norm)=0.0500679, mflops=99.8644 (err=2.0e-15) 14. GSL: elapsed time t=1.93 s, 2048 iters, t-(init.)=1.9 s t(norm)=0.0905991, mflops=55.1882 (err=1.9e-15) 15. GSL DIT: elapsed time t=1.7 s, 1024 iters, t-(init.)=1.69 s t(norm)=0.161171, mflops=31.023 (err=2.1e-15) 16. GSL DIF: elapsed time t=1.5 s, 1024 iters, t-(init.)=1.48 s t(norm)=0.141144, mflops=35.4249 (err=2.2e-15) 17. Krukar: elapsed time t=1.34 s, 1024 iters, t-(init.)=1.33 s t(norm)=0.126839, mflops=39.4202 (err=1.9e-15) 18. Mayer (Buneman): elapsed time t=1.28 s, 1024 iters, t-(init.)=1.26 s t(norm)=0.120163, mflops=41.6102 (err=1.8e-15) 19. Mayer (simple): elapsed time t=1.96 s, 2048 iters, t-(init.)=1.93 s t(norm)=0.0920296, mflops=54.3304 20. Mayer (lookup): elapsed time t=1 s, 1024 iters, t-(init.)=0.98 s t(norm)=0.0934601, mflops=53.4988 (err=1.8e-15) 21. NAPACK (f2c): elapsed time t=1.54 s, 512 iters, t-(init.)=1.53 s t(norm)=0.291824, mflops=17.1336 (err=1.6e-14) 22. Ooura (C): elapsed time t=1.19 s, 2048 iters, t-(init.)=1.16 s t(norm)=0.0553131, mflops=90.3945 (err=2.2e-15) 23. Ransom: elapsed time t=1.67 s, 1024 iters, t-(init.)=1.65 s t(norm)=0.157356, mflops=31.775 (err=2.3e-15) 24. Singleton (f2c): elapsed time t=1.62 s, 2048 iters, t-(init.)=1.59 s t(norm)=0.0758171, mflops=65.9482 (err=2.8e-15) 25. Temperton (f2c): elapsed time t=1.61 s, 1024 iters, t-(init.)=1.59 s t(norm)=0.151634, mflops=32.9741 (err=1.9e-15) 26. Valkenburg: elapsed time t=1.35 s, 128 iters, t-(init.)=1.35 s t(norm)=1.02997, mflops=4.85452 (err=2.4e-15) 27. SGIMATH: elapsed time t=1.6 s, 1024 iters, t-(init.)=1.58 s t(norm)=0.150681, mflops=33.1828 (err=1.9e-15) Top mflops for N=1024 = 99.8644 Normalized results and averages for N=1024: fft 0: mflops = 59.5782 (norm. = 0.596591), norm. avg. (of 10) = 0.441387 fft 1: mflops = 57.2992 (norm. = 0.57377), norm. avg. (of 10) = 0.435292 fft 2: mflops = 37.4491 (norm. = 0.375), norm. avg. (of 10) = 0.275531 fft 3: mflops = 22.9951 (norm. = 0.230263), norm. avg. (of 10) = 0.0960988 fft 4: mflops = 8.09086 (norm. = 0.0810185), norm. avg. (of 10) = 0.0541728 fft 5: mflops = 42.2813 (norm. = 0.423387), norm. avg. (of 10) = 0.213003 fft 6: mflops = 96.1996 (norm. = 0.963303), norm. avg. (of 10) = 0.46846 fft 7: mflops = 96.1996 (norm. = 0.963303), norm. avg. (of 10) = 0.471384 fft 8: mflops = 35.4249 (norm. = 0.35473), norm. avg. (of 9) = 0.231949 fft 9: mflops = 44.4312 (norm. = 0.444915), norm. avg. (of 10) = 0.340686 fft 10: mflops = 89.6219 (norm. = 0.897436), norm. avg. (of 10) = 0.802354 fft 11: mflops = 79.4376 (norm. = 0.795455), norm. avg. (of 10) = 0.740118 fft 12: mflops = 75.4371 (norm. = 0.755396), norm. avg. (of 10) = 0.960367 fft 13: mflops = 99.8644 (norm. = 1), norm. avg. (of 8) = 0.641272 fft 14: mflops = 55.1882 (norm. = 0.552632), norm. avg. (of 10) = 0.406906 fft 15: mflops = 31.023 (norm. = 0.310651), norm. avg. (of 10) = 0.185998 fft 16: mflops = 35.4249 (norm. = 0.35473), norm. avg. (of 10) = 0.203005 fft 17: mflops = 39.4202 (norm. = 0.394737), norm. avg. (of 10) = 0.530523 fft 18: mflops = 41.6102 (norm. = 0.416667), norm. avg. (of 9) = 0.280294 fft 19: mflops = 54.3304 (norm. = 0.544041), norm. avg. (of 9) = 0.355571 fft 20: mflops = 53.4988 (norm. = 0.535714), norm. avg. (of 9) = 0.346172 fft 21: mflops = 17.1336 (norm. = 0.171569), norm. avg. (of 10) = 0.109751 fft 22: mflops = 90.3945 (norm. = 0.905172), norm. avg. (of 10) = 0.573839 fft 23: mflops = 31.775 (norm. = 0.318182), norm. avg. (of 9) = 0.14248 fft 24: mflops = 65.9482 (norm. = 0.660377), norm. avg. (of 10) = 0.326554 fft 25: mflops = 32.9741 (norm. = 0.330189), norm. avg. (of 10) = 0.181114 fft 26: mflops = 4.85452 (norm. = 0.0486111), norm. avg. (of 10) = 0.05095 fft 27: mflops = 33.1828 (norm. = 0.332278), norm. avg. (of 10) = 0.132733 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.97 s, 1024 iters, t-(init.)=1.94 s t(norm)=0.0840967, mflops=59.4553 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.01 s, 512 iters, t-(init.)=0.99 s t(norm)=0.0858307, mflops=58.2542 (err=1.4e-15) 2. Arndt Split-Radix: elapsed time t=1.55 s, 512 iters, t-(init.)=1.53 s t(norm)=0.132647, mflops=37.6939 (err=1.5e-15) 3. Arndt 4-step: elapsed time t=1.34 s, 256 iters, t-(init.)=1.33 s t(norm)=0.230616, mflops=21.6811 (err=1.4e-15) 4. Beauregard: elapsed time t=1.81 s, 128 iters, t-(init.)=1.8 s t(norm)=0.624223, mflops=8.00996 (err=1.4e-15) 5. Bergland: elapsed time t=1.35 s, 512 iters, t-(init.)=1.33 s t(norm)=0.115308, mflops=43.3622 (err=1.5e-15) 6. CWP (min N) (N=2145): elapsed time t=1.59 s, 1024 iters, t-(init.)=1.54 s t(norm)=0.0667572, mflops=74.8983 7. CWP (best N) (N=2184): elapsed time t=1.43 s, 1024 iters, t-(init.)=1.37 s t(norm)=0.0593879, mflops=84.1922 8. Edelblute: elapsed time t=1.64 s, 512 iters, t-(init.)=1.63 s t(norm)=0.141317, mflops=35.3814 (err=1.4e-15) 9. FFTPACK (f2c): elapsed time t=1.16 s, 256 iters, t-(init.)=1.15 s t(norm)=0.199405, mflops=25.0746 (err=1.4e-15) FFTW_MEASURE plan: (cost = 1.562500e-03) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 16 10. FFTW: elapsed time t=1.69 s, 1024 iters, t-(init.)=1.66 s t(norm)=0.0719591, mflops=69.484 (err=1.4e-15) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.83 s, 1024 iters, t-(init.)=1.8 s t(norm)=0.0780279, mflops=64.0796 (err=1.4e-15) 12. Frigo-old: elapsed time t=1.3 s, 512 iters, t-(init.)=1.28 s t(norm)=0.110973, mflops=45.056 (err=1.4e-15) 13. Green: elapsed time t=1.36 s, 1024 iters, t-(init.)=1.33 s t(norm)=0.0576539, mflops=86.7243 (err=1.4e-15) 14. GSL: elapsed time t=1.87 s, 512 iters, t-(init.)=1.85 s t(norm)=0.160391, mflops=31.1739 (err=1.4e-15) 15. GSL DIT: elapsed time t=1.88 s, 512 iters, t-(init.)=1.85 s t(norm)=0.160391, mflops=31.1739 (err=2.0e-15) 16. GSL DIF: elapsed time t=1.69 s, 512 iters, t-(init.)=1.67 s t(norm)=0.144785, mflops=34.5339 (err=2.3e-15) 17. Krukar: elapsed time t=1.65 s, 512 iters, t-(init.)=1.63 s t(norm)=0.141317, mflops=35.3814 (err=1.4e-15) 18. Mayer (Buneman): elapsed time t=1.33 s, 512 iters, t-(init.)=1.32 s t(norm)=0.114441, mflops=43.6907 (err=1.4e-15) 19. Mayer (simple): elapsed time t=1.03 s, 512 iters, t-(init.)=1.01 s t(norm)=0.0875646, mflops=57.1007 20. Mayer (lookup): elapsed time t=1.1 s, 512 iters, t-(init.)=1.08 s t(norm)=0.0936335, mflops=53.3997 (err=1.4e-15) 21. NAPACK (f2c): elapsed time t=1.29 s, 128 iters, t-(init.)=1.28 s t(norm)=0.443892, mflops=11.264 (err=1.5e-14) 22. Ooura (C): elapsed time t=1.41 s, 1024 iters, t-(init.)=1.37 s t(norm)=0.0593879, mflops=84.1922 (err=1.4e-15) 23. Ransom: elapsed time t=1.96 s, 512 iters, t-(init.)=1.95 s t(norm)=0.16906, mflops=29.5752 (err=2.0e-15) 24. Singleton (f2c): elapsed time t=1.04 s, 512 iters, t-(init.)=1.03 s t(norm)=0.0892986, mflops=55.9919 (err=1.9e-15) 25. Temperton (f2c): elapsed time t=1.03 s, 256 iters, t-(init.)=1.02 s t(norm)=0.176863, mflops=28.2704 (err=1.4e-15) 26. Valkenburg: elapsed time t=1.55 s, 64 iters, t-(init.)=1.55 s t(norm)=1.07505, mflops=4.65094 (err=1.7e-15) 27. SGIMATH: elapsed time t=1.01 s, 256 iters, t-(init.)=1 s t(norm)=0.173395, mflops=28.8358 (err=1.4e-15) Top mflops for N=2048 = 86.7243 Normalized results and averages for N=2048: fft 0: mflops = 59.4553 (norm. = 0.685567), norm. avg. (of 11) = 0.463585 fft 1: mflops = 58.2542 (norm. = 0.671717), norm. avg. (of 11) = 0.456785 fft 2: mflops = 37.6939 (norm. = 0.434641), norm. avg. (of 11) = 0.289995 fft 3: mflops = 21.6811 (norm. = 0.25), norm. avg. (of 11) = 0.11009 fft 4: mflops = 8.00996 (norm. = 0.0923611), norm. avg. (of 11) = 0.0576445 fft 5: mflops = 43.3622 (norm. = 0.5), norm. avg. (of 11) = 0.239093 fft 6: mflops = 74.8983 (norm. = 0.863636), norm. avg. (of 11) = 0.504385 fft 7: mflops = 84.1922 (norm. = 0.970803), norm. avg. (of 11) = 0.516786 fft 8: mflops = 35.3814 (norm. = 0.407975), norm. avg. (of 10) = 0.249551 fft 9: mflops = 25.0746 (norm. = 0.28913), norm. avg. (of 11) = 0.335999 fft 10: mflops = 69.484 (norm. = 0.801205), norm. avg. (of 11) = 0.80225 fft 11: mflops = 64.0796 (norm. = 0.738889), norm. avg. (of 11) = 0.740006 fft 12: mflops = 45.056 (norm. = 0.519531), norm. avg. (of 11) = 0.920291 fft 13: mflops = 86.7243 (norm. = 1), norm. avg. (of 9) = 0.681131 fft 14: mflops = 31.1739 (norm. = 0.359459), norm. avg. (of 11) = 0.402593 fft 15: mflops = 31.1739 (norm. = 0.359459), norm. avg. (of 11) = 0.201767 fft 16: mflops = 34.5339 (norm. = 0.398204), norm. avg. (of 11) = 0.22075 fft 17: mflops = 35.3814 (norm. = 0.407975), norm. avg. (of 11) = 0.519382 fft 18: mflops = 43.6907 (norm. = 0.503788), norm. avg. (of 10) = 0.302643 fft 19: mflops = 57.1007 (norm. = 0.658416), norm. avg. (of 10) = 0.385855 fft 20: mflops = 53.3997 (norm. = 0.615741), norm. avg. (of 10) = 0.373129 fft 21: mflops = 11.264 (norm. = 0.129883), norm. avg. (of 11) = 0.111581 fft 22: mflops = 84.1922 (norm. = 0.970803), norm. avg. (of 11) = 0.609927 fft 23: mflops = 29.5752 (norm. = 0.341026), norm. avg. (of 10) = 0.162335 fft 24: mflops = 55.9919 (norm. = 0.645631), norm. avg. (of 11) = 0.355561 fft 25: mflops = 28.2704 (norm. = 0.32598), norm. avg. (of 11) = 0.194283 fft 26: mflops = 4.65094 (norm. = 0.053629), norm. avg. (of 11) = 0.0511935 fft 27: mflops = 28.8358 (norm. = 0.3325), norm. avg. (of 11) = 0.150894 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.39 s, 128 iters, t-(init.)=1.35 s t(norm)=0.214577, mflops=23.3017 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.26 s, 128 iters, t-(init.)=1.21 s t(norm)=0.192324, mflops=25.9978 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.7 s, 128 iters, t-(init.)=1.66 s t(norm)=0.26385, mflops=18.9502 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.6 s, 128 iters, t-(init.)=1.56 s t(norm)=0.247955, mflops=20.1649 (err=3.7e-15) 4. Beauregard: elapsed time t=1.04 s, 32 iters, t-(init.)=1.03 s t(norm)=0.654856, mflops=7.63526 (err=3.8e-15) 5. Bergland: elapsed time t=1.11 s, 128 iters, t-(init.)=1.06 s t(norm)=0.168482, mflops=29.6767 (err=3.9e-15) 6. CWP (min N) (N=4290): elapsed time t=1.1 s, 256 iters, t-(init.)=1.02 s t(norm)=0.0810623, mflops=61.6809 7. CWP (best N) (N=4368): elapsed time t=1.9 s, 512 iters, t-(init.)=1.72 s t(norm)=0.0683467, mflops=73.1565 8. Edelblute: elapsed time t=1.74 s, 128 iters, t-(init.)=1.7 s t(norm)=0.270208, mflops=18.5043 (err=3.7e-15) 9. FFTPACK (f2c): elapsed time t=1.46 s, 128 iters, t-(init.)=1.42 s t(norm)=0.225703, mflops=22.153 (err=3.8e-15) FFTW_MEASURE plan: (cost = 5.312500e-03) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 16 10. FFTW: elapsed time t=1.33 s, 256 iters, t-(init.)=1.23 s t(norm)=0.0977516, mflops=51.15 (err=3.8e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.4 s, 256 iters, t-(init.)=1.32 s t(norm)=0.104904, mflops=47.6625 (err=3.8e-15) 12. Frigo-old: elapsed time t=1.94 s, 256 iters, t-(init.)=1.86 s t(norm)=0.14782, mflops=33.825 (err=3.8e-15) 13. Green: elapsed time t=1.48 s, 256 iters, t-(init.)=1.39 s t(norm)=0.110467, mflops=45.2623 (err=3.8e-15) 14. GSL: elapsed time t=1.39 s, 128 iters, t-(init.)=1.35 s t(norm)=0.214577, mflops=23.3017 (err=3.8e-15) 15. GSL DIT: elapsed time t=1.98 s, 128 iters, t-(init.)=1.94 s t(norm)=0.308355, mflops=16.2151 (err=4.1e-15) 16. GSL DIF: elapsed time t=1.91 s, 128 iters, t-(init.)=1.87 s t(norm)=0.297228, mflops=16.8221 (err=4.3e-15) 17. Krukar: elapsed time t=1.12 s, 128 iters, t-(init.)=1.07 s t(norm)=0.170072, mflops=29.3993 (err=3.8e-15) 18. Mayer (Buneman): elapsed time t=1.57 s, 256 iters, t-(init.)=1.49 s t(norm)=0.118415, mflops=42.2245 (err=3.7e-15) 19. Mayer (simple): elapsed time t=1.26 s, 256 iters, t-(init.)=1.17 s t(norm)=0.0929832, mflops=53.7731 20. Mayer (lookup): elapsed time t=1.62 s, 256 iters, t-(init.)=1.54 s t(norm)=0.122388, mflops=40.8536 (err=3.7e-15) 21. NAPACK (f2c): elapsed time t=1.53 s, 64 iters, t-(init.)=1.51 s t(norm)=0.480016, mflops=10.4163 (err=4.9e-14) 22. Ooura (C): elapsed time t=1.56 s, 256 iters, t-(init.)=1.47 s t(norm)=0.116825, mflops=42.799 (err=3.9e-15) 23. Ransom: elapsed time t=1.12 s, 128 iters, t-(init.)=1.08 s t(norm)=0.171661, mflops=29.1271 (err=4.3e-15) 24. Singleton (f2c): elapsed time t=1.12 s, 128 iters, t-(init.)=1.08 s t(norm)=0.171661, mflops=29.1271 (err=5.8e-15) 25. Temperton (f2c): elapsed time t=1.65 s, 128 iters, t-(init.)=1.61 s t(norm)=0.255903, mflops=19.5387 (err=3.8e-15) 26. Valkenburg: elapsed time t=1.88 s, 32 iters, t-(init.)=1.87 s t(norm)=1.18891, mflops=4.20552 (err=4.0e-15) 27. SGIMATH: elapsed time t=1.47 s, 128 iters, t-(init.)=1.43 s t(norm)=0.227292, mflops=21.9981 (err=3.8e-15) Top mflops for N=4096 = 73.1565 Normalized results and averages for N=4096: fft 0: mflops = 23.3017 (norm. = 0.318519), norm. avg. (of 12) = 0.451496 fft 1: mflops = 25.9978 (norm. = 0.355372), norm. avg. (of 12) = 0.448334 fft 2: mflops = 18.9502 (norm. = 0.259036), norm. avg. (of 12) = 0.287415 fft 3: mflops = 20.1649 (norm. = 0.275641), norm. avg. (of 12) = 0.123886 fft 4: mflops = 7.63526 (norm. = 0.104369), norm. avg. (of 12) = 0.0615382 fft 5: mflops = 29.6767 (norm. = 0.40566), norm. avg. (of 12) = 0.252974 fft 6: mflops = 61.6809 (norm. = 0.843137), norm. avg. (of 12) = 0.532615 fft 7: mflops = 73.1565 (norm. = 1), norm. avg. (of 12) = 0.557054 fft 8: mflops = 18.5043 (norm. = 0.252941), norm. avg. (of 11) = 0.24986 fft 9: mflops = 22.153 (norm. = 0.302817), norm. avg. (of 12) = 0.333234 fft 10: mflops = 51.15 (norm. = 0.699187), norm. avg. (of 12) = 0.793661 fft 11: mflops = 47.6625 (norm. = 0.651515), norm. avg. (of 12) = 0.732632 fft 12: mflops = 33.825 (norm. = 0.462366), norm. avg. (of 12) = 0.882131 fft 13: mflops = 45.2623 (norm. = 0.618705), norm. avg. (of 10) = 0.674888 fft 14: mflops = 23.3017 (norm. = 0.318519), norm. avg. (of 12) = 0.395587 fft 15: mflops = 16.2151 (norm. = 0.221649), norm. avg. (of 12) = 0.203424 fft 16: mflops = 16.8221 (norm. = 0.229947), norm. avg. (of 12) = 0.221517 fft 17: mflops = 29.3993 (norm. = 0.401869), norm. avg. (of 12) = 0.50959 fft 18: mflops = 42.2245 (norm. = 0.577181), norm. avg. (of 11) = 0.327601 fft 19: mflops = 53.7731 (norm. = 0.735043), norm. avg. (of 11) = 0.417599 fft 20: mflops = 40.8536 (norm. = 0.558442), norm. avg. (of 11) = 0.389976 fft 21: mflops = 10.4163 (norm. = 0.142384), norm. avg. (of 12) = 0.114148 fft 22: mflops = 42.799 (norm. = 0.585034), norm. avg. (of 12) = 0.607853 fft 23: mflops = 29.1271 (norm. = 0.398148), norm. avg. (of 11) = 0.183772 fft 24: mflops = 29.1271 (norm. = 0.398148), norm. avg. (of 12) = 0.35911 fft 25: mflops = 19.5387 (norm. = 0.267081), norm. avg. (of 12) = 0.20035 fft 26: mflops = 4.20552 (norm. = 0.0574866), norm. avg. (of 12) = 0.051718 fft 27: mflops = 21.9981 (norm. = 0.300699), norm. avg. (of 12) = 0.163377 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1 s, 32 iters, t-(init.)=0.96 s t(norm)=0.281701, mflops=17.7493 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.02 s, 32 iters, t-(init.)=0.99 s t(norm)=0.290504, mflops=17.2115 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.34 s, 32 iters, t-(init.)=1.31 s t(norm)=0.384404, mflops=13.0071 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.1 s, 32 iters, t-(init.)=1.07 s t(norm)=0.313979, mflops=15.9246 (err=3.7e-15) 4. Beauregard: elapsed time t=1.2 s, 16 iters, t-(init.)=1.18 s t(norm)=0.692514, mflops=7.22007 (err=3.7e-15) 5. Bergland: elapsed time t=1.65 s, 64 iters, t-(init.)=1.59 s t(norm)=0.233283, mflops=21.4332 (err=3.7e-15) 6. CWP (min N) (N=8580): elapsed time t=1.43 s, 128 iters, t-(init.)=1.29 s t(norm)=0.0946338, mflops=52.8352 7. CWP (best N) (N=9240): elapsed time t=1.47 s, 128 iters, t-(init.)=1.32 s t(norm)=0.0968346, mflops=51.6344 8. Edelblute: elapsed time t=1.36 s, 32 iters, t-(init.)=1.33 s t(norm)=0.390273, mflops=12.8115 (err=3.7e-15) 9. FFTPACK (f2c): elapsed time t=1.12 s, 32 iters, t-(init.)=1.09 s t(norm)=0.319848, mflops=15.6324 (err=3.7e-15) FFTW_MEASURE plan: (cost = 1.437500e-02) FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 16 10. FFTW: elapsed time t=1.92 s, 128 iters, t-(init.)=1.79 s t(norm)=0.131314, mflops=38.0768 (err=3.7e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.92 s, 128 iters, t-(init.)=1.79 s t(norm)=0.131314, mflops=38.0768 (err=3.7e-15) 12. Frigo-old: elapsed time t=1.42 s, 64 iters, t-(init.)=1.36 s t(norm)=0.199538, mflops=25.0579 (err=3.7e-15) 13. Green: elapsed time t=1.19 s, 64 iters, t-(init.)=1.13 s t(norm)=0.165793, mflops=30.1582 (err=3.7e-15) 14. GSL: elapsed time t=1.03 s, 32 iters, t-(init.)=1 s t(norm)=0.293438, mflops=17.0394 (err=3.7e-15) 15. GSL DIT: elapsed time t=1.47 s, 32 iters, t-(init.)=1.44 s t(norm)=0.422551, mflops=11.8329 (err=4.3e-15) 16. GSL DIF: elapsed time t=1.41 s, 32 iters, t-(init.)=1.38 s t(norm)=0.404945, mflops=12.3474 (err=4.3e-15) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.68 s, 64 iters, t-(init.)=1.62 s t(norm)=0.237685, mflops=21.0362 (err=3.7e-15) 19. Mayer (simple): elapsed time t=1.53 s, 64 iters, t-(init.)=1.47 s t(norm)=0.215677, mflops=23.1828 20. Mayer (lookup): elapsed time t=1.69 s, 64 iters, t-(init.)=1.62 s t(norm)=0.237685, mflops=21.0362 (err=3.7e-15) 21. NAPACK (f2c): elapsed time t=1.05 s, 16 iters, t-(init.)=1.03 s t(norm)=0.604483, mflops=8.27153 (err=4.5e-14) 22. Ooura (C): elapsed time t=1.05 s, 64 iters, t-(init.)=0.99 s t(norm)=0.145252, mflops=34.4229 (err=3.7e-15) 23. Ransom: elapsed time t=1.55 s, 64 iters, t-(init.)=1.49 s t(norm)=0.218611, mflops=22.8716 (err=4.8e-15) 24. Singleton (f2c): elapsed time t=1.72 s, 64 iters, t-(init.)=1.66 s t(norm)=0.243554, mflops=20.5293 (err=5.6e-15) 25. Temperton (f2c): elapsed time t=1.17 s, 32 iters, t-(init.)=1.14 s t(norm)=0.33452, mflops=14.9468 (err=3.7e-15) 26. Valkenburg: elapsed time t=1.08 s, 8 iters, t-(init.)=1.07 s t(norm)=1.25592, mflops=3.98116 (err=3.8e-15) 27. SGIMATH: elapsed time t=1.4 s, 64 iters, t-(init.)=1.34 s t(norm)=0.196604, mflops=25.4319 (err=3.7e-15) Top mflops for N=8192 = 52.8352 Normalized results and averages for N=8192: fft 0: mflops = 17.7493 (norm. = 0.335938), norm. avg. (of 13) = 0.442607 fft 1: mflops = 17.2115 (norm. = 0.325758), norm. avg. (of 13) = 0.438905 fft 2: mflops = 13.0071 (norm. = 0.246183), norm. avg. (of 13) = 0.284243 fft 3: mflops = 15.9246 (norm. = 0.301402), norm. avg. (of 13) = 0.137541 fft 4: mflops = 7.22007 (norm. = 0.136653), norm. avg. (of 13) = 0.0673162 fft 5: mflops = 21.4332 (norm. = 0.40566), norm. avg. (of 13) = 0.264719 fft 6: mflops = 52.8352 (norm. = 1), norm. avg. (of 13) = 0.568567 fft 7: mflops = 51.6344 (norm. = 0.977273), norm. avg. (of 13) = 0.589378 fft 8: mflops = 12.8115 (norm. = 0.242481), norm. avg. (of 12) = 0.249245 fft 9: mflops = 15.6324 (norm. = 0.295872), norm. avg. (of 13) = 0.33036 fft 10: mflops = 38.0768 (norm. = 0.72067), norm. avg. (of 13) = 0.788047 fft 11: mflops = 38.0768 (norm. = 0.72067), norm. avg. (of 13) = 0.731711 fft 12: mflops = 25.0579 (norm. = 0.474265), norm. avg. (of 13) = 0.850756 fft 13: mflops = 30.1582 (norm. = 0.570796), norm. avg. (of 11) = 0.665425 fft 14: mflops = 17.0394 (norm. = 0.3225), norm. avg. (of 13) = 0.389964 fft 15: mflops = 11.8329 (norm. = 0.223958), norm. avg. (of 13) = 0.205003 fft 16: mflops = 12.3474 (norm. = 0.233696), norm. avg. (of 13) = 0.222454 fft 17: mflops = -1 (norm. = -0.0189268), norm. avg. (of 12) = 0.50959 fft 18: mflops = 21.0362 (norm. = 0.398148), norm. avg. (of 12) = 0.33348 fft 19: mflops = 23.1828 (norm. = 0.438776), norm. avg. (of 12) = 0.419364 fft 20: mflops = 21.0362 (norm. = 0.398148), norm. avg. (of 12) = 0.390657 fft 21: mflops = 8.27153 (norm. = 0.156553), norm. avg. (of 13) = 0.11741 fft 22: mflops = 34.4229 (norm. = 0.651515), norm. avg. (of 13) = 0.611211 fft 23: mflops = 22.8716 (norm. = 0.432886), norm. avg. (of 12) = 0.204532 fft 24: mflops = 20.5293 (norm. = 0.388554), norm. avg. (of 13) = 0.361375 fft 25: mflops = 14.9468 (norm. = 0.282895), norm. avg. (of 13) = 0.206699 fft 26: mflops = 3.98116 (norm. = 0.0753505), norm. avg. (of 13) = 0.0535359 fft 27: mflops = 25.4319 (norm. = 0.481343), norm. avg. (of 13) = 0.187836 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.4 s, 16 iters, t-(init.)=1.36 s t(norm)=0.370571, mflops=13.4927 (err=6.8e-15) 1. Arndt DIT: elapsed time t=1.37 s, 16 iters, t-(init.)=1.34 s t(norm)=0.365121, mflops=13.6941 (err=6.8e-15) 2. Arndt Split-Radix: elapsed time t=1.77 s, 16 iters, t-(init.)=1.74 s t(norm)=0.474112, mflops=10.546 (err=6.8e-15) 3. Arndt 4-step: elapsed time t=1.07 s, 16 iters, t-(init.)=1.04 s t(norm)=0.283378, mflops=17.6443 (err=6.8e-15) 4. Beauregard: elapsed time t=1.33 s, 8 iters, t-(init.)=1.31 s t(norm)=0.713893, mflops=7.00385 (err=6.8e-15) 5. Bergland: elapsed time t=1.93 s, 32 iters, t-(init.)=1.86 s t(norm)=0.253405, mflops=19.7313 (err=6.8e-15) 6. CWP (min N) (N=17160): elapsed time t=1.63 s, 64 iters, t-(init.)=1.47 s t(norm)=0.100136, mflops=49.9322 7. CWP (best N) (N=17160): elapsed time t=1.62 s, 64 iters, t-(init.)=1.47 s t(norm)=0.100136, mflops=49.9322 8. Edelblute: elapsed time t=1.79 s, 16 iters, t-(init.)=1.76 s t(norm)=0.479562, mflops=10.4262 (err=6.8e-15) 9. FFTPACK (f2c): elapsed time t=1.34 s, 16 iters, t-(init.)=1.3 s t(norm)=0.354222, mflops=14.1154 (err=6.8e-15) FFTW_MEASURE plan: (cost = 3.625000e-02) FFTW_TWIDDLE 16 FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_NOTW 16 10. FFTW: elapsed time t=1.2 s, 32 iters, t-(init.)=1.12 s t(norm)=0.152588, mflops=32.768 (err=6.8e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.27 s, 32 iters, t-(init.)=1.2 s t(norm)=0.163487, mflops=30.5835 (err=6.8e-15) 12. Frigo-old: elapsed time t=1.86 s, 32 iters, t-(init.)=1.78 s t(norm)=0.242506, mflops=20.6181 (err=6.8e-15) 13. Green: elapsed time t=1.51 s, 32 iters, t-(init.)=1.43 s t(norm)=0.194822, mflops=25.6644 (err=6.8e-15) 14. GSL: elapsed time t=1.27 s, 16 iters, t-(init.)=1.23 s t(norm)=0.335148, mflops=14.9188 (err=6.8e-15) 15. GSL DIT: elapsed time t=1.87 s, 16 iters, t-(init.)=1.84 s t(norm)=0.50136, mflops=9.97287 (err=7.2e-15) 16. GSL DIF: elapsed time t=1.81 s, 16 iters, t-(init.)=1.77 s t(norm)=0.482287, mflops=10.3673 (err=7.3e-15) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.15 s, 16 iters, t-(init.)=1.11 s t(norm)=0.302451, mflops=16.5316 (err=6.8e-15) 19. Mayer (simple): elapsed time t=1.07 s, 16 iters, t-(init.)=1.03 s t(norm)=0.280653, mflops=17.8156 20. Mayer (lookup): elapsed time t=1.15 s, 16 iters, t-(init.)=1.11 s t(norm)=0.302451, mflops=16.5316 (err=6.8e-15) 21. NAPACK (f2c): elapsed time t=1.32 s, 8 iters, t-(init.)=1.3 s t(norm)=0.708444, mflops=7.05772 (err=2.3e-13) 22. Ooura (C): elapsed time t=1.41 s, 32 iters, t-(init.)=1.34 s t(norm)=0.182561, mflops=27.3882 (err=6.8e-15) 23. Ransom: elapsed time t=1.51 s, 32 iters, t-(init.)=1.43 s t(norm)=0.194822, mflops=25.6644 (err=7.3e-15) 24. Singleton (f2c): elapsed time t=1.1 s, 16 iters, t-(init.)=1.06 s t(norm)=0.288827, mflops=17.3114 (err=1.0e-14) 25. Temperton (f2c): elapsed time t=1.42 s, 16 iters, t-(init.)=1.38 s t(norm)=0.37602, mflops=13.2972 (err=6.8e-15) 26. Valkenburg: elapsed time t=1.28 s, 4 iters, t-(init.)=1.27 s t(norm)=1.38419, mflops=3.61222 (err=6.9e-15) 27. SGIMATH: elapsed time t=1.69 s, 32 iters, t-(init.)=1.61 s t(norm)=0.219345, mflops=22.7951 (err=6.8e-15) Top mflops for N=16384 = 49.9322 Normalized results and averages for N=16384: fft 0: mflops = 13.4927 (norm. = 0.270221), norm. avg. (of 14) = 0.430294 fft 1: mflops = 13.6941 (norm. = 0.274254), norm. avg. (of 14) = 0.427145 fft 2: mflops = 10.546 (norm. = 0.211207), norm. avg. (of 14) = 0.279027 fft 3: mflops = 17.6443 (norm. = 0.353365), norm. avg. (of 14) = 0.152957 fft 4: mflops = 7.00385 (norm. = 0.140267), norm. avg. (of 14) = 0.072527 fft 5: mflops = 19.7313 (norm. = 0.395161), norm. avg. (of 14) = 0.274036 fft 6: mflops = 49.9322 (norm. = 1), norm. avg. (of 14) = 0.599384 fft 7: mflops = 49.9322 (norm. = 1), norm. avg. (of 14) = 0.618708 fft 8: mflops = 10.4262 (norm. = 0.208807), norm. avg. (of 13) = 0.246134 fft 9: mflops = 14.1154 (norm. = 0.282692), norm. avg. (of 14) = 0.326955 fft 10: mflops = 32.768 (norm. = 0.65625), norm. avg. (of 14) = 0.778633 fft 11: mflops = 30.5835 (norm. = 0.6125), norm. avg. (of 14) = 0.723196 fft 12: mflops = 20.6181 (norm. = 0.412921), norm. avg. (of 14) = 0.819483 fft 13: mflops = 25.6644 (norm. = 0.513986), norm. avg. (of 12) = 0.652805 fft 14: mflops = 14.9188 (norm. = 0.29878), norm. avg. (of 14) = 0.383451 fft 15: mflops = 9.97287 (norm. = 0.199728), norm. avg. (of 14) = 0.204627 fft 16: mflops = 10.3673 (norm. = 0.207627), norm. avg. (of 14) = 0.221394 fft 17: mflops = -1 (norm. = -0.0200272), norm. avg. (of 12) = 0.50959 fft 18: mflops = 16.5316 (norm. = 0.331081), norm. avg. (of 13) = 0.333296 fft 19: mflops = 17.8156 (norm. = 0.356796), norm. avg. (of 13) = 0.414551 fft 20: mflops = 16.5316 (norm. = 0.331081), norm. avg. (of 13) = 0.386074 fft 21: mflops = 7.05772 (norm. = 0.141346), norm. avg. (of 14) = 0.11912 fft 22: mflops = 27.3882 (norm. = 0.548507), norm. avg. (of 14) = 0.606732 fft 23: mflops = 25.6644 (norm. = 0.513986), norm. avg. (of 13) = 0.228336 fft 24: mflops = 17.3114 (norm. = 0.346698), norm. avg. (of 14) = 0.360326 fft 25: mflops = 13.2972 (norm. = 0.266304), norm. avg. (of 14) = 0.210957 fft 26: mflops = 3.61222 (norm. = 0.0723425), norm. avg. (of 14) = 0.0548792 fft 27: mflops = 22.7951 (norm. = 0.456522), norm. avg. (of 14) = 0.207028 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.64 s, 8 iters, t-(init.)=1.59 s t(norm)=0.404358, mflops=12.3653 (err=1.4e-14) 1. Arndt DIT: elapsed time t=1.66 s, 8 iters, t-(init.)=1.61 s t(norm)=0.409444, mflops=12.2117 (err=1.4e-14) 2. Arndt Split-Radix: elapsed time t=1.12 s, 4 iters, t-(init.)=1.1 s t(norm)=0.559489, mflops=8.93673 (err=1.4e-14) 3. Arndt 4-step: elapsed time t=1.44 s, 8 iters, t-(init.)=1.39 s t(norm)=0.353495, mflops=14.1445 (err=1.4e-14) 4. Beauregard: elapsed time t=1.48 s, 4 iters, t-(init.)=1.46 s t(norm)=0.742594, mflops=6.73315 (err=1.4e-14) 5. Bergland: elapsed time t=1.17 s, 8 iters, t-(init.)=1.13 s t(norm)=0.287374, mflops=17.3989 (err=1.4e-14) 6. CWP (min N) (N=34320): elapsed time t=1.91 s, 32 iters, t-(init.)=1.73 s t(norm)=0.10999, mflops=45.4585 7. CWP (best N) (N=34320): elapsed time t=1.91 s, 32 iters, t-(init.)=1.73 s t(norm)=0.10999, mflops=45.4585 8. Edelblute: elapsed time t=1.11 s, 4 iters, t-(init.)=1.07 s t(norm)=0.54423, mflops=9.18729 (err=1.4e-14) 9. FFTPACK (f2c): elapsed time t=1.7 s, 8 iters, t-(init.)=1.66 s t(norm)=0.42216, mflops=11.8439 (err=1.4e-14) FFTW_MEASURE plan: (cost = 8.750000e-02) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_NOTW 16 10. FFTW: elapsed time t=1.4 s, 16 iters, t-(init.)=1.31 s t(norm)=0.166575, mflops=30.0165 (err=1.4e-14) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.46 s, 16 iters, t-(init.)=1.37 s t(norm)=0.174205, mflops=28.7019 (err=1.4e-14) 12. Frigo-old: elapsed time t=1.16 s, 8 iters, t-(init.)=1.12 s t(norm)=0.284831, mflops=17.5543 (err=1.4e-14) 13. Green: elapsed time t=1.9 s, 16 iters, t-(init.)=1.81 s t(norm)=0.230153, mflops=21.7246 (err=1.4e-14) 14. GSL: elapsed time t=1.48 s, 8 iters, t-(init.)=1.44 s t(norm)=0.366211, mflops=13.6533 (err=1.4e-14) 15. GSL DIT: elapsed time t=1.15 s, 4 iters, t-(init.)=1.13 s t(norm)=0.574748, mflops=8.69947 (err=1.4e-14) 16. GSL DIF: elapsed time t=1.13 s, 4 iters, t-(init.)=1.1 s t(norm)=0.559489, mflops=8.93673 (err=1.4e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.45 s, 8 iters, t-(init.)=1.4 s t(norm)=0.356038, mflops=14.0434 (err=1.4e-14) 19. Mayer (simple): elapsed time t=1.39 s, 8 iters, t-(init.)=1.34 s t(norm)=0.34078, mflops=14.6722 20. Mayer (lookup): elapsed time t=1.49 s, 8 iters, t-(init.)=1.44 s t(norm)=0.366211, mflops=13.6533 (err=1.4e-14) 21. NAPACK (f2c): elapsed time t=1.73 s, 4 iters, t-(init.)=1.71 s t(norm)=0.869751, mflops=5.74877 (err=5.6e-13) 22. Ooura (C): elapsed time t=1.66 s, 16 iters, t-(init.)=1.57 s t(norm)=0.199636, mflops=25.0456 (err=1.4e-14) 23. Ransom: elapsed time t=1.95 s, 16 iters, t-(init.)=1.86 s t(norm)=0.236511, mflops=21.1406 (err=1.5e-14) 24. Singleton (f2c): elapsed time t=1.57 s, 8 iters, t-(init.)=1.52 s t(norm)=0.386556, mflops=12.9347 (err=2.1e-14) 25. Temperton (f2c): elapsed time t=1.75 s, 8 iters, t-(init.)=1.71 s t(norm)=0.434875, mflops=11.4975 (err=1.4e-14) 26. Valkenburg: elapsed time t=1.53 s, 2 iters, t-(init.)=1.52 s t(norm)=1.54622, mflops=3.23368 (err=1.4e-14) 27. SGIMATH: elapsed time t=1.01 s, 8 iters, t-(init.)=0.97 s t(norm)=0.246684, mflops=20.2689 (err=1.4e-14) Top mflops for N=32768 = 45.4585 Normalized results and averages for N=32768: fft 0: mflops = 12.3653 (norm. = 0.272013), norm. avg. (of 15) = 0.419741 fft 1: mflops = 12.2117 (norm. = 0.268634), norm. avg. (of 15) = 0.416577 fft 2: mflops = 8.93673 (norm. = 0.196591), norm. avg. (of 15) = 0.273531 fft 3: mflops = 14.1445 (norm. = 0.311151), norm. avg. (of 15) = 0.163503 fft 4: mflops = 6.73315 (norm. = 0.148116), norm. avg. (of 15) = 0.0775663 fft 5: mflops = 17.3989 (norm. = 0.382743), norm. avg. (of 15) = 0.281284 fft 6: mflops = 45.4585 (norm. = 1), norm. avg. (of 15) = 0.626092 fft 7: mflops = 45.4585 (norm. = 1), norm. avg. (of 15) = 0.644128 fft 8: mflops = 9.18729 (norm. = 0.202103), norm. avg. (of 14) = 0.242989 fft 9: mflops = 11.8439 (norm. = 0.260542), norm. avg. (of 15) = 0.322527 fft 10: mflops = 30.0165 (norm. = 0.660305), norm. avg. (of 15) = 0.770744 fft 11: mflops = 28.7019 (norm. = 0.631387), norm. avg. (of 15) = 0.717076 fft 12: mflops = 17.5543 (norm. = 0.386161), norm. avg. (of 15) = 0.790594 fft 13: mflops = 21.7246 (norm. = 0.477901), norm. avg. (of 13) = 0.639351 fft 14: mflops = 13.6533 (norm. = 0.300347), norm. avg. (of 15) = 0.377911 fft 15: mflops = 8.69947 (norm. = 0.191372), norm. avg. (of 15) = 0.203743 fft 16: mflops = 8.93673 (norm. = 0.196591), norm. avg. (of 15) = 0.219741 fft 17: mflops = -1 (norm. = -0.0219981), norm. avg. (of 12) = 0.50959 fft 18: mflops = 14.0434 (norm. = 0.308929), norm. avg. (of 14) = 0.331555 fft 19: mflops = 14.6722 (norm. = 0.322761), norm. avg. (of 14) = 0.407995 fft 20: mflops = 13.6533 (norm. = 0.300347), norm. avg. (of 14) = 0.379951 fft 21: mflops = 5.74877 (norm. = 0.126462), norm. avg. (of 15) = 0.119609 fft 22: mflops = 25.0456 (norm. = 0.550955), norm. avg. (of 15) = 0.603014 fft 23: mflops = 21.1406 (norm. = 0.465054), norm. avg. (of 14) = 0.245244 fft 24: mflops = 12.9347 (norm. = 0.284539), norm. avg. (of 15) = 0.355274 fft 25: mflops = 11.4975 (norm. = 0.252924), norm. avg. (of 15) = 0.213755 fft 26: mflops = 3.23368 (norm. = 0.0711349), norm. avg. (of 15) = 0.0559629 fft 27: mflops = 20.2689 (norm. = 0.445876), norm. avg. (of 15) = 0.222951 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.16 s, 2 iters, t-(init.)=1.14 s t(norm)=0.543594, mflops=9.19804 (err=1.7e-14) 1. Arndt DIT: elapsed time t=1.15 s, 2 iters, t-(init.)=1.13 s t(norm)=0.538826, mflops=9.27943 (err=1.7e-14) 2. Arndt Split-Radix: elapsed time t=1.46 s, 2 iters, t-(init.)=1.43 s t(norm)=0.681877, mflops=7.3327 (err=1.7e-14) 3. Arndt 4-step: elapsed time t=1.42 s, 4 iters, t-(init.)=1.36 s t(norm)=0.324249, mflops=15.4202 (err=1.7e-14) 4. Beauregard: elapsed time t=1.66 s, 2 iters, t-(init.)=1.64 s t(norm)=0.782013, mflops=6.39376 (err=1.7e-14) 5. Bergland: elapsed time t=1.49 s, 4 iters, t-(init.)=1.42 s t(norm)=0.338554, mflops=14.7687 (err=1.7e-14) 6. CWP (min N) (N=72072): elapsed time t=1.2 s, 8 iters, t-(init.)=1.08 s t(norm)=0.128746, mflops=38.8361 7. CWP (best N) (N=72072): elapsed time t=1.21 s, 8 iters, t-(init.)=1.09 s t(norm)=0.129938, mflops=38.4799 8. Edelblute: elapsed time t=1.44 s, 2 iters, t-(init.)=1.41 s t(norm)=0.67234, mflops=7.43671 (err=1.7e-14) 9. FFTPACK (f2c): elapsed time t=1.82 s, 4 iters, t-(init.)=1.75 s t(norm)=0.417233, mflops=11.9837 (err=1.7e-14) FFTW_MEASURE plan: (cost = 2.200000e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 16 10. FFTW: elapsed time t=1.7 s, 8 iters, t-(init.)=1.58 s t(norm)=0.188351, mflops=26.5462 (err=1.7e-14) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.76 s, 8 iters, t-(init.)=1.63 s t(norm)=0.194311, mflops=25.7319 (err=1.7e-14) 12. Frigo-old: elapsed time t=1.43 s, 4 iters, t-(init.)=1.37 s t(norm)=0.326633, mflops=15.3077 (err=1.7e-14) 13. Green: elapsed time t=1.2 s, 4 iters, t-(init.)=1.14 s t(norm)=0.271797, mflops=18.3961 (err=1.7e-14) 14. GSL: elapsed time t=1.68 s, 4 iters, t-(init.)=1.63 s t(norm)=0.388622, mflops=12.866 (err=1.7e-14) 15. GSL DIT: elapsed time t=1.49 s, 2 iters, t-(init.)=1.46 s t(norm)=0.696182, mflops=7.18203 (err=1.7e-14) 16. GSL DIF: elapsed time t=1.46 s, 2 iters, t-(init.)=1.44 s t(norm)=0.686646, mflops=7.28178 (err=1.8e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.79 s, 4 iters, t-(init.)=1.74 s t(norm)=0.414848, mflops=12.0526 (err=1.7e-14) 19. Mayer (simple): elapsed time t=1.71 s, 4 iters, t-(init.)=1.65 s t(norm)=0.393391, mflops=12.71 20. Mayer (lookup): elapsed time t=1.86 s, 4 iters, t-(init.)=1.81 s t(norm)=0.431538, mflops=11.5865 (err=1.7e-14) 21. NAPACK (f2c): elapsed time t=1.93 s, 2 iters, t-(init.)=1.9 s t(norm)=0.905991, mflops=5.51882 (err=8.6e-13) 22. Ooura (C): elapsed time t=1.07 s, 4 iters, t-(init.)=1.02 s t(norm)=0.243187, mflops=20.5603 (err=1.7e-14) 23. Ransom: elapsed time t=1.06 s, 4 iters, t-(init.)=1.01 s t(norm)=0.240803, mflops=20.7639 (err=1.7e-14) 24. Singleton (f2c): elapsed time t=1.75 s, 4 iters, t-(init.)=1.69 s t(norm)=0.402927, mflops=12.4092 (err=2.3e-14) 25. Temperton (f2c): elapsed time t=1.05 s, 2 iters, t-(init.)=1.02 s t(norm)=0.486374, mflops=10.2802 (err=1.7e-14) 26. Valkenburg: elapsed time t=1.77 s, 1 iters, t-(init.)=1.76 s t(norm)=1.67847, mflops=2.97891 (err=1.7e-14) 27. SGIMATH: elapsed time t=1.1 s, 4 iters, t-(init.)=1.05 s t(norm)=0.25034, mflops=19.9729 (err=1.7e-14) Top mflops for N=65536 = 38.8361 Normalized results and averages for N=65536: fft 0: mflops = 9.19804 (norm. = 0.236842), norm. avg. (of 16) = 0.40831 fft 1: mflops = 9.27943 (norm. = 0.238938), norm. avg. (of 16) = 0.405475 fft 2: mflops = 7.3327 (norm. = 0.188811), norm. avg. (of 16) = 0.268236 fft 3: mflops = 15.4202 (norm. = 0.397059), norm. avg. (of 16) = 0.1781 fft 4: mflops = 6.39376 (norm. = 0.164634), norm. avg. (of 16) = 0.083008 fft 5: mflops = 14.7687 (norm. = 0.380282), norm. avg. (of 16) = 0.287471 fft 6: mflops = 38.8361 (norm. = 1), norm. avg. (of 16) = 0.649461 fft 7: mflops = 38.4799 (norm. = 0.990826), norm. avg. (of 16) = 0.665796 fft 8: mflops = 7.43671 (norm. = 0.191489), norm. avg. (of 15) = 0.239556 fft 9: mflops = 11.9837 (norm. = 0.308571), norm. avg. (of 16) = 0.321655 fft 10: mflops = 26.5462 (norm. = 0.683544), norm. avg. (of 16) = 0.765294 fft 11: mflops = 25.7319 (norm. = 0.662577), norm. avg. (of 16) = 0.71367 fft 12: mflops = 15.3077 (norm. = 0.394161), norm. avg. (of 16) = 0.765817 fft 13: mflops = 18.3961 (norm. = 0.473684), norm. avg. (of 14) = 0.627518 fft 14: mflops = 12.866 (norm. = 0.331288), norm. avg. (of 16) = 0.374997 fft 15: mflops = 7.18203 (norm. = 0.184932), norm. avg. (of 16) = 0.202567 fft 16: mflops = 7.28178 (norm. = 0.1875), norm. avg. (of 16) = 0.217726 fft 17: mflops = -1 (norm. = -0.0257492), norm. avg. (of 12) = 0.50959 fft 18: mflops = 12.0526 (norm. = 0.310345), norm. avg. (of 15) = 0.330141 fft 19: mflops = 12.71 (norm. = 0.327273), norm. avg. (of 15) = 0.402613 fft 20: mflops = 11.5865 (norm. = 0.298343), norm. avg. (of 15) = 0.37451 fft 21: mflops = 5.51882 (norm. = 0.142105), norm. avg. (of 16) = 0.121015 fft 22: mflops = 20.5603 (norm. = 0.529412), norm. avg. (of 16) = 0.598414 fft 23: mflops = 20.7639 (norm. = 0.534653), norm. avg. (of 15) = 0.264538 fft 24: mflops = 12.4092 (norm. = 0.319527), norm. avg. (of 16) = 0.35304 fft 25: mflops = 10.2802 (norm. = 0.264706), norm. avg. (of 16) = 0.216939 fft 26: mflops = 2.97891 (norm. = 0.0767045), norm. avg. (of 16) = 0.0572593 fft 27: mflops = 19.9729 (norm. = 0.514286), norm. avg. (of 16) = 0.24116 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.28 s, 1 iters, t-(init.)=1.25 s t(norm)=0.560985, mflops=8.9129 (err=3.3e-14) 1. Arndt DIT: elapsed time t=1.26 s, 1 iters, t-(init.)=1.23 s t(norm)=0.552009, mflops=9.05782 (err=3.3e-14) 2. Arndt Split-Radix: elapsed time t=1.69 s, 1 iters, t-(init.)=1.66 s t(norm)=0.744988, mflops=6.71152 (err=3.3e-14) 3. Arndt 4-step: elapsed time t=1.92 s, 2 iters, t-(init.)=1.86 s t(norm)=0.417373, mflops=11.9797 (err=3.3e-14) 4. Beauregard: elapsed time t=1.81 s, 1 iters, t-(init.)=1.78 s t(norm)=0.798842, mflops=6.25906 (err=3.3e-14) 5. Bergland: elapsed time t=1.69 s, 2 iters, t-(init.)=1.63 s t(norm)=0.365762, mflops=13.6701 (err=3.4e-14) 6. CWP (min N) (N=144144): elapsed time t=1.32 s, 4 iters, t-(init.)=1.19 s t(norm)=0.133514, mflops=37.4491 7. CWP (best N) (N=144144): elapsed time t=1.31 s, 4 iters, t-(init.)=1.16 s t(norm)=0.130148, mflops=38.4177 8. Edelblute: elapsed time t=1.71 s, 1 iters, t-(init.)=1.68 s t(norm)=0.753964, mflops=6.63162 (err=3.3e-14) 9. FFTPACK (f2c): elapsed time t=1.13 s, 1 iters, t-(init.)=1.1 s t(norm)=0.493667, mflops=10.1283 (err=3.3e-14) FFTW_MEASURE plan: (cost = 4.600000e-01) FFTW_TWIDDLE 32 FFTW_TWIDDLE 2 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 16 10. FFTW: elapsed time t=1.84 s, 4 iters, t-(init.)=1.72 s t(norm)=0.192979, mflops=25.9096 (err=3.3e-14) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.91 s, 4 iters, t-(init.)=1.78 s t(norm)=0.199711, mflops=25.0362 (err=3.3e-14) 12. Frigo-old: elapsed time t=1.69 s, 2 iters, t-(init.)=1.63 s t(norm)=0.365762, mflops=13.6701 (err=3.3e-14) 13. Green: elapsed time t=1.47 s, 2 iters, t-(init.)=1.41 s t(norm)=0.316395, mflops=15.803 (err=3.3e-14) 14. GSL: elapsed time t=1 s, 1 iters, t-(init.)=0.97 s t(norm)=0.435324, mflops=11.4857 (err=3.3e-14) 15. GSL DIT: elapsed time t=1.74 s, 1 iters, t-(init.)=1.71 s t(norm)=0.767427, mflops=6.51527 (err=3.5e-14) 16. GSL DIF: elapsed time t=1.72 s, 1 iters, t-(init.)=1.69 s t(norm)=0.758452, mflops=6.59238 (err=3.5e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.16 s, 1 iters, t-(init.)=1.13 s t(norm)=0.50713, mflops=9.8594 (err=3.3e-14) 19. Mayer (simple): elapsed time t=1.12 s, 1 iters, t-(init.)=1.09 s t(norm)=0.489179, mflops=10.2212 20. Mayer (lookup): elapsed time t=1.21 s, 1 iters, t-(init.)=1.18 s t(norm)=0.52957, mflops=9.44163 (err=3.3e-14) 21. NAPACK (f2c): elapsed time t=2.14 s, 1 iters, t-(init.)=2.11 s t(norm)=0.946942, mflops=5.28015 (err=2.0e-12) 22. Ooura (C): elapsed time t=1.17 s, 2 iters, t-(init.)=1.11 s t(norm)=0.249077, mflops=20.0741 (err=3.4e-14) 23. Ransom: elapsed time t=1.3 s, 2 iters, t-(init.)=1.23 s t(norm)=0.276005, mflops=18.1156 (err=3.3e-14) 24. Singleton (f2c): elapsed time t=1.06 s, 1 iters, t-(init.)=1.02 s t(norm)=0.457764, mflops=10.9227 (err=4.8e-14) 25. Temperton (f2c): elapsed time t=1.25 s, 1 iters, t-(init.)=1.22 s t(norm)=0.547521, mflops=9.13207 (err=3.3e-14) 26. Valkenburg: elapsed time t=3.9 s, 1 iters, t-(init.)=3.87 s t(norm)=1.73681, mflops=2.87884 (err=3.4e-14) 27. SGIMATH: elapsed time t=1.31 s, 2 iters, t-(init.)=1.25 s t(norm)=0.280492, mflops=17.8258 (err=3.3e-14) Top mflops for N=131072 = 38.4177 Normalized results and averages for N=131072: fft 0: mflops = 8.9129 (norm. = 0.232), norm. avg. (of 17) = 0.397939 fft 1: mflops = 9.05782 (norm. = 0.235772), norm. avg. (of 17) = 0.395492 fft 2: mflops = 6.71152 (norm. = 0.174699), norm. avg. (of 17) = 0.262734 fft 3: mflops = 11.9797 (norm. = 0.311828), norm. avg. (of 17) = 0.185967 fft 4: mflops = 6.25906 (norm. = 0.162921), norm. avg. (of 17) = 0.0877088 fft 5: mflops = 13.6701 (norm. = 0.355828), norm. avg. (of 17) = 0.291492 fft 6: mflops = 37.4491 (norm. = 0.97479), norm. avg. (of 17) = 0.668598 fft 7: mflops = 38.4177 (norm. = 1), norm. avg. (of 17) = 0.685456 fft 8: mflops = 6.63162 (norm. = 0.172619), norm. avg. (of 16) = 0.235372 fft 9: mflops = 10.1283 (norm. = 0.263636), norm. avg. (of 17) = 0.318242 fft 10: mflops = 25.9096 (norm. = 0.674419), norm. avg. (of 17) = 0.759948 fft 11: mflops = 25.0362 (norm. = 0.651685), norm. avg. (of 17) = 0.710023 fft 12: mflops = 13.6701 (norm. = 0.355828), norm. avg. (of 17) = 0.7417 fft 13: mflops = 15.803 (norm. = 0.411348), norm. avg. (of 15) = 0.613106 fft 14: mflops = 11.4857 (norm. = 0.298969), norm. avg. (of 17) = 0.370525 fft 15: mflops = 6.51527 (norm. = 0.169591), norm. avg. (of 17) = 0.200627 fft 16: mflops = 6.59238 (norm. = 0.171598), norm. avg. (of 17) = 0.215012 fft 17: mflops = -1 (norm. = -0.0260297), norm. avg. (of 12) = 0.50959 fft 18: mflops = 9.8594 (norm. = 0.256637), norm. avg. (of 16) = 0.325547 fft 19: mflops = 10.2212 (norm. = 0.266055), norm. avg. (of 16) = 0.394078 fft 20: mflops = 9.44163 (norm. = 0.245763), norm. avg. (of 16) = 0.366463 fft 21: mflops = 5.28015 (norm. = 0.137441), norm. avg. (of 17) = 0.121981 fft 22: mflops = 20.0741 (norm. = 0.522523), norm. avg. (of 17) = 0.59395 fft 23: mflops = 18.1156 (norm. = 0.471545), norm. avg. (of 16) = 0.277476 fft 24: mflops = 10.9227 (norm. = 0.284314), norm. avg. (of 17) = 0.348997 fft 25: mflops = 9.13207 (norm. = 0.237705), norm. avg. (of 17) = 0.218161 fft 26: mflops = 2.87884 (norm. = 0.0749354), norm. avg. (of 17) = 0.058299 fft 27: mflops = 17.8258 (norm. = 0.464), norm. avg. (of 17) = 0.254268 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=2.97 s, 1 iters, t-(init.)=2.91 s t(norm)=0.616709, mflops=8.10755 (err=4.3e-14) 1. Arndt DIT: elapsed time t=2.92 s, 1 iters, t-(init.)=2.86 s t(norm)=0.606113, mflops=8.24929 (err=4.3e-14) 2. Arndt Split-Radix: elapsed time t=3.73 s, 1 iters, t-(init.)=3.67 s t(norm)=0.777774, mflops=6.4286 (err=4.3e-14) 3. Arndt 4-step: elapsed time t=1.64 s, 1 iters, t-(init.)=1.58 s t(norm)=0.334846, mflops=14.9323 (err=4.3e-14) 4. Beauregard: elapsed time t=3.8 s, 1 iters, t-(init.)=3.74 s t(norm)=0.792609, mflops=6.30828 (err=4.4e-14) 5. Bergland: elapsed time t=1.76 s, 1 iters, t-(init.)=1.7 s t(norm)=0.360277, mflops=13.8782 (err=4.4e-14) 6. CWP (min N) (N=360360): elapsed time t=1.89 s, 2 iters, t-(init.)=1.71 s t(norm)=0.181198, mflops=27.5941 7. CWP (best N) (N=360360): elapsed time t=1.89 s, 2 iters, t-(init.)=1.72 s t(norm)=0.182258, mflops=27.4337 8. Edelblute: elapsed time t=3.7 s, 1 iters, t-(init.)=3.63 s t(norm)=0.769297, mflops=6.49944 (err=4.3e-14) 9. FFTPACK (f2c): elapsed time t=2.22 s, 1 iters, t-(init.)=2.15 s t(norm)=0.455644, mflops=10.9735 (err=4.4e-14) FFTW_MEASURE plan: (cost = 1.030000e+00) FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 8 10. FFTW: elapsed time t=1.02 s, 1 iters, t-(init.)=0.96 s t(norm)=0.203451, mflops=24.576 (err=4.4e-14) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.06 s, 1 iters, t-(init.)=1 s t(norm)=0.211928, mflops=23.593 (err=4.4e-14) 12. Frigo-old: elapsed time t=1.94 s, 1 iters, t-(init.)=1.88 s t(norm)=0.398424, mflops=12.5494 (err=4.4e-14) 13. Green: elapsed time t=1.55 s, 1 iters, t-(init.)=1.48 s t(norm)=0.313653, mflops=15.9412 (err=4.4e-14) 14. GSL: elapsed time t=2.02 s, 1 iters, t-(init.)=1.96 s t(norm)=0.415378, mflops=12.0372 (err=4.4e-14) 15. GSL DIT: elapsed time t=3.75 s, 1 iters, t-(init.)=3.69 s t(norm)=0.782013, mflops=6.39376 (err=4.6e-14) 16. GSL DIF: elapsed time t=3.71 s, 1 iters, t-(init.)=3.65 s t(norm)=0.773536, mflops=6.46382 (err=4.6e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=2.61 s, 1 iters, t-(init.)=2.55 s t(norm)=0.540415, mflops=9.25214 (err=4.3e-14) 19. Mayer (simple): elapsed time t=2.54 s, 1 iters, t-(init.)=2.48 s t(norm)=0.525581, mflops=9.51329 20. Mayer (lookup): elapsed time t=2.7 s, 1 iters, t-(init.)=2.64 s t(norm)=0.559489, mflops=8.93673 (err=4.3e-14) 21. NAPACK (f2c): elapsed time t=4.42 s, 1 iters, t-(init.)=4.36 s t(norm)=0.924004, mflops=5.41123 (err=3.7e-12) 22. Ooura (C): elapsed time t=1.27 s, 1 iters, t-(init.)=1.2 s t(norm)=0.254313, mflops=19.6608 (err=4.4e-14) 23. Ransom: elapsed time t=1.15 s, 1 iters, t-(init.)=1.09 s t(norm)=0.231001, mflops=21.6449 (err=4.3e-14) 24. Singleton (f2c): elapsed time t=2.19 s, 1 iters, t-(init.)=2.13 s t(norm)=0.451406, mflops=11.0765 (err=6.0e-14) 25. Temperton (f2c): elapsed time t=2.59 s, 1 iters, t-(init.)=2.53 s t(norm)=0.536177, mflops=9.32528 (err=4.4e-14) 26. Valkenburg: elapsed time t=8.57 s, 1 iters, t-(init.)=8.51 s t(norm)=1.8035, mflops=2.77238 (err=4.4e-14) 27. SGIMATH: elapsed time t=1.32 s, 1 iters, t-(init.)=1.26 s t(norm)=0.267029, mflops=18.7246 (err=4.4e-14) Top mflops for N=262144 = 27.5941 Normalized results and averages for N=262144: fft 0: mflops = 8.10755 (norm. = 0.293814), norm. avg. (of 18) = 0.392154 fft 1: mflops = 8.24929 (norm. = 0.298951), norm. avg. (of 18) = 0.390129 fft 2: mflops = 6.4286 (norm. = 0.23297), norm. avg. (of 18) = 0.26108 fft 3: mflops = 14.9323 (norm. = 0.541139), norm. avg. (of 18) = 0.205699 fft 4: mflops = 6.30828 (norm. = 0.22861), norm. avg. (of 18) = 0.0955366 fft 5: mflops = 13.8782 (norm. = 0.502941), norm. avg. (of 18) = 0.303239 fft 6: mflops = 27.5941 (norm. = 1), norm. avg. (of 18) = 0.687009 fft 7: mflops = 27.4337 (norm. = 0.994186), norm. avg. (of 18) = 0.702607 fft 8: mflops = 6.49944 (norm. = 0.235537), norm. avg. (of 17) = 0.235382 fft 9: mflops = 10.9735 (norm. = 0.397674), norm. avg. (of 18) = 0.322655 fft 10: mflops = 24.576 (norm. = 0.890625), norm. avg. (of 18) = 0.767208 fft 11: mflops = 23.593 (norm. = 0.855), norm. avg. (of 18) = 0.718078 fft 12: mflops = 12.5494 (norm. = 0.454787), norm. avg. (of 18) = 0.725761 fft 13: mflops = 15.9412 (norm. = 0.577703), norm. avg. (of 16) = 0.610894 fft 14: mflops = 12.0372 (norm. = 0.436224), norm. avg. (of 18) = 0.374175 fft 15: mflops = 6.39376 (norm. = 0.231707), norm. avg. (of 18) = 0.202354 fft 16: mflops = 6.46382 (norm. = 0.234247), norm. avg. (of 18) = 0.216081 fft 17: mflops = -1 (norm. = -0.0362396), norm. avg. (of 12) = 0.50959 fft 18: mflops = 9.25214 (norm. = 0.335294), norm. avg. (of 17) = 0.32612 fft 19: mflops = 9.51329 (norm. = 0.344758), norm. avg. (of 17) = 0.391177 fft 20: mflops = 8.93673 (norm. = 0.323864), norm. avg. (of 17) = 0.363957 fft 21: mflops = 5.41123 (norm. = 0.196101), norm. avg. (of 18) = 0.126099 fft 22: mflops = 19.6608 (norm. = 0.7125), norm. avg. (of 18) = 0.600536 fft 23: mflops = 21.6449 (norm. = 0.784404), norm. avg. (of 17) = 0.307295 fft 24: mflops = 11.0765 (norm. = 0.401408), norm. avg. (of 18) = 0.351909 fft 25: mflops = 9.32528 (norm. = 0.337945), norm. avg. (of 18) = 0.224815 fft 26: mflops = 2.77238 (norm. = 0.10047), norm. avg. (of 18) = 0.0606419 fft 27: mflops = 18.7246 (norm. = 0.678571), norm. avg. (of 18) = 0.27784 Benchmarking for array size = 524288 (power of 2): 0. Arndt DIF: elapsed time t=6.11 s, 1 iters, t-(init.)=5.99 s t(norm)=0.601317, mflops=8.31509 (err=1.1e-13) 1. Arndt DIT: elapsed time t=6.02 s, 1 iters, t-(init.)=5.9 s t(norm)=0.592282, mflops=8.44193 (err=1.1e-13) 2. Arndt Split-Radix: elapsed time t=8 s, 1 iters, t-(init.)=7.88 s t(norm)=0.791048, mflops=6.32073 (err=1.1e-13) 3. Arndt 4-step: elapsed time t=4.41 s, 1 iters, t-(init.)=4.28 s t(norm)=0.429655, mflops=11.6372 (err=1.1e-13) 4. Beauregard: elapsed time t=7.96 s, 1 iters, t-(init.)=7.83 s t(norm)=0.786028, mflops=6.36109 (err=1.1e-13) 5. Bergland: elapsed time t=3.9 s, 1 iters, t-(init.)=3.77 s t(norm)=0.378458, mflops=13.2115 (err=1.1e-13) 6. CWP (min N) (N=720720): elapsed time t=1.92 s, 1 iters, t-(init.)=1.75 s t(norm)=0.175677, mflops=28.4613 7. CWP (best N) (N=720720): elapsed time t=1.92 s, 1 iters, t-(init.)=1.75 s t(norm)=0.175677, mflops=28.4613 8. Edelblute: elapsed time t=7.98 s, 1 iters, t-(init.)=7.84 s t(norm)=0.787032, mflops=6.35298 (err=1.1e-13) 9. FFTPACK (f2c): elapsed time t=4.91 s, 1 iters, t-(init.)=4.78 s t(norm)=0.479849, mflops=10.4199 (err=1.1e-13) FFTW_MEASURE plan: (cost = 2.180000e+00) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 16 10. FFTW: elapsed time t=2.12 s, 1 iters, t-(init.)=2 s t(norm)=0.200774, mflops=24.9037 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 5.976883e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=2.23 s, 1 iters, t-(init.)=2.11 s t(norm)=0.211816, mflops=23.6054 (err=1.1e-13) 12. Frigo-old: elapsed time t=4.3 s, 1 iters, t-(init.)=4.18 s t(norm)=0.419617, mflops=11.9156 (err=1.1e-13) 13. Green: elapsed time t=3.34 s, 1 iters, t-(init.)=3.21 s t(norm)=0.322242, mflops=15.5163 (err=1.1e-13) 14. GSL: elapsed time t=4.18 s, 1 iters, t-(init.)=4.06 s t(norm)=0.40757, mflops=12.2678 (err=1.1e-13) 15. GSL DIT: elapsed time t=8.03 s, 1 iters, t-(init.)=7.91 s t(norm)=0.794059, mflops=6.29676 (err=1.1e-13) 16. GSL DIF: elapsed time t=7.94 s, 1 iters, t-(init.)=7.81 s t(norm)=0.784021, mflops=6.37738 (err=1.1e-13) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=5.77 s, 1 iters, t-(init.)=5.65 s t(norm)=0.567185, mflops=8.81546 (err=1.1e-13) 19. Mayer (simple): elapsed time t=5.61 s, 1 iters, t-(init.)=5.48 s t(norm)=0.55012, mflops=9.08893 20. Mayer (lookup): elapsed time t=5.93 s, 1 iters, t-(init.)=5.81 s t(norm)=0.583247, mflops=8.5727 (err=1.1e-13) 21. NAPACK (f2c): elapsed time t=9.64 s, 1 iters, t-(init.)=9.52 s t(norm)=0.955682, mflops=5.23187 (err=7.9e-12) 22. Ooura (C): elapsed time t=2.7 s, 1 iters, t-(init.)=2.57 s t(norm)=0.257994, mflops=19.3803 (err=1.1e-13) 23. Ransom: elapsed time t=2.8 s, 1 iters, t-(init.)=2.68 s t(norm)=0.269037, mflops=18.5848 (err=1.1e-13) 24. Singleton (f2c): elapsed time t=5.39 s, 1 iters, t-(init.)=5.26 s t(norm)=0.528034, mflops=9.46908 (err=1.6e-13) 25. Temperton (f2c): elapsed time t=5.73 s, 1 iters, t-(init.)=5.61 s t(norm)=0.56317, mflops=8.87832 (err=1.1e-13) 26. Valkenburg: elapsed time t=18.92 s, 1 iters, t-(init.)=18.8 s t(norm)=1.88727, mflops=2.64933 (err=1.1e-13) 27. SGIMATH: elapsed time t=2.79 s, 1 iters, t-(init.)=2.67 s t(norm)=0.268033, mflops=18.6544 (err=1.1e-13) Top mflops for N=524288 = 28.4613 Normalized results and averages for N=524288: fft 0: mflops = 8.31509 (norm. = 0.292154), norm. avg. (of 19) = 0.386891 fft 1: mflops = 8.44193 (norm. = 0.29661), norm. avg. (of 19) = 0.385207 fft 2: mflops = 6.32073 (norm. = 0.222081), norm. avg. (of 19) = 0.259028 fft 3: mflops = 11.6372 (norm. = 0.408879), norm. avg. (of 19) = 0.216392 fft 4: mflops = 6.36109 (norm. = 0.223499), norm. avg. (of 19) = 0.102272 fft 5: mflops = 13.2115 (norm. = 0.464191), norm. avg. (of 19) = 0.31171 fft 6: mflops = 28.4613 (norm. = 1), norm. avg. (of 19) = 0.703482 fft 7: mflops = 28.4613 (norm. = 1), norm. avg. (of 19) = 0.718259 fft 8: mflops = 6.35298 (norm. = 0.223214), norm. avg. (of 18) = 0.234706 fft 9: mflops = 10.4199 (norm. = 0.366109), norm. avg. (of 19) = 0.324942 fft 10: mflops = 24.9037 (norm. = 0.875), norm. avg. (of 19) = 0.772882 fft 11: mflops = 23.6054 (norm. = 0.829384), norm. avg. (of 19) = 0.723936 fft 12: mflops = 11.9156 (norm. = 0.41866), norm. avg. (of 19) = 0.709597 fft 13: mflops = 15.5163 (norm. = 0.545171), norm. avg. (of 17) = 0.607028 fft 14: mflops = 12.2678 (norm. = 0.431034), norm. avg. (of 19) = 0.377167 fft 15: mflops = 6.29676 (norm. = 0.221239), norm. avg. (of 19) = 0.203348 fft 16: mflops = 6.37738 (norm. = 0.224072), norm. avg. (of 19) = 0.216502 fft 17: mflops = -1 (norm. = -0.0351354), norm. avg. (of 12) = 0.50959 fft 18: mflops = 8.81546 (norm. = 0.309735), norm. avg. (of 18) = 0.32521 fft 19: mflops = 9.08893 (norm. = 0.319343), norm. avg. (of 18) = 0.387186 fft 20: mflops = 8.5727 (norm. = 0.301205), norm. avg. (of 18) = 0.360471 fft 21: mflops = 5.23187 (norm. = 0.183824), norm. avg. (of 19) = 0.129137 fft 22: mflops = 19.3803 (norm. = 0.680934), norm. avg. (of 19) = 0.604767 fft 23: mflops = 18.5848 (norm. = 0.652985), norm. avg. (of 18) = 0.3265 fft 24: mflops = 9.46908 (norm. = 0.3327), norm. avg. (of 19) = 0.350898 fft 25: mflops = 8.87832 (norm. = 0.311943), norm. avg. (of 19) = 0.229401 fft 26: mflops = 2.64933 (norm. = 0.0930851), norm. avg. (of 19) = 0.0623494 fft 27: mflops = 18.6544 (norm. = 0.655431), norm. avg. (of 19) = 0.297714 Benchmarking for array size = 1048576 (power of 2): 0. Arndt DIF: elapsed time t=13.89 s, 1 iters, t-(init.)=13.64 s t(norm)=0.650406, mflops=7.68751 (err=1.9e-13) 1. Arndt DIT: elapsed time t=13.54 s, 1 iters, t-(init.)=13.29 s t(norm)=0.633717, mflops=7.88996 (err=1.9e-13) 2. Arndt Split-Radix: elapsed time t=17.13 s, 1 iters, t-(init.)=16.88 s t(norm)=0.804901, mflops=6.21194 (err=1.9e-13) 3. Arndt 4-step: elapsed time t=7.29 s, 1 iters, t-(init.)=7.04 s t(norm)=0.335693, mflops=14.8945 (err=1.9e-13) 4. Beauregard: elapsed time t=16.67 s, 1 iters, t-(init.)=16.42 s t(norm)=0.782967, mflops=6.38597 (err=1.9e-13) 5. Bergland: elapsed time t=8.2 s, 1 iters, t-(init.)=7.95 s t(norm)=0.379086, mflops=13.1896 (err=1.9e-13) 6. Skipping fft (this transform size is too big for CWP). 7. Skipping fft (this transform size is too big for CWP). 8. Edelblute: elapsed time t=17.16 s, 1 iters, t-(init.)=16.92 s t(norm)=0.806808, mflops=6.19726 (err=1.9e-13) 9. FFTPACK (f2c): elapsed time t=9.78 s, 1 iters, t-(init.)=9.52 s t(norm)=0.453949, mflops=11.0145 (err=1.9e-13) FFTW_MEASURE plan: (cost = 4.980000e+00) FFTW_TWIDDLE 32 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_NOTW 16 10. FFTW: elapsed time t=4.89 s, 1 iters, t-(init.)=4.6 s t(norm)=0.219345, mflops=22.7951 (err=1.9e-13) FFTW_ESTIMATE plan: (cost = 1.195377e+07) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=5.21 s, 1 iters, t-(init.)=4.92 s t(norm)=0.234604, mflops=21.3125 (err=1.9e-13) 12. Frigo-old: elapsed time t=9.42 s, 1 iters, t-(init.)=9.18 s t(norm)=0.437737, mflops=11.4224 (err=1.9e-13) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=8.57 s, 1 iters, t-(init.)=8.33 s t(norm)=0.397205, mflops=12.5879 (err=1.9e-13) 15. GSL DIT: elapsed time t=17.13 s, 1 iters, t-(init.)=16.89 s t(norm)=0.805378, mflops=6.20827 (err=1.9e-13) 16. GSL DIF: elapsed time t=16.91 s, 1 iters, t-(init.)=16.65 s t(norm)=0.793934, mflops=6.29775 (err=1.9e-13) 17. Skipping fft (Krukar can't handle N > 4096). 18. Skipping fft (Mayer (Buneman) can't handle N > 2^19). 19. Mayer (simple): elapsed time t=11.99 s, 1 iters, t-(init.)=11.75 s t(norm)=0.560284, mflops=8.92405 20. Mayer (lookup): elapsed time t=12.64 s, 1 iters, t-(init.)=12.39 s t(norm)=0.590801, mflops=8.46308 (err=1.9e-13) 21. NAPACK (f2c): elapsed time t=19.78 s, 1 iters, t-(init.)=19.54 s t(norm)=0.93174, mflops=5.36631 (err=1.5e-11) 22. Ooura (C): elapsed time t=5.76 s, 1 iters, t-(init.)=5.52 s t(norm)=0.263214, mflops=18.9959 (err=1.9e-13) 23. Ransom: elapsed time t=4.87 s, 1 iters, t-(init.)=4.62 s t(norm)=0.220299, mflops=22.6965 (err=1.9e-13) 24. Singleton (f2c): elapsed time t=9.99 s, 1 iters, t-(init.)=9.74 s t(norm)=0.464439, mflops=10.7657 (err=2.6e-13) 25. Temperton (f2c): elapsed time t=11.81 s, 1 iters, t-(init.)=11.55 s t(norm)=0.550747, mflops=9.07858 (err=1.9e-13) 26. Valkenburg: elapsed time t=40.34 s, 1 iters, t-(init.)=40.09 s t(norm)=1.91164, mflops=2.61556 (err=1.9e-13) 27. SGIMATH: elapsed time t=5.51 s, 1 iters, t-(init.)=5.27 s t(norm)=0.251293, mflops=19.8971 (err=1.9e-13) Top mflops for N=1048576 = 22.7951 Normalized results and averages for N=1048576: fft 0: mflops = 7.68751 (norm. = 0.337243), norm. avg. (of 20) = 0.384409 fft 1: mflops = 7.88996 (norm. = 0.346125), norm. avg. (of 20) = 0.383253 fft 2: mflops = 6.21194 (norm. = 0.272512), norm. avg. (of 20) = 0.259702 fft 3: mflops = 14.8945 (norm. = 0.653409), norm. avg. (of 20) = 0.238243 fft 4: mflops = 6.38597 (norm. = 0.280146), norm. avg. (of 20) = 0.111165 fft 5: mflops = 13.1896 (norm. = 0.578616), norm. avg. (of 20) = 0.325056 fft 6: mflops = -1 (norm. = -0.043869), norm. avg. (of 19) = 0.703482 fft 7: mflops = -1 (norm. = -0.043869), norm. avg. (of 19) = 0.718259 fft 8: mflops = 6.19726 (norm. = 0.271868), norm. avg. (of 19) = 0.236662 fft 9: mflops = 11.0145 (norm. = 0.483193), norm. avg. (of 20) = 0.332855 fft 10: mflops = 22.7951 (norm. = 1), norm. avg. (of 20) = 0.784237 fft 11: mflops = 21.3125 (norm. = 0.934959), norm. avg. (of 20) = 0.734487 fft 12: mflops = 11.4224 (norm. = 0.501089), norm. avg. (of 20) = 0.699172 fft 13: mflops = -1 (norm. = -0.043869), norm. avg. (of 17) = 0.607028 fft 14: mflops = 12.5879 (norm. = 0.552221), norm. avg. (of 20) = 0.38592 fft 15: mflops = 6.20827 (norm. = 0.272351), norm. avg. (of 20) = 0.206798 fft 16: mflops = 6.29775 (norm. = 0.276276), norm. avg. (of 20) = 0.21949 fft 17: mflops = -1 (norm. = -0.043869), norm. avg. (of 12) = 0.50959 fft 18: mflops = -1 (norm. = -0.043869), norm. avg. (of 18) = 0.32521 fft 19: mflops = 8.92405 (norm. = 0.391489), norm. avg. (of 19) = 0.387413 fft 20: mflops = 8.46308 (norm. = 0.371267), norm. avg. (of 19) = 0.361039 fft 21: mflops = 5.36631 (norm. = 0.235415), norm. avg. (of 20) = 0.134451 fft 22: mflops = 18.9959 (norm. = 0.833333), norm. avg. (of 20) = 0.616196 fft 23: mflops = 22.6965 (norm. = 0.995671), norm. avg. (of 19) = 0.36172 fft 24: mflops = 10.7657 (norm. = 0.472279), norm. avg. (of 20) = 0.356967 fft 25: mflops = 9.07858 (norm. = 0.398268), norm. avg. (of 20) = 0.237844 fft 26: mflops = 2.61556 (norm. = 0.114742), norm. avg. (of 20) = 0.064969 fft 27: mflops = 19.8971 (norm. = 0.872865), norm. avg. (of 20) = 0.326471 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. CWP (min N) 1. CWP (best N) 2. FFTPACK (f2c) 3. FFTW 4. FFTW_ESTIMATE 5. Frigo-old 6. GSL 7. NAPACK (f2c) 8. Singleton (f2c) 9. Temperton (f2c) 10. Valkenburg 11. SGIMATH Computing normalized averages (12 transforms). Benchmarking for array size = 6: 0. CWP (min N): elapsed time t=1.64 s, 262144 iters, t-(init.)=1.56 s t(norm)=0.383689, mflops=13.0314 1. CWP (best N) (N=15): elapsed time t=1.16 s, 131072 iters, t-(init.)=1.1 s t(norm)=0.5411, mflops=9.24044 2. FFTPACK (f2c): elapsed time t=1.23 s, 262144 iters, t-(init.)=1.14 s t(norm)=0.280388, mflops=17.8324 (err=1.8e-16) FFTW_MEASURE plan: (cost = 1.220703e-06) FFTW_NOTW 6 3. FFTW: elapsed time t=1.49 s, 1048576 iters, t-(init.)=1.15 s t(norm)=0.0707119, mflops=70.7095 (err=1.1e-16) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 4. FFTW_ESTIMATE: elapsed time t=1.49 s, 1048576 iters, t-(init.)=1.16 s t(norm)=0.0713268, mflops=70.0999 (err=1.1e-16) 5. Frigo-old: elapsed time t=1.05 s, 131072 iters, t-(init.)=1 s t(norm)=0.491909, mflops=10.1645 (err=3.3e-16) 6. GSL: elapsed time t=1.96 s, 524288 iters, t-(init.)=1.8 s t(norm)=0.221359, mflops=22.5877 (err=1.2e-16) 7. NAPACK (f2c): elapsed time t=1.33 s, 131072 iters, t-(init.)=1.29 s t(norm)=0.634562, mflops=7.87945 (err=4.7e-16) 8. Singleton (f2c): elapsed time t=1.25 s, 131072 iters, t-(init.)=1.2 s t(norm)=0.590291, mflops=8.47041 (err=1.0e-16) 9. Temperton (f2c): elapsed time t=1.44 s, 131072 iters, t-(init.)=1.39 s t(norm)=0.683753, mflops=7.31258 (err=1.0e-16) 10. Valkenburg: elapsed time t=1.12 s, 65536 iters, t-(init.)=1.1 s t(norm)=1.0822, mflops=4.62022 (err=3.0e-16) 11. SGIMATH: elapsed time t=1.45 s, 65536 iters, t-(init.)=1.42 s t(norm)=1.39702, mflops=3.57904 (err=1.8e-16) Top mflops for N=6 = 70.7095 Normalized results and averages for N=6: fft 0: mflops = 13.0314 (norm. = 0.184295), norm. avg. (of 1) = 0.184295 fft 1: mflops = 9.24044 (norm. = 0.130682), norm. avg. (of 1) = 0.130682 fft 2: mflops = 17.8324 (norm. = 0.252193), norm. avg. (of 1) = 0.252193 fft 3: mflops = 70.7095 (norm. = 1), norm. avg. (of 1) = 1 fft 4: mflops = 70.0999 (norm. = 0.991379), norm. avg. (of 1) = 0.991379 fft 5: mflops = 10.1645 (norm. = 0.14375), norm. avg. (of 1) = 0.14375 fft 6: mflops = 22.5877 (norm. = 0.319444), norm. avg. (of 1) = 0.319444 fft 7: mflops = 7.87945 (norm. = 0.111434), norm. avg. (of 1) = 0.111434 fft 8: mflops = 8.47041 (norm. = 0.119792), norm. avg. (of 1) = 0.119792 fft 9: mflops = 7.31258 (norm. = 0.103417), norm. avg. (of 1) = 0.103417 fft 10: mflops = 4.62022 (norm. = 0.0653409), norm. avg. (of 1) = 0.0653409 fft 11: mflops = 3.57904 (norm. = 0.0506162), norm. avg. (of 1) = 0.0506162 Benchmarking for array size = 9: 0. CWP (min N): elapsed time t=1.6 s, 262144 iters, t-(init.)=1.51 s t(norm)=0.201904, mflops=24.7642 1. CWP (best N) (N=15): elapsed time t=1.16 s, 131072 iters, t-(init.)=1.09 s t(norm)=0.291491, mflops=17.1532 2. FFTPACK (f2c): elapsed time t=1.61 s, 262144 iters, t-(init.)=1.51 s t(norm)=0.201904, mflops=24.7642 (err=2.5e-16) FFTW_MEASURE plan: (cost = 2.059937e-06) FFTW_NOTW 9 3. FFTW: elapsed time t=1.18 s, 524288 iters, t-(init.)=0.96 s t(norm)=0.0641815, mflops=77.9041 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 4. FFTW_ESTIMATE: elapsed time t=1.18 s, 524288 iters, t-(init.)=0.99 s t(norm)=0.0661872, mflops=75.5433 (err=1.4e-16) 5. Frigo-old: elapsed time t=1.15 s, 65536 iters, t-(init.)=1.12 s t(norm)=0.599027, mflops=8.34687 (err=3.8e-16) 6. GSL: elapsed time t=1.62 s, 262144 iters, t-(init.)=1.53 s t(norm)=0.204579, mflops=24.4405 (err=1.5e-16) 7. NAPACK (f2c): elapsed time t=1.71 s, 131072 iters, t-(init.)=1.67 s t(norm)=0.446596, mflops=11.1958 (err=6.2e-16) 8. Singleton (f2c): elapsed time t=1.36 s, 131072 iters, t-(init.)=1.31 s t(norm)=0.350324, mflops=14.2725 (err=1.5e-16) 9. Temperton (f2c): elapsed time t=1.56 s, 131072 iters, t-(init.)=1.51 s t(norm)=0.403809, mflops=12.3821 (err=1.5e-16) 10. Valkenburg: elapsed time t=1 s, 32768 iters, t-(init.)=0.98 s t(norm)=1.0483, mflops=4.76964 (err=3.1e-16) 11. SGIMATH: elapsed time t=1.24 s, 32768 iters, t-(init.)=1.23 s t(norm)=1.31572, mflops=3.8002 (err=2.5e-16) Top mflops for N=9 = 77.9041 Normalized results and averages for N=9: fft 0: mflops = 24.7642 (norm. = 0.317881), norm. avg. (of 2) = 0.251088 fft 1: mflops = 17.1532 (norm. = 0.220183), norm. avg. (of 2) = 0.175433 fft 2: mflops = 24.7642 (norm. = 0.317881), norm. avg. (of 2) = 0.285037 fft 3: mflops = 77.9041 (norm. = 1), norm. avg. (of 2) = 1 fft 4: mflops = 75.5433 (norm. = 0.969697), norm. avg. (of 2) = 0.980538 fft 5: mflops = 8.34687 (norm. = 0.107143), norm. avg. (of 2) = 0.125446 fft 6: mflops = 24.4405 (norm. = 0.313725), norm. avg. (of 2) = 0.316585 fft 7: mflops = 11.1958 (norm. = 0.143713), norm. avg. (of 2) = 0.127573 fft 8: mflops = 14.2725 (norm. = 0.183206), norm. avg. (of 2) = 0.151499 fft 9: mflops = 12.3821 (norm. = 0.15894), norm. avg. (of 2) = 0.131179 fft 10: mflops = 4.76964 (norm. = 0.0612245), norm. avg. (of 2) = 0.0632827 fft 11: mflops = 3.8002 (norm. = 0.0487805), norm. avg. (of 2) = 0.0496983 Benchmarking for array size = 12: 0. CWP (min N): elapsed time t=1 s, 131072 iters, t-(init.)=0.95 s t(norm)=0.16848, mflops=29.6771 1. CWP (best N) (N=15): elapsed time t=1.15 s, 131072 iters, t-(init.)=1.08 s t(norm)=0.191535, mflops=26.1049 2. FFTPACK (f2c): elapsed time t=1.85 s, 262144 iters, t-(init.)=1.76 s t(norm)=0.156065, mflops=32.0378 (err=2.2e-16) FFTW_MEASURE plan: (cost = 2.059937e-06) FFTW_NOTW 12 3. FFTW: elapsed time t=1.2 s, 524288 iters, t-(init.)=1 s t(norm)=0.0443368, mflops=112.773 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.2 s, 524288 iters, t-(init.)=1.02 s t(norm)=0.0452235, mflops=110.562 (err=1.2e-16) 5. Frigo-old: elapsed time t=1.92 s, 131072 iters, t-(init.)=1.88 s t(norm)=0.333413, mflops=14.9964 (err=2.9e-16) 6. GSL: elapsed time t=1.62 s, 262144 iters, t-(init.)=1.52 s t(norm)=0.134784, mflops=37.0964 (err=1.6e-16) 7. NAPACK (f2c): elapsed time t=1.37 s, 65536 iters, t-(init.)=1.35 s t(norm)=0.478837, mflops=10.442 (err=5.5e-16) 8. Singleton (f2c): elapsed time t=1.88 s, 131072 iters, t-(init.)=1.83 s t(norm)=0.324545, mflops=15.4062 (err=1.5e-16) 9. Temperton (f2c): elapsed time t=1.78 s, 131072 iters, t-(init.)=1.73 s t(norm)=0.306811, mflops=16.2967 (err=1.4e-16) 10. Valkenburg: elapsed time t=1.45 s, 32768 iters, t-(init.)=1.43 s t(norm)=1.01443, mflops=4.9289 (err=2.6e-16) 11. SGIMATH: elapsed time t=1.1 s, 32768 iters, t-(init.)=1.08 s t(norm)=0.76614, mflops=6.52623 (err=2.2e-16) Top mflops for N=12 = 112.773 Normalized results and averages for N=12: fft 0: mflops = 29.6771 (norm. = 0.263158), norm. avg. (of 3) = 0.255111 fft 1: mflops = 26.1049 (norm. = 0.231481), norm. avg. (of 3) = 0.194116 fft 2: mflops = 32.0378 (norm. = 0.284091), norm. avg. (of 3) = 0.284722 fft 3: mflops = 112.773 (norm. = 1), norm. avg. (of 3) = 1 fft 4: mflops = 110.562 (norm. = 0.980392), norm. avg. (of 3) = 0.980489 fft 5: mflops = 14.9964 (norm. = 0.132979), norm. avg. (of 3) = 0.127957 fft 6: mflops = 37.0964 (norm. = 0.328947), norm. avg. (of 3) = 0.320706 fft 7: mflops = 10.442 (norm. = 0.0925926), norm. avg. (of 3) = 0.115913 fft 8: mflops = 15.4062 (norm. = 0.136612), norm. avg. (of 3) = 0.146537 fft 9: mflops = 16.2967 (norm. = 0.144509), norm. avg. (of 3) = 0.135622 fft 10: mflops = 4.9289 (norm. = 0.0437063), norm. avg. (of 3) = 0.0567572 fft 11: mflops = 6.52623 (norm. = 0.0578704), norm. avg. (of 3) = 0.0524224 Benchmarking for array size = 15: 0. CWP (min N): elapsed time t=1.16 s, 131072 iters, t-(init.)=1.1 s t(norm)=0.143206, mflops=34.9148 1. CWP (best N): elapsed time t=1.16 s, 131072 iters, t-(init.)=1.1 s t(norm)=0.143206, mflops=34.9148 2. FFTPACK (f2c): elapsed time t=1.13 s, 131072 iters, t-(init.)=1.05 s t(norm)=0.136696, mflops=36.5774 (err=4.5e-16) FFTW_MEASURE plan: (cost = 3.967285e-06) FFTW_NOTW 15 3. FFTW: elapsed time t=1.11 s, 262144 iters, t-(init.)=0.98 s t(norm)=0.0637916, mflops=78.3802 (err=1.9e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.11 s, 262144 iters, t-(init.)=0.98 s t(norm)=0.0637916, mflops=78.3802 (err=1.9e-16) 5. Frigo-old: elapsed time t=1.85 s, 65536 iters, t-(init.)=1.81 s t(norm)=0.471277, mflops=10.6095 (err=3.9e-16) 6. GSL: elapsed time t=1.13 s, 131072 iters, t-(init.)=1.06 s t(norm)=0.137998, mflops=36.2324 (err=2.3e-16) 7. NAPACK (f2c): elapsed time t=1.25 s, 32768 iters, t-(init.)=1.23 s t(norm)=0.64052, mflops=7.80616 (err=1.1e-15) 8. Singleton (f2c): elapsed time t=1.18 s, 65536 iters, t-(init.)=1.14 s t(norm)=0.296826, mflops=16.8449 (err=2.9e-16) 9. Temperton (f2c): elapsed time t=1.05 s, 65536 iters, t-(init.)=1.01 s t(norm)=0.262978, mflops=19.013 (err=2.0e-16) 10. Valkenburg: elapsed time t=1.11 s, 16384 iters, t-(init.)=1.1 s t(norm)=1.14565, mflops=4.36435 (err=4.6e-16) 11. SGIMATH: elapsed time t=1.36 s, 32768 iters, t-(init.)=1.34 s t(norm)=0.697802, mflops=7.16535 (err=4.5e-16) Top mflops for N=15 = 78.3802 Normalized results and averages for N=15: fft 0: mflops = 34.9148 (norm. = 0.445455), norm. avg. (of 4) = 0.302697 fft 1: mflops = 34.9148 (norm. = 0.445455), norm. avg. (of 4) = 0.25695 fft 2: mflops = 36.5774 (norm. = 0.466667), norm. avg. (of 4) = 0.330208 fft 3: mflops = 78.3802 (norm. = 1), norm. avg. (of 4) = 1 fft 4: mflops = 78.3802 (norm. = 1), norm. avg. (of 4) = 0.985367 fft 5: mflops = 10.6095 (norm. = 0.135359), norm. avg. (of 4) = 0.129808 fft 6: mflops = 36.2324 (norm. = 0.462264), norm. avg. (of 4) = 0.356095 fft 7: mflops = 7.80616 (norm. = 0.0995935), norm. avg. (of 4) = 0.111833 fft 8: mflops = 16.8449 (norm. = 0.214912), norm. avg. (of 4) = 0.163631 fft 9: mflops = 19.013 (norm. = 0.242574), norm. avg. (of 4) = 0.16236 fft 10: mflops = 4.36435 (norm. = 0.0556818), norm. avg. (of 4) = 0.0564884 fft 11: mflops = 7.16535 (norm. = 0.0914179), norm. avg. (of 4) = 0.0621712 Benchmarking for array size = 18: 0. CWP (min N): elapsed time t=1.4 s, 131072 iters, t-(init.)=1.32 s t(norm)=0.134172, mflops=37.2655 1. CWP (best N) (N=28): elapsed time t=1.61 s, 131072 iters, t-(init.)=1.53 s t(norm)=0.155518, mflops=32.1506 2. FFTPACK (f2c): elapsed time t=1.81 s, 131072 iters, t-(init.)=1.74 s t(norm)=0.176864, mflops=28.2704 (err=2.9e-16) FFTW_MEASURE plan: (cost = 6.103516e-06) FFTW_TWIDDLE 2 FFTW_NOTW 9 3. FFTW: elapsed time t=1.69 s, 262144 iters, t-(init.)=1.54 s t(norm)=0.0782672, mflops=63.8837 (err=2.2e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 4. FFTW_ESTIMATE: elapsed time t=1.71 s, 262144 iters, t-(init.)=1.57 s t(norm)=0.0797919, mflops=62.663 (err=2.2e-16) 5. Frigo-old: elapsed time t=1.26 s, 32768 iters, t-(init.)=1.24 s t(norm)=0.504163, mflops=9.91743 (err=4.6e-16) 6. GSL: elapsed time t=1.23 s, 131072 iters, t-(init.)=1.16 s t(norm)=0.117909, mflops=42.4055 (err=2.3e-16) 7. NAPACK (f2c): elapsed time t=1.9 s, 65536 iters, t-(init.)=1.87 s t(norm)=0.380155, mflops=13.1525 (err=7.8e-16) 8. Singleton (f2c): elapsed time t=1.25 s, 65536 iters, t-(init.)=1.22 s t(norm)=0.248016, mflops=20.16 (err=2.1e-16) 9. Temperton (f2c): elapsed time t=1.58 s, 65536 iters, t-(init.)=1.54 s t(norm)=0.313069, mflops=15.9709 (err=2.9e-16) 10. Valkenburg: elapsed time t=1.25 s, 16384 iters, t-(init.)=1.24 s t(norm)=1.00833, mflops=4.95871 (err=4.8e-16) 11. SGIMATH: elapsed time t=1.69 s, 32768 iters, t-(init.)=1.67 s t(norm)=0.678994, mflops=7.36384 (err=2.7e-16) Top mflops for N=18 = 63.8837 Normalized results and averages for N=18: fft 0: mflops = 37.2655 (norm. = 0.583333), norm. avg. (of 5) = 0.358824 fft 1: mflops = 32.1506 (norm. = 0.503268), norm. avg. (of 5) = 0.306214 fft 2: mflops = 28.2704 (norm. = 0.442529), norm. avg. (of 5) = 0.352672 fft 3: mflops = 63.8837 (norm. = 1), norm. avg. (of 5) = 1 fft 4: mflops = 62.663 (norm. = 0.980892), norm. avg. (of 5) = 0.984472 fft 5: mflops = 9.91743 (norm. = 0.155242), norm. avg. (of 5) = 0.134895 fft 6: mflops = 42.4055 (norm. = 0.663793), norm. avg. (of 5) = 0.417635 fft 7: mflops = 13.1525 (norm. = 0.205882), norm. avg. (of 5) = 0.130643 fft 8: mflops = 20.16 (norm. = 0.315574), norm. avg. (of 5) = 0.194019 fft 9: mflops = 15.9709 (norm. = 0.25), norm. avg. (of 5) = 0.179888 fft 10: mflops = 4.95871 (norm. = 0.077621), norm. avg. (of 5) = 0.0607149 fft 11: mflops = 7.36384 (norm. = 0.115269), norm. avg. (of 5) = 0.0727909 Benchmarking for array size = 24: 0. CWP (min N): elapsed time t=1.38 s, 131072 iters, t-(init.)=1.31 s t(norm)=0.0908269, mflops=55.0498 1. CWP (best N) (N=28): elapsed time t=1.63 s, 131072 iters, t-(init.)=1.55 s t(norm)=0.107467, mflops=46.526 2. FFTPACK (f2c): elapsed time t=1.09 s, 65536 iters, t-(init.)=1.06 s t(norm)=0.146987, mflops=34.0166 (err=2.7e-16) FFTW_MEASURE plan: (cost = 7.019043e-06) FFTW_TWIDDLE 2 FFTW_NOTW 12 3. FFTW: elapsed time t=1.86 s, 262144 iters, t-(init.)=1.71 s t(norm)=0.0592801, mflops=84.3453 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.86 s, 262144 iters, t-(init.)=1.71 s t(norm)=0.0592801, mflops=84.3453 (err=2.1e-16) 5. Frigo-old: elapsed time t=1.85 s, 65536 iters, t-(init.)=1.81 s t(norm)=0.250987, mflops=19.9213 (err=3.5e-16) 6. GSL: elapsed time t=1.43 s, 131072 iters, t-(init.)=1.36 s t(norm)=0.0942935, mflops=53.0259 (err=2.1e-16) 7. NAPACK (f2c): elapsed time t=1.37 s, 32768 iters, t-(init.)=1.35 s t(norm)=0.374401, mflops=13.3547 (err=8.0e-16) 8. Singleton (f2c): elapsed time t=1.92 s, 65536 iters, t-(init.)=1.88 s t(norm)=0.260694, mflops=19.1796 (err=2.3e-16) 9. Temperton (f2c): elapsed time t=1.64 s, 65536 iters, t-(init.)=1.6 s t(norm)=0.221867, mflops=22.536 (err=2.8e-16) 10. Valkenburg: elapsed time t=1.8 s, 16384 iters, t-(init.)=1.79 s t(norm)=0.992856, mflops=5.03598 (err=4.8e-16) 11. SGIMATH: elapsed time t=1.96 s, 32768 iters, t-(init.)=1.94 s t(norm)=0.538028, mflops=9.2932 (err=2.6e-16) Top mflops for N=24 = 84.3453 Normalized results and averages for N=24: fft 0: mflops = 55.0498 (norm. = 0.652672), norm. avg. (of 6) = 0.407799 fft 1: mflops = 46.526 (norm. = 0.551613), norm. avg. (of 6) = 0.347114 fft 2: mflops = 34.0166 (norm. = 0.403302), norm. avg. (of 6) = 0.36111 fft 3: mflops = 84.3453 (norm. = 1), norm. avg. (of 6) = 1 fft 4: mflops = 84.3453 (norm. = 1), norm. avg. (of 6) = 0.98706 fft 5: mflops = 19.9213 (norm. = 0.236188), norm. avg. (of 6) = 0.151777 fft 6: mflops = 53.0259 (norm. = 0.628676), norm. avg. (of 6) = 0.452809 fft 7: mflops = 13.3547 (norm. = 0.158333), norm. avg. (of 6) = 0.135258 fft 8: mflops = 19.1796 (norm. = 0.227394), norm. avg. (of 6) = 0.199582 fft 9: mflops = 22.536 (norm. = 0.267188), norm. avg. (of 6) = 0.194438 fft 10: mflops = 5.03598 (norm. = 0.0597067), norm. avg. (of 6) = 0.0605469 fft 11: mflops = 9.2932 (norm. = 0.11018), norm. avg. (of 6) = 0.0790225 Benchmarking for array size = 36: 0. CWP (min N): elapsed time t=1 s, 65536 iters, t-(init.)=0.95 s t(norm)=0.0778856, mflops=64.1968 1. CWP (best N): elapsed time t=1 s, 65536 iters, t-(init.)=0.95 s t(norm)=0.0778856, mflops=64.1968 2. FFTPACK (f2c): elapsed time t=1.59 s, 65536 iters, t-(init.)=1.55 s t(norm)=0.127076, mflops=39.3464 (err=4.8e-16) FFTW_MEASURE plan: (cost = 1.159668e-05) FFTW_TWIDDLE 6 FFTW_NOTW 6 3. FFTW: elapsed time t=1.58 s, 131072 iters, t-(init.)=1.49 s t(norm)=0.0610787, mflops=81.8616 (err=4.1e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.53 s, 131072 iters, t-(init.)=1.42 s t(norm)=0.0582092, mflops=85.8971 (err=4.0e-16) 5. Frigo-old: elapsed time t=1.22 s, 16384 iters, t-(init.)=1.2 s t(norm)=0.393527, mflops=12.7056 (err=5.7e-16) 6. GSL: elapsed time t=1.07 s, 65536 iters, t-(init.)=1.02 s t(norm)=0.0836245, mflops=59.7911 (err=4.6e-16) 7. NAPACK (f2c): elapsed time t=1.97 s, 32768 iters, t-(init.)=1.95 s t(norm)=0.319741, mflops=15.6377 (err=1.8e-15) 8. Singleton (f2c): elapsed time t=1.05 s, 32768 iters, t-(init.)=1.02 s t(norm)=0.167249, mflops=29.8955 (err=4.7e-16) 9. Temperton (f2c): elapsed time t=1.14 s, 32768 iters, t-(init.)=1.11 s t(norm)=0.182006, mflops=27.4716 (err=3.8e-16) 10. Valkenburg: elapsed time t=1.49 s, 8192 iters, t-(init.)=1.48 s t(norm)=0.9707, mflops=5.15092 (err=6.1e-16) 11. SGIMATH: elapsed time t=1.4 s, 16384 iters, t-(init.)=1.39 s t(norm)=0.455835, mflops=10.9689 (err=4.9e-16) Top mflops for N=36 = 85.8971 Normalized results and averages for N=36: fft 0: mflops = 64.1968 (norm. = 0.747368), norm. avg. (of 7) = 0.456309 fft 1: mflops = 64.1968 (norm. = 0.747368), norm. avg. (of 7) = 0.404293 fft 2: mflops = 39.3464 (norm. = 0.458065), norm. avg. (of 7) = 0.374961 fft 3: mflops = 81.8616 (norm. = 0.95302), norm. avg. (of 7) = 0.993289 fft 4: mflops = 85.8971 (norm. = 1), norm. avg. (of 7) = 0.988909 fft 5: mflops = 12.7056 (norm. = 0.147917), norm. avg. (of 7) = 0.151225 fft 6: mflops = 59.7911 (norm. = 0.696078), norm. avg. (of 7) = 0.487561 fft 7: mflops = 15.6377 (norm. = 0.182051), norm. avg. (of 7) = 0.141943 fft 8: mflops = 29.8955 (norm. = 0.348039), norm. avg. (of 7) = 0.22079 fft 9: mflops = 27.4716 (norm. = 0.31982), norm. avg. (of 7) = 0.21235 fft 10: mflops = 5.15092 (norm. = 0.0599662), norm. avg. (of 7) = 0.0604639 fft 11: mflops = 10.9689 (norm. = 0.127698), norm. avg. (of 7) = 0.0859761 Benchmarking for array size = 80: 0. CWP (min N): elapsed time t=1.98 s, 65536 iters, t-(init.)=1.89 s t(norm)=0.057022, mflops=87.6855 1. CWP (best N) (N=84): elapsed time t=1.05 s, 32768 iters, t-(init.)=1.01 s t(norm)=0.0609441, mflops=82.0424 2. FFTPACK (f2c): elapsed time t=1.55 s, 32768 iters, t-(init.)=1.51 s t(norm)=0.0911145, mflops=54.876 (err=4.2e-16) FFTW_MEASURE plan: (cost = 2.685547e-05) FFTW_TWIDDLE 10 FFTW_NOTW 8 3. FFTW: elapsed time t=1.73 s, 65536 iters, t-(init.)=1.64 s t(norm)=0.0494794, mflops=101.052 (err=3.5e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 4. FFTW_ESTIMATE: elapsed time t=1.85 s, 65536 iters, t-(init.)=1.76 s t(norm)=0.0530998, mflops=94.1622 (err=4.2e-16) 5. Frigo-old: elapsed time t=1.61 s, 16384 iters, t-(init.)=1.59 s t(norm)=0.191884, mflops=26.0575 (err=3.5e-16) 6. GSL: elapsed time t=1.42 s, 32768 iters, t-(init.)=1.37 s t(norm)=0.0826668, mflops=60.4838 (err=3.0e-16) 7. NAPACK (f2c): elapsed time t=1 s, 4096 iters, t-(init.)=0.99 s t(norm)=0.477899, mflops=10.4625 (err=5.2e-16) 8. Singleton (f2c): elapsed time t=1.89 s, 32768 iters, t-(init.)=1.85 s t(norm)=0.11163, mflops=44.7907 (err=4.3e-16) 9. Temperton (f2c): elapsed time t=1.16 s, 16384 iters, t-(init.)=1.14 s t(norm)=0.137577, mflops=36.3433 (err=3.6e-16) 10. Valkenburg: elapsed time t=1.05 s, 2048 iters, t-(init.)=1.04 s t(norm)=1.00407, mflops=4.97973 (err=4.3e-16) 11. SGIMATH: elapsed time t=1.24 s, 8192 iters, t-(init.)=1.23 s t(norm)=0.296876, mflops=16.842 (err=4.2e-16) Top mflops for N=80 = 101.052 Normalized results and averages for N=80: fft 0: mflops = 87.6855 (norm. = 0.867725), norm. avg. (of 8) = 0.507736 fft 1: mflops = 82.0424 (norm. = 0.811881), norm. avg. (of 8) = 0.455241 fft 2: mflops = 54.876 (norm. = 0.543046), norm. avg. (of 8) = 0.395972 fft 3: mflops = 101.052 (norm. = 1), norm. avg. (of 8) = 0.994128 fft 4: mflops = 94.1622 (norm. = 0.931818), norm. avg. (of 8) = 0.981772 fft 5: mflops = 26.0575 (norm. = 0.257862), norm. avg. (of 8) = 0.164555 fft 6: mflops = 60.4838 (norm. = 0.59854), norm. avg. (of 8) = 0.501434 fft 7: mflops = 10.4625 (norm. = 0.103535), norm. avg. (of 8) = 0.137142 fft 8: mflops = 44.7907 (norm. = 0.443243), norm. avg. (of 8) = 0.248596 fft 9: mflops = 36.3433 (norm. = 0.359649), norm. avg. (of 8) = 0.230762 fft 10: mflops = 4.97973 (norm. = 0.0492788), norm. avg. (of 8) = 0.0590658 fft 11: mflops = 16.842 (norm. = 0.166667), norm. avg. (of 8) = 0.0960624 Benchmarking for array size = 108: 0. CWP (min N) (N=110): elapsed time t=1.76 s, 32768 iters, t-(init.)=1.69 s t(norm)=0.070696, mflops=70.7254 1. CWP (best N) (N=112): elapsed time t=1.34 s, 32768 iters, t-(init.)=1.28 s t(norm)=0.0535449, mflops=93.3796 2. FFTPACK (f2c): elapsed time t=1.35 s, 16384 iters, t-(init.)=1.32 s t(norm)=0.110436, mflops=45.2749 (err=4.0e-16) FFTW_MEASURE plan: (cost = 3.784180e-05) FFTW_TWIDDLE 9 FFTW_NOTW 12 3. FFTW: elapsed time t=1.27 s, 32768 iters, t-(init.)=1.21 s t(norm)=0.0506167, mflops=98.7817 (err=3.9e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.27 s, 32768 iters, t-(init.)=1.21 s t(norm)=0.0506167, mflops=98.7817 (err=3.9e-16) 5. Frigo-old: elapsed time t=1.32 s, 4096 iters, t-(init.)=1.31 s t(norm)=0.438399, mflops=11.4051 (err=6.0e-16) 6. GSL: elapsed time t=1.98 s, 32768 iters, t-(init.)=1.92 s t(norm)=0.0803174, mflops=62.253 (err=3.9e-16) 7. NAPACK (f2c): elapsed time t=1.65 s, 8192 iters, t-(init.)=1.64 s t(norm)=0.274418, mflops=18.2204 (err=2.4e-15) 8. Singleton (f2c): elapsed time t=1.99 s, 16384 iters, t-(init.)=1.96 s t(norm)=0.163981, mflops=30.4913 (err=4.5e-16) 9. Temperton (f2c): elapsed time t=1.91 s, 16384 iters, t-(init.)=1.88 s t(norm)=0.157288, mflops=31.7888 (err=3.3e-16) 10. Valkenburg: elapsed time t=1.43 s, 2048 iters, t-(init.)=1.42 s t(norm)=0.950422, mflops=5.26082 (err=5.7e-16) 11. SGIMATH: elapsed time t=1.73 s, 8192 iters, t-(init.)=1.71 s t(norm)=0.286131, mflops=17.4745 (err=5.0e-16) Top mflops for N=108 = 98.7817 Normalized results and averages for N=108: fft 0: mflops = 70.7254 (norm. = 0.715976), norm. avg. (of 9) = 0.530874 fft 1: mflops = 93.3796 (norm. = 0.945312), norm. avg. (of 9) = 0.509694 fft 2: mflops = 45.2749 (norm. = 0.458333), norm. avg. (of 9) = 0.402901 fft 3: mflops = 98.7817 (norm. = 1), norm. avg. (of 9) = 0.99478 fft 4: mflops = 98.7817 (norm. = 1), norm. avg. (of 9) = 0.983798 fft 5: mflops = 11.4051 (norm. = 0.115458), norm. avg. (of 9) = 0.1591 fft 6: mflops = 62.253 (norm. = 0.630208), norm. avg. (of 9) = 0.515742 fft 7: mflops = 18.2204 (norm. = 0.184451), norm. avg. (of 9) = 0.142398 fft 8: mflops = 30.4913 (norm. = 0.308673), norm. avg. (of 9) = 0.255272 fft 9: mflops = 31.7888 (norm. = 0.321809), norm. avg. (of 9) = 0.240878 fft 10: mflops = 5.26082 (norm. = 0.053257), norm. avg. (of 9) = 0.0584204 fft 11: mflops = 17.4745 (norm. = 0.176901), norm. avg. (of 9) = 0.105044 Benchmarking for array size = 210: 0. CWP (min N): elapsed time t=1.56 s, 16384 iters, t-(init.)=1.51 s t(norm)=0.0568911, mflops=87.8872 1. CWP (best N): elapsed time t=1.57 s, 16384 iters, t-(init.)=1.51 s t(norm)=0.0568911, mflops=87.8872 2. FFTPACK (f2c): elapsed time t=1.06 s, 4096 iters, t-(init.)=1.05 s t(norm)=0.15824, mflops=31.5975 (err=6.0e-16) FFTW_MEASURE plan: (cost = 1.123047e-04) FFTW_TWIDDLE 3 FFTW_TWIDDLE 7 FFTW_NOTW 10 3. FFTW: elapsed time t=1.93 s, 16384 iters, t-(init.)=1.87 s t(norm)=0.0704545, mflops=70.9678 (err=4.6e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1 s, 8192 iters, t-(init.)=0.98 s t(norm)=0.0738454, mflops=67.709 (err=4.9e-16) 5. Frigo-old: elapsed time t=1.3 s, 2048 iters, t-(init.)=1.3 s t(norm)=0.391833, mflops=12.7605 (err=5.8e-16) 6. GSL: elapsed time t=1.09 s, 8192 iters, t-(init.)=1.06 s t(norm)=0.0798736, mflops=62.5989 (err=6.3e-16) 7. NAPACK (f2c): elapsed time t=1.92 s, 2048 iters, t-(init.)=1.91 s t(norm)=0.575693, mflops=8.68519 (err=1.5e-14) 8. Singleton (f2c): elapsed time t=1.32 s, 4096 iters, t-(init.)=1.31 s t(norm)=0.197423, mflops=25.3263 (err=6.3e-16) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=1 s, 512 iters, t-(init.)=1 s t(norm)=1.20564, mflops=4.14718 (err=7.2e-16) 11. SGIMATH: elapsed time t=1.34 s, 2048 iters, t-(init.)=1.33 s t(norm)=0.400875, mflops=12.4727 (err=6.2e-16) Top mflops for N=210 = 87.8872 Normalized results and averages for N=210: fft 0: mflops = 87.8872 (norm. = 1), norm. avg. (of 10) = 0.577786 fft 1: mflops = 87.8872 (norm. = 1), norm. avg. (of 10) = 0.558724 fft 2: mflops = 31.5975 (norm. = 0.359524), norm. avg. (of 10) = 0.398563 fft 3: mflops = 70.9678 (norm. = 0.807487), norm. avg. (of 10) = 0.976051 fft 4: mflops = 67.709 (norm. = 0.770408), norm. avg. (of 10) = 0.962459 fft 5: mflops = 12.7605 (norm. = 0.145192), norm. avg. (of 10) = 0.157709 fft 6: mflops = 62.5989 (norm. = 0.712264), norm. avg. (of 10) = 0.535394 fft 7: mflops = 8.68519 (norm. = 0.098822), norm. avg. (of 10) = 0.138041 fft 8: mflops = 25.3263 (norm. = 0.288168), norm. avg. (of 10) = 0.258561 fft 9: mflops = -1 (norm. = -0.0113782), norm. avg. (of 9) = 0.240878 fft 10: mflops = 4.14718 (norm. = 0.0471875), norm. avg. (of 10) = 0.0572971 fft 11: mflops = 12.4727 (norm. = 0.141917), norm. avg. (of 10) = 0.108732 Benchmarking for array size = 504: 0. CWP (min N): elapsed time t=1.66 s, 8192 iters, t-(init.)=1.59 s t(norm)=0.0428975, mflops=116.557 1. CWP (best N): elapsed time t=1.67 s, 8192 iters, t-(init.)=1.61 s t(norm)=0.0434371, mflops=115.109 2. FFTPACK (f2c): elapsed time t=1.48 s, 2048 iters, t-(init.)=1.47 s t(norm)=0.15864, mflops=31.5179 (err=1.3e-15) FFTW_MEASURE plan: (cost = 2.539063e-04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 7 FFTW_NOTW 12 3. FFTW: elapsed time t=1.12 s, 4096 iters, t-(init.)=1.09 s t(norm)=0.0588154, mflops=85.0117 (err=1.2e-15) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 4. FFTW_ESTIMATE: elapsed time t=1.19 s, 4096 iters, t-(init.)=1.16 s t(norm)=0.0625926, mflops=79.8817 (err=1.2e-15) 5. Frigo-old: elapsed time t=1.57 s, 1024 iters, t-(init.)=1.56 s t(norm)=0.336705, mflops=14.8498 (err=1.3e-15) 6. GSL: elapsed time t=1.38 s, 4096 iters, t-(init.)=1.35 s t(norm)=0.0728448, mflops=68.6391 (err=1.3e-15) 7. NAPACK (f2c): elapsed time t=1.01 s, 512 iters, t-(init.)=1.01 s t(norm)=0.43599, mflops=11.4682 (err=4.1e-14) 8. Singleton (f2c): elapsed time t=1.57 s, 2048 iters, t-(init.)=1.56 s t(norm)=0.168352, mflops=29.6996 (err=1.9e-15) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=1.28 s, 256 iters, t-(init.)=1.28 s t(norm)=1.10508, mflops=4.52455 (err=1.4e-15) 11. SGIMATH: elapsed time t=1.28 s, 1024 iters, t-(init.)=1.27 s t(norm)=0.274112, mflops=18.2407 (err=1.3e-15) Top mflops for N=504 = 116.557 Normalized results and averages for N=504: fft 0: mflops = 116.557 (norm. = 1), norm. avg. (of 11) = 0.616169 fft 1: mflops = 115.109 (norm. = 0.987578), norm. avg. (of 11) = 0.597711 fft 2: mflops = 31.5179 (norm. = 0.270408), norm. avg. (of 11) = 0.386913 fft 3: mflops = 85.0117 (norm. = 0.729358), norm. avg. (of 11) = 0.953624 fft 4: mflops = 79.8817 (norm. = 0.685345), norm. avg. (of 11) = 0.937266 fft 5: mflops = 14.8498 (norm. = 0.127404), norm. avg. (of 11) = 0.154954 fft 6: mflops = 68.6391 (norm. = 0.588889), norm. avg. (of 11) = 0.540257 fft 7: mflops = 11.4682 (norm. = 0.0983911), norm. avg. (of 11) = 0.134436 fft 8: mflops = 29.6996 (norm. = 0.254808), norm. avg. (of 11) = 0.25822 fft 9: mflops = -1 (norm. = -0.0085795), norm. avg. (of 9) = 0.240878 fft 10: mflops = 4.52455 (norm. = 0.0388184), norm. avg. (of 11) = 0.0556172 fft 11: mflops = 18.2407 (norm. = 0.156496), norm. avg. (of 11) = 0.113074 Benchmarking for array size = 1000: 0. CWP (min N) (N=1001): elapsed time t=1.27 s, 2048 iters, t-(init.)=1.23 s t(norm)=0.0602648, mflops=82.9672 1. CWP (best N) (N=1008): elapsed time t=1.89 s, 4096 iters, t-(init.)=1.83 s t(norm)=0.0448311, mflops=111.53 2. FFTPACK (f2c): elapsed time t=1.14 s, 1024 iters, t-(init.)=1.12 s t(norm)=0.109751, mflops=45.5579 (err=1.1e-15) FFTW_MEASURE plan: (cost = 7.617188e-04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 3. FFTW: elapsed time t=1.57 s, 2048 iters, t-(init.)=1.53 s t(norm)=0.0749635, mflops=66.6991 (err=1.0e-15) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 4. FFTW_ESTIMATE: elapsed time t=1.58 s, 2048 iters, t-(init.)=1.55 s t(norm)=0.0759434, mflops=65.8385 (err=1.0e-15) 5. Frigo-old: elapsed time t=1.66 s, 512 iters, t-(init.)=1.66 s t(norm)=0.325332, mflops=15.3689 (err=1.1e-15) 6. GSL: elapsed time t=1.93 s, 2048 iters, t-(init.)=1.9 s t(norm)=0.093092, mflops=53.7103 (err=1.0e-15) 7. NAPACK (f2c): elapsed time t=1.42 s, 256 iters, t-(init.)=1.41 s t(norm)=0.552672, mflops=9.04695 (err=1.7e-14) 8. Singleton (f2c): elapsed time t=1.24 s, 1024 iters, t-(init.)=1.23 s t(norm)=0.12053, mflops=41.4836 (err=1.5e-15) 9. Temperton (f2c): elapsed time t=1.23 s, 1024 iters, t-(init.)=1.21 s t(norm)=0.11857, mflops=42.1693 (err=1.0e-15) 10. Valkenburg: elapsed time t=1.46 s, 128 iters, t-(init.)=1.46 s t(norm)=1.14454, mflops=4.36856 (err=1.1e-15) 11. SGIMATH: elapsed time t=1.76 s, 1024 iters, t-(init.)=1.74 s t(norm)=0.170505, mflops=29.3246 (err=1.1e-15) Top mflops for N=1000 = 111.53 Normalized results and averages for N=1000: fft 0: mflops = 82.9672 (norm. = 0.743902), norm. avg. (of 12) = 0.626814 fft 1: mflops = 111.53 (norm. = 1), norm. avg. (of 12) = 0.631235 fft 2: mflops = 45.5579 (norm. = 0.408482), norm. avg. (of 12) = 0.38871 fft 3: mflops = 66.6991 (norm. = 0.598039), norm. avg. (of 12) = 0.923992 fft 4: mflops = 65.8385 (norm. = 0.590323), norm. avg. (of 12) = 0.908354 fft 5: mflops = 15.3689 (norm. = 0.137801), norm. avg. (of 12) = 0.153525 fft 6: mflops = 53.7103 (norm. = 0.481579), norm. avg. (of 12) = 0.535367 fft 7: mflops = 9.04695 (norm. = 0.081117), norm. avg. (of 12) = 0.129993 fft 8: mflops = 41.4836 (norm. = 0.371951), norm. avg. (of 12) = 0.267698 fft 9: mflops = 42.1693 (norm. = 0.378099), norm. avg. (of 10) = 0.2546 fft 10: mflops = 4.36856 (norm. = 0.0391695), norm. avg. (of 12) = 0.0542466 fft 11: mflops = 29.3246 (norm. = 0.262931), norm. avg. (of 12) = 0.125562 Benchmarking for array size = 1960: 0. CWP (min N) (N=1980): elapsed time t=1.19 s, 1024 iters, t-(init.)=1.16 s t(norm)=0.0528467, mflops=94.6132 1. CWP (best N) (N=1980): elapsed time t=1.19 s, 1024 iters, t-(init.)=1.15 s t(norm)=0.0523912, mflops=95.436 2. FFTPACK (f2c): elapsed time t=1.44 s, 256 iters, t-(init.)=1.43 s t(norm)=0.260589, mflops=19.1873 (err=2.8e-15) FFTW_MEASURE plan: (cost = 2.031250e-03) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_NOTW 8 3. FFTW: elapsed time t=1.09 s, 512 iters, t-(init.)=1.08 s t(norm)=0.0984043, mflops=50.8108 (err=2.8e-15) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 4. FFTW_ESTIMATE: elapsed time t=1.01 s, 512 iters, t-(init.)=0.99 s t(norm)=0.0902039, mflops=55.43 (err=2.8e-15) 5. Frigo-old: elapsed time t=1.91 s, 256 iters, t-(init.)=1.91 s t(norm)=0.34806, mflops=14.3654 (err=2.8e-15) 6. GSL: elapsed time t=1.48 s, 512 iters, t-(init.)=1.47 s t(norm)=0.133939, mflops=37.3304 (err=2.8e-15) 7. NAPACK (f2c): elapsed time t=1.82 s, 128 iters, t-(init.)=1.81 s t(norm)=0.659673, mflops=7.57951 (err=1.3e-13) 8. Singleton (f2c): elapsed time t=1.8 s, 512 iters, t-(init.)=1.78 s t(norm)=0.162185, mflops=30.829 (err=4.3e-15) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=1.83 s, 64 iters, t-(init.)=1.82 s t(norm)=1.32664, mflops=3.76893 (err=2.7e-15) 11. SGIMATH: elapsed time t=1.74 s, 256 iters, t-(init.)=1.73 s t(norm)=0.315258, mflops=15.86 (err=2.8e-15) Top mflops for N=1960 = 95.436 Normalized results and averages for N=1960: fft 0: mflops = 94.6132 (norm. = 0.991379), norm. avg. (of 13) = 0.654857 fft 1: mflops = 95.436 (norm. = 1), norm. avg. (of 13) = 0.659602 fft 2: mflops = 19.1873 (norm. = 0.201049), norm. avg. (of 13) = 0.374275 fft 3: mflops = 50.8108 (norm. = 0.532407), norm. avg. (of 13) = 0.89387 fft 4: mflops = 55.43 (norm. = 0.580808), norm. avg. (of 13) = 0.883159 fft 5: mflops = 14.3654 (norm. = 0.150524), norm. avg. (of 13) = 0.153294 fft 6: mflops = 37.3304 (norm. = 0.391156), norm. avg. (of 13) = 0.524274 fft 7: mflops = 7.57951 (norm. = 0.0794199), norm. avg. (of 13) = 0.126103 fft 8: mflops = 30.829 (norm. = 0.323034), norm. avg. (of 13) = 0.271954 fft 9: mflops = -1 (norm. = -0.0104782), norm. avg. (of 10) = 0.2546 fft 10: mflops = 3.76893 (norm. = 0.0394918), norm. avg. (of 13) = 0.0531116 fft 11: mflops = 15.86 (norm. = 0.166185), norm. avg. (of 13) = 0.128687 Benchmarking for array size = 4725: 0. CWP (min N) (N=5005): elapsed time t=1.32 s, 256 iters, t-(init.)=1.21 s t(norm)=0.0819534, mflops=61.0103 1. CWP (best N) (N=5040): elapsed time t=1.18 s, 256 iters, t-(init.)=1.06 s t(norm)=0.0717938, mflops=69.6439 2. FFTPACK (f2c): elapsed time t=1.86 s, 128 iters, t-(init.)=1.8 s t(norm)=0.243828, mflops=20.5062 (err=1.9e-15) FFTW_MEASURE plan: (cost = 6.406250e-03) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 15 3. FFTW: elapsed time t=1.63 s, 256 iters, t-(init.)=1.52 s t(norm)=0.10295, mflops=48.5674 (err=1.9e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.73 s, 256 iters, t-(init.)=1.62 s t(norm)=0.109723, mflops=45.5694 (err=1.8e-15) 5. Frigo-old: elapsed time t=1.1 s, 32 iters, t-(init.)=1.08 s t(norm)=0.585188, mflops=8.54427 (err=1.9e-15) 6. GSL: elapsed time t=1.12 s, 128 iters, t-(init.)=1.07 s t(norm)=0.144942, mflops=34.4965 (err=1.9e-15) 7. NAPACK (f2c): elapsed time t=1.15 s, 32 iters, t-(init.)=1.14 s t(norm)=0.617698, mflops=8.09457 (err=3.5e-13) 8. Singleton (f2c): elapsed time t=1.51 s, 128 iters, t-(init.)=1.46 s t(norm)=0.197772, mflops=25.2817 (err=2.4e-15) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=1.19 s, 16 iters, t-(init.)=1.18 s t(norm)=1.27874, mflops=3.91009 (err=1.8e-15) 11. SGIMATH: elapsed time t=1.67 s, 128 iters, t-(init.)=1.62 s t(norm)=0.219445, mflops=22.7847 (err=1.9e-15) Top mflops for N=4725 = 69.6439 Normalized results and averages for N=4725: fft 0: mflops = 61.0103 (norm. = 0.876033), norm. avg. (of 14) = 0.670656 fft 1: mflops = 69.6439 (norm. = 1), norm. avg. (of 14) = 0.683916 fft 2: mflops = 20.5062 (norm. = 0.294444), norm. avg. (of 14) = 0.368572 fft 3: mflops = 48.5674 (norm. = 0.697368), norm. avg. (of 14) = 0.879834 fft 4: mflops = 45.5694 (norm. = 0.654321), norm. avg. (of 14) = 0.866813 fft 5: mflops = 8.54427 (norm. = 0.122685), norm. avg. (of 14) = 0.151107 fft 6: mflops = 34.4965 (norm. = 0.495327), norm. avg. (of 14) = 0.522207 fft 7: mflops = 8.09457 (norm. = 0.116228), norm. avg. (of 14) = 0.125397 fft 8: mflops = 25.2817 (norm. = 0.363014), norm. avg. (of 14) = 0.278459 fft 9: mflops = -1 (norm. = -0.0143588), norm. avg. (of 10) = 0.2546 fft 10: mflops = 3.91009 (norm. = 0.0561441), norm. avg. (of 14) = 0.0533282 fft 11: mflops = 22.7847 (norm. = 0.32716), norm. avg. (of 14) = 0.142864 Benchmarking for array size = 10368: 0. CWP (min N) (N=10920): elapsed time t=1.61 s, 128 iters, t-(init.)=1.47 s t(norm)=0.083035, mflops=60.2155 1. CWP (best N) (N=11088): elapsed time t=1.51 s, 128 iters, t-(init.)=1.37 s t(norm)=0.0773864, mflops=64.6108 2. FFTPACK (f2c): elapsed time t=1.1 s, 32 iters, t-(init.)=1.07 s t(norm)=0.241762, mflops=20.6815 (err=3.0e-15) FFTW_MEASURE plan: (cost = 1.562500e-02) FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 16 FFTW_NOTW 8 3. FFTW: elapsed time t=1.93 s, 128 iters, t-(init.)=1.8 s t(norm)=0.101676, mflops=49.176 (err=3.0e-15) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 4. FFTW_ESTIMATE: elapsed time t=1.02 s, 64 iters, t-(init.)=0.96 s t(norm)=0.108454, mflops=46.1025 (err=3.0e-15) 5. Frigo-old: elapsed time t=1.98 s, 32 iters, t-(init.)=1.95 s t(norm)=0.440594, mflops=11.3483 (err=3.1e-15) 6. GSL: elapsed time t=1.59 s, 64 iters, t-(init.)=1.52 s t(norm)=0.171719, mflops=29.1174 (err=3.0e-15) 7. NAPACK (f2c): elapsed time t=1.11 s, 16 iters, t-(init.)=1.09 s t(norm)=0.492562, mflops=10.151 (err=8.1e-14) 8. Singleton (f2c): elapsed time t=1.79 s, 64 iters, t-(init.)=1.72 s t(norm)=0.194313, mflops=25.7316 (err=4.4e-15) 9. Temperton (f2c): elapsed time t=1.02 s, 32 iters, t-(init.)=0.99 s t(norm)=0.223686, mflops=22.3527 (err=3.0e-15) 10. Valkenburg: elapsed time t=1.31 s, 8 iters, t-(init.)=1.3 s t(norm)=1.17492, mflops=4.25562 (err=3.0e-15) 11. SGIMATH: elapsed time t=1.6 s, 64 iters, t-(init.)=1.53 s t(norm)=0.172848, mflops=28.9271 (err=3.0e-15) Top mflops for N=10368 = 64.6108 Normalized results and averages for N=10368: fft 0: mflops = 60.2155 (norm. = 0.931973), norm. avg. (of 15) = 0.688077 fft 1: mflops = 64.6108 (norm. = 1), norm. avg. (of 15) = 0.704988 fft 2: mflops = 20.6815 (norm. = 0.320093), norm. avg. (of 15) = 0.36534 fft 3: mflops = 49.176 (norm. = 0.761111), norm. avg. (of 15) = 0.871919 fft 4: mflops = 46.1025 (norm. = 0.713542), norm. avg. (of 15) = 0.856595 fft 5: mflops = 11.3483 (norm. = 0.175641), norm. avg. (of 15) = 0.152743 fft 6: mflops = 29.1174 (norm. = 0.450658), norm. avg. (of 15) = 0.517437 fft 7: mflops = 10.151 (norm. = 0.15711), norm. avg. (of 15) = 0.127512 fft 8: mflops = 25.7316 (norm. = 0.398256), norm. avg. (of 15) = 0.286445 fft 9: mflops = 22.3527 (norm. = 0.34596), norm. avg. (of 11) = 0.262906 fft 10: mflops = 4.25562 (norm. = 0.0658654), norm. avg. (of 15) = 0.054164 fft 11: mflops = 28.9271 (norm. = 0.447712), norm. avg. (of 15) = 0.163187 Benchmarking for array size = 27000: 0. CWP (min N) (N=27720): elapsed time t=1.43 s, 32 iters, t-(init.)=1.28 s t(norm)=0.10064, mflops=49.6823 1. CWP (best N) (N=27720): elapsed time t=1.44 s, 32 iters, t-(init.)=1.3 s t(norm)=0.102212, mflops=48.9179 2. FFTPACK (f2c): elapsed time t=1.02 s, 8 iters, t-(init.)=0.98 s t(norm)=0.308209, mflops=16.2228 (err=5.5e-15) FFTW_MEASURE plan: (cost = 6.250000e-02) FFTW_TWIDDLE 10 FFTW_TWIDDLE 5 FFTW_TWIDDLE 10 FFTW_TWIDDLE 6 FFTW_NOTW 9 3. FFTW: elapsed time t=1.93 s, 32 iters, t-(init.)=1.78 s t(norm)=0.139952, mflops=35.7266 (err=5.6e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.91 s, 32 iters, t-(init.)=1.76 s t(norm)=0.138379, mflops=36.1326 (err=5.6e-15) 5. Frigo-old: elapsed time t=1.99 s, 8 iters, t-(init.)=1.95 s t(norm)=0.613272, mflops=8.15299 (err=5.7e-15) 6. GSL: elapsed time t=1.52 s, 16 iters, t-(init.)=1.45 s t(norm)=0.228011, mflops=21.9287 (err=5.5e-15) 7. NAPACK (f2c): elapsed time t=1.34 s, 4 iters, t-(init.)=1.33 s t(norm)=0.836566, mflops=5.97681 (err=1.1e-12) 8. Singleton (f2c): elapsed time t=1.89 s, 16 iters, t-(init.)=1.82 s t(norm)=0.286194, mflops=17.4707 (err=7.7e-15) 9. Temperton (f2c): elapsed time t=1.79 s, 16 iters, t-(init.)=1.72 s t(norm)=0.270469, mflops=18.4864 (err=5.6e-15) 10. Valkenburg: elapsed time t=1.13 s, 2 iters, t-(init.)=1.12 s t(norm)=1.40895, mflops=3.54873 (err=5.5e-15) 11. SGIMATH: elapsed time t=1.37 s, 16 iters, t-(init.)=1.29 s t(norm)=0.202852, mflops=24.6486 (err=5.6e-15) Top mflops for N=27000 = 49.6823 Normalized results and averages for N=27000: fft 0: mflops = 49.6823 (norm. = 1), norm. avg. (of 16) = 0.707572 fft 1: mflops = 48.9179 (norm. = 0.984615), norm. avg. (of 16) = 0.722465 fft 2: mflops = 16.2228 (norm. = 0.326531), norm. avg. (of 16) = 0.362915 fft 3: mflops = 35.7266 (norm. = 0.719101), norm. avg. (of 16) = 0.862368 fft 4: mflops = 36.1326 (norm. = 0.727273), norm. avg. (of 16) = 0.848512 fft 5: mflops = 8.15299 (norm. = 0.164103), norm. avg. (of 16) = 0.153453 fft 6: mflops = 21.9287 (norm. = 0.441379), norm. avg. (of 16) = 0.512683 fft 7: mflops = 5.97681 (norm. = 0.120301), norm. avg. (of 16) = 0.127061 fft 8: mflops = 17.4707 (norm. = 0.351648), norm. avg. (of 16) = 0.29052 fft 9: mflops = 18.4864 (norm. = 0.372093), norm. avg. (of 12) = 0.272005 fft 10: mflops = 3.54873 (norm. = 0.0714286), norm. avg. (of 16) = 0.055243 fft 11: mflops = 24.6486 (norm. = 0.496124), norm. avg. (of 16) = 0.183995 Benchmarking for array size = 75600: 0. CWP (min N) (N=80080): elapsed time t=1.37 s, 8 iters, t-(init.)=1.23 s t(norm)=0.125492, mflops=39.8433 1. CWP (best N) (N=80080): elapsed time t=1.37 s, 8 iters, t-(init.)=1.23 s t(norm)=0.125492, mflops=39.8433 2. FFTPACK (f2c): elapsed time t=1.24 s, 2 iters, t-(init.)=1.21 s t(norm)=0.493805, mflops=10.1255 (err=1.1e-14) FFTW_MEASURE plan: (cost = 2.100000e-01) FFTW_TWIDDLE 6 FFTW_TWIDDLE 9 FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 14 3. FFTW: elapsed time t=1.72 s, 8 iters, t-(init.)=1.58 s t(norm)=0.161201, mflops=31.0172 (err=1.1e-14) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.69 s, 8 iters, t-(init.)=1.55 s t(norm)=0.15814, mflops=31.6176 (err=1.1e-14) 5. Frigo-old: elapsed time t=1.74 s, 2 iters, t-(init.)=1.7 s t(norm)=0.693775, mflops=7.20695 (err=1.1e-14) 6. GSL: elapsed time t=1.51 s, 4 iters, t-(init.)=1.45 s t(norm)=0.295875, mflops=16.899 (err=1.1e-14) 7. NAPACK (f2c): elapsed time t=1.17 s, 1 iters, t-(init.)=1.15 s t(norm)=0.938637, mflops=5.32687 (err=5.1e-12) 8. Singleton (f2c): elapsed time t=1.9 s, 4 iters, t-(init.)=1.83 s t(norm)=0.373414, mflops=13.39 (err=1.5e-14) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=1.99 s, 1 iters, t-(init.)=1.98 s t(norm)=1.61609, mflops=3.09389 (err=1.1e-14) 11. SGIMATH: elapsed time t=1.24 s, 4 iters, t-(init.)=1.18 s t(norm)=0.240781, mflops=20.7658 (err=1.1e-14) Top mflops for N=75600 = 39.8433 Normalized results and averages for N=75600: fft 0: mflops = 39.8433 (norm. = 1), norm. avg. (of 17) = 0.724774 fft 1: mflops = 39.8433 (norm. = 1), norm. avg. (of 17) = 0.73879 fft 2: mflops = 10.1255 (norm. = 0.254132), norm. avg. (of 17) = 0.356516 fft 3: mflops = 31.0172 (norm. = 0.778481), norm. avg. (of 17) = 0.857434 fft 4: mflops = 31.6176 (norm. = 0.793548), norm. avg. (of 17) = 0.845279 fft 5: mflops = 7.20695 (norm. = 0.180882), norm. avg. (of 17) = 0.155066 fft 6: mflops = 16.899 (norm. = 0.424138), norm. avg. (of 17) = 0.507475 fft 7: mflops = 5.32687 (norm. = 0.133696), norm. avg. (of 17) = 0.127451 fft 8: mflops = 13.39 (norm. = 0.336066), norm. avg. (of 17) = 0.293199 fft 9: mflops = -1 (norm. = -0.0250983), norm. avg. (of 12) = 0.272005 fft 10: mflops = 3.09389 (norm. = 0.0776515), norm. avg. (of 17) = 0.0565612 fft 11: mflops = 20.7658 (norm. = 0.521186), norm. avg. (of 17) = 0.20383 Benchmarking for array size = 165375: 0. CWP (min N) (N=180180): elapsed time t=1.85 s, 4 iters, t-(init.)=1.68 s t(norm)=0.146503, mflops=34.129 1. CWP (best N) (N=180180): elapsed time t=1.85 s, 4 iters, t-(init.)=1.69 s t(norm)=0.147375, mflops=33.9271 2. FFTPACK (f2c): elapsed time t=2.11 s, 1 iters, t-(init.)=2.07 s t(norm)=0.72205, mflops=6.92473 (err=2.7e-14) FFTW_MEASURE plan: (cost = 5.600000e-01) FFTW_TWIDDLE 5 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_NOTW 15 3. FFTW: elapsed time t=1.1 s, 2 iters, t-(init.)=1.01 s t(norm)=0.176152, mflops=28.3845 (err=2.7e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.14 s, 2 iters, t-(init.)=1.06 s t(norm)=0.184873, mflops=27.0456 (err=2.7e-14) 5. Frigo-old: elapsed time t=3.06 s, 1 iters, t-(init.)=3.02 s t(norm)=1.05343, mflops=4.74642 (err=2.7e-14) 6. GSL: elapsed time t=1.7 s, 2 iters, t-(init.)=1.62 s t(norm)=0.282541, mflops=17.6965 (err=2.7e-14) 7. NAPACK (f2c): elapsed time t=2.99 s, 1 iters, t-(init.)=2.96 s t(norm)=1.0325, mflops=4.84263 (err=1.6e-11) 8. Singleton (f2c): elapsed time t=1.27 s, 1 iters, t-(init.)=1.23 s t(norm)=0.429044, mflops=11.6538 (err=4.0e-14) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=5.34 s, 1 iters, t-(init.)=5.3 s t(norm)=1.84873, mflops=2.70456 (err=2.7e-14) 11. SGIMATH: elapsed time t=1.62 s, 2 iters, t-(init.)=1.54 s t(norm)=0.268589, mflops=18.6158 (err=2.7e-14) Top mflops for N=165375 = 34.129 Normalized results and averages for N=165375: fft 0: mflops = 34.129 (norm. = 1), norm. avg. (of 18) = 0.740064 fft 1: mflops = 33.9271 (norm. = 0.994083), norm. avg. (of 18) = 0.752973 fft 2: mflops = 6.92473 (norm. = 0.202899), norm. avg. (of 18) = 0.347982 fft 3: mflops = 28.3845 (norm. = 0.831683), norm. avg. (of 18) = 0.856003 fft 4: mflops = 27.0456 (norm. = 0.792453), norm. avg. (of 18) = 0.842344 fft 5: mflops = 4.74642 (norm. = 0.139073), norm. avg. (of 18) = 0.154178 fft 6: mflops = 17.6965 (norm. = 0.518519), norm. avg. (of 18) = 0.508088 fft 7: mflops = 4.84263 (norm. = 0.141892), norm. avg. (of 18) = 0.128253 fft 8: mflops = 11.6538 (norm. = 0.341463), norm. avg. (of 18) = 0.295881 fft 9: mflops = -1 (norm. = -0.0293006), norm. avg. (of 12) = 0.272005 fft 10: mflops = 2.70456 (norm. = 0.0792453), norm. avg. (of 18) = 0.0578214 fft 11: mflops = 18.6158 (norm. = 0.545455), norm. avg. (of 18) = 0.222809 Benchmarking for array size = 362880: 0. CWP (min N) (N=720720): elapsed time t=1.94 s, 1 iters, t-(init.)=1.77 s t(norm)=0.264097, mflops=18.9324 1. CWP (best N) (N=720720): elapsed time t=1.94 s, 1 iters, t-(init.)=1.77 s t(norm)=0.264097, mflops=18.9324 2. FFTPACK (f2c): elapsed time t=3.58 s, 1 iters, t-(init.)=3.5 s t(norm)=0.522226, mflops=9.5744 (err=1.1e-13) FFTW_MEASURE plan: (cost = 1.180000e+00) FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_TWIDDLE 3 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_NOTW 15 3. FFTW: elapsed time t=1.21 s, 1 iters, t-(init.)=1.12 s t(norm)=0.167112, mflops=29.92 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 7.511616e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 4. FFTW_ESTIMATE: elapsed time t=1.21 s, 1 iters, t-(init.)=1.12 s t(norm)=0.167112, mflops=29.92 (err=1.1e-13) 5. Frigo-old: elapsed time t=5.14 s, 1 iters, t-(init.)=5.06 s t(norm)=0.75499, mflops=6.62261 (err=1.1e-13) 6. GSL: elapsed time t=1.87 s, 1 iters, t-(init.)=1.79 s t(norm)=0.267081, mflops=18.7209 (err=1.1e-13) 7. NAPACK (f2c): elapsed time t=6.3 s, 1 iters, t-(init.)=6.21 s t(norm)=0.926578, mflops=5.3962 (err=3.4e-11) 8. Singleton (f2c): elapsed time t=3.18 s, 1 iters, t-(init.)=3.1 s t(norm)=0.462543, mflops=10.8098 (err=1.6e-13) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=11.91 s, 1 iters, t-(init.)=11.82 s t(norm)=1.76363, mflops=2.83506 (err=1.1e-13) 11. SGIMATH: elapsed time t=1.8 s, 1 iters, t-(init.)=1.71 s t(norm)=0.255145, mflops=19.5967 (err=1.1e-13) Top mflops for N=362880 = 29.92 Normalized results and averages for N=362880: fft 0: mflops = 18.9324 (norm. = 0.632768), norm. avg. (of 19) = 0.734417 fft 1: mflops = 18.9324 (norm. = 0.632768), norm. avg. (of 19) = 0.746647 fft 2: mflops = 9.5744 (norm. = 0.32), norm. avg. (of 19) = 0.346509 fft 3: mflops = 29.92 (norm. = 1), norm. avg. (of 19) = 0.863582 fft 4: mflops = 29.92 (norm. = 1), norm. avg. (of 19) = 0.850642 fft 5: mflops = 6.62261 (norm. = 0.221344), norm. avg. (of 19) = 0.157713 fft 6: mflops = 18.7209 (norm. = 0.625698), norm. avg. (of 19) = 0.514278 fft 7: mflops = 5.3962 (norm. = 0.180354), norm. avg. (of 19) = 0.130996 fft 8: mflops = 10.8098 (norm. = 0.36129), norm. avg. (of 19) = 0.299323 fft 9: mflops = -1 (norm. = -0.0334225), norm. avg. (of 12) = 0.272005 fft 10: mflops = 2.83506 (norm. = 0.0947547), norm. avg. (of 19) = 0.0597653 fft 11: mflops = 19.5967 (norm. = 0.654971), norm. avg. (of 19) = 0.245555 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) 512x128x64 (64.0236 MB) Maximum array size N = 4194304 Benchmarking FFTs: 0. FFTW 1. HARM (f2c) 2. PDA (f2c) 3. Singleton (f2c) 4. Temperton (f2c) Computing normalized averages (5 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.28 s, 65536 iters, t-(init.)=1.2 s t(norm)=0.0476837, mflops=104.858 (err=1.9e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. PDA (f2c): elapsed time t=1.48 s, 8192 iters, t-(init.)=1.47 s t(norm)=0.4673, mflops=10.6998 (err=2.8e-16) 3. Singleton (f2c): elapsed time t=1.13 s, 32768 iters, t-(init.)=1.1 s t(norm)=0.0874201, mflops=57.1951 (err=1.9e-16) 4. Temperton (f2c): elapsed time t=1.72 s, 32768 iters, t-(init.)=1.69 s t(norm)=0.134309, mflops=37.2276 (err=1.9e-16) Top mflops for N=64 = 104.858 Normalized results and averages for N=64: fft 0: mflops = 104.858 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.00953674), norm. avg. (of 0) = -1 fft 2: mflops = 10.6998 (norm. = 0.102041), norm. avg. (of 1) = 0.102041 fft 3: mflops = 57.1951 (norm. = 0.545455), norm. avg. (of 1) = 0.545455 fft 4: mflops = 37.2276 (norm. = 0.35503), norm. avg. (of 1) = 0.35503 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.29 s, 8192 iters, t-(init.)=1.23 s t(norm)=0.0325839, mflops=153.45 (err=3.4e-16) 1. HARM (f2c): elapsed time t=1.01 s, 2048 iters, t-(init.)=1 s t(norm)=0.105964, mflops=47.1859 (err=4.0e-16) 2. PDA (f2c): elapsed time t=1.29 s, 1024 iters, t-(init.)=1.28 s t(norm)=0.271267, mflops=18.432 (err=3.1e-16) 3. Singleton (f2c): elapsed time t=1.21 s, 2048 iters, t-(init.)=1.2 s t(norm)=0.127157, mflops=39.3216 (err=3.5e-16) 4. Temperton (f2c): elapsed time t=1.47 s, 4096 iters, t-(init.)=1.44 s t(norm)=0.0762939, mflops=65.536 (err=3.3e-16) Top mflops for N=512 = 153.45 Normalized results and averages for N=512: fft 0: mflops = 153.45 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 47.1859 (norm. = 0.3075), norm. avg. (of 1) = 0.3075 fft 2: mflops = 18.432 (norm. = 0.120117), norm. avg. (of 2) = 0.111079 fft 3: mflops = 39.3216 (norm. = 0.25625), norm. avg. (of 2) = 0.400852 fft 4: mflops = 65.536 (norm. = 0.427083), norm. avg. (of 2) = 0.391056 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.82 s, 512 iters, t-(init.)=1.65 s t(norm)=0.0655651, mflops=76.2601 (err=4.2e-16) 1. HARM (f2c): elapsed time t=1.55 s, 256 iters, t-(init.)=1.47 s t(norm)=0.116825, mflops=42.799 (err=4.0e-16) 2. PDA (f2c): elapsed time t=1.29 s, 128 iters, t-(init.)=1.25 s t(norm)=0.198682, mflops=25.1658 (err=3.9e-16) 3. Singleton (f2c): elapsed time t=1.15 s, 128 iters, t-(init.)=1.11 s t(norm)=0.17643, mflops=28.3399 (err=4.1e-16) 4. Temperton (f2c): elapsed time t=1.67 s, 256 iters, t-(init.)=1.58 s t(norm)=0.125567, mflops=39.8193 (err=4.5e-16) Top mflops for N=4096 = 76.2601 Normalized results and averages for N=4096: fft 0: mflops = 76.2601 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 42.799 (norm. = 0.561224), norm. avg. (of 2) = 0.434362 fft 2: mflops = 25.1658 (norm. = 0.33), norm. avg. (of 3) = 0.184053 fft 3: mflops = 28.3399 (norm. = 0.371622), norm. avg. (of 3) = 0.391109 fft 4: mflops = 39.8193 (norm. = 0.522152), norm. avg. (of 3) = 0.434755 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.1 s, 16 iters, t-(init.)=1.01 s t(norm)=0.128428, mflops=38.9323 (err=5.2e-16) 1. HARM (f2c): elapsed time t=1.78 s, 16 iters, t-(init.)=1.69 s t(norm)=0.214895, mflops=23.2672 (err=5.0e-16) 2. PDA (f2c): elapsed time t=1.18 s, 8 iters, t-(init.)=1.13 s t(norm)=0.287374, mflops=17.3989 (err=4.3e-16) 3. Singleton (f2c): elapsed time t=1.66 s, 8 iters, t-(init.)=1.61 s t(norm)=0.409444, mflops=12.2117 (err=5.3e-16) 4. Temperton (f2c): elapsed time t=1.96 s, 16 iters, t-(init.)=1.87 s t(norm)=0.237783, mflops=21.0276 (err=4.9e-16) Top mflops for N=32768 = 38.9323 Normalized results and averages for N=32768: fft 0: mflops = 38.9323 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 23.2672 (norm. = 0.597633), norm. avg. (of 3) = 0.488786 fft 2: mflops = 17.3989 (norm. = 0.446903), norm. avg. (of 4) = 0.249765 fft 3: mflops = 12.2117 (norm. = 0.313665), norm. avg. (of 4) = 0.371748 fft 4: mflops = 21.0276 (norm. = 0.540107), norm. avg. (of 4) = 0.461093 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.82 s, 2 iters, t-(init.)=1.7 s t(norm)=0.180138, mflops=27.7564 (err=1.3e-15) 1. HARM (f2c): elapsed time t=1.18 s, 1 iters, t-(init.)=1.12 s t(norm)=0.237359, mflops=21.0651 (err=1.2e-15) 2. PDA (f2c): elapsed time t=1.51 s, 1 iters, t-(init.)=1.45 s t(norm)=0.307295, mflops=16.271 (err=1.2e-15) 3. Singleton (f2c): elapsed time t=2.42 s, 1 iters, t-(init.)=2.36 s t(norm)=0.500149, mflops=9.99702 (err=1.7e-15) 4. Temperton (f2c): elapsed time t=1.42 s, 1 iters, t-(init.)=1.36 s t(norm)=0.288222, mflops=17.3478 (err=1.3e-15) Top mflops for N=262144 = 27.7564 Normalized results and averages for N=262144: fft 0: mflops = 27.7564 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 21.0651 (norm. = 0.758929), norm. avg. (of 4) = 0.556322 fft 2: mflops = 16.271 (norm. = 0.586207), norm. avg. (of 5) = 0.317054 fft 3: mflops = 9.99702 (norm. = 0.360169), norm. avg. (of 5) = 0.369432 fft 4: mflops = 17.3478 (norm. = 0.625), norm. avg. (of 5) = 0.493874 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.82 s, 1 iters, t-(init.)=1.69 s t(norm)=0.169654, mflops=29.4718 (err=1.2e-15) 1. HARM (f2c): elapsed time t=2.63 s, 1 iters, t-(init.)=2.51 s t(norm)=0.251971, mflops=19.8436 (err=1.2e-15) 2. PDA (f2c): elapsed time t=2.87 s, 1 iters, t-(init.)=2.75 s t(norm)=0.276064, mflops=18.1118 (err=1.2e-15) 3. Singleton (f2c): elapsed time t=5.4 s, 1 iters, t-(init.)=5.27 s t(norm)=0.529038, mflops=9.45111 (err=1.7e-15) 4. Temperton (f2c): elapsed time t=2.84 s, 1 iters, t-(init.)=2.72 s t(norm)=0.273052, mflops=18.3115 (err=1.3e-15) Top mflops for N=524288 = 29.4718 Normalized results and averages for N=524288: fft 0: mflops = 29.4718 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 19.8436 (norm. = 0.673307), norm. avg. (of 5) = 0.579719 fft 2: mflops = 18.1118 (norm. = 0.614545), norm. avg. (of 6) = 0.366636 fft 3: mflops = 9.45111 (norm. = 0.320683), norm. avg. (of 6) = 0.361307 fft 4: mflops = 18.3115 (norm. = 0.621324), norm. avg. (of 6) = 0.515116 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=3.69 s, 1 iters, t-(init.)=3.43 s t(norm)=0.163555, mflops=30.5707 (err=2.0e-15) 1. HARM (f2c): elapsed time t=5.51 s, 1 iters, t-(init.)=5.26 s t(norm)=0.250816, mflops=19.9349 (err=1.9e-15) 2. PDA (f2c): elapsed time t=6.26 s, 1 iters, t-(init.)=6 s t(norm)=0.286102, mflops=17.4763 (err=2.0e-15) 3. Singleton (f2c): elapsed time t=10.35 s, 1 iters, t-(init.)=10.11 s t(norm)=0.482082, mflops=10.3717 (err=2.8e-15) 4. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 30.5707 Normalized results and averages for N=1048576: fft 0: mflops = 30.5707 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 19.9349 (norm. = 0.652091), norm. avg. (of 6) = 0.591781 fft 2: mflops = 17.4763 (norm. = 0.571667), norm. avg. (of 7) = 0.395926 fft 3: mflops = 10.3717 (norm. = 0.339268), norm. avg. (of 7) = 0.358159 fft 4: mflops = -1 (norm. = -0.032711), norm. avg. (of 6) = 0.515116 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=8.54 s, 1 iters, t-(init.)=8.05 s t(norm)=0.182788, mflops=27.3542 (err=7.4e-16) 1. HARM (f2c): elapsed time t=11.69 s, 1 iters, t-(init.)=11.18 s t(norm)=0.253859, mflops=19.696 (err=7.0e-16) 2. PDA (f2c): elapsed time t=13.6 s, 1 iters, t-(init.)=13.1 s t(norm)=0.297456, mflops=16.8092 (err=7.0e-16) 3. Singleton (f2c): elapsed time t=30.61 s, 1 iters, t-(init.)=30.11 s t(norm)=0.683694, mflops=7.31322 (err=8.4e-16) 4. Temperton (f2c): elapsed time t=17.37 s, 1 iters, t-(init.)=16.86 s t(norm)=0.382832, mflops=13.0606 (err=7.3e-16) Top mflops for N=2097152 = 27.3542 Normalized results and averages for N=2097152: fft 0: mflops = 27.3542 (norm. = 1), norm. avg. (of 8) = 1 fft 1: mflops = 19.696 (norm. = 0.720036), norm. avg. (of 7) = 0.610103 fft 2: mflops = 16.8092 (norm. = 0.614504), norm. avg. (of 8) = 0.423248 fft 3: mflops = 7.31322 (norm. = 0.267353), norm. avg. (of 8) = 0.346808 fft 4: mflops = 13.0606 (norm. = 0.477461), norm. avg. (of 7) = 0.509737 Benchmarking for array size = 512x128x64 (power of 2): 0. FFTW: elapsed time t=18.58 s, 1 iters, t-(init.)=17.59 s t(norm)=0.190626, mflops=26.2293 (err=1.4e-15) 1. HARM (f2c): elapsed time t=24.83 s, 1 iters, t-(init.)=23.84 s t(norm)=0.258359, mflops=19.3529 (err=1.2e-15) 2. PDA (f2c): elapsed time t=29.05 s, 1 iters, t-(init.)=28.06 s t(norm)=0.304092, mflops=16.4424 (err=1.3e-15) 3. Singleton (f2c): elapsed time t=60.73 s, 1 iters, t-(init.)=59.73 s t(norm)=0.647306, mflops=7.72432 (err=1.6e-15) 4. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=4194304 = 26.2293 Normalized results and averages for N=4194304: fft 0: mflops = 26.2293 (norm. = 1), norm. avg. (of 9) = 1 fft 1: mflops = 19.3529 (norm. = 0.737836), norm. avg. (of 8) = 0.626069 fft 2: mflops = 16.4424 (norm. = 0.626871), norm. avg. (of 9) = 0.445873 fft 3: mflops = 7.72432 (norm. = 0.294492), norm. avg. (of 9) = 0.340995 fft 4: mflops = -1 (norm. = -0.0381253), norm. avg. (of 7) = 0.509737 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) 180x180x180 (88.9976 MB) Maximum array size N = 5832000 Benchmarking FFTs: 0. FFTW 1. PDA (f2c) 2. Singleton (f2c) 3. Temperton (f2c) Computing normalized averages (4 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1 s, 16384 iters, t-(init.)=0.97 s t(norm)=0.0679942, mflops=73.5357 (err=3.9e-16) 1. PDA (f2c): elapsed time t=1.36 s, 4096 iters, t-(init.)=1.36 s t(norm)=0.381328, mflops=13.1121 (err=3.0e-16) 2. Singleton (f2c): elapsed time t=1.81 s, 32768 iters, t-(init.)=1.74 s t(norm)=0.0609845, mflops=81.9881 (err=3.4e-16) 3. Temperton (f2c): elapsed time t=1.59 s, 16384 iters, t-(init.)=1.55 s t(norm)=0.10865, mflops=46.0191 (err=2.5e-16) Top mflops for N=125 = 81.9881 Normalized results and averages for N=125: fft 0: mflops = 73.5357 (norm. = 0.896907), norm. avg. (of 1) = 0.896907 fft 1: mflops = 13.1121 (norm. = 0.159926), norm. avg. (of 1) = 0.159926 fft 2: mflops = 81.9881 (norm. = 1), norm. avg. (of 1) = 1 fft 3: mflops = 46.0191 (norm. = 0.56129), norm. avg. (of 1) = 0.56129 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.26 s, 16384 iters, t-(init.)=1.2 s t(norm)=0.0437252, mflops=114.35 (err=2.9e-16) 1. PDA (f2c): elapsed time t=1.26 s, 2048 iters, t-(init.)=1.25 s t(norm)=0.364377, mflops=13.7221 (err=3.6e-16) 2. Singleton (f2c): elapsed time t=1.95 s, 8192 iters, t-(init.)=1.93 s t(norm)=0.140649, mflops=35.5494 (err=2.9e-16) 3. Temperton (f2c): elapsed time t=1.76 s, 8192 iters, t-(init.)=1.73 s t(norm)=0.126074, mflops=39.6591 (err=3.1e-16) Top mflops for N=216 = 114.35 Normalized results and averages for N=216: fft 0: mflops = 114.35 (norm. = 1), norm. avg. (of 2) = 0.948454 fft 1: mflops = 13.7221 (norm. = 0.12), norm. avg. (of 2) = 0.139963 fft 2: mflops = 35.5494 (norm. = 0.310881), norm. avg. (of 2) = 0.65544 fft 3: mflops = 39.6591 (norm. = 0.346821), norm. avg. (of 2) = 0.454056 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.77 s, 8192 iters, t-(init.)=1.72 s t(norm)=0.0726818, mflops=68.793 (err=3.7e-16) 1. PDA (f2c): elapsed time t=1.11 s, 512 iters, t-(init.)=1.11 s t(norm)=0.750482, mflops=6.66238 (err=4.9e-16) 2. Singleton (f2c): elapsed time t=1.7 s, 4096 iters, t-(init.)=1.68 s t(norm)=0.141983, mflops=35.2155 (err=5.8e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 68.793 Normalized results and averages for N=343: fft 0: mflops = 68.793 (norm. = 1), norm. avg. (of 3) = 0.965636 fft 1: mflops = 6.66238 (norm. = 0.0968468), norm. avg. (of 3) = 0.125591 fft 2: mflops = 35.2155 (norm. = 0.511905), norm. avg. (of 3) = 0.607595 fft 3: mflops = -1 (norm. = -0.0145364), norm. avg. (of 2) = 0.454056 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.51 s, 4096 iters, t-(init.)=1.46 s t(norm)=0.0514156, mflops=97.2467 (err=5.3e-16) 1. PDA (f2c): elapsed time t=1.8 s, 1024 iters, t-(init.)=1.77 s t(norm)=0.249331, mflops=20.0537 (err=4.1e-16) 2. Singleton (f2c): elapsed time t=1.72 s, 2048 iters, t-(init.)=1.7 s t(norm)=0.119735, mflops=41.7589 (err=4.5e-16) 3. Temperton (f2c): elapsed time t=1.19 s, 2048 iters, t-(init.)=1.16 s t(norm)=0.0817015, mflops=61.1984 (err=4.8e-16) Top mflops for N=729 = 97.2467 Normalized results and averages for N=729: fft 0: mflops = 97.2467 (norm. = 1), norm. avg. (of 4) = 0.974227 fft 1: mflops = 20.0537 (norm. = 0.206215), norm. avg. (of 4) = 0.145747 fft 2: mflops = 41.7589 (norm. = 0.429412), norm. avg. (of 4) = 0.563049 fft 3: mflops = 61.1984 (norm. = 0.62931), norm. avg. (of 3) = 0.512474 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.06 s, 2048 iters, t-(init.)=1.03 s t(norm)=0.0504656, mflops=99.0773 (err=4.8e-16) 1. PDA (f2c): elapsed time t=1.17 s, 512 iters, t-(init.)=1.16 s t(norm)=0.22734, mflops=21.9935 (err=4.7e-16) 2. Singleton (f2c): elapsed time t=1.09 s, 1024 iters, t-(init.)=1.07 s t(norm)=0.104851, mflops=47.6867 (err=5.4e-16) 3. Temperton (f2c): elapsed time t=1.91 s, 2048 iters, t-(init.)=1.88 s t(norm)=0.092112, mflops=54.2817 (err=3.9e-16) Top mflops for N=1000 = 99.0773 Normalized results and averages for N=1000: fft 0: mflops = 99.0773 (norm. = 1), norm. avg. (of 5) = 0.979381 fft 1: mflops = 21.9935 (norm. = 0.221983), norm. avg. (of 5) = 0.160994 fft 2: mflops = 47.6867 (norm. = 0.481308), norm. avg. (of 5) = 0.546701 fft 3: mflops = 54.2817 (norm. = 0.547872), norm. avg. (of 4) = 0.521323 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.33 s, 1024 iters, t-(init.)=1.31 s t(norm)=0.092612, mflops=53.9887 (err=4.4e-16) 1. PDA (f2c): elapsed time t=1.29 s, 128 iters, t-(init.)=1.28 s t(norm)=0.723929, mflops=6.90676 (err=5.3e-16) 2. Singleton (f2c): elapsed time t=1.19 s, 512 iters, t-(init.)=1.18 s t(norm)=0.166843, mflops=29.9683 (err=6.6e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 53.9887 Normalized results and averages for N=1331: fft 0: mflops = 53.9887 (norm. = 1), norm. avg. (of 6) = 0.982818 fft 1: mflops = 6.90676 (norm. = 0.12793), norm. avg. (of 6) = 0.155483 fft 2: mflops = 29.9683 (norm. = 0.555085), norm. avg. (of 6) = 0.548098 fft 3: mflops = -1 (norm. = -0.0185224), norm. avg. (of 4) = 0.521323 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.42 s, 2048 iters, t-(init.)=1.36 s t(norm)=0.0357322, mflops=139.93 (err=3.9e-16) 1. PDA (f2c): elapsed time t=1.94 s, 512 iters, t-(init.)=1.93 s t(norm)=0.202833, mflops=24.6509 (err=3.8e-16) 2. Singleton (f2c): elapsed time t=1.18 s, 512 iters, t-(init.)=1.16 s t(norm)=0.12191, mflops=41.0139 (err=4.0e-16) 3. Temperton (f2c): elapsed time t=1.26 s, 1024 iters, t-(init.)=1.24 s t(norm)=0.0651587, mflops=76.7358 (err=3.9e-16) Top mflops for N=1728 = 139.93 Normalized results and averages for N=1728: fft 0: mflops = 139.93 (norm. = 1), norm. avg. (of 7) = 0.985272 fft 1: mflops = 24.6509 (norm. = 0.176166), norm. avg. (of 7) = 0.158438 fft 2: mflops = 41.0139 (norm. = 0.293103), norm. avg. (of 7) = 0.511671 fft 3: mflops = 76.7358 (norm. = 0.548387), norm. avg. (of 5) = 0.526736 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.5 s, 512 iters, t-(init.)=1.47 s t(norm)=0.117718, mflops=42.4744 (err=7.8e-16) 1. PDA (f2c): elapsed time t=1.17 s, 64 iters, t-(init.)=1.16 s t(norm)=0.743145, mflops=6.72817 (err=1.2e-15) 2. Singleton (f2c): elapsed time t=1.08 s, 256 iters, t-(init.)=1.07 s t(norm)=0.171372, mflops=29.1763 (err=1.5e-15) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 42.4744 Normalized results and averages for N=2197: fft 0: mflops = 42.4744 (norm. = 1), norm. avg. (of 8) = 0.987113 fft 1: mflops = 6.72817 (norm. = 0.158405), norm. avg. (of 8) = 0.158434 fft 2: mflops = 29.1763 (norm. = 0.686916), norm. avg. (of 8) = 0.533576 fft 3: mflops = -1 (norm. = -0.0235436), norm. avg. (of 5) = 0.526736 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.23 s, 512 iters, t-(init.)=1.14 s t(norm)=0.0710405, mflops=70.3824 (err=4.1e-16) 1. PDA (f2c): elapsed time t=1.84 s, 128 iters, t-(init.)=1.82 s t(norm)=0.453662, mflops=11.0214 (err=5.0e-16) 2. Singleton (f2c): elapsed time t=1.44 s, 256 iters, t-(init.)=1.4 s t(norm)=0.174486, mflops=28.6557 (err=5.6e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 70.3824 Normalized results and averages for N=2744: fft 0: mflops = 70.3824 (norm. = 1), norm. avg. (of 9) = 0.988545 fft 1: mflops = 11.0214 (norm. = 0.156593), norm. avg. (of 9) = 0.158229 fft 2: mflops = 28.6557 (norm. = 0.407143), norm. avg. (of 9) = 0.519528 fft 3: mflops = -1 (norm. = -0.0142081), norm. avg. (of 5) = 0.526736 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.43 s, 512 iters, t-(init.)=1.28 s t(norm)=0.0631995, mflops=79.1145 (err=5.4e-16) 1. PDA (f2c): elapsed time t=1.93 s, 256 iters, t-(init.)=1.85 s t(norm)=0.182686, mflops=27.3694 (err=5.3e-16) 2. Singleton (f2c): elapsed time t=1.52 s, 256 iters, t-(init.)=1.46 s t(norm)=0.144174, mflops=34.6803 (err=6.7e-16) 3. Temperton (f2c): elapsed time t=1.94 s, 512 iters, t-(init.)=1.81 s t(norm)=0.0893681, mflops=55.9484 (err=5.2e-16) Top mflops for N=3375 = 79.1145 Normalized results and averages for N=3375: fft 0: mflops = 79.1145 (norm. = 1), norm. avg. (of 10) = 0.989691 fft 1: mflops = 27.3694 (norm. = 0.345946), norm. avg. (of 10) = 0.177001 fft 2: mflops = 34.6803 (norm. = 0.438356), norm. avg. (of 10) = 0.511411 fft 3: mflops = 55.9484 (norm. = 0.707182), norm. avg. (of 6) = 0.556811 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.82 s, 64 iters, t-(init.)=1.67 s t(norm)=0.110657, mflops=45.1847 (err=4.7e-16) 1. PDA (f2c): elapsed time t=1.81 s, 32 iters, t-(init.)=1.74 s t(norm)=0.23059, mflops=21.6835 (err=4.9e-16) 2. Singleton (f2c): elapsed time t=1.97 s, 32 iters, t-(init.)=1.9 s t(norm)=0.251794, mflops=19.8575 (err=5.3e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 45.1847 Normalized results and averages for N=16800: fft 0: mflops = 45.1847 (norm. = 1), norm. avg. (of 11) = 0.990628 fft 1: mflops = 21.6835 (norm. = 0.479885), norm. avg. (of 11) = 0.204536 fft 2: mflops = 19.8575 (norm. = 0.439474), norm. avg. (of 11) = 0.504871 fft 3: mflops = -1 (norm. = -0.0221314), norm. avg. (of 6) = 0.556811 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.11 s, 4 iters, t-(init.)=1.01 s t(norm)=0.136269, mflops=36.6922 (err=6.7e-16) 1. PDA (f2c): elapsed time t=1.95 s, 4 iters, t-(init.)=1.85 s t(norm)=0.249601, mflops=20.032 (err=6.4e-16) 2. Singleton (f2c): elapsed time t=1.85 s, 2 iters, t-(init.)=1.8 s t(norm)=0.48571, mflops=10.2942 (err=6.5e-16) 3. Temperton (f2c): elapsed time t=1.86 s, 4 iters, t-(init.)=1.76 s t(norm)=0.237458, mflops=21.0563 (err=7.3e-16) Top mflops for N=110592 = 36.6922 Normalized results and averages for N=110592: fft 0: mflops = 36.6922 (norm. = 1), norm. avg. (of 12) = 0.991409 fft 1: mflops = 20.032 (norm. = 0.545946), norm. avg. (of 12) = 0.232987 fft 2: mflops = 10.2942 (norm. = 0.280556), norm. avg. (of 12) = 0.486178 fft 3: mflops = 21.0563 (norm. = 0.573864), norm. avg. (of 7) = 0.559247 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.23 s, 4 iters, t-(init.)=1.11 s t(norm)=0.140032, mflops=35.7062 (err=6.8e-16) 1. PDA (f2c): elapsed time t=1.74 s, 2 iters, t-(init.)=1.68 s t(norm)=0.42388, mflops=11.7958 (err=7.6e-16) 2. Singleton (f2c): elapsed time t=1.61 s, 2 iters, t-(init.)=1.55 s t(norm)=0.391079, mflops=12.7851 (err=1.0e-15) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 35.7062 Normalized results and averages for N=117649: fft 0: mflops = 35.7062 (norm. = 1), norm. avg. (of 13) = 0.99207 fft 1: mflops = 11.7958 (norm. = 0.330357), norm. avg. (of 13) = 0.240477 fft 2: mflops = 12.7851 (norm. = 0.358065), norm. avg. (of 13) = 0.476323 fft 3: mflops = -1 (norm. = -0.0280063), norm. avg. (of 7) = 0.559247 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.03 s, 2 iters, t-(init.)=0.93 s t(norm)=0.121484, mflops=41.1577 (err=7.6e-16) 1. PDA (f2c): elapsed time t=1.69 s, 2 iters, t-(init.)=1.59 s t(norm)=0.207698, mflops=24.0734 (err=7.7e-16) 2. Singleton (f2c): elapsed time t=2.15 s, 1 iters, t-(init.)=2.1 s t(norm)=0.548637, mflops=9.11349 (err=1.0e-15) 3. Temperton (f2c): elapsed time t=1.63 s, 2 iters, t-(init.)=1.53 s t(norm)=0.199861, mflops=25.0174 (err=7.3e-16) Top mflops for N=216000 = 41.1577 Normalized results and averages for N=216000: fft 0: mflops = 41.1577 (norm. = 1), norm. avg. (of 14) = 0.992636 fft 1: mflops = 24.0734 (norm. = 0.584906), norm. avg. (of 14) = 0.265079 fft 2: mflops = 9.11349 (norm. = 0.221429), norm. avg. (of 14) = 0.458117 fft 3: mflops = 25.0174 (norm. = 0.607843), norm. avg. (of 8) = 0.565321 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.22 s, 2 iters, t-(init.)=1.11 s t(norm)=0.128278, mflops=38.9778 (err=7.5e-16) 1. PDA (f2c): elapsed time t=1.15 s, 1 iters, t-(init.)=1.1 s t(norm)=0.254245, mflops=19.6661 (err=8.0e-16) 2. Singleton (f2c): elapsed time t=2.6 s, 1 iters, t-(init.)=2.54 s t(norm)=0.587074, mflops=8.51681 (err=9.3e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 38.9778 Normalized results and averages for N=241920: fft 0: mflops = 38.9778 (norm. = 1), norm. avg. (of 15) = 0.993127 fft 1: mflops = 19.6661 (norm. = 0.504545), norm. avg. (of 15) = 0.281043 fft 2: mflops = 8.51681 (norm. = 0.218504), norm. avg. (of 15) = 0.442142 fft 3: mflops = -1 (norm. = -0.0256556), norm. avg. (of 8) = 0.565321 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.15 s, 1 iters, t-(init.)=1.05 s t(norm)=0.133192, mflops=37.5398 (err=7.6e-16) 1. PDA (f2c): elapsed time t=1.66 s, 1 iters, t-(init.)=1.55 s t(norm)=0.196617, mflops=25.4302 (err=7.9e-16) 2. Singleton (f2c): elapsed time t=3.7 s, 1 iters, t-(init.)=3.6 s t(norm)=0.456659, mflops=10.9491 (err=9.8e-16) 3. Temperton (f2c): elapsed time t=1.47 s, 1 iters, t-(init.)=1.37 s t(norm)=0.173784, mflops=28.7713 (err=9.7e-16) Top mflops for N=421875 = 37.5398 Normalized results and averages for N=421875: fft 0: mflops = 37.5398 (norm. = 1), norm. avg. (of 16) = 0.993557 fft 1: mflops = 25.4302 (norm. = 0.677419), norm. avg. (of 16) = 0.305817 fft 2: mflops = 10.9491 (norm. = 0.291667), norm. avg. (of 16) = 0.432738 fft 3: mflops = 28.7713 (norm. = 0.766423), norm. avg. (of 9) = 0.587666 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.53 s, 1 iters, t-(init.)=1.41 s t(norm)=0.145204, mflops=34.4343 (err=6.7e-16) 1. PDA (f2c): elapsed time t=2.39 s, 1 iters, t-(init.)=2.27 s t(norm)=0.233768, mflops=21.3887 (err=6.1e-16) 2. Singleton (f2c): elapsed time t=5.06 s, 1 iters, t-(init.)=4.94 s t(norm)=0.508729, mflops=9.82842 (err=7.9e-16) 3. Temperton (f2c): elapsed time t=2.52 s, 1 iters, t-(init.)=2.4 s t(norm)=0.247156, mflops=20.2302 (err=6.7e-16) Top mflops for N=512000 = 34.4343 Normalized results and averages for N=512000: fft 0: mflops = 34.4343 (norm. = 1), norm. avg. (of 17) = 0.993936 fft 1: mflops = 21.3887 (norm. = 0.621145), norm. avg. (of 17) = 0.324366 fft 2: mflops = 9.82842 (norm. = 0.285425), norm. avg. (of 17) = 0.424072 fft 3: mflops = 20.2302 (norm. = 0.5875), norm. avg. (of 10) = 0.587649 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.5 s, 1 iters, t-(init.)=1.36 s t(norm)=0.119652, mflops=41.7877 (err=7.0e-16) 1. PDA (f2c): elapsed time t=3.28 s, 1 iters, t-(init.)=3.14 s t(norm)=0.276256, mflops=18.0991 (err=7.0e-16) 2. Singleton (f2c): elapsed time t=6.98 s, 1 iters, t-(init.)=6.84 s t(norm)=0.601781, mflops=8.30867 (err=8.9e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 41.7877 Normalized results and averages for N=592704: fft 0: mflops = 41.7877 (norm. = 1), norm. avg. (of 18) = 0.994273 fft 1: mflops = 18.0991 (norm. = 0.433121), norm. avg. (of 18) = 0.330408 fft 2: mflops = 8.30867 (norm. = 0.19883), norm. avg. (of 18) = 0.411559 fft 3: mflops = -1 (norm. = -0.0239305), norm. avg. (of 10) = 0.587649 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=3.18 s, 1 iters, t-(init.)=2.98 s t(norm)=0.170501, mflops=29.3253 (err=7.7e-16) 1. PDA (f2c): elapsed time t=5.16 s, 1 iters, t-(init.)=4.96 s t(norm)=0.283788, mflops=17.6188 (err=6.5e-16) 2. Singleton (f2c): elapsed time t=11.83 s, 1 iters, t-(init.)=11.62 s t(norm)=0.664841, mflops=7.52059 (err=7.0e-16) 3. Temperton (f2c): elapsed time t=5.61 s, 1 iters, t-(init.)=5.4 s t(norm)=0.308962, mflops=16.1832 (err=7.7e-16) Top mflops for N=884736 = 29.3253 Normalized results and averages for N=884736: fft 0: mflops = 29.3253 (norm. = 1), norm. avg. (of 19) = 0.994574 fft 1: mflops = 17.6188 (norm. = 0.600806), norm. avg. (of 19) = 0.344639 fft 2: mflops = 7.52059 (norm. = 0.256454), norm. avg. (of 19) = 0.403395 fft 3: mflops = 16.1832 (norm. = 0.551852), norm. avg. (of 11) = 0.584395 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=3.45 s, 1 iters, t-(init.)=3.18 s t(norm)=0.136377, mflops=36.6631 (err=7.5e-16) 1. PDA (f2c): elapsed time t=6.78 s, 1 iters, t-(init.)=6.5 s t(norm)=0.278758, mflops=17.9367 (err=7.3e-16) 2. Singleton (f2c): elapsed time t=11.72 s, 1 iters, t-(init.)=11.45 s t(norm)=0.491043, mflops=10.1824 (err=7.8e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 36.6631 Normalized results and averages for N=1157625: fft 0: mflops = 36.6631 (norm. = 1), norm. avg. (of 20) = 0.994845 fft 1: mflops = 17.9367 (norm. = 0.489231), norm. avg. (of 20) = 0.351869 fft 2: mflops = 10.1824 (norm. = 0.277729), norm. avg. (of 20) = 0.397112 fft 3: mflops = -1 (norm. = -0.0272754), norm. avg. (of 11) = 0.584395 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=4.61 s, 1 iters, t-(init.)=4.28 s t(norm)=0.149173, mflops=33.5181 (err=5.5e-16) 1. PDA (f2c): elapsed time t=9.12 s, 1 iters, t-(init.)=8.79 s t(norm)=0.306362, mflops=16.3206 (err=5.6e-16) 2. Singleton (f2c): elapsed time t=15.71 s, 1 iters, t-(init.)=15.38 s t(norm)=0.536047, mflops=9.32755 (err=6.4e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 33.5181 Normalized results and averages for N=1404928: fft 0: mflops = 33.5181 (norm. = 1), norm. avg. (of 21) = 0.995091 fft 1: mflops = 16.3206 (norm. = 0.486917), norm. avg. (of 21) = 0.3583 fft 2: mflops = 9.32755 (norm. = 0.278283), norm. avg. (of 21) = 0.391454 fft 3: mflops = -1 (norm. = -0.0298346), norm. avg. (of 11) = 0.584395 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=4.81 s, 1 iters, t-(init.)=4.4 s t(norm)=0.122887, mflops=40.6879 (err=7.5e-16) 1. PDA (f2c): elapsed time t=8.15 s, 1 iters, t-(init.)=7.74 s t(norm)=0.216169, mflops=23.1301 (err=8.1e-16) 2. Singleton (f2c): elapsed time t=26.03 s, 1 iters, t-(init.)=25.63 s t(norm)=0.715815, mflops=6.98504 (err=9.6e-16) 3. Temperton (f2c): elapsed time t=9.28 s, 1 iters, t-(init.)=8.88 s t(norm)=0.248008, mflops=20.1607 (err=7.1e-16) Top mflops for N=1728000 = 40.6879 Normalized results and averages for N=1728000: fft 0: mflops = 40.6879 (norm. = 1), norm. avg. (of 22) = 0.995314 fft 1: mflops = 23.1301 (norm. = 0.568475), norm. avg. (of 22) = 0.367853 fft 2: mflops = 6.98504 (norm. = 0.171674), norm. avg. (of 22) = 0.381464 fft 3: mflops = 20.1607 (norm. = 0.495495), norm. avg. (of 12) = 0.576987 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=10.28 s, 1 iters, t-(init.)=9.57 s t(norm)=0.149001, mflops=33.5569 (err=1.2e-15) 1. PDA (f2c): elapsed time t=16.7 s, 1 iters, t-(init.)=15.99 s t(norm)=0.248957, mflops=20.0838 (err=1.2e-15) 2. Singleton (f2c): elapsed time t=43.23 s, 1 iters, t-(init.)=42.51 s t(norm)=0.661862, mflops=7.55444 (err=1.6e-15) 3. Temperton (f2c): elapsed time t=18.49 s, 1 iters, t-(init.)=17.78 s t(norm)=0.276827, mflops=18.0618 (err=1.2e-15) Top mflops for N=2985984 = 33.5569 Normalized results and averages for N=2985984: fft 0: mflops = 33.5569 (norm. = 1), norm. avg. (of 23) = 0.995518 fft 1: mflops = 20.0838 (norm. = 0.598499), norm. avg. (of 23) = 0.377881 fft 2: mflops = 7.55444 (norm. = 0.225124), norm. avg. (of 23) = 0.374666 fft 3: mflops = 18.0618 (norm. = 0.538245), norm. avg. (of 13) = 0.574007 Benchmarking for array size = 180x180x180: 0. FFTW: elapsed time t=21.43 s, 1 iters, t-(init.)=20.05 s t(norm)=0.152963, mflops=32.6876 (err=9.7e-16) 1. PDA (f2c): elapsed time t=29.14 s, 1 iters, t-(init.)=27.76 s t(norm)=0.211783, mflops=23.6091 (err=9.5e-16) 2. Singleton (f2c): elapsed time t=93.58 s, 1 iters, t-(init.)=92.2 s t(norm)=0.703401, mflops=7.10832 (err=1.2e-15) 3. Temperton (f2c): elapsed time t=30.89 s, 1 iters, t-(init.)=29.5 s t(norm)=0.225058, mflops=22.2165 (err=9.4e-16) Top mflops for N=5832000 = 32.6876 Normalized results and averages for N=5832000: fft 0: mflops = 32.6876 (norm. = 1), norm. avg. (of 24) = 0.995704 fft 1: mflops = 23.6091 (norm. = 0.722262), norm. avg. (of 24) = 0.39223 fft 2: mflops = 7.10832 (norm. = 0.217462), norm. avg. (of 24) = 0.368116 fft 3: mflops = 22.2165 (norm. = 0.679661), norm. avg. (of 14) = 0.581553 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Beauregard, Bergland, CWP (min N), CWP (best N), Edelblute, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), NAPACK (f2c), Ooura (C), Ransom, Singleton (f2c), Temperton (f2c), Valkenburg, SGIMATH 2, 17.1898, 17.05, 13.6179, 0.630154, 2.27951, 2.0011, 2.44994, 2.36166, , 5.19097, 17.1898, 17.1898, 27.962, , 6.16809, 4.12825, 4.0022, 12.6334, , , , 2.75941, 12.7875, , 3.33941, 1.89959, 4.12825, 1.6384 4, 34.1, 33.5544, 19.7845, 2.73067, 3.54249, 6.39376, 9.27943, 4.80998, 19.5996, 14.9797, 53.0925, 52.4288, 87.3813, , 17.05, 7.94376, 7.82519, 43.6907, 18.5589, 20.1649, 18.0789, 5.34988, 25.7319, 3.08405, 11.5865, 6.80894, 4.36907, 5.76141 8, 59.9186, 57.7198, 23.4756, 3.33234, 4.01241, 11.5652, 23.4756, 14.4299, 21.6947, 24.0132, 98.304, 99.078, 159.277, 41.3912, 30.2474, 12.8923, 13.3294, 65.536, 32.0993, 33.1129, 31.4573, 7.94376, 44.306, 3.1711, 10.9227, 11.3156, 4.54585, 4.76625 16, 39.9458, 40.3298, 26.2144, 6.67883, 4.99322, 17.7725, 46.3459, 27.06, 23.6966, 41.943, 119.837, 116.508, 176.602, 43.2402, 46.6034, 18.0789, 19.065, 83.0555, 28.5327, 34.6637, 33.5544, 11.1551, 60.787, 10.5917, 29.1271, 19.065, 4.76625, 5.79324 32, 49.4611, 49.9322, 28.6496, 7.75574, 5.90414, 24.0499, 49.9322, 52.9584, 25.4509, 41.2825, 108.101, 106.998, 144.631, 59.9186, 49.4611, 22.2156, 24.2726, 90.3945, 33.6082, 43.6907, 42.2813, 13.4433, 73.8434, 9.78149, 35.6659, 19.563, 4.96485, 11.7029 64, 48.3958, 48.3958, 31.1458, 14.6997, 6.44616, 29.9593, 62.9146, 62.2916, 27.8383, 53.7731, 114.39, 89.2405, 106.635, 87.3813, 60.4948, 25.3688, 27.8383, 93.2068, 33.825, 44.306, 43.6907, 15.4202, 79.6387, 19.7845, 49.9322, 28.0869, 5.14008, 16.384 128, 53.9708, 54.3706, 33.3638, 14.1154, 6.89853, 33.3638, 72.6736, 83.8861, 29.8375, 54.3706, 123.362, 102.658, 123.362, 85.8483, 66.7276, 27.1853, 30.3307, 66.1264, 37.4491, 49.9322, 48.2897, 15.9566, 85.8483, 18.1684, 44.7563, 29.3601, 5.21309, 20.5029 256, 54.8275, 53.4306, 34.9525, 17.7725, 7.3327, 37.1177, 86.4805, 101.068, 32.0176, 61.6809, 123.362, 112.599, 115.705, 92.6918, 72.9444, 29.7468, 33.5544, 61.2307, 37.7865, 50.5338, 49.6367, 16.6441, 87.8388, 26.7153, 61.6809, 33.026, 5.24288, 15.6504 512, 59.3534, 58.2542, 36.5782, 18.7246, 7.66005, 41.0312, 98.304, 100.932, 33.4652, 54.8673, 90.7422, 89.03, 103.139, 105.443, 67.8934, 30.6402, 33.9467, 63.7648, 41.3912, 54.5502, 54.5502, 16.9734, 89.03, 25.7847, 60.4948, 28.4253, 4.608, 30.6402 1024, 59.5782, 57.2992, 37.4491, 22.9951, 8.09086, 42.2813, 96.1996, 96.1996, 35.4249, 44.4312, 89.6219, 79.4376, 75.4371, 99.8644, 55.1882, 31.023, 35.4249, 39.4202, 41.6102, 54.3304, 53.4988, 17.1336, 90.3945, 31.775, 65.9482, 32.9741, 4.85452, 33.1828 2048, 59.4553, 58.2542, 37.6939, 21.6811, 8.00996, 43.3622, 74.8983, 84.1922, 35.3814, 25.0746, 69.484, 64.0796, 45.056, 86.7243, 31.1739, 31.1739, 34.5339, 35.3814, 43.6907, 57.1007, 53.3997, 11.264, 84.1922, 29.5752, 55.9919, 28.2704, 4.65094, 28.8358 4096, 23.3017, 25.9978, 18.9502, 20.1649, 7.63526, 29.6767, 61.6809, 73.1565, 18.5043, 22.153, 51.15, 47.6625, 33.825, 45.2623, 23.3017, 16.2151, 16.8221, 29.3993, 42.2245, 53.7731, 40.8536, 10.4163, 42.799, 29.1271, 29.1271, 19.5387, 4.20552, 21.9981 8192, 17.7493, 17.2115, 13.0071, 15.9246, 7.22007, 21.4332, 52.8352, 51.6344, 12.8115, 15.6324, 38.0768, 38.0768, 25.0579, 30.1582, 17.0394, 11.8329, 12.3474, , 21.0362, 23.1828, 21.0362, 8.27153, 34.4229, 22.8716, 20.5293, 14.9468, 3.98116, 25.4319 16384, 13.4927, 13.6941, 10.546, 17.6443, 7.00385, 19.7313, 49.9322, 49.9322, 10.4262, 14.1154, 32.768, 30.5835, 20.6181, 25.6644, 14.9188, 9.97287, 10.3673, , 16.5316, 17.8156, 16.5316, 7.05772, 27.3882, 25.6644, 17.3114, 13.2972, 3.61222, 22.7951 32768, 12.3653, 12.2117, 8.93673, 14.1445, 6.73315, 17.3989, 45.4585, 45.4585, 9.18729, 11.8439, 30.0165, 28.7019, 17.5543, 21.7246, 13.6533, 8.69947, 8.93673, , 14.0434, 14.6722, 13.6533, 5.74877, 25.0456, 21.1406, 12.9347, 11.4975, 3.23368, 20.2689 65536, 9.19804, 9.27943, 7.3327, 15.4202, 6.39376, 14.7687, 38.8361, 38.4799, 7.43671, 11.9837, 26.5462, 25.7319, 15.3077, 18.3961, 12.866, 7.18203, 7.28178, , 12.0526, 12.71, 11.5865, 5.51882, 20.5603, 20.7639, 12.4092, 10.2802, 2.97891, 19.9729 131072, 8.9129, 9.05782, 6.71152, 11.9797, 6.25906, 13.6701, 37.4491, 38.4177, 6.63162, 10.1283, 25.9096, 25.0362, 13.6701, 15.803, 11.4857, 6.51527, 6.59238, , 9.8594, 10.2212, 9.44163, 5.28015, 20.0741, 18.1156, 10.9227, 9.13207, 2.87884, 17.8258 262144, 8.10755, 8.24929, 6.4286, 14.9323, 6.30828, 13.8782, 27.5941, 27.4337, 6.49944, 10.9735, 24.576, 23.593, 12.5494, 15.9412, 12.0372, 6.39376, 6.46382, , 9.25214, 9.51329, 8.93673, 5.41123, 19.6608, 21.6449, 11.0765, 9.32528, 2.77238, 18.7246 524288, 8.31509, 8.44193, 6.32073, 11.6372, 6.36109, 13.2115, 28.4613, 28.4613, 6.35298, 10.4199, 24.9037, 23.6054, 11.9156, 15.5163, 12.2678, 6.29676, 6.37738, , 8.81546, 9.08893, 8.5727, 5.23187, 19.3803, 18.5848, 9.46908, 8.87832, 2.64933, 18.6544 1048576, 7.68751, 7.88996, 6.21194, 14.8945, 6.38597, 13.1896, , , 6.19726, 11.0145, 22.7951, 21.3125, 11.4224, , 12.5879, 6.20827, 6.29775, , , 8.92405, 8.46308, 5.36631, 18.9959, 22.6965, 10.7657, 9.07858, 2.61556, 19.8971 Norm. Avg., 0.384409, 0.383253, 0.259702, 0.238243, 0.111165, 0.325056, 0.703482, 0.718259, 0.236662, 0.332855, 0.784237, 0.734487, 0.699172, 0.607028, 0.38592, 0.206798, 0.21949, 0.50959, 0.32521, 0.387413, 0.361039, 0.134451, 0.616196, 0.36172, 0.356967, 0.237844, 0.064969, 0.326471 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, CWP (min N), CWP (best N), FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Singleton (f2c), Temperton (f2c), Valkenburg, SGIMATH 6, 13.0314, 9.24044, 17.8324, 70.7095, 70.0999, 10.1645, 22.5877, 7.87945, 8.47041, 7.31258, 4.62022, 3.57904 9, 24.7642, 17.1532, 24.7642, 77.9041, 75.5433, 8.34687, 24.4405, 11.1958, 14.2725, 12.3821, 4.76964, 3.8002 12, 29.6771, 26.1049, 32.0378, 112.773, 110.562, 14.9964, 37.0964, 10.442, 15.4062, 16.2967, 4.9289, 6.52623 15, 34.9148, 34.9148, 36.5774, 78.3802, 78.3802, 10.6095, 36.2324, 7.80616, 16.8449, 19.013, 4.36435, 7.16535 18, 37.2655, 32.1506, 28.2704, 63.8837, 62.663, 9.91743, 42.4055, 13.1525, 20.16, 15.9709, 4.95871, 7.36384 24, 55.0498, 46.526, 34.0166, 84.3453, 84.3453, 19.9213, 53.0259, 13.3547, 19.1796, 22.536, 5.03598, 9.2932 36, 64.1968, 64.1968, 39.3464, 81.8616, 85.8971, 12.7056, 59.7911, 15.6377, 29.8955, 27.4716, 5.15092, 10.9689 80, 87.6855, 82.0424, 54.876, 101.052, 94.1622, 26.0575, 60.4838, 10.4625, 44.7907, 36.3433, 4.97973, 16.842 108, 70.7254, 93.3796, 45.2749, 98.7817, 98.7817, 11.4051, 62.253, 18.2204, 30.4913, 31.7888, 5.26082, 17.4745 210, 87.8872, 87.8872, 31.5975, 70.9678, 67.709, 12.7605, 62.5989, 8.68519, 25.3263, , 4.14718, 12.4727 504, 116.557, 115.109, 31.5179, 85.0117, 79.8817, 14.8498, 68.6391, 11.4682, 29.6996, , 4.52455, 18.2407 1000, 82.9672, 111.53, 45.5579, 66.6991, 65.8385, 15.3689, 53.7103, 9.04695, 41.4836, 42.1693, 4.36856, 29.3246 1960, 94.6132, 95.436, 19.1873, 50.8108, 55.43, 14.3654, 37.3304, 7.57951, 30.829, , 3.76893, 15.86 4725, 61.0103, 69.6439, 20.5062, 48.5674, 45.5694, 8.54427, 34.4965, 8.09457, 25.2817, , 3.91009, 22.7847 10368, 60.2155, 64.6108, 20.6815, 49.176, 46.1025, 11.3483, 29.1174, 10.151, 25.7316, 22.3527, 4.25562, 28.9271 27000, 49.6823, 48.9179, 16.2228, 35.7266, 36.1326, 8.15299, 21.9287, 5.97681, 17.4707, 18.4864, 3.54873, 24.6486 75600, 39.8433, 39.8433, 10.1255, 31.0172, 31.6176, 7.20695, 16.899, 5.32687, 13.39, , 3.09389, 20.7658 165375, 34.129, 33.9271, 6.92473, 28.3845, 27.0456, 4.74642, 17.6965, 4.84263, 11.6538, , 2.70456, 18.6158 362880, 18.9324, 18.9324, 9.5744, 29.92, 29.92, 6.62261, 18.7209, 5.3962, 10.8098, , 2.83506, 19.5967 Norm. Avg., 0.734417, 0.746647, 0.346509, 0.863582, 0.850642, 0.157713, 0.514278, 0.130996, 0.299323, 0.272005, 0.0597653, 0.245555 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM (f2c), PDA (f2c), Singleton (f2c), Temperton (f2c) 4x4x4, 104.858, , 10.6998, 57.1951, 37.2276 8x8x8, 153.45, 47.1859, 18.432, 39.3216, 65.536 16x16x16, 76.2601, 42.799, 25.1658, 28.3399, 39.8193 32x32x32, 38.9323, 23.2672, 17.3989, 12.2117, 21.0276 64x64x64, 27.7564, 21.0651, 16.271, 9.99702, 17.3478 256x64x32, 29.4718, 19.8436, 18.1118, 9.45111, 18.3115 16x1024x64, 30.5707, 19.9349, 17.4763, 10.3717, 128x128x128, 27.3542, 19.696, 16.8092, 7.31322, 13.0606 512x128x64, 26.2293, 19.3529, 16.4424, 7.72432, Norm. Avg., 1, 0.626069, 0.445873, 0.340995, 0.509737 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA (f2c), Singleton (f2c), Temperton (f2c) 5x5x5, 73.5357, 13.1121, 81.9881, 46.0191 6x6x6, 114.35, 13.7221, 35.5494, 39.6591 7x7x7, 68.793, 6.66238, 35.2155, 9x9x9, 97.2467, 20.0537, 41.7589, 61.1984 10x10x10, 99.0773, 21.9935, 47.6867, 54.2817 11x11x11, 53.9887, 6.90676, 29.9683, 12x12x12, 139.93, 24.6509, 41.0139, 76.7358 13x13x13, 42.4744, 6.72817, 29.1763, 14x14x14, 70.3824, 11.0214, 28.6557, 15x15x15, 79.1145, 27.3694, 34.6803, 55.9484 24x25x28, 45.1847, 21.6835, 19.8575, 48x48x48, 36.6922, 20.032, 10.2942, 21.0563 49x49x49, 35.7062, 11.7958, 12.7851, 60x60x60, 41.1577, 24.0734, 9.11349, 25.0174 72x60x56, 38.9778, 19.6661, 8.51681, 75x75x75, 37.5398, 25.4302, 10.9491, 28.7713 80x80x80, 34.4343, 21.3887, 9.82842, 20.2302 84x84x84, 41.7877, 18.0991, 8.30867, 96x96x96, 29.3253, 17.6188, 7.52059, 16.1832 105x105x105, 36.6631, 17.9367, 10.1824, 112x112x112, 33.5181, 16.3206, 9.32755, 120x120x120, 40.6879, 23.1301, 6.98504, 20.1607 144x144x144, 33.5569, 20.0838, 7.55444, 18.0618 180x180x180, 32.6876, 23.6091, 7.10832, 22.2165 Norm. Avg., 0.995704, 0.39223, 0.368116, 0.581553 @@@@ end