To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Thomas L. Ferrell @ submitter email = f44@ornl.gov @ submitter organization = Oak Ridge National Lab @ computer manufacturer = Apple @ computer model = PowerMac9600/300 @ CPU manufacturer = Motorola/IBM @ CPU model = PowerPC 604e Mach5 @ CPU speed = 300 MHz @ RAM = 196 MB @ L2 cache size = 1 Mb @ operating system = MacOS 7.6.1 @ C compiler = Metrowerks CodeWarrior Pro 2 @ C compiler flags = all @ Fortran compiler = NONE @ Fortran compiler flags = NONE @ remarks = AppleShare, AppleGuide off, 7.6.1 base on @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) 524288 (50.9973 MB) 1048576 (96 MB) Maximum array size = 1048576 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Beauregard 5. Bergland 6. CWP (min N) 7. CWP (best N) 8. Edelblute 9. FFTPACK (f2c) 10. FFTW 11. FFTW_ESTIMATE 12. Frigo-old 13. Green 14. GSL 15. GSL DIT 16. GSL DIF 17. Krukar 18. Mayer (Buneman) 19. Mayer (simple) 20. Mayer (lookup) 21. NAPACK (f2c) 22. Nielsen 23. NR (C) 24. Ooura (C) 25. QFT 26. Ransom 27. Singleton (f2c) 28. Temperton (f2c) 29. Valkenburg Computing normalized averages (30 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.46563 s, 4194304 iters, t-(init.)=1.19821 s t(norm)=0.142838, mflops=35.0048 (err=1.7e-17) 1. Arndt DIT: elapsed time t=1.52086 s, 4194304 iters, t-(init.)=1.25347 s t(norm)=0.149425, mflops=33.4616 (err=1.7e-17) 2. Arndt Split-Radix: elapsed time t=1.15917 s, 2097152 iters, t-(init.)=1.02546 s t(norm)=0.244489, mflops=20.4508 (err=1.7e-17) 3. Arndt 4-step: elapsed time t=1.87068 s, 262144 iters, t-(init.)=1.85389 s t(norm)=3.53601, mflops=1.41402 (err=1.7e-17) 4. Beauregard: elapsed time t=1.08509 s, 524288 iters, t-(init.)=1.05163 s t(norm)=1.00292, mflops=4.98546 (err=1.7e-17) 5. Bergland: elapsed time t=1.88449 s, 1048576 iters, t-(init.)=1.81726 s t(norm)=0.866537, mflops=5.77009 (err=1.7e-17) 6. CWP (min N): elapsed time t=1.08584 s, 524288 iters, t-(init.)=1.05235 s t(norm)=1.0036, mflops=4.98205 7. CWP (best N) (N=3): elapsed time t=1.15811 s, 524288 iters, t-(init.)=1.11763 s t(norm)=1.06586, mflops=4.69105 8. Skipping fft (Edelblute can't handle N <= 2). 9. FFTPACK (f2c): elapsed time t=1.50576 s, 1048576 iters, t-(init.)=1.43538 s t(norm)=0.684443, mflops=7.30521 (err=1.7e-17) FFTW_MEASURE plan: (cost = 3.227654e-07) FFTW_NOTW 2 10. FFTW: elapsed time t=1.46409 s, 4194304 iters, t-(init.)=1.18218 s t(norm)=0.140926, mflops=35.4795 (err=1.7e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 11. FFTW_ESTIMATE: elapsed time t=1.43663 s, 4194304 iters, t-(init.)=1.16893 s t(norm)=0.139347, mflops=35.8815 (err=1.7e-17) 12. Frigo-old: elapsed time t=1.04183 s, 4194304 iters, t-(init.)=0.774364 s t(norm)=0.0923114, mflops=54.1645 (err=1.7e-17) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=1.71111 s, 2097152 iters, t-(init.)=1.57735 s t(norm)=0.376069, mflops=13.2954 (err=1.7e-17) 15. GSL DIT: elapsed time t=1.24248 s, 1048576 iters, t-(init.)=1.17211 s t(norm)=0.558906, mflops=8.94605 (err=1.7e-17) 16. GSL DIF: elapsed time t=1.28598 s, 1048576 iters, t-(init.)=1.21899 s t(norm)=0.581262, mflops=8.60198 (err=1.7e-17) 17. Krukar: elapsed time t=1.67543 s, 4194304 iters, t-(init.)=1.40774 s t(norm)=0.167816, mflops=29.7946 (err=1.7e-17) 18. Skipping fft (Mayer can't handle N <= 2). 19. Skipping fft (Mayer can't handle N <= 2). 20. Skipping fft (Mayer can't handle N <= 2). 21. NAPACK (f2c): elapsed time t=1.59869 s, 524288 iters, t-(init.)=1.5652 s t(norm)=1.49269, mflops=3.34966 (err=1.7e-17) 22. Nielsen: elapsed time t=1.04836 s, 262144 iters, t-(init.)=1.0316 s t(norm)=1.96763, mflops=2.54113 (err=1.7e-17) 23. NR (C): elapsed time t=1.08091 s, 1048576 iters, t-(init.)=1.01404 s t(norm)=0.483533, mflops=10.3406 (err=1.7e-17) 24. Ooura (C): elapsed time t=1.53459 s, 4194304 iters, t-(init.)=1.26709 s t(norm)=0.151049, mflops=33.1019 (err=1.7e-17) 25. Skipping fft (QFT requires N >= 16). 26. Skipping fft (Ransom doesn't work for N=2). 27. Singleton (f2c): elapsed time t=1.51785 s, 1048576 iters, t-(init.)=1.45036 s t(norm)=0.691585, mflops=7.22977 (err=1.7e-17) 28. Temperton (f2c): elapsed time t=1.6115 s, 524288 iters, t-(init.)=1.57802 s t(norm)=1.50492, mflops=3.32244 (err=1.7e-17) 29. Valkenburg: elapsed time t=1.27771 s, 1048576 iters, t-(init.)=1.20705 s t(norm)=0.575568, mflops=8.68707 (err=1.7e-17) Top mflops for N=2 = 54.1645 Normalized results and averages for N=2: fft 0: mflops = 35.0048 (norm. = 0.646268), norm. avg. (of 1) = 0.646268 fft 1: mflops = 33.4616 (norm. = 0.617777), norm. avg. (of 1) = 0.617777 fft 2: mflops = 20.4508 (norm. = 0.377569), norm. avg. (of 1) = 0.377569 fft 3: mflops = 1.41402 (norm. = 0.0261061), norm. avg. (of 1) = 0.0261061 fft 4: mflops = 4.98546 (norm. = 0.092043), norm. avg. (of 1) = 0.092043 fft 5: mflops = 5.77009 (norm. = 0.106529), norm. avg. (of 1) = 0.106529 fft 6: mflops = 4.98205 (norm. = 0.0919799), norm. avg. (of 1) = 0.0919799 fft 7: mflops = 4.69105 (norm. = 0.0866075), norm. avg. (of 1) = 0.0866075 fft 8: mflops = -1 (norm. = -0.0184623), norm. avg. (of 0) = -1 fft 9: mflops = 7.30521 (norm. = 0.134871), norm. avg. (of 1) = 0.134871 fft 10: mflops = 35.4795 (norm. = 0.655033), norm. avg. (of 1) = 0.655033 fft 11: mflops = 35.8815 (norm. = 0.662455), norm. avg. (of 1) = 0.662455 fft 12: mflops = 54.1645 (norm. = 1), norm. avg. (of 1) = 1 fft 13: mflops = -1 (norm. = -0.0184623), norm. avg. (of 0) = -1 fft 14: mflops = 13.2954 (norm. = 0.245464), norm. avg. (of 1) = 0.245464 fft 15: mflops = 8.94605 (norm. = 0.165165), norm. avg. (of 1) = 0.165165 fft 16: mflops = 8.60198 (norm. = 0.158812), norm. avg. (of 1) = 0.158812 fft 17: mflops = 29.7946 (norm. = 0.550076), norm. avg. (of 1) = 0.550076 fft 18: mflops = -1 (norm. = -0.0184623), norm. avg. (of 0) = -1 fft 19: mflops = -1 (norm. = -0.0184623), norm. avg. (of 0) = -1 fft 20: mflops = -1 (norm. = -0.0184623), norm. avg. (of 0) = -1 fft 21: mflops = 3.34966 (norm. = 0.0618424), norm. avg. (of 1) = 0.0618424 fft 22: mflops = 2.54113 (norm. = 0.0469151), norm. avg. (of 1) = 0.0469151 fft 23: mflops = 10.3406 (norm. = 0.19091), norm. avg. (of 1) = 0.19091 fft 24: mflops = 33.1019 (norm. = 0.611137), norm. avg. (of 1) = 0.611137 fft 25: mflops = -1 (norm. = -0.0184623), norm. avg. (of 0) = -1 fft 26: mflops = -1 (norm. = -0.0184623), norm. avg. (of 0) = -1 fft 27: mflops = 7.22977 (norm. = 0.133478), norm. avg. (of 1) = 0.133478 fft 28: mflops = 3.32244 (norm. = 0.0613398), norm. avg. (of 1) = 0.0613398 fft 29: mflops = 8.68707 (norm. = 0.160383), norm. avg. (of 1) = 0.160383 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.37287 s, 2097152 iters, t-(init.)=1.18271 s t(norm)=0.070495, mflops=70.927 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.36016 s, 2097152 iters, t-(init.)=1.17013 s t(norm)=0.0697453, mflops=71.6894 (err=1.3e-16) 2. Arndt Split-Radix: elapsed time t=1.40931 s, 1048576 iters, t-(init.)=1.31428 s t(norm)=0.156675, mflops=31.9132 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.60421 s, 262144 iters, t-(init.)=1.58043 s t(norm)=0.753607, mflops=6.63476 (err=1.3e-16) 4. Beauregard: elapsed time t=1.86158 s, 524288 iters, t-(init.)=1.81401 s t(norm)=0.432494, mflops=11.5609 (err=5.3e-17) 5. Bergland: elapsed time t=1.06761 s, 524288 iters, t-(init.)=1.01975 s t(norm)=0.243126, mflops=20.5654 (err=5.3e-17) 6. CWP (min N): elapsed time t=1.14566 s, 524288 iters, t-(init.)=1.09814 s t(norm)=0.261816, mflops=19.0974 7. CWP (best N) (N=15): elapsed time t=1.16456 s, 262144 iters, t-(init.)=1.10208 s t(norm)=0.525512, mflops=9.51454 8. Edelblute: elapsed time t=1.50969 s, 1048576 iters, t-(init.)=1.41455 s t(norm)=0.168627, mflops=29.6512 (err=1.3e-16) 9. FFTPACK (f2c): elapsed time t=1.16334 s, 524288 iters, t-(init.)=1.11403 s t(norm)=0.265605, mflops=18.8249 (err=5.3e-17) FFTW_MEASURE plan: (cost = 4.070816e-07) FFTW_NOTW 4 10. FFTW: elapsed time t=1.84473 s, 4194304 iters, t-(init.)=1.45072 s t(norm)=0.0432349, mflops=115.647 (err=5.3e-17) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 11. FFTW_ESTIMATE: elapsed time t=1.81634 s, 4194304 iters, t-(init.)=1.43644 s t(norm)=0.0428092, mflops=116.797 (err=5.3e-17) 12. Frigo-old: elapsed time t=1.35144 s, 4194304 iters, t-(init.)=0.971396 s t(norm)=0.0289499, mflops=172.712 (err=5.3e-17) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=1.29708 s, 1048576 iters, t-(init.)=1.20206 s t(norm)=0.143296, mflops=34.8927 (err=5.3e-17) 15. GSL DIT: elapsed time t=1.20177 s, 524288 iters, t-(init.)=1.15248 s t(norm)=0.274773, mflops=18.1969 (err=6.4e-17) 16. GSL DIF: elapsed time t=1.23746 s, 524288 iters, t-(init.)=1.18989 s t(norm)=0.283692, mflops=17.6247 (err=6.4e-17) 17. Krukar: elapsed time t=1.19716 s, 2097152 iters, t-(init.)=1.00708 s t(norm)=0.0600268, mflops=83.2961 (err=5.3e-17) 18. Mayer (Buneman): elapsed time t=1.27132 s, 1048576 iters, t-(init.)=1.1763 s t(norm)=0.140226, mflops=35.6568 (err=1.3e-16) 19. Mayer (simple): elapsed time t=1.22454 s, 1048576 iters, t-(init.)=1.12941 s t(norm)=0.134636, mflops=37.1371 20. Mayer (lookup): elapsed time t=1.324 s, 1048576 iters, t-(init.)=1.2289 s t(norm)=0.146497, mflops=34.1304 (err=1.3e-16) 21. NAPACK (f2c): elapsed time t=1.43732 s, 262144 iters, t-(init.)=1.41353 s t(norm)=0.674025, mflops=7.41812 (err=5.3e-17) 22. Nielsen: elapsed time t=1.16336 s, 262144 iters, t-(init.)=1.13953 s t(norm)=0.543368, mflops=9.20186 (err=1.3e-16) 23. NR (C): elapsed time t=1.1332 s, 524288 iters, t-(init.)=1.08565 s t(norm)=0.25884, mflops=19.317 (err=6.4e-17) 24. Ooura (C): elapsed time t=1.49972 s, 2097152 iters, t-(init.)=1.30966 s t(norm)=0.0780619, mflops=64.0517 (err=5.3e-17) 25. Skipping fft (QFT requires N >= 16). 26. Ransom: elapsed time t=1.96347 s, 262144 iters, t-(init.)=1.93969 s t(norm)=0.924917, mflops=5.40589 (err=2.4e-16) 27. Singleton (f2c): elapsed time t=1.74256 s, 1048576 iters, t-(init.)=1.64759 s t(norm)=0.196408, mflops=25.4573 (err=5.3e-17) 28. Temperton (f2c): elapsed time t=1.02923 s, 262144 iters, t-(init.)=1.0054 s t(norm)=0.479414, mflops=10.4294 (err=5.3e-17) 29. Valkenburg: elapsed time t=1.15819 s, 262144 iters, t-(init.)=1.13355 s t(norm)=0.540518, mflops=9.25039 (err=5.3e-17) Top mflops for N=4 = 172.712 Normalized results and averages for N=4: fft 0: mflops = 70.927 (norm. = 0.410665), norm. avg. (of 2) = 0.528467 fft 1: mflops = 71.6894 (norm. = 0.41508), norm. avg. (of 2) = 0.516428 fft 2: mflops = 31.9132 (norm. = 0.184777), norm. avg. (of 2) = 0.281173 fft 3: mflops = 6.63476 (norm. = 0.0384151), norm. avg. (of 2) = 0.0322606 fft 4: mflops = 11.5609 (norm. = 0.066937), norm. avg. (of 2) = 0.07949 fft 5: mflops = 20.5654 (norm. = 0.119073), norm. avg. (of 2) = 0.112801 fft 6: mflops = 19.0974 (norm. = 0.110573), norm. avg. (of 2) = 0.101277 fft 7: mflops = 9.51454 (norm. = 0.0550889), norm. avg. (of 2) = 0.0708482 fft 8: mflops = 29.6512 (norm. = 0.17168), norm. avg. (of 1) = 0.17168 fft 9: mflops = 18.8249 (norm. = 0.108996), norm. avg. (of 2) = 0.121933 fft 10: mflops = 115.647 (norm. = 0.669595), norm. avg. (of 2) = 0.662314 fft 11: mflops = 116.797 (norm. = 0.676253), norm. avg. (of 2) = 0.669354 fft 12: mflops = 172.712 (norm. = 1), norm. avg. (of 2) = 1 fft 13: mflops = -1 (norm. = -0.00578997), norm. avg. (of 0) = -1 fft 14: mflops = 34.8927 (norm. = 0.202028), norm. avg. (of 2) = 0.223746 fft 15: mflops = 18.1969 (norm. = 0.105359), norm. avg. (of 2) = 0.135262 fft 16: mflops = 17.6247 (norm. = 0.102047), norm. avg. (of 2) = 0.130429 fft 17: mflops = 83.2961 (norm. = 0.482282), norm. avg. (of 2) = 0.516179 fft 18: mflops = 35.6568 (norm. = 0.206452), norm. avg. (of 1) = 0.206452 fft 19: mflops = 37.1371 (norm. = 0.215023), norm. avg. (of 1) = 0.215023 fft 20: mflops = 34.1304 (norm. = 0.197614), norm. avg. (of 1) = 0.197614 fft 21: mflops = 7.41812 (norm. = 0.0429507), norm. avg. (of 2) = 0.0523966 fft 22: mflops = 9.20186 (norm. = 0.0532785), norm. avg. (of 2) = 0.0500968 fft 23: mflops = 19.317 (norm. = 0.111845), norm. avg. (of 2) = 0.151377 fft 24: mflops = 64.0517 (norm. = 0.370858), norm. avg. (of 2) = 0.490997 fft 25: mflops = -1 (norm. = -0.00578997), norm. avg. (of 0) = -1 fft 26: mflops = 5.40589 (norm. = 0.0312999), norm. avg. (of 1) = 0.0312999 fft 27: mflops = 25.4573 (norm. = 0.147397), norm. avg. (of 2) = 0.140437 fft 28: mflops = 10.4294 (norm. = 0.0603859), norm. avg. (of 2) = 0.0608628 fft 29: mflops = 9.25039 (norm. = 0.0535595), norm. avg. (of 2) = 0.106971 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.26738 s, 1048576 iters, t-(init.)=1.11604 s t(norm)=0.0443474, mflops=112.746 (err=1.1e-16) 1. Arndt DIT: elapsed time t=1.27815 s, 1048576 iters, t-(init.)=1.12671 s t(norm)=0.0447714, mflops=111.678 (err=1.1e-16) 2. Arndt Split-Radix: elapsed time t=1.61657 s, 524288 iters, t-(init.)=1.54087 s t(norm)=0.122457, mflops=40.8305 (err=7.7e-17) 3. Arndt 4-step: elapsed time t=1.80498 s, 131072 iters, t-(init.)=1.78604 s t(norm)=0.567766, mflops=8.80644 (err=9.0e-17) 4. Beauregard: elapsed time t=1.94327 s, 262144 iters, t-(init.)=1.90541 s t(norm)=0.302857, mflops=16.5094 (err=1.5e-16) 5. Bergland: elapsed time t=1.80527 s, 524288 iters, t-(init.)=1.72928 s t(norm)=0.137431, mflops=36.3819 (err=1.6e-16) 6. CWP (min N): elapsed time t=1.42592 s, 524288 iters, t-(init.)=1.35015 s t(norm)=0.1073, mflops=46.5982 7. CWP (best N) (N=15): elapsed time t=1.16188 s, 262144 iters, t-(init.)=1.09937 s t(norm)=0.174741, mflops=28.6138 8. Edelblute: elapsed time t=1.99522 s, 524288 iters, t-(init.)=1.91939 s t(norm)=0.152539, mflops=32.7784 (err=8.3e-17) 9. FFTPACK (f2c): elapsed time t=1.04045 s, 262144 iters, t-(init.)=1.00172 s t(norm)=0.15922, mflops=31.4032 (err=1.5e-16) FFTW_MEASURE plan: (cost = 7.183609e-07) FFTW_NOTW 8 10. FFTW: elapsed time t=1.60522 s, 2097152 iters, t-(init.)=1.29547 s t(norm)=0.0257386, mflops=194.26 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 11. FFTW_ESTIMATE: elapsed time t=1.59149 s, 2097152 iters, t-(init.)=1.28885 s t(norm)=0.0256072, mflops=195.258 (err=1.4e-16) 12. Frigo-old: elapsed time t=1.27431 s, 2097152 iters, t-(init.)=0.971759 s t(norm)=0.0193071, mflops=258.972 (err=1.4e-16) 13. Green: elapsed time t=1.66148 s, 1048576 iters, t-(init.)=1.51019 s t(norm)=0.0600096, mflops=83.3199 (err=1.4e-16) 14. GSL: elapsed time t=1.34476 s, 524288 iters, t-(init.)=1.2693 s t(norm)=0.100875, mflops=49.5661 (err=1.4e-16) 15. GSL DIT: elapsed time t=1.02816 s, 262144 iters, t-(init.)=0.989429 s t(norm)=0.157266, mflops=31.7934 (err=1.5e-16) 16. GSL DIF: elapsed time t=1.05527 s, 262144 iters, t-(init.)=1.0174 s t(norm)=0.161712, mflops=30.9192 (err=1.6e-16) 17. Krukar: elapsed time t=1.24593 s, 1048576 iters, t-(init.)=1.09446 s t(norm)=0.0434898, mflops=114.969 (err=1.5e-16) 18. Mayer (Buneman): elapsed time t=1.12021 s, 524288 iters, t-(init.)=1.04458 s t(norm)=0.0830161, mflops=60.2293 (err=1.1e-16) 19. Mayer (simple): elapsed time t=1.11069 s, 524288 iters, t-(init.)=1.03502 s t(norm)=0.0822558, mflops=60.786 20. Mayer (lookup): elapsed time t=1.15465 s, 524288 iters, t-(init.)=1.07893 s t(norm)=0.0857459, mflops=58.3118 (err=1.1e-16) 21. NAPACK (f2c): elapsed time t=1.42535 s, 131072 iters, t-(init.)=1.40638 s t(norm)=0.447075, mflops=11.1838 (err=1.7e-16) 22. Nielsen: elapsed time t=1.57613 s, 262144 iters, t-(init.)=1.53828 s t(norm)=0.244504, mflops=20.4496 (err=7.5e-16) 23. NR (C): elapsed time t=1.98677 s, 524288 iters, t-(init.)=1.91106 s t(norm)=0.151877, mflops=32.9214 (err=1.6e-16) 24. Ooura (C): elapsed time t=1.20877 s, 1048576 iters, t-(init.)=1.05749 s t(norm)=0.042021, mflops=118.988 (err=1.5e-16) 25. Skipping fft (QFT requires N >= 16). 26. Ransom: elapsed time t=1.25175 s, 65536 iters, t-(init.)=1.2422 s t(norm)=0.789772, mflops=6.33094 (err=3.1e-16) 27. Singleton (f2c): elapsed time t=1.13215 s, 262144 iters, t-(init.)=1.09424 s t(norm)=0.173926, mflops=28.7479 (err=1.4e-16) 28. Temperton (f2c): elapsed time t=1.08241 s, 131072 iters, t-(init.)=1.06343 s t(norm)=0.338055, mflops=14.7905 (err=1.4e-16) 29. Valkenburg: elapsed time t=1.6362 s, 131072 iters, t-(init.)=1.61682 s t(norm)=0.513974, mflops=9.72813 (err=1.4e-16) Top mflops for N=8 = 258.972 Normalized results and averages for N=8: fft 0: mflops = 112.746 (norm. = 0.435361), norm. avg. (of 3) = 0.497432 fft 1: mflops = 111.678 (norm. = 0.431238), norm. avg. (of 3) = 0.488032 fft 2: mflops = 40.8305 (norm. = 0.157664), norm. avg. (of 3) = 0.240003 fft 3: mflops = 8.80644 (norm. = 0.0340054), norm. avg. (of 3) = 0.0328422 fft 4: mflops = 16.5094 (norm. = 0.0637499), norm. avg. (of 3) = 0.0742433 fft 5: mflops = 36.3819 (norm. = 0.140486), norm. avg. (of 3) = 0.122029 fft 6: mflops = 46.5982 (norm. = 0.179936), norm. avg. (of 3) = 0.127496 fft 7: mflops = 28.6138 (norm. = 0.11049), norm. avg. (of 3) = 0.0840621 fft 8: mflops = 32.7784 (norm. = 0.126571), norm. avg. (of 2) = 0.149125 fft 9: mflops = 31.4032 (norm. = 0.121261), norm. avg. (of 3) = 0.121709 fft 10: mflops = 194.26 (norm. = 0.750122), norm. avg. (of 3) = 0.691583 fft 11: mflops = 195.258 (norm. = 0.753973), norm. avg. (of 3) = 0.69756 fft 12: mflops = 258.972 (norm. = 1), norm. avg. (of 3) = 1 fft 13: mflops = 83.3199 (norm. = 0.321734), norm. avg. (of 1) = 0.321734 fft 14: mflops = 49.5661 (norm. = 0.191396), norm. avg. (of 3) = 0.212963 fft 15: mflops = 31.7934 (norm. = 0.122768), norm. avg. (of 3) = 0.131097 fft 16: mflops = 30.9192 (norm. = 0.119392), norm. avg. (of 3) = 0.12675 fft 17: mflops = 114.969 (norm. = 0.443946), norm. avg. (of 3) = 0.492101 fft 18: mflops = 60.2293 (norm. = 0.232571), norm. avg. (of 2) = 0.219511 fft 19: mflops = 60.786 (norm. = 0.234721), norm. avg. (of 2) = 0.224872 fft 20: mflops = 58.3118 (norm. = 0.225167), norm. avg. (of 2) = 0.21139 fft 21: mflops = 11.1838 (norm. = 0.0431854), norm. avg. (of 3) = 0.0493262 fft 22: mflops = 20.4496 (norm. = 0.0789645), norm. avg. (of 3) = 0.0597194 fft 23: mflops = 32.9214 (norm. = 0.127123), norm. avg. (of 3) = 0.143293 fft 24: mflops = 118.988 (norm. = 0.459464), norm. avg. (of 3) = 0.480486 fft 25: mflops = -1 (norm. = -0.00386142), norm. avg. (of 0) = -1 fft 26: mflops = 6.33094 (norm. = 0.0244464), norm. avg. (of 2) = 0.0278732 fft 27: mflops = 28.7479 (norm. = 0.111008), norm. avg. (of 3) = 0.130628 fft 28: mflops = 14.7905 (norm. = 0.0571123), norm. avg. (of 3) = 0.0596126 fft 29: mflops = 9.72813 (norm. = 0.0375644), norm. avg. (of 3) = 0.0838357 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.21022 s, 262144 iters, t-(init.)=1.14416 s t(norm)=0.0681975, mflops=73.3165 (err=1.9e-16) 1. Arndt DIT: elapsed time t=1.20452 s, 262144 iters, t-(init.)=1.13851 s t(norm)=0.0678602, mflops=73.6808 (err=1.9e-16) 2. Arndt Split-Radix: elapsed time t=1.74543 s, 262144 iters, t-(init.)=1.67943 s t(norm)=0.100102, mflops=49.9493 (err=1.5e-16) 3. Arndt 4-step: elapsed time t=1.24889 s, 65536 iters, t-(init.)=1.23236 s t(norm)=0.293818, mflops=17.0173 (err=2.0e-16) 4. Beauregard: elapsed time t=1.13871 s, 65536 iters, t-(init.)=1.12214 s t(norm)=0.26754, mflops=18.6888 (err=2.3e-16) 5. Bergland: elapsed time t=1.44224 s, 262144 iters, t-(init.)=1.37612 s t(norm)=0.0820233, mflops=60.9583 (err=2.6e-16) 6. CWP (min N): elapsed time t=1.07669 s, 262144 iters, t-(init.)=1.01069 s t(norm)=0.0602418, mflops=82.9989 7. CWP (best N) (N=28): elapsed time t=1.49845 s, 262144 iters, t-(init.)=1.39027 s t(norm)=0.0828665, mflops=60.338 8. Edelblute: elapsed time t=1.15913 s, 131072 iters, t-(init.)=1.12609 s t(norm)=0.13424, mflops=37.2467 (err=1.6e-16) 9. FFTPACK (f2c): elapsed time t=1.87265 s, 262144 iters, t-(init.)=1.80577 s t(norm)=0.107633, mflops=46.4543 (err=2.1e-16) FFTW_MEASURE plan: (cost = 1.400543e-06) FFTW_NOTW 16 10. FFTW: elapsed time t=1.54723 s, 1048576 iters, t-(init.)=1.2798 s t(norm)=0.0190705, mflops=262.185 (err=2.2e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 11. FFTW_ESTIMATE: elapsed time t=1.54023 s, 1048576 iters, t-(init.)=1.2764 s t(norm)=0.0190198, mflops=262.884 (err=2.2e-16) 12. Frigo-old: elapsed time t=1.47481 s, 1048576 iters, t-(init.)=1.21081 s t(norm)=0.0180424, mflops=277.125 (err=2.2e-16) 13. Green: elapsed time t=1.40488 s, 524288 iters, t-(init.)=1.27292 s t(norm)=0.0379359, mflops=131.801 (err=2.6e-16) 14. GSL: elapsed time t=1.19399 s, 262144 iters, t-(init.)=1.12798 s t(norm)=0.0672327, mflops=74.3686 (err=2.1e-16) 15. GSL DIT: elapsed time t=1.87394 s, 262144 iters, t-(init.)=1.80726 s t(norm)=0.107721, mflops=46.4161 (err=3.1e-16) 16. GSL DIF: elapsed time t=1.89724 s, 262144 iters, t-(init.)=1.83092 s t(norm)=0.109132, mflops=45.8163 (err=2.5e-16) 17. Krukar: elapsed time t=1.39232 s, 524288 iters, t-(init.)=1.26024 s t(norm)=0.0375582, mflops=133.127 (err=1.7e-16) 18. Mayer (Buneman): elapsed time t=1.48639 s, 262144 iters, t-(init.)=1.42022 s t(norm)=0.0846514, mflops=59.0658 (err=2.3e-16) 19. Mayer (simple): elapsed time t=1.25228 s, 262144 iters, t-(init.)=1.18628 s t(norm)=0.0707076, mflops=70.7138 20. Mayer (lookup): elapsed time t=1.26421 s, 262144 iters, t-(init.)=1.1982 s t(norm)=0.0714185, mflops=70.0099 (err=2.1e-16) 21. NAPACK (f2c): elapsed time t=1.34284 s, 65536 iters, t-(init.)=1.32631 s t(norm)=0.316217, mflops=15.8119 (err=2.7e-16) 22. Nielsen: elapsed time t=1.09174 s, 65536 iters, t-(init.)=1.07517 s t(norm)=0.25634, mflops=19.5054 (err=1.8e-16) 23. NR (C): elapsed time t=1.81723 s, 262144 iters, t-(init.)=1.75123 s t(norm)=0.104381, mflops=47.9013 (err=2.9e-16) 24. Ooura (C): elapsed time t=1.21367 s, 524288 iters, t-(init.)=1.08171 s t(norm)=0.0322376, mflops=155.098 (err=2.5e-16) 25. QFT: elapsed time t=1.98279 s, 524288 iters, t-(init.)=1.84896 s t(norm)=0.0551033, mflops=90.7386 (err=1.4e-16) 26. Ransom: elapsed time t=1.03476 s, 65536 iters, t-(init.)=1.01822 s t(norm)=0.242764, mflops=20.5962 (err=5.0e-16) 27. Singleton (f2c): elapsed time t=1.17436 s, 262144 iters, t-(init.)=1.10805 s t(norm)=0.0660452, mflops=75.7057 (err=2.0e-16) 28. Temperton (f2c): elapsed time t=1.00987 s, 65536 iters, t-(init.)=0.993235 s t(norm)=0.236806, mflops=21.1144 (err=2.1e-16) 29. Valkenburg: elapsed time t=1.04519 s, 32768 iters, t-(init.)=1.03681 s t(norm)=0.49439, mflops=10.1135 (err=2.5e-16) Top mflops for N=16 = 277.125 Normalized results and averages for N=16: fft 0: mflops = 73.3165 (norm. = 0.264561), norm. avg. (of 4) = 0.439214 fft 1: mflops = 73.6808 (norm. = 0.265876), norm. avg. (of 4) = 0.432493 fft 2: mflops = 49.9493 (norm. = 0.180241), norm. avg. (of 4) = 0.225063 fft 3: mflops = 17.0173 (norm. = 0.0614067), norm. avg. (of 4) = 0.0399833 fft 4: mflops = 18.6888 (norm. = 0.0674382), norm. avg. (of 4) = 0.072542 fft 5: mflops = 60.9583 (norm. = 0.219967), norm. avg. (of 4) = 0.146514 fft 6: mflops = 82.9989 (norm. = 0.2995), norm. avg. (of 4) = 0.170497 fft 7: mflops = 60.338 (norm. = 0.217729), norm. avg. (of 4) = 0.117479 fft 8: mflops = 37.2467 (norm. = 0.134404), norm. avg. (of 3) = 0.144218 fft 9: mflops = 46.4543 (norm. = 0.16763), norm. avg. (of 4) = 0.133189 fft 10: mflops = 262.185 (norm. = 0.946091), norm. avg. (of 4) = 0.75521 fft 11: mflops = 262.884 (norm. = 0.948613), norm. avg. (of 4) = 0.760323 fft 12: mflops = 277.125 (norm. = 1), norm. avg. (of 4) = 1 fft 13: mflops = 131.801 (norm. = 0.475603), norm. avg. (of 2) = 0.398668 fft 14: mflops = 74.3686 (norm. = 0.268358), norm. avg. (of 4) = 0.226811 fft 15: mflops = 46.4161 (norm. = 0.167492), norm. avg. (of 4) = 0.140196 fft 16: mflops = 45.8163 (norm. = 0.165327), norm. avg. (of 4) = 0.136394 fft 17: mflops = 133.127 (norm. = 0.480386), norm. avg. (of 4) = 0.489172 fft 18: mflops = 59.0658 (norm. = 0.213138), norm. avg. (of 3) = 0.217387 fft 19: mflops = 70.7138 (norm. = 0.25517), norm. avg. (of 3) = 0.234971 fft 20: mflops = 70.0099 (norm. = 0.25263), norm. avg. (of 3) = 0.225137 fft 21: mflops = 15.8119 (norm. = 0.057057), norm. avg. (of 4) = 0.0512589 fft 22: mflops = 19.5054 (norm. = 0.0703848), norm. avg. (of 4) = 0.0623857 fft 23: mflops = 47.9013 (norm. = 0.172851), norm. avg. (of 4) = 0.150682 fft 24: mflops = 155.098 (norm. = 0.55967), norm. avg. (of 4) = 0.500282 fft 25: mflops = 90.7386 (norm. = 0.327429), norm. avg. (of 1) = 0.327429 fft 26: mflops = 20.5962 (norm. = 0.0743209), norm. avg. (of 3) = 0.0433558 fft 27: mflops = 75.7057 (norm. = 0.273183), norm. avg. (of 4) = 0.166266 fft 28: mflops = 21.1144 (norm. = 0.0761908), norm. avg. (of 4) = 0.0637572 fft 29: mflops = 10.1135 (norm. = 0.0364943), norm. avg. (of 4) = 0.0720003 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.24675 s, 131072 iters, t-(init.)=1.18558 s t(norm)=0.0565331, mflops=88.4438 (err=2.4e-16) 1. Arndt DIT: elapsed time t=1.23864 s, 131072 iters, t-(init.)=1.17747 s t(norm)=0.0561463, mflops=89.0531 (err=2.7e-16) 2. Arndt Split-Radix: elapsed time t=1.84164 s, 131072 iters, t-(init.)=1.78024 s t(norm)=0.0848885, mflops=58.9008 (err=3.0e-16) 3. Arndt 4-step: elapsed time t=1.29442 s, 32768 iters, t-(init.)=1.27908 s t(norm)=0.243966, mflops=20.4947 (err=2.4e-16) 4. Beauregard: elapsed time t=1.34418 s, 32768 iters, t-(init.)=1.32885 s t(norm)=0.253457, mflops=19.7272 (err=2.5e-16) 5. Bergland: elapsed time t=1.21586 s, 131072 iters, t-(init.)=1.15468 s t(norm)=0.0550593, mflops=90.8112 (err=2.6e-16) 6. CWP (min N) (N=33): elapsed time t=1.25137 s, 131072 iters, t-(init.)=1.18841 s t(norm)=0.056668, mflops=88.2332 7. CWP (best N) (N=35): elapsed time t=1.99162 s, 262144 iters, t-(init.)=1.85876 s t(norm)=0.0443163, mflops=112.825 8. Edelblute: elapsed time t=1.2676 s, 65536 iters, t-(init.)=1.23698 s t(norm)=0.117968, mflops=42.3844 (err=2.9e-16) 9. FFTPACK (f2c): elapsed time t=1.39098 s, 65536 iters, t-(init.)=1.36 s t(norm)=0.1297, mflops=38.5506 (err=2.3e-16) FFTW_MEASURE plan: (cost = 2.961517e-06) FFTW_NOTW 32 10. FFTW: elapsed time t=1.58548 s, 524288 iters, t-(init.)=1.33906 s t(norm)=0.0159629, mflops=313.227 (err=2.4e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.58154 s, 524288 iters, t-(init.)=1.33654 s t(norm)=0.0159328, mflops=313.817 (err=2.4e-16) 12. Frigo-old: elapsed time t=1.56932 s, 524288 iters, t-(init.)=1.32484 s t(norm)=0.0157933, mflops=316.59 (err=2.1e-16) 13. Green: elapsed time t=1.27784 s, 262144 iters, t-(init.)=1.15547 s t(norm)=0.0275486, mflops=181.498 (err=2.4e-16) 14. GSL: elapsed time t=1.57444 s, 131072 iters, t-(init.)=1.51324 s t(norm)=0.072157, mflops=69.2934 (err=2.3e-16) 15. GSL DIT: elapsed time t=1.78602 s, 131072 iters, t-(init.)=1.72427 s t(norm)=0.0822197, mflops=60.8126 (err=3.1e-16) 16. GSL DIF: elapsed time t=1.7657 s, 131072 iters, t-(init.)=1.70452 s t(norm)=0.081278, mflops=61.5172 (err=3.2e-16) 17. Krukar: elapsed time t=1.65354 s, 262144 iters, t-(init.)=1.53089 s t(norm)=0.0364993, mflops=136.989 (err=2.7e-16) 18. Mayer (Buneman): elapsed time t=1.57427 s, 131072 iters, t-(init.)=1.51312 s t(norm)=0.0721514, mflops=69.2987 (err=2.8e-16) 19. Mayer (simple): elapsed time t=1.27457 s, 131072 iters, t-(init.)=1.21341 s t(norm)=0.05786, mflops=86.4155 20. Mayer (lookup): elapsed time t=1.25156 s, 131072 iters, t-(init.)=1.19034 s t(norm)=0.0567599, mflops=88.0903 (err=2.6e-16) 21. NAPACK (f2c): elapsed time t=1.4675 s, 32768 iters, t-(init.)=1.45217 s t(norm)=0.27698, mflops=18.0519 (err=6.4e-16) 22. Nielsen: elapsed time t=1.90127 s, 65536 iters, t-(init.)=1.87062 s t(norm)=0.178397, mflops=28.0274 (err=1.1e-15) 23. NR (C): elapsed time t=1.70779 s, 131072 iters, t-(init.)=1.64664 s t(norm)=0.0785177, mflops=63.6799 (err=2.9e-16) 24. Ooura (C): elapsed time t=1.3167 s, 262144 iters, t-(init.)=1.19433 s t(norm)=0.0284751, mflops=175.592 (err=2.5e-16) 25. QFT: elapsed time t=1.32671 s, 131072 iters, t-(init.)=1.26513 s t(norm)=0.0603263, mflops=82.8826 (err=2.8e-16) 26. Ransom: elapsed time t=1.26242 s, 32768 iters, t-(init.)=1.24706 s t(norm)=0.237857, mflops=21.021 (err=7.4e-16) 27. Singleton (f2c): elapsed time t=1.11721 s, 131072 iters, t-(init.)=1.05601 s t(norm)=0.0503544, mflops=99.2962 (err=2.3e-16) 28. Temperton (f2c): elapsed time t=1.34574 s, 32768 iters, t-(init.)=1.33039 s t(norm)=0.253752, mflops=19.7043 (err=2.6e-16) 29. Valkenburg: elapsed time t=1.26495 s, 16384 iters, t-(init.)=1.25716 s t(norm)=0.479567, mflops=10.4261 (err=2.8e-16) Top mflops for N=32 = 316.59 Normalized results and averages for N=32: fft 0: mflops = 88.4438 (norm. = 0.279364), norm. avg. (of 5) = 0.407244 fft 1: mflops = 89.0531 (norm. = 0.281288), norm. avg. (of 5) = 0.402252 fft 2: mflops = 58.9008 (norm. = 0.186048), norm. avg. (of 5) = 0.21726 fft 3: mflops = 20.4947 (norm. = 0.0647356), norm. avg. (of 5) = 0.0449338 fft 4: mflops = 19.7272 (norm. = 0.0623114), norm. avg. (of 5) = 0.0704959 fft 5: mflops = 90.8112 (norm. = 0.286841), norm. avg. (of 5) = 0.174579 fft 6: mflops = 88.2332 (norm. = 0.278699), norm. avg. (of 5) = 0.192137 fft 7: mflops = 112.825 (norm. = 0.356376), norm. avg. (of 5) = 0.165258 fft 8: mflops = 42.3844 (norm. = 0.133878), norm. avg. (of 4) = 0.141633 fft 9: mflops = 38.5506 (norm. = 0.121768), norm. avg. (of 5) = 0.130905 fft 10: mflops = 313.227 (norm. = 0.989375), norm. avg. (of 5) = 0.802043 fft 11: mflops = 313.817 (norm. = 0.991241), norm. avg. (of 5) = 0.806507 fft 12: mflops = 316.59 (norm. = 1), norm. avg. (of 5) = 1 fft 13: mflops = 181.498 (norm. = 0.573289), norm. avg. (of 3) = 0.456875 fft 14: mflops = 69.2934 (norm. = 0.218874), norm. avg. (of 5) = 0.225224 fft 15: mflops = 60.8126 (norm. = 0.192086), norm. avg. (of 5) = 0.150574 fft 16: mflops = 61.5172 (norm. = 0.194312), norm. avg. (of 5) = 0.147978 fft 17: mflops = 136.989 (norm. = 0.432701), norm. avg. (of 5) = 0.477878 fft 18: mflops = 69.2987 (norm. = 0.218891), norm. avg. (of 4) = 0.217763 fft 19: mflops = 86.4155 (norm. = 0.272957), norm. avg. (of 4) = 0.244467 fft 20: mflops = 88.0903 (norm. = 0.278247), norm. avg. (of 4) = 0.238414 fft 21: mflops = 18.0519 (norm. = 0.0570196), norm. avg. (of 5) = 0.052411 fft 22: mflops = 28.0274 (norm. = 0.088529), norm. avg. (of 5) = 0.0676144 fft 23: mflops = 63.6799 (norm. = 0.201143), norm. avg. (of 5) = 0.160774 fft 24: mflops = 175.592 (norm. = 0.554634), norm. avg. (of 5) = 0.511152 fft 25: mflops = 82.8826 (norm. = 0.261798), norm. avg. (of 2) = 0.294613 fft 26: mflops = 21.021 (norm. = 0.0663983), norm. avg. (of 4) = 0.0491164 fft 27: mflops = 99.2962 (norm. = 0.313643), norm. avg. (of 5) = 0.195742 fft 28: mflops = 19.7043 (norm. = 0.0622392), norm. avg. (of 5) = 0.0634536 fft 29: mflops = 10.4261 (norm. = 0.0329324), norm. avg. (of 5) = 0.0641867 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.43674 s, 65536 iters, t-(init.)=1.37792 s t(norm)=0.0547535, mflops=91.3184 (err=5.0e-16) 1. Arndt DIT: elapsed time t=1.43037 s, 65536 iters, t-(init.)=1.37185 s t(norm)=0.0545125, mflops=91.7221 (err=4.9e-16) 2. Arndt Split-Radix: elapsed time t=1.93223 s, 65536 iters, t-(init.)=1.87354 s t(norm)=0.0744477, mflops=67.1613 (err=4.5e-16) 3. Arndt 4-step: elapsed time t=1.04609 s, 16384 iters, t-(init.)=1.03134 s t(norm)=0.163927, mflops=30.5014 (err=4.9e-16) 4. Beauregard: elapsed time t=1.58405 s, 16384 iters, t-(init.)=1.56928 s t(norm)=0.24943, mflops=20.0457 (err=4.5e-16) 5. Bergland: elapsed time t=1.19753 s, 65536 iters, t-(init.)=1.13861 s t(norm)=0.0452445, mflops=110.511 (err=5.5e-16) 6. CWP (min N) (N=65): elapsed time t=1.30077 s, 65536 iters, t-(init.)=1.2412 s t(norm)=0.0493207, mflops=101.377 7. CWP (best N) (N=84): elapsed time t=1.02765 s, 65536 iters, t-(init.)=0.951331 s t(norm)=0.0378025, mflops=132.266 8. Edelblute: elapsed time t=1.34573 s, 32768 iters, t-(init.)=1.31632 s t(norm)=0.104612, mflops=47.7957 (err=4.6e-16) 9. FFTPACK (f2c): elapsed time t=1.40205 s, 32768 iters, t-(init.)=1.37249 s t(norm)=0.109076, mflops=45.8398 (err=4.4e-16) FFTW_MEASURE plan: (cost = 6.323730e-06) FFTW_NOTW 64 10. FFTW: elapsed time t=1.72051 s, 262144 iters, t-(init.)=1.48433 s t(norm)=0.0147455, mflops=339.087 (err=4.4e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.05864 s, 131072 iters, t-(init.)=0.941165 s t(norm)=0.0186993, mflops=267.39 (err=4.7e-16) 12. Frigo-old: elapsed time t=1.36627 s, 131072 iters, t-(init.)=1.24877 s t(norm)=0.0248107, mflops=201.526 (err=4.5e-16) 13. Green: elapsed time t=1.10753 s, 131072 iters, t-(init.)=0.989922 s t(norm)=0.019668, mflops=254.22 (err=4.6e-16) 14. GSL: elapsed time t=1.55822 s, 65536 iters, t-(init.)=1.49947 s t(norm)=0.0595834, mflops=83.916 (err=4.4e-16) 15. GSL DIT: elapsed time t=1.82851 s, 65536 iters, t-(init.)=1.76949 s t(norm)=0.0703133, mflops=71.1103 (err=4.6e-16) 16. GSL DIF: elapsed time t=1.76648 s, 65536 iters, t-(init.)=1.70759 s t(norm)=0.0678534, mflops=73.6883 (err=4.9e-16) 17. Krukar: elapsed time t=1.94928 s, 131072 iters, t-(init.)=1.83157 s t(norm)=0.03639, mflops=137.4 (err=5.2e-16) 18. Mayer (Buneman): elapsed time t=1.78163 s, 65536 iters, t-(init.)=1.72287 s t(norm)=0.0684608, mflops=73.0345 (err=4.8e-16) 19. Mayer (simple): elapsed time t=1.38792 s, 65536 iters, t-(init.)=1.32911 s t(norm)=0.0528142, mflops=94.6716 20. Mayer (lookup): elapsed time t=1.34295 s, 65536 iters, t-(init.)=1.28401 s t(norm)=0.051022, mflops=97.9969 (err=4.5e-16) 21. NAPACK (f2c): elapsed time t=1.51577 s, 16384 iters, t-(init.)=1.50103 s t(norm)=0.238582, mflops=20.9571 (err=1.1e-15) 22. Nielsen: elapsed time t=1.77493 s, 32768 iters, t-(init.)=1.74558 s t(norm)=0.138726, mflops=36.0423 (err=1.9e-15) 23. NR (C): elapsed time t=1.71 s, 65536 iters, t-(init.)=1.65124 s t(norm)=0.0656145, mflops=76.2027 (err=4.4e-16) 24. Ooura (C): elapsed time t=1.39473 s, 131072 iters, t-(init.)=1.27728 s t(norm)=0.0253772, mflops=197.027 (err=5.4e-16) 25. QFT: elapsed time t=1.64643 s, 65536 iters, t-(init.)=1.58745 s t(norm)=0.0630798, mflops=79.2647 (err=4.9e-16) 26. Ransom: elapsed time t=1.52112 s, 32768 iters, t-(init.)=1.49177 s t(norm)=0.118555, mflops=42.1745 (err=9.1e-16) 27. Singleton (f2c): elapsed time t=1.91101 s, 131072 iters, t-(init.)=1.79356 s t(norm)=0.0356348, mflops=140.312 (err=6.5e-16) 28. Temperton (f2c): elapsed time t=1.23028 s, 16384 iters, t-(init.)=1.21555 s t(norm)=0.193206, mflops=25.8791 (err=4.7e-16) 29. Valkenburg: elapsed time t=1.48329 s, 8192 iters, t-(init.)=1.4759 s t(norm)=0.469175, mflops=10.657 (err=6.0e-16) Top mflops for N=64 = 339.087 Normalized results and averages for N=64: fft 0: mflops = 91.3184 (norm. = 0.269307), norm. avg. (of 6) = 0.384254 fft 1: mflops = 91.7221 (norm. = 0.270497), norm. avg. (of 6) = 0.380293 fft 2: mflops = 67.1613 (norm. = 0.198065), norm. avg. (of 6) = 0.214061 fft 3: mflops = 30.5014 (norm. = 0.0899515), norm. avg. (of 6) = 0.0524367 fft 4: mflops = 20.0457 (norm. = 0.0591167), norm. avg. (of 6) = 0.0685994 fft 5: mflops = 110.511 (norm. = 0.325907), norm. avg. (of 6) = 0.199801 fft 6: mflops = 101.377 (norm. = 0.298972), norm. avg. (of 6) = 0.209943 fft 7: mflops = 132.266 (norm. = 0.390066), norm. avg. (of 6) = 0.202726 fft 8: mflops = 47.7957 (norm. = 0.140954), norm. avg. (of 5) = 0.141497 fft 9: mflops = 45.8398 (norm. = 0.135186), norm. avg. (of 6) = 0.131619 fft 10: mflops = 339.087 (norm. = 1), norm. avg. (of 6) = 0.835036 fft 11: mflops = 267.39 (norm. = 0.788559), norm. avg. (of 6) = 0.803516 fft 12: mflops = 201.526 (norm. = 0.594319), norm. avg. (of 6) = 0.932386 fft 13: mflops = 254.22 (norm. = 0.74972), norm. avg. (of 4) = 0.530086 fft 14: mflops = 83.916 (norm. = 0.247476), norm. avg. (of 6) = 0.228933 fft 15: mflops = 71.1103 (norm. = 0.209711), norm. avg. (of 6) = 0.16043 fft 16: mflops = 73.6883 (norm. = 0.217314), norm. avg. (of 6) = 0.159534 fft 17: mflops = 137.4 (norm. = 0.405207), norm. avg. (of 6) = 0.465766 fft 18: mflops = 73.0345 (norm. = 0.215386), norm. avg. (of 5) = 0.217287 fft 19: mflops = 94.6716 (norm. = 0.279196), norm. avg. (of 5) = 0.251413 fft 20: mflops = 97.9969 (norm. = 0.289002), norm. avg. (of 5) = 0.248532 fft 21: mflops = 20.9571 (norm. = 0.0618046), norm. avg. (of 6) = 0.0539766 fft 22: mflops = 36.0423 (norm. = 0.106292), norm. avg. (of 6) = 0.0740607 fft 23: mflops = 76.2027 (norm. = 0.224729), norm. avg. (of 6) = 0.171434 fft 24: mflops = 197.027 (norm. = 0.581052), norm. avg. (of 6) = 0.522802 fft 25: mflops = 79.2647 (norm. = 0.233759), norm. avg. (of 3) = 0.274329 fft 26: mflops = 42.1745 (norm. = 0.124377), norm. avg. (of 5) = 0.0641684 fft 27: mflops = 140.312 (norm. = 0.413794), norm. avg. (of 6) = 0.232084 fft 28: mflops = 25.8791 (norm. = 0.07632), norm. avg. (of 6) = 0.065598 fft 29: mflops = 10.657 (norm. = 0.0314286), norm. avg. (of 6) = 0.058727 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.50372 s, 32768 iters, t-(init.)=1.44616 s t(norm)=0.0492559, mflops=101.511 (err=4.0e-16) 1. Arndt DIT: elapsed time t=1.49507 s, 32768 iters, t-(init.)=1.43757 s t(norm)=0.0489632, mflops=102.117 (err=4.1e-16) 2. Arndt Split-Radix: elapsed time t=1.00978 s, 16384 iters, t-(init.)=0.980929 s t(norm)=0.0668205, mflops=74.8274 (err=4.4e-16) 3. Arndt 4-step: elapsed time t=1.20741 s, 8192 iters, t-(init.)=1.19298 s t(norm)=0.162531, mflops=30.7634 (err=4.0e-16) 4. Beauregard: elapsed time t=1.83429 s, 8192 iters, t-(init.)=1.81985 s t(norm)=0.247935, mflops=20.1666 (err=4.1e-16) 5. Bergland: elapsed time t=1.23304 s, 32768 iters, t-(init.)=1.17526 s t(norm)=0.0400291, mflops=124.909 (err=4.3e-16) 6. CWP (min N) (N=130): elapsed time t=1.28144 s, 32768 iters, t-(init.)=1.22296 s t(norm)=0.0416537, mflops=120.037 7. CWP (best N) (N=140): elapsed time t=1.70136 s, 65536 iters, t-(init.)=1.57576 s t(norm)=0.0268351, mflops=186.323 8. Edelblute: elapsed time t=1.40747 s, 16384 iters, t-(init.)=1.37862 s t(norm)=0.0939112, mflops=53.2418 (err=4.1e-16) 9. FFTPACK (f2c): elapsed time t=1.44997 s, 16384 iters, t-(init.)=1.421 s t(norm)=0.0967981, mflops=51.6539 (err=4.1e-16) FFTW_MEASURE plan: (cost = 1.642896e-05) FFTW_TWIDDLE 4 FFTW_NOTW 32 10. FFTW: elapsed time t=1.10871 s, 65536 iters, t-(init.)=0.993376 s t(norm)=0.0169171, mflops=295.559 (err=4.2e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.1082 s, 65536 iters, t-(init.)=0.993188 s t(norm)=0.0169139, mflops=295.615 (err=4.2e-16) 12. Frigo-old: elapsed time t=1.42683 s, 65536 iters, t-(init.)=1.31167 s t(norm)=0.0223376, mflops=223.838 (err=4.4e-16) 13. Green: elapsed time t=1.16559 s, 65536 iters, t-(init.)=1.05031 s t(norm)=0.0178867, mflops=279.537 (err=4.4e-16) 14. GSL: elapsed time t=1.75991 s, 32768 iters, t-(init.)=1.70237 s t(norm)=0.0579824, mflops=86.233 (err=4.2e-16) 15. GSL DIT: elapsed time t=1.92204 s, 32768 iters, t-(init.)=1.86427 s t(norm)=0.0634967, mflops=78.7442 (err=4.3e-16) 16. GSL DIF: elapsed time t=1.82266 s, 32768 iters, t-(init.)=1.76515 s t(norm)=0.0601207, mflops=83.1661 (err=4.6e-16) 17. Krukar: elapsed time t=1.12897 s, 16384 iters, t-(init.)=1.10014 s t(norm)=0.0749407, mflops=66.7194 (err=4.6e-16) 18. Mayer (Buneman): elapsed time t=1.89076 s, 32768 iters, t-(init.)=1.83326 s t(norm)=0.0624405, mflops=80.0763 (err=4.0e-16) 19. Mayer (simple): elapsed time t=1.43797 s, 32768 iters, t-(init.)=1.38037 s t(norm)=0.0470152, mflops=106.349 20. Mayer (lookup): elapsed time t=1.39085 s, 32768 iters, t-(init.)=1.33323 s t(norm)=0.0454097, mflops=110.109 (err=4.3e-16) 21. NAPACK (f2c): elapsed time t=1.72235 s, 8192 iters, t-(init.)=1.70791 s t(norm)=0.232684, mflops=21.4883 (err=1.2e-15) 22. Nielsen: elapsed time t=1.13342 s, 8192 iters, t-(init.)=1.11905 s t(norm)=0.152459, mflops=32.7958 (err=1.3e-15) 23. NR (C): elapsed time t=1.75564 s, 32768 iters, t-(init.)=1.69798 s t(norm)=0.057833, mflops=86.4559 (err=4.4e-16) 24. Ooura (C): elapsed time t=1.57461 s, 65536 iters, t-(init.)=1.45957 s t(norm)=0.0248563, mflops=201.156 (err=4.1e-16) 25. QFT: elapsed time t=1.9604 s, 32768 iters, t-(init.)=1.90268 s t(norm)=0.0648049, mflops=77.1547 (err=4.6e-16) 26. Ransom: elapsed time t=1.87502 s, 16384 iters, t-(init.)=1.84621 s t(norm)=0.125763, mflops=39.7573 (err=1.1e-15) 27. Singleton (f2c): elapsed time t=1.17436 s, 32768 iters, t-(init.)=1.1167 s t(norm)=0.0380345, mflops=131.46 (err=5.3e-16) 28. Temperton (f2c): elapsed time t=1.50002 s, 8192 iters, t-(init.)=1.48557 s t(norm)=0.202393, mflops=24.7044 (err=4.4e-16) 29. Valkenburg: elapsed time t=1.70363 s, 4096 iters, t-(init.)=1.69632 s t(norm)=0.462212, mflops=10.8175 (err=4.8e-16) Top mflops for N=128 = 295.615 Normalized results and averages for N=128: fft 0: mflops = 101.511 (norm. = 0.343388), norm. avg. (of 7) = 0.378416 fft 1: mflops = 102.117 (norm. = 0.345441), norm. avg. (of 7) = 0.375314 fft 2: mflops = 74.8274 (norm. = 0.253124), norm. avg. (of 7) = 0.219641 fft 3: mflops = 30.7634 (norm. = 0.104066), norm. avg. (of 7) = 0.0598123 fft 4: mflops = 20.1666 (norm. = 0.0682192), norm. avg. (of 7) = 0.068545 fft 5: mflops = 124.909 (norm. = 0.42254), norm. avg. (of 7) = 0.231621 fft 6: mflops = 120.037 (norm. = 0.406059), norm. avg. (of 7) = 0.23796 fft 7: mflops = 186.323 (norm. = 0.630291), norm. avg. (of 7) = 0.263807 fft 8: mflops = 53.2418 (norm. = 0.180105), norm. avg. (of 6) = 0.147932 fft 9: mflops = 51.6539 (norm. = 0.174734), norm. avg. (of 7) = 0.137778 fft 10: mflops = 295.559 (norm. = 0.999811), norm. avg. (of 7) = 0.858575 fft 11: mflops = 295.615 (norm. = 1), norm. avg. (of 7) = 0.831585 fft 12: mflops = 223.838 (norm. = 0.757193), norm. avg. (of 7) = 0.907359 fft 13: mflops = 279.537 (norm. = 0.945611), norm. avg. (of 5) = 0.613191 fft 14: mflops = 86.233 (norm. = 0.291707), norm. avg. (of 7) = 0.2379 fft 15: mflops = 78.7442 (norm. = 0.266374), norm. avg. (of 7) = 0.175565 fft 16: mflops = 83.1661 (norm. = 0.281332), norm. avg. (of 7) = 0.176934 fft 17: mflops = 66.7194 (norm. = 0.225697), norm. avg. (of 7) = 0.43147 fft 18: mflops = 80.0763 (norm. = 0.27088), norm. avg. (of 6) = 0.22622 fft 19: mflops = 106.349 (norm. = 0.359753), norm. avg. (of 6) = 0.26947 fft 20: mflops = 110.109 (norm. = 0.372473), norm. avg. (of 6) = 0.269189 fft 21: mflops = 21.4883 (norm. = 0.0726903), norm. avg. (of 7) = 0.05665 fft 22: mflops = 32.7958 (norm. = 0.110941), norm. avg. (of 7) = 0.0793293 fft 23: mflops = 86.4559 (norm. = 0.292461), norm. avg. (of 7) = 0.188723 fft 24: mflops = 201.156 (norm. = 0.680466), norm. avg. (of 7) = 0.545326 fft 25: mflops = 77.1547 (norm. = 0.260997), norm. avg. (of 4) = 0.270996 fft 26: mflops = 39.7573 (norm. = 0.13449), norm. avg. (of 6) = 0.0758887 fft 27: mflops = 131.46 (norm. = 0.444699), norm. avg. (of 7) = 0.262457 fft 28: mflops = 24.7044 (norm. = 0.0835694), norm. avg. (of 7) = 0.0681653 fft 29: mflops = 10.8175 (norm. = 0.0365934), norm. avg. (of 7) = 0.0555651 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.6135 s, 16384 iters, t-(init.)=1.55655 s t(norm)=0.0463888, mflops=107.785 (err=6.7e-16) 1. Arndt DIT: elapsed time t=1.61236 s, 16384 iters, t-(init.)=1.55539 s t(norm)=0.0463542, mflops=107.865 (err=7.1e-16) 2. Arndt Split-Radix: elapsed time t=1.05487 s, 8192 iters, t-(init.)=1.02635 s t(norm)=0.0611753, mflops=81.7323 (err=7.4e-16) 3. Arndt 4-step: elapsed time t=1.19913 s, 4096 iters, t-(init.)=1.18485 s t(norm)=0.141245, mflops=35.3994 (err=7.2e-16) 4. Beauregard: elapsed time t=1.04585 s, 2048 iters, t-(init.)=1.03869 s t(norm)=0.247643, mflops=20.1903 (err=7.8e-16) 5. Bergland: elapsed time t=1.2378 s, 16384 iters, t-(init.)=1.18085 s t(norm)=0.035192, mflops=142.078 (err=8.3e-16) 6. CWP (min N) (N=260): elapsed time t=1.23876 s, 16384 iters, t-(init.)=1.18088 s t(norm)=0.035193, mflops=142.074 7. CWP (best N) (N=280): elapsed time t=1.85993 s, 32768 iters, t-(init.)=1.73554 s t(norm)=0.0258615, mflops=193.337 8. Edelblute: elapsed time t=1.46305 s, 8192 iters, t-(init.)=1.43459 s t(norm)=0.0855082, mflops=58.4739 (err=7.0e-16) 9. FFTPACK (f2c): elapsed time t=1.52629 s, 8192 iters, t-(init.)=1.49781 s t(norm)=0.0892764, mflops=56.0059 (err=7.8e-16) FFTW_MEASURE plan: (cost = 3.489307e-05) FFTW_TWIDDLE 4 FFTW_NOTW 64 10. FFTW: elapsed time t=1.17405 s, 32768 iters, t-(init.)=1.06004 s t(norm)=0.0157958, mflops=316.54 (err=8.0e-16) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.189 s, 32768 iters, t-(init.)=1.07514 s t(norm)=0.0160208, mflops=312.094 (err=8.1e-16) 12. Frigo-old: elapsed time t=1.45349 s, 32768 iters, t-(init.)=1.33964 s t(norm)=0.0199622, mflops=250.473 (err=8.0e-16) 13. Green: elapsed time t=1.23435 s, 32768 iters, t-(init.)=1.12026 s t(norm)=0.0166932, mflops=299.523 (err=7.6e-16) 14. GSL: elapsed time t=1.79693 s, 16384 iters, t-(init.)=1.73996 s t(norm)=0.051855, mflops=96.4228 (err=7.8e-16) 15. GSL DIT: elapsed time t=1.03476 s, 8192 iters, t-(init.)=1.00623 s t(norm)=0.0599757, mflops=83.367 (err=7.7e-16) 16. GSL DIF: elapsed time t=1.93179 s, 16384 iters, t-(init.)=1.87488 s t(norm)=0.0558758, mflops=89.4842 (err=8.3e-16) 17. Krukar: elapsed time t=1.2712 s, 8192 iters, t-(init.)=1.24262 s t(norm)=0.0740662, mflops=67.5072 (err=7.7e-16) 18. Mayer (Buneman): elapsed time t=1.02083 s, 8192 iters, t-(init.)=0.992318 s t(norm)=0.0591468, mflops=84.5355 (err=7.0e-16) 19. Mayer (simple): elapsed time t=1.55006 s, 16384 iters, t-(init.)=1.49311 s t(norm)=0.0444981, mflops=112.364 20. Mayer (lookup): elapsed time t=1.49276 s, 16384 iters, t-(init.)=1.43586 s t(norm)=0.0427918, mflops=116.845 (err=7.1e-16) 21. NAPACK (f2c): elapsed time t=1.82549 s, 4096 iters, t-(init.)=1.8112 s t(norm)=0.215912, mflops=23.1576 (err=3.6e-15) 22. Nielsen: elapsed time t=1.13969 s, 4096 iters, t-(init.)=1.1254 s t(norm)=0.134158, mflops=37.2695 (err=3.4e-15) 23. NR (C): elapsed time t=1.85025 s, 16384 iters, t-(init.)=1.79329 s t(norm)=0.0534441, mflops=93.5557 (err=8.6e-16) 24. Ooura (C): elapsed time t=1.67183 s, 32768 iters, t-(init.)=1.55808 s t(norm)=0.0232171, mflops=215.358 (err=7.9e-16) 25. QFT: elapsed time t=1.1393 s, 8192 iters, t-(init.)=1.11069 s t(norm)=0.0662025, mflops=75.5259 (err=9.5e-16) 26. Ransom: elapsed time t=1.42785 s, 8192 iters, t-(init.)=1.39934 s t(norm)=0.0834072, mflops=59.9469 (err=1.7e-15) 27. Singleton (f2c): elapsed time t=1.00687 s, 16384 iters, t-(init.)=0.949973 s t(norm)=0.0283114, mflops=176.607 (err=1.3e-15) 28. Temperton (f2c): elapsed time t=1.5543 s, 4096 iters, t-(init.)=1.54 s t(norm)=0.183583, mflops=27.2357 (err=7.5e-16) 29. Valkenburg: elapsed time t=1.92368 s, 2048 iters, t-(init.)=1.91659 s t(norm)=0.456952, mflops=10.9421 (err=7.4e-16) Top mflops for N=256 = 316.54 Normalized results and averages for N=256: fft 0: mflops = 107.785 (norm. = 0.340509), norm. avg. (of 8) = 0.373678 fft 1: mflops = 107.865 (norm. = 0.340763), norm. avg. (of 8) = 0.370995 fft 2: mflops = 81.7323 (norm. = 0.258206), norm. avg. (of 8) = 0.224462 fft 3: mflops = 35.3994 (norm. = 0.111833), norm. avg. (of 8) = 0.0663148 fft 4: mflops = 20.1903 (norm. = 0.0637845), norm. avg. (of 8) = 0.06795 fft 5: mflops = 142.078 (norm. = 0.448846), norm. avg. (of 8) = 0.258774 fft 6: mflops = 142.074 (norm. = 0.448834), norm. avg. (of 8) = 0.264319 fft 7: mflops = 193.337 (norm. = 0.610784), norm. avg. (of 8) = 0.307179 fft 8: mflops = 58.4739 (norm. = 0.184729), norm. avg. (of 7) = 0.153189 fft 9: mflops = 56.0059 (norm. = 0.176932), norm. avg. (of 8) = 0.142672 fft 10: mflops = 316.54 (norm. = 1), norm. avg. (of 8) = 0.876253 fft 11: mflops = 312.094 (norm. = 0.985956), norm. avg. (of 8) = 0.850881 fft 12: mflops = 250.473 (norm. = 0.791285), norm. avg. (of 8) = 0.89285 fft 13: mflops = 299.523 (norm. = 0.946243), norm. avg. (of 6) = 0.6687 fft 14: mflops = 96.4228 (norm. = 0.304615), norm. avg. (of 8) = 0.24624 fft 15: mflops = 83.367 (norm. = 0.26337), norm. avg. (of 8) = 0.186541 fft 16: mflops = 89.4842 (norm. = 0.282695), norm. avg. (of 8) = 0.190154 fft 17: mflops = 67.5072 (norm. = 0.213266), norm. avg. (of 8) = 0.404195 fft 18: mflops = 84.5355 (norm. = 0.267061), norm. avg. (of 7) = 0.232054 fft 19: mflops = 112.364 (norm. = 0.354978), norm. avg. (of 7) = 0.281685 fft 20: mflops = 116.845 (norm. = 0.369131), norm. avg. (of 7) = 0.283466 fft 21: mflops = 23.1576 (norm. = 0.0731587), norm. avg. (of 8) = 0.0587136 fft 22: mflops = 37.2695 (norm. = 0.11774), norm. avg. (of 8) = 0.0841307 fft 23: mflops = 93.5557 (norm. = 0.295558), norm. avg. (of 8) = 0.202078 fft 24: mflops = 215.358 (norm. = 0.680351), norm. avg. (of 8) = 0.562204 fft 25: mflops = 75.5259 (norm. = 0.238599), norm. avg. (of 5) = 0.264516 fft 26: mflops = 59.9469 (norm. = 0.189382), norm. avg. (of 7) = 0.092102 fft 27: mflops = 176.607 (norm. = 0.557931), norm. avg. (of 8) = 0.299392 fft 28: mflops = 27.2357 (norm. = 0.0860418), norm. avg. (of 8) = 0.0703999 fft 29: mflops = 10.9421 (norm. = 0.0345678), norm. avg. (of 8) = 0.0529404 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.72244 s, 8192 iters, t-(init.)=1.66581 s t(norm)=0.0441289, mflops=113.305 (err=6.7e-16) 1. Arndt DIT: elapsed time t=1.721 s, 8192 iters, t-(init.)=1.6644 s t(norm)=0.0440916, mflops=113.4 (err=6.2e-16) 2. Arndt Split-Radix: elapsed time t=1.20109 s, 4096 iters, t-(init.)=1.17301 s t(norm)=0.0621485, mflops=80.4524 (err=6.5e-16) 3. Arndt 4-step: elapsed time t=1.271 s, 2048 iters, t-(init.)=1.25679 s t(norm)=0.133174, mflops=37.5447 (err=6.3e-16) 4. Beauregard: elapsed time t=1.17627 s, 1024 iters, t-(init.)=1.16923 s t(norm)=0.247792, mflops=20.1782 (err=6.8e-16) 5. Bergland: elapsed time t=1.27884 s, 8192 iters, t-(init.)=1.22223 s t(norm)=0.032378, mflops=154.426 (err=7.2e-16) 6. CWP (min N) (N=520): elapsed time t=1.34762 s, 8192 iters, t-(init.)=1.28997 s t(norm)=0.0341724, mflops=146.317 7. CWP (best N) (N=560): elapsed time t=1.09444 s, 8192 iters, t-(init.)=1.03255 s t(norm)=0.0273532, mflops=182.794 8. Edelblute: elapsed time t=1.62189 s, 4096 iters, t-(init.)=1.59353 s t(norm)=0.0844284, mflops=59.2218 (err=6.2e-16) 9. FFTPACK (f2c): elapsed time t=1.03099 s, 2048 iters, t-(init.)=1.01673 s t(norm)=0.107736, mflops=46.4096 (err=6.4e-16) FFTW_MEASURE plan: (cost = 7.607422e-05) FFTW_TWIDDLE 8 FFTW_NOTW 64 10. FFTW: elapsed time t=1.37996 s, 16384 iters, t-(init.)=1.26571 s t(norm)=0.0167649, mflops=298.242 (err=6.4e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.38236 s, 16384 iters, t-(init.)=1.2691 s t(norm)=0.0168099, mflops=297.444 (err=6.5e-16) 12. Frigo-old: elapsed time t=1.67704 s, 16384 iters, t-(init.)=1.56376 s t(norm)=0.0207128, mflops=241.397 (err=6.3e-16) 13. Green: elapsed time t=1.27913 s, 16384 iters, t-(init.)=1.16582 s t(norm)=0.0154419, mflops=323.795 (err=6.2e-16) 14. GSL: elapsed time t=1.13076 s, 4096 iters, t-(init.)=1.10239 s t(norm)=0.058407, mflops=85.6062 (err=6.4e-16) 15. GSL DIT: elapsed time t=1.15161 s, 4096 iters, t-(init.)=1.12324 s t(norm)=0.0595115, mflops=84.0174 (err=9.0e-16) 16. GSL DIF: elapsed time t=1.03805 s, 4096 iters, t-(init.)=1.00968 s t(norm)=0.0534947, mflops=93.4673 (err=7.8e-16) 17. Krukar: elapsed time t=1.35664 s, 4096 iters, t-(init.)=1.32826 s t(norm)=0.0703738, mflops=71.0492 (err=6.9e-16) 18. Mayer (Buneman): elapsed time t=1.06485 s, 4096 iters, t-(init.)=1.0365 s t(norm)=0.0549156, mflops=91.0488 (err=6.5e-16) 19. Mayer (simple): elapsed time t=1.60846 s, 8192 iters, t-(init.)=1.55172 s t(norm)=0.0411065, mflops=121.635 20. Mayer (lookup): elapsed time t=1.55583 s, 8192 iters, t-(init.)=1.49923 s t(norm)=0.039716, mflops=125.894 (err=6.5e-16) 21. NAPACK (f2c): elapsed time t=1.03378 s, 1024 iters, t-(init.)=1.02667 s t(norm)=0.21758, mflops=22.98 (err=6.7e-15) 22. Nielsen: elapsed time t=1.14808 s, 2048 iters, t-(init.)=1.13387 s t(norm)=0.12015, mflops=41.6148 (err=3.2e-15) 23. NR (C): elapsed time t=1.02025 s, 4096 iters, t-(init.)=0.991901 s t(norm)=0.0525528, mflops=95.1424 (err=7.1e-16) 24. Ooura (C): elapsed time t=1.87634 s, 16384 iters, t-(init.)=1.76311 s t(norm)=0.0233532, mflops=214.103 (err=6.9e-16) 25. QFT: elapsed time t=1.36034 s, 4096 iters, t-(init.)=1.33196 s t(norm)=0.07057, mflops=70.8516 (err=9.5e-16) 26. Ransom: elapsed time t=1.73068 s, 4096 iters, t-(init.)=1.70232 s t(norm)=0.0901919, mflops=55.4373 (err=1.5e-15) 27. Singleton (f2c): elapsed time t=1.11601 s, 8192 iters, t-(init.)=1.05935 s t(norm)=0.0280633, mflops=178.169 (err=8.4e-16) 28. Temperton (f2c): elapsed time t=1.94903 s, 2048 iters, t-(init.)=1.93475 s t(norm)=0.205014, mflops=24.3886 (err=6.4e-16) 29. Valkenburg: elapsed time t=1.07346 s, 512 iters, t-(init.)=1.06993 s t(norm)=0.453495, mflops=11.0255 (err=7.4e-16) Top mflops for N=512 = 323.795 Normalized results and averages for N=512: fft 0: mflops = 113.305 (norm. = 0.349927), norm. avg. (of 9) = 0.371039 fft 1: mflops = 113.4 (norm. = 0.350222), norm. avg. (of 9) = 0.368687 fft 2: mflops = 80.4524 (norm. = 0.248467), norm. avg. (of 9) = 0.227129 fft 3: mflops = 37.5447 (norm. = 0.115952), norm. avg. (of 9) = 0.0718301 fft 4: mflops = 20.1782 (norm. = 0.0623179), norm. avg. (of 9) = 0.0673242 fft 5: mflops = 154.426 (norm. = 0.476925), norm. avg. (of 9) = 0.283013 fft 6: mflops = 146.317 (norm. = 0.451881), norm. avg. (of 9) = 0.285159 fft 7: mflops = 182.794 (norm. = 0.564535), norm. avg. (of 9) = 0.335774 fft 8: mflops = 59.2218 (norm. = 0.182899), norm. avg. (of 8) = 0.156902 fft 9: mflops = 46.4096 (norm. = 0.14333), norm. avg. (of 9) = 0.142745 fft 10: mflops = 298.242 (norm. = 0.921084), norm. avg. (of 9) = 0.881234 fft 11: mflops = 297.444 (norm. = 0.91862), norm. avg. (of 9) = 0.858408 fft 12: mflops = 241.397 (norm. = 0.745524), norm. avg. (of 9) = 0.87648 fft 13: mflops = 323.795 (norm. = 1), norm. avg. (of 7) = 0.716028 fft 14: mflops = 85.6062 (norm. = 0.264384), norm. avg. (of 9) = 0.248256 fft 15: mflops = 84.0174 (norm. = 0.259477), norm. avg. (of 9) = 0.194645 fft 16: mflops = 93.4673 (norm. = 0.288662), norm. avg. (of 9) = 0.201099 fft 17: mflops = 71.0492 (norm. = 0.219426), norm. avg. (of 9) = 0.383665 fft 18: mflops = 91.0488 (norm. = 0.281193), norm. avg. (of 8) = 0.238196 fft 19: mflops = 121.635 (norm. = 0.375655), norm. avg. (of 8) = 0.293431 fft 20: mflops = 125.894 (norm. = 0.388807), norm. avg. (of 8) = 0.296634 fft 21: mflops = 22.98 (norm. = 0.0709709), norm. avg. (of 9) = 0.0600755 fft 22: mflops = 41.6148 (norm. = 0.128522), norm. avg. (of 9) = 0.0890631 fft 23: mflops = 95.1424 (norm. = 0.293835), norm. avg. (of 9) = 0.212273 fft 24: mflops = 214.103 (norm. = 0.66123), norm. avg. (of 9) = 0.573207 fft 25: mflops = 70.8516 (norm. = 0.218816), norm. avg. (of 6) = 0.2569 fft 26: mflops = 55.4373 (norm. = 0.171211), norm. avg. (of 8) = 0.101991 fft 27: mflops = 178.169 (norm. = 0.550252), norm. avg. (of 9) = 0.327265 fft 28: mflops = 24.3886 (norm. = 0.0753211), norm. avg. (of 9) = 0.0709467 fft 29: mflops = 11.0255 (norm. = 0.0340508), norm. avg. (of 9) = 0.0508416 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.81388 s, 4096 iters, t-(init.)=1.75713 s t(norm)=0.0418932, mflops=119.351 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.81877 s, 4096 iters, t-(init.)=1.76228 s t(norm)=0.0420159, mflops=119.002 (err=1.0e-15) 2. Arndt Split-Radix: elapsed time t=1.25649 s, 2048 iters, t-(init.)=1.22819 s t(norm)=0.0585647, mflops=85.3756 (err=1.0e-15) 3. Arndt 4-step: elapsed time t=1.20825 s, 1024 iters, t-(init.)=1.19406 s t(norm)=0.113875, mflops=43.9079 (err=1.0e-15) 4. Beauregard: elapsed time t=1.30692 s, 512 iters, t-(init.)=1.29989 s t(norm)=0.247935, mflops=20.1666 (err=1.1e-15) 5. Bergland: elapsed time t=1.38233 s, 4096 iters, t-(init.)=1.32575 s t(norm)=0.0316083, mflops=158.186 (err=1.1e-15) 6. CWP (min N) (N=1040): elapsed time t=1.50625 s, 4096 iters, t-(init.)=1.44881 s t(norm)=0.0345424, mflops=144.75 7. CWP (best N) (N=1040): elapsed time t=1.5064 s, 4096 iters, t-(init.)=1.44903 s t(norm)=0.0345477, mflops=144.727 8. Edelblute: elapsed time t=1.68079 s, 2048 iters, t-(init.)=1.6525 s t(norm)=0.0787971, mflops=63.4541 (err=1.0e-15) 9. FFTPACK (f2c): elapsed time t=1.84454 s, 1024 iters, t-(init.)=1.83031 s t(norm)=0.174552, mflops=28.6448 (err=1.1e-15) FFTW_MEASURE plan: (cost = 3.941562e-04) FFTW_TWIDDLE 16 FFTW_NOTW 64 10. FFTW: elapsed time t=1.58004 s, 4096 iters, t-(init.)=1.52338 s t(norm)=0.0363203, mflops=137.664 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.67087 s, 2048 iters, t-(init.)=1.64235 s t(norm)=0.0783133, mflops=63.8461 (err=1.1e-15) 12. Frigo-old: elapsed time t=1.68721 s, 2048 iters, t-(init.)=1.65887 s t(norm)=0.0791013, mflops=63.2101 (err=1.1e-15) 13. Green: elapsed time t=1.37525 s, 8192 iters, t-(init.)=1.26215 s t(norm)=0.015046, mflops=332.314 (err=1.1e-15) 14. GSL: elapsed time t=1.20729 s, 1024 iters, t-(init.)=1.19302 s t(norm)=0.113775, mflops=43.9463 (err=1.1e-15) 15. GSL DIT: elapsed time t=1.28824 s, 2048 iters, t-(init.)=1.25999 s t(norm)=0.0600808, mflops=83.2212 (err=1.3e-15) 16. GSL DIF: elapsed time t=1.12542 s, 2048 iters, t-(init.)=1.09712 s t(norm)=0.0523148, mflops=95.5752 (err=1.4e-15) 17. Krukar: elapsed time t=1.88544 s, 2048 iters, t-(init.)=1.85709 s t(norm)=0.088553, mflops=56.4634 (err=1.1e-15) 18. Mayer (Buneman): elapsed time t=1.12541 s, 2048 iters, t-(init.)=1.09717 s t(norm)=0.0523173, mflops=95.5707 (err=1.0e-15) 19. Mayer (simple): elapsed time t=1.72122 s, 4096 iters, t-(init.)=1.66463 s t(norm)=0.0396879, mflops=125.983 20. Mayer (lookup): elapsed time t=1.66387 s, 4096 iters, t-(init.)=1.60732 s t(norm)=0.0383214, mflops=130.475 (err=1.0e-15) 21. NAPACK (f2c): elapsed time t=1.2815 s, 512 iters, t-(init.)=1.27442 s t(norm)=0.243076, mflops=20.5697 (err=1.6e-14) 22. Nielsen: elapsed time t=1.5613 s, 1024 iters, t-(init.)=1.54694 s t(norm)=0.147528, mflops=33.892 (err=7.2e-15) 23. NR (C): elapsed time t=1.13561 s, 2048 iters, t-(init.)=1.10722 s t(norm)=0.0527962, mflops=94.7038 (err=1.2e-15) 24. Ooura (C): elapsed time t=1.98986 s, 8192 iters, t-(init.)=1.8767 s t(norm)=0.022372, mflops=223.494 (err=1.1e-15) 25. QFT: elapsed time t=1.99953 s, 1024 iters, t-(init.)=1.9849 s t(norm)=0.189295, mflops=26.4138 (err=1.4e-15) 26. Ransom: elapsed time t=1.44898 s, 2048 iters, t-(init.)=1.42074 s t(norm)=0.0677462, mflops=73.8049 (err=2.1e-15) 27. Singleton (f2c): elapsed time t=1.13887 s, 4096 iters, t-(init.)=1.08232 s t(norm)=0.0258045, mflops=193.765 (err=1.6e-15) 28. Temperton (f2c): elapsed time t=1.91362 s, 1024 iters, t-(init.)=1.8993 s t(norm)=0.181131, mflops=27.6043 (err=1.1e-15) 29. Valkenburg: elapsed time t=1.41174 s, 256 iters, t-(init.)=1.40804 s t(norm)=0.537126, mflops=9.30881 (err=1.1e-15) Top mflops for N=1024 = 332.314 Normalized results and averages for N=1024: fft 0: mflops = 119.351 (norm. = 0.359151), norm. avg. (of 10) = 0.36985 fft 1: mflops = 119.002 (norm. = 0.358102), norm. avg. (of 10) = 0.367629 fft 2: mflops = 85.3756 (norm. = 0.256912), norm. avg. (of 10) = 0.230107 fft 3: mflops = 43.9079 (norm. = 0.132128), norm. avg. (of 10) = 0.0778598 fft 4: mflops = 20.1666 (norm. = 0.0606853), norm. avg. (of 10) = 0.0666603 fft 5: mflops = 158.186 (norm. = 0.476014), norm. avg. (of 10) = 0.302313 fft 6: mflops = 144.75 (norm. = 0.435581), norm. avg. (of 10) = 0.300201 fft 7: mflops = 144.727 (norm. = 0.435514), norm. avg. (of 10) = 0.345748 fft 8: mflops = 63.4541 (norm. = 0.190946), norm. avg. (of 9) = 0.160685 fft 9: mflops = 28.6448 (norm. = 0.086198), norm. avg. (of 10) = 0.13709 fft 10: mflops = 137.664 (norm. = 0.414259), norm. avg. (of 10) = 0.834537 fft 11: mflops = 63.8461 (norm. = 0.192126), norm. avg. (of 10) = 0.79178 fft 12: mflops = 63.2101 (norm. = 0.190212), norm. avg. (of 10) = 0.807853 fft 13: mflops = 332.314 (norm. = 1), norm. avg. (of 8) = 0.751525 fft 14: mflops = 43.9463 (norm. = 0.132243), norm. avg. (of 10) = 0.236655 fft 15: mflops = 83.2212 (norm. = 0.250429), norm. avg. (of 10) = 0.200223 fft 16: mflops = 95.5752 (norm. = 0.287605), norm. avg. (of 10) = 0.20975 fft 17: mflops = 56.4634 (norm. = 0.16991), norm. avg. (of 10) = 0.36229 fft 18: mflops = 95.5707 (norm. = 0.287591), norm. avg. (of 9) = 0.243685 fft 19: mflops = 125.983 (norm. = 0.379108), norm. avg. (of 9) = 0.302951 fft 20: mflops = 130.475 (norm. = 0.392626), norm. avg. (of 9) = 0.3073 fft 21: mflops = 20.5697 (norm. = 0.0618984), norm. avg. (of 10) = 0.0602578 fft 22: mflops = 33.892 (norm. = 0.101988), norm. avg. (of 10) = 0.0903555 fft 23: mflops = 94.7038 (norm. = 0.284983), norm. avg. (of 10) = 0.219544 fft 24: mflops = 223.494 (norm. = 0.672538), norm. avg. (of 10) = 0.58314 fft 25: mflops = 26.4138 (norm. = 0.0794843), norm. avg. (of 7) = 0.231555 fft 26: mflops = 73.8049 (norm. = 0.222094), norm. avg. (of 9) = 0.115335 fft 27: mflops = 193.765 (norm. = 0.583077), norm. avg. (of 10) = 0.352846 fft 28: mflops = 27.6043 (norm. = 0.083067), norm. avg. (of 10) = 0.0721587 fft 29: mflops = 9.30881 (norm. = 0.0280121), norm. avg. (of 10) = 0.0485586 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.02925 s, 1024 iters, t-(init.)=0.999762 s t(norm)=0.0433385, mflops=115.371 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.04807 s, 1024 iters, t-(init.)=1.01839 s t(norm)=0.0441458, mflops=113.261 (err=1.4e-15) 2. Arndt Split-Radix: elapsed time t=1.47947 s, 1024 iters, t-(init.)=1.44997 s t(norm)=0.0628546, mflops=79.5486 (err=1.4e-15) 3. Arndt 4-step: elapsed time t=1.46733 s, 512 iters, t-(init.)=1.45243 s t(norm)=0.125922, mflops=39.707 (err=1.4e-15) 4. Beauregard: elapsed time t=1.47439 s, 256 iters, t-(init.)=1.46708 s t(norm)=0.254385, mflops=19.6552 (err=1.5e-15) 5. Bergland: elapsed time t=1.57125 s, 2048 iters, t-(init.)=1.5122 s t(norm)=0.0327761, mflops=152.55 (err=1.5e-15) 6. CWP (min N) (N=2145): elapsed time t=1.11342 s, 1024 iters, t-(init.)=1.03722 s t(norm)=0.0449621, mflops=111.205 7. CWP (best N) (N=2184): elapsed time t=1.00235 s, 1024 iters, t-(init.)=0.907533 s t(norm)=0.0393405, mflops=127.095 8. Edelblute: elapsed time t=1.91421 s, 1024 iters, t-(init.)=1.88447 s t(norm)=0.0816897, mflops=61.2072 (err=1.4e-15) 9. FFTPACK (f2c): elapsed time t=1.80096 s, 256 iters, t-(init.)=1.79306 s t(norm)=0.310908, mflops=16.082 (err=1.5e-15) FFTW_MEASURE plan: (cost = 1.102719e-03) FFTW_TWIDDLE 32 FFTW_NOTW 64 10. FFTW: elapsed time t=1.26754 s, 512 iters, t-(init.)=1.252 s t(norm)=0.108546, mflops=46.0635 (err=1.5e-15) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.31042 s, 512 iters, t-(init.)=1.29505 s t(norm)=0.112278, mflops=44.5323 (err=1.5e-15) 12. Frigo-old: elapsed time t=1.06391 s, 256 iters, t-(init.)=1.0564 s t(norm)=0.183175, mflops=27.2962 (err=1.5e-15) 13. Green: elapsed time t=1.60019 s, 2048 iters, t-(init.)=1.54112 s t(norm)=0.0334028, mflops=149.688 (err=1.5e-15) 14. GSL: elapsed time t=1.4443 s, 256 iters, t-(init.)=1.43697 s t(norm)=0.249164, mflops=20.0671 (err=1.5e-15) 15. GSL DIT: elapsed time t=1.52884 s, 1024 iters, t-(init.)=1.49927 s t(norm)=0.0649914, mflops=76.9333 (err=2.1e-15) 16. GSL DIF: elapsed time t=1.313 s, 1024 iters, t-(init.)=1.28339 s t(norm)=0.0556333, mflops=89.8742 (err=2.2e-15) 17. Krukar: elapsed time t=1.20607 s, 512 iters, t-(init.)=1.19117 s t(norm)=0.103271, mflops=48.4162 (err=1.5e-15) 18. Mayer (Buneman): elapsed time t=1.22419 s, 1024 iters, t-(init.)=1.19463 s t(norm)=0.0517857, mflops=96.5518 (err=1.4e-15) 19. Mayer (simple): elapsed time t=1.88829 s, 2048 iters, t-(init.)=1.82943 s t(norm)=0.0396519, mflops=126.097 20. Mayer (lookup): elapsed time t=1.22843 s, 1024 iters, t-(init.)=1.19879 s t(norm)=0.0519661, mflops=96.2165 (err=1.4e-15) 21. NAPACK (f2c): elapsed time t=1.31025 s, 64 iters, t-(init.)=1.30831 s t(norm)=0.907417, mflops=5.51015 (err=1.5e-14) 22. Nielsen: elapsed time t=1.26517 s, 256 iters, t-(init.)=1.25688 s t(norm)=0.217937, mflops=22.9424 (err=1.2e-14) 23. NR (C): elapsed time t=1.3551 s, 1024 iters, t-(init.)=1.32549 s t(norm)=0.0574587, mflops=87.0191 (err=1.6e-15) 24. Ooura (C): elapsed time t=1.07054 s, 1024 iters, t-(init.)=1.04109 s t(norm)=0.04513, mflops=110.791 (err=1.4e-15) 25. QFT: elapsed time t=1.21146 s, 256 iters, t-(init.)=1.20332 s t(norm)=0.20865, mflops=23.9636 (err=1.9e-15) 26. Ransom: elapsed time t=1.99852 s, 1024 iters, t-(init.)=1.96908 s t(norm)=0.0853573, mflops=58.5773 (err=2.6e-15) 27. Singleton (f2c): elapsed time t=1.52698 s, 2048 iters, t-(init.)=1.46757 s t(norm)=0.0318088, mflops=157.189 (err=2.0e-15) 28. Temperton (f2c): elapsed time t=1.41872 s, 256 iters, t-(init.)=1.41119 s t(norm)=0.244694, mflops=20.4337 (err=1.5e-15) 29. Valkenburg: elapsed time t=1.12177 s, 64 iters, t-(init.)=1.11984 s t(norm)=0.776699, mflops=6.4375 (err=1.5e-15) Top mflops for N=2048 = 157.189 Normalized results and averages for N=2048: fft 0: mflops = 115.371 (norm. = 0.733962), norm. avg. (of 11) = 0.402951 fft 1: mflops = 113.261 (norm. = 0.720539), norm. avg. (of 11) = 0.399711 fft 2: mflops = 79.5486 (norm. = 0.506069), norm. avg. (of 11) = 0.255195 fft 3: mflops = 39.707 (norm. = 0.252607), norm. avg. (of 11) = 0.0937459 fft 4: mflops = 19.6552 (norm. = 0.125042), norm. avg. (of 11) = 0.0719677 fft 5: mflops = 152.55 (norm. = 0.970489), norm. avg. (of 11) = 0.363056 fft 6: mflops = 111.205 (norm. = 0.707458), norm. avg. (of 11) = 0.337225 fft 7: mflops = 127.095 (norm. = 0.808551), norm. avg. (of 11) = 0.387821 fft 8: mflops = 61.2072 (norm. = 0.389386), norm. avg. (of 10) = 0.183555 fft 9: mflops = 16.082 (norm. = 0.10231), norm. avg. (of 11) = 0.133929 fft 10: mflops = 46.0635 (norm. = 0.293045), norm. avg. (of 11) = 0.78531 fft 11: mflops = 44.5323 (norm. = 0.283304), norm. avg. (of 11) = 0.745554 fft 12: mflops = 27.2962 (norm. = 0.173652), norm. avg. (of 11) = 0.750199 fft 13: mflops = 149.688 (norm. = 0.95228), norm. avg. (of 9) = 0.773831 fft 14: mflops = 20.0671 (norm. = 0.127662), norm. avg. (of 11) = 0.226746 fft 15: mflops = 76.9333 (norm. = 0.489431), norm. avg. (of 11) = 0.226515 fft 16: mflops = 89.8742 (norm. = 0.571758), norm. avg. (of 11) = 0.24266 fft 17: mflops = 48.4162 (norm. = 0.308012), norm. avg. (of 11) = 0.357355 fft 18: mflops = 96.5518 (norm. = 0.614239), norm. avg. (of 10) = 0.28074 fft 19: mflops = 126.097 (norm. = 0.802201), norm. avg. (of 10) = 0.352876 fft 20: mflops = 96.2165 (norm. = 0.612106), norm. avg. (of 10) = 0.33778 fft 21: mflops = 5.51015 (norm. = 0.0350543), norm. avg. (of 11) = 0.0579666 fft 22: mflops = 22.9424 (norm. = 0.145954), norm. avg. (of 11) = 0.0954099 fft 23: mflops = 87.0191 (norm. = 0.553595), norm. avg. (of 11) = 0.249912 fft 24: mflops = 110.791 (norm. = 0.704826), norm. avg. (of 11) = 0.594202 fft 25: mflops = 23.9636 (norm. = 0.152451), norm. avg. (of 8) = 0.221667 fft 26: mflops = 58.5773 (norm. = 0.372655), norm. avg. (of 10) = 0.141067 fft 27: mflops = 157.189 (norm. = 1), norm. avg. (of 11) = 0.411678 fft 28: mflops = 20.4337 (norm. = 0.129994), norm. avg. (of 11) = 0.0774165 fft 29: mflops = 6.4375 (norm. = 0.0409539), norm. avg. (of 11) = 0.0478673 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.4025 s, 128 iters, t-(init.)=1.34797 s t(norm)=0.214254, mflops=23.3368 (err=2.5e-15) 1. Arndt DIT: elapsed time t=1.47937 s, 128 iters, t-(init.)=1.425 s t(norm)=0.226497, mflops=22.0753 (err=2.5e-15) 2. Arndt Split-Radix: elapsed time t=1.75518 s, 128 iters, t-(init.)=1.70071 s t(norm)=0.27032, mflops=18.4966 (err=2.5e-15) 3. Arndt 4-step: elapsed time t=1.11371 s, 128 iters, t-(init.)=1.05894 s t(norm)=0.168314, mflops=29.7064 (err=2.5e-15) 4. Beauregard: elapsed time t=1.95169 s, 128 iters, t-(init.)=1.89708 s t(norm)=0.301532, mflops=16.582 (err=2.6e-15) 5. Bergland: elapsed time t=1.27127 s, 256 iters, t-(init.)=1.16246 s t(norm)=0.0923843, mflops=54.1218 (err=2.5e-15) 6. CWP (min N) (N=4290): elapsed time t=1.88636 s, 512 iters, t-(init.)=1.65841 s t(norm)=0.0658995, mflops=75.8731 7. CWP (best N) (N=4368): elapsed time t=1.61221 s, 512 iters, t-(init.)=1.37963 s t(norm)=0.0548217, mflops=91.2048 8. Edelblute: elapsed time t=1.84757 s, 128 iters, t-(init.)=1.79319 s t(norm)=0.28502, mflops=17.5426 (err=2.5e-15) 9. FFTPACK (f2c): elapsed time t=1.86335 s, 128 iters, t-(init.)=1.80849 s t(norm)=0.287451, mflops=17.3943 (err=2.6e-15) FFTW_MEASURE plan: (cost = 2.580312e-03) FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_NOTW 64 10. FFTW: elapsed time t=1.46039 s, 256 iters, t-(init.)=1.35068 s t(norm)=0.107343, mflops=46.5798 (err=2.6e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.56493 s, 256 iters, t-(init.)=1.4539 s t(norm)=0.115546, mflops=43.2728 (err=2.6e-15) 12. Frigo-old: elapsed time t=1.17443 s, 128 iters, t-(init.)=1.12007 s t(norm)=0.178031, mflops=28.0851 (err=2.6e-15) 13. Green: elapsed time t=1.23771 s, 256 iters, t-(init.)=1.12872 s t(norm)=0.0897027, mflops=55.7397 (err=2.6e-15) 14. GSL: elapsed time t=1.41281 s, 128 iters, t-(init.)=1.35818 s t(norm)=0.215877, mflops=23.1613 (err=2.6e-15) 15. GSL DIT: elapsed time t=1.41978 s, 128 iters, t-(init.)=1.3654 s t(norm)=0.217025, mflops=23.0388 (err=3.0e-15) 16. GSL DIF: elapsed time t=1.40269 s, 128 iters, t-(init.)=1.34825 s t(norm)=0.214299, mflops=23.3319 (err=3.1e-15) 17. Krukar: elapsed time t=1.53794 s, 256 iters, t-(init.)=1.42868 s t(norm)=0.113541, mflops=44.0369 (err=2.6e-15) 18. Mayer (Buneman): elapsed time t=1.66364 s, 512 iters, t-(init.)=1.44609 s t(norm)=0.0574625, mflops=87.0133 (err=2.5e-15) 19. Mayer (simple): elapsed time t=1.35106 s, 512 iters, t-(init.)=1.13348 s t(norm)=0.0450405, mflops=111.011 20. Mayer (lookup): elapsed time t=1.19464 s, 256 iters, t-(init.)=1.08568 s t(norm)=0.0862822, mflops=57.9494 (err=2.5e-15) 21. NAPACK (f2c): elapsed time t=1.3758 s, 32 iters, t-(init.)=1.36125 s t(norm)=0.865458, mflops=5.77729 (err=4.7e-14) 22. Nielsen: elapsed time t=1.81079 s, 128 iters, t-(init.)=1.7553 s t(norm)=0.278997, mflops=17.9213 (err=2.2e-14) 23. NR (C): elapsed time t=1.39275 s, 128 iters, t-(init.)=1.33838 s t(norm)=0.212729, mflops=23.5041 (err=2.6e-15) 24. Ooura (C): elapsed time t=1.15017 s, 256 iters, t-(init.)=1.04135 s t(norm)=0.0827591, mflops=60.4163 (err=2.5e-15) 25. QFT: elapsed time t=1.4689 s, 128 iters, t-(init.)=1.41334 s t(norm)=0.224644, mflops=22.2574 (err=3.1e-15) 26. Ransom: elapsed time t=1.23243 s, 256 iters, t-(init.)=1.12347 s t(norm)=0.0892855, mflops=56.0002 (err=3.1e-15) 27. Singleton (f2c): elapsed time t=1.45956 s, 256 iters, t-(init.)=1.35064 s t(norm)=0.107339, mflops=46.5813 (err=3.8e-15) 28. Temperton (f2c): elapsed time t=1.67442 s, 128 iters, t-(init.)=1.61901 s t(norm)=0.257335, mflops=19.4299 (err=2.6e-15) 29. Valkenburg: elapsed time t=1.35481 s, 32 iters, t-(init.)=1.34118 s t(norm)=0.852699, mflops=5.86374 (err=2.5e-15) Top mflops for N=4096 = 111.011 Normalized results and averages for N=4096: fft 0: mflops = 23.3368 (norm. = 0.21022), norm. avg. (of 12) = 0.38689 fft 1: mflops = 22.0753 (norm. = 0.198857), norm. avg. (of 12) = 0.382973 fft 2: mflops = 18.4966 (norm. = 0.166619), norm. avg. (of 12) = 0.247813 fft 3: mflops = 29.7064 (norm. = 0.267599), norm. avg. (of 12) = 0.108234 fft 4: mflops = 16.582 (norm. = 0.149372), norm. avg. (of 12) = 0.0784181 fft 5: mflops = 54.1218 (norm. = 0.487535), norm. avg. (of 12) = 0.373429 fft 6: mflops = 75.8731 (norm. = 0.683473), norm. avg. (of 12) = 0.366079 fft 7: mflops = 91.2048 (norm. = 0.821583), norm. avg. (of 12) = 0.423968 fft 8: mflops = 17.5426 (norm. = 0.158026), norm. avg. (of 11) = 0.181234 fft 9: mflops = 17.3943 (norm. = 0.156689), norm. avg. (of 12) = 0.135825 fft 10: mflops = 46.5798 (norm. = 0.419596), norm. avg. (of 12) = 0.754834 fft 11: mflops = 43.2728 (norm. = 0.389806), norm. avg. (of 12) = 0.715909 fft 12: mflops = 28.0851 (norm. = 0.252993), norm. avg. (of 12) = 0.708765 fft 13: mflops = 55.7397 (norm. = 0.502109), norm. avg. (of 10) = 0.746659 fft 14: mflops = 23.1613 (norm. = 0.20864), norm. avg. (of 12) = 0.225237 fft 15: mflops = 23.0388 (norm. = 0.207536), norm. avg. (of 12) = 0.224933 fft 16: mflops = 23.3319 (norm. = 0.210176), norm. avg. (of 12) = 0.239953 fft 17: mflops = 44.0369 (norm. = 0.396689), norm. avg. (of 12) = 0.360633 fft 18: mflops = 87.0133 (norm. = 0.783825), norm. avg. (of 11) = 0.326475 fft 19: mflops = 111.011 (norm. = 1), norm. avg. (of 11) = 0.411706 fft 20: mflops = 57.9494 (norm. = 0.522014), norm. avg. (of 11) = 0.354529 fft 21: mflops = 5.77729 (norm. = 0.0520424), norm. avg. (of 12) = 0.0574729 fft 22: mflops = 17.9213 (norm. = 0.161437), norm. avg. (of 12) = 0.100912 fft 23: mflops = 23.5041 (norm. = 0.211727), norm. avg. (of 12) = 0.24673 fft 24: mflops = 60.4163 (norm. = 0.544237), norm. avg. (of 12) = 0.590038 fft 25: mflops = 22.2574 (norm. = 0.200497), norm. avg. (of 9) = 0.219314 fft 26: mflops = 56.0002 (norm. = 0.504455), norm. avg. (of 11) = 0.174103 fft 27: mflops = 46.5813 (norm. = 0.419609), norm. avg. (of 12) = 0.412339 fft 28: mflops = 19.4299 (norm. = 0.175027), norm. avg. (of 12) = 0.0855507 fft 29: mflops = 5.86374 (norm. = 0.0528212), norm. avg. (of 12) = 0.0482801 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.43205 s, 64 iters, t-(init.)=1.37763 s t(norm)=0.202125, mflops=24.7372 (err=3.0e-15) 1. Arndt DIT: elapsed time t=1.51334 s, 64 iters, t-(init.)=1.45892 s t(norm)=0.214052, mflops=23.3588 (err=3.0e-15) 2. Arndt Split-Radix: elapsed time t=1.01262 s, 32 iters, t-(init.)=0.985336 s t(norm)=0.289135, mflops=17.2929 (err=3.0e-15) 3. Arndt 4-step: elapsed time t=1.43531 s, 64 iters, t-(init.)=1.38072 s t(norm)=0.202578, mflops=24.6818 (err=2.9e-15) 4. Beauregard: elapsed time t=1.07315 s, 32 iters, t-(init.)=1.04587 s t(norm)=0.306898, mflops=16.2921 (err=2.9e-15) 5. Bergland: elapsed time t=1.52303 s, 128 iters, t-(init.)=1.41415 s t(norm)=0.103742, mflops=48.1967 (err=2.9e-15) 6. CWP (min N) (N=8580): elapsed time t=1.01581 s, 128 iters, t-(init.)=0.901737 s t(norm)=0.066151, mflops=75.5846 7. CWP (best N) (N=9240): elapsed time t=1.90425 s, 256 iters, t-(init.)=1.65835 s t(norm)=0.060828, mflops=82.1991 8. Edelblute: elapsed time t=1.05825 s, 32 iters, t-(init.)=1.03105 s t(norm)=0.30255, mflops=16.5262 (err=3.0e-15) 9. FFTPACK (f2c): elapsed time t=1.19293 s, 32 iters, t-(init.)=1.16568 s t(norm)=0.342054, mflops=14.6176 (err=2.9e-15) FFTW_MEASURE plan: (cost = 6.123000e-03) FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_TWIDDLE 4 FFTW_NOTW 32 10. FFTW: elapsed time t=1.58917 s, 128 iters, t-(init.)=1.47795 s t(norm)=0.108422, mflops=46.1163 (err=2.9e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.60454 s, 128 iters, t-(init.)=1.49357 s t(norm)=0.109568, mflops=45.6338 (err=2.9e-15) 12. Frigo-old: elapsed time t=1.20667 s, 64 iters, t-(init.)=1.15223 s t(norm)=0.169054, mflops=29.5763 (err=2.9e-15) 13. Green: elapsed time t=1.36559 s, 128 iters, t-(init.)=1.25651 s t(norm)=0.092177, mflops=54.2435 (err=2.9e-15) 14. GSL: elapsed time t=1.9588 s, 64 iters, t-(init.)=1.90441 s t(norm)=0.279414, mflops=17.8946 (err=2.9e-15) 15. GSL DIT: elapsed time t=1.55581 s, 64 iters, t-(init.)=1.5013 s t(norm)=0.22027, mflops=22.6994 (err=3.6e-15) 16. GSL DIF: elapsed time t=1.53986 s, 64 iters, t-(init.)=1.48528 s t(norm)=0.21792, mflops=22.9442 (err=3.6e-15) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.35735 s, 64 iters, t-(init.)=1.30293 s t(norm)=0.191165, mflops=26.1554 (err=2.9e-15) 19. Mayer (simple): elapsed time t=1.29114 s, 64 iters, t-(init.)=1.2365 s t(norm)=0.181418, mflops=27.5606 20. Mayer (lookup): elapsed time t=1.52486 s, 64 iters, t-(init.)=1.47044 s t(norm)=0.215742, mflops=23.1758 (err=3.0e-15) 21. NAPACK (f2c): elapsed time t=1.53778 s, 16 iters, t-(init.)=1.52414 s t(norm)=0.894484, mflops=5.58982 (err=4.3e-14) 22. Nielsen: elapsed time t=1.05281 s, 32 iters, t-(init.)=1.02335 s t(norm)=0.30029, mflops=16.6506 (err=1.1e-14) 23. NR (C): elapsed time t=1.5286 s, 64 iters, t-(init.)=1.47419 s t(norm)=0.216292, mflops=23.1169 (err=3.0e-15) 24. Ooura (C): elapsed time t=1.2842 s, 128 iters, t-(init.)=1.1754 s t(norm)=0.0862266, mflops=57.9867 (err=2.9e-15) 25. QFT: elapsed time t=1.79564 s, 64 iters, t-(init.)=1.7388 s t(norm)=0.255116, mflops=19.599 (err=4.0e-15) 26. Ransom: elapsed time t=1.5987 s, 128 iters, t-(init.)=1.48964 s t(norm)=0.109279, mflops=45.7544 (err=4.1e-15) 27. Singleton (f2c): elapsed time t=1.72409 s, 128 iters, t-(init.)=1.61509 s t(norm)=0.118482, mflops=42.2004 (err=4.4e-15) 28. Temperton (f2c): elapsed time t=1.94306 s, 64 iters, t-(init.)=1.8884 s t(norm)=0.277065, mflops=18.0463 (err=2.9e-15) 29. Valkenburg: elapsed time t=1.56203 s, 16 iters, t-(init.)=1.54838 s t(norm)=0.908711, mflops=5.5023 (err=2.9e-15) Top mflops for N=8192 = 82.1991 Normalized results and averages for N=8192: fft 0: mflops = 24.7372 (norm. = 0.300942), norm. avg. (of 13) = 0.380279 fft 1: mflops = 23.3588 (norm. = 0.284174), norm. avg. (of 13) = 0.375373 fft 2: mflops = 17.2929 (norm. = 0.210379), norm. avg. (of 13) = 0.244934 fft 3: mflops = 24.6818 (norm. = 0.300269), norm. avg. (of 13) = 0.123006 fft 4: mflops = 16.2921 (norm. = 0.198203), norm. avg. (of 13) = 0.0876323 fft 5: mflops = 48.1967 (norm. = 0.586341), norm. avg. (of 13) = 0.389807 fft 6: mflops = 75.5846 (norm. = 0.919531), norm. avg. (of 13) = 0.408652 fft 7: mflops = 82.1991 (norm. = 1), norm. avg. (of 13) = 0.468278 fft 8: mflops = 16.5262 (norm. = 0.201051), norm. avg. (of 12) = 0.182886 fft 9: mflops = 14.6176 (norm. = 0.177831), norm. avg. (of 13) = 0.139057 fft 10: mflops = 46.1163 (norm. = 0.561032), norm. avg. (of 13) = 0.739926 fft 11: mflops = 45.6338 (norm. = 0.555162), norm. avg. (of 13) = 0.703544 fft 12: mflops = 29.5763 (norm. = 0.359813), norm. avg. (of 13) = 0.681922 fft 13: mflops = 54.2435 (norm. = 0.659904), norm. avg. (of 11) = 0.738772 fft 14: mflops = 17.8946 (norm. = 0.217698), norm. avg. (of 13) = 0.224657 fft 15: mflops = 22.6994 (norm. = 0.276152), norm. avg. (of 13) = 0.228873 fft 16: mflops = 22.9442 (norm. = 0.27913), norm. avg. (of 13) = 0.242966 fft 17: mflops = -1 (norm. = -0.0121656), norm. avg. (of 12) = 0.360633 fft 18: mflops = 26.1554 (norm. = 0.318196), norm. avg. (of 12) = 0.325785 fft 19: mflops = 27.5606 (norm. = 0.335291), norm. avg. (of 12) = 0.405338 fft 20: mflops = 23.1758 (norm. = 0.281947), norm. avg. (of 12) = 0.34848 fft 21: mflops = 5.58982 (norm. = 0.0680034), norm. avg. (of 13) = 0.0582829 fft 22: mflops = 16.6506 (norm. = 0.202564), norm. avg. (of 13) = 0.108732 fft 23: mflops = 23.1169 (norm. = 0.281231), norm. avg. (of 13) = 0.249384 fft 24: mflops = 57.9867 (norm. = 0.705443), norm. avg. (of 13) = 0.598916 fft 25: mflops = 19.599 (norm. = 0.238433), norm. avg. (of 10) = 0.221226 fft 26: mflops = 45.7544 (norm. = 0.556629), norm. avg. (of 12) = 0.20598 fft 27: mflops = 42.2004 (norm. = 0.513393), norm. avg. (of 13) = 0.420113 fft 28: mflops = 18.0463 (norm. = 0.219544), norm. avg. (of 13) = 0.0958579 fft 29: mflops = 5.5023 (norm. = 0.0669387), norm. avg. (of 13) = 0.0497154 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.71594 s, 32 iters, t-(init.)=1.66133 s t(norm)=0.226339, mflops=22.0908 (err=5.6e-15) 1. Arndt DIT: elapsed time t=1.8098 s, 32 iters, t-(init.)=1.75516 s t(norm)=0.239121, mflops=20.9099 (err=5.6e-15) 2. Arndt Split-Radix: elapsed time t=1.10269 s, 16 iters, t-(init.)=1.07542 s t(norm)=0.293028, mflops=17.0632 (err=5.6e-15) 3. Arndt 4-step: elapsed time t=1.22788 s, 32 iters, t-(init.)=1.17315 s t(norm)=0.159828, mflops=31.2835 (err=5.6e-15) 4. Beauregard: elapsed time t=1.14728 s, 16 iters, t-(init.)=1.11993 s t(norm)=0.305156, mflops=16.3851 (err=5.7e-15) 5. Bergland: elapsed time t=1.53779 s, 64 iters, t-(init.)=1.42836 s t(norm)=0.0972994, mflops=51.3878 (err=5.7e-15) 6. CWP (min N) (N=17160): elapsed time t=1.03176 s, 64 iters, t-(init.)=0.916307 s t(norm)=0.0624185, mflops=80.1045 7. CWP (best N) (N=17160): elapsed time t=1.03204 s, 64 iters, t-(init.)=0.916197 s t(norm)=0.062411, mflops=80.1141 8. Edelblute: elapsed time t=1.1465 s, 16 iters, t-(init.)=1.11921 s t(norm)=0.304961, mflops=16.3956 (err=5.6e-15) 9. FFTPACK (f2c): elapsed time t=1.23949 s, 16 iters, t-(init.)=1.21209 s t(norm)=0.330269, mflops=15.1392 (err=5.7e-15) FFTW_MEASURE plan: (cost = 1.444650e-02) FFTW_TWIDDLE 32 FFTW_TWIDDLE 8 FFTW_NOTW 64 10. FFTW: elapsed time t=1.59436 s, 64 iters, t-(init.)=1.48074 s t(norm)=0.100868, mflops=49.5699 (err=5.7e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.68943 s, 64 iters, t-(init.)=1.57592 s t(norm)=0.107351, mflops=46.5761 (err=5.7e-15) 12. Frigo-old: elapsed time t=1.39068 s, 32 iters, t-(init.)=1.33596 s t(norm)=0.18201, mflops=27.4711 (err=5.7e-15) 13. Green: elapsed time t=1.50579 s, 64 iters, t-(init.)=1.3964 s t(norm)=0.0951219, mflops=52.5641 (err=5.7e-15) 14. GSL: elapsed time t=1.9372 s, 32 iters, t-(init.)=1.88257 s t(norm)=0.25648, mflops=19.4947 (err=5.7e-15) 15. GSL DIT: elapsed time t=1.67489 s, 32 iters, t-(init.)=1.61989 s t(norm)=0.220692, mflops=22.656 (err=6.3e-15) 16. GSL DIF: elapsed time t=1.65954 s, 32 iters, t-(init.)=1.60485 s t(norm)=0.218644, mflops=22.8683 (err=6.4e-15) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.5097 s, 32 iters, t-(init.)=1.4551 s t(norm)=0.198241, mflops=25.2218 (err=5.6e-15) 19. Mayer (simple): elapsed time t=1.44429 s, 32 iters, t-(init.)=1.38958 s t(norm)=0.189316, mflops=26.4109 20. Mayer (lookup): elapsed time t=1.66539 s, 32 iters, t-(init.)=1.61077 s t(norm)=0.21945, mflops=22.7843 (err=5.6e-15) 21. NAPACK (f2c): elapsed time t=1.60408 s, 8 iters, t-(init.)=1.58607 s t(norm)=0.864337, mflops=5.78478 (err=2.3e-13) 22. Nielsen: elapsed time t=1.06478 s, 16 iters, t-(init.)=1.03297 s t(norm)=0.281463, mflops=17.7644 (err=1.3e-13) 23. NR (C): elapsed time t=1.65391 s, 32 iters, t-(init.)=1.59932 s t(norm)=0.21789, mflops=22.9473 (err=5.6e-15) 24. Ooura (C): elapsed time t=1.30565 s, 64 iters, t-(init.)=1.1963 s t(norm)=0.0814917, mflops=61.356 (err=5.7e-15) 25. QFT: elapsed time t=1.07604 s, 16 iters, t-(init.)=1.0442 s t(norm)=0.284521, mflops=17.5734 (err=7.0e-15) 26. Ransom: elapsed time t=1.28025 s, 64 iters, t-(init.)=1.17088 s t(norm)=0.0797601, mflops=62.688 (err=6.0e-15) 27. Singleton (f2c): elapsed time t=1.72045 s, 64 iters, t-(init.)=1.6113 s t(norm)=0.109761, mflops=45.5534 (err=8.5e-15) 28. Temperton (f2c): elapsed time t=1.9456 s, 32 iters, t-(init.)=1.8909 s t(norm)=0.257614, mflops=19.4089 (err=5.7e-15) 29. Valkenburg: elapsed time t=1.75809 s, 8 iters, t-(init.)=1.74435 s t(norm)=0.950593, mflops=5.25987 (err=5.7e-15) Top mflops for N=16384 = 80.1141 Normalized results and averages for N=16384: fft 0: mflops = 22.0908 (norm. = 0.275742), norm. avg. (of 14) = 0.372812 fft 1: mflops = 20.9099 (norm. = 0.261001), norm. avg. (of 14) = 0.367204 fft 2: mflops = 17.0632 (norm. = 0.212986), norm. avg. (of 14) = 0.242652 fft 3: mflops = 31.2835 (norm. = 0.390487), norm. avg. (of 14) = 0.142111 fft 4: mflops = 16.3851 (norm. = 0.204521), norm. avg. (of 14) = 0.0959815 fft 5: mflops = 51.3878 (norm. = 0.641432), norm. avg. (of 14) = 0.40778 fft 6: mflops = 80.1045 (norm. = 0.99988), norm. avg. (of 14) = 0.450883 fft 7: mflops = 80.1141 (norm. = 1), norm. avg. (of 14) = 0.506258 fft 8: mflops = 16.3956 (norm. = 0.204652), norm. avg. (of 13) = 0.18456 fft 9: mflops = 15.1392 (norm. = 0.18897), norm. avg. (of 14) = 0.142622 fft 10: mflops = 49.5699 (norm. = 0.618741), norm. avg. (of 14) = 0.73127 fft 11: mflops = 46.5761 (norm. = 0.581372), norm. avg. (of 14) = 0.694817 fft 12: mflops = 27.4711 (norm. = 0.342899), norm. avg. (of 14) = 0.657706 fft 13: mflops = 52.5641 (norm. = 0.656115), norm. avg. (of 12) = 0.731884 fft 14: mflops = 19.4947 (norm. = 0.243337), norm. avg. (of 14) = 0.225992 fft 15: mflops = 22.656 (norm. = 0.282796), norm. avg. (of 14) = 0.232725 fft 16: mflops = 22.8683 (norm. = 0.285446), norm. avg. (of 14) = 0.246001 fft 17: mflops = -1 (norm. = -0.0124822), norm. avg. (of 12) = 0.360633 fft 18: mflops = 25.2218 (norm. = 0.314824), norm. avg. (of 13) = 0.324942 fft 19: mflops = 26.4109 (norm. = 0.329666), norm. avg. (of 13) = 0.399517 fft 20: mflops = 22.7843 (norm. = 0.284398), norm. avg. (of 13) = 0.343551 fft 21: mflops = 5.78478 (norm. = 0.0722068), norm. avg. (of 14) = 0.0592775 fft 22: mflops = 17.7644 (norm. = 0.221738), norm. avg. (of 14) = 0.116803 fft 23: mflops = 22.9473 (norm. = 0.286433), norm. avg. (of 14) = 0.25203 fft 24: mflops = 61.356 (norm. = 0.765857), norm. avg. (of 14) = 0.61084 fft 25: mflops = 17.5734 (norm. = 0.219354), norm. avg. (of 11) = 0.221056 fft 26: mflops = 62.688 (norm. = 0.782483), norm. avg. (of 13) = 0.250326 fft 27: mflops = 45.5534 (norm. = 0.568606), norm. avg. (of 14) = 0.430719 fft 28: mflops = 19.4089 (norm. = 0.242265), norm. avg. (of 14) = 0.106316 fft 29: mflops = 5.25987 (norm. = 0.0656548), norm. avg. (of 14) = 0.0508539 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.75185 s, 16 iters, t-(init.)=1.69569 s t(norm)=0.215619, mflops=23.1891 (err=5.2e-15) 1. Arndt DIT: elapsed time t=1.84383 s, 16 iters, t-(init.)=1.78753 s t(norm)=0.227297, mflops=21.9977 (err=5.2e-15) 2. Arndt Split-Radix: elapsed time t=1.20888 s, 8 iters, t-(init.)=1.18076 s t(norm)=0.300283, mflops=16.6509 (err=5.2e-15) 3. Arndt 4-step: elapsed time t=1.61307 s, 16 iters, t-(init.)=1.55686 s t(norm)=0.197965, mflops=25.257 (err=5.2e-15) 4. Beauregard: elapsed time t=1.25575 s, 8 iters, t-(init.)=1.22759 s t(norm)=0.312192, mflops=16.0158 (err=5.2e-15) 5. Bergland: elapsed time t=1.69749 s, 32 iters, t-(init.)=1.58516 s t(norm)=0.100782, mflops=49.6123 (err=5.2e-15) 6. CWP (min N) (N=34320): elapsed time t=1.12743 s, 32 iters, t-(init.)=1.00883 s t(norm)=0.0641399, mflops=77.9546 7. CWP (best N) (N=34320): elapsed time t=1.12818 s, 32 iters, t-(init.)=1.00969 s t(norm)=0.0641945, mflops=77.8883 8. Edelblute: elapsed time t=1.25332 s, 8 iters, t-(init.)=1.2249 s t(norm)=0.311509, mflops=16.0509 (err=5.2e-15) 9. FFTPACK (f2c): elapsed time t=1.30249 s, 8 iters, t-(init.)=1.26965 s t(norm)=0.322888, mflops=15.4852 (err=5.2e-15) FFTW_MEASURE plan: (cost = 4.601000e-02) FFTW_TWIDDLE 32 FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 10. FFTW: elapsed time t=1.84978 s, 32 iters, t-(init.)=1.72886 s t(norm)=0.109918, mflops=45.4885 (err=5.2e-15) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.85618 s, 32 iters, t-(init.)=1.73512 s t(norm)=0.110316, mflops=45.3243 (err=5.2e-15) 12. Frigo-old: elapsed time t=1.88256 s, 16 iters, t-(init.)=1.82608 s t(norm)=0.232199, mflops=21.5333 (err=5.2e-15) 13. Green: elapsed time t=1.65698 s, 32 iters, t-(init.)=1.54351 s t(norm)=0.0981335, mflops=50.951 (err=5.2e-15) 14. GSL: elapsed time t=1.93521 s, 16 iters, t-(init.)=1.87838 s t(norm)=0.238848, mflops=20.9338 (err=5.2e-15) 15. GSL DIT: elapsed time t=1.83273 s, 16 iters, t-(init.)=1.77634 s t(norm)=0.225874, mflops=22.1363 (err=5.9e-15) 16. GSL DIF: elapsed time t=1.81543 s, 16 iters, t-(init.)=1.75924 s t(norm)=0.223698, mflops=22.3515 (err=6.0e-15) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.64154 s, 16 iters, t-(init.)=1.58504 s t(norm)=0.201549, mflops=24.8079 (err=5.2e-15) 19. Mayer (simple): elapsed time t=1.57466 s, 16 iters, t-(init.)=1.51854 s t(norm)=0.193093, mflops=25.8943 20. Mayer (lookup): elapsed time t=1.79739 s, 16 iters, t-(init.)=1.74097 s t(norm)=0.221376, mflops=22.5861 (err=5.2e-15) 21. NAPACK (f2c): elapsed time t=1.78966 s, 4 iters, t-(init.)=1.77545 s t(norm)=0.903041, mflops=5.53685 (err=5.6e-13) 22. Nielsen: elapsed time t=1.13141 s, 8 iters, t-(init.)=1.09427 s t(norm)=0.278288, mflops=17.967 (err=2.3e-13) 23. NR (C): elapsed time t=1.8202 s, 16 iters, t-(init.)=1.76382 s t(norm)=0.224281, mflops=22.2934 (err=5.3e-15) 24. Ooura (C): elapsed time t=1.46097 s, 32 iters, t-(init.)=1.34831 s t(norm)=0.0857232, mflops=58.3272 (err=5.2e-15) 25. QFT: elapsed time t=1.26485 s, 8 iters, t-(init.)=1.22751 s t(norm)=0.312172, mflops=16.0168 (err=7.5e-15) 26. Ransom: elapsed time t=1.69882 s, 32 iters, t-(init.)=1.58642 s t(norm)=0.100862, mflops=49.5727 (err=6.4e-15) 27. Singleton (f2c): elapsed time t=1.1179 s, 16 iters, t-(init.)=1.06135 s t(norm)=0.134957, mflops=37.0487 (err=7.2e-15) 28. Temperton (f2c): elapsed time t=1.11822 s, 8 iters, t-(init.)=1.08984 s t(norm)=0.277161, mflops=18.04 (err=5.2e-15) 29. Valkenburg: elapsed time t=1.00477 s, 2 iters, t-(init.)=0.997496 s t(norm)=1.01471, mflops=4.92754 (err=5.2e-15) Top mflops for N=32768 = 77.9546 Normalized results and averages for N=32768: fft 0: mflops = 23.1891 (norm. = 0.297469), norm. avg. (of 15) = 0.367789 fft 1: mflops = 21.9977 (norm. = 0.282186), norm. avg. (of 15) = 0.361536 fft 2: mflops = 16.6509 (norm. = 0.213598), norm. avg. (of 15) = 0.240715 fft 3: mflops = 25.257 (norm. = 0.323996), norm. avg. (of 15) = 0.154237 fft 4: mflops = 16.0158 (norm. = 0.20545), norm. avg. (of 15) = 0.103279 fft 5: mflops = 49.6123 (norm. = 0.636425), norm. avg. (of 15) = 0.423023 fft 6: mflops = 77.9546 (norm. = 1), norm. avg. (of 15) = 0.48749 fft 7: mflops = 77.8883 (norm. = 0.999149), norm. avg. (of 15) = 0.539118 fft 8: mflops = 16.0509 (norm. = 0.205901), norm. avg. (of 14) = 0.186084 fft 9: mflops = 15.4852 (norm. = 0.198644), norm. avg. (of 15) = 0.146357 fft 10: mflops = 45.4885 (norm. = 0.583525), norm. avg. (of 15) = 0.721421 fft 11: mflops = 45.3243 (norm. = 0.581419), norm. avg. (of 15) = 0.687257 fft 12: mflops = 21.5333 (norm. = 0.276229), norm. avg. (of 15) = 0.632275 fft 13: mflops = 50.951 (norm. = 0.653598), norm. avg. (of 13) = 0.725862 fft 14: mflops = 20.9338 (norm. = 0.268538), norm. avg. (of 15) = 0.228828 fft 15: mflops = 22.1363 (norm. = 0.283963), norm. avg. (of 15) = 0.236141 fft 16: mflops = 22.3515 (norm. = 0.286725), norm. avg. (of 15) = 0.248716 fft 17: mflops = -1 (norm. = -0.012828), norm. avg. (of 12) = 0.360633 fft 18: mflops = 24.8079 (norm. = 0.318235), norm. avg. (of 14) = 0.324463 fft 19: mflops = 25.8943 (norm. = 0.332172), norm. avg. (of 14) = 0.394706 fft 20: mflops = 22.5861 (norm. = 0.289733), norm. avg. (of 14) = 0.339707 fft 21: mflops = 5.53685 (norm. = 0.0710266), norm. avg. (of 15) = 0.0600608 fft 22: mflops = 17.967 (norm. = 0.23048), norm. avg. (of 15) = 0.124382 fft 23: mflops = 22.2934 (norm. = 0.285979), norm. avg. (of 15) = 0.254294 fft 24: mflops = 58.3272 (norm. = 0.74822), norm. avg. (of 15) = 0.619999 fft 25: mflops = 16.0168 (norm. = 0.205463), norm. avg. (of 12) = 0.219757 fft 26: mflops = 49.5727 (norm. = 0.635918), norm. avg. (of 14) = 0.277869 fft 27: mflops = 37.0487 (norm. = 0.47526), norm. avg. (of 15) = 0.433689 fft 28: mflops = 18.04 (norm. = 0.231417), norm. avg. (of 15) = 0.114656 fft 29: mflops = 4.92754 (norm. = 0.0632103), norm. avg. (of 15) = 0.0516777 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.05945 s, 4 iters, t-(init.)=1.02939 s t(norm)=0.245426, mflops=20.3727 (err=1.6e-14) 1. Arndt DIT: elapsed time t=1.11067 s, 4 iters, t-(init.)=1.08041 s t(norm)=0.25759, mflops=19.4107 (err=1.6e-14) 2. Arndt Split-Radix: elapsed time t=1.33555 s, 4 iters, t-(init.)=1.30536 s t(norm)=0.311222, mflops=16.0657 (err=1.6e-14) 3. Arndt 4-step: elapsed time t=1.33456 s, 8 iters, t-(init.)=1.27475 s t(norm)=0.151962, mflops=32.9029 (err=1.6e-14) 4. Beauregard: elapsed time t=1.36967 s, 4 iters, t-(init.)=1.33954 s t(norm)=0.319371, mflops=15.6558 (err=1.6e-14) 5. Bergland: elapsed time t=1.00002 s, 8 iters, t-(init.)=0.939594 s t(norm)=0.112008, mflops=44.6395 (err=1.6e-14) 6. CWP (min N) (N=72072): elapsed time t=1.52951 s, 16 iters, t-(init.)=1.34213 s t(norm)=0.0799974, mflops=62.5021 7. CWP (best N) (N=72072): elapsed time t=1.53133 s, 16 iters, t-(init.)=1.34424 s t(norm)=0.0801229, mflops=62.4041 8. Edelblute: elapsed time t=1.38343 s, 4 iters, t-(init.)=1.35311 s t(norm)=0.322607, mflops=15.4987 (err=1.6e-14) 9. FFTPACK (f2c): elapsed time t=1.65218 s, 4 iters, t-(init.)=1.61356 s t(norm)=0.384702, mflops=12.9971 (err=1.6e-14) FFTW_MEASURE plan: (cost = 1.265920e-01) FFTW_TWIDDLE 64 FFTW_TWIDDLE 16 FFTW_NOTW 64 10. FFTW: elapsed time t=1.00729 s, 8 iters, t-(init.)=0.930516 s t(norm)=0.110926, mflops=45.075 (err=1.6e-14) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.22164 s, 8 iters, t-(init.)=1.14472 s t(norm)=0.136461, mflops=36.6406 (err=1.6e-14) 12. Frigo-old: elapsed time t=1.132 s, 4 iters, t-(init.)=1.09273 s t(norm)=0.260528, mflops=19.1918 (err=1.6e-14) 13. Green: elapsed time t=1.82808 s, 16 iters, t-(init.)=1.70618 s t(norm)=0.101696, mflops=49.1659 (err=1.6e-14) 14. GSL: elapsed time t=1.06502 s, 4 iters, t-(init.)=1.03489 s t(norm)=0.246737, mflops=20.2645 (err=1.6e-14) 15. GSL DIT: elapsed time t=1.02444 s, 4 iters, t-(init.)=0.994242 s t(norm)=0.237046, mflops=21.093 (err=1.7e-14) 16. GSL DIF: elapsed time t=1.01266 s, 4 iters, t-(init.)=0.982425 s t(norm)=0.234228, mflops=21.3467 (err=1.8e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.79155 s, 8 iters, t-(init.)=1.73168 s t(norm)=0.206433, mflops=24.221 (err=1.6e-14) 19. Mayer (simple): elapsed time t=1.72319 s, 8 iters, t-(init.)=1.6631 s t(norm)=0.198257, mflops=25.2197 20. Mayer (lookup): elapsed time t=1.97711 s, 8 iters, t-(init.)=1.9165 s t(norm)=0.228465, mflops=21.8852 (err=1.6e-14) 21. NAPACK (f2c): elapsed time t=1.87911 s, 2 iters, t-(init.)=1.84683 s t(norm)=0.880638, mflops=5.6777 (err=8.7e-13) 22. Nielsen: elapsed time t=1.3601 s, 4 iters, t-(init.)=1.31236 s t(norm)=0.312891, mflops=15.98 (err=2.6e-13) 23. NR (C): elapsed time t=1.99119 s, 8 iters, t-(init.)=1.93093 s t(norm)=0.230185, mflops=21.7216 (err=1.6e-14) 24. Ooura (C): elapsed time t=1.63539 s, 16 iters, t-(init.)=1.51597 s t(norm)=0.0903587, mflops=55.335 (err=1.6e-14) 25. QFT: elapsed time t=1.89827 s, 4 iters, t-(init.)=1.85 s t(norm)=0.441076, mflops=11.3359 (err=1.9e-14) 26. Ransom: elapsed time t=1.05809 s, 8 iters, t-(init.)=0.997577 s t(norm)=0.11892, mflops=42.0449 (err=1.7e-14) 27. Singleton (f2c): elapsed time t=1.09053 s, 8 iters, t-(init.)=1.03056 s t(norm)=0.122853, mflops=40.6991 (err=2.4e-14) 28. Temperton (f2c): elapsed time t=1.18851 s, 4 iters, t-(init.)=1.15825 s t(norm)=0.276147, mflops=18.1063 (err=1.6e-14) 29. Valkenburg: elapsed time t=1.17496 s, 1 iters, t-(init.)=1.16674 s t(norm)=1.11269, mflops=4.49361 (err=1.6e-14) Top mflops for N=65536 = 62.5021 Normalized results and averages for N=65536: fft 0: mflops = 20.3727 (norm. = 0.325953), norm. avg. (of 16) = 0.365174 fft 1: mflops = 19.4107 (norm. = 0.31056), norm. avg. (of 16) = 0.35835 fft 2: mflops = 16.0657 (norm. = 0.257043), norm. avg. (of 16) = 0.241735 fft 3: mflops = 32.9029 (norm. = 0.526429), norm. avg. (of 16) = 0.177499 fft 4: mflops = 15.6558 (norm. = 0.250484), norm. avg. (of 16) = 0.11248 fft 5: mflops = 44.6395 (norm. = 0.714209), norm. avg. (of 16) = 0.441222 fft 6: mflops = 62.5021 (norm. = 1), norm. avg. (of 16) = 0.519522 fft 7: mflops = 62.4041 (norm. = 0.998433), norm. avg. (of 16) = 0.567825 fft 8: mflops = 15.4987 (norm. = 0.247971), norm. avg. (of 15) = 0.19021 fft 9: mflops = 12.9971 (norm. = 0.207946), norm. avg. (of 16) = 0.150206 fft 10: mflops = 45.075 (norm. = 0.721177), norm. avg. (of 16) = 0.721405 fft 11: mflops = 36.6406 (norm. = 0.58623), norm. avg. (of 16) = 0.680943 fft 12: mflops = 19.1918 (norm. = 0.307059), norm. avg. (of 16) = 0.611949 fft 13: mflops = 49.1659 (norm. = 0.786629), norm. avg. (of 14) = 0.730202 fft 14: mflops = 20.2645 (norm. = 0.324221), norm. avg. (of 16) = 0.23479 fft 15: mflops = 21.093 (norm. = 0.337476), norm. avg. (of 16) = 0.242474 fft 16: mflops = 21.3467 (norm. = 0.341536), norm. avg. (of 16) = 0.254517 fft 17: mflops = -1 (norm. = -0.0159995), norm. avg. (of 12) = 0.360633 fft 18: mflops = 24.221 (norm. = 0.387523), norm. avg. (of 15) = 0.328667 fft 19: mflops = 25.2197 (norm. = 0.403503), norm. avg. (of 15) = 0.395293 fft 20: mflops = 21.8852 (norm. = 0.350151), norm. avg. (of 15) = 0.340403 fft 21: mflops = 5.6777 (norm. = 0.0908403), norm. avg. (of 16) = 0.0619845 fft 22: mflops = 15.98 (norm. = 0.255672), norm. avg. (of 16) = 0.132588 fft 23: mflops = 21.7216 (norm. = 0.347534), norm. avg. (of 16) = 0.260121 fft 24: mflops = 55.335 (norm. = 0.885331), norm. avg. (of 16) = 0.636582 fft 25: mflops = 11.3359 (norm. = 0.181369), norm. avg. (of 13) = 0.216804 fft 26: mflops = 42.0449 (norm. = 0.672696), norm. avg. (of 15) = 0.30419 fft 27: mflops = 40.6991 (norm. = 0.651164), norm. avg. (of 16) = 0.447281 fft 28: mflops = 18.1063 (norm. = 0.289691), norm. avg. (of 16) = 0.125595 fft 29: mflops = 4.49361 (norm. = 0.0718953), norm. avg. (of 16) = 0.0529413 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.73656 s, 1 iters, t-(init.)=1.6867 s t(norm)=0.75697, mflops=6.60528 (err=3.9e-14) 1. Arndt DIT: elapsed time t=1.84428 s, 1 iters, t-(init.)=1.79435 s t(norm)=0.805281, mflops=6.20901 (err=3.9e-14) 2. Arndt Split-Radix: elapsed time t=2.47305 s, 1 iters, t-(init.)=2.42302 s t(norm)=1.08742, mflops=4.59802 (err=3.9e-14) 3. Arndt 4-step: elapsed time t=1.61271 s, 2 iters, t-(init.)=1.51293 s t(norm)=0.339493, mflops=14.7278 (err=3.9e-14) 4. Beauregard: elapsed time t=1.12877 s, 1 iters, t-(init.)=1.07884 s t(norm)=0.484169, mflops=10.327 (err=3.8e-14) 5. Bergland: elapsed time t=1.40623 s, 2 iters, t-(init.)=1.30656 s t(norm)=0.293184, mflops=17.0541 (err=3.9e-14) 6. CWP (min N) (N=144144): elapsed time t=1.34672 s, 4 iters, t-(init.)=1.12987 s t(norm)=0.126768, mflops=39.4421 7. CWP (best N) (N=144144): elapsed time t=1.34657 s, 4 iters, t-(init.)=1.12995 s t(norm)=0.126777, mflops=39.4393 8. Edelblute: elapsed time t=2.47315 s, 1 iters, t-(init.)=2.4233 s t(norm)=1.08755, mflops=4.59751 (err=3.9e-14) 9. FFTPACK (f2c): elapsed time t=1.27276 s, 1 iters, t-(init.)=1.223 s t(norm)=0.548869, mflops=9.10963 (err=3.8e-14) FFTW_MEASURE plan: (cost = 3.593510e-01) FFTW_TWIDDLE 2 FFTW_TWIDDLE 64 FFTW_TWIDDLE 16 FFTW_NOTW 64 10. FFTW: elapsed time t=1.46402 s, 4 iters, t-(init.)=1.26394 s t(norm)=0.14181, mflops=35.2584 (err=3.8e-14) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.86929 s, 4 iters, t-(init.)=1.66984 s t(norm)=0.187351, mflops=26.6878 (err=3.8e-14) 12. Frigo-old: elapsed time t=1.50462 s, 2 iters, t-(init.)=1.40555 s t(norm)=0.315396, mflops=15.8531 (err=3.8e-14) 13. Green: elapsed time t=1.3399 s, 2 iters, t-(init.)=1.24854 s t(norm)=0.280166, mflops=17.8466 (err=3.8e-14) 14. GSL: elapsed time t=1.85026 s, 2 iters, t-(init.)=1.75019 s t(norm)=0.392731, mflops=12.7313 (err=3.8e-14) 15. GSL DIT: elapsed time t=1.67009 s, 1 iters, t-(init.)=1.62015 s t(norm)=0.727102, mflops=6.87661 (err=4.0e-14) 16. GSL DIF: elapsed time t=1.61923 s, 1 iters, t-(init.)=1.56927 s t(norm)=0.704269, mflops=7.09956 (err=4.2e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.21204 s, 2 iters, t-(init.)=1.1118 s t(norm)=0.249481, mflops=20.0416 (err=3.9e-14) 19. Mayer (simple): elapsed time t=1.17476 s, 2 iters, t-(init.)=1.07479 s t(norm)=0.241177, mflops=20.7317 20. Mayer (lookup): elapsed time t=1.32123 s, 2 iters, t-(init.)=1.22133 s t(norm)=0.274059, mflops=18.2443 (err=3.9e-14) 21. NAPACK (f2c): elapsed time t=2.11264 s, 1 iters, t-(init.)=2.06269 s t(norm)=0.92571, mflops=5.40126 (err=2.0e-12) 22. Nielsen: elapsed time t=1.38553 s, 1 iters, t-(init.)=1.33558 s t(norm)=0.599394, mflops=8.34175 (err=9.2e-13) 23. NR (C): elapsed time t=1.67687 s, 1 iters, t-(init.)=1.62687 s t(norm)=0.730119, mflops=6.8482 (err=3.9e-14) 24. Ooura (C): elapsed time t=1.13885 s, 2 iters, t-(init.)=1.03856 s t(norm)=0.233047, mflops=21.4549 (err=3.9e-14) 25. QFT: elapsed time t=1.13588 s, 1 iters, t-(init.)=1.08597 s t(norm)=0.48737, mflops=10.2591 (err=4.1e-14) 26. Ransom: elapsed time t=1.18066 s, 2 iters, t-(init.)=1.08065 s t(norm)=0.242491, mflops=20.6194 (err=3.9e-14) 27. Singleton (f2c): elapsed time t=1.76628 s, 2 iters, t-(init.)=1.66659 s t(norm)=0.373973, mflops=13.37 (err=5.7e-14) 28. Temperton (f2c): elapsed time t=1.06871 s, 1 iters, t-(init.)=1.01919 s t(norm)=0.457402, mflops=10.9313 (err=3.8e-14) 29. Valkenburg: elapsed time t=3.18503 s, 1 iters, t-(init.)=3.13522 s t(norm)=1.40705, mflops=3.55353 (err=3.9e-14) Top mflops for N=131072 = 39.4421 Normalized results and averages for N=131072: fft 0: mflops = 6.60528 (norm. = 0.167468), norm. avg. (of 17) = 0.353545 fft 1: mflops = 6.20901 (norm. = 0.157421), norm. avg. (of 17) = 0.346531 fft 2: mflops = 4.59802 (norm. = 0.116576), norm. avg. (of 17) = 0.234373 fft 3: mflops = 14.7278 (norm. = 0.373404), norm. avg. (of 17) = 0.189023 fft 4: mflops = 10.327 (norm. = 0.261826), norm. avg. (of 17) = 0.121265 fft 5: mflops = 17.0541 (norm. = 0.432384), norm. avg. (of 17) = 0.440703 fft 6: mflops = 39.4421 (norm. = 1), norm. avg. (of 17) = 0.547786 fft 7: mflops = 39.4393 (norm. = 0.999927), norm. avg. (of 17) = 0.593243 fft 8: mflops = 4.59751 (norm. = 0.116563), norm. avg. (of 16) = 0.185607 fft 9: mflops = 9.10963 (norm. = 0.230962), norm. avg. (of 17) = 0.154956 fft 10: mflops = 35.2584 (norm. = 0.893928), norm. avg. (of 17) = 0.731554 fft 11: mflops = 26.6878 (norm. = 0.676633), norm. avg. (of 17) = 0.680689 fft 12: mflops = 15.8531 (norm. = 0.401932), norm. avg. (of 17) = 0.599595 fft 13: mflops = 17.8466 (norm. = 0.452475), norm. avg. (of 15) = 0.711687 fft 14: mflops = 12.7313 (norm. = 0.322786), norm. avg. (of 17) = 0.239966 fft 15: mflops = 6.87661 (norm. = 0.174347), norm. avg. (of 17) = 0.238467 fft 16: mflops = 7.09956 (norm. = 0.179999), norm. avg. (of 17) = 0.250133 fft 17: mflops = -1 (norm. = -0.0253536), norm. avg. (of 12) = 0.360633 fft 18: mflops = 20.0416 (norm. = 0.508127), norm. avg. (of 16) = 0.339883 fft 19: mflops = 20.7317 (norm. = 0.525623), norm. avg. (of 16) = 0.403439 fft 20: mflops = 18.2443 (norm. = 0.462558), norm. avg. (of 16) = 0.348038 fft 21: mflops = 5.40126 (norm. = 0.136941), norm. avg. (of 17) = 0.0663937 fft 22: mflops = 8.34175 (norm. = 0.211493), norm. avg. (of 17) = 0.137229 fft 23: mflops = 6.8482 (norm. = 0.173626), norm. avg. (of 17) = 0.255033 fft 24: mflops = 21.4549 (norm. = 0.543958), norm. avg. (of 17) = 0.631134 fft 25: mflops = 10.2591 (norm. = 0.260106), norm. avg. (of 14) = 0.219897 fft 26: mflops = 20.6194 (norm. = 0.522775), norm. avg. (of 16) = 0.317852 fft 27: mflops = 13.37 (norm. = 0.338976), norm. avg. (of 17) = 0.44091 fft 28: mflops = 10.9313 (norm. = 0.277148), norm. avg. (of 17) = 0.13451 fft 29: mflops = 3.55353 (norm. = 0.0900949), norm. avg. (of 17) = 0.0551268 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=4.62419 s, 1 iters, t-(init.)=4.52424 s t(norm)=0.958811, mflops=5.21479 (err=4.3e-14) 1. Arndt DIT: elapsed time t=4.65318 s, 1 iters, t-(init.)=4.55334 s t(norm)=0.964979, mflops=5.18146 (err=4.3e-14) 2. Arndt Split-Radix: elapsed time t=5.62808 s, 1 iters, t-(init.)=5.52783 s t(norm)=1.1715, mflops=4.26804 (err=4.3e-14) 3. Arndt 4-step: elapsed time t=1.52849 s, 1 iters, t-(init.)=1.42868 s t(norm)=0.302778, mflops=16.5138 (err=4.3e-14) 4. Beauregard: elapsed time t=2.39103 s, 1 iters, t-(init.)=2.29126 s t(norm)=0.48558, mflops=10.297 (err=4.3e-14) 5. Bergland: elapsed time t=1.55349 s, 1 iters, t-(init.)=1.45346 s t(norm)=0.308029, mflops=16.2323 (err=4.3e-14) 6. CWP (min N) (N=360360): elapsed time t=1.0821 s, 1 iters, t-(init.)=0.95289 s t(norm)=0.201944, mflops=24.7594 7. CWP (best N) (N=360360): elapsed time t=1.08415 s, 1 iters, t-(init.)=0.955127 s t(norm)=0.202418, mflops=24.7014 8. Edelblute: elapsed time t=5.6485 s, 1 iters, t-(init.)=5.54849 s t(norm)=1.17588, mflops=4.25214 (err=4.3e-14) 9. FFTPACK (f2c): elapsed time t=2.70769 s, 1 iters, t-(init.)=2.60773 s t(norm)=0.55265, mflops=9.04732 (err=4.3e-14) FFTW_MEASURE plan: (cost = 7.604780e-01) FFTW_TWIDDLE 4 FFTW_TWIDDLE 64 FFTW_TWIDDLE 16 FFTW_NOTW 64 10. FFTW: elapsed time t=1.59857 s, 2 iters, t-(init.)=1.39875 s t(norm)=0.148217, mflops=33.7343 (err=4.3e-14) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.97822 s, 2 iters, t-(init.)=1.77854 s t(norm)=0.188461, mflops=26.5307 (err=4.3e-14) 12. Frigo-old: elapsed time t=1.75804 s, 1 iters, t-(init.)=1.6585 s t(norm)=0.351483, mflops=14.2254 (err=4.3e-14) 13. Green: elapsed time t=1.51073 s, 1 iters, t-(init.)=1.41114 s t(norm)=0.29906, mflops=16.719 (err=4.3e-14) 14. GSL: elapsed time t=1.91733 s, 1 iters, t-(init.)=1.81751 s t(norm)=0.385181, mflops=12.9809 (err=4.3e-14) 15. GSL DIT: elapsed time t=3.64153 s, 1 iters, t-(init.)=3.54165 s t(norm)=0.750573, mflops=6.66158 (err=4.5e-14) 16. GSL DIF: elapsed time t=3.72056 s, 1 iters, t-(init.)=3.62083 s t(norm)=0.767354, mflops=6.51589 (err=4.7e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=3.25504 s, 1 iters, t-(init.)=3.15528 s t(norm)=0.668691, mflops=7.47729 (err=4.3e-14) 19. Mayer (simple): elapsed time t=3.22281 s, 1 iters, t-(init.)=3.12239 s t(norm)=0.66172, mflops=7.55606 20. Mayer (lookup): elapsed time t=3.35971 s, 1 iters, t-(init.)=3.25933 s t(norm)=0.690741, mflops=7.2386 (err=4.3e-14) 21. NAPACK (f2c): elapsed time t=4.31565 s, 1 iters, t-(init.)=4.21571 s t(norm)=0.893425, mflops=5.59644 (err=3.7e-12) 22. Nielsen: elapsed time t=3.50411 s, 1 iters, t-(init.)=3.40414 s t(norm)=0.721431, mflops=6.93067 (err=2.1e-12) 23. NR (C): elapsed time t=3.76276 s, 1 iters, t-(init.)=3.66286 s t(norm)=0.776261, mflops=6.44113 (err=4.3e-14) 24. Ooura (C): elapsed time t=1.15051 s, 1 iters, t-(init.)=1.05026 s t(norm)=0.222579, mflops=22.4639 (err=4.3e-14) 25. QFT: elapsed time t=2.95311 s, 1 iters, t-(init.)=2.85328 s t(norm)=0.604688, mflops=8.26872 (err=4.7e-14) 26. Ransom: elapsed time t=1.79555 s, 2 iters, t-(init.)=1.59574 s t(norm)=0.169091, mflops=29.5699 (err=4.3e-14) 27. Singleton (f2c): elapsed time t=1.90329 s, 1 iters, t-(init.)=1.80349 s t(norm)=0.38221, mflops=13.0818 (err=5.9e-14) 28. Temperton (f2c): elapsed time t=2.3798 s, 1 iters, t-(init.)=2.27983 s t(norm)=0.483158, mflops=10.3486 (err=4.3e-14) 29. Valkenburg: elapsed time t=7.0235 s, 1 iters, t-(init.)=6.9236 s t(norm)=1.4673, mflops=3.40761 (err=4.3e-14) Top mflops for N=262144 = 33.7343 Normalized results and averages for N=262144: fft 0: mflops = 5.21479 (norm. = 0.154584), norm. avg. (of 18) = 0.342491 fft 1: mflops = 5.18146 (norm. = 0.153596), norm. avg. (of 18) = 0.335812 fft 2: mflops = 4.26804 (norm. = 0.126519), norm. avg. (of 18) = 0.228381 fft 3: mflops = 16.5138 (norm. = 0.489525), norm. avg. (of 18) = 0.205717 fft 4: mflops = 10.297 (norm. = 0.305237), norm. avg. (of 18) = 0.131485 fft 5: mflops = 16.2323 (norm. = 0.48118), norm. avg. (of 18) = 0.442951 fft 6: mflops = 24.7594 (norm. = 0.733953), norm. avg. (of 18) = 0.558128 fft 7: mflops = 24.7014 (norm. = 0.732234), norm. avg. (of 18) = 0.600964 fft 8: mflops = 4.25214 (norm. = 0.126048), norm. avg. (of 17) = 0.182104 fft 9: mflops = 9.04732 (norm. = 0.268194), norm. avg. (of 18) = 0.161247 fft 10: mflops = 33.7343 (norm. = 1), norm. avg. (of 18) = 0.746467 fft 11: mflops = 26.5307 (norm. = 0.786461), norm. avg. (of 18) = 0.686566 fft 12: mflops = 14.2254 (norm. = 0.421691), norm. avg. (of 18) = 0.589711 fft 13: mflops = 16.719 (norm. = 0.49561), norm. avg. (of 16) = 0.698182 fft 14: mflops = 12.9809 (norm. = 0.384799), norm. avg. (of 18) = 0.248013 fft 15: mflops = 6.66158 (norm. = 0.197472), norm. avg. (of 18) = 0.236189 fft 16: mflops = 6.51589 (norm. = 0.193153), norm. avg. (of 18) = 0.246968 fft 17: mflops = -1 (norm. = -0.0296434), norm. avg. (of 12) = 0.360633 fft 18: mflops = 7.47729 (norm. = 0.221653), norm. avg. (of 17) = 0.332929 fft 19: mflops = 7.55606 (norm. = 0.223988), norm. avg. (of 17) = 0.392883 fft 20: mflops = 7.2386 (norm. = 0.214577), norm. avg. (of 17) = 0.340187 fft 21: mflops = 5.59644 (norm. = 0.165898), norm. avg. (of 18) = 0.0719217 fft 22: mflops = 6.93067 (norm. = 0.205449), norm. avg. (of 18) = 0.141019 fft 23: mflops = 6.44113 (norm. = 0.190937), norm. avg. (of 18) = 0.251472 fft 24: mflops = 22.4639 (norm. = 0.665907), norm. avg. (of 18) = 0.633065 fft 25: mflops = 8.26872 (norm. = 0.245113), norm. avg. (of 15) = 0.221578 fft 26: mflops = 29.5699 (norm. = 0.876554), norm. avg. (of 17) = 0.350717 fft 27: mflops = 13.0818 (norm. = 0.387789), norm. avg. (of 18) = 0.437959 fft 28: mflops = 10.3486 (norm. = 0.306767), norm. avg. (of 18) = 0.14408 fft 29: mflops = 3.40761 (norm. = 0.101013), norm. avg. (of 18) = 0.057676 Benchmarking for array size = 524288 (power of 2): 0. Arndt DIF: elapsed time t=9.34361 s, 1 iters, t-(init.)=9.14302 s t(norm)=0.917838, mflops=5.44758 (err=1.1e-13) 1. Arndt DIT: elapsed time t=9.4088 s, 1 iters, t-(init.)=9.20922 s t(norm)=0.924484, mflops=5.40842 (err=1.1e-13) 2. Arndt Split-Radix: elapsed time t=12.1008 s, 1 iters, t-(init.)=11.9009 s t(norm)=1.19469, mflops=4.18519 (err=1.1e-13) 3. Arndt 4-step: elapsed time t=4.6323 s, 1 iters, t-(init.)=4.43277 s t(norm)=0.444991, mflops=11.2362 (err=1.1e-13) 4. Beauregard: elapsed time t=5.0295 s, 1 iters, t-(init.)=4.82998 s t(norm)=0.484866, mflops=10.3121 (err=1.1e-13) 5. Bergland: elapsed time t=3.53369 s, 1 iters, t-(init.)=3.33392 s t(norm)=0.334681, mflops=14.9396 (err=1.1e-13) 6. CWP (min N) (N=720720): elapsed time t=2.25513 s, 1 iters, t-(init.)=1.98013 s t(norm)=0.198779, mflops=25.1536 7. CWP (best N) (N=720720): elapsed time t=2.2551 s, 1 iters, t-(init.)=1.98043 s t(norm)=0.198809, mflops=25.1498 8. Edelblute: elapsed time t=12.1609 s, 1 iters, t-(init.)=11.9609 s t(norm)=1.20072, mflops=4.16418 (err=1.1e-13) 9. FFTPACK (f2c): elapsed time t=5.44745 s, 1 iters, t-(init.)=5.24752 s t(norm)=0.526782, mflops=9.49159 (err=1.1e-13) FFTW_MEASURE plan: (cost = 1.760174e+00) FFTW_TWIDDLE 4 FFTW_TWIDDLE 2 FFTW_TWIDDLE 64 FFTW_TWIDDLE 16 FFTW_NOTW 64 10. FFTW: elapsed time t=1.79809 s, 1 iters, t-(init.)=1.59855 s t(norm)=0.160473, mflops=31.1579 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 5.976883e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=2.0628 s, 1 iters, t-(init.)=1.8632 s t(norm)=0.187041, mflops=26.7322 (err=1.1e-13) 12. Frigo-old: elapsed time t=4.13149 s, 1 iters, t-(init.)=3.93215 s t(norm)=0.394735, mflops=12.6667 (err=1.1e-13) 13. Green: elapsed time t=3.22129 s, 1 iters, t-(init.)=3.0218 s t(norm)=0.303349, mflops=16.4827 (err=1.1e-13) 14. GSL: elapsed time t=3.81003 s, 1 iters, t-(init.)=3.61006 s t(norm)=0.362402, mflops=13.7968 (err=1.1e-13) 15. GSL DIT: elapsed time t=7.70424 s, 1 iters, t-(init.)=7.50457 s t(norm)=0.75336, mflops=6.63694 (err=1.1e-13) 16. GSL DIF: elapsed time t=7.88304 s, 1 iters, t-(init.)=7.68351 s t(norm)=0.771323, mflops=6.48237 (err=1.1e-13) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=7.40071 s, 1 iters, t-(init.)=7.2005 s t(norm)=0.722835, mflops=6.91721 (err=1.1e-13) 19. Mayer (simple): elapsed time t=7.32746 s, 1 iters, t-(init.)=7.12789 s t(norm)=0.715546, mflops=6.98767 20. Mayer (lookup): elapsed time t=7.695 s, 1 iters, t-(init.)=7.49525 s t(norm)=0.752424, mflops=6.64519 (err=1.1e-13) 21. NAPACK (f2c): elapsed time t=9.3501 s, 1 iters, t-(init.)=9.14945 s t(norm)=0.918483, mflops=5.44376 (err=7.9e-12) 22. Nielsen: elapsed time t=7.78344 s, 1 iters, t-(init.)=7.58275 s t(norm)=0.761208, mflops=6.56851 (err=4.4e-12) 23. NR (C): elapsed time t=7.96033 s, 1 iters, t-(init.)=7.7602 s t(norm)=0.779021, mflops=6.41831 (err=1.1e-13) 24. Ooura (C): elapsed time t=2.49224 s, 1 iters, t-(init.)=2.29233 s t(norm)=0.23012, mflops=21.7278 (err=1.1e-13) 25. QFT: elapsed time t=8.73929 s, 1 iters, t-(init.)=8.53938 s t(norm)=0.857241, mflops=5.83267 (err=1.2e-13) 26. Ransom: elapsed time t=2.38257 s, 1 iters, t-(init.)=2.18308 s t(norm)=0.219152, mflops=22.8152 (err=1.1e-13) 27. Singleton (f2c): elapsed time t=4.70009 s, 1 iters, t-(init.)=4.50012 s t(norm)=0.451753, mflops=11.068 (err=1.6e-13) 28. Temperton (f2c): elapsed time t=5.32812 s, 1 iters, t-(init.)=5.12843 s t(norm)=0.514826, mflops=9.71201 (err=1.1e-13) 29. Valkenburg: elapsed time t=15.1233 s, 1 iters, t-(init.)=14.9238 s t(norm)=1.49816, mflops=3.33744 (err=1.1e-13) Top mflops for N=524288 = 31.1579 Normalized results and averages for N=524288: fft 0: mflops = 5.44758 (norm. = 0.174838), norm. avg. (of 19) = 0.333667 fft 1: mflops = 5.40842 (norm. = 0.173581), norm. avg. (of 19) = 0.327274 fft 2: mflops = 4.18519 (norm. = 0.134322), norm. avg. (of 19) = 0.223431 fft 3: mflops = 11.2362 (norm. = 0.360621), norm. avg. (of 19) = 0.21387 fft 4: mflops = 10.3121 (norm. = 0.330964), norm. avg. (of 19) = 0.141984 fft 5: mflops = 14.9396 (norm. = 0.479481), norm. avg. (of 19) = 0.444874 fft 6: mflops = 25.1536 (norm. = 0.807294), norm. avg. (of 19) = 0.571242 fft 7: mflops = 25.1498 (norm. = 0.807173), norm. avg. (of 19) = 0.611817 fft 8: mflops = 4.16418 (norm. = 0.133648), norm. avg. (of 18) = 0.179412 fft 9: mflops = 9.49159 (norm. = 0.304629), norm. avg. (of 19) = 0.168794 fft 10: mflops = 31.1579 (norm. = 1), norm. avg. (of 19) = 0.759811 fft 11: mflops = 26.7322 (norm. = 0.857959), norm. avg. (of 19) = 0.695586 fft 12: mflops = 12.6667 (norm. = 0.406534), norm. avg. (of 19) = 0.58007 fft 13: mflops = 16.4827 (norm. = 0.529005), norm. avg. (of 17) = 0.688231 fft 14: mflops = 13.7968 (norm. = 0.442804), norm. avg. (of 19) = 0.258265 fft 15: mflops = 6.63694 (norm. = 0.21301), norm. avg. (of 19) = 0.234969 fft 16: mflops = 6.48237 (norm. = 0.208049), norm. avg. (of 19) = 0.24492 fft 17: mflops = -1 (norm. = -0.0320946), norm. avg. (of 12) = 0.360633 fft 18: mflops = 6.91721 (norm. = 0.222005), norm. avg. (of 18) = 0.326766 fft 19: mflops = 6.98767 (norm. = 0.224267), norm. avg. (of 18) = 0.383515 fft 20: mflops = 6.64519 (norm. = 0.213275), norm. avg. (of 18) = 0.333137 fft 21: mflops = 5.44376 (norm. = 0.174715), norm. avg. (of 19) = 0.0773319 fft 22: mflops = 6.56851 (norm. = 0.210814), norm. avg. (of 19) = 0.144692 fft 23: mflops = 6.41831 (norm. = 0.205993), norm. avg. (of 19) = 0.249079 fft 24: mflops = 21.7278 (norm. = 0.697347), norm. avg. (of 19) = 0.636449 fft 25: mflops = 5.83267 (norm. = 0.187197), norm. avg. (of 16) = 0.219429 fft 26: mflops = 22.8152 (norm. = 0.732245), norm. avg. (of 18) = 0.371913 fft 27: mflops = 11.068 (norm. = 0.355223), norm. avg. (of 19) = 0.433604 fft 28: mflops = 9.71201 (norm. = 0.311704), norm. avg. (of 19) = 0.152902 fft 29: mflops = 3.33744 (norm. = 0.107114), norm. avg. (of 19) = 0.060278 Benchmarking for array size = 1048576 (power of 2): 0. Arndt DIF: elapsed time t=21.3139 s, 1 iters, t-(init.)=20.9148 s t(norm)=0.997294, mflops=5.01357 (err=1.8e-13) 1. Arndt DIT: elapsed time t=21.4865 s, 1 iters, t-(init.)=21.087 s t(norm)=1.00551, mflops=4.97262 (err=1.8e-13) 2. Arndt Split-Radix: elapsed time t=25.7934 s, 1 iters, t-(init.)=25.3941 s t(norm)=1.21088, mflops=4.12922 (err=1.8e-13) 3. Arndt 4-step: elapsed time t=6.45965 s, 1 iters, t-(init.)=6.06031 s t(norm)=0.288978, mflops=17.3023 (err=1.8e-13) 4. Beauregard: elapsed time t=10.5433 s, 1 iters, t-(init.)=10.1437 s t(norm)=0.483692, mflops=10.3372 (err=1.8e-13) 5. Bergland: elapsed time t=7.55696 s, 1 iters, t-(init.)=7.15757 s t(norm)=0.3413, mflops=14.6499 (err=1.8e-13) 6. Skipping fft (this transform size is too big for CWP). 7. Skipping fft (this transform size is too big for CWP). 8. Edelblute: elapsed time t=25.9348 s, 1 iters, t-(init.)=25.5353 s t(norm)=1.21762, mflops=4.10638 (err=1.8e-13) 9. FFTPACK (f2c): elapsed time t=11.6291 s, 1 iters, t-(init.)=11.2285 s t(norm)=0.535415, mflops=9.33856 (err=1.8e-13) FFTW_MEASURE plan: (cost = 3.607511e+00) FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 64 FFTW_TWIDDLE 16 FFTW_NOTW 64 10. FFTW: elapsed time t=3.65437 s, 1 iters, t-(init.)=3.25378 s t(norm)=0.155152, mflops=32.2264 (err=1.8e-13) FFTW_ESTIMATE plan: (cost = 1.195377e+07) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=4.59747 s, 1 iters, t-(init.)=4.19793 s t(norm)=0.200173, mflops=24.9784 (err=1.8e-13) 12. Frigo-old: elapsed time t=8.86928 s, 1 iters, t-(init.)=8.47005 s t(norm)=0.403883, mflops=12.3798 (err=1.8e-13) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=7.87362 s, 1 iters, t-(init.)=7.47414 s t(norm)=0.356395, mflops=14.0294 (err=1.8e-13) 15. GSL DIT: elapsed time t=16.265 s, 1 iters, t-(init.)=15.8654 s t(norm)=0.756523, mflops=6.60918 (err=1.9e-13) 16. GSL DIF: elapsed time t=16.673 s, 1 iters, t-(init.)=16.2735 s t(norm)=0.775982, mflops=6.44345 (err=1.9e-13) 17. Skipping fft (Krukar can't handle N > 4096). 18. Skipping fft (Mayer (Buneman) can't handle N > 2^19). 19. Mayer (simple): elapsed time t=15.5947 s, 1 iters, t-(init.)=15.1952 s t(norm)=0.724562, mflops=6.90072 20. Mayer (lookup): elapsed time t=16.3402 s, 1 iters, t-(init.)=15.9405 s t(norm)=0.760105, mflops=6.57804 (err=1.8e-13) 21. NAPACK (f2c): elapsed time t=19.067 s, 1 iters, t-(init.)=18.6675 s t(norm)=0.890138, mflops=5.61711 (err=1.5e-11) 22. Nielsen: elapsed time t=15.7829 s, 1 iters, t-(init.)=15.3835 s t(norm)=0.733542, mflops=6.81624 (err=8.1e-12) 23. NR (C): elapsed time t=16.8083 s, 1 iters, t-(init.)=16.4086 s t(norm)=0.782422, mflops=6.39041 (err=1.8e-13) 24. Ooura (C): elapsed time t=5.00243 s, 1 iters, t-(init.)=4.60262 s t(norm)=0.21947, mflops=22.7821 (err=1.8e-13) 25. QFT: elapsed time t=19.6983 s, 1 iters, t-(init.)=19.2987 s t(norm)=0.920236, mflops=5.43339 (err=1.9e-13) 26. Ransom: elapsed time t=3.62825 s, 1 iters, t-(init.)=3.22872 s t(norm)=0.153957, mflops=32.4766 (err=1.8e-13) 27. Singleton (f2c): elapsed time t=8.56582 s, 1 iters, t-(init.)=8.16654 s t(norm)=0.389411, mflops=12.8399 (err=2.6e-13) 28. Temperton (f2c): elapsed time t=10.8319 s, 1 iters, t-(init.)=10.4324 s t(norm)=0.497457, mflops=10.0511 (err=1.8e-13) 29. Valkenburg: elapsed time t=32.5732 s, 1 iters, t-(init.)=32.1741 s t(norm)=1.53418, mflops=3.25907 (err=1.8e-13) Top mflops for N=1048576 = 32.4766 Normalized results and averages for N=1048576: fft 0: mflops = 5.01357 (norm. = 0.154375), norm. avg. (of 20) = 0.324703 fft 1: mflops = 4.97262 (norm. = 0.153114), norm. avg. (of 20) = 0.318566 fft 2: mflops = 4.12922 (norm. = 0.127144), norm. avg. (of 20) = 0.218616 fft 3: mflops = 17.3023 (norm. = 0.532764), norm. avg. (of 20) = 0.229815 fft 4: mflops = 10.3372 (norm. = 0.318296), norm. avg. (of 20) = 0.1508 fft 5: mflops = 14.6499 (norm. = 0.451091), norm. avg. (of 20) = 0.445185 fft 6: mflops = -1 (norm. = -0.0307914), norm. avg. (of 19) = 0.571242 fft 7: mflops = -1 (norm. = -0.0307914), norm. avg. (of 19) = 0.611817 fft 8: mflops = 4.10638 (norm. = 0.126441), norm. avg. (of 19) = 0.176624 fft 9: mflops = 9.33856 (norm. = 0.287548), norm. avg. (of 20) = 0.174731 fft 10: mflops = 32.2264 (norm. = 0.992297), norm. avg. (of 20) = 0.771435 fft 11: mflops = 24.9784 (norm. = 0.769122), norm. avg. (of 20) = 0.699263 fft 12: mflops = 12.3798 (norm. = 0.381192), norm. avg. (of 20) = 0.570126 fft 13: mflops = -1 (norm. = -0.0307914), norm. avg. (of 17) = 0.688231 fft 14: mflops = 14.0294 (norm. = 0.431985), norm. avg. (of 20) = 0.266951 fft 15: mflops = 6.60918 (norm. = 0.203506), norm. avg. (of 20) = 0.233396 fft 16: mflops = 6.44345 (norm. = 0.198403), norm. avg. (of 20) = 0.242594 fft 17: mflops = -1 (norm. = -0.0307914), norm. avg. (of 12) = 0.360633 fft 18: mflops = -1 (norm. = -0.0307914), norm. avg. (of 18) = 0.326766 fft 19: mflops = 6.90072 (norm. = 0.212483), norm. avg. (of 19) = 0.374513 fft 20: mflops = 6.57804 (norm. = 0.202547), norm. avg. (of 19) = 0.326263 fft 21: mflops = 5.61711 (norm. = 0.172959), norm. avg. (of 20) = 0.0821132 fft 22: mflops = 6.81624 (norm. = 0.209882), norm. avg. (of 20) = 0.147952 fft 23: mflops = 6.39041 (norm. = 0.19677), norm. avg. (of 20) = 0.246463 fft 24: mflops = 22.7821 (norm. = 0.701494), norm. avg. (of 20) = 0.639701 fft 25: mflops = 5.43339 (norm. = 0.167302), norm. avg. (of 17) = 0.216363 fft 26: mflops = 32.4766 (norm. = 1), norm. avg. (of 19) = 0.40497 fft 27: mflops = 12.8399 (norm. = 0.395359), norm. avg. (of 20) = 0.431692 fft 28: mflops = 10.0511 (norm. = 0.309489), norm. avg. (of 20) = 0.160732 fft 29: mflops = 3.25907 (norm. = 0.100351), norm. avg. (of 20) = 0.0622817 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. CWP (min N) 1. CWP (best N) 2. FFTPACK (f2c) 3. FFTW 4. FFTW_ESTIMATE 5. Frigo-old 6. GSL 7. NAPACK (f2c) 8. Nielsen 9. Singleton (f2c) 10. Temperton (f2c) 11. Valkenburg Computing normalized averages (12 transforms). Benchmarking for array size = 6: 0. CWP (min N): elapsed time t=1.54352 s, 524288 iters, t-(init.)=1.48188 s t(norm)=0.182238, mflops=27.4367 1. CWP (best N) (N=15): elapsed time t=1.16832 s, 262144 iters, t-(init.)=1.10585 s t(norm)=0.271988, mflops=18.3832 2. FFTPACK (f2c): elapsed time t=1.7977 s, 524288 iters, t-(init.)=1.73432 s t(norm)=0.213282, mflops=23.4432 (err=1.7e-16) FFTW_MEASURE plan: (cost = 6.075516e-07) FFTW_NOTW 6 3. FFTW: elapsed time t=1.35881 s, 2097152 iters, t-(init.)=1.10554 s t(norm)=0.033989, mflops=147.107 (err=1.3e-16) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 4. FFTW_ESTIMATE: elapsed time t=1.34539 s, 2097152 iters, t-(init.)=1.09897 s t(norm)=0.0337871, mflops=147.985 (err=1.3e-16) 5. Frigo-old: elapsed time t=1.59992 s, 524288 iters, t-(init.)=1.53828 s t(norm)=0.189174, mflops=26.4307 (err=3.2e-16) 6. GSL: elapsed time t=1.10377 s, 524288 iters, t-(init.)=1.04194 s t(norm)=0.128135, mflops=39.0212 (err=1.3e-16) 7. NAPACK (f2c): elapsed time t=1.0305 s, 131072 iters, t-(init.)=1.01503 s t(norm)=0.499302, mflops=10.014 (err=2.3e-16) 8. Nielsen: elapsed time t=1.32583 s, 131072 iters, t-(init.)=1.31039 s t(norm)=0.644592, mflops=7.75684 (err=2.7e-16) 9. Singleton (f2c): elapsed time t=1.0231 s, 262144 iters, t-(init.)=0.992273 s t(norm)=0.244054, mflops=20.4873 (err=1.3e-16) 10. Temperton (f2c): elapsed time t=1.01641 s, 131072 iters, t-(init.)=1.00096 s t(norm)=0.492383, mflops=10.1547 (err=1.2e-16) 11. Valkenburg: elapsed time t=1.04991 s, 131072 iters, t-(init.)=1.03403 s t(norm)=0.50865, mflops=9.82994 (err=2.1e-16) Top mflops for N=6 = 147.985 Normalized results and averages for N=6: fft 0: mflops = 27.4367 (norm. = 0.185401), norm. avg. (of 1) = 0.185401 fft 1: mflops = 18.3832 (norm. = 0.124223), norm. avg. (of 1) = 0.124223 fft 2: mflops = 23.4432 (norm. = 0.158415), norm. avg. (of 1) = 0.158415 fft 3: mflops = 147.107 (norm. = 0.994062), norm. avg. (of 1) = 0.994062 fft 4: mflops = 147.985 (norm. = 1), norm. avg. (of 1) = 1 fft 5: mflops = 26.4307 (norm. = 0.178604), norm. avg. (of 1) = 0.178604 fft 6: mflops = 39.0212 (norm. = 0.263683), norm. avg. (of 1) = 0.263683 fft 7: mflops = 10.014 (norm. = 0.0676688), norm. avg. (of 1) = 0.0676688 fft 8: mflops = 7.75684 (norm. = 0.0524163), norm. avg. (of 1) = 0.0524163 fft 9: mflops = 20.4873 (norm. = 0.138441), norm. avg. (of 1) = 0.138441 fft 10: mflops = 10.1547 (norm. = 0.0686196), norm. avg. (of 1) = 0.0686196 fft 11: mflops = 9.82994 (norm. = 0.0664251), norm. avg. (of 1) = 0.0664251 Benchmarking for array size = 9: 0. CWP (min N): elapsed time t=1.65219 s, 524288 iters, t-(init.)=1.56933 s t(norm)=0.104918, mflops=47.6561 1. CWP (best N) (N=15): elapsed time t=1.16103 s, 262144 iters, t-(init.)=1.09856 s t(norm)=0.146891, mflops=34.0389 2. FFTPACK (f2c): elapsed time t=1.31356 s, 262144 iters, t-(init.)=1.27133 s t(norm)=0.169992, mflops=29.4132 (err=2.8e-16) FFTW_MEASURE plan: (cost = 9.839478e-07) FFTW_NOTW 9 3. FFTW: elapsed time t=1.09873 s, 1048576 iters, t-(init.)=0.929764 s t(norm)=0.03108, mflops=160.875 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 4. FFTW_ESTIMATE: elapsed time t=1.09242 s, 1048576 iters, t-(init.)=0.926847 s t(norm)=0.0309825, mflops=161.381 (err=1.4e-16) 5. Frigo-old: elapsed time t=1.67224 s, 262144 iters, t-(init.)=1.63084 s t(norm)=0.218062, mflops=22.9293 (err=3.1e-16) 6. GSL: elapsed time t=1.05797 s, 262144 iters, t-(init.)=1.01662 s t(norm)=0.135933, mflops=36.7828 (err=1.4e-16) 7. NAPACK (f2c): elapsed time t=1.50874 s, 131072 iters, t-(init.)=1.48796 s t(norm)=0.397916, mflops=12.5655 (err=5.8e-16) 8. Nielsen: elapsed time t=1.58694 s, 131072 iters, t-(init.)=1.56622 s t(norm)=0.418844, mflops=11.9376 (err=4.5e-16) 9. Singleton (f2c): elapsed time t=1.01381 s, 262144 iters, t-(init.)=0.972401 s t(norm)=0.130021, mflops=38.4553 (err=1.7e-16) 10. Temperton (f2c): elapsed time t=1.29273 s, 131072 iters, t-(init.)=1.27201 s t(norm)=0.340164, mflops=14.6988 (err=1.7e-16) 11. Valkenburg: elapsed time t=1.8891 s, 131072 iters, t-(init.)=1.86785 s t(norm)=0.499505, mflops=10.0099 (err=2.6e-16) Top mflops for N=9 = 161.381 Normalized results and averages for N=9: fft 0: mflops = 47.6561 (norm. = 0.295301), norm. avg. (of 2) = 0.240351 fft 1: mflops = 34.0389 (norm. = 0.210922), norm. avg. (of 2) = 0.167573 fft 2: mflops = 29.4132 (norm. = 0.182259), norm. avg. (of 2) = 0.170337 fft 3: mflops = 160.875 (norm. = 0.996863), norm. avg. (of 2) = 0.995462 fft 4: mflops = 161.381 (norm. = 1), norm. avg. (of 2) = 1 fft 5: mflops = 22.9293 (norm. = 0.142081), norm. avg. (of 2) = 0.160342 fft 6: mflops = 36.7828 (norm. = 0.227925), norm. avg. (of 2) = 0.245804 fft 7: mflops = 12.5655 (norm. = 0.077862), norm. avg. (of 2) = 0.0727654 fft 8: mflops = 11.9376 (norm. = 0.0739715), norm. avg. (of 2) = 0.0631939 fft 9: mflops = 38.4553 (norm. = 0.238288), norm. avg. (of 2) = 0.188365 fft 10: mflops = 14.6988 (norm. = 0.0910812), norm. avg. (of 2) = 0.0798504 fft 11: mflops = 10.0099 (norm. = 0.0620265), norm. avg. (of 2) = 0.0642258 Benchmarking for array size = 12: 0. CWP (min N): elapsed time t=1.93969 s, 524288 iters, t-(init.)=1.83579 s t(norm)=0.0813933, mflops=61.4302 1. CWP (best N) (N=15): elapsed time t=1.1645 s, 262144 iters, t-(init.)=1.10199 s t(norm)=0.097717, mflops=51.1681 2. FFTPACK (f2c): elapsed time t=1.56162 s, 262144 iters, t-(init.)=1.50881 s t(norm)=0.133792, mflops=37.3715 (err=1.9e-16) FFTW_MEASURE plan: (cost = 1.041183e-06) FFTW_NOTW 12 3. FFTW: elapsed time t=1.15548 s, 1048576 iters, t-(init.)=0.944143 s t(norm)=0.0209301, mflops=238.89 (err=1.3e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.1485 s, 1048576 iters, t-(init.)=0.940976 s t(norm)=0.0208599, mflops=239.694 (err=1.3e-16) 5. Frigo-old: elapsed time t=1.46423 s, 262144 iters, t-(init.)=1.4123 s t(norm)=0.125234, mflops=39.9254 (err=2.3e-16) 6. GSL: elapsed time t=1.03091 s, 262144 iters, t-(init.)=0.978975 s t(norm)=0.0868092, mflops=57.5976 (err=1.5e-16) 7. NAPACK (f2c): elapsed time t=1.08714 s, 65536 iters, t-(init.)=1.07406 s t(norm)=0.380964, mflops=13.1246 (err=4.2e-16) 8. Nielsen: elapsed time t=1.84982 s, 131072 iters, t-(init.)=1.82386 s t(norm)=0.323456, mflops=15.458 (err=4.8e-16) 9. Singleton (f2c): elapsed time t=1.43408 s, 262144 iters, t-(init.)=1.38215 s t(norm)=0.12256, mflops=40.7963 (err=1.9e-16) 10. Temperton (f2c): elapsed time t=1.47563 s, 131072 iters, t-(init.)=1.44967 s t(norm)=0.257094, mflops=19.4481 (err=1.2e-16) 11. Valkenburg: elapsed time t=1.40181 s, 65536 iters, t-(init.)=1.38855 s t(norm)=0.492512, mflops=10.152 (err=1.9e-16) Top mflops for N=12 = 239.694 Normalized results and averages for N=12: fft 0: mflops = 61.4302 (norm. = 0.256286), norm. avg. (of 3) = 0.245663 fft 1: mflops = 51.1681 (norm. = 0.213473), norm. avg. (of 3) = 0.182873 fft 2: mflops = 37.3715 (norm. = 0.155913), norm. avg. (of 3) = 0.165529 fft 3: mflops = 238.89 (norm. = 0.996646), norm. avg. (of 3) = 0.995857 fft 4: mflops = 239.694 (norm. = 1), norm. avg. (of 3) = 1 fft 5: mflops = 39.9254 (norm. = 0.166568), norm. avg. (of 3) = 0.162418 fft 6: mflops = 57.5976 (norm. = 0.240296), norm. avg. (of 3) = 0.243968 fft 7: mflops = 13.1246 (norm. = 0.0547557), norm. avg. (of 3) = 0.0667622 fft 8: mflops = 15.458 (norm. = 0.0644907), norm. avg. (of 3) = 0.0636262 fft 9: mflops = 40.7963 (norm. = 0.170202), norm. avg. (of 3) = 0.18231 fft 10: mflops = 19.4481 (norm. = 0.0811373), norm. avg. (of 3) = 0.0802794 fft 11: mflops = 10.152 (norm. = 0.0423542), norm. avg. (of 3) = 0.0569352 Benchmarking for array size = 15: 0. CWP (min N): elapsed time t=1.15918 s, 262144 iters, t-(init.)=1.09656 s t(norm)=0.0713789, mflops=70.0488 1. CWP (best N): elapsed time t=1.16197 s, 262144 iters, t-(init.)=1.09952 s t(norm)=0.0715717, mflops=69.8601 2. FFTPACK (f2c): elapsed time t=1.00836 s, 131072 iters, t-(init.)=0.976642 s t(norm)=0.127146, mflops=39.3248 (err=3.6e-16) FFTW_MEASURE plan: (cost = 1.775940e-06) FFTW_NOTW 15 3. FFTW: elapsed time t=1.94364 s, 1048576 iters, t-(init.)=1.69011 s t(norm)=0.0275038, mflops=181.793 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.93669 s, 1048576 iters, t-(init.)=1.68685 s t(norm)=0.0274507, mflops=182.145 (err=1.7e-16) 5. Frigo-old: elapsed time t=1.41305 s, 131072 iters, t-(init.)=1.38179 s t(norm)=0.179891, mflops=27.7946 (err=2.7e-16) 6. GSL: elapsed time t=1.97999 s, 262144 iters, t-(init.)=1.91748 s t(norm)=0.124815, mflops=40.0592 (err=1.9e-16) 7. NAPACK (f2c): elapsed time t=1.06133 s, 32768 iters, t-(init.)=1.05329 s t(norm)=0.548497, mflops=9.11582 (err=9.4e-16) 8. Nielsen: elapsed time t=1.08187 s, 65536 iters, t-(init.)=1.06618 s t(norm)=0.277606, mflops=18.0112 (err=4.5e-15) 9. Singleton (f2c): elapsed time t=1.73509 s, 262144 iters, t-(init.)=1.67254 s t(norm)=0.108872, mflops=45.9256 (err=2.0e-16) 10. Temperton (f2c): elapsed time t=1.95295 s, 131072 iters, t-(init.)=1.92174 s t(norm)=0.250186, mflops=19.9852 (err=2.5e-16) 11. Valkenburg: elapsed time t=1.0585 s, 32768 iters, t-(init.)=1.05054 s t(norm)=0.547068, mflops=9.13964 (err=2.5e-16) Top mflops for N=15 = 182.145 Normalized results and averages for N=15: fft 0: mflops = 70.0488 (norm. = 0.384578), norm. avg. (of 4) = 0.280391 fft 1: mflops = 69.8601 (norm. = 0.383542), norm. avg. (of 4) = 0.23304 fft 2: mflops = 39.3248 (norm. = 0.215899), norm. avg. (of 4) = 0.178122 fft 3: mflops = 181.793 (norm. = 0.998069), norm. avg. (of 4) = 0.99641 fft 4: mflops = 182.145 (norm. = 1), norm. avg. (of 4) = 1 fft 5: mflops = 27.7946 (norm. = 0.152597), norm. avg. (of 4) = 0.159962 fft 6: mflops = 40.0592 (norm. = 0.219931), norm. avg. (of 4) = 0.237959 fft 7: mflops = 9.11582 (norm. = 0.0500472), norm. avg. (of 4) = 0.0625834 fft 8: mflops = 18.0112 (norm. = 0.0988839), norm. avg. (of 4) = 0.0724406 fft 9: mflops = 45.9256 (norm. = 0.252138), norm. avg. (of 4) = 0.199767 fft 10: mflops = 19.9852 (norm. = 0.109721), norm. avg. (of 4) = 0.0876399 fft 11: mflops = 9.13964 (norm. = 0.0501779), norm. avg. (of 4) = 0.0552459 Benchmarking for array size = 18: 0. CWP (min N): elapsed time t=1.39618 s, 262144 iters, t-(init.)=1.32304 s t(norm)=0.0672408, mflops=74.3597 1. CWP (best N) (N=28): elapsed time t=1.50194 s, 262144 iters, t-(init.)=1.39378 s t(norm)=0.0708358, mflops=70.5858 2. FFTPACK (f2c): elapsed time t=1.7812 s, 131072 iters, t-(init.)=1.74417 s t(norm)=0.177287, mflops=28.2028 (err=2.6e-16) FFTW_MEASURE plan: (cost = 2.697968e-06) FFTW_TWIDDLE 3 FFTW_NOTW 6 3. FFTW: elapsed time t=1.44679 s, 524288 iters, t-(init.)=1.29873 s t(norm)=0.0330027, mflops=151.503 (err=1.9e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 4. FFTW_ESTIMATE: elapsed time t=1.46063 s, 524288 iters, t-(init.)=1.3146 s t(norm)=0.0334058, mflops=149.675 (err=2.3e-16) 5. Frigo-old: elapsed time t=1.89688 s, 131072 iters, t-(init.)=1.85974 s t(norm)=0.189034, mflops=26.4502 (err=3.8e-16) 6. GSL: elapsed time t=1.60822 s, 262144 iters, t-(init.)=1.53517 s t(norm)=0.0780219, mflops=64.0846 (err=2.4e-16) 7. NAPACK (f2c): elapsed time t=1.65623 s, 65536 iters, t-(init.)=1.63789 s t(norm)=0.33297, mflops=15.0164 (err=6.0e-16) 8. Nielsen: elapsed time t=1.81435 s, 65536 iters, t-(init.)=1.79607 s t(norm)=0.365126, mflops=13.6939 (err=7.7e-16) 9. Singleton (f2c): elapsed time t=1.78773 s, 262144 iters, t-(init.)=1.71472 s t(norm)=0.0871472, mflops=57.3742 (err=1.7e-16) 10. Temperton (f2c): elapsed time t=1.46429 s, 65536 iters, t-(init.)=1.44601 s t(norm)=0.293962, mflops=17.009 (err=2.8e-16) 11. Valkenburg: elapsed time t=1.2016 s, 32768 iters, t-(init.)=1.19235 s t(norm)=0.48479, mflops=10.3137 (err=2.8e-16) Top mflops for N=18 = 151.503 Normalized results and averages for N=18: fft 0: mflops = 74.3597 (norm. = 0.490814), norm. avg. (of 5) = 0.322476 fft 1: mflops = 70.5858 (norm. = 0.465905), norm. avg. (of 5) = 0.279613 fft 2: mflops = 28.2028 (norm. = 0.186154), norm. avg. (of 5) = 0.179728 fft 3: mflops = 151.503 (norm. = 1), norm. avg. (of 5) = 0.997128 fft 4: mflops = 149.675 (norm. = 0.987935), norm. avg. (of 5) = 0.997587 fft 5: mflops = 26.4502 (norm. = 0.174586), norm. avg. (of 5) = 0.162887 fft 6: mflops = 64.0846 (norm. = 0.422993), norm. avg. (of 5) = 0.274966 fft 7: mflops = 15.0164 (norm. = 0.0991163), norm. avg. (of 5) = 0.06989 fft 8: mflops = 13.6939 (norm. = 0.0903872), norm. avg. (of 5) = 0.0760299 fft 9: mflops = 57.3742 (norm. = 0.378701), norm. avg. (of 5) = 0.235554 fft 10: mflops = 17.009 (norm. = 0.112269), norm. avg. (of 5) = 0.0925656 fft 11: mflops = 10.3137 (norm. = 0.0680764), norm. avg. (of 5) = 0.057812 Benchmarking for array size = 24: 0. CWP (min N): elapsed time t=1.50189 s, 262144 iters, t-(init.)=1.40762 s t(norm)=0.0487976, mflops=102.464 1. CWP (best N) (N=28): elapsed time t=1.50232 s, 262144 iters, t-(init.)=1.39386 s t(norm)=0.0483206, mflops=103.475 2. FFTPACK (f2c): elapsed time t=1.11188 s, 65536 iters, t-(init.)=1.0881 s t(norm)=0.150884, mflops=33.1381 (err=2.4e-16) FFTW_MEASURE plan: (cost = 3.033508e-06) FFTW_TWIDDLE 2 FFTW_NOTW 12 3. FFTW: elapsed time t=1.62387 s, 524288 iters, t-(init.)=1.43359 s t(norm)=0.0248489, mflops=201.216 (err=2.0e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.62032 s, 524288 iters, t-(init.)=1.43183 s t(norm)=0.0248185, mflops=201.463 (err=2.0e-16) 5. Frigo-old: elapsed time t=1.43339 s, 131072 iters, t-(init.)=1.38641 s t(norm)=0.0961249, mflops=52.0156 (err=2.7e-16) 6. GSL: elapsed time t=1.79681 s, 262144 iters, t-(init.)=1.70271 s t(norm)=0.0590273, mflops=84.7066 (err=2.2e-16) 7. NAPACK (f2c): elapsed time t=1.0808 s, 32768 iters, t-(init.)=1.06897 s t(norm)=0.296461, mflops=16.8656 (err=8.2e-16) 8. Nielsen: elapsed time t=1.56893 s, 65536 iters, t-(init.)=1.54531 s t(norm)=0.214284, mflops=23.3335 (err=1.4e-15) 9. Singleton (f2c): elapsed time t=1.37564 s, 131072 iters, t-(init.)=1.32852 s t(norm)=0.0921107, mflops=54.2825 (err=2.2e-16) 10. Temperton (f2c): elapsed time t=1.66159 s, 65536 iters, t-(init.)=1.63808 s t(norm)=0.227148, mflops=22.0121 (err=2.7e-16) 11. Valkenburg: elapsed time t=1.74011 s, 32768 iters, t-(init.)=1.72817 s t(norm)=0.479279, mflops=10.4323 (err=2.9e-16) Top mflops for N=24 = 201.463 Normalized results and averages for N=24: fft 0: mflops = 102.464 (norm. = 0.5086), norm. avg. (of 6) = 0.353497 fft 1: mflops = 103.475 (norm. = 0.513621), norm. avg. (of 6) = 0.318614 fft 2: mflops = 33.1381 (norm. = 0.164488), norm. avg. (of 6) = 0.177188 fft 3: mflops = 201.216 (norm. = 0.998776), norm. avg. (of 6) = 0.997403 fft 4: mflops = 201.463 (norm. = 1), norm. avg. (of 6) = 0.997989 fft 5: mflops = 52.0156 (norm. = 0.25819), norm. avg. (of 6) = 0.178771 fft 6: mflops = 84.7066 (norm. = 0.420458), norm. avg. (of 6) = 0.299214 fft 7: mflops = 16.8656 (norm. = 0.0837158), norm. avg. (of 6) = 0.0721943 fft 8: mflops = 23.3335 (norm. = 0.115821), norm. avg. (of 6) = 0.0826617 fft 9: mflops = 54.2825 (norm. = 0.269442), norm. avg. (of 6) = 0.241202 fft 10: mflops = 22.0121 (norm. = 0.109261), norm. avg. (of 6) = 0.0953482 fft 11: mflops = 10.4323 (norm. = 0.0517829), norm. avg. (of 6) = 0.0568072 Benchmarking for array size = 36: 0. CWP (min N): elapsed time t=1.06082 s, 131072 iters, t-(init.)=0.992579 s t(norm)=0.0406882, mflops=122.886 1. CWP (best N): elapsed time t=1.0619 s, 131072 iters, t-(init.)=0.993663 s t(norm)=0.0407326, mflops=122.752 2. FFTPACK (f2c): elapsed time t=1.73285 s, 65536 iters, t-(init.)=1.69852 s t(norm)=0.139253, mflops=35.9059 (err=3.7e-16) FFTW_MEASURE plan: (cost = 4.739136e-06) FFTW_TWIDDLE 3 FFTW_NOTW 12 3. FFTW: elapsed time t=1.27095 s, 262144 iters, t-(init.)=1.13374 s t(norm)=0.0232374, mflops=215.17 (err=3.5e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.26967 s, 262144 iters, t-(init.)=1.13334 s t(norm)=0.0232291, mflops=215.247 (err=3.5e-16) 5. Frigo-old: elapsed time t=1.87141 s, 65536 iters, t-(init.)=1.83727 s t(norm)=0.150628, mflops=33.1943 (err=4.8e-16) 6. GSL: elapsed time t=1.33289 s, 131072 iters, t-(init.)=1.26469 s t(norm)=0.0518428, mflops=96.4455 (err=2.8e-16) 7. NAPACK (f2c): elapsed time t=1.68679 s, 32768 iters, t-(init.)=1.66967 s t(norm)=0.273775, mflops=18.2632 (err=1.0e-15) 8. Nielsen: elapsed time t=1.49128 s, 32768 iters, t-(init.)=1.47416 s t(norm)=0.241718, mflops=20.6852 (err=9.7e-16) 9. Singleton (f2c): elapsed time t=1.45397 s, 131072 iters, t-(init.)=1.38578 s t(norm)=0.0568065, mflops=88.0181 (err=2.7e-16) 10. Temperton (f2c): elapsed time t=1.22907 s, 32768 iters, t-(init.)=1.21199 s t(norm)=0.198729, mflops=25.1599 (err=3.9e-16) 11. Valkenburg: elapsed time t=1.45256 s, 16384 iters, t-(init.)=1.4439 s t(norm)=0.473512, mflops=10.5594 (err=4.0e-16) Top mflops for N=36 = 215.247 Normalized results and averages for N=36: fft 0: mflops = 122.886 (norm. = 0.570905), norm. avg. (of 7) = 0.384555 fft 1: mflops = 122.752 (norm. = 0.570282), norm. avg. (of 7) = 0.354567 fft 2: mflops = 35.9059 (norm. = 0.166812), norm. avg. (of 7) = 0.175706 fft 3: mflops = 215.17 (norm. = 0.999643), norm. avg. (of 7) = 0.997723 fft 4: mflops = 215.247 (norm. = 1), norm. avg. (of 7) = 0.998276 fft 5: mflops = 33.1943 (norm. = 0.154215), norm. avg. (of 7) = 0.175263 fft 6: mflops = 96.4455 (norm. = 0.448068), norm. avg. (of 7) = 0.320479 fft 7: mflops = 18.2632 (norm. = 0.0848475), norm. avg. (of 7) = 0.0740019 fft 8: mflops = 20.6852 (norm. = 0.0960999), norm. avg. (of 7) = 0.0845814 fft 9: mflops = 88.0181 (norm. = 0.408916), norm. avg. (of 7) = 0.265161 fft 10: mflops = 25.1599 (norm. = 0.116888), norm. avg. (of 7) = 0.0984254 fft 11: mflops = 10.5594 (norm. = 0.049057), norm. avg. (of 7) = 0.0557 Benchmarking for array size = 80: 0. CWP (min N): elapsed time t=1.15789 s, 65536 iters, t-(init.)=1.08493 s t(norm)=0.0327328, mflops=152.752 1. CWP (best N) (N=84): elapsed time t=1.02778 s, 65536 iters, t-(init.)=0.951472 s t(norm)=0.0287063, mflops=174.178 2. FFTPACK (f2c): elapsed time t=1.84056 s, 32768 iters, t-(init.)=1.804 s t(norm)=0.108855, mflops=45.9328 (err=7.7e-16) FFTW_MEASURE plan: (cost = 1.120190e-05) FFTW_TWIDDLE 5 FFTW_NOTW 16 3. FFTW: elapsed time t=1.50498 s, 131072 iters, t-(init.)=1.35889 s t(norm)=0.0204991, mflops=243.913 (err=7.3e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 4. FFTW_ESTIMATE: elapsed time t=1.50491 s, 131072 iters, t-(init.)=1.35936 s t(norm)=0.0205061, mflops=243.83 (err=7.3e-16) 5. Frigo-old: elapsed time t=1.28033 s, 32768 iters, t-(init.)=1.24389 s t(norm)=0.0750572, mflops=66.6158 (err=7.1e-16) 6. GSL: elapsed time t=1.33325 s, 32768 iters, t-(init.)=1.29676 s t(norm)=0.0782476, mflops=63.8998 (err=6.9e-16) 7. NAPACK (f2c): elapsed time t=1.64457 s, 8192 iters, t-(init.)=1.63538 s t(norm)=0.394721, mflops=12.6672 (err=1.1e-15) 8. Nielsen: elapsed time t=1.09571 s, 16384 iters, t-(init.)=1.07745 s t(norm)=0.130028, mflops=38.4533 (err=5.4e-15) 9. Singleton (f2c): elapsed time t=1.29693 s, 65536 iters, t-(init.)=1.22409 s t(norm)=0.0369313, mflops=135.387 (err=1.3e-15) 10. Temperton (f2c): elapsed time t=1.42739 s, 16384 iters, t-(init.)=1.40912 s t(norm)=0.170055, mflops=29.4023 (err=7.0e-16) 11. Valkenburg: elapsed time t=1.03416 s, 4096 iters, t-(init.)=1.02961 s t(norm)=0.497017, mflops=10.06 (err=8.4e-16) Top mflops for N=80 = 243.913 Normalized results and averages for N=80: fft 0: mflops = 152.752 (norm. = 0.626255), norm. avg. (of 8) = 0.414768 fft 1: mflops = 174.178 (norm. = 0.714098), norm. avg. (of 8) = 0.399508 fft 2: mflops = 45.9328 (norm. = 0.188316), norm. avg. (of 8) = 0.177282 fft 3: mflops = 243.913 (norm. = 1), norm. avg. (of 8) = 0.998007 fft 4: mflops = 243.83 (norm. = 0.999656), norm. avg. (of 8) = 0.998449 fft 5: mflops = 66.6158 (norm. = 0.273113), norm. avg. (of 8) = 0.187494 fft 6: mflops = 63.8998 (norm. = 0.261977), norm. avg. (of 8) = 0.313166 fft 7: mflops = 12.6672 (norm. = 0.0519331), norm. avg. (of 8) = 0.0712433 fft 8: mflops = 38.4533 (norm. = 0.157652), norm. avg. (of 8) = 0.0937152 fft 9: mflops = 135.387 (norm. = 0.55506), norm. avg. (of 8) = 0.301399 fft 10: mflops = 29.4023 (norm. = 0.120544), norm. avg. (of 8) = 0.10119 fft 11: mflops = 10.06 (norm. = 0.0412442), norm. avg. (of 8) = 0.053893 Benchmarking for array size = 108: 0. CWP (min N) (N=110): elapsed time t=1.93719 s, 65536 iters, t-(init.)=1.83784 s t(norm)=0.0384402, mflops=130.072 1. CWP (best N) (N=112): elapsed time t=1.52647 s, 65536 iters, t-(init.)=1.42535 s t(norm)=0.0298126, mflops=167.714 2. FFTPACK (f2c): elapsed time t=1.39739 s, 16384 iters, t-(init.)=1.37299 s t(norm)=0.11487, mflops=43.5276 (err=4.7e-16) FFTW_MEASURE plan: (cost = 1.662817e-05) FFTW_TWIDDLE 9 FFTW_NOTW 12 3. FFTW: elapsed time t=1.10893 s, 65536 iters, t-(init.)=1.01123 s t(norm)=0.0211508, mflops=236.398 (err=3.7e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.10783 s, 65536 iters, t-(init.)=1.01035 s t(norm)=0.0211325, mflops=236.603 (err=3.7e-16) 5. Frigo-old: elapsed time t=1.98063 s, 16384 iters, t-(init.)=1.95625 s t(norm)=0.163668, mflops=30.5497 (err=5.5e-16) 6. GSL: elapsed time t=1.36631 s, 32768 iters, t-(init.)=1.31754 s t(norm)=0.0551155, mflops=90.7186 (err=4.7e-16) 7. NAPACK (f2c): elapsed time t=1.50806 s, 8192 iters, t-(init.)=1.49588 s t(norm)=0.250303, mflops=19.9758 (err=2.7e-15) 8. Nielsen: elapsed time t=1.2804 s, 8192 iters, t-(init.)=1.26816 s t(norm)=0.212198, mflops=23.5628 (err=1.1e-15) 9. Singleton (f2c): elapsed time t=1.27346 s, 32768 iters, t-(init.)=1.22473 s t(norm)=0.0512329, mflops=97.5935 (err=5.1e-16) 10. Temperton (f2c): elapsed time t=1.12281 s, 8192 iters, t-(init.)=1.11069 s t(norm)=0.18585, mflops=26.9034 (err=3.8e-16) 11. Valkenburg: elapsed time t=1.40057 s, 4096 iters, t-(init.)=1.39436 s t(norm)=0.466631, mflops=10.7151 (err=5.2e-16) Top mflops for N=108 = 236.603 Normalized results and averages for N=108: fft 0: mflops = 130.072 (norm. = 0.549748), norm. avg. (of 9) = 0.429765 fft 1: mflops = 167.714 (norm. = 0.708842), norm. avg. (of 9) = 0.433879 fft 2: mflops = 43.5276 (norm. = 0.183969), norm. avg. (of 9) = 0.178025 fft 3: mflops = 236.398 (norm. = 0.999133), norm. avg. (of 9) = 0.998132 fft 4: mflops = 236.603 (norm. = 1), norm. avg. (of 9) = 0.998621 fft 5: mflops = 30.5497 (norm. = 0.129118), norm. avg. (of 9) = 0.181008 fft 6: mflops = 90.7186 (norm. = 0.383421), norm. avg. (of 9) = 0.320973 fft 7: mflops = 19.9758 (norm. = 0.0844276), norm. avg. (of 9) = 0.0727082 fft 8: mflops = 23.5628 (norm. = 0.0995882), norm. avg. (of 9) = 0.0943678 fft 9: mflops = 97.5935 (norm. = 0.412478), norm. avg. (of 9) = 0.313741 fft 10: mflops = 26.9034 (norm. = 0.113707), norm. avg. (of 9) = 0.102581 fft 11: mflops = 10.7151 (norm. = 0.0452873), norm. avg. (of 9) = 0.0529368 Benchmarking for array size = 210: 0. CWP (min N): elapsed time t=1.52397 s, 32768 iters, t-(init.)=1.4304 s t(norm)=0.026946, mflops=185.556 1. CWP (best N): elapsed time t=1.52414 s, 32768 iters, t-(init.)=1.43038 s t(norm)=0.0269456, mflops=185.559 2. FFTPACK (f2c): elapsed time t=1.05333 s, 4096 iters, t-(init.)=1.0416 s t(norm)=0.156974, mflops=31.8523 (err=5.7e-16) FFTW_MEASURE plan: (cost = 4.324756e-05) FFTW_TWIDDLE 3 FFTW_TWIDDLE 7 FFTW_NOTW 10 3. FFTW: elapsed time t=1.45155 s, 32768 iters, t-(init.)=1.35782 s t(norm)=0.0255787, mflops=195.475 (err=4.6e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.51671 s, 32768 iters, t-(init.)=1.42281 s t(norm)=0.0268031, mflops=186.546 (err=4.6e-16) 5. Frigo-old: elapsed time t=1.02999 s, 4096 iters, t-(init.)=1.01821 s t(norm)=0.153449, mflops=32.5841 (err=5.8e-16) 6. GSL: elapsed time t=1.0459 s, 8192 iters, t-(init.)=1.02251 s t(norm)=0.0770488, mflops=64.894 (err=5.3e-16) 7. NAPACK (f2c): elapsed time t=1.61036 s, 2048 iters, t-(init.)=1.60436 s t(norm)=0.48357, mflops=10.3398 (err=1.4e-14) 8. Nielsen: elapsed time t=1.05543 s, 4096 iters, t-(init.)=1.04373 s t(norm)=0.157295, mflops=31.7874 (err=7.6e-15) 9. Singleton (f2c): elapsed time t=1.71184 s, 16384 iters, t-(init.)=1.66502 s t(norm)=0.0627315, mflops=79.7047 (err=6.7e-16) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=1.92001 s, 2048 iters, t-(init.)=1.91395 s t(norm)=0.576884, mflops=8.66726 (err=6.5e-16) Top mflops for N=210 = 195.475 Normalized results and averages for N=210: fft 0: mflops = 185.556 (norm. = 0.949258), norm. avg. (of 10) = 0.481715 fft 1: mflops = 185.559 (norm. = 0.949272), norm. avg. (of 10) = 0.485418 fft 2: mflops = 31.8523 (norm. = 0.162948), norm. avg. (of 10) = 0.176517 fft 3: mflops = 195.475 (norm. = 1), norm. avg. (of 10) = 0.998319 fft 4: mflops = 186.546 (norm. = 0.95432), norm. avg. (of 10) = 0.994191 fft 5: mflops = 32.5841 (norm. = 0.166692), norm. avg. (of 10) = 0.179576 fft 6: mflops = 64.894 (norm. = 0.331981), norm. avg. (of 10) = 0.322073 fft 7: mflops = 10.3398 (norm. = 0.0528956), norm. avg. (of 10) = 0.070727 fft 8: mflops = 31.7874 (norm. = 0.162616), norm. avg. (of 10) = 0.101193 fft 9: mflops = 79.7047 (norm. = 0.407749), norm. avg. (of 10) = 0.323141 fft 10: mflops = -1 (norm. = -0.00511574), norm. avg. (of 9) = 0.102581 fft 11: mflops = 8.66726 (norm. = 0.0443394), norm. avg. (of 10) = 0.0520771 Benchmarking for array size = 504: 0. CWP (min N): elapsed time t=1.95731 s, 16384 iters, t-(init.)=1.84552 s t(norm)=0.0248957, mflops=200.838 1. CWP (best N): elapsed time t=1.9566 s, 16384 iters, t-(init.)=1.84485 s t(norm)=0.0248867, mflops=200.911 2. FFTPACK (f2c): elapsed time t=1.49771 s, 2048 iters, t-(init.)=1.48369 s t(norm)=0.160118, mflops=31.227 (err=9.8e-16) FFTW_MEASURE plan: (cost = 9.970117e-05) FFTW_TWIDDLE 6 FFTW_TWIDDLE 7 FFTW_NOTW 12 3. FFTW: elapsed time t=1.65463 s, 16384 iters, t-(init.)=1.54302 s t(norm)=0.020815, mflops=240.212 (err=8.7e-16) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 4. FFTW_ESTIMATE: elapsed time t=1.907 s, 16384 iters, t-(init.)=1.79549 s t(norm)=0.0242207, mflops=206.435 (err=8.8e-16) 5. Frigo-old: elapsed time t=1.22527 s, 2048 iters, t-(init.)=1.21125 s t(norm)=0.130716, mflops=38.2507 (err=1.0e-15) 6. GSL: elapsed time t=1.08879 s, 4096 iters, t-(init.)=1.0609 s t(norm)=0.0572455, mflops=87.3431 (err=8.9e-16) 7. NAPACK (f2c): elapsed time t=1.71526 s, 1024 iters, t-(init.)=1.70819 s t(norm)=0.368689, mflops=13.5616 (err=4.2e-14) 8. Nielsen: elapsed time t=1.54258 s, 2048 iters, t-(init.)=1.52858 s t(norm)=0.164961, mflops=30.3102 (err=5.8e-15) 9. Singleton (f2c): elapsed time t=1.00171 s, 4096 iters, t-(init.)=0.973776 s t(norm)=0.0525441, mflops=95.1582 (err=1.3e-15) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=1.24352 s, 512 iters, t-(init.)=1.24004 s t(norm)=0.535292, mflops=9.3407 (err=1.0e-15) Top mflops for N=504 = 240.212 Normalized results and averages for N=504: fft 0: mflops = 200.838 (norm. = 0.836087), norm. avg. (of 11) = 0.51393 fft 1: mflops = 200.911 (norm. = 0.83639), norm. avg. (of 11) = 0.517325 fft 2: mflops = 31.227 (norm. = 0.129998), norm. avg. (of 11) = 0.172288 fft 3: mflops = 240.212 (norm. = 1), norm. avg. (of 11) = 0.998472 fft 4: mflops = 206.435 (norm. = 0.859387), norm. avg. (of 11) = 0.981936 fft 5: mflops = 38.2507 (norm. = 0.159237), norm. avg. (of 11) = 0.177727 fft 6: mflops = 87.3431 (norm. = 0.363609), norm. avg. (of 11) = 0.325849 fft 7: mflops = 13.5616 (norm. = 0.0564566), norm. avg. (of 11) = 0.0694297 fft 8: mflops = 30.3102 (norm. = 0.126181), norm. avg. (of 11) = 0.103464 fft 9: mflops = 95.1582 (norm. = 0.396143), norm. avg. (of 11) = 0.329778 fft 10: mflops = -1 (norm. = -0.00416299), norm. avg. (of 9) = 0.102581 fft 11: mflops = 9.3407 (norm. = 0.0388852), norm. avg. (of 11) = 0.0508778 Benchmarking for array size = 1000: 0. CWP (min N) (N=1001): elapsed time t=1.6222 s, 4096 iters, t-(init.)=1.56689 s t(norm)=0.0383854, mflops=130.258 1. CWP (best N) (N=1008): elapsed time t=1.128 s, 4096 iters, t-(init.)=1.07231 s t(norm)=0.0262693, mflops=190.336 2. FFTPACK (f2c): elapsed time t=1.80497 s, 1024 iters, t-(init.)=1.79111 s t(norm)=0.175514, mflops=28.4878 (err=3.1e-15) FFTW_MEASURE plan: (cost = 4.209375e-04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 3. FFTW: elapsed time t=1.70089 s, 4096 iters, t-(init.)=1.64561 s t(norm)=0.040314, mflops=124.026 (err=3.1e-15) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 4. FFTW_ESTIMATE: elapsed time t=1.70503 s, 4096 iters, t-(init.)=1.64975 s t(norm)=0.0404155, mflops=123.715 (err=3.1e-15) 5. Frigo-old: elapsed time t=1.02804 s, 512 iters, t-(init.)=1.02092 s t(norm)=0.200082, mflops=24.9897 (err=3.1e-15) 6. GSL: elapsed time t=1.41077 s, 1024 iters, t-(init.)=1.39661 s t(norm)=0.136856, mflops=36.5347 (err=3.1e-15) 7. NAPACK (f2c): elapsed time t=1.16919 s, 256 iters, t-(init.)=1.16551 s t(norm)=0.45684, mflops=10.9447 (err=1.8e-14) 8. Nielsen: elapsed time t=1.22507 s, 1024 iters, t-(init.)=1.21109 s t(norm)=0.118676, mflops=42.1314 (err=1.5e-14) 9. Singleton (f2c): elapsed time t=1.5696 s, 4096 iters, t-(init.)=1.51439 s t(norm)=0.0370993, mflops=134.774 (err=4.7e-15) 10. Temperton (f2c): elapsed time t=1.87226 s, 1024 iters, t-(init.)=1.85839 s t(norm)=0.182107, mflops=27.4564 (err=3.0e-15) 11. Valkenburg: elapsed time t=1.53339 s, 256 iters, t-(init.)=1.52985 s t(norm)=0.599648, mflops=8.33822 (err=3.0e-15) Top mflops for N=1000 = 190.336 Normalized results and averages for N=1000: fft 0: mflops = 130.258 (norm. = 0.684356), norm. avg. (of 12) = 0.528133 fft 1: mflops = 190.336 (norm. = 1), norm. avg. (of 12) = 0.557548 fft 2: mflops = 28.4878 (norm. = 0.149671), norm. avg. (of 12) = 0.170404 fft 3: mflops = 124.026 (norm. = 0.651617), norm. avg. (of 12) = 0.969567 fft 4: mflops = 123.715 (norm. = 0.649981), norm. avg. (of 12) = 0.954273 fft 5: mflops = 24.9897 (norm. = 0.131292), norm. avg. (of 12) = 0.173858 fft 6: mflops = 36.5347 (norm. = 0.191948), norm. avg. (of 12) = 0.314691 fft 7: mflops = 10.9447 (norm. = 0.0575022), norm. avg. (of 12) = 0.0684357 fft 8: mflops = 42.1314 (norm. = 0.221353), norm. avg. (of 12) = 0.113288 fft 9: mflops = 134.774 (norm. = 0.708081), norm. avg. (of 12) = 0.361303 fft 10: mflops = 27.4564 (norm. = 0.144252), norm. avg. (of 10) = 0.106748 fft 11: mflops = 8.33822 (norm. = 0.0438079), norm. avg. (of 12) = 0.0502887 Benchmarking for array size = 1960: 0. CWP (min N) (N=1980): elapsed time t=1.417 s, 2048 iters, t-(init.)=1.36015 s t(norm)=0.0309825, mflops=161.381 1. CWP (best N) (N=1980): elapsed time t=1.41847 s, 2048 iters, t-(init.)=1.36128 s t(norm)=0.0310082, mflops=161.248 2. FFTPACK (f2c): elapsed time t=1.45516 s, 128 iters, t-(init.)=1.4512 s t(norm)=0.528904, mflops=9.45352 (err=1.5e-15) FFTW_MEASURE plan: (cost = 1.075031e-03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_NOTW 8 3. FFTW: elapsed time t=1.15936 s, 512 iters, t-(init.)=1.14468 s t(norm)=0.104298, mflops=47.9398 (err=1.5e-15) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 4. FFTW_ESTIMATE: elapsed time t=1.12935 s, 512 iters, t-(init.)=1.11468 s t(norm)=0.101564, mflops=49.2298 (err=1.5e-15) 5. Frigo-old: elapsed time t=1.45318 s, 256 iters, t-(init.)=1.44604 s t(norm)=0.263512, mflops=18.9745 (err=1.5e-15) 6. GSL: elapsed time t=1.6783 s, 256 iters, t-(init.)=1.67072 s t(norm)=0.304455, mflops=16.4228 (err=1.6e-15) 7. NAPACK (f2c): elapsed time t=1.10964 s, 64 iters, t-(init.)=1.10752 s t(norm)=0.807296, mflops=6.19351 (err=1.3e-13) 8. Nielsen: elapsed time t=1.23884 s, 256 iters, t-(init.)=1.23112 s t(norm)=0.224346, mflops=22.287 (err=1.7e-14) 9. Singleton (f2c): elapsed time t=1.18132 s, 1024 iters, t-(init.)=1.15283 s t(norm)=0.0525202, mflops=95.2014 (err=2.3e-15) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=1.11311 s, 64 iters, t-(init.)=1.11093 s t(norm)=0.809779, mflops=6.17453 (err=1.4e-15) Top mflops for N=1960 = 161.381 Normalized results and averages for N=1960: fft 0: mflops = 161.381 (norm. = 1), norm. avg. (of 13) = 0.56443 fft 1: mflops = 161.248 (norm. = 0.99917), norm. avg. (of 13) = 0.591518 fft 2: mflops = 9.45352 (norm. = 0.0585787), norm. avg. (of 13) = 0.161802 fft 3: mflops = 47.9398 (norm. = 0.297059), norm. avg. (of 13) = 0.917836 fft 4: mflops = 49.2298 (norm. = 0.305052), norm. avg. (of 13) = 0.904333 fft 5: mflops = 18.9745 (norm. = 0.117575), norm. avg. (of 13) = 0.169528 fft 6: mflops = 16.4228 (norm. = 0.101764), norm. avg. (of 13) = 0.298312 fft 7: mflops = 6.19351 (norm. = 0.0383781), norm. avg. (of 13) = 0.0661236 fft 8: mflops = 22.287 (norm. = 0.138101), norm. avg. (of 13) = 0.115197 fft 9: mflops = 95.2014 (norm. = 0.589915), norm. avg. (of 13) = 0.378889 fft 10: mflops = -1 (norm. = -0.0061965), norm. avg. (of 10) = 0.106748 fft 11: mflops = 6.17453 (norm. = 0.0382604), norm. avg. (of 13) = 0.0493634 Benchmarking for array size = 4725: 0. CWP (min N) (N=5005): elapsed time t=1.92362 s, 512 iters, t-(init.)=1.65736 s t(norm)=0.0561266, mflops=89.0843 1. CWP (best N) (N=5040): elapsed time t=1.77132 s, 512 iters, t-(init.)=1.50358 s t(norm)=0.0509188, mflops=98.1955 2. FFTPACK (f2c): elapsed time t=1.68202 s, 64 iters, t-(init.)=1.6498 s t(norm)=0.446964, mflops=11.1866 (err=2.4e-15) FFTW_MEASURE plan: (cost = 3.035312e-03) FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_NOTW 15 3. FFTW: elapsed time t=1.55389 s, 256 iters, t-(init.)=1.42736 s t(norm)=0.0966749, mflops=51.7197 (err=2.4e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.74585 s, 256 iters, t-(init.)=1.61917 s t(norm)=0.109667, mflops=45.5927 (err=2.3e-15) 5. Frigo-old: elapsed time t=1.45125 s, 64 iters, t-(init.)=1.4189 s t(norm)=0.384409, mflops=13.007 (err=2.3e-15) 6. GSL: elapsed time t=1.01054 s, 64 iters, t-(init.)=0.978229 s t(norm)=0.265022, mflops=18.8664 (err=2.4e-15) 7. NAPACK (f2c): elapsed time t=1.44452 s, 32 iters, t-(init.)=1.42781 s t(norm)=0.773645, mflops=6.46291 (err=3.5e-13) 8. Nielsen: elapsed time t=1.07979 s, 64 iters, t-(init.)=1.04708 s t(norm)=0.283676, mflops=17.6258 (err=4.4e-14) 9. Singleton (f2c): elapsed time t=1.85429 s, 256 iters, t-(init.)=1.7289 s t(norm)=0.117098, mflops=42.6991 (err=3.3e-15) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=1.4547 s, 32 iters, t-(init.)=1.43795 s t(norm)=0.779139, mflops=6.41734 (err=2.3e-15) Top mflops for N=4725 = 98.1955 Normalized results and averages for N=4725: fft 0: mflops = 89.0843 (norm. = 0.907213), norm. avg. (of 14) = 0.588915 fft 1: mflops = 98.1955 (norm. = 1), norm. avg. (of 14) = 0.620696 fft 2: mflops = 11.1866 (norm. = 0.113921), norm. avg. (of 14) = 0.158382 fft 3: mflops = 51.7197 (norm. = 0.526702), norm. avg. (of 14) = 0.889898 fft 4: mflops = 45.5927 (norm. = 0.464306), norm. avg. (of 14) = 0.872903 fft 5: mflops = 13.007 (norm. = 0.13246), norm. avg. (of 14) = 0.166881 fft 6: mflops = 18.8664 (norm. = 0.192131), norm. avg. (of 14) = 0.290727 fft 7: mflops = 6.46291 (norm. = 0.0658168), norm. avg. (of 14) = 0.0661017 fft 8: mflops = 17.6258 (norm. = 0.179497), norm. avg. (of 14) = 0.11979 fft 9: mflops = 42.6991 (norm. = 0.434838), norm. avg. (of 14) = 0.382885 fft 10: mflops = -1 (norm. = -0.0101838), norm. avg. (of 10) = 0.106748 fft 11: mflops = 6.41734 (norm. = 0.0653527), norm. avg. (of 14) = 0.0505055 Benchmarking for array size = 10368: 0. CWP (min N) (N=10920): elapsed time t=1.18352 s, 128 iters, t-(init.)=1.03827 s t(norm)=0.0586483, mflops=85.254 1. CWP (best N) (N=11088): elapsed time t=1.13463 s, 128 iters, t-(init.)=0.987191 s t(norm)=0.0557629, mflops=89.6654 2. FFTPACK (f2c): elapsed time t=1.42547 s, 32 iters, t-(init.)=1.38948 s t(norm)=0.313947, mflops=15.9263 (err=4.7e-15) FFTW_MEASURE plan: (cost = 7.500750e-03) FFTW_TWIDDLE 64 FFTW_TWIDDLE 3 FFTW_TWIDDLE 6 FFTW_NOTW 9 3. FFTW: elapsed time t=1.83722 s, 128 iters, t-(init.)=1.6965 s t(norm)=0.0958291, mflops=52.1762 (err=4.7e-15) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 4. FFTW_ESTIMATE: elapsed time t=1.94085 s, 128 iters, t-(init.)=1.80013 s t(norm)=0.101683, mflops=49.1724 (err=4.7e-15) 5. Frigo-old: elapsed time t=1.26221 s, 32 iters, t-(init.)=1.22598 s t(norm)=0.277005, mflops=18.0502 (err=4.8e-15) 6. GSL: elapsed time t=1.04635 s, 32 iters, t-(init.)=1.00927 s t(norm)=0.228041, mflops=21.9259 (err=4.7e-15) 7. NAPACK (f2c): elapsed time t=1.74355 s, 16 iters, t-(init.)=1.72363 s t(norm)=0.778892, mflops=6.41938 (err=7.8e-14) 8. Nielsen: elapsed time t=1.34677 s, 32 iters, t-(init.)=1.30948 s t(norm)=0.295871, mflops=16.8992 (err=1.1e-14) 9. Singleton (f2c): elapsed time t=1.34292 s, 64 iters, t-(init.)=1.27375 s t(norm)=0.1439, mflops=34.7464 (err=6.7e-15) 10. Temperton (f2c): elapsed time t=1.11571 s, 32 iters, t-(init.)=1.08118 s t(norm)=0.244289, mflops=20.4676 (err=4.7e-15) 11. Valkenburg: elapsed time t=1.91636 s, 16 iters, t-(init.)=1.89652 s t(norm)=0.857021, mflops=5.83417 (err=4.7e-15) Top mflops for N=10368 = 89.6654 Normalized results and averages for N=10368: fft 0: mflops = 85.254 (norm. = 0.950802), norm. avg. (of 15) = 0.61304 fft 1: mflops = 89.6654 (norm. = 1), norm. avg. (of 15) = 0.645983 fft 2: mflops = 15.9263 (norm. = 0.177619), norm. avg. (of 15) = 0.159664 fft 3: mflops = 52.1762 (norm. = 0.581899), norm. avg. (of 15) = 0.869364 fft 4: mflops = 49.1724 (norm. = 0.548399), norm. avg. (of 15) = 0.851269 fft 5: mflops = 18.0502 (norm. = 0.201307), norm. avg. (of 15) = 0.169176 fft 6: mflops = 21.9259 (norm. = 0.24453), norm. avg. (of 15) = 0.287648 fft 7: mflops = 6.41938 (norm. = 0.0715926), norm. avg. (of 15) = 0.0664677 fft 8: mflops = 16.8992 (norm. = 0.18847), norm. avg. (of 15) = 0.124368 fft 9: mflops = 34.7464 (norm. = 0.387512), norm. avg. (of 15) = 0.383194 fft 10: mflops = 20.4676 (norm. = 0.228266), norm. avg. (of 11) = 0.117795 fft 11: mflops = 5.83417 (norm. = 0.065066), norm. avg. (of 15) = 0.0514762 Benchmarking for array size = 27000: 0. CWP (min N) (N=27720): elapsed time t=1.59108 s, 64 iters, t-(init.)=1.40171 s t(norm)=0.0551046, mflops=90.7366 1. CWP (best N) (N=27720): elapsed time t=1.59115 s, 64 iters, t-(init.)=1.40227 s t(norm)=0.0551266, mflops=90.7004 2. FFTPACK (f2c): elapsed time t=1.98281 s, 16 iters, t-(init.)=1.93265 s t(norm)=0.303908, mflops=16.4523 (err=7.3e-15) FFTW_MEASURE plan: (cost = 3.443300e-02) FFTW_TWIDDLE 5 FFTW_TWIDDLE 10 FFTW_TWIDDLE 5 FFTW_TWIDDLE 9 FFTW_NOTW 12 3. FFTW: elapsed time t=1.60662 s, 32 iters, t-(init.)=1.50694 s t(norm)=0.118483, mflops=42.2002 (err=7.3e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.43935 s, 32 iters, t-(init.)=1.34024 s t(norm)=0.105376, mflops=47.4493 (err=7.3e-15) 5. Frigo-old: elapsed time t=1.26034 s, 8 iters, t-(init.)=1.23058 s t(norm)=0.387015, mflops=12.9194 (err=7.3e-15) 6. GSL: elapsed time t=1.54265 s, 16 iters, t-(init.)=1.48951 s t(norm)=0.234224, mflops=21.3471 (err=7.3e-15) 7. NAPACK (f2c): elapsed time t=1.30235 s, 4 iters, t-(init.)=1.28361 s t(norm)=0.807389, mflops=6.1928 (err=1.0e-12) 8. Nielsen: elapsed time t=1.73003 s, 16 iters, t-(init.)=1.67667 s t(norm)=0.263656, mflops=18.9641 (err=2.0e-13) 9. Singleton (f2c): elapsed time t=1.92778 s, 32 iters, t-(init.)=1.83599 s t(norm)=0.144354, mflops=34.6371 (err=1.1e-14) 10. Temperton (f2c): elapsed time t=1.62031 s, 16 iters, t-(init.)=1.57434 s t(norm)=0.247563, mflops=20.1969 (err=7.3e-15) 11. Valkenburg: elapsed time t=1.50861 s, 4 iters, t-(init.)=1.48985 s t(norm)=0.937111, mflops=5.33555 (err=7.3e-15) Top mflops for N=27000 = 90.7366 Normalized results and averages for N=27000: fft 0: mflops = 90.7366 (norm. = 1), norm. avg. (of 16) = 0.637225 fft 1: mflops = 90.7004 (norm. = 0.999601), norm. avg. (of 16) = 0.668084 fft 2: mflops = 16.4523 (norm. = 0.18132), norm. avg. (of 16) = 0.161018 fft 3: mflops = 42.2002 (norm. = 0.465085), norm. avg. (of 16) = 0.844097 fft 4: mflops = 47.4493 (norm. = 0.522934), norm. avg. (of 16) = 0.830748 fft 5: mflops = 12.9194 (norm. = 0.142383), norm. avg. (of 16) = 0.167501 fft 6: mflops = 21.3471 (norm. = 0.235265), norm. avg. (of 16) = 0.284374 fft 7: mflops = 6.1928 (norm. = 0.0682503), norm. avg. (of 16) = 0.0665791 fft 8: mflops = 18.9641 (norm. = 0.209002), norm. avg. (of 16) = 0.129658 fft 9: mflops = 34.6371 (norm. = 0.381732), norm. avg. (of 16) = 0.383102 fft 10: mflops = 20.1969 (norm. = 0.222588), norm. avg. (of 12) = 0.126528 fft 11: mflops = 5.33555 (norm. = 0.0588026), norm. avg. (of 16) = 0.0519341 Benchmarking for array size = 75600: 0. CWP (min N) (N=80080): elapsed time t=1.95674 s, 16 iters, t-(init.)=1.68523 s t(norm)=0.0859686, mflops=58.1608 1. CWP (best N) (N=80080): elapsed time t=1.95646 s, 16 iters, t-(init.)=1.68464 s t(norm)=0.0859382, mflops=58.1813 2. FFTPACK (f2c): elapsed time t=1.20094 s, 2 iters, t-(init.)=1.15829 s t(norm)=0.472703, mflops=10.5775 (err=9.4e-15) FFTW_MEASURE plan: (cost = 1.598900e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_NOTW 15 3. FFTW: elapsed time t=1.27106 s, 8 iters, t-(init.)=1.14401 s t(norm)=0.116719, mflops=42.838 (err=9.4e-15) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.32405 s, 8 iters, t-(init.)=1.19727 s t(norm)=0.122152, mflops=40.9325 (err=9.4e-15) 5. Frigo-old: elapsed time t=1.08003 s, 2 iters, t-(init.)=1.03772 s t(norm)=0.423495, mflops=11.8065 (err=9.4e-15) 6. GSL: elapsed time t=1.57062 s, 4 iters, t-(init.)=1.49994 s t(norm)=0.306066, mflops=16.3364 (err=9.4e-15) 7. NAPACK (f2c): elapsed time t=1.05234 s, 1 iters, t-(init.)=1.02371 s t(norm)=0.835561, mflops=5.98401 (err=5.1e-12) 8. Nielsen: elapsed time t=1.66178 s, 4 iters, t-(init.)=1.59085 s t(norm)=0.324615, mflops=15.4029 (err=4.7e-13) 9. Singleton (f2c): elapsed time t=1.12948 s, 4 iters, t-(init.)=1.07508 s t(norm)=0.219372, mflops=22.7923 (err=1.3e-14) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=1.34723 s, 1 iters, t-(init.)=1.31864 s t(norm)=1.07628, mflops=4.64562 (err=9.5e-15) Top mflops for N=75600 = 58.1813 Normalized results and averages for N=75600: fft 0: mflops = 58.1608 (norm. = 0.999646), norm. avg. (of 17) = 0.658544 fft 1: mflops = 58.1813 (norm. = 1), norm. avg. (of 17) = 0.687608 fft 2: mflops = 10.5775 (norm. = 0.181802), norm. avg. (of 17) = 0.16224 fft 3: mflops = 42.838 (norm. = 0.736285), norm. avg. (of 17) = 0.837755 fft 4: mflops = 40.9325 (norm. = 0.703534), norm. avg. (of 17) = 0.823265 fft 5: mflops = 11.8065 (norm. = 0.202926), norm. avg. (of 17) = 0.169585 fft 6: mflops = 16.3364 (norm. = 0.280783), norm. avg. (of 17) = 0.284163 fft 7: mflops = 5.98401 (norm. = 0.102851), norm. avg. (of 17) = 0.0687128 fft 8: mflops = 15.4029 (norm. = 0.264739), norm. avg. (of 17) = 0.137604 fft 9: mflops = 22.7923 (norm. = 0.391747), norm. avg. (of 17) = 0.383611 fft 10: mflops = -1 (norm. = -0.0171876), norm. avg. (of 12) = 0.126528 fft 11: mflops = 4.64562 (norm. = 0.0798473), norm. avg. (of 17) = 0.0535761 Benchmarking for array size = 165375: 0. CWP (min N) (N=180180): elapsed time t=1.00758 s, 2 iters, t-(init.)=0.875499 s t(norm)=0.152694, mflops=32.7452 1. CWP (best N) (N=180180): elapsed time t=1.00797 s, 2 iters, t-(init.)=0.875882 s t(norm)=0.152761, mflops=32.7309 2. FFTPACK (f2c): elapsed time t=2.30624 s, 1 iters, t-(init.)=2.24337 s t(norm)=0.782523, mflops=6.38959 (err=3.7e-14) FFTW_MEASURE plan: (cost = 4.448880e-01) FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_NOTW 15 3. FFTW: elapsed time t=1.7811 s, 4 iters, t-(init.)=1.52938 s t(norm)=0.133368, mflops=37.4901 (err=3.7e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.88298 s, 4 iters, t-(init.)=1.63099 s t(norm)=0.142229, mflops=35.1546 (err=3.7e-14) 5. Frigo-old: elapsed time t=1.90203 s, 1 iters, t-(init.)=1.83914 s t(norm)=0.641523, mflops=7.79395 (err=3.7e-14) 6. GSL: elapsed time t=1.83825 s, 2 iters, t-(init.)=1.71237 s t(norm)=0.298651, mflops=16.742 (err=3.7e-14) 7. NAPACK (f2c): elapsed time t=2.90592 s, 1 iters, t-(init.)=2.84304 s t(norm)=0.991697, mflops=5.04186 (err=1.6e-11) 8. Nielsen: elapsed time t=1.35297 s, 1 iters, t-(init.)=1.28968 s t(norm)=0.449861, mflops=11.1145 (err=1.6e-12) 9. Singleton (f2c): elapsed time t=1.14204 s, 1 iters, t-(init.)=1.09258 s t(norm)=0.381108, mflops=13.1196 (err=5.6e-14) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=3.10914 s, 1 iters, t-(init.)=3.04627 s t(norm)=1.06259, mflops=4.70549 (err=3.6e-14) Top mflops for N=165375 = 37.4901 Normalized results and averages for N=165375: fft 0: mflops = 32.7452 (norm. = 0.873435), norm. avg. (of 18) = 0.670483 fft 1: mflops = 32.7309 (norm. = 0.873053), norm. avg. (of 18) = 0.697911 fft 2: mflops = 6.38959 (norm. = 0.170434), norm. avg. (of 18) = 0.162695 fft 3: mflops = 37.4901 (norm. = 1), norm. avg. (of 18) = 0.846769 fft 4: mflops = 35.1546 (norm. = 0.937703), norm. avg. (of 18) = 0.829623 fft 5: mflops = 7.79395 (norm. = 0.207893), norm. avg. (of 18) = 0.171713 fft 6: mflops = 16.742 (norm. = 0.44657), norm. avg. (of 18) = 0.293185 fft 7: mflops = 5.04186 (norm. = 0.134485), norm. avg. (of 18) = 0.0723668 fft 8: mflops = 11.1145 (norm. = 0.296466), norm. avg. (of 18) = 0.14643 fft 9: mflops = 13.1196 (norm. = 0.349949), norm. avg. (of 18) = 0.381741 fft 10: mflops = -1 (norm. = -0.0266737), norm. avg. (of 12) = 0.126528 fft 11: mflops = 4.70549 (norm. = 0.125513), norm. avg. (of 18) = 0.0575725 Benchmarking for array size = 362880: 0. CWP (min N) (N=720720): elapsed time t=2.25589 s, 1 iters, t-(init.)=1.9812 s t(norm)=0.29561, mflops=16.9142 1. CWP (best N) (N=720720): elapsed time t=2.25548 s, 1 iters, t-(init.)=1.9806 s t(norm)=0.29552, mflops=16.9193 2. FFTPACK (f2c): elapsed time t=4.00532 s, 1 iters, t-(init.)=3.86718 s t(norm)=0.577011, mflops=8.66534 (err=7.1e-14) FFTW_MEASURE plan: (cost = 9.322640e-01) FFTW_TWIDDLE 64 FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 9 3. FFTW: elapsed time t=1.78709 s, 2 iters, t-(init.)=1.5108 s t(norm)=0.112712, mflops=44.361 (err=7.1e-14) FFTW_ESTIMATE plan: (cost = 7.511616e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 4. FFTW_ESTIMATE: elapsed time t=1.06237 s, 1 iters, t-(init.)=0.924149 s t(norm)=0.13789, mflops=36.2608 (err=7.1e-14) 5. Frigo-old: elapsed time t=3.60055 s, 1 iters, t-(init.)=3.46256 s t(norm)=0.516639, mflops=9.67794 (err=7.1e-14) 6. GSL: elapsed time t=2.00905 s, 1 iters, t-(init.)=1.87101 s t(norm)=0.279168, mflops=17.9104 (err=7.1e-14) 7. NAPACK (f2c): elapsed time t=6.44338 s, 1 iters, t-(init.)=6.30524 s t(norm)=0.940788, mflops=5.31469 (err=3.4e-11) 8. Nielsen: elapsed time t=3.95461 s, 1 iters, t-(init.)=3.81643 s t(norm)=0.56944, mflops=8.78055 (err=3.5e-12) 9. Singleton (f2c): elapsed time t=3.38614 s, 1 iters, t-(init.)=3.26431 s t(norm)=0.487059, mflops=10.2657 (err=1.0e-13) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=8.96936 s, 1 iters, t-(init.)=8.83132 s t(norm)=1.3177, mflops=3.7945 (err=7.1e-14) Top mflops for N=362880 = 44.361 Normalized results and averages for N=362880: fft 0: mflops = 16.9142 (norm. = 0.381284), norm. avg. (of 19) = 0.655262 fft 1: mflops = 16.9193 (norm. = 0.381401), norm. avg. (of 19) = 0.681252 fft 2: mflops = 8.66534 (norm. = 0.195337), norm. avg. (of 19) = 0.164413 fft 3: mflops = 44.361 (norm. = 1), norm. avg. (of 19) = 0.854834 fft 4: mflops = 36.2608 (norm. = 0.817403), norm. avg. (of 19) = 0.828979 fft 5: mflops = 9.67794 (norm. = 0.218163), norm. avg. (of 19) = 0.174158 fft 6: mflops = 17.9104 (norm. = 0.403741), norm. avg. (of 19) = 0.299004 fft 7: mflops = 5.31469 (norm. = 0.119806), norm. avg. (of 19) = 0.0748636 fft 8: mflops = 8.78055 (norm. = 0.197934), norm. avg. (of 19) = 0.14914 fft 9: mflops = 10.2657 (norm. = 0.231413), norm. avg. (of 19) = 0.373829 fft 10: mflops = -1 (norm. = -0.0225423), norm. avg. (of 12) = 0.126528 fft 11: mflops = 3.7945 (norm. = 0.0855367), norm. avg. (of 19) = 0.0590443 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) 512x128x64 (64.0236 MB) Maximum array size N = 4194304 Benchmarking FFTs: 0. FFTW 1. HARM (f2c) 2. NR (C) 3. PDA (f2c) 4. Singleton (f2c) 5. Temperton (f2c) Computing normalized averages (6 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.20118 s, 131072 iters, t-(init.)=1.08355 s t(norm)=0.0215282, mflops=232.253 (err=1.9e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. NR (C): elapsed time t=1.62035 s, 65536 iters, t-(init.)=1.56161 s t(norm)=0.062053, mflops=80.5763 (err=2.3e-16) 3. PDA (f2c): elapsed time t=1.31995 s, 8192 iters, t-(init.)=1.31254 s t(norm)=0.417245, mflops=11.9834 (err=2.8e-16) 4. Singleton (f2c): elapsed time t=1.76467 s, 131072 iters, t-(init.)=1.64722 s t(norm)=0.0327273, mflops=152.778 (err=1.9e-16) 5. Temperton (f2c): elapsed time t=1.82877 s, 32768 iters, t-(init.)=1.79937 s t(norm)=0.143001, mflops=34.9648 (err=1.9e-16) Top mflops for N=64 = 232.253 Normalized results and averages for N=64: fft 0: mflops = 232.253 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.00430564), norm. avg. (of 0) = -1 fft 2: mflops = 80.5763 (norm. = 0.346933), norm. avg. (of 1) = 0.346933 fft 3: mflops = 11.9834 (norm. = 0.0515962), norm. avg. (of 1) = 0.0515962 fft 4: mflops = 152.778 (norm. = 0.657806), norm. avg. (of 1) = 0.657806 fft 5: mflops = 34.9648 (norm. = 0.150546), norm. avg. (of 1) = 0.150546 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.48137 s, 16384 iters, t-(init.)=1.36785 s t(norm)=0.0181179, mflops=275.971 (err=3.8e-16) 1. HARM (f2c): elapsed time t=1.2079 s, 2048 iters, t-(init.)=1.19368 s t(norm)=0.126487, mflops=39.5296 (err=3.6e-16) 2. NR (C): elapsed time t=1.77755 s, 8192 iters, t-(init.)=1.7209 s t(norm)=0.0455883, mflops=109.677 (err=2.9e-16) 3. PDA (f2c): elapsed time t=1.10929 s, 1024 iters, t-(init.)=1.10216 s t(norm)=0.233578, mflops=21.4061 (err=3.1e-16) 4. Singleton (f2c): elapsed time t=1.45924 s, 8192 iters, t-(init.)=1.40258 s t(norm)=0.0371557, mflops=134.569 (err=3.1e-16) 5. Temperton (f2c): elapsed time t=1.27192 s, 2048 iters, t-(init.)=1.2577 s t(norm)=0.133271, mflops=37.5176 (err=3.7e-16) Top mflops for N=512 = 275.971 Normalized results and averages for N=512: fft 0: mflops = 275.971 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 39.5296 (norm. = 0.143239), norm. avg. (of 1) = 0.143239 fft 2: mflops = 109.677 (norm. = 0.397424), norm. avg. (of 2) = 0.372179 fft 3: mflops = 21.4061 (norm. = 0.0775667), norm. avg. (of 2) = 0.0645814 fft 4: mflops = 134.569 (norm. = 0.487621), norm. avg. (of 2) = 0.572713 fft 5: mflops = 37.5176 (norm. = 0.135948), norm. avg. (of 2) = 0.143247 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.52411 s, 512 iters, t-(init.)=1.30656 s t(norm)=0.051918, mflops=96.3057 (err=4.1e-16) 1. HARM (f2c): elapsed time t=1.92861 s, 256 iters, t-(init.)=1.81981 s t(norm)=0.144625, mflops=34.5721 (err=4.0e-16) 2. NR (C): elapsed time t=1.51121 s, 128 iters, t-(init.)=1.45666 s t(norm)=0.231529, mflops=21.5956 (err=4.7e-16) 3. PDA (f2c): elapsed time t=1.38194 s, 128 iters, t-(init.)=1.32754 s t(norm)=0.211007, mflops=23.6959 (err=3.8e-16) 4. Singleton (f2c): elapsed time t=1.73482 s, 256 iters, t-(init.)=1.62601 s t(norm)=0.129224, mflops=38.6925 (err=4.7e-16) 5. Temperton (f2c): elapsed time t=1.91859 s, 256 iters, t-(init.)=1.80912 s t(norm)=0.143776, mflops=34.7764 (err=4.1e-16) Top mflops for N=4096 = 96.3057 Normalized results and averages for N=4096: fft 0: mflops = 96.3057 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 34.5721 (norm. = 0.358983), norm. avg. (of 2) = 0.251111 fft 2: mflops = 21.5956 (norm. = 0.22424), norm. avg. (of 3) = 0.322866 fft 3: mflops = 23.6959 (norm. = 0.246049), norm. avg. (of 3) = 0.125071 fft 4: mflops = 38.6925 (norm. = 0.401768), norm. avg. (of 3) = 0.515731 fft 5: mflops = 34.7764 (norm. = 0.361104), norm. avg. (of 3) = 0.215866 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.64453 s, 64 iters, t-(init.)=1.42001 s t(norm)=0.0451409, mflops=110.764 (err=4.8e-16) 1. HARM (f2c): elapsed time t=1.21382 s, 16 iters, t-(init.)=1.15726 s t(norm)=0.147153, mflops=33.9782 (err=4.8e-16) 2. NR (C): elapsed time t=1.9382 s, 16 iters, t-(init.)=1.88151 s t(norm)=0.239246, mflops=20.899 (err=6.0e-16) 3. PDA (f2c): elapsed time t=1.76856 s, 16 iters, t-(init.)=1.71238 s t(norm)=0.21774, mflops=22.9631 (err=4.5e-16) 4. Singleton (f2c): elapsed time t=1.24806 s, 16 iters, t-(init.)=1.19178 s t(norm)=0.151543, mflops=32.9939 (err=4.9e-16) 5. Temperton (f2c): elapsed time t=1.40678 s, 16 iters, t-(init.)=1.3503 s t(norm)=0.171699, mflops=29.1207 (err=5.1e-16) Top mflops for N=32768 = 110.764 Normalized results and averages for N=32768: fft 0: mflops = 110.764 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 33.9782 (norm. = 0.306761), norm. avg. (of 3) = 0.269661 fft 2: mflops = 20.899 (norm. = 0.18868), norm. avg. (of 4) = 0.289319 fft 3: mflops = 22.9631 (norm. = 0.207315), norm. avg. (of 4) = 0.145632 fft 4: mflops = 32.9939 (norm. = 0.297875), norm. avg. (of 4) = 0.461267 fft 5: mflops = 29.1207 (norm. = 0.262907), norm. avg. (of 4) = 0.227626 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.43963 s, 2 iters, t-(init.)=1.2401 s t(norm)=0.131406, mflops=38.0499 (err=1.0e-15) 1. HARM (f2c): elapsed time t=1.27301 s, 1 iters, t-(init.)=1.1727 s t(norm)=0.248528, mflops=20.1184 (err=1.0e-15) 2. NR (C): elapsed time t=4.03336 s, 1 iters, t-(init.)=3.93345 s t(norm)=0.833607, mflops=5.99803 (err=1.0e-15) 3. PDA (f2c): elapsed time t=2.14715 s, 1 iters, t-(init.)=2.04745 s t(norm)=0.433911, mflops=11.5231 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=2.14032 s, 1 iters, t-(init.)=2.04046 s t(norm)=0.432429, mflops=11.5626 (err=1.4e-15) 5. Temperton (f2c): elapsed time t=1.33105 s, 1 iters, t-(init.)=1.23123 s t(norm)=0.260932, mflops=19.162 (err=9.9e-16) Top mflops for N=262144 = 38.0499 Normalized results and averages for N=262144: fft 0: mflops = 38.0499 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 20.1184 (norm. = 0.528738), norm. avg. (of 4) = 0.33443 fft 2: mflops = 5.99803 (norm. = 0.157636), norm. avg. (of 5) = 0.262982 fft 3: mflops = 11.5231 (norm. = 0.302841), norm. avg. (of 5) = 0.177074 fft 4: mflops = 11.5626 (norm. = 0.303879), norm. avg. (of 5) = 0.42979 fft 5: mflops = 19.162 (norm. = 0.503602), norm. avg. (of 5) = 0.282821 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.41894 s, 1 iters, t-(init.)=1.21945 s t(norm)=0.122417, mflops=40.844 (err=9.2e-16) 1. HARM (f2c): elapsed time t=2.77314 s, 1 iters, t-(init.)=2.57354 s t(norm)=0.25835, mflops=19.3536 (err=9.4e-16) 2. NR (C): elapsed time t=8.45347 s, 1 iters, t-(init.)=8.25355 s t(norm)=0.828547, mflops=6.03466 (err=9.6e-16) 3. PDA (f2c): elapsed time t=3.66232 s, 1 iters, t-(init.)=3.46258 s t(norm)=0.347597, mflops=14.3845 (err=8.8e-16) 4. Singleton (f2c): elapsed time t=4.71506 s, 1 iters, t-(init.)=4.51512 s t(norm)=0.453259, mflops=11.0312 (err=1.3e-15) 5. Temperton (f2c): elapsed time t=3.24126 s, 1 iters, t-(init.)=3.04164 s t(norm)=0.305341, mflops=16.3752 (err=9.2e-16) Top mflops for N=524288 = 40.844 Normalized results and averages for N=524288: fft 0: mflops = 40.844 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 19.3536 (norm. = 0.473843), norm. avg. (of 5) = 0.362313 fft 2: mflops = 6.03466 (norm. = 0.147749), norm. avg. (of 6) = 0.243777 fft 3: mflops = 14.3845 (norm. = 0.352181), norm. avg. (of 6) = 0.206258 fft 4: mflops = 11.0312 (norm. = 0.270082), norm. avg. (of 6) = 0.403172 fft 5: mflops = 16.3752 (norm. = 0.40092), norm. avg. (of 6) = 0.302504 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=3.13047 s, 1 iters, t-(init.)=2.73107 s t(norm)=0.130227, mflops=38.3944 (err=1.2e-15) 1. HARM (f2c): elapsed time t=5.69221 s, 1 iters, t-(init.)=5.2924 s t(norm)=0.252361, mflops=19.8129 (err=1.2e-15) 2. NR (C): elapsed time t=17.9102 s, 1 iters, t-(init.)=17.5106 s t(norm)=0.834971, mflops=5.98823 (err=1.3e-15) 3. PDA (f2c): elapsed time t=9.15976 s, 1 iters, t-(init.)=8.76031 s t(norm)=0.417724, mflops=11.9696 (err=1.2e-15) 4. Singleton (f2c): elapsed time t=9.55023 s, 1 iters, t-(init.)=9.1508 s t(norm)=0.436344, mflops=11.4588 (err=1.7e-15) 5. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 38.3944 Normalized results and averages for N=1048576: fft 0: mflops = 38.3944 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 19.8129 (norm. = 0.516035), norm. avg. (of 6) = 0.387933 fft 2: mflops = 5.98823 (norm. = 0.155966), norm. avg. (of 7) = 0.231233 fft 3: mflops = 11.9696 (norm. = 0.311755), norm. avg. (of 7) = 0.221329 fft 4: mflops = 11.4588 (norm. = 0.298451), norm. avg. (of 7) = 0.388212 fft 5: mflops = -1 (norm. = -0.0260455), norm. avg. (of 6) = 0.302504 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=6.09945 s, 1 iters, t-(init.)=5.30109 s t(norm)=0.120369, mflops=41.5388 (err=8.1e-16) 1. HARM (f2c): elapsed time t=12.7097 s, 1 iters, t-(init.)=11.9107 s t(norm)=0.27045, mflops=18.4877 (err=8.0e-16) 2. NR (C): elapsed time t=37.5111 s, 1 iters, t-(init.)=36.7118 s t(norm)=0.833597, mflops=5.9981 (err=8.7e-16) 3. PDA (f2c): elapsed time t=17.3148 s, 1 iters, t-(init.)=16.5157 s t(norm)=0.375014, mflops=13.3328 (err=7.8e-16) 4. Singleton (f2c): elapsed time t=27.7678 s, 1 iters, t-(init.)=26.9688 s t(norm)=0.612368, mflops=8.16502 (err=1.1e-15) 5. Temperton (f2c): elapsed time t=22.049 s, 1 iters, t-(init.)=21.2499 s t(norm)=0.482512, mflops=10.3624 (err=8.1e-16) Top mflops for N=2097152 = 41.5388 Normalized results and averages for N=2097152: fft 0: mflops = 41.5388 (norm. = 1), norm. avg. (of 8) = 1 fft 1: mflops = 18.4877 (norm. = 0.44507), norm. avg. (of 7) = 0.396095 fft 2: mflops = 5.9981 (norm. = 0.144397), norm. avg. (of 8) = 0.220378 fft 3: mflops = 13.3328 (norm. = 0.320973), norm. avg. (of 8) = 0.233785 fft 4: mflops = 8.16502 (norm. = 0.196564), norm. avg. (of 8) = 0.364256 fft 5: mflops = 10.3624 (norm. = 0.249464), norm. avg. (of 7) = 0.294927 Benchmarking for array size = 512x128x64 (power of 2): 0. FFTW: elapsed time t=12.2254 s, 1 iters, t-(init.)=10.6279 s t(norm)=0.115177, mflops=43.4114 (err=1.0e-15) 1. HARM (f2c): elapsed time t=25.971 s, 1 iters, t-(init.)=24.3729 s t(norm)=0.264134, mflops=18.9298 (err=1.0e-15) 2. NR (C): elapsed time t=78.6124 s, 1 iters, t-(init.)=77.0142 s t(norm)=0.834619, mflops=5.99076 (err=1.0e-15) 3. PDA (f2c): elapsed time t=35.9879 s, 1 iters, t-(init.)=34.3897 s t(norm)=0.372688, mflops=13.416 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=48.8507 s, 1 iters, t-(init.)=47.2529 s t(norm)=0.51209, mflops=9.76391 (err=1.4e-15) 5. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=4194304 = 43.4114 Normalized results and averages for N=4194304: fft 0: mflops = 43.4114 (norm. = 1), norm. avg. (of 9) = 1 fft 1: mflops = 18.9298 (norm. = 0.436055), norm. avg. (of 8) = 0.40109 fft 2: mflops = 5.99076 (norm. = 0.138), norm. avg. (of 9) = 0.211225 fft 3: mflops = 13.416 (norm. = 0.309044), norm. avg. (of 9) = 0.242147 fft 4: mflops = 9.76391 (norm. = 0.224916), norm. avg. (of 9) = 0.348773 fft 5: mflops = -1 (norm. = -0.0230354), norm. avg. (of 7) = 0.294927 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) 180x180x180 (88.9976 MB) Maximum array size N = 5832000 Benchmarking FFTs: 0. FFTW 1. PDA (f2c) 2. Singleton (f2c) 3. Temperton (f2c) Computing normalized averages (4 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.84817 s, 65536 iters, t-(init.)=1.73578 s t(norm)=0.0304184, mflops=164.374 (err=2.4e-16) 1. PDA (f2c): elapsed time t=1.23694 s, 4096 iters, t-(init.)=1.2298 s t(norm)=0.344823, mflops=14.5002 (err=2.1e-16) 2. Singleton (f2c): elapsed time t=1.46623 s, 65536 iters, t-(init.)=1.35376 s t(norm)=0.0237237, mflops=210.76 (err=3.1e-16) 3. Temperton (f2c): elapsed time t=1.05447 s, 8192 iters, t-(init.)=1.04036 s t(norm)=0.145852, mflops=34.2814 (err=2.4e-16) Top mflops for N=125 = 210.76 Normalized results and averages for N=125: fft 0: mflops = 164.374 (norm. = 0.779913), norm. avg. (of 1) = 0.779913 fft 1: mflops = 14.5002 (norm. = 0.0687996), norm. avg. (of 1) = 0.0687996 fft 2: mflops = 210.76 (norm. = 1), norm. avg. (of 1) = 1 fft 3: mflops = 34.2814 (norm. = 0.162656), norm. avg. (of 1) = 0.162656 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.33657 s, 32768 iters, t-(init.)=1.24032 s t(norm)=0.0225972, mflops=221.267 (err=3.0e-16) 1. PDA (f2c): elapsed time t=1.07061 s, 2048 iters, t-(init.)=1.06453 s t(norm)=0.310312, mflops=16.1128 (err=3.7e-16) 2. Singleton (f2c): elapsed time t=1.25053 s, 16384 iters, t-(init.)=1.20239 s t(norm)=0.0438123, mflops=114.123 (err=3.1e-16) 3. Temperton (f2c): elapsed time t=1.12846 s, 4096 iters, t-(init.)=1.11641 s t(norm)=0.162717, mflops=30.7282 (err=3.2e-16) Top mflops for N=216 = 221.267 Normalized results and averages for N=216: fft 0: mflops = 221.267 (norm. = 1), norm. avg. (of 2) = 0.889956 fft 1: mflops = 16.1128 (norm. = 0.0728208), norm. avg. (of 2) = 0.0708102 fft 2: mflops = 114.123 (norm. = 0.515772), norm. avg. (of 2) = 0.757886 fft 3: mflops = 30.7282 (norm. = 0.138874), norm. avg. (of 2) = 0.150765 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.58314 s, 16384 iters, t-(init.)=1.50708 s t(norm)=0.0318422, mflops=157.024 (err=4.0e-16) 1. PDA (f2c): elapsed time t=1.6965 s, 1024 iters, t-(init.)=1.69169 s t(norm)=0.571886, mflops=8.74301 (err=4.0e-16) 2. Singleton (f2c): elapsed time t=1.2144 s, 8192 iters, t-(init.)=1.1763 s t(norm)=0.0497068, mflops=100.59 (err=4.9e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 157.024 Normalized results and averages for N=343: fft 0: mflops = 157.024 (norm. = 1), norm. avg. (of 3) = 0.926638 fft 1: mflops = 8.74301 (norm. = 0.0556794), norm. avg. (of 3) = 0.0657666 fft 2: mflops = 100.59 (norm. = 0.640602), norm. avg. (of 3) = 0.718791 fft 3: mflops = -1 (norm. = -0.00636845), norm. avg. (of 2) = 0.150765 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.45857 s, 8192 iters, t-(init.)=1.37793 s t(norm)=0.0242626, mflops=206.078 (err=5.4e-16) 1. PDA (f2c): elapsed time t=1.70932 s, 1024 iters, t-(init.)=1.69923 s t(norm)=0.239362, mflops=20.8889 (err=5.2e-16) 2. Singleton (f2c): elapsed time t=1.00894 s, 4096 iters, t-(init.)=0.968633 s t(norm)=0.0341116, mflops=146.578 (err=4.9e-16) 3. Temperton (f2c): elapsed time t=1.90619 s, 2048 iters, t-(init.)=1.88594 s t(norm)=0.132831, mflops=37.6417 (err=5.8e-16) Top mflops for N=729 = 206.078 Normalized results and averages for N=729: fft 0: mflops = 206.078 (norm. = 1), norm. avg. (of 4) = 0.944978 fft 1: mflops = 20.8889 (norm. = 0.101364), norm. avg. (of 4) = 0.0746659 fft 2: mflops = 146.578 (norm. = 0.711274), norm. avg. (of 4) = 0.716912 fft 3: mflops = 37.6417 (norm. = 0.182658), norm. avg. (of 3) = 0.161396 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.9121 s, 8192 iters, t-(init.)=1.80173 s t(norm)=0.0220693, mflops=226.559 (err=3.8e-16) 1. PDA (f2c): elapsed time t=1.08365 s, 512 iters, t-(init.)=1.07672 s t(norm)=0.211018, mflops=23.6947 (err=4.2e-16) 2. Singleton (f2c): elapsed time t=1.47516 s, 4096 iters, t-(init.)=1.41994 s t(norm)=0.0347855, mflops=143.738 (err=4.4e-16) 3. Temperton (f2c): elapsed time t=1.48177 s, 1024 iters, t-(init.)=1.46791 s t(norm)=0.143843, mflops=34.7601 (err=3.6e-16) Top mflops for N=1000 = 226.559 Normalized results and averages for N=1000: fft 0: mflops = 226.559 (norm. = 1), norm. avg. (of 5) = 0.955983 fft 1: mflops = 23.6947 (norm. = 0.104585), norm. avg. (of 5) = 0.0806497 fft 2: mflops = 143.738 (norm. = 0.634438), norm. avg. (of 5) = 0.700417 fft 3: mflops = 34.7601 (norm. = 0.153426), norm. avg. (of 4) = 0.159403 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.21191 s, 2048 iters, t-(init.)=1.17515 s t(norm)=0.0415392, mflops=120.368 (err=4.0e-16) 1. PDA (f2c): elapsed time t=1.01466 s, 128 iters, t-(init.)=1.01235 s t(norm)=0.572556, mflops=8.73277 (err=4.8e-16) 2. Singleton (f2c): elapsed time t=1.52948 s, 2048 iters, t-(init.)=1.49258 s t(norm)=0.0527599, mflops=94.7689 (err=6.4e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 120.368 Normalized results and averages for N=1331: fft 0: mflops = 120.368 (norm. = 1), norm. avg. (of 6) = 0.963319 fft 1: mflops = 8.73277 (norm. = 0.0725505), norm. avg. (of 6) = 0.0792998 fft 2: mflops = 94.7689 (norm. = 0.787326), norm. avg. (of 6) = 0.714902 fft 3: mflops = -1 (norm. = -0.00830785), norm. avg. (of 4) = 0.159403 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.37785 s, 4096 iters, t-(init.)=1.28158 s t(norm)=0.0168359, mflops=296.984 (err=3.8e-16) 1. PDA (f2c): elapsed time t=1.86592 s, 512 iters, t-(init.)=1.85377 s t(norm)=0.194821, mflops=25.6645 (err=3.8e-16) 2. Singleton (f2c): elapsed time t=1.38761 s, 2048 iters, t-(init.)=1.33958 s t(norm)=0.0351957, mflops=142.063 (err=4.0e-16) 3. Temperton (f2c): elapsed time t=1.03476 s, 512 iters, t-(init.)=1.02275 s t(norm)=0.107486, mflops=46.5178 (err=3.8e-16) Top mflops for N=1728 = 296.984 Normalized results and averages for N=1728: fft 0: mflops = 296.984 (norm. = 1), norm. avg. (of 7) = 0.968559 fft 1: mflops = 25.6645 (norm. = 0.0864173), norm. avg. (of 7) = 0.0803166 fft 2: mflops = 142.063 (norm. = 0.478353), norm. avg. (of 7) = 0.681109 fft 3: mflops = 46.5178 (norm. = 0.156634), norm. avg. (of 5) = 0.158849 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.65219 s, 1024 iters, t-(init.)=1.55136 s t(norm)=0.0621167, mflops=80.4936 (err=4.1e-16) 1. PDA (f2c): elapsed time t=1.89602 s, 128 iters, t-(init.)=1.88306 s t(norm)=0.603184, mflops=8.28935 (err=7.2e-16) 2. Singleton (f2c): elapsed time t=1.63722 s, 1024 iters, t-(init.)=1.53645 s t(norm)=0.0615195, mflops=81.2751 (err=4.3e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 81.2751 Normalized results and averages for N=2197: fft 0: mflops = 80.4936 (norm. = 0.990385), norm. avg. (of 8) = 0.971287 fft 1: mflops = 8.28935 (norm. = 0.101991), norm. avg. (of 8) = 0.083026 fft 2: mflops = 81.2751 (norm. = 1), norm. avg. (of 8) = 0.72097 fft 3: mflops = -1 (norm. = -0.0123039), norm. avg. (of 5) = 0.158849 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.63207 s, 1024 iters, t-(init.)=1.34091 s t(norm)=0.0417801, mflops=119.674 (err=3.9e-16) 1. PDA (f2c): elapsed time t=1.46825 s, 128 iters, t-(init.)=1.43185 s t(norm)=0.35691, mflops=14.0091 (err=3.8e-16) 2. Singleton (f2c): elapsed time t=1.3915 s, 512 iters, t-(init.)=1.24601 s t(norm)=0.0776464, mflops=64.3945 (err=4.6e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 119.674 Normalized results and averages for N=2744: fft 0: mflops = 119.674 (norm. = 1), norm. avg. (of 9) = 0.974478 fft 1: mflops = 14.0091 (norm. = 0.117061), norm. avg. (of 9) = 0.0868076 fft 2: mflops = 64.3945 (norm. = 0.538082), norm. avg. (of 9) = 0.700649 fft 3: mflops = -1 (norm. = -0.00835602), norm. avg. (of 5) = 0.158849 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.96654 s, 1024 iters, t-(init.)=1.60787 s t(norm)=0.0396941, mflops=125.963 (err=4.6e-16) 1. PDA (f2c): elapsed time t=1.0028 s, 128 iters, t-(init.)=0.957923 s t(norm)=0.189188, mflops=26.4287 (err=4.5e-16) 2. Singleton (f2c): elapsed time t=1.7541 s, 512 iters, t-(init.)=1.57445 s t(norm)=0.0777377, mflops=64.3188 (err=4.8e-16) 3. Temperton (f2c): elapsed time t=1.47883 s, 256 iters, t-(init.)=1.38929 s t(norm)=0.137191, mflops=36.4455 (err=4.6e-16) Top mflops for N=3375 = 125.963 Normalized results and averages for N=3375: fft 0: mflops = 125.963 (norm. = 1), norm. avg. (of 10) = 0.97703 fft 1: mflops = 26.4287 (norm. = 0.209813), norm. avg. (of 10) = 0.0991081 fft 2: mflops = 64.3188 (norm. = 0.510616), norm. avg. (of 10) = 0.681646 fft 3: mflops = 36.4455 (norm. = 0.289334), norm. avg. (of 6) = 0.180597 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.68663 s, 128 iters, t-(init.)=1.46201 s t(norm)=0.0484376, mflops=103.226 (err=4.7e-16) 1. PDA (f2c): elapsed time t=1.65342 s, 32 iters, t-(init.)=1.59738 s t(norm)=0.21169, mflops=23.6195 (err=4.4e-16) 2. Singleton (f2c): elapsed time t=1.11087 s, 32 iters, t-(init.)=1.05429 s t(norm)=0.139718, mflops=35.7864 (err=5.6e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 103.226 Normalized results and averages for N=16800: fft 0: mflops = 103.226 (norm. = 1), norm. avg. (of 11) = 0.979118 fft 1: mflops = 23.6195 (norm. = 0.228814), norm. avg. (of 11) = 0.1109 fft 2: mflops = 35.7864 (norm. = 0.346681), norm. avg. (of 11) = 0.651195 fft 3: mflops = -1 (norm. = -0.00968753), norm. avg. (of 6) = 0.180597 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.50222 s, 8 iters, t-(init.)=1.21315 s t(norm)=0.081839, mflops=61.0956 (err=7.1e-16) 1. PDA (f2c): elapsed time t=1.78801 s, 4 iters, t-(init.)=1.65107 s t(norm)=0.222762, mflops=22.4455 (err=7.1e-16) 2. Singleton (f2c): elapsed time t=1.21064 s, 2 iters, t-(init.)=1.14094 s t(norm)=0.30787, mflops=16.2406 (err=8.2e-16) 3. Temperton (f2c): elapsed time t=1.59164 s, 4 iters, t-(init.)=1.44982 s t(norm)=0.195609, mflops=25.5612 (err=7.6e-16) Top mflops for N=110592 = 61.0956 Normalized results and averages for N=110592: fft 0: mflops = 61.0956 (norm. = 1), norm. avg. (of 12) = 0.980858 fft 1: mflops = 22.4455 (norm. = 0.367384), norm. avg. (of 12) = 0.132273 fft 2: mflops = 16.2406 (norm. = 0.265824), norm. avg. (of 12) = 0.61908 fft 3: mflops = 25.5612 (norm. = 0.41838), norm. avg. (of 7) = 0.214566 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.53297 s, 8 iters, t-(init.)=1.20624 s t(norm)=0.0760863, mflops=65.7148 (err=8.7e-16) 1. PDA (f2c): elapsed time t=1.54294 s, 2 iters, t-(init.)=1.46779 s t(norm)=0.370336, mflops=13.5013 (err=8.8e-16) 2. Singleton (f2c): elapsed time t=1.09285 s, 2 iters, t-(init.)=1.01253 s t(norm)=0.25547, mflops=19.5717 (err=1.1e-15) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 65.7148 Normalized results and averages for N=117649: fft 0: mflops = 65.7148 (norm. = 1), norm. avg. (of 13) = 0.982331 fft 1: mflops = 13.5013 (norm. = 0.205452), norm. avg. (of 13) = 0.137902 fft 2: mflops = 19.5717 (norm. = 0.297828), norm. avg. (of 13) = 0.594369 fft 3: mflops = -1 (norm. = -0.0152173), norm. avg. (of 7) = 0.214566 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.56793 s, 4 iters, t-(init.)=1.24199 s t(norm)=0.0811194, mflops=61.6375 (err=4.9e-16) 1. PDA (f2c): elapsed time t=1.74672 s, 2 iters, t-(init.)=1.58816 s t(norm)=0.207458, mflops=24.1012 (err=5.0e-16) 2. Singleton (f2c): elapsed time t=1.94218 s, 1 iters, t-(init.)=1.8692 s t(norm)=0.488338, mflops=10.2388 (err=6.0e-16) 3. Temperton (f2c): elapsed time t=1.87576 s, 2 iters, t-(init.)=1.71476 s t(norm)=0.223996, mflops=22.3218 (err=4.7e-16) Top mflops for N=216000 = 61.6375 Normalized results and averages for N=216000: fft 0: mflops = 61.6375 (norm. = 1), norm. avg. (of 14) = 0.983593 fft 1: mflops = 24.1012 (norm. = 0.391016), norm. avg. (of 14) = 0.155982 fft 2: mflops = 10.2388 (norm. = 0.166113), norm. avg. (of 14) = 0.563779 fft 3: mflops = 22.3218 (norm. = 0.362146), norm. avg. (of 8) = 0.233014 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.79432 s, 4 iters, t-(init.)=1.42537 s t(norm)=0.082362, mflops=60.7076 (err=5.7e-16) 1. PDA (f2c): elapsed time t=1.18465 s, 1 iters, t-(init.)=1.0925 s t(norm)=0.252512, mflops=19.801 (err=6.1e-16) 2. Singleton (f2c): elapsed time t=2.21348 s, 1 iters, t-(init.)=2.1218 s t(norm)=0.490414, mflops=10.1955 (err=7.0e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 60.7076 Normalized results and averages for N=241920: fft 0: mflops = 60.7076 (norm. = 1), norm. avg. (of 15) = 0.984687 fft 1: mflops = 19.801 (norm. = 0.32617), norm. avg. (of 15) = 0.167328 fft 2: mflops = 10.1955 (norm. = 0.167944), norm. avg. (of 15) = 0.53739 fft 3: mflops = -1 (norm. = -0.0164724), norm. avg. (of 8) = 0.233014 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.57252 s, 2 iters, t-(init.)=1.25379 s t(norm)=0.0795216, mflops=62.876 (err=9.0e-16) 1. PDA (f2c): elapsed time t=1.76215 s, 1 iters, t-(init.)=1.60476 s t(norm)=0.203564, mflops=24.5623 (err=9.5e-16) 2. Singleton (f2c): elapsed time t=3.38318 s, 1 iters, t-(init.)=3.22224 s t(norm)=0.40874, mflops=12.2327 (err=1.3e-15) 3. Temperton (f2c): elapsed time t=1.78271 s, 1 iters, t-(init.)=1.62599 s t(norm)=0.206256, mflops=24.2417 (err=1.1e-15) Top mflops for N=421875 = 62.876 Normalized results and averages for N=421875: fft 0: mflops = 62.876 (norm. = 1), norm. avg. (of 16) = 0.985644 fft 1: mflops = 24.5623 (norm. = 0.390647), norm. avg. (of 16) = 0.181285 fft 2: mflops = 12.2327 (norm. = 0.194553), norm. avg. (of 16) = 0.515963 fft 3: mflops = 24.2417 (norm. = 0.385547), norm. avg. (of 9) = 0.249962 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.0046 s, 1 iters, t-(init.)=0.812647 s t(norm)=0.0836876, mflops=59.746 (err=1.5e-15) 1. PDA (f2c): elapsed time t=2.25555 s, 1 iters, t-(init.)=2.06627 s t(norm)=0.212787, mflops=23.4977 (err=1.5e-15) 2. Singleton (f2c): elapsed time t=4.14283 s, 1 iters, t-(init.)=3.94761 s t(norm)=0.40653, mflops=12.2992 (err=2.3e-15) 3. Temperton (f2c): elapsed time t=2.4348 s, 1 iters, t-(init.)=2.24285 s t(norm)=0.230972, mflops=21.6476 (err=1.5e-15) Top mflops for N=512000 = 59.746 Normalized results and averages for N=512000: fft 0: mflops = 59.746 (norm. = 1), norm. avg. (of 17) = 0.986488 fft 1: mflops = 23.4977 (norm. = 0.393293), norm. avg. (of 17) = 0.193756 fft 2: mflops = 12.2992 (norm. = 0.205858), norm. avg. (of 17) = 0.497721 fft 3: mflops = 21.6476 (norm. = 0.362327), norm. avg. (of 10) = 0.261198 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.10578 s, 1 iters, t-(init.)=0.880378 s t(norm)=0.0774554, mflops=64.5533 (err=7.6e-16) 1. PDA (f2c): elapsed time t=3.27158 s, 1 iters, t-(init.)=3.0467 s t(norm)=0.268048, mflops=18.6534 (err=6.9e-16) 2. Singleton (f2c): elapsed time t=5.97242 s, 1 iters, t-(init.)=5.75512 s t(norm)=0.506334, mflops=9.87491 (err=8.6e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 64.5533 Normalized results and averages for N=592704: fft 0: mflops = 64.5533 (norm. = 1), norm. avg. (of 18) = 0.987239 fft 1: mflops = 18.6534 (norm. = 0.288961), norm. avg. (of 18) = 0.199045 fft 2: mflops = 9.87491 (norm. = 0.152973), norm. avg. (of 18) = 0.478569 fft 3: mflops = -1 (norm. = -0.0154911), norm. avg. (of 10) = 0.261198 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=2.22139 s, 1 iters, t-(init.)=1.88426 s t(norm)=0.107809, mflops=46.3785 (err=8.0e-16) 1. PDA (f2c): elapsed time t=5.16033 s, 1 iters, t-(init.)=4.82313 s t(norm)=0.275957, mflops=18.1188 (err=7.7e-16) 2. Singleton (f2c): elapsed time t=10.2679 s, 1 iters, t-(init.)=9.93297 s t(norm)=0.568317, mflops=8.7979 (err=8.2e-16) 3. Temperton (f2c): elapsed time t=5.00074 s, 1 iters, t-(init.)=4.66475 s t(norm)=0.266895, mflops=18.734 (err=8.9e-16) Top mflops for N=884736 = 46.3785 Normalized results and averages for N=884736: fft 0: mflops = 46.3785 (norm. = 1), norm. avg. (of 19) = 0.98791 fft 1: mflops = 18.1188 (norm. = 0.390672), norm. avg. (of 19) = 0.209131 fft 2: mflops = 8.7979 (norm. = 0.189698), norm. avg. (of 19) = 0.463365 fft 3: mflops = 18.734 (norm. = 0.403936), norm. avg. (of 11) = 0.274175 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=2.1872 s, 1 iters, t-(init.)=1.74665 s t(norm)=0.0749065, mflops=66.7498 (err=7.9e-16) 1. PDA (f2c): elapsed time t=6.45299 s, 1 iters, t-(init.)=6.01399 s t(norm)=0.257915, mflops=19.3862 (err=8.1e-16) 2. Singleton (f2c): elapsed time t=9.71099 s, 1 iters, t-(init.)=9.27277 s t(norm)=0.39767, mflops=12.5732 (err=9.7e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 66.7498 Normalized results and averages for N=1157625: fft 0: mflops = 66.7498 (norm. = 1), norm. avg. (of 20) = 0.988515 fft 1: mflops = 19.3862 (norm. = 0.290431), norm. avg. (of 20) = 0.213196 fft 2: mflops = 12.5732 (norm. = 0.188363), norm. avg. (of 20) = 0.449615 fft 3: mflops = -1 (norm. = -0.0149813), norm. avg. (of 11) = 0.274175 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=2.82465 s, 1 iters, t-(init.)=2.29017 s t(norm)=0.0798203, mflops=62.6407 (err=7.2e-16) 1. PDA (f2c): elapsed time t=8.2143 s, 1 iters, t-(init.)=7.68015 s t(norm)=0.26768, mflops=18.679 (err=7.0e-16) 2. Singleton (f2c): elapsed time t=12.1305 s, 1 iters, t-(init.)=11.5956 s t(norm)=0.404146, mflops=12.3718 (err=6.8e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 62.6407 Normalized results and averages for N=1404928: fft 0: mflops = 62.6407 (norm. = 1), norm. avg. (of 21) = 0.989062 fft 1: mflops = 18.679 (norm. = 0.298193), norm. avg. (of 21) = 0.217244 fft 2: mflops = 12.3718 (norm. = 0.197504), norm. avg. (of 21) = 0.43761 fft 3: mflops = -1 (norm. = -0.0159641), norm. avg. (of 11) = 0.274175 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=3.5407 s, 1 iters, t-(init.)=2.88279 s t(norm)=0.080513, mflops=62.1018 (err=5.9e-16) 1. PDA (f2c): elapsed time t=7.86945 s, 1 iters, t-(init.)=7.21175 s t(norm)=0.201416, mflops=24.8243 (err=6.2e-16) 2. Singleton (f2c): elapsed time t=24.1062 s, 1 iters, t-(init.)=23.4486 s t(norm)=0.654891, mflops=7.63486 (err=7.4e-16) 3. Temperton (f2c): elapsed time t=10.1463 s, 1 iters, t-(init.)=9.48804 s t(norm)=0.26499, mflops=18.8687 (err=5.7e-16) Top mflops for N=1728000 = 62.1018 Normalized results and averages for N=1728000: fft 0: mflops = 62.1018 (norm. = 1), norm. avg. (of 22) = 0.989559 fft 1: mflops = 24.8243 (norm. = 0.399735), norm. avg. (of 22) = 0.225539 fft 2: mflops = 7.63486 (norm. = 0.122941), norm. avg. (of 22) = 0.423306 fft 3: mflops = 18.8687 (norm. = 0.303834), norm. avg. (of 12) = 0.276646 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=6.2503 s, 1 iters, t-(init.)=5.11311 s t(norm)=0.0796089, mflops=62.807 (err=1.2e-15) 1. PDA (f2c): elapsed time t=14.6646 s, 1 iters, t-(init.)=13.5278 s t(norm)=0.210621, mflops=23.7393 (err=1.1e-15) 2. Singleton (f2c): elapsed time t=35.8101 s, 1 iters, t-(init.)=34.6727 s t(norm)=0.539839, mflops=9.26202 (err=1.5e-15) 3. Temperton (f2c): elapsed time t=16.8194 s, 1 iters, t-(init.)=15.6812 s t(norm)=0.244149, mflops=20.4793 (err=1.2e-15) Top mflops for N=2985984 = 62.807 Normalized results and averages for N=2985984: fft 0: mflops = 62.807 (norm. = 1), norm. avg. (of 23) = 0.990013 fft 1: mflops = 23.7393 (norm. = 0.377972), norm. avg. (of 23) = 0.232166 fft 2: mflops = 9.26202 (norm. = 0.147468), norm. avg. (of 23) = 0.411313 fft 3: mflops = 20.4793 (norm. = 0.326066), norm. avg. (of 13) = 0.280448 Benchmarking for array size = 180x180x180: 0. FFTW: elapsed time t=12.4558 s, 1 iters, t-(init.)=10.2349 s t(norm)=0.0780829, mflops=64.0345 (err=1.2e-15) 1. PDA (f2c): elapsed time t=29.4931 s, 1 iters, t-(init.)=27.2712 s t(norm)=0.208054, mflops=24.0322 (err=1.2e-15) 2. Singleton (f2c): elapsed time t=82.6982 s, 1 iters, t-(init.)=80.4767 s t(norm)=0.613963, mflops=8.14381 (err=1.7e-15) 3. Temperton (f2c): elapsed time t=32.965 s, 1 iters, t-(init.)=30.7437 s t(norm)=0.234546, mflops=21.3178 (err=1.2e-15) Top mflops for N=5832000 = 64.0345 Normalized results and averages for N=5832000: fft 0: mflops = 64.0345 (norm. = 1), norm. avg. (of 24) = 0.990429 fft 1: mflops = 24.0322 (norm. = 0.375301), norm. avg. (of 24) = 0.23813 fft 2: mflops = 8.14381 (norm. = 0.127178), norm. avg. (of 24) = 0.399474 fft 3: mflops = 21.3178 (norm. = 0.33291), norm. avg. (of 14) = 0.284195 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Beauregard, Bergland, CWP (min N), CWP (best N), Edelblute, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), NAPACK (f2c), Nielsen, NR (C), Ooura (C), QFT, Ransom, Singleton (f2c), Temperton (f2c), Valkenburg 2, 35.0048, 33.4616, 20.4508, 1.41402, 4.98546, 5.77009, 4.98205, 4.69105, , 7.30521, 35.4795, 35.8815, 54.1645, , 13.2954, 8.94605, 8.60198, 29.7946, , , , 3.34966, 2.54113, 10.3406, 33.1019, , , 7.22977, 3.32244, 8.68707 4, 70.927, 71.6894, 31.9132, 6.63476, 11.5609, 20.5654, 19.0974, 9.51454, 29.6512, 18.8249, 115.647, 116.797, 172.712, , 34.8927, 18.1969, 17.6247, 83.2961, 35.6568, 37.1371, 34.1304, 7.41812, 9.20186, 19.317, 64.0517, , 5.40589, 25.4573, 10.4294, 9.25039 8, 112.746, 111.678, 40.8305, 8.80644, 16.5094, 36.3819, 46.5982, 28.6138, 32.7784, 31.4032, 194.26, 195.258, 258.972, 83.3199, 49.5661, 31.7934, 30.9192, 114.969, 60.2293, 60.786, 58.3118, 11.1838, 20.4496, 32.9214, 118.988, , 6.33094, 28.7479, 14.7905, 9.72813 16, 73.3165, 73.6808, 49.9493, 17.0173, 18.6888, 60.9583, 82.9989, 60.338, 37.2467, 46.4543, 262.185, 262.884, 277.125, 131.801, 74.3686, 46.4161, 45.8163, 133.127, 59.0658, 70.7138, 70.0099, 15.8119, 19.5054, 47.9013, 155.098, 90.7386, 20.5962, 75.7057, 21.1144, 10.1135 32, 88.4438, 89.0531, 58.9008, 20.4947, 19.7272, 90.8112, 88.2332, 112.825, 42.3844, 38.5506, 313.227, 313.817, 316.59, 181.498, 69.2934, 60.8126, 61.5172, 136.989, 69.2987, 86.4155, 88.0903, 18.0519, 28.0274, 63.6799, 175.592, 82.8826, 21.021, 99.2962, 19.7043, 10.4261 64, 91.3184, 91.7221, 67.1613, 30.5014, 20.0457, 110.511, 101.377, 132.266, 47.7957, 45.8398, 339.087, 267.39, 201.526, 254.22, 83.916, 71.1103, 73.6883, 137.4, 73.0345, 94.6716, 97.9969, 20.9571, 36.0423, 76.2027, 197.027, 79.2647, 42.1745, 140.312, 25.8791, 10.657 128, 101.511, 102.117, 74.8274, 30.7634, 20.1666, 124.909, 120.037, 186.323, 53.2418, 51.6539, 295.559, 295.615, 223.838, 279.537, 86.233, 78.7442, 83.1661, 66.7194, 80.0763, 106.349, 110.109, 21.4883, 32.7958, 86.4559, 201.156, 77.1547, 39.7573, 131.46, 24.7044, 10.8175 256, 107.785, 107.865, 81.7323, 35.3994, 20.1903, 142.078, 142.074, 193.337, 58.4739, 56.0059, 316.54, 312.094, 250.473, 299.523, 96.4228, 83.367, 89.4842, 67.5072, 84.5355, 112.364, 116.845, 23.1576, 37.2695, 93.5557, 215.358, 75.5259, 59.9469, 176.607, 27.2357, 10.9421 512, 113.305, 113.4, 80.4524, 37.5447, 20.1782, 154.426, 146.317, 182.794, 59.2218, 46.4096, 298.242, 297.444, 241.397, 323.795, 85.6062, 84.0174, 93.4673, 71.0492, 91.0488, 121.635, 125.894, 22.98, 41.6148, 95.1424, 214.103, 70.8516, 55.4373, 178.169, 24.3886, 11.0255 1024, 119.351, 119.002, 85.3756, 43.9079, 20.1666, 158.186, 144.75, 144.727, 63.4541, 28.6448, 137.664, 63.8461, 63.2101, 332.314, 43.9463, 83.2212, 95.5752, 56.4634, 95.5707, 125.983, 130.475, 20.5697, 33.892, 94.7038, 223.494, 26.4138, 73.8049, 193.765, 27.6043, 9.30881 2048, 115.371, 113.261, 79.5486, 39.707, 19.6552, 152.55, 111.205, 127.095, 61.2072, 16.082, 46.0635, 44.5323, 27.2962, 149.688, 20.0671, 76.9333, 89.8742, 48.4162, 96.5518, 126.097, 96.2165, 5.51015, 22.9424, 87.0191, 110.791, 23.9636, 58.5773, 157.189, 20.4337, 6.4375 4096, 23.3368, 22.0753, 18.4966, 29.7064, 16.582, 54.1218, 75.8731, 91.2048, 17.5426, 17.3943, 46.5798, 43.2728, 28.0851, 55.7397, 23.1613, 23.0388, 23.3319, 44.0369, 87.0133, 111.011, 57.9494, 5.77729, 17.9213, 23.5041, 60.4163, 22.2574, 56.0002, 46.5813, 19.4299, 5.86374 8192, 24.7372, 23.3588, 17.2929, 24.6818, 16.2921, 48.1967, 75.5846, 82.1991, 16.5262, 14.6176, 46.1163, 45.6338, 29.5763, 54.2435, 17.8946, 22.6994, 22.9442, , 26.1554, 27.5606, 23.1758, 5.58982, 16.6506, 23.1169, 57.9867, 19.599, 45.7544, 42.2004, 18.0463, 5.5023 16384, 22.0908, 20.9099, 17.0632, 31.2835, 16.3851, 51.3878, 80.1045, 80.1141, 16.3956, 15.1392, 49.5699, 46.5761, 27.4711, 52.5641, 19.4947, 22.656, 22.8683, , 25.2218, 26.4109, 22.7843, 5.78478, 17.7644, 22.9473, 61.356, 17.5734, 62.688, 45.5534, 19.4089, 5.25987 32768, 23.1891, 21.9977, 16.6509, 25.257, 16.0158, 49.6123, 77.9546, 77.8883, 16.0509, 15.4852, 45.4885, 45.3243, 21.5333, 50.951, 20.9338, 22.1363, 22.3515, , 24.8079, 25.8943, 22.5861, 5.53685, 17.967, 22.2934, 58.3272, 16.0168, 49.5727, 37.0487, 18.04, 4.92754 65536, 20.3727, 19.4107, 16.0657, 32.9029, 15.6558, 44.6395, 62.5021, 62.4041, 15.4987, 12.9971, 45.075, 36.6406, 19.1918, 49.1659, 20.2645, 21.093, 21.3467, , 24.221, 25.2197, 21.8852, 5.6777, 15.98, 21.7216, 55.335, 11.3359, 42.0449, 40.6991, 18.1063, 4.49361 131072, 6.60528, 6.20901, 4.59802, 14.7278, 10.327, 17.0541, 39.4421, 39.4393, 4.59751, 9.10963, 35.2584, 26.6878, 15.8531, 17.8466, 12.7313, 6.87661, 7.09956, , 20.0416, 20.7317, 18.2443, 5.40126, 8.34175, 6.8482, 21.4549, 10.2591, 20.6194, 13.37, 10.9313, 3.55353 262144, 5.21479, 5.18146, 4.26804, 16.5138, 10.297, 16.2323, 24.7594, 24.7014, 4.25214, 9.04732, 33.7343, 26.5307, 14.2254, 16.719, 12.9809, 6.66158, 6.51589, , 7.47729, 7.55606, 7.2386, 5.59644, 6.93067, 6.44113, 22.4639, 8.26872, 29.5699, 13.0818, 10.3486, 3.40761 524288, 5.44758, 5.40842, 4.18519, 11.2362, 10.3121, 14.9396, 25.1536, 25.1498, 4.16418, 9.49159, 31.1579, 26.7322, 12.6667, 16.4827, 13.7968, 6.63694, 6.48237, , 6.91721, 6.98767, 6.64519, 5.44376, 6.56851, 6.41831, 21.7278, 5.83267, 22.8152, 11.068, 9.71201, 3.33744 1048576, 5.01357, 4.97262, 4.12922, 17.3023, 10.3372, 14.6499, , , 4.10638, 9.33856, 32.2264, 24.9784, 12.3798, , 14.0294, 6.60918, 6.44345, , , 6.90072, 6.57804, 5.61711, 6.81624, 6.39041, 22.7821, 5.43339, 32.4766, 12.8399, 10.0511, 3.25907 Norm. Avg., 0.324703, 0.318566, 0.218616, 0.229815, 0.1508, 0.445185, 0.571242, 0.611817, 0.176624, 0.174731, 0.771435, 0.699263, 0.570126, 0.688231, 0.266951, 0.233396, 0.242594, 0.360633, 0.326766, 0.374513, 0.326263, 0.0821132, 0.147952, 0.246463, 0.639701, 0.216363, 0.40497, 0.431692, 0.160732, 0.0622817 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, CWP (min N), CWP (best N), FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Nielsen, Singleton (f2c), Temperton (f2c), Valkenburg 6, 27.4367, 18.3832, 23.4432, 147.107, 147.985, 26.4307, 39.0212, 10.014, 7.75684, 20.4873, 10.1547, 9.82994 9, 47.6561, 34.0389, 29.4132, 160.875, 161.381, 22.9293, 36.7828, 12.5655, 11.9376, 38.4553, 14.6988, 10.0099 12, 61.4302, 51.1681, 37.3715, 238.89, 239.694, 39.9254, 57.5976, 13.1246, 15.458, 40.7963, 19.4481, 10.152 15, 70.0488, 69.8601, 39.3248, 181.793, 182.145, 27.7946, 40.0592, 9.11582, 18.0112, 45.9256, 19.9852, 9.13964 18, 74.3597, 70.5858, 28.2028, 151.503, 149.675, 26.4502, 64.0846, 15.0164, 13.6939, 57.3742, 17.009, 10.3137 24, 102.464, 103.475, 33.1381, 201.216, 201.463, 52.0156, 84.7066, 16.8656, 23.3335, 54.2825, 22.0121, 10.4323 36, 122.886, 122.752, 35.9059, 215.17, 215.247, 33.1943, 96.4455, 18.2632, 20.6852, 88.0181, 25.1599, 10.5594 80, 152.752, 174.178, 45.9328, 243.913, 243.83, 66.6158, 63.8998, 12.6672, 38.4533, 135.387, 29.4023, 10.06 108, 130.072, 167.714, 43.5276, 236.398, 236.603, 30.5497, 90.7186, 19.9758, 23.5628, 97.5935, 26.9034, 10.7151 210, 185.556, 185.559, 31.8523, 195.475, 186.546, 32.5841, 64.894, 10.3398, 31.7874, 79.7047, , 8.66726 504, 200.838, 200.911, 31.227, 240.212, 206.435, 38.2507, 87.3431, 13.5616, 30.3102, 95.1582, , 9.3407 1000, 130.258, 190.336, 28.4878, 124.026, 123.715, 24.9897, 36.5347, 10.9447, 42.1314, 134.774, 27.4564, 8.33822 1960, 161.381, 161.248, 9.45352, 47.9398, 49.2298, 18.9745, 16.4228, 6.19351, 22.287, 95.2014, , 6.17453 4725, 89.0843, 98.1955, 11.1866, 51.7197, 45.5927, 13.007, 18.8664, 6.46291, 17.6258, 42.6991, , 6.41734 10368, 85.254, 89.6654, 15.9263, 52.1762, 49.1724, 18.0502, 21.9259, 6.41938, 16.8992, 34.7464, 20.4676, 5.83417 27000, 90.7366, 90.7004, 16.4523, 42.2002, 47.4493, 12.9194, 21.3471, 6.1928, 18.9641, 34.6371, 20.1969, 5.33555 75600, 58.1608, 58.1813, 10.5775, 42.838, 40.9325, 11.8065, 16.3364, 5.98401, 15.4029, 22.7923, , 4.64562 165375, 32.7452, 32.7309, 6.38959, 37.4901, 35.1546, 7.79395, 16.742, 5.04186, 11.1145, 13.1196, , 4.70549 362880, 16.9142, 16.9193, 8.66534, 44.361, 36.2608, 9.67794, 17.9104, 5.31469, 8.78055, 10.2657, , 3.7945 Norm. Avg., 0.655262, 0.681252, 0.164413, 0.854834, 0.828979, 0.174158, 0.299004, 0.0748636, 0.14914, 0.373829, 0.126528, 0.0590443 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM (f2c), NR (C), PDA (f2c), Singleton (f2c), Temperton (f2c) 4x4x4, 232.253, , 80.5763, 11.9834, 152.778, 34.9648 8x8x8, 275.971, 39.5296, 109.677, 21.4061, 134.569, 37.5176 16x16x16, 96.3057, 34.5721, 21.5956, 23.6959, 38.6925, 34.7764 32x32x32, 110.764, 33.9782, 20.899, 22.9631, 32.9939, 29.1207 64x64x64, 38.0499, 20.1184, 5.99803, 11.5231, 11.5626, 19.162 256x64x32, 40.844, 19.3536, 6.03466, 14.3845, 11.0312, 16.3752 16x1024x64, 38.3944, 19.8129, 5.98823, 11.9696, 11.4588, 128x128x128, 41.5388, 18.4877, 5.9981, 13.3328, 8.16502, 10.3624 512x128x64, 43.4114, 18.9298, 5.99076, 13.416, 9.76391, Norm. Avg., 1, 0.40109, 0.211225, 0.242147, 0.348773, 0.294927 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA (f2c), Singleton (f2c), Temperton (f2c) 5x5x5, 164.374, 14.5002, 210.76, 34.2814 6x6x6, 221.267, 16.1128, 114.123, 30.7282 7x7x7, 157.024, 8.74301, 100.59, 9x9x9, 206.078, 20.8889, 146.578, 37.6417 10x10x10, 226.559, 23.6947, 143.738, 34.7601 11x11x11, 120.368, 8.73277, 94.7689, 12x12x12, 296.984, 25.6645, 142.063, 46.5178 13x13x13, 80.4936, 8.28935, 81.2751, 14x14x14, 119.674, 14.0091, 64.3945, 15x15x15, 125.963, 26.4287, 64.3188, 36.4455 24x25x28, 103.226, 23.6195, 35.7864, 48x48x48, 61.0956, 22.4455, 16.2406, 25.5612 49x49x49, 65.7148, 13.5013, 19.5717, 60x60x60, 61.6375, 24.1012, 10.2388, 22.3218 72x60x56, 60.7076, 19.801, 10.1955, 75x75x75, 62.876, 24.5623, 12.2327, 24.2417 80x80x80, 59.746, 23.4977, 12.2992, 21.6476 84x84x84, 64.5533, 18.6534, 9.87491, 96x96x96, 46.3785, 18.1188, 8.7979, 18.734 105x105x105, 66.7498, 19.3862, 12.5732, 112x112x112, 62.6407, 18.679, 12.3718, 120x120x120, 62.1018, 24.8243, 7.63486, 18.8687 144x144x144, 62.807, 23.7393, 9.26202, 20.4793 180x180x180, 64.0345, 24.0322, 8.14381, 21.3178 Norm. Avg., 0.990429, 0.23813, 0.399474, 0.284195 @@@@ end