To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Eric Frey @ submitter email = frey@bme.unc.edu @ submitter organization = University of North Carolina @ computer manufacturer = DEC @ computer model = Personal Workstation 500au @ CPU manufacturer = DEC @ CPU model = Alpha EV56/21164a @ CPU speed = 500 MHz @ RAM = 256 MB @ L2 cache size = 2 MB @ operating system = Digital Unix 4.0c @ C compiler = DEC C V5.2-033 @ C compiler flags = -newc -w0 -O5 -ansi_alias -ansi_args -fp_reorder -tune host -std1 -DUSE_DXML @ Fortran compiler = DEC Fortran v3.8 @ Fortran compiler flags = -w0 -O5 -ansi_alias -ansi_args -fp_reorder -tune host -std1 @ remarks = Using DXML version 3.3, optimized for the EV5 (this is the correct version). @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.3 7915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) Maximum array size = 360360 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Bailey 5. Beauregard 6. Bergland 7. Brenner 8. Burrus 9. CWP (min N) 10. CWP (best N) 11. Edelblute 12. FFTPACK 13. FFTPACK (f2c) 14. FFTW 15. FFTW_ESTIMATE 16. Frigo-old 17. Green 18. GSL 19. GSL DIT 20. GSL DIF 21. Krukar 22. Mayer (Buneman) 23. Mayer (simple) 24. Mayer (lookup) 25. Monro 26. NAPACK (f2c) 27. Ooura (C) 28. Ooura (F) 29. Ransom 30. SCIPORT 31. Singleton 32. Singleton (f2c) 33. Sorensen 34. Sorensen DIT 35. Temperton 36. Temperton (f2c) 37. Valkenburg 38. DXML Computing normalized averages (39 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.08329 s, 4194304 iters, t-(init.)=0.649974 s t(norm)=0.0774829, mflops=64.5303 (err=1.7e-17) 1. Arndt DIT: elapsed time t=1.11662 s, 4194304 iters, t-(init.)=0.899964 s t(norm)=0.107284, mflops=46.6052 (err=1.7e-17) 2. Arndt Split-Radix: elapsed time t=1.19995 s, 4194304 iters, t-(init.)=0.966628 s t(norm)=0.115231, mflops=43.3911 (err=1.7e-17) 3. Arndt 4-step: elapsed time t=1.98325 s, 524288 iters, t-(init.)=1.94992 s t(norm)=1.85959, mflops=2.68876 (err=1.7e-17) 4. Bailey: elapsed time t=1.41661 s, 2097152 iters, t-(init.)=1.29995 s t(norm)=0.309932, mflops=16.1326 (err=1.7e-17) 5. Beauregard: elapsed time t=1.48327 s, 1048576 iters, t-(init.)=1.41661 s t(norm)=0.675492, mflops=7.40201 (err=1.7e-17) 6. Bergland: elapsed time t=1.79993 s, 2097152 iters, t-(init.)=1.68327 s t(norm)=0.401322, mflops=12.4588 (err=1.7e-17) 7. Brenner: elapsed time t=1.99992 s, 2097152 iters, t-(init.)=1.86659 s t(norm)=0.44503, mflops=11.2352 (err=1.7e-17) 8. Burrus: elapsed time t=1.31661 s, 4194304 iters, t-(init.)=1.08329 s t(norm)=0.129138, mflops=38.7182 (err=1.7e-17) 9. CWP (min N): elapsed time t=1.31661 s, 1048576 iters, t-(init.)=1.24995 s t(norm)=0.596023, mflops=8.38894 10. CWP (best N) (N=3): elapsed time t=1.43328 s, 1048576 iters, t-(init.)=1.34995 s t(norm)=0.643704, mflops=7.76754 11. Skipping fft (Edelblute can't handle N <= 2). 12. FFTPACK: elapsed time t=1.11662 s, 2097152 iters, t-(init.)=0.99996 s t(norm)=0.238409, mflops=20.9724 (err=1.7e-17) 13. FFTPACK (f2c): elapsed time t=1.14995 s, 2097152 iters, t-(init.)=1.03329 s t(norm)=0.246356, mflops=20.2958 (err=1.7e-17) FFTW_MEASURE plan: (cost = 1.986742e-07) FFTW_NOTW 2 14. FFTW: elapsed time t=1.7666 s, 8388608 iters, t-(init.)=1.24995 s t(norm)=0.0745028, mflops=67.1115 (err=1.7e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 15. FFTW_ESTIMATE: elapsed time t=1.74993 s, 8388608 iters, t-(init.)=1.09996 s t(norm)=0.0655625, mflops=76.2631 (err=1..7e-17) 16. Frigo-old: elapsed time t=1.28328 s, 8388608 iters, t-(init.)=0.8333 s t(norm)=0.0496686, mflops=100.667 (err=1.7e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.78326 s, 4194304 iters, t-(init.)=1.54994 s t(norm)=0.184767, mflops=27.0611 (err=1.7e-17) 19. GSL DIT: elapsed time t=1.03329 s, 1048576 iters, t-(init.)=0.983294 s t(norm)=0.468871, mflops=10.6639 (err=1.7e-17) 20. GSL DIF: elapsed time t=1.94992 s, 2097152 iters, t-(init.)=1.83326 s t(norm)=0.437083, mflops=11.4395 (err=1.7e-17) 21. Krukar: elapsed time t=1.19995 s, 4194304 iters, t-(init.)=0.983294 s t(norm)=0.117218, mflops=42.6556 (err=1.7e-17) 22. Skipping fft (Mayer can't handle N <= 2). 23. Skipping fft (Mayer can't handle N <= 2). 24. Skipping fft (Mayer can't handle N <= 2). 25. Skipping fft (Monro can't handle N <= 2). 26. NAPACK (f2c): elapsed time t=1.53327 s, 1048576 iters, t-(init.)=1.48327 s t(norm)=0.70728, mflops=7.06933 (err=1.7e-17) 27. Ooura (C): elapsed time t=1.41661 s, 8388608 iters, t-(init.)=0.949962 s t(norm)=0.0566221, mflops=88.3047 (err=1.7e-17) 28. Ooura (F): elapsed time t=1.73326 s, 8388608 iters, t-(init.)=1.26662 s t(norm)=0.0754962, mflops=66.2285 (err=1.7e-17) 29. Skipping fft (Ransom doesn't work for N=2). 30. Skipping fft (SCIPORT can't handle N < 4). 31. Singleton: elapsed time t=1.26662 s, 1048576 iters, t-(init.)=1.18329 s t(norm)=0.564235, mflops=8.86156 (err=1.7e-17) 32. Singleton (f2c): elapsed time t=1.23328 s, 1048576 iters, t-(init.)=1.16662 s t(norm)=0.556288, mflops=8.98815 (err=1.7e-17) 33. Sorensen: elapsed time t=1.08329 s, 4194304 iters, t-(init.)=0.799968 s t(norm)=0.0953636, mflops=52.4309 (err=1.7e-17) 34. Sorensen DIT: elapsed time t=1.44994 s, 4194304 iters, t-(init.)=1.23328 s t(norm)=0.147019, mflops=34.0092 (err=1.7e-17) 35. Temperton: elapsed time t=1.26662 s, 1048576 iters, t-(init.)=1.21662 s t(norm)=0.580129, mflops=8.61878 (err=1.7e-17) 36. Temperton (f2c): elapsed time t=1.51661 s, 1048576 iters, t-(init.)=1.46661 s t(norm)=0.699333, mflops=7.14967 (err=1.7e-17) 37. Valkenburg: elapsed time t=1.43328 s, 2097152 iters, t-(init.)=1.31661 s t(norm)=0.313905, mflops=15.9284 (err=1.7e-17) 38. DXML: elapsed time t=1.06662 s, 1048576 iters, t-(init.)=0.99996 s t(norm)=0.476818, mflops=10.4862 (err=1.7e-17) Top mflops for N=2 = 100.667 Normalized results and averages for N=2: fft 0: mflops = 64.5303 (norm. = 0.641026), norm. avg. (of 1) = 0.641026 fft 1: mflops = 46.6052 (norm. = 0.462963), norm. avg. (of 1) = 0.462963 fft 2: mflops = 43.3911 (norm. = 0.431034), norm. avg. (of 1) = 0.431034 fft 3: mflops = 2.68876 (norm. = 0.0267094), norm. avg. (of 1) = 0.0267094 fft 4: mflops = 16.1326 (norm. = 0.160256), norm. avg. (of 1) = 0.160256 fft 5: mflops = 7.40201 (norm. = 0.0735294), norm. avg. (of 1) = 0.0735294 fft 6: mflops = 12.4588 (norm. = 0.123762), norm. avg. (of 1) = 0.123762 fft 7: mflops = 11.2352 (norm. = 0.111607), norm. avg. (of 1) = 0.111607 fft 8: mflops = 38.7182 (norm. = 0.384615), norm. avg. (of 1) = 0.384615 fft 9: mflops = 8.38894 (norm. = 0.0833333), norm. avg. (of 1) = 0.0833333 fft 10: mflops = 7.76754 (norm. = 0.0771605), norm. avg. (of 1) = 0.0771605 fft 11: mflops = -1 (norm. = -0.00993371), norm. avg. (of 0) = -1 fft 12: mflops = 20.9724 (norm. = 0.208333), norm. avg. (of 1) = 0.208333 fft 13: mflops = 20.2958 (norm. = 0.201613), norm. avg. (of 1) = 0.201613 fft 14: mflops = 67.1115 (norm. = 0.666667), norm. avg. (of 1) = 0.666667 fft 15: mflops = 76.2631 (norm. = 0.757576), norm. avg. (of 1) = 0.757576 fft 16: mflops = 100.667 (norm. = 1), norm. avg. (of 1) = 1 fft 17: mflops = -1 (norm. = -0.00993371), norm. avg. (of 0) = -1 fft 18: mflops = 27.0611 (norm. = 0.268817), norm. avg. (of 1) = 0.268817 fft 19: mflops = 10.6639 (norm. = 0.105932), norm. avg. (of 1) = 0.105932 fft 20: mflops = 11.4395 (norm. = 0.113636), norm. avg. (of 1) = 0.113636 fft 21: mflops = 42.6556 (norm. = 0.423729), norm. avg. (of 1) = 0.423729 fft 22: mflops = -1 (norm. = -0.00993371), norm. avg. (of 0) = -1 fft 23: mflops = -1 (norm. = -0.00993371), norm. avg. (of 0) = -1 fft 24: mflops = -1 (norm. = -0.00993371), norm. avg. (of 0) = -1 fft 25: mflops = -1 (norm. = -0.00993371), norm. avg. (of 0) = -1 fft 26: mflops = 7.06933 (norm. = 0.0702247), norm. avg. (of 1) = 0.0702247 fft 27: mflops = 88.3047 (norm. = 0.877193), norm. avg. (of 1) = 0.877193 fft 28: mflops = 66.2285 (norm. = 0.657895), norm. avg. (of 1) = 0.657895 fft 29: mflops = -1 (norm. = -0.00993371), norm. avg. (of 0) = -1 fft 30: mflops = -1 (norm. = -0.00993371), norm. avg. (of 0) = -1 fft 31: mflops = 8.86156 (norm. = 0.0880282), norm. avg. (of 1) = 0.0880282 fft 32: mflops = 8.98815 (norm. = 0.0892857), norm. avg. (of 1) = 0.0892857 fft 33: mflops = 52.4309 (norm. = 0.520833), norm. avg. (of 1) = 0.520833 fft 34: mflops = 34.0092 (norm. = 0.337838), norm. avg. (of 1) = 0.337838 fft 35: mflops = 8.61878 (norm. = 0.0856164), norm. avg. (of 1) = 0.0856164 fft 36: mflops = 7.14967 (norm. = 0.0710227), norm. avg. (of 1) = 0.0710227 fft 37: mflops = 15.9284 (norm. = 0.158228), norm. avg. (of 1) = 0.158228 fft 38: mflops = 10.4862 (norm. = 0.104167), norm. avg. (of 1) = 0.104167 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.96659 s, 4194304 iters, t-(init.)=1.44994 s t(norm)=0.0432116, mflops=115.71 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.64993 s, 4194304 iters, t-(init.)=1.44994 s t(norm)=0.0432116, mflops=115.71 (err=1.3e-16) 2. Arndt Split-Radix: elapsed time t=1.69993 s, 2097152 iters, t-(init.)=1.59994 s t(norm)=0.0953636, mflops=52.4309 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.03329 s, 262144 iters, t-(init.)=1.01663 s t(norm)=0.484765, mflops=10.3143 (err=1.3e-16) 4. Bailey: elapsed time t=1.28328 s, 1048576 iters, t-(init.)=1.24995 s t(norm)=0.149006, mflops=33.5558 (err=1.3e-16) 5. Beauregard: elapsed time t=1.38328 s, 524288 iters, t-(init.)=1.34995 s t(norm)=0.321852, mflops=15.5351 (err=6.5e-17) 6. Bergland: elapsed time t=1.03329 s, 1048576 iters, t-(init.)=0.983294 s t(norm)=0.117218, mflops=42.6556 (err=5.3e-17) 7. Brenner: elapsed time t=1.53327 s, 1048576 iters, t-(init.)=1.48327 s t(norm)=0.17682, mflops=28.2773 (err=5.3e-17) 8. Burrus: elapsed time t=2.01659 s, 2097152 iters, t-(init.)=1.93326 s t(norm)=0.115231, mflops=43.3911 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.34995 s, 1048576 iters, t-(init.)=1.29995 s t(norm)=0.154966, mflops=32.2652 10. CWP (best N) (N=15): elapsed time t=1.31661 s, 524288 iters, t-(init.)=1.24995 s t(norm)=0.298011, mflops=16.7779 11. Edelblute: elapsed time t=1.81659 s, 2097152 iters, t-(init.)=1.73326 s t(norm)=0.103311, mflops=48.3978 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.04996 s, 2097152 iters, t-(init.)=0.949962 s t(norm)=0.0566221, mflops=88.3047 (err=5.3e-17) 13. FFTPACK (f2c): elapsed time t=1.44994 s, 2097152 iters, t-(init.)=1.34995 s t(norm)=0.0804631, mflops=62.1403 (err=5.3e-17) FFTW_MEASURE plan: (cost = 2.543030e-07) FFTW_NOTW 4 14. FFTW: elapsed time t=1.06662 s, 4194304 iters, t-(init.)=0.883298 s t(norm)=0.0263243, mflops=189.938 (err=5.3e-17) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 15. FFTW_ESTIMATE: elapsed time t=1.06662 s, 4194304 iters, t-(init.)=0.533312 s t(norm)=0.0158939, mflops=314.585 (err=5.3e-17) 16. Frigo-old: elapsed time t=1.43328 s, 8388608 iters, t-(init.)=1.03329 s t(norm)=0.0153973, mflops=324.733 (err=5.3e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.21662 s, 2097152 iters, t-(init.)=1.11662 s t(norm)=0.0665559, mflops=75.1249 (err=5.3e-17) 19. GSL DIT: elapsed time t=1.98325 s, 1048576 iters, t-(init.)=1.93326 s t(norm)=0.230462, mflops=21.6955 (err=6.5e-17) 20. GSL DIF: elapsed time t=1.89992 s, 1048576 iters, t-(init.)=1.84993 s t(norm)=0.220528, mflops=22.6728 (err=6.5e-17) 21. Krukar: elapsed time t=1.73326 s, 4194304 iters, t-(init.)=1.54994 s t(norm)=0.0461918, mflops=108.244 (err=5.3e-17) 22. Mayer (Buneman): elapsed time t=1.13329 s, 2097152 iters, t-(init.)=1.03329 s t(norm)=0.061589, mflops=81.1833 (err=1.3e-16) 23. Mayer (simple): elapsed time t=1.06662 s, 2097152 iters, t-(init.)=0.966628 s t(norm)=0.0576155, mflops=86.7822 24. Mayer (lookup): elapsed time t=1.16662 s, 2097152 iters, t-(init.)=1.08329 s t(norm)=0.0645691, mflops=77.4364 (err=1.3e-16) 25. Monro: elapsed time t=1.78326 s, 524288 iters, t-(init.)=1.74993 s t(norm)=0.417216, mflops=11.9842 (err=1.3e-16) 26. NAPACK (f2c): elapsed time t=1.53327 s, 524288 iters, t-(init.)=1.49994 s t(norm)=0.357614, mflops=13.9816 (err=1.6e-16) 27. Ooura (C): elapsed time t=1.84993 s, 4194304 iters, t-(init.)=1.6666 s t(norm)=0.0496686, mflops=100.667 (err=5.3e-17) 28. Ooura (F): elapsed time t=1.49994 s, 4194304 iters, t-(init.)=1.29995 s t(norm)=0.0387415, mflops=129.061 (err=5.3e-17) 29. Ransom: elapsed time t=1.04996 s, 262144 iters, t-(init.)=1.04996 s t(norm)=0.500659, mflops=9.98684 (err=1.6e-16) 30. SCIPORT: elapsed time t=1.11662 s, 2097152 iters, t-(init.)=1.03329 s t(norm)=0.061589, mflops=81.1833 (err=6.5e-17) 31. Singleton: elapsed time t=1.46661 s, 1048576 iters, t-(init.)=1.36661 s t(norm)=0.162913, mflops=30.6913 (err=5.3e-17) 32. Singleton (f2c): elapsed time t=1.36661 s, 1048576 iters, t-(init.)=1.31661 s t(norm)=0.156953, mflops=31.8567 (err=5.3e-17) 33. Sorensen: elapsed time t=1.41661 s, 2097152 iters, t-(init.)=1.33328 s t(norm)=0.0794697, mflops=62.9171 (err=1.3e-16) 34. Sorensen DIT: elapsed time t=1.93326 s, 2097152 iters, t-(init.)=1.83326 s t(norm)=0.109271, mflops=45.7579 (err=1.3e-16) 35. Temperton: elapsed time t=1.38328 s, 1048576 iters, t-(init.)=1.33328 s t(norm)=0.158939, mflops=31.4585 (err=5.3e-17) 36. Temperton (f2c): elapsed time t=1.81659 s, 1048576 iters, t-(init.)=1.7666 s t(norm)=0.210595, mflops=23.7423 (err=5.3e-17) 37. Valkenburg: elapsed time t=1.21662 s, 524288 iters, t-(init.)=1.19995 s t(norm)=0.286091, mflops=17.477 (err=1.6e-16) 38. DXML: elapsed time t=1.08329 s, 1048576 iters, t-(init.)=1.03329 s t(norm)=0.123178, mflops=40.5917 (err=5.3e-17) Top mflops for N=4 = 324.733 Normalized results and averages for N=4: fft 0: mflops = 115.71 (norm. = 0.356322), norm. avg. (of 2) = 0.498674 fft 1: mflops = 115.71 (norm. = 0.356322), norm. avg. (of 2) = 0.409642 fft 2: mflops = 52.4309 (norm. = 0.161458), norm. avg. (of 2) = 0.296246 fft 3: mflops = 10.3143 (norm. = 0.0317623), norm. avg. (of 2) = 0.0292358 fft 4: mflops = 33.5558 (norm. = 0.103333), norm. avg. (of 2) = 0.131795 fft 5: mflops = 15.5351 (norm. = 0.0478395), norm. avg. (of 2) = 0.0606845 fft 6: mflops = 42.6556 (norm. = 0.131356), norm. avg. (of 2) = 0.127559 fft 7: mflops = 28.2773 (norm. = 0.0870787), norm. avg. (of 2) = 0.0993429 fft 8: mflops = 43.3911 (norm. = 0.133621), norm. avg. (of 2) = 0.259118 fft 9: mflops = 32.2652 (norm. = 0.099359), norm. avg. (of 2) = 0.0913462 fft 10: mflops = 16.7779 (norm. = 0.0516667), norm. avg. (of 2) = 0.0644136 fft 11: mflops = 48.3978 (norm. = 0.149038), norm. avg. (of 1) = 0.149038 fft 12: mflops = 88.3047 (norm. = 0.27193), norm. avg. (of 2) = 0.240132 fft 13: mflops = 62.1403 (norm. = 0.191358), norm. avg. (of 2) = 0.196485 fft 14: mflops = 189.938 (norm. = 0.584906), norm. avg. (of 2) = 0.625786 fft 15: mflops = 314.585 (norm. = 0.96875), norm. avg. (of 2) = 0.863163 fft 16: mflops = 324.733 (norm. = 1), norm. avg. (of 2) = 1 fft 17: mflops = -1 (norm. = -0.00307945), norm. avg. (of 0) = -1 fft 18: mflops = 75.1249 (norm. = 0.231343), norm. avg. (of 2) = 0.25008 fft 19: mflops = 21.6955 (norm. = 0.0668103), norm. avg. (of 2) = 0.0863713 fft 20: mflops = 22.6728 (norm. = 0.0698198), norm. avg. (of 2) = 0.0917281 fft 21: mflops = 108.244 (norm. = 0.333333), norm. avg. (of 2) = 0.378531 fft 22: mflops = 81.1833 (norm. = 0.25), norm. avg. (of 1) = 0.25 fft 23: mflops = 86.7822 (norm. = 0.267241), norm. avg. (of 1) = 0.267241 fft 24: mflops = 77.4364 (norm. = 0.238462), norm. avg. (of 1) = 0.238462 fft 25: mflops = 11.9842 (norm. = 0.0369048), norm. avg. (of 1) = 0.0369048 fft 26: mflops = 13.9816 (norm. = 0.0430556), norm. avg. (of 2) = 0.0566401 fft 27: mflops = 100.667 (norm. = 0.31), norm. avg. (of 2) = 0.593596 fft 28: mflops = 129.061 (norm. = 0.397436), norm. avg. (of 2) = 0.527665 fft 29: mflops = 9.98684 (norm. = 0.030754), norm. avg. (of 1) = 0.030754 fft 30: mflops = 81.1833 (norm. = 0.25), norm. avg. (of 1) = 0.25 fft 31: mflops = 30.6913 (norm. = 0.0945122), norm. avg. (of 2) = 0.0912702 fft 32: mflops = 31.8567 (norm. = 0.0981013), norm. avg. (of 2) = 0.0936935 fft 33: mflops = 62.9171 (norm. = 0.19375), norm. avg. (of 2) = 0.357292 fft 34: mflops = 45.7579 (norm. = 0.140909), norm. avg. (of 2) = 0.239373 fft 35: mflops = 31.4585 (norm. = 0.096875), norm. avg. (of 2) = 0.0912457 fft 36: mflops = 23.7423 (norm. = 0.0731132), norm. avg. (of 2) = 0.072068 fft 37: mflops = 17.477 (norm. = 0.0538194), norm. avg. (of 2) = 0.106024 fft 38: mflops = 40.5917 (norm. = 0.125), norm. avg. (of 2) = 0.114583 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.53327 s, 2097152 iters, t-(init.)=1.19995 s t(norm)=0.0238409, mflops=209.724 (err=1.2e-16) 1. Arndt DIT: elapsed time t=1.54994 s, 2097152 iters, t-(init.)=1.36661 s t(norm)=0.0271521, mflops=184.148 (err=1.2e-16) 2. Arndt Split-Radix: elapsed time t=1.13329 s, 524288 iters, t-(init.)=1.09996 s t(norm)=0.0874166, mflops=57.1973 (err=1.1e-16) 3. Arndt 4-step: elapsed time t=1.11662 s, 131072 iters, t-(init.)=1.09996 s t(norm)=0.349667, mflops=14.2993 (err=1.3e-16) 4. Bailey: elapsed time t=1.86659 s, 1048576 iters, t-(init.)=1.7666 s t(norm)=0.0701982, mflops=71.2269 (err=9.8e-17) 5. Beauregard: elapsed time t=1.53327 s, 262144 iters, t-(init.)=1.49994 s t(norm)=0.238409, mflops=20.9724 (err=1.2e-16) 6. Bergland: elapsed time t=1.89992 s, 1048576 iters, t-(init.)=1.79993 s t(norm)=0.0715227, mflops=69.9079 (err=1.3e-16) 7. Brenner: elapsed time t=1.6666 s, 524288 iters, t-(init.)=1.6166 s t(norm)=0.128476, mflops=38.9178 (err=1.2e-16) 8. Burrus: elapsed time t=1.43328 s, 524288 iters, t-(init.)=1.38328 s t(norm)=0.109933, mflops=45.4822 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.5666 s, 1048576 iters, t-(init.)=1.48327 s t(norm)=0.05894, mflops=84.832 10. CWP (best N) (N=15): elapsed time t=1.31661 s, 524288 iters, t-(init.)=1.24995 s t(norm)=0.0993371, mflops=50.3337 11. Edelblute: elapsed time t=1.48327 s, 524288 iters, t-(init.)=1.43328 s t(norm)=0.113907, mflops=43.8956 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.18329 s, 1048576 iters, t-(init.)=1.09996 s t(norm)=0.0437083, mflops=114.395 (err=1.2e-16) 13. FFTPACK (f2c): elapsed time t=1.43328 s, 1048576 iters, t-(init.)=1.34995 s t(norm)=0.053642, mflops=93.2105 (err=1.2e-16) FFTW_MEASURE plan: (cost = 3.973484e-07) FFTW_NOTW 8 14. FFTW: elapsed time t=1.83326 s, 4194304 iters, t-(init.)=1.48327 s t(norm)=0.014735, mflops=339.328 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.04996 s, 2097152 iters, t-(init.)=0.699972 s t(norm)=0.0139072, mflops=359.526 (err=1.2e-16) 16. Frigo-old: elapsed time t=1.34995 s, 4194304 iters, t-(init.)=0.983294 s t(norm)=0.00976815, mflops=511.868 (err=1.4e-16) 17. Green: elapsed time t=1.63327 s, 2097152 iters, t-(init.)=1.44994 s t(norm)=0.0288078, mflops=173.564 (err=1.4e-16) 18. GSL: elapsed time t=1.31661 s, 1048576 iters, t-(init.)=1.23328 s t(norm)=0.0490063, mflops=102.028 (err=1.4e-16) 19. GSL DIT: elapsed time t=1.78326 s, 524288 iters, t-(init.)=1.73326 s t(norm)=0.137747, mflops=36.2983 (err=1.2e-16) 20. GSL DIF: elapsed time t=1.69993 s, 524288 iters, t-(init.)=1.6666 s t(norm)=0.132449, mflops=37.7502 (err=1.4e-16) 21. Krukar: elapsed time t=1.38328 s, 2097152 iters, t-(init.)=1.19995 s t(norm)=0.0238409, mflops=209.724 (err=1.9e-16) 22. Mayer (Buneman): elapsed time t=1.21662 s, 1048576 iters, t-(init.)=1.11662 s t(norm)=0.0443706, mflops=112.687 (err=1.2e-16) 23. Mayer (simple): elapsed time t=1.18329 s, 1048576 iters, t-(init.)=1.09996 s t(norm)=0.0437083, mflops=114.395 24. Mayer (lookup): elapsed time t=1.24995 s, 1048576 iters, t-(init.)=1.16662 s t(norm)=0.0463573, mflops=107.858 (err=1.2e-16) 25. Monro: elapsed time t=1.23328 s, 262144 iters, t-(init.)=1.19995 s t(norm)=0.190727, mflops=26.2154 (err=1.1e-08) 26. NAPACK (f2c): elapsed time t=1.54994 s, 262144 iters, t-(init.)=1.53327 s t(norm)=0.243707, mflops=20.5164 (err=1.7e-16) 27. Ooura (C): elapsed time t=1.43328 s, 2097152 iters, t-(init.)=1.24995 s t(norm)=0.0248343, mflops=201.335 (err=1.3e-16) 28. Ooura (F): elapsed time t=1.46661 s, 2097152 iters, t-(init.)=1.28328 s t(norm)=0.0254965, mflops=196.105 (err=1.3e-16) 29. Ransom: elapsed time t=1.11662 s, 131072 iters, t-(init.)=1.09996 s t(norm)=0.349667, mflops=14.2993 (err=3.4e-16) 30. SCIPORT: elapsed time t=1.09996 s, 1048576 iters, t-(init.)=1.01663 s t(norm)=0.0403971, mflops=123.771 (err=1.4e-16) 31. Singleton: elapsed time t=1.94992 s, 524288 iters, t-(init.)=1.88326 s t(norm)=0.149668, mflops=33.4073 (err=1.4e-16) 32. Singleton (f2c): elapsed time t=1.96659 s, 524288 iters, t-(init.)=1.91659 s t(norm)=0.152317, mflops=32.8263 (err=1.4e-16) 33. Sorensen: elapsed time t=1.36661 s, 1048576 iters, t-(init.)=1.26662 s t(norm)=0.0503308, mflops=99.3428 (err=1.5e-16) 34. Sorensen DIT: elapsed time t=1.44994 s, 524288 iters, t-(init.)=1.39994 s t(norm)=0.111258, mflops=44.9408 (err=1.1e-16) 35. Temperton: elapsed time t=1.08329 s, 524288 iters, t-(init.)=1.03329 s t(norm)=0.0821187, mflops=60.8875 (err=4.6e-09) 36. Temperton (f2c): elapsed time t=1.54994 s, 524288 iters, t-(init.)=1.51661 s t(norm)=0..120529, mflops=41.4838 (err=1.4e-16) 37. Valkenburg: elapsed time t=1.68327 s, 262144 iters, t-(init.)=1.6666 s t(norm)=0.264899, mflops=18.8751 (err=1.5e-16) 38. DXML: elapsed time t=1.29995 s, 1048576 iters, t-(init.)=1.19995 s t(norm)=0.0476818, mflops=104.862 (err=1.0e-15) Top mflops for N=8 = 511.868 Normalized results and averages for N=8: fft 0: mflops = 209.724 (norm. = 0.409722), norm. avg. (of 3) = 0.469023 fft 1: mflops = 184.148 (norm. = 0.359756), norm. avg. (of 3) = 0.393014 fft 2: mflops = 57.1973 (norm. = 0.111742), norm. avg. (of 3) = 0.234745 fft 3: mflops = 14.2993 (norm. = 0.0279356), norm. avg. (of 3) = 0.0288024 fft 4: mflops = 71.2269 (norm. = 0.139151), norm. avg. (of 3) = 0.134247 fft 5: mflops = 20.9724 (norm. = 0.0409722), norm. avg. (of 3) = 0.0541137 fft 6: mflops = 69.9079 (norm. = 0.136574), norm. avg. (of 3) = 0.130564 fft 7: mflops = 38.9178 (norm. = 0.0760309), norm. avg. (of 3) = 0.0915722 fft 8: mflops = 45.4822 (norm. = 0.0888554), norm. avg. (of 3) = 0.202364 fft 9: mflops = 84.832 (norm. = 0.16573), norm. avg. (of 3) = 0.116141 fft 10: mflops = 50.3337 (norm. = 0.0983333), norm. avg. (of 3) = 0.0757202 fft 11: mflops = 43.8956 (norm. = 0.0857558), norm. avg. (of 2) = 0.117397 fft 12: mflops = 114.395 (norm. = 0.223485), norm. avg. (of 3) = 0.234583 fft 13: mflops = 93.2105 (norm. = 0.182099), norm. avg. (of 3) = 0.19169 fft 14: mflops = 339.328 (norm. = 0.662921), norm. avg. (of 3) = 0.638165 fft 15: mflops = 359.526 (norm. = 0.702381), norm. avg. (of 3) = 0.809569 fft 16: mflops = 511.868 (norm. = 1), norm. avg. (of 3) = 1 fft 17: mflops = 173.564 (norm. = 0.33908), norm. avg. (of 1) = 0.33908 fft 18: mflops = 102.028 (norm. = 0.199324), norm. avg. (of 3) = 0.233162 fft 19: mflops = 36.2983 (norm. = 0.0709135), norm. avg. (of 3) = 0.0812187 fft 20: mflops = 37.7502 (norm. = 0.07375), norm. avg. (of 3) = 0.0857354 fft 21: mflops = 209.724 (norm. = 0.409722), norm. avg. (of 3) = 0.388928 fft 22: mflops = 112.687 (norm. = 0.220149), norm. avg. (of 2) = 0.235075 fft 23: mflops = 114.395 (norm. = 0.223485), norm. avg. (of 2) = 0.245363 fft 24: mflops = 107.858 (norm. = 0.210714), norm. avg. (of 2) = 0.224588 fft 25: mflops = 26.2154 (norm. = 0.0512153), norm. avg. (of 2) = 0.04406 fft 26: mflops = 20.5164 (norm. = 0.0400815), norm. avg. (of 3) = 0.0511206 fft 27: mflops = 201.335 (norm. = 0.393333), norm. avg. (of 3) = 0.526842 fft 28: mflops = 196.105 (norm. = 0.383117), norm. avg. (of 3) = 0.479483 fft 29: mflops = 14.2993 (norm. = 0.0279356), norm. avg. (of 2) = 0.0293448 fft 30: mflops = 123.771 (norm. = 0.241803), norm. avg. (of 2) = 0.245902 fft 31: mflops = 33.4073 (norm. = 0.0652655), norm. avg. (of 3) = 0.082602 fft 32: mflops = 32.8263 (norm. = 0.0641304), norm. avg. (of 3) = 0.0838391 fft 33: mflops = 99.3428 (norm. = 0.194079), norm. avg. (of 3) = 0.302887 fft 34: mflops = 44.9408 (norm. = 0.0877976), norm. avg. (of 3) = 0.188848 fft 35: mflops = 60.8875 (norm. = 0.118952), norm. avg. (of 3) = 0.100481 fft 36: mflops = 41.4838 (norm. = 0.081044), norm. avg. (of 3) = 0.07506 fft 37: mflops = 18.8751 (norm. = 0.036875), norm. avg. (of 3) = 0.0829741 fft 38: mflops = 104.862 (norm. = 0.204861), norm. avg. (of 3) = 0.144676 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.63327 s, 524288 iters, t-(init.)=1.53327 s t(norm)=0.0456951, mflops=109.421 (err=1.5e-16) 1. Arndt DIT: elapsed time t=1.81659 s, 524288 iters, t-(init.)=1.7666 s t(norm)=0.0526487, mflops=94.9692 (err=2.2e-16) 2. Arndt Split-Radix: elapsed time t=1.29995 s, 262144 iters, t-(init.)=1.26662 s t(norm)=0.0754962, mflops=66.2285 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.64993 s, 131072 iters, t-(init.)=1.63327 s t(norm)=0.194701, mflops=25.6804 (err=2.0e-16) 4. Bailey: elapsed time t=1.64993 s, 524288 iters, t-(init.)=1.58327 s t(norm)=0.0471851, mflops=105.966 (err=2.0e-16) 5. Beauregard: elapsed time t=1.79993 s, 131072 iters, t-(init.)=1.78326 s t(norm)=0.212581, mflops=23.5204 (err=2.7e-16) 6. Bergland: elapsed time t=1.83326 s, 524288 iters, t-(init.)=1.78326 s t(norm)=0.0531453, mflops=94.0816 (err=2.6e-16) 7. Brenner: elapsed time t=1.41661 s, 262144 iters, t-(init.)=1.39994 s t(norm)=0.0834432, mflops=59.921 (err=2.1e-16) 8. Burrus: elapsed time t=1.73326 s, 262144 iters, t-(init.)=1.69993 s t(norm)=0.101324, mflops=49.3467 (err=1.4e-16) 9. CWP (min N): elapsed time t=1.28328 s, 524288 iters, t-(init.)=1.23328 s t(norm)=0.0367547, mflops=136.037 10. CWP (best N) (N=28): elapsed time t=1.6166 s, 524288 iters, t-(init.)=1.53327 s t(norm)=0.0456951, mflops=109.421 11. Edelblute: elapsed time t=1.89992 s, 262144 iters, t-(init.)=1.86659 s t(norm)=0.111258, mflops=44.9408 (err=1.4e-16) 12. FFTPACK: elapsed time t=1.51661 s, 1048576 iters, t-(init.)=1.39994 s t(norm)=0.0208608, mflops=239.684 (err=1.8e-16) 13. FFTPACK (f2c): elapsed time t=1.09996 s, 524288 iters, t-(init.)=1.04996 s t(norm)=0.0312912, mflops=159.789 (err=1.8e-16) FFTW_MEASURE plan: (cost = 8.264847e-07) FFTW_NOTW 16 14. FFTW: elapsed time t=1.64993 s, 2097152 iters, t-(init.)=1.41661 s t(norm)=0.0105546, mflops=473.729 (err=1.8e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.84993 s, 2097152 iters, t-(init.)=1.44994 s t(norm)=0.0108029, mflops=462.838 (err=1.8e-16) 16. Frigo-old: elapsed time t=1.36661 s, 2097152 iters, t-(init.)=1.13329 s t(norm)=0.00844365, mflops=592.161 (err=1.8e-16) 17. Green: elapsed time t=1.59994 s, 1048576 iters, t-(init.)=1.48327 s t(norm)=0.0221025, mflops=226.219 (err=1.9e-16) 18. GSL: elapsed time t=1.08329 s, 524288 iters, t-(init.)=1.03329 s t(norm)=0.0307945, mflops=162.367 (err=1.8e-16) 19. GSL DIT: elapsed time t=1.54994 s, 262144 iters, t-(init.)=1.51661 s t(norm)=0.0903968, mflops=55.3117 (err=2.1e-16) 20. GSL DIF: elapsed time t=1.41661 s, 262144 iters, t-(init.)=1.38328 s t(norm)=0.0824498, mflops=60.643 (err=2.8e-16) 21. Krukar: elapsed time t=1.33328 s, 1048576 iters, t-(init.)=1.23328 s t(norm)=0.0183774, mflops=272.074 (err=1.9e-16) 22. Mayer (Buneman): elapsed time t=1.78326 s, 524288 iters, t-(init.)=1.73326 s t(norm)=0.0516553, mflops=96.7955 (err=1.7e-16) 23. Mayer (simple): elapsed time t=1.43328 s, 524288 iters, t-(init.)=1.36661 s t(norm)=0.0407282, mflops=122.765 24. Mayer (lookup): elapsed time t=1.49994 s, 524288 iters, t-(init.)=1.44994 s t(norm)=0.0432116, mflops=115.71 (err=1.8e-16) 25. Monro: elapsed time t=1.84993 s, 262144 iters, t-(init.)=1.83326 s t(norm)=0.109271, mflops=45.7579 (err=2.1e-08) 26. NAPACK (f2c): elapsed time t=1.46661 s, 131072 iters, t-(init.)=1.46661 s t(norm)=0.174833, mflops=28.5987 (err=3.3e-16) 27. Ooura (C): elapsed time t=1.59994 s, 1048576 iters, t-(init.)=1.48327 s t(norm)=0.0221025, mflops=226.219 (err=2.0e-16) 28. Ooura (F): elapsed time t=1.68327 s, 1048576 iters, t-(init.)=1.5666 s t(norm)=0.0233442, mflops=214.186 (err=2.0e-16) 29. Ransom: elapsed time t=1.09996 s, 131072 iters, t-(init.)=1.08329 s t(norm)=0.129138, mflops=38.7182 (err=3.4e-16) 30. SCIPORT: elapsed time t=1.03329 s, 524288 iters, t-(init.)=0.983294 s t(norm)=0.0293044, mflops=170.623 (err=2.8e-16) 31. Singleton: elapsed time t=1.01663 s, 262144 iters, t-(init.)=0.983294 s t(norm)=0.0586089, mflops=85.3113 (err=1.7e-16) 32. Singleton (f2c): elapsed time t=1.94992 s, 524288 iters, t-(init.)=1.88326 s t(norm)=0.0561255, mflops=89.0861 (err=1.7e-16) 33. Sorensen: elapsed time t=1.34995 s, 524288 iters, t-(init.)=1.29995 s t(norm)=0.0387415, mflops=129.061 (err=1.5e-16) 34. Sorensen DIT: elapsed time t=1.68327 s, 262144 iters, t-(init.)=1.64993 s t(norm)=0.0983437, mflops=50.8421 (err=1.6e-16) 35. Temperton: elapsed time t=1.88326 s, 524288 iters, t-(init.)=1.83326 s t(norm)=0.0546354, mflops=91.5157 (err=1.7e-08) 36. Temperton (f2c): elapsed time t=1.18329 s, 262144 iters, t-(init.)=1.14995 s t(norm)=0.0685426, mflops=72.9473 (err=1.8e-16) 37. Valkenburg: elapsed time t=1.06662 s, 65536 iters, t-(init.)=1.06662 s t(norm)=0.254303, mflops=19.6616 (err=2.9e-16) 38. DXML: elapsed time t=2.01659 s, 1048576 iters, t-(init.)=1.89992 s t(norm)=0.0283111, mflops=176.609 (err=2.0e-16) Top mflops for N=16 = 592.161 Normalized results and averages for N=16: fft 0: mflops = 109.421 (norm. = 0.184783), norm. avg. (of 4) = 0.397963 fft 1: mflops = 94.9692 (norm. = 0.160377), norm. avg. (of 4) = 0.334855 fft 2: mflops = 66.2285 (norm. = 0.111842), norm. avg. (of 4) = 0.204019 fft 3: mflops = 25.6804 (norm. = 0.0433673), norm. avg. (of 4) = 0.0324437 fft 4: mflops = 105.966 (norm. = 0.178947), norm. avg. (of 4) = 0.145422 fft 5: mflops = 23.5204 (norm. = 0.0397196), norm. avg. (of 4) = 0.0505152 fft 6: mflops = 94.0816 (norm. = 0.158879), norm. avg. (of 4) = 0.137643 fft 7: mflops = 59.921 (norm. = 0.10119), norm. avg. (of 4) = 0.0939768 fft 8: mflops = 49.3467 (norm. = 0.0833333), norm. avg. (of 4) = 0.172606 fft 9: mflops = 136.037 (norm. = 0.22973), norm. avg. (of 4) = 0.144538 fft 10: mflops = 109.421 (norm. = 0.184783), norm. avg. (of 4) = 0.102986 fft 11: mflops = 44.9408 (norm. = 0.0758929), norm. avg. (of 3) = 0.103562 fft 12: mflops = 239.684 (norm. = 0.404762), norm. avg. (of 4) = 0.277127 fft 13: mflops = 159.789 (norm. = 0.269841), norm. avg. (of 4) = 0.211228 fft 14: mflops = 473.729 (norm. = 0.8), norm. avg. (of 4) = 0.678623 fft 15: mflops = 462.838 (norm. = 0.781609), norm. avg. (of 4) = 0.802579 fft 16: mflops = 592.161 (norm. = 1), norm. avg. (of 4) = 1 fft 17: mflops = 226.219 (norm. = 0.382022), norm. avg. (of 2) = 0.360551 fft 18: mflops = 162.367 (norm. = 0.274194), norm. avg. (of 4) = 0.24342 fft 19: mflops = 55.3117 (norm. = 0.0934066), norm. avg. (of 4) = 0.0842657 fft 20: mflops = 60.643 (norm. = 0.10241), norm. avg. (of 4) = 0.089904 fft 21: mflops = 272.074 (norm. = 0.459459), norm. avg. (of 4) = 0.406561 fft 22: mflops = 96.7955 (norm. = 0.163462), norm. avg. (of 3) = 0.211204 fft 23: mflops = 122.765 (norm. = 0.207317), norm. avg. (of 3) = 0.232681 fft 24: mflops = 115.71 (norm. = 0.195402), norm. avg. (of 3) = 0.214859 fft 25: mflops = 45.7579 (norm. = 0.0772727), norm. avg. (of 3) = 0.0551309 fft 26: mflops = 28.5987 (norm. = 0.0482955), norm. avg. (of 4) = 0.0504143 fft 27: mflops = 226.219 (norm. = 0.382022), norm. avg. (of 4) = 0.490637 fft 28: mflops = 214.186 (norm. = 0.361702), norm. avg. (of 4) = 0.450037 fft 29: mflops = 38.7182 (norm. = 0.0653846), norm. avg. (of 3) = 0.0413581 fft 30: mflops = 170.623 (norm. = 0.288136), norm. avg. (of 3) = 0.25998 fft 31: mflops = 85.3113 (norm. = 0.144068), norm. avg. (of 4) = 0.0979684 fft 32: mflops = 89.0861 (norm. = 0.150442), norm. avg. (of 4) = 0.10049 fft 33: mflops = 129.061 (norm. = 0.217949), norm. avg. (of 4) = 0.281653 fft 34: mflops = 50.8421 (norm. = 0.0858586), norm. avg. (of 4) = 0.163101 fft 35: mflops = 91.5157 (norm. = 0.154545), norm. avg. (of 4) = 0.113997 fft 36: mflops = 72.9473 (norm. = 0.123188), norm. avg. (of 4) = 0.0870921 fft 37: mflops = 19.6616 (norm. = 0.0332031), norm. avg. (of 4) = 0.0705314 fft 38: mflops = 176.609 (norm. = 0.298246), norm. avg. (of 4) = 0.183068 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.64993 s, 262144 iters, t-(init.)=1.58327 s t(norm)=0.0377481, mflops=132.457 (err=3.1e-16) 1. Arndt DIT: elapsed time t=1.88326 s, 262144 iters, t-(init.)=1.83326 s t(norm)=0.0437083, mflops=114.395 (err=2.5e-16) 2. Arndt Split-Radix: elapsed time t=1.44994 s, 131072 iters, t-(init.)=1.43328 s t(norm)=0.0683439, mflops=73.1594 (err=2.7e-16) 3. Arndt 4-step: elapsed time t=1.79993 s, 65536 iters, t-(init.)=1.78326 s t(norm)=0.170065, mflops=29.4005 (err=2.8e-16) 4. Bailey: elapsed time t=1.63327 s, 262144 iters, t-(init.)=1.58327 s t(norm)=0.0377481, mflops=132.457 (err=2.7e-16) 5. Beauregard: elapsed time t=1.08329 s, 32768 iters, t-(init.)=1.08329 s t(norm)=0.206621, mflops=24.1989 (err=1.8e-16) 6. Bergland: elapsed time t=1.5666 s, 262144 iters, t-(init.)=1.53327 s t(norm)=0.0365561, mflops=136.776 (err=2.6e-16) 7. Brenner: elapsed time t=1.39994 s, 131072 iters, t-(init.)=1.38328 s t(norm)=0.0659598, mflops=75.8037 (err=2.2e-16) 8. Burrus: elapsed time t=1.93326 s, 131072 iters, t-(init.)=1.89992 s t(norm)=0.0905954, mflops=55.1904 (err=2.9e-16) 9. CWP (min N) (N=33): elapsed time t=1.23328 s, 262144 iters, t-(init.)=1.19995 s t(norm)=0.0286091, mflops=174.77 10. CWP (best N) (N=35): elapsed time t=1.99992 s, 524288 iters, t-(init.)=1.88326 s t(norm)=0.0224502, mflops=222.715 11. Edelblute: elapsed time t=1.08329 s, 65536 iters, t-(init.)=1.06662 s t(norm)=0.101721, mflops=49.154 (err=2.9e-16) 12. FFTPACK: elapsed time t=1.13329 s, 262144 iters, t-(init.)=1.09996 s t(norm)=0.026225, mflops=190.658 (err=1.9e-16) 13. FFTPACK (f2c): elapsed time t=1.64993 s, 262144 iters, t-(init.)=1.59994 s t(norm)=0.0381454, mflops=131.077 (err=1.9e-16) FFTW_MEASURE plan: (cost = 1.398666e-06) FFTW_NOTW 32 14. FFTW: elapsed time t=1.41661 s, 1048576 iters, t-(init.)=1.23328 s t(norm)=0.00735095, mflops=680.185 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.41661 s, 1048576 iters, t-(init.)=1.14995 s t(norm)=0.00685426, mflops=729.473 (err=2.1e-16) 16. Frigo-old: elapsed time t=1.6666 s, 1048576 iters, t-(init.)=1.48327 s t(norm)=0.008841, mflops=565.547 (err=2.2e-16) 17. Green: elapsed time t=1.44994 s, 524288 iters, t-(init.)=1.34995 s t(norm)=0.0160926, mflops=310.702 (err=2.0e-16) 18. GSL: elapsed time t=1.41661 s, 262144 iters, t-(init.)=1.36661 s t(norm)=0.0325826, mflops=153.456 (err=2.0e-16) 19. GSL DIT: elapsed time t=1.48327 s, 131072 iters, t-(init.)=1.46661 s t(norm)=0.0699333, mflops=71.4967 (err=2.2e-16) 20. GSL DIF: elapsed time t=1.28328 s, 131072 iters, t-(init.)=1.24995 s t(norm)=0.0596023, mflops=83.8894 (err=2.5e-16) 21. Krukar: elapsed time t=1.74993 s, 524288 iters, t-(init.)=1.6666 s t(norm)=0.0198674, mflops=251.668 (err=2.2e-16) 22. Mayer (Buneman): elapsed time t=1.88326 s, 262144 iters, t-(init.)=1.84993 s t(norm)=0.0441057, mflops=113.364 (err=2.7e-16) 23. Mayer (simple): elapsed time t=1.53327 s, 262144 iters, t-(init.)=1.48327 s t(norm)=0.035364, mflops=141.387 24. Mayer (lookup): elapsed time t=1.58327 s, 262144 iters, t-(init.)=1.53327 s t(norm)=0.0365561, mflops=136.776 (err=2.5e-16) 25. Monro: elapsed time t=1.43328 s, 131072 iters, t-(init.)=1.39994 s t(norm)=0.0667545, mflops=74.9013 (err=3.7e-08) 26. NAPACK (f2c): elapsed time t=1.5666 s, 65536 iters, t-(init.)=1.54994 s t(norm)=0.147814, mflops=33.8264 (err=5.4e-16) 27. Ooura (C): elapsed time t=1.68327 s, 524288 iters, t-(init.)=1.59994 s t(norm)=0.0190727, mflops=262.154 (err=2.7e-16) 28. Ooura (F): elapsed time t=1.78326 s, 524288 iters, t-(init.)=1.69993 s t(norm)=0.0202648, mflops=246.734 (err=2.7e-16) 29. Ransom: elapsed time t=1.28328 s, 65536 iters, t-(init.)=1.26662 s t(norm)=0.120794, mflops=41.3928 (err=7.0e-16) 30. SCIPORT: elapsed time t=1.01663 s, 262144 iters, t-(init.)=0.966628 s t(norm)=0.0230462, mflops=216.955 (err=1.8e-16) 31. Singleton: elapsed time t=1.91659 s, 262144 iters, t-(init.)=1.86659 s t(norm)=0.044503, mflops=112.352 (err=2.2e-16) 32. Singleton (f2c): elapsed time t=1.89992 s, 262144 iters, t-(init.)=1.84993 s t(norm)=0.0441057, mflops=113.364 (err=2.2e-16) 33. Sorensen: elapsed time t=1.28328 s, 262144 iters, t-(init.)=1.23328 s t(norm)=0.0294038, mflops=170.046 (err=2.7e-16) 34. Sorensen DIT: elapsed time t=1.84993 s, 131072 iters, t-(init.)=1.83326 s t(norm)=0.0874166, mflops=57.1973 (err=2.6e-16) 35. Temperton: elapsed time t=1.96659 s, 262144 iters, t-(init.)=1.91659 s t(norm)=0.0456951, mflops=109.421 (err=3.1e-08) 36. Temperton (f2c): elapsed time t=1.31661 s, 131072 iters, t-(init.)=1.28328 s t(norm)=0.0611917, mflops=81.7105 (err=2.0e-16) 37. Valkenburg: elapsed time t=1.23328 s, 32768 iters, t-(init.)=1.21662 s t(norm)=0.232051, mflops=21.5469 (err=4.3e-16) 38. DXML: elapsed time t=1.79993 s, 524288 iters, t-(init.)=1.7166 s t(norm)=0.0204634, mflops=244.338 (err=1.1e-15) Top mflops for N=32 = 729.473 Normalized results and averages for N=32: fft 0: mflops = 132.457 (norm. = 0.181579), norm. avg. (of 5) = 0.354686 fft 1: mflops = 114.395 (norm. = 0.156818), norm. avg. (of 5) = 0.299247 fft 2: mflops = 73.1594 (norm. = 0.100291), norm. avg. (of 5) = 0.183274 fft 3: mflops = 29.4005 (norm. = 0.0403037), norm. avg. (of 5) = 0.0340157 fft 4: mflops = 132.457 (norm. = 0.181579), norm. avg. (of 5) = 0.152653 fft 5: mflops = 24.1989 (norm. = 0.0331731), norm. avg. (of 5) = 0.0470468 fft 6: mflops = 136.776 (norm. = 0.1875), norm. avg. (of 5) = 0.147614 fft 7: mflops = 75.8037 (norm. = 0.103916), norm. avg. (of 5) = 0.0959646 fft 8: mflops = 55.1904 (norm. = 0.0756579), norm. avg. (of 5) = 0.153217 fft 9: mflops = 174.77 (norm. = 0.239583), norm. avg. (of 5) = 0.163547 fft 10: mflops = 222.715 (norm. = 0.30531), norm. avg. (of 5) = 0.143451 fft 11: mflops = 49.154 (norm. = 0.0673828), norm. avg. (of 4) = 0.0945175 fft 12: mflops = 190.658 (norm. = 0.261364), norm. avg. (of 5) = 0.273975 fft 13: mflops = 131.077 (norm. = 0.179688), norm. avg. (of 5) = 0.20492 fft 14: mflops = 680.185 (norm. = 0.932432), norm. avg. (of 5) = 0.729385 fft 15: mflops = 729.473 (norm. = 1), norm. avg. (of 5) = 0.842063 fft 16: mflops = 565.547 (norm. = 0.775281), norm. avg. (of 5) = 0.955056 fft 17: mflops = 310.702 (norm. = 0.425926), norm. avg. (of 3) = 0.382343 fft 18: mflops = 153.456 (norm. = 0.210366), norm. avg. (of 5) = 0.236809 fft 19: mflops = 71.4967 (norm. = 0.0980114), norm. avg. (of 5) = 0.0870148 fft 20: mflops = 83.8894 (norm. = 0.115), norm. avg. (of 5) = 0.0949232 fft 21: mflops = 251.668 (norm. = 0.345), norm. avg. (of 5) = 0.394249 fft 22: mflops = 113.364 (norm. = 0.155405), norm. avg. (of 4) = 0.197254 fft 23: mflops = 141.387 (norm. = 0.19382), norm. avg. (of 4) = 0.222966 fft 24: mflops = 136.776 (norm. = 0.1875), norm. avg. (of 4) = 0.20802 fft 25: mflops = 74.9013 (norm. = 0.102679), norm. avg. (of 4) = 0.0670178 fft 26: mflops = 33.8264 (norm. = 0.046371), norm. avg. (of 5) = 0.0496056 fft 27: mflops = 262.154 (norm. = 0.359375), norm. avg. (of 5) = 0.464385 fft 28: mflops = 246.734 (norm. = 0.338235), norm. avg. (of 5) = 0.427677 fft 29: mflops = 41.3928 (norm. = 0.0567434), norm. avg. (of 4) = 0.0452044 fft 30: mflops = 216.955 (norm. = 0.297414), norm. avg. (of 4) = 0.269338 fft 31: mflops = 112.352 (norm. = 0.154018), norm. avg. (of 5) = 0.109178 fft 32: mflops = 113.364 (norm. = 0.155405), norm. avg. (of 5) = 0.111473 fft 33: mflops = 170.046 (norm. = 0.233108), norm. avg. (of 5) = 0.271944 fft 34: mflops = 57.1973 (norm. = 0.0784091), norm. avg. (of 5) = 0.146162 fft 35: mflops = 109.421 (norm. = 0.15), norm. avg. (of 5) = 0.121198 fft 36: mflops = 81.7105 (norm. = 0.112013), norm. avg. (of 5) = 0.0920763 fft 37: mflops = 21.5469 (norm. = 0.0295377), norm. avg. (of 5) = 0.0623326 fft 38: mflops = 244.338 (norm. = 0.334951), norm. avg. (of 5) = 0.213445 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=2.01659 s, 131072 iters, t-(init.)=1.96659 s t(norm)=0.0390726, mflops=127.967 (err=5.7e-16) 1. Arndt DIT: elapsed time t=1.14995 s, 65536 iters, t-(init.)=1.13329 s t(norm)=0.0450328, mflops=111.03 (err=5.7e-16) 2. Arndt Split-Radix: elapsed time t=1.5666 s, 65536 iters, t-(init.)=1.53327 s t(norm)=0.0609268, mflops=82.0658 (err=5.7e-16) 3. Arndt 4-step: elapsed time t=1.53327 s, 32768 iters, t-(init.)=1.53327 s t(norm)=0.121854, mflops=41.0329 (err=5.6e-16) 4. Bailey: elapsed time t=1.7166 s, 131072 iters, t-(init.)=1.68327 s t(norm)=0.0334435, mflops=149.506 (err=5.6e-16) 5. Beauregard: elapsed time t=1.33328 s, 16384 iters, t-(init.)=1.33328 s t(norm)=0.211919, mflops=23.5939 (err=6.0e-16) 6. Bergland: elapsed time t=1.59994 s, 131072 iters, t-(init.)=1.5666 s t(norm)=0.0311256, mflops=160.639 (err=6.0e-16) 7. Brenner: elapsed time t=1.31661 s, 65536 iters, t-(init.)=1.28328 s t(norm)=0.050993, mflops=98.0526 (err=5.9e-16) 8. Burrus: elapsed time t=1.03329 s, 32768 iters, t-(init.)=1.01663 s t(norm)=0.0807942, mflops=61.8856 (err=5.7e-16) 9. CWP (min N) (N=65): elapsed time t=1.09996 s, 131072 iters, t-(init.)=1.04996 s t(norm)=0.0208608, mflops=239.684 10. CWP (best N) (N=84): elapsed time t=1.94992 s, 262144 iters, t-(init.)=1.84993 s t(norm)=0.0183774, mflops=272.074 11. Edelblute: elapsed time t=1.19995 s, 32768 iters, t-(init.)=1.19995 s t(norm)=0.0953636, mflops=52.4309 (err=5.7e-16) 12. FFTPACK: elapsed time t=1.81659 s, 262144 iters, t-(init.)=1.74993 s t(norm)=0.017384, mflops=287.621 (err=5.5e-16) 13. FFTPACK (f2c): elapsed time t=1.51661 s, 131072 iters, t-(init.)=1.48327 s t(norm)=0.02947, mflops=169.664 (err=5.5e-16) FFTW_MEASURE plan: (cost = 3.305939e-06) FFTW_NOTW 64 14. FFTW: elapsed time t=1.74993 s, 524288 iters, t-(init.)=1.59994 s t(norm)=0.00794697, mflops=629.171 (err=5.3e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.21662 s, 262144 iters, t-(init.)=1.11662 s t(norm)=0.0110926, mflops=450.749 (err=5.5e-16) 16. Frigo-old: elapsed time t=1.38328 s, 262144 iters, t-(init.)=1.31661 s t(norm)=0.0130794, mflops=382.281 (err=5.6e-16) 17. Green: elapsed time t=1.36661 s, 262144 iters, t-(init.)=1.28328 s t(norm)=0.0127483, mflops=392.21 (err=5.5e-16) 18. GSL: elapsed time t=1.31661 s, 131072 iters, t-(init.)=1.28328 s t(norm)=0.0254965, mflops=196.105 (err=5.5e-16) 19. GSL DIT: elapsed time t=1.58327 s, 65536 iters, t-(init.)=1.5666 s t(norm)=0.0622512, mflops=80.3197 (err=5.6e-16) 20. GSL DIF: elapsed time t=1.31661 s, 65536 iters, t-(init.)=1.28328 s t(norm)=0.050993, mflops=98.0526 (err=5.4e-16) 21. Krukar: elapsed time t=1.06662 s, 131072 iters, t-(init.)=1.01663 s t(norm)=0.0201985, mflops=247.543 (err=6.0e-16) 22. Mayer (Buneman): elapsed time t=1.16662 s, 65536 iters, t-(init.)=1.14995 s t(norm)=0.0456951, mflops=109.421 (err=5.4e-16) 23. Mayer (simple): elapsed time t=1.81659 s, 131072 iters, t-(init.)=1.7666 s t(norm)=0.0350991, mflops=142.454 24. Mayer (lookup): elapsed time t=1.84993 s, 131072 iters, t-(init.)=1.81659 s t(norm)=0.0360925, mflops=138.533 (err=5.5e-16) 25. Monro: elapsed time t=1.24995 s, 65536 iters, t-(init.)=1.23328 s t(norm)=0.0490063, mflops=102.028 (err=3.4e-08) 26. NAPACK (f2c): elapsed time t=1.5666 s, 32768 iters, t-(init.)=1.54994 s t(norm)=0.123178, mflops=40.5917 (err=1.1e-15) 27. Ooura (C): elapsed time t=1.6166 s, 262144 iters, t-(init.)=1.53327 s t(norm)=0.0152317, mflops=328.263 (err=5.9e-16) 28. Ooura (F): elapsed time t=1.68327 s, 262144 iters, t-(init.)=1.59994 s t(norm)=0.0158939, mflops=314.585 (err=5.9e-16) 29. Ransom: elapsed time t=1.73326 s, 65536 iters, t-(init.)=1.7166 s t(norm)=0.0682115, mflops=73.3014 (err=8.6e-16) 30. SCIPORT: elapsed time t=1.99992 s, 262144 iters, t-(init.)=1.91659 s t(norm)=0.0190396, mflops=262.61 (err=5.9e-16) 31. Singleton: elapsed time t=1.59994 s, 131072 iters, t-(init.)=1.54994 s t(norm)=0.0307945, mflops=162.367 (err=9.2e-16) 32. Singleton (f2c): elapsed time t=1.59994 s, 131072 iters, t-(init.)=1.5666 s t(norm)=0.0311256, mflops=160.639 (err=9.2e-16) 33. Sorensen: elapsed time t=1.23328 s, 131072 iters, t-(init.)=1.18329 s t(norm)=0.0235098, mflops=212.677 (err=5.4e-16) 34. Sorensen DIT: elapsed time t=2.01659 s, 65536 iters, t-(init.)=1.99992 s t(norm)=0.0794697, mflops=62.9171 (err=5.5e-16) 35. Temperton: elapsed time t=1.64993 s, 131072 iters, t-(init.)=1.6166 s t(norm)=0.032119, mflops=155.671 (err=3.8e-08) 36. Temperton (f2c): elapsed time t=1.03329 s, 65536 iters, t-(init.)=1.01663 s t(norm)=0.0403971, mflops=123.771 (err=5.5e-16) 37. Valkenburg: elapsed time t=1.43328 s, 16384 iters, t-(init.)=1.41661 s t(norm)=0.225164, mflops=22.206 (err=8.0e-16) 38. DXML: elapsed time t=1.6666 s, 262144 iters, t-(init.)=1.59994 s t(norm)=0.0158939, mflops=314.585 (err=2.0e-15) Top mflops for N=64 = 629.171 Normalized results and averages for N=64: fft 0: mflops = 127.967 (norm. = 0.20339), norm. avg. (of 6) = 0.32947 fft 1: mflops = 111.03 (norm. = 0.176471), norm. avg. (of 6) = 0.278785 fft 2: mflops = 82.0658 (norm. = 0.130435), norm. avg. (of 6) = 0.174467 fft 3: mflops = 41.0329 (norm. = 0.0652174), norm. avg. (of 6) = 0.039216 fft 4: mflops = 149.506 (norm. = 0.237624), norm. avg. (of 6) = 0.166815 fft 5: mflops = 23.5939 (norm. = 0.0375), norm. avg. (of 6) = 0.0454556 fft 6: mflops = 160.639 (norm. = 0.255319), norm. avg. (of 6) = 0.165565 fft 7: mflops = 98.0526 (norm. = 0..155844), norm. avg. (of 6) = 0.105945 fft 8: mflops = 61.8856 (norm. = 0.0983607), norm. avg. (of 6) = 0.144074 fft 9: mflops = 239.684 (norm. = 0.380952), norm. avg. (of 6) = 0.199781 fft 10: mflops = 272.074 (norm. = 0.432432), norm. avg. (of 6) = 0.191614 fft 11: mflops = 52.4309 (norm. = 0.0833333), norm. avg. (of 5) = 0.0922807 fft 12: mflops = 287.621 (norm. = 0.457143), norm. avg. (of 6) = 0.304503 fft 13: mflops = 169.664 (norm. = 0.269663), norm. avg. (of 6) = 0.21571 fft 14: mflops = 629.171 (norm. = 1), norm. avg. (of 6) = 0.774488 fft 15: mflops = 450.749 (norm. = 0.716418), norm. avg. (of 6) = 0.821122 fft 16: mflops = 382.281 (norm. = 0.607595), norm. avg. (of 6) = 0.897146 fft 17: mflops = 392.21 (norm. = 0.623377), norm. avg. (of 4) = 0.442601 fft 18: mflops = 196.105 (norm. = 0.311688), norm. avg. (of 6) = 0.249289 fft 19: mflops = 80.3197 (norm. = 0.12766), norm. avg. (of 6) = 0.0937889 fft 20: mflops = 98.0526 (norm. = 0.155844), norm. avg. (of 6) = 0..105077 fft 21: mflops = 247.543 (norm. = 0.393443), norm. avg. (of 6) = 0.394114 fft 22: mflops = 109.421 (norm. = 0.173913), norm. avg. (of 5) = 0.192586 fft 23: mflops = 142.454 (norm. = 0.226415), norm. avg. (of 5) = 0.223656 fft 24: mflops = 138.533 (norm. = 0.220183), norm. avg. (of 5) = 0.210452 fft 25: mflops = 102.028 (norm. = 0.162162), norm. avg. (of 5) = 0.0860467 fft 26: mflops = 40.5917 (norm. = 0.0645161), norm. avg. (of 6) = 0.0520907 fft 27: mflops = 328.263 (norm. = 0.521739), norm. avg. (of 6) = 0.473944 fft 28: mflops = 314.585 (norm. = 0.5), norm. avg. (of 6) = 0.439731 fft 29: mflops = 73.3014 (norm. = 0.116505), norm. avg. (of 5) = 0.0594645 fft 30: mflops = 262.61 (norm. = 0.417391), norm. avg. (of 5) = 0.298949 fft 31: mflops = 162.367 (norm. = 0.258065), norm. avg. (of 6) = 0.133993 fft 32: mflops = 160.639 (norm. = 0.255319), norm. avg. (of 6) = 0.135447 fft 33: mflops = 212.677 (norm. = 0.338028), norm. avg. (of 6) = 0.282958 fft 34: mflops = 62.9171 (norm. = 0.1), norm. avg. (of 6) = 0.138469 fft 35: mflops = 155.671 (norm. = 0.247423), norm. avg. (of 6) = 0.142235 fft 36: mflops = 123.771 (norm. = 0.196721), norm. avg. (of 6) = 0.109517 fft 37: mflops = 22.206 (norm. = 0.0352941), norm. avg. (of 6) = 0.0578262 fft 38: mflops = 314.585 (norm. = 0.5), norm. avg. (of 6) = 0.261204 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.06662 s, 32768 iters, t-(init.)=1.04996 s t(norm)=0.0357614, mflops=139.816 (err=3.5e-16) 1. Arndt DIT: elapsed time t=1.23328 s, 32768 iters, t-(init.)=1.21662 s t(norm)=0.0414378, mflops=120.663 (err=3.3e-16) 2. Arndt Split-Radix: elapsed time t=1.7166 s, 32768 iters, t-(init.)=1.69993 s t(norm)=0.0578993, mflops=86.3568 (err=3.6e-16) 3. Arndt 4-step: elapsed time t=1.94992 s, 16384 iters, t-(init.)=1.93326 s t(norm)=0.131693, mflops=37.9672 (err=3.3e-16) 4. Bailey: elapsed time t=1.59994 s, 65536 iters, t-(init.)=1.5666 s t(norm)=0.0266791, mflops=187.413 (err=3.3e-16) 5. Beauregard: elapsed time t=1.5666 s, 8192 iters, t-(init.)=1.54994 s t(norm)=0.211162, mflops=23.6785 (err=3.8e-16) 6. Bergland: elapsed time t=1.74993 s, 65536 iters, t-(init.)=1.7166 s t(norm)=0.0292335, mflops=171.037 (err=3.5e-16) 7. Brenner: elapsed time t=1.41661 s, 32768 iters, t-(init.)=1.38328 s t(norm)=0.0471142, mflops=106.125 (err=4.1e-16) 8. Burrus: elapsed time t=1.13329 s, 16384 iters, t-(init.)=1.13329 s t(norm)=0.0771991, mflops=64.7676 (err=3.2e-16) 9. CWP (min N) (N=130): elapsed time t=1.06662 s, 65536 iters, t-(init.)=1.03329 s t(norm)=0.0175969, mflops=284.142 10. CWP (best N) (N=140): elapsed time t=1.54994 s, 131072 iters, t-(init.)=1.46661 s t(norm)=0.0124881, mflops=400.381 11. Edelblute: elapsed time t=1.29995 s, 16384 iters, t-(init.)=1.29995 s t(norm)=0.0885519, mflops=56.464 (err=3.2e-16) 12. FFTPACK: elapsed time t=1.89992 s, 131072 iters, t-(init.)=1.81659 s t(norm)=0.0154682, mflops=323.244 (err=3.6e-16) 13. FFTPACK (f2c): elapsed time t=1.6166 s, 65536 iters, t-(init.)=1.58327 s t(norm)=0.0269629, mflops=185.44 (err=3.6e-16) FFTW_MEASURE plan: (cost = 8.646301e-06) FFTW_TWIDDLE 4 FFTW_NOTW 32 14. FFTW: elapsed time t=1.08329 s, 131072 iters, t-(init.)=1.01663 s t(norm)=0.00865652, mflops=577.599 (err=3.4e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.09996 s, 131072 iters, t-(init.)=1.01663 s t(norm)=0.00865652, mflops=577.599 (err=3.4e-16) 16. Frigo-old: elapsed time t=1.31661 s, 131072 iters, t-(init.)=1.24995 s t(norm)=0.0106433, mflops=469.781 (err=3.4e-16) 17. Green: elapsed time t=1.59994 s, 131072 iters, t-(init.)=1.51661 s t(norm)=0.0129138, mflops=387.182 (err=4.2e-16) 18. GSL: elapsed time t=1.43328 s, 65536 iters, t-(init.)=1.39994 s t(norm)=0.0238409, mflops=209.724 (err=3.4e-16) 19. GSL DIT: elapsed time t=1.68327 s, 32768 iters, t-(init.)=1.6666 s t(norm)=0.0567641, mflops=88.0839 (err=3.5e-16) 20. GSL DIF: elapsed time t=1.38328 s, 32768 iters, t-(init.)=1.36661 s t(norm)=0.0465465, mflops=107.419 (err=3.7e-16) 21. Krukar: elapsed time t=1.33328 s, 65536 iters, t-(init.)=1.29995 s t(norm)=0.022138, mflops=225.856 (err=3.6e-16) 22. Mayer (Buneman): elapsed time t=1.19995 s, 32768 iters, t-(init.)=1.18329 s t(norm)=0.0403025, mflops=124.062 (err=3.2e-16) 23. Mayer (simple): elapsed time t=1.86659 s, 65536 iters, t-(init.)=1.83326 s t(norm)=0.0312202, mflops=160.153 24. Mayer (lookup): elapsed time t=1.91659 s, 65536 iters, t-(init.)=1.88326 s t(norm)=0.0320717, mflops=155.901 (err=3.3e-16) 25. Monro: elapsed time t=1.21662 s, 32768 iters, t-(init.)=1.19995 s t(norm)=0.0408701, mflops=122.339 (err=5.2e-08) 26. NAPACK (f2c): elapsed time t=1.79993 s, 16384 iters, t-(init.)=1.79993 s t(norm)=0.12261, mflops=40.7796 (err=1.2e-15) 27. Ooura (C): elapsed time t=1.88326 s, 131072 iters, t-(init.)=1.79993 s t(norm)=0.0153263, mflops=326.237 (err=3.3e-16) 28. Ooura (F): elapsed time t=1.93326 s, 131072 iters, t-(init.)=1.86659 s t(norm)=0.0158939, mflops=314.585 (err=3.3e-16) 29. Ransom: elapsed time t=1.01663 s, 16384 iters, t-(init.)=1.01663 s t(norm)=0.0692522, mflops=72.1999 (err=1.0e-15) 30. SCIPORT: elapsed time t=1.06662 s, 65536 iters, t-(init.)=1.03329 s t(norm)=0.0175969, mflops=284.142 (err=3.8e-16) 31. Singleton: elapsed time t=2.01659 s, 65536 iters, t-(init.)=1.96659 s t(norm)=0.0334908, mflops=149.295 (err=4.2e-16) 32. Singleton (f2c): elapsed time t=1.01663 s, 32768 iters, t-(init.)=0.99996 s t(norm)=0.0340584, mflops=146.807 (err=4.2e-16) 33. Sorensen: elapsed time t=1.19995 s, 65536 iters, t-(init.)=1.14995 s t(norm)=0.0195836, mflops=255.316 (err=3.3e-16) 34. Sorensen DIT: elapsed time t=1.09996 s, 16384 iters, t-(init.)=1.08329 s t(norm)=0.0737933, mflops=67.7569 (err=3.2e-16) 35. Temperton: elapsed time t=1.84993 s, 65536 iters, t-(init.)=1.81659 s t(norm)=0.0309364, mflops=161.622 (err=4.7e-08) 36. Temperton (f2c): elapsed time t=1.21662 s, 32768 iters, t-(init.)=1.19995 s t(norm)=0.0408701, mflops=122.339 (err=3.6e-16) 37. Valkenburg: elapsed time t=1.64993 s, 8192 iters, t-(init.)=1.64993 s t(norm)=0.224786, mflops=22.2434 (err=5.8e-16) 38. DXML: elapsed time t=1.59994 s, 131072 iters, t-(init.)=1.53327 s t(norm)=0.0130557, mflops=382.974 (err=1.0e-15) Top mflops for N=128 = 577.599 Normalized results and averages for N=128: fft 0: mflops = 139.816 (norm. = 0.242063), norm. avg. (of 7) = 0.316984 fft 1: mflops = 120.663 (norm. = 0.208904), norm. avg. (of 7) = 0.268802 fft 2: mflops = 86.3568 (norm. = 0.14951), norm. avg. (of 7) = 0.170902 fft 3: mflops = 37.9672 (norm. = 0.0657328), norm. avg. (of 7) = 0.0430041 fft 4: mflops = 187.413 (norm. = 0.324468), norm. avg. (of 7) = 0.189337 fft 5: mflops = 23.6785 (norm. = 0.0409946), norm. avg. (of 7) = 0.0448184 fft 6: mflops = 171.037 (norm. = 0.296117), norm. avg. (of 7) = 0.184215 fft 7: mflops = 106.125 (norm. = 0.183735), norm. avg. (of 7) = 0.117057 fft 8: mflops = 64.7676 (norm. = 0.112132), norm. avg. (of 7) = 0.139511 fft 9: mflops = 284.142 (norm. = 0.491935), norm. avg. (of 7) = 0.241518 fft 10: mflops = 400.381 (norm. = 0.693182), norm. avg. (of 7) = 0.263267 fft 11: mflops = 56.464 (norm. = 0.0977564), norm. avg. (of 6) = 0.0931933 fft 12: mflops = 323.244 (norm. = 0.559633), norm. avg. (of 7) = 0.34095 fft 13: mflops = 185.44 (norm. = 0.321053), norm. avg. (of 7) = 0.230759 fft 14: mflops = 577.599 (norm. = 1), norm. avg. (of 7) = 0.806704 fft 15: mflops = 577.599 (norm. = 1), norm. avg. (of 7) = 0.846676 fft 16: mflops = 469.781 (norm. = 0.813333), norm. avg. (of 7) = 0.885173 fft 17: mflops = 387.182 (norm. = 0.67033), norm. avg. (of 5) = 0.488147 fft 18: mflops = 209.724 (norm. = 0.363095), norm. avg. (of 7) = 0.265547 fft 19: mflops = 88.0839 (norm. = 0.1525), norm. avg. (of 7) = 0.102176 fft 20: mflops = 107.419 (norm. = 0.185976), norm. avg. (of 7) = 0.116634 fft 21: mflops = 225.856 (norm. = 0.391026), norm. avg. (of 7) = 0.393673 fft 22: mflops = 124.062 (norm. = 0.214789), norm. avg. (of 6) = 0.196286 fft 23: mflops = 160.153 (norm. = 0.277273), norm. avg. (of 6) = 0.232592 fft 24: mflops = 155.901 (norm. = 0.269912), norm. avg. (of 6) = 0.220362 fft 25: mflops = 122.339 (norm. = 0.211806), norm. avg. (of 6) = 0.107007 fft 26: mflops = 40.7796 (norm. = 0.0706019), norm. avg. (of 7) = 0.0547352 fft 27: mflops = 326.237 (norm. = 0.564815), norm. avg. (of 7) = 0.486925 fft 28: mflops = 314.585 (norm. = 0.544643), norm. avg. (of 7) = 0.454718 fft 29: mflops = 72.1999 (norm. = 0.125), norm. avg. (of 6) = 0.0703871 fft 30: mflops = 284.142 (norm. = 0.491935), norm. avg. (of 6) = 0.331113 fft 31: mflops = 149.295 (norm. = 0.258475), norm. avg. (of 7) = 0.151776 fft 32: mflops = 146.807 (norm. = 0.254167), norm. avg. (of 7) = 0.152407 fft 33: mflops = 255.316 (norm. = 0.442029), norm. avg. (of 7) = 0.305682 fft 34: mflops = 67.7569 (norm. = 0.117308), norm. avg. (of 7) = 0.135446 fft 35: mflops = 161.622 (norm. = 0.279817), norm. avg. (of 7) = 0.16189 fft 36: mflops = 122.339 (norm. = 0.211806), norm. avg. (of 7) = 0.12413 fft 37: mflops = 22.2434 (norm. = 0.0385101), norm. avg. (of 7) = 0.0550668 fft 38: mflops = 382.974 (norm. = 0.663043), norm. avg. (of 7) = 0.31861 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.18329 s, 16384 iters, t-(init.)=1.16662 s t(norm)=0.034768, mflops=143.81 (err=9.7e-16) 1. Arndt DIT: elapsed time t=1.39994 s, 16384 iters, t-(init.)=1.38328 s t(norm)=0.0412249, mflops=121.286 (err=9.9e-16) 2. Arndt Split-Radix: elapsed time t=1.84993 s, 16384 iters, t-(init.)=1.83326 s t(norm)=0.0546354, mflops=91.5157 (err=9.8e-16) 3. Arndt 4-step: elapsed time t=1.96659 s, 8192 iters, t-(init.)=1.94992 s t(norm)=0.116224, mflops=43.0202 (err=1.0e-15) 4. Bailey: elapsed time t=1.98325 s, 32768 iters, t-(init.)=1.94992 s t(norm)=0.0290561, mflops=172.081 (err=9.8e-16) 5. Beauregard: elapsed time t=1.79993 s, 4096 iters, t-(init.)=1.79993 s t(norm)=0.214568, mflops=23.3026 (err=1.1e-15) 6. Bergland: elapsed time t=1.68327 s, 32768 iters, t-(init.)=1.64993 s t(norm)=0.0245859, mflops=203.368 (err=1.0e-15) 7. Brenner: elapsed time t=1.41661 s, 16384 iters, t-(init.)=1.39994 s t(norm)=0.0417216, mflops=119.842 (err=1.1e-15) 8. Burrus: elapsed time t=1.19995 s, 8192 iters, t-(init.)=1.19995 s t(norm)=0.0715227, mflops=69.9079 (err=9.9e-16) 9. CWP (min N) (N=260): elapsed time t=1.89992 s, 65536 iters, t-(init.)=1.83326 s t(norm)=0.0136589, mflops=366.063 10. CWP (best N) (N=280): elapsed time t=1.39994 s, 65536 iters, t-(init.)=1.31661 s t(norm)=0.00980954, mflops=509.708 11. Edelblute: elapsed time t=1.36661 s, 8192 iters, t-(init.)=1.34995 s t(norm)=0.0804631, mflops=62.1403 (err=9.9e-16) 12. FFTPACK: elapsed time t=1.81659 s, 65536 iters, t-(init.)=1.73326 s t(norm)=0.0129138, mflops=387.182 (err=1.0e-15) 13. FFTPACK (f2c): elapsed time t=1.58327 s, 32768 iters, t-(init.)=1.54994 s t(norm)=0.0230959, mflops=216.489 (err=1.0e-15) FFTW_MEASURE plan: (cost = 2.441309e-05) FFTW_TWIDDLE 8 FFTW_NOTW 32 14. FFTW: elapsed time t=1.16662 s, 65536 iters, t-(init.)=1.09996 s t(norm)=0.00819531, mflops=610.105 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.16662 s, 65536 iters, t-(init.)=1.08329 s t(norm)=0.00807114, mflops=619.491 (err=1.1e-15) 16. Frigo-old: elapsed time t=1.5666 s, 65536 iters, t-(init.)=1.49994 s t(norm)=0.0111754, mflops=447.41 (err=1.1e-15) 17. Green: elapsed time t=1.63327 s, 65536 iters, t-(init.)=1.5666 s t(norm)=0.0116721, mflops=428.372 (err=1.1e-15) 18. GSL: elapsed time t=1.38328 s, 32768 iters, t-(init.)=1.34995 s t(norm)=0.0201158, mflops=248.561 (err=1.0e-15) 19. GSL DIT: elapsed time t=1.78326 s, 16384 iters, t-(init.)=1.7666 s t(norm)=0.0526487, mflops=94.9692 (err=1.0e-15) 20. GSL DIF: elapsed time t=1.46661 s, 16384 iters, t-(init.)=1.44994 s t(norm)=0.0432116, mflops=115.71 (err=1.1e-15) 21. Krukar: elapsed time t=1.7666 s, 32768 iters, t-(init.)=1.73326 s t(norm)=0.0258276, mflops=193.591 (err=1.1e-15) 22. Mayer (Buneman): elapsed time t=1.34995 s, 16384 iters, t-(init.)=1.33328 s t(norm)=0.0397348, mflops=125.834 (err=9.7e-16) 23. Mayer (simple): elapsed time t=1.03329 s, 16384 iters, t-(init.)=1.01663 s t(norm)=0.0302978, mflops=165.028 24. Mayer (lookup): elapsed time t=1.06662 s, 16384 iters, t-(init.)=1.04996 s t(norm)=0.0312912, mflops=159.789 (err=9.4e-16) 25. Monro: elapsed time t=1.21662 s, 16384 iters, t-(init.)=1.19995 s t(norm)=0.0357614, mflops=139.816 (err=8.5e-08) 26. NAPACK (f2c): elapsed time t=1.89992 s, 8192 iters, t-(init.)=1.88326 s t(norm)=0.112251, mflops=44.5431 (err=3.8e-15) 27. Ooura (C): elapsed time t=1.88326 s, 65536 iters, t-(init.)=1.81659 s t(norm)=0..0135347, mflops=369.421 (err=9.9e-16) 28. Ooura (F): elapsed time t=1.93326 s, 65536 iters, t-(init.)=1.86659 s t(norm)=0.0139072, mflops=359.526 (err=9.9e-16) 29. Ransom: elapsed time t=1.7166 s, 16384 iters, t-(init.)=1.69993 s t(norm)=0.0506619, mflops=98.6935 (err=1.9e-15) 30. SCIPORT: elapsed time t=1.19995 s, 32768 iters, t-(init.)=1.16662 s t(norm)=0.017384, mflops=287.621 (err=1.1e-15) 31. Singleton: elapsed time t=1.68327 s, 32768 iters, t-(init.)=1.64993 s t(norm)=0.0245859, mflops=203.368 (err=1.7e-15) 32. Singleton (f2c): elapsed time t=1.68327 s, 32768 iters, t-(init.)=1.64993 s t(norm)=0.0245859, mflops=203.368 (err=1.7e-15) 33. Sorensen: elapsed time t=1.26662 s, 32768 iters, t-(init.)=1.23328 s t(norm)=0.0183774, mflops=272.074 (err=9.9e-16) 34. Sorensen DIT: elapsed time t=1.18329 s, 8192 iters, t-(init.)=1.16662 s t(norm)=0.069536, mflops=71.9052 (err=9.9e-16) 35. Temperton: elapsed time t=1.88326 s, 32768 iters, t-(init.)=1.84993 s t(norm)=0.027566, mflops=181.383 (err=9.5e-08) 36. Temperton (f2c): elapsed time t=1.18329 s, 16384 iters, t-(init.)=1.16662 s t(norm)=0.034768, mflops=143.81 (err=1.0e-15) 37. Valkenburg: elapsed time t=1.86659 s, 4096 iters, t-(init.)=1.86659 s t(norm)=0.222515, mflops=22.4704 (err=1.2e-15) 38. DXML: elapsed time t=1.48327 s, 65536 iters, t-(init.)=1.39994 s t(norm)=0.0104304, mflops=479.368 (err=2.4e-15) Top mflops for N=256 = 619.491 Normalized results and averages for N=256: fft 0: mflops = 143.81 (norm. = 0.232143), norm. avg. (of 8) = 0.306378 fft 1: mflops = 121.286 (norm. = 0.195783), norm. avg. (of 8) = 0.259674 fft 2: mflops = 91.5157 (norm. = 0.147727), norm. avg. (of 8) = 0.168005 fft 3: mflops = 43.0202 (norm. = 0.0694444), norm. avg. (of 8) = 0.0463091 fft 4: mflops = 172.081 (norm. = 0.277778), norm. avg. (of 8) = 0.200392 fft 5: mflops = 23.3026 (norm. = 0.0376157), norm. avg. (of 8) = 0.043918 fft 6: mflops = 203.368 (norm. = 0.328283), norm. avg. (of 8) = 0.202224 fft 7: mflops = 119.842 (norm. = 0.193452), norm. avg. (of 8) = 0.126607 fft 8: mflops = 69.9079 (norm. = 0.112847), norm. avg. (of 8) = 0.136178 fft 9: mflops = 366.063 (norm. = 0.590909), norm. avg. (of 8) = 0.285192 fft 10: mflops = 509.708 (norm. = 0.822785), norm. avg. (of 8) = 0.333206 fft 11: mflops = 62.1403 (norm. = 0.100309), norm. avg. (of 7) = 0.0942098 fft 12: mflops = 387.182 (norm. = 0.625), norm. avg. (of 8) = 0.376456 fft 13: mflops = 216.489 (norm. = 0.349462), norm. avg. (of 8) = 0.245597 fft 14: mflops = 610.105 (norm. = 0.984848), norm. avg. (of 8) = 0.828972 fft 15: mflops = 619.491 (norm. = 1), norm. avg. (of 8) = 0.865842 fft 16: mflops = 447.41 (norm. = 0.722222), norm. avg. (of 8) = 0.864804 fft 17: mflops = 428.372 (norm. = 0.691489), norm. avg. (of 6) = 0.522037 fft 18: mflops = 248.561 (norm. = 0.401235), norm. avg. (of 8) = 0.282508 fft 19: mflops = 94.9692 (norm. = 0.153302), norm. avg. (of 8) = 0.108567 fft 20: mflops = 115.71 (norm. = 0.186782), norm. avg. (of 8) = 0.125402 fft 21: mflops = 193.591 (norm. = 0.3125), norm. avg. (of 8) = 0.383527 fft 22: mflops = 125.834 (norm. = 0.203125), norm. avg. (of 7) = 0.197263 fft 23: mflops = 165.028 (norm. = 0.266393), norm. avg. (of 7) = 0.237421 fft 24: mflops = 159.789 (norm. = 0.257937), norm. avg. (of 7) = 0.22573 fft 25: mflops = 139.816 (norm. = 0.225694), norm. avg. (of 7) = 0.123962 fft 26: mflops = 44.5431 (norm. = 0.0719027), norm. avg. (of 8) = 0.0568811 fft 27: mflops = 369.421 (norm. = 0.59633), norm. avg. (of 8) = 0.500601 fft 28: mflops = 359.526 (norm. = 0.580357), norm. avg. (of 8) = 0.470423 fft 29: mflops = 98.6935 (norm. = 0.159314), norm. avg. (of 7) = 0.0830909 fft 30: mflops = 287.621 (norm. = 0.464286), norm. avg. (of 7) = 0.350138 fft 31: mflops = 203.368 (norm. = 0.328283), norm. avg. (of 8) = 0.173839 fft 32: mflops = 203.368 (norm. = 0.328283), norm. avg. (of 8) = 0.174392 fft 33: mflops = 272.074 (norm. = 0.439189), norm. avg. (of 8) = 0.322371 fft 34: mflops = 71.9052 (norm. = 0.116071), norm. avg. (of 8) = 0.133024 fft 35: mflops = 181.383 (norm. = 0.292793), norm. avg. (of 8) = 0.178253 fft 36: mflops = 143.81 (norm. = 0.232143), norm. avg. (of 8) = 0.137631 fft 37: mflops = 22.4704 (norm. = 0.0362723), norm. avg. (of 8) = 0.0527175 fft 38: mflops = 479.368 (norm. = 0.77381), norm. avg. (of 8) = 0.37551 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.23328 s, 8192 iters, t-(init.)=1.21662 s t(norm)=0.0322294, mflops=155.138 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.39994 s, 8192 iters, t-(init.)=1.38328 s t(norm)=0.0366444, mflops=136.447 (err=1.1e-15) 2. Arndt Split-Radix: elapsed time t=1.94992 s, 8192 iters, t-(init.)=1.93326 s t(norm)=0.0512138, mflops=97.6299 (err=1.1e-15) 3. Arndt 4-step: elapsed time t=1.08329 s, 2048 iters, t-(init.)=1.08329 s t(norm)=0.11479, mflops=43.558 (err=9.9e-16) 4. Bailey: elapsed time t=2.01659 s, 16384 iters, t-(init.)=1.98325 s t(norm)=0.0262691, mflops=190.337 (err=1.1e-15) 5. Beauregard: elapsed time t=1.01663 s, 1024 iters, t-(init.)=0.99996 s t(norm)=0.211919, mflops=23.5939 (err=1.0e-15) 6. Bergland: elapsed time t=1.7666 s, 16384 iters, t-(init.)=1.73326 s t(norm)=0.0229579, mflops=217.79 (err=1.0e-15) 7. Brenner: elapsed time t=1.86659 s, 8192 iters, t-(init.)=1.84993 s t(norm)=0.0490063, mflops=102.028 (err=1.0e-15) 8. Burrus: elapsed time t=1.31661 s, 4096 iters, t-(init.)=1.31661 s t(norm)=0.0697567, mflops=71.6777 (err=1.1e-15) 9. CWP (min N) (N=520): elapsed time t=1.86659 s, 32768 iters, t-(init.)=1.79993 s t(norm)=0.0119205, mflops=419.447 10. CWP (best N) (N=560): elapsed time t=1.96659 s, 32768 iters, t-(init.)=1.89992 s t(norm)=0.0125827, mflops=397.371 11. Edelblute: elapsed time t=1.46661 s, 4096 iters, t-(init.)=1.46661 s t(norm)=0.0777037, mflops=64.347 (err=1.1e-15) 12. FFTPACK: elapsed time t=1.31661 s, 16384 iters, t-(init.)=1.28328 s t(norm)=0.0169977, mflops=294.158 (err=1.0e-15) 13. FFTPACK (f2c): elapsed time t=1.18329 s, 8192 iters, t-(init.)=1.16662 s t(norm)=0.0309049, mflops=161.787 (err=1.0e-15) FFTW_MEASURE plan: (cost = 4.272290e-05) FFTW_TWIDDLE 8 FFTW_NOTW 64 14. FFTW: elapsed time t=1.6666 s, 32768 iters, t-(init.)=1.59994 s t(norm)=0.010596, mflops=471.878 (err=9.9e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.73326 s, 32768 iters, t-(init.)=1.6666 s t(norm)=0.0110375, mflops=453.003 (err=9.6e-16) 16. Frigo-old: elapsed time t=1.04996 s, 16384 iters, t-(init.)=1.01663 s t(norm)=0.0134657, mflops=371.314 (err=9.4e-16) 17. Green: elapsed time t=1.73326 s, 32768 iters, t-(init.)=1.6666 s t(norm)=0.0110375, mflops=453.003 (err=9.6e-16) 18. GSL: elapsed time t=1.91659 s, 16384 iters, t-(init.)=1.88326 s t(norm)=0.0249446, mflops=200.444 (err=1.0e-15) 19. GSL DIT: elapsed time t=1.89992 s, 8192 iters, t-(init.)=1.88326 s t(norm)=0.0498893, mflops=100.222 (err=1.2e-15) 20. GSL DIF: elapsed time t=1.51661 s, 8192 iters, t-(init.)=1.49994 s t(norm)=0.0397348, mflops=125.834 (err=1.1e-15) 21. Krukar: elapsed time t=1.06662 s, 8192 iters, t-(init.)=1.04996 s t(norm)=0.0278144, mflops=179.763 (err=1.0e-15) 22. Mayer (Buneman): elapsed time t=1.39994 s, 8192 iters, t-(init.)=1.36661 s t(norm)=0.0362029, mflops=138.111 (err=1.0e-15) 23. Mayer (simple): elapsed time t=1.08329 s, 8192 iters, t-(init.)=1.06662 s t(norm)=0.0282559, mflops=176.954 24. Mayer (lookup): elapsed time t=1.11662 s, 8192 iters, t-(init.)=1.09996 s t(norm)=0.0291389, mflops=171.592 (err=1.0e-15) 25. Monro: elapsed time t=1.26662 s, 8192 iters, t-(init.)=1.24995 s t(norm)=0.0331124, mflops=151.001 (err=8.4e-08) 26. NAPACK (f2c): elapsed time t=1.19995 s, 2048 iters, t-(init.)=1.19995 s t(norm)=0.127151, mflops=39.3232 (err=7.1e-15) 27. Ooura (C): elapsed time t=1.09996 s, 16384 iters, t-(init.)=1.06662 s t(norm)=0.0141279, mflops=353.909 (err=9.7e-16) 28. Ooura (F): elapsed time t=1.13329 s, 16384 iters, t-(init.)=1.09996 s t(norm)=0.0145694, mflops=343.184 (err=9.7e-16) 29. Ransom: elapsed time t=1.96659 s, 8192 iters, t-(init.)=1.94992 s t(norm)=0.0516553, mflops=96.7955 (err=1.4e-15) 30. SCIPORT: elapsed time t=1.48327 s, 16384 iters, t-(init.)=1.44994 s t(norm)=0.0192052, mflops=260.347 (err=1.0e-15) 31. Singleton: elapsed time t=1.88326 s, 16384 iters, t-(init.)=1.84993 s t(norm)=0.0245032, mflops=204.055 (err=1.2e-15) 32. Singleton (f2c): elapsed time t=1.89992 s, 16384 iters, t-(init.)=1.86659 s t(norm)=0.0247239, mflops=202.233 (err=1.2e-15) 33. Sorensen: elapsed time t=1.34995 s, 16384 iters, t-(init.)=1.29995 s t(norm)=0.0172184, mflops=290.387 (err=1.0e-15) 34. Sorensen DIT: elapsed time t=1.29995 s, 4096 iters, t-(init.)=1.28328 s t(norm)=0.0679907, mflops=73.5394 (err=1.1e-15) 35. Temperton: elapsed time t=1.29995 s, 8192 iters, t-(init.)=1.28328 s t(norm)=0.0339954, mflops=147.079 (err=1.0e-07) 36. Temperton (f2c): elapsed time t=1.5666 s, 8192 iters, t-(init.)=1.54994 s t(norm)=0.0410593, mflops=121.775 (err=1.0e-15) 37. Valkenburg: elapsed time t=1.06662 s, 1024 iters, t-(init.)=1.06662 s t(norm)=0.226047, mflops=22.1193 (err=1.3e-15) 38. DXML: elapsed time t=1.5666 s, 32768 iters, t-(init.)=1.49994 s t(norm)=0.00993371, mflops=503.337 (err=2.9e-15) Top mflops for N=512 = 503.337 Normalized results and averages for N=512: fft 0: mflops = 155.138 (norm. = 0.308219), norm. avg. (of 9) = 0.306583 fft 1: mflops = 136.447 (norm. = 0.271084), norm. avg. (of 9) = 0.260942 fft 2: mflops = 97.6299 (norm. = 0.193966), norm. avg. (of 9) = 0.170889 fft 3: mflops = 43.558 (norm. = 0.0865385), norm. avg. (of 9) = 0.050779 fft 4: mflops = 190.337 (norm. = 0.378151), norm. avg. (of 9) = 0.220143 fft 5: mflops = 23.5939 (norm. = 0.046875), norm. avg. (of 9) = 0.0442466 fft 6: mflops = 217.79 (norm. = 0.432692), norm. avg. (of 9) = 0.227831 fft 7: mflops = 102.028 (norm. = 0.202703), norm. avg. (of 9) = 0.135062 fft 8: mflops = 71.6777 (norm. = 0.142405), norm. avg. (of 9) = 0.13687 fft 9: mflops = 419.447 (norm. = 0.833333), norm. avg. (of 9) = 0.346096 fft 10: mflops = 397.371 (norm. = 0.789474), norm. avg. (of 9) = 0.383903 fft 11: mflops = 64.347 (norm. = 0.127841), norm. avg. (of 8) = 0.0984137 fft 12: mflops = 294.158 (norm. = 0.584416), norm. avg. (of 9) = 0.399563 fft 13: mflops = 161.787 (norm. = 0.321429), norm. avg. (of 9) = 0.254023 fft 14: mflops = 471.878 (norm. = 0.9375), norm. avg. (of 9) = 0.841031 fft 15: mflops = 453.003 (norm. = 0.9), norm. avg. (of 9) = 0.869637 fft 16: mflops = 371.314 (norm. = 0.737705), norm. avg. (of 9) = 0.850682 fft 17: mflops = 453.003 (norm. = 0.9), norm. avg. (of 7) = 0.576032 fft 18: mflops = 200.444 (norm. = 0.39823), norm. avg. (of 9) = 0.295366 fft 19: mflops = 100.222 (norm. = 0.199115), norm. avg. (of 9) = 0.118628 fft 20: mflops = 125.834 (norm. = 0.25), norm. avg. (of 9) = 0.139246 fft 21: mflops = 179.763 (norm. = 0.357143), norm. avg. (of 9) = 0.380595 fft 22: mflops = 138.111 (norm. = 0.27439), norm. avg. (of 8) = 0.206904 fft 23: mflops = 176.954 (norm. = 0.351562), norm. avg. (of 8) = 0.251688 fft 24: mflops = 171.592 (norm. = 0.340909), norm. avg. (of 8) = 0.240127 fft 25: mflops = 151.001 (norm. = 0.3), norm. avg. (of 8) = 0.145967 fft 26: mflops = 39.3232 (norm. = 0.078125), norm. avg. (of 9) = 0.0592415 fft 27: mflops = 353.909 (norm. = 0.703125), norm. avg. (of 9) = 0.523104 fft 28: mflops = 343.184 (norm. = 0.681818), norm. avg. (of 9) = 0.493911 fft 29: mflops = 96.7955 (norm. = 0.192308), norm. avg. (of 8) = 0.096743 fft 30: mflops = 260.347 (norm. = 0.517241), norm. avg. (of 8) = 0.371026 fft 31: mflops = 204.055 (norm. = 0.405405), norm. avg. (of 9) = 0.199569 fft 32: mflops = 202.233 (norm. = 0.401786), norm. avg. (of 9) = 0.199658 fft 33: mflops = 290.387 (norm. = 0.576923), norm. avg. (of 9) = 0.350654 fft 34: mflops = 73.5394 (norm. = 0.146104), norm. avg. (of 9) = 0.134477 fft 35: mflops = 147.079 (norm. = 0.292208), norm. avg. (of 9) = 0.190914 fft 36: mflops = 121.775 (norm. = 0.241935), norm. avg. (of 9) = 0.149221 fft 37: mflops = 22.1193 (norm. = 0.0439453), norm. avg. (of 9) = 0.0517428 fft 38: mflops = 503.337 (norm. = 1), norm. avg. (of 9) = 0.444898 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.48327 s, 4096 iters, t-(init.)=1.46661 s t(norm)=0.0349667, mflops=142.993 (err=1.8e-15) 1. Arndt DIT: elapsed time t=1.58327 s, 4096 iters, t-(init.)=1.5666 s t(norm)=0.0373507, mflops=133.866 (err=1.8e-15) 2. Arndt Split-Radix: elapsed time t=1.19995 s, 2048 iters, t-(init.)=1.19995 s t(norm)=0.0572182, mflops=87.3848 (err=1.8e-15) 3. Arndt 4-step: elapsed time t=1.08329 s, 1024 iters, t-(init.)=1.08329 s t(norm)=0.103311, mflops=48.3978 (err=1.8e-15) 4. Bailey: elapsed time t=1.39994 s, 4096 iters, t-(init.)=1.38328 s t(norm)=0.0329799, mflops=151.607 (err=1.9e-15) 5. Beauregard: elapsed time t=1.18329 s, 512 iters, t-(init.)=1.18329 s t(norm)=0.225694, mflops=22.1539 (err=2.0e-15) 6. Bergland: elapsed time t=1.09996 s, 4096 iters, t-(init.)=1.06662 s t(norm)=0.0254303, mflops=196.616 (err=2.2e-15) 7. Brenner: elapsed time t=1.99992 s, 4096 iters, t-(init.)=1.98325 s t(norm)=0.0472845, mflops=105.743 (err=2.0e-15) 8. Burrus: elapsed time t=1.49994 s, 2048 iters, t-(init.)=1.49994 s t(norm)=0.0715227, mflops=69.9079 (err=1.8e-15) 9. CWP (min N) (N=1040): elapsed time t=1.26662 s, 8192 iters, t-(init.)=1.23328 s t(norm)=0.0147019, mflops=340.092 10. CWP (best N) (N=1040): elapsed time t=1.26662 s, 8192 iters, t-(init.)=1.23328 s t(norm)=0.0147019, mflops=340.092 11. Edelblute: elapsed time t=1.64993 s, 2048 iters, t-(init.)=1.63327 s t(norm)=0.0778803, mflops=64.2011 (err=1.8e-15) 12. FFTPACK: elapsed time t=1.29995 s, 8192 iters, t-(init.)=1.26662 s t(norm)=0.0150992, mflops=331.143 (err=1.9e-15) 13. FFTPACK (f2c): elapsed time t=1.19995 s, 4096 iters, t-(init.)=1.18329 s t(norm)=0.0282117, mflops=177.231 (err=1.9e-15) FFTW_MEASURE plan: (cost = 9.358350e-05) FFTW_TWIDDLE 32 FFTW_NOTW 32 14. FFTW: elapsed time t=1.63327 s, 16384 iters, t-(init.)=1.5666 s t(norm)=0.00933769, mflops=535.464 (err=2.0e-15) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.81659 s, 16384 iters, t-(init.)=1.74993 s t(norm)=0.0104304, mflops=479.368 (err=2.0e-15) 16. Frigo-old: elapsed time t=1.14995 s, 8192 iters, t-(init.)=1.11662 s t(norm)=0.0133112, mflops=375.624 (err=1.9e-15) 17. Green: elapsed time t=1.03329 s, 8192 iters, t-(init.)=0.99996 s t(norm)=0.0119205, mflops=419.447 (err=2.0e-15) 18. GSL: elapsed time t=1.01663 s, 4096 iters, t-(init.)=0.99996 s t(norm)=0.0238409, mflops=209.724 (err=1.9e-15) 19. GSL DIT: elapsed time t=1.19995 s, 2048 iters, t-(init.)=1.18329 s t(norm)=0.0564235, mflops=88.6156 (err=2.1e-15) 20. GSL DIF: elapsed time t=1.01663 s, 2048 iters, t-(init.)=1.01663 s t(norm)=0.0484765, mflops=103.143 (err=2.2e-15) 21. Krukar: elapsed time t=1.81659 s, 4096 iters, t-(init.)=1.79993 s t(norm)=0.0429136, mflops=116.513 (err=1.9e-15) 22. Mayer (Buneman): elapsed time t=1.64993 s, 4096 iters, t-(init.)=1.63327 s t(norm)=0.0389401, mflops=128.402 (err=1.8e-15) 23. Mayer (simple): elapsed time t=1.31661 s, 4096 iters, t-(init.)=1.29995 s t(norm)=0.0309932, mflops=161.326 24. Mayer (lookup): elapsed time t=1.34995 s, 4096 iters, t-(init.)=1.33328 s t(norm)=0.0317879, mflops=157.293 (err=1.8e-15) 25. Monro: elapsed time t=1.54994 s, 4096 iters, t-(init.)=1.53327 s t(norm)=0.0365561, mflops=136.776 (err=1.0e-07) 26. NAPACK (f2c): elapsed time t=1.39994 s, 1024 iters, t-(init.)=1.38328 s t(norm)=0.13192, mflops=37.9019 (err=1.7e-14) 27. Ooura (C): elapsed time t=1.24995 s, 8192 iters, t-(init.)=1.21662 s t(norm)=0.0145032, mflops=344.751 (err=2.2e-15) 28. Ooura (F): elapsed time t=1.39994 s, 8192 iters, t-(init.)=1.36661 s t(norm)=0.0162913, mflops=306.913 (err=2.2e-15) 29. Ransom: elapsed time t=1.83326 s, 4096 iters, t-(init.)=1.81659 s t(norm)=0.043311, mflops=115.444 (err=2.3e-15) 30. SCIPORT: elapsed time t=1.68327 s, 8192 iters, t-(init.)=1.63327 s t(norm)=0.0194701, mflops=256.804 (err=2.0e-15) 31. Singleton: elapsed time t=1.03329 s, 4096 iters, t-(init.)=1.01663 s t(norm)=0.0242383, mflops=206.285 (err=2.8e-15) 32. Singleton (f2c): elapsed time t=1.14995 s, 4096 iters, t-(init.)=1.13329 s t(norm)=0.0270197, mflops=185.05 (err=2.8e-15) 33. Sorensen: elapsed time t=1.7666 s, 8192 iters, t-(init.)=1.73326 s t(norm)=0.0206621, mflops=241.989 (err=1.8e-15) 34. Sorensen DIT: elapsed time t=1.6166 s, 2048 iters, t-(init.)=1.59994 s t(norm)=0.0762909, mflops=65.5386 (err=1.9e-15) 35. Temperton: elapsed time t=1.34995 s, 4096 iters, t-(init.)=1.33328 s t(norm)=0.0317879, mflops=157.293 (err=1.1e-07) 36. Temperton (f2c): elapsed time t=1.44994 s, 4096 iters, t-(init.)=1.43328 s t(norm)=0.034172, mflops=146.319 (err=1.9e-15) 37. Valkenburg: elapsed time t=1.29995 s, 512 iters, t-(init.)=1.29995 s t(norm)=0.247945, mflops=20.1657 (err=2.4e-15) 38. DXML: elapsed time t=1.99992 s, 16384 iters, t-(init.)=1.91659 s t(norm)=0.0114238, mflops=437.684 (err=2.8e-15) Top mflops for N=1024 = 535.464 Normalized results and averages for N=1024: fft 0: mflops = 142.993 (norm. = 0.267045), norm. avg. (of 10) = 0.302629 fft 1: mflops = 133.866 (norm. = 0.25), norm. avg. (of 10) = 0.259848 fft 2: mflops = 87.3848 (norm. = 0.163194), norm. avg. (of 10) = 0.17012 fft 3: mflops = 48.3978 (norm. = 0.0903846), norm. avg. (of 10) = 0.0547396 fft 4: mflops = 151.607 (norm. = 0.283133), norm. avg. (of 10) = 0.226442 fft 5: mflops = 22.1539 (norm. = 0.0413732), norm. avg. (of 10) = 0.0439592 fft 6: mflops = 196.616 (norm. = 0.367187), norm. avg. (of 10) = 0.241767 fft 7: mflops = 105.743 (norm. = 0.197479), norm. avg. (of 10) = 0.141304 fft 8: mflops = 69.9079 (norm. = 0.130556), norm. avg. (of 10) = 0.136238 fft 9: mflops = 340.092 (norm. = 0.635135), norm. avg. (of 10) = 0.375 fft 10: mflops = 340.092 (norm. = 0.635135), norm. avg. (of 10) = 0.409026 fft 11: mflops = 64.2011 (norm. = 0.119898), norm. avg. (of 9) = 0.100801 fft 12: mflops = 331.143 (norm. = 0.618421), norm. avg. (of 10) = 0.421449 fft 13: mflops = 177.231 (norm. = 0.330986), norm. avg. (of 10) = 0.261719 fft 14: mflops = 535.464 (norm. = 1), norm. avg. (of 10) = 0.856927 fft 15: mflops = 479.368 (norm. = 0.895238), norm. avg. (of 10) = 0.872197 fft 16: mflops = 375.624 (norm. = 0.701493), norm. avg. (of 10) = 0.835763 fft 17: mflops = 419.447 (norm. = 0.783333), norm. avg. (of 8) = 0.601945 fft 18: mflops = 209.724 (norm. = 0.391667), norm. avg. (of 10) = 0.304996 fft 19: mflops = 88.6156 (norm. = 0.165493), norm. avg. (of 10) = 0.123314 fft 20: mflops = 103.143 (norm. = 0.192623), norm. avg. (of 10) = 0.144584 fft 21: mflops = 116.513 (norm. = 0.217593), norm. avg. (of 10) = 0.364295 fft 22: mflops = 128.402 (norm. = 0.239796), norm. avg. (of 9) = 0.210559 fft 23: mflops = 161.326 (norm. = 0.301282), norm. avg. (of 9) = 0.257199 fft 24: mflops = 157.293 (norm. = 0.29375), norm. avg. (of 9) = 0.246085 fft 25: mflops = 136.776 (norm. = 0.255435), norm. avg. (of 9) = 0.15813 fft 26: mflops = 37.9019 (norm. = 0.0707831), norm. avg. (of 10) = 0.0603957 fft 27: mflops = 344.751 (norm. = 0.643836), norm. avg. (of 10) = 0.535177 fft 28: mflops = 306.913 (norm. = 0.573171), norm. avg. (of 10) = 0.501837 fft 29: mflops = 115.444 (norm. = 0.215596), norm. avg. (of 9) = 0.109949 fft 30: mflops = 256.804 (norm. = 0.479592), norm. avg. (of 9) = 0.383089 fft 31: mflops = 206.285 (norm. = 0.385246), norm. avg. (of 10) = 0.218136 fft 32: mflops = 185.05 (norm. = 0.345588), norm. avg. (of 10) = 0.214251 fft 33: mflops = 241.989 (norm. = 0.451923), norm. avg. (of 10) = 0.360781 fft 34: mflops = 65.5386 (norm. = 0.122396), norm. avg. (of 10) = 0.133269 fft 35: mflops = 157.293 (norm. = 0.29375), norm. avg. (of 10) = 0.201198 fft 36: mflops = 146.319 (norm. = 0.273256), norm. avg. (of 10) = 0.161624 fft 37: mflops = 20.1657 (norm. = 0.0376603), norm. avg. (of 10) = 0.0503345 fft 38: mflops = 437.684 (norm. = 0.817391), norm. avg. (of 10) = 0.482147 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.54994 s, 2048 iters, t-(init.)=1.53327 s t(norm)=0.0332328, mflops=150.454 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.6666 s, 2048 iters, t-(init.)=1.64993 s t(norm)=0.0357614, mflops=139.816 (err=1.4e-15) 2. Arndt Split-Radix: elapsed time t=1.24995 s, 1024 iters, t-(init.)=1.23328 s t(norm)=0.0534614, mflops=93.5254 (err=1.5e-15) 3. Arndt 4-step: elapsed time t=1.23328 s, 512 iters, t-(init.)=1.23328 s t(norm)=0.106923, mflops=46.7627 (err=1.4e-15) 4. Bailey: elapsed time t=1.5666 s, 2048 iters, t-(init.)=1.54994 s t(norm)=0.033594, mflops=148.836 (err=1.4e-15) 5. Beauregard: elapsed time t=1.29995 s, 256 iters, t-(init.)=1.29995 s t(norm)=0.225405, mflops=22.1823 (err=1.4e-15) 6. Bergland: elapsed time t=1.08329 s, 2048 iters, t-(init.)=1.06662 s t(norm)=0.0231185, mflops=216.277 (err=1.5e-15) 7. Brenner: elapsed time t=1.11662 s, 1024 iters, t-(init.)=1.09996 s t(norm)=0.0476818, mflops=104.862 (err=1.4e-15) 8. Burrus: elapsed time t=1.6166 s, 1024 iters, t-(init.)=1.6166 s t(norm)=0.0700778, mflops=71.3493 (err=1.4e-15) 9. CWP (min N) (N=2145): elapsed time t=1.5666 s, 4096 iters, t-(init.)=1.53327 s t(norm)=0.0166164, mflops=300.908 10. CWP (best N) (N=2184): elapsed time t=1.29995 s, 4096 iters, t-(init.)=1.26662 s t(norm)=0.0137266, mflops=364.257 11. Edelblute: elapsed time t=1.73326 s, 1024 iters, t-(init.)=1.7166 s t(norm)=0.0744125, mflops=67.193 (err=1.4e-15) 12. FFTPACK: elapsed time t=1.03329 s, 2048 iters, t-(init.)=1.01663 s t(norm)=0.0220348, mflops=226.914 (err=1.4e-15) 13. FFTPACK (f2c): elapsed time t=1.51661 s, 2048 iters, t-(init.)=1.49994 s t(norm)=0.0325103, mflops=153.797 (err=1.4e-15) FFTW_MEASURE plan: (cost = 2.604063e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 14. FFTW: elapsed time t=1.19995 s, 4096 iters, t-(init.)=1.14995 s t(norm)=0.0124623, mflops=401.21 (err=1.4e-15) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.18329 s, 4096 iters, t-(init.)=1.14995 s t(norm)=0.0124623, mflops=401.21 (err=1.4e-15) 16. Frigo-old: elapsed time t=1.63327 s, 4096 iters, t-(init.)=1.59994 s t(norm)=0.0173388, mflops=288.37 (err=1.3e-15) 17. Green: elapsed time t=1.09996 s, 4096 iters, t-(init.)=1.06662 s t(norm)=0.0115592, mflops=432.555 (err=1.4e-15) 18. GSL: elapsed time t=1.21662 s, 2048 iters, t-(init.)=1.19995 s t(norm)=0.0260083, mflops=192.247 (err=1.4e-15) 19. GSL DIT: elapsed time t=1.29995 s, 1024 iters, t-(init.)=1.28328 s t(norm)=0.0556288, mflops=89.8815 (err=2.0e-15) 20. GSL DIF: elapsed time t=1.06662 s, 1024 iters, t-(init.)=1.06662 s t(norm)=0.0462369, mflops=108.139 (err=2.3e-15) 21. Krukar: elapsed time t=1.04996 s, 1024 iters, t-(init.)=1.04996 s t(norm)=0.0455145, mflops=109.855 (err=1.4e-15) 22. Mayer (Buneman): elapsed time t=1.84993 s, 2048 iters, t-(init.)=1.83326 s t(norm)=0.0397348, mflops=125.834 (err=1.4e-15) 23. Mayer (simple): elapsed time t=1.51661 s, 2048 iters, t-(init.)=1.49994 s t(norm)=0.0325103, mflops=153.797 24. Mayer (lookup): elapsed time t=1.53327 s, 2048 iters, t-(init.)=1.51661 s t(norm)=0.0328715, mflops=152.107 (err=1.4e-15) 25. Monro: elapsed time t=1.89992 s, 2048 iters, t-(init.)=1.88326 s t(norm)=0.0408185, mflops=122.493 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.64993 s, 512 iters, t-(init.)=1.64993 s t(norm)=0.143045, mflops=34.9539 (err=1.5e-14) 27. Ooura (C): elapsed time t=1.58327 s, 4096 iters, t-(init.)=1.54994 s t(norm)=0.016797, mflops=297.672 (err=1.4e-15) 28. Ooura (F): elapsed time t=1.73326 s, 4096 iters, t-(init.)=1.69993 s t(norm)=0.0184225, mflops=271.407 (err=1.4e-15) 29. Ransom: elapsed time t=1.09996 s, 1024 iters, t-(init.)=1.08329 s t(norm)=0.0469594, mflops=106.475 (err=2.1e-15) 30. SCIPORT: elapsed time t=1.23328 s, 2048 iters, t-(init.)=1.21662 s t(norm)=0.0263695, mflops=189.613 (err=1.4e-15) 31. Singleton: elapsed time t=1.31661 s, 2048 iters, t-(init.)=1.29995 s t(norm)=0.0281756, mflops=177.458 (err=1.9e-15) 32. Singleton (f2c): elapsed time t=1.44994 s, 2048 iters, t-(init.)=1.43328 s t(norm)=0.0310654, mflops=160.951 (err=1.9e-15) 33. Sorensen: elapsed time t=1.94992 s, 4096 iters, t-(init.)=1.91659 s t(norm)=0.0207705, mflops=240.726 (err=1.4e-15) 34. Sorensen DIT: elapsed time t=1.73326 s, 1024 iters, t-(init.)=1.7166 s t(norm)=0.0744125, mflops=67.193 (err=1.4e-15) 35. Temperton: elapsed time t=1.6166 s, 2048 iters, t-(init.)=1.59994 s t(norm)=0.0346777, mflops=144.185 (err=1.1e-07) 36. Temperton (f2c): elapsed time t=1.73326 s, 2048 iters, t-(init.)=1.7166 s t(norm)=0.0372063, mflops=134.386 (err=1.4e-15) 37. Valkenburg: elapsed time t=1.49994 s, 256 iters, t-(init.)=1.49994 s t(norm)=0.260083, mflops=19.2247 (err=1.7e-15) 38. DXML: elapsed time t=1.48327 s, 4096 iters, t-(init.)=1.44994 s t(norm)=0.0157133, mflops=318.201 (err=2.9e-15) Top mflops for N=2048 = 432.555 Normalized results and averages for N=2048: fft 0: mflops = 150.454 (norm. = 0.347826), norm. avg. (of 11) = 0.306738 fft 1: mflops = 139.816 (norm. = 0.323232), norm. avg. (of 11) = 0.26561 fft 2: mflops = 93.5254 (norm. = 0.216216), norm. avg. (of 11) = 0.174311 fft 3: mflops = 46.7627 (norm. = 0.108108), norm. avg. (of 11) = 0.0595913 fft 4: mflops = 148.836 (norm. = 0.344086), norm. avg. (of 11) = 0.237137 fft 5: mflops = 22.1823 (norm. = 0.0512821), norm. avg. (of 11) = 0.044625 fft 6: mflops = 216.277 (norm. = 0.5), norm. avg. (of 11) = 0.265243 fft 7: mflops = 104.862 (norm. = 0.242424), norm. avg. (of 11) = 0.150496 fft 8: mflops = 71.3493 (norm. = 0.164948), norm. avg. (of 11) = 0.138848 fft 9: mflops = 300.908 (norm. = 0.695652), norm. avg. (of 11) = 0.40415 fft 10: mflops = 364.257 (norm. = 0.842105), norm. avg. (of 11) = 0.448397 fft 11: mflops = 67.193 (norm. = 0.15534), norm. avg. (of 10) = 0.106255 fft 12: mflops = 226.914 (norm. = 0.52459), norm. avg. (of 11) = 0.430825 fft 13: mflops = 153.797 (norm. = 0.355556), norm. avg. (of 11) = 0.27025 fft 14: mflops = 401.21 (norm. = 0.927536), norm. avg. (of 11) = 0.863346 fft 15: mflops = 401.21 (norm. = 0.927536), norm. avg. (of 11) = 0.877228 fft 16: mflops = 288.37 (norm. = 0.666667), norm. avg. (of 11) = 0.820391 fft 17: mflops = 432.555 (norm. = 1), norm. avg. (of 9) = 0.646173 fft 18: mflops = 192.247 (norm. = 0.444444), norm. avg. (of 11) = 0.317673 fft 19: mflops = 89.8815 (norm. = 0.207792), norm. avg. (of 11) = 0.130994 fft 20: mflops = 108.139 (norm. = 0.25), norm. avg. (of 11) = 0.154167 fft 21: mflops = 109.855 (norm. = 0.253968), norm. avg. (of 11) = 0.354265 fft 22: mflops = 125.834 (norm. = 0.290909), norm. avg. (of 10) = 0.218594 fft 23: mflops = 153.797 (norm. = 0.355556), norm. avg. (of 10) = 0.267034 fft 24: mflops = 152.107 (norm. = 0.351648), norm. avg. (of 10) = 0.256642 fft 25: mflops = 122.493 (norm. = 0.283186), norm. avg. (of 10) = 0.170635 fft 26: mflops = 34.9539 (norm. = 0.0808081), norm. avg. (of 11) = 0.0622514 fft 27: mflops = 297.672 (norm. = 0.688172), norm. avg. (of 11) = 0.549086 fft 28: mflops = 271.407 (norm. = 0.627451), norm. avg. (of 11) = 0.513257 fft 29: mflops = 106.475 (norm. = 0.246154), norm. avg. (of 10) = 0.123569 fft 30: mflops = 189.613 (norm. = 0.438356), norm. avg. (of 10) = 0.388615 fft 31: mflops = 177.458 (norm. = 0.410256), norm. avg. (of 11) = 0.235602 fft 32: mflops = 160.951 (norm. = 0.372093), norm. avg. (of 11) = 0.2286 fft 33: mflops = 240.726 (norm. = 0.556522), norm. avg. (of 11) = 0.378576 fft 34: mflops = 67.193 (norm. = 0.15534), norm. avg. (of 11) = 0.135276 fft 35: mflops = 144.185 (norm. = 0.333333), norm. avg. (of 11) = 0.21321 fft 36: mflops = 134.386 (norm. = 0.31068), norm. avg. (of 11) = 0.175175 fft 37: mflops = 19.2247 (norm. = 0.0444444), norm. avg. (of 11) = 0.0497991 fft 38: mflops = 318.201 (norm. = 0.735632), norm. avg. (of 11) = 0.505191 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.86659 s, 1024 iters, t-(init.)=1.84993 s t(norm)=0.0367547, mflops=136.037 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.01663 s, 512 iters, t-(init.)=1.01663 s t(norm)=0.0403971, mflops=123.771 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.39994 s, 512 iters, t-(init.)=1.39994 s t(norm)=0.0556288, mflops=89.8815 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.26662 s, 256 iters, t-(init.)=1.26662 s t(norm)=0.100662, mflops=49.6714 (err=3.7e-15) 4. Bailey: elapsed time t=1.86659 s, 512 iters, t-(init.)=1.86659 s t(norm)=0.0741717, mflops=67.4112 (err=3.7e-15) 5. Beauregard: elapsed time t=1.44994 s, 128 iters, t-(init.)=1.44994 s t(norm)=0.230462, mflops=21.6955 (err=3.8e-15) 6. Bergland: elapsed time t=1.21662 s, 1024 iters, t-(init.)=1.19995 s t(norm)=0.0238409, mflops=209.724 (err=3.9e-15) 7. Brenner: elapsed time t=1.18329 s, 512 iters, t-(init.)=1.18329 s t(norm)=0.0470196, mflops=106.339 (err=3.8e-15) 8. Burrus: elapsed time t=1.89992 s, 512 iters, t-(init.)=1.89992 s t(norm)=0.0754962, mflops=66.2285 (err=3.7e-15) 9. CWP (min N) (N=4290): elapsed time t=1.81659 s, 2048 iters, t-(init.)=1.78326 s t(norm)=0.0177151, mflops=282.245 10. CWP (best N) (N=4368): elapsed time t=1.63327 s, 2048 iters, t-(init.)=1.59994 s t(norm)=0.0158939, mflops=314.585 11. Edelblute: elapsed time t=1.91659 s, 512 iters, t-(init.)=1.89992 s t(norm)=0.0754962, mflops=66.2285 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.69993 s, 1024 iters, t-(init.)=1.68327 s t(norm)=0.0334435, mflops=149.506 (err=3.8e-15) 13. FFTPACK (f2c): elapsed time t=1.01663 s, 512 iters, t-(init.)=0.99996 s t(norm)=0.0397348, mflops=125.834 (err=3.8e-15) FFTW_MEASURE plan: (cost = 8.463203e-04) FFTW_TWIDDLE 32 FFTW_TWIDDLE 4 FFTW_NOTW 32 14. FFTW: elapsed time t=1.7666 s, 2048 iters, t-(init.)=1.73326 s t(norm)=0.0172184, mflops=290.387 (err=3.8e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.89992 s, 2048 iters, t-(init.)=1.86659 s t(norm)=0.0185429, mflops=269.645 (err=3.8e-15) 16. Frigo-old: elapsed time t=1.49994 s, 1024 iters, t-(init.)=1.48327 s t(norm)=0.02947, mflops=169.664 (err=3.8e-15) 17. Green: elapsed time t=1.24995 s, 2048 iters, t-(init.)=1.21662 s t(norm)=0.012086, mflops=413.701 (err=3.8e-15) 18. GSL: elapsed time t=1.78326 s, 1024 iters, t-(init.)=1.7666 s t(norm)=0.0350991, mflops=142.454 (err=3.8e-15) 19. GSL DIT: elapsed time t=1.41661 s, 512 iters, t-(init.)=1.39994 s t(norm)=0.0556288, mflops=89.8815 (err=4.1e-15) 20. GSL DIF: elapsed time t=1.13329 s, 512 iters, t-(init.)=1.11662 s t(norm)=0.0443706, mflops=112.687 (err=4.3e-15) 21. Krukar: elapsed time t=1.26662 s, 512 iters, t-(init.)=1.24995 s t(norm)=0.0496686, mflops=100.667 (err=3.8e-15) 22. Mayer (Buneman): elapsed time t=1.03329 s, 512 iters, t-(init.)=1.03329 s t(norm)=0.0410593, mflops=121.775 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.69993 s, 1024 iters, t-(init.)=1.68327 s t(norm)=0.0334435, mflops=149.506 24. Mayer (lookup): elapsed time t=1.73326 s, 1024 iters, t-(init.)=1.7166 s t(norm)=0.0341057, mflops=146.603 (err=3.7e-15) 25. Monro: elapsed time t=1.08329 s, 512 iters, t-(init.)=1.08329 s t(norm)=0.0430461, mflops=116.155 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=2.03325 s, 256 iters, t-(init.)=2.03325 s t(norm)=0.161588, mflops=30.9428 (err=4.9e-14) 27. Ooura (C): elapsed time t=1.51661 s, 2048 iters, t-(init.)=1.48327 s t(norm)=0.014735, mflops=339.328 (err=3.9e-15) 28. Ooura (F): elapsed time t=1.7666 s, 2048 iters, t-(init.)=1.73326 s t(norm)=0.0172184, mflops=290.387 (err=3.9e-15) 29. Ransom: elapsed time t=1.01663 s, 512 iters, t-(init.)=0.99996 s t(norm)=0.0397348, mflops=125.834 (err=4.4e-15) 30. SCIPORT: elapsed time t=1.88326 s, 512 iters, t-(init.)=1.88326 s t(norm)=0.0748339, mflops=66.8146 (err=3.8e-15) 31. Singleton: elapsed time t=1.24995 s, 1024 iters, t-(init.)=1.23328 s t(norm)=0.0245032, mflops=204.055 (err=5.8e-15) 32. Singleton (f2c): elapsed time t=1.43328 s, 1024 iters, t-(init.)=1.41661 s t(norm)=0.0281455, mflops=177.648 (err=5.8e-15) 33. Sorensen: elapsed time t=1.13329 s, 1024 iters, t-(init.)=1.11662 s t(norm)=0.0221853, mflops=225.375 (err=3.7e-15) 34. Sorensen DIT: elapsed time t=1.01663 s, 256 iters, t-(init.)=1.01663 s t(norm)=0.0807942, mflops=61.8856 (err=3.7e-15) 35. Temperton: elapsed time t=1.86659 s, 1024 iters, t-(init.)=1.84993 s t(norm)=0.0367547, mflops=136.037 (err=1.2e-07) 36. Temperton (f2c): elapsed time t=1.99992 s, 1024 iters, t-(init.)=1.98325 s t(norm)=0.0394037, mflops=126.892 (err=3.8e-15) 37. Valkenburg: elapsed time t=1.74993 s, 128 iters, t-(init.)=1.73326 s t(norm)=0.275495, mflops=18.1492 (err=4.0e-15) 38. DXML: elapsed time t=1.29995 s, 1024 iters, t-(init.)=1.26662 s t(norm)=0.0251654, mflops=198.686 (err=4.7e-15) Top mflops for N=4096 = 413.701 Normalized results and averages for N=4096: fft 0: mflops = 136.037 (norm. = 0.328829), norm. avg. (of 12) = 0.308579 fft 1: mflops = 123.771 (norm. = 0.29918), norm. avg. (of 12) = 0.268408 fft 2: mflops = 89.8815 (norm. = 0.217262), norm. avg. (of 12) = 0.17789 fft 3: mflops = 49.6714 (norm. = 0.120066), norm. avg. (of 12) = 0.0646308 fft 4: mflops = 67.4112 (norm. = 0.162946), norm. avg. (of 12) = 0.230954 fft 5: mflops = 21.6955 (norm. = 0.0524425), norm. avg. (of 12) = 0.0452764 fft 6: mflops = 209.724 (norm. = 0.506944), norm. avg. (of 12) = 0.285384 fft 7: mflops = 106.339 (norm. = 0.257042), norm. avg. (of 12) = 0.159375 fft 8: mflops = 66.2285 (norm. = 0.160088), norm. avg. (of 12) = 0.140618 fft 9: mflops = 282.245 (norm. = 0.682243), norm. avg. (of 12) = 0.427325 fft 10: mflops = 314.585 (norm. = 0.760417), norm. avg. (of 12) = 0.474399 fft 11: mflops = 66.2285 (norm. = 0.160088), norm. avg. (of 11) = 0.111149 fft 12: mflops = 149.506 (norm. = 0.361386), norm. avg. (of 12) = 0.425039 fft 13: mflops = 125.834 (norm. = 0.304167), norm. avg. (of 12) = 0.273076 fft 14: mflops = 290.387 (norm. = 0.701923), norm. avg. (of 12) = 0.849894 fft 15: mflops = 269.645 (norm. = 0.651786), norm. avg. (of 12) = 0.858441 fft 16: mflops = 169.664 (norm. = 0.410112), norm. avg. (of 12) = 0.786201 fft 17: mflops = 413.701 (norm. = 1), norm. avg. (of 10) = 0.681556 fft 18: mflops = 142.454 (norm. = 0.34434), norm. avg. (of 12) = 0.319895 fft 19: mflops = 89.8815 (norm. = 0.217262), norm. avg. (of 12) = 0.138183 fft 20: mflops = 112.687 (norm. = 0.272388), norm. avg. (of 12) = 0.164019 fft 21: mflops = 100.667 (norm. = 0.243333), norm. avg. (of 12) = 0.345021 fft 22: mflops = 121.775 (norm. = 0.294355), norm. avg. (of 11) = 0.225481 fft 23: mflops = 149.506 (norm. = 0.361386), norm. avg. (of 11) = 0.275612 fft 24: mflops = 146.603 (norm. = 0.354369), norm. avg. (of 11) = 0.265526 fft 25: mflops = 116.155 (norm. = 0.280769), norm. avg. (of 11) = 0.180648 fft 26: mflops = 30.9428 (norm. = 0.0747951), norm. avg. (of 12) = 0.0632967 fft 27: mflops = 339.328 (norm. = 0.820225), norm. avg. (of 12) = 0.57168 fft 28: mflops = 290.387 (norm. = 0.701923), norm. avg. (of 12) = 0.528979 fft 29: mflops = 125.834 (norm. = 0.304167), norm. avg. (of 11) = 0.139987 fft 30: mflops = 66.8146 (norm. = 0.161504), norm. avg. (of 11) = 0.367969 fft 31: mflops = 204.055 (norm. = 0.493243), norm. avg. (of 12) = 0.257072 fft 32: mflops = 177.648 (norm. = 0.429412), norm. avg. (of 12) = 0.245334 fft 33: mflops = 225.375 (norm. = 0.544776), norm. avg. (of 12) = 0.392426 fft 34: mflops = 61.8856 (norm. = 0.14959), norm. avg. (of 12) = 0.136468 fft 35: mflops = 136.037 (norm. = 0.328829), norm. avg. (of 12) = 0.222845 fft 36: mflops = 126.892 (norm. = 0.306723), norm. avg. (of 12) = 0.186137 fft 37: mflops = 18.1492 (norm. = 0.0438702), norm. avg. (of 12) = 0.049305 fft 38: mflops = 198.686 (norm. = 0.480263), norm. avg. (of 12) = 0.503114 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.5666 s, 256 iters, t-(init.)=1.53327 s t(norm)=0.0562401, mflops=88.9046 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.69993 s, 256 iters, t-(init.)=1.6666 s t(norm)=0.0611305, mflops=81.7922 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.04996 s, 128 iters, t-(init.)=1.03329 s t(norm)=0.0758018, mflops=65.9615 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.49994 s, 128 iters, t-(init.)=1.48327 s t(norm)=0.108812, mflops=45.9507 (err=3.7e-15) 4. Bailey: elapsed time t=1.26662 s, 128 iters, t-(init.)=1.24995 s t(norm)=0.0916958, mflops=54.5281 (err=3.7e-15) 5. Beauregard: elapsed time t=1.6166 s, 64 iters, t-(init.)=1.59994 s t(norm)=0.234741, mflops=21.3001 (err=3.7e-15) 6. Bergland: elapsed time t=1.86659 s, 512 iters, t-(init.)=1.78326 s t(norm)=0.0327048, mflops=152.883 (err=3.7e-15) 7. Brenner: elapsed time t=1.6166 s, 256 iters, t-(init.)=1.58327 s t(norm)=0.058074, mflops=86.0971 (err=3.7e-15) 8. Burrus: elapsed time t=1.31661 s, 128 iters, t-(init.)=1.29995 s t(norm)=0.0953636, mflops=52.4309 (err=3.7e-15) 9. CWP (min N) (N=8580): elapsed time t=1.11662 s, 512 iters, t-(init.)=1.03329 s t(norm)=0.0189505, mflops=263.846 10. CWP (best N) (N=9240): elapsed time t=1.06662 s, 512 iters, t-(init.)=0.983294 s t(norm)=0.0180335, mflops=277.262 11. Edelblute: elapsed time t=1.31661 s, 128 iters, t-(init.)=1.29995 s t(norm)=0.0953636, mflops=52.4309 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.38328 s, 256 iters, t-(init.)=1.34995 s t(norm)=0.0495157, mflops=100.978 (err=3.7e-15) 13. FFTPACK (f2c): elapsed time t=1.54994 s, 256 iters, t-(init.)=1.51661 s t(norm)=0.0556288, mflops=89.8815 (err=3.7e-15) FFTW_MEASURE plan: (cost = 2.473859e-03) FFTW_TWIDDLE 32 FFTW_TWIDDLE 8 FFTW_NOTW 32 14. FFTW: elapsed time t=1.29995 s, 512 iters, t-(init.)=1.23328 s t(norm)=0.0226183, mflops=221.06 (err=3.7e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.48327 s, 512 iters, t-(init.)=1.41661 s t(norm)=0.0259805, mflops=192.452 (err=3.7e-15) 16. Frigo-old: elapsed time t=2.01659 s, 512 iters, t-(init.)=1.93326 s t(norm)=0.0354557, mflops=141.021 (err=3.7e-15) 17. Green: elapsed time t=1.26662 s, 512 iters, t-(init.)=1.18329 s t(norm)=0.0217013, mflops=230.401 (err=3.7e-15) 18. GSL: elapsed time t=1.31661 s, 256 iters, t-(init.)=1.28328 s t(norm)=0.0470705, mflops=106.224 (err=3.7e-15) 19. GSL DIT: elapsed time t=1.06662 s, 128 iters, t-(init.)=1.04996 s t(norm)=0.0770245, mflops=64.9144 (err=4.3e-15) 20. GSL DIF: elapsed time t=1.81659 s, 256 iters, t-(init.)=1.78326 s t(norm)=0.0654097, mflops=76.4413 (err=4.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.11662 s, 256 iters, t-(init.)=1.08329 s t(norm)=0.0397348, mflops=125.834 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.88326 s, 512 iters, t-(init.)=1.79993 s t(norm)=0.0330105, mflops=151.467 24. Mayer (lookup): elapsed time t=1.04996 s, 256 iters, t-(init.)=1.01663 s t(norm)=0.0372896, mflops=134.086 (err=3.7e-15) 25. Monro: elapsed time t=1.91659 s, 256 iters, t-(init.)=1.88326 s t(norm)=0.0690775, mflops=72.3825 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.21662 s, 64 iters, t-(init.)=1.21662 s t(norm)=0.178501, mflops=28.011 (err=4.5e-14) 27. Ooura (C): elapsed time t=1.39994 s, 512 iters, t-(init.)=1.33328 s t(norm)=0.0244522, mflops=204.48 (err=3.7e-15) 28. Ooura (F): elapsed time t=1.46661 s, 512 iters, t-(init.)=1.38328 s t(norm)=0.0253692, mflops=197.09 (err=3.7e-15) 29. Ransom: elapsed time t=1.39994 s, 256 iters, t-(init.)=1.36661 s t(norm)=0.050127, mflops=99.7466 (err=4.9e-15) 30. SCIPORT: elapsed time t=1.36661 s, 128 iters, t-(init.)=1.34995 s t(norm)=0.0990314, mflops=50.489 (err=3.7e-15) 31. Singleton: elapsed time t=1.08329 s, 256 iters, t-(init.)=1.04996 s t(norm)=0.0385122, mflops=129.829 (err=5.6e-15) 32. Singleton (f2c): elapsed time t=1.19995 s, 256 iters, t-(init.)=1.16662 s t(norm)=0.0427914, mflops=116.846 (err=5.6e-15) 33. Sorensen: elapsed time t=1.14995 s, 256 iters, t-(init.)=1.11662 s t(norm)=0.0409575, mflops=122.078 (err=3.7e-15) 34. Sorensen DIT: elapsed time t=1.41661 s, 128 iters, t-(init.)=1.39994 s t(norm)=0.102699, mflops=48.6858 (err=3.7e-15) 35. Temperton: elapsed time t=1.41661 s, 256 iters, t-(init.)=1.38328 s t(norm)=0.0507383, mflops=98.5448 (err=1.4e-07) 36. Temperton (f2c): elapsed time t=1.63327 s, 256 iters, t-(init.)=1.58327 s t(norm)=0.058074, mflops=86.0971 (err=3.7e-15) 37. Valkenburg: elapsed time t=1.01663 s, 32 iters, t-(init.)=0.99996 s t(norm)=0.293427, mflops=17.04 (err=3.8e-15) 38. DXML: elapsed time t=1.04996 s, 256 iters, t-(init.)=1.01663 s t(norm)=0.0372896, mflops=134.086 (err=4.6e-15) Top mflops for N=8192 = 277.262 Normalized results and averages for N=8192: fft 0: mflops = 88.9046 (norm. = 0.320652), norm. avg. (of 13) = 0.309508 fft 1: mflops = 81.7922 (norm. = 0.295), norm. avg. (of 13) = 0.270453 fft 2: mflops = 65.9615 (norm. = 0.237903), norm. avg. (of 13) = 0.182506 fft 3: mflops = 45.9507 (norm. = 0.16573), norm. avg. (of 13) = 0.0724077 fft 4: mflops = 54.5281 (norm. = 0.196667), norm. avg. (of 13) = 0.228317 fft 5: mflops = 21.3001 (norm. = 0.0768229), norm. avg. (of 13) = 0.0477031 fft 6: mflops = 152.883 (norm. = 0.551402), norm. avg. (of 13) = 0.305847 fft 7: mflops = 86.0971 (norm. = 0.310526), norm. avg. (of 13) = 0.171002 fft 8: mflops = 52.4309 (norm. = 0.189103), norm. avg. (of 13) = 0.144348 fft 9: mflops = 263.846 (norm. = 0.951613), norm. avg. (of 13) = 0.467655 fft 10: mflops = 277.262 (norm. = 1), norm. avg. (of 13) = 0.514829 fft 11: mflops = 52.4309 (norm. = 0.189103), norm. avg. (of 12) = 0.117645 fft 12: mflops = 100.978 (norm. = 0.364198), norm. avg. (of 13) = 0.420358 fft 13: mflops = 89.8815 (norm. = 0.324176), norm. avg. (of 13) = 0.277007 fft 14: mflops = 221.06 (norm. = 0.797297), norm. avg. (of 13) = 0.845849 fft 15: mflops = 192.452 (norm. = 0.694118), norm. avg. (of 13) = 0.845801 fft 16: mflops = 141.021 (norm. = 0.508621), norm. avg. (of 13) = 0.764848 fft 17: mflops = 230.401 (norm. = 0.830986), norm. avg. (of 11) = 0.69514 fft 18: mflops = 106.224 (norm. = 0.383117), norm. avg. (of 13) = 0.324758 fft 19: mflops = 64.9144 (norm. = 0.234127), norm. avg. (of 13) = 0.145563 fft 20: mflops = 76.4413 (norm. = 0.275701), norm. avg. (of 13) = 0.17261 fft 21: mflops = -1 (norm. = -0.0036067), norm. avg. (of 12) = 0.345021 fft 22: mflops = 125.834 (norm. = 0.453846), norm. avg. (of 12) = 0.244512 fft 23: mflops = 151.467 (norm. = 0.546296), norm. avg. (of 12) = 0.298169 fft 24: mflops = 134.086 (norm. = 0.483607), norm. avg. (of 12) = 0.283699 fft 25: mflops = 72.3825 (norm. = 0.261062), norm. avg. (of 12) = 0.187349 fft 26: mflops = 28.011 (norm. = 0.101027), norm. avg. (of 13) = 0.066199 fft 27: mflops = 204.48 (norm. = 0.7375), norm. avg. (of 13) = 0.584436 fft 28: mflops = 197.09 (norm. = 0.710843), norm. avg. (of 13) = 0.542969 fft 29: mflops = 99.7466 (norm. = 0.359756), norm. avg. (of 12) = 0.158301 fft 30: mflops = 50.489 (norm. = 0.182099), norm. avg. (of 12) = 0.35248 fft 31: mflops = 129.829 (norm. = 0.468254), norm. avg. (of 13) = 0.273317 fft 32: mflops = 116.846 (norm. = 0.421429), norm. avg. (of 13) = 0.25888 fft 33: mflops = 122.078 (norm. = 0.440299), norm. avg. (of 13) = 0.396108 fft 34: mflops = 48.6858 (norm. = 0.175595), norm. avg. (of 13) = 0.139478 fft 35: mflops = 98.5448 (norm. = 0.355422), norm. avg. (of 13) = 0.233043 fft 36: mflops = 86.0971 (norm. = 0.310526), norm. avg. (of 13) = 0.195705 fft 37: mflops = 17.04 (norm. = 0.0614583), norm. avg. (of 13) = 0.0502399 fft 38: mflops = 134.086 (norm. = 0.483607), norm. avg. (of 13) = 0.501613 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.48327 s, 64 iters, t-(init.)=1.44994 s t(norm)=0.0987695, mflops=50.6229 (err=6.8e-15) 1. Arndt DIT: elapsed time t=1.59994 s, 64 iters, t-(init.)=1.5666 s t(norm)=0.106716, mflops=46.8531 (err=6.8e-15) 2. Arndt Split-Radix: elapsed time t=1.83326 s, 64 iters, t-(init.)=1.79993 s t(norm)=0.12261, mflops=40.7796 (err=6.8e-15) 3. Arndt 4-step: elapsed time t=1.59994 s, 64 iters, t-(init.)=1.5666 s t(norm)=0.106716, mflops=46.8531 (err=6.8e-15) 4. Bailey: elapsed time t=1.6166 s, 64 iters, t-(init.)=1.58327 s t(norm)=0.107852, mflops=46.36 (err=6.8e-15) 5. Beauregard: elapsed time t=1.81659 s, 32 iters, t-(init.)=1.79993 s t(norm)=0.245221, mflops=20.3898 (err=6.8e-15) 6. Bergland: elapsed time t=1.39994 s, 128 iters, t-(init.)=1.33328 s t(norm)=0.0454112, mflops=110.105 (err=6.8e-15) 7. Brenner: elapsed time t=1.06662 s, 64 iters, t-(init.)=1.03329 s t(norm)=0.0703874, mflops=71.0354 (err=6.8e-15) 8. Burrus: elapsed time t=1.08329 s, 32 iters, t-(init.)=1.06662 s t(norm)=0.145316, mflops=34.4078 (err=6.8e-15) 9. CWP (min N) (N=17160): elapsed time t=1.41661 s, 256 iters, t-(init.)=1.28328 s t(norm)=0.0218542, mflops=228.789 10. CWP (best N) (N=17160): elapsed time t=1.43328 s, 256 iters, t-(init.)=1.29995 s t(norm)=0.022138, mflops=225.856 11. Edelblute: elapsed time t=1.04996 s, 32 iters, t-(init.)=1.03329 s t(norm)=0.140775, mflops=35.5177 (err=6.8e-15) 12. FFTPACK: elapsed time t=1.63327 s, 128 iters, t-(init.)=1.5666 s t(norm)=0.0533582, mflops=93.7063 (err=6.8e-15) 13. FFTPACK (f2c): elapsed time t=1.69993 s, 128 iters, t-(init.)=1.63327 s t(norm)=0.0556288, mflops=89.8815 (err=6.8e-15) FFTW_MEASURE plan: (cost = 6.770562e-03) FFTW_TWIDDLE 64 FFTW_TWIDDLE 8 FFTW_NOTW 32 14. FFTW: elapsed time t=1.78326 s, 256 iters, t-(init.)=1.6666 s t(norm)=0.028382, mflops=176.168 (err=6.8e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.41661 s, 256 iters, t-(init.)=1.28328 s t(norm)=0.0218542, mflops=228.789 (err=6.8e-15) 16. Frigo-old: elapsed time t=1.19995 s, 128 iters, t-(init.)=1.13329 s t(norm)=0.0385996, mflops=129.535 (err=6.8e-15) 17. Green: elapsed time t=1.19995 s, 128 iters, t-(init.)=1.13329 s t(norm)=0.0385996, mflops=129.535 (err=6.8e-15) 18. GSL: elapsed time t=1.51661 s, 128 iters, t-(init.)=1.46661 s t(norm)=0.0499524, mflops=100.095 (err=6.8e-15) 19. GSL DIT: elapsed time t=1.59994 s, 64 iters, t-(init.)=1.5666 s t(norm)=0.106716, mflops=46.8531 (err=7.2e-15) 20. GSL DIF: elapsed time t=1.39994 s, 64 iters, t-(init.)=1.36661 s t(norm)=0.0930931, mflops=53.7097 (err=7.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.93326 s, 128 iters, t-(init.)=1.86659 s t(norm)=0.0635757, mflops=78.6463 (err=6.8e-15) 23. Mayer (simple): elapsed time t=1.78326 s, 128 iters, t-(init.)=1.7166 s t(norm)=0.058467, mflops=85.5184 24. Mayer (lookup): elapsed time t=1.96659 s, 128 iters, t-(init.)=1.89992 s t(norm)=0.064711, mflops=77.2666 (err=6.8e-15) 25. Monro: elapsed time t=1.7166 s, 64 iters, t-(init.)=1.68327 s t(norm)=0.114663, mflops=43.6059 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.29995 s, 32 iters, t-(init.)=1.28328 s t(norm)=0.174833, mflops=28.5987 (err=2.3e-13) 27. Ooura (C): elapsed time t=1.01663 s, 128 iters, t-(init.)=0.966628 s t(norm)=0.0329232, mflops=151.869 (err=6.8e-15) 28. Ooura (F): elapsed time t=1.04996 s, 128 iters, t-(init.)=0.99996 s t(norm)=0.0340584, mflops=146.807 (err=6.8e-15) 29. Ransom: elapsed time t=1.36661 s, 128 iters, t-(init.)=1.29995 s t(norm)=0.044276, mflops=112.928 (err=7.4e-15) 30. SCIPORT: elapsed time t=1.6666 s, 64 iters, t-(init.)=1.63327 s t(norm)=0.111258, mflops=44.9408 (err=6.8e-15) 31. Singleton: elapsed time t=1.5666 s, 128 iters, t-(init.)=1.49994 s t(norm)=0.0510877, mflops=97.871 (err=1.0e-14) 32. Singleton (f2c): elapsed time t=1.63327 s, 128 iters, t-(init.)=1.5666 s t(norm)=0.0533582, mflops=93.7063 (err=1.0e-14) 33. Sorensen: elapsed time t=1.08329 s, 64 iters, t-(init.)=1.04996 s t(norm)=0.0715227, mflops=69.9079 (err=6.8e-15) 34. Sorensen DIT: elapsed time t=1.19995 s, 32 iters, t-(init.)=1.18329 s t(norm)=0.16121, mflops=31.0155 (err=6.8e-15) 35. Temperton: elapsed time t=1.86659 s, 128 iters, t-(init.)=1.81659 s t(norm)=0.0618728, mflops=80.8109 (err=1.5e-07) 36. Temperton (f2c): elapsed time t=1.94992 s, 128 iters, t-(init.)=1.88326 s t(norm)=0.0641434, mflops=77.9504 (err=6.8e-15) 37. Valkenburg: elapsed time t=1.16662 s, 16 iters, t-(init.)=1.14995 s t(norm)=0.313338, mflops=15.9572 (err=6.9e-15) 38. DXML: elapsed time t=1.81659 s, 256 iters, t-(init.)=1.68327 s t(norm)=0.0286658, mflops=174.424 (err=7.6e-15) Top mflops for N=16384 = 228.789 Normalized results and averages for N=16384: fft 0: mflops = 50.6229 (norm. = 0.221264), norm. avg. (of 14) = 0.303205 fft 1: mflops = 46.8531 (norm. = 0.204787), norm. avg. (of 14) = 0.265763 fft 2: mflops = 40.7796 (norm. = 0.178241), norm. avg. (of 14) = 0.182202 fft 3: mflops = 46.8531 (norm. = 0.204787), norm. avg. (of 14) = 0.0818634 fft 4: mflops = 46.36 (norm. = 0.202632), norm. avg. (of 14) = 0.226482 fft 5: mflops = 20.3898 (norm. = 0.0891204), norm. avg. (of 14) = 0.0506615 fft 6: mflops = 110.105 (norm. = 0.48125), norm. avg. (of 14) = 0.318376 fft 7: mflops = 71.0354 (norm. = 0.310484), norm. avg. (of 14) = 0.180965 fft 8: mflops = 34.4078 (norm. = 0.150391), norm. avg. (of 14) = 0.144779 fft 9: mflops = 228.789 (norm. = 1), norm. avg. (of 14) = 0.505679 fft 10: mflops = 225.856 (norm. = 0.987179), norm. avg. (of 14) = 0.548569 fft 11: mflops = 35.5177 (norm. = 0.155242), norm. avg. (of 13) = 0.120537 fft 12: mflops = 93.7063 (norm. = 0.409574), norm. avg. (of 14) = 0.419588 fft 13: mflops = 89.8815 (norm. = 0.392857), norm. avg. (of 14) = 0.285282 fft 14: mflops = 176.168 (norm. = 0.77), norm. avg. (of 14) = 0.840431 fft 15: mflops = 228.789 (norm. = 1), norm. avg. (of 14) = 0.856815 fft 16: mflops = 129.535 (norm. = 0.566176), norm. avg. (of 14) = 0.750658 fft 17: mflops = 129.535 (norm. = 0.566176), norm. avg. (of 12) = 0.684393 fft 18: mflops = 100.095 (norm. = 0.4375), norm. avg. (of 14) = 0.332811 fft 19: mflops = 46.8531 (norm. = 0.204787), norm. avg. (of 14) = 0.149794 fft 20: mflops = 53.7097 (norm. = 0.234756), norm. avg. (of 14) = 0.177049 fft 21: mflops = -1 (norm. = -0.00437083), norm. avg. (of 12) = 0.345021 fft 22: mflops = 78.6463 (norm. = 0.34375), norm. avg. (of 13) = 0.252145 fft 23: mflops = 85.5184 (norm. = 0.373786), norm. avg. (of 13) = 0.303986 fft 24: mflops = 77.2666 (norm. = 0.337719), norm. avg. (of 13) = 0.287855 fft 25: mflops = 43.6059 (norm. = 0.190594), norm. avg. (of 13) = 0.187598 fft 26: mflops = 28.5987 (norm. = 0.125), norm. avg. (of 14) = 0.0703991 fft 27: mflops = 151.869 (norm. = 0.663793), norm. avg. (of 14) = 0.590104 fft 28: mflops = 146.807 (norm. = 0.641667), norm. avg. (of 14) = 0.550018 fft 29: mflops = 112.928 (norm. = 0.49359), norm. avg. (of 13) = 0.184093 fft 30: mflops = 44.9408 (norm. = 0.196429), norm. avg. (of 13) = 0.340476 fft 31: mflops = 97.871 (norm. = 0.427778), norm. avg. (of 14) = 0.28435 fft 32: mflops = 93.7063 (norm. = 0.409574), norm. avg. (of 14) = 0.269644 fft 33: mflops = 69.9079 (norm. = 0.305556), norm. avg. (of 14) = 0.38964 fft 34: mflops = 31.0155 (norm. = 0.135563), norm. avg. (of 14) = 0.139199 fft 35: mflops = 80.8109 (norm. = 0.353211), norm. avg. (of 14) = 0.241627 fft 36: mflops = 77.9504 (norm. = 0.340708), norm. avg. (of 14) = 0.206063 fft 37: mflops = 15.9572 (norm. = 0.0697464), norm. avg. (of 14) = 0.0516332 fft 38: mflops = 174.424 (norm. = 0.762376), norm. avg. (of 14) = 0.520239 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.74993 s, 32 iters, t-(init.)=1.7166 s t(norm)=0.109138, mflops=45.8134 (err=1.4e-14) 1. Arndt DIT: elapsed time t=1.81659 s, 32 iters, t-(init.)=1.78326 s t(norm)=0.113377, mflops=44.1008 (err=1.4e-14) 2. Arndt Split-Radix: elapsed time t=1.13329 s, 16 iters, t-(init.)=1.09996 s t(norm)=0.139867, mflops=35.7483 (err=1.4e-14) 3. Arndt 4-step: elapsed time t=1.01663 s, 16 iters, t-(init.)=0.99996 s t(norm)=0.127151, mflops=39.3232 (err=1.4e-14) 4. Bailey: elapsed time t=1.79993 s, 32 iters, t-(init.)=1.7666 s t(norm)=0.112317, mflops=44.5168 (err=1.4e-14) 5. Beauregard: elapsed time t=2.01659 s, 16 iters, t-(init.)=1.99992 s t(norm)=0.254303, mflops=19.6616 (err=1.4e-14) 6. Bergland: elapsed time t=1.63327 s, 64 iters, t-(init.)=1.5666 s t(norm)=0.049801, mflops=100.4 (err=1.4e-14) 7. Brenner: elapsed time t=1.26662 s, 32 iters, t-(init.)=1.23328 s t(norm)=0.0784101, mflops=63.7673 (err=1.4e-14) 8. Burrus: elapsed time t=1.33328 s, 16 iters, t-(init.)=1.31661 s t(norm)=0.167416, mflops=29.8657 (err=1.4e-14) 9. CWP (min N) (N=34320): elapsed time t=1.73326 s, 128 iters, t-(init.)=1.5666 s t(norm)=0.0249005, mflops=200.799 10. CWP (best N) (N=34320): elapsed time t=1.73326 s, 128 iters, t-(init.)=1.58327 s t(norm)=0.0251654, mflops=198.686 11. Edelblute: elapsed time t=1.28328 s, 16 iters, t-(init.)=1.26662 s t(norm)=0.161059, mflops=31.0446 (err=1.4e-14) 12. FFTPACK: elapsed time t=1.93326 s, 64 iters, t-(init.)=1.84993 s t(norm)=0.0588076, mflops=85.0231 (err=1.4e-14) 13. FFTPACK (f2c): elapsed time t=1.03329 s, 32 iters, t-(init.)=0.99996 s t(norm)=0.0635757, mflops=78.6463 (err=1.4e-14) FFTW_MEASURE plan: (cost = 1.458275e-02) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 14. FFTW: elapsed time t=1.89992 s, 128 iters, t-(init.)=1.74993 s t(norm)=0.0278144, mflops=179.763 (err=1.4e-14) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.89992 s, 128 iters, t-(init.)=1.74993 s t(norm)=0.0278144, mflops=179.763 (err=1.4e-14) 16. Frigo-old: elapsed time t=1.43328 s, 64 iters, t-(init.)=1.34995 s t(norm)=0.0429136, mflops=116.513 (err=1.4e-14) 17. Green: elapsed time t=1.38328 s, 64 iters, t-(init.)=1.29995 s t(norm)=0.0413242, mflops=120.994 (err=1.4e-14) 18. GSL: elapsed time t=1.64993 s, 64 iters, t-(init.)=1.58327 s t(norm)=0.0503308, mflops=99.3428 (err=1.4e-14) 19. GSL DIT: elapsed time t=1.89992 s, 32 iters, t-(init.)=1.86659 s t(norm)=0.118675, mflops=42.132 (err=1.4e-14) 20. GSL DIF: elapsed time t=1.68327 s, 32 iters, t-(init.)=1.63327 s t(norm)=0.10384, mflops=48.1508 (err=1.4e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.29995 s, 32 iters, t-(init.)=1.26662 s t(norm)=0.0805293, mflops=62.0892 (err=1.4e-14) 23. Mayer (simple): elapsed time t=1.23328 s, 32 iters, t-(init.)=1.18329 s t(norm)=0.0752313, mflops=66.4617 24. Mayer (lookup): elapsed time t=1.28328 s, 32 iters, t-(init.)=1.24995 s t(norm)=0.0794697, mflops=62.9171 (err=1.4e-14) 25. Monro: elapsed time t=1.03329 s, 16 iters, t-(init.)=1.01663 s t(norm)=0.129271, mflops=38.6785 (err=1.5e-07) 26. NAPACK (f2c): elapsed time t=1.46661 s, 16 iters, t-(init.)=1.44994 s t(norm)=0.18437, mflops=27.1194 (err=5.6e-13) 27. Ooura (C): elapsed time t=1.18329 s, 64 iters, t-(init.)=1.11662 s t(norm)=0.0354965, mflops=140.859 (err=1.4e-14) 28. Ooura (F): elapsed time t=1.24995 s, 64 iters, t-(init.)=1.18329 s t(norm)=0.0376156, mflops=132.923 (err=1.4e-14) 29. Ransom: elapsed time t=1.7666 s, 64 iters, t-(init.)=1.68327 s t(norm)=0.0535096, mflops=93.4412 (err=1.5e-14) 30. SCIPORT: elapsed time t=1.29995 s, 16 iters, t-(init.)=1.26662 s t(norm)=0.161059, mflops=31.0446 (err=1.4e-14) 31. Singleton: elapsed time t=1.11662 s, 32 iters, t-(init.)=1.08329 s t(norm)=0.0688737, mflops=72.5966 (err=2.1e-14) 32. Singleton (f2c): elapsed time t=1.16662 s, 32 iters, t-(init.)=1.11662 s t(norm)=0.0709929, mflops=70.4296 (err=2.1e-14) 33. Sorensen: elapsed time t=1.36661 s, 32 iters, t-(init.)=1.33328 s t(norm)=0.0847677, mflops=58.9848 (err=1.4e-14) 34. Sorensen DIT: elapsed time t=1.41661 s, 16 iters, t-(init.)=1.39994 s t(norm)=0.178012, mflops=28.088 (err=1.4e-14) 35. Temperton: elapsed time t=1.19995 s, 32 iters, t-(init.)=1.16662 s t(norm)=0.0741717, mflops=67.4112 (err=1.5e-07) 36. Temperton (f2c): elapsed time t=1.24995 s, 32 iters, t-(init.)=1.21662 s t(norm)=0.0773505, mflops=64.6408 (err=1.4e-14) 37. Valkenburg: elapsed time t=1.33328 s, 8 iters, t-(init.)=1.33328 s t(norm)=0.339071, mflops=14.7462 (err=1.4e-14) 38. DXML: elapsed time t=1.03329 s, 64 iters, t-(init.)=0.966628 s t(norm)=0.0307283, mflops=162.717 (err=1.4e-14) Top mflops for N=32768 = 200.799 Normalized results and averages for N=32768: fft 0: mflops = 45.8134 (norm. = 0.228155), norm. avg. (of 15) = 0.298201 fft 1: mflops = 44.1008 (norm. = 0.219626), norm. avg. (of 15) = 0.262687 fft 2: mflops = 35.7483 (norm. = 0.17803), norm. avg. (of 15) = 0.181923 fft 3: mflops = 39.3232 (norm. = 0.195833), norm. avg. (of 15) = 0.0894614 fft 4: mflops = 44.5168 (norm. = 0.221698), norm. avg. (of 15) = 0.226163 fft 5: mflops = 19.6616 (norm. = 0.0979167), norm. avg. (of 15) = 0.0538118 fft 6: mflops = 100.4 (norm. = 0.5), norm. avg. (of 15) = 0.330484 fft 7: mflops = 63.7673 (norm. = 0.317568), norm. avg. (of 15) = 0.190072 fft 8: mflops = 29.8657 (norm.. = 0.148734), norm. avg. (of 15) = 0.145043 fft 9: mflops = 200.799 (norm. = 1), norm. avg. (of 15) = 0.538634 fft 10: mflops = 198.686 (norm. = 0.989474), norm. avg. (of 15) = 0.577962 fft 11: mflops = 31.0446 (norm. = 0.154605), norm. avg. (of 14) = 0.12297 fft 12: mflops = 85.0231 (norm. = 0.423423), norm. avg. (of 15) = 0.419844 fft 13: mflops = 78.6463 (norm. = 0.391667), norm. avg. (of 15) = 0.292374 fft 14: mflops = 179.763 (norm. = 0.895238), norm. avg. (of 15) = 0.844085 fft 15: mflops = 179.763 (norm. = 0.895238), norm. avg. (of 15) = 0.859377 fft 16: mflops = 116.513 (norm. = 0.580247), norm. avg. (of 15) = 0.739297 fft 17: mflops = 120.994 (norm. = 0.602564), norm. avg. (of 13) = 0.678099 fft 18: mflops = 99.3428 (norm. = 0.494737), norm. avg. (of 15) = 0.343606 fft 19: mflops = 42.132 (norm. = 0.209821), norm. avg. (of 15) = 0.153796 fft 20: mflops = 48.1508 (norm. = 0.239796), norm. avg. (of 15) = 0.181232 fft 21: mflops = -1 (norm. = -0.0049801), norm. avg. (of 12) = 0.345021 fft 22: mflops = 62.0892 (norm. = 0.309211), norm. avg. (of 14) = 0.256221 fft 23: mflops = 66.4617 (norm. = 0.330986), norm. avg. (of 14) = 0.305914 fft 24: mflops = 62.9171 (norm. = 0.313333), norm. avg. (of 14) = 0.289675 fft 25: mflops = 38.6785 (norm. = 0.192623), norm. avg. (of 14) = 0.187957 fft 26: mflops = 27.1194 (norm. = 0.135057), norm. avg. (of 15) = 0.0747097 fft 27: mflops = 140.859 (norm. = 0.701493), norm. avg. (of 15) = 0.59753 fft 28: mflops = 132.923 (norm. = 0.661972), norm. avg. (of 15) = 0.557482 fft 29: mflops = 93.4412 (norm. = 0.465347), norm. avg. (of 14) = 0.204182 fft 30: mflops = 31.0446 (norm. = 0.154605), norm. avg. (of 14) = 0.327199 fft 31: mflops = 72.5966 (norm. = 0.361538), norm. avg. (of 15) = 0.289496 fft 32: mflops = 70.4296 (norm. = 0.350746), norm. avg. (of 15) = 0.275051 fft 33: mflops = 58.9848 (norm. = 0.29375), norm. avg. (of 15) = 0.383248 fft 34: mflops = 28.088 (norm. = 0.139881), norm. avg. (of 15) = 0.139244 fft 35: mflops = 67.4112 (norm. = 0.335714), norm. avg. (of 15) = 0.247899 fft 36: mflops = 64.6408 (norm. = 0.321918), norm. avg. (of 15) = 0.213786 fft 37: mflops = 14.7462 (norm. = 0.0734375), norm. avg. (of 15) = 0.0530868 fft 38: mflops = 162.717 (norm. = 0.810345), norm. avg. (of 15) = 0.539579 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.23328 s, 8 iters, t-(init.)=1.21662 s t(norm)=0.145032, mflops=34.4751 (err=1.7e-14) 1. Arndt DIT: elapsed time t=1.26662 s, 8 iters, t-(init.)=1.24995 s t(norm)=0.149006, mflops=33.5558 (err=1.7e-14) 2. Arndt Split-Radix: elapsed time t=1.58327 s, 8 iters, t-(init.)=1.5666 s t(norm)=0.186754, mflops=26.7732 (err=1.7e-14) 3. Arndt 4-step: elapsed time t=1.93326 s, 16 iters, t-(init.)=1.88326 s t(norm)=0.112251, mflops=44.5431 (err=1.7e-14) 4. Bailey: elapsed time t=1.48327 s, 8 iters, t-(init.)=1.46661 s t(norm)=0.174833, mflops=28.5987 (err=1.7e-14) 5. Beauregard: elapsed time t=1.09996 s, 4 iters, t-(init.)=1.08329 s t(norm)=0.258276, mflops=19.3591 (err=1.7e-14) 6. Bergland: elapsed time t=1.08329 s, 16 iters, t-(init.)=1.03329 s t(norm)=0.061589, mflops=81.1833 (err=1.7e-14) 7. Brenner: elapsed time t=1.58327 s, 16 iters, t-(init.)=1.53327 s t(norm)=0.0913901, mflops=54.7105 (err=1.7e-14) 8. Burrus: elapsed time t=1.73326 s, 8 iters, t-(init.)=1.69993 s t(norm)=0.202648, mflops=24.6734 (err=1.7e-14) 9. CWP (min N) (N=72072): elapsed time t=1.03329 s, 32 iters, t-(init.)=0.91663 s t(norm)=0.0273177, mflops=183.031 10. CWP (best N) (N=72072): elapsed time t=1.03329 s, 32 iters, t-(init.)=0.899964 s t(norm)=0.026821, mflops=186.421 11. Edelblute: elapsed time t=1.68327 s, 8 iters, t-(init.)=1.64993 s t(norm)=0.196687, mflops=25.421 (err=1.7e-14) 12. FFTPACK: elapsed time t=1.38328 s, 16 iters, t-(init.)=1.33328 s t(norm)=0.0794697, mflops=62.9171 (err=1.7e-14) 13. FFTPACK (f2c): elapsed time t=1.38328 s, 16 iters, t-(init.)=1.33328 s t(norm)=0.0794697, mflops=62.9171 (err=1.7e-14) FFTW_MEASURE plan: (cost = 3.541525e-02) FFTW_TWIDDLE 64 FFTW_TWIDDLE 32 FFTW_NOTW 32 14. FFTW: elapsed time t=1.38328 s, 32 iters, t-(init.)=1.29995 s t(norm)=0.0387415, mflops=129.061 (err=1.7e-14) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.39994 s, 32 iters, t-(init.)=1.29995 s t(norm)=0.0387415, mflops=129.061 (err=1.7e-14) 16. Frigo-old: elapsed time t=1.04996 s, 16 iters, t-(init.)=0.99996 s t(norm)=0.0596023, mflops=83.8894 (err=1.7e-14) 17. Green: elapsed time t=1.84993 s, 32 iters, t-(init.)=1.7666 s t(norm)=0.0526487, mflops=94.9692 (err=1.7e-14) 18. GSL: elapsed time t=1.21662 s, 16 iters, t-(init.)=1.16662 s t(norm)=0.069536, mflops=71.9052 (err=1.7e-14) 19. GSL DIT: elapsed time t=1.33328 s, 8 iters, t-(init.)=1.31661 s t(norm)=0.156953, mflops=31.8567 (err=1.7e-14) 20. GSL DIF: elapsed time t=1.16662 s, 8 iters, t-(init.)=1.13329 s t(norm)=0.135098, mflops=37.01 (err=1.8e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.68327 s, 16 iters, t-(init.)=1.63327 s t(norm)=0.0973504, mflops=51.3609 (err=1.7e-14) 23. Mayer (simple): elapsed time t=1.63327 s, 16 iters, t-(init.)=1.59994 s t(norm)=0.0953636, mflops=52.4309 24. Mayer (lookup): elapsed time t=1.73326 s, 16 iters, t-(init.)=1.68327 s t(norm)=0.10033, mflops=49.8353 (err=1.7e-14) 25. Monro: elapsed time t=1.36661 s, 8 iters, t-(init.)=1.33328 s t(norm)=0.158939, mflops=31.4585 (err=1.6e-07) 26. NAPACK (f2c): elapsed time t=1.81659 s, 8 iters, t-(init.)=1.79993 s t(norm)=0.214568, mflops=23.3026 (err=8.6e-13) 27. Ooura (C): elapsed time t=1.51661 s, 32 iters, t-(init.)=1.41661 s t(norm)=0.0422183, mflops=118.432 (err=1.7e-14) 28. Ooura (F): elapsed time t=1.5666 s, 32 iters, t-(init.)=1.46661 s t(norm)=0.0437083, mflops=114.395 (err=1.7e-14) 29. Ransom: elapsed time t=1.93326 s, 32 iters, t-(init.)=1.83326 s t(norm)=0.0546354, mflops=91.5157 (err=1.7e-14) 30. SCIPORT: elapsed time t=1.11662 s, 4 iters, t-(init.)=1.09996 s t(norm)=0.26225, mflops=19.0658 (err=1.7e-14) 31. Singleton: elapsed time t=1.29995 s, 16 iters, t-(init.)=1.24995 s t(norm)=0.0745028, mflops=67.1115 (err=2.3e-14) 32. Singleton (f2c): elapsed time t=1.38328 s, 16 iters, t-(init.)=1.33328 s t(norm)=0.0794697, mflops=62.9171 (err=2.3e-14) 33. Sorensen: elapsed time t=1.93326 s, 16 iters, t-(init.)=1.89992 s t(norm)=0.113244, mflops=44.1523 (err=1.7e-14) 34. Sorensen DIT: elapsed time t=1.86659 s, 8 iters, t-(init.)=1.84993 s t(norm)=0.220528, mflops=22.6728 (err=1.7e-14) 35. Temperton: elapsed time t=1.54994 s, 16 iters, t-(init.)=1.51661 s t(norm)=0.0903968, mflops=55.3117 (err=1.7e-07) 36. Temperton (f2c): elapsed time t=1.63327 s, 16 iters, t-(init.)=1.58327 s t(norm)=0.0943702, mflops=52.9828 (err=1.7e-14) 37. Valkenburg: elapsed time t=1.54994 s, 4 iters, t-(init.)=1.53327 s t(norm)=0.365561, mflops=13.6776 (err=1.7e-14) 38. DXML: elapsed time t=1.51661 s, 32 iters, t-(init.)=1.41661 s t(norm)=0.0422183, mflops=118.432 (err=1.7e-14) Top mflops for N=65536 = 186.421 Normalized results and averages for N=65536: fft 0: mflops = 34.4751 (norm. = 0.184932), norm. avg. (of 16) = 0.291122 fft 1: mflops = 33.5558 (norm. = 0.18), norm. avg. (of 16) = 0.257519 fft 2: mflops = 26.7732 (norm. = 0.143617), norm. avg. (of 16) = 0.179529 fft 3: mflops = 44.5431 (norm. = 0.238938), norm. avg. (of 16) = 0.0988037 fft 4: mflops = 28.5987 (norm. = 0.153409), norm. avg. (of 16) = 0.221616 fft 5: mflops = 19.3591 (norm. = 0.103846), norm. avg. (of 16) = 0.0569389 fft 6: mflops = 81.1833 (norm. = 0.435484), norm. avg. (of 16) = 0.337047 fft 7: mflops = 54.7105 (norm. = 0.293478), norm. avg. (of 16) = 0.196535 fft 8: mflops = 24.6734 (norm. = 0.132353), norm. avg. (of 16) = 0.14425 fft 9: mflops = 183.031 (norm. = 0.981818), norm. avg. (of 16) = 0.566333 fft 10: mflops = 186.421 (norm. = 1), norm. avg. (of 16) = 0.60434 fft 11: mflops = 25.421 (norm. = 0.136364), norm. avg. (of 15) = 0.123863 fft 12: mflops = 62.9171 (norm. = 0.3375), norm. avg. (of 16) = 0.414697 fft 13: mflops = 62.9171 (norm. = 0.3375), norm. avg. (of 16) = 0.295195 fft 14: mflops = 129.061 (norm. = 0.692308), norm. avg. (of 16) = 0.834599 fft 15: mflops = 129.061 (norm. = 0.692308), norm. avg. (of 16) = 0.848935 fft 16: mflops = 83.8894 (norm. = 0.45), norm. avg. (of 16) = 0.721216 fft 17: mflops = 94.9692 (norm. = 0.509434), norm. avg. (of 14) = 0.666051 fft 18: mflops = 71.9052 (norm. = 0.385714), norm. avg. (of 16) = 0.346238 fft 19: mflops = 31.8567 (norm. = 0.170886), norm. avg. (of 16) = 0.154864 fft 20: mflops = 37.01 (norm. = 0.198529), norm. avg. (of 16) = 0.182313 fft 21: mflops = -1 (norm. = -0.0053642), norm. avg. (of 12) = 0.345021 fft 22: mflops = 51.3609 (norm. = 0.27551), norm. avg. (of 15) = 0.257507 fft 23: mflops = 52.4309 (norm. = 0.28125), norm. avg. (of 15) = 0.30427 fft 24: mflops = 49.8353 (norm. = 0.267327), norm. avg. (of 15) = 0.288185 fft 25: mflops = 31.4585 (norm. = 0.16875), norm. avg. (of 15) = 0.186677 fft 26: mflops = 23.3026 (norm. = 0.125), norm. avg. (of 16) = 0.0778528 fft 27: mflops = 118.432 (norm. = 0.635294), norm. avg. (of 16) = 0.59989 fft 28: mflops = 114.395 (norm. = 0.613636), norm. avg. (of 16) = 0.560992 fft 29: mflops = 91.5157 (norm. = 0.490909), norm. avg. (of 15) = 0.223297 fft 30: mflops = 19.0658 (norm. = 0.102273), norm. avg. (of 15) = 0.312204 fft 31: mflops = 67.1115 (norm. = 0.36), norm. avg. (of 16) = 0.293902 fft 32: mflops = 62.9171 (norm. = 0.3375), norm. avg. (of 16) = 0.278954 fft 33: mflops = 44.1523 (norm. = 0.236842), norm. avg. (of 16) = 0.374097 fft 34: mflops = 22.6728 (norm. = 0.121622), norm. avg. (of 16) = 0.138143 fft 35: mflops = 55.3117 (norm. = 0.296703), norm. avg. (of 16) = 0.250949 fft 36: mflops = 52.9828 (norm. = 0.284211), norm. avg. (of 16) = 0.218188 fft 37: mflops = 13.6776 (norm. = 0.0733696), norm. avg. (of 16) = 0.0543545 fft 38: mflops = 118.432 (norm. = 0.635294), norm. avg. (of 16) = 0.545562 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.63327 s, 4 iters, t-(init.)=1.59994 s t(norm)=0.179508, mflops=27.8539 (err=3.3e-14) 1. Arndt DIT: elapsed time t=1.63327 s, 4 iters, t-(init.)=1.59994 s t(norm)=0.179508, mflops=27.8539 (err=3.3e-14) 2. Arndt Split-Radix: elapsed time t=1.09996 s, 2 iters, t-(init.)=1.08329 s t(norm)=0.243084, mflops=20.569 (err=3.3e-14) 3. Arndt 4-step: elapsed time t=1.41661 s, 4 iters, t-(init.)=1.36661 s t(norm)=0.15333, mflops=32.6095 (err=3.3e-14) 4. Bailey: elapsed time t=1.88326 s, 4 iters, t-(init.)=1.84993 s t(norm)=0.207556, mflops=24.0899 (err=3.3e-14) 5. Beauregard: elapsed time t=1.28328 s, 2 iters, t-(init.)=1.26662 s t(norm)=0.284221, mflops=17.5919 (err=3.3e-14) 6. Bergland: elapsed time t=1.54994 s, 8 iters, t-(init.)=1.48327 s t(norm)=0.0832094, mflops=60.0893 (err=3.4e-14) 7. Brenner: elapsed time t=1.16662 s, 4 iters, t-(init.)=1.13329 s t(norm)=0.127151, mflops=39.3232 (err=3.3e-14) 8. Burrus: elapsed time t=1.13329 s, 2 iters, t-(init.)=1.11662 s t(norm)=0.250563, mflops=19.955 (err=3.3e-14) 9. CWP (min N) (N=144144): elapsed time t=1.34995 s, 16 iters, t-(init.)=1.18329 s t(norm)=0.0331903, mflops=150.647 10. CWP (best N) (N=144144): elapsed time t=1.36661 s, 16 iters, t-(init.)=1.19995 s t(norm)=0.0336577, mflops=148.554 11. Edelblute: elapsed time t=1.11662 s, 2 iters, t-(init.)=1.09996 s t(norm)=0.246823, mflops=20.2574 (err=3.3e-14) 12. FFTPACK: elapsed time t=1.06662 s, 4 iters, t-(init.)=1.01663 s t(norm)=0.114062, mflops=43.8357 (err=3.3e-14) 13. FFTPACK (f2c): elapsed time t=1.06662 s, 4 iters, t-(init.)=1.03329 s t(norm)=0.115932, mflops=43.1286 (err=3.3e-14) FFTW_MEASURE plan: (cost = 9.166300e-02) FFTW_TWIDDLE 64 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 14. FFTW: elapsed time t=1.69993 s, 16 iters, t-(init.)=1.54994 s t(norm)=0.0434746, mflops=115.01 (err=3.3e-14) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.83326 s, 16 iters, t-(init.)=1.69993 s t(norm)=0.0476818, mflops=104.862 (err=3.3e-14) 16. Frigo-old: elapsed time t=1.38328 s, 8 iters, t-(init.)=1.31661 s t(norm)=0.0738601, mflops=67.6956 (err=3.3e-14) 17. Green: elapsed time t=1.38328 s, 8 iters, t-(init.)=1.31661 s t(norm)=0.0738601, mflops=67.6956 (err=3.3e-14) 18. GSL: elapsed time t=1.81659 s, 8 iters, t-(init.)=1.73326 s t(norm)=0.0972335, mflops=51.4226 (err=3.3e-14) 19. GSL DIT: elapsed time t=1.84993 s, 4 iters, t-(init.)=1.79993 s t(norm)=0.201946, mflops=24.759 (err=3.5e-14) 20. GSL DIF: elapsed time t=1.69993 s, 4 iters, t-(init.)=1.64993 s t(norm)=0.185118, mflops=27.0099 (err=3.5e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.13329 s, 4 iters, t-(init.)=1.09996 s t(norm)=0.123412, mflops=40.5148 (err=3.3e-14) 23. Mayer (simple): elapsed time t=1.11662 s, 4 iters, t-(init.)=1.08329 s t(norm)=0.121542, mflops=41.1381 24. Mayer (lookup): elapsed time t=1.16662 s, 4 iters, t-(init.)=1.13329 s t(norm)=0.127151, mflops=39.3232 (err=3.3e-14) 25. Monro: elapsed time t=1.81659 s, 4 iters, t-(init.)=1.78326 s t(norm)=0.200077, mflops=24.9904 (err=1.7e-07) 26. NAPACK (f2c): elapsed time t=1.24995 s, 2 iters, t-(init.)=1.23328 s t(norm)=0.276741, mflops=18.0674 (err=2.0e-12) 27. Ooura (C): elapsed time t=1.03329 s, 8 iters, t-(init.)=0.949962 s t(norm)=0.0532914, mflops=93.8237 (err=3.4e-14) 28. Ooura (F): elapsed time t=1.06662 s, 8 iters, t-(init.)=0.99996 s t(norm)=0.0560962, mflops=89.1325 (err=3.4e-14) 29. Ransom: elapsed time t=1.34995 s, 8 iters, t-(init.)=1.28328 s t(norm)=0.0719902, mflops=69.4539 (err=3.3e-14) 30. SCIPORT: elapsed time t=1.46661 s, 2 iters, t-(init.)=1.44994 s t(norm)=0.325358, mflops=15.3677 (err=3.3e-14) 31. Singleton: elapsed time t=1.01663 s, 4 iters, t-(init.)=0.983294 s t(norm)=0.110323, mflops=45.3216 (err=4.8e-14) 32. Singleton (f2c): elapsed time t=1.04996 s, 4 iters, t-(init.)=1.01663 s t(norm)=0.114062, mflops=43.8357 (err=4.8e-14) 33. Sorensen: elapsed time t=1.09996 s, 4 iters, t-(init.)=1.06662 s t(norm)=0.119672, mflops=41.7809 (err=3.3e-14) 34. Sorensen DIT: elapsed time t=1.14995 s, 2 iters, t-(init.)=1.13329 s t(norm)=0.254303, mflops=19.6616 (err=3.3e-14) 35. Temperton: elapsed time t=1.11662 s, 4 iters, t-(init.)=1.08329 s t(norm)=0.121542, mflops=41.1381 (err=1.9e-07) 36. Temperton (f2c): elapsed time t=1.18329 s, 4 iters, t-(init.)=1.14995 s t(norm)=0.129021, mflops=38.7533 (err=3.3e-14) 37. Valkenburg: elapsed time t=1.98325 s, 2 iters, t-(init.)=1.96659 s t(norm)=0.44129, mflops=11.3304 (err=3.4e-14) 38. DXML: elapsed time t=1.81659 s, 16 iters, t-(init.)=1.68327 s t(norm)=0.0472143, mflops=105.9 (err=3.4e-14) Top mflops for N=131072 = 150.647 Normalized results and averages for N=131072: fft 0: mflops = 27.8539 (norm. = 0.184896), norm. avg. (of 17) = 0.284873 fft 1: mflops = 27.8539 (norm. = 0.184896), norm. avg. (of 17) = 0.253247 fft 2: mflops = 20.569 (norm. = 0.136538), norm. avg. (of 17) = 0.177 fft 3: mflops = 32.6095 (norm. = 0.216463), norm. avg. (of 17) = 0.105725 fft 4: mflops = 24.0899 (norm. = 0.15991), norm. avg. (of 17) = 0.217986 fft 5: mflops = 17.5919 (norm. = 0.116776), norm. avg. (of 17) = 0.0604588 fft 6: mflops = 60.0893 (norm. = 0.398876), norm. avg. (of 17) = 0.340684 fft 7: mflops = 39.3232 (norm. = 0.261029), norm. avg. (of 17) = 0.200329 fft 8: mflops = 19.955 (norm. = 0.132463), norm. avg. (of 17) = 0.143557 fft 9: mflops = 150.647 (norm. = 1), norm. avg. (of 17) = 0.591843 fft 10: mflops = 148.554 (norm. = 0.986111), norm. avg. (of 17) = 0.626797 fft 11: mflops = 20.2574 (norm. = 0.13447), norm. avg. (of 16) = 0.124526 fft 12: mflops = 43.8357 (norm. = 0.290984), norm. avg. (of 17) = 0.40742 fft 13: mflops = 43.1286 (norm. = 0.28629), norm. avg. (of 17) = 0.294671 fft 14: mflops = 115.01 (norm. = 0.763441), norm. avg. (of 17) = 0.830413 fft 15: mflops = 104.862 (norm. = 0.696078), norm. avg. (of 17) = 0.839943 fft 16: mflops = 67.6956 (norm. = 0.449367), norm. avg. (of 17) = 0.705225 fft 17: mflops = 67.6956 (norm. = 0.449367), norm. avg. (of 15) = 0.651606 fft 18: mflops = 51.4226 (norm. = 0.341346), norm. avg. (of 17) = 0.34595 fft 19: mflops = 24.759 (norm. = 0.164352), norm. avg. (of 17) = 0.155422 fft 20: mflops = 27.0099 (norm. = 0.179293), norm. avg. (of 17) = 0.182135 fft 21: mflops = -1 (norm. = -0.00663806), norm. avg. (of 12) = 0.345021 fft 22: mflops = 40.5148 (norm. = 0.268939), norm. avg. (of 16) = 0.258222 fft 23: mflops = 41.1381 (norm. = 0.273077), norm. avg. (of 16) = 0.30232 fft 24: mflops = 39.3232 (norm. = 0..261029), norm. avg. (of 16) = 0.286488 fft 25: mflops = 24.9904 (norm. = 0.165888), norm. avg. (of 16) = 0.185378 fft 26: mflops = 18.0674 (norm. = 0.119932), norm. avg. (of 17) = 0.0803281 fft 27: mflops = 93.8237 (norm. = 0.622807), norm. avg. (of 17) = 0.601238 fft 28: mflops = 89.1325 (norm. = 0.591667), norm. avg. (of 17) = 0.562796 fft 29: mflops = 69.4539 (norm. = 0.461039), norm. avg. (of 16) = 0.238156 fft 30: mflops = 15.3677 (norm. = 0.102011), norm. avg. (of 16) = 0.299067 fft 31: mflops = 45.3216 (norm. = 0.300847), norm. avg. (of 17) = 0.294311 fft 32: mflops = 43.8357 (norm. = 0.290984), norm. avg. (of 17) = 0.279662 fft 33: mflops = 41.7809 (norm. = 0.277344), norm. avg. (of 17) = 0.368406 fft 34: mflops = 19.6616 (norm. = 0.130515), norm. avg. (of 17) = 0.137694 fft 35: mflops = 41.1381 (norm. = 0.273077), norm. avg. (of 17) = 0.252251 fft 36: mflops = 38.7533 (norm. = 0.257246), norm. avg. (of 17) = 0.220486 fft 37: mflops = 11.3304 (norm. = 0.0752119), norm. avg. (of 17) = 0.0555814 fft 38: mflops = 105.9 (norm. = 0.70297), norm. avg. (of 17) = 0.554821 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=1.23328 s, 1 iters, t-(init.)=1.21662 s t(norm)=0.257835, mflops=19.3922 (err=4.3e-14) 1. Arndt DIT: elapsed time t=1.26662 s, 1 iters, t-(init.)=1.24995 s t(norm)=0.264899, mflops=18.8751 (err=4.3e-14) 2. Arndt Split-Radix: elapsed time t=1.53327 s, 1 iters, t-(init.)=1.51661 s t(norm)=0.321411, mflops=15.5564 (err=4.3e-14) 3. Arndt 4-step: elapsed time t=1.43328 s, 2 iters, t-(init.)=1.39994 s t(norm)=0.148343, mflops=33.7056 (err=4.3e-14) 4. Bailey: elapsed time t=1.08329 s, 1 iters, t-(init.)=1.04996 s t(norm)=0.222515, mflops=22.4704 (err=4.3e-14) 5. Beauregard: elapsed time t=1.44994 s, 1 iters, t-(init.)=1.41661 s t(norm)=0.300219, mflops=16.6545 (err=4.4e-14) 6. Bergland: elapsed time t=1.98325 s, 4 iters, t-(init.)=1.88326 s t(norm)=0.0997786, mflops=50.1109 (err=4.4e-14) 7. Brenner: elapsed time t=1.6166 s, 2 iters, t-(init.)=1.5666 s t(norm)=0.166003, mflops=30.1199 (err=4.4e-14) 8. Burrus: elapsed time t=1.59994 s, 1 iters, t-(init.)=1.58327 s t(norm)=0.335539, mflops=14.9014 (err=4.3e-14) 9. CWP (min N) (N=360360): elapsed time t=1.14995 s, 4 iters, t-(init.)=1.01663 s t(norm)=0.0538628, mflops=92.8285 10. CWP (best N) (N=360360): elapsed time t=1.13329 s, 4 iters, t-(init.)=0.99996 s t(norm)=0.0529798, mflops=94.3756 11. Edelblute: elapsed time t=1.58327 s, 1 iters, t-(init.)=1.5666 s t(norm)=0.332007, mflops=15.0599 (err=4.3e-14) 12. FFTPACK: elapsed time t=1.13329 s, 2 iters, t-(init.)=1.08329 s t(norm)=0.11479, mflops=43.558 (err=4.4e-14) 13. FFTPACK (f2c): elapsed time t=1.14995 s, 2 iters, t-(init.)=1.11662 s t(norm)=0.118322, mflops=42.2577 (err=4.4e-14) FFTW_MEASURE plan: (cost = 2.499900e-01) FFTW_TWIDDLE 64 FFTW_TWIDDLE 32 FFTW_TWIDDLE 4 FFTW_NOTW 32 14. FFTW: elapsed time t=1.06662 s, 4 iters, t-(init.)=0.966628 s t(norm)=0.0512138, mflops=97.6299 (err=4.4e-14) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.01663 s, 4 iters, t-(init.)=0.91663 s t(norm)=0.0485648, mflops=102.955 (err=4.4e-14) 16. Frigo-old: elapsed time t=1.68327 s, 4 iters, t-(init.)=1.58327 s t(norm)=0.0838847, mflops=59.6057 (err=4.4e-14) 17. Green: elapsed time t=1.84993 s, 4 iters, t-(init.)=1.7666 s t(norm)=0.0935976, mflops=53.4202 (err=4.4e-14) 18. GSL: elapsed time t=1.99992 s, 4 iters, t-(init.)=1.89992 s t(norm)=0.100662, mflops=49.6714 (err=4.4e-14) 19. GSL DIT: elapsed time t=1.24995 s, 1 iters, t-(init.)=1.23328 s t(norm)=0.261367, mflops=19.1302 (err=4.6e-14) 20. GSL DIF: elapsed time t=1.19995 s, 1 iters, t-(init.)=1.16662 s t(norm)=0.247239, mflops=20.2233 (err=4.6e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.49994 s, 2 iters, t-(init.)=1.44994 s t(norm)=0.153641, mflops=32.5433 (err=4.3e-14) 23. Mayer (simple): elapsed time t=1.46661 s, 2 iters, t-(init.)=1.41661 s t(norm)=0.150109, mflops=33.309 24. Mayer (lookup): elapsed time t=1.54994 s, 2 iters, t-(init.)=1.49994 s t(norm)=0.158939, mflops=31.4585 (err=4.3e-14) 25. Monro: elapsed time t=1.33328 s, 1 iters, t-(init.)=1.31661 s t(norm)=0.279027, mflops=17.9194 (err=1.8e-07) 26. NAPACK (f2c): elapsed time t=1.38328 s, 1 iters, t-(init.)=1.36661 s t(norm)=0.289623, mflops=17.2638 (err=3.7e-12) 27. Ooura (C): elapsed time t=1.31661 s, 4 iters, t-(init.)=1.23328 s t(norm)=0.0653417, mflops=76.5208 (err=4.4e-14) 28. Ooura (F): elapsed time t=1.33328 s, 4 iters, t-(init.)=1.23328 s t(norm)=0.0653417, mflops=76.5208 (err=4.4e-14) 29. Ransom: elapsed time t=1.31661 s, 4 iters, t-(init.)=1.21662 s t(norm)=0.0644587, mflops=77.569 (err=4.3e-14) 30. SCIPORT: elapsed time t=1.83326 s, 1 iters, t-(init.)=1.81659 s t(norm)=0.384986, mflops=12..9875 (err=4.4e-14) 31. Singleton: elapsed time t=1.28328 s, 2 iters, t-(init.)=1.23328 s t(norm)=0.130683, mflops=38.2604 (err=6.0e-14) 32. Singleton (f2c): elapsed time t=1.33328 s, 2 iters, t-(init.)=1.29995 s t(norm)=0.137747, mflops=36.2983 (err=6.0e-14) 33. Sorensen: elapsed time t=1.5666 s, 2 iters, t-(init.)=1.53327 s t(norm)=0.162471, mflops=30.7747 (err=4.3e-14) 34. Sorensen DIT: elapsed time t=1.59994 s, 1 iters, t-(init.)=1.5666 s t(norm)=0.332007, mflops=15.0599 (err=4.3e-14) 35. Temperton: elapsed time t=1.38328 s, 2 iters, t-(init.)=1.34995 s t(norm)=0.143045, mflops=34.9539 (err=2.0e-07) 36. Temperton (f2c): elapsed time t=1.43328 s, 2 iters, t-(init.)=1.38328 s t(norm)=0.146577, mflops=34.1117 (err=4.4e-14) 37. Valkenburg: elapsed time t=2.33324 s, 1 iters, t-(init.)=2.31657 s t(norm)=0.490946, mflops=10.1844 (err=4.4e-14) 38. DXML: elapsed time t=1.86659 s, 8 iters, t-(init.)=1.68327 s t(norm)=0.0445913, mflops=112.129 (err=4.4e-14) Top mflops for N=262144 = 112.129 Normalized results and averages for N=262144: fft 0: mflops = 19.3922 (norm. = 0.172945), norm. avg. (of 18) = 0.278655 fft 1: mflops = 18.8751 (norm. = 0.168333), norm. avg. (of 18) = 0.24853 fft 2: mflops = 15.5564 (norm. = 0.138736), norm. avg. (of 18) = 0.174875 fft 3: mflops = 33.7056 (norm. = 0.300595), norm. avg. (of 18) = 0.116551 fft 4: mflops = 22.4704 (norm. = 0.200397), norm. avg. (of 18) = 0.217009 fft 5: mflops = 16.6545 (norm. = 0.148529), norm. avg. (of 18) = 0.0653516 fft 6: mflops = 50.1109 (norm. = 0.446903), norm. avg. (of 18) = 0.346585 fft 7: mflops = 30.1199 (norm. = 0.268617), norm. avg. (of 18) = 0.204122 fft 8: mflops = 14.9014 (norm. = 0.132895), norm. avg. (of 18) = 0.142964 fft 9: mflops = 92.8285 (norm. = 0.827869), norm. avg. (of 18) = 0.604955 fft 10: mflops = 94.3756 (norm. = 0.841667), norm. avg. (of 18) = 0.638734 fft 11: mflops = 15.0599 (norm. = 0.134309), norm. avg. (of 17) = 0.125102 fft 12: mflops = 43.558 (norm. = 0.388462), norm. avg. (of 18) = 0.406367 fft 13: mflops = 42.2577 (norm. = 0.376866), norm. avg. (of 18) = 0.299237 fft 14: mflops = 97.6299 (norm. = 0.87069), norm. avg. (of 18) = 0.83265 fft 15: mflops = 102.955 (norm. = 0.918182), norm. avg. (of 18) = 0.84429 fft 16: mflops = 59.6057 (norm. = 0.531579), norm. avg. (of 18) = 0.695578 fft 17: mflops = 53.4202 (norm. = 0.476415), norm. avg. (of 16) = 0.640656 fft 18: mflops = 49.6714 (norm. = 0.442982), norm. avg. (of 18) = 0.351341 fft 19: mflops = 19.1302 (norm. = 0.170608), norm. avg. (of 18) = 0.156266 fft 20: mflops = 20.2233 (norm. = 0.180357), norm. avg. (of 18) = 0.182037 fft 21: mflops = -1 (norm. = -0.00891826), norm. avg. (of 12) = 0.345021 fft 22: mflops = 32.5433 (norm. = 0.29023), norm. avg. (of 17) = 0.260105 fft 23: mflops = 33.309 (norm. = 0.297059), norm. avg. (of 17) = 0.302011 fft 24: mflops = 31.4585 (norm. = 0.280556), norm. avg. (of 17) = 0.286139 fft 25: mflops = 17.9194 (norm. = 0.15981), norm. avg. (of 17) = 0.183874 fft 26: mflops = 17.2638 (norm. = 0.153963), norm. avg. (of 18) = 0.0844189 fft 27: mflops = 76.5208 (norm. = 0.682432), norm. avg. (of 18) = 0.605749 fft 28: mflops = 76.5208 (norm. = 0.682432), norm. avg. (of 18) = 0.569443 fft 29: mflops = 77.569 (norm. = 0.691781), norm. avg. (of 17) = 0.26484 fft 30: mflops = 12.9875 (norm. = 0.115826), norm. avg. (of 17) = 0.288288 fft 31: mflops = 38.2604 (norm. = 0.341216), norm. avg. (of 18) = 0.296917 fft 32: mflops = 36.2983 (norm. = 0.323718), norm. avg. (of 18) = 0.282109 fft 33: mflops = 30.7747 (norm. = 0.274457), norm. avg. (of 18) = 0.363186 fft 34: mflops = 15.0599 (norm. = 0.134309), norm. avg. (of 18) = 0.137506 fft 35: mflops = 34.9539 (norm. = 0.311728), norm. avg. (of 18) = 0.255555 fft 36: mflops = 34.1117 (norm. = 0.304217), norm. avg. (of 18) = 0.225137 fft 37: mflops = 10.1844 (norm. = 0.0908273), norm. avg. (of 18) = 0.0575395 fft 38: mflops = 112.129 (norm. = 1), norm. avg. (of 18) = 0.579553 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.091 6748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. Brenner 1. CWP (min N) 2. CWP (best N) 3. FFTPACK 4. FFTPACK (f2c) 5. FFTW 6. FFTW_ESTIMATE 7. Frigo-old 8. GSL 9. NAPACK (f2c) 10. Singleton 11. Singleton (f2c) 12. Temperton 13. Temperton (f2c) 14. Valkenburg 15. DXML Computing normalized averages (16 transforms). Benchmarking for array size = 6: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.94992 s, 1048576 iters, t-(init.)=1.86659 s t(norm)=0.114774, mflops=43.5638 2. CWP (best N) (N=15): elapsed time t=1.33328 s, 524288 iters, t-(init.)=1.26662 s t(norm)=0.155765, mflops=32.0997 3. FFTPACK: elapsed time t=1.16662 s, 1048576 iters, t-(init.)=1.08329 s t(norm)=0.06661, mflops=75.0638 (err=1.0e-16) 4. FFTPACK (f2c): elapsed time t=1.34995 s, 1048576 iters, t-(init.)=1.26662 s t(norm)=0.0778824, mflops=64.1993 (err=1.8e-16) FFTW_MEASURE plan: (cost = 3.178787e-07) FFTW_NOTW 6 5. FFTW: elapsed time t=1.34995 s, 4194304 iters, t-(init.)=1.04996 s t(norm)=0.0161401, mflops=309.787 (err=1.1e-16) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 6. FFTW_ESTIMATE: elapsed time t=1.33328 s, 4194304 iters, t-(init.)=0.699972 s t(norm)=0.0107601, mflops=464.681 (err=1.1e-16) 7. Frigo-old: elapsed time t=1.21662 s, 524288 iters, t-(init.)=1.18329 s t(norm)=0.145517, mflops=34.3602 (err=3.1e-16) 8. GSL: elapsed time t=1.08329 s, 1048576 iters, t-(init.)=0.99996 s t(norm)=0.0614861, mflops=81.3191 (err=1.2e-16) 9. NAPACK (f2c): elapsed time t=1.98325 s, 524288 iters, t-(init.)=1.93326 s t(norm)=0.237746, mflops=21.0308 (err=4.7e-16) 10. Singleton: elapsed time t=1.86659 s, 524288 iters, t-(init.)=1.79993 s t(norm)=0.22135, mflops=22.5887 (err=1.0e-16) 11. Singleton (f2c): elapsed time t=1.69993 s, 524288 iters, t-(init.)=1.64993 s t(norm)=0.202904, mflops=24.6422 (err=1.0e-16) 12. Temperton: elapsed time t=1.99992 s, 524288 iters, t-(init.)=1.94992 s t(norm)=0.239796, mflops=20.8511 (err=3.7e-16) 13. Temperton (f2c): elapsed time t=1.59994 s, 524288 iters, t-(init.)=1.5666 s t(norm)=0.192657, mflops=25.9529 (err=1.0e-16) 14. Valkenburg: elapsed time t=1.14995 s, 262144 iters, t-(init.)=1.13329 s t(norm)=0.278737, mflops=17.938 (err=3.2e-16) 15. DXML: elapsed time t=1.14995 s, 524288 iters, t-(init.)=1.09996 s t(norm)=0.13527, mflops=36.9632 (err=1.9e-16) Top mflops for N=6 = 464.681 Normalized results and averages for N=6: fft 0: mflops = -1 (norm. = -0.00215201), norm. avg. (of 0) = -1 fft 1: mflops = 43.5638 (norm. = 0.09375), norm. avg. (of 1) = 0.09375 fft 2: mflops = 32.0997 (norm. = 0.0690789), norm. avg. (of 1) = 0.0690789 fft 3: mflops = 75.0638 (norm. = 0.161538), norm. avg. (of 1) = 0.161538 fft 4: mflops = 64.1993 (norm. = 0.138158), norm. avg. (of 1) = 0.138158 fft 5: mflops = 309.787 (norm. = 0.666667), norm. avg. (of 1) = 0.666667 fft 6: mflops = 464.681 (norm. = 1), norm. avg. (of 1) = 1 fft 7: mflops = 34.3602 (norm. = 0.0739437), norm. avg. (of 1) = 0.0739437 fft 8: mflops = 81.3191 (norm. = 0.175), norm. avg. (of 1) = 0.175 fft 9: mflops = 21.0308 (norm. = 0.0452586), norm. avg. (of 1) = 0.0452586 fft 10: mflops = 22.5887 (norm. = 0.0486111), norm. avg. (of 1) = 0.0486111 fft 11: mflops = 24.6422 (norm. = 0.0530303), norm. avg. (of 1) = 0.0530303 fft 12: mflops = 20.8511 (norm. = 0.0448718), norm. avg. (of 1) = 0.0448718 fft 13: mflops = 25.9529 (norm. = 0.0558511), norm. avg. (of 1) = 0.0558511 fft 14: mflops = 17.938 (norm. = 0.0386029), norm. avg. (of 1) = 0.0386029 fft 15: mflops = 36.9632 (norm. = 0.0795455), norm. avg. (of 1) = 0.0795455 Benchmarking for array size = 9: 0. Brenner: elapsed time t=1.31661 s, 131072 iters, t-(init.)=1.29995 s t(norm)=0.347636, mflops=14.3829 (err=3.6e-16) 1. CWP (min N): elapsed time t=1.7666 s, 1048576 iters, t-(init.)=1.6666 s t(norm)=0.0557109, mflops=89.7491 2. CWP (best N) (N=15): elapsed time t=1.33328 s, 524288 iters, t-(init.)=1.26662 s t(norm)=0.0846805, mflops=59.0455 3. FFTPACK: elapsed time t=1.36661 s, 1048576 iters, t-(init.)=1.28328 s t(norm)=0.0428974, mflops=116.557 (err=1.4e-16) 4. FFTPACK (f2c): elapsed time t=1.7166 s, 1048576 iters, t-(init.)=1.6166 s t(norm)=0.0540395, mflops=92.5248 (err=2.4e-16) FFTW_MEASURE plan: (cost = 6.039696e-07) FFTW_NOTW 9 5. FFTW: elapsed time t=1.16662 s, 2097152 iters, t-(init.)=0.983294 s t(norm)=0.0164347, mflops=304.234 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4..851000e+02) FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.13329 s, 2097152 iters, t-(init.)=0.783302 s t(norm)=0.0130921, mflops=381.911 (err=1.2e-16) 7. Frigo-old: elapsed time t=1.18329 s, 262144 iters, t-(init.)=1.16662 s t(norm)=0.15599, mflops=32.0532 (err=3.3e-16) 8. GSL: elapsed time t=1.03329 s, 524288 iters, t-(init.)=0.983294 s t(norm)=0.0657388, mflops=76.0585 (err=1.4e-16) 9. NAPACK (f2c): elapsed time t=1.23328 s, 262144 iters, t-(init.)=1.21662 s t(norm)=0.162676, mflops=30.736 (err=4.3e-16) 10. Singleton: elapsed time t=1.84993 s, 524288 iters, t-(init.)=1.78326 s t(norm)=0.119221, mflops=41.9388 (err=1.5e-16) 11. Singleton (f2c): elapsed time t=1.7666 s, 524288 iters, t-(init.)=1.7166 s t(norm)=0.114764, mflops=43.5675 (err=1.5e-16) 12. Temperton: elapsed time t=1.7666 s, 262144 iters, t-(init.)=1.74993 s t(norm)=0.233986, mflops=21.3688 (err=1.1e-08) 13. Temperton (f2c): elapsed time t=1.58327 s, 524288 iters, t-(init.)=1.53327 s t(norm)=0.102508, mflops=48.7767 (err=1.4e-16) 14. Valkenburg: elapsed time t=1.99992 s, 262144 iters, t-(init.)=1.98325 s t(norm)=0.265184, mflops=18.8549 (err=3.7e-16) 15. DXML: elapsed time t=1.28328 s, 524288 iters, t-(init.)=1.23328 s t(norm)=0.0824521, mflops=60.6413 (err=2.0e-16) Top mflops for N=9 = 381.911 Normalized results and averages for N=9: fft 0: mflops = 14.3829 (norm. = 0.0376603), norm. avg. (of 1) = 0.0376603 fft 1: mflops = 89.7491 (norm. = 0.235), norm. avg. (of 2) = 0.164375 fft 2: mflops = 59.0455 (norm. = 0.154605), norm. avg. (of 2) = 0.111842 fft 3: mflops = 116.557 (norm. = 0.305195), norm. avg. (of 2) = 0.233367 fft 4: mflops = 92.5248 (norm. = 0.242268), norm. avg. (of 2) = 0.190213 fft 5: mflops = 304.234 (norm. = 0.79661), norm. avg. (of 2) = 0.731638 fft 6: mflops = 381.911 (norm. = 1), norm. avg. (of 2) = 1 fft 7: mflops = 32.0532 (norm. = 0.0839286), norm. avg. (of 2) = 0.0789361 fft 8: mflops = 76.0585 (norm. = 0.199153), norm. avg. (of 2) = 0.187076 fft 9: mflops = 30.736 (norm. = 0.0804795), norm. avg. (of 2) = 0.062869 fft 10: mflops = 41.9388 (norm. = 0.109813), norm. avg. (of 2) = 0.0792121 fft 11: mflops = 43.5675 (norm. = 0.114078), norm. avg. (of 2) = 0.083554 fft 12: mflops = 21.3688 (norm. = 0.0559524), norm. avg. (of 2) = 0.0504121 fft 13: mflops = 48.7767 (norm. = 0.127717), norm. avg. (of 2) = 0.0917842 fft 14: mflops = 18.8549 (norm. = 0.0493697), norm. avg. (of 2) = 0.0439863 fft 15: mflops = 60.6413 (norm. = 0.158784), norm. avg. (of 2) = 0.119165 Benchmarking for array size = 12: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.08329 s, 524288 iters, t-(init.)=1.03329 s t(norm)=0.0458128, mflops=109.14 2. CWP (best N) (N=15): elapsed time t=1.33328 s, 524288 iters, t-(init.)=1.26662 s t(norm)=0.0561577, mflops=89.035 3. FFTPACK: elapsed time t=1.48327 s, 1048576 iters, t-(init.)=1.38328 s t(norm)=0.0306651, mflops=163.052 (err=1.6e-16) 4. FFTPACK (f2c): elapsed time t=1.93326 s, 1048576 iters, t-(init.)=1.83326 s t(norm)=0.0406404, mflops=123.03 (err=1.9e-16) FFTW_MEASURE plan: (cost = 6.993332e-07) FFTW_NOTW 12 5. FFTW: elapsed time t=1.51661 s, 2097152 iters, t-(init.)=1.31661 s t(norm)=0.0145936, mflops=342.616 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.51661 s, 2097152 iters, t-(init.)=1.14995 s t(norm)=0.0127463, mflops=392.27 (err=1.2e-16) 7. Frigo-old: elapsed time t=1.01663 s, 262144 iters, t-(init.)=0.99996 s t(norm)=0.08867, mflops=56.3888 (err=2.9e-16) 8. GSL: elapsed time t=1.01663 s, 524288 iters, t-(init.)=0.966628 s t(norm)=0.0428572, mflops=116.667 (err=1.6e-16) 9. NAPACK (f2c): elapsed time t=1.01663 s, 131072 iters, t-(init.)=0.99996 s t(norm)=0.17734, mflops=28.1944 (err=5.5e-16) 10. Singleton: elapsed time t=1.24995 s, 262144 iters, t-(init.)=1.19995 s t(norm)=0.106404, mflops=46.9907 (err=1.5e-16) 11. Singleton (f2c): elapsed time t=1.26662 s, 262144 iters, t-(init.)=1.23328 s t(norm)=0.10936, mflops=45.7207 (err=1.5e-16) 12. Temperton: elapsed time t=1.69993 s, 524288 iters, t-(init.)=1.64993 s t(norm)=0.0731528, mflops=68.3501 (err=5.4e-16) 13. Temperton (f2c): elapsed time t=1.99992 s, 524288 iters, t-(init.)=1.94992 s t(norm)=0.0864533, mflops=57.8347 (err=1.4e-16) 14. Valkenburg: elapsed time t=1.53327 s, 131072 iters, t-(init.)=1.51661 s t(norm)=0.268966, mflops=18.5897 (err=3.9e-16) 15. DXML: elapsed time t=1.28328 s, 524288 iters, t-(init.)=1.23328 s t(norm)=0.0546799, mflops=91.4414 (err=1.3e-16) Top mflops for N=12 = 392.27 Normalized results and averages for N=12: fft 0: mflops = -1 (norm. = -0.00254926), norm. avg. (of 1) = 0.0376603 fft 1: mflops = 109.14 (norm. = 0.278226), norm. avg. (of 3) = 0.202325 fft 2: mflops = 89.035 (norm. = 0.226974), norm. avg. (of 3) = 0.150219 fft 3: mflops = 163.052 (norm. = 0.415663), norm. avg. (of 3) = 0.294132 fft 4: mflops = 123.03 (norm. = 0.313636), norm. avg. (of 3) = 0.231354 fft 5: mflops = 342.616 (norm. = 0.873418), norm. avg. (of 3) = 0.778898 fft 6: mflops = 392.27 (norm. = 1), norm. avg. (of 3) = 1 fft 7: mflops = 56.3888 (norm. = 0.14375), norm. avg. (of 3) = 0.100541 fft 8: mflops = 116.667 (norm. = 0.297414), norm. avg. (of 3) = 0.223855 fft 9: mflops = 28.1944 (norm. = 0.071875), norm. avg. (of 3) = 0.065871 fft 10: mflops = 46.9907 (norm. = 0.119792), norm. avg. (of 3) = 0.0927386 fft 11: mflops = 45.7207 (norm. = 0.116554), norm. avg. (of 3) = 0.094554 fft 12: mflops = 68.3501 (norm. = 0.174242), norm. avg. (of 3) = 0.0916889 fft 13: mflops = 57.8347 (norm. = 0.147436), norm. avg. (of 3) = 0.110335 fft 14: mflops = 18.5897 (norm. = 0.0473901), norm. avg. (of 3) = 0.0451209 fft 15: mflops = 91.4414 (norm. = 0.233108), norm. avg. (of 3) = 0.157146 Benchmarking for array size = 15: 0. Brenner: elapsed time t=1.84993 s, 131072 iters, t-(init.)=1.83326 s t(norm)=0.238667, mflops=20.9497 (err=3.3e-16) 1. CWP (min N): elapsed time t=1.33328 s, 524288 iters, t-(init.)=1.24995 s t(norm)=0.0406818, mflops=122.905 2. CWP (best N): elapsed time t=1.33328 s, 524288 iters, t-(init.)=1.26662 s t(norm)=0.0412242, mflops=121.288 3. FFTPACK: elapsed time t=1.93326 s, 1048576 iters, t-(init.)=1.79993 s t(norm)=0.0292909, mflops=170.701 (err=2.6e-16) 4. FFTPACK (f2c): elapsed time t=1.21662 s, 524288 iters, t-(init.)=1.14995 s t(norm)=0.0374273, mflops=133.592 (err=4.1e-16) FFTW_MEASURE plan: (cost = 1.017212e-06) FFTW_NOTW 15 5. FFTW: elapsed time t=1.21662 s, 1048576 iters, t-(init.)=1.08329 s t(norm)=0.0176288, mflops=283.627 (err=1.8e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.16662 s, 1048576 iters, t-(init.)=0.949962 s t(norm)=0.0154591, mflops=323.434 (err=1.8e-16) 7. Frigo-old: elapsed time t=1.98325 s, 262144 iters, t-(init.)=1.94992 s t(norm)=0.126927, mflops=39.3926 (err=4.2e-16) 8. GSL: elapsed time t=1.04996 s, 262144 iters, t-(init.)=1.01663 s t(norm)=0.0661757, mflops=75.5564 (err=2.0e-16) 9. NAPACK (f2c): elapsed time t=1.83326 s, 131072 iters, t-(init.)=1.81659 s t(norm)=0.236497, mflops=21.1419 (err=1.0e-15) 10. Singleton: elapsed time t=1.59994 s, 262144 iters, t-(init.)=1.54994 s t(norm)=0.100891, mflops=49.5585 (err=2.9e-16) 11. Singleton (f2c): elapsed time t=1.53327 s, 262144 iters, t-(init.)=1.49994 s t(norm)=0.0976363, mflops=51.2104 (err=2.9e-16) 12. Temperton: elapsed time t=1.91659 s, 524288 iters, t-(init.)=1.84993 s t(norm)=0.0602091, mflops=83.044 (err=7.9e-16) 13. Temperton (f2c): elapsed time t=1.49994 s, 262144 iters, t-(init.)=1.46661 s t(norm)=0.0954666, mflops=52.3743 (err=2.1e-16) 14. Valkenburg: elapsed time t=1.06662 s, 65536 iters, t-(init.)=1.06662 s t(norm)=0.277721, mflops=18.0037 (err=4.0e-16) 15. DXML: elapsed time t=1.51661 s, 524288 iters, t-(init.)=1.43328 s t(norm)=0.0466485, mflops=107.185 (err=2.1e-16) Top mflops for N=15 = 323.434 Normalized results and averages for N=15: fft 0: mflops = 20.9497 (norm. = 0.0647727), norm. avg. (of 2) = 0.0512165 fft 1: mflops = 122.905 (norm. = 0.38), norm. avg. (of 4) = 0.246744 fft 2: mflops = 121.288 (norm. = 0.375), norm. avg. (of 4) = 0.206414 fft 3: mflops = 170.701 (norm. = 0.527778), norm. avg. (of 4) = 0.352543 fft 4: mflops = 133.592 (norm. = 0.413043), norm. avg. (of 4) = 0.276776 fft 5: mflops = 283.627 (norm. = 0.876923), norm. avg. (of 4) = 0.803404 fft 6: mflops = 323.434 (norm. = 1), norm. avg. (of 4) = 1 fft 7: mflops = 39.3926 (norm. = 0.121795), norm. avg. (of 4) = 0.105854 fft 8: mflops = 75.5564 (norm. = 0.233607), norm. avg. (of 4) = 0.226293 fft 9: mflops = 21.1419 (norm. = 0.065367), norm. avg. (of 4) = 0.065745 fft 10: mflops = 49.5585 (norm. = 0.153226), norm. avg. (of 4) = 0.10786 fft 11: mflops = 51.2104 (norm. = 0.158333), norm. avg. (of 4) = 0.110499 fft 12: mflops = 83.044 (norm. = 0.256757), norm. avg.. (of 4) = 0.132956 fft 13: mflops = 52.3743 (norm. = 0.161932), norm. avg. (of 4) = 0.123234 fft 14: mflops = 18.0037 (norm. = 0.0556641), norm. avg. (of 4) = 0.0477567 fft 15: mflops = 107.185 (norm. = 0.331395), norm. avg. (of 4) = 0.200708 Benchmarking for array size = 18: 0. Brenner: elapsed time t=1.29995 s, 65536 iters, t-(init.)=1.29995 s t(norm)=0.264268, mflops=18.9202 (err=4.5e-16) 1. CWP (min N): elapsed time t=1.58327 s, 524288 iters, t-(init.)=1.51661 s t(norm)=0.0385391, mflops=129.738 2. CWP (best N) (N=28): elapsed time t=1.6666 s, 524288 iters, t-(init.)=1.58327 s t(norm)=0.0402332, mflops=124.276 3. FFTPACK: elapsed time t=1.51661 s, 524288 iters, t-(init.)=1.44994 s t(norm)=0.0368451, mflops=135.703 (err=2.6e-16) 4. FFTPACK (f2c): elapsed time t=1.06662 s, 262144 iters, t-(init.)=1.03329 s t(norm)=0.0525149, mflops=95.2111 (err=2.6e-16) FFTW_MEASURE plan: (cost = 1.652969e-06) FFTW_TWIDDLE 3 FFTW_NOTW 6 5. FFTW: elapsed time t=1.63327 s, 1048576 iters, t-(init.)=1.48327 s t(norm)=0.0188461, mflops=265.307 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.86659 s, 1048576 iters, t-(init.)=1.63327 s t(norm)=0.0207518, mflops=240.942 (err=1.9e-16) 7. Frigo-old: elapsed time t=1.34995 s, 131072 iters, t-(init.)=1.31661 s t(norm)=0.133828, mflops=37.3613 (err=4.5e-16) 8. GSL: elapsed time t=1.44994 s, 524288 iters, t-(init.)=1.38328 s t(norm)=0.0351511, mflops=142.243 (err=2.2e-16) 9. NAPACK (f2c): elapsed time t=1.36661 s, 131072 iters, t-(init.)=1.34995 s t(norm)=0.137216, mflops=36.4388 (err=8.7e-16) 10. Singleton: elapsed time t=1.59994 s, 262144 iters, t-(init.)=1.54994 s t(norm)=0.0787723, mflops=63.4741 (err=2.1e-16) 11. Singleton (f2c): elapsed time t=1.63327 s, 262144 iters, t-(init.)=1.58327 s t(norm)=0.0804664, mflops=62.1378 (err=2.1e-16) 12. Temperton: elapsed time t=1.78326 s, 131072 iters, t-(init.)=1.7666 s t(norm)=0.179567, mflops=27.8448 (err=2.7e-08) 13. Temperton (f2c): elapsed time t=1.58327 s, 262144 iters, t-(init.)=1.54994 s t(norm)=0.0787723, mflops=63.4741 (err=2.9e-16) 14. Valkenburg: elapsed time t=1.39994 s, 65536 iters, t-(init.)=1.38328 s t(norm)=0.281209, mflops=17.7804 (err=4.1e-16) 15. DXML: elapsed time t=1.89992 s, 524288 iters, t-(init.)=1.83326 s t(norm)=0.0465858, mflops=107.329 (err=2.4e-16) Top mflops for N=18 = 265.307 Normalized results and averages for N=18: fft 0: mflops = 18.9202 (norm. = 0.0713141), norm. avg. (of 3) = 0.0579157 fft 1: mflops = 129.738 (norm. = 0.489011), norm. avg. (of 5) = 0.295197 fft 2: mflops = 124.276 (norm. = 0.468421), norm. avg. (of 5) = 0.258816 fft 3: mflops = 135.703 (norm. = 0.511494), norm. avg. (of 5) = 0.384334 fft 4: mflops = 95.2111 (norm. = 0.358871), norm. avg. (of 5) = 0.293195 fft 5: mflops = 265.307 (norm. = 1), norm. avg. (of 5) = 0.842724 fft 6: mflops = 240.942 (norm. = 0.908163), norm. avg. (of 5) = 0.981633 fft 7: mflops = 37.3613 (norm. = 0.140823), norm. avg. (of 5) = 0.112848 fft 8: mflops = 142.243 (norm. = 0.536145), norm. avg. (of 5) = 0.288263 fft 9: mflops = 36.4388 (norm. = 0.137346), norm. avg. (of 5) = 0.0800651 fft 10: mflops = 63.4741 (norm. = 0.239247), norm. avg. (of 5) = 0.134138 fft 11: mflops = 62.1378 (norm. = 0.234211), norm. avg. (of 5) = 0.135241 fft 12: mflops = 27.8448 (norm. = 0.104953), norm. avg. (of 5) = 0.127355 fft 13: mflops = 63.4741 (norm. = 0.239247), norm. avg. (of 5) = 0.146437 fft 14: mflops = 17.7804 (norm. = 0.0670181), norm. avg. (of 5) = 0.051609 fft 15: mflops = 107.329 (norm. = 0.404545), norm. avg. (of 5) = 0.241476 Benchmarking for array size = 24: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.49994 s, 524288 iters, t-(init.)=1.41661 s t(norm)=0.0245546, mflops=203.628 2. CWP (best N) (N=28): elapsed time t=1.68327 s, 524288 iters, t-(init.)=1.59994 s t(norm)=0.0277323, mflops=180.295 3. FFTPACK: elapsed time t=1.81659 s, 524288 iters, t-(init.)=1.74993 s t(norm)=0.0303322, mflops=164.841 (err=2.2e-16) 4. FFTPACK (f2c): elapsed time t=1.23328 s, 262144 iters, t-(init.)=1.18329 s t(norm)=0.0410207, mflops=121.89 (err=2.4e-16) FFTW_MEASURE plan: (cost = 1.780121e-06) FFTW_TWIDDLE 4 FFTW_NOTW 6 5. FFTW: elapsed time t=1.91659 s, 1048576 iters, t-(init.)=1.78326 s t(norm)=0.015455, mflops=323.52 (err=2.3e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.09996 s, 524288 iters, t-(init.)=0.983294 s t(norm)=0.0170438, mflops=293.362 (err=2.1e-16) 7. Frigo-old: elapsed time t=1.94992 s, 262144 iters, t-(init.)=1.91659 s t(norm)=0.0664419, mflops=75.2537 (err=3.6e-16) 8. GSL: elapsed time t=1.54994 s, 524288 iters, t-(init.)=1.46661 s t(norm)=0.0254213, mflops=196.686 (err=2.1e-16) 9. NAPACK (f2c): elapsed time t=1.99992 s, 131072 iters, t-(init.)=1.98325 s t(norm)=0.137506, mflops=36.3621 (err=8.0e-16) 10. Singleton: elapsed time t=1.28328 s, 131072 iters, t-(init.)=1.26662 s t(norm)=0.0878189, mflops=56.9353 (err=2.3e-16) 11. Singleton (f2c): elapsed time t=1.29995 s, 131072 iters, t-(init.)=1.28328 s t(norm)=0.0889744, mflops=56.1959 (err=2.3e-16) 12. Temperton: elapsed time t=1.36661 s, 262144 iters, t-(init.)=1.33328 s t(norm)=0.0462205, mflops=108.177 (err=4.5e-09) 13. Temperton (f2c): elapsed time t=1.7666 s, 262144 iters, t-(init.)=1.7166 s t(norm)=0.0595089, mflops=84.0211 (err=2.8e-16) 14. Valkenburg: elapsed time t=1.91659 s, 65536 iters, t-(init.)=1.89992 s t(norm)=0.263457, mflops=18.9784 (err=5.6e-16) 15. DXML: elapsed time t=1.81659 s, 524288 iters, t-(init.)=1.73326 s t(norm)=0.0300433, mflops=166.426 (err=8.3e-16) Top mflops for N=24 = 323.52 Normalized results and averages for N=24: fft 0: mflops = -1 (norm. = -0.00309099), norm. avg. (of 3) = 0.0579157 fft 1: mflops = 203.628 (norm. = 0.629412), norm. avg. (of 6) = 0.3509 fft 2: mflops = 180.295 (norm. = 0.557292), norm. avg. (of 6) = 0.308562 fft 3: mflops = 164.841 (norm. = 0.509524), norm. avg. (of 6) = 0.405199 fft 4: mflops = 121.89 (norm. = 0.376761), norm. avg. (of 6) = 0.307123 fft 5: mflops = 323.52 (norm. = 1), norm. avg. (of 6) = 0.868936 fft 6: mflops = 293.362 (norm. = 0.90678), norm. avg. (of 6) = 0.969157 fft 7: mflops = 75.2537 (norm. = 0.232609), norm. avg. (of 6) = 0.132808 fft 8: mflops = 196.686 (norm. = 0.607955), norm. avg. (of 6) = 0.341545 fft 9: mflops = 36.3621 (norm. = 0.112395), norm. avg. (of 6) = 0.0854534 fft 10: mflops = 56.9353 (norm. = 0.175987), norm. avg. (of 6) = 0.141113 fft 11: mflops = 56.1959 (norm. = 0.173701), norm. avg. (of 6) = 0.141651 fft 12: mflops = 108.177 (norm. = 0.334375), norm. avg. (of 6) = 0.161859 fft 13: mflops = 84.0211 (norm. = 0.259709), norm. avg. (of 6) = 0.165315 fft 14: mflops = 18.9784 (norm. = 0.0586623), norm. avg. (of 6) = 0.0527845 fft 15: mflops = 166.426 (norm. = 0..514423), norm. avg. (of 6) = 0.286967 Benchmarking for array size = 36: 0. Brenner: elapsed time t=1.19995 s, 32768 iters, t-(init.)=1.18329 s t(norm)=0.194023, mflops=25.7702 (err=5.5e-16) 1. CWP (min N): elapsed time t=1.03329 s, 262144 iters, t-(init.)=0.983294 s t(norm)=0.0201538, mflops=248.092 2. CWP (best N): elapsed time t=1.03329 s, 262144 iters, t-(init.)=0.983294 s t(norm)=0.0201538, mflops=248.092 3. FFTPACK: elapsed time t=1.11662 s, 262144 iters, t-(init.)=1.06662 s t(norm)=0.0218617, mflops=228.71 (err=3.9e-16) 4. FFTPACK (f2c): elapsed time t=1.84993 s, 262144 iters, t-(init.)=1.79993 s t(norm)=0.0368917, mflops=135.532 (err=4.5e-16) FFTW_MEASURE plan: (cost = 2.670181e-06) FFTW_TWIDDLE 6 FFTW_NOTW 6 5. FFTW: elapsed time t=1.39994 s, 524288 iters, t-(init.)=1.29995 s t(norm)=0.013322, mflops=375.319 (err=4.6e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.69993 s, 524288 iters, t-(init.)=1.54994 s t(norm)=0.0158839, mflops=314.784 (err=4.4e-16) 7. Frigo-old: elapsed time t=1.29995 s, 65536 iters, t-(init.)=1.28328 s t(norm)=0.10521, mflops=47.5242 (err=5.4e-16) 8. GSL: elapsed time t=1.11662 s, 262144 iters, t-(init.)=1.06662 s t(norm)=0.0218617, mflops=228.71 (err=4.3e-16) 9. NAPACK (f2c): elapsed time t=1.39994 s, 65536 iters, t-(init.)=1.39994 s t(norm)=0.114774, mflops=43.5638 (err=1.4e-15) 10. Singleton: elapsed time t=1.33328 s, 131072 iters, t-(init.)=1.31661 s t(norm)=0.0539712, mflops=92.6421 (err=4.7e-16) 11. Singleton (f2c): elapsed time t=1.31661 s, 131072 iters, t-(init.)=1.29995 s t(norm)=0.053288, mflops=93.8298 (err=4.7e-16) 12. Temperton: elapsed time t=1.89992 s, 131072 iters, t-(init.)=1.86659 s t(norm)=0.0765161, mflops=65.3457 (err=5.1e-08) 13. Temperton (f2c): elapsed time t=1.08329 s, 131072 iters, t-(init.)=1.06662 s t(norm)=0.0437235, mflops=114.355 (err=3.7e-16) 14. Valkenburg: elapsed time t=1.58327 s, 32768 iters, t-(init.)=1.58327 s t(norm)=0.259608, mflops=19.2598 (err=6.2e-16) 15. DXML: elapsed time t=1.21662 s, 262144 iters, t-(init.)=1.16662 s t(norm)=0.0239113, mflops=209.106 (err=4.0e-16) Top mflops for N=36 = 375.319 Normalized results and averages for N=36: fft 0: mflops = 25.7702 (norm. = 0.068662), norm. avg. (of 4) = 0.0606023 fft 1: mflops = 248.092 (norm. = 0.661017), norm. avg. (of 7) = 0.395202 fft 2: mflops = 248.092 (norm. = 0.661017), norm. avg. (of 7) = 0.358913 fft 3: mflops = 228.71 (norm. = 0.609375), norm. avg. (of 7) = 0.434367 fft 4: mflops = 135.532 (norm. = 0.361111), norm. avg. (of 7) = 0.314835 fft 5: mflops = 375.319 (norm. = 1), norm. avg. (of 7) = 0.88766 fft 6: mflops = 314.784 (norm. = 0.83871), norm. avg. (of 7) = 0.950522 fft 7: mflops = 47.5242 (norm. = 0.126623), norm. avg. (of 7) = 0.131925 fft 8: mflops = 228.71 (norm. = 0.609375), norm. avg. (of 7) = 0.379807 fft 9: mflops = 43.5638 (norm. = 0.116071), norm. avg. (of 7) = 0.0898274 fft 10: mflops = 92.6421 (norm. = 0.246835), norm. avg. (of 7) = 0.156216 fft 11: mflops = 93.8298 (norm. = 0.25), norm. avg. (of 7) = 0.15713 fft 12: mflops = 65.3457 (norm. = 0.174107), norm. avg. (of 7) = 0.163608 fft 13: mflops = 114.355 (norm. = 0.304688), norm. avg. (of 7) = 0.185226 fft 14: mflops = 19.2598 (norm. = 0.0513158), norm. avg. (of 7) = 0.0525747 fft 15: mflops = 209.106 (norm. = 0.557143), norm. avg. (of 7) = 0.325563 Benchmarking for array size = 80: 0. Brenner: elapsed time t=1.86659 s, 32768 iters, t-(init.)=1.84993 s t(norm)=0.111626, mflops=44.7925 (err=4.3e-16) 1. CWP (min N): elapsed time t=1.09996 s, 131072 iters, t-(init.)=1.04996 s t(norm)=0.0158388, mflops=315.68 2. CWP (best N) (N=84): elapsed time t=1.99992 s, 262144 iters, t-(init.)=1.89992 s t(norm)=0.0143303, mflops=348.91 3. FFTPACK: elapsed time t=1.09996 s, 131072 iters, t-(init.)=1.04996 s t(norm)=0.0158388, mflops=315.68 (err=3.2e-16) 4. FFTPACK (f2c): elapsed time t=1.81659 s, 131072 iters, t-(init.)=1.7666 s t(norm)=0.0266494, mflops=187.621 (err=3.8e-16) FFTW_MEASURE plan: (cost = 6.611877e-06) FFTW_TWIDDLE 10 FFTW_NOTW 8 5. FFTW: elapsed time t=1.69993 s, 262144 iters, t-(init.)=1.59994 s t(norm)=0.0120677, mflops=414.33 (err=3.5e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 6. FFTW_ESTIMATE: elapsed time t=1.6666 s, 262144 iters, t-(init.)=1.54994 s t(norm)=0.0116905, mflops=427.696 (err=3.5e-16) 7. Frigo-old: elapsed time t=1.73326 s, 65536 iters, t-(init.)=1.7166 s t(norm)=0.0517904, mflops=96.543 (err=3.6e-16) 8. GSL: elapsed time t=1.26662 s, 65536 iters, t-(init.)=1.23328 s t(norm)=0.0372086, mflops=134.377 (err=3.3e-16) 9. NAPACK (f2c): elapsed time t=1.46661 s, 16384 iters, t-(init.)=1.46661 s t(norm)=0.176992, mflops=28.2498 (err=5.0e-16) 10. Singleton: elapsed time t=1.11662 s, 65536 iters, t-(init.)=1.08329 s t(norm)=0.0326833, mflops=152.984 (err=4.4e-16) 11. Singleton (f2c): elapsed time t=1.13329 s, 65536 iters, t-(init.)=1.09996 s t(norm)=0.0331861, mflops=150.666 (err=4.4e-16) 12. Temperton: elapsed time t=1.69993 s, 131072 iters, t-(init.)=1.64993 s t(norm)=0.0248896, mflops=200.887 (err=5.3e-08) 13. Temperton (f2c): elapsed time t=1.08329 s, 65536 iters, t-(init.)=1.06662 s t(norm)=0.0321804, mflops=155.374 (err=3.4e-16) 14. Valkenburg: elapsed time t=1.08329 s, 8192 iters, t-(init.)=1.08329 s t(norm)=0.261466, mflops=19.1229 (err=4.6e-16) 15. DXML: elapsed time t=1.08329 s, 131072 iters, t-(init.)=1.03329 s t(norm)=0.0155874, mflops=320.772 (err=3.5e-16) Top mflops for N=80 = 427.696 Normalized results and averages for N=80: fft 0: mflops = 44.7925 (norm. = 0.10473), norm. avg. (of 5) = 0.0694278 fft 1: mflops = 315.68 (norm. = 0.738095), norm. avg. (of 8) = 0.438064 fft 2: mflops = 348.91 (norm. = 0.815789), norm. avg. (of 8) = 0.416022 fft 3: mflops = 315.68 (norm. = 0.738095), norm. avg. (of 8) = 0.472333 fft 4: mflops = 187.621 (norm. = 0.438679), norm. avg. (of 8) = 0.330316 fft 5: mflops = 414.33 (norm. = 0.96875), norm. avg. (of 8) = 0.897796 fft 6: mflops = 427.696 (norm. = 1), norm. avg. (of 8) = 0.956707 fft 7: mflops = 96.543 (norm. = 0.225728), norm. avg. (of 8) = 0.14365 fft 8: mflops = 134.377 (norm. = 0.314189), norm. avg. (of 8) = 0.371605 fft 9: mflops = 28.2498 (norm. = 0.0660511), norm. avg. (of 8) = 0.0868554 fft 10: mflops = 152.984 (norm. = 0.357692), norm. avg. (of 8) = 0.1814 fft 11: mflops = 150.666 (norm. = 0.352273), norm. avg. (of 8) = 0.181522 fft 12: mflops = 200.887 (norm. = 0.469697), norm. avg. (of 8) = 0.201869 fft 13: mflops = 155.374 (norm. = 0.363281), norm. avg. (of 8) = 0.207483 fft 14: mflops = 19.1229 (norm. = 0.0447115), norm. avg. (of 8) = 0.0515918 fft 15: mflops = 320.772 (norm. = 0.75), norm. avg. (of 8) = 0.378618 Benchmarking for array size = 108: 0. Brenner: elapsed time t=1.04996 s, 8192 iters, t-(init.)=1.04996 s t(norm)=0.175687, mflops=28.4597 (err=6.5e-16) 1. CWP (min N) (N=110): elapsed time t=1.81659 s, 131072 iters, t-(init.)=1.74993 s t(norm)=0.0183007, mflops=273.213 2. CWP (best N) (N=112): elapsed time t=1.44994 s, 131072 iters, t-(init.)=1.38328 s t(norm)=0.0144663, mflops=345.631 3. FFTPACK: elapsed time t=1.64993 s, 131072 iters, t-(init.)=1.58327 s t(norm)=0.0165578, mflops=301.972 (err=4.1e-16) 4. FFTPACK (f2c): elapsed time t=1.58327 s, 65536 iters, t-(init.)=1.54994 s t(norm)=0.0324185, mflops=154.233 (err=4.1e-16) FFTW_MEASURE plan: (cost = 9.663513e-06) FFTW_TWIDDLE 9 FFTW_NOTW 12 5. FFTW: elapsed time t=1.29995 s, 131072 iters, t-(init.)=1.23328 s t(norm)=0.0128977, mflops=387.667 (err=3.6e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.28328 s, 131072 iters, t-(init.)=1.21662 s t(norm)=0.0127234, mflops=392.977 (err=3.6e-16) 7. Frigo-old: elapsed time t=1.36661 s, 16384 iters, t-(init.)=1.34995 s t(norm)=0.112942, mflops=44.2706 (err=5.5e-16) 8. GSL: elapsed time t=1.09996 s, 65536 iters, t-(init.)=1.06662 s t(norm)=0.0223095, mflops=224.12 (err=3.9e-16) 9. NAPACK (f2c): elapsed time t=1.13329 s, 16384 iters, t-(init.)=1.13329 s t(norm)=0.0948153, mflops=52.7341 (err=3.1e-15) 10. Singleton: elapsed time t=1.14995 s, 32768 iters, t-(init.)=1.13329 s t(norm)=0.0474077, mflops=105.468 (err=4.5e-16) 11. Singleton (f2c): elapsed time t=1.23328 s, 32768 iters, t-(init.)=1.21662 s t(norm)=0.0508935, mflops=98.2443 (err=4.5e-16) 12. Temperton: elapsed time t=1.88326 s, 65536 iters, t-(init.)=1.84993 s t(norm)=0.038693, mflops=129.222 (err=7.4e-08) 13. Temperton (f2c): elapsed time t=1.5666 s, 65536 iters, t-(init.)=1.53327 s t(norm)=0.0320699, mflops=155.91 (err=3.5e-16) 14. Valkenburg: elapsed time t=1.46661 s, 8192 iters, t-(init.)=1.46661 s t(norm)=0.245404, mflops=20.3745 (err=7.5e-16) 15. DXML: elapsed time t=1.54994 s, 131072 iters, t-(init.)=1.49994 s t(norm)=0..0156864, mflops=318.748 (err=3.6e-16) Top mflops for N=108 = 392.977 Normalized results and averages for N=108: fft 0: mflops = 28.4597 (norm. = 0.0724206), norm. avg. (of 6) = 0.0699266 fft 1: mflops = 273.213 (norm. = 0.695238), norm. avg. (of 9) = 0.466639 fft 2: mflops = 345.631 (norm. = 0.879518), norm. avg. (of 9) = 0.467522 fft 3: mflops = 301.972 (norm. = 0.768421), norm. avg. (of 9) = 0.505231 fft 4: mflops = 154.233 (norm. = 0.392473), norm. avg. (of 9) = 0.337222 fft 5: mflops = 387.667 (norm. = 0.986486), norm. avg. (of 9) = 0.90765 fft 6: mflops = 392.977 (norm. = 1), norm. avg. (of 9) = 0.961517 fft 7: mflops = 44.2706 (norm. = 0.112654), norm. avg. (of 9) = 0.140206 fft 8: mflops = 224.12 (norm. = 0.570312), norm. avg. (of 9) = 0.393683 fft 9: mflops = 52.7341 (norm. = 0.134191), norm. avg. (of 9) = 0.0921149 fft 10: mflops = 105.468 (norm. = 0.268382), norm. avg. (of 9) = 0.191065 fft 11: mflops = 98.2443 (norm. = 0.25), norm. avg. (of 9) = 0.189131 fft 12: mflops = 129.222 (norm. = 0.328829), norm. avg. (of 9) = 0.215976 fft 13: mflops = 155.91 (norm. = 0.396739), norm. avg. (of 9) = 0.228511 fft 14: mflops = 20.3745 (norm. = 0.0518466), norm. avg. (of 9) = 0.0516201 fft 15: mflops = 318.748 (norm. = 0.811111), norm. avg. (of 9) = 0.426673 Benchmarking for array size = 210: 0. Brenner: elapsed time t=1.81659 s, 8192 iters, t-(init.)=1.81659 s t(norm)=0.136885, mflops=36.5271 (err=6.8e-16) 1. CWP (min N): elapsed time t=1.39994 s, 65536 iters, t-(init.)=1.33328 s t(norm)=0.0125582, mflops=398.145 2. CWP (best N): elapsed time t=1.39994 s, 65536 iters, t-(init.)=1.34995 s t(norm)=0.0127152, mflops=393.23 3. FFTPACK: elapsed time t=1.31661 s, 32768 iters, t-(init.)=1.28328 s t(norm)=0.0241746, mflops=206.829 (err=4.7e-16) 4. FFTPACK (f2c): elapsed time t=1.44994 s, 16384 iters, t-(init.)=1.43328 s t(norm)=0.0540004, mflops=92.5919 (err=6.2e-16) FFTW_MEASURE plan: (cost = 2.441309e-05) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_NOTW 6 5. FFTW: elapsed time t=1.63327 s, 65536 iters, t-(init.)=1.5666 s t(norm)=0.0147559, mflops=338.847 (err=5.2e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=2.03325 s, 65536 iters, t-(init.)=1.96659 s t(norm)=0.0185234, mflops=269.929 (err=4.9e-16) 7. Frigo-old: elapsed time t=1.46661 s, 8192 iters, t-(init.)=1.46661 s t(norm)=0.110512, mflops=45.2438 (err=6.3e-16) 8. GSL: elapsed time t=1.79993 s, 32768 iters, t-(init.)=1.7666 s t(norm)=0.0332793, mflops=150.243 (err=6.4e-16) 9. NAPACK (f2c): elapsed time t=1.41661 s, 4096 iters, t-(init.)=1.41661 s t(norm)=0.21349, mflops=23.4203 (err=1.5e-14) 10. Singleton: elapsed time t=1.43328 s, 16384 iters, t-(init.)=1.41661 s t(norm)=0.0533725, mflops=93.6812 (err=6.4e-16) 11. Singleton (f2c): elapsed time t=1.48327 s, 16384 iters, t-(init.)=1.46661 s t(norm)=0.0552562, mflops=90.4875 (err=6.4e-16) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.86659 s, 4096 iters, t-(init.)=1.86659 s t(norm)=0.281305, mflops=17.7743 (err=7.5e-16) 15. DXML: elapsed time t=1.74993 s, 16384 iters, t-(init.)=1.74993 s t(norm)=0.0659307, mflops=75.8372 (err=7.3e-16) Top mflops for N=210 = 398.145 Normalized results and averages for N=210: fft 0: mflops = 36.5271 (norm. = 0.0917431), norm. avg. (of 7) = 0.0730432 fft 1: mflops = 398.145 (norm. = 1), norm. avg. (of 10) = 0.519975 fft 2: mflops = 393.23 (norm. = 0.987654), norm. avg. (of 10) = 0.519535 fft 3: mflops = 206.829 (norm. = 0.519481), norm. avg. (of 10) = 0.506656 fft 4: mflops = 92.5919 (norm. = 0.232558), norm. avg. (of 10) = 0.326756 fft 5: mflops = 338.847 (norm. = 0.851064), norm. avg. (of 10) = 0.901992 fft 6: mflops = 269.929 (norm. = 0.677966), norm. avg. (of 10) = 0.933162 fft 7: mflops = 45.2438 (norm. = 0.113636), norm. avg. (of 10) = 0..137549 fft 8: mflops = 150.243 (norm. = 0.377358), norm. avg. (of 10) = 0.392051 fft 9: mflops = 23.4203 (norm. = 0.0588235), norm. avg. (of 10) = 0.0887858 fft 10: mflops = 93.6812 (norm. = 0.235294), norm. avg. (of 10) = 0.195488 fft 11: mflops = 90.4875 (norm. = 0.227273), norm. avg. (of 10) = 0.192945 fft 12: mflops = -1 (norm. = -0.00251165), norm. avg. (of 9) = 0.215976 fft 13: mflops = -1 (norm. = -0.00251165), norm. avg. (of 9) = 0.228511 fft 14: mflops = 17.7743 (norm. = 0.0446429), norm. avg. (of 10) = 0.0509224 fft 15: mflops = 75.8372 (norm. = 0.190476), norm. avg. (of 10) = 0.403053 Benchmarking for array size = 504: 0. Brenner: elapsed time t=1.26662 s, 2048 iters, t-(init.)=1.26662 s t(norm)=0.136691, mflops=36.5789 (err=1.5e-15) 1. CWP (min N): elapsed time t=1.26662 s, 32768 iters, t-(init.)=1.19995 s t(norm)=0.00809354, mflops=617.776 2. CWP (best N): elapsed time t=1.26662 s, 32768 iters, t-(init.)=1.19995 s t(norm)=0.00809354, mflops=617.776 3.. FFTPACK: elapsed time t=1.96659 s, 16384 iters, t-(init.)=1.91659 s t(norm)=0.0258544, mflops=193.391 (err=1.3e-15) 4. FFTPACK (f2c): elapsed time t=1.09996 s, 4096 iters, t-(init.)=1.09996 s t(norm)=0.0593526, mflops=84.2422 (err=1.3e-15) FFTW_MEASURE plan: (cost = 6.103271e-05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 8 FFTW_NOTW 7 5. FFTW: elapsed time t=1.04996 s, 16384 iters, t-(init.)=1.01663 s t(norm)=0.0137141, mflops=364.589 (err=1.3e-15) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.93326 s, 32768 iters, t-(init.)=1.86659 s t(norm)=0.01259, mflops=397.142 (err=1.2e-15) 7. Frigo-old: elapsed time t=1.7666 s, 4096 iters, t-(init.)=1.7666 s t(norm)=0.0953239, mflops=52.4527 (err=1.3e-15) 8. GSL: elapsed time t=1.59994 s, 16384 iters, t-(init.)=1.5666 s t(norm)=0.0211331, mflops=236.595 (err=1.3e-15) 9. NAPACK (f2c): elapsed time t=1.58327 s, 2048 iters, t-(init.)=1.5666 s t(norm)=0.169065, mflops=29.5744 (err=4.1e-14) 10. Singleton: elapsed time t=1.73326 s, 8192 iters, t-(init.)=1.7166 s t(norm)=0.046313, mflops=107.961 (err=1.9e-15) 11. Singleton (f2c): elapsed time t=1.84993 s, 8192 iters, t-(init.)=1.83326 s t(norm)=0.0494605, mflops=101.091 (err=1.9e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.23328 s, 1024 iters, t-(init.)=1.21662 s t(norm)=0.26259, mflops=19.0411 (err=1.7e-15) 15. DXML: elapsed time t=1.78326 s, 8192 iters, t-(init.)=1.7666 s t(norm)=0.047662, mflops=104.905 (err=2.1e-15) Top mflops for N=504 = 617.776 Normalized results and averages for N=504: fft 0: mflops = 36.5789 (norm. = 0.0592105), norm. avg. (of 8) = 0.0713141 fft 1: mflops = 617.776 (norm. = 1), norm. avg. (of 11) = 0.563614 fft 2: mflops = 617.776 (norm. = 1), norm. avg. (of 11) = 0.563214 fft 3: mflops = 193.391 (norm. = 0.313043), norm. avg. (of 11) = 0..489055 fft 4: mflops = 84.2422 (norm. = 0.136364), norm. avg. (of 11) = 0.309448 fft 5: mflops = 364.589 (norm. = 0.590164), norm. avg. (of 11) = 0.873644 fft 6: mflops = 397.142 (norm. = 0.642857), norm. avg. (of 11) = 0.906771 fft 7: mflops = 52.4527 (norm. = 0.0849057), norm. avg. (of 11) = 0.132763 fft 8: mflops = 236.595 (norm. = 0.382979), norm. avg. (of 11) = 0.391226 fft 9: mflops = 29.5744 (norm. = 0.0478723), norm. avg. (of 11) = 0.0850664 fft 10: mflops = 107.961 (norm. = 0.174757), norm. avg. (of 11) = 0.193603 fft 11: mflops = 101.091 (norm. = 0.163636), norm. avg. (of 11) = 0.190281 fft 12: mflops = -1 (norm. = -0.00161871), norm. avg. (of 9) = 0.215976 fft 13: mflops = -1 (norm. = -0.00161871), norm. avg. (of 9) = 0.228511 fft 14: mflops = 19.0411 (norm. = 0.0308219), norm. avg. (of 11) = 0.0490951 fft 15: mflops = 104.905 (norm. = 0.169811), norm. avg. (of 11) = 0.381849 Benchmarking for array size = 1000: 0. Brenner: elapsed time t=1.34995 s, 1024 iters, t-(init.)=1.34995 s t(norm)=0.132283, mflops=37.7977 (err=1.1e-15) 1. CWP (min N) (N=1001): elapsed time t=1.31661 s, 8192 iters, t-(init.)=1.28328 s t(norm)=0.0157188, mflops=318.089 2. CWP (best N) (N=1008): elapsed time t=1.86659 s, 16384 iters, t-(init.)=1.79993 s t(norm)=0.0110236, mflops=453.572 3. FFTPACK: elapsed time t=1.53327 s, 8192 iters, t-(init.)=1.49994 s t(norm)=0.0183727, mflops=272.143 (err=9.9e-16) 4. FFTPACK (f2c): elapsed time t=1.26662 s, 4096 iters, t-(init.)=1.24995 s t(norm)=0.0306211, mflops=163.286 (err=1.1e-15) FFTW_MEASURE plan: (cost = 1.383408e-04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 5. FFTW: elapsed time t=1.19995 s, 8192 iters, t-(init.)=1.16662 s t(norm)=0.0142899, mflops=349.898 (err=9.8e-16) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.18329 s, 8192 iters, t-(init.)=1.14995 s t(norm)=0.0140857, mflops=354.969 (err=9.8e-16) 7. Frigo-old: elapsed time t=1.68327 s, 2048 iters, t-(init.)=1.6666 s t(norm)=0.0816563, mflops=61.2322 (err=1.0e-15) 8. GSL: elapsed time t=1.03329 s, 2048 iters, t-(init.)=1.03329 s t(norm)=0.0506269, mflops=98.7617 (err=1.0e-15) 9. NAPACK (f2c): elapsed time t=1.29995 s, 512 iters, t-(init.)=1.28328 s t(norm)=0.251502, mflops=19.8806 (err=1.7e-14) 10. Singleton: elapsed time t=1.44994 s, 4096 iters, t-(init.)=1.43328 s t(norm)=0.0351122, mflops=142.401 (err=1.5e-15) 11. Singleton (f2c): elapsed time t=1.54994 s, 4096 iters, t-(init.)=1.53327 s t(norm)=0.0375619, mflops=133.114 (err=1.5e-15) 12. Temperton: elapsed time t=1.08329 s, 4096 iters, t-(init.)=1.06662 s t(norm)=0.02613, mflops=191.351 (err=1.3e-07) 13. Temperton (f2c): elapsed time t=1.51661 s, 4096 iters, t-(init.)=1.49994 s t(norm)=0.0367454, mflops=136.072 (err=9.9e-16) 14. Valkenburg: elapsed time t=1.59994 s, 512 iters, t-(init.)=1.59994 s t(norm)=0.31356, mflops=15.9459 (err=1.1e-15) 15. DXML: elapsed time t=1.01663 s, 8192 iters, t-(init.)=0.983294 s t(norm)=0.0120443, mflops=415.134 (err=1.8e-15) Top mflops for N=1000 = 453.572 Normalized results and averages for N=1000: fft 0: mflops = 37.7977 (norm. = 0.0833333), norm. avg. (of 9) = 0.0726496 fft 1: mflops = 318.089 (norm. = 0.701299), norm. avg. (of 12) = 0.575087 fft 2: mflops = 453.572 (norm. = 1), norm. avg. (of 12) = 0.599612 fft 3: mflops = 272.143 (norm. = 0.6), norm. avg. (of 12) = 0.498301 fft 4: mflops = 163.286 (norm. = 0.36), norm. avg. (of 12) = 0.31366 fft 5: mflops = 349.898 (norm. = 0.771429), norm. avg. (of 12) = 0.865126 fft 6: mflops = 354.969 (norm. = 0.782609), norm. avg. (of 12) = 0.896424 fft 7: mflops = 61.2322 (norm. = 0.135), norm. avg. (of 12) = 0.13295 fft 8: mflops = 98.7617 (norm. = 0.217742), norm. avg. (of 12) = 0.376769 fft 9: mflops = 19.8806 (norm. = 0.0438312), norm. avg. (of 12) = 0.0816301 fft 10: mflops = 142.401 (norm. = 0.313953), norm. avg. (of 12) = 0.203633 fft 11: mflops = 133.114 (norm. = 0.293478), norm. avg. (of 12) = 0.198881 fft 12: mflops = 191.351 (norm. = 0.421875), norm. avg. (of 10) = 0.236566 fft 13: mflops = 136.072 (norm. = 0.3), norm. avg. (of 10) = 0.23566 fft 14: mflops = 15.9459 (norm. = 0.0351562), norm. avg. (of 12) = 0.0479335 fft 15: mflops = 415.134 (norm. = 0.915254), norm. avg. (of 12) = 0.4263 Benchmarking for array size = 1960: 0. Brenner: elapsed time t=1.36661 s, 512 iters, t-(init.)=1.36661 s t(norm)=0.124519, mflops=40.1545 (err=2.9e-15) 1. CWP (min N) (N=1980): elapsed time t=1.06662 s, 4096 iters, t-(init.)=1.03329 s t(norm)=0.0117686, mflops=424.861 2. CWP (best N) (N=1980): elapsed time t=1.04996 s, 4096 iters, t-(init.)=1.01663 s t(norm)=0.0115787, mflops=431.826 3. FFTPACK: elapsed time t=1.43328 s, 2048 iters, t-(init.)=1.41661 s t(norm)=0.0322686, mflops=154.949 (err=2.8e-15) 4. FFTPACK (f2c): elapsed time t=1.73326 s, 1024 iters, t-(init.)=1.7166 s t(norm)=0.078204, mflops=63.9354 (err=2.8e-15) FFTW_MEASURE plan: (cost = 3.417832e-04) FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_TWIDDLE 5 FFTW_NOTW 7 5. FFTW: elapsed time t=1.43328 s, 4096 iters, t-(init.)=1.39994 s t(norm)=0.0159445, mflops=313.588 (err=2.8e-15) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.69993 s, 4096 iters, t-(init.)=1.6666 s t(norm)=0.0189815, mflops=263.414 (err=2.8e-15) 7. Frigo-old: elapsed time t=1.01663 s, 512 iters, t-(init.)=1.01663 s t(norm)=0.0926299, mflops=53.9782 (err=2.8e-15) 8. GSL: elapsed time t=1.49994 s, 2048 iters, t-(init.)=1.48327 s t(norm)=0.0337871, mflops=147.985 (err=2.8e-15) 9. NAPACK (f2c): elapsed time t=1.58327 s, 256 iters, t-(init.)=1.58327 s t(norm)=0.288519, mflops=17.3299 (err=1.3e-13) 10. Singleton: elapsed time t=1.93326 s, 2048 iters, t-(init.)=1.91659 s t(norm)=0.0436575, mflops=114.528 (err=4.3e-15) 11. Singleton (f2c): elapsed time t=1.03329 s, 1024 iters, t-(init.)=1.01663 s t(norm)=0.046315, mflops=107.956 (err=4.3e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.96659 s, 256 iters, t-(init.)=1.96659 s t(norm)=0.358372, mflops=13.952 (err=2.8e-15) 15. DXML: elapsed time t=1.94992 s, 1024 iters, t-(init.)=1.93326 s t(norm)=0.0880744, mflops=56.7702 (err=3.6e-15) Top mflops for N=1960 = 431.826 Normalized results and averages for N=1960: fft 0: mflops = 40.1545 (norm. = 0.0929878), norm. avg. (of 10) = 0.0746834 fft 1: mflops = 424.861 (norm. = 0.983871), norm. avg. (of 13) = 0.606532 fft 2: mflops = 431.826 (norm. = 1), norm. avg. (of 13) = 0.630411 fft 3: mflops = 154.949 (norm. = 0.358824), norm. avg. (of 13) = 0.487572 fft 4: mflops = 63.9354 (norm. = 0.148058), norm. avg. (of 13) = 0.300922 fft 5: mflops = 313.588 (norm. = 0.72619), norm. avg. (of 13) = 0.854439 fft 6: mflops = 263.414 (norm. = 0.61), norm. avg. (of 13) = 0.874391 fft 7: mflops = 53.9782 (norm. = 0.125), norm. avg. (of 13) = 0.132338 fft 8: mflops = 147.985 (norm. = 0.342697), norm. avg. (of 13) = 0.374148 fft 9: mflops = 17.3299 (norm. = 0.0401316), norm. avg. (of 13) = 0.0784379 fft 10: mflops = 114.528 (norm. = 0.265217), norm. avg. (of 13) = 0.20837 fft 11: mflops = 107.956 (norm. = 0.25), norm. avg. (of 13) = 0.202813 fft 12: mflops = -1 (norm. = -0.00231575), norm. avg. (of 10) = 0.236566 fft 13: mflops = -1 (norm. = -0.00231575), norm. avg. (of 10) = 0.23566 fft 14: mflops = 13.952 (norm. = 0.0323093), norm. avg. (of 13) = 0.0467317 fft 15: mflops = 56.7702 (norm. = 0.131466), norm. avg. (of 13) = 0.40362 Benchmarking for array size = 4725: 0. Brenner: elapsed time t=1.24995 s, 128 iters, t-(init.)=1.24995 s t(norm)=0.169318, mflops=29.5302 (err=1.9e-15) 1. CWP (min N) (N=5005): elapsed time t=1.94992 s, 2048 iters, t-(init.)=1.91659 s t(norm)=0.0162263, mflops=308.141 2. CWP (best N) (N=5040): elapsed time t=1.43328 s, 2048 iters, t-(init.)=1.39994 s t(norm)=0.0118523, mflops=421.86 3. FFTPACK: elapsed time t=1.16662 s, 512 iters, t-(init.)=1.14995 s t(norm)=0.0389432, mflops=128.392 (err=1.8e-15) 4. FFTPACK (f2c): elapsed time t=1.89992 s, 512 iters, t-(init.)=1.88326 s t(norm)=0.0637766, mflops=78.3987 (err=1.9e-15) FFTW_MEASURE plan: (cost = 1.302031e-03) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 15 5. FFTW: elapsed time t=1.26662 s, 1024 iters, t-(init.)=1.24995 s t(norm)=0.0211648, mflops=236.241 (err=1.8e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.49994 s, 1024 iters, t-(init.)=1.48327 s t(norm)=0.0251156, mflops=199.08 (err=1.8e-15) 7. Frigo-old: elapsed time t=1.04996 s, 128 iters, t-(init.)=1.03329 s t(norm)=0.13997, mflops=35.722 (err=1.9e-15) 8. GSL: elapsed time t=1.31661 s, 512 iters, t-(init.)=1.29995 s t(norm)=0.0440228, mflops=113.578 (err=1.9e-15) 9. NAPACK (f2c): elapsed time t=1.86659 s, 128 iters, t-(init.)=1.84993 s t(norm)=0.250591, mflops=19.9528 (err=3.5e-13) 10. Singleton: elapsed time t=1.34995 s, 512 iters, t-(init.)=1.33328 s t(norm)=0.0451516, mflops=110.738 (err=2.4e-15) 11. Singleton (f2c): elapsed time t=1.48327 s, 512 iters, t-(init.)=1.46661 s t(norm)=0.0496667, mflops=100.671 (err=2.4e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.29995 s, 64 iters, t-(init.)=1.29995 s t(norm)=0.352182, mflops=14.1972 (err=1.9e-15) 15. DXML: elapsed time t=1.54994 s, 512 iters, t-(init.)=1.53327 s t(norm)=0.0519243, mflops=96.294 (err=1.9e-15) Top mflops for N=4725 = 421.86 Normalized results and averages for N=4725: fft 0: mflops = 29.5302 (norm. = 0.07), norm. avg. (of 11) = 0.0742577 fft 1: mflops = 308.141 (norm. = 0.730435), norm. avg. (of 14) = 0.615382 fft 2: mflops = 421.86 (norm. = 1), norm. avg. (of 14) = 0.656811 fft 3: mflops = 128.392 (norm. = 0.304348), norm. avg. (of 14) = 0.474484 fft 4: mflops = 78.3987 (norm. = 0.185841), norm. avg. (of 14) = 0.292702 fft 5: mflops = 236.241 (norm. = 0.56), norm. avg. (of 14) = 0.833407 fft 6: mflops = 199.08 (norm. = 0.47191), norm. avg. (of 14) = 0.845642 fft 7: mflops = 35.722 (norm. = 0.0846774), norm. avg. (of 14) = 0.128934 fft 8: mflops = 113.578 (norm. = 0.269231), norm. avg. (of 14) = 0.366654 fft 9: mflops = 19.9528 (norm. = 0.0472973), norm. avg. (of 14) = 0.0762136 fft 10: mflops = 110.738 (norm. = 0.2625), norm. avg. (of 14) = 0.212236 fft 11: mflops = 100.671 (norm. = 0.238636), norm. avg. (of 14) = 0.205372 fft 12: mflops = -1 (norm. = -0.00237046), norm. avg. (of 10) = 0.236566 fft 13: mflops = -1 (norm. = -0.00237046), norm. avg. (of 10) = 0.23566 fft 14: mflops = 14.1972 (norm. = 0.0336538), norm. avg. (of 14) = 0.0457975 fft 15: mflops = 96.294 (norm. = 0.228261), norm. avg. (of 14) = 0.391095 Benchmarking for array size = 10368: 0. Brenner: elapsed time t=1.59994 s, 64 iters, t-(init.)=1.58327 s t(norm)=0.178867, mflops=27.9538 (err=3.1e-15) 1. CWP (min N) (N=10920): elapsed time t=1.44994 s, 512 iters, t-(init.)=1.29995 s t(norm)=0.0183574, mflops=272.37 2. CWP (best N) (N=11088): elapsed time t=1.48327 s, 512 iters, t-(init.)=1.34995 s t(norm)=0.0190634, mflops=262.283 3. FFTPACK: elapsed time t=1.34995 s, 256 iters, t-(init.)=1.28328 s t(norm)=0.036244, mflops=137.954 (err=3.0e-15) 4. FFTPACK (f2c): elapsed time t=1.89992 s, 256 iters, t-(init.)=1.83326 s t(norm)=0.0517772, mflops=96.5677 (err=3.0e-15) FFTW_MEASURE plan: (cost = 2.994672e-03) FFTW_TWIDDLE 32 FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_NOTW 6 5. FFTW: elapsed time t=1.58327 s, 512 iters, t-(init.)=1.46661 s t(norm)=0.0207109, mflops=241.419 (err=3.0e-15) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.48327 s, 512 iters, t-(init.)=1.34995 s t(norm)=0.0190634, mflops=262.283 (err=3.0e-15) 7. Frigo-old: elapsed time t=1.83326 s, 128 iters, t-(init.)=1.79993 s t(norm)=0.101671, mflops=49.178 (err=3.1e-15) 8. GSL: elapsed time t=1.18329 s, 256 iters, t-(init.)=1.11662 s t(norm)=0.031537, mflops=158.544 (err=3.0e-15) 9. NAPACK (f2c): elapsed time t=1.34995 s, 64 iters, t-(init.)=1.33328 s t(norm)=0.150624, mflops=33.1951 (err=8.1e-14) 10. Singleton: elapsed time t=1.98325 s, 256 iters, t-(init.)=1.91659 s t(norm)=0.0541307, mflops=92.3691 (err=4.4e-15) 11. Singleton (f2c): elapsed time t=1.06662 s, 128 iters, t-(init.)=1.03329 s t(norm)=0.058367, mflops=85.6649 (err=4.4e-15) 12. Temperton: elapsed time t=1.73326 s, 256 iters, t-(init.)=1.6666 s t(norm)=0.0470701, mflops=106.224 (err=2.1e-07) 13. Temperton (f2c): elapsed time t=1.84993 s, 256 iters, t-(init.)=1.78326 s t(norm)=0.050365, mflops=99.2752 (err=3.0e-15) 14. Valkenburg: elapsed time t=1.39994 s, 32 iters, t-(init.)=1.39994 s t(norm)=0.316311, mflops=15.8072 (err=3.0e-15) 15. DXML: elapsed time t=1.26662 s, 256 iters, t-(init.)=1.19995 s t(norm)=0.0338905, mflops=147.534 (err=3.2e-15) Top mflops for N=10368 = 272.37 Normalized results and averages for N=10368: fft 0: mflops = 27.9538 (norm. = 0.102632), norm. avg. (of 12) = 0.0766221 fft 1: mflops = 272.37 (norm. = 1), norm. avg. (of 15) = 0.641024 fft 2: mflops = 262.283 (norm. = 0.962963), norm. avg. (of 15) = 0.677221 fft 3: mflops = 137.954 (norm. = 0.506494), norm. avg. (of 15) = 0.476618 fft 4: mflops = 96.5677 (norm. = 0.354545), norm. avg. (of 15) = 0.296824 fft 5: mflops = 241.419 (norm. = 0.886364), norm. avg. (of 15) = 0.836938 fft 6: mflops = 262.283 (norm. = 0.962963), norm. avg. (of 15) = 0.853464 fft 7: mflops = 49.178 (norm. = 0.180556), norm. avg. (of 15) = 0.132375 fft 8: mflops = 158.544 (norm. = 0.58209), norm. avg. (of 15) = 0.381016 fft 9: mflops = 33.1951 (norm. = 0.121875), norm. avg. (of 15) = 0.0792577 fft 10: mflops = 92.3691 (norm. = 0.33913), norm. avg. (of 15) = 0.220696 fft 11: mflops = 85.6649 (norm. = 0.314516), norm. avg. (of 15) = 0.212648 fft 12: mflops = 106.224 (norm. = 0.39), norm. avg. (of 11) = 0.250514 fft 13: mflops = 99.2752 (norm. = 0.364486), norm. avg. (of 11) = 0.247371 fft 14: mflops = 15.8072 (norm. = 0.0580357), norm. avg. (of 15) = 0.0466134 fft 15: mflops = 147.534 (norm. = 0.541667), norm. avg. (of 15) = 0.401133 Benchmarking for array size = 27000: 0. Brenner: elapsed time t=1.23328 s, 16 iters, t-(init.)=1.21662 s t(norm)=0.191312, mflops=26.1353 (err=5.6e-15) 1. CWP (min N) (N=27720): elapsed time t=1.09996 s, 128 iters, t-(init.)=0.983294 s t(norm)=0.0193278, mflops=258.695 2. CWP (best N) (N=27720): elapsed time t=1.09996 s, 128 iters, t-(init.)=0.983294 s t(norm)=0.0193278, mflops=258.695 3. FFTPACK: elapsed time t=1.96659 s, 128 iters, t-(init.)=1.84993 s t(norm)=0..0363624, mflops=137.505 (err=5.5e-15) 4. FFTPACK (f2c): elapsed time t=1.24995 s, 64 iters, t-(init.)=1.19995 s t(norm)=0.0471729, mflops=105.993 (err=5.5e-15) FFTW_MEASURE plan: (cost = 1.041625e-02) FFTW_TWIDDLE 6 FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_NOTW 10 5. FFTW: elapsed time t=1.59994 s, 128 iters, t-(init.)=1.48327 s t(norm)=0.0291555, mflops=171.494 (err=5.6e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.6166 s, 128 iters, t-(init.)=1.48327 s t(norm)=0.0291555, mflops=171.494 (err=5.6e-15) 7. Frigo-old: elapsed time t=1.69993 s, 32 iters, t-(init.)=1.68327 s t(norm)=0.132346, mflops=37.7797 (err=5.7e-15) 8. GSL: elapsed time t=1.31661 s, 64 iters, t-(init.)=1.26662 s t(norm)=0.0497936, mflops=100.414 (err=5.5e-15) 9. NAPACK (f2c): elapsed time t=1.49994 s, 16 iters, t-(init.)=1.48327 s t(norm)=0.233244, mflops=21.4368 (err=1.1e-12) 10. Singleton: elapsed time t=1.68327 s, 64 iters, t-(init.)=1.6166 s t(norm)=0.0635524, mflops=78.6753 (err=7.6e-15) 11. Singleton (f2c): elapsed time t=1.7666 s, 64 iters, t-(init.)=1.69993 s t(norm)=0.0668283, mflops=74.8186 (err=7.6e-15) 12. Temperton: elapsed time t=1.46661 s, 64 iters, t-(init.)=1.39994 s t(norm)=0.055035, mflops=90.8512 (err=1.4e-07) 13. Temperton (f2c): elapsed time t=1.73326 s, 64 iters, t-(init.)=1.68327 s t(norm)=0.0661731, mflops=75.5594 (err=5.7e-15) 14. Valkenburg: elapsed time t=1.11662 s, 8 iters, t-(init.)=1.09996 s t(norm)=0.345935, mflops=14.4536 (err=5.4e-15) 15. DXML: elapsed time t=1.54994 s, 64 iters, t-(init.)=1.48327 s t(norm)=0.0583109, mflops=85.7472 (err=5.8e-15) Top mflops for N=27000 = 258.695 Normalized results and averages for N=27000: fft 0: mflops = 26.1353 (norm. = 0.101027), norm. avg. (of 13) = 0.0784995 fft 1: mflops = 258.695 (norm. = 1), norm. avg. (of 16) = 0.66346 fft 2: mflops = 258.695 (norm.. = 1), norm. avg. (of 16) = 0.697395 fft 3: mflops = 137.505 (norm. = 0.531532), norm. avg. (of 16) = 0.48005 fft 4: mflops = 105.993 (norm. = 0.409722), norm. avg. (of 16) = 0.303881 fft 5: mflops = 171.494 (norm. = 0.662921), norm. avg. (of 16) = 0.826062 fft 6: mflops = 171.494 (norm. = 0.662921), norm. avg. (of 16) = 0.841555 fft 7: mflops = 37.7797 (norm. = 0.14604), norm. avg. (of 16) = 0.133229 fft 8: mflops = 100.414 (norm. = 0.388158), norm. avg. (of 16) = 0.381463 fft 9: mflops = 21.4368 (norm. = 0.0828652), norm. avg. (of 16) = 0.0794832 fft 10: mflops = 78.6753 (norm. = 0.304124), norm. avg. (of 16) = 0.22591 fft 11: mflops = 74.8186 (norm. = 0.289216), norm. avg. (of 16) = 0.217433 fft 12: mflops = 90.8512 (norm. = 0.35119), norm. avg. (of 12) = 0.258904 fft 13: mflops = 75.5594 (norm. = 0.292079), norm. avg. (of 12) = 0.251097 fft 14: mflops = 14.4536 (norm. = 0.0558712), norm. avg. (of 16) = 0.047192 fft 15: mflops = 85.7472 (norm. = 0.331461), norm. avg. (of 16) = 0.396778 Benchmarking for array size = 75600: 0. Brenner: elapsed time t=1.28328 s, 4 iters, t-(init.)=1.26662 s t(norm)=0.258455, mflops=19.3457 (err=1.1e-14) 1. CWP (min N) (N=80080): elapsed time t=1.24995 s, 32 iters, t-(init.)=1.09996 s t(norm)=0.028056, mflops=178.215 2. CWP (best N) (N=80080): elapsed time t=1.26662 s, 32 iters, t-(init.)=1.11662 s t(norm)=0.028481, mflops=175.555 3. FFTPACK: elapsed time t=1.7666 s, 16 iters, t-(init.)=1.69993 s t(norm)=0.0867184, mflops=57.6579 (err=1.0e-14) 4. FFTPACK (f2c): elapsed time t=1.04996 s, 8 iters, t-(init.)=1.01663 s t(norm)=0.103722, mflops=48.2058 (err=1.1e-14) FFTW_MEASURE plan: (cost = 4.374825e-02) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_NOTW 14 5. FFTW: elapsed time t=1.53327 s, 32 iters, t-(init.)=1.39994 s t(norm)=0.0357076, mflops=140.026 (err=1.1e-14) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.53327 s, 32 iters, t-(init.)=1.39994 s t(norm)=0.0357076, mflops=140.026 (err=1.1e-14) 7. Frigo-old: elapsed time t=1.53327 s, 8 iters, t-(init.)=1.49994 s t(norm)=0.153032, mflops=32.6728 (err=1.1e-14) 8. GSL: elapsed time t=1.44994 s, 16 iters, t-(init.)=1.39994 s t(norm)=0.0714152, mflops=70.0131 (err=1.1e-14) 9. NAPACK (f2c): elapsed time t=1.39994 s, 4 iters, t-(init.)=1.38328 s t(norm)=0.28226, mflops=17.7142 (err=5.1e-12) 10. Singleton: elapsed time t=1.81659 s, 16 iters, t-(init.)=1.7666 s t(norm)=0.0901191, mflops=55.4821 (err=1.5e-14) 11. Singleton (f2c): elapsed time t=1.88326 s, 16 iters, t-(init.)=1.83326 s t(norm)=0.0935199, mflops=53.4646 (err=1.5e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.04996 s, 2 iters, t-(init.)=1.04996 s t(norm)=0.428491, mflops=11.6689 (err=1.1e-14) 15. DXML: elapsed time t=1.26662 s, 4 iters, t-(init.)=1.24995 s t(norm)=0.255054, mflops=19.6037 (err=1.1e-14) Top mflops for N=75600 = 178.215 Normalized results and averages for N=75600: fft 0: mflops = 19.3457 (norm. = 0.108553), norm. avg. (of 14) = 0.0806461 fft 1: mflops = 178.215 (norm. = 1), norm. avg. (of 17) = 0.683256 fft 2: mflops = 175.555 (norm. = 0.985075), norm. avg. (of 17) = 0.714317 fft 3: mflops = 57.6579 (norm. = 0.323529), norm. avg. (of 17) = 0.470843 fft 4: mflops = 48.2058 (norm. = 0.270492), norm. avg. (of 17) = 0.301917 fft 5: mflops = 140.026 (norm. = 0.785714), norm. avg. (of 17) = 0.823688 fft 6: mflops = 140.026 (norm. = 0.785714), norm. avg. (of 17) = 0.83827 fft 7: mflops = 32.6728 (norm. = 0.183333), norm. avg. (of 17) = 0.136177 fft 8: mflops = 70.0131 (norm. = 0.392857), norm. avg. (of 17) = 0.382133 fft 9: mflops = 17.7142 (norm. = 0.0993976), norm. avg. (of 17) = 0.0806546 fft 10: mflops = 55.4821 (norm. = 0.311321), norm. avg. (of 17) = 0.230934 fft 11: mflops = 53.4646 (norm. = 0.3), norm. avg. (of 17) = 0.22229 fft 12: mflops = -1 (norm. = -0.00561119), norm. avg. (of 12) = 0.258904 fft 13: mflops = -1 (norm. = -0.00561119), norm. avg. (of 12) = 0.251097 fft 14: mflops = 11.6689 (norm. = 0.0654762), norm. avg. (of 17) = 0.0482676 fft 15: mflops = 19.6037 (norm. = 0.11), norm. avg. (of 17) = 0.379909 Benchmarking for array size = 165375: 0. Brenner: elapsed time t=1.16662 s, 1 iters, t-(init.)=1.14995 s t(norm)=0.401123, mflops=12.465 (err=2.7e-14) 1. CWP (min N) (N=180180): elapsed time t=1.91659 s, 16 iters, t-(init.)=1.68327 s t(norm)=0.0366969, mflops=136.251 2. CWP (best N) (N=180180): elapsed time t=1.91659 s, 16 iters, t-(init.)=1.68327 s t(norm)=0.0366969, mflops=136.251 3. FFTPACK: elapsed time t=1.16662 s, 2 iters, t-(init.)=1.13329 s t(norm)=0.197655, mflops=25.2966 (err=2.7e-14) 4. FFTPACK (f2c): elapsed time t=1.24995 s, 2 iters, t-(init.)=1.21662 s t(norm)=0.212188, mflops=23.564 (err=2.7e-14) FFTW_MEASURE plan: (cost = 1.416610e-01) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 15 5. FFTW: elapsed time t=1.09996 s, 8 iters, t-(init.)=0.983294 s t(norm)=0.0428736, mflops=116.622 (err=2.7e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.16662 s, 8 iters, t-(init.)=1.04996 s t(norm)=0.0457803, mflops=109.217 (err=2.7e-14) 7. Frigo-old: elapsed time t=1.34995 s, 2 iters, t-(init.)=1.33328 s t(norm)=0.232535, mflops=21.5022 (err=2.7e-14) 8. GSL: elapsed time t=1.08329 s, 4 iters, t-(init.)=1.03329 s t(norm)=0.0901073, mflops=55.4894 (err=2.7e-14) 9. NAPACK (f2c): elapsed time t=1.13329 s, 1 iters, t-(init.)=1.13329 s t(norm)=0.395309, mflops=12.6483 (err=1.6e-11) 10. Singleton: elapsed time t=1.39994 s, 4 iters, t-(init.)=1.34995 s t(norm)=0.117721, mflops=42.4734 (err=4.0e-14) 11. Singleton (f2c): elapsed time t=1.46661 s, 4 iters, t-(init.)=1.39994 s t(norm)=0.122081, mflops=40.9565 (err=4.0e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.34995 s, 1 iters, t-(init.)=1.33328 s t(norm)=0.46507, mflops=10.7511 (err=2.7e-14) 15. DXML: elapsed time t=1.53327 s, 2 iters, t-(init.)=1.49994 s t(norm)=0.261602, mflops=19.113 (err=2.7e-14) Top mflops for N=165375 = 136.251 Normalized results and averages for N=165375: fft 0: mflops = 12.465 (norm. = 0.0914855), norm. avg. (of 15) = 0.0813688 fft 1: mflops = 136.251 (norm. = 1), norm. avg. (of 18) = 0.700853 fft 2: mflops = 136.251 (norm. = 1), norm. avg. (of 18) = 0.730188 fft 3: mflops = 25.2966 (norm. = 0.185662), norm. avg. (of 18) = 0.455 fft 4: mflops = 23.564 (norm. = 0.172945), norm. avg. (of 18) = 0.294751 fft 5: mflops = 116.622 (norm. = 0.855932), norm. avg. (of 18) = 0.82548 fft 6: mflops = 109.217 (norm. = 0.801587), norm. avg. (of 18) = 0.836232 fft 7: mflops = 21.5022 (norm. = 0.157813), norm. avg. (of 18) = 0.137379 fft 8: mflops = 55.4894 (norm. = 0.407258), norm. avg. (of 18) = 0.383529 fft 9: mflops = 12.6483 (norm. = 0.0928309), norm. avg. (of 18) = 0.0813311 fft 10: mflops = 42.4734 (norm. = 0.311728), norm. avg. (of 18) = 0.235423 fft 11: mflops = 40.9565 (norm. = 0.300595), norm. avg. (of 18) = 0.226641 fft 12: mflops = -1 (norm. = -0.00733938), norm. avg. (of 12) = 0.258904 fft 13: mflops = -1 (norm. = -0.00733938), norm. avg. (of 12) = 0.251097 fft 14: mflops = 10.7511 (norm. = 0.0789063), norm. avg. (of 18) = 0.0499697 fft 15: mflops = 19.113 (norm. = 0.140278), norm. avg. (of 18) = 0.366596 Benchmarking for array size = 362880: 0. Brenner: elapsed time t=2.78322 s, 1 iters, t-(init.)=2.74989 s t(norm)=0.410304, mflops=12.1861 (err=1.1e-13) 1. CWP (min N) (N=720720): elapsed time t=1.26662 s, 2 iters, t-(init.)=1.11662 s t(norm)=0.0833042, mflops=60.021 2. CWP (best N) (N=720720): elapsed time t=1.26662 s, 2 iters, t-(init.)=1.13329 s t(norm)=0.0845475, mflops=59.1384 3. FFTPACK: elapsed time t=1.03329 s, 1 iters, t-(init.)=0.99996 s t(norm)=0.149201, mflops=33.5117 (err=1.1e-13) 4. FFTPACK (f2c): elapsed time t=1.13329 s, 1 iters, t-(init.)=1.09996 s t(norm)=0.164122, mflops=30.4652 (err=1.1e-13) FFTW_MEASURE plan: (cost = 3.166540e-01) FFTW_TWIDDLE 32 FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_NOTW 14 5. FFTW: elapsed time t=1.19995 s, 4 iters, t-(init.)=1.06662 s t(norm)=0.0397871, mflops=125.669 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 7.511616e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.31661 s, 4 iters, t-(init.)=1.18329 s t(norm)=0.0441388, mflops=113.279 (err=1.1e-13) 7. Frigo-old: elapsed time t=1.33328 s, 1 iters, t-(init.)=1.29995 s t(norm)=0.193962, mflops=25.7783 (err=1.1e-13) 8. GSL: elapsed time t=1.16662 s, 2 iters, t-(init.)=1.09996 s t(norm)=0.0820608, mflops=60.9304 (err=1.1e-13) 9. NAPACK (f2c): elapsed time t=2.24991 s, 1 iters, t-(init.)=2.21658 s t(norm)=0.33073, mflops=15.1181 (err=3.4e-11) 10. Singleton: elapsed time t=1.19995 s, 1 iters, t-(init.)=1.18329 s t(norm)=0.176555, mflops=28.3198 (err=1.6e-13) 11. Singleton (f2c): elapsed time t=1.21662 s, 1 iters, t-(init.)=1.19995 s t(norm)=0.179042, mflops=27.9264 (err=1.6e-13) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=3.54986 s, 1 iters, t-(init.)=3.53319 s t(norm)=0.527179, mflops=9.48445 (err=1.1e-13) 15. DXML: elapsed time t=1.18329 s, 2 iters, t-(init.)=1.11662 s t(norm)=0.0833042, mflops=60.021 (err=1.1e-13) Top mflops for N=362880 = 125.669 Normalized results and averages for N=362880: fft 0: mflops = 12.1861 (norm. = 0.0969697), norm. avg. (of 16) = 0.0823438 fft 1: mflops = 60.021 (norm. = 0.477612), norm. avg. (of 19) = 0.689103 fft 2: mflops = 59.1384 (norm. = 0.470588), norm. avg. (of 19) = 0.716525 fft 3: mflops = 33.5117 (norm. = 0.266667), norm. avg. (of 19) = 0.445087 fft 4: mflops = 30.4652 (norm. = 0.242424), norm. avg. (of 19) = 0.291997 fft 5: mflops = 125.669 (norm. = 1), norm. avg. (of 19) = 0.834665 fft 6: mflops = 113.279 (norm. = 0.901408), norm. avg. (of 19) = 0.839663 fft 7: mflops = 25.7783 (norm. = 0.205128), norm. avg. (of 19) = 0.140944 fft 8: mflops = 60.9304 (norm. = 0.484848), norm. avg. (of 19) = 0.388861 fft 9: mflops = 15.1181 (norm. = 0.120301), norm. avg. (of 19) = 0.0833821 fft 10: mflops = 28.3198 (norm. = 0.225352), norm. avg. (of 19) = 0.234893 fft 11: mflops = 27.9264 (norm. = 0.222222), norm. avg. (of 19) = 0.226408 fft 12: mflops = -1 (norm. = -0.00795741), norm. avg. (of 12) = 0.258904 fft 13: mflops = -1 (norm. = -0.00795741), norm. avg. (of 12) = 0.251097 fft 14: mflops = 9.48445 (norm. = 0.0754717), norm. avg. (of 19) = 0.0513119 fft 15: mflops = 60.021 (norm. = 0.477612), norm. avg. (of 19) = 0.372439 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) Maximum array size N = 2097152 Benchmarking FFTs: 0. FFTW 1. HARM 2. HARM (f2c) 3. PDA 4. PDA (f2c) 5. Singleton 6. Singleton (f2c) 7. Temperton 8. Temperton (f2c) Computing normalized averages (9 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.58327 s, 262144 iters, t-(init.)=1.49994 s t(norm)=0.0149006, mflops=335.558 (err=1.9e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. Skipping fft (all dimensions must be > 4 for HARM). 3. PDA: elapsed time t=1.21662 s, 32768 iters, t-(init.)=1.21662 s t(norm)=0.0966881, mflops=51.7127 (err=2.8e-16) 4. PDA (f2c): elapsed time t=1.6166 s, 32768 iters, t-(init.)=1.59994 s t(norm)=0.127151, mflops=39.3232 (err=2.8e-16) 5. Singleton: elapsed time t=1.89992 s, 131072 iters, t-(init.)=1.86659 s t(norm)=0.0370859, mflops=134.822 (err=1.9e-16) 6. Singleton (f2c): elapsed time t=1.39994 s, 131072 iters, t-(init.)=1.36661 s t(norm)=0.0271521, mflops=184.148 (err=1.9e-16) 7. Temperton: elapsed time t=1.38328 s, 131072 iters, t-(init.)=1.33328 s t(norm)=0.0264899, mflops=188.751 (err=1.9e-16) 8. Temperton (f2c): elapsed time t=1.03329 s, 65536 iters, t-(init.)=1.01663 s t(norm)=0.0403971, mflops=123.771 (err=1.9e-16) Top mflops for N=64 = 335.558 Normalized results and averages for N=64: fft 0: mflops = 335.558 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.00298011), norm. avg. (of 0) = -1 fft 2: mflops = -1 (norm. = -0.00298011), norm. avg. (of 0) = -1 fft 3: mflops = 51.7127 (norm. = 0.15411), norm. avg. (of 1) = 0..15411 fft 4: mflops = 39.3232 (norm. = 0.117188), norm. avg. (of 1) = 0.117188 fft 5: mflops = 134.822 (norm. = 0.401786), norm. avg. (of 1) = 0.401786 fft 6: mflops = 184.148 (norm. = 0.54878), norm. avg. (of 1) = 0.54878 fft 7: mflops = 188.751 (norm. = 0.5625), norm. avg. (of 1) = 0.5625 fft 8: mflops = 123.771 (norm. = 0.368852), norm. avg. (of 1) = 0.368852 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.34995 s, 32768 iters, t-(init.)=1.28328 s t(norm)=0.00849884, mflops=588.316 (err=3.6e-16) 1. HARM: elapsed time t=1.63327 s, 16384 iters, t-(init.)=1.59994 s t(norm)=0.0211919, mflops=235.939 (err=3.8e-16) 2. HARM (f2c): elapsed time t=1.09996 s, 8192 iters, t-(init.)=1.08329 s t(norm)=0.0286974, mflops=174.232 (err=3.8e-16) 3. PDA: elapsed time t=1.19995 s, 4096 iters, t-(init.)=1.18329 s t(norm)=0.0626927, mflops=79.754 (err=3.0e-16) 4. PDA (f2c): elapsed time t=1.43328 s, 4096 iters, t-(init.)=1.41661 s t(norm)=0.0750547, mflops=66.6181 (err=3.0e-16) 5. Singleton: elapsed time t=1.36661 s, 8192 iters, t-(init.)=1.34995 s t(norm)=0.0357614, mflops=139.816 (err=3.5e-16) 6. Singleton (f2c): elapsed time t=1.59994 s, 8192 iters, t-(init.)=1.58327 s t(norm)=0.0419423, mflops=119.211 (err=3.5e-16) 7. Temperton: elapsed time t=1.11662 s, 16384 iters, t-(init.)=1.08329 s t(norm)=0.0143487, mflops=348.464 (err=1.3e-08) 8. Temperton (f2c): elapsed time t=1.03329 s, 8192 iters, t-(init.)=1.01663 s t(norm)=0.0269314, mflops=185.657 (err=3.3e-16) Top mflops for N=512 = 588.316 Normalized results and averages for N=512: fft 0: mflops = 588.316 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 235.939 (norm. = 0.401042), norm. avg. (of 1) = 0.401042 fft 2: mflops = 174.232 (norm. = 0.296154), norm. avg. (of 1) = 0.296154 fft 3: mflops = 79.754 (norm. = 0.135563), norm. avg. (of 2) = 0.144836 fft 4: mflops = 66.6181 (norm. = 0.113235), norm. avg. (of 2) = 0.115211 fft 5: mflops = 139.816 (norm. = 0..237654), norm. avg. (of 2) = 0.31972 fft 6: mflops = 119.211 (norm. = 0.202632), norm. avg. (of 2) = 0.375706 fft 7: mflops = 348.464 (norm. = 0.592308), norm. avg. (of 2) = 0.577404 fft 8: mflops = 185.657 (norm. = 0.315574), norm. avg. (of 2) = 0.342213 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.01663 s, 2048 iters, t-(init.)=0.983294 s t(norm)=0.00976815, mflops=511.868 (err=4.2e-16) 1. HARM: elapsed time t=1.06662 s, 1024 iters, t-(init.)=1.04996 s t(norm)=0.0208608, mflops=239.684 (err=4.0e-16) 2. HARM (f2c): elapsed time t=1.36661 s, 1024 iters, t-(init.)=1.34995 s t(norm)=0.026821, mflops=186.421 (err=4.0e-16) 3. PDA: elapsed time t=1.64993 s, 1024 iters, t-(init.)=1.63327 s t(norm)=0.0324501, mflops=154.083 (err=4.0e-16) 4. PDA (f2c): elapsed time t=1.23328 s, 512 iters, t-(init.)=1.21662 s t(norm)=0.0483441, mflops=103.425 (err=4.0e-16) 5. Singleton: elapsed time t=1.44994 s, 1024 iters, t-(init.)=1.43328 s t(norm)=0..0284766, mflops=175.583 (err=4.1e-16) 6. Singleton (f2c): elapsed time t=1.51661 s, 1024 iters, t-(init.)=1.49994 s t(norm)=0.0298011, mflops=167.779 (err=4.1e-16) 7. Temperton: elapsed time t=1.51661 s, 2048 iters, t-(init.)=1.48327 s t(norm)=0.014735, mflops=339.328 (err=6.3e-08) 8. Temperton (f2c): elapsed time t=1.06662 s, 1024 iters, t-(init.)=1.04996 s t(norm)=0.0208608, mflops=239.684 (err=4.6e-16) Top mflops for N=4096 = 511.868 Normalized results and averages for N=4096: fft 0: mflops = 511.868 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 239.684 (norm. = 0.468254), norm. avg. (of 2) = 0.434648 fft 2: mflops = 186.421 (norm. = 0.364198), norm. avg. (of 2) = 0.330176 fft 3: mflops = 154.083 (norm. = 0.30102), norm. avg. (of 3) = 0.196898 fft 4: mflops = 103.425 (norm. = 0.202055), norm. avg. (of 3) = 0.144159 fft 5: mflops = 175.583 (norm. = 0.343023), norm. avg. (of 3) = 0.327488 fft 6: mflops = 167.779 (norm. = 0.327778), norm. avg. (of 3) = 0.35973 fft 7: mflops = 339.328 (norm. = 0.662921), norm. avg. (of 3) = 0.60591 fft 8: mflops = 239.684 (norm. = 0.468254), norm. avg. (of 3) = 0.384227 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.43328 s, 128 iters, t-(init.)=1.29995 s t(norm)=0.0206621, mflops=241.989 (err=5.2e-16) 1. HARM: elapsed time t=1.21662 s, 64 iters, t-(init.)=1.14995 s t(norm)=0.0365561, mflops=136.776 (err=5.3e-16) 2. HARM (f2c): elapsed time t=1.34995 s, 64 iters, t-(init.)=1.28328 s t(norm)=0.0407944, mflops=122.566 (err=5.3e-16) 3. PDA: elapsed time t=1.94992 s, 64 iters, t-(init.)=1.88326 s t(norm)=0.0598672, mflops=83.5182 (err=4.2e-16) 4. PDA (f2c): elapsed time t=1.19995 s, 32 iters, t-(init.)=1.16662 s t(norm)=0.0741717, mflops=67.4112 (err=4.2e-16) 5. Singleton: elapsed time t=1.13329 s, 32 iters, t-(init.)=1.09996 s t(norm)=0.0699333, mflops=71.4967 (err=5.3e-16) 6. Singleton (f2c): elapsed time t=1.21662 s, 32 iters, t-(init.)=1.16662 s t(norm)=0.0741717, mflops=67.4112 (err=5.3e-16) 7. Temperton: elapsed time t=1.33328 s, 64 iters, t-(init.)=1.26662 s t(norm)=0.0402646, mflops=124.178 (err=9.6e-08) 8. Temperton (f2c): elapsed time t=1.49994 s, 64 iters, t-(init.)=1.43328 s t(norm)=0.0455626, mflops=109.739 (err=4.7e-16) Top mflops for N=32768 = 241.989 Normalized results and averages for N=32768: fft 0: mflops = 241.989 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 136.776 (norm. = 0.565217), norm. avg. (of 3) = 0.478171 fft 2: mflops = 122.566 (norm. = 0.506494), norm. avg. (of 3) = 0.388948 fft 3: mflops = 83.5182 (norm. = 0.345133), norm. avg. (of 4) = 0.233957 fft 4: mflops = 67.4112 (norm. = 0.278571), norm. avg. (of 4) = 0.177762 fft 5: mflops = 71.4967 (norm. = 0.295455), norm. avg. (of 4) = 0.319479 fft 6: mflops = 67.4112 (norm. = 0.278571), norm. avg. (of 4) = 0.33944 fft 7: mflops = 124.178 (norm. = 0.513158), norm. avg. (of 4) = 0.582722 fft 8: mflops = 109.739 (norm. = 0.453488), norm. avg. (of 4) = 0.401542 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.36661 s, 8 iters, t-(init.)=1.18329 s t(norm)=0.0313464, mflops=159.508 (err=1.2e-15) 1. HARM: elapsed time t=1.24995 s, 4 iters, t-(init.)=1.16662 s t(norm)=0.0618098, mflops=80.8934 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.33328 s, 4 iters, t-(init.)=1.24995 s t(norm)=0.0662247, mflops=75.5005 (err=1.2e-15) 3. PDA: elapsed time t=1.41661 s, 4 iters, t-(init.)=1.33328 s t(norm)=0.0706397, mflops=70.7817 (err=1.3e-15) 4. PDA (f2c): elapsed time t=1.6666 s, 4 iters, t-(init.)=1.58327 s t(norm)=0.0838847, mflops=59.6057 (err=1.3e-15) 5. Singleton: elapsed time t=1.51661 s, 2 iters, t-(init.)=1.48327 s t(norm)=0.157173, mflops=31.812 (err=1.7e-15) 6. Singleton (f2c): elapsed time t=1.54994 s, 2 iters, t-(init.)=1.49994 s t(norm)=0.158939, mflops=31.4585 (err=1.7e-15) 7. Temperton: elapsed time t=1.08329 s, 4 iters, t-(init.)=0.99996 s t(norm)=0.0529798, mflops=94.3756 (err=1.3e-07) 8. Temperton (f2c): elapsed time t=1.11662 s, 4 iters, t-(init.)=1.03329 s t(norm)=0.0547458, mflops=91.3312 (err=1.3e-15) Top mflops for N=262144 = 159.508 Normalized results and averages for N=262144: fft 0: mflops = 159.508 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 80.8934 (norm. = 0.507143), norm. avg. (of 4) = 0.485414 fft 2: mflops = 75.5005 (norm. = 0.473333), norm. avg. (of 4) = 0.410045 fft 3: mflops = 70.7817 (norm. = 0.44375), norm. avg. (of 5) = 0.275915 fft 4: mflops = 59.6057 (norm. = 0.373684), norm. avg. (of 5) = 0.216947 fft 5: mflops = 31.812 (norm. = 0.199438), norm. avg. (of 5) = 0.295471 fft 6: mflops = 31.4585 (norm. = 0.197222), norm. avg. (of 5) = 0.310997 fft 7: mflops = 94.3756 (norm. = 0.591667), norm. avg. (of 5) = 0.584511 fft 8: mflops = 91.3312 (norm. = 0.572581), norm. avg. (of 5) = 0.43575 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.78326 s, 4 iters, t-(init.)=1.58327 s t(norm)=0.0397348, mflops=125.834 (err=1.2e-15) 1. HARM: elapsed time t=1.44994 s, 2 iters, t-(init.)=1.34995 s t(norm)=0.0677584, mflops=73.7916 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.51661 s, 2 iters, t-(init.)=1.41661 s t(norm)=0.0711045, mflops=70.3191 (err=1.2e-15) 3. PDA: elapsed time t=1.89992 s, 2 iters, t-(init.)=1.79993 s t(norm)=0.0903445, mflops=55.3437 (err=1.2e-15) 4. PDA (f2c): elapsed time t=1.08329 s, 1 iters, t-(init.)=1.03329 s t(norm)=0.103729, mflops=48.2026 (err=1.2e-15) 5. Singleton: elapsed time t=1.89992 s, 1 iters, t-(init.)=1.84993 s t(norm)=0.185708, mflops=26.924 (err=1.7e-15) 6. Singleton (f2c): elapsed time t=1.96659 s, 1 iters, t-(init.)=1.91659 s t(norm)=0.1924, mflops=25.9875 (err=1.7e-15) 7. Temperton: elapsed time t=1.29995 s, 2 iters, t-(init.)=1.19995 s t(norm)=0.0602297, mflops=83.0156 (err=1.5e-07) 8. Temperton (f2c): elapsed time t=1.31661 s, 2 iters, t-(init.)=1.21662 s t(norm)=0.0610662, mflops=81.8784 (err=1.3e-15) Top mflops for N=524288 = 125.834 Normalized results and averages for N=524288: fft 0: mflops = 125.834 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 73.7916 (norm. = 0.58642), norm. avg. (of 5) = 0.505615 fft 2: mflops = 70.3191 (norm. = 0.558824), norm. avg. (of 5) = 0.4398 fft 3: mflops = 55.3437 (norm. = 0.439815), norm. avg. (of 6) = 0.303232 fft 4: mflops = 48.2026 (norm. = 0.383065), norm. avg. (of 6) = 0.244633 fft 5: mflops = 26.924 (norm. = 0.213964), norm. avg. (of 6) = 0.281887 fft 6: mflops = 25.9875 (norm. = 0.206522), norm. avg. (of 6) = 0.293584 fft 7: mflops = 83.0156 (norm. = 0.659722), norm. avg. (of 6) = 0.597046 fft 8: mflops = 81.8784 (norm. = 0.650685), norm. avg. (of 6) = 0.471572 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=1.7166 s, 2 iters, t-(init.)=1.51661 s t(norm)=0.0361587, mflops=138.279 (err=2.0e-15) 1. HARM: elapsed time t=1.49994 s, 1 iters, t-(init.)=1.39994 s t(norm)=0.0667545, mflops=74.9013 (err=1.9e-15) 2. HARM (f2c): elapsed time t=1.5666 s, 1 iters, t-(init.)=1.46661 s t(norm)=0.0699333, mflops=71.4967 (err=1.9e-15) 3. PDA: elapsed time t=1.89992 s, 1 iters, t-(init.)=1.79993 s t(norm)=0.0858273, mflops=58.2566 (err=2.0e-15) 4. PDA (f2c): elapsed time t=2.16658 s, 1 iters, t-(init.)=2.06658 s t(norm)=0.0985424, mflops=50.7396 (err=2.0e-15) 5. Singleton: elapsed time t=3.76652 s, 1 iters, t-(init.)=3.66652 s t(norm)=0.174833, mflops=28.5987 (err=2.8e-15) 6. Singleton (f2c): elapsed time t=3.84985 s, 1 iters, t-(init.)=3.74985 s t(norm)=0.178807, mflops=27.9631 (err=2.8e-15) 7. Skipping fft (Temperton can't handle dimensions > 256). 8. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 138.279 Normalized results and averages for N=1048576: fft 0: mflops = 138.279 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 74.9013 (norm. = 0.541667), norm. avg. (of 6) = 0.511624 fft 2: mflops = 71.4967 (norm. = 0.517045), norm. avg. (of 6) = 0.452675 fft 3: mflops = 58.2566 (norm. = 0.421296), norm. avg. (of 7) = 0.320098 fft 4: mflops = 50.7396 (norm. = 0.366935), norm. avg. (of 7) = 0.262105 fft 5: mflops = 28.5987 (norm. = 0.206818), norm. avg. (of 7) = 0.271163 fft 6: mflops = 27.9631 (norm. = 0.202222), norm. avg. (of 7) = 0.280532 fft 7: mflops = -1 (norm. = -0.00723174), norm. avg. (of 6) = 0.597046 fft 8: mflops = -1 (norm. = -0.00723174), norm. avg. (of 6) = 0.471572 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=1.94992 s, 1 iters, t-(init.)=1.73326 s t(norm)=0.0393564, mflops=127.044 (err=7.2e-16) 1. HARM: elapsed time t=3.36653 s, 1 iters, t-(init.)=3.16654 s t(norm)=0.0719011, mflops=69.5399 (err=7.0e-16) 2. HARM (f2c): elapsed time t=3.51653 s, 1 iters, t-(init.)=3.31653 s t(norm)=0.075307, mflops=66.3949 (err=7.0e-16) 3. PDA: elapsed time t=3.79985 s, 1 iters, t-(init.)=3.59986 s t(norm)=0.0817402, mflops=61.1694 (err=7.1e-16) 4. PDA (f2c): elapsed time t=4.36649 s, 1 iters, t-(init.)=4.1665 s t(norm)=0.0946068, mflops=52.8503 (err=7.1e-16) 5. Singleton: elapsed time t=10.8162 s, 1 iters, t-(init.)=10.6162 s t(norm)=0.241058, mflops=20.7419 (err=8.4e-16) 6. Singleton (f2c): elapsed time t=11.2162 s, 1 iters, t-(init.)=11.0162 s t(norm)=0.25014, mflops=19.9888 (err=8.4e-16) 7. Temperton: elapsed time t=3.21654 s, 1 iters, t-(init.)=3.01655 s t(norm)=0.0684953, mflops=72.9977 (err=1.5e-07) 8. Temperton (f2c): elapsed time t=3.51653 s, 1 iters, t-(init.)=3.31653 s t(norm)=0.075307, mflops=66.3949 (err=7.4e-16) Top mflops for N=2097152 = 127.044 Normalized results and averages for N=2097152: fft 0: mflops = 127.044 (norm. = 1), norm. avg. (of 8) = 1 fft 1: mflops = 69.5399 (norm. = 0.547368), norm. avg. (of 7) = 0.51673 fft 2: mflops = 66.3949 (norm. = 0.522613), norm. avg. (of 7) = 0.462666 fft 3: mflops = 61.1694 (norm. = 0.481481), norm. avg. (of 8) = 0.340271 fft 4: mflops = 52.8503 (norm. = 0.416), norm. avg. (of 8) = 0.281342 fft 5: mflops = 20.7419 (norm. = 0.163265), norm. avg. (of 8) = 0.257675 fft 6: mflops = 19.9888 (norm. = 0.157337), norm. avg. (of 8) = 0.265133 fft 7: mflops = 72.9977 (norm. = 0.574586), norm. avg. (of 7) = 0.593837 fft 8: mflops = 66.3949 (norm. = 0.522613), norm. avg. (of 7) = 0.478864 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.052 3071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) Maximum array size N = 2985984 Benchmarking FFTs: 0. FFTW 1. PDA 2. PDA (f2c) 3. Singleton 4. Singleton (f2c) 5. Temperton 6. Temperton (f2c) Computing normalized averages (7 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.81659 s, 131072 iters, t-(init.)=1.74993 s t(norm)=0.0153331, mflops=326.091 (err=3.0e-16) 1. PDA: elapsed time t=1.19995 s, 16384 iters, t-(init.)=1.18329 s t(norm)=0.0829449, mflops=60.281 (err=2.3e-16) 2. PDA (f2c): elapsed time t=1.53327 s, 16384 iters, t-(init.)=1.53327 s t(norm)=0.107478, mflops=46.5212 (err=2.3e-16) 3. Singleton: elapsed time t=1.23328 s, 65536 iters, t-(init.)=1.19995 s t(norm)=0.0210283, mflops=237.775 (err=3.1e-16) 4. Singleton (f2c): elapsed time t=1.81659 s, 131072 iters, t-(init.)=1.74993 s t(norm)=0.0153331, mflops=326.091 (err=3.1e-16) 5. Temperton: elapsed time t=1.21662 s, 65536 iters, t-(init.)=1.18329 s t(norm)=0.0207362, mflops=241.124 (err=5.3e-16) 6. Temperton (f2c): elapsed time t=1.69993 s, 65536 iters, t-(init.)=1.6666 s t(norm)=0.029206, mflops=171.198 (err=2.4e-16) Top mflops for N=125 = 326.091 Normalized results and averages for N=125: fft 0: mflops = 326.091 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = 60.281 (norm. = 0.184859), norm. avg. (of 1) = 0.184859 fft 2: mflops = 46.5212 (norm. = 0.142663), norm. avg. (of 1) = 0.142663 fft 3: mflops = 237.775 (norm. = 0.729167), norm. avg. (of 1) = 0.729167 fft 4: mflops = 326.091 (norm. = 1), norm. avg. (of 1) = 1 fft 5: mflops = 241.124 (norm. = 0.739437), norm. avg. (of 1) = 0.739437 fft 6: mflops = 171.198 (norm. = 0.525), norm. avg. (of 1) = 0.525 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.21662 s, 65536 iters, t-(init.)=1.14995 s t(norm)=0.0104754, mflops=477.308 (err=2.9e-16) 1. PDA: elapsed time t=1.28328 s, 8192 iters, t-(init.)=1.28328 s t(norm)=0.0935197, mflops=53.4647 (err=3.6e-16) 2. PDA (f2c): elapsed time t=1.51661 s, 8192 iters, t-(init.)=1.51661 s t(norm)=0.110523, mflops=45.2394 (err=3.6e-16) 3. Singleton: elapsed time t=1.19995 s, 16384 iters, t-(init.)=1.18329 s t(norm)=0.0431162, mflops=115.966 (err=2.9e-16) 4. Singleton (f2c): elapsed time t=1.16662 s, 16384 iters, t-(init.)=1.14995 s t(norm)=0.0419017, mflops=119.327 (err=2.9e-16) 5. Temperton: elapsed time t=1.73326 s, 32768 iters, t-(init.)=1.69993 s t(norm)=0.0309708, mflops=161.442 (err=2.1e-15) 6. Temperton (f2c): elapsed time t=1.79993 s, 32768 iters, t-(init.)=1.7666 s t(norm)=0.0321853, mflops=155.35 (err=3.1e-16) Top mflops for N=216 = 477.308 Normalized results and averages for N=216: fft 0: mflops = 477.308 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 53.4647 (norm. = 0.112013), norm. avg. (of 2) = 0.148436 fft 2: mflops = 45.2394 (norm. = 0.0947802), norm. avg. (of 2) = 0.118722 fft 3: mflops = 115.966 (norm. = 0.242958), norm. avg. (of 2) = 0.486062 fft 4: mflops = 119.327 (norm. = 0.25), norm. avg. (of 2) = 0.625 fft 5: mflops = 161.442 (norm. = 0.338235), norm. avg. (of 2) = 0.538836 fft 6: mflops = 155.35 (norm. = 0.325472), norm. avg. (of 2) = 0.425236 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.36661 s, 32768 iters, t-(init.)=1.31661 s t(norm)=0.013909, mflops=359.48 (err=3.8e-16) 1. PDA: elapsed time t=1.19995 s, 2048 iters, t-(init.)=1.19995 s t(norm)=0.202825, mflops=24.6518 (err=4.8e-16) 2. PDA (f2c): elapsed time t=1.21662 s, 2048 iters, t-(init.)=1.21662 s t(norm)=0.205642, mflops=24.3141 (err=4.3e-16) 3. Singleton: elapsed time t=1.74993 s, 16384 iters, t-(init.)=1.7166 s t(norm)=0.036269, mflops=137.859 (err=5.8e-16) 4. Singleton (f2c): elapsed time t=1.6166 s, 16384 iters, t-(init.)=1.59994 s t(norm)=0.0338041, mflops=147.911 (err=5.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 359.48 Normalized results and averages for N=343: fft 0: mflops = 359.48 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 24.6518 (norm. = 0.0685764), norm. avg. (of 3) = 0.121816 fft 2: mflops = 24.3141 (norm. = 0.067637), norm. avg. (of 3) = 0.101693 fft 3: mflops = 137.859 (norm. = 0.383495), norm. avg. (of 3) = 0.451873 fft 4: mflops = 147.911 (norm. = 0.411458), norm. avg. (of 3) = 0.553819 fft 5: mflops = -1 (norm. = -0.0027818), norm. avg. (of 2) = 0.538836 fft 6: mflops = -1 (norm. = -0.0027818), norm. avg. (of 2) = 0.425236 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.48327 s, 16384 iters, t-(init.)=1.43328 s t(norm)=0.0126186, mflops=396.24 (err=5.3e-16) 1. PDA: elapsed time t=1.73326 s, 4096 iters, t-(init.)=1.73326 s t(norm)=0.0610389, mflops=81.9149 (err=4.9e-16) 2. PDA (f2c): elapsed time t=1.14995 s, 2048 iters, t-(init.)=1.13329 s t(norm)=0.0798202, mflops=62.6408 (err=4.9e-16) 3. Singleton: elapsed time t=1.03329 s, 4096 iters, t-(init.)=1.03329 s t(norm)=0.0363886, mflops=137.406 (err=4.5e-16) 4. Singleton (f2c): elapsed time t=1.06662 s, 4096 iters, t-(init.)=1.04996 s t(norm)=0.0369755, mflops=135.225 (err=4.5e-16) 5. Temperton: elapsed time t=1.46661 s, 8192 iters, t-(init.)=1.44994 s t(norm)=0.0255307, mflops=195.843 (err=6.0e-08) 6. Temperton (f2c): elapsed time t=1.26662 s, 8192 iters, t-(init.)=1.24995 s t(norm)=0.0220092, mflops=227.177 (err=5.1e-16) Top mflops for N=729 = 396.24 Normalized results and averages for N=729: fft 0: mflops = 396.24 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 81.9149 (norm. = 0.206731), norm. avg. (of 4) = 0.143045 fft 2: mflops = 62.6408 (norm. = 0.158088), norm. avg. (of 4) = 0.115792 fft 3: mflops = 137.406 (norm. = 0.346774), norm. avg. (of 4) = 0.425598 fft 4: mflops = 135.225 (norm. = 0.34127), norm. avg. (of 4) = 0.500682 fft 5: mflops = 195.843 (norm. = 0.494253), norm. avg. (of 3) = 0.523975 fft 6: mflops = 227.177 (norm. = 0.573333), norm. avg. (of 3) = 0.474602 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.89992 s, 16384 iters, t-(init.)=1.83326 s t(norm)=0.0112277, mflops=445.325 (err=4.0e-16) 1. PDA: elapsed time t=1.09996 s, 2048 iters, t-(init.)=1.08329 s t(norm)=0.0530766, mflops=94.2034 (err=4.3e-16) 2. PDA (f2c): elapsed time t=1.34995 s, 2048 iters, t-(init.)=1.34995 s t(norm)=0.0661416, mflops=75.5953 (err=4.3e-16) 3. Singleton: elapsed time t=1.41661 s, 4096 iters, t-(init.)=1.39994 s t(norm)=0.0342957, mflops=145.791 (err=4.6e-16) 4. Singleton (f2c): elapsed time t=1.39994 s, 4096 iters, t-(init.)=1.38328 s t(norm)=0.0338874, mflops=147.548 (err=4.6e-16) 5. Temperton: elapsed time t=1.7166 s, 8192 iters, t-(init.)=1.68327 s t(norm)=0.0206182, mflops=242.504 (err=6.3e-16) 6. Temperton (f2c): elapsed time t=1.04996 s, 4096 iters, t-(init.)=1.03329 s t(norm)=0.0253135, mflops=197.523 (err=3.4e-16) Top mflops for N=1000 = 445.325 Normalized results and averages for N=1000: fft 0: mflops = 445.325 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 94.2034 (norm. = 0.211538), norm. avg. (of 5) = 0.156744 fft 2: mflops = 75.5953 (norm. = 0.169753), norm. avg. (of 5) = 0.126584 fft 3: mflops = 145.791 (norm. = 0.327381), norm. avg. (of 5) = 0.405955 fft 4: mflops = 147.548 (norm. = 0.331325), norm. avg. (of 5) = 0.466811 fft 5: mflops = 242.504 (norm. = 0.544554), norm. avg. (of 4) = 0.52912 fft 6: mflops = 197.523 (norm. = 0.443548), norm. avg. (of 4) = 0.466838 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.98325 s, 8192 iters, t-(init.)=1.94992 s t(norm)=0.0172315, mflops=290.166 (err=4.3e-16) 1. PDA: elapsed time t=1.46661 s, 512 iters, t-(init.)=1.46661 s t(norm)=0.207367, mflops=24.1118 (err=5.6e-16) 2. PDA (f2c): elapsed time t=1.46661 s, 512 iters, t-(init.)=1.46661 s t(norm)=0.207367, mflops=24.1118 (err=5.6e-16) 3. Singleton: elapsed time t=1.79993 s, 4096 iters, t-(init.)=1.78326 s t(norm)=0.0315175, mflops=158.642 (err=6.6e-16) 4. Singleton (f2c): elapsed time t=1.63327 s, 4096 iters, t-(init.)=1.6166 s t(norm)=0.0285719, mflops=174.997 (err=6.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 290.166 Normalized results and averages for N=1331: fft 0: mflops = 290.166 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 24.1118 (norm. = 0.0830966), norm. avg. (of 6) = 0.144469 fft 2: mflops = 24.1118 (norm. = 0.0830966), norm. avg. (of 6) = 0.119336 fft 3: mflops = 158.642 (norm. = 0.546729), norm. avg. (of 6) = 0.429417 fft 4: mflops = 174.997 (norm. = 0.603093), norm. avg. (of 6) = 0.489524 fft 5: mflops = -1 (norm. = -0.0034463), norm. avg. (of 4) = 0.52912 fft 6: mflops = -1 (norm. = -0.0034463), norm. avg. (of 4) = 0.466838 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.98325 s, 8192 iters, t-(init.)=1.91659 s t(norm)=0.012589, mflops=397.174 (err=3.9e-16) 1. PDA: elapsed time t=1.6666 s, 2048 iters, t-(init.)=1.64993 s t(norm)=0.0433498, mflops=115.341 (err=3.9e-16) 2. PDA (f2c): elapsed time t=1.18329 s, 1024 iters, t-(init.)=1.18329 s t(norm)=0.0621785, mflops=80.4137 (err=3.8e-16) 3. Singleton: elapsed time t=1.44994 s, 2048 iters, t-(init.)=1.43328 s t(norm)=0.0376574, mflops=132.776 (err=4.0e-16) 4. Singleton (f2c): elapsed time t=1.64993 s, 2048 iters, t-(init.)=1.63327 s t(norm)=0.0429119, mflops=116.518 (err=4.0e-16) 5. Temperton: elapsed time t=1.09996 s, 4096 iters, t-(init.)=1.08329 s t(norm)=0.014231, mflops=351.346 (err=1.8e-15) 6. Temperton (f2c): elapsed time t=1.46661 s, 4096 iters, t-(init.)=1.44994 s t(norm)=0.0190476, mflops=262.5 (err=3.9e-16) Top mflops for N=1728 = 397.174 Normalized results and averages for N=1728: fft 0: mflops = 397.174 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 115.341 (norm. = 0.290404), norm. avg. (of 7) = 0.165317 fft 2: mflops = 80.4137 (norm. = 0.202465), norm. avg. (of 7) = 0.131212 fft 3: mflops = 132.776 (norm. = 0.334302), norm. avg. (of 7) = 0.415829 fft 4: mflops = 116.518 (norm. = 0.293367), norm. avg. (of 7) = 0.461502 fft 5: mflops = 351.346 (norm. = 0.884615), norm. avg. (of 5) = 0.600219 fft 6: mflops = 262.5 (norm. = 0.66092), norm. avg. (of 5) = 0.505655 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.84993 s, 4096 iters, t-(init.)=1.81659 s t(norm)=0.0181842, mflops=274.965 (err=4.5e-16) 1. PDA: elapsed time t=1.36661 s, 256 iters, t-(init.)=1.36661 s t(norm)=0.218877, mflops=22.8439 (err=9.2e-16) 2. PDA (f2c): elapsed time t=1.33328 s, 256 iters, t-(init.)=1.33328 s t(norm)=0.213539, mflops=23.415 (err=9.2e-16) 3. Singleton: elapsed time t=1.7666 s, 2048 iters, t-(init.)=1.73326 s t(norm)=0.0347001, mflops=144.092 (err=7.7e-16) 4. Singleton (f2c): elapsed time t=1.63327 s, 2048 iters, t-(init.)=1.6166 s t(norm)=0.0323645, mflops=154.49 (err=7.7e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 274.965 Normalized results and averages for N=2197: fft 0: mflops = 274.965 (norm. = 1), norm. avg. (of 8) = 1 fft 1: mflops = 22.8439 (norm. = 0.0830793), norm. avg. (of 8) = 0.155037 fft 2: mflops = 23.415 (norm. = 0.0851563), norm. avg. (of 8) = 0.125455 fft 3: mflops = 144.092 (norm. = 0.524038), norm. avg. (of 8) = 0.429356 fft 4: mflops = 154.49 (norm. = 0.561856), norm. avg. (of 8) = 0.474046 fft 5: mflops = -1 (norm. = -0.00363683), norm. avg. (of 5) = 0.600219 fft 6: mflops = -1 (norm. = -0.00363683), norm. avg. (of 5) = 0.505655 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.6666 s, 4096 iters, t-(init.)=1.6166 s t(norm)=0.0125926, mflops=397.059 (err=4.1e-16) 1. PDA: elapsed time t=1.6666 s, 512 iters, t-(init.)=1.64993 s t(norm)=0.102818, mflops=48.6298 (err=4.5e-16) 2. PDA (f2c): elapsed time t=1.01663 s, 256 iters, t-(init.)=0.99996 s t(norm)=0.124628, mflops=40.1196 (err=4.5e-16) 3.. Singleton: elapsed time t=1.49994 s, 1024 iters, t-(init.)=1.49994 s t(norm)=0.0467353, mflops=106.985 (err=5.6e-16) 4. Singleton (f2c): elapsed time t=1.5666 s, 1024 iters, t-(init.)=1.54994 s t(norm)=0.0482932, mflops=103.534 (err=5.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 397.059 Normalized results and averages for N=2744: fft 0: mflops = 397.059 (norm. = 1), norm. avg. (of 9) = 1 fft 1: mflops = 48.6298 (norm. = 0.122475), norm. avg. (of 9) = 0.151419 fft 2: mflops = 40.1196 (norm. = 0.101042), norm. avg. (of 9) = 0.122742 fft 3: mflops = 106.985 (norm. = 0.269444), norm. avg. (of 9) = 0.411588 fft 4: mflops = 103.534 (norm. = 0.260753), norm. avg. (of 9) = 0.450347 fft 5: mflops = -1 (norm. = -0.00251851), norm. avg. (of 5) = 0.600219 fft 6: mflops = -1 (norm. = -0.00251851), norm. avg. (of 5) = 0.505655 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.16662 s, 2048 iters, t-(init.)=1.13329 s t(norm)=0.0139889, mflops=357.426 (err=4.7e-16) 1. PDA: elapsed time t=1.6666 s, 1024 iters, t-(init.)=1.64993 s t(norm)=0.0407324, mflops=122.752 (err=4.8e-16) 2. PDA (f2c): elapsed time t=1.13329 s, 512 iters, t-(init.)=1.11662 s t(norm)=0.0551328, mflops=90.6901 (err=4.8e-16) 3. Singleton: elapsed time t=1.51661 s, 1024 iters, t-(init.)=1.49994 s t(norm)=0.0370295, mflops=135.028 (err=6.1e-16) 4. Singleton (f2c): elapsed time t=1.59994 s, 1024 iters, t-(init.)=1.58327 s t(norm)=0.0390867, mflops=127.921 (err=6.1e-16) 5. Temperton: elapsed time t=1.19995 s, 2048 iters, t-(init.)=1.16662 s t(norm)=0.0144004, mflops=347.214 (err=2.0e-15) 6. Temperton (f2c): elapsed time t=1.7166 s, 2048 iters, t-(init.)=1.68327 s t(norm)=0.0207777, mflops=240.643 (err=4.6e-16) Top mflops for N=3375 = 357.426 Normalized results and averages for N=3375: fft 0: mflops = 357.426 (norm. = 1), norm. avg. (of 10) = 1 fft 1: mflops = 122.752 (norm. = 0.343434), norm. avg. (of 10) = 0.170621 fft 2: mflops = 90.6901 (norm. = 0.253731), norm. avg. (of 10) = 0.135841 fft 3: mflops = 135.028 (norm. = 0.377778), norm. avg. (of 10) = 0.408207 fft 4: mflops = 127.921 (norm. = 0.357895), norm. avg. (of 10) = 0.441102 fft 5: mflops = 347.214 (norm. = 0.971429), norm. avg. (of 6) = 0.662087 fft 6: mflops = 240.643 (norm. = 0.673267), norm. avg. (of 6) = 0.53359 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.68327 s, 256 iters, t-(init.)=1.54994 s t(norm)=0.0256753, mflops=194.739 (err=4.6e-16) 1. PDA: elapsed time t=1.41661 s, 128 iters, t-(init.)=1.34995 s t(norm)=0.0447248, mflops=111.795 (err=5.2e-16) 2. PDA (f2c): elapsed time t=1.01663 s, 64 iters, t-(init.)=0.983294 s t(norm)=0.0651546, mflops=76.7405 (err=4.8e-16) 3. Singleton: elapsed time t=1.81659 s, 128 iters, t-(init.)=1.74993 s t(norm)=0.0579766, mflops=86.2417 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=1.94992 s, 128 iters, t-(init.)=1.88326 s t(norm)=0.0623939, mflops=80.1361 (err=5.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 194.739 Normalized results and averages for N=16800: fft 0: mflops = 194.739 (norm. = 1), norm. avg. (of 11) = 1 fft 1: mflops = 111.795 (norm. = 0.574074), norm. avg. (of 11) = 0.207298 fft 2: mflops = 76.7405 (norm. = 0.394068), norm. avg. (of 11) = 0.159316 fft 3: mflops = 86.2417 (norm. = 0.442857), norm. avg. (of 11) = 0.411357 fft 4: mflops = 80.1361 (norm. = 0.411504), norm. avg. (of 11) = 0.438411 fft 5: mflops = -1 (norm. = -0.00513507), norm. avg. (of 6) = 0.662087 fft 6: mflops = -1 (norm. = -0.00513507), norm. avg. (of 6) = 0.53359 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.94992 s, 32 iters, t-(init.)=1.73326 s t(norm)=0.0292314, mflops=171.049 (err=6.8e-16) 1. PDA: elapsed time t=1.54994 s, 16 iters, t-(init.)=1.43328 s t(norm)=0.0483442, mflops=103.425 (err=6.3e-16) 2. PDA (f2c): elapsed time t=1.06662 s, 8 iters, t-(init.)=1.01663 s t(norm)=0.0685813, mflops=72.9061 (err=6.2e-16) 3. Singleton: elapsed time t=1.64993 s, 8 iters, t-(init.)=1.59994 s t(norm)=0.107931, mflops=46.3258 (err=6.5e-16) 4. Singleton (f2c): elapsed time t=1.64993 s, 8 iters, t-(init.)=1.58327 s t(norm)=0.106807, mflops=46.8134 (err=6.5e-16) 5. Temperton: elapsed time t=1.31661 s, 16 iters, t-(init.)=1.21662 s t(norm)=0.0410364, mflops=121.843 (err=1.0e-07) 6. Temperton (f2c): elapsed time t=1.38328 s, 16 iters, t-(init.)=1.28328 s t(norm)=0.0432849, mflops=115.514 (err=7.0e-16) Top mflops for N=110592 = 171.049 Normalized results and averages for N=110592: fft 0: mflops = 171.049 (norm. = 1), norm. avg. (of 12) = 1 fft 1: mflops = 103.425 (norm. = 0.604651), norm. avg. (of 12) = 0.240411 fft 2: mflops = 72.9061 (norm. = 0.42623), norm. avg. (of 12) = 0.181559 fft 3: mflops = 46.3258 (norm. = 0.270833), norm. avg. (of 12) = 0.399646 fft 4: mflops = 46.8134 (norm. = 0.273684), norm. avg. (of 12) = 0.424684 fft 5: mflops = 121.843 (norm. = 0.712329), norm. avg. (of 7) = 0.669265 fft 6: mflops = 115.514 (norm. = 0.675325), norm. avg. (of 7) = 0.553838 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.96659 s, 32 iters, t-(init.)=1.7166 s t(norm)=0.0270696, mflops=184.709 (err=6.4e-16) 1. PDA: elapsed time t=1.36661 s, 8 iters, t-(init.)=1.31661 s t(norm)=0.0830485, mflops=60.2058 (err=7.1e-16) 2. PDA (f2c): elapsed time t=1.04996 s, 4 iters, t-(init.)=1.01663 s t(norm)=0.128252, mflops=38.9857 (err=7.1e-16) 3. Singleton: elapsed time t=1.48327 s, 8 iters, t-(init.)=1.41661 s t(norm)=0.089356, mflops=55.956 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.48327 s, 8 iters, t-(init.)=1.41661 s t(norm)=0.089356, mflops=55.956 (err=1.0e-15) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 184.709 Normalized results and averages for N=117649: fft 0: mflops = 184.709 (norm. = 1), norm. avg. (of 13) = 1 fft 1: mflops = 60.2058 (norm. = 0.325949), norm. avg. (of 13) = 0.246991 fft 2: mflops = 38.9857 (norm. = 0.211066), norm. avg. (of 13) = 0.183829 fft 3: mflops = 55.956 (norm. = 0.302941), norm. avg. (of 13) = 0.392208 fft 4: mflops = 55.956 (norm. = 0.302941), norm. avg. (of 13) = 0.415319 fft 5: mflops = -1 (norm. = -0.00541392), norm. avg. (of 7) = 0.669265 fft 6: mflops = -1 (norm. = -0.00541392), norm. avg. (of 7) = 0.553838 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.14995 s, 8 iters, t-(init.)=1.01663 s t(norm)=0.0331999, mflops=150.603 (err=7.4e-16) 1. PDA: elapsed time t=1.53327 s, 8 iters, t-(init.)=1.38328 s t(norm)=0.0451737, mflops=110.684 (err=7.4e-16) 2. PDA (f2c): elapsed time t=1.01663 s, 4 iters, t-(init.)=0.949962 s t(norm)=0.0620458, mflops=80.5856 (err=7.4e-16) 3. Singleton: elapsed time t=1.16662 s, 2 iters, t-(init.)=1.13329 s t(norm)=0.148039, mflops=33.7749 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.23328 s, 2 iters, t-(init.)=1.19995 s t(norm)=0.156747, mflops=31.8985 (err=1.0e-15) 5. Temperton: elapsed time t=1.6166 s, 8 iters, t-(init.)=1.46661 s t(norm)=0.047895, mflops=104.395 (err=2.0e-15) 6. Temperton (f2c): elapsed time t=1.78326 s, 8 iters, t-(init.)=1.63327 s t(norm)=0.0533376, mflops=93.7425 (err=7.1e-16) Top mflops for N=216000 = 150.603 Normalized results and averages for N=216000: fft 0: mflops = 150.603 (norm. = 1), norm. avg. (of 14) = 1 fft 1: mflops = 110.684 (norm. = 0.73494), norm. avg. (of 14) = 0.281844 fft 2: mflops = 80.5856 (norm. = 0.535088), norm. avg. (of 14) = 0.208919 fft 3: mflops = 33.7749 (norm. = 0.224265), norm. avg. (of 14) = 0.380212 fft 4: mflops = 31.8985 (norm. = 0.211806), norm. avg. (of 14) = 0.400782 fft 5: mflops = 104.395 (norm. = 0.693182), norm. avg. (of 8) = 0.672254 fft 6: mflops = 93.7425 (norm. = 0.622449), norm. avg. (of 8) = 0.562414 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.26662 s, 8 iters, t-(init.)=1.09996 s t(norm)=0.0317793, mflops=157.335 (err=7.4e-16) 1. PDA: elapsed time t=1.88326 s, 8 iters, t-(init.)=1.7166 s t(norm)=0.049595, mflops=100.817 (err=7.8e-16) 2. PDA (f2c): elapsed time t=1.34995 s, 4 iters, t-(init.)=1.26662 s t(norm)=0.0731888, mflops=68.3165 (err=7.8e-16) 3. Singleton: elapsed time t=1.34995 s, 2 iters, t-(init.)=1.29995 s t(norm)=0.15023, mflops=33.2824 (err=9.4e-16) 4. Singleton (f2c): elapsed time t=1.41661 s, 2 iters, t-(init.)=1.38328 s t(norm)=0.15986, mflops=31.2774 (err=9.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 157.335 Normalized results and averages for N=241920: fft 0: mflops = 157.335 (norm. = 1), norm. avg. (of 15) = 1 fft 1: mflops = 100.817 (norm. = 0.640777), norm. avg. (of 15) = 0.305773 fft 2: mflops = 68.3165 (norm. = 0.434211), norm. avg. (of 15) = 0.223938 fft 3: mflops = 33.2824 (norm. = 0.211538), norm. avg. (of 15) = 0.368967 fft 4: mflops = 31.2774 (norm. = 0.198795), norm. avg. (of 15) = 0.387316 fft 5: mflops = -1 (norm. = -0.00635587), norm. avg. (of 8) = 0.672254 fft 6: mflops = -1 (norm. = -0.00635587), norm. avg. (of 8) = 0.562414 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.18329 s, 4 iters, t-(init.)=1.03329 s t(norm)=0.0327682, mflops=152.587 (err=7.0e-16) 1. PDA: elapsed time t=1.5666 s, 4 iters, t-(init.)=1.41661 s t(norm)=0.0449241, mflops=111.299 (err=7.3e-16) 2. PDA (f2c): elapsed time t=1.03329 s, 2 iters, t-(init.)=0.949962 s t(norm)=0.0602512, mflops=82.9859 (err=7.3e-16) 3. Singleton: elapsed time t=1.13329 s, 1 iters, t-(init.)=1.09996 s t(norm)=0.139529, mflops=35.8348 (err=9.3e-16) 4. Singleton (f2c): elapsed time t=1.14995 s, 1 iters, t-(init.)=1.11662 s t(norm)=0.141643, mflops=35.3 (err=9.3e-16) 5. Temperton: elapsed time t=1.46661 s, 4 iters, t-(init.)=1.31661 s t(norm)=0.041753, mflops=119.752 (err=1.4e-07) 6. Temperton (f2c): elapsed time t=1.73326 s, 4 iters, t-(init.)=1.5666 s t(norm)=0.0496808, mflops=100.643 (err=9.6e-16) Top mflops for N=421875 = 152.587 Normalized results and averages for N=421875: fft 0: mflops = 152.587 (norm. = 1), norm. avg. (of 16) = 1 fft 1: mflops = 111.299 (norm. = 0.729412), norm. avg. (of 16) = 0.332251 fft 2: mflops = 82.9859 (norm. = 0.54386), norm. avg. (of 16) = 0.243933 fft 3: mflops = 35.8348 (norm. = 0.234848), norm. avg. (of 16) = 0.360584 fft 4: mflops = 35.3 (norm. = 0.231343), norm. avg. (of 16) = 0.377568 fft 5: mflops = 119.752 (norm. = 0.78481), norm. avg. (of 9) = 0.68476 fft 6: mflops = 100.643 (norm. = 0.659574), norm. avg. (of 9) = 0.57321 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.59994 s, 4 iters, t-(init.)=1.39994 s t(norm)=0.0360421, mflops=138.727 (err=6.6e-16) 1. PDA: elapsed time t=1.11662 s, 2 iters, t-(init.)=1.01663 s t(norm)=0.0523468, mflops=95.5168 (err=6.2e-16) 2. PDA (f2c): elapsed time t=1.41661 s, 2 iters, t-(init.)=1.31661 s t(norm)=0.0677934, mflops=73.7534 (err=6.2e-16) 3. Singleton: elapsed time t=1.48327 s, 1 iters, t-(init.)=1.43328 s t(norm)=0.147601, mflops=33.8751 (err=8.2e-16) 4. Singleton (f2c): elapsed time t=1.51661 s, 1 iters, t-(init.)=1.46661 s t(norm)=0.151033, mflops=33.1052 (err=8.2e-16) 5. Temperton: elapsed time t=1.11662 s, 2 iters, t-(init.)=1.03329 s t(norm)=0.053205, mflops=93.9762 (err=1.7e-07) 6. Temperton (f2c): elapsed time t=1.19995 s, 2 iters, t-(init.)=1.09996 s t(norm)=0.0566376, mflops=88.2806 (err=6.6e-16) Top mflops for N=512000 = 138.727 Normalized results and averages for N=512000: fft 0: mflops = 138.727 (norm. = 1), norm. avg. (of 17) = 1 fft 1: mflops = 95.5168 (norm. = 0.688525), norm. avg. (of 17) = 0.353208 fft 2: mflops = 73.7534 (norm. = 0.531646), norm. avg. (of 17) = 0.260858 fft 3: mflops = 33.8751 (norm. = 0.244186), norm. avg. (of 17) = 0.353737 fft 4: mflops = 33.1052 (norm. = 0.238636), norm. avg. (of 17) = 0.369396 fft 5: mflops = 93.9762 (norm. = 0.677419), norm. avg. (of 10) = 0.684026 fft 6: mflops = 88.2806 (norm. = 0.636364), norm. avg. (of 10) = 0.579525 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.6666 s, 4 iters, t-(init.)=1.44994 s t(norm)=0.0318914, mflops=156.782 (err=7.0e-16) 1. PDA: elapsed time t=1.41661 s, 2 iters, t-(init.)=1.29995 s t(norm)=0.0571845, mflops=87.4362 (err=6.6e-16) 2. PDA (f2c): elapsed time t=1.08329 s, 1 iters, t-(init.)=1.03329 s t(norm)=0.0909087, mflops=55.0002 (err=6.6e-16) 3. Singleton: elapsed time t=2.06658 s, 1 iters, t-(init.)=2.01659 s t(norm)=0.177419, mflops=28.1819 (err=8.9e-16) 4. Singleton (f2c): elapsed time t=2.13325 s, 1 iters, t-(init.)=2.08325 s t(norm)=0.183284, mflops=27.2801 (err=8.9e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 156.782 Normalized results and averages for N=592704: fft 0: mflops = 156.782 (norm. = 1), norm. avg. (of 18) = 1 fft 1: mflops = 87.4362 (norm. = 0.557692), norm. avg. (of 18) = 0.364568 fft 2: mflops = 55.0002 (norm. = 0.350806), norm. avg. (of 18) = 0.265855 fft 3: mflops = 28.1819 (norm. = 0.179752), norm. avg. (of 18) = 0.344072 fft 4: mflops = 27.2801 (norm. = 0.174), norm. avg. (of 18) = 0.35854 fft 5: mflops = -1 (norm. = -0.00637827), norm. avg. (of 10) = 0.684026 fft 6: mflops = -1 (norm. = -0.00637827), norm. avg. (of 10) = 0.579525 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=1.64993 s, 2 iters, t-(init.)=1.48327 s t(norm)=0.0424329, mflops=117.833 (err=7.7e-16) 1. PDA: elapsed time t=1.46661 s, 1 iters, t-(init.)=1.39994 s t(norm)=0.0800981, mflops=62.4234 (err=6.4e-16) 2. PDA (f2c): elapsed time t=1.74993 s, 1 iters, t-(init.)=1.6666 s t(norm)=0.0953549, mflops=52.4357 (err=6.4e-16) 3. Singleton: elapsed time t=4.0665 s, 1 iters, t-(init.)=3.98317 s t(norm)=0.227898, mflops=21.9396 (err=7.0e-16) 4. Singleton (f2c): elapsed time t=4.1165 s, 1 iters, t-(init.)=4.03317 s t(norm)=0.230759, mflops=21.6676 (err=7.0e-16) 5. Temperton: elapsed time t=1.18329 s, 1 iters, t-(init.)=1.09996 s t(norm)=0.0629342, mflops=79.448 (err=1.6e-07) 6. Temperton (f2c): elapsed time t=1.26662 s, 1 iters, t-(init.)=1.18329 s t(norm)=0.067702, mflops=73.8531 (err=7.5e-16) Top mflops for N=884736 = 117.833 Normalized results and averages for N=884736: fft 0: mflops = 117.833 (norm. = 1), norm. avg. (of 19) = 1 fft 1: mflops = 62.4234 (norm. = 0.529762), norm. avg. (of 19) = 0.373263 fft 2: mflops = 52.4357 (norm. = 0.445), norm. avg. (of 19) = 0.275283 fft 3: mflops = 21.9396 (norm. = 0.186192), norm. avg. (of 19) = 0.335762 fft 4: mflops = 21.6676 (norm. = 0.183884), norm. avg. (of 19) = 0.349348 fft 5: mflops = 79.448 (norm. = 0.674242), norm. avg. (of 11) = 0.683137 fft 6: mflops = 73.8531 (norm. = 0.626761), norm. avg. (of 11) = 0.583819 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=1.81659 s, 2 iters, t-(init.)=1.59994 s t(norm)=0.0343073, mflops=145.742 (err=7.4e-16) 1. PDA: elapsed time t=1.46661 s, 1 iters, t-(init.)=1.34995 s t(norm)=0.0578935, mflops=86.3654 (err=7.8e-16) 2. PDA (f2c): elapsed time t=2.18325 s, 1 iters, t-(init.)=2.06658 s t(norm)=0.0886271, mflops=56.4161 (err=7.2e-16) 3. Singleton: elapsed time t=3.76652 s, 1 iters, t-(init.)=3.64985 s t(norm)=0.156527, mflops=31.9434 (err=8.0e-16) 4. Singleton (f2c): elapsed time t=3.81651 s, 1 iters, t-(init.)=3.71652 s t(norm)=0.159386, mflops=31.3704 (err=8.0e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 145.742 Normalized results and averages for N=1157625: fft 0: mflops = 145.742 (norm. = 1), norm. avg. (of 20) = 1 fft 1: mflops = 86.3654 (norm. = 0.592593), norm. avg. (of 20) = 0.384229 fft 2: mflops = 56.4161 (norm. = 0.387097), norm. avg. (of 20) = 0.280874 fft 3: mflops = 31.9434 (norm. = 0.219178), norm. avg. (of 20) = 0.329933 fft 4: mflops = 31.3704 (norm. = 0.215247), norm. avg. (of 20) = 0.342643 fft 5: mflops = -1 (norm. = -0.00686146), norm. avg. (of 11) = 0.683137 fft 6: mflops = -1 (norm. = -0.00686146), norm. avg. (of 11) = 0.583819 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=1.23328 s, 1 iters, t-(init.)=1.09996 s t(norm)=0.0383373, mflops=130.421 (err=5.9e-16) 1. PDA: elapsed time t=1.94992 s, 1 iters, t-(init.)=1.81659 s t(norm)=0.0633146, mflops=78.9707 (err=6.1e-16) 2. PDA (f2c): elapsed time t=2.79989 s, 1 iters, t-(init.)=2.66656 s t(norm)=0.0929389, mflops=53.7988 (err=5.6e-16) 3. Singleton: elapsed time t=5.01647 s, 1 iters, t-(init.)=4.88314 s t(norm)=0.170194, mflops=29.3782 (err=6.1e-16) 4. Singleton (f2c): elapsed time t=5.0498 s, 1 iters, t-(init.)=4.91647 s t(norm)=0.171356, mflops=29.179 (err=6.1e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 130.421 Normalized results and averages for N=1404928: fft 0: mflops = 130.421 (norm. = 1), norm. avg. (of 21) = 1 fft 1: mflops = 78.9707 (norm. = 0.605505), norm. avg. (of 21) = 0.394766 fft 2: mflops = 53.7988 (norm. = 0.4125), norm. avg. (of 21) = 0.287142 fft 3: mflops = 29.3782 (norm. = 0.225256), norm. avg. (of 21) = 0.324948 fft 4: mflops = 29.179 (norm. = 0.223729), norm. avg. (of 21) = 0.33698 fft 5: mflops = -1 (norm. = -0.00766746), norm. avg. (of 11) = 0.683137 fft 6: mflops = -1 (norm. = -0.00766746), norm. avg. (of 11) = 0.583819 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=1.36661 s, 1 iters, t-(init.)=1.19995 s t(norm)=0.0335132, mflops=149.195 (err=7.3e-16) 1. PDA: elapsed time t=1.91659 s, 1 iters, t-(init.)=1.74993 s t(norm)=0.0488735, mflops=102.305 (err=7.9e-16) 2. PDA (f2c): elapsed time t=2.53323 s, 1 iters, t-(init.)=2.36657 s t(norm)=0.0660955, mflops=75.6481 (err=7.8e-16) 3. Singleton: elapsed time t=8.61632 s, 1 iters, t-(init.)=8.44966 s t(norm)=0.235989, mflops=21.1874 (err=9.4e-16) 4. Singleton (f2c): elapsed time t=8.86631 s, 1 iters, t-(init.)=8.71632 s t(norm)=0.243436, mflops=20.5392 (err=9.4e-16) 5. Temperton: elapsed time t=2.34991 s, 1 iters, t-(init.)=2.18325 s t(norm)=0.0609755, mflops=82.0002 (err=1.1e-08) 6. Temperton (f2c): elapsed time t=2.64989 s, 1 iters, t-(init.)=2.48323 s t(norm)=0.0693538, mflops=72.0941 (err=6.9e-16) Top mflops for N=1728000 = 149.195 Normalized results and averages for N=1728000: fft 0: mflops = 149.195 (norm. = 1), norm. avg. (of 22) = 1 fft 1: mflops = 102.305 (norm. = 0.685714), norm. avg. (of 22) = 0.407991 fft 2: mflops = 75.6481 (norm. = 0.507042), norm. avg. (of 22) = 0.297137 fft 3: mflops = 21.1874 (norm. = 0.142012), norm. avg. (of 22) = 0.316633 fft 4: mflops = 20.5392 (norm. = 0.137667), norm. avg. (of 22) = 0.327921 fft 5: mflops = 82.0002 (norm. = 0.549618), norm. avg. (of 12) = 0.67201 fft 6: mflops = 72.0941 (norm. = 0.483221), norm. avg. (of 12) = 0.575436 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=2.91655 s, 1 iters, t-(init.)=2.63323 s t(norm)=0.0409982, mflops=121.956 (err=1.2e-15) 1. PDA: elapsed time t=4.0665 s, 1 iters, t-(init.)=3.78318 s t(norm)=0.0589025, mflops=84.886 (err=1.2e-15) 2. PDA (f2c): elapsed time t=5.26646 s, 1 iters, t-(init.)=4.98313 s t(norm)=0.0775853, mflops=64.4452 (err=1.2e-15) 3. Singleton: elapsed time t=15.6327 s, 1 iters, t-(init.)=15.3494 s t(norm)=0.238983, mflops=20.922 (err=1.6e-15) 4. Singleton (f2c): elapsed time t=15.6827 s, 1 iters, t-(init.)=15.3994 s t(norm)=0.239762, mflops=20.854 (err=1.6e-15) 5. Temperton: elapsed time t=4.36649 s, 1 iters, t-(init.)=4.08317 s t(norm)=0.0635732, mflops=78.6495 (err=1.8e-07) 6. Temperton (f2c): elapsed time t=4.53315 s, 1 iters, t-(init.)=4.24983 s t(norm)=0.066168, mflops=75.5652 (err=1.2e-15) Top mflops for N=2985984 = 121.956 Normalized results and averages for N=2985984: fft 0: mflops = 121.956 (norm. = 1), norm. avg. (of 23) = 1 fft 1: mflops = 84.886 (norm. = 0.696035), norm. avg. (of 23) = 0.420515 fft 2: mflops = 64.4452 (norm. = 0.528428), norm. avg. (of 23) = 0.307194 fft 3: mflops = 20.922 (norm. = 0.171553), norm. avg. (of 23) = 0.310325 fft 4: mflops = 20.854 (norm. = 0.170996), norm. avg. (of 23) = 0.321098 fft 5: mflops = 78.6495 (norm. = 0.644898), norm. avg. (of 13) = 0.669925 fft 6: mflops = 75.5652 (norm. = 0.619608), norm. avg. (of 13) = 0.578834 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Bailey, Beauregard, Bergland, Brenner, Burrus, CWP (min N), CWP (best N), Edelblute, FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), Monro, NAPACK (f2c), Ooura (C), Ooura (F), Ransom, SCIPORT, Singleton, Singleton (f2c), Sorensen, Sorensen DIT, Temperton, Temperton (f2c), Valkenburg, DXML 2, 64.5303, 46.6052, 43.3911, 2.68876, 16.1326, 7.40201, 12.4588, 11.2352, 38.7182, 8.38894, 7.76754, , 20.9724, 20.2958, 67.1115, 76.2631, 100.667, , 27.0611, 10.6639, 11.4395, 42.6556, , , , , 7.06933, 88.3047, 66.2285, , , 8.86156, 8.98815, 52.4309, 34.0092, 8.61878, 7.14967, 15.9284, 10.4862 4, 115.71, 115.71, 52.4309, 10.3143, 33.5558, 15.5351, 42.6556, 28.2773, 43.3911, 32.2652, 16.7779, 48.3978, 88.3047, 62.1403, 189.938, 314.585, 324.733, , 75.1249, 21.6955, 22.6728, 108.244, 81.1833, 86.7822, 77.4364, 11.9842, 13.9816, 100.667, 129.061, 9.98684, 81.1833, 30.6913, 31.8567, 62.9171, 45.7579, 31.4585, 23.7423, 17.477, 40.5917 8, 209.724, 184.148, 57.1973, 14.2993, 71.2269, 20.9724, 69.9079, 38..9178, 45.4822, 84.832, 50.3337, 43.8956, 114.395, 93.2105, 339.328, 359.526, 511.868, 173.564, 102.028, 36.2983, 37.7502, 209.724, 112.687, 114.395, 107.858, 26.2154, 20.5164, 201.335, 196.105, 14.2993, 123.771, 33.4073, 32.8263, 99.3428, 44.9408, 60.8875, 41.4838, 18.8751, 104.862 16, 109.421, 94.9692, 66.2285, 25.6804, 105.966, 23.5204, 94.0816, 59.921, 49.3467, 136.037, 109.421, 44.9408, 239.684, 159.789, 473.729, 462.838, 592.161, 226.219, 162.367, 55.3117, 60.643, 272.074, 96.7955, 122.765, 115.71, 45.7579, 28.5987, 226.219, 214.186, 38.7182, 170.623, 85.3113, 89.0861, 129.061, 50.8421, 91.5157, 72.9473, 19.6616, 176.609 32, 132.457, 114.395, 73.1594, 29.4005, 132.457, 24.1989, 136.776, 75.8037, 55.1904, 174.77, 222.715, 49.154, 190.658, 131.077, 680.185, 729.473, 565.547, 310.702, 153.456, 71.4967, 83.8894, 251.668, 113.364, 141.387, 136.776, 74.9013, 33.8264, 262.154, 246.734, 41.3928, 216.955, 112.352, 113.364, 170.046, 57.1973, 109.421, 81.7105, 21.5469, 244.338 64, 127.967, 111.03, 82.0658, 41.0329, 149.506, 23.5939, 160.639, 98.0526, 61.8856, 239.684, 272.074, 52.4309, 287.621, 169.664, 629.171, 450.749, 382.281, 392.21, 196.105, 80.3197, 98.0526, 247.543, 109.421, 142.454, 138.533, 102.028, 40.5917, 328.263, 314.585, 73.3014, 262.61, 162.367, 160.639, 212.677, 62.9171, 155.671, 123.771, 22.206, 314.585 128, 139.816, 120.663, 86.3568, 37.9672, 187.413, 23.6785, 171.037, 106.125, 64.7676, 284.142, 400.381, 56.464, 323.244, 185.44, 577.599, 577.599, 469.781, 387.182, 209.724, 88.0839, 107.419, 225.856, 124.062, 160.153, 155.901, 122.339, 40.7796, 326.237, 314.585, 72.1999, 284.142, 149.295, 146.807, 255.316, 67.7569, 161.622, 122.339, 22.2434, 382.974 256, 143.81, 121.286, 91.5157, 43.0202, 172.081, 23.3026, 203.368, 119.842, 69.9079, 366.063, 509.708, 62.1403, 387.182, 216.489, 610.105, 619.491, 447.41, 428.372, 248.561, 94.9692, 115.71, 193.591, 125.834, 165.028, 159.789, 139.816, 44.5431, 369.421, 359.526, 98.6935, 287.621, 203.368, 203.368, 272.074, 71.9052, 181.383, 143.81, 22.4704, 479.368 512, 155.138, 136.447, 97.6299, 43.558, 190.337, 23.5939, 217.79, 102.028, 71.6777, 419.447, 397.371, 64.347, 294.158, 161.787, 471.878, 453.003, 371.314, 453.003, 200.444, 100.222, 125.834, 179.763, 138.111, 176.954, 171.592, 151.001, 39.3232, 353.909, 343.184, 96.7955, 260.347, 204.055, 202.233, 290.387, 73.5394, 147.079, 121.775, 22.1193, 503.337 1024, 142.993, 133.866, 87.3848, 48.3978, 151.607, 22.1539, 196.616, 105.743, 69.9079, 340.092, 340.092, 64.2011, 331.143, 177.231, 535.464, 479.368, 375.624, 419.447, 209.724, 88.6156, 103.143, 116.513, 128.402, 161.326, 157.293, 136.776, 37.9019, 344.751, 306.913, 115.444, 256.804, 206.285, 185.05, 241.989, 65.5386, 157.293, 146.319, 20.1657, 437.684 2048, 150.454, 139.816, 93.5254, 46.7627, 148.836, 22.1823, 216.277, 104.862, 71.3493, 300.908, 364.257, 67.193, 226.914, 153.797, 401.21, 401.21, 288.37, 432.555, 192.247, 89.8815, 108.139, 109.855, 125.834, 153.797, 152.107, 122.493, 34.9539, 297.672, 271.407, 106.475, 189.613, 177.458, 160.951, 240.726, 67.193, 144.185, 134.386, 19.2247, 318.201 4096, 136.037, 123.771, 89.8815, 49.6714, 67.4112, 21.6955, 209.724, 106.339, 66.2285, 282.245, 314.585, 66.2285, 149.506, 125.834, 290.387, 269.645, 169.664, 413.701, 142.454, 89.8815, 112.687, 100.667, 121.775, 149.506, 146.603, 116.155, 30.9428, 339.328, 290.387, 125.834, 66.8146, 204.055, 177.648, 225.375, 61.8856, 136.037, 126.892, 18.1492, 198.686 8192, 88.9046, 81.7922, 65.9615, 45.9507, 54.5281, 21.3001, 152.883, 86.0971, 52.4309, 263.846, 277.262, 52.4309, 100.978, 89.8815, 221.06, 192.452, 141.021, 230.401, 106.224, 64.9144, 76.4413, , 125.834, 151.467, 134.086, 72.3825, 28.011, 204.48, 197.09, 99.7466, 50.489, 129.829, 116.846, 122.078, 48.6858, 98.5448, 86.0971, 17.04, 134.086 16384, 50.6229, 46.8531, 40.7796, 46.8531, 46.36, 20.3898, 110.105, 71.0354, 34.4078, 228.789, 225.856, 35.5177, 93.7063, 89.8815, 176.168, 228.789, 129.535, 129.535, 100.095, 46.8531, 53.7097, , 78.6463, 85.5184, 77.2666, 43.6059, 28.5987, 151.869, 146.807, 112.928, 44.9408, 97.871, 93.7063, 69.9079, 31.0155, 80.8109, 77.9504, 15.9572, 174.424 32768, 45.8134, 44.1008, 35.7483, 39.3232, 44.5168, 19.6616, 100.4, 63.7673, 29.8657, 200.799, 198.686, 31.0446, 85.0231, 78.6463, 179.763, 179.763, 116.513, 120.994, 99.3428, 42.132, 48.1508, , 62.0892, 66.4617, 62.9171, 38.6785, 27.1194, 140.859, 132.923, 93.4412, 31.0446, 72.5966, 70.4296, 58.9848, 28.088, 67.4112, 64.6408, 14.7462, 162.717 65536, 34.4751, 33.5558, 26.7732, 44.5431, 28.5987, 19.3591, 81.1833, 54.7105, 24.6734, 183.031, 186.421, 25.421, 62.9171, 62.9171, 129.061, 129.061, 83.8894, 94.9692, 71.9052, 31.8567, 37.01, , 51.3609, 52.4309, 49.8353, 31.4585, 23.3026, 118.432, 114.395, 91.5157, 19.0658, 67.1115, 62.9171, 44.1523, 22.6728, 55.3117, 52.9828, 13.6776, 118.432 131072, 27.8539, 27.8539, 20.569, 32.6095, 24.0899, 17.5919, 60.0893, 39.3232, 19.955, 150.647, 148.554, 20.2574, 43.8357, 43.1286, 115.01, 104.862, 67.6956, 67.6956, 51.4226, 24.759, 27.0099, , 40.5148, 41.1381, 39.3232, 24.9904, 18.0674, 93.8237, 89.1325, 69.4539, 15.3677, 45.3216, 43.8357, 41.7809, 19.6616, 41.1381, 38.7533, 11.3304, 105.9 262144, 19.3922, 18.8751, 15.5564, 33.7056, 22.4704, 16.6545, 50.1109, 30.1199, 14.9014, 92.8285, 94.3756, 15.0599, 43.558, 42.2577, 97.6299, 102.955, 59.6057, 53.4202, 49.6714, 19.1302, 20.2233, , 32.5433, 33.309, 31.4585, 17.9194, 17.2638, 76.5208, 76.5208, 77.569, 12.9875, 38.2604, 36.2983, 30.7747, 15.0599, 34.9539, 34.1117, 10.1844, 112.129 Norm. Avg., 0.278655, 0.24853, 0.174875, 0.116551, 0.217009, 0.0653516, 0.346585, 0.204122, 0.142964, 0.604955, 0.638734, 0.125102, 0.406367, 0.299237, 0.83265, 0.84429, 0.695578, 0.640656, 0.351341, 0.156266, 0.182037, 0.345021, 0.260105, 0.302011, 0.286139, 0.183874, 0.0844189, 0.605749, 0.569443, 0.26484, 0.288288, 0.296917, 0.282109, 0.363186, 0.137506, 0.255555, 0.225137, 0.0575395, 0.579553 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, Brenner, CWP (min N), CWP (best N), FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c), Valkenburg, DXML 6, , 43.5638, 32.0997, 75.0638, 64.1993, 309.787, 464.681, 34.3602, 81.3191, 21.0308, 22.5887, 24.6422, 20.8511, 25.9529, 17.938, 36.9632 9, 14.3829, 89.7491, 59.0455, 116.557, 92.5248, 304.234, 381.911, 32.0532, 76.0585, 30.736, 41.9388, 43.5675, 21.3688, 48.7767, 18.8549, 60.6413 12, , 109.14, 89.035, 163.052, 123.03, 342.616, 392.27, 56.3888, 116.667, 28.1944, 46.9907, 45.7207, 68.3501, 57.8347, 18.5897, 91.4414 15, 20.9497, 122.905, 121.288, 170.701, 133.592, 283.627, 323.434, 39.3926, 75.5564, 21.1419, 49.5585, 51.2104, 83.044, 52.3743, 18.0037, 107.185 18, 18.9202, 129.738, 124.276, 135.703, 95.2111, 265.307, 240.942, 37.3613, 142.243, 36.4388, 63.4741, 62.1378, 27.8448, 63.4741, 17.7804, 107.329 24, , 203.628, 180.295, 164.841, 121.89, 323.52, 293.362, 75.2537, 196.686, 36.3621, 56.9353, 56.1959, 108.177, 84.0211, 18.9784, 166.426 36, 25.7702, 248.092, 248.092, 228.71, 135.532, 375.319, 314..784, 47.5242, 228.71, 43.5638, 92.6421, 93.8298, 65.3457, 114.355, 19.2598, 209.106 80, 44.7925, 315.68, 348.91, 315.68, 187.621, 414.33, 427.696, 96.543, 134.377, 28.2498, 152.984, 150.666, 200.887, 155.374, 19.1229, 320.772 108, 28.4597, 273.213, 345.631, 301.972, 154.233, 387.667, 392.977, 44.2706, 224.12, 52.7341, 105.468, 98.2443, 129.222, 155.91, 20.3745, 318.748 210, 36.5271, 398.145, 393.23, 206.829, 92.5919, 338.847, 269.929, 45.2438, 150.243, 23.4203, 93.6812, 90.4875, , , 17.7743, 75.8372 504, 36.5789, 617.776, 617.776, 193.391, 84.2422, 364.589, 397.142, 52.4527, 236.595, 29.5744, 107.961, 101.091, , , 19.0411, 104.905 1000, 37.7977, 318.089, 453.572, 272.143, 163.286, 349.898, 354.969, 61.2322, 98.7617, 19.8806, 142.401, 133.114, 191.351, 136.072, 15.9459, 415.134 1960, 40.1545, 424.861, 431.826, 154.949, 63.9354, 313.588, 263.414, 53.9782, 147.985, 17.3299, 114.528, 107.956, , , 13.952, 56.7702 4725, 29.5302, 308.141, 421.86, 128.392, 78.3987, 236.241, 199.08, 35.722, 113.578, 19.9528, 110.738, 100.671, , , 14.1972, 96.294 10368, 27.9538, 272.37, 262.283, 137.954, 96.5677, 241.419, 262.283, 49.178, 158.544, 33.1951, 92.3691, 85.6649, 106.224, 99.2752, 15.8072, 147.534 27000, 26.1353, 258.695, 258.695, 137.505, 105.993, 171.494, 171.494, 37.7797, 100.414, 21.4368, 78.6753, 74.8186, 90.8512, 75.5594, 14.4536, 85.7472 75600, 19.3457, 178.215, 175.555, 57.6579, 48.2058, 140.026, 140.026, 32.6728, 70.0131, 17.7142, 55.4821, 53.4646, , , 11.6689, 19.6037 165375, 12.465, 136.251, 136.251, 25.2966, 23.564, 116.622, 109.217, 21.5022, 55.4894, 12.6483, 42.4734, 40.9565, , , 10.7511, 19.113 362880, 12.1861, 60.021, 59.1384, 33.5117, 30.4652, 125.669, 113.279, 25.7783, 60.9304, 15.1181, 28.3198, 27.9264, , , 9.48445, 60.021 Norm. Avg., 0.0823438, 0.689103, 0.716525, 0.445087, 0.291997, 0.834665, 0.839663, 0.140944, 0.388861, 0.0833821, 0.234893, 0.226408, 0.258904, 0.251097, 0.0513119, 0.372439 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM, HARM (f2c), PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 4x4x4, 335.558, , , 51.7127, 39.3232, 134.822, 184.148, 188.751, 123.771 8x8x8, 588.316, 235.939, 174.232, 79.754, 66.6181, 139.816, 119.211, 348.464, 185.657 16x16x16, 511.868, 239.684, 186.421, 154.083, 103.425, 175.583, 167.779, 339.328, 239.684 32x32x32, 241.989, 136.776, 122.566, 83.5182, 67.4112, 71.4967, 67.4112, 124.178, 109.739 64x64x64, 159.508, 80.8934, 75.5005, 70.7817, 59.6057, 31.812, 31.4585, 94.3756, 91.3312 256x64x32, 125.834, 73.7916, 70.3191, 55.3437, 48.2026, 26.924, 25.9875, 83.0156, 81.8784 16x1024x64, 138.279, 74.9013, 71.4967, 58.2566, 50.7396, 28.5987, 27.9631, , 128x128x128, 127.044, 69.5399, 66.3949, 61.1694, 52.8503, 20.7419, 19.9888, 72.9977, 66.3949 Norm. Avg., 1, 0.51673, 0.462666, 0.340271, 0.281342, 0.257675, 0.265133, 0.593837, 0.478864 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 5x5x5, 326.091, 60.281, 46.5212, 237.775, 326.091, 241.124, 171.198 6x6x6, 477.308, 53.4647, 45.2394, 115.966, 119.327, 161.442, 155.35 7x7x7, 359.48, 24.6518, 24.3141, 137.859, 147.911, , 9x9x9, 396.24, 81.9149, 62.6408, 137.406, 135.225, 195.843, 227.177 10x10x10, 445.325, 94.2034, 75.5953, 145.791, 147.548, 242.504, 197.523 11x11x11, 290.166, 24.1118, 24.1118, 158.642, 174.997, , 12x12x12, 397.174, 115.341, 80.4137, 132.776, 116.518, 351.346, 262.5 13x13x13, 274.965, 22.8439, 23.415, 144.092, 154.49, , 14x14x14, 397.059, 48.6298, 40.1196, 106.985, 103.534, , 15x15x15, 357.426, 122.752, 90.6901, 135.028, 127.921, 347.214, 240.643 24x25x28, 194.739, 111.795, 76.7405, 86.2417, 80.1361, , 48x48x48, 171.049, 103.425, 72.9061, 46.3258, 46.8134, 121.843, 115.514 49x49x49, 184.709, 60.2058, 38.9857, 55.956, 55.956, , 60x60x60, 150.603, 110.684, 80.5856, 33.7749, 31.8985, 104.395, 93.7425 72x60x56, 157.335, 100.817, 68.3165, 33.2824, 31.2774, , 75x75x75, 152.587, 111.299, 82.9859, 35.8348, 35.3, 119.752, 100.643 80x80x80, 138.727, 95.5168, 73.7534, 33.8751, 33.1052, 93.9762, 88.2806 84x84x84, 156.782, 87.4362, 55.0002, 28.1819, 27.2801, , 96x96x96, 117.833, 62.4234, 52.4357, 21.9396, 21.6676, 79.448, 73.8531 105x105x105, 145.742, 86.3654, 56.4161, 31.9434, 31.3704, , 112x112x112, 130.421, 78.9707, 53.7988, 29.3782, 29.179, , 120x120x120, 149.195, 102.305, 75.6481, 21.1874, 20.5392, 82.0002, 72.0941 144x144x144, 121.956, 84.886, 64.4452, 20.922, 20.854, 78.6495, 75.5652 Norm. Avg., 1, 0.420515, 0.307194, 0.310325, 0.321098, 0.669925, 0.578834 @@@@ end