To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Steven G. Johnson @ submitter email = stevenj@alum.mit.edu @ submitter organization = MIT @ computer manufacturer = Apple @ computer model = Power Macintosh 9500/120 @ CPU manufacturer = Motorola @ CPU model = PowerPC 604 @ CPU speed = 120 MHz @ RAM = 64 MB @ L2 cache size = 256 kB @ operating system = MacOS 8 @ C compiler = Metrowerks Codewarrior Pro 2 @ C compiler flags = all @ Fortran compiler = NONE @ Fortran compiler flags = NONE @ remarks = Linked with Motorola LIBMOTO library. @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) Maximum array size = 360360 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Beauregard 5. Bergland 6. CWP (min N) 7. CWP (best N) 8. Edelblute 9. FFTPACK (f2c) 10. FFTW 11. FFTW_ESTIMATE 12. Frigo-old 13. Green 14. GSL 15. GSL DIT 16. GSL DIF 17. Krukar 18. Mayer (Buneman) 19. Mayer (simple) 20. Mayer (lookup) 21. NAPACK (f2c) 22. Nielsen 23. NR (C) 24. Ooura (C) 25. QFT 26. Ransom 27. Singleton (f2c) 28. Temperton (f2c) 29. Valkenburg Computing normalized averages (30 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.77809 s, 2097152 iters, t-(init.)=1.49397 s t(norm)=0.356191, mflops=14.0374 (err=1.7e-17) 1. Arndt DIT: elapsed time t=1.77774 s, 2097152 iters, t-(init.)=1.49342 s t(norm)=0.356059, mflops=14.0426 (err=1.7e-17) 2. Arndt Split-Radix: elapsed time t=1.43154 s, 1048576 iters, t-(init.)=1.28929 s t(norm)=0.614784, mflops=8.13294 (err=1.7e-17) 3. Arndt 4-step: elapsed time t=1.15709 s, 65536 iters, t-(init.)=1.14799 s t(norm)=8.75845, mflops=0.570877 (err=1.7e-17) 4. Beauregard: elapsed time t=1.37642 s, 262144 iters, t-(init.)=1.34082 s t(norm)=2.55742, mflops=1.95509 (err=1.7e-17) 5. Bergland: elapsed time t=1.89353 s, 524288 iters, t-(init.)=1.82238 s t(norm)=1.73795, mflops=2.87695 (err=1.7e-17) 6. CWP (min N): elapsed time t=1.36917 s, 262144 iters, t-(init.)=1.33359 s t(norm)=2.54362, mflops=1.9657 7. CWP (best N) (N=3): elapsed time t=1.44884 s, 262144 iters, t-(init.)=1.40445 s t(norm)=2.67878, mflops=1.86652 8. Skipping fft (Edelblute can't handle N <= 2). 9. FFTPACK (f2c): elapsed time t=1.8636 s, 524288 iters, t-(init.)=1.79239 s t(norm)=1.70936, mflops=2.92507 (err=1.7e-17) FFTW_MEASURE plan: (cost = 8.133087e-07) FFTW_NOTW 2 10. FFTW: elapsed time t=1.81409 s, 2097152 iters, t-(init.)=1.52989 s t(norm)=0.364755, mflops=13.7078 (err=1.7e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 11. FFTW_ESTIMATE: elapsed time t=1.79634 s, 2097152 iters, t-(init.)=1.51196 s t(norm)=0.360479, mflops=13.8704 (err=1.7e-17) 12. Frigo-old: elapsed time t=1.12003 s, 2097152 iters, t-(init.)=0.835554 s t(norm)=0.199212, mflops=25.0989 (err=1.7e-17) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=1.07635 s, 524288 iters, t-(init.)=1.00534 s t(norm)=0.958771, mflops=5.21501 (err=1.7e-17) 15. GSL DIT: elapsed time t=1.49196 s, 524288 iters, t-(init.)=1.42084 s t(norm)=1.35502, mflops=3.68999 (err=1.7e-17) 16. GSL DIF: elapsed time t=1.57335 s, 524288 iters, t-(init.)=1.50224 s t(norm)=1.43265, mflops=3.49003 (err=1.7e-17) 17. Krukar: elapsed time t=1.01368 s, 1048576 iters, t-(init.)=0.862657 s t(norm)=0.411347, mflops=12.1552 (err=1.7e-17) 18. Skipping fft (Mayer can't handle N <= 2). 19. Skipping fft (Mayer can't handle N <= 2). 20. Skipping fft (Mayer can't handle N <= 2). 21. NAPACK (f2c): elapsed time t=1.01677 s, 131072 iters, t-(init.)=0.998777 s t(norm)=3.81003, mflops=1.31232 (err=1.7e-17) 22. Nielsen: elapsed time t=1.32483 s, 131072 iters, t-(init.)=1.30595 s t(norm)=4.98182, mflops=1.00365 (err=1.7e-17) 23. NR (C): elapsed time t=1.31881 s, 524288 iters, t-(init.)=1.24768 s t(norm)=1.18988, mflops=4.20211 (err=1.7e-17) 24. Ooura (C): elapsed time t=1.83095 s, 2097152 iters, t-(init.)=1.54645 s t(norm)=0.368703, mflops=13.5611 (err=1.7e-17) 25. Skipping fft (QFT requires N >= 16). 26. Skipping fft (Ransom doesn't work for N=2). 27. Singleton (f2c): elapsed time t=1.92041 s, 524288 iters, t-(init.)=1.84908 s t(norm)=1.76342, mflops=2.8354 (err=1.7e-17) 28. Temperton (f2c): elapsed time t=1.01812 s, 131072 iters, t-(init.)=1.00023 s t(norm)=3.81558, mflops=1.31042 (err=1.7e-17) 29. Valkenburg: elapsed time t=1.5524 s, 524288 iters, t-(init.)=1.48136 s t(norm)=1.41274, mflops=3.53923 (err=1.7e-17) Top mflops for N=2 = 25.0989 Normalized results and averages for N=2: fft 0: mflops = 14.0374 (norm. = 0.559282), norm. avg. (of 1) = 0.559282 fft 1: mflops = 14.0426 (norm. = 0.559491), norm. avg. (of 1) = 0.559491 fft 2: mflops = 8.13294 (norm. = 0.324035), norm. avg. (of 1) = 0.324035 fft 3: mflops = 0.570877 (norm. = 0.0227451), norm. avg. (of 1) = 0.0227451 fft 4: mflops = 1.95509 (norm. = 0.0778955), norm. avg. (of 1) = 0.0778955 fft 5: mflops = 2.87695 (norm. = 0.114624), norm. avg. (of 1) = 0.114624 fft 6: mflops = 1.9657 (norm. = 0.0783181), norm. avg. (of 1) = 0.0783181 fft 7: mflops = 1.86652 (norm. = 0.0743664), norm. avg. (of 1) = 0.0743664 fft 8: mflops = -1 (norm. = -0.0398423), norm. avg. (of 0) = -1 fft 9: mflops = 2.92507 (norm. = 0.116542), norm. avg. (of 1) = 0.116542 fft 10: mflops = 13.7078 (norm. = 0.546152), norm. avg. (of 1) = 0.546152 fft 11: mflops = 13.8704 (norm. = 0.552631), norm. avg. (of 1) = 0.552631 fft 12: mflops = 25.0989 (norm. = 1), norm. avg. (of 1) = 1 fft 13: mflops = -1 (norm. = -0.0398423), norm. avg. (of 0) = -1 fft 14: mflops = 5.21501 (norm. = 0.207778), norm. avg. (of 1) = 0.207778 fft 15: mflops = 3.68999 (norm. = 0.147018), norm. avg. (of 1) = 0.147018 fft 16: mflops = 3.49003 (norm. = 0.139051), norm. avg. (of 1) = 0.139051 fft 17: mflops = 12.1552 (norm. = 0.484291), norm. avg. (of 1) = 0.484291 fft 18: mflops = -1 (norm. = -0.0398423), norm. avg. (of 0) = -1 fft 19: mflops = -1 (norm. = -0.0398423), norm. avg. (of 0) = -1 fft 20: mflops = -1 (norm. = -0.0398423), norm. avg. (of 0) = -1 fft 21: mflops = 1.31232 (norm. = 0.0522861), norm. avg. (of 1) = 0.0522861 fft 22: mflops = 1.00365 (norm. = 0.0399877), norm. avg. (of 1) = 0.0399877 fft 23: mflops = 4.20211 (norm. = 0.167422), norm. avg. (of 1) = 0.167422 fft 24: mflops = 13.5611 (norm. = 0.540304), norm. avg. (of 1) = 0.540304 fft 25: mflops = -1 (norm. = -0.0398423), norm. avg. (of 0) = -1 fft 26: mflops = -1 (norm. = -0.0398423), norm. avg. (of 0) = -1 fft 27: mflops = 2.8354 (norm. = 0.112969), norm. avg. (of 1) = 0.112969 fft 28: mflops = 1.31042 (norm. = 0.05221), norm. avg. (of 1) = 0.05221 fft 29: mflops = 3.53923 (norm. = 0.141011), norm. avg. (of 1) = 0.141011 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.61855 s, 1048576 iters, t-(init.)=1.43162 s t(norm)=0.170663, mflops=29.2975 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.64529 s, 1048576 iters, t-(init.)=1.45864 s t(norm)=0.173884, mflops=28.7548 (err=1.3e-16) 2. Arndt Split-Radix: elapsed time t=1.73419 s, 524288 iters, t-(init.)=1.64097 s t(norm)=0.391237, mflops=12.78 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.02073 s, 65536 iters, t-(init.)=1.00909 s t(norm)=1.92469, mflops=2.59783 (err=1.3e-16) 4. Beauregard: elapsed time t=1.16937 s, 131072 iters, t-(init.)=1.14595 s t(norm)=1.09286, mflops=4.57513 (err=5.3e-17) 5. Bergland: elapsed time t=1.10148 s, 262144 iters, t-(init.)=1.05483 s t(norm)=0.502984, mflops=9.94067 (err=5.3e-17) 6. CWP (min N): elapsed time t=1.44211 s, 262144 iters, t-(init.)=1.39535 s t(norm)=0.665357, mflops=7.51477 7. CWP (best N) (N=15): elapsed time t=1.51759 s, 131072 iters, t-(init.)=1.45745 s t(norm)=1.38993, mflops=3.59729 8. Edelblute: elapsed time t=1.87106 s, 524288 iters, t-(init.)=1.77782 s t(norm)=0.423865, mflops=11.7962 (err=1.3e-16) 9. FFTPACK (f2c): elapsed time t=1.48522 s, 262144 iters, t-(init.)=1.43853 s t(norm)=0.685943, mflops=7.28923 (err=5.3e-17) FFTW_MEASURE plan: (cost = 1.083771e-06) FFTW_NOTW 4 10. FFTW: elapsed time t=1.1738 s, 1048576 iters, t-(init.)=0.987561 s t(norm)=0.117726, mflops=42.4713 (err=5.3e-17) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 11. FFTW_ESTIMATE: elapsed time t=1.16491 s, 1048576 iters, t-(init.)=0.978083 s t(norm)=0.116597, mflops=42.8829 (err=5.3e-17) 12. Frigo-old: elapsed time t=1.65354 s, 2097152 iters, t-(init.)=1.28001 s t(norm)=0.0762948, mflops=65.5353 (err=5.3e-17) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=1.61524 s, 524288 iters, t-(init.)=1.52184 s t(norm)=0.362834, mflops=13.7804 (err=5.3e-17) 15. GSL DIT: elapsed time t=1.44331 s, 262144 iters, t-(init.)=1.39668 s t(norm)=0.66599, mflops=7.50762 (err=6.4e-17) 16. GSL DIF: elapsed time t=1.49821 s, 262144 iters, t-(init.)=1.45157 s t(norm)=0.692163, mflops=7.22374 (err=6.4e-17) 17. Krukar: elapsed time t=1.45924 s, 1048576 iters, t-(init.)=1.2635 s t(norm)=0.150621, mflops=33.1958 (err=5.3e-17) 18. Mayer (Buneman): elapsed time t=1.52166 s, 524288 iters, t-(init.)=1.42824 s t(norm)=0.34052, mflops=14.6834 (err=1.3e-16) 19. Mayer (simple): elapsed time t=1.49112 s, 524288 iters, t-(init.)=1.39778 s t(norm)=0.333256, mflops=15.0035 20. Mayer (lookup): elapsed time t=1.62764 s, 524288 iters, t-(init.)=1.53429 s t(norm)=0.365804, mflops=13.6685 (err=1.3e-16) 21. NAPACK (f2c): elapsed time t=1.82115 s, 131072 iters, t-(init.)=1.79779 s t(norm)=1.71451, mflops=2.91629 (err=5.3e-17) 22. Nielsen: elapsed time t=1.47092 s, 131072 iters, t-(init.)=1.44647 s t(norm)=1.37946, mflops=3.6246 (err=1.3e-16) 23. NR (C): elapsed time t=1.39913 s, 262144 iters, t-(init.)=1.35244 s t(norm)=0.644892, mflops=7.75324 (err=6.4e-17) 24. Ooura (C): elapsed time t=1.95018 s, 1048576 iters, t-(init.)=1.76372 s t(norm)=0.210252, mflops=23.781 (err=5.3e-17) 25. Skipping fft (QFT requires N >= 16). 26. Ransom: elapsed time t=1.23208 s, 65536 iters, t-(init.)=1.22057 s t(norm)=2.32806, mflops=2.14771 (err=2.4e-16) 27. Singleton (f2c): elapsed time t=1.10979 s, 262144 iters, t-(init.)=1.06327 s t(norm)=0.507006, mflops=9.86182 (err=5.3e-17) 28. Temperton (f2c): elapsed time t=1.31488 s, 131072 iters, t-(init.)=1.29164 s t(norm)=1.2318, mflops=4.0591 (err=5.3e-17) 29. Valkenburg: elapsed time t=1.40547 s, 131072 iters, t-(init.)=1.38213 s t(norm)=1.3181, mflops=3.79334 (err=5.3e-17) Top mflops for N=4 = 65.5353 Normalized results and averages for N=4: fft 0: mflops = 29.2975 (norm. = 0.447049), norm. avg. (of 2) = 0.503166 fft 1: mflops = 28.7548 (norm. = 0.438768), norm. avg. (of 2) = 0.49913 fft 2: mflops = 12.78 (norm. = 0.195009), norm. avg. (of 2) = 0.259522 fft 3: mflops = 2.59783 (norm. = 0.0396401), norm. avg. (of 2) = 0.0311926 fft 4: mflops = 4.57513 (norm. = 0.0698118), norm. avg. (of 2) = 0.0738536 fft 5: mflops = 9.94067 (norm. = 0.151684), norm. avg. (of 2) = 0.133154 fft 6: mflops = 7.51477 (norm. = 0.114667), norm. avg. (of 2) = 0.0964928 fft 7: mflops = 3.59729 (norm. = 0.0548909), norm. avg. (of 2) = 0.0646287 fft 8: mflops = 11.7962 (norm. = 0.179998), norm. avg. (of 1) = 0.179998 fft 9: mflops = 7.28923 (norm. = 0.111226), norm. avg. (of 2) = 0.113884 fft 10: mflops = 42.4713 (norm. = 0.648068), norm. avg. (of 2) = 0.59711 fft 11: mflops = 42.8829 (norm. = 0.654348), norm. avg. (of 2) = 0.60349 fft 12: mflops = 65.5353 (norm. = 1), norm. avg. (of 2) = 1 fft 13: mflops = -1 (norm. = -0.015259), norm. avg. (of 0) = -1 fft 14: mflops = 13.7804 (norm. = 0.210274), norm. avg. (of 2) = 0.209026 fft 15: mflops = 7.50762 (norm. = 0.114558), norm. avg. (of 2) = 0.130788 fft 16: mflops = 7.22374 (norm. = 0.110227), norm. avg. (of 2) = 0.124639 fft 17: mflops = 33.1958 (norm. = 0.506534), norm. avg. (of 2) = 0.495412 fft 18: mflops = 14.6834 (norm. = 0.224054), norm. avg. (of 1) = 0.224054 fft 19: mflops = 15.0035 (norm. = 0.228937), norm. avg. (of 1) = 0.228937 fft 20: mflops = 13.6685 (norm. = 0.208567), norm. avg. (of 1) = 0.208567 fft 21: mflops = 2.91629 (norm. = 0.0444995), norm. avg. (of 2) = 0.0483928 fft 22: mflops = 3.6246 (norm. = 0.0553076), norm. avg. (of 2) = 0.0476476 fft 23: mflops = 7.75324 (norm. = 0.118306), norm. avg. (of 2) = 0.142864 fft 24: mflops = 23.781 (norm. = 0.362873), norm. avg. (of 2) = 0.451588 fft 25: mflops = -1 (norm. = -0.015259), norm. avg. (of 0) = -1 fft 26: mflops = 2.14771 (norm. = 0.0327719), norm. avg. (of 1) = 0.0327719 fft 27: mflops = 9.86182 (norm. = 0.150481), norm. avg. (of 2) = 0.131725 fft 28: mflops = 4.0591 (norm. = 0.0619376), norm. avg. (of 2) = 0.0570738 fft 29: mflops = 3.79334 (norm. = 0.0578824), norm. avg. (of 2) = 0.0994467 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.62644 s, 524288 iters, t-(init.)=1.48877 s t(norm)=0.118317, mflops=42.2593 (err=1.1e-16) 1. Arndt DIT: elapsed time t=1.61591 s, 524288 iters, t-(init.)=1.47793 s t(norm)=0.117455, mflops=42.5694 (err=1.1e-16) 2. Arndt Split-Radix: elapsed time t=1.00775 s, 131072 iters, t-(init.)=0.973141 s t(norm)=0.309353, mflops=16.1628 (err=7.7e-17) 3. Arndt 4-step: elapsed time t=1.13321 s, 32768 iters, t-(init.)=1.12458 s t(norm)=1.42997, mflops=3.49657 (err=9.0e-17) 4. Beauregard: elapsed time t=1.23619 s, 65536 iters, t-(init.)=1.21896 s t(norm)=0.774992, mflops=6.45168 (err=1.5e-16) 5. Bergland: elapsed time t=1.04937 s, 131072 iters, t-(init.)=1.01472 s t(norm)=0.322572, mflops=15.5004 (err=1.6e-16) 6. CWP (min N): elapsed time t=1.78833 s, 262144 iters, t-(init.)=1.71928 s t(norm)=0.273273, mflops=18.2968 7. CWP (best N) (N=15): elapsed time t=1.51717 s, 131072 iters, t-(init.)=1.45723 s t(norm)=0.463241, mflops=10.7935 8. Edelblute: elapsed time t=1.24733 s, 131072 iters, t-(init.)=1.21283 s t(norm)=0.385548, mflops=12.9686 (err=8.3e-17) 9. FFTPACK (f2c): elapsed time t=1.36691 s, 131072 iters, t-(init.)=1.33224 s t(norm)=0.423509, mflops=11.8061 (err=1.5e-16) FFTW_MEASURE plan: (cost = 1.976288e-06) FFTW_NOTW 8 10. FFTW: elapsed time t=1.05421 s, 524288 iters, t-(init.)=0.916156 s t(norm)=0.0728095, mflops=68.6723 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 11. FFTW_ESTIMATE: elapsed time t=1.05024 s, 524288 iters, t-(init.)=0.912426 s t(norm)=0.0725131, mflops=68.9531 (err=1.4e-16) 12. Frigo-old: elapsed time t=1.68039 s, 1048576 iters, t-(init.)=1.40487 s t(norm)=0.0558246, mflops=89.5662 (err=1.4e-16) 13. Green: elapsed time t=1.04345 s, 262144 iters, t-(init.)=0.972392 s t(norm)=0.154558, mflops=32.3504 (err=1.4e-16) 14. GSL: elapsed time t=1.66431 s, 262144 iters, t-(init.)=1.5955 s t(norm)=0.253597, mflops=19.7163 (err=1.4e-16) 15. GSL DIT: elapsed time t=1.2363 s, 131072 iters, t-(init.)=1.2018 s t(norm)=0.382041, mflops=13.0876 (err=1.5e-16) 16. GSL DIF: elapsed time t=1.27045 s, 131072 iters, t-(init.)=1.23594 s t(norm)=0.392895, mflops=12.726 (err=1.6e-16) 17. Krukar: elapsed time t=1.53504 s, 524288 iters, t-(init.)=1.393 s t(norm)=0.110705, mflops=45.1649 (err=1.5e-16) 18. Mayer (Buneman): elapsed time t=1.38699 s, 262144 iters, t-(init.)=1.3175 s t(norm)=0.209412, mflops=23.8764 (err=1.1e-16) 19. Mayer (simple): elapsed time t=1.37254 s, 262144 iters, t-(init.)=1.30367 s t(norm)=0.207212, mflops=24.1299 20. Mayer (lookup): elapsed time t=1.43554 s, 262144 iters, t-(init.)=1.36663 s t(norm)=0.21722, mflops=23.0182 (err=1.1e-16) 21. NAPACK (f2c): elapsed time t=1.82089 s, 65536 iters, t-(init.)=1.80344 s t(norm)=1.1466, mflops=4.36073 (err=1.7e-16) 22. Nielsen: elapsed time t=1.01157 s, 65536 iters, t-(init.)=0.993732 s t(norm)=0.631798, mflops=7.91392 (err=7.5e-16) 23. NR (C): elapsed time t=1.22482 s, 131072 iters, t-(init.)=1.19028 s t(norm)=0.378381, mflops=13.2142 (err=1.6e-16) 24. Ooura (C): elapsed time t=1.57173 s, 524288 iters, t-(init.)=1.43391 s t(norm)=0.113957, mflops=43.8761 (err=1.5e-16) 25. Skipping fft (QFT requires N >= 16). 26. Ransom: elapsed time t=1.55742 s, 32768 iters, t-(init.)=1.54891 s t(norm)=1.96954, mflops=2.53866 (err=3.1e-16) 27. Singleton (f2c): elapsed time t=1.4611 s, 131072 iters, t-(init.)=1.42632 s t(norm)=0.453416, mflops=11.0274 (err=1.4e-16) 28. Temperton (f2c): elapsed time t=1.37185 s, 65536 iters, t-(init.)=1.3546 s t(norm)=0.861231, mflops=5.80564 (err=1.4e-16) 29. Valkenburg: elapsed time t=1.99766 s, 65536 iters, t-(init.)=1.98032 s t(norm)=1.25906, mflops=3.97123 (err=1.4e-16) Top mflops for N=8 = 89.5662 Normalized results and averages for N=8: fft 0: mflops = 42.2593 (norm. = 0.471822), norm. avg. (of 3) = 0.492718 fft 1: mflops = 42.5694 (norm. = 0.475284), norm. avg. (of 3) = 0.491181 fft 2: mflops = 16.1628 (norm. = 0.180456), norm. avg. (of 3) = 0.233167 fft 3: mflops = 3.49657 (norm. = 0.039039), norm. avg. (of 3) = 0.0338081 fft 4: mflops = 6.45168 (norm. = 0.0720325), norm. avg. (of 3) = 0.0732466 fft 5: mflops = 15.5004 (norm. = 0.173061), norm. avg. (of 3) = 0.146457 fft 6: mflops = 18.2968 (norm. = 0.204282), norm. avg. (of 3) = 0.132423 fft 7: mflops = 10.7935 (norm. = 0.120509), norm. avg. (of 3) = 0.0832554 fft 8: mflops = 12.9686 (norm. = 0.144793), norm. avg. (of 2) = 0.162395 fft 9: mflops = 11.8061 (norm. = 0.131815), norm. avg. (of 3) = 0.119861 fft 10: mflops = 68.6723 (norm. = 0.766721), norm. avg. (of 3) = 0.653647 fft 11: mflops = 68.9531 (norm. = 0.769856), norm. avg. (of 3) = 0.658945 fft 12: mflops = 89.5662 (norm. = 1), norm. avg. (of 3) = 1 fft 13: mflops = 32.3504 (norm. = 0.36119), norm. avg. (of 1) = 0.36119 fft 14: mflops = 19.7163 (norm. = 0.220131), norm. avg. (of 3) = 0.212728 fft 15: mflops = 13.0876 (norm. = 0.146122), norm. avg. (of 3) = 0.135899 fft 16: mflops = 12.726 (norm. = 0.142085), norm. avg. (of 3) = 0.130454 fft 17: mflops = 45.1649 (norm. = 0.504263), norm. avg. (of 3) = 0.498363 fft 18: mflops = 23.8764 (norm. = 0.266579), norm. avg. (of 2) = 0.245316 fft 19: mflops = 24.1299 (norm. = 0.269408), norm. avg. (of 2) = 0.249173 fft 20: mflops = 23.0182 (norm. = 0.256996), norm. avg. (of 2) = 0.232782 fft 21: mflops = 4.36073 (norm. = 0.0486872), norm. avg. (of 3) = 0.0484909 fft 22: mflops = 7.91392 (norm. = 0.0883584), norm. avg. (of 3) = 0.0612179 fft 23: mflops = 13.2142 (norm. = 0.147535), norm. avg. (of 3) = 0.144421 fft 24: mflops = 43.8761 (norm. = 0.489874), norm. avg. (of 3) = 0.46435 fft 25: mflops = -1 (norm. = -0.0111649), norm. avg. (of 0) = -1 fft 26: mflops = 2.53866 (norm. = 0.0283439), norm. avg. (of 2) = 0.0305579 fft 27: mflops = 11.0274 (norm. = 0.12312), norm. avg. (of 3) = 0.128857 fft 28: mflops = 5.80564 (norm. = 0.0648196), norm. avg. (of 3) = 0.0596557 fft 29: mflops = 3.97123 (norm. = 0.0443385), norm. avg. (of 3) = 0.0810773 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.55253 s, 131072 iters, t-(init.)=1.49136 s t(norm)=0.177784, mflops=28.124 (err=1.9e-16) 1. Arndt DIT: elapsed time t=1.52739 s, 131072 iters, t-(init.)=1.46632 s t(norm)=0.174799, mflops=28.6043 (err=1.9e-16) 2. Arndt Split-Radix: elapsed time t=1.09834 s, 65536 iters, t-(init.)=1.06768 s t(norm)=0.254555, mflops=19.6421 (err=1.5e-16) 3. Arndt 4-step: elapsed time t=1.58557 s, 32768 iters, t-(init.)=1.57024 s t(norm)=0.748746, mflops=6.67783 (err=2.0e-16) 4. Beauregard: elapsed time t=1.43037 s, 32768 iters, t-(init.)=1.41483 s t(norm)=0.674644, mflops=7.41131 (err=2.3e-16) 5. Bergland: elapsed time t=1.73595 s, 131072 iters, t-(init.)=1.6749 s t(norm)=0.199664, mflops=25.0421 (err=2.6e-16) 6. CWP (min N): elapsed time t=1.31736 s, 131072 iters, t-(init.)=1.25618 s t(norm)=0.149749, mflops=33.3893 7. CWP (best N) (N=28): elapsed time t=1.9788 s, 131072 iters, t-(init.)=1.87672 s t(norm)=0.223722, mflops=22.3492 8. Edelblute: elapsed time t=1.46259 s, 65536 iters, t-(init.)=1.432 s t(norm)=0.341416, mflops=14.6449 (err=1.6e-16) 9. FFTPACK (f2c): elapsed time t=1.23896 s, 65536 iters, t-(init.)=1.20822 s t(norm)=0.288063, mflops=17.3573 (err=2.1e-16) FFTW_MEASURE plan: (cost = 3.850464e-06) FFTW_NOTW 16 10. FFTW: elapsed time t=1.0175 s, 262144 iters, t-(init.)=0.894952 s t(norm)=0.0533433, mflops=93.7325 (err=2.2e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 11. FFTW_ESTIMATE: elapsed time t=1.0155 s, 262144 iters, t-(init.)=0.893217 s t(norm)=0.0532399, mflops=93.9146 (err=2.2e-16) 12. Frigo-old: elapsed time t=1.99689 s, 524288 iters, t-(init.)=1.7522 s t(norm)=0.0522196, mflops=95.7495 (err=2.2e-16) 13. Green: elapsed time t=1.8077 s, 262144 iters, t-(init.)=1.68345 s t(norm)=0.100341, mflops=49.8299 (err=2.6e-16) 14. GSL: elapsed time t=1.52085 s, 131072 iters, t-(init.)=1.45966 s t(norm)=0.174005, mflops=28.7349 (err=2.1e-16) 15. GSL DIT: elapsed time t=1.12464 s, 65536 iters, t-(init.)=1.09395 s t(norm)=0.260817, mflops=19.1705 (err=3.1e-16) 16. GSL DIF: elapsed time t=1.14085 s, 65536 iters, t-(init.)=1.11014 s t(norm)=0.264679, mflops=18.8908 (err=2.5e-16) 17. Krukar: elapsed time t=1.74819 s, 262144 iters, t-(init.)=1.62391 s t(norm)=0.0967923, mflops=51.657 (err=1.7e-16) 18. Mayer (Buneman): elapsed time t=1.80864 s, 131072 iters, t-(init.)=1.74744 s t(norm)=0.208311, mflops=24.0025 (err=2.3e-16) 19. Mayer (simple): elapsed time t=1.52871 s, 131072 iters, t-(init.)=1.46735 s t(norm)=0.174922, mflops=28.5842 20. Mayer (lookup): elapsed time t=1.56325 s, 131072 iters, t-(init.)=1.50181 s t(norm)=0.179029, mflops=27.9284 (err=2.1e-16) 21. NAPACK (f2c): elapsed time t=1.70324 s, 32768 iters, t-(init.)=1.68791 s t(norm)=0.804857, mflops=6.21229 (err=2.7e-16) 22. Nielsen: elapsed time t=1.3928 s, 32768 iters, t-(init.)=1.377 s t(norm)=0.656607, mflops=7.61491 (err=1.8e-16) 23. NR (C): elapsed time t=1.12512 s, 65536 iters, t-(init.)=1.09454 s t(norm)=0.26096, mflops=19.16 (err=2.9e-16) 24. Ooura (C): elapsed time t=1.5875 s, 262144 iters, t-(init.)=1.46501 s t(norm)=0.0873216, mflops=57.2596 (err=2.5e-16) 25. QFT: elapsed time t=1.2523 s, 131072 iters, t-(init.)=1.19112 s t(norm)=0.141993, mflops=35.213 (err=1.4e-16) 26. Ransom: elapsed time t=1.27526 s, 32768 iters, t-(init.)=1.25992 s t(norm)=0.600775, mflops=8.32258 (err=5.0e-16) 27. Singleton (f2c): elapsed time t=1.50795 s, 131072 iters, t-(init.)=1.4469 s t(norm)=0.172485, mflops=28.9881 (err=2.0e-16) 28. Temperton (f2c): elapsed time t=1.29125 s, 32768 iters, t-(init.)=1.27592 s t(norm)=0.608407, mflops=8.21818 (err=2.1e-16) 29. Valkenburg: elapsed time t=1.28224 s, 16384 iters, t-(init.)=1.27451 s t(norm)=1.21546, mflops=4.11366 (err=2.5e-16) Top mflops for N=16 = 95.7495 Normalized results and averages for N=16: fft 0: mflops = 28.124 (norm. = 0.293725), norm. avg. (of 4) = 0.44297 fft 1: mflops = 28.6043 (norm. = 0.298742), norm. avg. (of 4) = 0.443071 fft 2: mflops = 19.6421 (norm. = 0.205141), norm. avg. (of 4) = 0.22616 fft 3: mflops = 6.67783 (norm. = 0.0697427), norm. avg. (of 4) = 0.0427917 fft 4: mflops = 7.41131 (norm. = 0.0774031), norm. avg. (of 4) = 0.0742857 fft 5: mflops = 25.0421 (norm. = 0.261538), norm. avg. (of 4) = 0.175227 fft 6: mflops = 33.3893 (norm. = 0.348715), norm. avg. (of 4) = 0.186496 fft 7: mflops = 22.3492 (norm. = 0.233413), norm. avg. (of 4) = 0.120795 fft 8: mflops = 14.6449 (norm. = 0.15295), norm. avg. (of 3) = 0.159247 fft 9: mflops = 17.3573 (norm. = 0.181279), norm. avg. (of 4) = 0.135215 fft 10: mflops = 93.7325 (norm. = 0.978935), norm. avg. (of 4) = 0.734969 fft 11: mflops = 93.9146 (norm. = 0.980836), norm. avg. (of 4) = 0.739418 fft 12: mflops = 95.7495 (norm. = 1), norm. avg. (of 4) = 1 fft 13: mflops = 49.8299 (norm. = 0.52042), norm. avg. (of 2) = 0.440805 fft 14: mflops = 28.7349 (norm. = 0.300105), norm. avg. (of 4) = 0.234572 fft 15: mflops = 19.1705 (norm. = 0.200215), norm. avg. (of 4) = 0.151978 fft 16: mflops = 18.8908 (norm. = 0.197294), norm. avg. (of 4) = 0.147164 fft 17: mflops = 51.657 (norm. = 0.539501), norm. avg. (of 4) = 0.508647 fft 18: mflops = 24.0025 (norm. = 0.250681), norm. avg. (of 3) = 0.247104 fft 19: mflops = 28.5842 (norm. = 0.298531), norm. avg. (of 3) = 0.265625 fft 20: mflops = 27.9284 (norm. = 0.291682), norm. avg. (of 3) = 0.252415 fft 21: mflops = 6.21229 (norm. = 0.0648806), norm. avg. (of 4) = 0.0525883 fft 22: mflops = 7.61491 (norm. = 0.0795295), norm. avg. (of 4) = 0.0657958 fft 23: mflops = 19.16 (norm. = 0.200106), norm. avg. (of 4) = 0.158342 fft 24: mflops = 57.2596 (norm. = 0.598014), norm. avg. (of 4) = 0.497766 fft 25: mflops = 35.213 (norm. = 0.367761), norm. avg. (of 1) = 0.367761 fft 26: mflops = 8.32258 (norm. = 0.0869204), norm. avg. (of 3) = 0.0493454 fft 27: mflops = 28.9881 (norm. = 0.302749), norm. avg. (of 4) = 0.17233 fft 28: mflops = 8.21818 (norm. = 0.08583), norm. avg. (of 4) = 0.0661993 fft 29: mflops = 4.11366 (norm. = 0.0429627), norm. avg. (of 4) = 0.0715487 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.60136 s, 65536 iters, t-(init.)=1.54422 s t(norm)=0.147268, mflops=33.9516 (err=2.4e-16) 1. Arndt DIT: elapsed time t=1.59697 s, 65536 iters, t-(init.)=1.53931 s t(norm)=0.1468, mflops=34.0599 (err=2.7e-16) 2. Arndt Split-Radix: elapsed time t=1.15414 s, 32768 iters, t-(init.)=1.12536 s t(norm)=0.214645, mflops=23.2942 (err=3.0e-16) 3. Arndt 4-step: elapsed time t=1.63348 s, 16384 iters, t-(init.)=1.6191 s t(norm)=0.617638, mflops=8.09535 (err=2.4e-16) 4. Beauregard: elapsed time t=1.70594 s, 16384 iters, t-(init.)=1.69151 s t(norm)=0.645261, mflops=7.74881 (err=2.5e-16) 5. Bergland: elapsed time t=1.48467 s, 65536 iters, t-(init.)=1.42735 s t(norm)=0.136122, mflops=36.7317 (err=2.6e-16) 6. CWP (min N) (N=33): elapsed time t=1.58902 s, 65536 iters, t-(init.)=1.53015 s t(norm)=0.145926, mflops=34.2638 7. CWP (best N) (N=35): elapsed time t=1.30838 s, 65536 iters, t-(init.)=1.24499 s t(norm)=0.118731, mflops=42.1119 8. Edelblute: elapsed time t=1.59261 s, 32768 iters, t-(init.)=1.56389 s t(norm)=0.298288, mflops=16.7623 (err=2.9e-16) 9. FFTPACK (f2c): elapsed time t=1.84702 s, 32768 iters, t-(init.)=1.81835 s t(norm)=0.346822, mflops=14.4166 (err=2.3e-16) FFTW_MEASURE plan: (cost = 7.858765e-06) FFTW_NOTW 32 10. FFTW: elapsed time t=1.03421 s, 131072 iters, t-(init.)=0.919788 s t(norm)=0.0438589, mflops=114.002 (err=2.4e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.03302 s, 131072 iters, t-(init.)=0.918632 s t(norm)=0.0438038, mflops=114.145 (err=2.4e-16) 12. Frigo-old: elapsed time t=1.04151 s, 131072 iters, t-(init.)=0.927137 s t(norm)=0.0442093, mflops=113.098 (err=2.1e-16) 13. Green: elapsed time t=1.63899 s, 131072 iters, t-(init.)=1.52326 s t(norm)=0.0726347, mflops=68.8376 (err=2.4e-16) 14. GSL: elapsed time t=1.00621 s, 32768 iters, t-(init.)=0.977527 s t(norm)=0.186448, mflops=26.8171 (err=2.3e-16) 15. GSL DIT: elapsed time t=1.0617 s, 32768 iters, t-(init.)=1.03302 s t(norm)=0.197033, mflops=25.3764 (err=3.1e-16) 16. GSL DIF: elapsed time t=1.04976 s, 32768 iters, t-(init.)=1.02108 s t(norm)=0.194756, mflops=25.6731 (err=3.2e-16) 17. Krukar: elapsed time t=1.06942 s, 65536 iters, t-(init.)=1.01173 s t(norm)=0.0964859, mflops=51.821 (err=2.7e-16) 18. Mayer (Buneman): elapsed time t=1.96506 s, 65536 iters, t-(init.)=1.90727 s t(norm)=0.181891, mflops=27.4889 (err=2.8e-16) 19. Mayer (simple): elapsed time t=1.57547 s, 65536 iters, t-(init.)=1.51834 s t(norm)=0.144801, mflops=34.5303 20. Mayer (lookup): elapsed time t=1.5777 s, 65536 iters, t-(init.)=1.52032 s t(norm)=0.144989, mflops=34.4853 (err=2.6e-16) 21. NAPACK (f2c): elapsed time t=1.86538 s, 16384 iters, t-(init.)=1.851 s t(norm)=0.706101, mflops=7.08114 (err=6.4e-16) 22. Nielsen: elapsed time t=1.21626 s, 16384 iters, t-(init.)=1.20174 s t(norm)=0.458428, mflops=10.9068 (err=1.1e-15) 23. NR (C): elapsed time t=1.05161 s, 32768 iters, t-(init.)=1.02275 s t(norm)=0.195074, mflops=25.6313 (err=2.9e-16) 24. Ooura (C): elapsed time t=1.723 s, 131072 iters, t-(init.)=1.60836 s t(norm)=0.0766925, mflops=65.1954 (err=2.5e-16) 25. QFT: elapsed time t=1.68412 s, 65536 iters, t-(init.)=1.62679 s t(norm)=0.155143, mflops=32.2284 (err=2.8e-16) 26. Ransom: elapsed time t=1.54292 s, 16384 iters, t-(init.)=1.52821 s t(norm)=0.582966, mflops=8.57683 (err=7.4e-16) 27. Singleton (f2c): elapsed time t=1.45629 s, 65536 iters, t-(init.)=1.39858 s t(norm)=0.133379, mflops=37.4873 (err=2.3e-16) 28. Temperton (f2c): elapsed time t=1.7142 s, 16384 iters, t-(init.)=1.69986 s t(norm)=0.648447, mflops=7.71073 (err=2.6e-16) 29. Valkenburg: elapsed time t=1.5584 s, 8192 iters, t-(init.)=1.55117 s t(norm)=1.18345, mflops=4.22493 (err=2.8e-16) Top mflops for N=32 = 114.145 Normalized results and averages for N=32: fft 0: mflops = 33.9516 (norm. = 0.297442), norm. avg. (of 5) = 0.413864 fft 1: mflops = 34.0599 (norm. = 0.298391), norm. avg. (of 5) = 0.414135 fft 2: mflops = 23.2942 (norm. = 0.204075), norm. avg. (of 5) = 0.221743 fft 3: mflops = 8.09535 (norm. = 0.0709214), norm. avg. (of 5) = 0.0484177 fft 4: mflops = 7.74881 (norm. = 0.0678854), norm. avg. (of 5) = 0.0730057 fft 5: mflops = 36.7317 (norm. = 0.321797), norm. avg. (of 5) = 0.204541 fft 6: mflops = 34.2638 (norm. = 0.300177), norm. avg. (of 5) = 0.209232 fft 7: mflops = 42.1119 (norm. = 0.368932), norm. avg. (of 5) = 0.170422 fft 8: mflops = 16.7623 (norm. = 0.146851), norm. avg. (of 4) = 0.156148 fft 9: mflops = 14.4166 (norm. = 0.1263), norm. avg. (of 5) = 0.133432 fft 10: mflops = 114.002 (norm. = 0.998743), norm. avg. (of 5) = 0.787724 fft 11: mflops = 114.145 (norm. = 1), norm. avg. (of 5) = 0.791534 fft 12: mflops = 113.098 (norm. = 0.990827), norm. avg. (of 5) = 0.998165 fft 13: mflops = 68.8376 (norm. = 0.60307), norm. avg. (of 3) = 0.494893 fft 14: mflops = 26.8171 (norm. = 0.234938), norm. avg. (of 5) = 0.234645 fft 15: mflops = 25.3764 (norm. = 0.222317), norm. avg. (of 5) = 0.166046 fft 16: mflops = 25.6731 (norm. = 0.224916), norm. avg. (of 5) = 0.162715 fft 17: mflops = 51.821 (norm. = 0.453992), norm. avg. (of 5) = 0.497716 fft 18: mflops = 27.4889 (norm. = 0.240824), norm. avg. (of 4) = 0.245534 fft 19: mflops = 34.5303 (norm. = 0.302511), norm. avg. (of 4) = 0.274847 fft 20: mflops = 34.4853 (norm. = 0.302117), norm. avg. (of 4) = 0.264841 fft 21: mflops = 7.08114 (norm. = 0.0620362), norm. avg. (of 5) = 0.0544779 fft 22: mflops = 10.9068 (norm. = 0.0955521), norm. avg. (of 5) = 0.0717471 fft 23: mflops = 25.6313 (norm. = 0.224549), norm. avg. (of 5) = 0.171584 fft 24: mflops = 65.1954 (norm. = 0.571161), norm. avg. (of 5) = 0.512445 fft 25: mflops = 32.2284 (norm. = 0.282345), norm. avg. (of 2) = 0.325053 fft 26: mflops = 8.57683 (norm. = 0.0751395), norm. avg. (of 4) = 0.0557939 fft 27: mflops = 37.4873 (norm. = 0.328417), norm. avg. (of 5) = 0.203547 fft 28: mflops = 7.71073 (norm. = 0.0675519), norm. avg. (of 5) = 0.0664698 fft 29: mflops = 4.22493 (norm. = 0.0370136), norm. avg. (of 5) = 0.0646416 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.83827 s, 32768 iters, t-(init.)=1.7829 s t(norm)=0.141693, mflops=35.2877 (err=5.0e-16) 1. Arndt DIT: elapsed time t=1.82453 s, 32768 iters, t-(init.)=1.76929 s t(norm)=0.14061, mflops=35.5593 (err=4.9e-16) 2. Arndt Split-Radix: elapsed time t=1.21404 s, 16384 iters, t-(init.)=1.18639 s t(norm)=0.188572, mflops=26.515 (err=4.5e-16) 3. Arndt 4-step: elapsed time t=1.33505 s, 8192 iters, t-(init.)=1.32107 s t(norm)=0.419957, mflops=11.906 (err=4.9e-16) 4. Beauregard: elapsed time t=1.99503 s, 8192 iters, t-(init.)=1.98113 s t(norm)=0.629784, mflops=7.93923 (err=4.5e-16) 5. Bergland: elapsed time t=1.46637 s, 32768 iters, t-(init.)=1.41092 s t(norm)=0.11213, mflops=44.5911 (err=5.5e-16) 6. CWP (min N) (N=65): elapsed time t=1.6531 s, 32768 iters, t-(init.)=1.59695 s t(norm)=0.126914, mflops=39.3968 7. CWP (best N) (N=84): elapsed time t=1.35263 s, 32768 iters, t-(init.)=1.2804 s t(norm)=0.101757, mflops=49.1366 8. Edelblute: elapsed time t=1.6953 s, 16384 iters, t-(init.)=1.66736 s t(norm)=0.26502, mflops=18.8665 (err=4.6e-16) 9. FFTPACK (f2c): elapsed time t=1.86128 s, 16384 iters, t-(init.)=1.83363 s t(norm)=0.291448, mflops=17.1557 (err=4.4e-16) FFTW_MEASURE plan: (cost = 1.811328e-05) FFTW_NOTW 64 10. FFTW: elapsed time t=1.19507 s, 65536 iters, t-(init.)=1.08455 s t(norm)=0.0430961, mflops=116.02 (err=4.4e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.37483 s, 65536 iters, t-(init.)=1.26392 s t(norm)=0.0502237, mflops=99.5547 (err=4.7e-16) 12. Frigo-old: elapsed time t=1.76762 s, 65536 iters, t-(init.)=1.65661 s t(norm)=0.0658277, mflops=75.9558 (err=4.5e-16) 13. Green: elapsed time t=1.44979 s, 65536 iters, t-(init.)=1.33832 s t(norm)=0.0531801, mflops=94.0201 (err=4.6e-16) 14. GSL: elapsed time t=1.99881 s, 32768 iters, t-(init.)=1.94356 s t(norm)=0.15446, mflops=32.3708 (err=4.4e-16) 15. GSL DIT: elapsed time t=1.08124 s, 16384 iters, t-(init.)=1.05341 s t(norm)=0.167435, mflops=29.8624 (err=4.6e-16) 16. GSL DIF: elapsed time t=1.04463 s, 16384 iters, t-(init.)=1.01686 s t(norm)=0.161625, mflops=30.9359 (err=4.9e-16) 17. Krukar: elapsed time t=1.06345 s, 16384 iters, t-(init.)=1.03571 s t(norm)=0.164622, mflops=30.3727 (err=5.2e-16) 18. Mayer (Buneman): elapsed time t=1.1125 s, 16384 iters, t-(init.)=1.08471 s t(norm)=0.172411, mflops=29.0005 (err=4.8e-16) 19. Mayer (simple): elapsed time t=1.69478 s, 32768 iters, t-(init.)=1.63954 s t(norm)=0.130299, mflops=38.3733 20. Mayer (lookup): elapsed time t=1.67328 s, 32768 iters, t-(init.)=1.61772 s t(norm)=0.128564, mflops=38.891 (err=4.5e-16) 21. NAPACK (f2c): elapsed time t=1.92077 s, 8192 iters, t-(init.)=1.90687 s t(norm)=0.606176, mflops=8.24842 (err=1.1e-15) 22. Nielsen: elapsed time t=1.12913 s, 8192 iters, t-(init.)=1.11516 s t(norm)=0.354499, mflops=14.1044 (err=1.9e-15) 23. NR (C): elapsed time t=1.05366 s, 16384 iters, t-(init.)=1.02594 s t(norm)=0.163068, mflops=30.662 (err=4.4e-16) 24. Ooura (C): elapsed time t=1.82697 s, 65536 iters, t-(init.)=1.71587 s t(norm)=0.0681825, mflops=73.3326 (err=5.4e-16) 25. QFT: elapsed time t=1.12429 s, 16384 iters, t-(init.)=1.09654 s t(norm)=0.174291, mflops=28.6877 (err=4.9e-16) 26. Ransom: elapsed time t=1.86312 s, 16384 iters, t-(init.)=1.83534 s t(norm)=0.29172, mflops=17.1397 (err=9.1e-16) 27. Singleton (f2c): elapsed time t=1.24458 s, 32768 iters, t-(init.)=1.18913 s t(norm)=0.0945033, mflops=52.9082 (err=6.5e-16) 28. Temperton (f2c): elapsed time t=1.56323 s, 8192 iters, t-(init.)=1.54937 s t(norm)=0.492533, mflops=10.1516 (err=4.7e-16) 29. Valkenburg: elapsed time t=1.83392 s, 4096 iters, t-(init.)=1.82681 s t(norm)=1.16146, mflops=4.30494 (err=6.0e-16) Top mflops for N=64 = 116.02 Normalized results and averages for N=64: fft 0: mflops = 35.2877 (norm. = 0.304152), norm. avg. (of 6) = 0.395579 fft 1: mflops = 35.5593 (norm. = 0.306494), norm. avg. (of 6) = 0.396195 fft 2: mflops = 26.515 (norm. = 0.228539), norm. avg. (of 6) = 0.222876 fft 3: mflops = 11.906 (norm. = 0.10262), norm. avg. (of 6) = 0.0574515 fft 4: mflops = 7.93923 (norm. = 0.06843), norm. avg. (of 6) = 0.0722431 fft 5: mflops = 44.5911 (norm. = 0.384341), norm. avg. (of 6) = 0.234508 fft 6: mflops = 39.3968 (norm. = 0.33957), norm. avg. (of 6) = 0.230955 fft 7: mflops = 49.1366 (norm. = 0.42352), norm. avg. (of 6) = 0.212605 fft 8: mflops = 18.8665 (norm. = 0.162615), norm. avg. (of 5) = 0.157441 fft 9: mflops = 17.1557 (norm. = 0.147869), norm. avg. (of 6) = 0.135838 fft 10: mflops = 116.02 (norm. = 1), norm. avg. (of 6) = 0.823103 fft 11: mflops = 99.5547 (norm. = 0.858084), norm. avg. (of 6) = 0.802626 fft 12: mflops = 75.9558 (norm. = 0.654681), norm. avg. (of 6) = 0.940918 fft 13: mflops = 94.0201 (norm. = 0.810381), norm. avg. (of 4) = 0.573765 fft 14: mflops = 32.3708 (norm. = 0.279011), norm. avg. (of 6) = 0.24204 fft 15: mflops = 29.8624 (norm. = 0.257391), norm. avg. (of 6) = 0.18127 fft 16: mflops = 30.9359 (norm. = 0.266643), norm. avg. (of 6) = 0.180036 fft 17: mflops = 30.3727 (norm. = 0.261789), norm. avg. (of 6) = 0.458395 fft 18: mflops = 29.0005 (norm. = 0.249962), norm. avg. (of 5) = 0.24642 fft 19: mflops = 38.3733 (norm. = 0.330748), norm. avg. (of 5) = 0.286027 fft 20: mflops = 38.891 (norm. = 0.33521), norm. avg. (of 5) = 0.278915 fft 21: mflops = 8.24842 (norm. = 0.0710951), norm. avg. (of 6) = 0.0572474 fft 22: mflops = 14.1044 (norm. = 0.121569), norm. avg. (of 6) = 0.0800507 fft 23: mflops = 30.662 (norm. = 0.264283), norm. avg. (of 6) = 0.187034 fft 24: mflops = 73.3326 (norm. = 0.632071), norm. avg. (of 6) = 0.532383 fft 25: mflops = 28.6877 (norm. = 0.247266), norm. avg. (of 3) = 0.299124 fft 26: mflops = 17.1397 (norm. = 0.147731), norm. avg. (of 5) = 0.0741814 fft 27: mflops = 52.9082 (norm. = 0.456028), norm. avg. (of 6) = 0.245627 fft 28: mflops = 10.1516 (norm. = 0.087499), norm. avg. (of 6) = 0.0699747 fft 29: mflops = 4.30494 (norm. = 0.0371053), norm. avg. (of 6) = 0.0600523 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.92721 s, 16384 iters, t-(init.)=1.87275 s t(norm)=0.127571, mflops=39.194 (err=4.0e-16) 1. Arndt DIT: elapsed time t=1.92453 s, 16384 iters, t-(init.)=1.86988 s t(norm)=0.127375, mflops=39.2541 (err=4.1e-16) 2. Arndt Split-Radix: elapsed time t=1.26577 s, 8192 iters, t-(init.)=1.23864 s t(norm)=0.168751, mflops=29.6294 (err=4.4e-16) 3. Arndt 4-step: elapsed time t=1.53129 s, 4096 iters, t-(init.)=1.51762 s t(norm)=0.413518, mflops=12.0914 (err=4.0e-16) 4. Beauregard: elapsed time t=1.15472 s, 2048 iters, t-(init.)=1.148 s t(norm)=0.625612, mflops=7.99217 (err=4.1e-16) 5. Bergland: elapsed time t=1.52198 s, 16384 iters, t-(init.)=1.46766 s t(norm)=0.0999762, mflops=50.0119 (err=4.3e-16) 6. CWP (min N) (N=130): elapsed time t=1.62461 s, 16384 iters, t-(init.)=1.56917 s t(norm)=0.106891, mflops=46.7765 7. CWP (best N) (N=140): elapsed time t=1.12443 s, 16384 iters, t-(init.)=1.06497 s t(norm)=0.0725451, mflops=68.9226 8. Edelblute: elapsed time t=1.7745 s, 8192 iters, t-(init.)=1.74726 s t(norm)=0.238045, mflops=21.0044 (err=4.1e-16) 9. FFTPACK (f2c): elapsed time t=1.95713 s, 8192 iters, t-(init.)=1.92998 s t(norm)=0.262939, mflops=19.0158 (err=4.1e-16) FFTW_MEASURE plan: (cost = 4.396533e-05) FFTW_TWIDDLE 4 FFTW_NOTW 32 10. FFTW: elapsed time t=1.44626 s, 32768 iters, t-(init.)=1.33722 s t(norm)=0.0455453, mflops=109.781 (err=4.2e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.44565 s, 32768 iters, t-(init.)=1.33675 s t(norm)=0.0455295, mflops=109.819 (err=4.2e-16) 12. Frigo-old: elapsed time t=1.88319 s, 32768 iters, t-(init.)=1.7743 s t(norm)=0.0604324, mflops=82.7371 (err=4.4e-16) 13. Green: elapsed time t=1.53846 s, 32768 iters, t-(init.)=1.42918 s t(norm)=0.0486774, mflops=102.717 (err=4.4e-16) 14. GSL: elapsed time t=1.13107 s, 8192 iters, t-(init.)=1.10351 s t(norm)=0.150341, mflops=33.2576 (err=4.2e-16) 15. GSL DIT: elapsed time t=1.1322 s, 8192 iters, t-(init.)=1.10494 s t(norm)=0.150536, mflops=33.2146 (err=4.3e-16) 16. GSL DIF: elapsed time t=1.07196 s, 8192 iters, t-(init.)=1.04462 s t(norm)=0.142318, mflops=35.1326 (err=4.6e-16) 17. Krukar: elapsed time t=1.07806 s, 4096 iters, t-(init.)=1.06434 s t(norm)=0.29001, mflops=17.2408 (err=4.6e-16) 18. Mayer (Buneman): elapsed time t=1.19183 s, 8192 iters, t-(init.)=1.16483 s t(norm)=0.158696, mflops=31.5068 (err=4.0e-16) 19. Mayer (simple): elapsed time t=1.78122 s, 16384 iters, t-(init.)=1.72658 s t(norm)=0.117614, mflops=42.512 20. Mayer (lookup): elapsed time t=1.75031 s, 16384 iters, t-(init.)=1.69595 s t(norm)=0.115527, mflops=43.2799 (err=4.3e-16) 21. NAPACK (f2c): elapsed time t=1.09512 s, 2048 iters, t-(init.)=1.08824 s t(norm)=0.593042, mflops=8.4311 (err=1.2e-15) 22. Nielsen: elapsed time t=1.44786 s, 4096 iters, t-(init.)=1.43394 s t(norm)=0.390717, mflops=12.797 (err=1.3e-15) 23. NR (C): elapsed time t=1.08321 s, 8192 iters, t-(init.)=1.05608 s t(norm)=0.14388, mflops=34.7513 (err=4.4e-16) 24. Ooura (C): elapsed time t=1.03179 s, 16384 iters, t-(init.)=0.9775 s t(norm)=0.0665869, mflops=75.0898 (err=4.1e-16) 25. QFT: elapsed time t=1.25683 s, 8192 iters, t-(init.)=1.22956 s t(norm)=0.167515, mflops=29.8481 (err=4.6e-16) 26. Ransom: elapsed time t=1.14711 s, 4096 iters, t-(init.)=1.13343 s t(norm)=0.308836, mflops=16.1898 (err=1.1e-15) 27. Singleton (f2c): elapsed time t=1.52094 s, 16384 iters, t-(init.)=1.46657 s t(norm)=0.0999019, mflops=50.0491 (err=5.3e-16) 28. Temperton (f2c): elapsed time t=1.0291 s, 2048 iters, t-(init.)=1.02239 s t(norm)=0.557158, mflops=8.97412 (err=4.4e-16) 29. Valkenburg: elapsed time t=1.0569 s, 1024 iters, t-(init.)=1.05335 s t(norm)=1.14806, mflops=4.35518 (err=4.8e-16) Top mflops for N=128 = 109.819 Normalized results and averages for N=128: fft 0: mflops = 39.194 (norm. = 0.356896), norm. avg. (of 7) = 0.390053 fft 1: mflops = 39.2541 (norm. = 0.357444), norm. avg. (of 7) = 0.390659 fft 2: mflops = 29.6294 (norm. = 0.269802), norm. avg. (of 7) = 0.22958 fft 3: mflops = 12.0914 (norm. = 0.110103), norm. avg. (of 7) = 0.0649731 fft 4: mflops = 7.99217 (norm. = 0.0727759), norm. avg. (of 7) = 0.0723192 fft 5: mflops = 50.0119 (norm. = 0.455403), norm. avg. (of 7) = 0.266064 fft 6: mflops = 46.7765 (norm. = 0.425942), norm. avg. (of 7) = 0.25881 fft 7: mflops = 68.9226 (norm. = 0.627603), norm. avg. (of 7) = 0.27189 fft 8: mflops = 21.0044 (norm. = 0.191264), norm. avg. (of 6) = 0.163078 fft 9: mflops = 19.0158 (norm. = 0.173156), norm. avg. (of 7) = 0.141169 fft 10: mflops = 109.781 (norm. = 0.999652), norm. avg. (of 7) = 0.848325 fft 11: mflops = 109.819 (norm. = 1), norm. avg. (of 7) = 0.830822 fft 12: mflops = 82.7371 (norm. = 0.753396), norm. avg. (of 7) = 0.914129 fft 13: mflops = 102.717 (norm. = 0.935331), norm. avg. (of 5) = 0.646078 fft 14: mflops = 33.2576 (norm. = 0.302841), norm. avg. (of 7) = 0.250725 fft 15: mflops = 33.2146 (norm. = 0.302448), norm. avg. (of 7) = 0.198581 fft 16: mflops = 35.1326 (norm. = 0.319914), norm. avg. (of 7) = 0.200019 fft 17: mflops = 17.2408 (norm. = 0.156993), norm. avg. (of 7) = 0.415337 fft 18: mflops = 31.5068 (norm. = 0.286898), norm. avg. (of 6) = 0.253166 fft 19: mflops = 42.512 (norm. = 0.38711), norm. avg. (of 6) = 0.302874 fft 20: mflops = 43.2799 (norm. = 0.394102), norm. avg. (of 6) = 0.298113 fft 21: mflops = 8.4311 (norm. = 0.0767728), norm. avg. (of 7) = 0.0600368 fft 22: mflops = 12.797 (norm. = 0.116528), norm. avg. (of 7) = 0.0852618 fft 23: mflops = 34.7513 (norm. = 0.316442), norm. avg. (of 7) = 0.20552 fft 24: mflops = 75.0898 (norm. = 0.683761), norm. avg. (of 7) = 0.554008 fft 25: mflops = 29.8481 (norm. = 0.271794), norm. avg. (of 4) = 0.292292 fft 26: mflops = 16.1898 (norm. = 0.147423), norm. avg. (of 6) = 0.0863883 fft 27: mflops = 50.0491 (norm. = 0.455742), norm. avg. (of 7) = 0.275644 fft 28: mflops = 8.97412 (norm. = 0.0817174), norm. avg. (of 7) = 0.0716522 fft 29: mflops = 4.35518 (norm. = 0.0396579), norm. avg. (of 7) = 0.0571388 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.03229 s, 4096 iters, t-(init.)=1.00521 s t(norm)=0.11983, mflops=41.7259 (err=6.7e-16) 1. Arndt DIT: elapsed time t=1.02736 s, 4096 iters, t-(init.)=1.00034 s t(norm)=0.11925, mflops=41.9289 (err=7.1e-16) 2. Arndt Split-Radix: elapsed time t=1.32163 s, 4096 iters, t-(init.)=1.29468 s t(norm)=0.154338, mflops=32.3965 (err=7.4e-16) 3. Arndt 4-step: elapsed time t=1.53208 s, 2048 iters, t-(init.)=1.51852 s t(norm)=0.362043, mflops=13.8105 (err=7.2e-16) 4. Beauregard: elapsed time t=1.31043 s, 1024 iters, t-(init.)=1.30362 s t(norm)=0.621616, mflops=8.04355 (err=7.8e-16) 5. Bergland: elapsed time t=1.5297 s, 8192 iters, t-(init.)=1.47576 s t(norm)=0.0879624, mflops=56.8425 (err=8.3e-16) 6. CWP (min N) (N=260): elapsed time t=1.57268 s, 8192 iters, t-(init.)=1.51798 s t(norm)=0.090479, mflops=55.2615 7. CWP (best N) (N=280): elapsed time t=1.20674 s, 8192 iters, t-(init.)=1.14773 s t(norm)=0.0684098, mflops=73.0889 8. Edelblute: elapsed time t=1.84526 s, 4096 iters, t-(init.)=1.81823 s t(norm)=0.21675, mflops=23.068 (err=7.0e-16) 9. FFTPACK (f2c): elapsed time t=1.03277 s, 2048 iters, t-(init.)=1.01917 s t(norm)=0.24299, mflops=20.577 (err=7.8e-16) FFTW_MEASURE plan: (cost = 9.659863e-05) FFTW_TWIDDLE 8 FFTW_NOTW 32 10. FFTW: elapsed time t=1.5904 s, 16384 iters, t-(init.)=1.48256 s t(norm)=0.0441837, mflops=113.164 (err=8.1e-16) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.59096 s, 16384 iters, t-(init.)=1.48304 s t(norm)=0.0441981, mflops=113.127 (err=8.1e-16) 12. Frigo-old: elapsed time t=1.90984 s, 16384 iters, t-(init.)=1.80198 s t(norm)=0.0537032, mflops=93.1043 (err=8.0e-16) 13. Green: elapsed time t=1.62567 s, 16384 iters, t-(init.)=1.5176 s t(norm)=0.0452281, mflops=110.551 (err=7.6e-16) 14. GSL: elapsed time t=1.16147 s, 4096 iters, t-(init.)=1.1344 s t(norm)=0.135231, mflops=36.9739 (err=7.8e-16) 15. GSL DIT: elapsed time t=1.21742 s, 4096 iters, t-(init.)=1.19034 s t(norm)=0.141899, mflops=35.2362 (err=7.7e-16) 16. GSL DIF: elapsed time t=1.13443 s, 4096 iters, t-(init.)=1.1074 s t(norm)=0.132012, mflops=37.8753 (err=8.3e-16) 17. Krukar: elapsed time t=1.85535 s, 4096 iters, t-(init.)=1.82841 s t(norm)=0.217964, mflops=22.9396 (err=7.7e-16) 18. Mayer (Buneman): elapsed time t=1.27718 s, 4096 iters, t-(init.)=1.25011 s t(norm)=0.149024, mflops=33.5516 (err=7.0e-16) 19. Mayer (simple): elapsed time t=1.89656 s, 8192 iters, t-(init.)=1.84274 s t(norm)=0.109836, mflops=45.5224 20. Mayer (lookup): elapsed time t=1.86426 s, 8192 iters, t-(init.)=1.81031 s t(norm)=0.107903, mflops=46.3381 (err=7.1e-16) 21. NAPACK (f2c): elapsed time t=1.16533 s, 1024 iters, t-(init.)=1.15835 s t(norm)=0.552343, mflops=9.05235 (err=3.6e-15) 22. Nielsen: elapsed time t=1.46206 s, 2048 iters, t-(init.)=1.44859 s t(norm)=0.345371, mflops=14.4772 (err=3.4e-15) 23. NR (C): elapsed time t=1.14439 s, 4096 iters, t-(init.)=1.11751 s t(norm)=0.133217, mflops=37.5326 (err=8.6e-16) 24. Ooura (C): elapsed time t=1.09465 s, 8192 iters, t-(init.)=1.04076 s t(norm)=0.0620341, mflops=80.6008 (err=7.9e-16) 25. QFT: elapsed time t=1.48775 s, 4096 iters, t-(init.)=1.46078 s t(norm)=0.174138, mflops=28.7128 (err=9.5e-16) 26. Ransom: elapsed time t=1.73895 s, 4096 iters, t-(init.)=1.71183 s t(norm)=0.204066, mflops=24.5019 (err=1.7e-15) 27. Singleton (f2c): elapsed time t=1.31119 s, 8192 iters, t-(init.)=1.25722 s t(norm)=0.0749363, mflops=66.7233 (err=1.3e-15) 28. Temperton (f2c): elapsed time t=1.00093 s, 1024 iters, t-(init.)=0.99412 s t(norm)=0.474033, mflops=10.5478 (err=7.5e-16) 29. Valkenburg: elapsed time t=1.20375 s, 512 iters, t-(init.)=1.20036 s t(norm)=1.14475, mflops=4.36776 (err=7.4e-16) Top mflops for N=256 = 113.164 Normalized results and averages for N=256: fft 0: mflops = 41.7259 (norm. = 0.36872), norm. avg. (of 8) = 0.387386 fft 1: mflops = 41.9289 (norm. = 0.370514), norm. avg. (of 8) = 0.388141 fft 2: mflops = 32.3965 (norm. = 0.286279), norm. avg. (of 8) = 0.236667 fft 3: mflops = 13.8105 (norm. = 0.12204), norm. avg. (of 8) = 0.0721064 fft 4: mflops = 8.04355 (norm. = 0.0710787), norm. avg. (of 8) = 0.0721641 fft 5: mflops = 56.8425 (norm. = 0.502302), norm. avg. (of 8) = 0.295594 fft 6: mflops = 55.2615 (norm. = 0.488331), norm. avg. (of 8) = 0.2875 fft 7: mflops = 73.0889 (norm. = 0.645868), norm. avg. (of 8) = 0.318638 fft 8: mflops = 23.068 (norm. = 0.203846), norm. avg. (of 7) = 0.168902 fft 9: mflops = 20.577 (norm. = 0.181833), norm. avg. (of 8) = 0.146252 fft 10: mflops = 113.164 (norm. = 1), norm. avg. (of 8) = 0.867284 fft 11: mflops = 113.127 (norm. = 0.999674), norm. avg. (of 8) = 0.851929 fft 12: mflops = 93.1043 (norm. = 0.822738), norm. avg. (of 8) = 0.902705 fft 13: mflops = 110.551 (norm. = 0.976908), norm. avg. (of 6) = 0.701216 fft 14: mflops = 36.9739 (norm. = 0.326728), norm. avg. (of 8) = 0.260226 fft 15: mflops = 35.2362 (norm. = 0.311373), norm. avg. (of 8) = 0.21268 fft 16: mflops = 37.8753 (norm. = 0.334694), norm. avg. (of 8) = 0.216853 fft 17: mflops = 22.9396 (norm. = 0.202711), norm. avg. (of 8) = 0.388759 fft 18: mflops = 33.5516 (norm. = 0.296486), norm. avg. (of 7) = 0.259355 fft 19: mflops = 45.5224 (norm. = 0.402269), norm. avg. (of 7) = 0.317074 fft 20: mflops = 46.3381 (norm. = 0.409477), norm. avg. (of 7) = 0.314022 fft 21: mflops = 9.05235 (norm. = 0.0799932), norm. avg. (of 8) = 0.0625313 fft 22: mflops = 14.4772 (norm. = 0.127931), norm. avg. (of 8) = 0.0905954 fft 23: mflops = 37.5326 (norm. = 0.331666), norm. avg. (of 8) = 0.221289 fft 24: mflops = 80.6008 (norm. = 0.712248), norm. avg. (of 8) = 0.573788 fft 25: mflops = 28.7128 (norm. = 0.253727), norm. avg. (of 5) = 0.284579 fft 26: mflops = 24.5019 (norm. = 0.216517), norm. avg. (of 7) = 0.104978 fft 27: mflops = 66.7233 (norm. = 0.589616), norm. avg. (of 8) = 0.31489 fft 28: mflops = 10.5478 (norm. = 0.0932079), norm. avg. (of 8) = 0.0743467 fft 29: mflops = 4.36776 (norm. = 0.0385968), norm. avg. (of 8) = 0.054821 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.10801 s, 2048 iters, t-(init.)=1.08101 s t(norm)=0.114548, mflops=43.6497 (err=6.7e-16) 1. Arndt DIT: elapsed time t=1.11106 s, 2048 iters, t-(init.)=1.08405 s t(norm)=0.11487, mflops=43.5274 (err=6.2e-16) 2. Arndt Split-Radix: elapsed time t=1.51304 s, 2048 iters, t-(init.)=1.48578 s t(norm)=0.157439, mflops=31.7584 (err=6.5e-16) 3. Arndt 4-step: elapsed time t=1.62382 s, 1024 iters, t-(init.)=1.61042 s t(norm)=0.341293, mflops=14.6502 (err=6.3e-16) 4. Beauregard: elapsed time t=1.49354 s, 512 iters, t-(init.)=1.48673 s t(norm)=0.630157, mflops=7.93454 (err=6.8e-16) 5. Bergland: elapsed time t=1.58424 s, 4096 iters, t-(init.)=1.52998 s t(norm)=0.0810612, mflops=61.6818 (err=7.2e-16) 6. CWP (min N) (N=520): elapsed time t=1.72298 s, 4096 iters, t-(init.)=1.66837 s t(norm)=0.0883934, mflops=56.5653 7. CWP (best N) (N=560): elapsed time t=1.40023 s, 4096 iters, t-(init.)=1.341 s t(norm)=0.0710489, mflops=70.374 8. Edelblute: elapsed time t=1.02501 s, 1024 iters, t-(init.)=1.01146 s t(norm)=0.214356, mflops=23.3257 (err=6.2e-16) 9. FFTPACK (f2c): elapsed time t=1.92209 s, 1024 iters, t-(init.)=1.90842 s t(norm)=0.404448, mflops=12.3625 (err=6.4e-16) FFTW_MEASURE plan: (cost = 4.179687e-04) FFTW_TWIDDLE 8 FFTW_NOTW 64 10. FFTW: elapsed time t=1.02232 s, 2048 iters, t-(init.)=0.995282 s t(norm)=0.105464, mflops=47.4096 (err=6.4e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.73932 s, 4096 iters, t-(init.)=1.68548 s t(norm)=0.0893002, mflops=55.9909 (err=6.5e-16) 12. Frigo-old: elapsed time t=1.25112 s, 2048 iters, t-(init.)=1.22402 s t(norm)=0.129701, mflops=38.5501 (err=6.3e-16) 13. Green: elapsed time t=1.6844 s, 8192 iters, t-(init.)=1.5769 s t(norm)=0.0417736, mflops=119.693 (err=6.2e-16) 14. GSL: elapsed time t=1.1977 s, 1024 iters, t-(init.)=1.18403 s t(norm)=0.250928, mflops=19.926 (err=6.4e-16) 15. GSL DIT: elapsed time t=1.36207 s, 2048 iters, t-(init.)=1.33521 s t(norm)=0.141484, mflops=35.3398 (err=9.0e-16) 16. GSL DIF: elapsed time t=1.22138 s, 2048 iters, t-(init.)=1.19436 s t(norm)=0.126559, mflops=39.5073 (err=7.8e-16) 17. Krukar: elapsed time t=1.22491 s, 1024 iters, t-(init.)=1.21092 s t(norm)=0.256628, mflops=19.4835 (err=6.9e-16) 18. Mayer (Buneman): elapsed time t=1.3439 s, 2048 iters, t-(init.)=1.31705 s t(norm)=0.139559, mflops=35.8271 (err=6.5e-16) 19. Mayer (simple): elapsed time t=1.00009 s, 2048 iters, t-(init.)=0.973233 s t(norm)=0.103127, mflops=48.4837 20. Mayer (lookup): elapsed time t=1.96887 s, 4096 iters, t-(init.)=1.91506 s t(norm)=0.101464, mflops=49.2787 (err=6.5e-16) 21. NAPACK (f2c): elapsed time t=1.50547 s, 512 iters, t-(init.)=1.49838 s t(norm)=0.635097, mflops=7.87282 (err=6.7e-15) 22. Nielsen: elapsed time t=1.60821 s, 1024 iters, t-(init.)=1.59449 s t(norm)=0.337916, mflops=14.7966 (err=3.2e-15) 23. NR (C): elapsed time t=1.26844 s, 2048 iters, t-(init.)=1.24145 s t(norm)=0.131549, mflops=38.0087 (err=7.1e-16) 24. Ooura (C): elapsed time t=1.23093 s, 4096 iters, t-(init.)=1.17717 s t(norm)=0.0623687, mflops=80.1684 (err=6.9e-16) 25. QFT: elapsed time t=1.66947 s, 1024 iters, t-(init.)=1.65568 s t(norm)=0.350885, mflops=14.2497 (err=9.5e-16) 26. Ransom: elapsed time t=1.07393 s, 1024 iters, t-(init.)=1.06035 s t(norm)=0.224718, mflops=22.2501 (err=1.5e-15) 27. Singleton (f2c): elapsed time t=1.45128 s, 4096 iters, t-(init.)=1.3975 s t(norm)=0.0740424, mflops=67.5289 (err=8.4e-16) 28. Temperton (f2c): elapsed time t=1.26655 s, 512 iters, t-(init.)=1.25955 s t(norm)=0.533869, mflops=9.3656 (err=6.4e-16) 29. Valkenburg: elapsed time t=1.49168 s, 256 iters, t-(init.)=1.48822 s t(norm)=1.26158, mflops=3.96327 (err=7.4e-16) Top mflops for N=512 = 119.693 Normalized results and averages for N=512: fft 0: mflops = 43.6497 (norm. = 0.364681), norm. avg. (of 9) = 0.384863 fft 1: mflops = 43.5274 (norm. = 0.363659), norm. avg. (of 9) = 0.385421 fft 2: mflops = 31.7584 (norm. = 0.265333), norm. avg. (of 9) = 0.239852 fft 3: mflops = 14.6502 (norm. = 0.122398), norm. avg. (of 9) = 0.0776944 fft 4: mflops = 7.93454 (norm. = 0.0662908), norm. avg. (of 9) = 0.0715115 fft 5: mflops = 61.6818 (norm. = 0.515335), norm. avg. (of 9) = 0.320009 fft 6: mflops = 56.5653 (norm. = 0.472588), norm. avg. (of 9) = 0.308066 fft 7: mflops = 70.374 (norm. = 0.587955), norm. avg. (of 9) = 0.348562 fft 8: mflops = 23.3257 (norm. = 0.19488), norm. avg. (of 8) = 0.17215 fft 9: mflops = 12.3625 (norm. = 0.103286), norm. avg. (of 9) = 0.141478 fft 10: mflops = 47.4096 (norm. = 0.396094), norm. avg. (of 9) = 0.81493 fft 11: mflops = 55.9909 (norm. = 0.467789), norm. avg. (of 9) = 0.809246 fft 12: mflops = 38.5501 (norm. = 0.322075), norm. avg. (of 9) = 0.838191 fft 13: mflops = 119.693 (norm. = 1), norm. avg. (of 7) = 0.7439 fft 14: mflops = 19.926 (norm. = 0.166476), norm. avg. (of 9) = 0.249809 fft 15: mflops = 35.3398 (norm. = 0.295254), norm. avg. (of 9) = 0.221855 fft 16: mflops = 39.5073 (norm. = 0.330072), norm. avg. (of 9) = 0.229433 fft 17: mflops = 19.4835 (norm. = 0.162779), norm. avg. (of 9) = 0.36365 fft 18: mflops = 35.8271 (norm. = 0.299326), norm. avg. (of 8) = 0.264351 fft 19: mflops = 48.4837 (norm. = 0.405068), norm. avg. (of 8) = 0.328073 fft 20: mflops = 49.2787 (norm. = 0.41171), norm. avg. (of 8) = 0.326233 fft 21: mflops = 7.87282 (norm. = 0.0657752), norm. avg. (of 9) = 0.0628918 fft 22: mflops = 14.7966 (norm. = 0.123621), norm. avg. (of 9) = 0.094265 fft 23: mflops = 38.0087 (norm. = 0.317552), norm. avg. (of 9) = 0.231985 fft 24: mflops = 80.1684 (norm. = 0.669785), norm. avg. (of 9) = 0.584454 fft 25: mflops = 14.2497 (norm. = 0.119052), norm. avg. (of 6) = 0.256991 fft 26: mflops = 22.2501 (norm. = 0.185893), norm. avg. (of 8) = 0.115093 fft 27: mflops = 67.5289 (norm. = 0.564185), norm. avg. (of 9) = 0.34259 fft 28: mflops = 9.3656 (norm. = 0.078247), norm. avg. (of 9) = 0.0747801 fft 29: mflops = 3.96327 (norm. = 0.033112), norm. avg. (of 9) = 0.0524089 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.28368 s, 1024 iters, t-(init.)=1.2535 s t(norm)=0.119543, mflops=41.8259 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.34576 s, 1024 iters, t-(init.)=1.31614 s t(norm)=0.125517, mflops=39.8353 (err=1.0e-15) 2. Arndt Split-Radix: elapsed time t=1.83747 s, 1024 iters, t-(init.)=1.80775 s t(norm)=0.172401, mflops=29.0022 (err=1.0e-15) 3. Arndt 4-step: elapsed time t=1.73223 s, 512 iters, t-(init.)=1.71718 s t(norm)=0.327526, mflops=15.266 (err=1.0e-15) 4. Beauregard: elapsed time t=1.72141 s, 256 iters, t-(init.)=1.71379 s t(norm)=0.653761, mflops=7.64806 (err=1.1e-15) 5. Bergland: elapsed time t=1.02163 s, 1024 iters, t-(init.)=0.991767 s t(norm)=0.0945823, mflops=52.864 (err=1.1e-15) 6. CWP (min N) (N=1040): elapsed time t=1.10278 s, 1024 iters, t-(init.)=1.0587 s t(norm)=0.100966, mflops=49.5218 7. CWP (best N) (N=1040): elapsed time t=1.10355 s, 1024 iters, t-(init.)=1.05947 s t(norm)=0.101039, mflops=49.4859 8. Edelblute: elapsed time t=1.18575 s, 512 iters, t-(init.)=1.17073 s t(norm)=0.2233, mflops=22.3914 (err=1.0e-15) 9. FFTPACK (f2c): elapsed time t=1.70188 s, 256 iters, t-(init.)=1.69455 s t(norm)=0.646419, mflops=7.73492 (err=1.1e-15) FFTW_MEASURE plan: (cost = 1.131047e-03) FFTW_TWIDDLE 16 FFTW_NOTW 64 10. FFTW: elapsed time t=1.17418 s, 1024 iters, t-(init.)=1.14454 s t(norm)=0.109152, mflops=45.8079 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.62469 s, 1024 iters, t-(init.)=1.59404 s t(norm)=0.15202, mflops=32.8905 (err=1.1e-15) 12. Frigo-old: elapsed time t=1.67778 s, 512 iters, t-(init.)=1.66255 s t(norm)=0.317106, mflops=15.7676 (err=1.1e-15) 13. Green: elapsed time t=1.44088 s, 2048 iters, t-(init.)=1.38129 s t(norm)=0.0658648, mflops=75.913 (err=1.1e-15) 14. GSL: elapsed time t=1.35471 s, 256 iters, t-(init.)=1.34738 s t(norm)=0.513985, mflops=9.72791 (err=1.1e-15) 15. GSL DIT: elapsed time t=1.70031 s, 1024 iters, t-(init.)=1.67018 s t(norm)=0.15928, mflops=31.3912 (err=1.3e-15) 16. GSL DIF: elapsed time t=1.49942 s, 1024 iters, t-(init.)=1.46952 s t(norm)=0.140145, mflops=35.6775 (err=1.4e-15) 17. Krukar: elapsed time t=1.71456 s, 512 iters, t-(init.)=1.69949 s t(norm)=0.324152, mflops=15.4249 (err=1.1e-15) 18. Mayer (Buneman): elapsed time t=1.4508 s, 1024 iters, t-(init.)=1.42096 s t(norm)=0.135513, mflops=36.8967 (err=1.0e-15) 19. Mayer (simple): elapsed time t=1.08696 s, 1024 iters, t-(init.)=1.05712 s t(norm)=0.100814, mflops=49.5961 20. Mayer (lookup): elapsed time t=1.23876 s, 1024 iters, t-(init.)=1.20892 s t(norm)=0.115291, mflops=43.3684 (err=1.0e-15) 21. NAPACK (f2c): elapsed time t=1.70282 s, 128 iters, t-(init.)=1.69902 s t(norm)=1.29625, mflops=3.85728 (err=1.6e-14) 22. Nielsen: elapsed time t=1.295 s, 256 iters, t-(init.)=1.28702 s t(norm)=0.490961, mflops=10.1841 (err=7.2e-15) 23. NR (C): elapsed time t=1.58824 s, 1024 iters, t-(init.)=1.5584 s t(norm)=0.148621, mflops=33.6427 (err=1.2e-15) 24. Ooura (C): elapsed time t=1.94547 s, 2048 iters, t-(init.)=1.88584 s t(norm)=0.089924, mflops=55.6025 (err=1.1e-15) 25. QFT: elapsed time t=1.05896 s, 256 iters, t-(init.)=1.05066 s t(norm)=0.400793, mflops=12.4753 (err=1.4e-15) 26. Ransom: elapsed time t=1.02098 s, 512 iters, t-(init.)=1.00588 s t(norm)=0.191856, mflops=26.0612 (err=2.1e-15) 27. Singleton (f2c): elapsed time t=1.71011 s, 2048 iters, t-(init.)=1.65058 s t(norm)=0.0787058, mflops=63.5277 (err=1.6e-15) 28. Temperton (f2c): elapsed time t=1.43133 s, 256 iters, t-(init.)=1.42402 s t(norm)=0.543219, mflops=9.20439 (err=1.1e-15) 29. Valkenburg: elapsed time t=1.01084 s, 64 iters, t-(init.)=1.00861 s t(norm)=1.53902, mflops=3.24882 (err=1.1e-15) Top mflops for N=1024 = 75.913 Normalized results and averages for N=1024: fft 0: mflops = 41.8259 (norm. = 0.550971), norm. avg. (of 10) = 0.401474 fft 1: mflops = 39.8353 (norm. = 0.52475), norm. avg. (of 10) = 0.399354 fft 2: mflops = 29.0022 (norm. = 0.382045), norm. avg. (of 10) = 0.254071 fft 3: mflops = 15.266 (norm. = 0.201098), norm. avg. (of 10) = 0.0900348 fft 4: mflops = 7.64806 (norm. = 0.100748), norm. avg. (of 10) = 0.0744352 fft 5: mflops = 52.864 (norm. = 0.696376), norm. avg. (of 10) = 0.357646 fft 6: mflops = 49.5218 (norm. = 0.652349), norm. avg. (of 10) = 0.342494 fft 7: mflops = 49.4859 (norm. = 0.651877), norm. avg. (of 10) = 0.378893 fft 8: mflops = 22.3914 (norm. = 0.294962), norm. avg. (of 9) = 0.185795 fft 9: mflops = 7.73492 (norm. = 0.101892), norm. avg. (of 10) = 0.13752 fft 10: mflops = 45.8079 (norm. = 0.603426), norm. avg. (of 10) = 0.793779 fft 11: mflops = 32.8905 (norm. = 0.433266), norm. avg. (of 10) = 0.771648 fft 12: mflops = 15.7676 (norm. = 0.207706), norm. avg. (of 10) = 0.775142 fft 13: mflops = 75.913 (norm. = 1), norm. avg. (of 8) = 0.775912 fft 14: mflops = 9.72791 (norm. = 0.128145), norm. avg. (of 10) = 0.237643 fft 15: mflops = 31.3912 (norm. = 0.413515), norm. avg. (of 10) = 0.241021 fft 16: mflops = 35.6775 (norm. = 0.469978), norm. avg. (of 10) = 0.253487 fft 17: mflops = 15.4249 (norm. = 0.203191), norm. avg. (of 10) = 0.347604 fft 18: mflops = 36.8967 (norm. = 0.486039), norm. avg. (of 9) = 0.288983 fft 19: mflops = 49.5961 (norm. = 0.653328), norm. avg. (of 9) = 0.364212 fft 20: mflops = 43.3684 (norm. = 0.57129), norm. avg. (of 9) = 0.353461 fft 21: mflops = 3.85728 (norm. = 0.0508118), norm. avg. (of 10) = 0.0616838 fft 22: mflops = 10.1841 (norm. = 0.134155), norm. avg. (of 10) = 0.098254 fft 23: mflops = 33.6427 (norm. = 0.443174), norm. avg. (of 10) = 0.253104 fft 24: mflops = 55.6025 (norm. = 0.73245), norm. avg. (of 10) = 0.599254 fft 25: mflops = 12.4753 (norm. = 0.164336), norm. avg. (of 7) = 0.243755 fft 26: mflops = 26.0612 (norm. = 0.343304), norm. avg. (of 9) = 0.140449 fft 27: mflops = 63.5277 (norm. = 0.836848), norm. avg. (of 10) = 0.392016 fft 28: mflops = 9.20439 (norm. = 0.121249), norm. avg. (of 10) = 0.079427 fft 29: mflops = 3.24882 (norm. = 0.0427966), norm. avg. (of 10) = 0.0514477 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.17039 s, 128 iters, t-(init.)=1.11997 s t(norm)=0.388395, mflops=12.8735 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.17693 s, 128 iters, t-(init.)=1.12658 s t(norm)=0.390688, mflops=12.7979 (err=1.4e-15) 2. Arndt Split-Radix: elapsed time t=1.515 s, 128 iters, t-(init.)=1.46447 s t(norm)=0.507865, mflops=9.84513 (err=1.4e-15) 3. Arndt 4-step: elapsed time t=1.27567 s, 128 iters, t-(init.)=1.22474 s t(norm)=0.424729, mflops=11.7722 (err=1.4e-15) 4. Beauregard: elapsed time t=1.15467 s, 64 iters, t-(init.)=1.12944 s t(norm)=0.783358, mflops=6.38277 (err=1.5e-15) 5. Bergland: elapsed time t=1.282 s, 256 iters, t-(init.)=1.18113 s t(norm)=0.204802, mflops=24.4139 (err=1.5e-15) 6. CWP (min N) (N=2145): elapsed time t=1.92054 s, 512 iters, t-(init.)=1.70924 s t(norm)=0.148187, mflops=33.7411 7. CWP (best N) (N=2184): elapsed time t=1.81065 s, 512 iters, t-(init.)=1.5942 s t(norm)=0.138214, mflops=36.1758 8. Edelblute: elapsed time t=1.65299 s, 128 iters, t-(init.)=1.60245 s t(norm)=0.555716, mflops=8.99741 (err=1.4e-15) 9. FFTPACK (f2c): elapsed time t=1.809 s, 128 iters, t-(init.)=1.7586 s t(norm)=0.609866, mflops=8.19852 (err=1.5e-15) FFTW_MEASURE plan: (cost = 2.743719e-03) FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_NOTW 32 10. FFTW: elapsed time t=1.22274 s, 256 iters, t-(init.)=1.12113 s t(norm)=0.194399, mflops=25.7203 (err=1.5e-15) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.29397 s, 256 iters, t-(init.)=1.19257 s t(norm)=0.206786, mflops=24.1796 (err=1.5e-15) 12. Frigo-old: elapsed time t=1.01421 s, 128 iters, t-(init.)=0.963484 s t(norm)=0.334127, mflops=14.9644 (err=1.5e-15) 13. Green: elapsed time t=1.12985 s, 256 iters, t-(init.)=1.0289 s t(norm)=0.178406, mflops=28.0259 (err=1.5e-15) 14. GSL: elapsed time t=1.44844 s, 128 iters, t-(init.)=1.39788 s t(norm)=0.484771, mflops=10.3141 (err=1.5e-15) 15. GSL DIT: elapsed time t=1.24388 s, 128 iters, t-(init.)=1.19338 s t(norm)=0.413854, mflops=12.0815 (err=2.1e-15) 16. GSL DIF: elapsed time t=1.16396 s, 128 iters, t-(init.)=1.11345 s t(norm)=0.386134, mflops=12.9489 (err=2.2e-15) 17. Krukar: elapsed time t=1.06386 s, 128 iters, t-(init.)=1.01314 s t(norm)=0.351349, mflops=14.2309 (err=1.5e-15) 18. Mayer (Buneman): elapsed time t=1.93711 s, 512 iters, t-(init.)=1.7353 s t(norm)=0.150447, mflops=33.2344 (err=1.4e-15) 19. Mayer (simple): elapsed time t=1.52079 s, 512 iters, t-(init.)=1.31906 s t(norm)=0.11436, mflops=43.7218 20. Mayer (lookup): elapsed time t=1.12911 s, 256 iters, t-(init.)=1.02815 s t(norm)=0.178277, mflops=28.0463 (err=1.4e-15) 21. NAPACK (f2c): elapsed time t=1.0185 s, 32 iters, t-(init.)=1.00572 s t(norm)=1.39509, mflops=3.58399 (err=1.5e-14) 22. Nielsen: elapsed time t=1.84881 s, 128 iters, t-(init.)=1.79745 s t(norm)=0.623337, mflops=8.02134 (err=1.2e-14) 23. NR (C): elapsed time t=1.19012 s, 128 iters, t-(init.)=1.13888 s t(norm)=0.394953, mflops=12.6597 (err=1.6e-15) 24. Ooura (C): elapsed time t=1.07628 s, 256 iters, t-(init.)=0.975526 s t(norm)=0.169152, mflops=29.5593 (err=1.4e-15) 25. QFT: elapsed time t=1.32187 s, 128 iters, t-(init.)=1.26972 s t(norm)=0.440327, mflops=11.3552 (err=1.9e-15) 26. Ransom: elapsed time t=1.90461 s, 256 iters, t-(init.)=1.80282 s t(norm)=0.312601, mflops=15.9948 (err=2.6e-15) 27. Singleton (f2c): elapsed time t=1.56201 s, 256 iters, t-(init.)=1.46123 s t(norm)=0.25337, mflops=19.734 (err=2.0e-15) 28. Temperton (f2c): elapsed time t=1.873 s, 128 iters, t-(init.)=1.82244 s t(norm)=0.632005, mflops=7.91133 (err=1.5e-15) 29. Valkenburg: elapsed time t=1.34166 s, 32 iters, t-(init.)=1.32832 s t(norm)=1.8426, mflops=2.71356 (err=1.5e-15) Top mflops for N=2048 = 43.7218 Normalized results and averages for N=2048: fft 0: mflops = 12.8735 (norm. = 0.294441), norm. avg. (of 11) = 0.391744 fft 1: mflops = 12.7979 (norm. = 0.292713), norm. avg. (of 11) = 0.389659 fft 2: mflops = 9.84513 (norm. = 0.225177), norm. avg. (of 11) = 0.251445 fft 3: mflops = 11.7722 (norm. = 0.269253), norm. avg. (of 11) = 0.106327 fft 4: mflops = 6.38277 (norm. = 0.145986), norm. avg. (of 11) = 0.0809398 fft 5: mflops = 24.4139 (norm. = 0.558391), norm. avg. (of 11) = 0.375896 fft 6: mflops = 33.7411 (norm. = 0.771724), norm. avg. (of 11) = 0.381515 fft 7: mflops = 36.1758 (norm. = 0.82741), norm. avg. (of 11) = 0.419668 fft 8: mflops = 8.99741 (norm. = 0.205788), norm. avg. (of 10) = 0.187795 fft 9: mflops = 8.19852 (norm. = 0.187516), norm. avg. (of 11) = 0.142065 fft 10: mflops = 25.7203 (norm. = 0.588271), norm. avg. (of 11) = 0.775097 fft 11: mflops = 24.1796 (norm. = 0.553033), norm. avg. (of 11) = 0.751774 fft 12: mflops = 14.9644 (norm. = 0.342263), norm. avg. (of 11) = 0.73579 fft 13: mflops = 28.0259 (norm. = 0.641007), norm. avg. (of 9) = 0.760923 fft 14: mflops = 10.3141 (norm. = 0.235904), norm. avg. (of 11) = 0.237485 fft 15: mflops = 12.0815 (norm. = 0.276328), norm. avg. (of 11) = 0.244231 fft 16: mflops = 12.9489 (norm. = 0.296165), norm. avg. (of 11) = 0.257367 fft 17: mflops = 14.2309 (norm. = 0.325487), norm. avg. (of 11) = 0.345594 fft 18: mflops = 33.2344 (norm. = 0.760133), norm. avg. (of 10) = 0.336098 fft 19: mflops = 43.7218 (norm. = 1), norm. avg. (of 10) = 0.427791 fft 20: mflops = 28.0463 (norm. = 0.641472), norm. avg. (of 10) = 0.382262 fft 21: mflops = 3.58399 (norm. = 0.0819728), norm. avg. (of 11) = 0.0635282 fft 22: mflops = 8.02134 (norm. = 0.183463), norm. avg. (of 11) = 0.106 fft 23: mflops = 12.6597 (norm. = 0.289552), norm. avg. (of 11) = 0.256417 fft 24: mflops = 29.5593 (norm. = 0.676077), norm. avg. (of 11) = 0.606238 fft 25: mflops = 11.3552 (norm. = 0.259715), norm. avg. (of 8) = 0.24575 fft 26: mflops = 15.9948 (norm. = 0.365832), norm. avg. (of 10) = 0.162988 fft 27: mflops = 19.734 (norm. = 0.451354), norm. avg. (of 11) = 0.39741 fft 28: mflops = 7.91133 (norm. = 0.180947), norm. avg. (of 11) = 0.0886561 fft 29: mflops = 2.71356 (norm. = 0.0620643), norm. avg. (of 11) = 0.0524128 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.37351 s, 64 iters, t-(init.)=1.32314 s t(norm)=0.420615, mflops=11.8873 (err=2.5e-15) 1. Arndt DIT: elapsed time t=1.38177 s, 64 iters, t-(init.)=1.3312 s t(norm)=0.423177, mflops=11.8154 (err=2.5e-15) 2. Arndt Split-Radix: elapsed time t=1.75519 s, 64 iters, t-(init.)=1.70461 s t(norm)=0.54188, mflops=9.22713 (err=2.5e-15) 3. Arndt 4-step: elapsed time t=1.29515 s, 64 iters, t-(init.)=1.24451 s t(norm)=0.39562, mflops=12.6384 (err=2.5e-15) 4. Beauregard: elapsed time t=1.27291 s, 32 iters, t-(init.)=1.24747 s t(norm)=0.793118, mflops=6.30423 (err=2.6e-15) 5. Bergland: elapsed time t=1.32474 s, 128 iters, t-(init.)=1.22329 s t(norm)=0.194438, mflops=25.7152 (err=2.5e-15) 6. CWP (min N) (N=4290): elapsed time t=1.14576 s, 128 iters, t-(init.)=1.03994 s t(norm)=0.165294, mflops=30.2491 7. CWP (best N) (N=4368): elapsed time t=1.00908 s, 128 iters, t-(init.)=0.901375 s t(norm)=0.14327, mflops=34.8992 8. Edelblute: elapsed time t=1.89901 s, 64 iters, t-(init.)=1.84846 s t(norm)=0.58761, mflops=8.50904 (err=2.5e-15) 9. FFTPACK (f2c): elapsed time t=1.88515 s, 64 iters, t-(init.)=1.83444 s t(norm)=0.583151, mflops=8.5741 (err=2.6e-15) FFTW_MEASURE plan: (cost = 6.283125e-03) FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_NOTW 64 10. FFTW: elapsed time t=1.36797 s, 128 iters, t-(init.)=1.2657 s t(norm)=0.201177, mflops=24.8537 (err=2.6e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.34586 s, 128 iters, t-(init.)=1.24347 s t(norm)=0.197644, mflops=25.298 (err=2.6e-15) 12. Frigo-old: elapsed time t=1.11407 s, 64 iters, t-(init.)=1.06242 s t(norm)=0.337733, mflops=14.8046 (err=2.6e-15) 13. Green: elapsed time t=1.19021 s, 128 iters, t-(init.)=1.08914 s t(norm)=0.173114, mflops=28.8827 (err=2.6e-15) 14. GSL: elapsed time t=1.45754 s, 64 iters, t-(init.)=1.40687 s t(norm)=0.447231, mflops=11.1799 (err=2.6e-15) 15. GSL DIT: elapsed time t=1.38769 s, 64 iters, t-(init.)=1.33678 s t(norm)=0.424952, mflops=11.766 (err=3.0e-15) 16. GSL DIF: elapsed time t=1.28803 s, 64 iters, t-(init.)=1.23751 s t(norm)=0.393393, mflops=12.7099 (err=3.1e-15) 17. Krukar: elapsed time t=1.16947 s, 64 iters, t-(init.)=1.11902 s t(norm)=0.355728, mflops=14.0557 (err=2.6e-15) 18. Mayer (Buneman): elapsed time t=1.13377 s, 64 iters, t-(init.)=1.08294 s t(norm)=0.344258, mflops=14.524 (err=2.5e-15) 19. Mayer (simple): elapsed time t=1.04076 s, 64 iters, t-(init.)=0.990136 s t(norm)=0.314756, mflops=15.8853 20. Mayer (lookup): elapsed time t=1.18232 s, 64 iters, t-(init.)=1.13173 s t(norm)=0.359768, mflops=13.8978 (err=2.5e-15) 21. NAPACK (f2c): elapsed time t=1.09983 s, 16 iters, t-(init.)=1.08664 s t(norm)=1.38174, mflops=3.61864 (err=4.7e-14) 22. Nielsen: elapsed time t=1.87242 s, 64 iters, t-(init.)=1.82053 s t(norm)=0.57873, mflops=8.63961 (err=2.2e-14) 23. NR (C): elapsed time t=1.32741 s, 64 iters, t-(init.)=1.27681 s t(norm)=0.405889, mflops=12.3187 (err=2.6e-15) 24. Ooura (C): elapsed time t=1.13654 s, 128 iters, t-(init.)=1.03541 s t(norm)=0.164575, mflops=30.3814 (err=2.5e-15) 25. QFT: elapsed time t=1.67884 s, 64 iters, t-(init.)=1.62686 s t(norm)=0.517166, mflops=9.66808 (err=3.1e-15) 26. Ransom: elapsed time t=1.52714 s, 128 iters, t-(init.)=1.42633 s t(norm)=0.226708, mflops=22.0548 (err=3.1e-15) 27. Singleton (f2c): elapsed time t=1.43677 s, 128 iters, t-(init.)=1.33598 s t(norm)=0.212349, mflops=23.5462 (err=3.8e-15) 28. Temperton (f2c): elapsed time t=1.91869 s, 64 iters, t-(init.)=1.86809 s t(norm)=0.59385, mflops=8.41964 (err=2.6e-15) 29. Valkenburg: elapsed time t=1.49653 s, 16 iters, t-(init.)=1.48253 s t(norm)=1.88513, mflops=2.65233 (err=2.5e-15) Top mflops for N=4096 = 34.8992 Normalized results and averages for N=4096: fft 0: mflops = 11.8873 (norm. = 0.340619), norm. avg. (of 12) = 0.387484 fft 1: mflops = 11.8154 (norm. = 0.338557), norm. avg. (of 12) = 0.385401 fft 2: mflops = 9.22713 (norm. = 0.264394), norm. avg. (of 12) = 0.252524 fft 3: mflops = 12.6384 (norm. = 0.36214), norm. avg. (of 12) = 0.127645 fft 4: mflops = 6.30423 (norm. = 0.180641), norm. avg. (of 12) = 0.0892482 fft 5: mflops = 25.7152 (norm. = 0.736842), norm. avg. (of 12) = 0.405975 fft 6: mflops = 30.2491 (norm. = 0.866757), norm. avg. (of 12) = 0.421952 fft 7: mflops = 34.8992 (norm. = 1), norm. avg. (of 12) = 0.468029 fft 8: mflops = 8.50904 (norm. = 0.243818), norm. avg. (of 11) = 0.192888 fft 9: mflops = 8.5741 (norm. = 0.245682), norm. avg. (of 12) = 0.1507 fft 10: mflops = 24.8537 (norm. = 0.712157), norm. avg. (of 12) = 0.769852 fft 11: mflops = 25.298 (norm. = 0.724888), norm. avg. (of 12) = 0.749534 fft 12: mflops = 14.8046 (norm. = 0.42421), norm. avg. (of 12) = 0.709825 fft 13: mflops = 28.8827 (norm. = 0.827602), norm. avg. (of 10) = 0.767591 fft 14: mflops = 11.1799 (norm. = 0.320349), norm. avg. (of 12) = 0.24439 fft 15: mflops = 11.766 (norm. = 0.337143), norm. avg. (of 12) = 0.251974 fft 16: mflops = 12.7099 (norm. = 0.36419), norm. avg. (of 12) = 0.266269 fft 17: mflops = 14.0557 (norm. = 0.402751), norm. avg. (of 12) = 0.350357 fft 18: mflops = 14.524 (norm. = 0.416169), norm. avg. (of 11) = 0.343377 fft 19: mflops = 15.8853 (norm. = 0.455177), norm. avg. (of 11) = 0.430281 fft 20: mflops = 13.8978 (norm. = 0.398228), norm. avg. (of 11) = 0.383714 fft 21: mflops = 3.61864 (norm. = 0.103688), norm. avg. (of 12) = 0.0668749 fft 22: mflops = 8.63961 (norm. = 0.247559), norm. avg. (of 12) = 0.117797 fft 23: mflops = 12.3187 (norm. = 0.352978), norm. avg. (of 12) = 0.264464 fft 24: mflops = 30.3814 (norm. = 0.870546), norm. avg. (of 12) = 0.628264 fft 25: mflops = 9.66808 (norm. = 0.277029), norm. avg. (of 9) = 0.249225 fft 26: mflops = 22.0548 (norm. = 0.631956), norm. avg. (of 11) = 0.205621 fft 27: mflops = 23.5462 (norm. = 0.67469), norm. avg. (of 12) = 0.420517 fft 28: mflops = 8.41964 (norm. = 0.241256), norm. avg. (of 12) = 0.101373 fft 29: mflops = 2.65233 (norm. = 0.0759997), norm. avg. (of 12) = 0.0543784 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.47232 s, 32 iters, t-(init.)=1.42159 s t(norm)=0.41715, mflops=11.9861 (err=3.0e-15) 1. Arndt DIT: elapsed time t=1.48411 s, 32 iters, t-(init.)=1.43307 s t(norm)=0.420516, mflops=11.8902 (err=3.0e-15) 2. Arndt Split-Radix: elapsed time t=1.92336 s, 32 iters, t-(init.)=1.87237 s t(norm)=0.549424, mflops=9.10044 (err=3.0e-15) 3. Arndt 4-step: elapsed time t=1.60201 s, 32 iters, t-(init.)=1.55077 s t(norm)=0.455054, mflops=10.9877 (err=2.9e-15) 4. Beauregard: elapsed time t=1.38471 s, 16 iters, t-(init.)=1.35899 s t(norm)=0.797561, mflops=6.26911 (err=2.9e-15) 5. Bergland: elapsed time t=1.53647 s, 64 iters, t-(init.)=1.43467 s t(norm)=0.210493, mflops=23.7538 (err=2.9e-15) 6. CWP (min N) (N=8580): elapsed time t=1.17564 s, 64 iters, t-(init.)=1.06914 s t(norm)=0.156863, mflops=31.875 7. CWP (best N) (N=9240): elapsed time t=1.14312 s, 64 iters, t-(init.)=1.02819 s t(norm)=0.150855, mflops=33.1443 8. Edelblute: elapsed time t=1.03477 s, 16 iters, t-(init.)=1.00912 s t(norm)=0.592228, mflops=8.4427 (err=3.0e-15) 9. FFTPACK (f2c): elapsed time t=1.15114 s, 16 iters, t-(init.)=1.12547 s t(norm)=0.660512, mflops=7.56988 (err=2.9e-15) FFTW_MEASURE plan: (cost = 1.425975e-02) FFTW_TWIDDLE 32 FFTW_TWIDDLE 8 FFTW_NOTW 32 10. FFTW: elapsed time t=1.46963 s, 64 iters, t-(init.)=1.36565 s t(norm)=0.200367, mflops=24.9542 (err=2.9e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.41744 s, 64 iters, t-(init.)=1.31308 s t(norm)=0.192654, mflops=25.9532 (err=2.9e-15) 12. Frigo-old: elapsed time t=1.13122 s, 32 iters, t-(init.)=1.07816 s t(norm)=0.316375, mflops=15.804 (err=2.9e-15) 13. Green: elapsed time t=1.36693 s, 64 iters, t-(init.)=1.26518 s t(norm)=0.185626, mflops=26.9358 (err=2.9e-15) 14. GSL: elapsed time t=1.85492 s, 32 iters, t-(init.)=1.80394 s t(norm)=0.529346, mflops=9.44561 (err=2.9e-15) 15. GSL DIT: elapsed time t=1.52018 s, 32 iters, t-(init.)=1.46919 s t(norm)=0.431117, mflops=11.5978 (err=3.6e-15) 16. GSL DIF: elapsed time t=1.40442 s, 32 iters, t-(init.)=1.35365 s t(norm)=0.397214, mflops=12.5877 (err=3.6e-15) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.25416 s, 32 iters, t-(init.)=1.20316 s t(norm)=0.353053, mflops=14.1622 (err=2.9e-15) 19. Mayer (simple): elapsed time t=1.16187 s, 32 iters, t-(init.)=1.11066 s t(norm)=0.325909, mflops=15.3417 20. Mayer (lookup): elapsed time t=1.30088 s, 32 iters, t-(init.)=1.25002 s t(norm)=0.366805, mflops=13.6312 (err=3.0e-15) 21. NAPACK (f2c): elapsed time t=1.23202 s, 8 iters, t-(init.)=1.2191 s t(norm)=1.43092, mflops=3.49426 (err=4.3e-14) 22. Nielsen: elapsed time t=1.09526 s, 16 iters, t-(init.)=1.06713 s t(norm)=0.626272, mflops=7.98375 (err=1.1e-14) 23. NR (C): elapsed time t=1.44866 s, 32 iters, t-(init.)=1.39768 s t(norm)=0.410134, mflops=12.1911 (err=3.0e-15) 24. Ooura (C): elapsed time t=1.27216 s, 64 iters, t-(init.)=1.17048 s t(norm)=0.171731, mflops=29.1152 (err=2.9e-15) 25. QFT: elapsed time t=1.99519 s, 32 iters, t-(init.)=1.9417 s t(norm)=0.56977, mflops=8.77547 (err=4.0e-15) 26. Ransom: elapsed time t=1.96662 s, 64 iters, t-(init.)=1.86504 s t(norm)=0.273637, mflops=18.2724 (err=4.1e-15) 27. Singleton (f2c): elapsed time t=1.65527 s, 64 iters, t-(init.)=1.55319 s t(norm)=0.227883, mflops=21.9411 (err=4.4e-15) 28. Temperton (f2c): elapsed time t=1.12219 s, 16 iters, t-(init.)=1.09669 s t(norm)=0.643623, mflops=7.76853 (err=2.9e-15) 29. Valkenburg: elapsed time t=1.66127 s, 8 iters, t-(init.)=1.64574 s t(norm)=1.93169, mflops=2.58841 (err=2.9e-15) Top mflops for N=8192 = 33.1443 Normalized results and averages for N=8192: fft 0: mflops = 11.9861 (norm. = 0.361633), norm. avg. (of 13) = 0.385495 fft 1: mflops = 11.8902 (norm. = 0.358738), norm. avg. (of 13) = 0.38335 fft 2: mflops = 9.10044 (norm. = 0.27457), norm. avg. (of 13) = 0.25422 fft 3: mflops = 10.9877 (norm. = 0.331511), norm. avg. (of 13) = 0.143327 fft 4: mflops = 6.26911 (norm. = 0.189146), norm. avg. (of 13) = 0.0969327 fft 5: mflops = 23.7538 (norm. = 0.716676), norm. avg. (of 13) = 0.429875 fft 6: mflops = 31.875 (norm. = 0.961702), norm. avg. (of 13) = 0.463471 fft 7: mflops = 33.1443 (norm. = 1), norm. avg. (of 13) = 0.508949 fft 8: mflops = 8.4427 (norm. = 0.254725), norm. avg. (of 12) = 0.198041 fft 9: mflops = 7.56988 (norm. = 0.228391), norm. avg. (of 13) = 0.156676 fft 10: mflops = 24.9542 (norm. = 0.752894), norm. avg. (of 13) = 0.768547 fft 11: mflops = 25.9532 (norm. = 0.783036), norm. avg. (of 13) = 0.752111 fft 12: mflops = 15.804 (norm. = 0.476825), norm. avg. (of 13) = 0.691902 fft 13: mflops = 26.9358 (norm. = 0.812682), norm. avg. (of 11) = 0.77169 fft 14: mflops = 9.44561 (norm. = 0.284984), norm. avg. (of 13) = 0.247513 fft 15: mflops = 11.5978 (norm. = 0.349917), norm. avg. (of 13) = 0.259508 fft 16: mflops = 12.5877 (norm. = 0.379783), norm. avg. (of 13) = 0.275001 fft 17: mflops = -1 (norm. = -0.0301711), norm. avg. (of 12) = 0.350357 fft 18: mflops = 14.1622 (norm. = 0.427288), norm. avg. (of 12) = 0.35037 fft 19: mflops = 15.3417 (norm. = 0.462875), norm. avg. (of 12) = 0.432997 fft 20: mflops = 13.6312 (norm. = 0.411268), norm. avg. (of 12) = 0.38601 fft 21: mflops = 3.49426 (norm. = 0.105426), norm. avg. (of 13) = 0.0698403 fft 22: mflops = 7.98375 (norm. = 0.240878), norm. avg. (of 13) = 0.127265 fft 23: mflops = 12.1911 (norm. = 0.36782), norm. avg. (of 13) = 0.272414 fft 24: mflops = 29.1152 (norm. = 0.878438), norm. avg. (of 13) = 0.647508 fft 25: mflops = 8.77547 (norm. = 0.264765), norm. avg. (of 10) = 0.250779 fft 26: mflops = 18.2724 (norm. = 0.551297), norm. avg. (of 12) = 0.234427 fft 27: mflops = 21.9411 (norm. = 0.661985), norm. avg. (of 13) = 0.439091 fft 28: mflops = 7.76853 (norm. = 0.234385), norm. avg. (of 13) = 0.111604 fft 29: mflops = 2.58841 (norm. = 0.078095), norm. avg. (of 13) = 0.0562028 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.67053 s, 16 iters, t-(init.)=1.61883 s t(norm)=0.441095, mflops=11.3354 (err=5.6e-15) 1. Arndt DIT: elapsed time t=1.68444 s, 16 iters, t-(init.)=1.63263 s t(norm)=0.444856, mflops=11.2396 (err=5.6e-15) 2. Arndt Split-Radix: elapsed time t=1.04551 s, 8 iters, t-(init.)=1.01942 s t(norm)=0.555541, mflops=9.00024 (err=5.6e-15) 3. Arndt 4-step: elapsed time t=1.34889 s, 16 iters, t-(init.)=1.29725 s t(norm)=0.353472, mflops=14.1454 (err=5.6e-15) 4. Beauregard: elapsed time t=1.50987 s, 8 iters, t-(init.)=1.48375 s t(norm)=0.808578, mflops=6.18369 (err=5.7e-15) 5. Bergland: elapsed time t=1.65736 s, 32 iters, t-(init.)=1.55412 s t(norm)=0.211732, mflops=23.6147 (err=5.7e-15) 6. CWP (min N) (N=17160): elapsed time t=1.23956 s, 32 iters, t-(init.)=1.13173 s t(norm)=0.154186, mflops=32.4283 7. CWP (best N) (N=17160): elapsed time t=1.23945 s, 32 iters, t-(init.)=1.13045 s t(norm)=0.154012, mflops=32.4651 8. Edelblute: elapsed time t=1.11881 s, 8 iters, t-(init.)=1.09268 s t(norm)=0.595462, mflops=8.39685 (err=5.6e-15) 9. FFTPACK (f2c): elapsed time t=1.21141 s, 8 iters, t-(init.)=1.18487 s t(norm)=0.645702, mflops=7.74351 (err=5.7e-15) FFTW_MEASURE plan: (cost = 3.758850e-02) FFTW_TWIDDLE 64 FFTW_TWIDDLE 8 FFTW_NOTW 32 10. FFTW: elapsed time t=1.46815 s, 32 iters, t-(init.)=1.35997 s t(norm)=0.185281, mflops=26.986 (err=5.7e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.67328 s, 32 iters, t-(init.)=1.56471 s t(norm)=0.213175, mflops=23.4549 (err=5.7e-15) 12. Frigo-old: elapsed time t=1.4974 s, 16 iters, t-(init.)=1.44063 s t(norm)=0.392541, mflops=12.7375 (err=5.7e-15) 13. Green: elapsed time t=1.57808 s, 32 iters, t-(init.)=1.47464 s t(norm)=0.200903, mflops=24.8876 (err=5.7e-15) 14. GSL: elapsed time t=1.85708 s, 16 iters, t-(init.)=1.80517 s t(norm)=0.491869, mflops=10.1653 (err=5.7e-15) 15. GSL DIT: elapsed time t=1.65842 s, 16 iters, t-(init.)=1.60707 s t(norm)=0.437891, mflops=11.4184 (err=6.3e-15) 16. GSL DIF: elapsed time t=1.52305 s, 16 iters, t-(init.)=1.47138 s t(norm)=0.40092, mflops=12.4713 (err=6.4e-15) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.39051 s, 16 iters, t-(init.)=1.33889 s t(norm)=0.364819, mflops=13.7054 (err=5.6e-15) 19. Mayer (simple): elapsed time t=1.29798 s, 16 iters, t-(init.)=1.2463 s t(norm)=0.33959, mflops=14.7237 20. Mayer (lookup): elapsed time t=1.44075 s, 16 iters, t-(init.)=1.38906 s t(norm)=0.378489, mflops=13.2104 (err=5.6e-15) 21. NAPACK (f2c): elapsed time t=1.30728 s, 4 iters, t-(init.)=1.29227 s t(norm)=1.40847, mflops=3.54996 (err=2.3e-13) 22. Nielsen: elapsed time t=1.1334 s, 8 iters, t-(init.)=1.10218 s t(norm)=0.600639, mflops=8.32447 (err=1.3e-13) 23. NR (C): elapsed time t=1.57988 s, 16 iters, t-(init.)=1.52848 s t(norm)=0.416477, mflops=12.0055 (err=5.6e-15) 24. Ooura (C): elapsed time t=1.30896 s, 32 iters, t-(init.)=1.20577 s t(norm)=0.164273, mflops=30.4372 (err=5.7e-15) 25. QFT: elapsed time t=1.18404 s, 8 iters, t-(init.)=1.15275 s t(norm)=0.628197, mflops=7.95929 (err=7.0e-15) 26. Ransom: elapsed time t=1.62163 s, 32 iters, t-(init.)=1.51888 s t(norm)=0.206931, mflops=24.1627 (err=6.0e-15) 27. Singleton (f2c): elapsed time t=1.69811 s, 32 iters, t-(init.)=1.59496 s t(norm)=0.217297, mflops=23.01 (err=8.5e-15) 28. Temperton (f2c): elapsed time t=1.12077 s, 8 iters, t-(init.)=1.09434 s t(norm)=0.596366, mflops=8.38412 (err=5.7e-15) 29. Valkenburg: elapsed time t=1.85436 s, 4 iters, t-(init.)=1.83617 s t(norm)=2.00127, mflops=2.49842 (err=5.7e-15) Top mflops for N=16384 = 32.4651 Normalized results and averages for N=16384: fft 0: mflops = 11.3354 (norm. = 0.349157), norm. avg. (of 14) = 0.382899 fft 1: mflops = 11.2396 (norm. = 0.346206), norm. avg. (of 14) = 0.380697 fft 2: mflops = 9.00024 (norm. = 0.277228), norm. avg. (of 14) = 0.255863 fft 3: mflops = 14.1454 (norm. = 0.435711), norm. avg. (of 14) = 0.164212 fft 4: mflops = 6.18369 (norm. = 0.190472), norm. avg. (of 14) = 0.103614 fft 5: mflops = 23.6147 (norm. = 0.727388), norm. avg. (of 14) = 0.451126 fft 6: mflops = 32.4283 (norm. = 0.998868), norm. avg. (of 14) = 0.501714 fft 7: mflops = 32.4651 (norm. = 1), norm. avg. (of 14) = 0.544025 fft 8: mflops = 8.39685 (norm. = 0.258642), norm. avg. (of 13) = 0.202702 fft 9: mflops = 7.74351 (norm. = 0.238518), norm. avg. (of 14) = 0.162522 fft 10: mflops = 26.986 (norm. = 0.831233), norm. avg. (of 14) = 0.773025 fft 11: mflops = 23.4549 (norm. = 0.722464), norm. avg. (of 14) = 0.749993 fft 12: mflops = 12.7375 (norm. = 0.392345), norm. avg. (of 14) = 0.670505 fft 13: mflops = 24.8876 (norm. = 0.766595), norm. avg. (of 12) = 0.771265 fft 14: mflops = 10.1653 (norm. = 0.313115), norm. avg. (of 14) = 0.252199 fft 15: mflops = 11.4184 (norm. = 0.351712), norm. avg. (of 14) = 0.266094 fft 16: mflops = 12.4713 (norm. = 0.384145), norm. avg. (of 14) = 0.282797 fft 17: mflops = -1 (norm. = -0.0308023), norm. avg. (of 12) = 0.350357 fft 18: mflops = 13.7054 (norm. = 0.422158), norm. avg. (of 13) = 0.355892 fft 19: mflops = 14.7237 (norm. = 0.453523), norm. avg. (of 13) = 0.434576 fft 20: mflops = 13.2104 (norm. = 0.406912), norm. avg. (of 13) = 0.387618 fft 21: mflops = 3.54996 (norm. = 0.109347), norm. avg. (of 14) = 0.0726622 fft 22: mflops = 8.32447 (norm. = 0.256413), norm. avg. (of 14) = 0.136489 fft 23: mflops = 12.0055 (norm. = 0.369796), norm. avg. (of 14) = 0.27937 fft 24: mflops = 30.4372 (norm. = 0.937537), norm. avg. (of 14) = 0.668224 fft 25: mflops = 7.95929 (norm. = 0.245165), norm. avg. (of 11) = 0.250269 fft 26: mflops = 24.1627 (norm. = 0.744266), norm. avg. (of 13) = 0.273646 fft 27: mflops = 23.01 (norm. = 0.708762), norm. avg. (of 14) = 0.458353 fft 28: mflops = 8.38412 (norm. = 0.25825), norm. avg. (of 14) = 0.122079 fft 29: mflops = 2.49842 (norm. = 0.076957), norm. avg. (of 14) = 0.0576852 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.7875 s, 8 iters, t-(init.)=1.73347 s t(norm)=0.440845, mflops=11.3418 (err=5.2e-15) 1. Arndt DIT: elapsed time t=1.80696 s, 8 iters, t-(init.)=1.75302 s t(norm)=0.445816, mflops=11.2154 (err=5.2e-15) 2. Arndt Split-Radix: elapsed time t=1.15023 s, 4 iters, t-(init.)=1.12285 s t(norm)=0.57111, mflops=8.75489 (err=5.2e-15) 3. Arndt 4-step: elapsed time t=1.68294 s, 8 iters, t-(init.)=1.62846 s t(norm)=0.414139, mflops=12.0732 (err=5.2e-15) 4. Beauregard: elapsed time t=1.63292 s, 4 iters, t-(init.)=1.60558 s t(norm)=0.816639, mflops=6.12266 (err=5.2e-15) 5. Bergland: elapsed time t=1.7385 s, 16 iters, t-(init.)=1.631 s t(norm)=0.207393, mflops=24.1088 (err=5.2e-15) 6. CWP (min N) (N=34320): elapsed time t=1.42752 s, 16 iters, t-(init.)=1.29729 s t(norm)=0.164959, mflops=30.3106 7. CWP (best N) (N=34320): elapsed time t=1.427 s, 16 iters, t-(init.)=1.29722 s t(norm)=0.16495, mflops=30.3122 8. Edelblute: elapsed time t=1.22343 s, 4 iters, t-(init.)=1.19609 s t(norm)=0.608363, mflops=8.21877 (err=5.2e-15) 9. FFTPACK (f2c): elapsed time t=1.42041 s, 4 iters, t-(init.)=1.39118 s t(norm)=0.707592, mflops=7.06622 (err=5.2e-15) FFTW_MEASURE plan: (cost = 1.044900e-01) FFTW_TWIDDLE 64 FFTW_TWIDDLE 16 FFTW_NOTW 32 10. FFTW: elapsed time t=1.6134 s, 16 iters, t-(init.)=1.49804 s t(norm)=0.190486, mflops=26.2486 (err=5.2e-15) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.85035 s, 16 iters, t-(init.)=1.73552 s t(norm)=0.220683, mflops=22.6569 (err=5.2e-15) 12. Frigo-old: elapsed time t=1.77772 s, 8 iters, t-(init.)=1.71751 s t(norm)=0.436785, mflops=11.4473 (err=5.2e-15) 13. Green: elapsed time t=1.66363 s, 16 iters, t-(init.)=1.55533 s t(norm)=0.197771, mflops=25.2818 (err=5.2e-15) 14. GSL: elapsed time t=1.02209 s, 4 iters, t-(init.)=0.99524 s t(norm)=0.506205, mflops=9.87742 (err=5.2e-15) 15. GSL DIT: elapsed time t=1.83914 s, 8 iters, t-(init.)=1.78521 s t(norm)=0.454002, mflops=11.0132 (err=5.9e-15) 16. GSL DIF: elapsed time t=1.68403 s, 8 iters, t-(init.)=1.63007 s t(norm)=0.414549, mflops=12.0613 (err=6.0e-15) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.49913 s, 8 iters, t-(init.)=1.44546 s t(norm)=0.3676, mflops=13.6017 (err=5.2e-15) 19. Mayer (simple): elapsed time t=1.40312 s, 8 iters, t-(init.)=1.34941 s t(norm)=0.343173, mflops=14.5699 20. Mayer (lookup): elapsed time t=1.55327 s, 8 iters, t-(init.)=1.49938 s t(norm)=0.381313, mflops=13.1126 (err=5.2e-15) 21. NAPACK (f2c): elapsed time t=1.45397 s, 2 iters, t-(init.)=1.44008 s t(norm)=1.46492, mflops=3.41315 (err=5.6e-13) 22. Nielsen: elapsed time t=1.17697 s, 4 iters, t-(init.)=1.13931 s t(norm)=0.579482, mflops=8.6284 (err=2.3e-13) 23. NR (C): elapsed time t=1.75064 s, 8 iters, t-(init.)=1.69614 s t(norm)=0.431352, mflops=11.5915 (err=5.3e-15) 24. Ooura (C): elapsed time t=1.48701 s, 16 iters, t-(init.)=1.38 s t(norm)=0.175477, mflops=28.4938 (err=5.2e-15) 25. QFT: elapsed time t=1.55487 s, 4 iters, t-(init.)=1.51728 s t(norm)=0.771726, mflops=6.47898 (err=7.5e-15) 26. Ransom: elapsed time t=1.05998 s, 8 iters, t-(init.)=1.00616 s t(norm)=0.255881, mflops=19.5403 (err=6.4e-15) 27. Singleton (f2c): elapsed time t=1.07895 s, 8 iters, t-(init.)=1.02447 s t(norm)=0.260536, mflops=19.1912 (err=7.2e-15) 28. Temperton (f2c): elapsed time t=1.27914 s, 4 iters, t-(init.)=1.25172 s t(norm)=0.636658, mflops=7.85351 (err=5.2e-15) 29. Valkenburg: elapsed time t=1.04895 s, 1 iters, t-(init.)=1.03157 s t(norm)=2.09874, mflops=2.38238 (err=5.2e-15) Top mflops for N=32768 = 30.3122 Normalized results and averages for N=32768: fft 0: mflops = 11.3418 (norm. = 0.374168), norm. avg. (of 15) = 0.382317 fft 1: mflops = 11.2154 (norm. = 0.369997), norm. avg. (of 15) = 0.379983 fft 2: mflops = 8.75489 (norm. = 0.288824), norm. avg. (of 15) = 0.25806 fft 3: mflops = 12.0732 (norm. = 0.398297), norm. avg. (of 15) = 0.179817 fft 4: mflops = 6.12266 (norm. = 0.201987), norm. avg. (of 15) = 0.110172 fft 5: mflops = 24.1088 (norm. = 0.795352), norm. avg. (of 15) = 0.474074 fft 6: mflops = 30.3106 (norm. = 0.999948), norm. avg. (of 15) = 0.534929 fft 7: mflops = 30.3122 (norm. = 1), norm. avg. (of 15) = 0.574423 fft 8: mflops = 8.21877 (norm. = 0.271138), norm. avg. (of 14) = 0.207591 fft 9: mflops = 7.06622 (norm. = 0.233115), norm. avg. (of 15) = 0.167228 fft 10: mflops = 26.2486 (norm. = 0.865944), norm. avg. (of 15) = 0.779219 fft 11: mflops = 22.6569 (norm. = 0.747453), norm. avg. (of 15) = 0.749824 fft 12: mflops = 11.4473 (norm. = 0.377647), norm. avg. (of 15) = 0.650981 fft 13: mflops = 25.2818 (norm. = 0.834048), norm. avg. (of 13) = 0.776095 fft 14: mflops = 9.87742 (norm. = 0.325857), norm. avg. (of 15) = 0.257109 fft 15: mflops = 11.0132 (norm. = 0.363325), norm. avg. (of 15) = 0.272576 fft 16: mflops = 12.0613 (norm. = 0.397903), norm. avg. (of 15) = 0.290471 fft 17: mflops = -1 (norm. = -0.0329901), norm. avg. (of 12) = 0.350357 fft 18: mflops = 13.6017 (norm. = 0.448722), norm. avg. (of 14) = 0.362523 fft 19: mflops = 14.5699 (norm. = 0.480662), norm. avg. (of 14) = 0.437868 fft 20: mflops = 13.1126 (norm. = 0.432585), norm. avg. (of 14) = 0.39083 fft 21: mflops = 3.41315 (norm. = 0.1126), norm. avg. (of 15) = 0.0753247 fft 22: mflops = 8.6284 (norm. = 0.284652), norm. avg. (of 15) = 0.146367 fft 23: mflops = 11.5915 (norm. = 0.382403), norm. avg. (of 15) = 0.286239 fft 24: mflops = 28.4938 (norm. = 0.940013), norm. avg. (of 15) = 0.686343 fft 25: mflops = 6.47898 (norm. = 0.213742), norm. avg. (of 12) = 0.247225 fft 26: mflops = 19.5403 (norm. = 0.644637), norm. avg. (of 14) = 0.300145 fft 27: mflops = 19.1912 (norm. = 0.633119), norm. avg. (of 15) = 0.470004 fft 28: mflops = 7.85351 (norm. = 0.259088), norm. avg. (of 15) = 0.131213 fft 29: mflops = 2.38238 (norm. = 0.078595), norm. avg. (of 15) = 0.0590792 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.1971 s, 1 iters, t-(init.)=1.16263 s t(norm)=1.10877, mflops=4.50952 (err=1.6e-14) 1. Arndt DIT: elapsed time t=1.25408 s, 1 iters, t-(init.)=1.21909 s t(norm)=1.16262, mflops=4.30063 (err=1.6e-14) 2. Arndt Split-Radix: elapsed time t=1.60543 s, 1 iters, t-(init.)=1.57071 s t(norm)=1.49795, mflops=3.3379 (err=1.6e-14) 3. Arndt 4-step: elapsed time t=1.1882 s, 2 iters, t-(init.)=1.11724 s t(norm)=0.532744, mflops=9.38537 (err=1.6e-14) 4. Beauregard: elapsed time t=1.27151 s, 1 iters, t-(init.)=1.23595 s t(norm)=1.17869, mflops=4.24198 (err=1.6e-14) 5. Bergland: elapsed time t=1.15412 s, 2 iters, t-(init.)=1.08363 s t(norm)=0.516715, mflops=9.67651 (err=1.6e-14) 6. CWP (min N) (N=72072): elapsed time t=1.18007 s, 4 iters, t-(init.)=1.02616 s t(norm)=0.244655, mflops=20.437 7. CWP (best N) (N=72072): elapsed time t=1.17993 s, 4 iters, t-(init.)=1.02528 s t(norm)=0.244446, mflops=20.4544 8. Edelblute: elapsed time t=1.63796 s, 1 iters, t-(init.)=1.60348 s t(norm)=1.5292, mflops=3.26969 (err=1.6e-14) 9. FFTPACK (f2c): elapsed time t=1.77024 s, 2 iters, t-(init.)=1.70415 s t(norm)=0.812601, mflops=6.15308 (err=1.6e-14) FFTW_MEASURE plan: (cost = 2.711440e-01) FFTW_TWIDDLE 2 FFTW_TWIDDLE 64 FFTW_TWIDDLE 16 FFTW_NOTW 32 10. FFTW: elapsed time t=1.09406 s, 4 iters, t-(init.)=0.953458 s t(norm)=0.227322, mflops=21.9952 (err=1.6e-14) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.28816 s, 4 iters, t-(init.)=1.1512 s t(norm)=0.274468, mflops=18.217 (err=1.6e-14) 12. Frigo-old: elapsed time t=1.11071 s, 2 iters, t-(init.)=1.04492 s t(norm)=0.498255, mflops=10.035 (err=1.6e-14) 13. Green: elapsed time t=1.04541 s, 2 iters, t-(init.)=0.981657 s t(norm)=0.468091, mflops=10.6817 (err=1.6e-14) 14. GSL: elapsed time t=1.32672 s, 2 iters, t-(init.)=1.25606 s t(norm)=0.598938, mflops=8.3481 (err=1.6e-14) 15. GSL DIT: elapsed time t=1.21756 s, 1 iters, t-(init.)=1.19298 s t(norm)=1.13772, mflops=4.39477 (err=1.7e-14) 16. GSL DIF: elapsed time t=1.17096 s, 1 iters, t-(init.)=1.13604 s t(norm)=1.08341, mflops=4.61504 (err=1.8e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.84599 s, 4 iters, t-(init.)=1.71487 s t(norm)=0.408858, mflops=12.2292 (err=1.6e-14) 19. Mayer (simple): elapsed time t=1.74431 s, 4 iters, t-(init.)=1.6127 s t(norm)=0.384498, mflops=13.004 20. Mayer (lookup): elapsed time t=1.97444 s, 4 iters, t-(init.)=1.84302 s t(norm)=0.439411, mflops=11.3789 (err=1.6e-14) 21. NAPACK (f2c): elapsed time t=1.55628 s, 1 iters, t-(init.)=1.52691 s t(norm)=1.45618, mflops=3.43365 (err=8.7e-13) 22. Nielsen: elapsed time t=1.08726 s, 1 iters, t-(init.)=1.05187 s t(norm)=1.00315, mflops=4.98432 (err=2.6e-13) 23. NR (C): elapsed time t=1.19739 s, 1 iters, t-(init.)=1.17216 s t(norm)=1.11786, mflops=4.47283 (err=1.6e-14) 24. Ooura (C): elapsed time t=1.76865 s, 4 iters, t-(init.)=1.62683 s t(norm)=0.387866, mflops=12.8911 (err=1.6e-14) 25. QFT: elapsed time t=1.79326 s, 2 iters, t-(init.)=1.72268 s t(norm)=0.821436, mflops=6.0869 (err=1.9e-14) 26. Ransom: elapsed time t=1.86666 s, 4 iters, t-(init.)=1.72496 s t(norm)=0.411263, mflops=12.1577 (err=1.7e-14) 27. Singleton (f2c): elapsed time t=1.31123 s, 2 iters, t-(init.)=1.24022 s t(norm)=0.591382, mflops=8.45478 (err=2.4e-14) 28. Temperton (f2c): elapsed time t=1.0063 s, 1 iters, t-(init.)=0.971174 s t(norm)=0.926184, mflops=5.3985 (err=1.6e-14) 29. Valkenburg: elapsed time t=2.66126 s, 1 iters, t-(init.)=2.6261 s t(norm)=2.50444, mflops=1.99645 (err=1.6e-14) Top mflops for N=65536 = 21.9952 Normalized results and averages for N=65536: fft 0: mflops = 4.50952 (norm. = 0.205023), norm. avg. (of 16) = 0.371236 fft 1: mflops = 4.30063 (norm. = 0.195526), norm. avg. (of 16) = 0.368455 fft 2: mflops = 3.3379 (norm. = 0.151756), norm. avg. (of 16) = 0.251416 fft 3: mflops = 9.38537 (norm. = 0.4267), norm. avg. (of 16) = 0.195247 fft 4: mflops = 4.24198 (norm. = 0.192859), norm. avg. (of 16) = 0.11534 fft 5: mflops = 9.67651 (norm. = 0.439937), norm. avg. (of 16) = 0.47194 fft 6: mflops = 20.437 (norm. = 0.929154), norm. avg. (of 16) = 0.559568 fft 7: mflops = 20.4544 (norm. = 0.929947), norm. avg. (of 16) = 0.596643 fft 8: mflops = 3.26969 (norm. = 0.148655), norm. avg. (of 15) = 0.203662 fft 9: mflops = 6.15308 (norm. = 0.279746), norm. avg. (of 16) = 0.17426 fft 10: mflops = 21.9952 (norm. = 1), norm. avg. (of 16) = 0.793018 fft 11: mflops = 18.217 (norm. = 0.828227), norm. avg. (of 16) = 0.754724 fft 12: mflops = 10.035 (norm. = 0.456236), norm. avg. (of 16) = 0.638809 fft 13: mflops = 10.6817 (norm. = 0.485637), norm. avg. (of 14) = 0.755348 fft 14: mflops = 8.3481 (norm. = 0.379542), norm. avg. (of 16) = 0.264761 fft 15: mflops = 4.39477 (norm. = 0.199806), norm. avg. (of 16) = 0.268028 fft 16: mflops = 4.61504 (norm. = 0.20982), norm. avg. (of 16) = 0.28543 fft 17: mflops = -1 (norm. = -0.0454644), norm. avg. (of 12) = 0.350357 fft 18: mflops = 12.2292 (norm. = 0.555993), norm. avg. (of 15) = 0.375421 fft 19: mflops = 13.004 (norm. = 0.591218), norm. avg. (of 15) = 0.448091 fft 20: mflops = 11.3789 (norm. = 0.517334), norm. avg. (of 15) = 0.399263 fft 21: mflops = 3.43365 (norm. = 0.156109), norm. avg. (of 16) = 0.0803737 fft 22: mflops = 4.98432 (norm. = 0.226609), norm. avg. (of 16) = 0.151382 fft 23: mflops = 4.47283 (norm. = 0.203355), norm. avg. (of 16) = 0.281059 fft 24: mflops = 12.8911 (norm. = 0.586085), norm. avg. (of 16) = 0.680077 fft 25: mflops = 6.0869 (norm. = 0.276737), norm. avg. (of 13) = 0.249495 fft 26: mflops = 12.1577 (norm. = 0.552741), norm. avg. (of 15) = 0.316985 fft 27: mflops = 8.45478 (norm. = 0.384392), norm. avg. (of 16) = 0.464654 fft 28: mflops = 5.3985 (norm. = 0.24544), norm. avg. (of 16) = 0.138352 fft 29: mflops = 1.99645 (norm. = 0.0907676), norm. avg. (of 16) = 0.0610597 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=2.98634 s, 1 iters, t-(init.)=2.91545 s t(norm)=1.30842, mflops=3.82141 (err=3.9e-14) 1. Arndt DIT: elapsed time t=3.00942 s, 1 iters, t-(init.)=2.93908 s t(norm)=1.31902, mflops=3.79068 (err=3.9e-14) 2. Arndt Split-Radix: elapsed time t=3.79817 s, 1 iters, t-(init.)=3.72756 s t(norm)=1.67288, mflops=2.98885 (err=3.9e-14) 3. Arndt 4-step: elapsed time t=1.68563 s, 1 iters, t-(init.)=1.61533 s t(norm)=0.724941, mflops=6.89712 (err=3.9e-14) 4. Beauregard: elapsed time t=2.692 s, 1 iters, t-(init.)=2.62149 s t(norm)=1.17649, mflops=4.24991 (err=3.8e-14) 5. Bergland: elapsed time t=1.32853 s, 1 iters, t-(init.)=1.25789 s t(norm)=0.564526, mflops=8.85699 (err=3.9e-14) 6. CWP (min N) (N=144144): elapsed time t=1.29673 s, 2 iters, t-(init.)=1.1423 s t(norm)=0.256326, mflops=19.5064 7. CWP (best N) (N=144144): elapsed time t=1.29687 s, 2 iters, t-(init.)=1.14184 s t(norm)=0.256222, mflops=19.5143 8. Edelblute: elapsed time t=3.86581 s, 1 iters, t-(init.)=3.79465 s t(norm)=1.70299, mflops=2.93601 (err=3.9e-14) 9. FFTPACK (f2c): elapsed time t=2.13915 s, 1 iters, t-(init.)=2.06859 s t(norm)=0.92836, mflops=5.38584 (err=3.8e-14) FFTW_MEASURE plan: (cost = 5.934330e-01) FFTW_TWIDDLE 4 FFTW_TWIDDLE 64 FFTW_TWIDDLE 16 FFTW_NOTW 32 10. FFTW: elapsed time t=1.15981 s, 2 iters, t-(init.)=1.02368 s t(norm)=0.229709, mflops=21.7667 (err=3.8e-14) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.41019 s, 2 iters, t-(init.)=1.27401 s t(norm)=0.285881, mflops=17.4898 (err=3.8e-14) 12. Frigo-old: elapsed time t=1.20377 s, 1 iters, t-(init.)=1.13854 s t(norm)=0.510964, mflops=9.78542 (err=3.8e-14) 13. Green: elapsed time t=1.27485 s, 1 iters, t-(init.)=1.20714 s t(norm)=0.541751, mflops=9.22933 (err=3.8e-14) 14. GSL: elapsed time t=1.57751 s, 1 iters, t-(init.)=1.50682 s t(norm)=0.676244, mflops=7.39379 (err=3.8e-14) 15. GSL DIT: elapsed time t=2.8666 s, 1 iters, t-(init.)=2.79608 s t(norm)=1.25485, mflops=3.98454 (err=4.0e-14) 16. GSL DIF: elapsed time t=2.77208 s, 1 iters, t-(init.)=2.70119 s t(norm)=1.21226, mflops=4.12453 (err=4.2e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=2.23755 s, 1 iters, t-(init.)=2.1719 s t(norm)=0.974722, mflops=5.12967 (err=3.9e-14) 19. Mayer (simple): elapsed time t=2.17649 s, 1 iters, t-(init.)=2.1105 s t(norm)=0.947166, mflops=5.27891 20. Mayer (lookup): elapsed time t=2.2832 s, 1 iters, t-(init.)=2.21749 s t(norm)=0.995182, mflops=5.02421 (err=3.9e-14) 21. NAPACK (f2c): elapsed time t=3.44584 s, 1 iters, t-(init.)=3.3753 s t(norm)=1.51479, mflops=3.30078 (err=2.0e-12) 22. Nielsen: elapsed time t=2.63038 s, 1 iters, t-(init.)=2.55977 s t(norm)=1.1488, mflops=4.35238 (err=9.2e-13) 23. NR (C): elapsed time t=2.8399 s, 1 iters, t-(init.)=2.76799 s t(norm)=1.24224, mflops=4.02498 (err=3.9e-14) 24. Ooura (C): elapsed time t=1.92385 s, 2 iters, t-(init.)=1.78164 s t(norm)=0.399788, mflops=12.5066 (err=3.9e-14) 25. QFT: elapsed time t=2.17026 s, 1 iters, t-(init.)=2.09939 s t(norm)=0.94218, mflops=5.30684 (err=4.1e-14) 26. Ransom: elapsed time t=1.18148 s, 1 iters, t-(init.)=1.11074 s t(norm)=0.498487, mflops=10.0304 (err=3.9e-14) 27. Singleton (f2c): elapsed time t=1.61809 s, 1 iters, t-(init.)=1.5472 s t(norm)=0.694364, mflops=7.20083 (err=5.7e-14) 28. Temperton (f2c): elapsed time t=2.37062 s, 1 iters, t-(init.)=2.29989 s t(norm)=1.03217, mflops=4.84419 (err=3.8e-14) 29. Valkenburg: elapsed time t=5.90301 s, 1 iters, t-(init.)=5.83275 s t(norm)=2.61767, mflops=1.9101 (err=3.9e-14) Top mflops for N=131072 = 21.7667 Normalized results and averages for N=131072: fft 0: mflops = 3.82141 (norm. = 0.175562), norm. avg. (of 17) = 0.359726 fft 1: mflops = 3.79068 (norm. = 0.174151), norm. avg. (of 17) = 0.357025 fft 2: mflops = 2.98885 (norm. = 0.137313), norm. avg. (of 17) = 0.244704 fft 3: mflops = 6.89712 (norm. = 0.316866), norm. avg. (of 17) = 0.202401 fft 4: mflops = 4.24991 (norm. = 0.195248), norm. avg. (of 17) = 0.120041 fft 5: mflops = 8.85699 (norm. = 0.406906), norm. avg. (of 17) = 0.468115 fft 6: mflops = 19.5064 (norm. = 0.89616), norm. avg. (of 17) = 0.579368 fft 7: mflops = 19.5143 (norm. = 0.896521), norm. avg. (of 17) = 0.614283 fft 8: mflops = 2.93601 (norm. = 0.134885), norm. avg. (of 16) = 0.199363 fft 9: mflops = 5.38584 (norm. = 0.247435), norm. avg. (of 17) = 0.178565 fft 10: mflops = 21.7667 (norm. = 1), norm. avg. (of 17) = 0.805194 fft 11: mflops = 17.4898 (norm. = 0.803512), norm. avg. (of 17) = 0.757594 fft 12: mflops = 9.78542 (norm. = 0.449559), norm. avg. (of 17) = 0.627677 fft 13: mflops = 9.22933 (norm. = 0.424011), norm. avg. (of 15) = 0.733259 fft 14: mflops = 7.39379 (norm. = 0.339683), norm. avg. (of 17) = 0.269168 fft 15: mflops = 3.98454 (norm. = 0.183057), norm. avg. (of 17) = 0.263029 fft 16: mflops = 4.12453 (norm. = 0.189488), norm. avg. (of 17) = 0.279786 fft 17: mflops = -1 (norm. = -0.0459417), norm. avg. (of 12) = 0.350357 fft 18: mflops = 5.12967 (norm. = 0.235666), norm. avg. (of 16) = 0.366686 fft 19: mflops = 5.27891 (norm. = 0.242522), norm. avg. (of 16) = 0.435243 fft 20: mflops = 5.02421 (norm. = 0.230821), norm. avg. (of 16) = 0.388736 fft 21: mflops = 3.30078 (norm. = 0.151644), norm. avg. (of 17) = 0.0845661 fft 22: mflops = 4.35238 (norm. = 0.199956), norm. avg. (of 17) = 0.154239 fft 23: mflops = 4.02498 (norm. = 0.184915), norm. avg. (of 17) = 0.275403 fft 24: mflops = 12.5066 (norm. = 0.574576), norm. avg. (of 17) = 0.673871 fft 25: mflops = 5.30684 (norm. = 0.243806), norm. avg. (of 14) = 0.249089 fft 26: mflops = 10.0304 (norm. = 0.460812), norm. avg. (of 16) = 0.325974 fft 27: mflops = 7.20083 (norm. = 0.330819), norm. avg. (of 17) = 0.456781 fft 28: mflops = 4.84419 (norm. = 0.22255), norm. avg. (of 17) = 0.143305 fft 29: mflops = 1.9101 (norm. = 0.0877532), norm. avg. (of 17) = 0.0626299 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=6.64843 s, 1 iters, t-(init.)=6.50687 s t(norm)=1.37898, mflops=3.62586 (err=4.3e-14) 1. Arndt DIT: elapsed time t=6.71341 s, 1 iters, t-(init.)=6.57218 s t(norm)=1.39283, mflops=3.58982 (err=4.3e-14) 2. Arndt Split-Radix: elapsed time t=8.31146 s, 1 iters, t-(init.)=8.17005 s t(norm)=1.73146, mflops=2.88774 (err=4.3e-14) 3. Arndt 4-step: elapsed time t=2.82498 s, 1 iters, t-(init.)=2.68329 s t(norm)=0.568663, mflops=8.79256 (err=4.3e-14) 4. Beauregard: elapsed time t=5.67556 s, 1 iters, t-(init.)=5.53397 s t(norm)=1.1728, mflops=4.2633 (err=4.3e-14) 5. Bergland: elapsed time t=2.69648 s, 1 iters, t-(init.)=2.55504 s t(norm)=0.541484, mflops=9.23389 (err=4.3e-14) 6. CWP (min N) (N=360360): elapsed time t=2.05021 s, 1 iters, t-(init.)=1.85512 s t(norm)=0.393151, mflops=12.7178 7. CWP (best N) (N=360360): elapsed time t=2.04925 s, 1 iters, t-(init.)=1.85445 s t(norm)=0.393009, mflops=12.7224 8. Edelblute: elapsed time t=8.45318 s, 1 iters, t-(init.)=8.31185 s t(norm)=1.76151, mflops=2.83847 (err=4.3e-14) 9. FFTPACK (f2c): elapsed time t=4.60538 s, 1 iters, t-(init.)=4.46406 s t(norm)=0.946059, mflops=5.28508 (err=4.3e-14) FFTW_MEASURE plan: (cost = 1.348163e+00) FFTW_TWIDDLE 2 FFTW_TWIDDLE 4 FFTW_TWIDDLE 64 FFTW_TWIDDLE 16 FFTW_NOTW 32 10. FFTW: elapsed time t=1.33848 s, 1 iters, t-(init.)=1.2026 s t(norm)=0.254865, mflops=19.6182 (err=4.3e-14) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.57222 s, 1 iters, t-(init.)=1.43603 s t(norm)=0.304335, mflops=16.4292 (err=4.3e-14) 12. Frigo-old: elapsed time t=2.93169 s, 1 iters, t-(init.)=2.79615 s t(norm)=0.592581, mflops=8.43767 (err=4.3e-14) 13. Green: elapsed time t=2.63772 s, 1 iters, t-(init.)=2.50231 s t(norm)=0.530308, mflops=9.42848 (err=4.3e-14) 14. GSL: elapsed time t=3.26531 s, 1 iters, t-(init.)=3.1236 s t(norm)=0.661978, mflops=7.55312 (err=4.3e-14) 15. GSL DIT: elapsed time t=6.13559 s, 1 iters, t-(init.)=5.99373 s t(norm)=1.27024, mflops=3.93627 (err=4.5e-14) 16. GSL DIF: elapsed time t=5.94536 s, 1 iters, t-(init.)=5.80386 s t(norm)=1.23, mflops=4.06505 (err=4.7e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=5.36978 s, 1 iters, t-(init.)=5.22796 s t(norm)=1.10795, mflops=4.51284 (err=4.3e-14) 19. Mayer (simple): elapsed time t=5.2655 s, 1 iters, t-(init.)=5.1239 s t(norm)=1.0859, mflops=4.60449 20. Mayer (lookup): elapsed time t=5.48316 s, 1 iters, t-(init.)=5.34113 s t(norm)=1.13193, mflops=4.41722 (err=4.3e-14) 21. NAPACK (f2c): elapsed time t=7.11812 s, 1 iters, t-(init.)=6.9824 s t(norm)=1.47976, mflops=3.37892 (err=3.7e-12) 22. Nielsen: elapsed time t=5.46558 s, 1 iters, t-(init.)=5.32367 s t(norm)=1.12823, mflops=4.43171 (err=2.1e-12) 23. NR (C): elapsed time t=6.08231 s, 1 iters, t-(init.)=5.93973 s t(norm)=1.25879, mflops=3.97206 (err=4.3e-14) 24. Ooura (C): elapsed time t=2.02208 s, 1 iters, t-(init.)=1.88026 s t(norm)=0.398479, mflops=12.5477 (err=4.3e-14) 25. QFT: elapsed time t=5.90432 s, 1 iters, t-(init.)=5.76274 s t(norm)=1.22128, mflops=4.09405 (err=4.7e-14) 26. Ransom: elapsed time t=1.9176 s, 1 iters, t-(init.)=1.77582 s t(norm)=0.376346, mflops=13.2857 (err=4.3e-14) 27. Singleton (f2c): elapsed time t=3.3491 s, 1 iters, t-(init.)=3.20777 s t(norm)=0.679816, mflops=7.35493 (err=5.9e-14) 28. Temperton (f2c): elapsed time t=4.89222 s, 1 iters, t-(init.)=4.75022 s t(norm)=1.0067, mflops=4.96671 (err=4.3e-14) 29. Valkenburg: elapsed time t=12.8034 s, 1 iters, t-(init.)=12.6617 s t(norm)=2.68337, mflops=1.86333 (err=4.3e-14) Top mflops for N=262144 = 19.6182 Normalized results and averages for N=262144: fft 0: mflops = 3.62586 (norm. = 0.184821), norm. avg. (of 18) = 0.350009 fft 1: mflops = 3.58982 (norm. = 0.182984), norm. avg. (of 18) = 0.347356 fft 2: mflops = 2.88774 (norm. = 0.147197), norm. avg. (of 18) = 0.239287 fft 3: mflops = 8.79256 (norm. = 0.448184), norm. avg. (of 18) = 0.216056 fft 4: mflops = 4.2633 (norm. = 0.217313), norm. avg. (of 18) = 0.125445 fft 5: mflops = 9.23389 (norm. = 0.47068), norm. avg. (of 18) = 0.468257 fft 6: mflops = 12.7178 (norm. = 0.648263), norm. avg. (of 18) = 0.583195 fft 7: mflops = 12.7224 (norm. = 0.648498), norm. avg. (of 18) = 0.616184 fft 8: mflops = 2.83847 (norm. = 0.144686), norm. avg. (of 17) = 0.196147 fft 9: mflops = 5.28508 (norm. = 0.269397), norm. avg. (of 18) = 0.183611 fft 10: mflops = 19.6182 (norm. = 1), norm. avg. (of 18) = 0.816016 fft 11: mflops = 16.4292 (norm. = 0.837448), norm. avg. (of 18) = 0.76203 fft 12: mflops = 8.43767 (norm. = 0.430094), norm. avg. (of 18) = 0.6167 fft 13: mflops = 9.42848 (norm. = 0.480599), norm. avg. (of 16) = 0.717467 fft 14: mflops = 7.55312 (norm. = 0.385006), norm. avg. (of 18) = 0.275604 fft 15: mflops = 3.93627 (norm. = 0.200644), norm. avg. (of 18) = 0.259564 fft 16: mflops = 4.06505 (norm. = 0.207208), norm. avg. (of 18) = 0.275754 fft 17: mflops = -1 (norm. = -0.050973), norm. avg. (of 12) = 0.350357 fft 18: mflops = 4.51284 (norm. = 0.230033), norm. avg. (of 17) = 0.358648 fft 19: mflops = 4.60449 (norm. = 0.234705), norm. avg. (of 17) = 0.423447 fft 20: mflops = 4.41722 (norm. = 0.225159), norm. avg. (of 17) = 0.379114 fft 21: mflops = 3.37892 (norm. = 0.172234), norm. avg. (of 18) = 0.0894365 fft 22: mflops = 4.43171 (norm. = 0.225898), norm. avg. (of 18) = 0.15822 fft 23: mflops = 3.97206 (norm. = 0.202468), norm. avg. (of 18) = 0.271351 fft 24: mflops = 12.5477 (norm. = 0.639595), norm. avg. (of 18) = 0.671967 fft 25: mflops = 4.09405 (norm. = 0.208686), norm. avg. (of 15) = 0.246395 fft 26: mflops = 13.2857 (norm. = 0.677211), norm. avg. (of 17) = 0.346635 fft 27: mflops = 7.35493 (norm. = 0.374903), norm. avg. (of 18) = 0.452232 fft 28: mflops = 4.96671 (norm. = 0.253168), norm. avg. (of 18) = 0.149409 fft 29: mflops = 1.86333 (norm. = 0.0949795), norm. avg. (of 18) = 0.0644271 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) Maximum array size = 180180 Benchmarking FFTs: 0. CWP (min N) 1. CWP (best N) 2. FFTPACK (f2c) 3. FFTW 4. FFTW_ESTIMATE 5. Frigo-old 6. GSL 7. NAPACK (f2c) 8. Nielsen 9. Singleton (f2c) 10. Temperton (f2c) 11. Valkenburg Computing normalized averages (12 transforms). Benchmarking for array size = 6: 0. CWP (min N): elapsed time t=1.97575 s, 262144 iters, t-(init.)=1.91804 s t(norm)=0.47175, mflops=10.5988 1. CWP (best N) (N=15): elapsed time t=1.51286 s, 131072 iters, t-(init.)=1.4526 s t(norm)=0.714547, mflops=6.99744 2. FFTPACK (f2c): elapsed time t=1.18049 s, 131072 iters, t-(init.)=1.15155 s t(norm)=0.566459, mflops=8.82676 (err=1.7e-16) FFTW_MEASURE plan: (cost = 1.661194e-06) FFTW_NOTW 6 3. FFTW: elapsed time t=1.7786 s, 1048576 iters, t-(init.)=1.54774 s t(norm)=0.0951682, mflops=52.5386 (err=1.3e-16) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 4. FFTW_ESTIMATE: elapsed time t=1.76901 s, 1048576 iters, t-(init.)=1.53744 s t(norm)=0.0945348, mflops=52.8906 (err=1.3e-16) 5. Frigo-old: elapsed time t=1.92372 s, 262144 iters, t-(init.)=1.86572 s t(norm)=0.458881, mflops=10.8961 (err=3.2e-16) 6. GSL: elapsed time t=1.37543 s, 262144 iters, t-(init.)=1.31753 s t(norm)=0.324053, mflops=15.4296 (err=1.3e-16) 7. NAPACK (f2c): elapsed time t=1.30696 s, 65536 iters, t-(init.)=1.29259 s t(norm)=1.27167, mflops=3.93182 (err=2.3e-16) 8. Nielsen: elapsed time t=1.69056 s, 65536 iters, t-(init.)=1.67545 s t(norm)=1.64834, mflops=3.03336 (err=2.7e-16) 9. Singleton (f2c): elapsed time t=1.29938 s, 131072 iters, t-(init.)=1.27043 s t(norm)=0.624938, mflops=8.00079 (err=1.3e-16) 10. Temperton (f2c): elapsed time t=1.28334 s, 65536 iters, t-(init.)=1.26882 s t(norm)=1.24829, mflops=4.00547 (err=1.2e-16) 11. Valkenburg: elapsed time t=1.30139 s, 65536 iters, t-(init.)=1.28689 s t(norm)=1.26606, mflops=3.94925 (err=2.1e-16) Top mflops for N=6 = 52.8906 Normalized results and averages for N=6: fft 0: mflops = 10.5988 (norm. = 0.200392), norm. avg. (of 1) = 0.200392 fft 1: mflops = 6.99744 (norm. = 0.1323), norm. avg. (of 1) = 0.1323 fft 2: mflops = 8.82676 (norm. = 0.166887), norm. avg. (of 1) = 0.166887 fft 3: mflops = 52.5386 (norm. = 0.993345), norm. avg. (of 1) = 0.993345 fft 4: mflops = 52.8906 (norm. = 1), norm. avg. (of 1) = 1 fft 5: mflops = 10.8961 (norm. = 0.206012), norm. avg. (of 1) = 0.206012 fft 6: mflops = 15.4296 (norm. = 0.291726), norm. avg. (of 1) = 0.291726 fft 7: mflops = 3.93182 (norm. = 0.0743389), norm. avg. (of 1) = 0.0743389 fft 8: mflops = 3.03336 (norm. = 0.0573516), norm. avg. (of 1) = 0.0573516 fft 9: mflops = 8.00079 (norm. = 0.151271), norm. avg. (of 1) = 0.151271 fft 10: mflops = 4.00547 (norm. = 0.0757313), norm. avg. (of 1) = 0.0757313 fft 11: mflops = 3.94925 (norm. = 0.0746684), norm. avg. (of 1) = 0.0746684 Benchmarking for array size = 9: 0. CWP (min N): elapsed time t=1.03176 s, 131072 iters, t-(init.)=0.993817 s t(norm)=0.265769, mflops=18.8133 1. CWP (best N) (N=15): elapsed time t=1.51083 s, 131072 iters, t-(init.)=1.45037 s t(norm)=0.387863, mflops=12.8912 2. FFTPACK (f2c): elapsed time t=1.7557 s, 131072 iters, t-(init.)=1.71794 s t(norm)=0.459416, mflops=10.8834 (err=2.8e-16) FFTW_MEASURE plan: (cost = 2.733246e-06) FFTW_NOTW 9 3. FFTW: elapsed time t=1.45035 s, 524288 iters, t-(init.)=1.29895 s t(norm)=0.0868425, mflops=57.5755 (err=1.4e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 4. FFTW_ESTIMATE: elapsed time t=1.44591 s, 524288 iters, t-(init.)=1.29474 s t(norm)=0.0865611, mflops=57.7627 (err=1.4e-16) 5. Frigo-old: elapsed time t=1.0075 s, 65536 iters, t-(init.)=0.988516 s t(norm)=0.528704, mflops=9.45709 (err=3.1e-16) 6. GSL: elapsed time t=1.34337 s, 131072 iters, t-(init.)=1.30539 s t(norm)=0.349091, mflops=14.3229 (err=1.4e-16) 7. NAPACK (f2c): elapsed time t=1.91222 s, 65536 iters, t-(init.)=1.89334 s t(norm)=1.01265, mflops=4.93756 (err=5.8e-16) 8. Nielsen: elapsed time t=1.0126 s, 32768 iters, t-(init.)=1.00282 s t(norm)=1.07271, mflops=4.66108 (err=4.5e-16) 9. Singleton (f2c): elapsed time t=1.3058 s, 131072 iters, t-(init.)=1.26812 s t(norm)=0.339125, mflops=14.7438 (err=1.7e-16) 10. Temperton (f2c): elapsed time t=1.64252 s, 65536 iters, t-(init.)=1.62353 s t(norm)=0.868339, mflops=5.75812 (err=1.7e-16) 11. Valkenburg: elapsed time t=1.16827 s, 32768 iters, t-(init.)=1.15868 s t(norm)=1.23943, mflops=4.0341 (err=2.6e-16) Top mflops for N=9 = 57.7627 Normalized results and averages for N=9: fft 0: mflops = 18.8133 (norm. = 0.3257), norm. avg. (of 2) = 0.263046 fft 1: mflops = 12.8912 (norm. = 0.223175), norm. avg. (of 2) = 0.177737 fft 2: mflops = 10.8834 (norm. = 0.188416), norm. avg. (of 2) = 0.177652 fft 3: mflops = 57.5755 (norm. = 0.99676), norm. avg. (of 2) = 0.995052 fft 4: mflops = 57.7627 (norm. = 1), norm. avg. (of 2) = 1 fft 5: mflops = 9.45709 (norm. = 0.163723), norm. avg. (of 2) = 0.184867 fft 6: mflops = 14.3229 (norm. = 0.247962), norm. avg. (of 2) = 0.269844 fft 7: mflops = 4.93756 (norm. = 0.0854802), norm. avg. (of 2) = 0.0799095 fft 8: mflops = 4.66108 (norm. = 0.0806937), norm. avg. (of 2) = 0.0690226 fft 9: mflops = 14.7438 (norm. = 0.255248), norm. avg. (of 2) = 0.20326 fft 10: mflops = 5.75812 (norm. = 0.0996859), norm. avg. (of 2) = 0.0877086 fft 11: mflops = 4.0341 (norm. = 0.0698392), norm. avg. (of 2) = 0.0722538 Benchmarking for array size = 12: 0. CWP (min N): elapsed time t=1.25719 s, 131072 iters, t-(init.)=1.20934 s t(norm)=0.214474, mflops=23.3129 1. CWP (best N) (N=15): elapsed time t=1.51245 s, 131072 iters, t-(init.)=1.45215 s t(norm)=0.257534, mflops=19.4149 2. FFTPACK (f2c): elapsed time t=1.04232 s, 65536 iters, t-(init.)=1.01826 s t(norm)=0.361172, mflops=13.8438 (err=1.9e-16) FFTW_MEASURE plan: (cost = 2.922119e-06) FFTW_NOTW 12 3. FFTW: elapsed time t=1.54416 s, 524288 iters, t-(init.)=1.35335 s t(norm)=0.060003, mflops=83.3291 (err=1.3e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.5397 s, 524288 iters, t-(init.)=1.34835 s t(norm)=0.0597814, mflops=83.638 (err=1.3e-16) 5. Frigo-old: elapsed time t=1.78517 s, 131072 iters, t-(init.)=1.73727 s t(norm)=0.308101, mflops=16.2285 (err=2.3e-16) 6. GSL: elapsed time t=1.30146 s, 131072 iters, t-(init.)=1.25375 s t(norm)=0.222349, mflops=22.4871 (err=1.5e-16) 7. NAPACK (f2c): elapsed time t=1.37695 s, 32768 iters, t-(init.)=1.36503 s t(norm)=0.96834, mflops=5.16348 (err=4.2e-16) 8. Nielsen: elapsed time t=1.17944 s, 32768 iters, t-(init.)=1.16713 s t(norm)=0.827946, mflops=6.03904 (err=4.8e-16) 9. Singleton (f2c): elapsed time t=1.85587 s, 131072 iters, t-(init.)=1.80802 s t(norm)=0.320647, mflops=15.5935 (err=1.9e-16) 10. Temperton (f2c): elapsed time t=1.88027 s, 65536 iters, t-(init.)=1.85624 s t(norm)=0.658397, mflops=7.5942 (err=1.2e-16) 11. Valkenburg: elapsed time t=1.74573 s, 32768 iters, t-(init.)=1.73364 s t(norm)=1.22982, mflops=4.06563 (err=1.9e-16) Top mflops for N=12 = 83.638 Normalized results and averages for N=12: fft 0: mflops = 23.3129 (norm. = 0.278736), norm. avg. (of 3) = 0.268276 fft 1: mflops = 19.4149 (norm. = 0.23213), norm. avg. (of 3) = 0.195868 fft 2: mflops = 13.8438 (norm. = 0.165521), norm. avg. (of 3) = 0.173608 fft 3: mflops = 83.3291 (norm. = 0.996307), norm. avg. (of 3) = 0.995471 fft 4: mflops = 83.638 (norm. = 1), norm. avg. (of 3) = 1 fft 5: mflops = 16.2285 (norm. = 0.194032), norm. avg. (of 3) = 0.187922 fft 6: mflops = 22.4871 (norm. = 0.268863), norm. avg. (of 3) = 0.269517 fft 7: mflops = 5.16348 (norm. = 0.061736), norm. avg. (of 3) = 0.0738517 fft 8: mflops = 6.03904 (norm. = 0.0722045), norm. avg. (of 3) = 0.0700833 fft 9: mflops = 15.5935 (norm. = 0.18644), norm. avg. (of 3) = 0.197653 fft 10: mflops = 7.5942 (norm. = 0.0907984), norm. avg. (of 3) = 0.0887385 fft 11: mflops = 4.06563 (norm. = 0.0486098), norm. avg. (of 3) = 0.0643725 Benchmarking for array size = 15: 0. CWP (min N): elapsed time t=1.51901 s, 131072 iters, t-(init.)=1.46012 s t(norm)=0.190089, mflops=26.3034 1. CWP (best N): elapsed time t=1.51305 s, 131072 iters, t-(init.)=1.45293 s t(norm)=0.189153, mflops=26.4337 2. FFTPACK (f2c): elapsed time t=1.37159 s, 65536 iters, t-(init.)=1.3421 s t(norm)=0.349448, mflops=14.3083 (err=3.6e-16) FFTW_MEASURE plan: (cost = 4.754822e-06) FFTW_NOTW 15 3. FFTW: elapsed time t=1.25754 s, 262144 iters, t-(init.)=1.13981 s t(norm)=0.0741945, mflops=67.3905 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.25549 s, 262144 iters, t-(init.)=1.13782 s t(norm)=0.0740644, mflops=67.5089 (err=1.7e-16) 5. Frigo-old: elapsed time t=1.72572 s, 65536 iters, t-(init.)=1.69627 s t(norm)=0.441664, mflops=11.3208 (err=2.7e-16) 6. GSL: elapsed time t=1.24692 s, 65536 iters, t-(init.)=1.21744 s t(norm)=0.316989, mflops=15.7734 (err=1.9e-16) 7. NAPACK (f2c): elapsed time t=1.345 s, 16384 iters, t-(init.)=1.33739 s t(norm)=1.39289, mflops=3.58966 (err=9.4e-16) 8. Nielsen: elapsed time t=1.38673 s, 32768 iters, t-(init.)=1.37167 s t(norm)=0.714294, mflops=6.99992 (err=4.5e-15) 9. Singleton (f2c): elapsed time t=1.11235 s, 65536 iters, t-(init.)=1.08253 s t(norm)=0.281864, mflops=17.7391 (err=2.0e-16) 10. Temperton (f2c): elapsed time t=1.21942 s, 32768 iters, t-(init.)=1.20464 s t(norm)=0.627313, mflops=7.97051 (err=2.5e-16) 11. Valkenburg: elapsed time t=1.32624 s, 16384 iters, t-(init.)=1.31883 s t(norm)=1.37356, mflops=3.64018 (err=2.5e-16) Top mflops for N=15 = 67.5089 Normalized results and averages for N=15: fft 0: mflops = 26.3034 (norm. = 0.389629), norm. avg. (of 4) = 0.298614 fft 1: mflops = 26.4337 (norm. = 0.391558), norm. avg. (of 4) = 0.244791 fft 2: mflops = 14.3083 (norm. = 0.211946), norm. avg. (of 4) = 0.183193 fft 3: mflops = 67.3905 (norm. = 0.998246), norm. avg. (of 4) = 0.996164 fft 4: mflops = 67.5089 (norm. = 1), norm. avg. (of 4) = 1 fft 5: mflops = 11.3208 (norm. = 0.167694), norm. avg. (of 4) = 0.182865 fft 6: mflops = 15.7734 (norm. = 0.23365), norm. avg. (of 4) = 0.26055 fft 7: mflops = 3.58966 (norm. = 0.0531731), norm. avg. (of 4) = 0.068682 fft 8: mflops = 6.99992 (norm. = 0.103689), norm. avg. (of 4) = 0.0784847 fft 9: mflops = 17.7391 (norm. = 0.262767), norm. avg. (of 4) = 0.213931 fft 10: mflops = 7.97051 (norm. = 0.118066), norm. avg. (of 4) = 0.0960704 fft 11: mflops = 3.64018 (norm. = 0.0539215), norm. avg. (of 4) = 0.0617597 Benchmarking for array size = 18: 0. CWP (min N): elapsed time t=1.78081 s, 131072 iters, t-(init.)=1.71302 s t(norm)=0.174121, mflops=28.7157 1. CWP (best N) (N=28): elapsed time t=1.97802 s, 131072 iters, t-(init.)=1.87572 s t(norm)=0.190659, mflops=26.2248 2. FFTPACK (f2c): elapsed time t=1.19019 s, 32768 iters, t-(init.)=1.17322 s t(norm)=0.477011, mflops=10.4819 (err=2.6e-16) FFTW_MEASURE plan: (cost = 7.253784e-06) FFTW_TWIDDLE 3 FFTW_NOTW 6 3. FFTW: elapsed time t=1.90015 s, 262144 iters, t-(init.)=1.76478 s t(norm)=0.0896915, mflops=55.7467 (err=1.9e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 4. FFTW_ESTIMATE: elapsed time t=1.93386 s, 262144 iters, t-(init.)=1.79849 s t(norm)=0.0914043, mflops=54.7021 (err=2.3e-16) 5. Frigo-old: elapsed time t=1.14948 s, 32768 iters, t-(init.)=1.13251 s t(norm)=0.460459, mflops=10.8587 (err=3.8e-16) 6. GSL: elapsed time t=1.02429 s, 65536 iters, t-(init.)=0.990429 s t(norm)=0.201346, mflops=24.8329 (err=2.4e-16) 7. NAPACK (f2c): elapsed time t=1.04718 s, 16384 iters, t-(init.)=1.03854 s t(norm)=0.844504, mflops=5.92064 (err=6.0e-16) 8. Nielsen: elapsed time t=1.15803 s, 16384 iters, t-(init.)=1.14953 s t(norm)=0.934758, mflops=5.34898 (err=7.7e-16) 9. Singleton (f2c): elapsed time t=1.15169 s, 65536 iters, t-(init.)=1.11783 s t(norm)=0.227246, mflops=22.0026 (err=1.7e-16) 10. Temperton (f2c): elapsed time t=1.84461 s, 32768 iters, t-(init.)=1.82749 s t(norm)=0.743025, mflops=6.72925 (err=2.8e-16) 11. Valkenburg: elapsed time t=1.4929 s, 16384 iters, t-(init.)=1.4844 s t(norm)=1.20706, mflops=4.14228 (err=2.8e-16) Top mflops for N=18 = 55.7467 Normalized results and averages for N=18: fft 0: mflops = 28.7157 (norm. = 0.51511), norm. avg. (of 5) = 0.341913 fft 1: mflops = 26.2248 (norm. = 0.470428), norm. avg. (of 5) = 0.289918 fft 2: mflops = 10.4819 (norm. = 0.188028), norm. avg. (of 5) = 0.18416 fft 3: mflops = 55.7467 (norm. = 1), norm. avg. (of 5) = 0.996932 fft 4: mflops = 54.7021 (norm. = 0.981261), norm. avg. (of 5) = 0.996252 fft 5: mflops = 10.8587 (norm. = 0.194787), norm. avg. (of 5) = 0.18525 fft 6: mflops = 24.8329 (norm. = 0.44546), norm. avg. (of 5) = 0.297532 fft 7: mflops = 5.92064 (norm. = 0.106206), norm. avg. (of 5) = 0.0761868 fft 8: mflops = 5.34898 (norm. = 0.0959515), norm. avg. (of 5) = 0.081978 fft 9: mflops = 22.0026 (norm. = 0.394689), norm. avg. (of 5) = 0.250083 fft 10: mflops = 6.72925 (norm. = 0.120711), norm. avg. (of 5) = 0.100999 fft 11: mflops = 4.14228 (norm. = 0.0743054), norm. avg. (of 5) = 0.0642689 Benchmarking for array size = 24: 0. CWP (min N): elapsed time t=1.93019 s, 131072 iters, t-(init.)=1.84226 s t(norm)=0.12773, mflops=39.145 1. CWP (best N) (N=28): elapsed time t=1.97845 s, 131072 iters, t-(init.)=1.87614 s t(norm)=0.130079, mflops=38.4381 2. FFTPACK (f2c): elapsed time t=1.47803 s, 32768 iters, t-(init.)=1.45609 s t(norm)=0.403825, mflops=12.3816 (err=2.4e-16) FFTW_MEASURE plan: (cost = 8.134277e-06) FFTW_TWIDDLE 2 FFTW_NOTW 12 3. FFTW: elapsed time t=1.07193 s, 131072 iters, t-(init.)=0.984019 s t(norm)=0.0682255, mflops=73.2864 (err=2.0e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.07035 s, 131072 iters, t-(init.)=0.982597 s t(norm)=0.0681269, mflops=73.3925 (err=2.0e-16) 5. Frigo-old: elapsed time t=1.75623 s, 65536 iters, t-(init.)=1.71244 s t(norm)=0.237459, mflops=21.0562 (err=2.7e-16) 6. GSL: elapsed time t=1.1481 s, 65536 iters, t-(init.)=1.10417 s t(norm)=0.153113, mflops=32.6557 (err=2.2e-16) 7. NAPACK (f2c): elapsed time t=1.36594 s, 16384 iters, t-(init.)=1.35484 s t(norm)=0.751488, mflops=6.65346 (err=8.2e-16) 8. Nielsen: elapsed time t=1.00086 s, 16384 iters, t-(init.)=0.98962 s t(norm)=0.54891, mflops=9.10895 (err=1.4e-15) 9. Singleton (f2c): elapsed time t=1.79728 s, 65536 iters, t-(init.)=1.75331 s t(norm)=0.243126, mflops=20.5655 (err=2.2e-16) 10. Temperton (f2c): elapsed time t=1.05311 s, 16384 iters, t-(init.)=1.04201 s t(norm)=0.577967, mflops=8.65101 (err=2.7e-16) 11. Valkenburg: elapsed time t=1.08456 s, 8192 iters, t-(init.)=1.07898 s t(norm)=1.19695, mflops=4.17729 (err=2.9e-16) Top mflops for N=24 = 73.3925 Normalized results and averages for N=24: fft 0: mflops = 39.145 (norm. = 0.533365), norm. avg. (of 6) = 0.373822 fft 1: mflops = 38.4381 (norm. = 0.523734), norm. avg. (of 6) = 0.328888 fft 2: mflops = 12.3816 (norm. = 0.168704), norm. avg. (of 6) = 0.181584 fft 3: mflops = 73.2864 (norm. = 0.998555), norm. avg. (of 6) = 0.997202 fft 4: mflops = 73.3925 (norm. = 1), norm. avg. (of 6) = 0.996877 fft 5: mflops = 21.0562 (norm. = 0.286899), norm. avg. (of 6) = 0.202191 fft 6: mflops = 32.6557 (norm. = 0.444946), norm. avg. (of 6) = 0.322101 fft 7: mflops = 6.65346 (norm. = 0.0906559), norm. avg. (of 6) = 0.0785984 fft 8: mflops = 9.10895 (norm. = 0.124113), norm. avg. (of 6) = 0.0890005 fft 9: mflops = 20.5655 (norm. = 0.280213), norm. avg. (of 6) = 0.255105 fft 10: mflops = 8.65101 (norm. = 0.117873), norm. avg. (of 6) = 0.103811 fft 11: mflops = 4.17729 (norm. = 0.0569172), norm. avg. (of 6) = 0.0630436 Benchmarking for array size = 36: 0. CWP (min N): elapsed time t=1.36585 s, 65536 iters, t-(init.)=1.30206 s t(norm)=0.106749, mflops=46.8389 1. CWP (best N): elapsed time t=1.3651 s, 65536 iters, t-(init.)=1.30074 s t(norm)=0.106641, mflops=46.8864 2. FFTPACK (f2c): elapsed time t=1.16243 s, 16384 iters, t-(init.)=1.14641 s t(norm)=0.375953, mflops=13.2995 (err=3.7e-16) FFTW_MEASURE plan: (cost = 1.283447e-05) FFTW_TWIDDLE 3 FFTW_NOTW 12 3. FFTW: elapsed time t=1.68229 s, 131072 iters, t-(init.)=1.55428 s t(norm)=0.0637136, mflops=78.4762 (err=3.5e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.68136 s, 131072 iters, t-(init.)=1.55327 s t(norm)=0.0636721, mflops=78.5273 (err=3.5e-16) 5. Frigo-old: elapsed time t=1.13826 s, 16384 iters, t-(init.)=1.12224 s t(norm)=0.368027, mflops=13.586 (err=4.8e-16) 6. GSL: elapsed time t=1.70459 s, 65536 iters, t-(init.)=1.64066 s t(norm)=0.13451, mflops=37.1721 (err=2.8e-16) 7. NAPACK (f2c): elapsed time t=1.06775 s, 8192 iters, t-(init.)=1.05953 s t(norm)=0.69492, mflops=7.19508 (err=1.0e-15) 8. Nielsen: elapsed time t=1.90456 s, 16384 iters, t-(init.)=1.8884 s t(norm)=0.619282, mflops=8.07387 (err=9.7e-16) 9. Singleton (f2c): elapsed time t=1.86975 s, 65536 iters, t-(init.)=1.80582 s t(norm)=0.14805, mflops=33.7723 (err=2.7e-16) 10. Temperton (f2c): elapsed time t=1.56055 s, 16384 iters, t-(init.)=1.54456 s t(norm)=0.506521, mflops=9.87125 (err=3.9e-16) 11. Valkenburg: elapsed time t=1.8102 s, 8192 iters, t-(init.)=1.80231 s t(norm)=1.18209, mflops=4.22978 (err=4.0e-16) Top mflops for N=36 = 78.5273 Normalized results and averages for N=36: fft 0: mflops = 46.8389 (norm. = 0.596467), norm. avg. (of 7) = 0.405628 fft 1: mflops = 46.8864 (norm. = 0.597071), norm. avg. (of 7) = 0.3672 fft 2: mflops = 13.2995 (norm. = 0.169362), norm. avg. (of 7) = 0.179838 fft 3: mflops = 78.4762 (norm. = 0.999349), norm. avg. (of 7) = 0.997509 fft 4: mflops = 78.5273 (norm. = 1), norm. avg. (of 7) = 0.997323 fft 5: mflops = 13.586 (norm. = 0.17301), norm. avg. (of 7) = 0.198022 fft 6: mflops = 37.1721 (norm. = 0.473365), norm. avg. (of 7) = 0.34371 fft 7: mflops = 7.19508 (norm. = 0.0916252), norm. avg. (of 7) = 0.0804593 fft 8: mflops = 8.07387 (norm. = 0.102816), norm. avg. (of 7) = 0.0909742 fft 9: mflops = 33.7723 (norm. = 0.430072), norm. avg. (of 7) = 0.2801 fft 10: mflops = 9.87125 (norm. = 0.125705), norm. avg. (of 7) = 0.106939 fft 11: mflops = 4.22978 (norm. = 0.0538638), norm. avg. (of 7) = 0.0617322 Benchmarking for array size = 80: 0. CWP (min N): elapsed time t=1.46233 s, 32768 iters, t-(init.)=1.39373 s t(norm)=0.0840988, mflops=59.4539 1. CWP (best N) (N=84): elapsed time t=1.353 s, 32768 iters, t-(init.)=1.28075 s t(norm)=0.0772814, mflops=64.6986 2. FFTPACK (f2c): elapsed time t=1.23575 s, 8192 iters, t-(init.)=1.21846 s t(norm)=0.294091, mflops=17.0016 (err=7.7e-16) FFTW_MEASURE plan: (cost = 2.995947e-05) FFTW_TWIDDLE 5 FFTW_NOTW 16 3. FFTW: elapsed time t=1.97052 s, 65536 iters, t-(init.)=1.83328 s t(norm)=0.0553106, mflops=90.3986 (err=7.3e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 4. FFTW_ESTIMATE: elapsed time t=1.97106 s, 65536 iters, t-(init.)=1.83355 s t(norm)=0.055319, mflops=90.3849 (err=7.3e-16) 5. Frigo-old: elapsed time t=1.60171 s, 16384 iters, t-(init.)=1.56742 s t(norm)=0.189159, mflops=26.4328 (err=7.1e-16) 6. GSL: elapsed time t=1.70029 s, 16384 iters, t-(init.)=1.66541 s t(norm)=0.200984, mflops=24.8776 (err=6.9e-16) 7. NAPACK (f2c): elapsed time t=1.04052 s, 2048 iters, t-(init.)=1.03611 s t(norm)=1.00031, mflops=4.99845 (err=1.1e-15) 8. Nielsen: elapsed time t=1.39783 s, 8192 iters, t-(init.)=1.38047 s t(norm)=0.333194, mflops=15.0063 (err=5.4e-15) 9. Singleton (f2c): elapsed time t=1.68344 s, 32768 iters, t-(init.)=1.61482 s t(norm)=0.0974393, mflops=51.314 (err=1.3e-15) 10. Temperton (f2c): elapsed time t=1.83786 s, 8192 iters, t-(init.)=1.82067 s t(norm)=0.439442, mflops=11.3781 (err=7.0e-16) 11. Valkenburg: elapsed time t=1.29875 s, 2048 iters, t-(init.)=1.29449 s t(norm)=1.24977, mflops=4.00073 (err=8.4e-16) Top mflops for N=80 = 90.3986 Normalized results and averages for N=80: fft 0: mflops = 59.4539 (norm. = 0.657686), norm. avg. (of 8) = 0.437136 fft 1: mflops = 64.6986 (norm. = 0.715704), norm. avg. (of 8) = 0.410763 fft 2: mflops = 17.0016 (norm. = 0.188073), norm. avg. (of 8) = 0.180867 fft 3: mflops = 90.3986 (norm. = 1), norm. avg. (of 8) = 0.99782 fft 4: mflops = 90.3849 (norm. = 0.999848), norm. avg. (of 8) = 0.997639 fft 5: mflops = 26.4328 (norm. = 0.292403), norm. avg. (of 8) = 0.20982 fft 6: mflops = 24.8776 (norm. = 0.275199), norm. avg. (of 8) = 0.335146 fft 7: mflops = 4.99845 (norm. = 0.0552934), norm. avg. (of 8) = 0.0773136 fft 8: mflops = 15.0063 (norm. = 0.166001), norm. avg. (of 8) = 0.100353 fft 9: mflops = 51.314 (norm. = 0.567642), norm. avg. (of 8) = 0.316043 fft 10: mflops = 11.3781 (norm. = 0.125865), norm. avg. (of 8) = 0.109305 fft 11: mflops = 4.00073 (norm. = 0.0442566), norm. avg. (of 8) = 0.0595477 Benchmarking for array size = 108: 0. CWP (min N) (N=110): elapsed time t=1.22896 s, 16384 iters, t-(init.)=1.18193 s t(norm)=0.0988851, mflops=50.5637 1. CWP (best N) (N=112): elapsed time t=1.93139 s, 32768 iters, t-(init.)=1.83583 s t(norm)=0.0767963, mflops=65.1073 2. FFTPACK (f2c): elapsed time t=1.91187 s, 8192 iters, t-(init.)=1.88882 s t(norm)=0.316052, mflops=15.8202 (err=4.7e-16) FFTW_MEASURE plan: (cost = 4.487158e-05) FFTW_TWIDDLE 9 FFTW_NOTW 12 3. FFTW: elapsed time t=1.46995 s, 32768 iters, t-(init.)=1.37757 s t(norm)=0.0576265, mflops=86.7656 (err=3.7e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.46904 s, 32768 iters, t-(init.)=1.37711 s t(norm)=0.0576072, mflops=86.7947 (err=3.7e-16) 5. Frigo-old: elapsed time t=1.2074 s, 4096 iters, t-(init.)=1.19571 s t(norm)=0.400151, mflops=12.4953 (err=5.5e-16) 6. GSL: elapsed time t=1.76159 s, 16384 iters, t-(init.)=1.71552 s t(norm)=0.143527, mflops=34.8366 (err=4.7e-16) 7. NAPACK (f2c): elapsed time t=1.91699 s, 4096 iters, t-(init.)=1.90536 s t(norm)=0.637641, mflops=7.84141 (err=2.7e-15) 8. Nielsen: elapsed time t=1.63338 s, 4096 iters, t-(init.)=1.62184 s t(norm)=0.542759, mflops=9.21219 (err=1.1e-15) 9. Singleton (f2c): elapsed time t=1.63071 s, 16384 iters, t-(init.)=1.58444 s t(norm)=0.132561, mflops=37.7185 (err=5.1e-16) 10. Temperton (f2c): elapsed time t=1.42648 s, 4096 iters, t-(init.)=1.41482 s t(norm)=0.473478, mflops=10.5602 (err=3.8e-16) 11. Valkenburg: elapsed time t=1.75438 s, 2048 iters, t-(init.)=1.74869 s t(norm)=1.17042, mflops=4.27198 (err=5.2e-16) Top mflops for N=108 = 86.7947 Normalized results and averages for N=108: fft 0: mflops = 50.5637 (norm. = 0.582567), norm. avg. (of 9) = 0.453295 fft 1: mflops = 65.1073 (norm. = 0.75013), norm. avg. (of 9) = 0.44847 fft 2: mflops = 15.8202 (norm. = 0.182271), norm. avg. (of 9) = 0.181023 fft 3: mflops = 86.7656 (norm. = 0.999665), norm. avg. (of 9) = 0.998025 fft 4: mflops = 86.7947 (norm. = 1), norm. avg. (of 9) = 0.997901 fft 5: mflops = 12.4953 (norm. = 0.143964), norm. avg. (of 9) = 0.202503 fft 6: mflops = 34.8366 (norm. = 0.401368), norm. avg. (of 9) = 0.342504 fft 7: mflops = 7.84141 (norm. = 0.0903443), norm. avg. (of 9) = 0.0787615 fft 8: mflops = 9.21219 (norm. = 0.106138), norm. avg. (of 9) = 0.100995 fft 9: mflops = 37.7185 (norm. = 0.434572), norm. avg. (of 9) = 0.329212 fft 10: mflops = 10.5602 (norm. = 0.121668), norm. avg. (of 9) = 0.110678 fft 11: mflops = 4.27198 (norm. = 0.0492194), norm. avg. (of 9) = 0.0584002 Benchmarking for array size = 210: 0. CWP (min N): elapsed time t=1.99484 s, 16384 iters, t-(init.)=1.90636 s t(norm)=0.0718244, mflops=69.6142 1. CWP (best N): elapsed time t=1.99461 s, 16384 iters, t-(init.)=1.90599 s t(norm)=0.0718106, mflops=69.6276 2. FFTPACK (f2c): elapsed time t=1.42248 s, 2048 iters, t-(init.)=1.41141 s t(norm)=0.425412, mflops=11.7533 (err=5.7e-16) FFTW_MEASURE plan: (cost = 1.161973e-04) FFTW_TWIDDLE 2 FFTW_TWIDDLE 7 FFTW_NOTW 15 3. FFTW: elapsed time t=1.909 s, 16384 iters, t-(init.)=1.82034 s t(norm)=0.0685837, mflops=72.9036 (err=4.5e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.9922 s, 16384 iters, t-(init.)=1.90349 s t(norm)=0.0717165, mflops=69.719 (err=4.6e-16) 5. Frigo-old: elapsed time t=1.28772 s, 2048 iters, t-(init.)=1.27651 s t(norm)=0.384752, mflops=12.9954 (err=5.8e-16) 6. GSL: elapsed time t=1.356 s, 4096 iters, t-(init.)=1.33379 s t(norm)=0.201008, mflops=24.8746 (err=5.3e-16) 7. NAPACK (f2c): elapsed time t=1.02645 s, 512 iters, t-(init.)=1.02368 s t(norm)=1.23419, mflops=4.05124 (err=1.4e-14) 8. Nielsen: elapsed time t=1.34603 s, 2048 iters, t-(init.)=1.33494 s t(norm)=0.402363, mflops=12.4266 (err=7.6e-15) 9. Singleton (f2c): elapsed time t=1.10311 s, 4096 iters, t-(init.)=1.08102 s t(norm)=0.162915, mflops=30.6909 (err=6.7e-16) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=1.21802 s, 512 iters, t-(init.)=1.21526 s t(norm)=1.46516, mflops=3.41259 (err=6.5e-16) Top mflops for N=210 = 72.9036 Normalized results and averages for N=210: fft 0: mflops = 69.6142 (norm. = 0.95488), norm. avg. (of 10) = 0.503453 fft 1: mflops = 69.6276 (norm. = 0.955064), norm. avg. (of 10) = 0.499129 fft 2: mflops = 11.7533 (norm. = 0.161217), norm. avg. (of 10) = 0.179043 fft 3: mflops = 72.9036 (norm. = 1), norm. avg. (of 10) = 0.998223 fft 4: mflops = 69.719 (norm. = 0.956317), norm. avg. (of 10) = 0.993743 fft 5: mflops = 12.9954 (norm. = 0.178255), norm. avg. (of 10) = 0.200078 fft 6: mflops = 24.8746 (norm. = 0.341199), norm. avg. (of 10) = 0.342374 fft 7: mflops = 4.05124 (norm. = 0.0555698), norm. avg. (of 10) = 0.0764423 fft 8: mflops = 12.4266 (norm. = 0.170452), norm. avg. (of 10) = 0.107941 fft 9: mflops = 30.6909 (norm. = 0.420979), norm. avg. (of 10) = 0.338389 fft 10: mflops = -1 (norm. = -0.0137167), norm. avg. (of 9) = 0.110678 fft 11: mflops = 3.41259 (norm. = 0.0468096), norm. avg. (of 10) = 0.0572411 Benchmarking for array size = 504: 0. CWP (min N): elapsed time t=1.26511 s, 4096 iters, t-(init.)=1.21211 s t(norm)=0.0654045, mflops=76.4473 1. CWP (best N): elapsed time t=1.26498 s, 4096 iters, t-(init.)=1.21201 s t(norm)=0.065399, mflops=76.4537 2. FFTPACK (f2c): elapsed time t=1.10758 s, 512 iters, t-(init.)=1.10082 s t(norm)=0.475194, mflops=10.522 (err=9.8e-16) FFTW_MEASURE plan: (cost = 4.646641e-04) FFTW_TWIDDLE 7 FFTW_TWIDDLE 6 FFTW_NOTW 12 3. FFTW: elapsed time t=1.88754 s, 4096 iters, t-(init.)=1.83441 s t(norm)=0.0989831, mflops=50.5137 (err=9.2e-16) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 4. FFTW_ESTIMATE: elapsed time t=1.02747 s, 2048 iters, t-(init.)=1.00082 s t(norm)=0.108007, mflops=46.2933 (err=8.8e-16) 5. Frigo-old: elapsed time t=1.76638 s, 1024 iters, t-(init.)=1.75262 s t(norm)=0.37828, mflops=13.2177 (err=1.0e-15) 6. GSL: elapsed time t=1.9087 s, 2048 iters, t-(init.)=1.88221 s t(norm)=0.203125, mflops=24.6154 (err=8.9e-16) 7. NAPACK (f2c): elapsed time t=1.13743 s, 256 iters, t-(init.)=1.13409 s t(norm)=0.979116, mflops=5.10665 (err=4.2e-14) 8. Nielsen: elapsed time t=1.02587 s, 512 iters, t-(init.)=1.01928 s t(norm)=0.439996, mflops=11.3637 (err=5.8e-15) 9. Singleton (f2c): elapsed time t=1.27743 s, 2048 iters, t-(init.)=1.25098 s t(norm)=0.135003, mflops=37.0361 (err=1.3e-15) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=1.6205 s, 256 iters, t-(init.)=1.61717 s t(norm)=1.39617, mflops=3.58122 (err=1.0e-15) Top mflops for N=504 = 76.4537 Normalized results and averages for N=504: fft 0: mflops = 76.4473 (norm. = 0.999916), norm. avg. (of 11) = 0.548586 fft 1: mflops = 76.4537 (norm. = 1), norm. avg. (of 11) = 0.544663 fft 2: mflops = 10.522 (norm. = 0.137626), norm. avg. (of 11) = 0.175277 fft 3: mflops = 50.5137 (norm. = 0.660709), norm. avg. (of 11) = 0.96754 fft 4: mflops = 46.2933 (norm. = 0.605508), norm. avg. (of 11) = 0.958449 fft 5: mflops = 13.2177 (norm. = 0.172885), norm. avg. (of 11) = 0.197606 fft 6: mflops = 24.6154 (norm. = 0.321964), norm. avg. (of 11) = 0.340518 fft 7: mflops = 5.10665 (norm. = 0.066794), norm. avg. (of 11) = 0.0755652 fft 8: mflops = 11.3637 (norm. = 0.148635), norm. avg. (of 11) = 0.111641 fft 9: mflops = 37.0361 (norm. = 0.484425), norm. avg. (of 11) = 0.351665 fft 10: mflops = -1 (norm. = -0.0130798), norm. avg. (of 9) = 0.110678 fft 11: mflops = 3.58122 (norm. = 0.0468416), norm. avg. (of 11) = 0.0562957 Benchmarking for array size = 1000: 0. CWP (min N) (N=1001): elapsed time t=1.14775 s, 1024 iters, t-(init.)=1.11981 s t(norm)=0.109732, mflops=45.5656 1. CWP (best N) (N=1008): elapsed time t=1.61847 s, 2048 iters, t-(init.)=1.56239 s t(norm)=0.0765505, mflops=65.3163 2. FFTPACK (f2c): elapsed time t=1.05288 s, 256 iters, t-(init.)=1.04545 s t(norm)=0.409779, mflops=12.2017 (err=3.1e-15) FFTW_MEASURE plan: (cost = 1.188234e-03) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 3. FFTW: elapsed time t=1.22862 s, 1024 iters, t-(init.)=1.20041 s t(norm)=0.11763, mflops=42.5061 (err=3.1e-15) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 4. FFTW_ESTIMATE: elapsed time t=1.22831 s, 1024 iters, t-(init.)=1.20057 s t(norm)=0.117646, mflops=42.5005 (err=3.1e-15) 5. Frigo-old: elapsed time t=1.06908 s, 256 iters, t-(init.)=1.06223 s t(norm)=0.416357, mflops=12.0089 (err=3.1e-15) 6. GSL: elapsed time t=1.94868 s, 512 iters, t-(init.)=1.93456 s t(norm)=0.379141, mflops=13.1877 (err=3.1e-15) 7. NAPACK (f2c): elapsed time t=1.75791 s, 128 iters, t-(init.)=1.7544 s t(norm)=1.37533, mflops=3.63548 (err=1.8e-14) 8. Nielsen: elapsed time t=1.74723 s, 512 iters, t-(init.)=1.73308 s t(norm)=0.339654, mflops=14.7209 (err=1.5e-14) 9. Singleton (f2c): elapsed time t=1.12517 s, 1024 iters, t-(init.)=1.09752 s t(norm)=0.107548, mflops=46.4911 (err=4.7e-15) 10. Temperton (f2c): elapsed time t=1.26461 s, 256 iters, t-(init.)=1.25743 s t(norm)=0.49287, mflops=10.1447 (err=3.0e-15) 11. Valkenburg: elapsed time t=1.98566 s, 128 iters, t-(init.)=1.9819 s t(norm)=1.55367, mflops=3.21818 (err=3.0e-15) Top mflops for N=1000 = 65.3163 Normalized results and averages for N=1000: fft 0: mflops = 45.5656 (norm. = 0.697614), norm. avg. (of 12) = 0.561005 fft 1: mflops = 65.3163 (norm. = 1), norm. avg. (of 12) = 0.582608 fft 2: mflops = 12.2017 (norm. = 0.186809), norm. avg. (of 12) = 0.176238 fft 3: mflops = 42.5061 (norm. = 0.650773), norm. avg. (of 12) = 0.941142 fft 4: mflops = 42.5005 (norm. = 0.650687), norm. avg. (of 12) = 0.932802 fft 5: mflops = 12.0089 (norm. = 0.183858), norm. avg. (of 12) = 0.19646 fft 6: mflops = 13.1877 (norm. = 0.201905), norm. avg. (of 12) = 0.328967 fft 7: mflops = 3.63548 (norm. = 0.0556596), norm. avg. (of 12) = 0.0739064 fft 8: mflops = 14.7209 (norm. = 0.225378), norm. avg. (of 12) = 0.121119 fft 9: mflops = 46.4911 (norm. = 0.711783), norm. avg. (of 12) = 0.381675 fft 10: mflops = 10.1447 (norm. = 0.155316), norm. avg. (of 10) = 0.115142 fft 11: mflops = 3.21818 (norm. = 0.0492707), norm. avg. (of 12) = 0.0557103 Benchmarking for array size = 1960: 0. CWP (min N) (N=1980): elapsed time t=1.53361 s, 512 iters, t-(init.)=1.3388 s t(norm)=0.121985, mflops=40.9886 1. CWP (best N) (N=1980): elapsed time t=1.53437 s, 512 iters, t-(init.)=1.33952 s t(norm)=0.122051, mflops=40.9666 2. FFTPACK (f2c): elapsed time t=1.96342 s, 128 iters, t-(init.)=1.91502 s t(norm)=0.697947, mflops=7.16387 (err=1.5e-15) FFTW_MEASURE plan: (cost = 2.843000e-03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 4 FFTW_TWIDDLE 7 FFTW_NOTW 10 3. FFTW: elapsed time t=1.42811 s, 512 iters, t-(init.)=1.23498 s t(norm)=0.112525, mflops=44.4345 (err=1.5e-15) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 4. FFTW_ESTIMATE: elapsed time t=1.4332 s, 512 iters, t-(init.)=1.24017 s t(norm)=0.112998, mflops=44.2486 (err=1.5e-15) 5. Frigo-old: elapsed time t=1.26517 s, 128 iters, t-(init.)=1.2168 s t(norm)=0.443474, mflops=11.2746 (err=1.5e-15) 6. GSL: elapsed time t=1.91096 s, 256 iters, t-(init.)=1.81349 s t(norm)=0.330472, mflops=15.1299 (err=1.6e-15) 7. NAPACK (f2c): elapsed time t=1.04902 s, 32 iters, t-(init.)=1.03682 s t(norm)=1.51152, mflops=3.30792 (err=1.3e-13) 8. Nielsen: elapsed time t=1.32372 s, 128 iters, t-(init.)=1.27541 s t(norm)=0.464837, mflops=10.7565 (err=1.7e-14) 9. Singleton (f2c): elapsed time t=1.36521 s, 256 iters, t-(init.)=1.26875 s t(norm)=0.231205, mflops=21.6258 (err=2.3e-15) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=1.22308 s, 32 iters, t-(init.)=1.21086 s t(norm)=1.76524, mflops=2.83248 (err=1.4e-15) Top mflops for N=1960 = 44.4345 Normalized results and averages for N=1960: fft 0: mflops = 40.9886 (norm. = 0.922451), norm. avg. (of 13) = 0.588809 fft 1: mflops = 40.9666 (norm. = 0.921956), norm. avg. (of 13) = 0.608712 fft 2: mflops = 7.16387 (norm. = 0.161223), norm. avg. (of 13) = 0.175083 fft 3: mflops = 44.4345 (norm. = 1), norm. avg. (of 13) = 0.94567 fft 4: mflops = 44.2486 (norm. = 0.995817), norm. avg. (of 13) = 0.937649 fft 5: mflops = 11.2746 (norm. = 0.253736), norm. avg. (of 13) = 0.200866 fft 6: mflops = 15.1299 (norm. = 0.340498), norm. avg. (of 13) = 0.329854 fft 7: mflops = 3.30792 (norm. = 0.074445), norm. avg. (of 13) = 0.0739478 fft 8: mflops = 10.7565 (norm. = 0.242075), norm. avg. (of 13) = 0.130423 fft 9: mflops = 21.6258 (norm. = 0.48669), norm. avg. (of 13) = 0.389753 fft 10: mflops = -1 (norm. = -0.0225051), norm. avg. (of 10) = 0.115142 fft 11: mflops = 2.83248 (norm. = 0.063745), norm. avg. (of 13) = 0.0563283 Benchmarking for array size = 4725: 0. CWP (min N) (N=5005): elapsed time t=1.21075 s, 128 iters, t-(init.)=1.08664 s t(norm)=0.147196, mflops=33.9684 1. CWP (best N) (N=5040): elapsed time t=1.05529 s, 128 iters, t-(init.)=0.931041 s t(norm)=0.126119, mflops=39.6451 2. FFTPACK (f2c): elapsed time t=1.00111 s, 32 iters, t-(init.)=0.971824 s t(norm)=0.526573, mflops=9.49535 (err=2.4e-15) FFTW_MEASURE plan: (cost = 8.059125e-03) FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_NOTW 15 3. FFTW: elapsed time t=1.03344 s, 128 iters, t-(init.)=0.916683 s t(norm)=0.124174, mflops=40.2661 (err=2.4e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.0427 s, 128 iters, t-(init.)=0.926309 s t(norm)=0.125478, mflops=39.8477 (err=2.3e-15) 5. Frigo-old: elapsed time t=1.19472 s, 32 iters, t-(init.)=1.16545 s t(norm)=0.631486, mflops=7.91783 (err=2.3e-15) 6. GSL: elapsed time t=1.23494 s, 64 iters, t-(init.)=1.17657 s t(norm)=0.318756, mflops=15.686 (err=2.4e-15) 7. NAPACK (f2c): elapsed time t=1.2684 s, 16 iters, t-(init.)=1.25389 s t(norm)=1.35882, mflops=3.67968 (err=3.5e-13) 8. Nielsen: elapsed time t=1.04055 s, 32 iters, t-(init.)=1.01124 s t(norm)=0.547932, mflops=9.12522 (err=4.4e-14) 9. Singleton (f2c): elapsed time t=1.07257 s, 64 iters, t-(init.)=1.01448 s t(norm)=0.274844, mflops=18.1921 (err=3.3e-15) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=1.53303 s, 16 iters, t-(init.)=1.51836 s t(norm)=1.64542, mflops=3.03874 (err=2.3e-15) Top mflops for N=4725 = 40.2661 Normalized results and averages for N=4725: fft 0: mflops = 33.9684 (norm. = 0.843597), norm. avg. (of 14) = 0.607008 fft 1: mflops = 39.6451 (norm. = 0.984579), norm. avg. (of 14) = 0.635559 fft 2: mflops = 9.49535 (norm. = 0.235815), norm. avg. (of 14) = 0.179421 fft 3: mflops = 40.2661 (norm. = 1), norm. avg. (of 14) = 0.949551 fft 4: mflops = 39.8477 (norm. = 0.989608), norm. avg. (of 14) = 0.94136 fft 5: mflops = 7.91783 (norm. = 0.196638), norm. avg. (of 14) = 0.200564 fft 6: mflops = 15.686 (norm. = 0.389558), norm. avg. (of 14) = 0.334119 fft 7: mflops = 3.67968 (norm. = 0.091384), norm. avg. (of 14) = 0.0751932 fft 8: mflops = 9.12522 (norm. = 0.226623), norm. avg. (of 14) = 0.137294 fft 9: mflops = 18.1921 (norm. = 0.451798), norm. avg. (of 14) = 0.394185 fft 10: mflops = -1 (norm. = -0.0248348), norm. avg. (of 10) = 0.115142 fft 11: mflops = 3.03874 (norm. = 0.0754665), norm. avg. (of 14) = 0.0576953 Benchmarking for array size = 10368: 0. CWP (min N) (N=10920): elapsed time t=1.42711 s, 64 iters, t-(init.)=1.29129 s t(norm)=0.14588, mflops=34.2747 1. CWP (best N) (N=11088): elapsed time t=1.31789 s, 64 iters, t-(init.)=1.17937 s t(norm)=0.133237, mflops=37.5272 2. FFTPACK (f2c): elapsed time t=1.01582 s, 16 iters, t-(init.)=0.983196 s t(norm)=0.444298, mflops=11.2537 (err=4.7e-15) FFTW_MEASURE plan: (cost = 1.808800e-02) FFTW_TWIDDLE 16 FFTW_TWIDDLE 6 FFTW_TWIDDLE 9 FFTW_NOTW 12 3. FFTW: elapsed time t=1.28921 s, 64 iters, t-(init.)=1.15992 s t(norm)=0.131039, mflops=38.1566 (err=4.7e-15) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 4. FFTW_ESTIMATE: elapsed time t=1.41883 s, 64 iters, t-(init.)=1.28875 s t(norm)=0.145593, mflops=34.3422 (err=4.7e-15) 5. Frigo-old: elapsed time t=1.13022 s, 16 iters, t-(init.)=1.09736 s t(norm)=0.495888, mflops=10.0829 (err=4.8e-15) 6. GSL: elapsed time t=1.15338 s, 32 iters, t-(init.)=1.08854 s t(norm)=0.245951, mflops=20.3293 (err=4.7e-15) 7. NAPACK (f2c): elapsed time t=1.94036 s, 16 iters, t-(init.)=1.908 s t(norm)=0.862208, mflops=5.79906 (err=7.8e-14) 8. Nielsen: elapsed time t=1.36323 s, 16 iters, t-(init.)=1.33053 s t(norm)=0.601254, mflops=8.31595 (err=1.1e-14) 9. Singleton (f2c): elapsed time t=1.41107 s, 32 iters, t-(init.)=1.34642 s t(norm)=0.304217, mflops=16.4356 (err=6.7e-15) 10. Temperton (f2c): elapsed time t=1.30963 s, 16 iters, t-(init.)=1.27714 s t(norm)=0.577127, mflops=8.6636 (err=4.7e-15) 11. Valkenburg: elapsed time t=1.77706 s, 8 iters, t-(init.)=1.7605 s t(norm)=1.59111, mflops=3.14245 (err=4.7e-15) Top mflops for N=10368 = 38.1566 Normalized results and averages for N=10368: fft 0: mflops = 34.2747 (norm. = 0.898264), norm. avg. (of 15) = 0.626425 fft 1: mflops = 37.5272 (norm. = 0.983506), norm. avg. (of 15) = 0.658756 fft 2: mflops = 11.2537 (norm. = 0.294935), norm. avg. (of 15) = 0.187122 fft 3: mflops = 38.1566 (norm. = 1), norm. avg. (of 15) = 0.952914 fft 4: mflops = 34.3422 (norm. = 0.900034), norm. avg. (of 15) = 0.938605 fft 5: mflops = 10.0829 (norm. = 0.264251), norm. avg. (of 15) = 0.20481 fft 6: mflops = 20.3293 (norm. = 0.532785), norm. avg. (of 15) = 0.347363 fft 7: mflops = 5.79906 (norm. = 0.151981), norm. avg. (of 15) = 0.0803124 fft 8: mflops = 8.31595 (norm. = 0.217943), norm. avg. (of 15) = 0.142671 fft 9: mflops = 16.4356 (norm. = 0.430742), norm. avg. (of 15) = 0.396622 fft 10: mflops = 8.6636 (norm. = 0.227054), norm. avg. (of 11) = 0.125316 fft 11: mflops = 3.14245 (norm. = 0.0823568), norm. avg. (of 15) = 0.0593394 Benchmarking for array size = 27000: 0. CWP (min N) (N=27720): elapsed time t=1.88438 s, 32 iters, t-(init.)=1.70516 s t(norm)=0.134067, mflops=37.2947 1. CWP (best N) (N=27720): elapsed time t=1.88329 s, 32 iters, t-(init.)=1.70349 s t(norm)=0.133936, mflops=37.3312 2. FFTPACK (f2c): elapsed time t=1.99609 s, 8 iters, t-(init.)=1.951 s t(norm)=0.613586, mflops=8.14882 (err=7.3e-15) FFTW_MEASURE plan: (cost = 8.755100e-02) FFTW_TWIDDLE 9 FFTW_TWIDDLE 10 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 12 3. FFTW: elapsed time t=1.34813 s, 16 iters, t-(init.)=1.25596 s t(norm)=0.197498, mflops=25.3167 (err=7.2e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.37887 s, 16 iters, t-(init.)=1.28611 s t(norm)=0.20224, mflops=24.7231 (err=7.3e-15) 5. Frigo-old: elapsed time t=1.22491 s, 4 iters, t-(init.)=1.20039 s t(norm)=0.755042, mflops=6.62214 (err=7.3e-15) 6. GSL: elapsed time t=1.47868 s, 8 iters, t-(init.)=1.43377 s t(norm)=0.45092, mflops=11.0884 (err=7.3e-15) 7. NAPACK (f2c): elapsed time t=1.19992 s, 2 iters, t-(init.)=1.18707 s t(norm)=1.49333, mflops=3.34823 (err=1.0e-12) 8. Nielsen: elapsed time t=1.85406 s, 8 iters, t-(init.)=1.80315 s t(norm)=0.567088, mflops=8.81697 (err=2.0e-13) 9. Singleton (f2c): elapsed time t=1.03669 s, 8 iters, t-(init.)=0.993287 s t(norm)=0.312387, mflops=16.0058 (err=1.1e-14) 10. Temperton (f2c): elapsed time t=1.89427 s, 8 iters, t-(init.)=1.85047 s t(norm)=0.581969, mflops=8.59153 (err=7.3e-15) 11. Valkenburg: elapsed time t=1.51223 s, 2 iters, t-(init.)=1.49779 s t(norm)=1.88421, mflops=2.65363 (err=7.3e-15) Top mflops for N=27000 = 37.3312 Normalized results and averages for N=27000: fft 0: mflops = 37.2947 (norm. = 0.999021), norm. avg. (of 16) = 0.649712 fft 1: mflops = 37.3312 (norm. = 1), norm. avg. (of 16) = 0.680083 fft 2: mflops = 8.14882 (norm. = 0.218284), norm. avg. (of 16) = 0.18907 fft 3: mflops = 25.3167 (norm. = 0.678163), norm. avg. (of 16) = 0.935742 fft 4: mflops = 24.7231 (norm. = 0.662262), norm. avg. (of 16) = 0.921334 fft 5: mflops = 6.62214 (norm. = 0.177389), norm. avg. (of 16) = 0.203096 fft 6: mflops = 11.0884 (norm. = 0.297028), norm. avg. (of 16) = 0.344217 fft 7: mflops = 3.34823 (norm. = 0.0896897), norm. avg. (of 16) = 0.0808985 fft 8: mflops = 8.81697 (norm. = 0.236182), norm. avg. (of 16) = 0.148515 fft 9: mflops = 16.0058 (norm. = 0.42875), norm. avg. (of 16) = 0.39863 fft 10: mflops = 8.59153 (norm. = 0.230143), norm. avg. (of 12) = 0.134051 fft 11: mflops = 2.65363 (norm. = 0.0710834), norm. avg. (of 16) = 0.0600734 Benchmarking for array size = 75600: 0. CWP (min N) (N=80080): elapsed time t=1.33191 s, 4 iters, t-(init.)=1.15902 s t(norm)=0.236499, mflops=21.1417 1. CWP (best N) (N=80080): elapsed time t=1.3319 s, 4 iters, t-(init.)=1.15907 s t(norm)=0.236509, mflops=21.1408 2. FFTPACK (f2c): elapsed time t=1.09867 s, 1 iters, t-(init.)=1.06103 s t(norm)=0.866022, mflops=5.77352 (err=9.4e-15) FFTW_MEASURE plan: (cost = 2.966200e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 5 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_NOTW 15 3. FFTW: elapsed time t=1.17077 s, 4 iters, t-(init.)=1.01107 s t(norm)=0.20631, mflops=24.2354 (err=9.4e-15) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.24757 s, 4 iters, t-(init.)=1.0864 s t(norm)=0.221681, mflops=22.5549 (err=9.4e-15) 5. Frigo-old: elapsed time t=1.08249 s, 1 iters, t-(init.)=1.0474 s t(norm)=0.854897, mflops=5.84866 (err=9.4e-15) 6. GSL: elapsed time t=1.39829 s, 2 iters, t-(init.)=1.31979 s t(norm)=0.538611, mflops=9.28313 (err=9.4e-15) 7. NAPACK (f2c): elapsed time t=2.1297 s, 1 iters, t-(init.)=2.09503 s t(norm)=1.70998, mflops=2.92402 (err=5.1e-12) 8. Nielsen: elapsed time t=1.1311 s, 1 iters, t-(init.)=1.08985 s t(norm)=0.889543, mflops=5.62086 (err=4.7e-13) 9. Singleton (f2c): elapsed time t=1.72868 s, 2 iters, t-(init.)=1.65612 s t(norm)=0.675867, mflops=7.3979 (err=1.3e-14) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=2.68437 s, 1 iters, t-(init.)=2.64681 s t(norm)=2.16034, mflops=2.31445 (err=9.5e-15) Top mflops for N=75600 = 24.2354 Normalized results and averages for N=75600: fft 0: mflops = 21.1417 (norm. = 0.872349), norm. avg. (of 17) = 0.662808 fft 1: mflops = 21.1408 (norm. = 0.872311), norm. avg. (of 17) = 0.691391 fft 2: mflops = 5.77352 (norm. = 0.238227), norm. avg. (of 17) = 0.191962 fft 3: mflops = 24.2354 (norm. = 1), norm. avg. (of 17) = 0.939522 fft 4: mflops = 22.5549 (norm. = 0.930661), norm. avg. (of 17) = 0.921883 fft 5: mflops = 5.84866 (norm. = 0.241327), norm. avg. (of 17) = 0.205345 fft 6: mflops = 9.28313 (norm. = 0.38304), norm. avg. (of 17) = 0.346501 fft 7: mflops = 2.92402 (norm. = 0.120651), norm. avg. (of 17) = 0.0832368 fft 8: mflops = 5.62086 (norm. = 0.231928), norm. avg. (of 17) = 0.153422 fft 9: mflops = 7.3979 (norm. = 0.305252), norm. avg. (of 17) = 0.393137 fft 10: mflops = -1 (norm. = -0.0412619), norm. avg. (of 12) = 0.134051 fft 11: mflops = 2.31445 (norm. = 0.0954986), norm. avg. (of 17) = 0.0621573 Benchmarking for array size = 165375: 0. CWP (min N) (N=180180): elapsed time t=1.00868 s, 1 iters, t-(init.)=0.911383 s t(norm)=0.317905, mflops=15.728 1. CWP (best N) (N=180180): elapsed time t=1.00868 s, 1 iters, t-(init.)=0.911194 s t(norm)=0.317839, mflops=15.7312 2. FFTPACK (f2c): elapsed time t=3.57774 s, 1 iters, t-(init.)=3.49145 s t(norm)=1.21788, mflops=4.10551 (err=3.7e-14) FFTW_MEASURE plan: (cost = 7.711030e-01) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 15 3. FFTW: elapsed time t=1.54737 s, 2 iters, t-(init.)=1.37107 s t(norm)=0.239125, mflops=20.9095 (err=3.7e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.60455 s, 2 iters, t-(init.)=1.4285 s t(norm)=0.249141, mflops=20.0689 (err=3.7e-14) 5. Frigo-old: elapsed time t=3.42612 s, 1 iters, t-(init.)=3.34242 s t(norm)=1.16589, mflops=4.28857 (err=3.7e-14) 6. GSL: elapsed time t=1.62534 s, 1 iters, t-(init.)=1.5388 s t(norm)=0.536757, mflops=9.3152 (err=3.7e-14) 7. NAPACK (f2c): elapsed time t=5.46927 s, 1 iters, t-(init.)=5.38582 s t(norm)=1.87866, mflops=2.66147 (err=1.6e-11) 8. Nielsen: elapsed time t=2.93266 s, 1 iters, t-(init.)=2.8436 s t(norm)=0.991894, mflops=5.04086 (err=1.6e-12) 9. Singleton (f2c): elapsed time t=2.1157 s, 1 iters, t-(init.)=2.03444 s t(norm)=0.709646, mflops=7.04577 (err=5.6e-14) 10. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 11. Valkenburg: elapsed time t=6.72135 s, 1 iters, t-(init.)=6.63678 s t(norm)=2.31502, mflops=2.15981 (err=3.6e-14) Top mflops for N=165375 = 20.9095 Normalized results and averages for N=165375: fft 0: mflops = 15.728 (norm. = 0.752191), norm. avg. (of 18) = 0.667774 fft 1: mflops = 15.7312 (norm. = 0.752347), norm. avg. (of 18) = 0.694777 fft 2: mflops = 4.10551 (norm. = 0.196346), norm. avg. (of 18) = 0.192205 fft 3: mflops = 20.9095 (norm. = 1), norm. avg. (of 18) = 0.942882 fft 4: mflops = 20.0689 (norm. = 0.959798), norm. avg. (of 18) = 0.923989 fft 5: mflops = 4.28857 (norm. = 0.205101), norm. avg. (of 18) = 0.205331 fft 6: mflops = 9.3152 (norm. = 0.4455), norm. avg. (of 18) = 0.352001 fft 7: mflops = 2.66147 (norm. = 0.127285), norm. avg. (of 18) = 0.085684 fft 8: mflops = 5.04086 (norm. = 0.24108), norm. avg. (of 18) = 0.158292 fft 9: mflops = 7.04577 (norm. = 0.336964), norm. avg. (of 18) = 0.390016 fft 10: mflops = -1 (norm. = -0.0478251), norm. avg. (of 12) = 0.134051 fft 11: mflops = 2.15981 (norm. = 0.103293), norm. avg. (of 18) = 0.0644426 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) Maximum array size N = 2097152 Benchmarking FFTs: 0. FFTW 1. HARM (f2c) 2. NR (C) 3. PDA (f2c) 4. Singleton (f2c) 5. Temperton (f2c) Computing normalized averages (6 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.59112 s, 65536 iters, t-(init.)=1.48063 s t(norm)=0.0588348, mflops=84.9837 (err=1.9e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. NR (C): elapsed time t=1.98827 s, 32768 iters, t-(init.)=1.93291 s t(norm)=0.153614, mflops=32.5491 (err=2.3e-16) 3. PDA (f2c): elapsed time t=1.70106 s, 4096 iters, t-(init.)=1.69409 s t(norm)=1.07707, mflops=4.64221 (err=2.8e-16) 4. Singleton (f2c): elapsed time t=1.16139 s, 32768 iters, t-(init.)=1.10614 s t(norm)=0.0879085, mflops=56.8773 (err=1.9e-16) 5. Temperton (f2c): elapsed time t=1.15882 s, 8192 iters, t-(init.)=1.1448 s t(norm)=0.363921, mflops=13.7392 (err=1.9e-16) Top mflops for N=64 = 84.9837 Normalized results and averages for N=64: fft 0: mflops = 84.9837 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.011767), norm. avg. (of 0) = -1 fft 2: mflops = 32.5491 (norm. = 0.383003), norm. avg. (of 1) = 0.383003 fft 3: mflops = 4.64221 (norm. = 0.0546247), norm. avg. (of 1) = 0.0546247 fft 4: mflops = 56.8773 (norm. = 0.669273), norm. avg. (of 1) = 0.669273 fft 5: mflops = 13.7392 (norm. = 0.161669), norm. avg. (of 1) = 0.161669 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.92692 s, 8192 iters, t-(init.)=1.81943 s t(norm)=0.0481985, mflops=103.738 (err=3.8e-16) 1. HARM (f2c): elapsed time t=1.54435 s, 1024 iters, t-(init.)=1.53095 s t(norm)=0.32445, mflops=15.4107 (err=3.6e-16) 2. NR (C): elapsed time t=1.11187 s, 2048 iters, t-(init.)=1.08487 s t(norm)=0.114957, mflops=43.4945 (err=2.9e-16) 3. PDA (f2c): elapsed time t=1.45222 s, 512 iters, t-(init.)=1.4454 s t(norm)=0.61264, mflops=8.1614 (err=3.1e-16) 4. Singleton (f2c): elapsed time t=1.8933 s, 4096 iters, t-(init.)=1.83939 s t(norm)=0.0974544, mflops=51.3061 (err=3.1e-16) 5. Temperton (f2c): elapsed time t=1.61494 s, 1024 iters, t-(init.)=1.60136 s t(norm)=0.339373, mflops=14.733 (err=3.7e-16) Top mflops for N=512 = 103.738 Normalized results and averages for N=512: fft 0: mflops = 103.738 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 15.4107 (norm. = 0.148555), norm. avg. (of 1) = 0.148555 fft 2: mflops = 43.4945 (norm. = 0.419274), norm. avg. (of 2) = 0.401139 fft 3: mflops = 8.1614 (norm. = 0.0786735), norm. avg. (of 2) = 0.0666491 fft 4: mflops = 51.3061 (norm. = 0.494575), norm. avg. (of 2) = 0.581924 fft 5: mflops = 14.733 (norm. = 0.142022), norm. avg. (of 2) = 0.151846 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.5728 s, 256 iters, t-(init.)=1.37102 s t(norm)=0.108959, mflops=45.889 (err=4.1e-16) 1. HARM (f2c): elapsed time t=1.17355 s, 64 iters, t-(init.)=1.12237 s t(norm)=0.356791, mflops=14.0138 (err=4.0e-16) 2. NR (C): elapsed time t=1.38961 s, 64 iters, t-(init.)=1.33921 s t(norm)=0.425724, mflops=11.7447 (err=4.7e-16) 3. PDA (f2c): elapsed time t=1.65063 s, 64 iters, t-(init.)=1.59997 s t(norm)=0.508617, mflops=9.83058 (err=3.8e-16) 4. Singleton (f2c): elapsed time t=1.62618 s, 128 iters, t-(init.)=1.52522 s t(norm)=0.242428, mflops=20.6247 (err=4.7e-16) 5. Temperton (f2c): elapsed time t=1.14503 s, 64 iters, t-(init.)=1.09446 s t(norm)=0.347918, mflops=14.3712 (err=4.1e-16) Top mflops for N=4096 = 45.889 Normalized results and averages for N=4096: fft 0: mflops = 45.889 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 14.0138 (norm. = 0.305385), norm. avg. (of 2) = 0.22697 fft 2: mflops = 11.7447 (norm. = 0.255937), norm. avg. (of 3) = 0.352738 fft 3: mflops = 9.83058 (norm. = 0.214225), norm. avg. (of 3) = 0.115841 fft 4: mflops = 20.6247 (norm. = 0.449448), norm. avg. (of 3) = 0.537766 fft 5: mflops = 14.3712 (norm. = 0.313173), norm. avg. (of 3) = 0.205622 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.77041 s, 32 iters, t-(init.)=1.55699 s t(norm)=0.0989909, mflops=50.5097 (err=4.8e-16) 1. HARM (f2c): elapsed time t=1.51822 s, 8 iters, t-(init.)=1.46421 s t(norm)=0.372368, mflops=13.4276 (err=4.8e-16) 2. NR (C): elapsed time t=1.81787 s, 8 iters, t-(init.)=1.76419 s t(norm)=0.448658, mflops=11.1444 (err=6.0e-16) 3. PDA (f2c): elapsed time t=1.09601 s, 4 iters, t-(init.)=1.06867 s t(norm)=0.543554, mflops=9.19872 (err=4.5e-16) 4. Singleton (f2c): elapsed time t=1.1824 s, 8 iters, t-(init.)=1.12856 s t(norm)=0.287008, mflops=17.4211 (err=4.9e-16) 5. Temperton (f2c): elapsed time t=1.71609 s, 8 iters, t-(init.)=1.66226 s t(norm)=0.422735, mflops=11.8277 (err=5.1e-16) Top mflops for N=32768 = 50.5097 Normalized results and averages for N=32768: fft 0: mflops = 50.5097 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 13.4276 (norm. = 0.265842), norm. avg. (of 3) = 0.239927 fft 2: mflops = 11.1444 (norm. = 0.220638), norm. avg. (of 4) = 0.319713 fft 3: mflops = 9.19872 (norm. = 0.182118), norm. avg. (of 4) = 0.13241 fft 4: mflops = 17.4211 (norm. = 0.344906), norm. avg. (of 4) = 0.489551 fft 5: mflops = 11.8277 (norm. = 0.234168), norm. avg. (of 4) = 0.212758 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.21269 s, 1 iters, t-(init.)=1.07087 s t(norm)=0.226947, mflops=22.0316 (err=1.0e-15) 1. HARM (f2c): elapsed time t=2.81457 s, 1 iters, t-(init.)=2.67294 s t(norm)=0.56647, mflops=8.82659 (err=1.0e-15) 2. NR (C): elapsed time t=6.46932 s, 1 iters, t-(init.)=6.32763 s t(norm)=1.341, mflops=3.72856 (err=1.0e-15) 3. PDA (f2c): elapsed time t=3.65038 s, 1 iters, t-(init.)=3.5087 s t(norm)=0.74359, mflops=6.72413 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=3.62907 s, 1 iters, t-(init.)=3.487 s t(norm)=0.738991, mflops=6.76598 (err=1.4e-15) 5. Temperton (f2c): elapsed time t=3.11786 s, 1 iters, t-(init.)=2.97671 s t(norm)=0.630846, mflops=7.92586 (err=9.9e-16) Top mflops for N=262144 = 22.0316 Normalized results and averages for N=262144: fft 0: mflops = 22.0316 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 8.82659 (norm. = 0.400633), norm. avg. (of 4) = 0.280104 fft 2: mflops = 3.72856 (norm. = 0.169237), norm. avg. (of 5) = 0.289618 fft 3: mflops = 6.72413 (norm. = 0.305204), norm. avg. (of 5) = 0.166969 fft 4: mflops = 6.76598 (norm. = 0.307104), norm. avg. (of 5) = 0.453061 fft 5: mflops = 7.92586 (norm. = 0.35975), norm. avg. (of 5) = 0.242156 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=2.48549 s, 1 iters, t-(init.)=2.20238 s t(norm)=0.22109, mflops=22.6152 (err=9.2e-16) 1. HARM (f2c): elapsed time t=6.01012 s, 1 iters, t-(init.)=5.72714 s t(norm)=0.574929, mflops=8.69672 (err=9.4e-16) 2. NR (C): elapsed time t=13.3626 s, 1 iters, t-(init.)=13.0789 s t(norm)=1.31295, mflops=3.80823 (err=9.6e-16) 3. PDA (f2c): elapsed time t=7.37318 s, 1 iters, t-(init.)=7.09036 s t(norm)=0.711778, mflops=7.02466 (err=8.8e-16) 4. Singleton (f2c): elapsed time t=7.72 s, 1 iters, t-(init.)=7.43621 s t(norm)=0.746497, mflops=6.69795 (err=1.3e-15) 5. Temperton (f2c): elapsed time t=6.56072 s, 1 iters, t-(init.)=6.27759 s t(norm)=0.630187, mflops=7.93415 (err=9.2e-16) Top mflops for N=524288 = 22.6152 Normalized results and averages for N=524288: fft 0: mflops = 22.6152 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 8.69672 (norm. = 0.384552), norm. avg. (of 5) = 0.300993 fft 2: mflops = 3.80823 (norm. = 0.168393), norm. avg. (of 6) = 0.269414 fft 3: mflops = 7.02466 (norm. = 0.310617), norm. avg. (of 6) = 0.19091 fft 4: mflops = 6.69795 (norm. = 0.29617), norm. avg. (of 6) = 0.426913 fft 5: mflops = 7.93415 (norm. = 0.350833), norm. avg. (of 6) = 0.260269 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=6.58192 s, 1 iters, t-(init.)=6.01475 s t(norm)=0.286805, mflops=17.4334 (err=1.2e-15) 1. HARM (f2c): elapsed time t=12.3264 s, 1 iters, t-(init.)=11.7586 s t(norm)=0.560696, mflops=8.91749 (err=1.2e-15) 2. NR (C): elapsed time t=28.2466 s, 1 iters, t-(init.)=27.6791 s t(norm)=1.31984, mflops=3.78833 (err=1.3e-15) 3. PDA (f2c): elapsed time t=16.5922 s, 1 iters, t-(init.)=16.0251 s t(norm)=0.764136, mflops=6.54334 (err=1.2e-15) 4. Singleton (f2c): elapsed time t=15.9879 s, 1 iters, t-(init.)=15.4201 s t(norm)=0.735287, mflops=6.80007 (err=1.7e-15) 5. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 17.4334 Normalized results and averages for N=1048576: fft 0: mflops = 17.4334 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 8.91749 (norm. = 0.511517), norm. avg. (of 6) = 0.336081 fft 2: mflops = 3.78833 (norm. = 0.217303), norm. avg. (of 7) = 0.261969 fft 3: mflops = 6.54334 (norm. = 0.375333), norm. avg. (of 7) = 0.217256 fft 4: mflops = 6.80007 (norm. = 0.390059), norm. avg. (of 7) = 0.421648 fft 5: mflops = -1 (norm. = -0.0573611), norm. avg. (of 6) = 0.260269 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=10.3963 s, 1 iters, t-(init.)=9.26166 s t(norm)=0.2103, mflops=23.7755 (err=8.1e-16) 1. HARM (f2c): elapsed time t=27.0897 s, 1 iters, t-(init.)=25.9544 s t(norm)=0.589333, mflops=8.48416 (err=8.0e-16) 2. NR (C): elapsed time t=59.224 s, 1 iters, t-(init.)=58.0876 s t(norm)=1.31897, mflops=3.79085 (err=8.7e-16) 3. PDA (f2c): elapsed time t=28.237 s, 1 iters, t-(init.)=27.1024 s t(norm)=0.615401, mflops=8.12478 (err=7.8e-16) 4. Singleton (f2c): elapsed time t=44.2082 s, 1 iters, t-(init.)=43.0723 s t(norm)=0.978022, mflops=5.11236 (err=1.1e-15) 5. Temperton (f2c): elapsed time t=40.4999 s, 1 iters, t-(init.)=39.3642 s t(norm)=0.893824, mflops=5.59394 (err=8.1e-16) Top mflops for N=2097152 = 23.7755 Normalized results and averages for N=2097152: fft 0: mflops = 23.7755 (norm. = 1), norm. avg. (of 8) = 1 fft 1: mflops = 8.48416 (norm. = 0.356844), norm. avg. (of 7) = 0.339047 fft 2: mflops = 3.79085 (norm. = 0.159443), norm. avg. (of 8) = 0.249153 fft 3: mflops = 8.12478 (norm. = 0.341729), norm. avg. (of 8) = 0.232816 fft 4: mflops = 5.11236 (norm. = 0.215026), norm. avg. (of 8) = 0.39582 fft 5: mflops = 5.59394 (norm. = 0.235282), norm. avg. (of 7) = 0.2567 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) Maximum array size N = 1728000 Benchmarking FFTs: 0. FFTW 1. PDA (f2c) 2. Singleton (f2c) 3. Temperton (f2c) Computing normalized averages (4 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.24686 s, 16384 iters, t-(init.)=1.19325 s t(norm)=0.0836436, mflops=59.7774 (err=2.4e-16) 1. PDA (f2c): elapsed time t=1.61371 s, 2048 iters, t-(init.)=1.60715 s t(norm)=0.901254, mflops=5.54783 (err=2.1e-16) 2. Singleton (f2c): elapsed time t=1.01205 s, 16384 iters, t-(init.)=0.958819 s t(norm)=0.0672104, mflops=74.3932 (err=3.1e-16) 3. Temperton (f2c): elapsed time t=1.33001 s, 4096 iters, t-(init.)=1.31636 s t(norm)=0.369091, mflops=13.5468 (err=2.4e-16) Top mflops for N=125 = 74.3932 Normalized results and averages for N=125: fft 0: mflops = 59.7774 (norm. = 0.803533), norm. avg. (of 1) = 0.803533 fft 1: mflops = 5.54783 (norm. = 0.0745743), norm. avg. (of 1) = 0.0745743 fft 2: mflops = 74.3932 (norm. = 1), norm. avg. (of 1) = 1 fft 3: mflops = 13.5468 (norm. = 0.182097), norm. avg. (of 1) = 0.182097 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.7402 s, 16384 iters, t-(init.)=1.64932 s t(norm)=0.0600973, mflops=83.1985 (err=3.0e-16) 1. PDA (f2c): elapsed time t=1.39613 s, 1024 iters, t-(init.)=1.3905 s t(norm)=0.810666, mflops=6.16777 (err=3.7e-16) 2. Singleton (f2c): elapsed time t=1.61 s, 8192 iters, t-(init.)=1.56435 s t(norm)=0.114003, mflops=43.8586 (err=3.1e-16) 3. Temperton (f2c): elapsed time t=1.43675 s, 2048 iters, t-(init.)=1.42522 s t(norm)=0.415453, mflops=12.0351 (err=3.2e-16) Top mflops for N=216 = 83.1985 Normalized results and averages for N=216: fft 0: mflops = 83.1985 (norm. = 1), norm. avg. (of 2) = 0.901767 fft 1: mflops = 6.16777 (norm. = 0.0741332), norm. avg. (of 2) = 0.0743538 fft 2: mflops = 43.8586 (norm. = 0.527156), norm. avg. (of 2) = 0.763578 fft 3: mflops = 12.0351 (norm. = 0.144655), norm. avg. (of 2) = 0.163376 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.062 s, 4096 iters, t-(init.)=1.02584 s t(norm)=0.0866972, mflops=57.672 (err=4.0e-16) 1. PDA (f2c): elapsed time t=1.09095 s, 256 iters, t-(init.)=1.08868 s t(norm)=1.47214, mflops=3.39642 (err=4.0e-16) 2. Singleton (f2c): elapsed time t=1.53315 s, 4096 iters, t-(init.)=1.49712 s t(norm)=0.126527, mflops=39.5171 (err=4.9e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 57.672 Normalized results and averages for N=343: fft 0: mflops = 57.672 (norm. = 1), norm. avg. (of 3) = 0.934511 fft 1: mflops = 3.39642 (norm. = 0.058892), norm. avg. (of 3) = 0.0691998 fft 2: mflops = 39.5171 (norm. = 0.685205), norm. avg. (of 3) = 0.737454 fft 3: mflops = -1 (norm. = -0.0173394), norm. avg. (of 2) = 0.163376 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.9213 s, 4096 iters, t-(init.)=1.84456 s t(norm)=0.0649584, mflops=76.9723 (err=5.4e-16) 1. PDA (f2c): elapsed time t=1.14392 s, 256 iters, t-(init.)=1.13915 s t(norm)=0.641863, mflops=7.78982 (err=5.2e-16) 2. Singleton (f2c): elapsed time t=1.31627 s, 2048 iters, t-(init.)=1.27776 s t(norm)=0.0899957, mflops=55.5582 (err=4.9e-16) 3. Temperton (f2c): elapsed time t=1.22228 s, 512 iters, t-(init.)=1.2128 s t(norm)=0.341681, mflops=14.6335 (err=5.8e-16) Top mflops for N=729 = 76.9723 Normalized results and averages for N=729: fft 0: mflops = 76.9723 (norm. = 1), norm. avg. (of 4) = 0.950883 fft 1: mflops = 7.78982 (norm. = 0.101203), norm. avg. (of 4) = 0.0772006 fft 2: mflops = 55.5582 (norm. = 0.721795), norm. avg. (of 4) = 0.733539 fft 3: mflops = 14.6335 (norm. = 0.190114), norm. avg. (of 3) = 0.172289 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.43223 s, 2048 iters, t-(init.)=1.37669 s t(norm)=0.0674518, mflops=74.127 (err=3.8e-16) 1. PDA (f2c): elapsed time t=1.49896 s, 256 iters, t-(init.)=1.49212 s t(norm)=0.584862, mflops=8.54903 (err=4.2e-16) 2. Singleton (f2c): elapsed time t=1.13402 s, 1024 iters, t-(init.)=1.10608 s t(norm)=0.108386, mflops=46.1313 (err=4.4e-16) 3. Temperton (f2c): elapsed time t=1.00924 s, 256 iters, t-(init.)=1.00185 s t(norm)=0.39269, mflops=12.7327 (err=3.6e-16) Top mflops for N=1000 = 74.127 Normalized results and averages for N=1000: fft 0: mflops = 74.127 (norm. = 1), norm. avg. (of 5) = 0.960707 fft 1: mflops = 8.54903 (norm. = 0.115329), norm. avg. (of 5) = 0.0848264 fft 2: mflops = 46.1313 (norm. = 0.622327), norm. avg. (of 5) = 0.711297 fft 3: mflops = 12.7327 (norm. = 0.171768), norm. avg. (of 4) = 0.172159 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.17054 s, 512 iters, t-(init.)=1.03942 s t(norm)=0.146966, mflops=34.0215 (err=4.0e-16) 1. PDA (f2c): elapsed time t=1.34253 s, 64 iters, t-(init.)=1.32587 s t(norm)=1.49975, mflops=3.33389 (err=4.8e-16) 2. Singleton (f2c): elapsed time t=1.24999 s, 512 iters, t-(init.)=1.11907 s t(norm)=0.158229, mflops=31.5999 (err=6.4e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 34.0215 Normalized results and averages for N=1331: fft 0: mflops = 34.0215 (norm. = 1), norm. avg. (of 6) = 0.967256 fft 1: mflops = 3.33389 (norm. = 0.0979938), norm. avg. (of 6) = 0.0870209 fft 2: mflops = 31.5999 (norm. = 0.928822), norm. avg. (of 6) = 0.747551 fft 3: mflops = -1 (norm. = -0.0293932), norm. avg. (of 4) = 0.172159 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.87135 s, 1024 iters, t-(init.)=1.53093 s t(norm)=0.0804465, mflops=62.1531 (err=3.8e-16) 1. PDA (f2c): elapsed time t=1.32748 s, 128 iters, t-(init.)=1.28473 s t(norm)=0.540072, mflops=9.25803 (err=3.8e-16) 2. Singleton (f2c): elapsed time t=1.24722 s, 256 iters, t-(init.)=1.16151 s t(norm)=0.244138, mflops=20.4802 (err=4.0e-16) 3. Temperton (f2c): elapsed time t=1.60763 s, 256 iters, t-(init.)=1.52241 s t(norm)=0.319994, mflops=15.6253 (err=3.8e-16) Top mflops for N=1728 = 62.1531 Normalized results and averages for N=1728: fft 0: mflops = 62.1531 (norm. = 1), norm. avg. (of 7) = 0.971933 fft 1: mflops = 9.25803 (norm. = 0.148955), norm. avg. (of 7) = 0.0958687 fft 2: mflops = 20.4802 (norm. = 0.329513), norm. avg. (of 7) = 0.687831 fft 3: mflops = 15.6253 (norm. = 0.2514), norm. avg. (of 5) = 0.188007 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.18613 s, 256 iters, t-(init.)=1.0774 s t(norm)=0.172557, mflops=28.9759 (err=4.1e-16) 1. PDA (f2c): elapsed time t=1.23657 s, 32 iters, t-(init.)=1.22293 s t(norm)=1.56692, mflops=3.19098 (err=7.2e-16) 2. Singleton (f2c): elapsed time t=1.16654 s, 256 iters, t-(init.)=1.0576 s t(norm)=0.169387, mflops=29.5183 (err=4.3e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 29.5183 Normalized results and averages for N=2197: fft 0: mflops = 28.9759 (norm. = 0.981626), norm. avg. (of 8) = 0.973145 fft 1: mflops = 3.19098 (norm. = 0.108102), norm. avg. (of 8) = 0.0973979 fft 2: mflops = 29.5183 (norm. = 1), norm. avg. (of 8) = 0.726852 fft 3: mflops = -1 (norm. = -0.0338773), norm. avg. (of 5) = 0.188007 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.03292 s, 256 iters, t-(init.)=0.897833 s t(norm)=0.111899, mflops=44.6831 (err=3.9e-16) 1. PDA (f2c): elapsed time t=1.92478 s, 64 iters, t-(init.)=1.89096 s t(norm)=0.9427, mflops=5.30391 (err=3.8e-16) 2. Singleton (f2c): elapsed time t=1.0251 s, 128 iters, t-(init.)=0.956945 s t(norm)=0.238533, mflops=20.9615 (err=4.6e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 44.6831 Normalized results and averages for N=2744: fft 0: mflops = 44.6831 (norm. = 1), norm. avg. (of 9) = 0.976129 fft 1: mflops = 5.30391 (norm. = 0.118701), norm. avg. (of 9) = 0.0997648 fft 2: mflops = 20.9615 (norm. = 0.469114), norm. avg. (of 9) = 0.698215 fft 3: mflops = -1 (norm. = -0.0223798), norm. avg. (of 5) = 0.188007 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.18753 s, 256 iters, t-(init.)=1.02077 s t(norm)=0.1008, mflops=49.6029 (err=4.6e-16) 1. PDA (f2c): elapsed time t=1.33675 s, 64 iters, t-(init.)=1.29503 s t(norm)=0.511533, mflops=9.77454 (err=4.5e-16) 2. Singleton (f2c): elapsed time t=1.18685 s, 128 iters, t-(init.)=1.10363 s t(norm)=0.217965, mflops=22.9395 (err=4.8e-16) 3. Temperton (f2c): elapsed time t=1.89032 s, 128 iters, t-(init.)=1.80721 s t(norm)=0.356922, mflops=14.0087 (err=4.6e-16) Top mflops for N=3375 = 49.6029 Normalized results and averages for N=3375: fft 0: mflops = 49.6029 (norm. = 1), norm. avg. (of 10) = 0.978516 fft 1: mflops = 9.77454 (norm. = 0.197056), norm. avg. (of 10) = 0.109494 fft 2: mflops = 22.9395 (norm. = 0.462462), norm. avg. (of 10) = 0.674639 fft 3: mflops = 14.0087 (norm. = 0.282416), norm. avg. (of 6) = 0.203742 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.05336 s, 32 iters, t-(init.)=0.947937 s t(norm)=0.125624, mflops=39.8014 (err=4.7e-16) 1. PDA (f2c): elapsed time t=1.07303 s, 8 iters, t-(init.)=1.04622 s t(norm)=0.554593, mflops=9.01562 (err=4.4e-16) 2. Singleton (f2c): elapsed time t=1.16278 s, 16 iters, t-(init.)=1.10989 s t(norm)=0.294172, mflops=16.9968 (err=5.6e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 39.8014 Normalized results and averages for N=16800: fft 0: mflops = 39.8014 (norm. = 1), norm. avg. (of 11) = 0.980469 fft 1: mflops = 9.01562 (norm. = 0.226515), norm. avg. (of 11) = 0.120132 fft 2: mflops = 16.9968 (norm. = 0.427041), norm. avg. (of 11) = 0.65213 fft 3: mflops = -1 (norm. = -0.0251247), norm. avg. (of 6) = 0.203742 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.53785 s, 4 iters, t-(init.)=1.30103 s t(norm)=0.175534, mflops=28.4845 (err=7.1e-16) 1. PDA (f2c): elapsed time t=1.08289 s, 1 iters, t-(init.)=1.02736 s t(norm)=0.554441, mflops=9.01809 (err=7.1e-16) 2. Singleton (f2c): elapsed time t=1.25452 s, 1 iters, t-(init.)=1.19542 s t(norm)=0.64514, mflops=7.75026 (err=8.2e-16) 3. Temperton (f2c): elapsed time t=1.93459 s, 2 iters, t-(init.)=1.81738 s t(norm)=0.490401, mflops=10.1957 (err=7.6e-16) Top mflops for N=110592 = 28.4845 Normalized results and averages for N=110592: fft 0: mflops = 28.4845 (norm. = 1), norm. avg. (of 12) = 0.982097 fft 1: mflops = 9.01809 (norm. = 0.316597), norm. avg. (of 12) = 0.136504 fft 2: mflops = 7.75026 (norm. = 0.272087), norm. avg. (of 12) = 0.62046 fft 3: mflops = 10.1957 (norm. = 0.35794), norm. avg. (of 7) = 0.22577 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.71057 s, 4 iters, t-(init.)=1.45947 s t(norm)=0.184119, mflops=27.1564 (err=8.7e-16) 1. PDA (f2c): elapsed time t=1.94356 s, 1 iters, t-(init.)=1.88289 s t(norm)=0.950143, mflops=5.26237 (err=8.8e-16) 2. Singleton (f2c): elapsed time t=1.17527 s, 1 iters, t-(init.)=1.11162 s t(norm)=0.560946, mflops=8.91352 (err=1.1e-15) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 27.1564 Normalized results and averages for N=117649: fft 0: mflops = 27.1564 (norm. = 1), norm. avg. (of 13) = 0.983474 fft 1: mflops = 5.26237 (norm. = 0.19378), norm. avg. (of 13) = 0.14091 fft 2: mflops = 8.91352 (norm. = 0.328229), norm. avg. (of 13) = 0.597981 fft 3: mflops = -1 (norm. = -0.0368237), norm. avg. (of 7) = 0.22577 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.45531 s, 2 iters, t-(init.)=1.22289 s t(norm)=0.159744, mflops=31.3001 (err=4.9e-16) 1. PDA (f2c): elapsed time t=2.1692 s, 1 iters, t-(init.)=2.05299 s t(norm)=0.536355, mflops=9.32218 (err=5.0e-16) 2. Singleton (f2c): elapsed time t=3.38714 s, 1 iters, t-(init.)=3.27421 s t(norm)=0.855407, mflops=5.84517 (err=6.0e-16) 3. Temperton (f2c): elapsed time t=2.02369 s, 1 iters, t-(init.)=1.90824 s t(norm)=0.49854, mflops=10.0293 (err=4.7e-16) Top mflops for N=216000 = 31.3001 Normalized results and averages for N=216000: fft 0: mflops = 31.3001 (norm. = 1), norm. avg. (of 14) = 0.984654 fft 1: mflops = 9.32218 (norm. = 0.297832), norm. avg. (of 14) = 0.152119 fft 2: mflops = 5.84517 (norm. = 0.186746), norm. avg. (of 14) = 0.568607 fft 3: mflops = 10.0293 (norm. = 0.320423), norm. avg. (of 8) = 0.237602 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.75847 s, 2 iters, t-(init.)=1.49695 s t(norm)=0.172996, mflops=28.9024 (err=5.7e-16) 1. PDA (f2c): elapsed time t=2.88247 s, 1 iters, t-(init.)=2.7521 s t(norm)=0.636098, mflops=7.86043 (err=6.1e-16) 2. Singleton (f2c): elapsed time t=3.92942 s, 1 iters, t-(init.)=3.7986 s t(norm)=0.877976, mflops=5.69492 (err=7.0e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 28.9024 Normalized results and averages for N=241920: fft 0: mflops = 28.9024 (norm. = 1), norm. avg. (of 15) = 0.985677 fft 1: mflops = 7.86043 (norm. = 0.271964), norm. avg. (of 15) = 0.160108 fft 2: mflops = 5.69492 (norm. = 0.197039), norm. avg. (of 15) = 0.543836 fft 3: mflops = -1 (norm. = -0.0345992), norm. avg. (of 8) = 0.237602 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.46839 s, 1 iters, t-(init.)=1.24102 s t(norm)=0.157423, mflops=31.7616 (err=9.0e-16) 1. PDA (f2c): elapsed time t=4.40539 s, 1 iters, t-(init.)=4.17903 s t(norm)=0.530109, mflops=9.43203 (err=9.5e-16) 2. Singleton (f2c): elapsed time t=5.73316 s, 1 iters, t-(init.)=5.50477 s t(norm)=0.698278, mflops=7.16048 (err=1.3e-15) 3. Temperton (f2c): elapsed time t=3.88526 s, 1 iters, t-(init.)=3.65749 s t(norm)=0.463952, mflops=10.777 (err=1.1e-15) Top mflops for N=421875 = 31.7616 Normalized results and averages for N=421875: fft 0: mflops = 31.7616 (norm. = 1), norm. avg. (of 16) = 0.986572 fft 1: mflops = 9.43203 (norm. = 0.296964), norm. avg. (of 16) = 0.168662 fft 2: mflops = 7.16048 (norm. = 0.225445), norm. avg. (of 16) = 0.523936 fft 3: mflops = 10.777 (norm. = 0.339309), norm. avg. (of 9) = 0.248903 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.99598 s, 1 iters, t-(init.)=1.72012 s t(norm)=0.177141, mflops=28.2262 (err=1.5e-15) 1. PDA (f2c): elapsed time t=5.43636 s, 1 iters, t-(init.)=5.15965 s t(norm)=0.531348, mflops=9.41003 (err=1.5e-15) 2. Singleton (f2c): elapsed time t=7.0032 s, 1 iters, t-(init.)=6.72581 s t(norm)=0.692634, mflops=7.21882 (err=2.3e-15) 3. Temperton (f2c): elapsed time t=5.40185 s, 1 iters, t-(init.)=5.12527 s t(norm)=0.527808, mflops=9.47313 (err=1.5e-15) Top mflops for N=512000 = 28.2262 Normalized results and averages for N=512000: fft 0: mflops = 28.2262 (norm. = 1), norm. avg. (of 17) = 0.987362 fft 1: mflops = 9.41003 (norm. = 0.33338), norm. avg. (of 17) = 0.178351 fft 2: mflops = 7.21882 (norm. = 0.255749), norm. avg. (of 17) = 0.508161 fft 3: mflops = 9.47313 (norm. = 0.335615), norm. avg. (of 10) = 0.257574 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=2.1427 s, 1 iters, t-(init.)=1.82249 s t(norm)=0.160342, mflops=31.1833 (err=7.6e-16) 1. PDA (f2c): elapsed time t=8.2441 s, 1 iters, t-(init.)=7.92388 s t(norm)=0.697141, mflops=7.17215 (err=6.9e-16) 2. Singleton (f2c): elapsed time t=10.5516 s, 1 iters, t-(init.)=10.2353 s t(norm)=0.900499, mflops=5.55248 (err=8.6e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 31.1833 Normalized results and averages for N=592704: fft 0: mflops = 31.1833 (norm. = 1), norm. avg. (of 18) = 0.988064 fft 1: mflops = 7.17215 (norm. = 0.23), norm. avg. (of 18) = 0.181221 fft 2: mflops = 5.55248 (norm. = 0.178059), norm. avg. (of 18) = 0.489822 fft 3: mflops = -1 (norm. = -0.0320684), norm. avg. (of 10) = 0.257574 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=4.43 s, 1 iters, t-(init.)=3.95129 s t(norm)=0.226074, mflops=22.1166 (err=8.0e-16) 1. PDA (f2c): elapsed time t=11.6715 s, 1 iters, t-(init.)=11.1934 s t(norm)=0.640432, mflops=7.80723 (err=7.7e-16) 2. Singleton (f2c): elapsed time t=17.6747 s, 1 iters, t-(init.)=17.1958 s t(norm)=0.983859, mflops=5.08203 (err=8.2e-16) 3. Temperton (f2c): elapsed time t=11.2603 s, 1 iters, t-(init.)=10.7817 s t(norm)=0.616879, mflops=8.10532 (err=8.9e-16) Top mflops for N=884736 = 22.1166 Normalized results and averages for N=884736: fft 0: mflops = 22.1166 (norm. = 1), norm. avg. (of 19) = 0.988693 fft 1: mflops = 7.80723 (norm. = 0.353003), norm. avg. (of 19) = 0.190262 fft 2: mflops = 5.08203 (norm. = 0.229783), norm. avg. (of 19) = 0.476135 fft 3: mflops = 8.10532 (norm. = 0.36648), norm. avg. (of 11) = 0.267474 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=4.19688 s, 1 iters, t-(init.)=3.57025 s t(norm)=0.153113, mflops=32.6556 (err=7.9e-16) 1. PDA (f2c): elapsed time t=16.9978 s, 1 iters, t-(init.)=16.3718 s t(norm)=0.702117, mflops=7.12132 (err=8.1e-16) 2. Singleton (f2c): elapsed time t=17.3361 s, 1 iters, t-(init.)=16.7109 s t(norm)=0.716662, mflops=6.97679 (err=9.7e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 32.6556 Normalized results and averages for N=1157625: fft 0: mflops = 32.6556 (norm. = 1), norm. avg. (of 20) = 0.989258 fft 1: mflops = 7.12132 (norm. = 0.218074), norm. avg. (of 20) = 0.191652 fft 2: mflops = 6.97679 (norm. = 0.213648), norm. avg. (of 20) = 0.463011 fft 3: mflops = -1 (norm. = -0.0306226), norm. avg. (of 11) = 0.267474 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=5.68865 s, 1 iters, t-(init.)=4.92819 s t(norm)=0.171764, mflops=29.1096 (err=7.2e-16) 1. PDA (f2c): elapsed time t=20.5381 s, 1 iters, t-(init.)=19.7781 s t(norm)=0.689338, mflops=7.25334 (err=7.0e-16) 2. Singleton (f2c): elapsed time t=20.0334 s, 1 iters, t-(init.)=19.2727 s t(norm)=0.671722, mflops=7.44355 (err=6.8e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 29.1096 Normalized results and averages for N=1404928: fft 0: mflops = 29.1096 (norm. = 1), norm. avg. (of 21) = 0.989769 fft 1: mflops = 7.25334 (norm. = 0.249173), norm. avg. (of 21) = 0.194391 fft 2: mflops = 7.44355 (norm. = 0.255708), norm. avg. (of 21) = 0.453139 fft 3: mflops = -1 (norm. = -0.0343529), norm. avg. (of 11) = 0.267474 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=7.14026 s, 1 iters, t-(init.)=6.20494 s t(norm)=0.173297, mflops=28.8523 (err=5.9e-16) 1. PDA (f2c): elapsed time t=20.6967 s, 1 iters, t-(init.)=19.7607 s t(norm)=0.551892, mflops=9.05975 (err=6.2e-16) 2. Singleton (f2c): elapsed time t=38.452 s, 1 iters, t-(init.)=37.516 s t(norm)=1.04778, mflops=4.772 (err=7.4e-16) 3. Temperton (f2c): elapsed time t=21.8272 s, 1 iters, t-(init.)=20.8909 s t(norm)=0.583458, mflops=8.56959 (err=5.7e-16) Top mflops for N=1728000 = 28.8523 Normalized results and averages for N=1728000: fft 0: mflops = 28.8523 (norm. = 1), norm. avg. (of 22) = 0.990235 fft 1: mflops = 9.05975 (norm. = 0.314005), norm. avg. (of 22) = 0.199828 fft 2: mflops = 4.772 (norm. = 0.165394), norm. avg. (of 22) = 0.44006 fft 3: mflops = 8.56959 (norm. = 0.297016), norm. avg. (of 12) = 0.269936 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Beauregard, Bergland, CWP (min N), CWP (best N), Edelblute, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), NAPACK (f2c), Nielsen, NR (C), Ooura (C), QFT, Ransom, Singleton (f2c), Temperton (f2c), Valkenburg 2, 14.0374, 14.0426, 8.13294, 0.570877, 1.95509, 2.87695, 1.9657, 1.86652, , 2.92507, 13.7078, 13.8704, 25.0989, , 5.21501, 3.68999, 3.49003, 12.1552, , , , 1.31232, 1.00365, 4.20211, 13.5611, , , 2.8354, 1.31042, 3.53923 4, 29.2975, 28.7548, 12.78, 2.59783, 4.57513, 9.94067, 7.51477, 3.59729, 11.7962, 7.28923, 42.4713, 42.8829, 65.5353, , 13.7804, 7.50762, 7.22374, 33.1958, 14.6834, 15.0035, 13.6685, 2.91629, 3.6246, 7.75324, 23.781, , 2.14771, 9.86182, 4.0591, 3.79334 8, 42.2593, 42.5694, 16.1628, 3.49657, 6.45168, 15.5004, 18.2968, 10.7935, 12.9686, 11.8061, 68.6723, 68.9531, 89.5662, 32.3504, 19.7163, 13.0876, 12.726, 45.1649, 23.8764, 24.1299, 23.0182, 4.36073, 7.91392, 13.2142, 43.8761, , 2.53866, 11.0274, 5.80564, 3.97123 16, 28.124, 28.6043, 19.6421, 6.67783, 7.41131, 25.0421, 33.3893, 22.3492, 14.6449, 17.3573, 93.7325, 93.9146, 95.7495, 49.8299, 28.7349, 19.1705, 18.8908, 51.657, 24.0025, 28.5842, 27.9284, 6.21229, 7.61491, 19.16, 57.2596, 35.213, 8.32258, 28.9881, 8.21818, 4.11366 32, 33.9516, 34.0599, 23.2942, 8.09535, 7.74881, 36.7317, 34.2638, 42.1119, 16.7623, 14.4166, 114.002, 114.145, 113.098, 68.8376, 26.8171, 25.3764, 25.6731, 51.821, 27.4889, 34.5303, 34.4853, 7.08114, 10.9068, 25.6313, 65.1954, 32.2284, 8.57683, 37.4873, 7.71073, 4.22493 64, 35.2877, 35.5593, 26.515, 11.906, 7.93923, 44.5911, 39.3968, 49.1366, 18.8665, 17.1557, 116.02, 99.5547, 75.9558, 94.0201, 32.3708, 29.8624, 30.9359, 30.3727, 29.0005, 38.3733, 38.891, 8.24842, 14.1044, 30.662, 73.3326, 28.6877, 17.1397, 52.9082, 10.1516, 4.30494 128, 39.194, 39.2541, 29.6294, 12.0914, 7.99217, 50.0119, 46.7765, 68.9226, 21.0044, 19.0158, 109.781, 109.819, 82.7371, 102.717, 33.2576, 33.2146, 35.1326, 17.2408, 31.5068, 42.512, 43.2799, 8.4311, 12.797, 34.7513, 75.0898, 29.8481, 16.1898, 50.0491, 8.97412, 4.35518 256, 41.7259, 41.9289, 32.3965, 13.8105, 8.04355, 56.8425, 55.2615, 73.0889, 23.068, 20.577, 113.164, 113.127, 93.1043, 110.551, 36.9739, 35.2362, 37.8753, 22.9396, 33.5516, 45.5224, 46.3381, 9.05235, 14.4772, 37.5326, 80.6008, 28.7128, 24.5019, 66.7233, 10.5478, 4.36776 512, 43.6497, 43.5274, 31.7584, 14.6502, 7.93454, 61.6818, 56.5653, 70.374, 23.3257, 12.3625, 47.4096, 55.9909, 38.5501, 119.693, 19.926, 35.3398, 39.5073, 19.4835, 35.8271, 48.4837, 49.2787, 7.87282, 14.7966, 38.0087, 80.1684, 14.2497, 22.2501, 67.5289, 9.3656, 3.96327 1024, 41.8259, 39.8353, 29.0022, 15.266, 7.64806, 52.864, 49.5218, 49.4859, 22.3914, 7.73492, 45.8079, 32.8905, 15.7676, 75.913, 9.72791, 31.3912, 35.6775, 15.4249, 36.8967, 49.5961, 43.3684, 3.85728, 10.1841, 33.6427, 55.6025, 12.4753, 26.0612, 63.5277, 9.20439, 3.24882 2048, 12.8735, 12.7979, 9.84513, 11.7722, 6.38277, 24.4139, 33.7411, 36.1758, 8.99741, 8.19852, 25.7203, 24.1796, 14.9644, 28.0259, 10.3141, 12.0815, 12.9489, 14.2309, 33.2344, 43.7218, 28.0463, 3.58399, 8.02134, 12.6597, 29.5593, 11.3552, 15.9948, 19.734, 7.91133, 2.71356 4096, 11.8873, 11.8154, 9.22713, 12.6384, 6.30423, 25.7152, 30.2491, 34.8992, 8.50904, 8.5741, 24.8537, 25.298, 14.8046, 28.8827, 11.1799, 11.766, 12.7099, 14.0557, 14.524, 15.8853, 13.8978, 3.61864, 8.63961, 12.3187, 30.3814, 9.66808, 22.0548, 23.5462, 8.41964, 2.65233 8192, 11.9861, 11.8902, 9.10044, 10.9877, 6.26911, 23.7538, 31.875, 33.1443, 8.4427, 7.56988, 24.9542, 25.9532, 15.804, 26.9358, 9.44561, 11.5978, 12.5877, , 14.1622, 15.3417, 13.6312, 3.49426, 7.98375, 12.1911, 29.1152, 8.77547, 18.2724, 21.9411, 7.76853, 2.58841 16384, 11.3354, 11.2396, 9.00024, 14.1454, 6.18369, 23.6147, 32.4283, 32.4651, 8.39685, 7.74351, 26.986, 23.4549, 12.7375, 24.8876, 10.1653, 11.4184, 12.4713, , 13.7054, 14.7237, 13.2104, 3.54996, 8.32447, 12.0055, 30.4372, 7.95929, 24.1627, 23.01, 8.38412, 2.49842 32768, 11.3418, 11.2154, 8.75489, 12.0732, 6.12266, 24.1088, 30.3106, 30.3122, 8.21877, 7.06622, 26.2486, 22.6569, 11.4473, 25.2818, 9.87742, 11.0132, 12.0613, , 13.6017, 14.5699, 13.1126, 3.41315, 8.6284, 11.5915, 28.4938, 6.47898, 19.5403, 19.1912, 7.85351, 2.38238 65536, 4.50952, 4.30063, 3.3379, 9.38537, 4.24198, 9.67651, 20.437, 20.4544, 3.26969, 6.15308, 21.9952, 18.217, 10.035, 10.6817, 8.3481, 4.39477, 4.61504, , 12.2292, 13.004, 11.3789, 3.43365, 4.98432, 4.47283, 12.8911, 6.0869, 12.1577, 8.45478, 5.3985, 1.99645 131072, 3.82141, 3.79068, 2.98885, 6.89712, 4.24991, 8.85699, 19.5064, 19.5143, 2.93601, 5.38584, 21.7667, 17.4898, 9.78542, 9.22933, 7.39379, 3.98454, 4.12453, , 5.12967, 5.27891, 5.02421, 3.30078, 4.35238, 4.02498, 12.5066, 5.30684, 10.0304, 7.20083, 4.84419, 1.9101 262144, 3.62586, 3.58982, 2.88774, 8.79256, 4.2633, 9.23389, 12.7178, 12.7224, 2.83847, 5.28508, 19.6182, 16.4292, 8.43767, 9.42848, 7.55312, 3.93627, 4.06505, , 4.51284, 4.60449, 4.41722, 3.37892, 4.43171, 3.97206, 12.5477, 4.09405, 13.2857, 7.35493, 4.96671, 1.86333 Norm. Avg., 0.350009, 0.347356, 0.239287, 0.216056, 0.125445, 0.468257, 0.583195, 0.616184, 0.196147, 0.183611, 0.816016, 0.76203, 0.6167, 0.717467, 0.275604, 0.259564, 0.275754, 0.350357, 0.358648, 0.423447, 0.379114, 0.0894365, 0.15822, 0.271351, 0.671967, 0.246395, 0.346635, 0.452232, 0.149409, 0.0644271 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, CWP (min N), CWP (best N), FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Nielsen, Singleton (f2c), Temperton (f2c), Valkenburg 6, 10.5988, 6.99744, 8.82676, 52.5386, 52.8906, 10.8961, 15.4296, 3.93182, 3.03336, 8.00079, 4.00547, 3.94925 9, 18.8133, 12.8912, 10.8834, 57.5755, 57.7627, 9.45709, 14.3229, 4.93756, 4.66108, 14.7438, 5.75812, 4.0341 12, 23.3129, 19.4149, 13.8438, 83.3291, 83.638, 16.2285, 22.4871, 5.16348, 6.03904, 15.5935, 7.5942, 4.06563 15, 26.3034, 26.4337, 14.3083, 67.3905, 67.5089, 11.3208, 15.7734, 3.58966, 6.99992, 17.7391, 7.97051, 3.64018 18, 28.7157, 26.2248, 10.4819, 55.7467, 54.7021, 10.8587, 24.8329, 5.92064, 5.34898, 22.0026, 6.72925, 4.14228 24, 39.145, 38.4381, 12.3816, 73.2864, 73.3925, 21.0562, 32.6557, 6.65346, 9.10895, 20.5655, 8.65101, 4.17729 36, 46.8389, 46.8864, 13.2995, 78.4762, 78.5273, 13.586, 37.1721, 7.19508, 8.07387, 33.7723, 9.87125, 4.22978 80, 59.4539, 64.6986, 17.0016, 90.3986, 90.3849, 26.4328, 24.8776, 4.99845, 15.0063, 51.314, 11.3781, 4.00073 108, 50.5637, 65.1073, 15.8202, 86.7656, 86.7947, 12.4953, 34.8366, 7.84141, 9.21219, 37.7185, 10.5602, 4.27198 210, 69.6142, 69.6276, 11.7533, 72.9036, 69.719, 12.9954, 24.8746, 4.05124, 12.4266, 30.6909, , 3.41259 504, 76.4473, 76.4537, 10.522, 50.5137, 46.2933, 13.2177, 24.6154, 5.10665, 11.3637, 37.0361, , 3.58122 1000, 45.5656, 65.3163, 12.2017, 42.5061, 42.5005, 12.0089, 13.1877, 3.63548, 14.7209, 46.4911, 10.1447, 3.21818 1960, 40.9886, 40.9666, 7.16387, 44.4345, 44.2486, 11.2746, 15.1299, 3.30792, 10.7565, 21.6258, , 2.83248 4725, 33.9684, 39.6451, 9.49535, 40.2661, 39.8477, 7.91783, 15.686, 3.67968, 9.12522, 18.1921, , 3.03874 10368, 34.2747, 37.5272, 11.2537, 38.1566, 34.3422, 10.0829, 20.3293, 5.79906, 8.31595, 16.4356, 8.6636, 3.14245 27000, 37.2947, 37.3312, 8.14882, 25.3167, 24.7231, 6.62214, 11.0884, 3.34823, 8.81697, 16.0058, 8.59153, 2.65363 75600, 21.1417, 21.1408, 5.77352, 24.2354, 22.5549, 5.84866, 9.28313, 2.92402, 5.62086, 7.3979, , 2.31445 165375, 15.728, 15.7312, 4.10551, 20.9095, 20.0689, 4.28857, 9.3152, 2.66147, 5.04086, 7.04577, , 2.15981 Norm. Avg., 0.667774, 0.694777, 0.192205, 0.942882, 0.923989, 0.205331, 0.352001, 0.085684, 0.158292, 0.390016, 0.134051, 0.0644426 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM (f2c), NR (C), PDA (f2c), Singleton (f2c), Temperton (f2c) 4x4x4, 84.9837, , 32.5491, 4.64221, 56.8773, 13.7392 8x8x8, 103.738, 15.4107, 43.4945, 8.1614, 51.3061, 14.733 16x16x16, 45.889, 14.0138, 11.7447, 9.83058, 20.6247, 14.3712 32x32x32, 50.5097, 13.4276, 11.1444, 9.19872, 17.4211, 11.8277 64x64x64, 22.0316, 8.82659, 3.72856, 6.72413, 6.76598, 7.92586 256x64x32, 22.6152, 8.69672, 3.80823, 7.02466, 6.69795, 7.93415 16x1024x64, 17.4334, 8.91749, 3.78833, 6.54334, 6.80007, 128x128x128, 23.7755, 8.48416, 3.79085, 8.12478, 5.11236, 5.59394 Norm. Avg., 1, 0.339047, 0.249153, 0.232816, 0.39582, 0.2567 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA (f2c), Singleton (f2c), Temperton (f2c) 5x5x5, 59.7774, 5.54783, 74.3932, 13.5468 6x6x6, 83.1985, 6.16777, 43.8586, 12.0351 7x7x7, 57.672, 3.39642, 39.5171, 9x9x9, 76.9723, 7.78982, 55.5582, 14.6335 10x10x10, 74.127, 8.54903, 46.1313, 12.7327 11x11x11, 34.0215, 3.33389, 31.5999, 12x12x12, 62.1531, 9.25803, 20.4802, 15.6253 13x13x13, 28.9759, 3.19098, 29.5183, 14x14x14, 44.6831, 5.30391, 20.9615, 15x15x15, 49.6029, 9.77454, 22.9395, 14.0087 24x25x28, 39.8014, 9.01562, 16.9968, 48x48x48, 28.4845, 9.01809, 7.75026, 10.1957 49x49x49, 27.1564, 5.26237, 8.91352, 60x60x60, 31.3001, 9.32218, 5.84517, 10.0293 72x60x56, 28.9024, 7.86043, 5.69492, 75x75x75, 31.7616, 9.43203, 7.16048, 10.777 80x80x80, 28.2262, 9.41003, 7.21882, 9.47313 84x84x84, 31.1833, 7.17215, 5.55248, 96x96x96, 22.1166, 7.80723, 5.08203, 8.10532 105x105x105, 32.6556, 7.12132, 6.97679, 112x112x112, 29.1096, 7.25334, 7.44355, 120x120x120, 28.8523, 9.05975, 4.772, 8.56959 Norm. Avg., 0.990235, 0.199828, 0.44006, 0.269936 @@@@ end