@@SUBMIT@@ @ submitter = Eric Frey @ submitter email = frey@bme.unc.edu @ submitter organization = University of North Carolina @ computer manufacturer = DCG @ computer model = SX Series @ CPU manufacturer = Digital @ CPU model = 21164PC @ CPU speed = 533 MHz @ RAM = 64 MB @ L2 cache size = 2 MB @ operating system = Red Hat Linux 5.0 @ C compiler = DEC C V5.2-033 @ C compiler flags = -newc -w0 -O5 -ansi_alias -ansi_args -fp_reorder -tune host -std1 -DUSE_DXML @ Fortran compiler = DEC Fortran 3.8 @ Fortran compiler flags = -w0 -O5 -ansi_alias -ansi_args -fp_reorder -tune host -std1 @ remarks = Memory is SD Ram. 21164PC has no on chip L2 cache. @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) Maximum array size = 360360 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Bailey 5. Beauregard 6. Bergland 7. Brenner 8. Burrus 9. CWP (min N) 10. CWP (best N) 11. Edelblute 12. FFTPACK 13. FFTPACK (f2c) 14. FFTW 15. FFTW_ESTIMATE 16. Frigo-old 17. Green 18. GSL 19. GSL DIT 20. GSL DIF 21. Krukar 22. Mayer (Buneman) 23. Mayer (simple) 24. Mayer (lookup) 25. Monro 26. NAPACK (f2c) 27. Ooura (C) 28. Ooura (F) 29. Ransom 30. SCIPORT 31. Singleton 32. Singleton (f2c) 33. Sorensen 34. Sorensen DIT 35. Temperton 36. Temperton (f2c) 37. Valkenburg Computing normalized averages (38 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.01663 s, 4194304 iters, t-(init.)=0.799968 s t(norm)=0.0953636, mflops=52.4309 (err=1.7e-17) 1. Arndt DIT: elapsed time t=1.91659 s, 8388608 iters, t-(init.)=1.49994 s t(norm)=0.0894034, mflops=55.9263 (err=1.7e-17) 2. Arndt Split-Radix: elapsed time t=1.11662 s, 4194304 iters, t-(init.)=0.899964 s t(norm)=0.107284, mflops=46.6052 (err=1.7e-17) 3. Arndt 4-step: elapsed time t=1.33328 s, 262144 iters, t-(init.)=1.33328 s t(norm)=2.54303, mflops=1.96616 (err=1.7e-17) 4. Bailey: elapsed time t=1.39994 s, 2097152 iters, t-(init.)=1.28328 s t(norm)=0.305958, mflops=16.3421 (err=1.7e-17) 5. Beauregard: elapsed time t=1.73326 s, 1048576 iters, t-(init.)=1.68327 s t(norm)=0.802644, mflops=6.22941 (err=1.7e-17) 6. Bergland: elapsed time t=1.09996 s, 1048576 iters, t-(init.)=1.04996 s t(norm)=0.500659, mflops=9.98684 (err=1.7e-17) 7. Brenner: elapsed time t=1.81659 s, 2097152 iters, t-(init.)=1.69993 s t(norm)=0.405295, mflops=12.3367 (err=1.7e-17) 8. Burrus: elapsed time t=1.06662 s, 2097152 iters, t-(init.)=0.933296 s t(norm)=0.222515, mflops=22.4704 (err=1.7e-17) 9. CWP (min N): elapsed time t=1.44994 s, 1048576 iters, t-(init.)=1.38328 s t(norm)=0.659598, mflops=7.58037 10. CWP (best N) (N=3): elapsed time t=1.38328 s, 1048576 iters, t-(init.)=1.31661 s t(norm)=0.62781, mflops=7.96419 11. Skipping fft (Edelblute can't handle N <= 2). 12. FFTPACK: elapsed time t=1.46661 s, 2097152 iters, t-(init.)=1.33328 s t(norm)=0.317879, mflops=15.7293 (err=1.7e-17) 13. FFTPACK (f2c): elapsed time t=1.04996 s, 2097152 iters, t-(init.)=0.933296 s t(norm)=0.222515, mflops=22.4704 (err=1.7e-17) FFTW_MEASURE plan: (cost = 1.748333e-07) FFTW_NOTW 2 14. FFTW: elapsed time t=1.59994 s, 8388608 iters, t-(init.)=1.13329 s t(norm)=0.0675492, mflops=74.0201 (err=1.7e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 15. FFTW_ESTIMATE: elapsed time t=1.6666 s, 8388608 iters, t-(init.)=-0.449982 s t(norm)=-0.026821, mflops=-186.421 (err=1.7e-17) 16. Frigo-old: elapsed time t=1.04996 s, 8388608 iters, t-(init.)=0.616642 s t(norm)=0.0367547, mflops=136.037 (err=1.7e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.14995 s, 2097152 iters, t-(init.)=1.04996 s t(norm)=0.250329, mflops=19.9737 (err=1.7e-17) 19. GSL DIT: elapsed time t=1.51661 s, 1048576 iters, t-(init.)=1.46661 s t(norm)=0.699333, mflops=7.14967 (err=1.7e-17) 20. GSL DIF: elapsed time t=1.79993 s, 2097152 iters, t-(init.)=1.68327 s t(norm)=0.401322, mflops=12.4588 (err=1.7e-17) 21. Krukar: elapsed time t=1.24995 s, 4194304 iters, t-(init.)=1.01663 s t(norm)=0.121191, mflops=41.2571 (err=1.7e-17) 22. Skipping fft (Mayer can't handle N <= 2). 23. Skipping fft (Mayer can't handle N <= 2). 24. Skipping fft (Mayer can't handle N <= 2). 25. Skipping fft (Monro can't handle N <= 2). 26. NAPACK (f2c): elapsed time t=1.08329 s, 524288 iters, t-(init.)=1.04996 s t(norm)=1.00132, mflops=4.99342 (err=1.7e-17) 27. Ooura (C): elapsed time t=1.28328 s, 8388608 iters, t-(init.)=0.849966 s t(norm)=0.0506619, mflops=98.6935 (err=1.7e-17) 28. Ooura (F): elapsed time t=1.53327 s, 8388608 iters, t-(init.)=1.06662 s t(norm)=0.0635757, mflops=78.6463 (err=1.7e-17) 29. Skipping fft (Ransom doesn't work for N=2). 30. Skipping fft (SCIPORT can't handle N < 4). 31. Singleton: elapsed time t=1.14995 s, 1048576 iters, t-(init.)=0.949962 s t(norm)=0.452977, mflops=11.0381 (err=1.7e-17) 32. Singleton (f2c): elapsed time t=1.54994 s, 1048576 iters, t-(init.)=1.49994 s t(norm)=0.715227, mflops=6.99079 (err=1.7e-17) 33. Sorensen: elapsed time t=1.18329 s, 4194304 iters, t-(init.)=0.949962 s t(norm)=0.113244, mflops=44.1523 (err=1.7e-17) 34. Sorensen DIT: elapsed time t=1.43328 s, 4194304 iters, t-(init.)=1.18329 s t(norm)=0.141059, mflops=35.4462 (err=1.7e-17) 35. Temperton: elapsed time t=1.08329 s, 1048576 iters, t-(init.)=1.03329 s t(norm)=0.492712, mflops=10.1479 (err=1.7e-17) 36. Temperton (f2c): elapsed time t=1.34995 s, 1048576 iters, t-(init.)=1.29995 s t(norm)=0.619864, mflops=8.06629 (err=1.7e-17) 37. Valkenburg: elapsed time t=1.09996 s, 1048576 iters, t-(init.)=1.04996 s t(norm)=0.500659, mflops=9.98684 (err=1.7e-17) Top mflops for N=2 = 136.037 Normalized results and averages for N=2: fft 0: mflops = 52.4309 (norm. = 0.385417), norm. avg. (of 1) = 0.385417 fft 1: mflops = 55.9263 (norm. = 0.411111), norm. avg. (of 1) = 0.411111 fft 2: mflops = 46.6052 (norm. = 0.342593), norm. avg. (of 1) = 0.342593 fft 3: mflops = 1.96616 (norm. = 0.0144531), norm. avg. (of 1) = 0.0144531 fft 4: mflops = 16.3421 (norm. = 0.12013), norm. avg. (of 1) = 0.12013 fft 5: mflops = 6.22941 (norm. = 0.0457921), norm. avg. (of 1) = 0.0457921 fft 6: mflops = 9.98684 (norm. = 0.0734127), norm. avg. (of 1) = 0.0734127 fft 7: mflops = 12.3367 (norm. = 0.0906863), norm. avg. (of 1) = 0.0906863 fft 8: mflops = 22.4704 (norm. = 0.165179), norm. avg. (of 1) = 0.165179 fft 9: mflops = 7.58037 (norm. = 0.0557229), norm. avg. (of 1) = 0.0557229 fft 10: mflops = 7.96419 (norm. = 0.0585443), norm. avg. (of 1) = 0.0585443 fft 11: mflops = -1 (norm. = -0.00735095), norm. avg. (of 0) = -1 fft 12: mflops = 15.7293 (norm. = 0.115625), norm. avg. (of 1) = 0.115625 fft 13: mflops = 22.4704 (norm. = 0.165179), norm. avg. (of 1) = 0.165179 fft 14: mflops = 74.0201 (norm. = 0.544118), norm. avg. (of 1) = 0.544118 fft 15: mflops = -186.421 (norm. = -1.37037), norm. avg. (of 1) = 0 fft 16: mflops = 136.037 (norm. = 1), norm. avg. (of 1) = 1 fft 17: mflops = -1 (norm. = -0.00735095), norm. avg. (of 0) = -1 fft 18: mflops = 19.9737 (norm. = 0.146825), norm. avg. (of 1) = 0.146825 fft 19: mflops = 7.14967 (norm. = 0.0525568), norm. avg. (of 1) = 0.0525568 fft 20: mflops = 12.4588 (norm. = 0.0915842), norm. avg. (of 1) = 0.0915842 fft 21: mflops = 41.2571 (norm. = 0.303279), norm. avg. (of 1) = 0.303279 fft 22: mflops = -1 (norm. = -0.00735095), norm. avg. (of 0) = -1 fft 23: mflops = -1 (norm. = -0.00735095), norm. avg. (of 0) = -1 fft 24: mflops = -1 (norm. = -0.00735095), norm. avg. (of 0) = -1 fft 25: mflops = -1 (norm. = -0.00735095), norm. avg. (of 0) = -1 fft 26: mflops = 4.99342 (norm. = 0.0367063), norm. avg. (of 1) = 0.0367063 fft 27: mflops = 98.6935 (norm. = 0.72549), norm. avg. (of 1) = 0.72549 fft 28: mflops = 78.6463 (norm. = 0.578125), norm. avg. (of 1) = 0.578125 fft 29: mflops = -1 (norm. = -0.00735095), norm. avg. (of 0) = -1 fft 30: mflops = -1 (norm. = -0.00735095), norm. avg. (of 0) = -1 fft 31: mflops = 11.0381 (norm. = 0.0811404), norm. avg. (of 1) = 0.0811404 fft 32: mflops = 6.99079 (norm. = 0.0513889), norm. avg. (of 1) = 0.0513889 fft 33: mflops = 44.1523 (norm. = 0.324561), norm. avg. (of 1) = 0.324561 fft 34: mflops = 35.4462 (norm. = 0.260563), norm. avg. (of 1) = 0.260563 fft 35: mflops = 10.1479 (norm. = 0.0745968), norm. avg. (of 1) = 0.0745968 fft 36: mflops = 8.06629 (norm. = 0.0592949), norm. avg. (of 1) = 0.0592949 fft 37: mflops = 9.98684 (norm. = 0.0734127), norm. avg. (of 1) = 0.0734127 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.5666 s, 4194304 iters, t-(init.)=1.38328 s t(norm)=0.0412249, mflops=121.286 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.58327 s, 4194304 iters, t-(init.)=1.39994 s t(norm)=0.0417216, mflops=119.842 (err=1.3e-16) 2. Arndt Split-Radix: elapsed time t=1.7666 s, 2097152 iters, t-(init.)=1.6666 s t(norm)=0.0993371, mflops=50.3337 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.51661 s, 262144 iters, t-(init.)=1.51661 s t(norm)=0.723174, mflops=6.91396 (err=1.3e-16) 4. Bailey: elapsed time t=1.48327 s, 1048576 iters, t-(init.)=1.43328 s t(norm)=0.17086, mflops=29.2638 (err=1.3e-16) 5. Beauregard: elapsed time t=1.49994 s, 524288 iters, t-(init.)=1.46661 s t(norm)=0.349667, mflops=14.2993 (err=6.5e-17) 6. Bergland: elapsed time t=1.23328 s, 1048576 iters, t-(init.)=1.19995 s t(norm)=0.143045, mflops=34.9539 (err=5.3e-17) 7. Brenner: elapsed time t=1.28328 s, 1048576 iters, t-(init.)=1.24995 s t(norm)=0.149006, mflops=33.5558 (err=5.3e-17) 8. Burrus: elapsed time t=1.99992 s, 2097152 iters, t-(init.)=1.89992 s t(norm)=0.113244, mflops=44.1523 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.46661 s, 1048576 iters, t-(init.)=1.41661 s t(norm)=0.168873, mflops=29.608 10. CWP (best N) (N=15): elapsed time t=1.29995 s, 524288 iters, t-(init.)=1.21662 s t(norm)=0.290064, mflops=17.2376 11. Edelblute: elapsed time t=1.79993 s, 2097152 iters, t-(init.)=1.7166 s t(norm)=0.102317, mflops=48.8676 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.11662 s, 2097152 iters, t-(init.)=1.01663 s t(norm)=0.0605956, mflops=82.5142 (err=5.3e-17) 13. FFTPACK (f2c): elapsed time t=1.16662 s, 1048576 iters, t-(init.)=1.11662 s t(norm)=0.133112, mflops=37.5624 (err=5.3e-17) FFTW_MEASURE plan: (cost = 5.086060e-07) FFTW_NOTW 4 14. FFTW: elapsed time t=2.01659 s, 4194304 iters, t-(init.)=1.83326 s t(norm)=0.0546354, mflops=91.5157 (err=5.3e-17) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 15. FFTW_ESTIMATE: elapsed time t=1.99992 s, 4194304 iters, t-(init.)=0.799968 s t(norm)=0.0238409, mflops=209.724 (err=5.3e-17) 16. Frigo-old: elapsed time t=1.74993 s, 4194304 iters, t-(init.)=1.5666 s t(norm)=0.0466884, mflops=107.093 (err=5.3e-17) 17. Skipping fft (Green can't handle this size.). 18. GSL: elapsed time t=1.39994 s, 2097152 iters, t-(init.)=1.29995 s t(norm)=0.0774829, mflops=64.5303 (err=5.3e-17) 19. GSL DIT: elapsed time t=1.26662 s, 524288 iters, t-(init.)=1.24995 s t(norm)=0.298011, mflops=16.7779 (err=6.5e-17) 20. GSL DIF: elapsed time t=1.94992 s, 1048576 iters, t-(init.)=1.91659 s t(norm)=0.228475, mflops=21.8842 (err=6.5e-17) 21. Krukar: elapsed time t=1.41661 s, 2097152 iters, t-(init.)=1.31661 s t(norm)=0.0784763, mflops=63.7135 (err=5.3e-17) 22. Mayer (Buneman): elapsed time t=1.06662 s, 2097152 iters, t-(init.)=0.983294 s t(norm)=0.0586089, mflops=85.3113 (err=1.3e-16) 23. Mayer (simple): elapsed time t=1.03329 s, 2097152 iters, t-(init.)=0.949962 s t(norm)=0.0566221, mflops=88.3047 24. Mayer (lookup): elapsed time t=1.16662 s, 2097152 iters, t-(init.)=1.06662 s t(norm)=0.0635757, mflops=78.6463 (err=1.3e-16) 25. Monro: elapsed time t=1.88326 s, 524288 iters, t-(init.)=1.84993 s t(norm)=0.441057, mflops=11.3364 (err=1.3e-16) 26. NAPACK (f2c): elapsed time t=1.81659 s, 524288 iters, t-(init.)=1.78326 s t(norm)=0.425163, mflops=11.7602 (err=1.6e-16) 27. Ooura (C): elapsed time t=1.18329 s, 4194304 iters, t-(init.)=0.99996 s t(norm)=0.0298011, mflops=167.779 (err=5.3e-17) 28. Ooura (F): elapsed time t=1.29995 s, 4194304 iters, t-(init.)=1.11662 s t(norm)=0.0332779, mflops=150.25 (err=5.3e-17) 29. Ransom: elapsed time t=1.06662 s, 262144 iters, t-(init.)=1.04996 s t(norm)=0.500659, mflops=9.98684 (err=1.6e-16) 30. SCIPORT: elapsed time t=1.01663 s, 2097152 iters, t-(init.)=0.933296 s t(norm)=0.0556288, mflops=89.8815 (err=6.5e-17) 31. Singleton: elapsed time t=1.41661 s, 1048576 iters, t-(init.)=1.21662 s t(norm)=0.145032, mflops=34.4751 (err=5.3e-17) 32. Singleton (f2c): elapsed time t=1.86659 s, 1048576 iters, t-(init.)=1.81659 s t(norm)=0.216555, mflops=23.0888 (err=5.3e-17) 33. Sorensen: elapsed time t=1.69993 s, 2097152 iters, t-(init.)=1.6166 s t(norm)=0.096357, mflops=51.8904 (err=1.3e-16) 34. Sorensen DIT: elapsed time t=1.6666 s, 2097152 iters, t-(init.)=1.58327 s t(norm)=0.0943702, mflops=52.9828 (err=1.3e-16) 35. Temperton: elapsed time t=1.24995 s, 1048576 iters, t-(init.)=1.19995 s t(norm)=0.143045, mflops=34.9539 (err=5.3e-17) 36. Temperton (f2c): elapsed time t=1.68327 s, 1048576 iters, t-(init.)=1.63327 s t(norm)=0.194701, mflops=25.6804 (err=5.3e-17) 37. Valkenburg: elapsed time t=1.38328 s, 524288 iters, t-(init.)=1.36661 s t(norm)=0.325826, mflops=15.3456 (err=1.6e-16) Top mflops for N=4 = 209.724 Normalized results and averages for N=4: fft 0: mflops = 121.286 (norm. = 0.578313), norm. avg. (of 2) = 0.481865 fft 1: mflops = 119.842 (norm. = 0.571429), norm. avg. (of 2) = 0.49127 fft 2: mflops = 50.3337 (norm. = 0.24), norm. avg. (of 2) = 0.291296 fft 3: mflops = 6.91396 (norm. = 0.032967), norm. avg. (of 2) = 0.0237101 fft 4: mflops = 29.2638 (norm. = 0.139535), norm. avg. (of 2) = 0.129832 fft 5: mflops = 14.2993 (norm. = 0.0681818), norm. avg. (of 2) = 0.0569869 fft 6: mflops = 34.9539 (norm. = 0.166667), norm. avg. (of 2) = 0.12004 fft 7: mflops = 33.5558 (norm. = 0.16), norm. avg. (of 2) = 0.125343 fft 8: mflops = 44.1523 (norm. = 0.210526), norm. avg. (of 2) = 0.187852 fft 9: mflops = 29.608 (norm. = 0.141176), norm. avg. (of 2) = 0.0984497 fft 10: mflops = 17.2376 (norm. = 0.0821918), norm. avg. (of 2) = 0.070368 fft 11: mflops = 48.8676 (norm. = 0.23301), norm. avg. (of 1) = 0.23301 fft 12: mflops = 82.5142 (norm. = 0.393443), norm. avg. (of 2) = 0.254534 fft 13: mflops = 37.5624 (norm. = 0.179104), norm. avg. (of 2) = 0.172142 fft 14: mflops = 91.5157 (norm. = 0.436364), norm. avg. (of 2) = 0.490241 fft 15: mflops = 209.724 (norm. = 1), norm. avg. (of 2) = 0.5 fft 16: mflops = 107.093 (norm. = 0.510638), norm. avg. (of 2) = 0.755319 fft 17: mflops = -1 (norm. = -0.00476818), norm. avg. (of 0) = -1 fft 18: mflops = 64.5303 (norm. = 0.307692), norm. avg. (of 2) = 0.227259 fft 19: mflops = 16.7779 (norm. = 0.08), norm. avg. (of 2) = 0.0662784 fft 20: mflops = 21.8842 (norm. = 0.104348), norm. avg. (of 2) = 0.097966 fft 21: mflops = 63.7135 (norm. = 0.303797), norm. avg. (of 2) = 0.303538 fft 22: mflops = 85.3113 (norm. = 0.40678), norm. avg. (of 1) = 0.40678 fft 23: mflops = 88.3047 (norm. = 0.421053), norm. avg. (of 1) = 0.421053 fft 24: mflops = 78.6463 (norm. = 0.375), norm. avg. (of 1) = 0.375 fft 25: mflops = 11.3364 (norm. = 0.0540541), norm. avg. (of 1) = 0.0540541 fft 26: mflops = 11.7602 (norm. = 0.0560748), norm. avg. (of 2) = 0.0463906 fft 27: mflops = 167.779 (norm. = 0.8), norm. avg. (of 2) = 0.762745 fft 28: mflops = 150.25 (norm. = 0.716418), norm. avg. (of 2) = 0.647271 fft 29: mflops = 9.98684 (norm. = 0.047619), norm. avg. (of 1) = 0.047619 fft 30: mflops = 89.8815 (norm. = 0.428571), norm. avg. (of 1) = 0.428571 fft 31: mflops = 34.4751 (norm. = 0.164384), norm. avg. (of 2) = 0.122762 fft 32: mflops = 23.0888 (norm. = 0.110092), norm. avg. (of 2) = 0.0807403 fft 33: mflops = 51.8904 (norm. = 0.247423), norm. avg. (of 2) = 0.285992 fft 34: mflops = 52.9828 (norm. = 0.252632), norm. avg. (of 2) = 0.256597 fft 35: mflops = 34.9539 (norm. = 0.166667), norm. avg. (of 2) = 0.120632 fft 36: mflops = 25.6804 (norm. = 0.122449), norm. avg. (of 2) = 0.0908719 fft 37: mflops = 15.3456 (norm. = 0.0731707), norm. avg. (of 2) = 0.0732917 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.79993 s, 2097152 iters, t-(init.)=1.6166 s t(norm)=0.032119, mflops=155.671 (err=1.2e-16) 1. Arndt DIT: elapsed time t=1.44994 s, 2097152 iters, t-(init.)=1.26662 s t(norm)=0.0251654, mflops=198.686 (err=1.2e-16) 2. Arndt Split-Radix: elapsed time t=1.09996 s, 524288 iters, t-(init.)=1.04996 s t(norm)=0.0834432, mflops=59.921 (err=1.1e-16) 3. Arndt 4-step: elapsed time t=1.53327 s, 131072 iters, t-(init.)=1.51661 s t(norm)=0.482116, mflops=10.3709 (err=1.3e-16) 4. Bailey: elapsed time t=1.13329 s, 524288 iters, t-(init.)=1.08329 s t(norm)=0.0860922, mflops=58.0773 (err=9.8e-17) 5. Beauregard: elapsed time t=1.59994 s, 262144 iters, t-(init.)=1.5666 s t(norm)=0.249005, mflops=20.0799 (err=1.2e-16) 6. Bergland: elapsed time t=1.16662 s, 524288 iters, t-(init.)=1.11662 s t(norm)=0.0887411, mflops=56.3437 (err=1.3e-16) 7. Brenner: elapsed time t=1.51661 s, 524288 iters, t-(init.)=1.48327 s t(norm)=0.11788, mflops=42.416 (err=1.2e-16) 8. Burrus: elapsed time t=1.26662 s, 524288 iters, t-(init.)=1.23328 s t(norm)=0.0980126, mflops=51.0138 (err=1.3e-16) 9. CWP (min N): elapsed time t=1.04996 s, 524288 iters, t-(init.)=0.99996 s t(norm)=0.0794697, mflops=62.9171 10. CWP (best N) (N=15): elapsed time t=1.29995 s, 524288 iters, t-(init.)=1.21662 s t(norm)=0.0966881, mflops=51.7127 11. Edelblute: elapsed time t=1.43328 s, 524288 iters, t-(init.)=1.38328 s t(norm)=0.109933, mflops=45.4822 (err=1.3e-16) 12. FFTPACK: elapsed time t=1.6666 s, 1048576 iters, t-(init.)=1.5666 s t(norm)=0.0622512, mflops=80.3197 (err=1.2e-16) 13. FFTPACK (f2c): elapsed time t=1.86659 s, 1048576 iters, t-(init.)=1.7666 s t(norm)=0.0701982, mflops=71.2269 (err=1.2e-16) FFTW_MEASURE plan: (cost = 3.655605e-07) FFTW_NOTW 8 14. FFTW: elapsed time t=1.68327 s, 4194304 iters, t-(init.)=1.31661 s t(norm)=0.0130794, mflops=382.281 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 15. FFTW_ESTIMATE: elapsed time t=1.69993 s, 4194304 iters, t-(init.)=0.199992 s t(norm)=0.00198674, mflops=2516.68 (err=1.2e-16) 16. Frigo-old: elapsed time t=1.28328 s, 4194304 iters, t-(init.)=0.91663 s t(norm)=0.0091059, mflops=549.094 (err=1.4e-16) 17. Green: elapsed time t=1.64993 s, 2097152 iters, t-(init.)=1.46661 s t(norm)=0.0291389, mflops=171.592 (err=1.4e-16) 18. GSL: elapsed time t=1.36661 s, 1048576 iters, t-(init.)=1.28328 s t(norm)=0.050993, mflops=98.0526 (err=1.4e-16) 19. GSL DIT: elapsed time t=1.08329 s, 262144 iters, t-(init.)=1.06662 s t(norm)=0.169535, mflops=29.4924 (err=1.2e-16) 20. GSL DIF: elapsed time t=1.79993 s, 524288 iters, t-(init.)=1.7666 s t(norm)=0.140396, mflops=35.6134 (err=1.4e-16) 21. Krukar: elapsed time t=1.29995 s, 1048576 iters, t-(init.)=1.19995 s t(norm)=0.0476818, mflops=104.862 (err=1.9e-16) 22. Mayer (Buneman): elapsed time t=1.23328 s, 1048576 iters, t-(init.)=1.14995 s t(norm)=0.0456951, mflops=109.421 (err=1.2e-16) 23. Mayer (simple): elapsed time t=1.21662 s, 1048576 iters, t-(init.)=1.11662 s t(norm)=0.0443706, mflops=112.687 24. Mayer (lookup): elapsed time t=1.31661 s, 1048576 iters, t-(init.)=1.23328 s t(norm)=0.0490063, mflops=102.028 (err=1.2e-16) 25. Monro: elapsed time t=1.23328 s, 262144 iters, t-(init.)=1.21662 s t(norm)=0.193376, mflops=25.8563 (err=1.1e-08) 26. NAPACK (f2c): elapsed time t=1.7666 s, 262144 iters, t-(init.)=1.74993 s t(norm)=0.278144, mflops=17.9763 (err=1.7e-16) 27. Ooura (C): elapsed time t=1.34995 s, 2097152 iters, t-(init.)=1.16662 s t(norm)=0.0231787, mflops=215.716 (err=1.3e-16) 28. Ooura (F): elapsed time t=1.38328 s, 2097152 iters, t-(init.)=1.18329 s t(norm)=0.0235098, mflops=212.677 (err=1.3e-16) 29. Ransom: elapsed time t=1.26662 s, 131072 iters, t-(init.)=1.24995 s t(norm)=0.397348, mflops=12.5834 (err=3.4e-16) 30. SCIPORT: elapsed time t=1.08329 s, 1048576 iters, t-(init.)=0.99996 s t(norm)=0.0397348, mflops=125.834 (err=1.4e-16) 31. Singleton: elapsed time t=1.06662 s, 262144 iters, t-(init.)=0.99996 s t(norm)=0.158939, mflops=31.4585 (err=1.4e-16) 32. Singleton (f2c): elapsed time t=1.29995 s, 262144 iters, t-(init.)=1.28328 s t(norm)=0.203972, mflops=24.5131 (err=1.4e-16) 33. Sorensen: elapsed time t=1.91659 s, 1048576 iters, t-(init.)=1.83326 s t(norm)=0.0728472, mflops=68.6368 (err=1.5e-16) 34. Sorensen DIT: elapsed time t=1.33328 s, 524288 iters, t-(init.)=1.29995 s t(norm)=0.103311, mflops=48.3978 (err=1.1e-16) 35. Temperton: elapsed time t=2.01659 s, 1048576 iters, t-(init.)=1.93326 s t(norm)=0.0768207, mflops=65.0866 (err=4.6e-09) 36. Temperton (f2c): elapsed time t=1.16662 s, 262144 iters, t-(init.)=1.13329 s t(norm)=0.180131, mflops=27.7575 (err=1.4e-16) 37. Valkenburg: elapsed time t=1.74993 s, 262144 iters, t-(init.)=1.7166 s t(norm)=0.272846, mflops=18.3254 (err=1.5e-16) Top mflops for N=8 = 2516.68 Normalized results and averages for N=8: fft 0: mflops = 155.671 (norm. = 0.0618557), norm. avg. (of 3) = 0.341862 fft 1: mflops = 198.686 (norm. = 0.0789474), norm. avg. (of 3) = 0.353829 fft 2: mflops = 59.921 (norm. = 0.0238095), norm. avg. (of 3) = 0.202134 fft 3: mflops = 10.3709 (norm. = 0.00412088), norm. avg. (of 3) = 0.0171803 fft 4: mflops = 58.0773 (norm. = 0.0230769), norm. avg. (of 3) = 0.0942472 fft 5: mflops = 20.0799 (norm. = 0.00797872), norm. avg. (of 3) = 0.0406509 fft 6: mflops = 56.3437 (norm. = 0.0223881), norm. avg. (of 3) = 0.0874891 fft 7: mflops = 42.416 (norm. = 0.0168539), norm. avg. (of 3) = 0.0891801 fft 8: mflops = 51.0138 (norm. = 0.0202703), norm. avg. (of 3) = 0.131992 fft 9: mflops = 62.9171 (norm. = 0.025), norm. avg. (of 3) = 0.0739665 fft 10: mflops = 51.7127 (norm. = 0.0205479), norm. avg. (of 3) = 0.0537613 fft 11: mflops = 45.4822 (norm. = 0.0180723), norm. avg. (of 2) = 0.125541 fft 12: mflops = 80.3197 (norm. = 0.0319149), norm. avg. (of 3) = 0.180328 fft 13: mflops = 71.2269 (norm. = 0.0283019), norm. avg. (of 3) = 0.124195 fft 14: mflops = 382.281 (norm. = 0.151899), norm. avg. (of 3) = 0.37746 fft 15: mflops = 2516.68 (norm. = 1), norm. avg. (of 3) = 0.666667 fft 16: mflops = 549.094 (norm. = 0.218182), norm. avg. (of 3) = 0.576273 fft 17: mflops = 171.592 (norm. = 0.0681818), norm. avg. (of 1) = 0.0681818 fft 18: mflops = 98.0526 (norm. = 0.038961), norm. avg. (of 3) = 0.164493 fft 19: mflops = 29.4924 (norm. = 0.0117188), norm. avg. (of 3) = 0.0480919 fft 20: mflops = 35.6134 (norm. = 0.0141509), norm. avg. (of 3) = 0.0700276 fft 21: mflops = 104.862 (norm. = 0.0416667), norm. avg. (of 3) = 0.216248 fft 22: mflops = 109.421 (norm. = 0.0434783), norm. avg. (of 2) = 0.225129 fft 23: mflops = 112.687 (norm. = 0.0447761), norm. avg. (of 2) = 0.232914 fft 24: mflops = 102.028 (norm. = 0.0405405), norm. avg. (of 2) = 0.20777 fft 25: mflops = 25.8563 (norm. = 0.010274), norm. avg. (of 2) = 0.032164 fft 26: mflops = 17.9763 (norm. = 0.00714286), norm. avg. (of 3) = 0.033308 fft 27: mflops = 215.716 (norm. = 0.0857143), norm. avg. (of 3) = 0.537068 fft 28: mflops = 212.677 (norm. = 0.084507), norm. avg. (of 3) = 0.459683 fft 29: mflops = 12.5834 (norm. = 0.005), norm. avg. (of 2) = 0.0263095 fft 30: mflops = 125.834 (norm. = 0.05), norm. avg. (of 2) = 0.239286 fft 31: mflops = 31.4585 (norm. = 0.0125), norm. avg. (of 3) = 0.086008 fft 32: mflops = 24.5131 (norm. = 0.00974026), norm. avg. (of 3) = 0.0570736 fft 33: mflops = 68.6368 (norm. = 0.0272727), norm. avg. (of 3) = 0.199752 fft 34: mflops = 48.3978 (norm. = 0.0192308), norm. avg. (of 3) = 0.177475 fft 35: mflops = 65.0866 (norm. = 0.0258621), norm. avg. (of 3) = 0.0890418 fft 36: mflops = 27.7575 (norm. = 0.0110294), norm. avg. (of 3) = 0.0642578 fft 37: mflops = 18.3254 (norm. = 0.00728155), norm. avg. (of 3) = 0.0512883 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.01663 s, 262144 iters, t-(init.)=0.966628 s t(norm)=0.0576155, mflops=86.7822 (err=1.5e-16) 1. Arndt DIT: elapsed time t=1.01663 s, 262144 iters, t-(init.)=0.983294 s t(norm)=0.0586089, mflops=85.3113 (err=2.2e-16) 2. Arndt Split-Radix: elapsed time t=1.53327 s, 262144 iters, t-(init.)=1.49994 s t(norm)=0.0894034, mflops=55.9263 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.11662 s, 65536 iters, t-(init.)=1.11662 s t(norm)=0.266223, mflops=18.7812 (err=2.0e-16) 4. Bailey: elapsed time t=1.29995 s, 262144 iters, t-(init.)=1.26662 s t(norm)=0.0754962, mflops=66.2285 (err=2.0e-16) 5. Beauregard: elapsed time t=1.88326 s, 131072 iters, t-(init.)=1.84993 s t(norm)=0.220528, mflops=22.6728 (err=2.7e-16) 6. Bergland: elapsed time t=1.06662 s, 262144 iters, t-(init.)=1.01663 s t(norm)=0.0605956, mflops=82.5142 (err=2.6e-16) 7. Brenner: elapsed time t=1.38328 s, 262144 iters, t-(init.)=1.33328 s t(norm)=0.0794697, mflops=62.9171 (err=2.1e-16) 8. Burrus: elapsed time t=1.69993 s, 262144 iters, t-(init.)=1.6666 s t(norm)=0.0993371, mflops=50.3337 (err=1.4e-16) 9. CWP (min N): elapsed time t=1.36661 s, 524288 iters, t-(init.)=1.28328 s t(norm)=0.0382448, mflops=130.737 10. CWP (best N) (N=28): elapsed time t=1.01663 s, 262144 iters, t-(init.)=0.949962 s t(norm)=0.0566221, mflops=88.3047 11. Edelblute: elapsed time t=1.01663 s, 131072 iters, t-(init.)=0.99996 s t(norm)=0.119205, mflops=41.9447 (err=1.4e-16) 12. FFTPACK: elapsed time t=1.09996 s, 524288 iters, t-(init.)=1.01663 s t(norm)=0.0302978, mflops=165.028 (err=1.8e-16) 13. FFTPACK (f2c): elapsed time t=1.49994 s, 524288 iters, t-(init.)=1.39994 s t(norm)=0.0417216, mflops=119.842 (err=1.8e-16) FFTW_MEASURE plan: (cost = 1.017212e-06) FFTW_NOTW 16 14. FFTW: elapsed time t=1.08329 s, 1048576 iters, t-(init.)=0.91663 s t(norm)=0.0136589, mflops=366.063 (err=1.8e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 15. FFTW_ESTIMATE: elapsed time t=1.08329 s, 1048576 iters, t-(init.)=0.649974 s t(norm)=0.00968537, mflops=516.243 (err=1.8e-16) 16. Frigo-old: elapsed time t=1.41661 s, 2097152 iters, t-(init.)=1.08329 s t(norm)=0.00807114, mflops=619.491 (err=1.8e-16) 17. Green: elapsed time t=1.16662 s, 524288 iters, t-(init.)=1.08329 s t(norm)=0.0322846, mflops=154.873 (err=1.9e-16) 18. GSL: elapsed time t=1.08329 s, 524288 iters, t-(init.)=0.99996 s t(norm)=0.0298011, mflops=167.779 (err=1.8e-16) 19. GSL DIT: elapsed time t=1.84993 s, 262144 iters, t-(init.)=1.79993 s t(norm)=0.107284, mflops=46.6052 (err=2.1e-16) 20. GSL DIF: elapsed time t=1.58327 s, 262144 iters, t-(init.)=1.53327 s t(norm)=0.0913901, mflops=54.7105 (err=2.8e-16) 21. Krukar: elapsed time t=1.28328 s, 524288 iters, t-(init.)=1.19995 s t(norm)=0.0357614, mflops=139.816 (err=1.9e-16) 22. Mayer (Buneman): elapsed time t=1.94992 s, 524288 iters, t-(init.)=1.86659 s t(norm)=0.0556288, mflops=89.8815 (err=1.7e-16) 23. Mayer (simple): elapsed time t=1.54994 s, 524288 iters, t-(init.)=1.46661 s t(norm)=0.0437083, mflops=114.395 24. Mayer (lookup): elapsed time t=1.68327 s, 524288 iters, t-(init.)=1.59994 s t(norm)=0.0476818, mflops=104.862 (err=1.8e-16) 25. Monro: elapsed time t=1.96659 s, 262144 iters, t-(init.)=1.93326 s t(norm)=0.115231, mflops=43.3911 (err=2.1e-08) 26. NAPACK (f2c): elapsed time t=1.51661 s, 131072 iters, t-(init.)=1.49994 s t(norm)=0.178807, mflops=27.9631 (err=3.3e-16) 27. Ooura (C): elapsed time t=1.6166 s, 1048576 iters, t-(init.)=1.44994 s t(norm)=0.0216058, mflops=231.419 (err=2.0e-16) 28. Ooura (F): elapsed time t=1.74993 s, 1048576 iters, t-(init.)=1.5666 s t(norm)=0.0233442, mflops=214.186 (err=2.0e-16) 29. Ransom: elapsed time t=1.41661 s, 131072 iters, t-(init.)=1.39994 s t(norm)=0.166886, mflops=29.9605 (err=3.4e-16) 30. SCIPORT: elapsed time t=1.84993 s, 1048576 iters, t-(init.)=1.6666 s t(norm)=0.0248343, mflops=201.335 (err=2.8e-16) 31. Singleton: elapsed time t=1.13329 s, 262144 iters, t-(init.)=1.04996 s t(norm)=0.0625824, mflops=79.8947 (err=1.7e-16) 32. Singleton (f2c): elapsed time t=1.24995 s, 262144 iters, t-(init.)=1.21662 s t(norm)=0.0725161, mflops=68.9502 (err=1.7e-16) 33. Sorensen: elapsed time t=1.08329 s, 262144 iters, t-(init.)=1.03329 s t(norm)=0.061589, mflops=81.1833 (err=1.5e-16) 34. Sorensen DIT: elapsed time t=1.7666 s, 262144 iters, t-(init.)=1.7166 s t(norm)=0.102317, mflops=48.8676 (err=1.6e-16) 35. Temperton: elapsed time t=1.23328 s, 262144 iters, t-(init.)=1.18329 s t(norm)=0.0705293, mflops=70.8925 (err=1.7e-08) 36. Temperton (f2c): elapsed time t=1.31661 s, 262144 iters, t-(init.)=1.26662 s t(norm)=0.0754962, mflops=66.2285 (err=1.8e-16) 37. Valkenburg: elapsed time t=1.13329 s, 65536 iters, t-(init.)=1.11662 s t(norm)=0.266223, mflops=18.7812 (err=2.9e-16) Top mflops for N=16 = 619.491 Normalized results and averages for N=16: fft 0: mflops = 86.7822 (norm. = 0.140086), norm. avg. (of 4) = 0.291418 fft 1: mflops = 85.3113 (norm. = 0.137712), norm. avg. (of 4) = 0.2998 fft 2: mflops = 55.9263 (norm. = 0.0902778), norm. avg. (of 4) = 0.17417 fft 3: mflops = 18.7812 (norm. = 0.0303172), norm. avg. (of 4) = 0.0204646 fft 4: mflops = 66.2285 (norm. = 0.106908), norm. avg. (of 4) = 0.0974124 fft 5: mflops = 22.6728 (norm. = 0.0365991), norm. avg. (of 4) = 0.0396379 fft 6: mflops = 82.5142 (norm. = 0.133197), norm. avg. (of 4) = 0.098916 fft 7: mflops = 62.9171 (norm. = 0.101562), norm. avg. (of 4) = 0.0922757 fft 8: mflops = 50.3337 (norm. = 0.08125), norm. avg. (of 4) = 0.119306 fft 9: mflops = 130.737 (norm. = 0.211039), norm. avg. (of 4) = 0.108235 fft 10: mflops = 88.3047 (norm. = 0.142544), norm. avg. (of 4) = 0.075957 fft 11: mflops = 41.9447 (norm. = 0.0677083), norm. avg. (of 3) = 0.106263 fft 12: mflops = 165.028 (norm. = 0.266393), norm. avg. (of 4) = 0.201844 fft 13: mflops = 119.842 (norm. = 0.193452), norm. avg. (of 4) = 0.141509 fft 14: mflops = 366.063 (norm. = 0.590909), norm. avg. (of 4) = 0.430822 fft 15: mflops = 516.243 (norm. = 0.833333), norm. avg. (of 4) = 0.708333 fft 16: mflops = 619.491 (norm. = 1), norm. avg. (of 4) = 0.682205 fft 17: mflops = 154.873 (norm. = 0.25), norm. avg. (of 2) = 0.159091 fft 18: mflops = 167.779 (norm. = 0.270833), norm. avg. (of 4) = 0.191078 fft 19: mflops = 46.6052 (norm. = 0.0752315), norm. avg. (of 4) = 0.0548768 fft 20: mflops = 54.7105 (norm. = 0.0883152), norm. avg. (of 4) = 0.0745995 fft 21: mflops = 139.816 (norm. = 0.225694), norm. avg. (of 4) = 0.218609 fft 22: mflops = 89.8815 (norm. = 0.145089), norm. avg. (of 3) = 0.198449 fft 23: mflops = 114.395 (norm. = 0.184659), norm. avg. (of 3) = 0.216829 fft 24: mflops = 104.862 (norm. = 0.169271), norm. avg. (of 3) = 0.194937 fft 25: mflops = 43.3911 (norm. = 0.0700431), norm. avg. (of 3) = 0.0447904 fft 26: mflops = 27.9631 (norm. = 0.0451389), norm. avg. (of 4) = 0.0362657 fft 27: mflops = 231.419 (norm. = 0.373563), norm. avg. (of 4) = 0.496192 fft 28: mflops = 214.186 (norm. = 0.345745), norm. avg. (of 4) = 0.431199 fft 29: mflops = 29.9605 (norm. = 0.0483631), norm. avg. (of 3) = 0.0336607 fft 30: mflops = 201.335 (norm. = 0.325), norm. avg. (of 3) = 0.267857 fft 31: mflops = 79.8947 (norm. = 0.128968), norm. avg. (of 4) = 0.096748 fft 32: mflops = 68.9502 (norm. = 0.111301), norm. avg. (of 4) = 0.0706306 fft 33: mflops = 81.1833 (norm. = 0.131048), norm. avg. (of 4) = 0.182576 fft 34: mflops = 48.8676 (norm. = 0.0788835), norm. avg. (of 4) = 0.152827 fft 35: mflops = 70.8925 (norm. = 0.114437), norm. avg. (of 4) = 0.0953905 fft 36: mflops = 66.2285 (norm. = 0.106908), norm. avg. (of 4) = 0.0749203 fft 37: mflops = 18.7812 (norm. = 0.0303172), norm. avg. (of 4) = 0.0460455 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.03329 s, 131072 iters, t-(init.)=0.983294 s t(norm)=0.0468871, mflops=106.639 (err=3.1e-16) 1. Arndt DIT: elapsed time t=1.03329 s, 131072 iters, t-(init.)=0.99996 s t(norm)=0.0476818, mflops=104.862 (err=2.5e-16) 2. Arndt Split-Radix: elapsed time t=1.6166 s, 131072 iters, t-(init.)=1.5666 s t(norm)=0.0747015, mflops=66.9331 (err=2.7e-16) 3. Arndt 4-step: elapsed time t=1.16662 s, 32768 iters, t-(init.)=1.14995 s t(norm)=0.219336, mflops=22.796 (err=2.8e-16) 4. Bailey: elapsed time t=1.36661 s, 131072 iters, t-(init.)=1.33328 s t(norm)=0.0635757, mflops=78.6463 (err=2.7e-16) 5. Beauregard: elapsed time t=1.06662 s, 32768 iters, t-(init.)=1.04996 s t(norm)=0.200264, mflops=24.9671 (err=1.8e-16) 6. Bergland: elapsed time t=1.79993 s, 262144 iters, t-(init.)=1.7166 s t(norm)=0.0409269, mflops=122.169 (err=2.6e-16) 7. Brenner: elapsed time t=1.46661 s, 131072 iters, t-(init.)=1.43328 s t(norm)=0.0683439, mflops=73.1594 (err=2.2e-16) 8. Burrus: elapsed time t=1.99992 s, 131072 iters, t-(init.)=1.96659 s t(norm)=0.0937742, mflops=53.3196 (err=2.9e-16) 9. CWP (min N) (N=33): elapsed time t=1.41661 s, 262144 iters, t-(init.)=1.33328 s t(norm)=0.0317879, mflops=157.293 10. CWP (best N) (N=35): elapsed time t=1.21662 s, 262144 iters, t-(init.)=1.13329 s t(norm)=0.0270197, mflops=185.05 11. Edelblute: elapsed time t=1.13329 s, 65536 iters, t-(init.)=1.11662 s t(norm)=0.106489, mflops=46.953 (err=2.9e-16) 12. FFTPACK: elapsed time t=1.6666 s, 262144 iters, t-(init.)=1.59994 s t(norm)=0.0381454, mflops=131.077 (err=1.9e-16) 13. FFTPACK (f2c): elapsed time t=1.83326 s, 262144 iters, t-(init.)=1.7666 s t(norm)=0.0421189, mflops=118.711 (err=1.9e-16) FFTW_MEASURE plan: (cost = 1.589394e-06) FFTW_NOTW 32 14. FFTW: elapsed time t=1.63327 s, 1048576 iters, t-(init.)=1.33328 s t(norm)=0.00794697, mflops=629.171 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.64993 s, 1048576 iters, t-(init.)=1.09996 s t(norm)=0.00655625, mflops=762.631 (err=2.1e-16) 16. Frigo-old: elapsed time t=1.83326 s, 1048576 iters, t-(init.)=1.53327 s t(norm)=0.00913901, mflops=547.105 (err=2.2e-16) 17. Green: elapsed time t=1.04996 s, 262144 iters, t-(init.)=0.966628 s t(norm)=0.0230462, mflops=216.955 (err=2.0e-16) 18. GSL: elapsed time t=1.38328 s, 262144 iters, t-(init.)=1.29995 s t(norm)=0.0309932, mflops=161.326 (err=2.0e-16) 19. GSL DIT: elapsed time t=1.59994 s, 131072 iters, t-(init.)=1.54994 s t(norm)=0.0739068, mflops=67.6528 (err=2.2e-16) 20. GSL DIF: elapsed time t=1.41661 s, 131072 iters, t-(init.)=1.38328 s t(norm)=0.0659598, mflops=75.8037 (err=2.5e-16) 21. Krukar: elapsed time t=1.53327 s, 262144 iters, t-(init.)=1.44994 s t(norm)=0.0345693, mflops=144.637 (err=2.2e-16) 22. Mayer (Buneman): elapsed time t=1.03329 s, 131072 iters, t-(init.)=0.983294 s t(norm)=0.0468871, mflops=106.639 (err=2.7e-16) 23. Mayer (simple): elapsed time t=1.6666 s, 262144 iters, t-(init.)=1.58327 s t(norm)=0.0377481, mflops=132.457 24. Mayer (lookup): elapsed time t=1.69993 s, 262144 iters, t-(init.)=1.6166 s t(norm)=0.0385428, mflops=129.726 (err=2.5e-16) 25. Monro: elapsed time t=1.53327 s, 131072 iters, t-(init.)=1.48327 s t(norm)=0.070728, mflops=70.6933 (err=3.7e-08) 26. NAPACK (f2c): elapsed time t=1.5666 s, 65536 iters, t-(init.)=1.54994 s t(norm)=0.147814, mflops=33.8264 (err=5.4e-16) 27. Ooura (C): elapsed time t=1.81659 s, 524288 iters, t-(init.)=1.6666 s t(norm)=0.0198674, mflops=251.668 (err=2.7e-16) 28. Ooura (F): elapsed time t=1.91659 s, 524288 iters, t-(init.)=1.78326 s t(norm)=0.0212581, mflops=235.204 (err=2.7e-16) 29. Ransom: elapsed time t=1.48327 s, 65536 iters, t-(init.)=1.46661 s t(norm)=0.139867, mflops=35.7483 (err=7.0e-16) 30. SCIPORT: elapsed time t=1.89992 s, 524288 iters, t-(init.)=1.74993 s t(norm)=0.0208608, mflops=239.684 (err=1.8e-16) 31. Singleton: elapsed time t=1.01663 s, 131072 iters, t-(init.)=0.949962 s t(norm)=0.0452977, mflops=110.381 (err=2.2e-16) 32. Singleton (f2c): elapsed time t=1.13329 s, 131072 iters, t-(init.)=1.09996 s t(norm)=0.05245, mflops=95.3289 (err=2.2e-16) 33. Sorensen: elapsed time t=1.03329 s, 131072 iters, t-(init.)=0.99996 s t(norm)=0.0476818, mflops=104.862 (err=2.7e-16) 34. Sorensen DIT: elapsed time t=1.04996 s, 65536 iters, t-(init.)=1.03329 s t(norm)=0.0985424, mflops=50.7396 (err=2.6e-16) 35. Temperton: elapsed time t=1.29995 s, 131072 iters, t-(init.)=1.24995 s t(norm)=0.0596023, mflops=83.8894 (err=3.1e-08) 36. Temperton (f2c): elapsed time t=1.73326 s, 131072 iters, t-(init.)=1.69993 s t(norm)=0.0810591, mflops=61.6834 (err=2.0e-16) 37. Valkenburg: elapsed time t=1.39994 s, 32768 iters, t-(init.)=1.38328 s t(norm)=0.263839, mflops=18.9509 (err=4.3e-16) Top mflops for N=32 = 762.631 Normalized results and averages for N=32: fft 0: mflops = 106.639 (norm. = 0.139831), norm. avg. (of 5) = 0.2611 fft 1: mflops = 104.862 (norm. = 0.1375), norm. avg. (of 5) = 0.26734 fft 2: mflops = 66.9331 (norm. = 0.087766), norm. avg. (of 5) = 0.156889 fft 3: mflops = 22.796 (norm. = 0.0298913), norm. avg. (of 5) = 0.0223499 fft 4: mflops = 78.6463 (norm. = 0.103125), norm. avg. (of 5) = 0.0985549 fft 5: mflops = 24.9671 (norm. = 0.0327381), norm. avg. (of 5) = 0.038258 fft 6: mflops = 122.169 (norm. = 0.160194), norm. avg. (of 5) = 0.111172 fft 7: mflops = 73.1594 (norm. = 0.0959302), norm. avg. (of 5) = 0.0930066 fft 8: mflops = 53.3196 (norm. = 0.0699153), norm. avg. (of 5) = 0.109428 fft 9: mflops = 157.293 (norm. = 0.20625), norm. avg. (of 5) = 0.127838 fft 10: mflops = 185.05 (norm. = 0.242647), norm. avg. (of 5) = 0.109295 fft 11: mflops = 46.953 (norm. = 0.0615672), norm. avg. (of 4) = 0.0950894 fft 12: mflops = 131.077 (norm. = 0.171875), norm. avg. (of 5) = 0.19585 fft 13: mflops = 118.711 (norm. = 0.15566), norm. avg. (of 5) = 0.14434 fft 14: mflops = 629.171 (norm. = 0.825), norm. avg. (of 5) = 0.509658 fft 15: mflops = 762.631 (norm. = 1), norm. avg. (of 5) = 0.766667 fft 16: mflops = 547.105 (norm. = 0.717391), norm. avg. (of 5) = 0.689242 fft 17: mflops = 216.955 (norm. = 0.284483), norm. avg. (of 3) = 0.200888 fft 18: mflops = 161.326 (norm. = 0.211538), norm. avg. (of 5) = 0.19517 fft 19: mflops = 67.6528 (norm. = 0.0887097), norm. avg. (of 5) = 0.0616433 fft 20: mflops = 75.8037 (norm. = 0.0993976), norm. avg. (of 5) = 0.0795591 fft 21: mflops = 144.637 (norm. = 0.189655), norm. avg. (of 5) = 0.212818 fft 22: mflops = 106.639 (norm. = 0.139831), norm. avg. (of 4) = 0.183794 fft 23: mflops = 132.457 (norm. = 0.173684), norm. avg. (of 4) = 0.206043 fft 24: mflops = 129.726 (norm. = 0.170103), norm. avg. (of 4) = 0.188729 fft 25: mflops = 70.6933 (norm. = 0.0926966), norm. avg. (of 4) = 0.0567669 fft 26: mflops = 33.8264 (norm. = 0.0443548), norm. avg. (of 5) = 0.0378835 fft 27: mflops = 251.668 (norm. = 0.33), norm. avg. (of 5) = 0.462954 fft 28: mflops = 235.204 (norm. = 0.308411), norm. avg. (of 5) = 0.406641 fft 29: mflops = 35.7483 (norm. = 0.046875), norm. avg. (of 4) = 0.0369643 fft 30: mflops = 239.684 (norm. = 0.314286), norm. avg. (of 4) = 0.279464 fft 31: mflops = 110.381 (norm. = 0.144737), norm. avg. (of 5) = 0.106346 fft 32: mflops = 95.3289 (norm. = 0.125), norm. avg. (of 5) = 0.0815045 fft 33: mflops = 104.862 (norm. = 0.1375), norm. avg. (of 5) = 0.173561 fft 34: mflops = 50.7396 (norm. = 0.0665323), norm. avg. (of 5) = 0.135568 fft 35: mflops = 83.8894 (norm. = 0.11), norm. avg. (of 5) = 0.0983124 fft 36: mflops = 61.6834 (norm. = 0.0808824), norm. avg. (of 5) = 0.0761127 fft 37: mflops = 18.9509 (norm. = 0.0248494), norm. avg. (of 5) = 0.0418063 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.31661 s, 65536 iters, t-(init.)=1.28328 s t(norm)=0.050993, mflops=98.0526 (err=5.7e-16) 1. Arndt DIT: elapsed time t=1.33328 s, 65536 iters, t-(init.)=1.29995 s t(norm)=0.0516553, mflops=96.7955 (err=5.7e-16) 2. Arndt Split-Radix: elapsed time t=1.79993 s, 65536 iters, t-(init.)=1.7666 s t(norm)=0.0701982, mflops=71.2269 (err=5.7e-16) 3. Arndt 4-step: elapsed time t=1.83326 s, 32768 iters, t-(init.)=1.83326 s t(norm)=0.145694, mflops=34.3184 (err=5.6e-16) 4. Bailey: elapsed time t=1.48327 s, 65536 iters, t-(init.)=1.44994 s t(norm)=0.0576155, mflops=86.7822 (err=5.6e-16) 5. Beauregard: elapsed time t=1.28328 s, 16384 iters, t-(init.)=1.28328 s t(norm)=0.203972, mflops=24.5131 (err=6.0e-16) 6. Bergland: elapsed time t=1.74993 s, 131072 iters, t-(init.)=1.68327 s t(norm)=0.0334435, mflops=149.506 (err=6.0e-16) 7. Brenner: elapsed time t=1.28328 s, 65536 iters, t-(init.)=1.24995 s t(norm)=0.0496686, mflops=100.667 (err=5.9e-16) 8. Burrus: elapsed time t=1.08329 s, 32768 iters, t-(init.)=1.06662 s t(norm)=0.0847677, mflops=58.9848 (err=5.7e-16) 9. CWP (min N) (N=65): elapsed time t=1.23328 s, 131072 iters, t-(init.)=1.16662 s t(norm)=0.0231787, mflops=215.716 10. CWP (best N) (N=84): elapsed time t=1.33328 s, 131072 iters, t-(init.)=1.23328 s t(norm)=0.0245032, mflops=204.055 11. Edelblute: elapsed time t=1.24995 s, 32768 iters, t-(init.)=1.23328 s t(norm)=0.0980126, mflops=51.0138 (err=5.7e-16) 12. FFTPACK: elapsed time t=1.31661 s, 131072 iters, t-(init.)=1.24995 s t(norm)=0.0248343, mflops=201.335 (err=5.5e-16) 13. FFTPACK (f2c): elapsed time t=1.84993 s, 131072 iters, t-(init.)=1.78326 s t(norm)=0.0354302, mflops=141.122 (err=5.5e-16) FFTW_MEASURE plan: (cost = 4.577454e-06) FFTW_NOTW 64 14. FFTW: elapsed time t=1.28328 s, 262144 iters, t-(init.)=1.14995 s t(norm)=0.0114238, mflops=437.684 (err=5.3e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.5666 s, 262144 iters, t-(init.)=1.36661 s t(norm)=0.0135761, mflops=368.295 (err=5.5e-16) 16. Frigo-old: elapsed time t=1.91659 s, 262144 iters, t-(init.)=1.78326 s t(norm)=0.0177151, mflops=282.245 (err=5.6e-16) 17. Green: elapsed time t=1.69993 s, 262144 iters, t-(init.)=1.5666 s t(norm)=0.0155628, mflops=321.279 (err=5.5e-16) 18. GSL: elapsed time t=1.46661 s, 131072 iters, t-(init.)=1.39994 s t(norm)=0.0278144, mflops=179.763 (err=5.5e-16) 19. GSL DIT: elapsed time t=1.63327 s, 65536 iters, t-(init.)=1.59994 s t(norm)=0.0635757, mflops=78.6463 (err=5.6e-16) 20. GSL DIF: elapsed time t=1.41661 s, 65536 iters, t-(init.)=1.38328 s t(norm)=0.0549665, mflops=90.9644 (err=5.4e-16) 21. Krukar: elapsed time t=1.04996 s, 65536 iters, t-(init.)=1.01663 s t(norm)=0.0403971, mflops=123.771 (err=6.0e-16) 22. Mayer (Buneman): elapsed time t=1.26662 s, 65536 iters, t-(init.)=1.23328 s t(norm)=0.0490063, mflops=102.028 (err=5.4e-16) 23. Mayer (simple): elapsed time t=2.01659 s, 131072 iters, t-(init.)=1.94992 s t(norm)=0.0387415, mflops=129.061 24. Mayer (lookup): elapsed time t=1.03329 s, 65536 iters, t-(init.)=0.99996 s t(norm)=0.0397348, mflops=125.834 (err=5.5e-16) 25. Monro: elapsed time t=1.43328 s, 65536 iters, t-(init.)=1.38328 s t(norm)=0.0549665, mflops=90.9644 (err=3.4e-08) 26. NAPACK (f2c): elapsed time t=1.53327 s, 32768 iters, t-(init.)=1.51661 s t(norm)=0.120529, mflops=41.4838 (err=1.1e-15) 27. Ooura (C): elapsed time t=1.03329 s, 131072 iters, t-(init.)=0.966628 s t(norm)=0.0192052, mflops=260.347 (err=5.9e-16) 28. Ooura (F): elapsed time t=1.08329 s, 131072 iters, t-(init.)=1.01663 s t(norm)=0.0201985, mflops=247.543 (err=5.9e-16) 29. Ransom: elapsed time t=1.03329 s, 32768 iters, t-(init.)=1.01663 s t(norm)=0.0807942, mflops=61.8856 (err=8.6e-16) 30. SCIPORT: elapsed time t=1.36661 s, 131072 iters, t-(init.)=1.29995 s t(norm)=0.0258276, mflops=193.591 (err=5.9e-16) 31. Singleton: elapsed time t=1.69993 s, 131072 iters, t-(init.)=1.6166 s t(norm)=0.032119, mflops=155.671 (err=9.2e-16) 32. Singleton (f2c): elapsed time t=1.7666 s, 131072 iters, t-(init.)=1.69993 s t(norm)=0.0337746, mflops=148.04 (err=9.2e-16) 33. Sorensen: elapsed time t=1.96659 s, 131072 iters, t-(init.)=1.89992 s t(norm)=0.0377481, mflops=132.457 (err=5.4e-16) 34. Sorensen DIT: elapsed time t=1.13329 s, 32768 iters, t-(init.)=1.11662 s t(norm)=0.0887411, mflops=56.3437 (err=5.5e-16) 35. Temperton: elapsed time t=1.13329 s, 65536 iters, t-(init.)=1.09996 s t(norm)=0.0437083, mflops=114.395 (err=3.8e-08) 36. Temperton (f2c): elapsed time t=1.58327 s, 65536 iters, t-(init.)=1.53327 s t(norm)=0.0609268, mflops=82.0658 (err=5.5e-16) 37. Valkenburg: elapsed time t=1.7166 s, 16384 iters, t-(init.)=1.69993 s t(norm)=0.270197, mflops=18.505 (err=8.0e-16) Top mflops for N=64 = 437.684 Normalized results and averages for N=64: fft 0: mflops = 98.0526 (norm. = 0.224026), norm. avg. (of 6) = 0.254921 fft 1: mflops = 96.7955 (norm. = 0.221154), norm. avg. (of 6) = 0.259642 fft 2: mflops = 71.2269 (norm. = 0.162736), norm. avg. (of 6) = 0.157864 fft 3: mflops = 34.3184 (norm. = 0.0784091), norm. avg. (of 6) = 0.0316931 fft 4: mflops = 86.7822 (norm. = 0.198276), norm. avg. (of 6) = 0.115175 fft 5: mflops = 24.5131 (norm. = 0.0560065), norm. avg. (of 6) = 0.0412161 fft 6: mflops = 149.506 (norm. = 0.341584), norm. avg. (of 6) = 0.149574 fft 7: mflops = 100.667 (norm. = 0.23), norm. avg. (of 6) = 0.115839 fft 8: mflops = 58.9848 (norm. = 0.134766), norm. avg. (of 6) = 0.113651 fft 9: mflops = 215.716 (norm. = 0.492857), norm. avg. (of 6) = 0.188674 fft 10: mflops = 204.055 (norm. = 0.466216), norm. avg. (of 6) = 0.168782 fft 11: mflops = 51.0138 (norm. = 0.116554), norm. avg. (of 5) = 0.0993823 fft 12: mflops = 201.335 (norm. = 0.46), norm. avg. (of 6) = 0.239875 fft 13: mflops = 141.122 (norm. = 0.32243), norm. avg. (of 6) = 0.174021 fft 14: mflops = 437.684 (norm. = 1), norm. avg. (of 6) = 0.591382 fft 15: mflops = 368.295 (norm. = 0.841463), norm. avg. (of 6) = 0.779133 fft 16: mflops = 282.245 (norm. = 0.64486), norm. avg. (of 6) = 0.681845 fft 17: mflops = 321.279 (norm. = 0.734043), norm. avg. (of 4) = 0.334177 fft 18: mflops = 179.763 (norm. = 0.410714), norm. avg. (of 6) = 0.231094 fft 19: mflops = 78.6463 (norm. = 0.179688), norm. avg. (of 6) = 0.0813174 fft 20: mflops = 90.9644 (norm. = 0.207831), norm. avg. (of 6) = 0.100938 fft 21: mflops = 123.771 (norm. = 0.282787), norm. avg. (of 6) = 0.22448 fft 22: mflops = 102.028 (norm. = 0.233108), norm. avg. (of 5) = 0.193657 fft 23: mflops = 129.061 (norm. = 0.294872), norm. avg. (of 5) = 0.223809 fft 24: mflops = 125.834 (norm. = 0.2875), norm. avg. (of 5) = 0.208483 fft 25: mflops = 90.9644 (norm. = 0.207831), norm. avg. (of 5) = 0.0869798 fft 26: mflops = 41.4838 (norm. = 0.0947802), norm. avg. (of 6) = 0.0473663 fft 27: mflops = 260.347 (norm. = 0.594828), norm. avg. (of 6) = 0.484933 fft 28: mflops = 247.543 (norm. = 0.565574), norm. avg. (of 6) = 0.43313 fft 29: mflops = 61.8856 (norm. = 0.141393), norm. avg. (of 5) = 0.0578501 fft 30: mflops = 193.591 (norm. = 0.442308), norm. avg. (of 5) = 0.312033 fft 31: mflops = 155.671 (norm. = 0.35567), norm. avg. (of 6) = 0.1479 fft 32: mflops = 148.04 (norm. = 0.338235), norm. avg. (of 6) = 0.124293 fft 33: mflops = 132.457 (norm. = 0.302632), norm. avg. (of 6) = 0.195073 fft 34: mflops = 56.3437 (norm. = 0.128731), norm. avg. (of 6) = 0.134429 fft 35: mflops = 114.395 (norm. = 0.261364), norm. avg. (of 6) = 0.125488 fft 36: mflops = 82.0658 (norm. = 0.1875), norm. avg. (of 6) = 0.0946773 fft 37: mflops = 18.505 (norm. = 0.0422794), norm. avg. (of 6) = 0.0418852 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.34995 s, 32768 iters, t-(init.)=1.31661 s t(norm)=0.0448436, mflops=111.499 (err=3.5e-16) 1. Arndt DIT: elapsed time t=1.41661 s, 32768 iters, t-(init.)=1.38328 s t(norm)=0.0471142, mflops=106.125 (err=3.3e-16) 2. Arndt Split-Radix: elapsed time t=1.93326 s, 32768 iters, t-(init.)=1.89992 s t(norm)=0.064711, mflops=77.2666 (err=3.6e-16) 3. Arndt 4-step: elapsed time t=1.16662 s, 8192 iters, t-(init.)=1.14995 s t(norm)=0.156669, mflops=31.9145 (err=3.3e-16) 4. Bailey: elapsed time t=1.7166 s, 32768 iters, t-(init.)=1.68327 s t(norm)=0.0573317, mflops=87.2118 (err=3.3e-16) 5. Beauregard: elapsed time t=1.49994 s, 8192 iters, t-(init.)=1.48327 s t(norm)=0.20208, mflops=24.7427 (err=3.8e-16) 6. Bergland: elapsed time t=1.96659 s, 65536 iters, t-(init.)=1.89992 s t(norm)=0.0323555, mflops=154.533 (err=3.5e-16) 7. Brenner: elapsed time t=1.46661 s, 32768 iters, t-(init.)=1.43328 s t(norm)=0.0488171, mflops=102.423 (err=4.1e-16) 8. Burrus: elapsed time t=1.14995 s, 16384 iters, t-(init.)=1.13329 s t(norm)=0.0771991, mflops=64.7676 (err=3.2e-16) 9. CWP (min N) (N=130): elapsed time t=1.14995 s, 65536 iters, t-(init.)=1.08329 s t(norm)=0.0184483, mflops=271.027 10. CWP (best N) (N=140): elapsed time t=1.04996 s, 65536 iters, t-(init.)=0.966628 s t(norm)=0.0164616, mflops=303.738 11. Edelblute: elapsed time t=1.34995 s, 16384 iters, t-(init.)=1.33328 s t(norm)=0.0908225, mflops=55.0524 (err=3.2e-16) 12. FFTPACK: elapsed time t=1.6166 s, 65536 iters, t-(init.)=1.54994 s t(norm)=0.0263953, mflops=189.428 (err=3.6e-16) 13. FFTPACK (f2c): elapsed time t=1.06662 s, 32768 iters, t-(init.)=1.03329 s t(norm)=0.0351937, mflops=142.071 (err=3.6e-16) FFTW_MEASURE plan: (cost = 1.322375e-05) FFTW_TWIDDLE 4 FFTW_NOTW 32 14. FFTW: elapsed time t=1.59994 s, 131072 iters, t-(init.)=1.46661 s t(norm)=0.0124881, mflops=400.381 (err=3.4e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.6166 s, 131072 iters, t-(init.)=1.44994 s t(norm)=0.0123462, mflops=404.983 (err=3.4e-16) 16. Frigo-old: elapsed time t=1.11662 s, 65536 iters, t-(init.)=1.04996 s t(norm)=0.0178807, mflops=279.631 (err=3.4e-16) 17. Green: elapsed time t=1.91659 s, 131072 iters, t-(init.)=1.78326 s t(norm)=0.0151844, mflops=329.286 (err=4.2e-16) 18. GSL: elapsed time t=1.74993 s, 65536 iters, t-(init.)=1.68327 s t(norm)=0.0286658, mflops=174.424 (err=3.4e-16) 19. GSL DIT: elapsed time t=1.6666 s, 32768 iters, t-(init.)=1.63327 s t(norm)=0.0556288, mflops=89.8815 (err=3.5e-16) 20. GSL DIF: elapsed time t=1.39994 s, 32768 iters, t-(init.)=1.36661 s t(norm)=0.0465465, mflops=107.419 (err=3.7e-16) 21. Krukar: elapsed time t=1.48327 s, 32768 iters, t-(init.)=1.44994 s t(norm)=0.0493847, mflops=101.246 (err=3.6e-16) 22. Mayer (Buneman): elapsed time t=1.26662 s, 32768 iters, t-(init.)=1.23328 s t(norm)=0.0420054, mflops=119.032 (err=3.2e-16) 23. Mayer (simple): elapsed time t=1.99992 s, 65536 iters, t-(init.)=1.93326 s t(norm)=0.0329232, mflops=151.869 24. Mayer (lookup): elapsed time t=1.99992 s, 65536 iters, t-(init.)=1.93326 s t(norm)=0.0329232, mflops=151.869 (err=3.3e-16) 25. Monro: elapsed time t=1.43328 s, 32768 iters, t-(init.)=1.39994 s t(norm)=0.0476818, mflops=104.862 (err=5.2e-08) 26. NAPACK (f2c): elapsed time t=1.79993 s, 16384 iters, t-(init.)=1.78326 s t(norm)=0.121475, mflops=41.1607 (err=1.2e-15) 27. Ooura (C): elapsed time t=1.23328 s, 65536 iters, t-(init.)=1.16662 s t(norm)=0.0198674, mflops=251.668 (err=3.3e-16) 28. Ooura (F): elapsed time t=1.21662 s, 65536 iters, t-(init.)=1.14995 s t(norm)=0.0195836, mflops=255.316 (err=3.3e-16) 29. Ransom: elapsed time t=1.26662 s, 16384 iters, t-(init.)=1.24995 s t(norm)=0.0851461, mflops=58.7226 (err=1.0e-15) 30. SCIPORT: elapsed time t=1.48327 s, 65536 iters, t-(init.)=1.41661 s t(norm)=0.0241247, mflops=207.256 (err=3.8e-16) 31. Singleton: elapsed time t=1.06662 s, 32768 iters, t-(init.)=1.03329 s t(norm)=0.0351937, mflops=142.071 (err=4.2e-16) 32. Singleton (f2c): elapsed time t=1.11662 s, 32768 iters, t-(init.)=1.08329 s t(norm)=0.0368966, mflops=135.514 (err=4.2e-16) 33. Sorensen: elapsed time t=1.94992 s, 65536 iters, t-(init.)=1.88326 s t(norm)=0.0320717, mflops=155.901 (err=3.3e-16) 34. Sorensen DIT: elapsed time t=1.23328 s, 16384 iters, t-(init.)=1.21662 s t(norm)=0.0828755, mflops=60.3314 (err=3.2e-16) 35. Temperton: elapsed time t=1.21662 s, 32768 iters, t-(init.)=1.18329 s t(norm)=0.0403025, mflops=124.062 (err=4.7e-08) 36. Temperton (f2c): elapsed time t=1.94992 s, 32768 iters, t-(init.)=1.91659 s t(norm)=0.0652787, mflops=76.5947 (err=3.6e-16) 37. Valkenburg: elapsed time t=1.89992 s, 8192 iters, t-(init.)=1.88326 s t(norm)=0.256574, mflops=19.4876 (err=5.8e-16) Top mflops for N=128 = 404.983 Normalized results and averages for N=128: fft 0: mflops = 111.499 (norm. = 0.275316), norm. avg. (of 7) = 0.257835 fft 1: mflops = 106.125 (norm. = 0.262048), norm. avg. (of 7) = 0.259986 fft 2: mflops = 77.2666 (norm. = 0.190789), norm. avg. (of 7) = 0.162567 fft 3: mflops = 31.9145 (norm. = 0.0788043), norm. avg. (of 7) = 0.0384233 fft 4: mflops = 87.2118 (norm. = 0.215347), norm. avg. (of 7) = 0.129485 fft 5: mflops = 24.7427 (norm. = 0.0610955), norm. avg. (of 7) = 0.044056 fft 6: mflops = 154.533 (norm. = 0.381579), norm. avg. (of 7) = 0.182717 fft 7: mflops = 102.423 (norm. = 0.252907), norm. avg. (of 7) = 0.13542 fft 8: mflops = 64.7676 (norm. = 0.159926), norm. avg. (of 7) = 0.120262 fft 9: mflops = 271.027 (norm. = 0.669231), norm. avg. (of 7) = 0.257325 fft 10: mflops = 303.738 (norm. = 0.75), norm. avg. (of 7) = 0.251813 fft 11: mflops = 55.0524 (norm. = 0.135937), norm. avg. (of 6) = 0.105475 fft 12: mflops = 189.428 (norm. = 0.467742), norm. avg. (of 7) = 0.272428 fft 13: mflops = 142.071 (norm. = 0.350806), norm. avg. (of 7) = 0.199276 fft 14: mflops = 400.381 (norm. = 0.988636), norm. avg. (of 7) = 0.648132 fft 15: mflops = 404.983 (norm. = 1), norm. avg. (of 7) = 0.810685 fft 16: mflops = 279.631 (norm. = 0.690476), norm. avg. (of 7) = 0.683078 fft 17: mflops = 329.286 (norm. = 0.813084), norm. avg. (of 5) = 0.429958 fft 18: mflops = 174.424 (norm. = 0.430693), norm. avg. (of 7) = 0.259608 fft 19: mflops = 89.8815 (norm. = 0.221939), norm. avg. (of 7) = 0.101406 fft 20: mflops = 107.419 (norm. = 0.265244), norm. avg. (of 7) = 0.12441 fft 21: mflops = 101.246 (norm. = 0.25), norm. avg. (of 7) = 0.228126 fft 22: mflops = 119.032 (norm. = 0.293919), norm. avg. (of 6) = 0.210367 fft 23: mflops = 151.869 (norm. = 0.375), norm. avg. (of 6) = 0.249007 fft 24: mflops = 151.869 (norm. = 0.375), norm. avg. (of 6) = 0.236236 fft 25: mflops = 104.862 (norm. = 0.258929), norm. avg. (of 6) = 0.115638 fft 26: mflops = 41.1607 (norm. = 0.101636), norm. avg. (of 7) = 0.0551191 fft 27: mflops = 251.668 (norm. = 0.621429), norm. avg. (of 7) = 0.504432 fft 28: mflops = 255.316 (norm. = 0.630435), norm. avg. (of 7) = 0.461316 fft 29: mflops = 58.7226 (norm. = 0.145), norm. avg. (of 6) = 0.0723751 fft 30: mflops = 207.256 (norm. = 0.511765), norm. avg. (of 6) = 0.345322 fft 31: mflops = 142.071 (norm. = 0.350806), norm. avg. (of 7) = 0.176887 fft 32: mflops = 135.514 (norm. = 0.334615), norm. avg. (of 7) = 0.154339 fft 33: mflops = 155.901 (norm. = 0.384956), norm. avg. (of 7) = 0.222199 fft 34: mflops = 60.3314 (norm. = 0.148973), norm. avg. (of 7) = 0.136506 fft 35: mflops = 124.062 (norm. = 0.306338), norm. avg. (of 7) = 0.151323 fft 36: mflops = 76.5947 (norm. = 0.18913), norm. avg. (of 7) = 0.108171 fft 37: mflops = 19.4876 (norm. = 0.0481195), norm. avg. (of 7) = 0.0427758 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.63327 s, 16384 iters, t-(init.)=1.59994 s t(norm)=0.0476818, mflops=104.862 (err=9.7e-16) 1. Arndt DIT: elapsed time t=1.68327 s, 16384 iters, t-(init.)=1.64993 s t(norm)=0.0491719, mflops=101.684 (err=9.9e-16) 2. Arndt Split-Radix: elapsed time t=1.11662 s, 8192 iters, t-(init.)=1.09996 s t(norm)=0.0655625, mflops=76.2631 (err=9.8e-16) 3. Arndt 4-step: elapsed time t=1.23328 s, 4096 iters, t-(init.)=1.21662 s t(norm)=0.145032, mflops=34.4751 (err=1.0e-15) 4. Bailey: elapsed time t=1.09996 s, 8192 iters, t-(init.)=1.08329 s t(norm)=0.0645691, mflops=77.4364 (err=9.8e-16) 5. Beauregard: elapsed time t=1.78326 s, 4096 iters, t-(init.)=1.7666 s t(norm)=0.210595, mflops=23.7423 (err=1.1e-15) 6. Bergland: elapsed time t=1.04996 s, 16384 iters, t-(init.)=1.01663 s t(norm)=0.0302978, mflops=165.028 (err=1.0e-15) 7. Brenner: elapsed time t=1.6166 s, 16384 iters, t-(init.)=1.58327 s t(norm)=0.0471851, mflops=105.966 (err=1.1e-15) 8. Burrus: elapsed time t=1.33328 s, 8192 iters, t-(init.)=1.31661 s t(norm)=0.0784763, mflops=63.7135 (err=9.9e-16) 9. CWP (min N) (N=260): elapsed time t=1.33328 s, 32768 iters, t-(init.)=1.26662 s t(norm)=0.018874, mflops=264.914 10. CWP (best N) (N=280): elapsed time t=1.98325 s, 65536 iters, t-(init.)=1.83326 s t(norm)=0.0136589, mflops=366.063 11. Edelblute: elapsed time t=1.53327 s, 8192 iters, t-(init.)=1.51661 s t(norm)=0.0903968, mflops=55.3117 (err=9.9e-16) 12. FFTPACK: elapsed time t=1.6166 s, 32768 iters, t-(init.)=1.54994 s t(norm)=0.0230959, mflops=216.489 (err=1.0e-15) 13. FFTPACK (f2c): elapsed time t=1.01663 s, 16384 iters, t-(init.)=0.983294 s t(norm)=0.0293044, mflops=170.623 (err=1.0e-15) FFTW_MEASURE plan: (cost = 2.848193e-05) FFTW_TWIDDLE 8 FFTW_NOTW 32 14. FFTW: elapsed time t=1.04996 s, 32768 iters, t-(init.)=0.983294 s t(norm)=0.0146522, mflops=341.245 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.04996 s, 32768 iters, t-(init.)=0.966628 s t(norm)=0.0144039, mflops=347.129 (err=1.1e-15) 16. Frigo-old: elapsed time t=1.19995 s, 32768 iters, t-(init.)=1.13329 s t(norm)=0.0168873, mflops=296.08 (err=1.1e-15) 17. Green: elapsed time t=1.09996 s, 32768 iters, t-(init.)=1.03329 s t(norm)=0.0153973, mflops=324.733 (err=1.1e-15) 18. GSL: elapsed time t=1.91659 s, 32768 iters, t-(init.)=1.84993 s t(norm)=0.027566, mflops=181.383 (err=1.0e-15) 19. GSL DIT: elapsed time t=1.78326 s, 16384 iters, t-(init.)=1.74993 s t(norm)=0.052152, mflops=95.8736 (err=1.0e-15) 20. GSL DIF: elapsed time t=1.48327 s, 16384 iters, t-(init.)=1.44994 s t(norm)=0.0432116, mflops=115.71 (err=1.1e-15) 21. Krukar: elapsed time t=1.68327 s, 16384 iters, t-(init.)=1.64993 s t(norm)=0.0491719, mflops=101.684 (err=1.1e-15) 22. Mayer (Buneman): elapsed time t=1.59994 s, 16384 iters, t-(init.)=1.5666 s t(norm)=0.0466884, mflops=107.093 (err=9.7e-16) 23. Mayer (simple): elapsed time t=1.21662 s, 16384 iters, t-(init.)=1.18329 s t(norm)=0.0352647, mflops=141.785 24. Mayer (lookup): elapsed time t=1.28328 s, 16384 iters, t-(init.)=1.24995 s t(norm)=0.0372514, mflops=134.223 (err=9.4e-16) 25. Monro: elapsed time t=1.53327 s, 16384 iters, t-(init.)=1.49994 s t(norm)=0.0447017, mflops=111.853 (err=8.5e-08) 26. NAPACK (f2c): elapsed time t=1.03329 s, 4096 iters, t-(init.)=1.03329 s t(norm)=0.123178, mflops=40.5917 (err=3.8e-15) 27. Ooura (C): elapsed time t=1.38328 s, 32768 iters, t-(init.)=1.31661 s t(norm)=0.0196191, mflops=254.854 (err=9.9e-16) 28. Ooura (F): elapsed time t=1.39994 s, 32768 iters, t-(init.)=1.33328 s t(norm)=0.0198674, mflops=251.668 (err=9.9e-16) 29. Ransom: elapsed time t=2.01659 s, 16384 iters, t-(init.)=1.98325 s t(norm)=0.0591056, mflops=84.5944 (err=1.9e-15) 30. SCIPORT: elapsed time t=1.03329 s, 16384 iters, t-(init.)=0.99996 s t(norm)=0.0298011, mflops=167.779 (err=1.1e-15) 31. Singleton: elapsed time t=1.86659 s, 32768 iters, t-(init.)=1.79993 s t(norm)=0.026821, mflops=186.421 (err=1.7e-15) 32. Singleton (f2c): elapsed time t=1.93326 s, 32768 iters, t-(init.)=1.88326 s t(norm)=0.0280627, mflops=178.172 (err=1.7e-15) 33. Sorensen: elapsed time t=1.11662 s, 16384 iters, t-(init.)=1.09996 s t(norm)=0.0327812, mflops=152.526 (err=9.9e-16) 34. Sorensen DIT: elapsed time t=1.39994 s, 8192 iters, t-(init.)=1.38328 s t(norm)=0.0824498, mflops=60.643 (err=9.9e-16) 35. Temperton: elapsed time t=1.41661 s, 16384 iters, t-(init.)=1.38328 s t(norm)=0.0412249, mflops=121.286 (err=9.5e-08) 36. Temperton (f2c): elapsed time t=1.01663 s, 8192 iters, t-(init.)=0.99996 s t(norm)=0.0596023, mflops=83.8894 (err=1.0e-15) 37. Valkenburg: elapsed time t=1.09996 s, 2048 iters, t-(init.)=1.09996 s t(norm)=0.26225, mflops=19.0658 (err=1.2e-15) Top mflops for N=256 = 366.063 Normalized results and averages for N=256: fft 0: mflops = 104.862 (norm. = 0.286458), norm. avg. (of 8) = 0.261413 fft 1: mflops = 101.684 (norm. = 0.277778), norm. avg. (of 8) = 0.26221 fft 2: mflops = 76.2631 (norm. = 0.208333), norm. avg. (of 8) = 0.168288 fft 3: mflops = 34.4751 (norm. = 0.0941781), norm. avg. (of 8) = 0.0453926 fft 4: mflops = 77.4364 (norm. = 0.211538), norm. avg. (of 8) = 0.139742 fft 5: mflops = 23.7423 (norm. = 0.0648585), norm. avg. (of 8) = 0.0466563 fft 6: mflops = 165.028 (norm. = 0.45082), norm. avg. (of 8) = 0.21623 fft 7: mflops = 105.966 (norm. = 0.289474), norm. avg. (of 8) = 0.154677 fft 8: mflops = 63.7135 (norm. = 0.174051), norm. avg. (of 8) = 0.126985 fft 9: mflops = 264.914 (norm. = 0.723684), norm. avg. (of 8) = 0.31562 fft 10: mflops = 366.063 (norm. = 1), norm. avg. (of 8) = 0.345336 fft 11: mflops = 55.3117 (norm. = 0.151099), norm. avg. (of 7) = 0.111993 fft 12: mflops = 216.489 (norm. = 0.591398), norm. avg. (of 8) = 0.312299 fft 13: mflops = 170.623 (norm. = 0.466102), norm. avg. (of 8) = 0.232629 fft 14: mflops = 341.245 (norm. = 0.932203), norm. avg. (of 8) = 0.683641 fft 15: mflops = 347.129 (norm. = 0.948276), norm. avg. (of 8) = 0.827884 fft 16: mflops = 296.08 (norm. = 0.808824), norm. avg. (of 8) = 0.698796 fft 17: mflops = 324.733 (norm. = 0.887097), norm. avg. (of 6) = 0.506148 fft 18: mflops = 181.383 (norm. = 0.495495), norm. avg. (of 8) = 0.289094 fft 19: mflops = 95.8736 (norm. = 0.261905), norm. avg. (of 8) = 0.121468 fft 20: mflops = 115.71 (norm. = 0.316092), norm. avg. (of 8) = 0.14837 fft 21: mflops = 101.684 (norm. = 0.277778), norm. avg. (of 8) = 0.234332 fft 22: mflops = 107.093 (norm. = 0.292553), norm. avg. (of 7) = 0.222108 fft 23: mflops = 141.785 (norm. = 0.387324), norm. avg. (of 7) = 0.268767 fft 24: mflops = 134.223 (norm. = 0.366667), norm. avg. (of 7) = 0.254869 fft 25: mflops = 111.853 (norm. = 0.305556), norm. avg. (of 7) = 0.142769 fft 26: mflops = 40.5917 (norm. = 0.110887), norm. avg. (of 8) = 0.0620901 fft 27: mflops = 254.854 (norm. = 0.696203), norm. avg. (of 8) = 0.528403 fft 28: mflops = 251.668 (norm. = 0.6875), norm. avg. (of 8) = 0.489589 fft 29: mflops = 84.5944 (norm. = 0.231092), norm. avg. (of 7) = 0.095049 fft 30: mflops = 167.779 (norm. = 0.458333), norm. avg. (of 7) = 0.361466 fft 31: mflops = 186.421 (norm. = 0.509259), norm. avg. (of 8) = 0.218433 fft 32: mflops = 178.172 (norm. = 0.486726), norm. avg. (of 8) = 0.195887 fft 33: mflops = 152.526 (norm. = 0.416667), norm. avg. (of 8) = 0.246507 fft 34: mflops = 60.643 (norm. = 0.165663), norm. avg. (of 8) = 0.140151 fft 35: mflops = 121.286 (norm. = 0.331325), norm. avg. (of 8) = 0.173824 fft 36: mflops = 83.8894 (norm. = 0.229167), norm. avg. (of 8) = 0.123295 fft 37: mflops = 19.0658 (norm. = 0.0520833), norm. avg. (of 8) = 0.0439392 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.6666 s, 8192 iters, t-(init.)=1.64993 s t(norm)=0.0437083, mflops=114.395 (err=1.0e-15) 1. Arndt DIT: elapsed time t=1.69993 s, 8192 iters, t-(init.)=1.6666 s t(norm)=0.0441498, mflops=113.251 (err=1.1e-15) 2. Arndt Split-Radix: elapsed time t=1.13329 s, 4096 iters, t-(init.)=1.11662 s t(norm)=0.0591608, mflops=84.5155 (err=1.1e-15) 3. Arndt 4-step: elapsed time t=1.33328 s, 2048 iters, t-(init.)=1.33328 s t(norm)=0.141279, mflops=35.3909 (err=9.9e-16) 4. Bailey: elapsed time t=1.18329 s, 4096 iters, t-(init.)=1.16662 s t(norm)=0.0618098, mflops=80.8934 (err=1.1e-15) 5. Beauregard: elapsed time t=1.98325 s, 2048 iters, t-(init.)=1.96659 s t(norm)=0.208387, mflops=23.9938 (err=1.0e-15) 6. Bergland: elapsed time t=1.01663 s, 8192 iters, t-(init.)=0.983294 s t(norm)=0.0260484, mflops=191.95 (err=1.0e-15) 7. Brenner: elapsed time t=1.96659 s, 8192 iters, t-(init.)=1.93326 s t(norm)=0.0512138, mflops=97.6299 (err=1.0e-15) 8. Burrus: elapsed time t=1.38328 s, 4096 iters, t-(init.)=1.36661 s t(norm)=0.0724057, mflops=69.0553 (err=1.1e-15) 9. CWP (min N) (N=520): elapsed time t=1.19995 s, 16384 iters, t-(init.)=1.13329 s t(norm)=0.0150109, mflops=333.09 10. CWP (best N) (N=560): elapsed time t=1.41661 s, 16384 iters, t-(init.)=1.33328 s t(norm)=0.0176599, mflops=283.127 11. Edelblute: elapsed time t=1.54994 s, 4096 iters, t-(init.)=1.53327 s t(norm)=0.0812357, mflops=61.5493 (err=1.1e-15) 12. FFTPACK: elapsed time t=1.19995 s, 8192 iters, t-(init.)=1.16662 s t(norm)=0.0309049, mflops=161.787 (err=1.0e-15) 13. FFTPACK (f2c): elapsed time t=1.68327 s, 8192 iters, t-(init.)=1.64993 s t(norm)=0.0437083, mflops=114.395 (err=1.0e-15) FFTW_MEASURE plan: (cost = 7.730811e-05) FFTW_TWIDDLE 64 FFTW_NOTW 8 14. FFTW: elapsed time t=1.24995 s, 16384 iters, t-(init.)=1.18329 s t(norm)=0.0156732, mflops=319.016 (err=9.9e-16) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.24995 s, 16384 iters, t-(init.)=1.18329 s t(norm)=0.0156732, mflops=319.016 (err=9.6e-16) 16. Frigo-old: elapsed time t=1.63327 s, 16384 iters, t-(init.)=1.5666 s t(norm)=0.0207504, mflops=240.959 (err=9.4e-16) 17. Green: elapsed time t=1.14995 s, 16384 iters, t-(init.)=1.08329 s t(norm)=0.0143487, mflops=348.464 (err=9.6e-16) 18. GSL: elapsed time t=1.36661 s, 8192 iters, t-(init.)=1.33328 s t(norm)=0.0353199, mflops=141.563 (err=1.0e-15) 19. GSL DIT: elapsed time t=1.91659 s, 8192 iters, t-(init.)=1.88326 s t(norm)=0.0498893, mflops=100.222 (err=1.2e-15) 20. GSL DIF: elapsed time t=1.51661 s, 8192 iters, t-(init.)=1.48327 s t(norm)=0.0392933, mflops=127.248 (err=1.1e-15) 21. Krukar: elapsed time t=1.98325 s, 8192 iters, t-(init.)=1.94992 s t(norm)=0.0516553, mflops=96.7955 (err=1.0e-15) 22. Mayer (Buneman): elapsed time t=1.54994 s, 8192 iters, t-(init.)=1.51661 s t(norm)=0.0401763, mflops=124.451 (err=1.0e-15) 23. Mayer (simple): elapsed time t=1.18329 s, 8192 iters, t-(init.)=1.14995 s t(norm)=0.0304634, mflops=164.132 24. Mayer (lookup): elapsed time t=1.34995 s, 8192 iters, t-(init.)=1.31661 s t(norm)=0.0348784, mflops=143.355 (err=1.0e-15) 25. Monro: elapsed time t=1.63327 s, 8192 iters, t-(init.)=1.59994 s t(norm)=0.0423838, mflops=117.97 (err=8.4e-08) 26. NAPACK (f2c): elapsed time t=1.24995 s, 2048 iters, t-(init.)=1.24995 s t(norm)=0.132449, mflops=37.7502 (err=7.1e-15) 27. Ooura (C): elapsed time t=1.64993 s, 16384 iters, t-(init.)=1.58327 s t(norm)=0.0209712, mflops=238.423 (err=9.7e-16) 28. Ooura (F): elapsed time t=1.63327 s, 16384 iters, t-(init.)=1.5666 s t(norm)=0.0207504, mflops=240.959 (err=9.7e-16) 29. Ransom: elapsed time t=1.16662 s, 4096 iters, t-(init.)=1.14995 s t(norm)=0.0609268, mflops=82.0658 (err=1.4e-15) 30. SCIPORT: elapsed time t=1.54994 s, 8192 iters, t-(init.)=1.51661 s t(norm)=0.0401763, mflops=124.451 (err=1.0e-15) 31. Singleton: elapsed time t=2.01659 s, 16384 iters, t-(init.)=1.94992 s t(norm)=0.0258276, mflops=193.591 (err=1.2e-15) 32. Singleton (f2c): elapsed time t=1.01663 s, 8192 iters, t-(init.)=0.983294 s t(norm)=0.0260484, mflops=191.95 (err=1.2e-15) 33. Sorensen: elapsed time t=1.24995 s, 8192 iters, t-(init.)=1.21662 s t(norm)=0.0322294, mflops=155.138 (err=1.0e-15) 34. Sorensen DIT: elapsed time t=1.48327 s, 4096 iters, t-(init.)=1.46661 s t(norm)=0.0777037, mflops=64.347 (err=1.1e-15) 35. Temperton: elapsed time t=1.86659 s, 8192 iters, t-(init.)=1.83326 s t(norm)=0.0485648, mflops=102.955 (err=1.0e-07) 36. Temperton (f2c): elapsed time t=1.29995 s, 4096 iters, t-(init.)=1.28328 s t(norm)=0.0679907, mflops=73.5394 (err=1.0e-15) 37. Valkenburg: elapsed time t=1.36661 s, 1024 iters, t-(init.)=1.36661 s t(norm)=0.289623, mflops=17.2638 (err=1.3e-15) Top mflops for N=512 = 348.464 Normalized results and averages for N=512: fft 0: mflops = 114.395 (norm. = 0.328283), norm. avg. (of 9) = 0.268843 fft 1: mflops = 113.251 (norm. = 0.325), norm. avg. (of 9) = 0.269187 fft 2: mflops = 84.5155 (norm. = 0.242537), norm. avg. (of 9) = 0.176538 fft 3: mflops = 35.3909 (norm. = 0.101562), norm. avg. (of 9) = 0.0516337 fft 4: mflops = 80.8934 (norm. = 0.232143), norm. avg. (of 9) = 0.150009 fft 5: mflops = 23.9938 (norm. = 0.0688559), norm. avg. (of 9) = 0.0491229 fft 6: mflops = 191.95 (norm. = 0.550847), norm. avg. (of 9) = 0.25341 fft 7: mflops = 97.6299 (norm. = 0.280172), norm. avg. (of 9) = 0.168621 fft 8: mflops = 69.0553 (norm. = 0.198171), norm. avg. (of 9) = 0.134895 fft 9: mflops = 333.09 (norm. = 0.955882), norm. avg. (of 9) = 0.38676 fft 10: mflops = 283.127 (norm. = 0.8125), norm. avg. (of 9) = 0.397243 fft 11: mflops = 61.5493 (norm. = 0.17663), norm. avg. (of 8) = 0.120072 fft 12: mflops = 161.787 (norm. = 0.464286), norm. avg. (of 9) = 0.329186 fft 13: mflops = 114.395 (norm. = 0.328283), norm. avg. (of 9) = 0.243258 fft 14: mflops = 319.016 (norm. = 0.915493), norm. avg. (of 9) = 0.709402 fft 15: mflops = 319.016 (norm. = 0.915493), norm. avg. (of 9) = 0.837618 fft 16: mflops = 240.959 (norm. = 0.691489), norm. avg. (of 9) = 0.697984 fft 17: mflops = 348.464 (norm. = 1), norm. avg. (of 7) = 0.576698 fft 18: mflops = 141.563 (norm. = 0.40625), norm. avg. (of 9) = 0.302111 fft 19: mflops = 100.222 (norm. = 0.287611), norm. avg. (of 9) = 0.139929 fft 20: mflops = 127.248 (norm. = 0.365169), norm. avg. (of 9) = 0.172459 fft 21: mflops = 96.7955 (norm. = 0.277778), norm. avg. (of 9) = 0.239159 fft 22: mflops = 124.451 (norm. = 0.357143), norm. avg. (of 8) = 0.238988 fft 23: mflops = 164.132 (norm. = 0.471014), norm. avg. (of 8) = 0.294048 fft 24: mflops = 143.355 (norm. = 0.411392), norm. avg. (of 8) = 0.274434 fft 25: mflops = 117.97 (norm. = 0.338542), norm. avg. (of 8) = 0.167241 fft 26: mflops = 37.7502 (norm. = 0.108333), norm. avg. (of 9) = 0.0672282 fft 27: mflops = 238.423 (norm. = 0.684211), norm. avg. (of 9) = 0.545715 fft 28: mflops = 240.959 (norm. = 0.691489), norm. avg. (of 9) = 0.512023 fft 29: mflops = 82.0658 (norm. = 0.235507), norm. avg. (of 8) = 0.112606 fft 30: mflops = 124.451 (norm. = 0.357143), norm. avg. (of 8) = 0.360926 fft 31: mflops = 193.591 (norm. = 0.555556), norm. avg. (of 9) = 0.255891 fft 32: mflops = 191.95 (norm. = 0.550847), norm. avg. (of 9) = 0.235327 fft 33: mflops = 155.138 (norm. = 0.445205), norm. avg. (of 9) = 0.268585 fft 34: mflops = 64.347 (norm. = 0.184659), norm. avg. (of 9) = 0.145096 fft 35: mflops = 102.955 (norm. = 0.295455), norm. avg. (of 9) = 0.187338 fft 36: mflops = 73.5394 (norm. = 0.211039), norm. avg. (of 9) = 0.133044 fft 37: mflops = 17.2638 (norm. = 0.0495427), norm. avg. (of 9) = 0.0445618 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.29995 s, 2048 iters, t-(init.)=1.28328 s t(norm)=0.0611917, mflops=81.7105 (err=1.8e-15) 1. Arndt DIT: elapsed time t=1.23328 s, 2048 iters, t-(init.)=1.21662 s t(norm)=0.0580129, mflops=86.1878 (err=1.8e-15) 2. Arndt Split-Radix: elapsed time t=1.99992 s, 2048 iters, t-(init.)=1.98325 s t(norm)=0.0945689, mflops=52.8715 (err=1.8e-15) 3. Arndt 4-step: elapsed time t=1.43328 s, 1024 iters, t-(init.)=1.43328 s t(norm)=0.136688, mflops=36.5797 (err=1.8e-15) 4. Bailey: elapsed time t=1.91659 s, 2048 iters, t-(init.)=1.89992 s t(norm)=0.0905954, mflops=55.1904 (err=1.9e-15) 5. Beauregard: elapsed time t=1.16662 s, 512 iters, t-(init.)=1.16662 s t(norm)=0.222515, mflops=22.4704 (err=2.0e-15) 6. Bergland: elapsed time t=1.69993 s, 4096 iters, t-(init.)=1.6666 s t(norm)=0.0397348, mflops=125.834 (err=2.2e-15) 7. Brenner: elapsed time t=1.31661 s, 2048 iters, t-(init.)=1.29995 s t(norm)=0.0619864, mflops=80.6629 (err=2.0e-15) 8. Burrus: elapsed time t=1.01663 s, 1024 iters, t-(init.)=0.99996 s t(norm)=0.0953636, mflops=52.4309 (err=1.8e-15) 9. CWP (min N) (N=1040): elapsed time t=1.69993 s, 8192 iters, t-(init.)=1.63327 s t(norm)=0.0194701, mflops=256.804 10. CWP (best N) (N=1040): elapsed time t=1.69993 s, 8192 iters, t-(init.)=1.63327 s t(norm)=0.0194701, mflops=256.804 11. Edelblute: elapsed time t=1.18329 s, 1024 iters, t-(init.)=1.18329 s t(norm)=0.112847, mflops=44.3078 (err=1.8e-15) 12. FFTPACK: elapsed time t=1.23328 s, 4096 iters, t-(init.)=1.19995 s t(norm)=0.0286091, mflops=174.77 (err=1.9e-15) 13. FFTPACK (f2c): elapsed time t=1.73326 s, 4096 iters, t-(init.)=1.69993 s t(norm)=0.0405295, mflops=123.367 (err=1.9e-15) FFTW_MEASURE plan: (cost = 1.790293e-04) FFTW_TWIDDLE 32 FFTW_NOTW 32 14. FFTW: elapsed time t=1.28328 s, 8192 iters, t-(init.)=1.23328 s t(norm)=0.0147019, mflops=340.092 (err=2.0e-15) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.53327 s, 8192 iters, t-(init.)=1.46661 s t(norm)=0.0174833, mflops=285.987 (err=2.0e-15) 16. Frigo-old: elapsed time t=1.13329 s, 4096 iters, t-(init.)=1.09996 s t(norm)=0.026225, mflops=190.658 (err=1.9e-15) 17. Green: elapsed time t=1.79993 s, 8192 iters, t-(init.)=1.73326 s t(norm)=0.0206621, mflops=241.989 (err=2.0e-15) 18. GSL: elapsed time t=1.64993 s, 4096 iters, t-(init.)=1.6166 s t(norm)=0.0385428, mflops=129.726 (err=1.9e-15) 19. GSL DIT: elapsed time t=1.68327 s, 2048 iters, t-(init.)=1.6666 s t(norm)=0.0794697, mflops=62.9171 (err=2.1e-15) 20. GSL DIF: elapsed time t=1.63327 s, 2048 iters, t-(init.)=1.6166 s t(norm)=0.0770856, mflops=64.863 (err=2.2e-15) 21. Krukar: elapsed time t=1.38328 s, 2048 iters, t-(init.)=1.36661 s t(norm)=0.0651651, mflops=76.7281 (err=1.9e-15) 22. Mayer (Buneman): elapsed time t=1.13329 s, 2048 iters, t-(init.)=1.11662 s t(norm)=0.0532447, mflops=93.9061 (err=1.8e-15) 23. Mayer (simple): elapsed time t=1.89992 s, 4096 iters, t-(init.)=1.86659 s t(norm)=0.044503, mflops=112.352 24. Mayer (lookup): elapsed time t=1.03329 s, 2048 iters, t-(init.)=1.01663 s t(norm)=0.0484765, mflops=103.143 (err=1.8e-15) 25. Monro: elapsed time t=1.26662 s, 2048 iters, t-(init.)=1.24995 s t(norm)=0.0596023, mflops=83.8894 (err=1.0e-07) 26. NAPACK (f2c): elapsed time t=1.58327 s, 1024 iters, t-(init.)=1.58327 s t(norm)=0.150992, mflops=33.1143 (err=1.7e-14) 27. Ooura (C): elapsed time t=1.23328 s, 4096 iters, t-(init.)=1.19995 s t(norm)=0.0286091, mflops=174.77 (err=2.2e-15) 28. Ooura (F): elapsed time t=1.31661 s, 4096 iters, t-(init.)=1.28328 s t(norm)=0.0305958, mflops=163.421 (err=2.2e-15) 29. Ransom: elapsed time t=1.31661 s, 2048 iters, t-(init.)=1.29995 s t(norm)=0.0619864, mflops=80.6629 (err=2.3e-15) 30. SCIPORT: elapsed time t=1.86659 s, 4096 iters, t-(init.)=1.84993 s t(norm)=0.0441057, mflops=113.364 (err=2.0e-15) 31. Singleton: elapsed time t=1.54994 s, 4096 iters, t-(init.)=1.51661 s t(norm)=0.0361587, mflops=138.279 (err=2.8e-15) 32. Singleton (f2c): elapsed time t=1.78326 s, 4096 iters, t-(init.)=1.74993 s t(norm)=0.0417216, mflops=119.842 (err=2.8e-15) 33. Sorensen: elapsed time t=1.14995 s, 2048 iters, t-(init.)=1.13329 s t(norm)=0.0540394, mflops=92.5251 (err=1.8e-15) 34. Sorensen DIT: elapsed time t=1.31661 s, 1024 iters, t-(init.)=1.29995 s t(norm)=0.123973, mflops=40.3315 (err=1.9e-15) 35. Temperton: elapsed time t=1.09996 s, 2048 iters, t-(init.)=1.08329 s t(norm)=0.0516553, mflops=96.7955 (err=1.1e-07) 36. Temperton (f2c): elapsed time t=1.41661 s, 2048 iters, t-(init.)=1.39994 s t(norm)=0.0667545, mflops=74.9013 (err=1.9e-15) 37. Valkenburg: elapsed time t=1.7166 s, 512 iters, t-(init.)=1.7166 s t(norm)=0.327415, mflops=15.2711 (err=2.4e-15) Top mflops for N=1024 = 340.092 Normalized results and averages for N=1024: fft 0: mflops = 81.7105 (norm. = 0.24026), norm. avg. (of 10) = 0.265985 fft 1: mflops = 86.1878 (norm. = 0.253425), norm. avg. (of 10) = 0.26761 fft 2: mflops = 52.8715 (norm. = 0.155462), norm. avg. (of 10) = 0.17443 fft 3: mflops = 36.5797 (norm. = 0.107558), norm. avg. (of 10) = 0.0572262 fft 4: mflops = 55.1904 (norm. = 0.162281), norm. avg. (of 10) = 0.151236 fft 5: mflops = 22.4704 (norm. = 0.0660714), norm. avg. (of 10) = 0.0508178 fft 6: mflops = 125.834 (norm. = 0.37), norm. avg. (of 10) = 0.265069 fft 7: mflops = 80.6629 (norm. = 0.237179), norm. avg. (of 10) = 0.175477 fft 8: mflops = 52.4309 (norm. = 0.154167), norm. avg. (of 10) = 0.136822 fft 9: mflops = 256.804 (norm. = 0.755102), norm. avg. (of 10) = 0.423594 fft 10: mflops = 256.804 (norm. = 0.755102), norm. avg. (of 10) = 0.433029 fft 11: mflops = 44.3078 (norm. = 0.130282), norm. avg. (of 9) = 0.121207 fft 12: mflops = 174.77 (norm. = 0.513889), norm. avg. (of 10) = 0.347657 fft 13: mflops = 123.367 (norm. = 0.362745), norm. avg. (of 10) = 0.255206 fft 14: mflops = 340.092 (norm. = 1), norm. avg. (of 10) = 0.738462 fft 15: mflops = 285.987 (norm. = 0.840909), norm. avg. (of 10) = 0.837947 fft 16: mflops = 190.658 (norm. = 0.560606), norm. avg. (of 10) = 0.684247 fft 17: mflops = 241.989 (norm. = 0.711538), norm. avg. (of 8) = 0.593553 fft 18: mflops = 129.726 (norm. = 0.381443), norm. avg. (of 10) = 0.310045 fft 19: mflops = 62.9171 (norm. = 0.185), norm. avg. (of 10) = 0.144436 fft 20: mflops = 64.863 (norm. = 0.190722), norm. avg. (of 10) = 0.174285 fft 21: mflops = 76.7281 (norm. = 0.22561), norm. avg. (of 10) = 0.237804 fft 22: mflops = 93.9061 (norm. = 0.276119), norm. avg. (of 9) = 0.243113 fft 23: mflops = 112.352 (norm. = 0.330357), norm. avg. (of 9) = 0.298082 fft 24: mflops = 103.143 (norm. = 0.303279), norm. avg. (of 9) = 0.277639 fft 25: mflops = 83.8894 (norm. = 0.246667), norm. avg. (of 9) = 0.176066 fft 26: mflops = 33.1143 (norm. = 0.0973684), norm. avg. (of 10) = 0.0702422 fft 27: mflops = 174.77 (norm. = 0.513889), norm. avg. (of 10) = 0.542533 fft 28: mflops = 163.421 (norm. = 0.480519), norm. avg. (of 10) = 0.508872 fft 29: mflops = 80.6629 (norm. = 0.237179), norm. avg. (of 9) = 0.126448 fft 30: mflops = 113.364 (norm. = 0.333333), norm. avg. (of 9) = 0.35786 fft 31: mflops = 138.279 (norm. = 0.406593), norm. avg. (of 10) = 0.270961 fft 32: mflops = 119.842 (norm. = 0.352381), norm. avg. (of 10) = 0.247033 fft 33: mflops = 92.5251 (norm. = 0.272059), norm. avg. (of 10) = 0.268932 fft 34: mflops = 40.3315 (norm. = 0.11859), norm. avg. (of 10) = 0.142446 fft 35: mflops = 96.7955 (norm. = 0.284615), norm. avg. (of 10) = 0.197066 fft 36: mflops = 74.9013 (norm. = 0.220238), norm. avg. (of 10) = 0.141764 fft 37: mflops = 15.2711 (norm. = 0.0449029), norm. avg. (of 10) = 0.0445959 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.34995 s, 1024 iters, t-(init.)=1.33328 s t(norm)=0.0577961, mflops=86.511 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.29995 s, 1024 iters, t-(init.)=1.28328 s t(norm)=0.0556288, mflops=89.8815 (err=1.4e-15) 2. Arndt Split-Radix: elapsed time t=1.04996 s, 512 iters, t-(init.)=1.03329 s t(norm)=0.089584, mflops=55.8135 (err=1.5e-15) 3. Arndt 4-step: elapsed time t=1.63327 s, 512 iters, t-(init.)=1.63327 s t(norm)=0.141601, mflops=35.3106 (err=1.4e-15) 4. Bailey: elapsed time t=1.94992 s, 1024 iters, t-(init.)=1.93326 s t(norm)=0.0838044, mflops=59.6627 (err=1.4e-15) 5. Beauregard: elapsed time t=1.29995 s, 256 iters, t-(init.)=1.29995 s t(norm)=0.225405, mflops=22.1823 (err=1.4e-15) 6. Bergland: elapsed time t=1.6666 s, 2048 iters, t-(init.)=1.64993 s t(norm)=0.0357614, mflops=139.816 (err=1.5e-15) 7. Brenner: elapsed time t=1.46661 s, 1024 iters, t-(init.)=1.44994 s t(norm)=0.0628533, mflops=79.5503 (err=1.4e-15) 8. Burrus: elapsed time t=1.13329 s, 512 iters, t-(init.)=1.11662 s t(norm)=0.0968085, mflops=51.6483 (err=1.4e-15) 9. CWP (min N) (N=2145): elapsed time t=1.09996 s, 2048 iters, t-(init.)=1.06662 s t(norm)=0.0231185, mflops=216.277 10. CWP (best N) (N=2184): elapsed time t=1.81659 s, 4096 iters, t-(init.)=1.74993 s t(norm)=0.0189644, mflops=263.653 11. Edelblute: elapsed time t=1.24995 s, 512 iters, t-(init.)=1.24995 s t(norm)=0.108368, mflops=46.1392 (err=1.4e-15) 12. FFTPACK: elapsed time t=1.48327 s, 2048 iters, t-(init.)=1.44994 s t(norm)=0.0314266, mflops=159.101 (err=1.4e-15) 13. FFTPACK (f2c): elapsed time t=1.01663 s, 1024 iters, t-(init.)=0.99996 s t(norm)=0.0433471, mflops=115.348 (err=1.4e-15) FFTW_MEASURE plan: (cost = 3.417832e-04) FFTW_TWIDDLE 64 FFTW_NOTW 32 14. FFTW: elapsed time t=1.34995 s, 4096 iters, t-(init.)=1.29995 s t(norm)=0.0140878, mflops=354.917 (err=1.4e-15) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.58327 s, 4096 iters, t-(init.)=1.51661 s t(norm)=0.0164358, mflops=304.214 (err=1.4e-15) 16. Frigo-old: elapsed time t=1.23328 s, 2048 iters, t-(init.)=1.19995 s t(norm)=0.0260083, mflops=192.247 (err=1.3e-15) 17. Green: elapsed time t=1.01663 s, 2048 iters, t-(init.)=0.983294 s t(norm)=0.0213123, mflops=234.606 (err=1.4e-15) 18. GSL: elapsed time t=1.84993 s, 2048 iters, t-(init.)=1.81659 s t(norm)=0.0393736, mflops=126.989 (err=1.4e-15) 19. GSL DIT: elapsed time t=1.81659 s, 1024 iters, t-(init.)=1.79993 s t(norm)=0.0780248, mflops=64.0822 (err=2.0e-15) 20. GSL DIF: elapsed time t=1.78326 s, 1024 iters, t-(init.)=1.7666 s t(norm)=0.0765799, mflops=65.2913 (err=2.3e-15) 21. Krukar: elapsed time t=1.63327 s, 1024 iters, t-(init.)=1.6166 s t(norm)=0.0700778, mflops=71.3493 (err=1.4e-15) 22. Mayer (Buneman): elapsed time t=1.53327 s, 1024 iters, t-(init.)=1.53327 s t(norm)=0.0664656, mflops=75.2269 (err=1.4e-15) 23. Mayer (simple): elapsed time t=1.39994 s, 1024 iters, t-(init.)=1.39994 s t(norm)=0.0606859, mflops=82.3914 24. Mayer (lookup): elapsed time t=1.43328 s, 1024 iters, t-(init.)=1.41661 s t(norm)=0.0614084, mflops=81.4221 (err=1.4e-15) 25. Monro: elapsed time t=1.58327 s, 1024 iters, t-(init.)=1.5666 s t(norm)=0.0679105, mflops=73.6264 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.79993 s, 512 iters, t-(init.)=1.78326 s t(norm)=0.154605, mflops=32.3406 (err=1.5e-14) 27. Ooura (C): elapsed time t=1.53327 s, 2048 iters, t-(init.)=1.49994 s t(norm)=0.0325103, mflops=153.797 (err=1.4e-15) 28. Ooura (F): elapsed time t=1.59994 s, 2048 iters, t-(init.)=1.5666 s t(norm)=0.0339552, mflops=147.253 (err=1.4e-15) 29. Ransom: elapsed time t=1.68327 s, 1024 iters, t-(init.)=1.6666 s t(norm)=0.0722452, mflops=69.2088 (err=2.1e-15) 30. SCIPORT: elapsed time t=1.04996 s, 1024 iters, t-(init.)=1.03329 s t(norm)=0.044792, mflops=111.627 (err=1.4e-15) 31. Singleton: elapsed time t=1.04996 s, 1024 iters, t-(init.)=1.03329 s t(norm)=0.044792, mflops=111.627 (err=1.9e-15) 32. Singleton (f2c): elapsed time t=1.23328 s, 1024 iters, t-(init.)=1.21662 s t(norm)=0.052739, mflops=94.8066 (err=1.9e-15) 33. Sorensen: elapsed time t=1.28328 s, 1024 iters, t-(init.)=1.26662 s t(norm)=0.0549063, mflops=91.0642 (err=1.4e-15) 34. Sorensen DIT: elapsed time t=1.46661 s, 512 iters, t-(init.)=1.44994 s t(norm)=0.125707, mflops=39.7752 (err=1.4e-15) 35. Temperton: elapsed time t=1.28328 s, 1024 iters, t-(init.)=1.26662 s t(norm)=0.0549063, mflops=91.0642 (err=1.1e-07) 36. Temperton (f2c): elapsed time t=1.69993 s, 1024 iters, t-(init.)=1.68327 s t(norm)=0.0729676, mflops=68.5235 (err=1.4e-15) 37. Valkenburg: elapsed time t=1.99992 s, 256 iters, t-(init.)=1.99992 s t(norm)=0.346777, mflops=14.4185 (err=1.7e-15) Top mflops for N=2048 = 354.917 Normalized results and averages for N=2048: fft 0: mflops = 86.511 (norm. = 0.24375), norm. avg. (of 11) = 0.263963 fft 1: mflops = 89.8815 (norm. = 0.253247), norm. avg. (of 11) = 0.266305 fft 2: mflops = 55.8135 (norm. = 0.157258), norm. avg. (of 11) = 0.172869 fft 3: mflops = 35.3106 (norm. = 0.0994898), norm. avg. (of 11) = 0.0610683 fft 4: mflops = 59.6627 (norm. = 0.168103), norm. avg. (of 11) = 0.152769 fft 5: mflops = 22.1823 (norm. = 0.0625), norm. avg. (of 11) = 0.0518798 fft 6: mflops = 139.816 (norm. = 0.393939), norm. avg. (of 11) = 0.276784 fft 7: mflops = 79.5503 (norm. = 0.224138), norm. avg. (of 11) = 0.1799 fft 8: mflops = 51.6483 (norm. = 0.145522), norm. avg. (of 11) = 0.137613 fft 9: mflops = 216.277 (norm. = 0.609375), norm. avg. (of 11) = 0.440484 fft 10: mflops = 263.653 (norm. = 0.742857), norm. avg. (of 11) = 0.461195 fft 11: mflops = 46.1392 (norm. = 0.13), norm. avg. (of 10) = 0.122086 fft 12: mflops = 159.101 (norm. = 0.448276), norm. avg. (of 11) = 0.356804 fft 13: mflops = 115.348 (norm. = 0.325), norm. avg. (of 11) = 0.261551 fft 14: mflops = 354.917 (norm. = 1), norm. avg. (of 11) = 0.762238 fft 15: mflops = 304.214 (norm. = 0.857143), norm. avg. (of 11) = 0.839693 fft 16: mflops = 192.247 (norm. = 0.541667), norm. avg. (of 11) = 0.671285 fft 17: mflops = 234.606 (norm. = 0.661017), norm. avg. (of 9) = 0.601049 fft 18: mflops = 126.989 (norm. = 0.357798), norm. avg. (of 11) = 0.314386 fft 19: mflops = 64.0822 (norm. = 0.180556), norm. avg. (of 11) = 0.147719 fft 20: mflops = 65.2913 (norm. = 0.183962), norm. avg. (of 11) = 0.175165 fft 21: mflops = 71.3493 (norm. = 0.201031), norm. avg. (of 11) = 0.234461 fft 22: mflops = 75.2269 (norm. = 0.211957), norm. avg. (of 10) = 0.239998 fft 23: mflops = 82.3914 (norm. = 0.232143), norm. avg. (of 10) = 0.291488 fft 24: mflops = 81.4221 (norm. = 0.229412), norm. avg. (of 10) = 0.272816 fft 25: mflops = 73.6264 (norm. = 0.207447), norm. avg. (of 10) = 0.179204 fft 26: mflops = 32.3406 (norm. = 0.0911215), norm. avg. (of 11) = 0.0721403 fft 27: mflops = 153.797 (norm. = 0.433333), norm. avg. (of 11) = 0.532605 fft 28: mflops = 147.253 (norm. = 0.414894), norm. avg. (of 11) = 0.500329 fft 29: mflops = 69.2088 (norm. = 0.195), norm. avg. (of 10) = 0.133303 fft 30: mflops = 111.627 (norm. = 0.314516), norm. avg. (of 10) = 0.353526 fft 31: mflops = 111.627 (norm. = 0.314516), norm. avg. (of 11) = 0.274921 fft 32: mflops = 94.8066 (norm. = 0.267123), norm. avg. (of 11) = 0.248859 fft 33: mflops = 91.0642 (norm. = 0.256579), norm. avg. (of 11) = 0.267809 fft 34: mflops = 39.7752 (norm. = 0.112069), norm. avg. (of 11) = 0.139684 fft 35: mflops = 91.0642 (norm. = 0.256579), norm. avg. (of 11) = 0.202476 fft 36: mflops = 68.5235 (norm. = 0.193069), norm. avg. (of 11) = 0.146428 fft 37: mflops = 14.4185 (norm. = 0.040625), norm. avg. (of 11) = 0.0442349 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.5666 s, 512 iters, t-(init.)=1.54994 s t(norm)=0.061589, mflops=81.1833 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.49994 s, 512 iters, t-(init.)=1.48327 s t(norm)=0.05894, mflops=84.832 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.09996 s, 256 iters, t-(init.)=1.08329 s t(norm)=0.0860922, mflops=58.0773 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.53327 s, 256 iters, t-(init.)=1.51661 s t(norm)=0.120529, mflops=41.4838 (err=3.7e-15) 4. Bailey: elapsed time t=1.13329 s, 256 iters, t-(init.)=1.11662 s t(norm)=0.0887411, mflops=56.3437 (err=3.7e-15) 5. Beauregard: elapsed time t=1.43328 s, 128 iters, t-(init.)=1.43328 s t(norm)=0.227813, mflops=21.9478 (err=3.8e-15) 6. Bergland: elapsed time t=1.68327 s, 1024 iters, t-(init.)=1.64993 s t(norm)=0.0327812, mflops=152.526 (err=3.9e-15) 7. Brenner: elapsed time t=1.44994 s, 512 iters, t-(init.)=1.43328 s t(norm)=0.0569533, mflops=87.7913 (err=3.8e-15) 8. Burrus: elapsed time t=1.21662 s, 256 iters, t-(init.)=1.21662 s t(norm)=0.0966881, mflops=51.7127 (err=3.7e-15) 9. CWP (min N) (N=4290): elapsed time t=1.23328 s, 1024 iters, t-(init.)=1.19995 s t(norm)=0.0238409, mflops=209.724 10. CWP (best N) (N=4368): elapsed time t=1.08329 s, 1024 iters, t-(init.)=1.04996 s t(norm)=0.0208608, mflops=239.684 11. Edelblute: elapsed time t=1.31661 s, 256 iters, t-(init.)=1.31661 s t(norm)=0.104635, mflops=47.7851 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.48327 s, 1024 iters, t-(init.)=1.44994 s t(norm)=0.0288078, mflops=173.564 (err=3.8e-15) 13. FFTPACK (f2c): elapsed time t=1.98325 s, 1024 iters, t-(init.)=1.94992 s t(norm)=0.0387415, mflops=129.061 (err=3.8e-15) FFTW_MEASURE plan: (cost = 9.765234e-04) FFTW_TWIDDLE 64 FFTW_NOTW 64 14. FFTW: elapsed time t=1.7166 s, 2048 iters, t-(init.)=1.6666 s t(norm)=0.0165562, mflops=302.002 (err=3.8e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.04996 s, 1024 iters, t-(init.)=1.01663 s t(norm)=0.0201985, mflops=247.543 (err=3.8e-15) 16. Frigo-old: elapsed time t=1.63327 s, 1024 iters, t-(init.)=1.6166 s t(norm)=0.032119, mflops=155.671 (err=3.8e-15) 17. Green: elapsed time t=1.04996 s, 1024 iters, t-(init.)=1.01663 s t(norm)=0.0201985, mflops=247.543 (err=3.8e-15) 18. GSL: elapsed time t=1.81659 s, 1024 iters, t-(init.)=1.78326 s t(norm)=0.0354302, mflops=141.122 (err=3.8e-15) 19. GSL DIT: elapsed time t=1.91659 s, 512 iters, t-(init.)=1.89992 s t(norm)=0.0754962, mflops=66.2285 (err=4.1e-15) 20. GSL DIF: elapsed time t=1.89992 s, 512 iters, t-(init.)=1.88326 s t(norm)=0.0748339, mflops=66.8146 (err=4.3e-15) 21. Krukar: elapsed time t=1.84993 s, 512 iters, t-(init.)=1.83326 s t(norm)=0.0728472, mflops=68.6368 (err=3.8e-15) 22. Mayer (Buneman): elapsed time t=1.73326 s, 512 iters, t-(init.)=1.7166 s t(norm)=0.0682115, mflops=73.3014 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.59994 s, 512 iters, t-(init.)=1.58327 s t(norm)=0.0629135, mflops=79.4742 24. Mayer (lookup): elapsed time t=1.59994 s, 512 iters, t-(init.)=1.58327 s t(norm)=0.0629135, mflops=79.4742 (err=3.7e-15) 25. Monro: elapsed time t=1.7166 s, 512 iters, t-(init.)=1.69993 s t(norm)=0.0675492, mflops=74.0201 (err=1.1e-07) 26. NAPACK (f2c): elapsed time t=1.93326 s, 256 iters, t-(init.)=1.91659 s t(norm)=0.152317, mflops=32.8263 (err=4.9e-14) 27. Ooura (C): elapsed time t=1.48327 s, 1024 iters, t-(init.)=1.44994 s t(norm)=0.0288078, mflops=173.564 (err=3.9e-15) 28. Ooura (F): elapsed time t=1.58327 s, 1024 iters, t-(init.)=1.54994 s t(norm)=0.0307945, mflops=162.367 (err=3.9e-15) 29. Ransom: elapsed time t=1.43328 s, 512 iters, t-(init.)=1.41661 s t(norm)=0.056291, mflops=88.8241 (err=4.4e-15) 30. SCIPORT: elapsed time t=1.18329 s, 256 iters, t-(init.)=1.16662 s t(norm)=0.0927146, mflops=53.9289 (err=3.8e-15) 31. Singleton: elapsed time t=1.86659 s, 1024 iters, t-(init.)=1.83326 s t(norm)=0.0364236, mflops=137.274 (err=5.8e-15) 32. Singleton (f2c): elapsed time t=1.18329 s, 512 iters, t-(init.)=1.16662 s t(norm)=0.0463573, mflops=107.858 (err=5.8e-15) 33. Sorensen: elapsed time t=1.41661 s, 512 iters, t-(init.)=1.39994 s t(norm)=0.0556288, mflops=89.8815 (err=3.7e-15) 34. Sorensen DIT: elapsed time t=1.5666 s, 256 iters, t-(init.)=1.5666 s t(norm)=0.124502, mflops=40.1598 (err=3.7e-15) 35. Temperton: elapsed time t=1.31661 s, 512 iters, t-(init.)=1.29995 s t(norm)=0.0516553, mflops=96.7955 (err=1.2e-07) 36. Temperton (f2c): elapsed time t=1.6666 s, 512 iters, t-(init.)=1.64993 s t(norm)=0.0655625, mflops=76.2631 (err=3.8e-15) 37. Valkenburg: elapsed time t=1.14995 s, 64 iters, t-(init.)=1.14995 s t(norm)=0.365561, mflops=13.6776 (err=4.0e-15) Top mflops for N=4096 = 302.002 Normalized results and averages for N=4096: fft 0: mflops = 81.1833 (norm. = 0.268817), norm. avg. (of 12) = 0.264368 fft 1: mflops = 84.832 (norm. = 0.280899), norm. avg. (of 12) = 0.267521 fft 2: mflops = 58.0773 (norm. = 0.192308), norm. avg. (of 12) = 0.174489 fft 3: mflops = 41.4838 (norm. = 0.137363), norm. avg. (of 12) = 0.0674262 fft 4: mflops = 56.3437 (norm. = 0.186567), norm. avg. (of 12) = 0.155586 fft 5: mflops = 21.9478 (norm. = 0.0726744), norm. avg. (of 12) = 0.0536127 fft 6: mflops = 152.526 (norm. = 0.505051), norm. avg. (of 12) = 0.295807 fft 7: mflops = 87.7913 (norm. = 0.290698), norm. avg. (of 12) = 0.189133 fft 8: mflops = 51.7127 (norm. = 0.171233), norm. avg. (of 12) = 0.140415 fft 9: mflops = 209.724 (norm. = 0.694444), norm. avg. (of 12) = 0.461647 fft 10: mflops = 239.684 (norm. = 0.793651), norm. avg. (of 12) = 0.4889 fft 11: mflops = 47.7851 (norm. = 0.158228), norm. avg. (of 11) = 0.125372 fft 12: mflops = 173.564 (norm. = 0.574713), norm. avg. (of 12) = 0.374963 fft 13: mflops = 129.061 (norm. = 0.42735), norm. avg. (of 12) = 0.275368 fft 14: mflops = 302.002 (norm. = 1), norm. avg. (of 12) = 0.782052 fft 15: mflops = 247.543 (norm. = 0.819672), norm. avg. (of 12) = 0.838024 fft 16: mflops = 155.671 (norm. = 0.515464), norm. avg. (of 12) = 0.6583 fft 17: mflops = 247.543 (norm. = 0.819672), norm. avg. (of 10) = 0.622912 fft 18: mflops = 141.122 (norm. = 0.46729), norm. avg. (of 12) = 0.327128 fft 19: mflops = 66.2285 (norm. = 0.219298), norm. avg. (of 12) = 0.153684 fft 20: mflops = 66.8146 (norm. = 0.221239), norm. avg. (of 12) = 0.179005 fft 21: mflops = 68.6368 (norm. = 0.227273), norm. avg. (of 12) = 0.233862 fft 22: mflops = 73.3014 (norm. = 0.242718), norm. avg. (of 11) = 0.240245 fft 23: mflops = 79.4742 (norm. = 0.263158), norm. avg. (of 11) = 0.288913 fft 24: mflops = 79.4742 (norm. = 0.263158), norm. avg. (of 11) = 0.271938 fft 25: mflops = 74.0201 (norm. = 0.245098), norm. avg. (of 11) = 0.185194 fft 26: mflops = 32.8263 (norm. = 0.108696), norm. avg. (of 12) = 0.0751866 fft 27: mflops = 173.564 (norm. = 0.574713), norm. avg. (of 12) = 0.536114 fft 28: mflops = 162.367 (norm. = 0.537634), norm. avg. (of 12) = 0.503438 fft 29: mflops = 88.8241 (norm. = 0.294118), norm. avg. (of 11) = 0.147922 fft 30: mflops = 53.9289 (norm. = 0.178571), norm. avg. (of 11) = 0.337621 fft 31: mflops = 137.274 (norm. = 0.454545), norm. avg. (of 12) = 0.28989 fft 32: mflops = 107.858 (norm. = 0.357143), norm. avg. (of 12) = 0.257883 fft 33: mflops = 89.8815 (norm. = 0.297619), norm. avg. (of 12) = 0.270293 fft 34: mflops = 40.1598 (norm. = 0.132979), norm. avg. (of 12) = 0.139125 fft 35: mflops = 96.7955 (norm. = 0.320513), norm. avg. (of 12) = 0.212313 fft 36: mflops = 76.2631 (norm. = 0.252525), norm. avg. (of 12) = 0.155269 fft 37: mflops = 13.6776 (norm. = 0.0452899), norm. avg. (of 12) = 0.0443229 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.54994 s, 256 iters, t-(init.)=1.53327 s t(norm)=0.0562401, mflops=88.9046 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.48327 s, 256 iters, t-(init.)=1.46661 s t(norm)=0.0537949, mflops=92.9457 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.18329 s, 128 iters, t-(init.)=1.18329 s t(norm)=0.0868053, mflops=57.6001 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.7166 s, 128 iters, t-(init.)=1.69993 s t(norm)=0.124706, mflops=40.0942 (err=3.7e-15) 4. Bailey: elapsed time t=1.5666 s, 128 iters, t-(init.)=1.5666 s t(norm)=0.114925, mflops=43.5065 (err=3.7e-15) 5. Beauregard: elapsed time t=1.54994 s, 64 iters, t-(init.)=1.53327 s t(norm)=0.22496, mflops=22.2261 (err=3.7e-15) 6. Bergland: elapsed time t=1.96659 s, 512 iters, t-(init.)=1.93326 s t(norm)=0.0354557, mflops=141.021 (err=3.7e-15) 7. Brenner: elapsed time t=1.6166 s, 256 iters, t-(init.)=1.59994 s t(norm)=0.0586853, mflops=85.2002 (err=3.7e-15) 8. Burrus: elapsed time t=1.28328 s, 128 iters, t-(init.)=1.28328 s t(norm)=0.094141, mflops=53.1118 (err=3.7e-15) 9. CWP (min N) (N=8580): elapsed time t=1.21662 s, 512 iters, t-(init.)=1.18329 s t(norm)=0.0217013, mflops=230.401 10. CWP (best N) (N=9240): elapsed time t=1.09996 s, 512 iters, t-(init.)=1.06662 s t(norm)=0.0195618, mflops=255.601 11. Edelblute: elapsed time t=1.38328 s, 128 iters, t-(init.)=1.38328 s t(norm)=0.101477, mflops=49.2724 (err=3.7e-15) 12. FFTPACK: elapsed time t=1.41661 s, 256 iters, t-(init.)=1.39994 s t(norm)=0.0513496, mflops=97.3717 (err=3.7e-15) 13. FFTPACK (f2c): elapsed time t=1.7666 s, 256 iters, t-(init.)=1.74993 s t(norm)=0.064187, mflops=77.8973 (err=3.7e-15) FFTW_MEASURE plan: (cost = 2.213453e-03) FFTW_TWIDDLE 32 FFTW_TWIDDLE 8 FFTW_NOTW 32 14. FFTW: elapsed time t=1.16662 s, 512 iters, t-(init.)=1.13329 s t(norm)=0.0207844, mflops=240.565 (err=3.7e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.36661 s, 512 iters, t-(init.)=1.33328 s t(norm)=0.0244522, mflops=204.48 (err=3.7e-15) 16. Frigo-old: elapsed time t=1.91659 s, 512 iters, t-(init.)=1.88326 s t(norm)=0.0345387, mflops=144.765 (err=3.7e-15) 17. Green: elapsed time t=1.21662 s, 512 iters, t-(init.)=1.18329 s t(norm)=0.0217013, mflops=230.401 (err=3.7e-15) 18. GSL: elapsed time t=1.36661 s, 256 iters, t-(init.)=1.34995 s t(norm)=0.0495157, mflops=100.978 (err=3.7e-15) 19. GSL DIT: elapsed time t=1.03329 s, 128 iters, t-(init.)=1.01663 s t(norm)=0.0745792, mflops=67.0428 (err=4.3e-15) 20. GSL DIF: elapsed time t=1.03329 s, 128 iters, t-(init.)=1.03329 s t(norm)=0.0758018, mflops=65.9615 (err=4.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.7666 s, 256 iters, t-(init.)=1.74993 s t(norm)=0.064187, mflops=77.8973 (err=3.7e-15) 23. Mayer (simple): elapsed time t=1.6166 s, 256 iters, t-(init.)=1.59994 s t(norm)=0.0586853, mflops=85.2002 24. Mayer (lookup): elapsed time t=1.68327 s, 256 iters, t-(init.)=1.6666 s t(norm)=0.0611305, mflops=81.7922 (err=3.7e-15) 25. Monro: elapsed time t=1.88326 s, 256 iters, t-(init.)=1.86659 s t(norm)=0.0684662, mflops=73.0287 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.18329 s, 64 iters, t-(init.)=1.18329 s t(norm)=0.173611, mflops=28.8001 (err=4.5e-14) 27. Ooura (C): elapsed time t=1.81659 s, 512 iters, t-(init.)=1.78326 s t(norm)=0.0327048, mflops=152.883 (err=3.7e-15) 28. Ooura (F): elapsed time t=1.91659 s, 512 iters, t-(init.)=1.88326 s t(norm)=0.0345387, mflops=144.765 (err=3.7e-15) 29. Ransom: elapsed time t=1.78326 s, 256 iters, t-(init.)=1.7666 s t(norm)=0.0647984, mflops=77.1625 (err=4.9e-15) 30. SCIPORT: elapsed time t=1.51661 s, 128 iters, t-(init.)=1.51661 s t(norm)=0.111258, mflops=44.9408 (err=3.7e-15) 31. Singleton: elapsed time t=1.08329 s, 256 iters, t-(init.)=1.06662 s t(norm)=0.0391235, mflops=127.8 (err=5.6e-15) 32. Singleton (f2c): elapsed time t=1.33328 s, 256 iters, t-(init.)=1.31661 s t(norm)=0.0482931, mflops=103.534 (err=5.6e-15) 33. Sorensen: elapsed time t=1.64993 s, 256 iters, t-(init.)=1.63327 s t(norm)=0.0599079, mflops=83.4614 (err=3.7e-15) 34. Sorensen DIT: elapsed time t=1.6666 s, 128 iters, t-(init.)=1.6666 s t(norm)=0.122261, mflops=40.8961 (err=3.7e-15) 35. Temperton: elapsed time t=1.53327 s, 256 iters, t-(init.)=1.51661 s t(norm)=0.0556288, mflops=89.8815 (err=1.4e-07) 36. Temperton (f2c): elapsed time t=1.03329 s, 128 iters, t-(init.)=1.03329 s t(norm)=0.0758018, mflops=65.9615 (err=3.7e-15) 37. Valkenburg: elapsed time t=1.33328 s, 32 iters, t-(init.)=1.33328 s t(norm)=0.391235, mflops=12.78 (err=3.8e-15) Top mflops for N=8192 = 255.601 Normalized results and averages for N=8192: fft 0: mflops = 88.9046 (norm. = 0.347826), norm. avg. (of 13) = 0.270788 fft 1: mflops = 92.9457 (norm. = 0.363636), norm. avg. (of 13) = 0.274914 fft 2: mflops = 57.6001 (norm. = 0.225352), norm. avg. (of 13) = 0.178402 fft 3: mflops = 40.0942 (norm. = 0.156863), norm. avg. (of 13) = 0.0743059 fft 4: mflops = 43.5065 (norm. = 0.170213), norm. avg. (of 13) = 0.156711 fft 5: mflops = 22.2261 (norm. = 0.0869565), norm. avg. (of 13) = 0.0561776 fft 6: mflops = 141.021 (norm. = 0.551724), norm. avg. (of 13) = 0.315493 fft 7: mflops = 85.2002 (norm. = 0.333333), norm. avg. (of 13) = 0.200226 fft 8: mflops = 53.1118 (norm. = 0.207792), norm. avg. (of 13) = 0.145598 fft 9: mflops = 230.401 (norm. = 0.901408), norm. avg. (of 13) = 0.495475 fft 10: mflops = 255.601 (norm. = 1), norm. avg. (of 13) = 0.528215 fft 11: mflops = 49.2724 (norm. = 0.192771), norm. avg. (of 12) = 0.130988 fft 12: mflops = 97.3717 (norm. = 0.380952), norm. avg. (of 13) = 0.375424 fft 13: mflops = 77.8973 (norm. = 0.304762), norm. avg. (of 13) = 0.277629 fft 14: mflops = 240.565 (norm. = 0.941176), norm. avg. (of 13) = 0.794292 fft 15: mflops = 204.48 (norm. = 0.8), norm. avg. (of 13) = 0.835099 fft 16: mflops = 144.765 (norm. = 0.566372), norm. avg. (of 13) = 0.651228 fft 17: mflops = 230.401 (norm. = 0.901408), norm. avg. (of 11) = 0.648229 fft 18: mflops = 100.978 (norm. = 0.395062), norm. avg. (of 13) = 0.332354 fft 19: mflops = 67.0428 (norm. = 0.262295), norm. avg. (of 13) = 0.162039 fft 20: mflops = 65.9615 (norm. = 0.258065), norm. avg. (of 13) = 0.185086 fft 21: mflops = -1 (norm. = -0.00391235), norm. avg. (of 12) = 0.233862 fft 22: mflops = 77.8973 (norm. = 0.304762), norm. avg. (of 12) = 0.245621 fft 23: mflops = 85.2002 (norm. = 0.333333), norm. avg. (of 12) = 0.292614 fft 24: mflops = 81.7922 (norm. = 0.32), norm. avg. (of 12) = 0.275943 fft 25: mflops = 73.0287 (norm. = 0.285714), norm. avg. (of 12) = 0.193571 fft 26: mflops = 28.8001 (norm. = 0.112676), norm. avg. (of 13) = 0.0780704 fft 27: mflops = 152.883 (norm. = 0.598131), norm. avg. (of 13) = 0.540885 fft 28: mflops = 144.765 (norm. = 0.566372), norm. avg. (of 13) = 0.508279 fft 29: mflops = 77.1625 (norm. = 0.301887), norm. avg. (of 12) = 0.160753 fft 30: mflops = 44.9408 (norm. = 0.175824), norm. avg. (of 12) = 0.324138 fft 31: mflops = 127.8 (norm. = 0.5), norm. avg. (of 13) = 0.306052 fft 32: mflops = 103.534 (norm. = 0.405063), norm. avg. (of 13) = 0.269204 fft 33: mflops = 83.4614 (norm. = 0.326531), norm. avg. (of 13) = 0.274619 fft 34: mflops = 40.8961 (norm. = 0.16), norm. avg. (of 13) = 0.140731 fft 35: mflops = 89.8815 (norm. = 0.351648), norm. avg. (of 13) = 0.223031 fft 36: mflops = 65.9615 (norm. = 0.258065), norm. avg. (of 13) = 0.163177 fft 37: mflops = 12.78 (norm. = 0.05), norm. avg. (of 13) = 0.0447596 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.59994 s, 64 iters, t-(init.)=1.58327 s t(norm)=0.107852, mflops=46.36 (err=6.8e-15) 1. Arndt DIT: elapsed time t=1.5666 s, 64 iters, t-(init.)=1.54994 s t(norm)=0.105581, mflops=47.3569 (err=6.8e-15) 2. Arndt Split-Radix: elapsed time t=1.06662 s, 32 iters, t-(init.)=1.04996 s t(norm)=0.143045, mflops=34.9539 (err=6.8e-15) 3. Arndt 4-step: elapsed time t=1.88326 s, 64 iters, t-(init.)=1.86659 s t(norm)=0.127151, mflops=39.3232 (err=6.8e-15) 4. Bailey: elapsed time t=1.93326 s, 64 iters, t-(init.)=1.91659 s t(norm)=0.130557, mflops=38.2974 (err=6.8e-15) 5. Beauregard: elapsed time t=1.79993 s, 32 iters, t-(init.)=1.79993 s t(norm)=0.245221, mflops=20.3898 (err=6.8e-15) 6. Bergland: elapsed time t=1.48327 s, 128 iters, t-(init.)=1.43328 s t(norm)=0.0488171, mflops=102.423 (err=6.8e-15) 7. Brenner: elapsed time t=1.16662 s, 64 iters, t-(init.)=1.13329 s t(norm)=0.0771991, mflops=64.7676 (err=6.8e-15) 8. Burrus: elapsed time t=1.09996 s, 32 iters, t-(init.)=1.08329 s t(norm)=0.147587, mflops=33.8784 (err=6.8e-15) 9. CWP (min N) (N=17160): elapsed time t=1.53327 s, 256 iters, t-(init.)=1.43328 s t(norm)=0.0244085, mflops=204.846 10. CWP (best N) (N=17160): elapsed time t=1.53327 s, 256 iters, t-(init.)=1.41661 s t(norm)=0.0241247, mflops=207.256 11. Edelblute: elapsed time t=1.16662 s, 32 iters, t-(init.)=1.14995 s t(norm)=0.156669, mflops=31.9145 (err=6.8e-15) 12. FFTPACK: elapsed time t=1.64993 s, 128 iters, t-(init.)=1.59994 s t(norm)=0.0544935, mflops=91.7541 (err=6.8e-15) 13. FFTPACK (f2c): elapsed time t=1.94992 s, 128 iters, t-(init.)=1.89992 s t(norm)=0.064711, mflops=77.2666 (err=6.8e-15) FFTW_MEASURE plan: (cost = 6.510156e-03) FFTW_TWIDDLE 8 FFTW_TWIDDLE 64 FFTW_NOTW 32 14. FFTW: elapsed time t=1.64993 s, 256 iters, t-(init.)=1.54994 s t(norm)=0.0263953, mflops=189.428 (err=6.8e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.7166 s, 256 iters, t-(init.)=1.6166 s t(norm)=0.0275306, mflops=181.616 (err=6.8e-15) 16. Frigo-old: elapsed time t=1.74993 s, 128 iters, t-(init.)=1.69993 s t(norm)=0.0578993, mflops=86.3568 (err=6.8e-15) 17. Green: elapsed time t=1.21662 s, 128 iters, t-(init.)=1.16662 s t(norm)=0.0397348, mflops=125.834 (err=6.8e-15) 18. GSL: elapsed time t=1.04996 s, 64 iters, t-(init.)=1.03329 s t(norm)=0.0703874, mflops=71.0354 (err=6.8e-15) 19. GSL DIT: elapsed time t=1.7666 s, 64 iters, t-(init.)=1.73326 s t(norm)=0.118069, mflops=42.348 (err=7.2e-15) 20. GSL DIF: elapsed time t=1.74993 s, 64 iters, t-(init.)=1.73326 s t(norm)=0.118069, mflops=42.348 (err=7.3e-15) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.04996 s, 64 iters, t-(init.)=1.03329 s t(norm)=0.0703874, mflops=71.0354 (err=6.8e-15) 23. Mayer (simple): elapsed time t=1.93326 s, 128 iters, t-(init.)=1.88326 s t(norm)=0.0641434, mflops=77.9504 24. Mayer (lookup): elapsed time t=1.96659 s, 128 iters, t-(init.)=1.91659 s t(norm)=0.0652787, mflops=76.5947 (err=6.8e-15) 25. Monro: elapsed time t=1.73326 s, 64 iters, t-(init.)=1.7166 s t(norm)=0.116934, mflops=42.7592 (err=1.3e-07) 26. NAPACK (f2c): elapsed time t=1.46661 s, 32 iters, t-(init.)=1.44994 s t(norm)=0.197539, mflops=25.3115 (err=2.3e-13) 27. Ooura (C): elapsed time t=1.16662 s, 128 iters, t-(init.)=1.11662 s t(norm)=0.0380319, mflops=131.469 (err=6.8e-15) 28. Ooura (F): elapsed time t=1.23328 s, 128 iters, t-(init.)=1.18329 s t(norm)=0.0403025, mflops=124.062 (err=6.8e-15) 29. Ransom: elapsed time t=1.7166 s, 128 iters, t-(init.)=1.68327 s t(norm)=0.0573317, mflops=87.2118 (err=7.4e-15) 30. SCIPORT: elapsed time t=1.34995 s, 32 iters, t-(init.)=1.33328 s t(norm)=0.181645, mflops=27.5262 (err=6.8e-15) 31. Singleton: elapsed time t=1.78326 s, 128 iters, t-(init.)=1.73326 s t(norm)=0.0590346, mflops=84.6961 (err=1.0e-14) 32. Singleton (f2c): elapsed time t=1.01663 s, 64 iters, t-(init.)=0.983294 s t(norm)=0.0669816, mflops=74.6474 (err=1.0e-14) 33. Sorensen: elapsed time t=1.24995 s, 64 iters, t-(init.)=1.23328 s t(norm)=0.0840108, mflops=59.5162 (err=6.8e-15) 34. Sorensen DIT: elapsed time t=1.29995 s, 32 iters, t-(init.)=1.28328 s t(norm)=0.174833, mflops=28.5987 (err=6.8e-15) 35. Temperton: elapsed time t=1.06662 s, 64 iters, t-(init.)=1.04996 s t(norm)=0.0715227, mflops=69.9079 (err=1.5e-07) 36. Temperton (f2c): elapsed time t=1.33328 s, 64 iters, t-(init.)=1.31661 s t(norm)=0.0896872, mflops=55.7493 (err=6.8e-15) 37. Valkenburg: elapsed time t=1.54994 s, 16 iters, t-(init.)=1.53327 s t(norm)=0.417783, mflops=11.9679 (err=6.9e-15) Top mflops for N=16384 = 207.256 Normalized results and averages for N=16384: fft 0: mflops = 46.36 (norm. = 0.223684), norm. avg. (of 14) = 0.267423 fft 1: mflops = 47.3569 (norm. = 0.228495), norm. avg. (of 14) = 0.271599 fft 2: mflops = 34.9539 (norm. = 0.168651), norm. avg. (of 14) = 0.177705 fft 3: mflops = 39.3232 (norm. = 0.189732), norm. avg. (of 14) = 0.0825506 fft 4: mflops = 38.2974 (norm. = 0.184783), norm. avg. (of 14) = 0.158716 fft 5: mflops = 20.3898 (norm. = 0.0983796), norm. avg. (of 14) = 0.059192 fft 6: mflops = 102.423 (norm. = 0.494186), norm. avg. (of 14) = 0.328256 fft 7: mflops = 64.7676 (norm. = 0.3125), norm. avg. (of 14) = 0.208245 fft 8: mflops = 33.8784 (norm. = 0.163462), norm. avg. (of 14) = 0.146874 fft 9: mflops = 204.846 (norm. = 0.988372), norm. avg. (of 14) = 0.530682 fft 10: mflops = 207.256 (norm. = 1), norm. avg. (of 14) = 0.561914 fft 11: mflops = 31.9145 (norm. = 0.153986), norm. avg. (of 13) = 0.132757 fft 12: mflops = 91.7541 (norm. = 0.442708), norm. avg. (of 14) = 0.38023 fft 13: mflops = 77.2666 (norm. = 0.372807), norm. avg. (of 14) = 0.284427 fft 14: mflops = 189.428 (norm. = 0.913978), norm. avg. (of 14) = 0.802841 fft 15: mflops = 181.616 (norm. = 0.876289), norm. avg. (of 14) = 0.838041 fft 16: mflops = 86.3568 (norm. = 0.416667), norm. avg. (of 14) = 0.634474 fft 17: mflops = 125.834 (norm. = 0.607143), norm. avg. (of 12) = 0.644806 fft 18: mflops = 71.0354 (norm. = 0.342742), norm. avg. (of 14) = 0.333096 fft 19: mflops = 42.348 (norm. = 0.204327), norm. avg. (of 14) = 0.16506 fft 20: mflops = 42.348 (norm. = 0.204327), norm. avg. (of 14) = 0.18646 fft 21: mflops = -1 (norm. = -0.00482494), norm. avg. (of 12) = 0.233862 fft 22: mflops = 71.0354 (norm. = 0.342742), norm. avg. (of 13) = 0.253092 fft 23: mflops = 77.9504 (norm. = 0.376106), norm. avg. (of 13) = 0.299037 fft 24: mflops = 76.5947 (norm. = 0.369565), norm. avg. (of 13) = 0.283145 fft 25: mflops = 42.7592 (norm. = 0.206311), norm. avg. (of 13) = 0.194551 fft 26: mflops = 25.3115 (norm. = 0.122126), norm. avg. (of 14) = 0.0812173 fft 27: mflops = 131.469 (norm. = 0.634328), norm. avg. (of 14) = 0.547559 fft 28: mflops = 124.062 (norm. = 0.598592), norm. avg. (of 14) = 0.51473 fft 29: mflops = 87.2118 (norm. = 0.420792), norm. avg. (of 13) = 0.180756 fft 30: mflops = 27.5262 (norm. = 0.132813), norm. avg. (of 13) = 0.30942 fft 31: mflops = 84.6961 (norm. = 0.408654), norm. avg. (of 14) = 0.313381 fft 32: mflops = 74.6474 (norm. = 0.360169), norm. avg. (of 14) = 0.275702 fft 33: mflops = 59.5162 (norm. = 0.287162), norm. avg. (of 14) = 0.275515 fft 34: mflops = 28.5987 (norm. = 0.137987), norm. avg. (of 14) = 0.140535 fft 35: mflops = 69.9079 (norm. = 0.337302), norm. avg. (of 14) = 0.231193 fft 36: mflops = 55.7493 (norm. = 0.268987), norm. avg. (of 14) = 0.170735 fft 37: mflops = 11.9679 (norm. = 0.0577446), norm. avg. (of 14) = 0.0456871 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.03329 s, 16 iters, t-(init.)=0.99996 s t(norm)=0.127151, mflops=39.3232 (err=1.4e-14) 1. Arndt DIT: elapsed time t=1.03329 s, 16 iters, t-(init.)=1.01663 s t(norm)=0.129271, mflops=38.6785 (err=1.4e-14) 2. Arndt Split-Radix: elapsed time t=1.38328 s, 16 iters, t-(init.)=1.34995 s t(norm)=0.171655, mflops=29.1283 (err=1.4e-14) 3. Arndt 4-step: elapsed time t=1.23328 s, 16 iters, t-(init.)=1.21662 s t(norm)=0.154701, mflops=32.3204 (err=1.4e-14) 4. Bailey: elapsed time t=1.31661 s, 16 iters, t-(init.)=1.28328 s t(norm)=0.163178, mflops=30.6414 (err=1.4e-14) 5. Beauregard: elapsed time t=1.03329 s, 8 iters, t-(init.)=1.01663 s t(norm)=0.258541, mflops=19.3393 (err=1.4e-14) 6. Bergland: elapsed time t=1.91659 s, 64 iters, t-(init.)=1.84993 s t(norm)=0.0588076, mflops=85.0231 (err=1.4e-14) 7. Brenner: elapsed time t=1.64993 s, 32 iters, t-(init.)=1.6166 s t(norm)=0.102781, mflops=48.6472 (err=1.4e-14) 8. Burrus: elapsed time t=1.41661 s, 16 iters, t-(init.)=1.39994 s t(norm)=0.178012, mflops=28.088 (err=1.4e-14) 9. CWP (min N) (N=34320): elapsed time t=1.98325 s, 128 iters, t-(init.)=1.81659 s t(norm)=0.028874, mflops=173.166 10. CWP (best N) (N=34320): elapsed time t=1.94992 s, 128 iters, t-(init.)=1.78326 s t(norm)=0.0283442, mflops=176.403 11. Edelblute: elapsed time t=1.49994 s, 16 iters, t-(init.)=1.48327 s t(norm)=0.188608, mflops=26.51 (err=1.4e-14) 12. FFTPACK: elapsed time t=1.23328 s, 32 iters, t-(init.)=1.19995 s t(norm)=0.0762909, mflops=65.5386 (err=1.4e-14) 13. FFTPACK (f2c): elapsed time t=1.38328 s, 32 iters, t-(init.)=1.34995 s t(norm)=0.0858273, mflops=58.2566 (err=1.4e-14) FFTW_MEASURE plan: (cost = 2.187413e-02) FFTW_TWIDDLE 8 FFTW_TWIDDLE 64 FFTW_NOTW 64 14. FFTW: elapsed time t=1.11662 s, 64 iters, t-(init.)=1.03329 s t(norm)=0.0328475, mflops=152.219 (err=1.4e-14) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.19995 s, 64 iters, t-(init.)=1.11662 s t(norm)=0.0354965, mflops=140.859 (err=1.4e-14) 16. Frigo-old: elapsed time t=1.93326 s, 64 iters, t-(init.)=1.84993 s t(norm)=0.0588076, mflops=85.0231 (err=1.4e-14) 17. Green: elapsed time t=1.6666 s, 64 iters, t-(init.)=1.59994 s t(norm)=0.0508606, mflops=98.3079 (err=1.4e-14) 18. GSL: elapsed time t=1.28328 s, 32 iters, t-(init.)=1.24995 s t(norm)=0.0794697, mflops=62.9171 (err=1.4e-14) 19. GSL DIT: elapsed time t=1.19995 s, 16 iters, t-(init.)=1.18329 s t(norm)=0.150463, mflops=33.2309 (err=1.4e-14) 20. GSL DIF: elapsed time t=1.18329 s, 16 iters, t-(init.)=1.16662 s t(norm)=0.148343, mflops=33.7056 (err=1.4e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.43328 s, 32 iters, t-(init.)=1.38328 s t(norm)=0.0879464, mflops=56.8528 (err=1.4e-14) 23. Mayer (simple): elapsed time t=1.38328 s, 32 iters, t-(init.)=1.34995 s t(norm)=0.0858273, mflops=58.2566 24. Mayer (lookup): elapsed time t=1.49994 s, 32 iters, t-(init.)=1.46661 s t(norm)=0.0932444, mflops=53.6225 (err=1.4e-14) 25. Monro: elapsed time t=1.18329 s, 16 iters, t-(init.)=1.16662 s t(norm)=0.148343, mflops=33.7056 (err=1.5e-07) 26. NAPACK (f2c): elapsed time t=1.94992 s, 16 iters, t-(init.)=1.93326 s t(norm)=0.245826, mflops=20.3396 (err=5.6e-13) 27. Ooura (C): elapsed time t=1.68327 s, 64 iters, t-(init.)=1.59994 s t(norm)=0.0508606, mflops=98.3079 (err=1.4e-14) 28. Ooura (F): elapsed time t=1.73326 s, 64 iters, t-(init.)=1.6666 s t(norm)=0.0529798, mflops=94.3756 (err=1.4e-14) 29. Ransom: elapsed time t=1.16662 s, 32 iters, t-(init.)=1.13329 s t(norm)=0.0720525, mflops=69.3938 (err=1.5e-14) 30. SCIPORT: elapsed time t=1.58327 s, 16 iters, t-(init.)=1.5666 s t(norm)=0.199204, mflops=25.0999 (err=1.4e-14) 31. Singleton: elapsed time t=1.51661 s, 32 iters, t-(init.)=1.48327 s t(norm)=0.094304, mflops=53.02 (err=2.1e-14) 32. Singleton (f2c): elapsed time t=1.64993 s, 32 iters, t-(init.)=1.59994 s t(norm)=0.101721, mflops=49.154 (err=2.1e-14) 33. Sorensen: elapsed time t=1.69993 s, 32 iters, t-(init.)=1.6666 s t(norm)=0.10596, mflops=47.1878 (err=1.4e-14) 34. Sorensen DIT: elapsed time t=1.63327 s, 16 iters, t-(init.)=1.6166 s t(norm)=0.205562, mflops=24.3236 (err=1.4e-14) 35. Temperton: elapsed time t=1.59994 s, 32 iters, t-(init.)=1.5666 s t(norm)=0.099602, mflops=50.1998 (err=1.5e-07) 36. Temperton (f2c): elapsed time t=1.93326 s, 32 iters, t-(init.)=1.88326 s t(norm)=0.119734, mflops=41.7591 (err=1.4e-14) 37. Valkenburg: elapsed time t=1.88326 s, 8 iters, t-(init.)=1.88326 s t(norm)=0.478937, mflops=10.4398 (err=1.4e-14) Top mflops for N=32768 = 176.403 Normalized results and averages for N=32768: fft 0: mflops = 39.3232 (norm. = 0.222917), norm. avg. (of 15) = 0.264456 fft 1: mflops = 38.6785 (norm. = 0.219262), norm. avg. (of 15) = 0.268109 fft 2: mflops = 29.1283 (norm. = 0.165123), norm. avg. (of 15) = 0.176866 fft 3: mflops = 32.3204 (norm. = 0.183219), norm. avg. (of 15) = 0.0892619 fft 4: mflops = 30.6414 (norm. = 0.173701), norm. avg. (of 15) = 0.159715 fft 5: mflops = 19.3393 (norm. = 0.109631), norm. avg. (of 15) = 0.0625546 fft 6: mflops = 85.0231 (norm. = 0.481982), norm. avg. (of 15) = 0.338505 fft 7: mflops = 48.6472 (norm. = 0.275773), norm. avg. (of 15) = 0.212747 fft 8: mflops = 28.088 (norm. = 0.159226), norm. avg. (of 15) = 0.147697 fft 9: mflops = 173.166 (norm. = 0.981651), norm. avg. (of 15) = 0.560746 fft 10: mflops = 176.403 (norm. = 1), norm. avg. (of 15) = 0.59112 fft 11: mflops = 26.51 (norm. = 0.150281), norm. avg. (of 14) = 0.134009 fft 12: mflops = 65.5386 (norm. = 0.371528), norm. avg. (of 15) = 0.379649 fft 13: mflops = 58.2566 (norm. = 0.330247), norm. avg. (of 15) = 0.287482 fft 14: mflops = 152.219 (norm. = 0.862903), norm. avg. (of 15) = 0.806845 fft 15: mflops = 140.859 (norm. = 0.798507), norm. avg. (of 15) = 0.835406 fft 16: mflops = 85.0231 (norm. = 0.481982), norm. avg. (of 15) = 0.624308 fft 17: mflops = 98.3079 (norm. = 0.557292), norm. avg. (of 13) = 0.638074 fft 18: mflops = 62.9171 (norm. = 0.356667), norm. avg. (of 15) = 0.334667 fft 19: mflops = 33.2309 (norm. = 0.18838), norm. avg. (of 15) = 0.166614 fft 20: mflops = 33.7056 (norm. = 0.191071), norm. avg. (of 15) = 0.186768 fft 21: mflops = -1 (norm. = -0.00566884), norm. avg. (of 12) = 0.233862 fft 22: mflops = 56.8528 (norm. = 0.322289), norm. avg. (of 14) = 0.258035 fft 23: mflops = 58.2566 (norm. = 0.330247), norm. avg. (of 14) = 0.301266 fft 24: mflops = 53.6225 (norm. = 0.303977), norm. avg. (of 14) = 0.284633 fft 25: mflops = 33.7056 (norm. = 0.191071), norm. avg. (of 14) = 0.194302 fft 26: mflops = 20.3396 (norm. = 0.115302), norm. avg. (of 15) = 0.0834896 fft 27: mflops = 98.3079 (norm. = 0.557292), norm. avg. (of 15) = 0.548208 fft 28: mflops = 94.3756 (norm. = 0.535), norm. avg. (of 15) = 0.516081 fft 29: mflops = 69.3938 (norm. = 0.393382), norm. avg. (of 14) = 0.195943 fft 30: mflops = 25.0999 (norm. = 0.142287), norm. avg. (of 14) = 0.297482 fft 31: mflops = 53.02 (norm. = 0.300562), norm. avg. (of 15) = 0.312526 fft 32: mflops = 49.154 (norm. = 0.278646), norm. avg. (of 15) = 0.275898 fft 33: mflops = 47.1878 (norm. = 0.2675), norm. avg. (of 15) = 0.274981 fft 34: mflops = 24.3236 (norm. = 0.137887), norm. avg. (of 15) = 0.140359 fft 35: mflops = 50.1998 (norm. = 0.284574), norm. avg. (of 15) = 0.234752 fft 36: mflops = 41.7591 (norm. = 0.236726), norm. avg. (of 15) = 0.175134 fft 37: mflops = 10.4398 (norm. = 0.0591814), norm. avg. (of 15) = 0.0465867 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.59994 s, 8 iters, t-(init.)=1.58327 s t(norm)=0.18874, mflops=26.4914 (err=1.7e-14) 1. Arndt DIT: elapsed time t=1.58327 s, 8 iters, t-(init.)=1.54994 s t(norm)=0.184767, mflops=27.0611 (err=1.7e-14) 2. Arndt Split-Radix: elapsed time t=1.94992 s, 8 iters, t-(init.)=1.91659 s t(norm)=0.228475, mflops=21.8842 (err=1.7e-14) 3. Arndt 4-step: elapsed time t=1.29995 s, 8 iters, t-(init.)=1.28328 s t(norm)=0.152979, mflops=32.6842 (err=1.7e-14) 4. Bailey: elapsed time t=1.69993 s, 8 iters, t-(init.)=1.6666 s t(norm)=0.198674, mflops=25.1668 (err=1.7e-14) 5. Beauregard: elapsed time t=1.16662 s, 4 iters, t-(init.)=1.16662 s t(norm)=0.278144, mflops=17.9763 (err=1.7e-14) 6. Bergland: elapsed time t=1.34995 s, 16 iters, t-(init.)=1.28328 s t(norm)=0.0764896, mflops=65.3684 (err=1.7e-14) 7. Brenner: elapsed time t=1.11662 s, 8 iters, t-(init.)=1.08329 s t(norm)=0.129138, mflops=38.7182 (err=1.7e-14) 8. Burrus: elapsed time t=2.01659 s, 8 iters, t-(init.)=1.98325 s t(norm)=0.236422, mflops=21.1486 (err=1.7e-14) 9. CWP (min N) (N=72072): elapsed time t=1.18329 s, 32 iters, t-(init.)=1.04996 s t(norm)=0.0312912, mflops=159.789 10. CWP (best N) (N=72072): elapsed time t=1.19995 s, 32 iters, t-(init.)=1.06662 s t(norm)=0.0317879, mflops=157.293 11. Edelblute: elapsed time t=2.03325 s, 8 iters, t-(init.)=1.99992 s t(norm)=0.238409, mflops=20.9724 (err=1.7e-14) 12. FFTPACK: elapsed time t=1.44994 s, 16 iters, t-(init.)=1.39994 s t(norm)=0.0834432, mflops=59.921 (err=1.7e-14) 13. FFTPACK (f2c): elapsed time t=1.59994 s, 16 iters, t-(init.)=1.54994 s t(norm)=0.0923835, mflops=54.1222 (err=1.7e-14) FFTW_MEASURE plan: (cost = 4.999800e-02) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 32 FFTW_NOTW 32 14. FFTW: elapsed time t=1.44994 s, 32 iters, t-(init.)=1.33328 s t(norm)=0.0397348, mflops=125.834 (err=1.7e-14) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.51661 s, 32 iters, t-(init.)=1.39994 s t(norm)=0.0417216, mflops=119.842 (err=1.7e-14) 16. Frigo-old: elapsed time t=1.13329 s, 16 iters, t-(init.)=1.08329 s t(norm)=0.0645691, mflops=77.4364 (err=1.7e-14) 17. Green: elapsed time t=1.13329 s, 16 iters, t-(init.)=1.08329 s t(norm)=0.0645691, mflops=77.4364 (err=1.7e-14) 18. GSL: elapsed time t=1.48327 s, 16 iters, t-(init.)=1.43328 s t(norm)=0.0854299, mflops=58.5275 (err=1.7e-14) 19. GSL DIT: elapsed time t=1.6666 s, 8 iters, t-(init.)=1.64993 s t(norm)=0.196687, mflops=25.421 (err=1.7e-14) 20. GSL DIF: elapsed time t=1.63327 s, 8 iters, t-(init.)=1.59994 s t(norm)=0.190727, mflops=26.2154 (err=1.8e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.44994 s, 8 iters, t-(init.)=1.43328 s t(norm)=0.17086, mflops=29.2638 (err=1.7e-14) 23. Mayer (simple): elapsed time t=1.39994 s, 8 iters, t-(init.)=1.36661 s t(norm)=0.162913, mflops=30.6913 24. Mayer (lookup): elapsed time t=1.43328 s, 8 iters, t-(init.)=1.38328 s t(norm)=0.1649, mflops=30.3215 (err=1.7e-14) 25. Monro: elapsed time t=1.69993 s, 8 iters, t-(init.)=1.6666 s t(norm)=0.198674, mflops=25.1668 (err=1.6e-07) 26. NAPACK (f2c): elapsed time t=1.08329 s, 4 iters, t-(init.)=1.06662 s t(norm)=0.254303, mflops=19.6616 (err=8.6e-13) 27. Ooura (C): elapsed time t=1.01663 s, 16 iters, t-(init.)=0.966628 s t(norm)=0.0576155, mflops=86.7822 (err=1.7e-14) 28. Ooura (F): elapsed time t=1.04996 s, 16 iters, t-(init.)=0.983294 s t(norm)=0.0586089, mflops=85.3113 (err=1.7e-14) 29. Ransom: elapsed time t=1.39994 s, 16 iters, t-(init.)=1.34995 s t(norm)=0.0804631, mflops=62.1403 (err=1.7e-14) 30. SCIPORT: elapsed time t=1.19995 s, 4 iters, t-(init.)=1.18329 s t(norm)=0.282117, mflops=17.7231 (err=1.7e-14) 31. Singleton: elapsed time t=1.79993 s, 16 iters, t-(init.)=1.74993 s t(norm)=0.104304, mflops=47.9368 (err=2.3e-14) 32. Singleton (f2c): elapsed time t=1.91659 s, 16 iters, t-(init.)=1.84993 s t(norm)=0.110264, mflops=45.3456 (err=2.3e-14) 33. Sorensen: elapsed time t=1.04996 s, 8 iters, t-(init.)=1.03329 s t(norm)=0.123178, mflops=40.5917 (err=1.7e-14) 34. Sorensen DIT: elapsed time t=1.11662 s, 4 iters, t-(init.)=1.09996 s t(norm)=0.26225, mflops=19.0658 (err=1.7e-14) 35. Temperton: elapsed time t=1.01663 s, 8 iters, t-(init.)=0.99996 s t(norm)=0.119205, mflops=41.9447 (err=1.7e-07) 36. Temperton (f2c): elapsed time t=1.16662 s, 8 iters, t-(init.)=1.13329 s t(norm)=0.135098, mflops=37.01 (err=1.7e-14) 37. Valkenburg: elapsed time t=1.08329 s, 2 iters, t-(init.)=1.08329 s t(norm)=0.516553, mflops=9.67955 (err=1.7e-14) Top mflops for N=65536 = 159.789 Normalized results and averages for N=65536: fft 0: mflops = 26.4914 (norm. = 0.165789), norm. avg. (of 16) = 0.258289 fft 1: mflops = 27.0611 (norm. = 0.169355), norm. avg. (of 16) = 0.261937 fft 2: mflops = 21.8842 (norm. = 0.136957), norm. avg. (of 16) = 0.174372 fft 3: mflops = 32.6842 (norm. = 0.204545), norm. avg. (of 16) = 0.0964671 fft 4: mflops = 25.1668 (norm. = 0.1575), norm. avg. (of 16) = 0.159577 fft 5: mflops = 17.9763 (norm. = 0.1125), norm. avg. (of 16) = 0.0656762 fft 6: mflops = 65.3684 (norm. = 0.409091), norm. avg. (of 16) = 0.342916 fft 7: mflops = 38.7182 (norm. = 0.242308), norm. avg. (of 16) = 0.214595 fft 8: mflops = 21.1486 (norm. = 0.132353), norm. avg. (of 16) = 0.146738 fft 9: mflops = 159.789 (norm. = 1), norm. avg. (of 16) = 0.5882 fft 10: mflops = 157.293 (norm. = 0.984375), norm. avg. (of 16) = 0.615699 fft 11: mflops = 20.9724 (norm. = 0.13125), norm. avg. (of 15) = 0.133825 fft 12: mflops = 59.921 (norm. = 0.375), norm. avg. (of 16) = 0.379359 fft 13: mflops = 54.1222 (norm. = 0.33871), norm. avg. (of 16) = 0.290684 fft 14: mflops = 125.834 (norm. = 0.7875), norm. avg. (of 16) = 0.805636 fft 15: mflops = 119.842 (norm. = 0.75), norm. avg. (of 16) = 0.830068 fft 16: mflops = 77.4364 (norm. = 0.484615), norm. avg. (of 16) = 0.615577 fft 17: mflops = 77.4364 (norm. = 0.484615), norm. avg. (of 14) = 0.627112 fft 18: mflops = 58.5275 (norm. = 0.366279), norm. avg. (of 16) = 0.336643 fft 19: mflops = 25.421 (norm. = 0.159091), norm. avg. (of 16) = 0.166144 fft 20: mflops = 26.2154 (norm. = 0.164062), norm. avg. (of 16) = 0.185349 fft 21: mflops = -1 (norm. = -0.00625824), norm. avg. (of 12) = 0.233862 fft 22: mflops = 29.2638 (norm. = 0.18314), norm. avg. (of 15) = 0.253042 fft 23: mflops = 30.6913 (norm. = 0.192073), norm. avg. (of 15) = 0.293987 fft 24: mflops = 30.3215 (norm. = 0.189759), norm. avg. (of 15) = 0.278308 fft 25: mflops = 25.1668 (norm. = 0.1575), norm. avg. (of 15) = 0.191849 fft 26: mflops = 19.6616 (norm. = 0.123047), norm. avg. (of 16) = 0.0859619 fft 27: mflops = 86.7822 (norm. = 0.543103), norm. avg. (of 16) = 0.547889 fft 28: mflops = 85.3113 (norm. = 0.533898), norm. avg. (of 16) = 0.517195 fft 29: mflops = 62.1403 (norm. = 0.388889), norm. avg. (of 15) = 0.208807 fft 30: mflops = 17.7231 (norm. = 0.110915), norm. avg. (of 15) = 0.285044 fft 31: mflops = 47.9368 (norm. = 0.3), norm. avg. (of 16) = 0.311743 fft 32: mflops = 45.3456 (norm. = 0.283784), norm. avg. (of 16) = 0.276391 fft 33: mflops = 40.5917 (norm. = 0.254032), norm. avg. (of 16) = 0.273672 fft 34: mflops = 19.0658 (norm. = 0.119318), norm. avg. (of 16) = 0.139044 fft 35: mflops = 41.9447 (norm. = 0.2625), norm. avg. (of 16) = 0.236486 fft 36: mflops = 37.01 (norm. = 0.231618), norm. avg. (of 16) = 0.178664 fft 37: mflops = 9.67955 (norm. = 0.0605769), norm. avg. (of 16) = 0.0474611 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.06662 s, 2 iters, t-(init.)=1.03329 s t(norm)=0.231864, mflops=21.5643 (err=3.3e-14) 1. Arndt DIT: elapsed time t=1.11662 s, 2 iters, t-(init.)=1.09996 s t(norm)=0.246823, mflops=20.2574 (err=3.3e-14) 2. Arndt Split-Radix: elapsed time t=1.41661 s, 2 iters, t-(init.)=1.39994 s t(norm)=0.314139, mflops=15.9165 (err=3.3e-14) 3. Arndt 4-step: elapsed time t=1.81659 s, 4 iters, t-(init.)=1.7666 s t(norm)=0.198207, mflops=25.2262 (err=3.3e-14) 4. Bailey: elapsed time t=1.96659 s, 4 iters, t-(init.)=1.93326 s t(norm)=0.216905, mflops=23.0515 (err=3.3e-14) 5. Beauregard: elapsed time t=1.34995 s, 2 iters, t-(init.)=1.33328 s t(norm)=0.29918, mflops=16.7123 (err=3.3e-14) 6. Bergland: elapsed time t=1.93326 s, 8 iters, t-(init.)=1.84993 s t(norm)=0.103778, mflops=48.1797 (err=3.4e-14) 7. Brenner: elapsed time t=1.51661 s, 4 iters, t-(init.)=1.48327 s t(norm)=0.166419, mflops=30.0447 (err=3.3e-14) 8. Burrus: elapsed time t=1.44994 s, 2 iters, t-(init.)=1.43328 s t(norm)=0.321618, mflops=15.5464 (err=3.3e-14) 9. CWP (min N) (N=144144): elapsed time t=1.59994 s, 16 iters, t-(init.)=1.41661 s t(norm)=0.0397348, mflops=125.834 10. CWP (best N) (N=144144): elapsed time t=1.59994 s, 16 iters, t-(init.)=1.41661 s t(norm)=0.0397348, mflops=125.834 11. Edelblute: elapsed time t=1.41661 s, 2 iters, t-(init.)=1.39994 s t(norm)=0.314139, mflops=15.9165 (err=3.3e-14) 12. FFTPACK: elapsed time t=1.01663 s, 4 iters, t-(init.)=0.983294 s t(norm)=0.110323, mflops=45.3216 (err=3.3e-14) 13. FFTPACK (f2c): elapsed time t=1.09996 s, 4 iters, t-(init.)=1.06662 s t(norm)=0.119672, mflops=41.7809 (err=3.3e-14) FFTW_MEASURE plan: (cost = 1.083290e-01) FFTW_TWIDDLE 32 FFTW_TWIDDLE 64 FFTW_NOTW 64 14. FFTW: elapsed time t=1.6166 s, 16 iters, t-(init.)=1.46661 s t(norm)=0.0411372, mflops=121.544 (err=3.3e-14) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.7666 s, 16 iters, t-(init.)=1.6166 s t(norm)=0.0453445, mflops=110.267 (err=3.3e-14) 16. Frigo-old: elapsed time t=1.43328 s, 8 iters, t-(init.)=1.36661 s t(norm)=0.0766649, mflops=65.2189 (err=3.3e-14) 17. Green: elapsed time t=1.74993 s, 8 iters, t-(init.)=1.6666 s t(norm)=0.0934937, mflops=53.4795 (err=3.3e-14) 18. GSL: elapsed time t=2.01659 s, 8 iters, t-(init.)=1.93326 s t(norm)=0.108453, mflops=46.103 (err=3.3e-14) 19. GSL DIT: elapsed time t=1.13329 s, 2 iters, t-(init.)=1.11662 s t(norm)=0.250563, mflops=19.955 (err=3.5e-14) 20. GSL DIF: elapsed time t=1.13329 s, 2 iters, t-(init.)=1.11662 s t(norm)=0.250563, mflops=19.955 (err=3.5e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.51661 s, 4 iters, t-(init.)=1.48327 s t(norm)=0.166419, mflops=30.0447 (err=3.3e-14) 23. Mayer (simple): elapsed time t=1.48327 s, 4 iters, t-(init.)=1.44994 s t(norm)=0.162679, mflops=30.7354 24. Mayer (lookup): elapsed time t=1.54994 s, 4 iters, t-(init.)=1.51661 s t(norm)=0.170159, mflops=29.3843 (err=3.3e-14) 25. Monro: elapsed time t=1.24995 s, 2 iters, t-(init.)=1.23328 s t(norm)=0.276741, mflops=18.0674 (err=1.7e-07) 26. NAPACK (f2c): elapsed time t=1.31661 s, 2 iters, t-(init.)=1.29995 s t(norm)=0.2917, mflops=17.1409 (err=2.0e-12) 27. Ooura (C): elapsed time t=1.38328 s, 8 iters, t-(init.)=1.31661 s t(norm)=0.0738601, mflops=67.6956 (err=3.4e-14) 28. Ooura (F): elapsed time t=1.43328 s, 8 iters, t-(init.)=1.36661 s t(norm)=0.0766649, mflops=65.2189 (err=3.4e-14) 29. Ransom: elapsed time t=1.86659 s, 8 iters, t-(init.)=1.78326 s t(norm)=0.100038, mflops=49.9809 (err=3.3e-14) 30. SCIPORT: elapsed time t=1.41661 s, 2 iters, t-(init.)=1.39994 s t(norm)=0.314139, mflops=15.9165 (err=3.3e-14) 31. Singleton: elapsed time t=1.34995 s, 4 iters, t-(init.)=1.29995 s t(norm)=0.14585, mflops=34.2817 (err=4.8e-14) 32. Singleton (f2c): elapsed time t=1.36661 s, 4 iters, t-(init.)=1.33328 s t(norm)=0.14959, mflops=33.4247 (err=4.8e-14) 33. Sorensen: elapsed time t=1.49994 s, 4 iters, t-(init.)=1.46661 s t(norm)=0.164549, mflops=30.3861 (err=3.3e-14) 34. Sorensen DIT: elapsed time t=1.58327 s, 2 iters, t-(init.)=1.5666 s t(norm)=0.351536, mflops=14.2233 (err=3.3e-14) 35. Temperton: elapsed time t=1.39994 s, 4 iters, t-(init.)=1.36661 s t(norm)=0.15333, mflops=32.6095 (err=1.9e-07) 36. Temperton (f2c): elapsed time t=1.64993 s, 4 iters, t-(init.)=1.6166 s t(norm)=0.181378, mflops=27.5668 (err=3.3e-14) 37. Valkenburg: elapsed time t=1.23328 s, 1 iters, t-(init.)=1.21662 s t(norm)=0.546003, mflops=9.15745 (err=3.4e-14) Top mflops for N=131072 = 125.834 Normalized results and averages for N=131072: fft 0: mflops = 21.5643 (norm. = 0.171371), norm. avg. (of 17) = 0.253176 fft 1: mflops = 20.2574 (norm. = 0.160985), norm. avg. (of 17) = 0.255999 fft 2: mflops = 15.9165 (norm. = 0.126488), norm. avg. (of 17) = 0.171555 fft 3: mflops = 25.2262 (norm. = 0.200472), norm. avg. (of 17) = 0.102585 fft 4: mflops = 23.0515 (norm. = 0.18319), norm. avg. (of 17) = 0.160966 fft 5: mflops = 16.7123 (norm. = 0.132813), norm. avg. (of 17) = 0.0696254 fft 6: mflops = 48.1797 (norm. = 0.382883), norm. avg. (of 17) = 0.345267 fft 7: mflops = 30.0447 (norm. = 0.238764), norm. avg. (of 17) = 0.216016 fft 8: mflops = 15.5464 (norm. = 0.123547), norm. avg. (of 17) = 0.145374 fft 9: mflops = 125.834 (norm. = 1), norm. avg. (of 17) = 0.612423 fft 10: mflops = 125.834 (norm. = 1), norm. avg. (of 17) = 0.638304 fft 11: mflops = 15.9165 (norm. = 0.126488), norm. avg. (of 16) = 0.133366 fft 12: mflops = 45.3216 (norm. = 0.360169), norm. avg. (of 17) = 0.37823 fft 13: mflops = 41.7809 (norm. = 0.332031), norm. avg. (of 17) = 0.293116 fft 14: mflops = 121.544 (norm. = 0.965909), norm. avg. (of 17) = 0.815064 fft 15: mflops = 110.267 (norm. = 0.876289), norm. avg. (of 17) = 0.832787 fft 16: mflops = 65.2189 (norm. = 0.518293), norm. avg. (of 17) = 0.609854 fft 17: mflops = 53.4795 (norm. = 0.425), norm. avg. (of 15) = 0.613638 fft 18: mflops = 46.103 (norm. = 0.366379), norm. avg. (of 17) = 0.338392 fft 19: mflops = 19.955 (norm. = 0.158582), norm. avg. (of 17) = 0.165699 fft 20: mflops = 19.955 (norm. = 0.158582), norm. avg. (of 17) = 0.183774 fft 21: mflops = -1 (norm. = -0.00794697), norm. avg. (of 12) = 0.233862 fft 22: mflops = 30.0447 (norm. = 0.238764), norm. avg. (of 16) = 0.252149 fft 23: mflops = 30.7354 (norm. = 0.244253), norm. avg. (of 16) = 0.290878 fft 24: mflops = 29.3843 (norm. = 0.233516), norm. avg. (of 16) = 0.275509 fft 25: mflops = 18.0674 (norm. = 0.143581), norm. avg. (of 16) = 0.188832 fft 26: mflops = 17.1409 (norm. = 0.136218), norm. avg. (of 17) = 0.0889181 fft 27: mflops = 67.6956 (norm. = 0.537975), norm. avg. (of 17) = 0.547306 fft 28: mflops = 65.2189 (norm. = 0.518293), norm. avg. (of 17) = 0.517259 fft 29: mflops = 49.9809 (norm. = 0.397196), norm. avg. (of 16) = 0.220581 fft 30: mflops = 15.9165 (norm. = 0.126488), norm. avg. (of 16) = 0.275135 fft 31: mflops = 34.2817 (norm. = 0.272436), norm. avg. (of 17) = 0.309431 fft 32: mflops = 33.4247 (norm. = 0.265625), norm. avg. (of 17) = 0.275758 fft 33: mflops = 30.3861 (norm. = 0.241477), norm. avg. (of 17) = 0.271778 fft 34: mflops = 14.2233 (norm. = 0.113032), norm. avg. (of 17) = 0.137513 fft 35: mflops = 32.6095 (norm. = 0.259146), norm. avg. (of 17) = 0.237819 fft 36: mflops = 27.5668 (norm. = 0.219072), norm. avg. (of 17) = 0.181041 fft 37: mflops = 9.15745 (norm. = 0.072774), norm. avg. (of 17) = 0.0489501 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=1.31661 s, 1 iters, t-(init.)=1.29995 s t(norm)=0.275495, mflops=18.1492 (err=4.3e-14) 1. Arndt DIT: elapsed time t=1.34995 s, 1 iters, t-(init.)=1.33328 s t(norm)=0.282559, mflops=17.6954 (err=4.3e-14) 2. Arndt Split-Radix: elapsed time t=1.6666 s, 1 iters, t-(init.)=1.64993 s t(norm)=0.349667, mflops=14.2993 (err=4.3e-14) 3. Arndt 4-step: elapsed time t=1.63327 s, 2 iters, t-(init.)=1.58327 s t(norm)=0.167769, mflops=29.8028 (err=4.3e-14) 4. Bailey: elapsed time t=1.09996 s, 1 iters, t-(init.)=1.06662 s t(norm)=0.226047, mflops=22.1193 (err=4.3e-14) 5. Beauregard: elapsed time t=1.46661 s, 1 iters, t-(init.)=1.43328 s t(norm)=0.303751, mflops=16.4609 (err=4.4e-14) 6. Bergland: elapsed time t=1.11662 s, 2 iters, t-(init.)=1.08329 s t(norm)=0.11479, mflops=43.558 (err=4.4e-14) 7. Brenner: elapsed time t=1.79993 s, 2 iters, t-(init.)=1.7666 s t(norm)=0.187195, mflops=26.7101 (err=4.4e-14) 8. Burrus: elapsed time t=1.74993 s, 1 iters, t-(init.)=1.73326 s t(norm)=0.367327, mflops=13.6119 (err=4.3e-14) 9. CWP (min N) (N=360360): elapsed time t=1.21662 s, 4 iters, t-(init.)=1.09996 s t(norm)=0.0582778, mflops=85.796 10. CWP (best N) (N=360360): elapsed time t=1.21662 s, 4 iters, t-(init.)=1.08329 s t(norm)=0.0573948, mflops=87.116 11. Edelblute: elapsed time t=1.7166 s, 1 iters, t-(init.)=1.69993 s t(norm)=0.360263, mflops=13.8788 (err=4.3e-14) 12. FFTPACK: elapsed time t=1.06662 s, 2 iters, t-(init.)=1.03329 s t(norm)=0.109492, mflops=45.6656 (err=4.4e-14) 13. FFTPACK (f2c): elapsed time t=1.13329 s, 2 iters, t-(init.)=1.09996 s t(norm)=0.116556, mflops=42.898 (err=4.4e-14) FFTW_MEASURE plan: (cost = 2.833220e-01) FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_TWIDDLE 64 FFTW_NOTW 32 14. FFTW: elapsed time t=1.03329 s, 4 iters, t-(init.)=0.949962 s t(norm)=0.0503308, mflops=99.3428 (err=4.4e-14) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 15. FFTW_ESTIMATE: elapsed time t=1.91659 s, 8 iters, t-(init.)=1.74993 s t(norm)=0.0463573, mflops=107.858 (err=4.4e-14) 16. Frigo-old: elapsed time t=1.79993 s, 4 iters, t-(init.)=1.7166 s t(norm)=0.0909486, mflops=54.9761 (err=4.4e-14) 17. Green: elapsed time t=1.96659 s, 4 iters, t-(init.)=1.88326 s t(norm)=0.0997786, mflops=50.1109 (err=4.4e-14) 18. GSL: elapsed time t=1.04996 s, 2 iters, t-(init.)=0.99996 s t(norm)=0.10596, mflops=47.1878 (err=4.4e-14) 19. GSL DIT: elapsed time t=1.34995 s, 1 iters, t-(init.)=1.33328 s t(norm)=0.282559, mflops=17.6954 (err=4.6e-14) 20. GSL DIF: elapsed time t=1.33328 s, 1 iters, t-(init.)=1.31661 s t(norm)=0.279027, mflops=17.9194 (err=4.6e-14) 21. Skipping fft (Krukar can't handle N > 4096). 22. Mayer (Buneman): elapsed time t=1.11662 s, 1 iters, t-(init.)=1.08329 s t(norm)=0.229579, mflops=21.779 (err=4.3e-14) 23. Mayer (simple): elapsed time t=1.11662 s, 1 iters, t-(init.)=1.09996 s t(norm)=0.233111, mflops=21.449 24. Mayer (lookup): elapsed time t=1.11662 s, 1 iters, t-(init.)=1.08329 s t(norm)=0.229579, mflops=21.779 (err=4.3e-14) 25. Monro: elapsed time t=1.39994 s, 1 iters, t-(init.)=1.38328 s t(norm)=0.293155, mflops=17.0558 (err=1.8e-07) 26. NAPACK (f2c): elapsed time t=1.38328 s, 1 iters, t-(init.)=1.36661 s t(norm)=0.289623, mflops=17.2638 (err=3.7e-12) 27. Ooura (C): elapsed time t=1.51661 s, 4 iters, t-(init.)=1.43328 s t(norm)=0.0759377, mflops=65.8435 (err=4.4e-14) 28. Ooura (F): elapsed time t=1.5666 s, 4 iters, t-(init.)=1.48327 s t(norm)=0.0785867, mflops=63.624 (err=4.4e-14) 29. Ransom: elapsed time t=1.6666 s, 4 iters, t-(init.)=1.5666 s t(norm)=0.0830017, mflops=60.2398 (err=4.3e-14) 30. SCIPORT: elapsed time t=1.6666 s, 1 iters, t-(init.)=1.64993 s t(norm)=0.349667, mflops=14.2993 (err=4.4e-14) 31. Singleton: elapsed time t=1.44994 s, 2 iters, t-(init.)=1.39994 s t(norm)=0.148343, mflops=33.7056 (err=6.0e-14) 32. Singleton (f2c): elapsed time t=1.51661 s, 2 iters, t-(init.)=1.46661 s t(norm)=0.155407, mflops=32.1735 (err=6.0e-14) 33. Sorensen: elapsed time t=1.81659 s, 2 iters, t-(init.)=1.7666 s t(norm)=0.187195, mflops=26.7101 (err=4.3e-14) 34. Sorensen DIT: elapsed time t=1.88326 s, 1 iters, t-(init.)=1.86659 s t(norm)=0.395582, mflops=12.6396 (err=4.3e-14) 35. Temperton: elapsed time t=1.51661 s, 2 iters, t-(init.)=1.46661 s t(norm)=0.155407, mflops=32.1735 (err=2.0e-07) 36. Temperton (f2c): elapsed time t=1.73326 s, 2 iters, t-(init.)=1.68327 s t(norm)=0.178365, mflops=28.0324 (err=4.4e-14) 37. Valkenburg: elapsed time t=2.81655 s, 1 iters, t-(init.)=2.79989 s t(norm)=0.593374, mflops=8.42639 (err=4.4e-14) Top mflops for N=262144 = 107.858 Normalized results and averages for N=262144: fft 0: mflops = 18.1492 (norm. = 0.168269), norm. avg. (of 18) = 0.248459 fft 1: mflops = 17.6954 (norm. = 0.164063), norm. avg. (of 18) = 0.250891 fft 2: mflops = 14.2993 (norm. = 0.132576), norm. avg. (of 18) = 0.16939 fft 3: mflops = 29.8028 (norm. = 0.276316), norm. avg. (of 18) = 0.112237 fft 4: mflops = 22.1193 (norm. = 0.205078), norm. avg. (of 18) = 0.163416 fft 5: mflops = 16.4609 (norm. = 0.152616), norm. avg. (of 18) = 0.074236 fft 6: mflops = 43.558 (norm. = 0.403846), norm. avg. (of 18) = 0.348522 fft 7: mflops = 26.7101 (norm. = 0.247642), norm. avg. (of 18) = 0.217773 fft 8: mflops = 13.6119 (norm. = 0.126202), norm. avg. (of 18) = 0.144309 fft 9: mflops = 85.796 (norm. = 0.795455), norm. avg. (of 18) = 0.622592 fft 10: mflops = 87.116 (norm. = 0.807692), norm. avg. (of 18) = 0.647715 fft 11: mflops = 13.8788 (norm. = 0.128676), norm. avg. (of 17) = 0.133091 fft 12: mflops = 45.6656 (norm. = 0.423387), norm. avg. (of 18) = 0.380739 fft 13: mflops = 42.898 (norm. = 0.397727), norm. avg. (of 18) = 0.298928 fft 14: mflops = 99.3428 (norm. = 0.921053), norm. avg. (of 18) = 0.820952 fft 15: mflops = 107.858 (norm. = 1), norm. avg. (of 18) = 0.842076 fft 16: mflops = 54.9761 (norm. = 0.509709), norm. avg. (of 18) = 0.604291 fft 17: mflops = 50.1109 (norm. = 0.464602), norm. avg. (of 16) = 0.604323 fft 18: mflops = 47.1878 (norm. = 0.4375), norm. avg. (of 18) = 0.343898 fft 19: mflops = 17.6954 (norm. = 0.164063), norm. avg. (of 18) = 0.165608 fft 20: mflops = 17.9194 (norm. = 0.166139), norm. avg. (of 18) = 0.182795 fft 21: mflops = -1 (norm. = -0.00927146), norm. avg. (of 12) = 0.233862 fft 22: mflops = 21.779 (norm. = 0.201923), norm. avg. (of 17) = 0.249195 fft 23: mflops = 21.449 (norm. = 0.198864), norm. avg. (of 17) = 0.285466 fft 24: mflops = 21.779 (norm. = 0.201923), norm. avg. (of 17) = 0.27118 fft 25: mflops = 17.0558 (norm. = 0.158133), norm. avg. (of 17) = 0.187026 fft 26: mflops = 17.2638 (norm. = 0.160061), norm. avg. (of 18) = 0.0928705 fft 27: mflops = 65.8435 (norm. = 0.610465), norm. avg. (of 18) = 0.550815 fft 28: mflops = 63.624 (norm. = 0.589888), norm. avg. (of 18) = 0.521294 fft 29: mflops = 60.2398 (norm. = 0.558511), norm. avg. (of 17) = 0.240459 fft 30: mflops = 14.2993 (norm. = 0.132576), norm. avg. (of 17) = 0.266749 fft 31: mflops = 33.7056 (norm. = 0.3125), norm. avg. (of 18) = 0.309601 fft 32: mflops = 32.1735 (norm. = 0.298295), norm. avg. (of 18) = 0.27701 fft 33: mflops = 26.7101 (norm. = 0.247642), norm. avg. (of 18) = 0.270437 fft 34: mflops = 12.6396 (norm. = 0.117188), norm. avg. (of 18) = 0.136384 fft 35: mflops = 32.1735 (norm. = 0.298295), norm. avg. (of 18) = 0.241179 fft 36: mflops = 28.0324 (norm. = 0.259901), norm. avg. (of 18) = 0.185422 fft 37: mflops = 8.42639 (norm. = 0.078125), norm. avg. (of 18) = 0.0505709 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. Brenner 1. CWP (min N) 2. CWP (best N) 3. FFTPACK 4. FFTPACK (f2c) 5. FFTW 6. FFTW_ESTIMATE 7. Frigo-old 8. GSL 9. NAPACK (f2c) 10. Singleton 11. Singleton (f2c) 12. Temperton 13. Temperton (f2c) 14. Valkenburg Computing normalized averages (15 transforms). Benchmarking for array size = 6: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.08329 s, 524288 iters, t-(init.)=1.03329 s t(norm)=0.127071, mflops=39.348 2. CWP (best N) (N=15): elapsed time t=1.29995 s, 524288 iters, t-(init.)=1.21662 s t(norm)=0.149616, mflops=33.4188 3. FFTPACK: elapsed time t=1.24995 s, 1048576 iters, t-(init.)=1.18329 s t(norm)=0.0727586, mflops=68.7204 (err=1.0e-16) 4. FFTPACK (f2c): elapsed time t=1.24995 s, 1048576 iters, t-(init.)=1.18329 s t(norm)=0.0727586, mflops=68.7204 (err=1.8e-16) FFTW_MEASURE plan: (cost = 3.178787e-07) FFTW_NOTW 6 5. FFTW: elapsed time t=1.33328 s, 4194304 iters, t-(init.)=1.04996 s t(norm)=0.0161401, mflops=309.787 (err=1.1e-16) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 6. FFTW_ESTIMATE: elapsed time t=1.31661 s, 4194304 iters, t-(init.)=-0.099996 s t(norm)=-0.00153715, mflops=-3252.77 (err=1.1e-16) 7. Frigo-old: elapsed time t=1.16662 s, 524288 iters, t-(init.)=1.13329 s t(norm)=0.139369, mflops=35.8761 (err=3.1e-16) 8. GSL: elapsed time t=1.86659 s, 1048576 iters, t-(init.)=1.78326 s t(norm)=0.10965, mflops=45.5995 (err=1.2e-16) 9. NAPACK (f2c): elapsed time t=1.23328 s, 262144 iters, t-(init.)=1.21662 s t(norm)=0.299233, mflops=16.7094 (err=4.7e-16) 10. Singleton: elapsed time t=1.68327 s, 524288 iters, t-(init.)=1.54994 s t(norm)=0.190607, mflops=26.232 (err=1.0e-16) 11. Singleton (f2c): elapsed time t=1.06662 s, 262144 iters, t-(init.)=1.04996 s t(norm)=0.258242, mflops=19.3617 (err=1.0e-16) 12. Temperton: elapsed time t=1.48327 s, 524288 iters, t-(init.)=1.44994 s t(norm)=0.17831, mflops=28.0411 (err=3.7e-16) 13. Temperton (f2c): elapsed time t=1.24995 s, 262144 iters, t-(init.)=1.23328 s t(norm)=0.303332, mflops=16.4836 (err=1.0e-16) 14. Valkenburg: elapsed time t=1.13329 s, 262144 iters, t-(init.)=1.11662 s t(norm)=0.274638, mflops=18.2058 (err=3.2e-16) Top mflops for N=6 = 309.787 Normalized results and averages for N=6: fft 0: mflops = -1 (norm. = -0.00322802), norm. avg. (of 0) = -1 fft 1: mflops = 39.348 (norm. = 0.127016), norm. avg. (of 1) = 0.127016 fft 2: mflops = 33.4188 (norm. = 0.107877), norm. avg. (of 1) = 0.107877 fft 3: mflops = 68.7204 (norm. = 0.221831), norm. avg. (of 1) = 0.221831 fft 4: mflops = 68.7204 (norm. = 0.221831), norm. avg. (of 1) = 0.221831 fft 5: mflops = 309.787 (norm. = 1), norm. avg. (of 1) = 1 fft 6: mflops = -3252.77 (norm. = -10.5), norm. avg. (of 1) = 0 fft 7: mflops = 35.8761 (norm. = 0.115809), norm. avg. (of 1) = 0.115809 fft 8: mflops = 45.5995 (norm. = 0.147196), norm. avg. (of 1) = 0.147196 fft 9: mflops = 16.7094 (norm. = 0.0539384), norm. avg. (of 1) = 0.0539384 fft 10: mflops = 26.232 (norm. = 0.0846774), norm. avg. (of 1) = 0.0846774 fft 11: mflops = 19.3617 (norm. = 0.0625), norm. avg. (of 1) = 0.0625 fft 12: mflops = 28.0411 (norm. = 0.0905172), norm. avg. (of 1) = 0.0905172 fft 13: mflops = 16.4836 (norm. = 0.0532095), norm. avg. (of 1) = 0.0532095 fft 14: mflops = 18.2058 (norm. = 0.0587687), norm. avg. (of 1) = 0.0587687 Benchmarking for array size = 9: 0. Brenner: elapsed time t=1.19995 s, 131072 iters, t-(init.)=1.18329 s t(norm)=0.316438, mflops=15.8009 (err=3.6e-16) 1. CWP (min N): elapsed time t=1.14995 s, 524288 iters, t-(init.)=1.08329 s t(norm)=0.0724241, mflops=69.0378 2. CWP (best N) (N=15): elapsed time t=1.28328 s, 524288 iters, t-(init.)=1.18329 s t(norm)=0.0791094, mflops=63.2036 3. FFTPACK: elapsed time t=1.28328 s, 1048576 iters, t-(init.)=1.14995 s t(norm)=0.0384405, mflops=130.071 (err=1.4e-16) 4. FFTPACK (f2c): elapsed time t=1.63327 s, 1048576 iters, t-(init.)=1.49994 s t(norm)=0.0501398, mflops=99.7212 (err=2.4e-16) FFTW_MEASURE plan: (cost = 6.675453e-07) FFTW_NOTW 9 5. FFTW: elapsed time t=1.34995 s, 2097152 iters, t-(init.)=1.08329 s t(norm)=0.018106, mflops=276.151 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.34995 s, 2097152 iters, t-(init.)=0.58331 s t(norm)=0.0097494, mflops=512.852 (err=1.2e-16) 7. Frigo-old: elapsed time t=1.19995 s, 262144 iters, t-(init.)=1.14995 s t(norm)=0.153762, mflops=32.5178 (err=3.3e-16) 8. GSL: elapsed time t=1.01663 s, 524288 iters, t-(init.)=0.949962 s t(norm)=0.0635104, mflops=78.7273 (err=1.4e-16) 9. NAPACK (f2c): elapsed time t=1.7166 s, 262144 iters, t-(init.)=1.68327 s t(norm)=0.225072, mflops=22.2151 (err=4.3e-16) 10. Singleton: elapsed time t=1.78326 s, 524288 iters, t-(init.)=1.63327 s t(norm)=0.109193, mflops=45.7904 (err=1.5e-16) 11. Singleton (f2c): elapsed time t=1.09996 s, 262144 iters, t-(init.)=1.06662 s t(norm)=0.14262, mflops=35.0582 (err=1.5e-16) 12. Temperton: elapsed time t=1.01663 s, 262144 iters, t-(init.)=0.983294 s t(norm)=0.131478, mflops=38.0293 (err=1.1e-08) 13. Temperton (f2c): elapsed time t=1.36661 s, 262144 iters, t-(init.)=1.33328 s t(norm)=0.178275, mflops=28.0466 (err=1.4e-16) 14. Valkenburg: elapsed time t=1.14995 s, 131072 iters, t-(init.)=1.13329 s t(norm)=0.303067, mflops=16.498 (err=3.7e-16) Top mflops for N=9 = 512.852 Normalized results and averages for N=9: fft 0: mflops = 15.8009 (norm. = 0.0308099), norm. avg. (of 1) = 0.0308099 fft 1: mflops = 69.0378 (norm. = 0.134615), norm. avg. (of 2) = 0.130816 fft 2: mflops = 63.2036 (norm. = 0.123239), norm. avg. (of 2) = 0.115558 fft 3: mflops = 130.071 (norm. = 0.253623), norm. avg. (of 2) = 0.237727 fft 4: mflops = 99.7212 (norm. = 0.194444), norm. avg. (of 2) = 0.208138 fft 5: mflops = 276.151 (norm. = 0.538462), norm. avg. (of 2) = 0.769231 fft 6: mflops = 512.852 (norm. = 1), norm. avg. (of 2) = 0.5 fft 7: mflops = 32.5178 (norm. = 0.0634058), norm. avg. (of 2) = 0.0896073 fft 8: mflops = 78.7273 (norm. = 0.153509), norm. avg. (of 2) = 0.150353 fft 9: mflops = 22.2151 (norm. = 0.0433168), norm. avg. (of 2) = 0.0486276 fft 10: mflops = 45.7904 (norm. = 0.0892857), norm. avg. (of 2) = 0.0869816 fft 11: mflops = 35.0582 (norm. = 0.0683594), norm. avg. (of 2) = 0.0654297 fft 12: mflops = 38.0293 (norm. = 0.0741525), norm. avg. (of 2) = 0.0823349 fft 13: mflops = 28.0466 (norm. = 0.0546875), norm. avg. (of 2) = 0.0539485 fft 14: mflops = 16.498 (norm. = 0.0321691), norm. avg. (of 2) = 0.0454689 Benchmarking for array size = 12: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.18329 s, 524288 iters, t-(init.)=1.11662 s t(norm)=0.0495074, mflops=100.995 2. CWP (best N) (N=15): elapsed time t=1.29995 s, 524288 iters, t-(init.)=1.21662 s t(norm)=0.0539409, mflops=92.694 3. FFTPACK: elapsed time t=1.41661 s, 1048576 iters, t-(init.)=1.28328 s t(norm)=0.0284483, mflops=175.757 (err=1.6e-16) 4. FFTPACK (f2c): elapsed time t=1.21662 s, 524288 iters, t-(init.)=1.14995 s t(norm)=0.0509853, mflops=98.0675 (err=1.9e-16) FFTW_MEASURE plan: (cost = 8.900604e-07) FFTW_NOTW 12 5. FFTW: elapsed time t=1.79993 s, 2097152 iters, t-(init.)=1.53327 s t(norm)=0.0169951, mflops=294.203 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.79993 s, 2097152 iters, t-(init.)=0.99996 s t(norm)=0.0110838, mflops=451.111 (err=1.2e-16) 7. Frigo-old: elapsed time t=1.99992 s, 524288 iters, t-(init.)=1.93326 s t(norm)=0.0857144, mflops=58.3333 (err=2.9e-16) 8. GSL: elapsed time t=1.46661 s, 524288 iters, t-(init.)=1.39994 s t(norm)=0.062069, mflops=80.5555 (err=1.6e-16) 9. NAPACK (f2c): elapsed time t=1.24995 s, 131072 iters, t-(init.)=1.23328 s t(norm)=0.218719, mflops=22.8603 (err=5.5e-16) 10. Singleton: elapsed time t=1.23328 s, 262144 iters, t-(init.)=1.16662 s t(norm)=0.103448, mflops=48.3333 (err=1.5e-16) 11. Singleton (f2c): elapsed time t=1.5666 s, 262144 iters, t-(init.)=1.53327 s t(norm)=0.135961, mflops=36.7753 (err=1.5e-16) 12. Temperton: elapsed time t=1.63327 s, 524288 iters, t-(init.)=1.5666 s t(norm)=0.0694582, mflops=71.9858 (err=5.4e-16) 13. Temperton (f2c): elapsed time t=1.51661 s, 262144 iters, t-(init.)=1.48327 s t(norm)=0.131527, mflops=38.0149 (err=1.4e-16) 14. Valkenburg: elapsed time t=1.6166 s, 131072 iters, t-(init.)=1.59994 s t(norm)=0.283744, mflops=17.6215 (err=3.9e-16) Top mflops for N=12 = 451.111 Normalized results and averages for N=12: fft 0: mflops = -1 (norm. = -0.00221675), norm. avg. (of 1) = 0.0308099 fft 1: mflops = 100.995 (norm. = 0.223881), norm. avg. (of 3) = 0.161837 fft 2: mflops = 92.694 (norm. = 0.205479), norm. avg. (of 3) = 0.145532 fft 3: mflops = 175.757 (norm. = 0.38961), norm. avg. (of 3) = 0.288355 fft 4: mflops = 98.0675 (norm. = 0.217391), norm. avg. (of 3) = 0.211222 fft 5: mflops = 294.203 (norm. = 0.652174), norm. avg. (of 3) = 0.730212 fft 6: mflops = 451.111 (norm. = 1), norm. avg. (of 3) = 0.666667 fft 7: mflops = 58.3333 (norm. = 0.12931), norm. avg. (of 3) = 0.102842 fft 8: mflops = 80.5555 (norm. = 0.178571), norm. avg. (of 3) = 0.159759 fft 9: mflops = 22.8603 (norm. = 0.0506757), norm. avg. (of 3) = 0.0493103 fft 10: mflops = 48.3333 (norm. = 0.107143), norm. avg. (of 3) = 0.093702 fft 11: mflops = 36.7753 (norm. = 0.0815217), norm. avg. (of 3) = 0.0707937 fft 12: mflops = 71.9858 (norm. = 0.159574), norm. avg. (of 3) = 0.108081 fft 13: mflops = 38.0149 (norm. = 0.0842697), norm. avg. (of 3) = 0.0640555 fft 14: mflops = 17.6215 (norm. = 0.0390625), norm. avg. (of 3) = 0.0433334 Benchmarking for array size = 15: 0. Brenner: elapsed time t=1.64993 s, 131072 iters, t-(init.)=1.6166 s t(norm)=0.210461, mflops=23.7574 (err=3.3e-16) 1. CWP (min N): elapsed time t=1.36661 s, 524288 iters, t-(init.)=1.28328 s t(norm)=0.0417667, mflops=119.713 2. CWP (best N): elapsed time t=1.29995 s, 524288 iters, t-(init.)=1.21662 s t(norm)=0.039597, mflops=126.272 3. FFTPACK: elapsed time t=1.73326 s, 1048576 iters, t-(init.)=1.5666 s t(norm)=0.0254939, mflops=196.125 (err=2.6e-16) 4. FFTPACK (f2c): elapsed time t=1.33328 s, 524288 iters, t-(init.)=1.24995 s t(norm)=0.0406818, mflops=122.905 (err=4.1e-16) FFTW_MEASURE plan: (cost = 1.907272e-06) FFTW_TWIDDLE 5 FFTW_NOTW 3 5. FFTW: elapsed time t=1.6166 s, 1048576 iters, t-(init.)=1.44994 s t(norm)=0.0235954, mflops=211.905 (err=2.5e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.06662 s, 524288 iters, t-(init.)=0.849966 s t(norm)=0.0276636, mflops=180.743 (err=1.8e-16) 7. Frigo-old: elapsed time t=1.04996 s, 131072 iters, t-(init.)=1.01663 s t(norm)=0.132351, mflops=37.7782 (err=4.2e-16) 8. GSL: elapsed time t=1.26662 s, 262144 iters, t-(init.)=1.23328 s t(norm)=0.0802788, mflops=62.283 (err=2.0e-16) 9. NAPACK (f2c): elapsed time t=1.13329 s, 65536 iters, t-(init.)=1.11662 s t(norm)=0.290739, mflops=17.1975 (err=1.0e-15) 10. Singleton: elapsed time t=1.51661 s, 262144 iters, t-(init.)=1.43328 s t(norm)=0.0932969, mflops=53.5923 (err=2.9e-16) 11. Singleton (f2c): elapsed time t=1.74993 s, 262144 iters, t-(init.)=1.7166 s t(norm)=0.111739, mflops=44.747 (err=2.9e-16) 12. Temperton: elapsed time t=1.99992 s, 524288 iters, t-(init.)=1.91659 s t(norm)=0.0623788, mflops=80.1555 (err=7.9e-16) 13. Temperton (f2c): elapsed time t=1.89992 s, 262144 iters, t-(init.)=1.86659 s t(norm)=0.121503, mflops=41.1513 (err=2.1e-16) 14. Valkenburg: elapsed time t=1.11662 s, 65536 iters, t-(init.)=1.11662 s t(norm)=0.290739, mflops=17.1975 (err=4.0e-16) Top mflops for N=15 = 211.905 Normalized results and averages for N=15: fft 0: mflops = 23.7574 (norm. = 0.112113), norm. avg. (of 2) = 0.0714616 fft 1: mflops = 119.713 (norm. = 0.564935), norm. avg. (of 4) = 0.262612 fft 2: mflops = 126.272 (norm. = 0.59589), norm. avg. (of 4) = 0.258122 fft 3: mflops = 196.125 (norm. = 0.925532), norm. avg. (of 4) = 0.447649 fft 4: mflops = 122.905 (norm. = 0.58), norm. avg. (of 4) = 0.303417 fft 5: mflops = 211.905 (norm. = 1), norm. avg. (of 4) = 0.797659 fft 6: mflops = 180.743 (norm. = 0.852941), norm. avg. (of 4) = 0.713235 fft 7: mflops = 37.7782 (norm. = 0.178279), norm. avg. (of 4) = 0.121701 fft 8: mflops = 62.283 (norm. = 0.293919), norm. avg. (of 4) = 0.193299 fft 9: mflops = 17.1975 (norm. = 0.0811567), norm. avg. (of 4) = 0.0572719 fft 10: mflops = 53.5923 (norm. = 0.252907), norm. avg. (of 4) = 0.133503 fft 11: mflops = 44.747 (norm. = 0.211165), norm. avg. (of 4) = 0.105887 fft 12: mflops = 80.1555 (norm. = 0.378261), norm. avg. (of 4) = 0.175626 fft 13: mflops = 41.1513 (norm. = 0.194196), norm. avg. (of 4) = 0.0965908 fft 14: mflops = 17.1975 (norm. = 0.0811567), norm. avg. (of 4) = 0.0527892 Benchmarking for array size = 18: 0. Brenner: elapsed time t=1.28328 s, 65536 iters, t-(init.)=1.28328 s t(norm)=0.26088, mflops=19.1659 (err=4.5e-16) 1. CWP (min N): elapsed time t=1.84993 s, 524288 iters, t-(init.)=1.73326 s t(norm)=0.0440447, mflops=113.521 2. CWP (best N) (N=28): elapsed time t=1.94992 s, 524288 iters, t-(init.)=1.83326 s t(norm)=0.0465858, mflops=107.329 3. FFTPACK: elapsed time t=1.83326 s, 524288 iters, t-(init.)=1.74993 s t(norm)=0.0444682, mflops=112.44 (err=2.6e-16) 4. FFTPACK (f2c): elapsed time t=1.03329 s, 262144 iters, t-(init.)=0.983294 s t(norm)=0.0499738, mflops=100.052 (err=2.6e-16) FFTW_MEASURE plan: (cost = 2.161575e-06) FFTW_TWIDDLE 3 FFTW_NOTW 6 5. FFTW: elapsed time t=1.68327 s, 1048576 iters, t-(init.)=1.48327 s t(norm)=0.0188461, mflops=265.307 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 6. FFTW_ESTIMATE: elapsed time t=1.89992 s, 1048576 iters, t-(init.)=1.44994 s t(norm)=0.0184226, mflops=271.406 (err=1.9e-16) 7. Frigo-old: elapsed time t=1.34995 s, 131072 iters, t-(init.)=1.33328 s t(norm)=0.135522, mflops=36.8943 (err=4.5e-16) 8. GSL: elapsed time t=1.86659 s, 524288 iters, t-(init.)=1.7666 s t(norm)=0.0448918, mflops=111.379 (err=2.2e-16) 9. NAPACK (f2c): elapsed time t=1.7166 s, 131072 iters, t-(init.)=1.69993 s t(norm)=0.172791, mflops=28.9367 (err=8.7e-16) 10. Singleton: elapsed time t=1.58327 s, 262144 iters, t-(init.)=1.49994 s t(norm)=0.0762313, mflops=65.5899 (err=2.1e-16) 11. Singleton (f2c): elapsed time t=1.93326 s, 262144 iters, t-(init.)=1.88326 s t(norm)=0.0957126, mflops=52.2397 (err=2.1e-16) 12. Temperton: elapsed time t=1.91659 s, 262144 iters, t-(init.)=1.86659 s t(norm)=0.0948656, mflops=52.7061 (err=2.7e-08) 13. Temperton (f2c): elapsed time t=1.39994 s, 131072 iters, t-(init.)=1.38328 s t(norm)=0.140604, mflops=35.5608 (err=2.9e-16) 14. Valkenburg: elapsed time t=1.18329 s, 65536 iters, t-(init.)=1.16662 s t(norm)=0.237164, mflops=21.0825 (err=4.1e-16) Top mflops for N=18 = 271.406 Normalized results and averages for N=18: fft 0: mflops = 19.1659 (norm. = 0.0706169), norm. avg. (of 3) = 0.07118 fft 1: mflops = 113.521 (norm. = 0.418269), norm. avg. (of 5) = 0.293743 fft 2: mflops = 107.329 (norm. = 0.395455), norm. avg. (of 5) = 0.285588 fft 3: mflops = 112.44 (norm. = 0.414286), norm. avg. (of 5) = 0.440976 fft 4: mflops = 100.052 (norm. = 0.368644), norm. avg. (of 5) = 0.316462 fft 5: mflops = 265.307 (norm. = 0.977528), norm. avg. (of 5) = 0.833633 fft 6: mflops = 271.406 (norm. = 1), norm. avg. (of 5) = 0.770588 fft 7: mflops = 36.8943 (norm. = 0.135938), norm. avg. (of 5) = 0.124548 fft 8: mflops = 111.379 (norm. = 0.410377), norm. avg. (of 5) = 0.236715 fft 9: mflops = 28.9367 (norm. = 0.106618), norm. avg. (of 5) = 0.067141 fft 10: mflops = 65.5899 (norm. = 0.241667), norm. avg. (of 5) = 0.155136 fft 11: mflops = 52.2397 (norm. = 0.192478), norm. avg. (of 5) = 0.123205 fft 12: mflops = 52.7061 (norm. = 0.194196), norm. avg. (of 5) = 0.17934 fft 13: mflops = 35.5608 (norm. = 0.131024), norm. avg. (of 5) = 0.103477 fft 14: mflops = 21.0825 (norm. = 0.0776786), norm. avg. (of 5) = 0.0577671 Benchmarking for array size = 24: 0. Skipping fft (Brenner has a bug for N=3*2^m). 1. CWP (min N): elapsed time t=1.91659 s, 524288 iters, t-(init.)=1.81659 s t(norm)=0.0314877, mflops=158.792 2. CWP (best N) (N=28): elapsed time t=1.96659 s, 524288 iters, t-(init.)=1.83326 s t(norm)=0.0317766, mflops=157.349 3. FFTPACK: elapsed time t=1.26662 s, 262144 iters, t-(init.)=1.19995 s t(norm)=0.0415984, mflops=120.197 (err=2.2e-16) 4. FFTPACK (f2c): elapsed time t=1.41661 s, 262144 iters, t-(init.)=1.36661 s t(norm)=0.047376, mflops=105.539 (err=2.4e-16) FFTW_MEASURE plan: (cost = 2.543030e-06) FFTW_TWIDDLE 4 FFTW_NOTW 6 5. FFTW: elapsed time t=1.16662 s, 524288 iters, t-(init.)=1.04996 s t(norm)=0.0181993, mflops=274.736 (err=2.3e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.26662 s, 524288 iters, t-(init.)=1.03329 s t(norm)=0.0179104, mflops=279.167 (err=2.1e-16) 7. Frigo-old: elapsed time t=1.91659 s, 262144 iters, t-(init.)=1.86659 s t(norm)=0.0647087, mflops=77.2694 (err=3.6e-16) 8. GSL: elapsed time t=1.94992 s, 524288 iters, t-(init.)=1.83326 s t(norm)=0.0317766, mflops=157.349 (err=2.1e-16) 9. NAPACK (f2c): elapsed time t=1.24995 s, 65536 iters, t-(init.)=1.23328 s t(norm)=0.171016, mflops=29.2371 (err=8.0e-16) 10. Singleton: elapsed time t=1.21662 s, 131072 iters, t-(init.)=1.16662 s t(norm)=0.0808858, mflops=61.8155 (err=2.3e-16) 11. Singleton (f2c): elapsed time t=1.48327 s, 131072 iters, t-(init.)=1.44994 s t(norm)=0.10053, mflops=49.7366 (err=2.3e-16) 12. Temperton: elapsed time t=1.43328 s, 262144 iters, t-(init.)=1.38328 s t(norm)=0.0479537, mflops=104.267 (err=4.5e-09) 13. Temperton (f2c): elapsed time t=1.68327 s, 131072 iters, t-(init.)=1.64993 s t(norm)=0.114396, mflops=43.7079 (err=2.8e-16) 14. Valkenburg: elapsed time t=1.79993 s, 65536 iters, t-(init.)=1.78326 s t(norm)=0.24728, mflops=20.22 (err=5.6e-16) Top mflops for N=24 = 279.167 Normalized results and averages for N=24: fft 0: mflops = -1 (norm. = -0.00358209), norm. avg. (of 3) = 0.07118 fft 1: mflops = 158.792 (norm. = 0.568807), norm. avg. (of 6) = 0.339587 fft 2: mflops = 157.349 (norm. = 0.563636), norm. avg. (of 6) = 0.331929 fft 3: mflops = 120.197 (norm. = 0.430556), norm. avg. (of 6) = 0.43924 fft 4: mflops = 105.539 (norm. = 0.378049), norm. avg. (of 6) = 0.326727 fft 5: mflops = 274.736 (norm. = 0.984127), norm. avg. (of 6) = 0.858715 fft 6: mflops = 279.167 (norm. = 1), norm. avg. (of 6) = 0.808824 fft 7: mflops = 77.2694 (norm. = 0.276786), norm. avg. (of 6) = 0.149921 fft 8: mflops = 157.349 (norm. = 0.563636), norm. avg. (of 6) = 0.291202 fft 9: mflops = 29.2371 (norm. = 0.10473), norm. avg. (of 6) = 0.0734058 fft 10: mflops = 61.8155 (norm. = 0.221429), norm. avg. (of 6) = 0.166185 fft 11: mflops = 49.7366 (norm. = 0.178161), norm. avg. (of 6) = 0.132364 fft 12: mflops = 104.267 (norm. = 0.373494), norm. avg. (of 6) = 0.211699 fft 13: mflops = 43.7079 (norm. = 0.156566), norm. avg. (of 6) = 0.112325 fft 14: mflops = 20.22 (norm. = 0.0724299), norm. avg. (of 6) = 0.0602109 Benchmarking for array size = 36: 0. Brenner: elapsed time t=1.08329 s, 32768 iters, t-(init.)=1.08329 s t(norm)=0.177627, mflops=28.1489 (err=5.5e-16) 1. CWP (min N): elapsed time t=1.29995 s, 262144 iters, t-(init.)=1.21662 s t(norm)=0.024936, mflops=200.513 2. CWP (best N): elapsed time t=1.21662 s, 262144 iters, t-(init.)=1.13329 s t(norm)=0.0232281, mflops=215.257 3. FFTPACK: elapsed time t=1.16662 s, 262144 iters, t-(init.)=1.08329 s t(norm)=0.0222033, mflops=225.191 (err=3.9e-16) 4. FFTPACK (f2c): elapsed time t=1.99992 s, 262144 iters, t-(init.)=1.93326 s t(norm)=0.0396244, mflops=126.185 (err=4.5e-16) FFTW_MEASURE plan: (cost = 3.560242e-06) FFTW_TWIDDLE 6 FFTW_NOTW 6 5. FFTW: elapsed time t=1.04996 s, 262144 iters, t-(init.)=0.966628 s t(norm)=0.0198122, mflops=252.37 (err=4.6e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.19995 s, 262144 iters, t-(init.)=1.04996 s t(norm)=0.0215201, mflops=232.34 (err=4.4e-16) 7. Frigo-old: elapsed time t=1.31661 s, 65536 iters, t-(init.)=1.29995 s t(norm)=0.106576, mflops=46.9149 (err=5.4e-16) 8. GSL: elapsed time t=1.49994 s, 262144 iters, t-(init.)=1.41661 s t(norm)=0.0290351, mflops=172.205 (err=4.3e-16) 9. NAPACK (f2c): elapsed time t=1.73326 s, 65536 iters, t-(init.)=1.7166 s t(norm)=0.140735, mflops=35.5278 (err=1.4e-15) 10. Singleton: elapsed time t=1.28328 s, 131072 iters, t-(init.)=1.23328 s t(norm)=0.0505553, mflops=98.9017 (err=4.7e-16) 11. Singleton (f2c): elapsed time t=1.49994 s, 131072 iters, t-(init.)=1.44994 s t(norm)=0.0594366, mflops=84.1233 (err=4.7e-16) 12. Temperton: elapsed time t=1.18329 s, 131072 iters, t-(init.)=1.14995 s t(norm)=0.0471394, mflops=106.068 (err=5.1e-08) 13. Temperton (f2c): elapsed time t=1.84993 s, 131072 iters, t-(init.)=1.81659 s t(norm)=0.0744665, mflops=67.1442 (err=3.7e-16) 14. Valkenburg: elapsed time t=1.13329 s, 16384 iters, t-(init.)=1.13329 s t(norm)=0.37165, mflops=13.4535 (err=6.2e-16) Top mflops for N=36 = 252.37 Normalized results and averages for N=36: fft 0: mflops = 28.1489 (norm. = 0.111538), norm. avg. (of 4) = 0.0812697 fft 1: mflops = 200.513 (norm. = 0.794521), norm. avg. (of 7) = 0.404578 fft 2: mflops = 215.257 (norm. = 0.852941), norm. avg. (of 7) = 0.40636 fft 3: mflops = 225.191 (norm. = 0.892308), norm. avg. (of 7) = 0.503964 fft 4: mflops = 126.185 (norm. = 0.5), norm. avg. (of 7) = 0.35148 fft 5: mflops = 252.37 (norm. = 1), norm. avg. (of 7) = 0.878899 fft 6: mflops = 232.34 (norm. = 0.920635), norm. avg. (of 7) = 0.824797 fft 7: mflops = 46.9149 (norm. = 0.185897), norm. avg. (of 7) = 0.155061 fft 8: mflops = 172.205 (norm. = 0.682353), norm. avg. (of 7) = 0.34708 fft 9: mflops = 35.5278 (norm. = 0.140777), norm. avg. (of 7) = 0.0830302 fft 10: mflops = 98.9017 (norm. = 0.391892), norm. avg. (of 7) = 0.198429 fft 11: mflops = 84.1233 (norm. = 0.333333), norm. avg. (of 7) = 0.161074 fft 12: mflops = 106.068 (norm. = 0.42029), norm. avg. (of 7) = 0.241498 fft 13: mflops = 67.1442 (norm. = 0.266055), norm. avg. (of 7) = 0.134287 fft 14: mflops = 13.4535 (norm. = 0.0533088), norm. avg. (of 7) = 0.0592249 Benchmarking for array size = 80: 0. Brenner: elapsed time t=1.68327 s, 32768 iters, t-(init.)=1.6666 s t(norm)=0.100564, mflops=49.7197 (err=4.3e-16) 1. CWP (min N): elapsed time t=1.26662 s, 131072 iters, t-(init.)=1.18329 s t(norm)=0.0178501, mflops=280.111 2. CWP (best N) (N=84): elapsed time t=1.29995 s, 131072 iters, t-(init.)=1.21662 s t(norm)=0.0183529, mflops=272.436 3. FFTPACK: elapsed time t=1.51661 s, 131072 iters, t-(init.)=1.43328 s t(norm)=0.0216212, mflops=231.254 (err=3.2e-16) 4. FFTPACK (f2c): elapsed time t=1.04996 s, 65536 iters, t-(init.)=0.99996 s t(norm)=0.0301692, mflops=165.732 (err=3.8e-16) FFTW_MEASURE plan: (cost = 8.137695e-06) FFTW_TWIDDLE 10 FFTW_NOTW 8 5. FFTW: elapsed time t=1.98325 s, 262144 iters, t-(init.)=1.81659 s t(norm)=0.0137018, mflops=364.915 (err=3.5e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 6. FFTW_ESTIMATE: elapsed time t=1.11662 s, 131072 iters, t-(init.)=0.99996 s t(norm)=0.0150846, mflops=331.464 (err=3.5e-16) 7. Frigo-old: elapsed time t=1.04996 s, 32768 iters, t-(init.)=1.03329 s t(norm)=0.0623496, mflops=80.193 (err=3.6e-16) 8. GSL: elapsed time t=1.74993 s, 65536 iters, t-(init.)=1.7166 s t(norm)=0.0517904, mflops=96.543 (err=3.3e-16) 9. NAPACK (f2c): elapsed time t=1.58327 s, 16384 iters, t-(init.)=1.5666 s t(norm)=0.18906, mflops=26.4466 (err=5.0e-16) 10. Singleton: elapsed time t=1.13329 s, 65536 iters, t-(init.)=1.08329 s t(norm)=0.0326833, mflops=152.984 (err=4.4e-16) 11. Singleton (f2c): elapsed time t=1.28328 s, 65536 iters, t-(init.)=1.24995 s t(norm)=0.0377114, mflops=132.586 (err=4.4e-16) 12. Temperton: elapsed time t=1.96659 s, 131072 iters, t-(init.)=1.88326 s t(norm)=0.0284093, mflops=175.999 (err=5.3e-08) 13. Temperton (f2c): elapsed time t=1.96659 s, 65536 iters, t-(init.)=1.93326 s t(norm)=0.058327, mflops=85.7235 (err=3.4e-16) 14. Valkenburg: elapsed time t=1.08329 s, 8192 iters, t-(init.)=1.08329 s t(norm)=0.261466, mflops=19.1229 (err=4.6e-16) Top mflops for N=80 = 364.915 Normalized results and averages for N=80: fft 0: mflops = 49.7197 (norm. = 0.13625), norm. avg. (of 5) = 0.0922657 fft 1: mflops = 280.111 (norm. = 0.767606), norm. avg. (of 8) = 0.449956 fft 2: mflops = 272.436 (norm. = 0.746575), norm. avg. (of 8) = 0.448887 fft 3: mflops = 231.254 (norm. = 0.633721), norm. avg. (of 8) = 0.520183 fft 4: mflops = 165.732 (norm. = 0.454167), norm. avg. (of 8) = 0.364316 fft 5: mflops = 364.915 (norm. = 1), norm. avg. (of 8) = 0.894036 fft 6: mflops = 331.464 (norm. = 0.908333), norm. avg. (of 8) = 0.835239 fft 7: mflops = 80.193 (norm. = 0.219758), norm. avg. (of 8) = 0.163148 fft 8: mflops = 96.543 (norm. = 0.264563), norm. avg. (of 8) = 0.336766 fft 9: mflops = 26.4466 (norm. = 0.0724734), norm. avg. (of 8) = 0.0817106 fft 10: mflops = 152.984 (norm. = 0.419231), norm. avg. (of 8) = 0.226029 fft 11: mflops = 132.586 (norm. = 0.363333), norm. avg. (of 8) = 0.186356 fft 12: mflops = 175.999 (norm. = 0.482301), norm. avg. (of 8) = 0.271598 fft 13: mflops = 85.7235 (norm. = 0.234914), norm. avg. (of 8) = 0.146865 fft 14: mflops = 19.1229 (norm. = 0.0524038), norm. avg. (of 8) = 0.0583723 Benchmarking for array size = 108: 0. Brenner: elapsed time t=1.94992 s, 16384 iters, t-(init.)=1.93326 s t(norm)=0.161744, mflops=30.9131 (err=6.5e-16) 1. CWP (min N) (N=110): elapsed time t=1.09996 s, 65536 iters, t-(init.)=1.03329 s t(norm)=0.0216123, mflops=231.35 2. CWP (best N) (N=112): elapsed time t=1.03329 s, 65536 iters, t-(init.)=0.983294 s t(norm)=0.0205666, mflops=243.113 3. FFTPACK: elapsed time t=1.04996 s, 65536 iters, t-(init.)=0.99996 s t(norm)=0.0209151, mflops=239.061 (err=4.1e-16) 4. FFTPACK (f2c): elapsed time t=1.79993 s, 65536 iters, t-(init.)=1.74993 s t(norm)=0.0366015, mflops=136.606 (err=4.1e-16) FFTW_MEASURE plan: (cost = 1.322375e-05) FFTW_TWIDDLE 9 FFTW_NOTW 12 5. FFTW: elapsed time t=1.79993 s, 131072 iters, t-(init.)=1.68327 s t(norm)=0.0176036, mflops=284.033 (err=3.6e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 6. FFTW_ESTIMATE: elapsed time t=1.78326 s, 131072 iters, t-(init.)=1.63327 s t(norm)=0.0170807, mflops=292.728 (err=3.6e-16) 7. Frigo-old: elapsed time t=1.68327 s, 16384 iters, t-(init.)=1.68327 s t(norm)=0.140829, mflops=35.5041 (err=5.5e-16) 8. GSL: elapsed time t=1.6666 s, 65536 iters, t-(init.)=1.6166 s t(norm)=0.0338128, mflops=147.873 (err=3.9e-16) 9. NAPACK (f2c): elapsed time t=1.53327 s, 16384 iters, t-(init.)=1.51661 s t(norm)=0.126885, mflops=39.4057 (err=3.1e-15) 10. Singleton: elapsed time t=1.14995 s, 32768 iters, t-(init.)=1.11662 s t(norm)=0.0467105, mflops=107.042 (err=4.5e-16) 11. Singleton (f2c): elapsed time t=1.31661 s, 32768 iters, t-(init.)=1.28328 s t(norm)=0.0536822, mflops=93.1407 (err=4.5e-16) 12. Temperton: elapsed time t=1.86659 s, 65536 iters, t-(init.)=1.81659 s t(norm)=0.0379958, mflops=131.593 (err=7.4e-08) 13. Temperton (f2c): elapsed time t=1.28328 s, 32768 iters, t-(init.)=1.24995 s t(norm)=0.0522879, mflops=95.6245 (err=3.5e-16) 14. Valkenburg: elapsed time t=1.53327 s, 8192 iters, t-(init.)=1.51661 s t(norm)=0.25377, mflops=19.7029 (err=7.5e-16) Top mflops for N=108 = 292.728 Normalized results and averages for N=108: fft 0: mflops = 30.9131 (norm. = 0.105603), norm. avg. (of 6) = 0.0944887 fft 1: mflops = 231.35 (norm. = 0.790323), norm. avg. (of 9) = 0.487775 fft 2: mflops = 243.113 (norm. = 0.830508), norm. avg. (of 9) = 0.491289 fft 3: mflops = 239.061 (norm. = 0.816667), norm. avg. (of 9) = 0.553126 fft 4: mflops = 136.606 (norm. = 0.466667), norm. avg. (of 9) = 0.375688 fft 5: mflops = 284.033 (norm. = 0.970297), norm. avg. (of 9) = 0.90251 fft 6: mflops = 292.728 (norm. = 1), norm. avg. (of 9) = 0.853545 fft 7: mflops = 35.5041 (norm. = 0.121287), norm. avg. (of 9) = 0.158497 fft 8: mflops = 147.873 (norm. = 0.505155), norm. avg. (of 9) = 0.355476 fft 9: mflops = 39.4057 (norm. = 0.134615), norm. avg. (of 9) = 0.0875889 fft 10: mflops = 107.042 (norm. = 0.365672), norm. avg. (of 9) = 0.241545 fft 11: mflops = 93.1407 (norm. = 0.318182), norm. avg. (of 9) = 0.201004 fft 12: mflops = 131.593 (norm. = 0.449541), norm. avg. (of 9) = 0.29137 fft 13: mflops = 95.6245 (norm. = 0.326667), norm. avg. (of 9) = 0.166843 fft 14: mflops = 19.7029 (norm. = 0.0673077), norm. avg. (of 9) = 0.0593651 Benchmarking for array size = 210: 0. Brenner: elapsed time t=1.84993 s, 8192 iters, t-(init.)=1.83326 s t(norm)=0.138141, mflops=36.195 (err=6.8e-16) 1. CWP (min N): elapsed time t=1.74993 s, 65536 iters, t-(init.)=1.64993 s t(norm)=0.0155408, mflops=321.733 2. CWP (best N): elapsed time t=1.7166 s, 65536 iters, t-(init.)=1.6166 s t(norm)=0.0152269, mflops=328.367 3. FFTPACK: elapsed time t=1.03329 s, 16384 iters, t-(init.)=1.01663 s t(norm)=0.0383026, mflops=130.539 (err=4.7e-16) 4. FFTPACK (f2c): elapsed time t=1.5666 s, 16384 iters, t-(init.)=1.53327 s t(norm)=0.0577679, mflops=86.5533 (err=6.2e-16) FFTW_MEASURE plan: (cost = 3.458521e-05) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 6 5. FFTW: elapsed time t=1.23328 s, 32768 iters, t-(init.)=1.18329 s t(norm)=0.0222909, mflops=224.307 (err=4.8e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.73326 s, 32768 iters, t-(init.)=1.6666 s t(norm)=0.0313956, mflops=159.258 (err=4.9e-16) 7. Frigo-old: elapsed time t=1.81659 s, 8192 iters, t-(init.)=1.81659 s t(norm)=0.136885, mflops=36.5271 (err=6.3e-16) 8. GSL: elapsed time t=1.18329 s, 16384 iters, t-(init.)=1.14995 s t(norm)=0.0433259, mflops=115.404 (err=6.4e-16) 9. NAPACK (f2c): elapsed time t=1.6666 s, 4096 iters, t-(init.)=1.64993 s t(norm)=0.248653, mflops=20.1083 (err=1.5e-14) 10. Singleton: elapsed time t=1.58327 s, 16384 iters, t-(init.)=1.54994 s t(norm)=0.0583958, mflops=85.6226 (err=6.4e-16) 11. Singleton (f2c): elapsed time t=1.74993 s, 16384 iters, t-(init.)=1.73326 s t(norm)=0.0653028, mflops=76.5664 (err=6.4e-16) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.86659 s, 4096 iters, t-(init.)=1.86659 s t(norm)=0.281305, mflops=17.7743 (err=7.5e-16) Top mflops for N=210 = 328.367 Normalized results and averages for N=210: fft 0: mflops = 36.195 (norm. = 0.110227), norm. avg. (of 7) = 0.096737 fft 1: mflops = 321.733 (norm. = 0.979798), norm. avg. (of 10) = 0.536977 fft 2: mflops = 328.367 (norm. = 1), norm. avg. (of 10) = 0.54216 fft 3: mflops = 130.539 (norm. = 0.397541), norm. avg. (of 10) = 0.537567 fft 4: mflops = 86.5533 (norm. = 0.263587), norm. avg. (of 10) = 0.364478 fft 5: mflops = 224.307 (norm. = 0.683099), norm. avg. (of 10) = 0.880569 fft 6: mflops = 159.258 (norm. = 0.485), norm. avg. (of 10) = 0.816691 fft 7: mflops = 36.5271 (norm. = 0.111239), norm. avg. (of 10) = 0.153771 fft 8: mflops = 115.404 (norm. = 0.351449), norm. avg. (of 10) = 0.355073 fft 9: mflops = 20.1083 (norm. = 0.0612374), norm. avg. (of 10) = 0.0849538 fft 10: mflops = 85.6226 (norm. = 0.260753), norm. avg. (of 10) = 0.243466 fft 11: mflops = 76.5664 (norm. = 0.233173), norm. avg. (of 10) = 0.204221 fft 12: mflops = -1 (norm. = -0.00304537), norm. avg. (of 9) = 0.29137 fft 13: mflops = -1 (norm. = -0.00304537), norm. avg. (of 9) = 0.166843 fft 14: mflops = 17.7743 (norm. = 0.0541295), norm. avg. (of 10) = 0.0588415 Benchmarking for array size = 504: 0. Brenner: elapsed time t=1.31661 s, 2048 iters, t-(init.)=1.29995 s t(norm)=0.140288, mflops=35.641 (err=1.5e-15) 1. CWP (min N): elapsed time t=1.86659 s, 32768 iters, t-(init.)=1.74993 s t(norm)=0.0118031, mflops=423.618 2. CWP (best N): elapsed time t=1.84993 s, 32768 iters, t-(init.)=1.73326 s t(norm)=0.0116907, mflops=427.691 3. FFTPACK: elapsed time t=1.69993 s, 8192 iters, t-(init.)=1.6666 s t(norm)=0.0449641, mflops=111.2 (err=1.3e-15) 4. FFTPACK (f2c): elapsed time t=1.24995 s, 4096 iters, t-(init.)=1.24995 s t(norm)=0.0674462, mflops=74.1332 (err=1.3e-15) FFTW_MEASURE plan: (cost = 9.358350e-05) FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 8 5. FFTW: elapsed time t=1.53327 s, 16384 iters, t-(init.)=1.46661 s t(norm)=0.0197842, mflops=252.727 (err=1.2e-15) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.64993 s, 16384 iters, t-(init.)=1.58327 s t(norm)=0.021358, mflops=234.105 (err=1.2e-15) 7. Frigo-old: elapsed time t=1.06662 s, 2048 iters, t-(init.)=1.06662 s t(norm)=0.115108, mflops=43.4374 (err=1.3e-15) 8. GSL: elapsed time t=1.11662 s, 8192 iters, t-(init.)=1.08329 s t(norm)=0.0292267, mflops=171.077 (err=1.3e-15) 9. NAPACK (f2c): elapsed time t=1.96659 s, 2048 iters, t-(init.)=1.96659 s t(norm)=0.212231, mflops=23.5593 (err=4.1e-14) 10. Singleton: elapsed time t=1.79993 s, 8192 iters, t-(init.)=1.7666 s t(norm)=0.047662, mflops=104.905 (err=1.9e-15) 11. Singleton (f2c): elapsed time t=1.91659 s, 8192 iters, t-(init.)=1.88326 s t(norm)=0.0508095, mflops=98.4069 (err=1.9e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.38328 s, 1024 iters, t-(init.)=1.38328 s t(norm)=0.298562, mflops=16.747 (err=1.7e-15) Top mflops for N=504 = 427.691 Normalized results and averages for N=504: fft 0: mflops = 35.641 (norm. = 0.0833333), norm. avg. (of 8) = 0.0950616 fft 1: mflops = 423.618 (norm. = 0.990476), norm. avg. (of 11) = 0.578204 fft 2: mflops = 427.691 (norm. = 1), norm. avg. (of 11) = 0.583782 fft 3: mflops = 111.2 (norm. = 0.26), norm. avg. (of 11) = 0.512334 fft 4: mflops = 74.1332 (norm. = 0.173333), norm. avg. (of 11) = 0.347101 fft 5: mflops = 252.727 (norm. = 0.590909), norm. avg. (of 11) = 0.854236 fft 6: mflops = 234.105 (norm. = 0.547368), norm. avg. (of 11) = 0.792207 fft 7: mflops = 43.4374 (norm. = 0.101562), norm. avg. (of 11) = 0.149025 fft 8: mflops = 171.077 (norm. = 0.4), norm. avg. (of 11) = 0.359157 fft 9: mflops = 23.5593 (norm. = 0.0550847), norm. avg. (of 11) = 0.0822384 fft 10: mflops = 104.905 (norm. = 0.245283), norm. avg. (of 11) = 0.243631 fft 11: mflops = 98.4069 (norm. = 0.230088), norm. avg. (of 11) = 0.206572 fft 12: mflops = -1 (norm. = -0.00233813), norm. avg. (of 9) = 0.29137 fft 13: mflops = -1 (norm. = -0.00233813), norm. avg. (of 9) = 0.166843 fft 14: mflops = 16.747 (norm. = 0.0391566), norm. avg. (of 11) = 0.057052 Benchmarking for array size = 1000: 0. Brenner: elapsed time t=1.48327 s, 1024 iters, t-(init.)=1.48327 s t(norm)=0.145348, mflops=34.4001 (err=1.1e-15) 1. CWP (min N) (N=1001): elapsed time t=1.73326 s, 8192 iters, t-(init.)=1.68327 s t(norm)=0.0206182, mflops=242.504 2. CWP (best N) (N=1008): elapsed time t=1.38328 s, 8192 iters, t-(init.)=1.31661 s t(norm)=0.0161271, mflops=310.037 3. FFTPACK: elapsed time t=1.31661 s, 4096 iters, t-(init.)=1.28328 s t(norm)=0.0314377, mflops=159.045 (err=9.9e-16) 4. FFTPACK (f2c): elapsed time t=1.74993 s, 4096 iters, t-(init.)=1.7166 s t(norm)=0.042053, mflops=118.898 (err=1.1e-15) FFTW_MEASURE plan: (cost = 2.278555e-04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 5. FFTW: elapsed time t=1.81659 s, 8192 iters, t-(init.)=1.7666 s t(norm)=0.0216389, mflops=231.065 (err=9.8e-16) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 6. FFTW_ESTIMATE: elapsed time t=1.78326 s, 8192 iters, t-(init.)=1.7166 s t(norm)=0.0210265, mflops=237.795 (err=9.8e-16) 7. Frigo-old: elapsed time t=1.08329 s, 1024 iters, t-(init.)=1.06662 s t(norm)=0.10452, mflops=47.8377 (err=1.0e-15) 8. GSL: elapsed time t=1.21662 s, 2048 iters, t-(init.)=1.19995 s t(norm)=0.0587926, mflops=85.0448 (err=1.0e-15) 9. NAPACK (f2c): elapsed time t=1.41661 s, 512 iters, t-(init.)=1.41661 s t(norm)=0.277632, mflops=18.0095 (err=1.7e-14) 10. Singleton: elapsed time t=1.04996 s, 2048 iters, t-(init.)=1.03329 s t(norm)=0.0506269, mflops=98.7617 (err=1.5e-15) 11. Singleton (f2c): elapsed time t=1.11662 s, 2048 iters, t-(init.)=1.09996 s t(norm)=0.0538932, mflops=92.7761 (err=1.5e-15) 12. Temperton: elapsed time t=1.41661 s, 4096 iters, t-(init.)=1.38328 s t(norm)=0.0338874, mflops=147.548 (err=1.3e-07) 13. Temperton (f2c): elapsed time t=1.59994 s, 2048 iters, t-(init.)=1.58327 s t(norm)=0.0775735, mflops=64.455 (err=9.9e-16) 14. Valkenburg: elapsed time t=1.64993 s, 512 iters, t-(init.)=1.63327 s t(norm)=0.320093, mflops=15.6205 (err=1.1e-15) Top mflops for N=1000 = 310.037 Normalized results and averages for N=1000: fft 0: mflops = 34.4001 (norm. = 0.110955), norm. avg. (of 9) = 0.0968275 fft 1: mflops = 242.504 (norm. = 0.782178), norm. avg. (of 12) = 0.595202 fft 2: mflops = 310.037 (norm. = 1), norm. avg. (of 12) = 0.618467 fft 3: mflops = 159.045 (norm. = 0.512987), norm. avg. (of 12) = 0.512388 fft 4: mflops = 118.898 (norm. = 0.383495), norm. avg. (of 12) = 0.350134 fft 5: mflops = 231.065 (norm. = 0.745283), norm. avg. (of 12) = 0.845157 fft 6: mflops = 237.795 (norm. = 0.76699), norm. avg. (of 12) = 0.790106 fft 7: mflops = 47.8377 (norm. = 0.154297), norm. avg. (of 12) = 0.149464 fft 8: mflops = 85.0448 (norm. = 0.274306), norm. avg. (of 12) = 0.352086 fft 9: mflops = 18.0095 (norm. = 0.0580882), norm. avg. (of 12) = 0.0802259 fft 10: mflops = 98.7617 (norm. = 0.318548), norm. avg. (of 12) = 0.249874 fft 11: mflops = 92.7761 (norm. = 0.299242), norm. avg. (of 12) = 0.214295 fft 12: mflops = 147.548 (norm. = 0.475904), norm. avg. (of 10) = 0.309823 fft 13: mflops = 64.455 (norm. = 0.207895), norm. avg. (of 10) = 0.170948 fft 14: mflops = 15.6205 (norm. = 0.0503827), norm. avg. (of 12) = 0.0564962 Benchmarking for array size = 1960: 0. Brenner: elapsed time t=1.53327 s, 512 iters, t-(init.)=1.53327 s t(norm)=0.139704, mflops=35.7899 (err=2.9e-15) 1. CWP (min N) (N=1980): elapsed time t=1.6666 s, 4096 iters, t-(init.)=1.59994 s t(norm)=0.0182223, mflops=274.389 2. CWP (best N) (N=1980): elapsed time t=1.6666 s, 4096 iters, t-(init.)=1.59994 s t(norm)=0.0182223, mflops=274.389 3. FFTPACK: elapsed time t=1.38328 s, 1024 iters, t-(init.)=1.36661 s t(norm)=0.0622595, mflops=80.3091 (err=2.8e-15) 4. FFTPACK (f2c): elapsed time t=2.01659 s, 1024 iters, t-(init.)=1.99992 s t(norm)=0.0911114, mflops=54.8779 (err=2.8e-15) FFTW_MEASURE plan: (cost = 5.208125e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 7 5. FFTW: elapsed time t=1.23328 s, 2048 iters, t-(init.)=1.19995 s t(norm)=0.0273334, mflops=182.926 (err=2.8e-15) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 6. FFTW_ESTIMATE: elapsed time t=1.24995 s, 2048 iters, t-(init.)=1.21662 s t(norm)=0.0277131, mflops=180.42 (err=2.8e-15) 7. Frigo-old: elapsed time t=1.23328 s, 512 iters, t-(init.)=1.21662 s t(norm)=0.110852, mflops=45.1051 (err=2.8e-15) 8. GSL: elapsed time t=1.81659 s, 2048 iters, t-(init.)=1.79993 s t(norm)=0.0410001, mflops=121.951 (err=2.8e-15) 9. NAPACK (f2c): elapsed time t=1.59994 s, 256 iters, t-(init.)=1.59994 s t(norm)=0.291557, mflops=17.1493 (err=1.3e-13) 10. Singleton: elapsed time t=1.34995 s, 1024 iters, t-(init.)=1.34995 s t(norm)=0.0615002, mflops=81.3005 (err=4.3e-15) 11. Singleton (f2c): elapsed time t=1.43328 s, 1024 iters, t-(init.)=1.41661 s t(norm)=0.0645372, mflops=77.4746 (err=4.3e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.01663 s, 128 iters, t-(init.)=1.01663 s t(norm)=0.37052, mflops=13.4946 (err=2.8e-15) Top mflops for N=1960 = 274.389 Normalized results and averages for N=1960: fft 0: mflops = 35.7899 (norm. = 0.130435), norm. avg. (of 10) = 0.100188 fft 1: mflops = 274.389 (norm. = 1), norm. avg. (of 13) = 0.62634 fft 2: mflops = 274.389 (norm. = 1), norm. avg. (of 13) = 0.647816 fft 3: mflops = 80.3091 (norm. = 0.292683), norm. avg. (of 13) = 0.495488 fft 4: mflops = 54.8779 (norm. = 0.2), norm. avg. (of 13) = 0.338585 fft 5: mflops = 182.926 (norm. = 0.666667), norm. avg. (of 13) = 0.831427 fft 6: mflops = 180.42 (norm. = 0.657534), norm. avg. (of 13) = 0.779908 fft 7: mflops = 45.1051 (norm. = 0.164384), norm. avg. (of 13) = 0.150612 fft 8: mflops = 121.951 (norm. = 0.444444), norm. avg. (of 13) = 0.359191 fft 9: mflops = 17.1493 (norm. = 0.0625), norm. avg. (of 13) = 0.0788624 fft 10: mflops = 81.3005 (norm. = 0.296296), norm. avg. (of 13) = 0.253445 fft 11: mflops = 77.4746 (norm. = 0.282353), norm. avg. (of 13) = 0.21953 fft 12: mflops = -1 (norm. = -0.00364446), norm. avg. (of 10) = 0.309823 fft 13: mflops = -1 (norm. = -0.00364446), norm. avg. (of 10) = 0.170948 fft 14: mflops = 13.4946 (norm. = 0.0491803), norm. avg. (of 13) = 0.0559335 Benchmarking for array size = 4725: 0. Brenner: elapsed time t=1.51661 s, 128 iters, t-(init.)=1.51661 s t(norm)=0.20544, mflops=24.3381 (err=1.9e-15) 1. CWP (min N) (N=5005): elapsed time t=1.38328 s, 1024 iters, t-(init.)=1.34995 s t(norm)=0.022858, mflops=218.742 2. CWP (best N) (N=5040): elapsed time t=1.13329 s, 1024 iters, t-(init.)=1.08329 s t(norm)=0.0183428, mflops=272.586 3. FFTPACK: elapsed time t=1.6666 s, 512 iters, t-(init.)=1.63327 s t(norm)=0.0553107, mflops=90.3985 (err=1.8e-15) 4. FFTPACK (f2c): elapsed time t=1.19995 s, 256 iters, t-(init.)=1.18329 s t(norm)=0.080144, mflops=62.3877 (err=1.9e-15) FFTW_MEASURE plan: (cost = 1.627539e-03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 3 FFTW_TWIDDLE 9 FFTW_NOTW 5 5. FFTW: elapsed time t=1.84993 s, 1024 iters, t-(init.)=1.81659 s t(norm)=0.0307595, mflops=162.551 (err=1.9e-15) FFTW_ESTIMATE plan: (cost = 1.946700e+05) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.83326 s, 1024 iters, t-(init.)=1.79993 s t(norm)=0.0304773, mflops=164.057 (err=1.8e-15) 7. Frigo-old: elapsed time t=1.33328 s, 128 iters, t-(init.)=1.31661 s t(norm)=0.178349, mflops=28.035 (err=1.9e-15) 8. GSL: elapsed time t=1.6666 s, 512 iters, t-(init.)=1.64993 s t(norm)=0.055875, mflops=89.4854 (err=1.9e-15) 9. NAPACK (f2c): elapsed time t=2.01659 s, 128 iters, t-(init.)=2.01659 s t(norm)=0.273167, mflops=18.3038 (err=3.5e-13) 10. Singleton: elapsed time t=2.01659 s, 512 iters, t-(init.)=1.99992 s t(norm)=0.0677273, mflops=73.8254 (err=2.4e-15) 11. Singleton (f2c): elapsed time t=1.08329 s, 256 iters, t-(init.)=1.06662 s t(norm)=0.0722425, mflops=69.2113 (err=2.4e-15) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.33328 s, 64 iters, t-(init.)=1.33328 s t(norm)=0.361212, mflops=13.8423 (err=1.9e-15) Top mflops for N=4725 = 272.586 Normalized results and averages for N=4725: fft 0: mflops = 24.3381 (norm. = 0.0892857), norm. avg. (of 11) = 0.0991971 fft 1: mflops = 218.742 (norm. = 0.802469), norm. avg. (of 14) = 0.638921 fft 2: mflops = 272.586 (norm. = 1), norm. avg. (of 14) = 0.672972 fft 3: mflops = 90.3985 (norm. = 0.331633), norm. avg. (of 14) = 0.483784 fft 4: mflops = 62.3877 (norm. = 0.228873), norm. avg. (of 14) = 0.330749 fft 5: mflops = 162.551 (norm. = 0.59633), norm. avg. (of 14) = 0.814634 fft 6: mflops = 164.057 (norm. = 0.601852), norm. avg. (of 14) = 0.76719 fft 7: mflops = 28.035 (norm. = 0.102848), norm. avg. (of 14) = 0.1472 fft 8: mflops = 89.4854 (norm. = 0.328283), norm. avg. (of 14) = 0.356983 fft 9: mflops = 18.3038 (norm. = 0.0671488), norm. avg. (of 14) = 0.0780257 fft 10: mflops = 73.8254 (norm. = 0.270833), norm. avg. (of 14) = 0.254687 fft 11: mflops = 69.2113 (norm. = 0.253906), norm. avg. (of 14) = 0.221985 fft 12: mflops = -1 (norm. = -0.00366856), norm. avg. (of 10) = 0.309823 fft 13: mflops = -1 (norm. = -0.00366856), norm. avg. (of 10) = 0.170948 fft 14: mflops = 13.8423 (norm. = 0.0507812), norm. avg. (of 14) = 0.0555654 Benchmarking for array size = 10368: 0. Brenner: elapsed time t=1.63327 s, 64 iters, t-(init.)=1.63327 s t(norm)=0.184515, mflops=27.0981 (err=3.1e-15) 1. CWP (min N) (N=10920): elapsed time t=1.38328 s, 512 iters, t-(init.)=1.33328 s t(norm)=0.0188281, mflops=265.561 2. CWP (best N) (N=11088): elapsed time t=1.39994 s, 512 iters, t-(init.)=1.36661 s t(norm)=0.0192988, mflops=259.084 3. FFTPACK: elapsed time t=1.78326 s, 256 iters, t-(init.)=1.7666 s t(norm)=0.0498943, mflops=100.212 (err=3.0e-15) 4. FFTPACK (f2c): elapsed time t=1.09996 s, 128 iters, t-(init.)=1.08329 s t(norm)=0.0611912, mflops=81.7111 (err=3.0e-15) FFTW_MEASURE plan: (cost = 3.385281e-03) FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_NOTW 64 5. FFTW: elapsed time t=1.98325 s, 512 iters, t-(init.)=1.93326 s t(norm)=0.0273007, mflops=183.146 (err=3.0e-15) FFTW_ESTIMATE plan: (cost = 1.254528e+05) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.89992 s, 512 iters, t-(init.)=1.86659 s t(norm)=0.0263593, mflops=189.687 (err=3.0e-15) 7. Frigo-old: elapsed time t=1.09996 s, 64 iters, t-(init.)=1.09996 s t(norm)=0.124265, mflops=40.2365 (err=3.1e-15) 8. GSL: elapsed time t=1.34995 s, 256 iters, t-(init.)=1.31661 s t(norm)=0.0371854, mflops=134.461 (err=3.0e-15) 9. NAPACK (f2c): elapsed time t=1.51661 s, 64 iters, t-(init.)=1.49994 s t(norm)=0.169452, mflops=29.5068 (err=8.1e-14) 10. Singleton: elapsed time t=1.09996 s, 128 iters, t-(init.)=1.09996 s t(norm)=0.0621326, mflops=80.4731 (err=4.4e-15) 11. Singleton (f2c): elapsed time t=1.18329 s, 128 iters, t-(init.)=1.16662 s t(norm)=0.0658982, mflops=75.8746 (err=4.4e-15) 12. Temperton: elapsed time t=1.51661 s, 256 iters, t-(init.)=1.49994 s t(norm)=0.0423631, mflops=118.027 (err=2.1e-07) 13. Temperton (f2c): elapsed time t=1.14995 s, 128 iters, t-(init.)=1.13329 s t(norm)=0.0640154, mflops=78.1062 (err=3.0e-15) 14. Valkenburg: elapsed time t=1.59994 s, 32 iters, t-(init.)=1.59994 s t(norm)=0.361499, mflops=13.8313 (err=3.0e-15) Top mflops for N=10368 = 265.561 Normalized results and averages for N=10368: fft 0: mflops = 27.0981 (norm. = 0.102041), norm. avg. (of 12) = 0.0994341 fft 1: mflops = 265.561 (norm. = 1), norm. avg. (of 15) = 0.662993 fft 2: mflops = 259.084 (norm. = 0.97561), norm. avg. (of 15) = 0.693147 fft 3: mflops = 100.212 (norm. = 0.377358), norm. avg. (of 15) = 0.476689 fft 4: mflops = 81.7111 (norm. = 0.307692), norm. avg. (of 15) = 0.329212 fft 5: mflops = 183.146 (norm. = 0.689655), norm. avg. (of 15) = 0.806302 fft 6: mflops = 189.687 (norm. = 0.714286), norm. avg. (of 15) = 0.763663 fft 7: mflops = 40.2365 (norm. = 0.151515), norm. avg. (of 15) = 0.147488 fft 8: mflops = 134.461 (norm. = 0.506329), norm. avg. (of 15) = 0.366939 fft 9: mflops = 29.5068 (norm. = 0.111111), norm. avg. (of 15) = 0.0802314 fft 10: mflops = 80.4731 (norm. = 0.30303), norm. avg. (of 15) = 0.25791 fft 11: mflops = 75.8746 (norm. = 0.285714), norm. avg. (of 15) = 0.226234 fft 12: mflops = 118.027 (norm. = 0.444444), norm. avg. (of 11) = 0.322061 fft 13: mflops = 78.1062 (norm. = 0.294118), norm. avg. (of 11) = 0.182146 fft 14: mflops = 13.8313 (norm. = 0.0520833), norm. avg. (of 15) = 0.0553333 Benchmarking for array size = 27000: 0. Brenner: elapsed time t=1.63327 s, 16 iters, t-(init.)=1.6166 s t(norm)=0.254209, mflops=19.6688 (err=5.6e-15) 1. CWP (min N) (N=27720): elapsed time t=1.21662 s, 128 iters, t-(init.)=1.09996 s t(norm)=0.0216209, mflops=231.258 2. CWP (best N) (N=27720): elapsed time t=1.23328 s, 128 iters, t-(init.)=1.13329 s t(norm)=0.0222761, mflops=224.456 3. FFTPACK: elapsed time t=1.64993 s, 64 iters, t-(init.)=1.59994 s t(norm)=0.0628972, mflops=79.4948 (err=5.5e-15) 4. FFTPACK (f2c): elapsed time t=1.84993 s, 64 iters, t-(init.)=1.79993 s t(norm)=0.0707593, mflops=70.6621 (err=5.5e-15) FFTW_MEASURE plan: (cost = 1.354112e-02) FFTW_TWIDDLE 5 FFTW_TWIDDLE 6 FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_NOTW 10 5. FFTW: elapsed time t=1.98325 s, 128 iters, t-(init.)=1.88326 s t(norm)=0.0370176, mflops=135.071 (err=5.6e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.04996 s, 64 iters, t-(init.)=0.99996 s t(norm)=0.0393107, mflops=127.192 (err=5.6e-15) 7. Frigo-old: elapsed time t=1.14995 s, 16 iters, t-(init.)=1.13329 s t(norm)=0.178209, mflops=28.057 (err=5.7e-15) 8. GSL: elapsed time t=1.63327 s, 64 iters, t-(init.)=1.58327 s t(norm)=0.062242, mflops=80.3316 (err=5.5e-15) 9. NAPACK (f2c): elapsed time t=1.79993 s, 16 iters, t-(init.)=1.78326 s t(norm)=0.280417, mflops=17.8306 (err=1.1e-12) 10. Singleton: elapsed time t=1.06662 s, 32 iters, t-(init.)=1.03329 s t(norm)=0.0812422, mflops=61.5444 (err=7.6e-15) 11. Singleton (f2c): elapsed time t=1.13329 s, 32 iters, t-(init.)=1.11662 s t(norm)=0.087794, mflops=56.9515 (err=7.6e-15) 12. Temperton: elapsed time t=1.64993 s, 64 iters, t-(init.)=1.59994 s t(norm)=0.0628972, mflops=79.4948 (err=1.4e-07) 13. Temperton (f2c): elapsed time t=1.28328 s, 32 iters, t-(init.)=1.24995 s t(norm)=0.0982769, mflops=50.8767 (err=5.7e-15) 14. Valkenburg: elapsed time t=1.29995 s, 8 iters, t-(init.)=1.28328 s t(norm)=0.40359, mflops=12.3888 (err=5.4e-15) Top mflops for N=27000 = 231.258 Normalized results and averages for N=27000: fft 0: mflops = 19.6688 (norm. = 0.0850515), norm. avg. (of 13) = 0.0983277 fft 1: mflops = 231.258 (norm. = 1), norm. avg. (of 16) = 0.684056 fft 2: mflops = 224.456 (norm. = 0.970588), norm. avg. (of 16) = 0.710487 fft 3: mflops = 79.4948 (norm. = 0.34375), norm. avg. (of 16) = 0.46838 fft 4: mflops = 70.6621 (norm. = 0.305556), norm. avg. (of 16) = 0.327733 fft 5: mflops = 135.071 (norm. = 0.584071), norm. avg. (of 16) = 0.792413 fft 6: mflops = 127.192 (norm. = 0.55), norm. avg. (of 16) = 0.750309 fft 7: mflops = 28.057 (norm. = 0.121324), norm. avg. (of 16) = 0.145852 fft 8: mflops = 80.3316 (norm. = 0.347368), norm. avg. (of 16) = 0.365716 fft 9: mflops = 17.8306 (norm. = 0.0771028), norm. avg. (of 16) = 0.0800358 fft 10: mflops = 61.5444 (norm. = 0.266129), norm. avg. (of 16) = 0.258423 fft 11: mflops = 56.9515 (norm. = 0.246269), norm. avg. (of 16) = 0.227486 fft 12: mflops = 79.4948 (norm. = 0.34375), norm. avg. (of 12) = 0.323869 fft 13: mflops = 50.8767 (norm. = 0.22), norm. avg. (of 12) = 0.1853 fft 14: mflops = 12.3888 (norm. = 0.0535714), norm. avg. (of 16) = 0.0552232 Benchmarking for array size = 75600: 0. Brenner: elapsed time t=1.63327 s, 4 iters, t-(init.)=1.6166 s t(norm)=0.32987, mflops=15.1575 (err=1.1e-14) 1. CWP (min N) (N=80080): elapsed time t=1.49994 s, 32 iters, t-(init.)=1.33328 s t(norm)=0.0340072, mflops=147.028 2. CWP (best N) (N=80080): elapsed time t=1.51661 s, 32 iters, t-(init.)=1.34995 s t(norm)=0.0344323, mflops=145.212 3. FFTPACK: elapsed time t=1.16662 s, 8 iters, t-(init.)=1.13329 s t(norm)=0.115625, mflops=43.2434 (err=1.0e-14) 4. FFTPACK (f2c): elapsed time t=1.34995 s, 8 iters, t-(init.)=1.31661 s t(norm)=0.134329, mflops=37.2222 (err=1.1e-14) FFTW_MEASURE plan: (cost = 5.833100e-02) FFTW_TWIDDLE 10 FFTW_TWIDDLE 4 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_TWIDDLE 5 FFTW_NOTW 6 5. FFTW: elapsed time t=1.89992 s, 32 iters, t-(init.)=1.74993 s t(norm)=0.0446345, mflops=112.021 (err=1.1e-14) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.96659 s, 32 iters, t-(init.)=1.81659 s t(norm)=0.0463348, mflops=107.91 (err=1.1e-14) 7. Frigo-old: elapsed time t=1.94992 s, 8 iters, t-(init.)=1.91659 s t(norm)=0.195542, mflops=25.57 (err=1.1e-14) 8. GSL: elapsed time t=1.79993 s, 16 iters, t-(init.)=1.7166 s t(norm)=0.0875686, mflops=57.0981 (err=1.1e-14) 9. NAPACK (f2c): elapsed time t=1.5666 s, 4 iters, t-(init.)=1.54994 s t(norm)=0.316267, mflops=15.8094 (err=5.1e-12) 10. Singleton: elapsed time t=1.21662 s, 8 iters, t-(init.)=1.18329 s t(norm)=0.120726, mflops=41.4162 (err=1.5e-14) 11. Singleton (f2c): elapsed time t=1.24995 s, 8 iters, t-(init.)=1.21662 s t(norm)=0.124126, mflops=40.2815 (err=1.5e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.16662 s, 2 iters, t-(init.)=1.16662 s t(norm)=0.476101, mflops=10.502 (err=1.1e-14) Top mflops for N=75600 = 147.028 Normalized results and averages for N=75600: fft 0: mflops = 15.1575 (norm. = 0.103093), norm. avg. (of 14) = 0.0986681 fft 1: mflops = 147.028 (norm. = 1), norm. avg. (of 17) = 0.702641 fft 2: mflops = 145.212 (norm. = 0.987654), norm. avg. (of 17) = 0.726791 fft 3: mflops = 43.2434 (norm. = 0.294118), norm. avg. (of 17) = 0.45813 fft 4: mflops = 37.2222 (norm. = 0.253165), norm. avg. (of 17) = 0.323347 fft 5: mflops = 112.021 (norm. = 0.761905), norm. avg. (of 17) = 0.790618 fft 6: mflops = 107.91 (norm. = 0.733945), norm. avg. (of 17) = 0.749346 fft 7: mflops = 25.57 (norm. = 0.173913), norm. avg. (of 17) = 0.147503 fft 8: mflops = 57.0981 (norm. = 0.38835), norm. avg. (of 17) = 0.367048 fft 9: mflops = 15.8094 (norm. = 0.107527), norm. avg. (of 17) = 0.081653 fft 10: mflops = 41.4162 (norm. = 0.28169), norm. avg. (of 17) = 0.259792 fft 11: mflops = 40.2815 (norm. = 0.273973), norm. avg. (of 17) = 0.230221 fft 12: mflops = -1 (norm. = -0.00680144), norm. avg. (of 12) = 0.323869 fft 13: mflops = -1 (norm. = -0.00680144), norm. avg. (of 12) = 0.1853 fft 14: mflops = 10.502 (norm. = 0.0714286), norm. avg. (of 17) = 0.0561764 Benchmarking for array size = 165375: 0. Brenner: elapsed time t=1.18329 s, 1 iters, t-(init.)=1.16662 s t(norm)=0.406936, mflops=12.2869 (err=2.7e-14) 1. CWP (min N) (N=180180): elapsed time t=1.09996 s, 8 iters, t-(init.)=0.983294 s t(norm)=0.0428736, mflops=116.622 2. CWP (best N) (N=180180): elapsed time t=1.08329 s, 8 iters, t-(init.)=0.966628 s t(norm)=0.0421469, mflops=118.633 3. FFTPACK: elapsed time t=1.23328 s, 2 iters, t-(init.)=1.21662 s t(norm)=0.212188, mflops=23.564 (err=2.7e-14) 4. FFTPACK (f2c): elapsed time t=1.39994 s, 2 iters, t-(init.)=1.36661 s t(norm)=0.238348, mflops=20.9777 (err=2.7e-14) FFTW_MEASURE plan: (cost = 1.666600e-01) FFTW_TWIDDLE 9 FFTW_TWIDDLE 3 FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 5 5. FFTW: elapsed time t=1.29995 s, 8 iters, t-(init.)=1.19995 s t(norm)=0.0523203, mflops=95.5651 (err=2.7e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 6. FFTW_ESTIMATE: elapsed time t=1.28328 s, 8 iters, t-(init.)=1.18329 s t(norm)=0.0515937, mflops=96.9111 (err=2.7e-14) 7. Frigo-old: elapsed time t=1.5666 s, 2 iters, t-(init.)=1.53327 s t(norm)=0.267415, mflops=18.6975 (err=2.7e-14) 8. GSL: elapsed time t=1.11662 s, 4 iters, t-(init.)=1.06662 s t(norm)=0.093014, mflops=53.7554 (err=2.7e-14) 9. NAPACK (f2c): elapsed time t=1.09996 s, 1 iters, t-(init.)=1.08329 s t(norm)=0.377869, mflops=13.2321 (err=1.6e-11) 10. Singleton: elapsed time t=1.73326 s, 4 iters, t-(init.)=1.68327 s t(norm)=0.146788, mflops=34.0628 (err=4.0e-14) 11. Singleton (f2c): elapsed time t=1.78326 s, 4 iters, t-(init.)=1.73326 s t(norm)=0.151148, mflops=33.0802 (err=4.0e-14) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=1.41661 s, 1 iters, t-(init.)=1.39994 s t(norm)=0.488323, mflops=10.2391 (err=2.7e-14) Top mflops for N=165375 = 118.633 Normalized results and averages for N=165375: fft 0: mflops = 12.2869 (norm. = 0.103571), norm. avg. (of 15) = 0.098995 fft 1: mflops = 116.622 (norm. = 0.983051), norm. avg. (of 18) = 0.718219 fft 2: mflops = 118.633 (norm. = 1), norm. avg. (of 18) = 0.74197 fft 3: mflops = 23.564 (norm. = 0.19863), norm. avg. (of 18) = 0.443713 fft 4: mflops = 20.9777 (norm. = 0.176829), norm. avg. (of 18) = 0.315207 fft 5: mflops = 95.5651 (norm. = 0.805556), norm. avg. (of 18) = 0.791448 fft 6: mflops = 96.9111 (norm. = 0.816901), norm. avg. (of 18) = 0.753099 fft 7: mflops = 18.6975 (norm. = 0.157609), norm. avg. (of 18) = 0.148064 fft 8: mflops = 53.7554 (norm. = 0.453125), norm. avg. (of 18) = 0.37183 fft 9: mflops = 13.2321 (norm. = 0.111538), norm. avg. (of 18) = 0.0833133 fft 10: mflops = 34.0628 (norm. = 0.287129), norm. avg. (of 18) = 0.261311 fft 11: mflops = 33.0802 (norm. = 0.278846), norm. avg. (of 18) = 0.232922 fft 12: mflops = -1 (norm. = -0.00842939), norm. avg. (of 12) = 0.323869 fft 13: mflops = -1 (norm. = -0.00842939), norm. avg. (of 12) = 0.1853 fft 14: mflops = 10.2391 (norm. = 0.0863095), norm. avg. (of 18) = 0.0578505 Benchmarking for array size = 362880: 0. Brenner: elapsed time t=2.78322 s, 1 iters, t-(init.)=2.74989 s t(norm)=0.410304, mflops=12.1861 (err=1.1e-13) 1. CWP (min N) (N=720720): elapsed time t=1.31661 s, 2 iters, t-(init.)=1.19995 s t(norm)=0.0895209, mflops=55.8529 2. CWP (best N) (N=720720): elapsed time t=1.29995 s, 2 iters, t-(init.)=1.18329 s t(norm)=0.0882775, mflops=56.6396 3. FFTPACK: elapsed time t=1.94992 s, 2 iters, t-(init.)=1.89992 s t(norm)=0.141741, mflops=35.2755 (err=1.1e-13) 4. FFTPACK (f2c): elapsed time t=1.13329 s, 1 iters, t-(init.)=1.09996 s t(norm)=0.164122, mflops=30.4652 (err=1.1e-13) FFTW_MEASURE plan: (cost = 3.499860e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 7 FFTW_TWIDDLE 4 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_NOTW 10 5. FFTW: elapsed time t=1.39994 s, 4 iters, t-(init.)=1.28328 s t(norm)=0.0478688, mflops=104.452 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 7.511616e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 6. FFTW_ESTIMATE: elapsed time t=1.43328 s, 4 iters, t-(init.)=1.31661 s t(norm)=0.0491121, mflops=101.808 (err=1.1e-13) 7. Frigo-old: elapsed time t=1.54994 s, 1 iters, t-(init.)=1.51661 s t(norm)=0.226289, mflops=22.0956 (err=1.1e-13) 8. GSL: elapsed time t=1.14995 s, 2 iters, t-(init.)=1.09996 s t(norm)=0.0820608, mflops=60.9304 (err=1.1e-13) 9. NAPACK (f2c): elapsed time t=2.19991 s, 1 iters, t-(init.)=2.18325 s t(norm)=0.325757, mflops=15.3489 (err=3.4e-11) 10. Singleton: elapsed time t=1.28328 s, 1 iters, t-(init.)=1.23328 s t(norm)=0.184015, mflops=27.1717 (err=1.6e-13) 11. Singleton (f2c): elapsed time t=1.31661 s, 1 iters, t-(init.)=1.29995 s t(norm)=0.193962, mflops=25.7783 (err=1.6e-13) 12. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 13. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 14. Valkenburg: elapsed time t=3.73318 s, 1 iters, t-(init.)=3.69985 s t(norm)=0.552045, mflops=9.05723 (err=1.1e-13) Top mflops for N=362880 = 104.452 Normalized results and averages for N=362880: fft 0: mflops = 12.1861 (norm. = 0.116667), norm. avg. (of 16) = 0.100099 fft 1: mflops = 55.8529 (norm. = 0.534722), norm. avg. (of 19) = 0.708561 fft 2: mflops = 56.6396 (norm. = 0.542254), norm. avg. (of 19) = 0.731458 fft 3: mflops = 35.2755 (norm. = 0.337719), norm. avg. (of 19) = 0.438134 fft 4: mflops = 30.4652 (norm. = 0.291667), norm. avg. (of 19) = 0.313968 fft 5: mflops = 104.452 (norm. = 1), norm. avg. (of 19) = 0.802424 fft 6: mflops = 101.808 (norm. = 0.974684), norm. avg. (of 19) = 0.764762 fft 7: mflops = 22.0956 (norm. = 0.211538), norm. avg. (of 19) = 0.151405 fft 8: mflops = 60.9304 (norm. = 0.583333), norm. avg. (of 19) = 0.382961 fft 9: mflops = 15.3489 (norm. = 0.146947), norm. avg. (of 19) = 0.0866624 fft 10: mflops = 27.1717 (norm. = 0.260135), norm. avg. (of 19) = 0.261249 fft 11: mflops = 25.7783 (norm. = 0.246795), norm. avg. (of 19) = 0.233652 fft 12: mflops = -1 (norm. = -0.00957376), norm. avg. (of 12) = 0.323869 fft 13: mflops = -1 (norm. = -0.00957376), norm. avg. (of 12) = 0.1853 fft 14: mflops = 9.05723 (norm. = 0.0867117), norm. avg. (of 19) = 0.0593695 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) Maximum array size N = 2097152 Benchmarking FFTs: 0. FFTW 1. HARM 2. HARM (f2c) 3. PDA 4. PDA (f2c) 5. Singleton 6. Singleton (f2c) 7. Temperton 8. Temperton (f2c) Computing normalized averages (9 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.19995 s, 131072 iters, t-(init.)=1.13329 s t(norm)=0.0225164, mflops=222.06 (err=1.9e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. Skipping fft (all dimensions must be > 4 for HARM). 3. PDA: elapsed time t=1.43328 s, 32768 iters, t-(init.)=1.41661 s t(norm)=0.112582, mflops=44.4121 (err=2.8e-16) 4. PDA (f2c): elapsed time t=1.64993 s, 32768 iters, t-(init.)=1.63327 s t(norm)=0.1298, mflops=38.5207 (err=2.8e-16) 5. Singleton: elapsed time t=1.54994 s, 131072 iters, t-(init.)=1.48327 s t(norm)=0.02947, mflops=169.664 (err=1.9e-16) 6. Singleton (f2c): elapsed time t=1.54994 s, 131072 iters, t-(init.)=1.48327 s t(norm)=0.02947, mflops=169.664 (err=1.9e-16) 7. Temperton: elapsed time t=1.7166 s, 131072 iters, t-(init.)=1.64993 s t(norm)=0.0327812, mflops=152.526 (err=1.9e-16) 8. Temperton (f2c): elapsed time t=1.46661 s, 65536 iters, t-(init.)=1.43328 s t(norm)=0.0569533, mflops=87.7913 (err=1.9e-16) Top mflops for N=64 = 222.06 Normalized results and averages for N=64: fft 0: mflops = 222.06 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.00450328), norm. avg. (of 0) = -1 fft 2: mflops = -1 (norm. = -0.00450328), norm. avg. (of 0) = -1 fft 3: mflops = 44.4121 (norm. = 0.2), norm. avg. (of 1) = 0.2 fft 4: mflops = 38.5207 (norm. = 0.173469), norm. avg. (of 1) = 0.173469 fft 5: mflops = 169.664 (norm. = 0.764045), norm. avg. (of 1) = 0.764045 fft 6: mflops = 169.664 (norm. = 0.764045), norm. avg. (of 1) = 0.764045 fft 7: mflops = 152.526 (norm. = 0.686869), norm. avg. (of 1) = 0.686869 fft 8: mflops = 87.7913 (norm. = 0.395349), norm. avg. (of 1) = 0.395349 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.99992 s, 32768 iters, t-(init.)=1.86659 s t(norm)=0.012362, mflops=404.467 (err=3.6e-16) 1. HARM: elapsed time t=1.94992 s, 16384 iters, t-(init.)=1.88326 s t(norm)=0.0249446, mflops=200.444 (err=3.8e-16) 2. HARM (f2c): elapsed time t=1.11662 s, 4096 iters, t-(init.)=1.09996 s t(norm)=0.0582778, mflops=85.796 (err=3.8e-16) 3. PDA: elapsed time t=1.68327 s, 4096 iters, t-(init.)=1.6666 s t(norm)=0.0882996, mflops=56.6254 (err=3.0e-16) 4. PDA (f2c): elapsed time t=1.63327 s, 4096 iters, t-(init.)=1.6166 s t(norm)=0.0856507, mflops=58.3767 (err=3.0e-16) 5. Singleton: elapsed time t=1.48327 s, 8192 iters, t-(init.)=1.44994 s t(norm)=0.0384103, mflops=130.173 (err=3.5e-16) 6. Singleton (f2c): elapsed time t=1.7166 s, 8192 iters, t-(init.)=1.68327 s t(norm)=0.0445913, mflops=112.129 (err=3.5e-16) 7. Temperton: elapsed time t=1.69993 s, 16384 iters, t-(init.)=1.63327 s t(norm)=0.0216334, mflops=231.124 (err=1.3e-08) 8. Temperton (f2c): elapsed time t=1.29995 s, 4096 iters, t-(init.)=1.28328 s t(norm)=0.0679907, mflops=73.5394 (err=3.3e-16) Top mflops for N=512 = 404.467 Normalized results and averages for N=512: fft 0: mflops = 404.467 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 200.444 (norm. = 0.495575), norm. avg. (of 1) = 0.495575 fft 2: mflops = 85.796 (norm. = 0.212121), norm. avg. (of 1) = 0.212121 fft 3: mflops = 56.6254 (norm. = 0.14), norm. avg. (of 2) = 0.17 fft 4: mflops = 58.3767 (norm. = 0.14433), norm. avg. (of 2) = 0.1589 fft 5: mflops = 130.173 (norm. = 0.321839), norm. avg. (of 2) = 0.542942 fft 6: mflops = 112.129 (norm. = 0.277228), norm. avg. (of 2) = 0.520636 fft 7: mflops = 231.124 (norm. = 0.571429), norm. avg. (of 2) = 0.629149 fft 8: mflops = 73.5394 (norm. = 0.181818), norm. avg. (of 2) = 0.288584 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.68327 s, 2048 iters, t-(init.)=1.63327 s t(norm)=0.0162251, mflops=308.165 (err=4.2e-16) 1. HARM: elapsed time t=1.88326 s, 1024 iters, t-(init.)=1.84993 s t(norm)=0.0367547, mflops=136.037 (err=4.0e-16) 2. HARM (f2c): elapsed time t=1.64993 s, 512 iters, t-(init.)=1.63327 s t(norm)=0.0649002, mflops=77.0413 (err=4.0e-16) 3. PDA: elapsed time t=1.24995 s, 512 iters, t-(init.)=1.23328 s t(norm)=0.0490063, mflops=102.028 (err=4.0e-16) 4. PDA (f2c): elapsed time t=1.68327 s, 512 iters, t-(init.)=1.6666 s t(norm)=0.0662247, mflops=75.5005 (err=4.0e-16) 5. Singleton: elapsed time t=1.08329 s, 512 iters, t-(init.)=1.06662 s t(norm)=0.0423838, mflops=117.97 (err=4.1e-16) 6. Singleton (f2c): elapsed time t=1.43328 s, 512 iters, t-(init.)=1.41661 s t(norm)=0.056291, mflops=88.8241 (err=4.1e-16) 7. Temperton: elapsed time t=1.53327 s, 1024 iters, t-(init.)=1.49994 s t(norm)=0.0298011, mflops=167.779 (err=6.3e-08) 8. Temperton (f2c): elapsed time t=1.09996 s, 512 iters, t-(init.)=1.09996 s t(norm)=0.0437083, mflops=114.395 (err=4.6e-16) Top mflops for N=4096 = 308.165 Normalized results and averages for N=4096: fft 0: mflops = 308.165 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 136.037 (norm. = 0.441441), norm. avg. (of 2) = 0.468508 fft 2: mflops = 77.0413 (norm. = 0.25), norm. avg. (of 2) = 0.231061 fft 3: mflops = 102.028 (norm. = 0.331081), norm. avg. (of 3) = 0.223694 fft 4: mflops = 75.5005 (norm. = 0.245), norm. avg. (of 3) = 0.1876 fft 5: mflops = 117.97 (norm. = 0.382812), norm. avg. (of 3) = 0.489566 fft 6: mflops = 88.8241 (norm. = 0.288235), norm. avg. (of 3) = 0.443169 fft 7: mflops = 167.779 (norm. = 0.544444), norm. avg. (of 3) = 0.600914 fft 8: mflops = 114.395 (norm. = 0.371212), norm. avg. (of 3) = 0.316126 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.36661 s, 128 iters, t-(init.)=1.23328 s t(norm)=0.0196025, mflops=255.069 (err=5.2e-16) 1. HARM: elapsed time t=1.69993 s, 64 iters, t-(init.)=1.63327 s t(norm)=0.0519202, mflops=96.3016 (err=5.3e-16) 2. HARM (f2c): elapsed time t=1.23328 s, 32 iters, t-(init.)=1.19995 s t(norm)=0.0762909, mflops=65.5386 (err=5.3e-16) 3. PDA: elapsed time t=1.29995 s, 32 iters, t-(init.)=1.26662 s t(norm)=0.0805293, mflops=62.0892 (err=4.2e-16) 4. PDA (f2c): elapsed time t=1.44994 s, 32 iters, t-(init.)=1.41661 s t(norm)=0.0900656, mflops=55.5151 (err=4.2e-16) 5. Singleton: elapsed time t=1.51661 s, 32 iters, t-(init.)=1.48327 s t(norm)=0.094304, mflops=53.02 (err=5.3e-16) 6. Singleton (f2c): elapsed time t=1.74993 s, 32 iters, t-(init.)=1.7166 s t(norm)=0.109138, mflops=45.8134 (err=5.3e-16) 7. Temperton: elapsed time t=1.89992 s, 64 iters, t-(init.)=1.83326 s t(norm)=0.0582778, mflops=85.796 (err=9.6e-08) 8. Temperton (f2c): elapsed time t=1.29995 s, 32 iters, t-(init.)=1.26662 s t(norm)=0.0805293, mflops=62.0892 (err=4.7e-16) Top mflops for N=32768 = 255.069 Normalized results and averages for N=32768: fft 0: mflops = 255.069 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 96.3016 (norm. = 0.377551), norm. avg. (of 3) = 0.438189 fft 2: mflops = 65.5386 (norm. = 0.256944), norm. avg. (of 3) = 0.239689 fft 3: mflops = 62.0892 (norm. = 0.243421), norm. avg. (of 4) = 0.228626 fft 4: mflops = 55.5151 (norm. = 0.217647), norm. avg. (of 4) = 0.195112 fft 5: mflops = 53.02 (norm. = 0.207865), norm. avg. (of 4) = 0.41914 fft 6: mflops = 45.8134 (norm. = 0.179612), norm. avg. (of 4) = 0.37728 fft 7: mflops = 85.796 (norm. = 0.336364), norm. avg. (of 4) = 0.534776 fft 8: mflops = 62.0892 (norm. = 0.243421), norm. avg. (of 4) = 0.29795 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.54994 s, 8 iters, t-(init.)=1.36661 s t(norm)=0.0362029, mflops=138.111 (err=1.2e-15) 1. HARM: elapsed time t=1.53327 s, 4 iters, t-(init.)=1.44994 s t(norm)=0.0768207, mflops=65.0866 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.94992 s, 4 iters, t-(init.)=1.86659 s t(norm)=0.0988956, mflops=50.5584 (err=1.2e-15) 3. PDA: elapsed time t=1.69993 s, 4 iters, t-(init.)=1.6166 s t(norm)=0.0856507, mflops=58.3767 (err=1.3e-15) 4. PDA (f2c): elapsed time t=2.01659 s, 4 iters, t-(init.)=1.93326 s t(norm)=0.102428, mflops=48.815 (err=1.3e-15) 5. Singleton: elapsed time t=1.6666 s, 2 iters, t-(init.)=1.63327 s t(norm)=0.173067, mflops=28.8905 (err=1.7e-15) 6. Singleton (f2c): elapsed time t=1.79993 s, 2 iters, t-(init.)=1.74993 s t(norm)=0.185429, mflops=26.9645 (err=1.7e-15) 7. Temperton: elapsed time t=1.49994 s, 4 iters, t-(init.)=1.41661 s t(norm)=0.0750547, mflops=66.6181 (err=1.3e-07) 8. Temperton (f2c): elapsed time t=1.84993 s, 4 iters, t-(init.)=1.7666 s t(norm)=0.0935976, mflops=53.4202 (err=1.3e-15) Top mflops for N=262144 = 138.111 Normalized results and averages for N=262144: fft 0: mflops = 138.111 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 65.0866 (norm. = 0.471264), norm. avg. (of 4) = 0.446458 fft 2: mflops = 50.5584 (norm. = 0.366071), norm. avg. (of 4) = 0.271284 fft 3: mflops = 58.3767 (norm. = 0.42268), norm. avg. (of 5) = 0.267437 fft 4: mflops = 48.815 (norm. = 0.353448), norm. avg. (of 5) = 0.226779 fft 5: mflops = 28.8905 (norm. = 0.209184), norm. avg. (of 5) = 0.377149 fft 6: mflops = 26.9645 (norm. = 0.195238), norm. avg. (of 5) = 0.340872 fft 7: mflops = 66.6181 (norm. = 0.482353), norm. avg. (of 5) = 0.524292 fft 8: mflops = 53.4202 (norm. = 0.386792), norm. avg. (of 5) = 0.315719 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.06662 s, 2 iters, t-(init.)=0.966628 s t(norm)=0.0485183, mflops=103.054 (err=1.2e-15) 1. HARM: elapsed time t=1.64993 s, 2 iters, t-(init.)=1.54994 s t(norm)=0.0777966, mflops=64.2701 (err=1.2e-15) 2. HARM (f2c): elapsed time t=1.06662 s, 1 iters, t-(init.)=1.01663 s t(norm)=0.102056, mflops=48.9928 (err=1.2e-15) 3. PDA: elapsed time t=1.06662 s, 1 iters, t-(init.)=1.01663 s t(norm)=0.102056, mflops=48.9928 (err=1.2e-15) 4. PDA (f2c): elapsed time t=1.21662 s, 1 iters, t-(init.)=1.18329 s t(norm)=0.118786, mflops=42.0924 (err=1.2e-15) 5. Singleton: elapsed time t=2.06658 s, 1 iters, t-(init.)=2.01659 s t(norm)=0.202439, mflops=24.6989 (err=1.7e-15) 6. Singleton (f2c): elapsed time t=2.19991 s, 1 iters, t-(init.)=2.14991 s t(norm)=0.215823, mflops=23.1671 (err=1.7e-15) 7. Temperton: elapsed time t=1.6166 s, 2 iters, t-(init.)=1.53327 s t(norm)=0.0769601, mflops=64.9687 (err=1.5e-07) 8. Temperton (f2c): elapsed time t=1.96659 s, 2 iters, t-(init.)=1.88326 s t(norm)=0.0945271, mflops=52.8949 (err=1.3e-15) Top mflops for N=524288 = 103.054 Normalized results and averages for N=524288: fft 0: mflops = 103.054 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 64.2701 (norm. = 0.623656), norm. avg. (of 5) = 0.481898 fft 2: mflops = 48.9928 (norm. = 0.47541), norm. avg. (of 5) = 0.312109 fft 3: mflops = 48.9928 (norm. = 0.47541), norm. avg. (of 6) = 0.302099 fft 4: mflops = 42.0924 (norm. = 0.408451), norm. avg. (of 6) = 0.257058 fft 5: mflops = 24.6989 (norm. = 0.239669), norm. avg. (of 6) = 0.354236 fft 6: mflops = 23.1671 (norm. = 0.224806), norm. avg. (of 6) = 0.321527 fft 7: mflops = 64.9687 (norm. = 0.630435), norm. avg. (of 6) = 0.541982 fft 8: mflops = 52.8949 (norm. = 0.513274), norm. avg. (of 6) = 0.348644 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=1.08329 s, 1 iters, t-(init.)=0.99996 s t(norm)=0.0476818, mflops=104.862 (err=2.0e-15) 1. HARM: elapsed time t=1.74993 s, 1 iters, t-(init.)=1.64993 s t(norm)=0.078675, mflops=63.5526 (err=1.9e-15) 2. HARM (f2c): elapsed time t=2.23324 s, 1 iters, t-(init.)=2.14991 s t(norm)=0.102516, mflops=48.7729 (err=1.9e-15) 3. PDA: elapsed time t=2.14991 s, 1 iters, t-(init.)=2.06658 s t(norm)=0.0985424, mflops=50.7396 (err=2.0e-15) 4. PDA (f2c): elapsed time t=2.46657 s, 1 iters, t-(init.)=2.36657 s t(norm)=0.112847, mflops=44.3078 (err=2.0e-15) 5. Singleton: elapsed time t=3.88318 s, 1 iters, t-(init.)=3.76652 s t(norm)=0.179601, mflops=27.8394 (err=2.8e-15) 6. Singleton (f2c): elapsed time t=4.2165 s, 1 iters, t-(init.)=4.1165 s t(norm)=0.19629, mflops=25.4725 (err=2.8e-15) 7. Skipping fft (Temperton can't handle dimensions > 256). 8. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 104.862 Normalized results and averages for N=1048576: fft 0: mflops = 104.862 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 63.5526 (norm. = 0.606061), norm. avg. (of 6) = 0.502591 fft 2: mflops = 48.7729 (norm. = 0.465116), norm. avg. (of 6) = 0.337611 fft 3: mflops = 50.7396 (norm. = 0.483871), norm. avg. (of 7) = 0.328066 fft 4: mflops = 44.3078 (norm. = 0.422535), norm. avg. (of 7) = 0.280697 fft 5: mflops = 27.8394 (norm. = 0.265487), norm. avg. (of 7) = 0.341557 fft 6: mflops = 25.4725 (norm. = 0.242915), norm. avg. (of 7) = 0.310297 fft 7: mflops = -1 (norm. = -0.00953636), norm. avg. (of 6) = 0.541982 fft 8: mflops = -1 (norm. = -0.00953636), norm. avg. (of 6) = 0.348644 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=2.78322 s, 1 iters, t-(init.)=2.5999 s t(norm)=0.0590346, mflops=84.6961 (err=7.3e-16) 1. HARM: elapsed time t=4.09984 s, 1 iters, t-(init.)=3.93318 s t(norm)=0.0893088, mflops=55.9855 (err=7.0e-16) 2. HARM (f2c): elapsed time t=5.0498 s, 1 iters, t-(init.)=4.86647 s t(norm)=0.110501, mflops=45.2486 (err=7.0e-16) 3. PDA: elapsed time t=4.48315 s, 1 iters, t-(init.)=4.29983 s t(norm)=0.0976342, mflops=51.2116 (err=7.1e-16) 4. PDA (f2c): elapsed time t=5.24979 s, 1 iters, t-(init.)=5.06646 s t(norm)=0.115042, mflops=43.4625 (err=7.1e-16) 5. Singleton: elapsed time t=12.1162 s, 1 iters, t-(init.)=11.9329 s t(norm)=0.270954, mflops=18.4533 (err=8.4e-16) 6. Singleton (f2c): elapsed time t=13.1828 s, 1 iters, t-(init.)=12.9995 s t(norm)=0.295173, mflops=16.9392 (err=8.4e-16) 7. Temperton: elapsed time t=5.19979 s, 1 iters, t-(init.)=5.01647 s t(norm)=0.113907, mflops=43.8956 (err=1.5e-07) 8. Temperton (f2c): elapsed time t=6.68307 s, 1 iters, t-(init.)=6.49974 s t(norm)=0.147587, mflops=33.8784 (err=7.4e-16) Top mflops for N=2097152 = 84.6961 Normalized results and averages for N=2097152: fft 0: mflops = 84.6961 (norm. = 1), norm. avg. (of 8) = 1 fft 1: mflops = 55.9855 (norm. = 0.661017), norm. avg. (of 7) = 0.525224 fft 2: mflops = 45.2486 (norm. = 0.534247), norm. avg. (of 7) = 0.365701 fft 3: mflops = 51.2116 (norm. = 0.604651), norm. avg. (of 8) = 0.362639 fft 4: mflops = 43.4625 (norm. = 0.513158), norm. avg. (of 8) = 0.309755 fft 5: mflops = 18.4533 (norm. = 0.217877), norm. avg. (of 8) = 0.326097 fft 6: mflops = 16.9392 (norm. = 0.2), norm. avg. (of 8) = 0.29651 fft 7: mflops = 43.8956 (norm. = 0.518272), norm. avg. (of 7) = 0.538595 fft 8: mflops = 33.8784 (norm. = 0.4), norm. avg. (of 7) = 0.355981 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) Maximum array size N = 1728000 Benchmarking FFTs: 0. FFTW 1. PDA 2. PDA (f2c) 3. Singleton 4. Singleton (f2c) 5. Temperton 6. Temperton (f2c) Computing normalized averages (7 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.01663 s, 65536 iters, t-(init.)=0.949962 s t(norm)=0.0166474, mflops=300.347 (err=3.0e-16) 1. PDA: elapsed time t=1.46661 s, 16384 iters, t-(init.)=1.44994 s t(norm)=0.101637, mflops=49.1948 (err=2.3e-16) 2. PDA (f2c): elapsed time t=1.7666 s, 16384 iters, t-(init.)=1.74993 s t(norm)=0.122665, mflops=40.7614 (err=2.3e-16) 3. Singleton: elapsed time t=1.21662 s, 65536 iters, t-(init.)=1.14995 s t(norm)=0.0201521, mflops=248.113 (err=3.1e-16) 4. Singleton (f2c): elapsed time t=1.14995 s, 65536 iters, t-(init.)=1.09996 s t(norm)=0.0192759, mflops=259.391 (err=3.1e-16) 5. Temperton: elapsed time t=1.59994 s, 65536 iters, t-(init.)=1.53327 s t(norm)=0.0268695, mflops=186.085 (err=5.3e-16) 6. Temperton (f2c): elapsed time t=1.73326 s, 32768 iters, t-(init.)=1.69993 s t(norm)=0.0595801, mflops=83.9206 (err=2.4e-16) Top mflops for N=125 = 300.347 Normalized results and averages for N=125: fft 0: mflops = 300.347 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = 49.1948 (norm. = 0.163793), norm. avg. (of 1) = 0.163793 fft 2: mflops = 40.7614 (norm. = 0.135714), norm. avg. (of 1) = 0.135714 fft 3: mflops = 248.113 (norm. = 0.826087), norm. avg. (of 1) = 0.826087 fft 4: mflops = 259.391 (norm. = 0.863636), norm. avg. (of 1) = 0.863636 fft 5: mflops = 186.085 (norm. = 0.619565), norm. avg. (of 1) = 0.619565 fft 6: mflops = 83.9206 (norm. = 0.279412), norm. avg. (of 1) = 0.279412 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.59994 s, 65536 iters, t-(init.)=1.48327 s t(norm)=0.0135118, mflops=370.048 (err=2.9e-16) 1. PDA: elapsed time t=1.64993 s, 8192 iters, t-(init.)=1.6166 s t(norm)=0.11781, mflops=42.441 (err=3.6e-16) 2. PDA (f2c): elapsed time t=1.7666 s, 8192 iters, t-(init.)=1.74993 s t(norm)=0.127527, mflops=39.2074 (err=3.6e-16) 3. Singleton: elapsed time t=1.13329 s, 16384 iters, t-(init.)=1.09996 s t(norm)=0.0400799, mflops=124.751 (err=2.9e-16) 4. Singleton (f2c): elapsed time t=1.26662 s, 16384 iters, t-(init.)=1.23328 s t(norm)=0.044938, mflops=111.264 (err=2.9e-16) 5. Temperton: elapsed time t=1.01663 s, 16384 iters, t-(init.)=0.99996 s t(norm)=0.0364362, mflops=137.226 (err=2.1e-15) 6. Temperton (f2c): elapsed time t=1.03329 s, 8192 iters, t-(init.)=1.01663 s t(norm)=0.074087, mflops=67.4882 (err=3.1e-16) Top mflops for N=216 = 370.048 Normalized results and averages for N=216: fft 0: mflops = 370.048 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 42.441 (norm. = 0.114691), norm. avg. (of 2) = 0.139242 fft 2: mflops = 39.2074 (norm. = 0.105952), norm. avg. (of 2) = 0.120833 fft 3: mflops = 124.751 (norm. = 0.337121), norm. avg. (of 2) = 0.581604 fft 4: mflops = 111.264 (norm. = 0.300676), norm. avg. (of 2) = 0.582156 fft 5: mflops = 137.226 (norm. = 0.370833), norm. avg. (of 2) = 0.495199 fft 6: mflops = 67.4882 (norm. = 0.182377), norm. avg. (of 2) = 0.230894 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.91659 s, 32768 iters, t-(init.)=1.83326 s t(norm)=0.019367, mflops=258.172 (err=3.8e-16) 1. PDA: elapsed time t=1.24995 s, 2048 iters, t-(init.)=1.23328 s t(norm)=0.208459, mflops=23.9855 (err=4.8e-16) 2. PDA (f2c): elapsed time t=1.38328 s, 2048 iters, t-(init.)=1.36661 s t(norm)=0.230995, mflops=21.6455 (err=4.3e-16) 3. Singleton: elapsed time t=1.86659 s, 16384 iters, t-(init.)=1.81659 s t(norm)=0.0383818, mflops=130.27 (err=5.8e-16) 4. Singleton (f2c): elapsed time t=1.79993 s, 16384 iters, t-(init.)=1.74993 s t(norm)=0.0369733, mflops=135.233 (err=5.8e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 258.172 Normalized results and averages for N=343: fft 0: mflops = 258.172 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 23.9855 (norm. = 0.0929054), norm. avg. (of 3) = 0.123796 fft 2: mflops = 21.6455 (norm. = 0.0838415), norm. avg. (of 3) = 0.108503 fft 3: mflops = 130.27 (norm. = 0.504587), norm. avg. (of 3) = 0.555932 fft 4: mflops = 135.233 (norm. = 0.52381), norm. avg. (of 3) = 0.562707 fft 5: mflops = -1 (norm. = -0.00387339), norm. avg. (of 2) = 0.495199 fft 6: mflops = -1 (norm. = -0.00387339), norm. avg. (of 2) = 0.230894 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.01663 s, 8192 iters, t-(init.)=0.966628 s t(norm)=0.0170205, mflops=293.764 (err=5.3e-16) 1. PDA: elapsed time t=1.11662 s, 2048 iters, t-(init.)=1.11662 s t(norm)=0.0786463, mflops=63.5758 (err=4.9e-16) 2. PDA (f2c): elapsed time t=1.29995 s, 2048 iters, t-(init.)=1.28328 s t(norm)=0.0903846, mflops=55.3192 (err=4.9e-16) 3. Singleton: elapsed time t=1.41661 s, 4096 iters, t-(init.)=1.39994 s t(norm)=0.0493007, mflops=101.418 (err=4.5e-16) 4. Singleton (f2c): elapsed time t=1.6166 s, 4096 iters, t-(init.)=1.58327 s t(norm)=0.0557567, mflops=89.6753 (err=4.5e-16) 5. Temperton: elapsed time t=1.91659 s, 8192 iters, t-(init.)=1.86659 s t(norm)=0.0328671, mflops=152.128 (err=6.0e-08) 6. Temperton (f2c): elapsed time t=1.53327 s, 4096 iters, t-(init.)=1.51661 s t(norm)=0.0534091, mflops=93.6171 (err=5.1e-16) Top mflops for N=729 = 293.764 Normalized results and averages for N=729: fft 0: mflops = 293.764 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 63.5758 (norm. = 0.216418), norm. avg. (of 4) = 0.146952 fft 2: mflops = 55.3192 (norm. = 0.188312), norm. avg. (of 4) = 0.128455 fft 3: mflops = 101.418 (norm. = 0.345238), norm. avg. (of 4) = 0.503258 fft 4: mflops = 89.6753 (norm. = 0.305263), norm. avg. (of 4) = 0.498346 fft 5: mflops = 152.128 (norm. = 0.517857), norm. avg. (of 3) = 0.502752 fft 6: mflops = 93.6171 (norm. = 0.318681), norm. avg. (of 3) = 0.260157 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.34995 s, 8192 iters, t-(init.)=1.28328 s t(norm)=0.0157188, mflops=318.089 (err=4.0e-16) 1. PDA: elapsed time t=1.48327 s, 2048 iters, t-(init.)=1.46661 s t(norm)=0.0718576, mflops=69.5821 (err=4.3e-16) 2. PDA (f2c): elapsed time t=1.73326 s, 2048 iters, t-(init.)=1.7166 s t(norm)=0.084106, mflops=59.4488 (err=4.3e-16) 3. Singleton: elapsed time t=1.84993 s, 4096 iters, t-(init.)=1.81659 s t(norm)=0.0445027, mflops=112.353 (err=4.6e-16) 4. Singleton (f2c): elapsed time t=1.79993 s, 4096 iters, t-(init.)=1.7666 s t(norm)=0.0432779, mflops=115.533 (err=4.6e-16) 5. Temperton: elapsed time t=1.21662 s, 4096 iters, t-(init.)=1.18329 s t(norm)=0.028988, mflops=172.485 (err=6.3e-16) 6. Temperton (f2c): elapsed time t=1.26662 s, 2048 iters, t-(init.)=1.24995 s t(norm)=0.0612423, mflops=81.643 (err=3.4e-16) Top mflops for N=1000 = 318.089 Normalized results and averages for N=1000: fft 0: mflops = 318.089 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 69.5821 (norm. = 0.21875), norm. avg. (of 5) = 0.161311 fft 2: mflops = 59.4488 (norm. = 0.186893), norm. avg. (of 5) = 0.140143 fft 3: mflops = 112.353 (norm. = 0.353211), norm. avg. (of 5) = 0.473249 fft 4: mflops = 115.533 (norm. = 0.363208), norm. avg. (of 5) = 0.471318 fft 5: mflops = 172.485 (norm. = 0.542254), norm. avg. (of 4) = 0.512627 fft 6: mflops = 81.643 (norm. = 0.256667), norm. avg. (of 4) = 0.259284 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.26662 s, 4096 iters, t-(init.)=1.23328 s t(norm)=0.0217971, mflops=229.388 (err=4.3e-16) 1. PDA: elapsed time t=1.48327 s, 512 iters, t-(init.)=1.46661 s t(norm)=0.207367, mflops=24.1118 (err=5.6e-16) 2. PDA (f2c): elapsed time t=1.5666 s, 512 iters, t-(init.)=1.54994 s t(norm)=0.219149, mflops=22.8155 (err=5.6e-16) 3. Singleton: elapsed time t=1.08329 s, 2048 iters, t-(init.)=1.06662 s t(norm)=0.0377031, mflops=132.615 (err=6.6e-16) 4. Singleton (f2c): elapsed time t=1.98325 s, 4096 iters, t-(init.)=1.93326 s t(norm)=0.0341685, mflops=146.334 (err=6.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 229.388 Normalized results and averages for N=1331: fft 0: mflops = 229.388 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 24.1118 (norm. = 0.105114), norm. avg. (of 6) = 0.151945 fft 2: mflops = 22.8155 (norm. = 0.0994624), norm. avg. (of 6) = 0.133363 fft 3: mflops = 132.615 (norm. = 0.578125), norm. avg. (of 6) = 0.490728 fft 4: mflops = 146.334 (norm. = 0.637931), norm. avg. (of 6) = 0.499087 fft 5: mflops = -1 (norm. = -0.00435942), norm. avg. (of 4) = 0.512627 fft 6: mflops = -1 (norm. = -0.00435942), norm. avg. (of 4) = 0.259284 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.26662 s, 4096 iters, t-(init.)=1.21662 s t(norm)=0.0159825, mflops=312.842 (err=3.9e-16) 1. PDA: elapsed time t=1.11662 s, 1024 iters, t-(init.)=1.09996 s t(norm)=0.0577997, mflops=86.5056 (err=3.9e-16) 2. PDA (f2c): elapsed time t=1.34995 s, 1024 iters, t-(init.)=1.33328 s t(norm)=0.0700603, mflops=71.3671 (err=3.8e-16) 3. Singleton: elapsed time t=1.13329 s, 1024 iters, t-(init.)=1.11662 s t(norm)=0.0586755, mflops=85.2145 (err=4.0e-16) 4. Singleton (f2c): elapsed time t=1.31661 s, 1024 iters, t-(init.)=1.29995 s t(norm)=0.0683088, mflops=73.1971 (err=4.0e-16) 5. Temperton: elapsed time t=1.08329 s, 2048 iters, t-(init.)=1.06662 s t(norm)=0.0280241, mflops=178.418 (err=1.8e-15) 6. Temperton (f2c): elapsed time t=1.88326 s, 2048 iters, t-(init.)=1.84993 s t(norm)=0.0486043, mflops=102.872 (err=3.9e-16) Top mflops for N=1728 = 312.842 Normalized results and averages for N=1728: fft 0: mflops = 312.842 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 86.5056 (norm. = 0.276515), norm. avg. (of 7) = 0.169741 fft 2: mflops = 71.3671 (norm. = 0.228125), norm. avg. (of 7) = 0.1469 fft 3: mflops = 85.2145 (norm. = 0.272388), norm. avg. (of 7) = 0.459537 fft 4: mflops = 73.1971 (norm. = 0.233974), norm. avg. (of 7) = 0.461214 fft 5: mflops = 178.418 (norm. = 0.570313), norm. avg. (of 5) = 0.524164 fft 6: mflops = 102.872 (norm. = 0.328829), norm. avg. (of 5) = 0.273193 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.19995 s, 2048 iters, t-(init.)=1.14995 s t(norm)=0.0230221, mflops=217.182 (err=4.5e-16) 1. PDA: elapsed time t=1.36661 s, 256 iters, t-(init.)=1.36661 s t(norm)=0.218877, mflops=22.8439 (err=9.2e-16) 2. PDA (f2c): elapsed time t=1.38328 s, 256 iters, t-(init.)=1.38328 s t(norm)=0.221546, mflops=22.5686 (err=9.2e-16) 3. Singleton: elapsed time t=1.04996 s, 1024 iters, t-(init.)=1.03329 s t(norm)=0.0413731, mflops=120.851 (err=7.7e-16) 4. Singleton (f2c): elapsed time t=1.96659 s, 2048 iters, t-(init.)=1.93326 s t(norm)=0.0387039, mflops=129.186 (err=7.7e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 217.182 Normalized results and averages for N=2197: fft 0: mflops = 217.182 (norm. = 1), norm. avg. (of 8) = 1 fft 1: mflops = 22.8439 (norm. = 0.105183), norm. avg. (of 8) = 0.161671 fft 2: mflops = 22.5686 (norm. = 0.103916), norm. avg. (of 8) = 0.141527 fft 3: mflops = 120.851 (norm. = 0.556452), norm. avg. (of 8) = 0.471651 fft 4: mflops = 129.186 (norm. = 0.594828), norm. avg. (of 8) = 0.477916 fft 5: mflops = -1 (norm. = -0.00460443), norm. avg. (of 5) = 0.524164 fft 6: mflops = -1 (norm. = -0.00460443), norm. avg. (of 5) = 0.273193 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.03329 s, 2048 iters, t-(init.)=0.99996 s t(norm)=0.0155784, mflops=320.956 (err=4.1e-16) 1. PDA: elapsed time t=1.83326 s, 512 iters, t-(init.)=1.83326 s t(norm)=0.114242, mflops=43.7668 (err=4.5e-16) 2. PDA (f2c): elapsed time t=1.11662 s, 256 iters, t-(init.)=1.09996 s t(norm)=0.13709, mflops=36.4723 (err=4.5e-16) 3. Singleton: elapsed time t=1.91659 s, 1024 iters, t-(init.)=1.89992 s t(norm)=0.0591981, mflops=84.4622 (err=5.6e-16) 4. Singleton (f2c): elapsed time t=1.79993 s, 1024 iters, t-(init.)=1.78326 s t(norm)=0.0555631, mflops=89.9878 (err=5.6e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 320.956 Normalized results and averages for N=2744: fft 0: mflops = 320.956 (norm. = 1), norm. avg. (of 9) = 1 fft 1: mflops = 43.7668 (norm. = 0.136364), norm. avg. (of 9) = 0.158859 fft 2: mflops = 36.4723 (norm. = 0.113636), norm. avg. (of 9) = 0.138428 fft 3: mflops = 84.4622 (norm. = 0.263158), norm. avg. (of 9) = 0.448485 fft 4: mflops = 89.9878 (norm. = 0.280374), norm. avg. (of 9) = 0.455967 fft 5: mflops = -1 (norm. = -0.00311569), norm. avg. (of 5) = 0.524164 fft 6: mflops = -1 (norm. = -0.00311569), norm. avg. (of 5) = 0.273193 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.94992 s, 2048 iters, t-(init.)=1.89992 s t(norm)=0.023452, mflops=213.201 (err=4.7e-16) 1. PDA: elapsed time t=1.11662 s, 512 iters, t-(init.)=1.09996 s t(norm)=0.0543099, mflops=92.0642 (err=4.8e-16) 2. PDA (f2c): elapsed time t=1.43328 s, 512 iters, t-(init.)=1.43328 s t(norm)=0.0707675, mflops=70.6539 (err=4.8e-16) 3. Singleton: elapsed time t=1.13329 s, 512 iters, t-(init.)=1.11662 s t(norm)=0.0551328, mflops=90.6901 (err=6.1e-16) 4. Singleton (f2c): elapsed time t=1.13329 s, 512 iters, t-(init.)=1.11662 s t(norm)=0.0551328, mflops=90.6901 (err=6.1e-16) 5. Temperton: elapsed time t=1.21662 s, 1024 iters, t-(init.)=1.19995 s t(norm)=0.0296236, mflops=168.784 (err=2.0e-15) 6. Temperton (f2c): elapsed time t=1.13329 s, 512 iters, t-(init.)=1.11662 s t(norm)=0.0551328, mflops=90.6901 (err=4.6e-16) Top mflops for N=3375 = 213.201 Normalized results and averages for N=3375: fft 0: mflops = 213.201 (norm. = 1), norm. avg. (of 10) = 1 fft 1: mflops = 92.0642 (norm. = 0.431818), norm. avg. (of 10) = 0.186155 fft 2: mflops = 70.6539 (norm. = 0.331395), norm. avg. (of 10) = 0.157725 fft 3: mflops = 90.6901 (norm. = 0.425373), norm. avg. (of 10) = 0.446174 fft 4: mflops = 90.6901 (norm. = 0.425373), norm. avg. (of 10) = 0.452907 fft 5: mflops = 168.784 (norm. = 0.791667), norm. avg. (of 6) = 0.568748 fft 6: mflops = 90.6901 (norm. = 0.425373), norm. avg. (of 6) = 0.298556 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.01663 s, 128 iters, t-(init.)=0.966628 s t(norm)=0.0320252, mflops=156.127 (err=4.5e-16) 1. PDA: elapsed time t=1.84993 s, 128 iters, t-(init.)=1.78326 s t(norm)=0.0590809, mflops=84.6297 (err=5.2e-16) 2. PDA (f2c): elapsed time t=1.13329 s, 64 iters, t-(init.)=1.09996 s t(norm)=0.0728849, mflops=68.6014 (err=4.8e-16) 3. Singleton: elapsed time t=1.21662 s, 64 iters, t-(init.)=1.19995 s t(norm)=0.0795108, mflops=62.8846 (err=5.4e-16) 4. Singleton (f2c): elapsed time t=1.31661 s, 64 iters, t-(init.)=1.28328 s t(norm)=0.0850323, mflops=58.8012 (err=5.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 156.127 Normalized results and averages for N=16800: fft 0: mflops = 156.127 (norm. = 1), norm. avg. (of 11) = 1 fft 1: mflops = 84.6297 (norm. = 0.542056), norm. avg. (of 11) = 0.21851 fft 2: mflops = 68.6014 (norm. = 0.439394), norm. avg. (of 11) = 0.183331 fft 3: mflops = 62.8846 (norm. = 0.402778), norm. avg. (of 11) = 0.442229 fft 4: mflops = 58.8012 (norm. = 0.376623), norm. avg. (of 11) = 0.445972 fft 5: mflops = -1 (norm. = -0.00640503), norm. avg. (of 6) = 0.568748 fft 6: mflops = -1 (norm. = -0.00640503), norm. avg. (of 6) = 0.298556 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.24995 s, 16 iters, t-(init.)=1.13329 s t(norm)=0.0382257, mflops=130.802 (err=6.7e-16) 1. PDA: elapsed time t=1.04996 s, 8 iters, t-(init.)=0.983294 s t(norm)=0.0663328, mflops=75.3775 (err=6.3e-16) 2. PDA (f2c): elapsed time t=1.43328 s, 8 iters, t-(init.)=1.36661 s t(norm)=0.0921913, mflops=54.235 (err=6.2e-16) 3. Singleton: elapsed time t=1.13329 s, 4 iters, t-(init.)=1.09996 s t(norm)=0.148406, mflops=33.6915 (err=6.5e-16) 4. Singleton (f2c): elapsed time t=1.19995 s, 4 iters, t-(init.)=1.16662 s t(norm)=0.1574, mflops=31.7662 (err=6.5e-16) 5. Temperton: elapsed time t=1.06662 s, 8 iters, t-(init.)=1.01663 s t(norm)=0.0685813, mflops=72.9061 (err=1.0e-07) 6. Temperton (f2c): elapsed time t=1.28328 s, 8 iters, t-(init.)=1.23328 s t(norm)=0.083197, mflops=60.0983 (err=7.0e-16) Top mflops for N=110592 = 130.802 Normalized results and averages for N=110592: fft 0: mflops = 130.802 (norm. = 1), norm. avg. (of 12) = 1 fft 1: mflops = 75.3775 (norm. = 0.576271), norm. avg. (of 12) = 0.248323 fft 2: mflops = 54.235 (norm. = 0.414634), norm. avg. (of 12) = 0.202606 fft 3: mflops = 33.6915 (norm. = 0.257576), norm. avg. (of 12) = 0.426841 fft 4: mflops = 31.7662 (norm. = 0.242857), norm. avg. (of 12) = 0.429046 fft 5: mflops = 72.9061 (norm. = 0.557377), norm. avg. (of 7) = 0.567124 fft 6: mflops = 60.0983 (norm. = 0.459459), norm. avg. (of 7) = 0.321543 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.29995 s, 16 iters, t-(init.)=1.16662 s t(norm)=0.0367936, mflops=135.893 (err=6.4e-16) 1. PDA: elapsed time t=1.64993 s, 8 iters, t-(init.)=1.58327 s t(norm)=0.0998684, mflops=50.0659 (err=7.1e-16) 2. PDA (f2c): elapsed time t=1.18329 s, 4 iters, t-(init.)=1.14995 s t(norm)=0.145072, mflops=34.4656 (err=7.1e-16) 3. Singleton: elapsed time t=1.94992 s, 8 iters, t-(init.)=1.86659 s t(norm)=0.11774, mflops=42.4666 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.99992 s, 8 iters, t-(init.)=1.93326 s t(norm)=0.121945, mflops=41.0022 (err=1.0e-15) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 135.893 Normalized results and averages for N=117649: fft 0: mflops = 135.893 (norm. = 1), norm. avg. (of 13) = 1 fft 1: mflops = 50.0659 (norm. = 0.368421), norm. avg. (of 13) = 0.257561 fft 2: mflops = 34.4656 (norm. = 0.253623), norm. avg. (of 13) = 0.206531 fft 3: mflops = 42.4666 (norm. = 0.3125), norm. avg. (of 13) = 0.418046 fft 4: mflops = 41.0022 (norm. = 0.301724), norm. avg. (of 13) = 0.419252 fft 5: mflops = -1 (norm. = -0.00735873), norm. avg. (of 7) = 0.567124 fft 6: mflops = -1 (norm. = -0.00735873), norm. avg. (of 7) = 0.321543 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.31661 s, 8 iters, t-(init.)=1.18329 s t(norm)=0.0386426, mflops=129.391 (err=7.4e-16) 1. PDA: elapsed time t=2.01659 s, 8 iters, t-(init.)=1.88326 s t(norm)=0.0615015, mflops=81.2988 (err=7.4e-16) 2. PDA (f2c): elapsed time t=1.29995 s, 4 iters, t-(init.)=1.21662 s t(norm)=0.0794622, mflops=62.923 (err=7.4e-16) 3. Singleton: elapsed time t=1.41661 s, 2 iters, t-(init.)=1.38328 s t(norm)=0.180695, mflops=27.671 (err=1.0e-15) 4. Singleton (f2c): elapsed time t=1.46661 s, 2 iters, t-(init.)=1.43328 s t(norm)=0.187226, mflops=26.7057 (err=1.0e-15) 5. Temperton: elapsed time t=1.01663 s, 4 iters, t-(init.)=0.949962 s t(norm)=0.0620458, mflops=80.5856 (err=2.0e-15) 6. Temperton (f2c): elapsed time t=1.39994 s, 4 iters, t-(init.)=1.33328 s t(norm)=0.0870818, mflops=57.4173 (err=7.1e-16) Top mflops for N=216000 = 129.391 Normalized results and averages for N=216000: fft 0: mflops = 129.391 (norm. = 1), norm. avg. (of 14) = 1 fft 1: mflops = 81.2988 (norm. = 0.628319), norm. avg. (of 14) = 0.284044 fft 2: mflops = 62.923 (norm. = 0.486301), norm. avg. (of 14) = 0.226514 fft 3: mflops = 27.671 (norm. = 0.213855), norm. avg. (of 14) = 0.403461 fft 4: mflops = 26.7057 (norm. = 0.206395), norm. avg. (of 14) = 0.404048 fft 5: mflops = 80.5856 (norm. = 0.622807), norm. avg. (of 8) = 0.574084 fft 6: mflops = 57.4173 (norm. = 0.44375), norm. avg. (of 8) = 0.336819 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.54994 s, 8 iters, t-(init.)=1.38328 s t(norm)=0.0399649, mflops=125.11 (err=7.4e-16) 1. PDA: elapsed time t=1.33328 s, 4 iters, t-(init.)=1.26662 s t(norm)=0.0731888, mflops=68.3165 (err=7.8e-16) 2. PDA (f2c): elapsed time t=1.68327 s, 4 iters, t-(init.)=1.59994 s t(norm)=0.092449, mflops=54.0839 (err=7.8e-16) 3. Singleton: elapsed time t=1.69993 s, 2 iters, t-(init.)=1.64993 s t(norm)=0.190676, mflops=26.2225 (err=9.4e-16) 4. Singleton (f2c): elapsed time t=1.7666 s, 2 iters, t-(init.)=1.73326 s t(norm)=0.200306, mflops=24.9618 (err=9.4e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 125.11 Normalized results and averages for N=241920: fft 0: mflops = 125.11 (norm. = 1), norm. avg. (of 15) = 1 fft 1: mflops = 68.3165 (norm. = 0.546053), norm. avg. (of 15) = 0.301511 fft 2: mflops = 54.0839 (norm. = 0.432292), norm. avg. (of 15) = 0.240233 fft 3: mflops = 26.2225 (norm. = 0.209596), norm. avg. (of 15) = 0.390536 fft 4: mflops = 24.9618 (norm. = 0.199519), norm. avg. (of 15) = 0.390413 fft 5: mflops = -1 (norm. = -0.00799298), norm. avg. (of 8) = 0.574084 fft 6: mflops = -1 (norm. = -0.00799298), norm. avg. (of 8) = 0.336819 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.58327 s, 4 iters, t-(init.)=1.43328 s t(norm)=0.0454526, mflops=110.005 (err=8.0e-16) 1. PDA: elapsed time t=1.03329 s, 2 iters, t-(init.)=0.966628 s t(norm)=0.0613082, mflops=81.5551 (err=7.3e-16) 2. PDA (f2c): elapsed time t=1.29995 s, 2 iters, t-(init.)=1.21662 s t(norm)=0.0771638, mflops=64.7972 (err=7.3e-16) 3. Singleton: elapsed time t=1.28328 s, 1 iters, t-(init.)=1.23328 s t(norm)=0.156442, mflops=31.9608 (err=9.3e-16) 4. Singleton (f2c): elapsed time t=1.38328 s, 1 iters, t-(init.)=1.34995 s t(norm)=0.17124, mflops=29.1988 (err=9.3e-16) 5. Temperton: elapsed time t=1.81659 s, 4 iters, t-(init.)=1.68327 s t(norm)=0.0533804, mflops=93.6673 (err=1.4e-07) 6. Temperton (f2c): elapsed time t=1.34995 s, 2 iters, t-(init.)=1.28328 s t(norm)=0.0813919, mflops=61.4311 (err=9.6e-16) Top mflops for N=421875 = 110.005 Normalized results and averages for N=421875: fft 0: mflops = 110.005 (norm. = 1), norm. avg. (of 16) = 1 fft 1: mflops = 81.5551 (norm. = 0.741379), norm. avg. (of 16) = 0.329003 fft 2: mflops = 64.7972 (norm. = 0.589041), norm. avg. (of 16) = 0.262033 fft 3: mflops = 31.9608 (norm. = 0.290541), norm. avg. (of 16) = 0.384287 fft 4: mflops = 29.1988 (norm. = 0.265432), norm. avg. (of 16) = 0.382601 fft 5: mflops = 93.6673 (norm. = 0.851485), norm. avg. (of 9) = 0.604906 fft 6: mflops = 61.4311 (norm. = 0.558442), norm. avg. (of 9) = 0.361443 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.86659 s, 4 iters, t-(init.)=1.69993 s t(norm)=0.0437654, mflops=114.246 (err=6.3e-16) 1. PDA: elapsed time t=1.39994 s, 2 iters, t-(init.)=1.29995 s t(norm)=0.0669353, mflops=74.699 (err=6.2e-16) 2. PDA (f2c): elapsed time t=1.73326 s, 2 iters, t-(init.)=1.64993 s t(norm)=0.0849563, mflops=58.8538 (err=6.2e-16) 3. Singleton: elapsed time t=1.74993 s, 1 iters, t-(init.)=1.7166 s t(norm)=0.176778, mflops=28.2841 (err=8.2e-16) 4. Singleton (f2c): elapsed time t=1.83326 s, 1 iters, t-(init.)=1.79993 s t(norm)=0.185359, mflops=26.9746 (err=8.2e-16) 5. Temperton: elapsed time t=1.46661 s, 2 iters, t-(init.)=1.36661 s t(norm)=0.0703679, mflops=71.0551 (err=1.7e-07) 6. Temperton (f2c): elapsed time t=1.88326 s, 2 iters, t-(init.)=1.79993 s t(norm)=0.0926796, mflops=53.9493 (err=6.6e-16) Top mflops for N=512000 = 114.246 Normalized results and averages for N=512000: fft 0: mflops = 114.246 (norm. = 1), norm. avg. (of 17) = 1 fft 1: mflops = 74.699 (norm. = 0.653846), norm. avg. (of 17) = 0.348112 fft 2: mflops = 58.8538 (norm. = 0.515152), norm. avg. (of 17) = 0.276923 fft 3: mflops = 28.2841 (norm. = 0.247573), norm. avg. (of 17) = 0.376245 fft 4: mflops = 26.9746 (norm. = 0.236111), norm. avg. (of 17) = 0.373984 fft 5: mflops = 71.0551 (norm. = 0.621951), norm. avg. (of 10) = 0.606611 fft 6: mflops = 53.9493 (norm. = 0.472222), norm. avg. (of 10) = 0.372521 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.93326 s, 4 iters, t-(init.)=1.7166 s t(norm)=0.0377565, mflops=132.428 (err=7.0e-16) 1. PDA: elapsed time t=1.84993 s, 2 iters, t-(init.)=1.74993 s t(norm)=0.0769792, mflops=64.9526 (err=6.6e-16) 2. PDA (f2c): elapsed time t=1.23328 s, 1 iters, t-(init.)=1.18329 s t(norm)=0.104105, mflops=48.0284 (err=6.6e-16) 3. Singleton: elapsed time t=2.36657 s, 1 iters, t-(init.)=2.31657 s t(norm)=0.203812, mflops=24.5325 (err=8.9e-16) 4. Singleton (f2c): elapsed time t=2.41657 s, 1 iters, t-(init.)=2.36657 s t(norm)=0.20821, mflops=24.0142 (err=8.9e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 132.428 Normalized results and averages for N=592704: fft 0: mflops = 132.428 (norm. = 1), norm. avg. (of 18) = 1 fft 1: mflops = 64.9526 (norm. = 0.490476), norm. avg. (of 18) = 0.356021 fft 2: mflops = 48.0284 (norm. = 0.362676), norm. avg. (of 18) = 0.281687 fft 3: mflops = 24.5325 (norm. = 0.185252), norm. avg. (of 18) = 0.365634 fft 4: mflops = 24.0142 (norm. = 0.181338), norm. avg. (of 18) = 0.363282 fft 5: mflops = -1 (norm. = -0.00755129), norm. avg. (of 10) = 0.606611 fft 6: mflops = -1 (norm. = -0.00755129), norm. avg. (of 10) = 0.372521 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=1.73326 s, 2 iters, t-(init.)=1.58327 s t(norm)=0.0452936, mflops=110.391 (err=7.9e-16) 1. PDA: elapsed time t=1.74993 s, 1 iters, t-(init.)=1.68327 s t(norm)=0.0963085, mflops=51.9165 (err=6.4e-16) 2. PDA (f2c): elapsed time t=1.99992 s, 1 iters, t-(init.)=1.91659 s t(norm)=0.109658, mflops=45.5962 (err=6.4e-16) 3. Singleton: elapsed time t=4.41649 s, 1 iters, t-(init.)=4.34983 s t(norm)=0.248876, mflops=20.0903 (err=7.0e-16) 4. Singleton (f2c): elapsed time t=4.63315 s, 1 iters, t-(init.)=4.54982 s t(norm)=0.260319, mflops=19.2072 (err=7.0e-16) 5. Temperton: elapsed time t=1.6666 s, 1 iters, t-(init.)=1.58327 s t(norm)=0.0905872, mflops=55.1955 (err=1.6e-07) 6. Temperton (f2c): elapsed time t=2.04992 s, 1 iters, t-(init.)=1.96659 s t(norm)=0.112519, mflops=44.437 (err=7.5e-16) Top mflops for N=884736 = 110.391 Normalized results and averages for N=884736: fft 0: mflops = 110.391 (norm. = 1), norm. avg. (of 19) = 1 fft 1: mflops = 51.9165 (norm. = 0.470297), norm. avg. (of 19) = 0.362035 fft 2: mflops = 45.5962 (norm. = 0.413043), norm. avg. (of 19) = 0.2886 fft 3: mflops = 20.0903 (norm. = 0.181992), norm. avg. (of 19) = 0.355969 fft 4: mflops = 19.2072 (norm. = 0.173993), norm. avg. (of 19) = 0.353319 fft 5: mflops = 55.1955 (norm. = 0.5), norm. avg. (of 11) = 0.596919 fft 6: mflops = 44.437 (norm. = 0.402542), norm. avg. (of 11) = 0.37525 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=1.18329 s, 1 iters, t-(init.)=1.08329 s t(norm)=0.0464578, mflops=107.625 (err=7.4e-16) 1. PDA: elapsed time t=1.83326 s, 1 iters, t-(init.)=1.73326 s t(norm)=0.0743324, mflops=67.2654 (err=7.8e-16) 2. PDA (f2c): elapsed time t=2.48323 s, 1 iters, t-(init.)=2.38324 s t(norm)=0.102207, mflops=48.9203 (err=7.2e-16) 3. Singleton: elapsed time t=4.01651 s, 1 iters, t-(init.)=3.91651 s t(norm)=0.167963, mflops=29.7685 (err=8.0e-16) 4. Singleton (f2c): elapsed time t=4.04984 s, 1 iters, t-(init.)=3.93318 s t(norm)=0.168677, mflops=29.6424 (err=8.0e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 107.625 Normalized results and averages for N=1157625: fft 0: mflops = 107.625 (norm. = 1), norm. avg. (of 20) = 1 fft 1: mflops = 67.2654 (norm. = 0.625), norm. avg. (of 20) = 0.375183 fft 2: mflops = 48.9203 (norm. = 0.454545), norm. avg. (of 20) = 0.296897 fft 3: mflops = 29.7685 (norm. = 0.276596), norm. avg. (of 20) = 0.352 fft 4: mflops = 29.6424 (norm. = 0.275424), norm. avg. (of 20) = 0.349424 fft 5: mflops = -1 (norm. = -0.00929155), norm. avg. (of 11) = 0.596919 fft 6: mflops = -1 (norm. = -0.00929155), norm. avg. (of 11) = 0.37525 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=1.46661 s, 1 iters, t-(init.)=1.34995 s t(norm)=0.0470503, mflops=106.269 (err=5.8e-16) 1. PDA: elapsed time t=2.3999 s, 1 iters, t-(init.)=2.28324 s t(norm)=0.079579, mflops=62.8307 (err=6.1e-16) 2. PDA (f2c): elapsed time t=3.24987 s, 1 iters, t-(init.)=3.13321 s t(norm)=0.109203, mflops=45.7862 (err=5.6e-16) 3. Singleton: elapsed time t=5.51645 s, 1 iters, t-(init.)=5.39978 s t(norm)=0.188201, mflops=26.5673 (err=6.1e-16) 4. Singleton (f2c): elapsed time t=5.69977 s, 1 iters, t-(init.)=5.58311 s t(norm)=0.194591, mflops=25.6949 (err=6.1e-16) 5. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 6. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 106.269 Normalized results and averages for N=1404928: fft 0: mflops = 106.269 (norm. = 1), norm. avg. (of 21) = 1 fft 1: mflops = 62.8307 (norm. = 0.591241), norm. avg. (of 21) = 0.385472 fft 2: mflops = 45.7862 (norm. = 0.430851), norm. avg. (of 21) = 0.303276 fft 3: mflops = 26.5673 (norm. = 0.25), norm. avg. (of 21) = 0.347143 fft 4: mflops = 25.6949 (norm. = 0.241791), norm. avg. (of 21) = 0.344299 fft 5: mflops = -1 (norm. = -0.00941007), norm. avg. (of 11) = 0.596919 fft 6: mflops = -1 (norm. = -0.00941007), norm. avg. (of 11) = 0.37525 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=1.6166 s, 1 iters, t-(init.)=1.46661 s t(norm)=0.0409606, mflops=122.068 (err=7.3e-16) 1. PDA: elapsed time t=2.4499 s, 1 iters, t-(init.)=2.29991 s t(norm)=0.0642337, mflops=77.8408 (err=7.9e-16) 2. PDA (f2c): elapsed time t=2.96655 s, 1 iters, t-(init.)=2.81655 s t(norm)=0.078663, mflops=63.5623 (err=7.8e-16) 3. Singleton: elapsed time t=9.2163 s, 1 iters, t-(init.)=9.0663 s t(norm)=0.253211, mflops=19.7464 (err=9.4e-16) 4. Singleton (f2c): elapsed time t=9.58295 s, 1 iters, t-(init.)=9.43296 s t(norm)=0.263451, mflops=18.9788 (err=9.4e-16) 5. Temperton: elapsed time t=2.96655 s, 1 iters, t-(init.)=2.81655 s t(norm)=0.078663, mflops=63.5623 (err=1.1e-08) 6. Temperton (f2c): elapsed time t=4.24983 s, 1 iters, t-(init.)=4.09984 s t(norm)=0.114504, mflops=43.6668 (err=6.9e-16) Top mflops for N=1728000 = 122.068 Normalized results and averages for N=1728000: fft 0: mflops = 122.068 (norm. = 1), norm. avg. (of 22) = 1 fft 1: mflops = 77.8408 (norm. = 0.637681), norm. avg. (of 22) = 0.396936 fft 2: mflops = 63.5623 (norm. = 0.52071), norm. avg. (of 22) = 0.31316 fft 3: mflops = 19.7464 (norm. = 0.161765), norm. avg. (of 22) = 0.338716 fft 4: mflops = 18.9788 (norm. = 0.155477), norm. avg. (of 22) = 0.335716 fft 5: mflops = 63.5623 (norm. = 0.52071), norm. avg. (of 12) = 0.590568 fft 6: mflops = 43.6668 (norm. = 0.357724), norm. avg. (of 12) = 0.37379 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Bailey, Beauregard, Bergland, Brenner, Burrus, CWP (min N), CWP (best N), Edelblute, FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), Monro, NAPACK (f2c), Ooura (C), Ooura (F), Ransom, SCIPORT, Singleton, Singleton (f2c), Sorensen, Sorensen DIT, Temperton, Temperton (f2c), Valkenburg 2, 52.4309, 55.9263, 46.6052, 1.96616, 16.3421, 6.22941, 9.98684, 12.3367, 22.4704, 7.58037, 7.96419, , 15.7293, 22.4704, 74.0201, -186.421, 136.037, , 19.9737, 7.14967, 12.4588, 41.2571, , , , , 4.99342, 98.6935, 78.6463, , , 11.0381, 6.99079, 44.1523, 35.4462, 10.1479, 8.06629, 9.98684 4, 121.286, 119.842, 50.3337, 6.91396, 29.2638, 14.2993, 34.9539, 33.5558, 44.1523, 29.608, 17.2376, 48.8676, 82.5142, 37.5624, 91.5157, 209.724, 107.093, , 64.5303, 16.7779, 21.8842, 63.7135, 85.3113, 88.3047, 78.6463, 11.3364, 11.7602, 167.779, 150.25, 9.98684, 89.8815, 34.4751, 23.0888, 51.8904, 52.9828, 34.9539, 25.6804, 15.3456 8, 155.671, 198.686, 59.921, 10.3709, 58.0773, 20.0799, 56.3437, 42.416, 51.0138, 62.9171, 51.7127, 45.4822, 80.3197, 71.2269, 382.281, 2516.68, 549.094, 171.592, 98.0526, 29.4924, 35.6134, 104.862, 109.421, 112.687, 102.028, 25.8563, 17.9763, 215.716, 212.677, 12.5834, 125.834, 31.4585, 24.5131, 68.6368, 48.3978, 65.0866, 27.7575, 18.3254 16, 86.7822, 85.3113, 55.9263, 18.7812, 66.2285, 22.6728, 82.5142, 62.9171, 50.3337, 130.737, 88.3047, 41.9447, 165.028, 119.842, 366.063, 516.243, 619.491, 154.873, 167.779, 46.6052, 54.7105, 139.816, 89.8815, 114.395, 104.862, 43.3911, 27.9631, 231.419, 214.186, 29.9605, 201.335, 79.8947, 68.9502, 81.1833, 48.8676, 70.8925, 66.2285, 18.7812 32, 106.639, 104.862, 66.9331, 22.796, 78.6463, 24.9671, 122.169, 73.1594, 53.3196, 157.293, 185.05, 46.953, 131.077, 118.711, 629.171, 762.631, 547.105, 216.955, 161.326, 67.6528, 75.8037, 144.637, 106.639, 132.457, 129.726, 70.6933, 33.8264, 251.668, 235.204, 35.7483, 239.684, 110.381, 95.3289, 104.862, 50.7396, 83.8894, 61.6834, 18.9509 64, 98.0526, 96.7955, 71.2269, 34.3184, 86.7822, 24.5131, 149.506, 100.667, 58.9848, 215.716, 204.055, 51.0138, 201.335, 141.122, 437.684, 368.295, 282.245, 321.279, 179.763, 78.6463, 90.9644, 123.771, 102.028, 129.061, 125.834, 90.9644, 41.4838, 260.347, 247.543, 61.8856, 193.591, 155.671, 148.04, 132.457, 56.3437, 114.395, 82.0658, 18.505 128, 111.499, 106.125, 77.2666, 31.9145, 87.2118, 24.7427, 154.533, 102.423, 64.7676, 271.027, 303.738, 55.0524, 189.428, 142.071, 400.381, 404.983, 279.631, 329.286, 174.424, 89.8815, 107.419, 101.246, 119.032, 151.869, 151.869, 104.862, 41.1607, 251.668, 255.316, 58.7226, 207.256, 142.071, 135.514, 155.901, 60.3314, 124.062, 76.5947, 19.4876 256, 104.862, 101.684, 76.2631, 34.4751, 77.4364, 23.7423, 165.028, 105.966, 63.7135, 264.914, 366.063, 55.3117, 216.489, 170.623, 341.245, 347.129, 296.08, 324.733, 181.383, 95.8736, 115.71, 101.684, 107.093, 141.785, 134.223, 111.853, 40.5917, 254.854, 251.668, 84.5944, 167.779, 186.421, 178.172, 152.526, 60.643, 121.286, 83.8894, 19.0658 512, 114.395, 113.251, 84.5155, 35.3909, 80.8934, 23.9938, 191.95, 97.6299, 69.0553, 333.09, 283.127, 61.5493, 161.787, 114.395, 319.016, 319.016, 240.959, 348.464, 141.563, 100.222, 127.248, 96.7955, 124.451, 164.132, 143.355, 117.97, 37.7502, 238.423, 240.959, 82.0658, 124.451, 193.591, 191.95, 155.138, 64.347, 102.955, 73.5394, 17.2638 1024, 81.7105, 86.1878, 52.8715, 36.5797, 55.1904, 22.4704, 125.834, 80.6629, 52.4309, 256.804, 256.804, 44.3078, 174.77, 123.367, 340.092, 285.987, 190.658, 241.989, 129.726, 62.9171, 64.863, 76.7281, 93.9061, 112.352, 103.143, 83.8894, 33.1143, 174.77, 163.421, 80.6629, 113.364, 138.279, 119.842, 92.5251, 40.3315, 96.7955, 74.9013, 15.2711 2048, 86.511, 89.8815, 55.8135, 35.3106, 59.6627, 22.1823, 139.816, 79.5503, 51.6483, 216.277, 263.653, 46.1392, 159.101, 115.348, 354.917, 304.214, 192.247, 234.606, 126.989, 64.0822, 65.2913, 71.3493, 75.2269, 82.3914, 81.4221, 73.6264, 32.3406, 153.797, 147.253, 69.2088, 111.627, 111.627, 94.8066, 91.0642, 39.7752, 91.0642, 68.5235, 14.4185 4096, 81.1833, 84.832, 58.0773, 41.4838, 56.3437, 21.9478, 152.526, 87.7913, 51.7127, 209.724, 239.684, 47.7851, 173.564, 129.061, 302.002, 247.543, 155.671, 247.543, 141.122, 66.2285, 66.8146, 68.6368, 73.3014, 79.4742, 79.4742, 74.0201, 32.8263, 173.564, 162.367, 88.8241, 53.9289, 137.274, 107.858, 89.8815, 40.1598, 96.7955, 76.2631, 13.6776 8192, 88.9046, 92.9457, 57.6001, 40.0942, 43.5065, 22.2261, 141.021, 85.2002, 53.1118, 230.401, 255.601, 49.2724, 97.3717, 77.8973, 240.565, 204.48, 144.765, 230.401, 100.978, 67.0428, 65.9615, , 77.8973, 85.2002, 81.7922, 73.0287, 28.8001, 152.883, 144.765, 77.1625, 44.9408, 127.8, 103.534, 83.4614, 40.8961, 89.8815, 65.9615, 12.78 16384, 46.36, 47.3569, 34.9539, 39.3232, 38.2974, 20.3898, 102.423, 64.7676, 33.8784, 204.846, 207.256, 31.9145, 91.7541, 77.2666, 189.428, 181.616, 86.3568, 125.834, 71.0354, 42.348, 42.348, , 71.0354, 77.9504, 76.5947, 42.7592, 25.3115, 131.469, 124.062, 87.2118, 27.5262, 84.6961, 74.6474, 59.5162, 28.5987, 69.9079, 55.7493, 11.9679 32768, 39.3232, 38.6785, 29.1283, 32.3204, 30.6414, 19.3393, 85.0231, 48.6472, 28.088, 173.166, 176.403, 26.51, 65.5386, 58.2566, 152.219, 140.859, 85.0231, 98.3079, 62.9171, 33.2309, 33.7056, , 56.8528, 58.2566, 53.6225, 33.7056, 20.3396, 98.3079, 94.3756, 69.3938, 25.0999, 53.02, 49.154, 47.1878, 24.3236, 50.1998, 41.7591, 10.4398 65536, 26.4914, 27.0611, 21.8842, 32.6842, 25.1668, 17.9763, 65.3684, 38.7182, 21.1486, 159.789, 157.293, 20.9724, 59.921, 54.1222, 125.834, 119.842, 77.4364, 77.4364, 58.5275, 25.421, 26.2154, , 29.2638, 30.6913, 30.3215, 25.1668, 19.6616, 86.7822, 85.3113, 62.1403, 17.7231, 47.9368, 45.3456, 40.5917, 19.0658, 41.9447, 37.01, 9.67955 131072, 21.5643, 20.2574, 15.9165, 25.2262, 23.0515, 16.7123, 48.1797, 30.0447, 15.5464, 125.834, 125.834, 15.9165, 45.3216, 41.7809, 121.544, 110.267, 65.2189, 53.4795, 46.103, 19.955, 19.955, , 30.0447, 30.7354, 29.3843, 18.0674, 17.1409, 67.6956, 65.2189, 49.9809, 15.9165, 34.2817, 33.4247, 30.3861, 14.2233, 32.6095, 27.5668, 9.15745 262144, 18.1492, 17.6954, 14.2993, 29.8028, 22.1193, 16.4609, 43.558, 26.7101, 13.6119, 85.796, 87.116, 13.8788, 45.6656, 42.898, 99.3428, 107.858, 54.9761, 50.1109, 47.1878, 17.6954, 17.9194, , 21.779, 21.449, 21.779, 17.0558, 17.2638, 65.8435, 63.624, 60.2398, 14.2993, 33.7056, 32.1735, 26.7101, 12.6396, 32.1735, 28.0324, 8.42639 Norm. Avg., 0.248459, 0.250891, 0.16939, 0.112237, 0.163416, 0.074236, 0.348522, 0.217773, 0.144309, 0.622592, 0.647715, 0.133091, 0.380739, 0.298928, 0.820952, 0.842076, 0.604291, 0.604323, 0.343898, 0.165608, 0.182795, 0.233862, 0.249195, 0.285466, 0.27118, 0.187026, 0.0928705, 0.550815, 0.521294, 0.240459, 0.266749, 0.309601, 0.27701, 0.270437, 0.136384, 0.241179, 0.185422, 0.0505709 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, Brenner, CWP (min N), CWP (best N), FFTPACK, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c), Valkenburg 6, , 39.348, 33.4188, 68.7204, 68.7204, 309.787, -3252.77, 35.8761, 45.5995, 16.7094, 26.232, 19.3617, 28.0411, 16.4836, 18.2058 9, 15.8009, 69.0378, 63.2036, 130.071, 99.7212, 276.151, 512.852, 32.5178, 78.7273, 22.2151, 45.7904, 35.0582, 38.0293, 28.0466, 16.498 12, , 100.995, 92.694, 175.757, 98.0675, 294.203, 451.111, 58.3333, 80.5555, 22.8603, 48.3333, 36.7753, 71.9858, 38.0149, 17.6215 15, 23.7574, 119.713, 126.272, 196.125, 122.905, 211.905, 180.743, 37.7782, 62.283, 17.1975, 53.5923, 44.747, 80.1555, 41.1513, 17.1975 18, 19.1659, 113.521, 107.329, 112.44, 100.052, 265.307, 271.406, 36.8943, 111.379, 28.9367, 65.5899, 52.2397, 52.7061, 35.5608, 21.0825 24, , 158.792, 157.349, 120.197, 105.539, 274.736, 279.167, 77.2694, 157.349, 29.2371, 61.8155, 49.7366, 104.267, 43.7079, 20.22 36, 28.1489, 200.513, 215.257, 225.191, 126.185, 252.37, 232.34, 46.9149, 172.205, 35.5278, 98.9017, 84.1233, 106.068, 67.1442, 13.4535 80, 49.7197, 280.111, 272.436, 231.254, 165.732, 364.915, 331.464, 80.193, 96.543, 26.4466, 152.984, 132.586, 175.999, 85.7235, 19.1229 108, 30.9131, 231.35, 243.113, 239.061, 136.606, 284.033, 292.728, 35.5041, 147.873, 39.4057, 107.042, 93.1407, 131.593, 95.6245, 19.7029 210, 36.195, 321.733, 328.367, 130.539, 86.5533, 224.307, 159.258, 36.5271, 115.404, 20.1083, 85.6226, 76.5664, , , 17.7743 504, 35.641, 423.618, 427.691, 111.2, 74.1332, 252.727, 234.105, 43.4374, 171.077, 23.5593, 104.905, 98.4069, , , 16.747 1000, 34.4001, 242.504, 310.037, 159.045, 118.898, 231.065, 237.795, 47.8377, 85.0448, 18.0095, 98.7617, 92.7761, 147.548, 64.455, 15.6205 1960, 35.7899, 274.389, 274.389, 80.3091, 54.8779, 182.926, 180.42, 45.1051, 121.951, 17.1493, 81.3005, 77.4746, , , 13.4946 4725, 24.3381, 218.742, 272.586, 90.3985, 62.3877, 162.551, 164.057, 28.035, 89.4854, 18.3038, 73.8254, 69.2113, , , 13.8423 10368, 27.0981, 265.561, 259.084, 100.212, 81.7111, 183.146, 189.687, 40.2365, 134.461, 29.5068, 80.4731, 75.8746, 118.027, 78.1062, 13.8313 27000, 19.6688, 231.258, 224.456, 79.4948, 70.6621, 135.071, 127.192, 28.057, 80.3316, 17.8306, 61.5444, 56.9515, 79.4948, 50.8767, 12.3888 75600, 15.1575, 147.028, 145.212, 43.2434, 37.2222, 112.021, 107.91, 25.57, 57.0981, 15.8094, 41.4162, 40.2815, , , 10.502 165375, 12.2869, 116.622, 118.633, 23.564, 20.9777, 95.5651, 96.9111, 18.6975, 53.7554, 13.2321, 34.0628, 33.0802, , , 10.2391 362880, 12.1861, 55.8529, 56.6396, 35.2755, 30.4652, 104.452, 101.808, 22.0956, 60.9304, 15.3489, 27.1717, 25.7783, , , 9.05723 Norm. Avg., 0.100099, 0.708561, 0.731458, 0.438134, 0.313968, 0.802424, 0.764762, 0.151405, 0.382961, 0.0866624, 0.261249, 0.233652, 0.323869, 0.1853, 0.0593695 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM, HARM (f2c), PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 4x4x4, 222.06, , , 44.4121, 38.5207, 169.664, 169.664, 152.526, 87.7913 8x8x8, 404.467, 200.444, 85.796, 56.6254, 58.3767, 130.173, 112.129, 231.124, 73.5394 16x16x16, 308.165, 136.037, 77.0413, 102.028, 75.5005, 117.97, 88.8241, 167.779, 114.395 32x32x32, 255.069, 96.3016, 65.5386, 62.0892, 55.5151, 53.02, 45.8134, 85.796, 62.0892 64x64x64, 138.111, 65.0866, 50.5584, 58.3767, 48.815, 28.8905, 26.9645, 66.6181, 53.4202 256x64x32, 103.054, 64.2701, 48.9928, 48.9928, 42.0924, 24.6989, 23.1671, 64.9687, 52.8949 16x1024x64, 104.862, 63.5526, 48.7729, 50.7396, 44.3078, 27.8394, 25.4725, , 128x128x128, 84.6961, 55.9855, 45.2486, 51.2116, 43.4625, 18.4533, 16.9392, 43.8956, 33.8784 Norm. Avg., 1, 0.525224, 0.365701, 0.362639, 0.309755, 0.326097, 0.29651, 0.538595, 0.355981 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA, PDA (f2c), Singleton, Singleton (f2c), Temperton, Temperton (f2c) 5x5x5, 300.347, 49.1948, 40.7614, 248.113, 259.391, 186.085, 83.9206 6x6x6, 370.048, 42.441, 39.2074, 124.751, 111.264, 137.226, 67.4882 7x7x7, 258.172, 23.9855, 21.6455, 130.27, 135.233, , 9x9x9, 293.764, 63.5758, 55.3192, 101.418, 89.6753, 152.128, 93.6171 10x10x10, 318.089, 69.5821, 59.4488, 112.353, 115.533, 172.485, 81.643 11x11x11, 229.388, 24.1118, 22.8155, 132.615, 146.334, , 12x12x12, 312.842, 86.5056, 71.3671, 85.2145, 73.1971, 178.418, 102.872 13x13x13, 217.182, 22.8439, 22.5686, 120.851, 129.186, , 14x14x14, 320.956, 43.7668, 36.4723, 84.4622, 89.9878, , 15x15x15, 213.201, 92.0642, 70.6539, 90.6901, 90.6901, 168.784, 90.6901 24x25x28, 156.127, 84.6297, 68.6014, 62.8846, 58.8012, , 48x48x48, 130.802, 75.3775, 54.235, 33.6915, 31.7662, 72.9061, 60.0983 49x49x49, 135.893, 50.0659, 34.4656, 42.4666, 41.0022, , 60x60x60, 129.391, 81.2988, 62.923, 27.671, 26.7057, 80.5856, 57.4173 72x60x56, 125.11, 68.3165, 54.0839, 26.2225, 24.9618, , 75x75x75, 110.005, 81.5551, 64.7972, 31.9608, 29.1988, 93.6673, 61.4311 80x80x80, 114.246, 74.699, 58.8538, 28.2841, 26.9746, 71.0551, 53.9493 84x84x84, 132.428, 64.9526, 48.0284, 24.5325, 24.0142, , 96x96x96, 110.391, 51.9165, 45.5962, 20.0903, 19.2072, 55.1955, 44.437 105x105x105, 107.625, 67.2654, 48.9203, 29.7685, 29.6424, , 112x112x112, 106.269, 62.8307, 45.7862, 26.5673, 25.6949, , 120x120x120, 122.068, 77.8408, 63.5623, 19.7464, 18.9788, 63.5623, 43.6668 Norm. Avg., 1, 0.396936, 0.31316, 0.338716, 0.335716, 0.590568, 0.37379 @@@@ end