To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Alexandre de Oliveira Penna @ submitter email = apenna@fee.unicamp.br @ submitter organization = State University of Campinass - UNICAMP @ computer manufacturer = Sun @ computer model = Ultra 1 @ CPU manufacturer = Sun @ CPU model = SparcULTRA I @ CPU speed = 167 MHz @ RAM = 128 MB @ L2 cache size = @ operating system = SunOS 5.5 @ C compiler = gcc 2.7.2 @ C compiler flags = -pedantic -ansi -O6 -fomit-frame-pointer -Wall @ Fortran compiler = NONE @ Fortran compiler flags = NONE @ remarks = @ FFTW version = FFTW V1.2.1 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) 524288 (50.9973 MB) Maximum array size = 720720 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Beauregard 5. Bergland 6. CWP (min N) 7. CWP (best N) 8. Edelblute 9. FFTPACK (f2c) 10. FFTW 11. FFTW_ESTIMATE 12. Frigo-old 13. Green 14. GSL 15. GSL DIT 16. GSL DIF 17. Krukar 18. Mayer (Buneman) 19. Mayer (simple) 20. Mayer (lookup) 21. NAPACK (f2c) 22. Ooura (C) 23. Ransom 24. Singleton (f2c) 25. Temperton (f2c) 26. Valkenburg Computing normalized averages (27 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.23 s, 2097152 iters, t-(init.)=0.68 s t(norm)=0.162125, mflops=30.8405 (err=1.7e-17) 1. Arndt DIT: elapsed time t=1.32 s, 2097152 iters, t-(init.)=0.9 s t(norm)=0.214577, mflops=23.3017 (err=1.7e-17) 2. Arndt Split-Radix: elapsed time t=1.91 s, 2097152 iters, t-(init.)=1.43 s t(norm)=0.340939, mflops=14.6654 (err=1.7e-17) 3. Arndt 4-step: elapsed time t=1.24 s, 65536 iters, t-(init.)=1.22 s t(norm)=9.30786, mflops=0.53718 (err=1.7e-17) 4. Beauregard: elapsed time t=1.47 s, 262144 iters, t-(init.)=1.4 s t(norm)=2.67029, mflops=1.87246 (err=1.7e-17) 5. Bergland: elapsed time t=1.58 s, 524288 iters, t-(init.)=1.44 s t(norm)=1.37329, mflops=3.64089 (err=1.7e-17) 6. CWP (min N): elapsed time t=1.17 s, 131072 iters, t-(init.)=1.13 s t(norm)=4.31061, mflops=1.15993 7. CWP (best N) (N=3): elapsed time t=1.21 s, 131072 iters, t-(init.)=1.17 s t(norm)=4.4632, mflops=1.12027 8. Skipping fft (Edelblute can't handle N <= 2). 9. FFTPACK (f2c): elapsed time t=1.78 s, 524288 iters, t-(init.)=1.64 s t(norm)=1.56403, mflops=3.19688 (err=1.7e-17) FFTW_MEASURE plan: (cost = 6.484985e-07) FFTW_NOTW 2 10. FFTW: elapsed time t=1.54 s, 2097152 iters, t-(init.)=0.99 s t(norm)=0.236034, mflops=21.1834 (err=1.7e-17) FFTW_ESTIMATE plan: (cost = 1.820000e+02) FFTW_NOTW 2 11. FFTW_ESTIMATE: elapsed time t=1.61 s, 2097152 iters, t-(init.)=1.09 s t(norm)=0.259876, mflops=19.2399 (err=1.7e-17) 12. Frigo-old: elapsed time t=1.98 s, 4194304 iters, t-(init.)=0.92 s t(norm)=0.109673, mflops=45.5903 (err=1.7e-17) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=1.02 s, 262144 iters, t-(init.)=0.96 s t(norm)=1.83105, mflops=2.73067 (err=1.7e-17) 15. GSL DIT: elapsed time t=1.8 s, 131072 iters, t-(init.)=1.77 s t(norm)=6.75201, mflops=0.74052 (err=1.7e-17) 16. GSL DIF: elapsed time t=1.8 s, 131072 iters, t-(init.)=1.77 s t(norm)=6.75201, mflops=0.74052 (err=1.7e-17) 17. Krukar: elapsed time t=1.04 s, 1048576 iters, t-(init.)=0.77 s t(norm)=0.367165, mflops=13.6179 (err=1.7e-17) 18. Skipping fft (Mayer can't handle N <= 2). 19. Skipping fft (Mayer can't handle N <= 2). 20. Skipping fft (Mayer can't handle N <= 2). 21. NAPACK (f2c): elapsed time t=1.81 s, 131072 iters, t-(init.)=1.78 s t(norm)=6.79016, mflops=0.73636 (err=1.7e-17) 22. Ooura (C): elapsed time t=1.18 s, 2097152 iters, t-(init.)=0.65 s t(norm)=0.154972, mflops=32.2639 (err=1.7e-17) 23. Skipping fft (Ransom doesn't work for N=2). 24. Singleton (f2c): elapsed time t=1 s, 131072 iters, t-(init.)=0.97 s t(norm)=3.70026, mflops=1.35126 (err=1.7e-17) 25. Temperton (f2c): elapsed time t=1 s, 131072 iters, t-(init.)=0.96 s t(norm)=3.66211, mflops=1.36533 (err=1.7e-17) 26. Valkenburg: elapsed time t=1.27 s, 262144 iters, t-(init.)=1.21 s t(norm)=2.30789, mflops=2.16648 (err=1.7e-17) Top mflops for N=2 = 45.5903 Normalized results and averages for N=2: fft 0: mflops = 30.8405 (norm. = 0.676471), norm. avg. (of 1) = 0.676471 fft 1: mflops = 23.3017 (norm. = 0.511111), norm. avg. (of 1) = 0.511111 fft 2: mflops = 14.6654 (norm. = 0.321678), norm. avg. (of 1) = 0.321678 fft 3: mflops = 0.53718 (norm. = 0.0117828), norm. avg. (of 1) = 0.0117828 fft 4: mflops = 1.87246 (norm. = 0.0410714), norm. avg. (of 1) = 0.0410714 fft 5: mflops = 3.64089 (norm. = 0.0798611), norm. avg. (of 1) = 0.0798611 fft 6: mflops = 1.15993 (norm. = 0.0254425), norm. avg. (of 1) = 0.0254425 fft 7: mflops = 1.12027 (norm. = 0.0245726), norm. avg. (of 1) = 0.0245726 fft 8: mflops = -1 (norm. = -0.0219345), norm. avg. (of 0) = -1 fft 9: mflops = 3.19688 (norm. = 0.070122), norm. avg. (of 1) = 0.070122 fft 10: mflops = 21.1834 (norm. = 0.464646), norm. avg. (of 1) = 0.464646 fft 11: mflops = 19.2399 (norm. = 0.422018), norm. avg. (of 1) = 0.422018 fft 12: mflops = 45.5903 (norm. = 1), norm. avg. (of 1) = 1 fft 13: mflops = -1 (norm. = -0.0219345), norm. avg. (of 0) = -1 fft 14: mflops = 2.73067 (norm. = 0.0598958), norm. avg. (of 1) = 0.0598958 fft 15: mflops = 0.74052 (norm. = 0.0162429), norm. avg. (of 1) = 0.0162429 fft 16: mflops = 0.74052 (norm. = 0.0162429), norm. avg. (of 1) = 0.0162429 fft 17: mflops = 13.6179 (norm. = 0.298701), norm. avg. (of 1) = 0.298701 fft 18: mflops = -1 (norm. = -0.0219345), norm. avg. (of 0) = -1 fft 19: mflops = -1 (norm. = -0.0219345), norm. avg. (of 0) = -1 fft 20: mflops = -1 (norm. = -0.0219345), norm. avg. (of 0) = -1 fft 21: mflops = 0.73636 (norm. = 0.0161517), norm. avg. (of 1) = 0.0161517 fft 22: mflops = 32.2639 (norm. = 0.707692), norm. avg. (of 1) = 0.707692 fft 23: mflops = -1 (norm. = -0.0219345), norm. avg. (of 0) = -1 fft 24: mflops = 1.35126 (norm. = 0.0296392), norm. avg. (of 1) = 0.0296392 fft 25: mflops = 1.36533 (norm. = 0.0299479), norm. avg. (of 1) = 0.0299479 fft 26: mflops = 2.16648 (norm. = 0.0475207), norm. avg. (of 1) = 0.0475207 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.37 s, 1048576 iters, t-(init.)=0.99 s t(norm)=0.118017, mflops=42.3667 (err=1.3e-16) 1. Arndt DIT: elapsed time t=1.44 s, 1048576 iters, t-(init.)=1.11 s t(norm)=0.132322, mflops=37.7865 (err=1.3e-16) 2. Arndt Split-Radix: elapsed time t=1.05 s, 262144 iters, t-(init.)=0.97 s t(norm)=0.462532, mflops=10.8101 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.56 s, 65536 iters, t-(init.)=1.53 s t(norm)=2.91824, mflops=1.71336 (err=1.3e-16) 4. Beauregard: elapsed time t=1.17 s, 65536 iters, t-(init.)=1.15 s t(norm)=2.19345, mflops=2.27951 (err=6.5e-17) 5. Bergland: elapsed time t=2.02 s, 524288 iters, t-(init.)=1.84 s t(norm)=0.43869, mflops=11.3976 (err=5.3e-17) 6. CWP (min N): elapsed time t=1.24 s, 131072 iters, t-(init.)=1.19 s t(norm)=1.13487, mflops=4.40578 7. CWP (best N) (N=15): elapsed time t=1.22 s, 65536 iters, t-(init.)=1.17 s t(norm)=2.2316, mflops=2.24055 8. Edelblute: elapsed time t=1.51 s, 262144 iters, t-(init.)=1.43 s t(norm)=0.681877, mflops=7.3327 (err=1.3e-16) 9. FFTPACK (f2c): elapsed time t=1.3 s, 262144 iters, t-(init.)=1.21 s t(norm)=0.576973, mflops=8.66592 (err=5.3e-17) FFTW_MEASURE plan: (cost = 8.773804e-07) FFTW_NOTW 4 10. FFTW: elapsed time t=1 s, 1048576 iters, t-(init.)=0.64 s t(norm)=0.0762939, mflops=65.536 (err=5.3e-17) FFTW_ESTIMATE plan: (cost = 3.176000e+02) FFTW_NOTW 4 11. FFTW_ESTIMATE: elapsed time t=1.06 s, 1048576 iters, t-(init.)=0.66 s t(norm)=0.0786781, mflops=63.5501 (err=5.3e-17) 12. Frigo-old: elapsed time t=1.45 s, 2097152 iters, t-(init.)=0.69 s t(norm)=0.0411272, mflops=121.574 (err=5.3e-17) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=1.21 s, 262144 iters, t-(init.)=1.11 s t(norm)=0.529289, mflops=9.44663 (err=5.3e-17) 15. GSL DIT: elapsed time t=1.61 s, 65536 iters, t-(init.)=1.59 s t(norm)=3.03268, mflops=1.6487 (err=6.5e-17) 16. GSL DIF: elapsed time t=1.62 s, 65536 iters, t-(init.)=1.6 s t(norm)=3.05176, mflops=1.6384 (err=6.5e-17) 17. Krukar: elapsed time t=1.67 s, 1048576 iters, t-(init.)=1.29 s t(norm)=0.15378, mflops=32.514 (err=5.3e-17) 18. Mayer (Buneman): elapsed time t=1.36 s, 524288 iters, t-(init.)=1.18 s t(norm)=0.281334, mflops=17.7725 (err=1.3e-16) 19. Mayer (simple): elapsed time t=1.26 s, 524288 iters, t-(init.)=1.06 s t(norm)=0.252724, mflops=19.7845 20. Mayer (lookup): elapsed time t=1.2 s, 524288 iters, t-(init.)=1.02 s t(norm)=0.243187, mflops=20.5603 (err=1.3e-16) 21. NAPACK (f2c): elapsed time t=1.57 s, 65536 iters, t-(init.)=1.55 s t(norm)=2.95639, mflops=1.69125 (err=1.6e-16) 22. Ooura (C): elapsed time t=1.39 s, 1048576 iters, t-(init.)=1.02 s t(norm)=0.121593, mflops=41.1206 (err=5.3e-17) 23. Ransom: elapsed time t=1.15 s, 32768 iters, t-(init.)=1.14 s t(norm)=4.34875, mflops=1.14975 (err=1.6e-16) 24. Singleton (f2c): elapsed time t=1.61 s, 262144 iters, t-(init.)=1.52 s t(norm)=0.724792, mflops=6.89853 (err=5.3e-17) 25. Temperton (f2c): elapsed time t=1.32 s, 131072 iters, t-(init.)=1.27 s t(norm)=1.21117, mflops=4.12825 (err=5.3e-17) 26. Valkenburg: elapsed time t=1.23 s, 65536 iters, t-(init.)=1.21 s t(norm)=2.30789, mflops=2.16648 (err=1.6e-16) Top mflops for N=4 = 121.574 Normalized results and averages for N=4: fft 0: mflops = 42.3667 (norm. = 0.348485), norm. avg. (of 2) = 0.512478 fft 1: mflops = 37.7865 (norm. = 0.310811), norm. avg. (of 2) = 0.410961 fft 2: mflops = 10.8101 (norm. = 0.0889175), norm. avg. (of 2) = 0.205298 fft 3: mflops = 1.71336 (norm. = 0.0140931), norm. avg. (of 2) = 0.012938 fft 4: mflops = 2.27951 (norm. = 0.01875), norm. avg. (of 2) = 0.0299107 fft 5: mflops = 11.3976 (norm. = 0.09375), norm. avg. (of 2) = 0.0868056 fft 6: mflops = 4.40578 (norm. = 0.0362395), norm. avg. (of 2) = 0.030841 fft 7: mflops = 2.24055 (norm. = 0.0184295), norm. avg. (of 2) = 0.0215011 fft 8: mflops = 7.3327 (norm. = 0.0603147), norm. avg. (of 1) = 0.0603147 fft 9: mflops = 8.66592 (norm. = 0.071281), norm. avg. (of 2) = 0.0707015 fft 10: mflops = 65.536 (norm. = 0.539062), norm. avg. (of 2) = 0.501854 fft 11: mflops = 63.5501 (norm. = 0.522727), norm. avg. (of 2) = 0.472373 fft 12: mflops = 121.574 (norm. = 1), norm. avg. (of 2) = 1 fft 13: mflops = -1 (norm. = -0.00822544), norm. avg. (of 0) = -1 fft 14: mflops = 9.44663 (norm. = 0.0777027), norm. avg. (of 2) = 0.0687993 fft 15: mflops = 1.6487 (norm. = 0.0135613), norm. avg. (of 2) = 0.0149021 fft 16: mflops = 1.6384 (norm. = 0.0134766), norm. avg. (of 2) = 0.0148598 fft 17: mflops = 32.514 (norm. = 0.267442), norm. avg. (of 2) = 0.283072 fft 18: mflops = 17.7725 (norm. = 0.146186), norm. avg. (of 1) = 0.146186 fft 19: mflops = 19.7845 (norm. = 0.162736), norm. avg. (of 1) = 0.162736 fft 20: mflops = 20.5603 (norm. = 0.169118), norm. avg. (of 1) = 0.169118 fft 21: mflops = 1.69125 (norm. = 0.0139113), norm. avg. (of 2) = 0.0150315 fft 22: mflops = 41.1206 (norm. = 0.338235), norm. avg. (of 2) = 0.522964 fft 23: mflops = 1.14975 (norm. = 0.00945724), norm. avg. (of 1) = 0.00945724 fft 24: mflops = 6.89853 (norm. = 0.0567434), norm. avg. (of 2) = 0.0431913 fft 25: mflops = 4.12825 (norm. = 0.0339567), norm. avg. (of 2) = 0.0319523 fft 26: mflops = 2.16648 (norm. = 0.0178202), norm. avg. (of 2) = 0.0326705 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.65 s, 524288 iters, t-(init.)=1.38 s t(norm)=0.109673, mflops=45.5903 (err=1.2e-16) 1. Arndt DIT: elapsed time t=1.66 s, 524288 iters, t-(init.)=1.41 s t(norm)=0.112057, mflops=44.6203 (err=1.2e-16) 2. Arndt Split-Radix: elapsed time t=1.4 s, 131072 iters, t-(init.)=1.34 s t(norm)=0.425975, mflops=11.7378 (err=1.1e-16) 3. Arndt 4-step: elapsed time t=1.01 s, 16384 iters, t-(init.)=1 s t(norm)=2.54313, mflops=1.96608 (err=1.3e-16) 4. Beauregard: elapsed time t=1.96 s, 32768 iters, t-(init.)=1.94 s t(norm)=2.46684, mflops=2.02689 (err=1.2e-16) 5. Bergland: elapsed time t=1.4 s, 131072 iters, t-(init.)=1.33 s t(norm)=0.422796, mflops=11.826 (err=1.3e-16) 6. CWP (min N): elapsed time t=1.48 s, 131072 iters, t-(init.)=1.41 s t(norm)=0.448227, mflops=11.1551 7. CWP (best N) (N=15): elapsed time t=1.22 s, 65536 iters, t-(init.)=1.16 s t(norm)=0.737508, mflops=6.77959 8. Edelblute: elapsed time t=1.5 s, 65536 iters, t-(init.)=1.47 s t(norm)=0.934601, mflops=5.34988 (err=1.3e-16) 9. FFTPACK (f2c): elapsed time t=1.57 s, 131072 iters, t-(init.)=1.5 s t(norm)=0.476837, mflops=10.4858 (err=1.2e-16) FFTW_MEASURE plan: (cost = 1.602173e-06) FFTW_NOTW 8 10. FFTW: elapsed time t=1.77 s, 1048576 iters, t-(init.)=1.21 s t(norm)=0.0480811, mflops=103.991 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.688000e+02) FFTW_NOTW 8 11. FFTW_ESTIMATE: elapsed time t=1.8 s, 1048576 iters, t-(init.)=1.24 s t(norm)=0.0492732, mflops=101.475 (err=1.2e-16) 12. Frigo-old: elapsed time t=1.77 s, 1048576 iters, t-(init.)=1.23 s t(norm)=0.0488758, mflops=102.3 (err=1.4e-16) 13. Green: elapsed time t=1.23 s, 262144 iters, t-(init.)=1.09 s t(norm)=0.173251, mflops=28.8599 (err=1.4e-16) 14. GSL: elapsed time t=1.27 s, 131072 iters, t-(init.)=1.2 s t(norm)=0.38147, mflops=13.1072 (err=1.4e-16) 15. GSL DIT: elapsed time t=1.1 s, 32768 iters, t-(init.)=1.08 s t(norm)=1.37329, mflops=3.64089 (err=1.2e-16) 16. GSL DIF: elapsed time t=1.13 s, 32768 iters, t-(init.)=1.11 s t(norm)=1.41144, mflops=3.54249 (err=1.4e-16) 17. Krukar: elapsed time t=1.76 s, 524288 iters, t-(init.)=1.48 s t(norm)=0.11762, mflops=42.5098 (err=1.9e-16) 18. Mayer (Buneman): elapsed time t=1.22 s, 262144 iters, t-(init.)=1.08 s t(norm)=0.171661, mflops=29.1271 (err=1.2e-16) 19. Mayer (simple): elapsed time t=1.24 s, 262144 iters, t-(init.)=1.11 s t(norm)=0.17643, mflops=28.3399 20. Mayer (lookup): elapsed time t=1.18 s, 262144 iters, t-(init.)=1.04 s t(norm)=0.165304, mflops=30.2474 (err=1.2e-16) 21. NAPACK (f2c): elapsed time t=1.23 s, 32768 iters, t-(init.)=1.21 s t(norm)=1.53859, mflops=3.24972 (err=1.7e-16) 22. Ooura (C): elapsed time t=1.35 s, 524288 iters, t-(init.)=1.07 s t(norm)=0.085036, mflops=58.7987 (err=1.3e-16) 23. Ransom: elapsed time t=1.19 s, 16384 iters, t-(init.)=1.18 s t(norm)=3.0009, mflops=1.66617 (err=3.4e-16) 24. Singleton (f2c): elapsed time t=1.23 s, 65536 iters, t-(init.)=1.2 s t(norm)=0.762939, mflops=6.5536 (err=1.4e-16) 25. Temperton (f2c): elapsed time t=1.01 s, 65536 iters, t-(init.)=0.98 s t(norm)=0.623067, mflops=8.02482 (err=1.4e-16) 26. Valkenburg: elapsed time t=1.79 s, 32768 iters, t-(init.)=1.77 s t(norm)=2.25067, mflops=2.22156 (err=1.5e-16) Top mflops for N=8 = 103.991 Normalized results and averages for N=8: fft 0: mflops = 45.5903 (norm. = 0.438406), norm. avg. (of 3) = 0.487787 fft 1: mflops = 44.6203 (norm. = 0.429078), norm. avg. (of 3) = 0.417 fft 2: mflops = 11.7378 (norm. = 0.112873), norm. avg. (of 3) = 0.17449 fft 3: mflops = 1.96608 (norm. = 0.0189062), norm. avg. (of 3) = 0.0149274 fft 4: mflops = 2.02689 (norm. = 0.019491), norm. avg. (of 3) = 0.0264375 fft 5: mflops = 11.826 (norm. = 0.113722), norm. avg. (of 3) = 0.0957776 fft 6: mflops = 11.1551 (norm. = 0.10727), norm. avg. (of 3) = 0.0563172 fft 7: mflops = 6.77959 (norm. = 0.065194), norm. avg. (of 3) = 0.0360654 fft 8: mflops = 5.34988 (norm. = 0.0514456), norm. avg. (of 2) = 0.0558801 fft 9: mflops = 10.4858 (norm. = 0.100833), norm. avg. (of 3) = 0.0807454 fft 10: mflops = 103.991 (norm. = 1), norm. avg. (of 3) = 0.667903 fft 11: mflops = 101.475 (norm. = 0.975806), norm. avg. (of 3) = 0.640184 fft 12: mflops = 102.3 (norm. = 0.98374), norm. avg. (of 3) = 0.99458 fft 13: mflops = 28.8599 (norm. = 0.277523), norm. avg. (of 1) = 0.277523 fft 14: mflops = 13.1072 (norm. = 0.126042), norm. avg. (of 3) = 0.0878801 fft 15: mflops = 3.64089 (norm. = 0.0350116), norm. avg. (of 3) = 0.0216053 fft 16: mflops = 3.54249 (norm. = 0.0340653), norm. avg. (of 3) = 0.0212616 fft 17: mflops = 42.5098 (norm. = 0.408784), norm. avg. (of 3) = 0.324976 fft 18: mflops = 29.1271 (norm. = 0.280093), norm. avg. (of 2) = 0.21314 fft 19: mflops = 28.3399 (norm. = 0.272523), norm. avg. (of 2) = 0.217629 fft 20: mflops = 30.2474 (norm. = 0.290865), norm. avg. (of 2) = 0.229992 fft 21: mflops = 3.24972 (norm. = 0.03125), norm. avg. (of 3) = 0.0204377 fft 22: mflops = 58.7987 (norm. = 0.565421), norm. avg. (of 3) = 0.537116 fft 23: mflops = 1.66617 (norm. = 0.0160222), norm. avg. (of 2) = 0.0127397 fft 24: mflops = 6.5536 (norm. = 0.0630208), norm. avg. (of 3) = 0.0498011 fft 25: mflops = 8.02482 (norm. = 0.0771684), norm. avg. (of 3) = 0.0470243 fft 26: mflops = 2.22156 (norm. = 0.021363), norm. avg. (of 3) = 0.0289013 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.61 s, 65536 iters, t-(init.)=1.55 s t(norm)=0.369549, mflops=13.53 (err=1.5e-16) 1. Arndt DIT: elapsed time t=1.63 s, 65536 iters, t-(init.)=1.58 s t(norm)=0.376701, mflops=13.2731 (err=2.2e-16) 2. Arndt Split-Radix: elapsed time t=1.86 s, 65536 iters, t-(init.)=1.81 s t(norm)=0.431538, mflops=11.5865 (err=1.3e-16) 3. Arndt 4-step: elapsed time t=1.7 s, 16384 iters, t-(init.)=1.68 s t(norm)=1.60217, mflops=3.12076 (err=2.0e-16) 4. Beauregard: elapsed time t=1.27 s, 8192 iters, t-(init.)=1.27 s t(norm)=2.42233, mflops=2.06413 (err=2.7e-16) 5. Bergland: elapsed time t=1.29 s, 65536 iters, t-(init.)=1.23 s t(norm)=0.293255, mflops=17.05 (err=2.6e-16) 6. CWP (min N): elapsed time t=1.96 s, 131072 iters, t-(init.)=1.84 s t(norm)=0.219345, mflops=22.7951 7. CWP (best N) (N=28): elapsed time t=1.73 s, 65536 iters, t-(init.)=1.64 s t(norm)=0.391006, mflops=12.7875 8. Edelblute: elapsed time t=1.05 s, 16384 iters, t-(init.)=1.04 s t(norm)=0.991821, mflops=5.04123 (err=1.4e-16) 9. FFTPACK (f2c): elapsed time t=1.4 s, 65536 iters, t-(init.)=1.35 s t(norm)=0.321865, mflops=15.5345 (err=1.8e-16) FFTW_MEASURE plan: (cost = 3.662109e-06) FFTW_NOTW 16 10. FFTW: elapsed time t=1 s, 262144 iters, t-(init.)=0.77 s t(norm)=0.0458956, mflops=108.943 (err=1.8e-16) FFTW_ESTIMATE plan: (cost = 4.256000e+02) FFTW_NOTW 16 11. FFTW_ESTIMATE: elapsed time t=1.02 s, 262144 iters, t-(init.)=0.79 s t(norm)=0.0470877, mflops=106.185 (err=1.8e-16) 12. Frigo-old: elapsed time t=1.23 s, 262144 iters, t-(init.)=1 s t(norm)=0.0596046, mflops=83.8861 (err=1.8e-16) 13. Green: elapsed time t=1.73 s, 131072 iters, t-(init.)=1.62 s t(norm)=0.193119, mflops=25.8908 (err=1.9e-16) 14. GSL: elapsed time t=1 s, 65536 iters, t-(init.)=0.95 s t(norm)=0.226498, mflops=22.0753 (err=1.8e-16) 15. GSL DIT: elapsed time t=1.53 s, 32768 iters, t-(init.)=1.5 s t(norm)=0.715256, mflops=6.99051 (err=2.1e-16) 16. GSL DIF: elapsed time t=1.55 s, 32768 iters, t-(init.)=1.53 s t(norm)=0.729561, mflops=6.85344 (err=2.8e-16) 17. Krukar: elapsed time t=1 s, 131072 iters, t-(init.)=0.89 s t(norm)=0.106096, mflops=47.127 (err=1.9e-16) 18. Mayer (Buneman): elapsed time t=1.65 s, 131072 iters, t-(init.)=1.54 s t(norm)=0.183582, mflops=27.2357 (err=1.7e-16) 19. Mayer (simple): elapsed time t=1.48 s, 131072 iters, t-(init.)=1.37 s t(norm)=0.163317, mflops=30.6154 20. Mayer (lookup): elapsed time t=1.52 s, 131072 iters, t-(init.)=1.41 s t(norm)=0.168085, mflops=29.7468 (err=1.8e-16) 21. NAPACK (f2c): elapsed time t=1.91 s, 32768 iters, t-(init.)=1.88 s t(norm)=0.896454, mflops=5.57753 (err=3.3e-16) 22. Ooura (C): elapsed time t=1.72 s, 262144 iters, t-(init.)=1.49 s t(norm)=0.0888109, mflops=56.2994 (err=2.0e-16) 23. Ransom: elapsed time t=1.73 s, 32768 iters, t-(init.)=1.7 s t(norm)=0.810623, mflops=6.16809 (err=3.4e-16) 24. Singleton (f2c): elapsed time t=1.14 s, 65536 iters, t-(init.)=1.09 s t(norm)=0.259876, mflops=19.2399 (err=1.7e-16) 25. Temperton (f2c): elapsed time t=1.73 s, 65536 iters, t-(init.)=1.67 s t(norm)=0.398159, mflops=12.5578 (err=1.8e-16) 26. Valkenburg: elapsed time t=1.16 s, 8192 iters, t-(init.)=1.16 s t(norm)=2.21252, mflops=2.25986 (err=2.9e-16) Top mflops for N=16 = 108.943 Normalized results and averages for N=16: fft 0: mflops = 13.53 (norm. = 0.124194), norm. avg. (of 4) = 0.396889 fft 1: mflops = 13.2731 (norm. = 0.121835), norm. avg. (of 4) = 0.343209 fft 2: mflops = 11.5865 (norm. = 0.106354), norm. avg. (of 4) = 0.157456 fft 3: mflops = 3.12076 (norm. = 0.0286458), norm. avg. (of 4) = 0.018357 fft 4: mflops = 2.06413 (norm. = 0.0189469), norm. avg. (of 4) = 0.0245648 fft 5: mflops = 17.05 (norm. = 0.156504), norm. avg. (of 4) = 0.110959 fft 6: mflops = 22.7951 (norm. = 0.209239), norm. avg. (of 4) = 0.0945477 fft 7: mflops = 12.7875 (norm. = 0.117378), norm. avg. (of 4) = 0.0563935 fft 8: mflops = 5.04123 (norm. = 0.046274), norm. avg. (of 3) = 0.0526781 fft 9: mflops = 15.5345 (norm. = 0.142593), norm. avg. (of 4) = 0.0962072 fft 10: mflops = 108.943 (norm. = 1), norm. avg. (of 4) = 0.750927 fft 11: mflops = 106.185 (norm. = 0.974684), norm. avg. (of 4) = 0.723809 fft 12: mflops = 83.8861 (norm. = 0.77), norm. avg. (of 4) = 0.938435 fft 13: mflops = 25.8908 (norm. = 0.237654), norm. avg. (of 2) = 0.257589 fft 14: mflops = 22.0753 (norm. = 0.202632), norm. avg. (of 4) = 0.116568 fft 15: mflops = 6.99051 (norm. = 0.0641667), norm. avg. (of 4) = 0.0322456 fft 16: mflops = 6.85344 (norm. = 0.0629085), norm. avg. (of 4) = 0.0316733 fft 17: mflops = 47.127 (norm. = 0.432584), norm. avg. (of 4) = 0.351878 fft 18: mflops = 27.2357 (norm. = 0.25), norm. avg. (of 3) = 0.225426 fft 19: mflops = 30.6154 (norm. = 0.281022), norm. avg. (of 3) = 0.23876 fft 20: mflops = 29.7468 (norm. = 0.27305), norm. avg. (of 3) = 0.244344 fft 21: mflops = 5.57753 (norm. = 0.0511968), norm. avg. (of 4) = 0.0281274 fft 22: mflops = 56.2994 (norm. = 0.516779), norm. avg. (of 4) = 0.532032 fft 23: mflops = 6.16809 (norm. = 0.0566176), norm. avg. (of 3) = 0.0273657 fft 24: mflops = 19.2399 (norm. = 0.176606), norm. avg. (of 4) = 0.0815022 fft 25: mflops = 12.5578 (norm. = 0.115269), norm. avg. (of 4) = 0.0640856 fft 26: mflops = 2.25986 (norm. = 0.0207435), norm. avg. (of 4) = 0.0268619 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.88 s, 32768 iters, t-(init.)=1.84 s t(norm)=0.350952, mflops=14.247 (err=3.1e-16) 1. Arndt DIT: elapsed time t=1.88 s, 32768 iters, t-(init.)=1.83 s t(norm)=0.349045, mflops=14.3248 (err=2.5e-16) 2. Arndt Split-Radix: elapsed time t=1.07 s, 16384 iters, t-(init.)=1.04 s t(norm)=0.396729, mflops=12.6031 (err=2.7e-16) 3. Arndt 4-step: elapsed time t=1.94 s, 8192 iters, t-(init.)=1.93 s t(norm)=1.47247, mflops=3.39565 (err=2.8e-16) 4. Beauregard: elapsed time t=1.58 s, 4096 iters, t-(init.)=1.57 s t(norm)=2.39563, mflops=2.08713 (err=1.8e-16) 5. Bergland: elapsed time t=1.16 s, 32768 iters, t-(init.)=1.1 s t(norm)=0.209808, mflops=23.8313 (err=2.6e-16) 6. CWP (min N) (N=33): elapsed time t=1.17 s, 32768 iters, t-(init.)=1.11 s t(norm)=0.211716, mflops=23.6166 7. CWP (best N) (N=35): elapsed time t=1.13 s, 32768 iters, t-(init.)=1.08 s t(norm)=0.205994, mflops=24.2726 8. Edelblute: elapsed time t=1.22 s, 8192 iters, t-(init.)=1.21 s t(norm)=0.923157, mflops=5.4162 (err=2.9e-16) 9. FFTPACK (f2c): elapsed time t=1.04 s, 16384 iters, t-(init.)=1.01 s t(norm)=0.385284, mflops=12.9774 (err=1.9e-16) FFTW_MEASURE plan: (cost = 9.460449e-06) FFTW_NOTW 32 10. FFTW: elapsed time t=1.25 s, 131072 iters, t-(init.)=1.05 s t(norm)=0.0500679, mflops=99.8644 (err=2.1e-16) FFTW_ESTIMATE plan: (cost = 3.200000e+01) FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.25 s, 131072 iters, t-(init.)=1.04 s t(norm)=0.0495911, mflops=100.825 (err=2.1e-16) 12. Frigo-old: elapsed time t=1.54 s, 131072 iters, t-(init.)=1.34 s t(norm)=0.0638962, mflops=78.2519 (err=2.2e-16) 13. Green: elapsed time t=1.83 s, 65536 iters, t-(init.)=1.73 s t(norm)=0.164986, mflops=30.3057 (err=2.0e-16) 14. GSL: elapsed time t=1.27 s, 32768 iters, t-(init.)=1.22 s t(norm)=0.232697, mflops=21.4872 (err=2.0e-16) 15. GSL DIT: elapsed time t=1.11 s, 16384 iters, t-(init.)=1.08 s t(norm)=0.411987, mflops=12.1363 (err=2.2e-16) 16. GSL DIF: elapsed time t=1.13 s, 16384 iters, t-(init.)=1.11 s t(norm)=0.423431, mflops=11.8083 (err=2.5e-16) 17. Krukar: elapsed time t=1.25 s, 65536 iters, t-(init.)=1.15 s t(norm)=0.109673, mflops=45.5903 (err=2.2e-16) 18. Mayer (Buneman): elapsed time t=1.73 s, 65536 iters, t-(init.)=1.63 s t(norm)=0.155449, mflops=32.1649 (err=2.7e-16) 19. Mayer (simple): elapsed time t=1.52 s, 65536 iters, t-(init.)=1.42 s t(norm)=0.135422, mflops=36.9217 20. Mayer (lookup): elapsed time t=1.53 s, 65536 iters, t-(init.)=1.43 s t(norm)=0.136375, mflops=36.6635 (err=2.6e-16) 21. NAPACK (f2c): elapsed time t=1.71 s, 16384 iters, t-(init.)=1.68 s t(norm)=0.640869, mflops=7.8019 (err=5.4e-16) 22. Ooura (C): elapsed time t=1.94 s, 131072 iters, t-(init.)=1.74 s t(norm)=0.0829697, mflops=60.263 (err=2.7e-16) 23. Ransom: elapsed time t=1.14 s, 8192 iters, t-(init.)=1.13 s t(norm)=0.862122, mflops=5.79965 (err=7.0e-16) 24. Singleton (f2c): elapsed time t=1.05 s, 32768 iters, t-(init.)=1 s t(norm)=0.190735, mflops=26.2144 (err=2.2e-16) 25. Temperton (f2c): elapsed time t=1.81 s, 32768 iters, t-(init.)=1.76 s t(norm)=0.335693, mflops=14.8945 (err=2.0e-16) 26. Valkenburg: elapsed time t=1.44 s, 4096 iters, t-(init.)=1.44 s t(norm)=2.19727, mflops=2.27556 (err=4.3e-16) Top mflops for N=32 = 100.825 Normalized results and averages for N=32: fft 0: mflops = 14.247 (norm. = 0.141304), norm. avg. (of 5) = 0.345772 fft 1: mflops = 14.3248 (norm. = 0.142077), norm. avg. (of 5) = 0.302982 fft 2: mflops = 12.6031 (norm. = 0.125), norm. avg. (of 5) = 0.150965 fft 3: mflops = 3.39565 (norm. = 0.0336788), norm. avg. (of 5) = 0.0214214 fft 4: mflops = 2.08713 (norm. = 0.0207006), norm. avg. (of 5) = 0.023792 fft 5: mflops = 23.8313 (norm. = 0.236364), norm. avg. (of 5) = 0.13604 fft 6: mflops = 23.6166 (norm. = 0.234234), norm. avg. (of 5) = 0.122485 fft 7: mflops = 24.2726 (norm. = 0.240741), norm. avg. (of 5) = 0.093263 fft 8: mflops = 5.4162 (norm. = 0.053719), norm. avg. (of 4) = 0.0529383 fft 9: mflops = 12.9774 (norm. = 0.128713), norm. avg. (of 5) = 0.102708 fft 10: mflops = 99.8644 (norm. = 0.990476), norm. avg. (of 5) = 0.798837 fft 11: mflops = 100.825 (norm. = 1), norm. avg. (of 5) = 0.779047 fft 12: mflops = 78.2519 (norm. = 0.776119), norm. avg. (of 5) = 0.905972 fft 13: mflops = 30.3057 (norm. = 0.300578), norm. avg. (of 3) = 0.271918 fft 14: mflops = 21.4872 (norm. = 0.213115), norm. avg. (of 5) = 0.135877 fft 15: mflops = 12.1363 (norm. = 0.12037), norm. avg. (of 5) = 0.0498706 fft 16: mflops = 11.8083 (norm. = 0.117117), norm. avg. (of 5) = 0.0487621 fft 17: mflops = 45.5903 (norm. = 0.452174), norm. avg. (of 5) = 0.371937 fft 18: mflops = 32.1649 (norm. = 0.319018), norm. avg. (of 4) = 0.248824 fft 19: mflops = 36.9217 (norm. = 0.366197), norm. avg. (of 4) = 0.270619 fft 20: mflops = 36.6635 (norm. = 0.363636), norm. avg. (of 4) = 0.274167 fft 21: mflops = 7.8019 (norm. = 0.077381), norm. avg. (of 5) = 0.0379781 fft 22: mflops = 60.263 (norm. = 0.597701), norm. avg. (of 5) = 0.545166 fft 23: mflops = 5.79965 (norm. = 0.0575221), norm. avg. (of 4) = 0.0349048 fft 24: mflops = 26.2144 (norm. = 0.26), norm. avg. (of 5) = 0.117202 fft 25: mflops = 14.8945 (norm. = 0.147727), norm. avg. (of 5) = 0.0808139 fft 26: mflops = 2.27556 (norm. = 0.0225694), norm. avg. (of 5) = 0.0260034 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.14 s, 8192 iters, t-(init.)=1.12 s t(norm)=0.356038, mflops=14.0434 (err=5.7e-16) 1. Arndt DIT: elapsed time t=1.13 s, 8192 iters, t-(init.)=1.1 s t(norm)=0.349681, mflops=14.2988 (err=5.7e-16) 2. Arndt Split-Radix: elapsed time t=1.2 s, 8192 iters, t-(init.)=1.18 s t(norm)=0.375112, mflops=13.3294 (err=5.8e-16) 3. Arndt 4-step: elapsed time t=1.78 s, 4096 iters, t-(init.)=1.77 s t(norm)=1.12534, mflops=4.44312 (err=5.6e-16) 4. Beauregard: elapsed time t=1.94 s, 2048 iters, t-(init.)=1.94 s t(norm)=2.46684, mflops=2.02689 (err=6.0e-16) 5. Bergland: elapsed time t=1.29 s, 16384 iters, t-(init.)=1.25 s t(norm)=0.198682, mflops=25.1658 (err=6.0e-16) 6. CWP (min N) (N=65): elapsed time t=1.2 s, 16384 iters, t-(init.)=1.15 s t(norm)=0.182788, mflops=27.3542 7. CWP (best N) (N=84): elapsed time t=1.05 s, 16384 iters, t-(init.)=0.99 s t(norm)=0.157356, mflops=31.775 8. Edelblute: elapsed time t=1.33 s, 4096 iters, t-(init.)=1.32 s t(norm)=0.839233, mflops=5.95782 (err=5.7e-16) 9. FFTPACK (f2c): elapsed time t=1.02 s, 8192 iters, t-(init.)=0.99 s t(norm)=0.314713, mflops=15.8875 (err=5.5e-16) FFTW_MEASURE plan: (cost = 2.136230e-05) FFTW_TWIDDLE 4 FFTW_NOTW 16 10. FFTW: elapsed time t=1.58 s, 65536 iters, t-(init.)=1.39 s t(norm)=0.0552336, mflops=90.5245 (err=5.5e-16) FFTW_ESTIMATE plan: (cost = 7.680000e+02) FFTW_TWIDDLE 2 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.84 s, 65536 iters, t-(init.)=1.65 s t(norm)=0.0655651, mflops=76.2601 (err=5.4e-16) 12. Frigo-old: elapsed time t=1.21 s, 32768 iters, t-(init.)=1.12 s t(norm)=0.0890096, mflops=56.1737 (err=5.5e-16) 13. Green: elapsed time t=1.92 s, 32768 iters, t-(init.)=1.82 s t(norm)=0.144641, mflops=34.5684 (err=5.5e-16) 14. GSL: elapsed time t=1.16 s, 16384 iters, t-(init.)=1.11 s t(norm)=0.17643, mflops=28.3399 (err=5.5e-16) 15. GSL DIT: elapsed time t=1.82 s, 16384 iters, t-(init.)=1.77 s t(norm)=0.281334, mflops=17.7725 (err=5.6e-16) 16. GSL DIF: elapsed time t=1.79 s, 16384 iters, t-(init.)=1.75 s t(norm)=0.278155, mflops=17.9756 (err=5.4e-16) 17. Krukar: elapsed time t=1.52 s, 32768 iters, t-(init.)=1.42 s t(norm)=0.112851, mflops=44.306 (err=6.0e-16) 18. Mayer (Buneman): elapsed time t=1.99 s, 32768 iters, t-(init.)=1.89 s t(norm)=0.150204, mflops=33.2881 (err=5.4e-16) 19. Mayer (simple): elapsed time t=1.7 s, 32768 iters, t-(init.)=1.61 s t(norm)=0.127951, mflops=39.0774 20. Mayer (lookup): elapsed time t=1.72 s, 32768 iters, t-(init.)=1.62 s t(norm)=0.128746, mflops=38.8361 (err=5.6e-16) 21. NAPACK (f2c): elapsed time t=1.53 s, 8192 iters, t-(init.)=1.5 s t(norm)=0.476837, mflops=10.4858 (err=1.1e-15) 22. Ooura (C): elapsed time t=1.05 s, 32768 iters, t-(init.)=0.95 s t(norm)=0.0754992, mflops=66.2259 (err=5.9e-16) 23. Ransom: elapsed time t=1.17 s, 8192 iters, t-(init.)=1.15 s t(norm)=0.365575, mflops=13.6771 (err=8.6e-16) 24. Singleton (f2c): elapsed time t=1.89 s, 32768 iters, t-(init.)=1.8 s t(norm)=0.143051, mflops=34.9525 (err=9.2e-16) 25. Temperton (f2c): elapsed time t=1.51 s, 16384 iters, t-(init.)=1.46 s t(norm)=0.232061, mflops=21.5461 (err=5.5e-16) 26. Valkenburg: elapsed time t=1.69 s, 2048 iters, t-(init.)=1.68 s t(norm)=2.13623, mflops=2.34057 (err=8.2e-16) Top mflops for N=64 = 90.5245 Normalized results and averages for N=64: fft 0: mflops = 14.0434 (norm. = 0.155134), norm. avg. (of 6) = 0.313999 fft 1: mflops = 14.2988 (norm. = 0.157955), norm. avg. (of 6) = 0.278811 fft 2: mflops = 13.3294 (norm. = 0.147246), norm. avg. (of 6) = 0.150345 fft 3: mflops = 4.44312 (norm. = 0.0490819), norm. avg. (of 6) = 0.0260314 fft 4: mflops = 2.02689 (norm. = 0.0223905), norm. avg. (of 6) = 0.0235584 fft 5: mflops = 25.1658 (norm. = 0.278), norm. avg. (of 6) = 0.1597 fft 6: mflops = 27.3542 (norm. = 0.302174), norm. avg. (of 6) = 0.152433 fft 7: mflops = 31.775 (norm. = 0.35101), norm. avg. (of 6) = 0.136221 fft 8: mflops = 5.95782 (norm. = 0.0658144), norm. avg. (of 5) = 0.0555135 fft 9: mflops = 15.8875 (norm. = 0.175505), norm. avg. (of 6) = 0.114841 fft 10: mflops = 90.5245 (norm. = 1), norm. avg. (of 6) = 0.832364 fft 11: mflops = 76.2601 (norm. = 0.842424), norm. avg. (of 6) = 0.78961 fft 12: mflops = 56.1737 (norm. = 0.620536), norm. avg. (of 6) = 0.858399 fft 13: mflops = 34.5684 (norm. = 0.381868), norm. avg. (of 4) = 0.299406 fft 14: mflops = 28.3399 (norm. = 0.313063), norm. avg. (of 6) = 0.165408 fft 15: mflops = 17.7725 (norm. = 0.196328), norm. avg. (of 6) = 0.0742801 fft 16: mflops = 17.9756 (norm. = 0.198571), norm. avg. (of 6) = 0.0737303 fft 17: mflops = 44.306 (norm. = 0.489437), norm. avg. (of 6) = 0.39152 fft 18: mflops = 33.2881 (norm. = 0.367725), norm. avg. (of 5) = 0.272604 fft 19: mflops = 39.0774 (norm. = 0.431677), norm. avg. (of 5) = 0.302831 fft 20: mflops = 38.8361 (norm. = 0.429012), norm. avg. (of 5) = 0.305136 fft 21: mflops = 10.4858 (norm. = 0.115833), norm. avg. (of 6) = 0.050954 fft 22: mflops = 66.2259 (norm. = 0.731579), norm. avg. (of 6) = 0.576234 fft 23: mflops = 13.6771 (norm. = 0.151087), norm. avg. (of 5) = 0.0581412 fft 24: mflops = 34.9525 (norm. = 0.386111), norm. avg. (of 6) = 0.16202 fft 25: mflops = 21.5461 (norm. = 0.238014), norm. avg. (of 6) = 0.107014 fft 26: mflops = 2.34057 (norm. = 0.0258557), norm. avg. (of 6) = 0.0259788 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.22 s, 4096 iters, t-(init.)=1.19 s t(norm)=0.324249, mflops=15.4202 (err=3.5e-16) 1. Arndt DIT: elapsed time t=1.22 s, 4096 iters, t-(init.)=1.2 s t(norm)=0.326974, mflops=15.2917 (err=3.3e-16) 2. Arndt Split-Radix: elapsed time t=1.26 s, 4096 iters, t-(init.)=1.24 s t(norm)=0.337873, mflops=14.7985 (err=3.7e-16) 3. Arndt 4-step: elapsed time t=1.02 s, 1024 iters, t-(init.)=1.01 s t(norm)=1.10081, mflops=4.5421 (err=3.2e-16) 4. Beauregard: elapsed time t=1.15 s, 512 iters, t-(init.)=1.14 s t(norm)=2.485, mflops=2.01207 (err=3.8e-16) 5. Bergland: elapsed time t=1.34 s, 8192 iters, t-(init.)=1.29 s t(norm)=0.175749, mflops=28.4497 (err=3.4e-16) 6. CWP (min N) (N=130): elapsed time t=1.14 s, 8192 iters, t-(init.)=1.1 s t(norm)=0.149863, mflops=33.3638 7. CWP (best N) (N=140): elapsed time t=1.88 s, 16384 iters, t-(init.)=1.78 s t(norm)=0.121253, mflops=41.2361 8. Edelblute: elapsed time t=1.38 s, 2048 iters, t-(init.)=1.37 s t(norm)=0.746591, mflops=6.69711 (err=3.2e-16) 9. FFTPACK (f2c): elapsed time t=1.19 s, 4096 iters, t-(init.)=1.17 s t(norm)=0.3188, mflops=15.6838 (err=3.5e-16) FFTW_MEASURE plan: (cost = 5.126953e-05) FFTW_TWIDDLE 4 FFTW_NOTW 32 10. FFTW: elapsed time t=1.97 s, 32768 iters, t-(init.)=1.79 s t(norm)=0.060967, mflops=82.0115 (err=3.4e-16) FFTW_ESTIMATE plan: (cost = 1.075200e+03) FFTW_TWIDDLE 4 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.97 s, 32768 iters, t-(init.)=1.79 s t(norm)=0.060967, mflops=82.0115 (err=3.4e-16) 12. Frigo-old: elapsed time t=1.41 s, 16384 iters, t-(init.)=1.32 s t(norm)=0.0899179, mflops=55.6063 (err=3.4e-16) 13. Green: elapsed time t=1.32 s, 8192 iters, t-(init.)=1.27 s t(norm)=0.173024, mflops=28.8978 (err=4.1e-16) 14. GSL: elapsed time t=1.28 s, 8192 iters, t-(init.)=1.23 s t(norm)=0.167574, mflops=29.8375 (err=3.5e-16) 15. GSL DIT: elapsed time t=1.6 s, 8192 iters, t-(init.)=1.56 s t(norm)=0.212533, mflops=23.5257 (err=3.5e-16) 16. GSL DIF: elapsed time t=1.55 s, 8192 iters, t-(init.)=1.51 s t(norm)=0.205721, mflops=24.3047 (err=3.7e-16) 17. Krukar: elapsed time t=1.73 s, 16384 iters, t-(init.)=1.64 s t(norm)=0.111716, mflops=44.7563 (err=3.6e-16) 18. Mayer (Buneman): elapsed time t=1.04 s, 8192 iters, t-(init.)=1 s t(norm)=0.136239, mflops=36.7002 (err=3.2e-16) 19. Mayer (simple): elapsed time t=1.75 s, 16384 iters, t-(init.)=1.66 s t(norm)=0.113079, mflops=44.2171 20. Mayer (lookup): elapsed time t=1.77 s, 16384 iters, t-(init.)=1.68 s t(norm)=0.114441, mflops=43.6907 (err=3.3e-16) 21. NAPACK (f2c): elapsed time t=1.63 s, 4096 iters, t-(init.)=1.6 s t(norm)=0.435965, mflops=11.4688 (err=1.2e-15) 22. Ooura (C): elapsed time t=1.2 s, 16384 iters, t-(init.)=1.11 s t(norm)=0.0756127, mflops=66.1264 (err=3.4e-16) 23. Ransom: elapsed time t=1.59 s, 4096 iters, t-(init.)=1.56 s t(norm)=0.425066, mflops=11.7629 (err=1.0e-15) 24. Singleton (f2c): elapsed time t=1.01 s, 8192 iters, t-(init.)=0.96 s t(norm)=0.13079, mflops=38.2293 (err=4.2e-16) 25. Temperton (f2c): elapsed time t=1.61 s, 8192 iters, t-(init.)=1.57 s t(norm)=0.213896, mflops=23.3759 (err=3.7e-16) 26. Valkenburg: elapsed time t=1.97 s, 1024 iters, t-(init.)=1.96 s t(norm)=2.13623, mflops=2.34057 (err=5.8e-16) Top mflops for N=128 = 82.0115 Normalized results and averages for N=128: fft 0: mflops = 15.4202 (norm. = 0.188025), norm. avg. (of 7) = 0.296003 fft 1: mflops = 15.2917 (norm. = 0.186458), norm. avg. (of 7) = 0.265618 fft 2: mflops = 14.7985 (norm. = 0.180444), norm. avg. (of 7) = 0.154645 fft 3: mflops = 4.5421 (norm. = 0.0553837), norm. avg. (of 7) = 0.0302246 fft 4: mflops = 2.01207 (norm. = 0.024534), norm. avg. (of 7) = 0.0236978 fft 5: mflops = 28.4497 (norm. = 0.346899), norm. avg. (of 7) = 0.186443 fft 6: mflops = 33.3638 (norm. = 0.406818), norm. avg. (of 7) = 0.188774 fft 7: mflops = 41.2361 (norm. = 0.502809), norm. avg. (of 7) = 0.188591 fft 8: mflops = 6.69711 (norm. = 0.0816606), norm. avg. (of 6) = 0.0598714 fft 9: mflops = 15.6838 (norm. = 0.191239), norm. avg. (of 7) = 0.125755 fft 10: mflops = 82.0115 (norm. = 1), norm. avg. (of 7) = 0.856312 fft 11: mflops = 82.0115 (norm. = 1), norm. avg. (of 7) = 0.819666 fft 12: mflops = 55.6063 (norm. = 0.67803), norm. avg. (of 7) = 0.832632 fft 13: mflops = 28.8978 (norm. = 0.352362), norm. avg. (of 5) = 0.309997 fft 14: mflops = 29.8375 (norm. = 0.363821), norm. avg. (of 7) = 0.193753 fft 15: mflops = 23.5257 (norm. = 0.286859), norm. avg. (of 7) = 0.104649 fft 16: mflops = 24.3047 (norm. = 0.296358), norm. avg. (of 7) = 0.105534 fft 17: mflops = 44.7563 (norm. = 0.545732), norm. avg. (of 7) = 0.41355 fft 18: mflops = 36.7002 (norm. = 0.4475), norm. avg. (of 6) = 0.301754 fft 19: mflops = 44.2171 (norm. = 0.539157), norm. avg. (of 6) = 0.342219 fft 20: mflops = 43.6907 (norm. = 0.532738), norm. avg. (of 6) = 0.34307 fft 21: mflops = 11.4688 (norm. = 0.139844), norm. avg. (of 7) = 0.0636525 fft 22: mflops = 66.1264 (norm. = 0.806306), norm. avg. (of 7) = 0.609102 fft 23: mflops = 11.7629 (norm. = 0.143429), norm. avg. (of 6) = 0.0723559 fft 24: mflops = 38.2293 (norm. = 0.466146), norm. avg. (of 7) = 0.205467 fft 25: mflops = 23.3759 (norm. = 0.285032), norm. avg. (of 7) = 0.132445 fft 26: mflops = 2.34057 (norm. = 0.0285395), norm. avg. (of 7) = 0.0263446 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.32 s, 2048 iters, t-(init.)=1.3 s t(norm)=0.309944, mflops=16.1319 (err=9.9e-16) 1. Arndt DIT: elapsed time t=1.29 s, 2048 iters, t-(init.)=1.27 s t(norm)=0.302792, mflops=16.513 (err=9.8e-16) 2. Arndt Split-Radix: elapsed time t=1.33 s, 2048 iters, t-(init.)=1.31 s t(norm)=0.312328, mflops=16.0088 (err=1.0e-15) 3. Arndt 4-step: elapsed time t=1.04 s, 512 iters, t-(init.)=1.03 s t(norm)=0.982285, mflops=5.09017 (err=1.0e-15) 4. Beauregard: elapsed time t=1.33 s, 256 iters, t-(init.)=1.32 s t(norm)=2.5177, mflops=1.98594 (err=1.1e-15) 5. Bergland: elapsed time t=1.36 s, 4096 iters, t-(init.)=1.32 s t(norm)=0.157356, mflops=31.775 (err=1.0e-15) 6. CWP (min N) (N=260): elapsed time t=1.09 s, 4096 iters, t-(init.)=1.04 s t(norm)=0.123978, mflops=40.3298 7. CWP (best N) (N=280): elapsed time t=1.95 s, 8192 iters, t-(init.)=1.85 s t(norm)=0.110269, mflops=45.3438 8. Edelblute: elapsed time t=1.42 s, 1024 iters, t-(init.)=1.41 s t(norm)=0.67234, mflops=7.43671 (err=9.9e-16) 9. FFTPACK (f2c): elapsed time t=1.27 s, 2048 iters, t-(init.)=1.25 s t(norm)=0.298023, mflops=16.7772 (err=1.0e-15) FFTW_MEASURE plan: (cost = 1.171875e-04) FFTW_TWIDDLE 8 FFTW_NOTW 32 10. FFTW: elapsed time t=1.08 s, 8192 iters, t-(init.)=0.99 s t(norm)=0.0590086, mflops=84.7334 (err=1.1e-15) FFTW_ESTIMATE plan: (cost = 9.216000e+02) FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.1 s, 8192 iters, t-(init.)=1.01 s t(norm)=0.0602007, mflops=83.0555 (err=1.1e-15) 12. Frigo-old: elapsed time t=1.64 s, 8192 iters, t-(init.)=1.55 s t(norm)=0.0923872, mflops=54.1201 (err=1.1e-15) 13. Green: elapsed time t=1.43 s, 4096 iters, t-(init.)=1.39 s t(norm)=0.165701, mflops=30.1748 (err=1.1e-15) 14. GSL: elapsed time t=1.33 s, 4096 iters, t-(init.)=1.29 s t(norm)=0.15378, mflops=32.514 (err=1.0e-15) 15. GSL DIT: elapsed time t=1.54 s, 4096 iters, t-(init.)=1.5 s t(norm)=0.178814, mflops=27.962 (err=1.0e-15) 16. GSL DIF: elapsed time t=1.47 s, 4096 iters, t-(init.)=1.42 s t(norm)=0.169277, mflops=29.5374 (err=1.1e-15) 17. Krukar: elapsed time t=1.06 s, 4096 iters, t-(init.)=1.02 s t(norm)=0.121593, mflops=41.1206 (err=1.1e-15) 18. Mayer (Buneman): elapsed time t=1.13 s, 4096 iters, t-(init.)=1.08 s t(norm)=0.128746, mflops=38.8361 (err=9.7e-16) 19. Mayer (simple): elapsed time t=1.89 s, 8192 iters, t-(init.)=1.8 s t(norm)=0.107288, mflops=46.6034 20. Mayer (lookup): elapsed time t=1.05 s, 4096 iters, t-(init.)=1.01 s t(norm)=0.120401, mflops=41.5278 (err=9.3e-16) 21. NAPACK (f2c): elapsed time t=1.72 s, 2048 iters, t-(init.)=1.7 s t(norm)=0.405312, mflops=12.3362 (err=3.8e-15) 22. Ooura (C): elapsed time t=1.33 s, 8192 iters, t-(init.)=1.24 s t(norm)=0.0739098, mflops=67.6501 (err=1.0e-15) 23. Ransom: elapsed time t=1.11 s, 2048 iters, t-(init.)=1.09 s t(norm)=0.259876, mflops=19.2399 (err=1.9e-15) 24. Singleton (f2c): elapsed time t=1.03 s, 4096 iters, t-(init.)=0.99 s t(norm)=0.118017, mflops=42.3667 (err=1.7e-15) 25. Temperton (f2c): elapsed time t=1.76 s, 4096 iters, t-(init.)=1.72 s t(norm)=0.20504, mflops=24.3855 (err=1.0e-15) 26. Valkenburg: elapsed time t=1.11 s, 256 iters, t-(init.)=1.1 s t(norm)=2.09808, mflops=2.38313 (err=1.2e-15) Top mflops for N=256 = 84.7334 Normalized results and averages for N=256: fft 0: mflops = 16.1319 (norm. = 0.190385), norm. avg. (of 8) = 0.2828 fft 1: mflops = 16.513 (norm. = 0.194882), norm. avg. (of 8) = 0.256776 fft 2: mflops = 16.0088 (norm. = 0.188931), norm. avg. (of 8) = 0.15893 fft 3: mflops = 5.09017 (norm. = 0.0600728), norm. avg. (of 8) = 0.0339556 fft 4: mflops = 1.98594 (norm. = 0.0234375), norm. avg. (of 8) = 0.0236652 fft 5: mflops = 31.775 (norm. = 0.375), norm. avg. (of 8) = 0.210012 fft 6: mflops = 40.3298 (norm. = 0.475962), norm. avg. (of 8) = 0.224672 fft 7: mflops = 45.3438 (norm. = 0.535135), norm. avg. (of 8) = 0.231909 fft 8: mflops = 7.43671 (norm. = 0.087766), norm. avg. (of 7) = 0.0638563 fft 9: mflops = 16.7772 (norm. = 0.198), norm. avg. (of 8) = 0.134786 fft 10: mflops = 84.7334 (norm. = 1), norm. avg. (of 8) = 0.874273 fft 11: mflops = 83.0555 (norm. = 0.980198), norm. avg. (of 8) = 0.839732 fft 12: mflops = 54.1201 (norm. = 0.63871), norm. avg. (of 8) = 0.808392 fft 13: mflops = 30.1748 (norm. = 0.356115), norm. avg. (of 6) = 0.317683 fft 14: mflops = 32.514 (norm. = 0.383721), norm. avg. (of 8) = 0.217499 fft 15: mflops = 27.962 (norm. = 0.33), norm. avg. (of 8) = 0.132817 fft 16: mflops = 29.5374 (norm. = 0.348592), norm. avg. (of 8) = 0.135916 fft 17: mflops = 41.1206 (norm. = 0.485294), norm. avg. (of 8) = 0.422518 fft 18: mflops = 38.8361 (norm. = 0.458333), norm. avg. (of 7) = 0.324122 fft 19: mflops = 46.6034 (norm. = 0.55), norm. avg. (of 7) = 0.371902 fft 20: mflops = 41.5278 (norm. = 0.490099), norm. avg. (of 7) = 0.364074 fft 21: mflops = 12.3362 (norm. = 0.145588), norm. avg. (of 8) = 0.0738945 fft 22: mflops = 67.6501 (norm. = 0.798387), norm. avg. (of 8) = 0.632763 fft 23: mflops = 19.2399 (norm. = 0.227064), norm. avg. (of 7) = 0.0944571 fft 24: mflops = 42.3667 (norm. = 0.5), norm. avg. (of 8) = 0.242283 fft 25: mflops = 24.3855 (norm. = 0.287791), norm. avg. (of 8) = 0.151863 fft 26: mflops = 2.38313 (norm. = 0.028125), norm. avg. (of 8) = 0.0265671 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.37 s, 1024 iters, t-(init.)=1.35 s t(norm)=0.286102, mflops=17.4763 (err=1.1e-15) 1. Arndt DIT: elapsed time t=1.34 s, 1024 iters, t-(init.)=1.32 s t(norm)=0.279744, mflops=17.8735 (err=1.1e-15) 2. Arndt Split-Radix: elapsed time t=1.39 s, 1024 iters, t-(init.)=1.37 s t(norm)=0.290341, mflops=17.2211 (err=1.1e-15) 3. Arndt 4-step: elapsed time t=1.13 s, 256 iters, t-(init.)=1.13 s t(norm)=0.957913, mflops=5.21968 (err=9.8e-16) 4. Beauregard: elapsed time t=1.52 s, 128 iters, t-(init.)=1.51 s t(norm)=2.56009, mflops=1.95306 (err=1.0e-15) 5. Bergland: elapsed time t=1.45 s, 2048 iters, t-(init.)=1.4 s t(norm)=0.148349, mflops=33.7042 (err=1.0e-15) 6. CWP (min N) (N=520): elapsed time t=1.19 s, 2048 iters, t-(init.)=1.14 s t(norm)=0.120799, mflops=41.3912 7. CWP (best N) (N=560): elapsed time t=1.02 s, 2048 iters, t-(init.)=0.97 s t(norm)=0.102785, mflops=48.6453 8. Edelblute: elapsed time t=1.45 s, 512 iters, t-(init.)=1.44 s t(norm)=0.610352, mflops=8.192 (err=1.1e-15) 9. FFTPACK (f2c): elapsed time t=1.64 s, 1024 iters, t-(init.)=1.62 s t(norm)=0.343323, mflops=14.5636 (err=1.0e-15) FFTW_MEASURE plan: (cost = 3.027344e-04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 16 10. FFTW: elapsed time t=1.49 s, 4096 iters, t-(init.)=1.4 s t(norm)=0.0741747, mflops=67.4085 (err=1.0e-15) FFTW_ESTIMATE plan: (cost = 1.843200e+03) FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.49 s, 4096 iters, t-(init.)=1.4 s t(norm)=0.0741747, mflops=67.4085 (err=9.8e-16) 12. Frigo-old: elapsed time t=1.02 s, 2048 iters, t-(init.)=0.98 s t(norm)=0.103845, mflops=48.1489 (err=9.5e-16) 13. Green: elapsed time t=1.51 s, 2048 iters, t-(init.)=1.47 s t(norm)=0.155767, mflops=32.0993 (err=9.6e-16) 14. GSL: elapsed time t=1.71 s, 2048 iters, t-(init.)=1.67 s t(norm)=0.17696, mflops=28.255 (err=1.0e-15) 15. GSL DIT: elapsed time t=1.55 s, 2048 iters, t-(init.)=1.51 s t(norm)=0.160005, mflops=31.249 (err=1.2e-15) 16. GSL DIF: elapsed time t=1.46 s, 2048 iters, t-(init.)=1.42 s t(norm)=0.150469, mflops=33.2295 (err=1.1e-15) 17. Krukar: elapsed time t=1.17 s, 2048 iters, t-(init.)=1.12 s t(norm)=0.118679, mflops=42.1303 (err=1.0e-15) 18. Mayer (Buneman): elapsed time t=1.17 s, 2048 iters, t-(init.)=1.13 s t(norm)=0.119739, mflops=41.7575 (err=1.0e-15) 19. Mayer (simple): elapsed time t=1.98 s, 4096 iters, t-(init.)=1.89 s t(norm)=0.100136, mflops=49.9322 20. Mayer (lookup): elapsed time t=1.3 s, 2048 iters, t-(init.)=1.26 s t(norm)=0.133514, mflops=37.4491 (err=1.0e-15) 21. NAPACK (f2c): elapsed time t=1 s, 512 iters, t-(init.)=0.99 s t(norm)=0.419617, mflops=11.9156 (err=7.1e-15) 22. Ooura (C): elapsed time t=1.46 s, 4096 iters, t-(init.)=1.37 s t(norm)=0.0725852, mflops=68.8846 (err=9.8e-16) 23. Ransom: elapsed time t=1.43 s, 1024 iters, t-(init.)=1.41 s t(norm)=0.298818, mflops=16.7326 (err=1.4e-15) 24. Singleton (f2c): elapsed time t=1.09 s, 2048 iters, t-(init.)=1.04 s t(norm)=0.110202, mflops=45.3711 (err=1.2e-15) 25. Temperton (f2c): elapsed time t=1.06 s, 1024 iters, t-(init.)=1.04 s t(norm)=0.220405, mflops=22.6855 (err=1.0e-15) 26. Valkenburg: elapsed time t=1.27 s, 128 iters, t-(init.)=1.27 s t(norm)=2.15318, mflops=2.32214 (err=1.3e-15) Top mflops for N=512 = 68.8846 Normalized results and averages for N=512: fft 0: mflops = 17.4763 (norm. = 0.253704), norm. avg. (of 9) = 0.279567 fft 1: mflops = 17.8735 (norm. = 0.25947), norm. avg. (of 9) = 0.257075 fft 2: mflops = 17.2211 (norm. = 0.25), norm. avg. (of 9) = 0.169049 fft 3: mflops = 5.21968 (norm. = 0.0757743), norm. avg. (of 9) = 0.0386022 fft 4: mflops = 1.95306 (norm. = 0.0283526), norm. avg. (of 9) = 0.0241861 fft 5: mflops = 33.7042 (norm. = 0.489286), norm. avg. (of 9) = 0.241043 fft 6: mflops = 41.3912 (norm. = 0.600877), norm. avg. (of 9) = 0.266473 fft 7: mflops = 48.6453 (norm. = 0.706186), norm. avg. (of 9) = 0.284606 fft 8: mflops = 8.192 (norm. = 0.118924), norm. avg. (of 8) = 0.0707397 fft 9: mflops = 14.5636 (norm. = 0.21142), norm. avg. (of 9) = 0.143301 fft 10: mflops = 67.4085 (norm. = 0.978571), norm. avg. (of 9) = 0.885862 fft 11: mflops = 67.4085 (norm. = 0.978571), norm. avg. (of 9) = 0.855159 fft 12: mflops = 48.1489 (norm. = 0.69898), norm. avg. (of 9) = 0.796235 fft 13: mflops = 32.0993 (norm. = 0.465986), norm. avg. (of 7) = 0.33887 fft 14: mflops = 28.255 (norm. = 0.41018), norm. avg. (of 9) = 0.238908 fft 15: mflops = 31.249 (norm. = 0.453642), norm. avg. (of 9) = 0.168465 fft 16: mflops = 33.2295 (norm. = 0.482394), norm. avg. (of 9) = 0.174414 fft 17: mflops = 42.1303 (norm. = 0.611607), norm. avg. (of 9) = 0.443528 fft 18: mflops = 41.7575 (norm. = 0.606195), norm. avg. (of 8) = 0.359381 fft 19: mflops = 49.9322 (norm. = 0.724868), norm. avg. (of 8) = 0.416022 fft 20: mflops = 37.4491 (norm. = 0.543651), norm. avg. (of 8) = 0.386521 fft 21: mflops = 11.9156 (norm. = 0.17298), norm. avg. (of 9) = 0.084904 fft 22: mflops = 68.8846 (norm. = 1), norm. avg. (of 9) = 0.673567 fft 23: mflops = 16.7326 (norm. = 0.242908), norm. avg. (of 8) = 0.113013 fft 24: mflops = 45.3711 (norm. = 0.658654), norm. avg. (of 9) = 0.288547 fft 25: mflops = 22.6855 (norm. = 0.329327), norm. avg. (of 9) = 0.171581 fft 26: mflops = 2.32214 (norm. = 0.0337106), norm. avg. (of 9) = 0.0273609 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.43 s, 512 iters, t-(init.)=1.41 s t(norm)=0.268936, mflops=18.5918 (err=1.8e-15) 1. Arndt DIT: elapsed time t=1.39 s, 512 iters, t-(init.)=1.37 s t(norm)=0.261307, mflops=19.1346 (err=1.8e-15) 2. Arndt Split-Radix: elapsed time t=1.45 s, 512 iters, t-(init.)=1.43 s t(norm)=0.272751, mflops=18.3317 (err=1.9e-15) 3. Arndt 4-step: elapsed time t=1.11 s, 128 iters, t-(init.)=1.11 s t(norm)=0.846863, mflops=5.90414 (err=1.8e-15) 4. Beauregard: elapsed time t=1.72 s, 64 iters, t-(init.)=1.72 s t(norm)=2.62451, mflops=1.90512 (err=2.0e-15) 5. Bergland: elapsed time t=1.56 s, 1024 iters, t-(init.)=1.51 s t(norm)=0.144005, mflops=34.7211 (err=2.2e-15) 6. CWP (min N) (N=1040): elapsed time t=1.25 s, 1024 iters, t-(init.)=1.2 s t(norm)=0.114441, mflops=43.6907 7. CWP (best N) (N=1040): elapsed time t=1.25 s, 1024 iters, t-(init.)=1.21 s t(norm)=0.115395, mflops=43.3296 8. Edelblute: elapsed time t=1.48 s, 256 iters, t-(init.)=1.47 s t(norm)=0.56076, mflops=8.91646 (err=1.8e-15) 9. FFTPACK (f2c): elapsed time t=1.73 s, 512 iters, t-(init.)=1.71 s t(norm)=0.326157, mflops=15.3301 (err=1.9e-15) FFTW_MEASURE plan: (cost = 7.031250e-04) FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_NOTW 16 10. FFTW: elapsed time t=1.57 s, 2048 iters, t-(init.)=1.48 s t(norm)=0.0705719, mflops=70.8497 (err=2.0e-15) FFTW_ESTIMATE plan: (cost = 1.126400e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.72 s, 2048 iters, t-(init.)=1.63 s t(norm)=0.0777245, mflops=64.3298 (err=1.9e-15) 12. Frigo-old: elapsed time t=1.26 s, 1024 iters, t-(init.)=1.21 s t(norm)=0.115395, mflops=43.3296 (err=1.9e-15) 13. Green: elapsed time t=1.86 s, 1024 iters, t-(init.)=1.82 s t(norm)=0.173569, mflops=28.807 (err=2.0e-15) 14. GSL: elapsed time t=1.75 s, 1024 iters, t-(init.)=1.71 s t(norm)=0.163078, mflops=30.6601 (err=1.9e-15) 15. GSL DIT: elapsed time t=1.62 s, 1024 iters, t-(init.)=1.57 s t(norm)=0.149727, mflops=33.3941 (err=2.1e-15) 16. GSL DIF: elapsed time t=1.52 s, 1024 iters, t-(init.)=1.47 s t(norm)=0.14019, mflops=35.6659 (err=2.2e-15) 17. Krukar: elapsed time t=1.94 s, 1024 iters, t-(init.)=1.9 s t(norm)=0.181198, mflops=27.5941 (err=1.9e-15) 18. Mayer (Buneman): elapsed time t=1.25 s, 1024 iters, t-(init.)=1.21 s t(norm)=0.115395, mflops=43.3296 (err=1.8e-15) 19. Mayer (simple): elapsed time t=1.07 s, 1024 iters, t-(init.)=1.03 s t(norm)=0.0982285, mflops=50.9017 20. Mayer (lookup): elapsed time t=1.1 s, 1024 iters, t-(init.)=1.06 s t(norm)=0.101089, mflops=49.4611 (err=1.8e-15) 21. NAPACK (f2c): elapsed time t=1.05 s, 256 iters, t-(init.)=1.04 s t(norm)=0.396729, mflops=12.6031 (err=1.7e-14) 22. Ooura (C): elapsed time t=1.53 s, 2048 iters, t-(init.)=1.44 s t(norm)=0.0686646, mflops=72.8178 (err=2.2e-15) 23. Ransom: elapsed time t=1.12 s, 512 iters, t-(init.)=1.1 s t(norm)=0.209808, mflops=23.8313 (err=2.3e-15) 24. Singleton (f2c): elapsed time t=1.18 s, 1024 iters, t-(init.)=1.14 s t(norm)=0.108719, mflops=45.9902 (err=2.8e-15) 25. Temperton (f2c): elapsed time t=1.09 s, 512 iters, t-(init.)=1.07 s t(norm)=0.204086, mflops=24.4994 (err=1.9e-15) 26. Valkenburg: elapsed time t=1.42 s, 64 iters, t-(init.)=1.42 s t(norm)=2.16675, mflops=2.30761 (err=2.4e-15) Top mflops for N=1024 = 72.8178 Normalized results and averages for N=1024: fft 0: mflops = 18.5918 (norm. = 0.255319), norm. avg. (of 10) = 0.277143 fft 1: mflops = 19.1346 (norm. = 0.262774), norm. avg. (of 10) = 0.257645 fft 2: mflops = 18.3317 (norm. = 0.251748), norm. avg. (of 10) = 0.177319 fft 3: mflops = 5.90414 (norm. = 0.0810811), norm. avg. (of 10) = 0.0428501 fft 4: mflops = 1.90512 (norm. = 0.0261628), norm. avg. (of 10) = 0.0243837 fft 5: mflops = 34.7211 (norm. = 0.476821), norm. avg. (of 10) = 0.264621 fft 6: mflops = 43.6907 (norm. = 0.6), norm. avg. (of 10) = 0.299826 fft 7: mflops = 43.3296 (norm. = 0.595041), norm. avg. (of 10) = 0.31565 fft 8: mflops = 8.91646 (norm. = 0.122449), norm. avg. (of 9) = 0.0764852 fft 9: mflops = 15.3301 (norm. = 0.210526), norm. avg. (of 10) = 0.150023 fft 10: mflops = 70.8497 (norm. = 0.972973), norm. avg. (of 10) = 0.894573 fft 11: mflops = 64.3298 (norm. = 0.883436), norm. avg. (of 10) = 0.857986 fft 12: mflops = 43.3296 (norm. = 0.595041), norm. avg. (of 10) = 0.776116 fft 13: mflops = 28.807 (norm. = 0.395604), norm. avg. (of 8) = 0.345961 fft 14: mflops = 30.6601 (norm. = 0.421053), norm. avg. (of 10) = 0.257122 fft 15: mflops = 33.3941 (norm. = 0.458599), norm. avg. (of 10) = 0.197478 fft 16: mflops = 35.6659 (norm. = 0.489796), norm. avg. (of 10) = 0.205952 fft 17: mflops = 27.5941 (norm. = 0.378947), norm. avg. (of 10) = 0.43707 fft 18: mflops = 43.3296 (norm. = 0.595041), norm. avg. (of 9) = 0.385566 fft 19: mflops = 50.9017 (norm. = 0.699029), norm. avg. (of 9) = 0.447468 fft 20: mflops = 49.4611 (norm. = 0.679245), norm. avg. (of 9) = 0.419046 fft 21: mflops = 12.6031 (norm. = 0.173077), norm. avg. (of 10) = 0.0937213 fft 22: mflops = 72.8178 (norm. = 1), norm. avg. (of 10) = 0.70621 fft 23: mflops = 23.8313 (norm. = 0.327273), norm. avg. (of 9) = 0.13682 fft 24: mflops = 45.9902 (norm. = 0.631579), norm. avg. (of 10) = 0.32285 fft 25: mflops = 24.4994 (norm. = 0.336449), norm. avg. (of 10) = 0.188068 fft 26: mflops = 2.30761 (norm. = 0.0316901), norm. avg. (of 10) = 0.0277938 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.65 s, 256 iters, t-(init.)=1.63 s t(norm)=0.282634, mflops=17.6907 (err=1.4e-15) 1. Arndt DIT: elapsed time t=1.64 s, 256 iters, t-(init.)=1.62 s t(norm)=0.2809, mflops=17.7999 (err=1.4e-15) 2. Arndt Split-Radix: elapsed time t=1.74 s, 256 iters, t-(init.)=1.73 s t(norm)=0.299974, mflops=16.6681 (err=1.4e-15) 3. Arndt 4-step: elapsed time t=1.22 s, 64 iters, t-(init.)=1.22 s t(norm)=0.846169, mflops=5.90898 (err=1.4e-15) 4. Beauregard: elapsed time t=1.93 s, 32 iters, t-(init.)=1.93 s t(norm)=2.67722, mflops=1.86761 (err=1.4e-15) 5. Bergland: elapsed time t=1.77 s, 512 iters, t-(init.)=1.72 s t(norm)=0.14912, mflops=33.53 (err=1.5e-15) 6. CWP (min N) (N=2145): elapsed time t=1.73 s, 512 iters, t-(init.)=1.68 s t(norm)=0.145652, mflops=34.3284 7. CWP (best N) (N=2184): elapsed time t=1.56 s, 512 iters, t-(init.)=1.51 s t(norm)=0.130913, mflops=38.1932 8. Edelblute: elapsed time t=1.62 s, 128 iters, t-(init.)=1.6 s t(norm)=0.554865, mflops=9.0112 (err=1.4e-15) 9. FFTPACK (f2c): elapsed time t=2 s, 256 iters, t-(init.)=1.98 s t(norm)=0.343323, mflops=14.5636 (err=1.4e-15) FFTW_MEASURE plan: (cost = 1.640625e-03) FFTW_TWIDDLE 32 FFTW_TWIDDLE 4 FFTW_NOTW 16 10. FFTW: elapsed time t=1.82 s, 1024 iters, t-(init.)=1.73 s t(norm)=0.0749935, mflops=66.6725 (err=1.4e-15) FFTW_ESTIMATE plan: (cost = 1.269760e+04) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.85 s, 1024 iters, t-(init.)=1.76 s t(norm)=0.0762939, mflops=65.536 (err=1.4e-15) 12. Frigo-old: elapsed time t=1.44 s, 512 iters, t-(init.)=1.39 s t(norm)=0.12051, mflops=41.4904 (err=1.4e-15) 13. Green: elapsed time t=1.09 s, 256 iters, t-(init.)=1.06 s t(norm)=0.183799, mflops=27.2036 (err=1.4e-15) 14. GSL: elapsed time t=1.06 s, 256 iters, t-(init.)=1.03 s t(norm)=0.178597, mflops=27.996 (err=1.4e-15) 15. GSL DIT: elapsed time t=1.1 s, 256 iters, t-(init.)=1.08 s t(norm)=0.187267, mflops=26.6999 (err=2.0e-15) 16. GSL DIF: elapsed time t=1.88 s, 512 iters, t-(init.)=1.84 s t(norm)=0.159524, mflops=31.3433 (err=2.3e-15) 17. Krukar: elapsed time t=1.16 s, 256 iters, t-(init.)=1.14 s t(norm)=0.197671, mflops=25.2946 (err=1.4e-15) 18. Mayer (Buneman): elapsed time t=1.4 s, 512 iters, t-(init.)=1.35 s t(norm)=0.117042, mflops=42.7198 (err=1.4e-15) 19. Mayer (simple): elapsed time t=1.21 s, 512 iters, t-(init.)=1.16 s t(norm)=0.100569, mflops=49.717 20. Mayer (lookup): elapsed time t=1.27 s, 512 iters, t-(init.)=1.22 s t(norm)=0.105771, mflops=47.2719 (err=1.4e-15) 21. NAPACK (f2c): elapsed time t=1.31 s, 128 iters, t-(init.)=1.3 s t(norm)=0.450828, mflops=11.0907 (err=1.5e-14) 22. Ooura (C): elapsed time t=1.13 s, 512 iters, t-(init.)=1.08 s t(norm)=0.0936335, mflops=53.3997 (err=1.4e-15) 23. Ransom: elapsed time t=1.55 s, 256 iters, t-(init.)=1.53 s t(norm)=0.265295, mflops=18.847 (err=2.1e-15) 24. Singleton (f2c): elapsed time t=1.51 s, 512 iters, t-(init.)=1.47 s t(norm)=0.127446, mflops=39.2324 (err=1.9e-15) 25. Temperton (f2c): elapsed time t=1.22 s, 256 iters, t-(init.)=1.19 s t(norm)=0.20634, mflops=24.2318 (err=1.4e-15) 26. Valkenburg: elapsed time t=1.55 s, 32 iters, t-(init.)=1.55 s t(norm)=2.1501, mflops=2.32547 (err=1.7e-15) Top mflops for N=2048 = 66.6725 Normalized results and averages for N=2048: fft 0: mflops = 17.6907 (norm. = 0.265337), norm. avg. (of 11) = 0.276069 fft 1: mflops = 17.7999 (norm. = 0.266975), norm. avg. (of 11) = 0.258493 fft 2: mflops = 16.6681 (norm. = 0.25), norm. avg. (of 11) = 0.183926 fft 3: mflops = 5.90898 (norm. = 0.088627), norm. avg. (of 11) = 0.0470116 fft 4: mflops = 1.86761 (norm. = 0.0280117), norm. avg. (of 11) = 0.0247135 fft 5: mflops = 33.53 (norm. = 0.502907), norm. avg. (of 11) = 0.286283 fft 6: mflops = 34.3284 (norm. = 0.514881), norm. avg. (of 11) = 0.319376 fft 7: mflops = 38.1932 (norm. = 0.572848), norm. avg. (of 11) = 0.339031 fft 8: mflops = 9.0112 (norm. = 0.135156), norm. avg. (of 10) = 0.0823523 fft 9: mflops = 14.5636 (norm. = 0.218434), norm. avg. (of 11) = 0.156242 fft 10: mflops = 66.6725 (norm. = 1), norm. avg. (of 11) = 0.904157 fft 11: mflops = 65.536 (norm. = 0.982955), norm. avg. (of 11) = 0.869347 fft 12: mflops = 41.4904 (norm. = 0.622302), norm. avg. (of 11) = 0.762133 fft 13: mflops = 27.2036 (norm. = 0.408019), norm. avg. (of 9) = 0.352857 fft 14: mflops = 27.996 (norm. = 0.419903), norm. avg. (of 11) = 0.271921 fft 15: mflops = 26.6999 (norm. = 0.400463), norm. avg. (of 11) = 0.215931 fft 16: mflops = 31.3433 (norm. = 0.470109), norm. avg. (of 11) = 0.229966 fft 17: mflops = 25.2946 (norm. = 0.379386), norm. avg. (of 11) = 0.431826 fft 18: mflops = 42.7198 (norm. = 0.640741), norm. avg. (of 10) = 0.411083 fft 19: mflops = 49.717 (norm. = 0.74569), norm. avg. (of 10) = 0.47729 fft 20: mflops = 47.2719 (norm. = 0.709016), norm. avg. (of 10) = 0.448043 fft 21: mflops = 11.0907 (norm. = 0.166346), norm. avg. (of 11) = 0.100324 fft 22: mflops = 53.3997 (norm. = 0.800926), norm. avg. (of 11) = 0.714821 fft 23: mflops = 18.847 (norm. = 0.28268), norm. avg. (of 10) = 0.151406 fft 24: mflops = 39.2324 (norm. = 0.588435), norm. avg. (of 11) = 0.346994 fft 25: mflops = 24.2318 (norm. = 0.363445), norm. avg. (of 11) = 0.204012 fft 26: mflops = 2.32547 (norm. = 0.034879), norm. avg. (of 11) = 0.0284379 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.71 s, 128 iters, t-(init.)=1.69 s t(norm)=0.268618, mflops=18.6138 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.68 s, 128 iters, t-(init.)=1.65 s t(norm)=0.26226, mflops=19.065 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.79 s, 128 iters, t-(init.)=1.77 s t(norm)=0.281334, mflops=17.7725 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.21 s, 32 iters, t-(init.)=1.21 s t(norm)=0.769297, mflops=6.49944 (err=3.7e-15) 4. Beauregard: elapsed time t=1.04 s, 8 iters, t-(init.)=1.04 s t(norm)=2.64486, mflops=1.89046 (err=3.8e-15) 5. Bergland: elapsed time t=1.87 s, 256 iters, t-(init.)=1.82 s t(norm)=0.144641, mflops=34.5684 (err=3.9e-15) 6. CWP (min N) (N=4290): elapsed time t=1.87 s, 256 iters, t-(init.)=1.82 s t(norm)=0.144641, mflops=34.5684 7. CWP (best N) (N=4368): elapsed time t=1.62 s, 256 iters, t-(init.)=1.57 s t(norm)=0.124772, mflops=40.073 8. Edelblute: elapsed time t=1.64 s, 64 iters, t-(init.)=1.62 s t(norm)=0.514984, mflops=9.70904 (err=3.7e-15) 9. FFTPACK (f2c): elapsed time t=1.09 s, 64 iters, t-(init.)=1.08 s t(norm)=0.343323, mflops=14.5636 (err=3.8e-15) FFTW_MEASURE plan: (cost = 3.593750e-03) FFTW_TWIDDLE 32 FFTW_TWIDDLE 4 FFTW_NOTW 32 10. FFTW: elapsed time t=1.96 s, 512 iters, t-(init.)=1.86 s t(norm)=0.0739098, mflops=67.6501 (err=3.8e-15) FFTW_ESTIMATE plan: (cost = 2.539520e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.95 s, 512 iters, t-(init.)=1.86 s t(norm)=0.0739098, mflops=67.6501 (err=3.8e-15) 12. Frigo-old: elapsed time t=1.72 s, 256 iters, t-(init.)=1.67 s t(norm)=0.13272, mflops=37.6734 (err=3.8e-15) 13. Green: elapsed time t=1.14 s, 128 iters, t-(init.)=1.11 s t(norm)=0.17643, mflops=28.3399 (err=3.8e-15) 14. GSL: elapsed time t=1.09 s, 128 iters, t-(init.)=1.07 s t(norm)=0.170072, mflops=29.3993 (err=3.8e-15) 15. GSL DIT: elapsed time t=1.21 s, 128 iters, t-(init.)=1.18 s t(norm)=0.187556, mflops=26.6587 (err=4.1e-15) 16. GSL DIF: elapsed time t=1.05 s, 128 iters, t-(init.)=1.03 s t(norm)=0.163714, mflops=30.541 (err=4.3e-15) 17. Krukar: elapsed time t=1.35 s, 128 iters, t-(init.)=1.32 s t(norm)=0.209808, mflops=23.8313 (err=3.8e-15) 18. Mayer (Buneman): elapsed time t=1.98 s, 256 iters, t-(init.)=1.93 s t(norm)=0.153383, mflops=32.5982 (err=3.7e-15) 19. Mayer (simple): elapsed time t=1.78 s, 256 iters, t-(init.)=1.73 s t(norm)=0.137488, mflops=36.3668 20. Mayer (lookup): elapsed time t=1.78 s, 256 iters, t-(init.)=1.74 s t(norm)=0.138283, mflops=36.1578 (err=3.7e-15) 21. NAPACK (f2c): elapsed time t=1.39 s, 64 iters, t-(init.)=1.38 s t(norm)=0.43869, mflops=11.3976 (err=4.9e-14) 22. Ooura (C): elapsed time t=1.13 s, 256 iters, t-(init.)=1.08 s t(norm)=0.0858307, mflops=58.2542 (err=3.9e-15) 23. Ransom: elapsed time t=1.26 s, 128 iters, t-(init.)=1.24 s t(norm)=0.197093, mflops=25.3688 (err=4.4e-15) 24. Singleton (f2c): elapsed time t=1.62 s, 256 iters, t-(init.)=1.57 s t(norm)=0.124772, mflops=40.073 (err=5.8e-15) 25. Temperton (f2c): elapsed time t=1.3 s, 128 iters, t-(init.)=1.28 s t(norm)=0.203451, mflops=24.576 (err=3.8e-15) 26. Valkenburg: elapsed time t=1.71 s, 16 iters, t-(init.)=1.71 s t(norm)=2.17438, mflops=2.29951 (err=4.0e-15) Top mflops for N=4096 = 67.6501 Normalized results and averages for N=4096: fft 0: mflops = 18.6138 (norm. = 0.275148), norm. avg. (of 12) = 0.275993 fft 1: mflops = 19.065 (norm. = 0.281818), norm. avg. (of 12) = 0.260437 fft 2: mflops = 17.7725 (norm. = 0.262712), norm. avg. (of 12) = 0.190492 fft 3: mflops = 6.49944 (norm. = 0.0960744), norm. avg. (of 12) = 0.0511002 fft 4: mflops = 1.89046 (norm. = 0.0279447), norm. avg. (of 12) = 0.0249828 fft 5: mflops = 34.5684 (norm. = 0.510989), norm. avg. (of 12) = 0.305009 fft 6: mflops = 34.5684 (norm. = 0.510989), norm. avg. (of 12) = 0.335344 fft 7: mflops = 40.073 (norm. = 0.592357), norm. avg. (of 12) = 0.360142 fft 8: mflops = 9.70904 (norm. = 0.143519), norm. avg. (of 11) = 0.0879129 fft 9: mflops = 14.5636 (norm. = 0.215278), norm. avg. (of 12) = 0.161162 fft 10: mflops = 67.6501 (norm. = 1), norm. avg. (of 12) = 0.912144 fft 11: mflops = 67.6501 (norm. = 1), norm. avg. (of 12) = 0.880235 fft 12: mflops = 37.6734 (norm. = 0.556886), norm. avg. (of 12) = 0.745029 fft 13: mflops = 28.3399 (norm. = 0.418919), norm. avg. (of 10) = 0.359463 fft 14: mflops = 29.3993 (norm. = 0.434579), norm. avg. (of 12) = 0.285476 fft 15: mflops = 26.6587 (norm. = 0.394068), norm. avg. (of 12) = 0.230776 fft 16: mflops = 30.541 (norm. = 0.451456), norm. avg. (of 12) = 0.248424 fft 17: mflops = 23.8313 (norm. = 0.352273), norm. avg. (of 12) = 0.425197 fft 18: mflops = 32.5982 (norm. = 0.481865), norm. avg. (of 11) = 0.417518 fft 19: mflops = 36.3668 (norm. = 0.537572), norm. avg. (of 11) = 0.48277 fft 20: mflops = 36.1578 (norm. = 0.534483), norm. avg. (of 11) = 0.455901 fft 21: mflops = 11.3976 (norm. = 0.168478), norm. avg. (of 12) = 0.106003 fft 22: mflops = 58.2542 (norm. = 0.861111), norm. avg. (of 12) = 0.727011 fft 23: mflops = 25.3688 (norm. = 0.375), norm. avg. (of 11) = 0.171733 fft 24: mflops = 40.073 (norm. = 0.592357), norm. avg. (of 12) = 0.367441 fft 25: mflops = 24.576 (norm. = 0.363281), norm. avg. (of 12) = 0.217284 fft 26: mflops = 2.29951 (norm. = 0.0339912), norm. avg. (of 12) = 0.0289007 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.77 s, 64 iters, t-(init.)=1.75 s t(norm)=0.256758, mflops=19.4736 (err=3.7e-15) 1. Arndt DIT: elapsed time t=1.76 s, 64 iters, t-(init.)=1.74 s t(norm)=0.255291, mflops=19.5855 (err=3.7e-15) 2. Arndt Split-Radix: elapsed time t=1.88 s, 64 iters, t-(init.)=1.86 s t(norm)=0.272898, mflops=18.3219 (err=3.7e-15) 3. Arndt 4-step: elapsed time t=1.28 s, 16 iters, t-(init.)=1.27 s t(norm)=0.745333, mflops=6.70841 (err=3.7e-15) 4. Beauregard: elapsed time t=1.15 s, 4 iters, t-(init.)=1.15 s t(norm)=2.69963, mflops=1.8521 (err=3.7e-15) 5. Bergland: elapsed time t=1 s, 64 iters, t-(init.)=0.98 s t(norm)=0.143785, mflops=34.7742 (err=3.7e-15) 6. CWP (min N) (N=8580): elapsed time t=1.93 s, 128 iters, t-(init.)=1.88 s t(norm)=0.137916, mflops=36.254 7. CWP (best N) (N=9240): elapsed time t=1.91 s, 128 iters, t-(init.)=1.85 s t(norm)=0.135715, mflops=36.8419 8. Edelblute: elapsed time t=1.7 s, 32 iters, t-(init.)=1.68 s t(norm)=0.492976, mflops=10.1425 (err=3.7e-15) 9. FFTPACK (f2c): elapsed time t=1.3 s, 32 iters, t-(init.)=1.29 s t(norm)=0.378535, mflops=13.2088 (err=3.7e-15) FFTW_MEASURE plan: (cost = 9.375000e-03) FFTW_TWIDDLE 32 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_NOTW 16 10. FFTW: elapsed time t=1.21 s, 128 iters, t-(init.)=1.16 s t(norm)=0.0850971, mflops=58.7564 (err=3.7e-15) FFTW_ESTIMATE plan: (cost = 5.079040e+04) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.24 s, 128 iters, t-(init.)=1.2 s t(norm)=0.0880315, mflops=56.7979 (err=3.7e-15) 12. Frigo-old: elapsed time t=1.05 s, 64 iters, t-(init.)=1.03 s t(norm)=0.151121, mflops=33.0861 (err=3.7e-15) 13. Green: elapsed time t=1.33 s, 64 iters, t-(init.)=1.31 s t(norm)=0.192202, mflops=26.0143 (err=3.7e-15) 14. GSL: elapsed time t=1.34 s, 64 iters, t-(init.)=1.32 s t(norm)=0.193669, mflops=25.8172 (err=3.7e-15) 15. GSL DIT: elapsed time t=1.31 s, 64 iters, t-(init.)=1.29 s t(norm)=0.189268, mflops=26.4176 (err=4.3e-15) 16. GSL DIF: elapsed time t=1.11 s, 64 iters, t-(init.)=1.08 s t(norm)=0.158457, mflops=31.5544 (err=4.3e-15) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.06 s, 64 iters, t-(init.)=1.04 s t(norm)=0.152588, mflops=32.768 (err=3.7e-15) 19. Mayer (simple): elapsed time t=1.92 s, 128 iters, t-(init.)=1.88 s t(norm)=0.137916, mflops=36.254 20. Mayer (lookup): elapsed time t=1.89 s, 128 iters, t-(init.)=1.85 s t(norm)=0.135715, mflops=36.8419 (err=3.7e-15) 21. NAPACK (f2c): elapsed time t=1.58 s, 32 iters, t-(init.)=1.57 s t(norm)=0.460698, mflops=10.8531 (err=4.5e-14) 22. Ooura (C): elapsed time t=1.3 s, 128 iters, t-(init.)=1.25 s t(norm)=0.0916995, mflops=54.526 (err=3.7e-15) 23. Ransom: elapsed time t=1.6 s, 64 iters, t-(init.)=1.58 s t(norm)=0.231816, mflops=21.5688 (err=4.9e-15) 24. Singleton (f2c): elapsed time t=1.75 s, 128 iters, t-(init.)=1.7 s t(norm)=0.124711, mflops=40.0926 (err=5.6e-15) 25. Temperton (f2c): elapsed time t=1.53 s, 64 iters, t-(init.)=1.51 s t(norm)=0.221546, mflops=22.5687 (err=3.7e-15) 26. Valkenburg: elapsed time t=1.85 s, 8 iters, t-(init.)=1.85 s t(norm)=2.17144, mflops=2.30262 (err=3.8e-15) Top mflops for N=8192 = 58.7564 Normalized results and averages for N=8192: fft 0: mflops = 19.4736 (norm. = 0.331429), norm. avg. (of 13) = 0.280257 fft 1: mflops = 19.5855 (norm. = 0.333333), norm. avg. (of 13) = 0.266044 fft 2: mflops = 18.3219 (norm. = 0.311828), norm. avg. (of 13) = 0.199825 fft 3: mflops = 6.70841 (norm. = 0.114173), norm. avg. (of 13) = 0.0559519 fft 4: mflops = 1.8521 (norm. = 0.0315217), norm. avg. (of 13) = 0.0254858 fft 5: mflops = 34.7742 (norm. = 0.591837), norm. avg. (of 13) = 0.327072 fft 6: mflops = 36.254 (norm. = 0.617021), norm. avg. (of 13) = 0.357011 fft 7: mflops = 36.8419 (norm. = 0.627027), norm. avg. (of 13) = 0.380671 fft 8: mflops = 10.1425 (norm. = 0.172619), norm. avg. (of 12) = 0.0949717 fft 9: mflops = 13.2088 (norm. = 0.224806), norm. avg. (of 13) = 0.166058 fft 10: mflops = 58.7564 (norm. = 1), norm. avg. (of 13) = 0.918902 fft 11: mflops = 56.7979 (norm. = 0.966667), norm. avg. (of 13) = 0.886884 fft 12: mflops = 33.0861 (norm. = 0.563107), norm. avg. (of 13) = 0.731035 fft 13: mflops = 26.0143 (norm. = 0.442748), norm. avg. (of 11) = 0.367034 fft 14: mflops = 25.8172 (norm. = 0.439394), norm. avg. (of 13) = 0.297315 fft 15: mflops = 26.4176 (norm. = 0.449612), norm. avg. (of 13) = 0.24761 fft 16: mflops = 31.5544 (norm. = 0.537037), norm. avg. (of 13) = 0.270625 fft 17: mflops = -1 (norm. = -0.0170194), norm. avg. (of 12) = 0.425197 fft 18: mflops = 32.768 (norm. = 0.557692), norm. avg. (of 12) = 0.429199 fft 19: mflops = 36.254 (norm. = 0.617021), norm. avg. (of 12) = 0.493958 fft 20: mflops = 36.8419 (norm. = 0.627027), norm. avg. (of 12) = 0.470162 fft 21: mflops = 10.8531 (norm. = 0.184713), norm. avg. (of 13) = 0.112058 fft 22: mflops = 54.526 (norm. = 0.928), norm. avg. (of 13) = 0.742472 fft 23: mflops = 21.5688 (norm. = 0.367089), norm. avg. (of 12) = 0.188012 fft 24: mflops = 40.0926 (norm. = 0.682353), norm. avg. (of 13) = 0.391665 fft 25: mflops = 22.5687 (norm. = 0.384106), norm. avg. (of 13) = 0.230116 fft 26: mflops = 2.30262 (norm. = 0.0391892), norm. avg. (of 13) = 0.0296921 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.09 s, 16 iters, t-(init.)=1.07 s t(norm)=0.291552, mflops=17.1496 (err=6.8e-15) 1. Arndt DIT: elapsed time t=1.07 s, 16 iters, t-(init.)=1.06 s t(norm)=0.288827, mflops=17.3114 (err=6.8e-15) 2. Arndt Split-Radix: elapsed time t=1.17 s, 16 iters, t-(init.)=1.15 s t(norm)=0.31335, mflops=15.9566 (err=6.8e-15) 3. Arndt 4-step: elapsed time t=1.28 s, 8 iters, t-(init.)=1.27 s t(norm)=0.692095, mflops=7.22444 (err=6.8e-15) 4. Beauregard: elapsed time t=1.27 s, 2 iters, t-(init.)=1.27 s t(norm)=2.76838, mflops=1.80611 (err=6.8e-15) 5. Bergland: elapsed time t=1.19 s, 32 iters, t-(init.)=1.16 s t(norm)=0.158037, mflops=31.6381 (err=6.8e-15) 6. CWP (min N) (N=17160): elapsed time t=1.07 s, 32 iters, t-(init.)=1.05 s t(norm)=0.143051, mflops=34.9525 7. CWP (best N) (N=17160): elapsed time t=1.06 s, 32 iters, t-(init.)=1.04 s t(norm)=0.141689, mflops=35.2886 8. Edelblute: elapsed time t=1.92 s, 16 iters, t-(init.)=1.91 s t(norm)=0.520434, mflops=9.60737 (err=6.8e-15) 9. FFTPACK (f2c): elapsed time t=1.47 s, 16 iters, t-(init.)=1.45 s t(norm)=0.395094, mflops=12.6552 (err=6.8e-15) FFTW_MEASURE plan: (cost = 2.625000e-02) FFTW_TWIDDLE 16 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_NOTW 16 10. FFTW: elapsed time t=1.67 s, 64 iters, t-(init.)=1.62 s t(norm)=0.110354, mflops=45.3088 (err=6.8e-15) FFTW_ESTIMATE plan: (cost = 1.441792e+05) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.51 s, 64 iters, t-(init.)=1.45 s t(norm)=0.0987734, mflops=50.6209 (err=6.8e-15) 12. Frigo-old: elapsed time t=1.4 s, 32 iters, t-(init.)=1.37 s t(norm)=0.186648, mflops=26.7884 (err=6.8e-15) 13. Green: elapsed time t=1.5 s, 32 iters, t-(init.)=1.47 s t(norm)=0.200272, mflops=24.9661 (err=6.8e-15) 14. GSL: elapsed time t=1.51 s, 32 iters, t-(init.)=1.48 s t(norm)=0.201634, mflops=24.7974 (err=6.8e-15) 15. GSL DIT: elapsed time t=1.66 s, 32 iters, t-(init.)=1.64 s t(norm)=0.223432, mflops=22.3781 (err=7.2e-15) 16. GSL DIF: elapsed time t=1.5 s, 32 iters, t-(init.)=1.47 s t(norm)=0.200272, mflops=24.9661 (err=7.3e-15) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.25 s, 32 iters, t-(init.)=1.22 s t(norm)=0.166212, mflops=30.0821 (err=6.8e-15) 19. Mayer (simple): elapsed time t=1.16 s, 32 iters, t-(init.)=1.13 s t(norm)=0.15395, mflops=32.478 20. Mayer (lookup): elapsed time t=1.15 s, 32 iters, t-(init.)=1.12 s t(norm)=0.152588, mflops=32.768 (err=6.8e-15) 21. NAPACK (f2c): elapsed time t=1.82 s, 16 iters, t-(init.)=1.81 s t(norm)=0.493186, mflops=10.1382 (err=2.3e-13) 22. Ooura (C): elapsed time t=1.44 s, 64 iters, t-(init.)=1.39 s t(norm)=0.0946862, mflops=52.806 (err=6.8e-15) 23. Ransom: elapsed time t=1.35 s, 32 iters, t-(init.)=1.32 s t(norm)=0.179836, mflops=27.8032 (err=7.4e-15) 24. Singleton (f2c): elapsed time t=1.08 s, 32 iters, t-(init.)=1.05 s t(norm)=0.143051, mflops=34.9525 (err=1.0e-14) 25. Temperton (f2c): elapsed time t=1.68 s, 32 iters, t-(init.)=1.66 s t(norm)=0.226157, mflops=22.1085 (err=6.8e-15) 26. Valkenburg: elapsed time t=1.01 s, 2 iters, t-(init.)=1.01 s t(norm)=2.20163, mflops=2.27105 (err=6.9e-15) Top mflops for N=16384 = 52.806 Normalized results and averages for N=16384: fft 0: mflops = 17.1496 (norm. = 0.324766), norm. avg. (of 14) = 0.283436 fft 1: mflops = 17.3114 (norm. = 0.32783), norm. avg. (of 14) = 0.270458 fft 2: mflops = 15.9566 (norm. = 0.302174), norm. avg. (of 14) = 0.207136 fft 3: mflops = 7.22444 (norm. = 0.136811), norm. avg. (of 14) = 0.0617276 fft 4: mflops = 1.80611 (norm. = 0.0342028), norm. avg. (of 14) = 0.0261084 fft 5: mflops = 31.6381 (norm. = 0.599138), norm. avg. (of 14) = 0.346506 fft 6: mflops = 34.9525 (norm. = 0.661905), norm. avg. (of 14) = 0.378789 fft 7: mflops = 35.2886 (norm. = 0.668269), norm. avg. (of 14) = 0.401214 fft 8: mflops = 9.60737 (norm. = 0.181937), norm. avg. (of 13) = 0.101661 fft 9: mflops = 12.6552 (norm. = 0.239655), norm. avg. (of 14) = 0.171315 fft 10: mflops = 45.3088 (norm. = 0.858025), norm. avg. (of 14) = 0.914554 fft 11: mflops = 50.6209 (norm. = 0.958621), norm. avg. (of 14) = 0.892008 fft 12: mflops = 26.7884 (norm. = 0.507299), norm. avg. (of 14) = 0.715054 fft 13: mflops = 24.9661 (norm. = 0.472789), norm. avg. (of 12) = 0.375847 fft 14: mflops = 24.7974 (norm. = 0.469595), norm. avg. (of 14) = 0.309621 fft 15: mflops = 22.3781 (norm. = 0.42378), norm. avg. (of 14) = 0.260193 fft 16: mflops = 24.9661 (norm. = 0.472789), norm. avg. (of 14) = 0.285065 fft 17: mflops = -1 (norm. = -0.0189372), norm. avg. (of 12) = 0.425197 fft 18: mflops = 30.0821 (norm. = 0.569672), norm. avg. (of 13) = 0.440005 fft 19: mflops = 32.478 (norm. = 0.615044), norm. avg. (of 13) = 0.503272 fft 20: mflops = 32.768 (norm. = 0.620536), norm. avg. (of 13) = 0.481729 fft 21: mflops = 10.1382 (norm. = 0.191989), norm. avg. (of 14) = 0.117767 fft 22: mflops = 52.806 (norm. = 1), norm. avg. (of 14) = 0.760867 fft 23: mflops = 27.8032 (norm. = 0.526515), norm. avg. (of 13) = 0.214051 fft 24: mflops = 34.9525 (norm. = 0.661905), norm. avg. (of 14) = 0.410968 fft 25: mflops = 22.1085 (norm. = 0.418675), norm. avg. (of 14) = 0.243585 fft 26: mflops = 2.27105 (norm. = 0.0430074), norm. avg. (of 14) = 0.0306432 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.61 s, 8 iters, t-(init.)=1.59 s t(norm)=0.404358, mflops=12.3653 (err=1.4e-14) 1. Arndt DIT: elapsed time t=1.62 s, 8 iters, t-(init.)=1.6 s t(norm)=0.406901, mflops=12.288 (err=1.4e-14) 2. Arndt Split-Radix: elapsed time t=1.79 s, 8 iters, t-(init.)=1.77 s t(norm)=0.450134, mflops=11.1078 (err=1.4e-14) 3. Arndt 4-step: elapsed time t=1.45 s, 4 iters, t-(init.)=1.44 s t(norm)=0.732422, mflops=6.82667 (err=1.4e-14) 4. Beauregard: elapsed time t=1.38 s, 1 iters, t-(init.)=1.37 s t(norm)=2.78727, mflops=1.79387 (err=1.4e-14) 5. Bergland: elapsed time t=1.66 s, 16 iters, t-(init.)=1.62 s t(norm)=0.205994, mflops=24.2726 (err=1.4e-14) 6. CWP (min N) (N=34320): elapsed time t=1.16 s, 16 iters, t-(init.)=1.11 s t(norm)=0.141144, mflops=35.4249 7. CWP (best N) (N=34320): elapsed time t=1.15 s, 16 iters, t-(init.)=1.11 s t(norm)=0.141144, mflops=35.4249 8. Edelblute: elapsed time t=1.28 s, 4 iters, t-(init.)=1.27 s t(norm)=0.645955, mflops=7.74047 (err=1.4e-14) 9. FFTPACK (f2c): elapsed time t=1.7 s, 8 iters, t-(init.)=1.68 s t(norm)=0.427246, mflops=11.7029 (err=1.4e-14) FFTW_MEASURE plan: (cost = 6.000000e-02) FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_NOTW 16 10. FFTW: elapsed time t=1.01 s, 16 iters, t-(init.)=0.97 s t(norm)=0.123342, mflops=40.5377 (err=1.4e-14) FFTW_ESTIMATE plan: (cost = 2.883584e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.92 s, 32 iters, t-(init.)=1.84 s t(norm)=0.116984, mflops=42.7409 (err=1.4e-14) 12. Frigo-old: elapsed time t=1.5 s, 16 iters, t-(init.)=1.46 s t(norm)=0.185649, mflops=26.9326 (err=1.4e-14) 13. Green: elapsed time t=1.95 s, 16 iters, t-(init.)=1.91 s t(norm)=0.242869, mflops=20.5872 (err=1.4e-14) 14. GSL: elapsed time t=1.83 s, 16 iters, t-(init.)=1.79 s t(norm)=0.22761, mflops=21.9674 (err=1.4e-14) 15. GSL DIT: elapsed time t=1.37 s, 8 iters, t-(init.)=1.35 s t(norm)=0.343323, mflops=14.5636 (err=1.4e-14) 16. GSL DIF: elapsed time t=1.32 s, 8 iters, t-(init.)=1.3 s t(norm)=0.330607, mflops=15.1237 (err=1.4e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.63 s, 16 iters, t-(init.)=1.59 s t(norm)=0.202179, mflops=24.7306 (err=1.4e-14) 19. Mayer (simple): elapsed time t=1.52 s, 16 iters, t-(init.)=1.48 s t(norm)=0.188192, mflops=26.5686 20. Mayer (lookup): elapsed time t=1.57 s, 16 iters, t-(init.)=1.53 s t(norm)=0.19455, mflops=25.7004 (err=1.4e-14) 21. NAPACK (f2c): elapsed time t=1.07 s, 4 iters, t-(init.)=1.06 s t(norm)=0.539144, mflops=9.27396 (err=5.6e-13) 22. Ooura (C): elapsed time t=1.06 s, 16 iters, t-(init.)=1.02 s t(norm)=0.1297, mflops=38.5506 (err=1.4e-14) 23. Ransom: elapsed time t=1.84 s, 16 iters, t-(init.)=1.8 s t(norm)=0.228882, mflops=21.8453 (err=1.5e-14) 24. Singleton (f2c): elapsed time t=1.98 s, 16 iters, t-(init.)=1.93 s t(norm)=0.245412, mflops=20.3739 (err=2.1e-14) 25. Temperton (f2c): elapsed time t=1.27 s, 8 iters, t-(init.)=1.25 s t(norm)=0.317891, mflops=15.7286 (err=1.4e-14) 26. Valkenburg: elapsed time t=1.15 s, 1 iters, t-(init.)=1.15 s t(norm)=2.33968, mflops=2.13704 (err=1.4e-14) Top mflops for N=32768 = 42.7409 Normalized results and averages for N=32768: fft 0: mflops = 12.3653 (norm. = 0.289308), norm. avg. (of 15) = 0.283828 fft 1: mflops = 12.288 (norm. = 0.2875), norm. avg. (of 15) = 0.271594 fft 2: mflops = 11.1078 (norm. = 0.259887), norm. avg. (of 15) = 0.210653 fft 3: mflops = 6.82667 (norm. = 0.159722), norm. avg. (of 15) = 0.0682606 fft 4: mflops = 1.79387 (norm. = 0.0419708), norm. avg. (of 15) = 0.0271659 fft 5: mflops = 24.2726 (norm. = 0.567901), norm. avg. (of 15) = 0.361265 fft 6: mflops = 35.4249 (norm. = 0.828829), norm. avg. (of 15) = 0.408792 fft 7: mflops = 35.4249 (norm. = 0.828829), norm. avg. (of 15) = 0.429722 fft 8: mflops = 7.74047 (norm. = 0.181102), norm. avg. (of 14) = 0.107336 fft 9: mflops = 11.7029 (norm. = 0.27381), norm. avg. (of 15) = 0.178148 fft 10: mflops = 40.5377 (norm. = 0.948454), norm. avg. (of 15) = 0.916814 fft 11: mflops = 42.7409 (norm. = 1), norm. avg. (of 15) = 0.899207 fft 12: mflops = 26.9326 (norm. = 0.630137), norm. avg. (of 15) = 0.709392 fft 13: mflops = 20.5872 (norm. = 0.481675), norm. avg. (of 13) = 0.383988 fft 14: mflops = 21.9674 (norm. = 0.513966), norm. avg. (of 15) = 0.323244 fft 15: mflops = 14.5636 (norm. = 0.340741), norm. avg. (of 15) = 0.265563 fft 16: mflops = 15.1237 (norm. = 0.353846), norm. avg. (of 15) = 0.289651 fft 17: mflops = -1 (norm. = -0.0233968), norm. avg. (of 12) = 0.425197 fft 18: mflops = 24.7306 (norm. = 0.578616), norm. avg. (of 14) = 0.449906 fft 19: mflops = 26.5686 (norm. = 0.621622), norm. avg. (of 14) = 0.511726 fft 20: mflops = 25.7004 (norm. = 0.601307), norm. avg. (of 14) = 0.49027 fft 21: mflops = 9.27396 (norm. = 0.216981), norm. avg. (of 15) = 0.124381 fft 22: mflops = 38.5506 (norm. = 0.901961), norm. avg. (of 15) = 0.770273 fft 23: mflops = 21.8453 (norm. = 0.511111), norm. avg. (of 14) = 0.23527 fft 24: mflops = 20.3739 (norm. = 0.476684), norm. avg. (of 15) = 0.415349 fft 25: mflops = 15.7286 (norm. = 0.368), norm. avg. (of 15) = 0.251879 fft 26: mflops = 2.13704 (norm. = 0.05), norm. avg. (of 15) = 0.0319336 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.06 s, 2 iters, t-(init.)=1.05 s t(norm)=0.500679, mflops=9.98644 (err=1.7e-14) 1. Arndt DIT: elapsed time t=1.06 s, 2 iters, t-(init.)=1.05 s t(norm)=0.500679, mflops=9.98644 (err=1.7e-14) 2. Arndt Split-Radix: elapsed time t=1.23 s, 2 iters, t-(init.)=1.21 s t(norm)=0.576973, mflops=8.66592 (err=1.7e-14) 3. Arndt 4-step: elapsed time t=1.48 s, 2 iters, t-(init.)=1.47 s t(norm)=0.700951, mflops=7.13317 (err=1.7e-14) 4. Beauregard: elapsed time t=2.97 s, 1 iters, t-(init.)=2.97 s t(norm)=2.83241, mflops=1.76528 (err=1.7e-14) 5. Bergland: elapsed time t=1.03 s, 4 iters, t-(init.)=1.01 s t(norm)=0.240803, mflops=20.7639 (err=1.7e-14) 6. CWP (min N) (N=72072): elapsed time t=1.42 s, 8 iters, t-(init.)=1.36 s t(norm)=0.162125, mflops=30.8405 7. CWP (best N) (N=72072): elapsed time t=1.43 s, 8 iters, t-(init.)=1.38 s t(norm)=0.164509, mflops=30.3935 8. Edelblute: elapsed time t=1.61 s, 2 iters, t-(init.)=1.6 s t(norm)=0.762939, mflops=6.5536 (err=1.7e-14) 9. FFTPACK (f2c): elapsed time t=1.79 s, 4 iters, t-(init.)=1.77 s t(norm)=0.422001, mflops=11.8483 (err=1.7e-14) FFTW_MEASURE plan: (cost = 1.400000e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 16 10. FFTW: elapsed time t=1.19 s, 8 iters, t-(init.)=1.14 s t(norm)=0.135899, mflops=36.7921 (err=1.7e-14) FFTW_ESTIMATE plan: (cost = 5.767168e+05) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.15 s, 8 iters, t-(init.)=1.11 s t(norm)=0.132322, mflops=37.7865 (err=1.7e-14) 12.12. Frigo-old: elapsed time t=1.8 s, 8 iters, t-(init.)=1.75 s t(norm)=0.208616, mflops=23.9675 (err=1.7e-141313. Green: elapsed time t=1.15 s, 4 iters, t-(init.)=1.13 s t(norm)=0.269413, mflops=18.5589 (err=1.7e-141414. GSL: elapsed time t=1.92 s, 8 iters, t-(init.)=1.88 s t(norm)=0.224113, mflops=22.3101 (err=1.7e-115.15. GSL DIT: elapsed time t=1.77 s, 4 iters, t-(init.)=1.75 s t(norm)=0.417233, mflops=11.9837 (err=1.7e-116.16. GSL DIF: elapsed time t=1.69 s, 4 iters, t-(init.)=1.67 s t(norm)=0.398159, mflops=12.5578 (err=1.8e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.2 s, 4 iters, t-(init.)=1.17 s t(norm)=0.27895, mflops=17.9244 (err=1.7e-14) 19. Mayer (simple): elapsed time t=1.14 s, 4 iters, t-(init.)=1.12 s t(norm)=0.267029, mflops=18.7246 20. Mayer (lookup): elapsed time t=1.22 s, 4 iters, t-(init.)=1.2 s t(norm)=0.286102, mflops=17.4763 (err=1.7e-14) 21. NAPACK (f2c): elapsed time t=1.15 s, 2 iters, t-(init.)=1.14 s t(norm)=0.543594, mflops=9.19804 (err=8.6e-13) 22. Ooura (C): elapsed time t=1.24 s, 8 iters, t-(init.)=1.2 s t(norm)=0.143051, mflops=34.9525 (err=1.7e-14) 23. Ransom: elapsed time t=1.03 s, 4 iters, t-(init.)=1.01 s t(norm)=0.240803, mflops=20.7639 (err=1.7e-14) 24. Singleton (f2c): elapsed time t=1.07 s, 4 iters, t-(init.)=1.04 s t(norm)=0.247955, mflops=20.1649 (err=2.3e-14) 25. Temperton (f2c): elapsed time t=1.41 s, 4 iters, t-(init.)=1.38 s t(norm)=0.329018, mflops=15.1968 (err=1.7e-14) 26. Valkenburg: elapsed time t=2.53 s, 1 iters, t-(init.)=2.52 s t(norm)=2.40326, mflops=2.08051 (err=1.7e-14) Top mflops for N=65536 = 40.3298 Normalized results and averages for N=65536: fft 0: mflops = 11.4598 (norm. = 0.284153), norm. avg. (of 16) = 0.325015 fft 1: mflops = 11.4598 (norm. = 0.284153), norm. avg. (of 16) = 0.321825 fft 2: mflops = 9.11805 (norm. = 0.226087), norm. avg. (of 16) = 0.216773 fft 3: mflops = 9.44663 (norm. = 0.234234), norm. avg. (of 16) = 0.10305 fft 4: mflops = 2.50856 (norm. = 0.062201), norm. avg. (of 16) = 0.0389529 fft 5: mflops = 21.7321 (norm. = 0.53886), norm. avg. (of 16) = 0.383941 fft 6: mflops = 30.8405 (norm. = 0.764706), norm. avg. (of 16) = 0.438272 fft 7: mflops = 30.6154 (norm. = 0.759124), norm. avg. (of 16) = 0.457666 fft 8: mflops = 7.3327 (norm. = 0.181818), norm. avg. (of 15) = 0.127723 fft 9: mflops = 11.8483 (norm. = 0.293785), norm. avg. (of 16) = 0.186355 fft 10: mflops = 40.3298 (norm. = 1), norm. avg. (of 16) = 0.906855 fft 11: mflops = 38.8361 (norm. = 0.962963), norm. avg. (of 16) = 0.883657 fft 12: mflops = 23.9675 (norm. = 0.594286), norm. avg. (of 16) = 0.707538 fft 13: mflops = 18.5589 (norm. = 0.460177), norm. avg. (of 14) = 0.391244 fft 14: mflops = 22.3101 (norm. = 0.553191), norm. avg. (of 16) = 0.333514 fft 15: mflops = 11.9837 (norm. = 0.297143), norm. avg. (of 16) = 0.28105 fft 16: mflops = 12.5578 (norm. = 0.311377), norm. avg. (of 16) = 0.295668 fft 17: mflops = -1 (norm. = -0.0247955), norm. avg. (of 12) = 0.416992 fft 18: mflops = 17.9244 (norm. = 0.444444), norm. avg. (of 15) = 0.426536 fft 19: mflops = 18.7246 (norm. = 0.464286), norm. avg. (of 15) = 0.480521 fft 20: mflops = 17.4763 (norm. = 0.433333), norm. avg. (of 15) = 0.460763 fft 21: mflops = 9.19804 (norm. = 0.22807), norm. avg. (of 16) = 0.135552 fft 22: mflops = 34.9525 (norm. = 0.866667), norm. avg. (of 16) = 0.743295 fft 23: mflops = 20.7639 (norm. = 0.514851), norm. avg. (of 15) = 0.259483 fft 24: mflops = 20.1649 (norm. = 0.5), norm. avg. (of 16) = 0.408486 fft 25: mflops = 15.1968 (norm. = 0.376812), norm. avg. (of 16) = 0.254115 fft 26: mflops = 2.08051 (norm. = 0.0515873), norm. avg. (of 16) = 0.0339187 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.08 s, 1 iters, t-(init.)=1.07 s t(norm)=0.480203, mflops=10.4123 (err=3.3e-14) 1. Arndt DIT: elapsed time t=1.06 s, 1 iters, t-(init.)=1.05 s t(norm)=0.471227, mflops=10.6106 (err=3.3e-14) 2. Arndt Split-Radix: elapsed time t=1.37 s, 1 iters, t-(init.)=1.36 s t(norm)=0.610352, mflops=8.192 (err=3.3e-14) 3. Arndt 4-step: elapsed time t=1.31 s, 1 iters, t-(init.)=1.29 s t(norm)=0.578936, mflops=8.63653 (4. Beauregard4. Beauregard: elapsed time t=4.54 s, 1 iters, t-(init.)=4.53 s t(norm)=2.03301, mflops=2.45941 (err=3.3e-14) 5. Bergland: elapsed time t=1.13 s, 2 iters, t-(init.)=1.11 s t(norm)=0.249077, mflops=20.0741 (err=3.4e-14) 6. CWP (min N) (N=144144): elapsed time t=1.48 s, 4 iters, t-(init.)=1.43 s t(norm)=0.160442, mflops=31.164 7. CWP (best N) (N=144144): elapsed time t=1.47 s, 4 iters, t-(init.)=1.42 s t(norm)=0.15932, mflops=31.3834 8. Edelblute: elapsed time t=1.63 s, 1 iters, t-(init.)=1.62 s t(norm)=0.727036, mflops=6.87723 (err=3.3e-14) 9. FFTPACK (f2c): elapsed time t=1.01 s, 1 iters, t-(init.)=0.99 s t(norm)=0.4443, mflops=11.2537 (er FFTW_MEASU FFTW_MEASURE plan: (cost = 3.000000e-01) FFTW_TWIDDLE 16 FFTW_TWIDDLE 32 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_NOTW 16 10. FFTW: elapsed time t=1.28 s, 4 iters, t-(init.)=1.23 s t(norm)=0.138002, mflops=36.2313 (err=3.3e-14) FFTW_ESTIMATE plan: (cost = 1.153434e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.23 s, 4 iters, t-(init.)=1.18 s t(norm)=0.132392, mflops=37.7665 (err=3.3e-14) 12. Frigo-old: elapsed time t=1.04 s, 2 iters, t-(init.)=1.02 s t(norm)=0.228882, mflops=21.8453 (err=3.3e-14) 13. Green: elapsed time t=1.32 s, 2 iters, t-(init.)=1.29 s t(norm)=0.289468, mflops=17.2731 (err=3.3e-14) 14. GSL: elapsed time t=1.08 s, 2 iters, t-(init.)=1.06 s t(norm)=0.237858, mflops=21.021 (err=3.3e-14) 15. GSL DIT: elapsed time t=1.03 s, 1 iters, t-(init.)=1.02 s t(norm)=0.457764, mflops=10.9227 (err=3.5e-14) 16. GSL DIF: elapsed time t=1 s, 1 iters, t-(init.)=0.99 s t(norm)=0.4443, mflops=11.2537 (err=3.5e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.56 s, 2 iters, t-(init.)=1.54 s t(norm)=0.345567, mflops=14.469 (err=3.3e-14) 19. Mayer (simple): elapsed time t=1.49 s, 2 iters, t-(init.)=1.47 s t(norm)=0.329859, mflops=15.158 20. Mayer (lookup): elapsed time t=1.55 s, 2 iters, t-(init.)=1.52 s t(norm)=0.341079, mflops=14.6594 (err=3.3e-14) 21. NAPACK (f2c): elapsed time t=1.27 s, 1 iters, t-(init.)=1.26 s t(norm)=0.565473, mflops=8.84216 (err=2.0e-12) 22. Ooura (C): elapsed time t=1.42 s, 4 iters, t-(init.)=1.37 s t(norm)=0.15371, mflops=32.5288 (err=3.4e-14) 23. Ransom: elapsed time t=1.26 s, 2 iters, t-(init.)=1.23 s t(norm)=0.276005, mflops=18.1156 (err=3.3e-14) 24. Singleton (f2c): elapsed time t=1.33 s, 2 iters, t-(init.)=1.3 s t(norm)=0.291712, mflops=17.1402 (err=4.8e-14) 25. Temperton (f2c): elapsed time t=1.71 s, 2 iters, t-(init.)=1.68 s t(norm)=0.376982, mflops=13.2632 (err=3.3e-14) 26. Valkenburg: elapsed time t=5.44 s, 1 iters, t-(init.)=5.42 s t(norm)=2.43243, mflops=2.05556 (err=3.4e-14) Top mflops for N=131072 = 37.7665 Normalized results and averages for N=131072: fft 0: mflops = 10.4123 (norm. = 0.275701), norm. avg. (of 17) = 0.322114 fft 1: mflops = 10.6106 (norm. = 0.280952), norm. avg. (of 17) = 0.319421 fft 2: mflops = 8.192 (norm. = 0.216912), norm. avg. (of 17) = 0.216781 fft 3: mflops = 8.63653 (norm. = 0.228682), norm. avg. (of 17) = 0.11044 fft 4: mflops = 2.45941 (norm. = 0.0651214), norm. avg. (of 17) = 0.0404923 fft 5: mflops = 20.0741 (norm. = 0.531532), norm. avg. (of 17) = 0.392622 fft 6: mflops = 31.164 (norm. = 0.825175), norm. avg. (of 17) = 0.461031 fft 7: mflops = 31.3834 (norm. = 0.830986), norm. avg. (of 17) = 0.479626 fft 8: mflops = 6.87723 (norm. = 0.182099), norm. avg. (of 16) = 0.131122 fft 9: mflops = 11.2537 (norm. = 0.29798), norm. avg. (of 17) = 0.192921 fft 10: mflops = 36.2313 (norm. = 0.95935), norm. avg. (of 17) = 0.909943 fft 11: mflops = 37.7665 (norm. = 1), norm. avg. (of 17) = 0.890501 fft 12: mflops = 21.8453 (norm. = 0.578431), norm. avg. (of 17) = 0.699944 fft 13: mflops = 17.2731 (norm. = 0.457364), norm. avg. (of 15) = 0.395652 fft 14: mflops = 21.021 (norm. = 0.556604), norm. avg. (of 17) = 0.346637 fft 15: mflops = 10.9227 (norm. = 0.289216), norm. avg. (of 17) = 0.28153 fft 16: mflops = 11.2537 (norm. = 0.29798), norm. avg. (of 17) = 0.295804 fft 17: mflops = -1 (norm. = -0.0264785), norm. avg. (of 12) = 0.416992 fft 18: mflops = 14.469 (norm. = 0.383117), norm. avg. (of 16) = 0.423823 fft 19: mflops = 15.158 (norm. = 0.401361), norm. avg. (of 16) = 0.475574 fft 20: mflops = 14.6594 (norm. = 0.388158), norm. avg. (of 16) = 0.456225 fft 21: mflops = 8.84216 (norm. = 0.234127), norm. avg. (of 17) = 0.141351 fft 22: mflops = 32.5288 (norm. = 0.861314), norm. avg. (of 17) = 0.750237 fft 23: mflops = 18.1156 (norm. = 0.479675), norm. avg. (of 16) = 0.273245 fft 24: mflops = 17.1402 (norm. = 0.453846), norm. avg. (of 17) = 0.411154 fft 25: mflops = 13.2632 (norm. = 0.35119), norm. avg. (of 17) = 0.259826 fft 26: mflops = 2.05556 (norm. = 0.054428), norm. avg. (of 17) = 0.0351252 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=2.36 s, 1 iters, t-(init.)=2.33 s t(norm)=0.493791, mflops=10.1257 (err=4.3e-14) 1. Arndt DIT: elapsed time t=2.33 s, 1 iters, t-(init.)=2.3 s t(norm)=0.487434, mflops=10.2578 (err=4.3e-14) 2. Arndt Split-Radix: elapsed time t=2.97 s, 1 iters, t-(init.)=2.94 s t(norm)=0.623067, mflops=8.02482 (err=4.3e-14) 3. Arndt 4-step: elapsed time t=2.41 s, 1 iters, t-(init.)=2.38 s t(norm)=0.504388, mflops=9.91301 (err=4.3e-14) 4. Beauregard: elapsed time t=9.58 s, 1 iters, t-(init.)=9.56 s t(norm)=2.02603, mflops=2.46788 (err=4.4e-14) 5. Bergland: elapsed time t=1.22 s, 1 iters, t-(init.)=1.2 s t(norm)=0.254313, mflops=19.6608 (err=4.4e-14) 6. CWP (min N) (N=360360): elapsed time t=1.04 s, 1 iters, t-(init.)=1 s t(norm)=0.211928, mflops=23.593 7. CWP (best N) (N=360360): elapsed time t=1.04 s, 1 iters, t-(init.)=1.01 s t(norm)=0.214047, mflops=23.3594 8. Edelblute: elapsed time t=3.52 s, 1 iters, t-(init.)=3.49 s t(norm)=0.739627, mflops=6.76016 (err=4.3e-14) 9. FFTPACK (f2c): elapsed time t=2.12 s, 1 iters, t-(init.)=2.1 s t(norm)=0.445048, mflops=11.2347 (err=4.4e- FFT FFTW_MEASURE plan: (cost = 6.200000e-01) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_NOTW 16 10. FFTW: elapsed time t=1.31 s, 2 iters, t-(init.)=1.26 s t(norm)=0.133514, mflops=37.4491 (err=4.4e-14) FFTW_ESTIMATE plan: (cost = 2.988442e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.39 s, 2 iters, t-(init.)=1.34 s t(norm)=0.141992, mflops=35.2134 (err=4.4e-14) 12. Frigo-old: elapsed time t=1.21 s, 1 iters, t-(init.)=1.18 s t(norm)=0.250075, mflops=19.994 (err=4.4e-14) 13. Green: elapsed time t=1.38 s, 1 iters, t-(init.)=1.35 s t(norm)=0.286102, mflops=17.4763 (err=4.4e-14) 14. GSL: elapsed time t=1.1 s, 1 iters, t-(init.)=1.08 s t(norm)=0.228882, mflops=21.8453 (err=4.4e-14) 15. GSL DIT: elapsed time t=2.27 s, 1 iters, t-(init.)=2.25 s t(norm)=0.476837, mflops=10.4858 (err=4.6e-14) 16. GSL DIF: elapsed time t=2.2 s, 1 iters, t-(init.)=2.18 s t(norm)=0.462002, mflops=10.8225 (err=4.6e-14) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.93 s, 1 iters, t-(init.)=1.9 s t(norm)=0.402662, mflops=12.4173 (err=4.3e-14) 19. Mayer (simple): elapsed time t=1.88 s, 1 iters, t-(init.)=1.85 s t(norm)=0.392066, mflops=12.753 20. Mayer (lookup): elapsed time t=1.93 s, 1 iters, t-(init.)=1.91 s t(norm)=0.404782, mflops=12.3523 (err=4.3e-14) 21. NAPACK (f2c): elapsed time t=2.62 s, 1 iters, t-(init.)=2.59 s t(norm)=0.548893, mflops=9.10925 (err=3.7e-12) 22. Ooura (C): elapsed time t=1.44 s, 2 iters, t-(init.)=1.39 s t(norm)=0.14729, mflops=33.9467 (err=4.4e-14) 23. Ransom: elapsed time t=1.1 s, 1 iters, t-(init.)=1.08 s t(norm)=0.228882, mflops=21.8453 (err=4.3e-14) 24. Singleton (f2c): elapsed time t=1.37 s, 1 iters, t-(init.)=1.35 s t(norm)=0.286102, mflops=17.4763 (err=6.0e-14) 25. Temperton (f2c): elapsed time t=1.72 s, 1 iters, t-(init.)=1.7 s t(norm)=0.360277, mflops=13.8782 (err=4.4e-14) 26. Valkenburg: elapsed time t=11.69 s, 1 iters, t-(init.)=11.66 s t(norm)=2.47108, mflops=2.02341 (err=4.4e-14) Top mflops for N=262144 = 37.4491 Normalized results and averages for N=262144: fft 0: mflops = 10.1257 (norm. = 0.270386), norm. avg. (of 18) = 0.31924 fft 1: mflops = 10.2578 (norm. = 0.273913), norm. avg. (of 18) = 0.316892 fft 2: mflops = 8.02482 (norm. = 0.214286), norm. avg. (of 18) = 0.216643 fft 3: mflops = 9.91301 (norm. = 0.264706), norm. avg. (of 18) = 0.11901 fft 4: mflops = 2.46788 (norm. = 0.0658996), norm. avg. (of 18) = 0.0419038 fft 5: mflops = 19.6608 (norm. = 0.525), norm. avg. (of 18) = 0.399977 fft 6: mflops = 23.593 (norm. = 0.63), norm. avg. (of 18) = 0.470419 fft 7: mflops = 23.3594 (norm. = 0.623762), norm. avg. (of 18) = 0.487633 fft 8: mflops = 6.76016 (norm. = 0.180516), norm. avg. (of 17) = 0.134027 fft 9: mflops = 11.2347 (norm. = 0.3), norm. avg. (of 18) = 0.19887 fft 10: mflops = 37.4491 (norm. = 1), norm. avg. (of 18) = 0.914946 fft 11: mflops = 35.2134 (norm. = 0.940299), norm. avg. (of 18) = 0.893267 fft 12: mflops = 19.994 (norm. = 0.533898), norm. avg. (of 18) = 0.690719 fft 13: mflops = 17.4763 (norm. = 0.466667), norm. avg. (of 16) = 0.400091 fft 14: mflops = 21.8453 (norm. = 0.583333), norm. avg. (of 18) = 0.359787 fft 15: mflops = 10.4858 (norm. = 0.28), norm. avg. (of 18) = 0.281445 fft 16: mflops = 10.8225 (norm. = 0.288991), norm. avg. (of 18) = 0.295426 fft 17: mflops = -1 (norm. = -0.0267029), norm. avg. (of 12) = 0.416992 fft 18: mflops = 12.4173 (norm. = 0.331579), norm. avg. (of 17) = 0.418396 fft 19: mflops = 12.753 (norm. = 0.340541), norm. avg. (of 17) = 0.467631 fft 20: mflops = 12.3523 (norm. = 0.329843), norm. avg. (of 17) = 0.448791 fft 21: mflops = 9.10925 (norm. = 0.243243), norm. avg. (of 18) = 0.147012 fft 22: mflops = 33.9467 (norm. = 0.906475), norm. avg. (of 18) = 0.758917 fft 23: mflops = 21.8453 (norm. = 0.583333), norm. avg. (of 17) = 0.291486 fft 24: mflops = 17.4763 (norm. = 0.466667), norm. avg. (of 18) = 0.414238 fft 25: mflops = 13.8782 (norm. = 0.370588), norm. avg. (of 18) = 0.265979 fft 26: mflops = 2.02341 (norm. = 0.0540309), norm. avg. (of 18) = 0.0361755 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB)3. Arndt 4-step: elapsed time t=7.19 s, 1 iters, t-(init.)=7.14 s t(norm)=0.716762, mflops=6.97582 (err=1.1e-13) 4. Beauregard: elapsed time t=28.23 s, 1 iters, t-(init.)=28.18 s t(norm)=2.8289, mflops=1.76747 (err=1.1e-13) 5. Bergland: elapsed time t=2.63 s, 1 iters, t-(init.)=2.58 s t(norm)=0.258998, mflops=19.3052 (err=1.1e-13) 6. CWP (min N) (N=720720): elapsed time t=2.1 s, 1 iters, t-(init.)=2.03 s t(norm)=0.203785, mflops=24.5356 7. CWP (best N) (N=720720): elapsed time t=2.1 s, 1 iters, t-(init.)=2.03 s t(norm)=0.203785, mflops=24.5356 8. Edelblute: elapsed time t=8.06 s, 1 iters, t-(init.)=8.01 s t(norm)=0.804098, mflops=6.21815 (err=1.1e-13) 9. FFTPACK (f2c): elapsed time t=4.55 s, 1 iters, t-(init.)=4.49 s t(norm)=0.450737, mflops=11.093 (err=1.1e-13) FFTW_MEASURE plan: (cost = 1.440000e+00) FFTW_TWIDDLE 8 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 10. FFTW: elapsed time t=1.46 s, 1 iters, t-(init.)=1.41 s t(norm)=0.141545, mflops=35.3244 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 5.976883e+06) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.5 s, 1 iters, t-(init.)=1.45 s t(norm)=0.145561, mflops=34.3499 (err=1.1e-13) 12. Frigo-old: elapsed time t=2.66 s, 1 iters, t-(init.)=2.61 s t(norm)=0.262009, mflops=19.0833 (err=1.1e-13) 13. Green: elapsed time t=3.09 s, 1 iters, t-(init.)=3.03 s t(norm)=0.304172, mflops=16.4381 (err=1.1e-13) 14. GSL: elapsed time t=2.37 s, 1 iters, t-(init.)=2.32 s t(norm)=0.232897, mflops=21.4687 (err=1.1e-13) 15. GSL DIT: elapsed time t=4.87 s, 1 iters, t-(init.)=4.82 s t(norm)=0.483864, mflops=10.3335 (err=1.1e-13) 16. GSL DIF: elapsed time t=4.73 s, 1 iters, t-(init.)=4.68 s t(norm)=0.46981, mflops=10.6426 (err=1.1e-13) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=4.08 s, 1 iters, t-(init.)=4.03 s t(norm)=0.404559, mflops=12.3591 (err=1.1e-13) 19. Mayer (simple): elapsed time t=4.01 s, 1 iters, t-(init.)=3.96 s t(norm)=0.397532, mflops=12.5776 20. Mayer (lookup): elapsed time t=4.09 s, 1 iters, t-(init.)=4.04 s t(norm)=0.405563, mflops=12.3286 (err=1.1e-13) 21. NAPACK (f2c): elapsed time t=5.58 s, 1 iters, t-(init.)=5.52 s t(norm)=0.554135, mflops=9.02307 (err=7.9e-12) 22. Ooura (C): elapsed time t=1.58 s, 1 iters, t-(init.)=1.53 s t(norm)=0.153592, mflops=32.5538 (err=1.1e-13) 23. Ransom: elapsed time t=2.6 s, 1 iters, t-(init.)=2.55 s t(norm)=0.255986, mflops=19.5323 (err=1.1e-13) 24. Singleton (f2c): elapsed time t=3.42 s, 1 iters, t-(init.)=3.37 s t(norm)=0.338303, mflops=14.7796 (err=1.6e-13) 25. Temperton (f2c): elapsed time t=3.85 s, 1 iters, t-(init.)=3.8 s t(norm)=0.38147, mflops=13.1072 (err=1.1e-13) 26. Valkenburg: elapsed time t=24.71 s, 1 iters, t-(init.)=24.66 s t(norm)=2.47554, mflops=2.01976 (err=1.1e-13) Top mflops for N=524288 = 35.3244 Normalized results and averages for N=524288: fft 0: mflops = 9.27511 (norm. = 0.26257), norm. avg. (of 19) = 0.27846 fft 1: mflops = 9.29242 (norm. = 0.26306), norm. avg. (of 19) = 0.268935 fft 2: mflops = 7.63917 (norm. = 0.216258), norm. avg. (of 19) = 0.211992 fft 3: mflops = 6.97582 (norm. = 0.197479), norm. avg. (of 19) = 0.0949934 fft 4: mflops = 1.76747 (norm. = 0.0500355), norm. avg. (of 19) = 0.0316011 fft 5: mflops = 19.3052 (norm. = 0.546512), norm. avg. (of 19) = 0.398825 fft 6: mflops = 24.5356 (norm. = 0.694581), norm. avg. (of 19) = 0.480458 fft 7: mflops = 24.5356 (norm. = 0.694581), norm. avg. (of 19) = 0.496458 fft 8: mflops = 6.21815 (norm. = 0.17603), norm. avg. (of 18) = 0.121617 fft 9: mflops = 11.093 (norm. = 0.314031), norm. avg. (of 19) = 0.205285 fft 10: mflops = 35.3244 (norm. = 1), norm. avg. (of 19) = 0.930353 fft 11: mflops = 34.3499 (norm. = 0.972414), norm. avg. (of 19) = 0.917449 fft 12: mflops = 19.0833 (norm. = 0.54023), norm. avg. (of 19) = 0.681867 fft 13: mflops = 16.4381 (norm. = 0.465347), norm. avg. (of 17) = 0.403758 fft 14: mflops = 21.4687 (norm. = 0.607759), norm. avg. (of 19) = 0.378515 fft 15: mflops = 10.3335 (norm. = 0.292531), norm. avg. (of 19) = 0.272234 fft 16: mflops = 10.6426 (norm. = 0.301282), norm. avg. (of 19) = 0.293459 fft 17: mflops = -1 (norm. = -0.0283091), norm. avg. (of 12) = 0.425197 fft 18: mflops = 12.3591 (norm. = 0.349876), norm. avg. (of 18) = 0.435017 fft 19: mflops = 12.5776 (norm. = 0.356061), norm. avg. (of 18) = 0.485348 fft 20: mflops = 12.3286 (norm. = 0.34901), norm. avg. (of 18) = 0.46579 fft 21: mflops = 9.02307 (norm. = 0.255435), norm. avg. (of 19) = 0.150481 fft 22: mflops = 32.5538 (norm. = 0.921569), norm. avg. (of 19) = 0.797787 fft 23: mflops = 19.5323 (norm. = 0.552941), norm. avg. (of 18) = 0.306401 fft 24: mflops = 14.7796 (norm. = 0.418398), norm. avg. (of 19) = 0.425418 fft 25: mflops = 13.1072 (norm. = 0.371053), norm. avg. (of 19) = 0.277292 fft 26: mflops = 2.01976 (norm. = 0.0571776), norm. avg. (of 19) = 0.0369678 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. CWP (min N) 1. CWP (best N) 2. FFTPACK (f2c) 3. FFTW 4. FFTW_ESTIMATE 5. Frigo-old 6. GSL 7. NAPACK (f2c) 8. Singleton (f2c) 9. Temperton (f2c) 10. Valkenburg Computing normalized averages (11 transforms). Benchmarking for array size = 6: 0. CWP (min N): elapsed time t=1.64 s, 131072 iters, t-(init.)=1.59 s t(norm)=0.782135, mflops=6.39276 1. CWP (best N) (N=15): elapsed time t=1.16 s, 65536 iters, t-(init.)=1.1 s t(norm)=1.0822, mflops=4.62022 2. FFTPACK (f2c): elapsed time t=1.32 s, 131072 iters, t-(init.)=1.26 s t(norm)=0.619805, mflops=8.06705 (err=1.8e-16) FFTW_MEASURE plan: (cost = 1.296997e-06) FFTW_NOTW 6 3. FFTW: elapsed time t=1.49 s, 1048576 iters, t-(init.)=1.03 s t(norm)=0.0633333, mflops=78.9475 (err=1.1e-16) FFTW_ESTIMATE plan: (cost = 4.116000e+02) FFTW_NOTW 6 4. FFTW_ESTIMATE: elapsed time t=1.52 s, 1048576 iters, t-(init.)=1.06 s t(norm)=0.0651779, mflops=76.7131 (err=1.1e-16) 5. Frigo-old: elapsed time t=1.16 s, 131072 iters, t-(init.)=1.1 s t(norm)=0.5411, mflops=9.24044 (err=3.1e-16) 6. GSL: elapsed time t=1.73 s, 262144 iters, t-(init.)=1.62 s t(norm)=0.398446, mflops=12.5487 (err=1.2e-16) 7. NAPACK (f2c): elapsed time t=1.81 s, 65536 iters, t-(init.)=1.78 s t(norm)=1.7512, mflops=2.85519 (err=4.7e-16) 8. Singleton (f2c): elapsed time t=1.16 s, 65536 iters, t-(init.)=1.14 s t(norm)=1.12155, mflops=4.45811 (err=1.0e-16) 9. Temperton (f2c): elapsed time t=1.03 s, 65536 iters, t-(init.)=1 s t(norm)=0.983818, mflops=5.08224 (err=1.0e-16) 10. Valkenburg: elapsed time t=1.22 s, 32768 iters, t-(init.)=1.21 s t(norm)=2.38084, mflops=2.1001 (err=2.5e-16) Top mflops for N=6 = 78.9475 Normalized results and averages for N=6: fft 0: mflops = 6.39276 (norm. = 0.0809748), norm. avg. (of 1) = 0.0809748 fft 1: mflops = 4.62022 (norm. = 0.0585227), norm. avg. (of 1) = 0.0585227 fft 2: mflops = 8.06705 (norm. = 0.102183), norm. avg. (of 1) = 0.102183 fft 3: mflops = 78.9475 (norm. = 1), norm. avg. (of 1) = 1 fft 4: mflops = 76.7131 (norm. = 0.971698), norm. avg. (of 1) = 0.971698 fft 5: mflops = 9.24044 (norm. = 0.117045), norm. avg. (of 1) = 0.117045 fft 6: mflops = 12.5487 (norm. = 0.158951), norm. avg. (of 1) = 0.158951 fft 7: mflops = 2.85519 (norm. = 0.0361657), norm. avg. (of 1) = 0.0361657 fft 8: mflops = 4.45811 (norm. = 0.0564693), norm. avg. (of 1) = 0.0564693 fft 9: mflops = 5.08224 (norm. = 0.064375), norm. avg. (of 1) = 0.064375 fft 10: mflops = 2.1001 (norm. = 0.0266012), norm. avg. (of 1) = 0.0266012 Benchmarking for array size = 9: 0. CWP (min N): elapsed time t=1.66 s, 131072 iters, t-(init.)=1.58 s t(norm)=0.422528, mflops=11.8335 1. CWP (best N) (N=15): elapsed time t=1.17 s, 65536 iters, t-(init.)=1.12 s t(norm)=0.599027, mflops=8.34687 2. FFTPACK (f2c): elapsed time t=1.86 s, 131072 iters, t-(init.)=1.79 s t(norm)=0.478687, mflops=10.4452 (err=2.4e-16) FFTW_MEASURE plan: (cost = 2.746582e-06) FFTW_NOTW 9 3. FFTW: elapsed time t=1.5 s, 524288 iters, t-(init.)=1.21 s t(norm)=0.0808954, mflops=61.8082 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.851000e+02) FFTW_NOTW 9 4. FFTW_ESTIMATE: elapsed time t=1.53 s, 524288 iters, t-(init.)=1.24 s t(norm)=0.0829011, mflops=60.3128 (err=1.2e-16) 5. Frigo-old: elapsed time t=1.23 s, 65536 iters, t-(init.)=1.19 s t(norm)=0.636466, mflops=7.85587 (err=3.3e-16) 6. GSL: elapsed time t=1.58 s, 131072 iters, t-(init.)=1.5 s t(norm)=0.401134, mflops=12.4647 (err=1.4e-16) 7. NAPACK (f2c): elapsed time t=1.78 s, 65536 iters, t-(init.)=1.74 s t(norm)=0.930632, mflops=5.37269 (err=4.3e-16) 8. Singleton (f2c): elapsed time t=1.06 s, 65536 iters, t-(init.)=1.02 s t(norm)=0.545543, mflops=9.16519 (err=1.5e-16) 9. Temperton (f2c): elapsed time t=1.22 s, 65536 iters, t-(init.)=1.19 s t(norm)=0.636466, mflops=7.85587 (err=1.4e-16) 10. Valkenburg: elapsed time t=1.13 s, 16384 iters, t-(init.)=1.12 s t(norm)=2.39611, mflops=2.08672 (err=2.0e-16) Top mflops for N=9 = 61.8082 Normalized results and averages for N=9: fft 0: mflops = 11.8335 (norm. = 0.191456), norm. avg. (of 2) = 0.136215 fft 1: mflops = 8.34687 (norm. = 0.135045), norm. avg. (of 2) = 0.0967837 fft 2: mflops = 10.4452 (norm. = 0.168994), norm. avg. (of 2) = 0.135588 fft 3: mflops = 61.8082 (norm. = 1), norm. avg. (of 2) = 1 fft 4: mflops = 60.3128 (norm. = 0.975806), norm. avg. (of 2) = 0.973752 fft 5: mflops = 7.85587 (norm. = 0.127101), norm. avg. (of 2) = 0.122073 fft 6: mflops = 12.4647 (norm. = 0.201667), norm. avg. (of 2) = 0.180309 fft 7: mflops = 5.37269 (norm. = 0.0869253), norm. avg. (of 2) = 0.0615455 fft 8: mflops = 9.16519 (norm. = 0.148284), norm. avg. (of 2) = 0.102377 fft 9: mflops = 7.85587 (norm. = 0.127101), norm. avg. (of 2) = 0.0957379 fft 10: mflops = 2.08672 (norm. = 0.0337612), norm. avg. (of 2) = 0.0301812 Benchmarking for array size = 12: 0. CWP (min N): elapsed time t=1.01 s, 65536 iters, t-(init.)=0.97 s t(norm)=0.344053, mflops=14.5326 1. CWP (best N) (N=15): elapsed time t=1.17 s, 65536 iters, t-(init.)=1.12 s t(norm)=0.397258, mflops=12.5863 2. FFTPACK (f2c): elapsed time t=1.11 s, 65536 iters, t-(init.)=1.07 s t(norm)=0.379523, mflops=13.1744 (err=1.9e-16) FFTW_MEASURE plan: (cost = 2.899170e-06) FFTW_NOTW 12 3. FFTW: elapsed time t=1.54 s, 524288 iters, t-(init.)=1.19 s t(norm)=0.0527608, mflops=94.7674 (err=1.2e-16) FFTW_ESTIMATE plan: (cost = 4.920000e+02) FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.56 s, 524288 iters, t-(init.)=1.2 s t(norm)=0.0532041, mflops=93.9776 (err=1.2e-16) 5. Frigo-old: elapsed time t=1.1 s, 65536 iters, t-(init.)=1.05 s t(norm)=0.372429, mflops=13.4254 (err=2.9e-16) 6. GSL: elapsed time t=1.81 s, 131072 iters, t-(init.)=1.72 s t(norm)=0.305037, mflops=16.3914 (err=1.6e-16) 7. NAPACK (f2c): elapsed time t=1.52 s, 32768 iters, t-(init.)=1.5 s t(norm)=1.06408, mflops=4.69888 (err=5.5e-16) 8. Singleton (f2c): elapsed time t=1.52 s, 65536 iters, t-(init.)=1.47 s t(norm)=0.521401, mflops=9.58956 (err=1.5e-16) 9. Temperton (f2c): elapsed time t=1.37 s, 65536 iters, t-(init.)=1.32 s t(norm)=0.468196, mflops=10.6793 (err=1.4e-16) 10. Valkenburg: elapsed time t=1.62 s, 16384 iters, t-(init.)=1.61 s t(norm)=2.28423, mflops=2.18892 (err=2.5e-16) Top mflops for N=12 = 94.7674 Normalized results and averages for N=12: fft 0: mflops = 14.5326 (norm. = 0.153351), norm. avg. (of 3) = 0.141927 fft 1: mflops = 12.5863 (norm. = 0.132812), norm. avg. (of 3) = 0.108793 fft 2: mflops = 13.1744 (norm. = 0.139019), norm. avg. (of 3) = 0.136732 fft 3: mflops = 94.7674 (norm. = 1), norm. avg. (of 3) = 1 fft 4: mflops = 93.9776 (norm. = 0.991667), norm. avg. (of 3) = 0.979724 fft 5: mflops = 13.4254 (norm. = 0.141667), norm. avg. (of 3) = 0.128604 fft 6: mflops = 16.3914 (norm. = 0.172965), norm. avg. (of 3) = 0.177861 fft 7: mflops = 4.69888 (norm. = 0.0495833), norm. avg. (of 3) = 0.0575581 fft 8: mflops = 9.58956 (norm. = 0.10119), norm. avg. (of 3) = 0.101981 fft 9: mflops = 10.6793 (norm. = 0.112689), norm. avg. (of 3) = 0.101388 fft 10: mflops = 2.18892 (norm. = 0.0230978), norm. avg. (of 3) = 0.0278201 Benchmarking for array size = 15: 0. CWP (min N): elapsed time t=1.17 s, 65536 iters, t-(init.)=1.11 s t(norm)=0.289015, mflops=17.3001 1. CWP (best N): elapsed time t=1.16 s, 65536 iters, t-(init.)=1.11 s t(norm)=0.289015, mflops=17.3001 2. FFTPACK (f2c): elapsed time t=1.44 s, 65536 iters, t-(init.)=1.38 s t(norm)=0.359316, mflops=13.9153 (err=4.1e-16) FFTW_MEASURE plan: (cost = 5.187988e-06) FFTW_NOTW 15 3. FFTW: elapsed time t=1.36 s, 262144 iters, t-(init.)=1.15 s t(norm)=0.0748575, mflops=66.7936 (err=1.8e-16) FFTW_ESTIMATE plan: (cost = 4.485000e+02) FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.36 s, 262144 iters, t-(init.)=1.14 s t(norm)=0.0742066, mflops=67.3795 (err=1.8e-16) 5. Frigo-old: elapsed time t=1.08 s, 32768 iters, t-(init.)=1.05 s t(norm)=0.546785, mflops=9.14436 (err=4.9e-16) 6. GSL: elapsed time t=1.68 s, 65536 iters, t-(init.)=1.62 s t(norm)=0.421806, mflops=11.8538 (err=2.0e-16) 7. NAPACK (f2c): elapsed time t=1.09 s, 16384 iters, t-(init.)=1.08 s t(norm)=1.12482, mflops=4.44517 (err=1.0e-15) 8. Singleton (f2c): elapsed time t=1.84 s, 65536 iters, t-(init.)=1.78 s t(norm)=0.463466, mflops=10.7883 (err=2.9e-16) 9. Temperton (f2c): elapsed time t=1.49 s, 65536 iters, t-(init.)=1.44 s t(norm)=0.374939, mflops=13.3355 (err=2.1e-16) 10. Valkenburg: elapsed time t=1.31 s, 8192 iters, t-(init.)=1.3 s t(norm)=2.70789, mflops=1.84646 (err=3.5e-16) Top mflops for N=15 = 67.3795 Normalized results and averages for N=15: fft 0: mflops = 17.3001 (norm. = 0.256757), norm. avg. (of 4) = 0.170634 fft 1: mflops = 17.3001 (norm. = 0.256757), norm. avg. (of 4) = 0.145784 fft 2: mflops = 13.9153 (norm. = 0.206522), norm. avg. (of 4) = 0.154179 fft 3: mflops = 66.7936 (norm. = 0.991304), norm. avg. (of 4) = 0.997826 fft 4: mflops = 67.3795 (norm. = 1), norm. avg. (of 4) = 0.984793 fft 5: mflops = 9.14436 (norm. = 0.135714), norm. avg. (of 4) = 0.130382 fft 6: mflops = 11.8538 (norm. = 0.175926), norm. avg. (of 4) = 0.177377 fft 7: mflops = 4.44517 (norm. = 0.0659722), norm. avg. (of 4) = 0.0596616 fft 8: mflops = 10.7883 (norm. = 0.160112), norm. avg. (of 4) = 0.116514 fft 9: mflops = 13.3355 (norm. = 0.197917), norm. avg. (of 4) = 0.12552 fft 10: mflops = 1.84646 (norm. = 0.0274038), norm. avg. (of 4) = 0.027716 Benchmarking for array size = 18: 0. CWP (min N): elapsed time t=1.44 s, 65536 iters, t-(init.)=1.37 s t(norm)=0.278509, mflops=17.9527 1. CWP (best N) (N=28): elapsed time t=1.66 s, 65536 iters, t-(init.)=1.57 s t(norm)=0.319168, mflops=15.6657 2. FFTPACK (f2c): elapsed time t=1.25 s, 32768 iters, t-(init.)=1.22 s t(norm)=0.496031, mflops=10.08 (err=2.6e-16) FFTW_MEASURE plan: (cost = 6.713867e-06) FFTW_TWIDDLE 3 FFTW_NOTW 6 3. FFTW: elapsed time t=1.85 s, 262144 iters, t-(init.)=1.6 s t(norm)=0.0813166, mflops=61.488 (err=1.7e-16) FFTW_ESTIMATE plan: (cost = 1.168200e+03) FFTW_TWIDDLE 2 FFTW_NOTW 9 4. FFTW_ESTIMATE: elapsed time t=1.09 s, 131072 iters, t-(init.)=0.96 s t(norm)=0.0975799, mflops=51.24 (err=1.9e-16) 5. Frigo-old: elapsed time t=1.4 s, 32768 iters, t-(init.)=1.37 s t(norm)=0.557019, mflops=8.97636 (err=4.5e-16) 6. GSL: elapsed time t=1.25 s, 65536 iters, t-(init.)=1.19 s t(norm)=0.241917, mflops=20.6683 (err=2.2e-16) 7. NAPACK (f2c): elapsed time t=1.95 s, 32768 iters, t-(init.)=1.92 s t(norm)=0.78064, mflops=6.405 (err=8.7e-16) 8. Singleton (f2c): elapsed time t=1.75 s, 65536 iters, t-(init.)=1.69 s t(norm)=0.343563, mflops=14.5534 (err=2.1e-16) 9. Temperton (f2c): elapsed time t=1.09 s, 32768 iters, t-(init.)=1.06 s t(norm)=0.430978, mflops=11.6015 (err=2.9e-16) 10. Valkenburg: elapsed time t=1.43 s, 8192 iters, t-(init.)=1.42 s t(norm)=2.30939, mflops=2.16507 (err=3.7e-16) Top mflops for N=18 = 61.488 Normalized results and averages for N=18: fft 0: mflops = 17.9527 (norm. = 0.291971), norm. avg. (of 5) = 0.194902 fft 1: mflops = 15.6657 (norm. = 0.254777), norm. avg. (of 5) = 0.167583 fft 2: mflops = 10.08 (norm. = 0.163934), norm. avg. (of 5) = 0.15613 fft 3: mflops = 61.488 (norm. = 1), norm. avg. (of 5) = 0.998261 fft 4: mflops = 51.24 (norm. = 0.833333), norm. avg. (of 5) = 0.954501 fft 5: mflops = 8.97636 (norm. = 0.145985), norm. avg. (of 5) = 0.133503 fft 6: mflops = 20.6683 (norm. = 0.336134), norm. avg. (of 5) = 0.209129 fft 7: mflops = 6.405 (norm. = 0.104167), norm. avg. (of 5) = 0.0685626 fft 8: mflops = 14.5534 (norm. = 0.236686), norm. avg. (of 5) = 0.140549 fft 9: mflops = 11.6015 (norm. = 0.188679), norm. avg. (of 5) = 0.138152 fft 10: mflops = 2.16507 (norm. = 0.0352113), norm. avg. (of 5) = 0.0292151 Benchmarking for array size = 24: 0. CWP (min N): elapsed time t=1.46 s, 65536 iters, t-(init.)=1.38 s t(norm)=0.19136, mflops=26.1287 1. CWP (best N) (N=28): elapsed time t=1.65 s, 65536 iters, t-(init.)=1.56 s t(norm)=0.21632, mflops=23.1139 2. FFTPACK (f2c): elapsed time t=1.57 s, 32768 iters, t-(init.)=1.53 s t(norm)=0.424321, mflops=11.7835 (err=2.4e-16) FFTW_MEASURE plan: (cost = 8.239746e-06) FFTW_TWIDDLE 4 FFTW_NOTW 6 3. FFTW: elapsed time t=1.1 s, 131072 iters, t-(init.)=0.95 s t(norm)=0.0658668, mflops=75.9108 (err=2.3e-16) FFTW_ESTIMATE plan: (cost = 1.248000e+03) FFTW_TWIDDLE 2 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.18 s, 131072 iters, t-(init.)=1.03 s t(norm)=0.0714135, mflops=70.0148 (err=2.1e-16) 5. Frigo-old: elapsed time t=1.15 s, 32768 iters, t-(init.)=1.11 s t(norm)=0.307841, mflops=16.2422 (err=3.6e-16) 6. GSL: elapsed time t=1.47 s, 65536 iters, t-(init.)=1.39 s t(norm)=0.192747, mflops=25.9407 (err=2.1e-16) 7. NAPACK (f2c): elapsed time t=1.24 s, 16384 iters, t-(init.)=1.23 s t(norm)=0.682242, mflops=7.32878 (err=8.0e-16) 8. Singleton (f2c): elapsed time t=1.37 s, 32768 iters, t-(init.)=1.33 s t(norm)=0.368854, mflops=13.5555 (err=2.3e-16) 9. Temperton (f2c): elapsed time t=1.12 s, 32768 iters, t-(init.)=1.09 s t(norm)=0.302294, mflops=16.5402 (err=2.8e-16) 10. Valkenburg: elapsed time t=1 s, 4096 iters, t-(init.)=1 s t(norm)=2.21867, mflops=2.2536 (err=4.2e-16) Top mflops for N=24 = 75.9108 Normalized results and averages for N=24: fft 0: mflops = 26.1287 (norm. = 0.344203), norm. avg. (of 6) = 0.219785 fft 1: mflops = 23.1139 (norm. = 0.304487), norm. avg. (of 6) = 0.1904 fft 2: mflops = 11.7835 (norm. = 0.155229), norm. avg. (of 6) = 0.15598 fft 3: mflops = 75.9108 (norm. = 1), norm. avg. (of 6) = 0.998551 fft 4: mflops = 70.0148 (norm. = 0.92233), norm. avg. (of 6) = 0.949139 fft 5: mflops = 16.2422 (norm. = 0.213964), norm. avg. (of 6) = 0.146913 fft 6: mflops = 25.9407 (norm. = 0.341727), norm. avg. (of 6) = 0.231228 fft 7: mflops = 7.32878 (norm. = 0.0965447), norm. avg. (of 6) = 0.0732263 fft 8: mflops = 13.5555 (norm. = 0.178571), norm. avg. (of 6) = 0.146886 fft 9: mflops = 16.5402 (norm. = 0.21789), norm. avg. (of 6) = 0.151442 fft 10: mflops = 2.2536 (norm. = 0.0296875), norm. avg. (of 6) = 0.0292938 Benchmarking for array size = 36: 0. CWP (min N): elapsed time t=1.07 s, 32768 iters, t-(init.)=1.02 s t(norm)=0.167249, mflops=29.8955 1. CWP (best N): elapsed time t=1.06 s, 32768 iters, t-(init.)=1.01 s t(norm)=0.165609, mflops=30.1915 2. FFTPACK (f2c): elapsed time t=1.18 s, 16384 iters, t-(init.)=1.15 s t(norm)=0.37713, mflops=13.258 (err=4.5e-16) FFTW_MEASURE plan: (cost = 1.342773e-05) FFTW_TWIDDLE 3 FFTW_NOTW 12 3. FFTW: elapsed time t=1.77 s, 131072 iters, t-(init.)=1.54 s t(norm)=0.0631283, mflops=79.2038 (err=4.4e-16) FFTW_ESTIMATE plan: (cost = 1.803600e+03) FFTW_TWIDDLE 3 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.77 s, 131072 iters, t-(init.)=1.54 s t(norm)=0.0631283, mflops=79.2038 (err=4.4e-16) 5. Frigo-old: elapsed time t=1.39 s, 16384 iters, t-(init.)=1.36 s t(norm)=0.445997, mflops=11.2108 (err=5.4e-16) 6. GSL: elapsed time t=1.16 s, 32768 iters, t-(init.)=1.1 s t(norm)=0.180367, mflops=27.7213 (err=4.3e-16) 7. NAPACK (f2c): elapsed time t=1.69 s, 16384 iters, t-(init.)=1.66 s t(norm)=0.544379, mflops=9.18478 (err=1.4e-15) 8. Singleton (f2c): elapsed time t=1.35 s, 32768 iters, t-(init.)=1.29 s t(norm)=0.211521, mflops=23.6383 (err=4.7e-16) 9. Temperton (f2c): elapsed time t=1.66 s, 32768 iters, t-(init.)=1.61 s t(norm)=0.263991, mflops=18.94 (err=3.7e-16) 10. Valkenburg: elapsed time t=1.72 s, 4096 iters, t-(init.)=1.71 s t(norm)=2.2431, mflops=2.22905 (err=5.1e-16) Top mflops for N=36 = 79.2038 Normalized results and averages for N=36: fft 0: mflops = 29.8955 (norm. = 0.377451), norm. avg. (of 7) = 0.242309 fft 1: mflops = 30.1915 (norm. = 0.381188), norm. avg. (of 7) = 0.217656 fft 2: mflops = 13.258 (norm. = 0.167391), norm. avg. (of 7) = 0.15761 fft 3: mflops = 79.2038 (norm. = 1), norm. avg. (of 7) = 0.998758 fft 4: mflops = 79.2038 (norm. = 1), norm. avg. (of 7) = 0.956405 fft 5: mflops = 11.2108 (norm. = 0.141544), norm. avg. (of 7) = 0.146146 fft 6: mflops = 27.7213 (norm. = 0.35), norm. avg. (of 7) = 0.248196 fft 7: mflops = 9.18478 (norm. = 0.115964), norm. avg. (of 7) = 0.0793317 fft 8: mflops = 23.6383 (norm. = 0.29845), norm. avg. (of 7) = 0.168538 fft 9: mflops = 18.94 (norm. = 0.23913), norm. avg. (of 7) = 0.163969 fft 10: mflops = 2.22905 (norm. = 0.0281433), norm. avg. (of 7) = 0.0291294 Benchmarking for array size = 80: 0. CWP (min N): elapsed time t=1.03 s, 16384 iters, t-(init.)=0.97 s t(norm)=0.117061, mflops=42.7128 1. CWP (best N) (N=84): elapsed time t=1.02 s, 16384 iters, t-(init.)=0.96 s t(norm)=0.115854, mflops=43.1577 2. FFTPACK (f2c): elapsed time t=1.34 s, 8192 iters, t-(init.)=1.31 s t(norm)=0.316185, mflops=15.8135 (err=3.9e-16) FFTW_MEASURE plan: (cost = 3.662109e-05) FFTW_TWIDDLE 5 FFTW_NOTW 16 3. FFTW: elapsed time t=1.35 s, 32768 iters, t-(init.)=1.23 s t(norm)=0.0742191, mflops=67.3681 (err=3.3e-16) FFTW_ESTIMATE plan: (cost = 2.600000e+03) FFTW_TWIDDLE 5 FFTW_NOTW 16 4. FFTW_ESTIMATE: elapsed time t=1.35 s, 32768 iters, t-(init.)=1.23 s t(norm)=0.0742191, mflops=67.3681 (err=3.3e-16) 5. Frigo-old: elapsed time t=1.14 s, 8192 iters, t-(init.)=1.11 s t(norm)=0.267913, mflops=18.6628 (err=3.4e-16) 6. GSL: elapsed time t=1.85 s, 16384 iters, t-(init.)=1.79 s t(norm)=0.21602, mflops=23.146 (err=3.3e-16) 7. NAPACK (f2c): elapsed time t=1.54 s, 4096 iters, t-(init.)=1.52 s t(norm)=0.733743, mflops=6.81437 (err=5.0e-16) 8. Singleton (f2c): elapsed time t=1.31 s, 16384 iters, t-(init.)=1.25 s t(norm)=0.150852, mflops=33.1451 (err=4.4e-16) 9. Temperton (f2c): elapsed time t=1.51 s, 16384 iters, t-(init.)=1.45 s t(norm)=0.174988, mflops=28.5734 (err=3.4e-16) 10. Valkenburg: elapsed time t=1.24 s, 1024 iters, t-(init.)=1.23 s t(norm)=2.37501, mflops=2.10525 (err=4.6e-16) Top mflops for N=80 = 67.3681 Normalized results and averages for N=80: fft 0: mflops = 42.7128 (norm. = 0.634021), norm. avg. (of 8) = 0.291273 fft 1: mflops = 43.1577 (norm. = 0.640625), norm. avg. (of 8) = 0.270527 fft 2: mflops = 15.8135 (norm. = 0.234733), norm. avg. (of 8) = 0.167251 fft 3: mflops = 67.3681 (norm. = 1), norm. avg. (of 8) = 0.998913 fft 4: mflops = 67.3681 (norm. = 1), norm. avg. (of 8) = 0.961854 fft 5: mflops = 18.6628 (norm. = 0.277027), norm. avg. (of 8) = 0.162506 fft 6: mflops = 23.146 (norm. = 0.343575), norm. avg. (of 8) = 0.260118 fft 7: mflops = 6.81437 (norm. = 0.101151), norm. avg. (of 8) = 0.0820591 fft 8: mflops = 33.1451 (norm. = 0.492), norm. avg. (of 8) = 0.20897 fft 9: mflops = 28.5734 (norm. = 0.424138), norm. avg. (of 8) = 0.19649 fft 10: mflops = 2.10525 (norm. = 0.03125), norm. avg. (of 8) = 0.0293945 Benchmarking for array size = 108: 0. CWP (min N) (N=110): elapsed time t=1.81 s, 16384 iters, t-(init.)=1.73 s t(norm)=0.144739, mflops=34.545 1. CWP (best N) (N=112): elapsed time t=1.38 s, 16384 iters, t-(init.)=1.3 s t(norm)=0.108763, mflops=45.9715 2. FFTPACK (f2c): elapsed time t=1.06 s, 4096 iters, t-(init.)=1.04 s t(norm)=0.348042, mflops=14.3661 (err=4.0e-16) FFTW_MEASURE plan: (cost = 5.615234e-05) FFTW_TWIDDLE 3 FFTW_TWIDDLE 3 FFTW_NOTW 12 3. FFTW: elapsed time t=1.89 s, 32768 iters, t-(init.)=1.73 s t(norm)=0.0723693, mflops=69.0901 (err=3.7e-16) FFTW_ESTIMATE plan: (cost = 4.633200e+03) FFTW_TWIDDLE 9 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.04 s, 16384 iters, t-(init.)=0.96 s t(norm)=0.0803174, mflops=62.253 (err=3.7e-16) 5. Frigo-old: elapsed time t=1.51 s, 4096 iters, t-(init.)=1.49 s t(norm)=0.498637, mflops=10.0273 (err=5.7e-16) 6. GSL: elapsed time t=1.14 s, 8192 iters, t-(init.)=1.1 s t(norm)=0.184061, mflops=27.165 (err=4.0e-16) 7. NAPACK (f2c): elapsed time t=1.3 s, 4096 iters, t-(init.)=1.28 s t(norm)=0.428359, mflops=11.6724 (err=3.1e-15) 8. Singleton (f2c): elapsed time t=1.05 s, 8192 iters, t-(init.)=1.01 s t(norm)=0.169001, mflops=29.5856 (err=4.5e-16) 9. Temperton (f2c): elapsed time t=1.31 s, 8192 iters, t-(init.)=1.27 s t(norm)=0.212506, mflops=23.5287 (err=3.5e-16) 10. Valkenburg: elapsed time t=1.7 s, 1024 iters, t-(init.)=1.7 s t(norm)=2.27566, mflops=2.19717 (err=5.4e-16) Top mflops for N=108 = 69.0901 Normalized results and averages for N=108: fft 0: mflops = 34.545 (norm. = 0.5), norm. avg. (of 9) = 0.314465 fft 1: mflops = 45.9715 (norm. = 0.665385), norm. avg. (of 9) = 0.3144 fft 2: mflops = 14.3661 (norm. = 0.207933), norm. avg. (of 9) = 0.171771 fft 3: mflops = 69.0901 (norm. = 1), norm. avg. (of 9) = 0.999034 fft 4: mflops = 62.253 (norm. = 0.901042), norm. avg. (of 9) = 0.955097 fft 5: mflops = 10.0273 (norm. = 0.145134), norm. avg. (of 9) = 0.160576 fft 6: mflops = 27.165 (norm. = 0.393182), norm. avg. (of 9) = 0.274903 fft 7: mflops = 11.6724 (norm. = 0.168945), norm. avg. (of 9) = 0.0917132 fft 8: mflops = 29.5856 (norm. = 0.428218), norm. avg. (of 9) = 0.233331 fft 9: mflops = 23.5287 (norm. = 0.340551), norm. avg. (of 9) = 0.212497 fft 10: mflops = 2.19717 (norm. = 0.0318015), norm. avg. (of 9) = 0.029662 Benchmarking for array size = 210: 0. CWP (min N): elapsed time t=1.53 s, 8192 iters, t-(init.)=1.46 s t(norm)=0.110015, mflops=45.4485 1. CWP (best N): elapsed time t=1.51 s, 8192 iters, t-(init.)=1.44 s t(norm)=0.108508, mflops=46.0798 2. FFTPACK (f2c): elapsed time t=1.26 s, 1024 iters, t-(init.)=1.25 s t(norm)=0.753524, mflops=6.63549 (err=6.6e-16) FFTW_MEASURE plan: (cost = 1.611328e-04) FFTW_TWIDDLE 2 FFTW_TWIDDLE 7 FFTW_NOTW 15 3. FFTW: elapsed time t=1.69 s, 8192 iters, t-(init.)=1.61 s t(norm)=0.121317, mflops=41.2142 (err=4.5e-16) FFTW_ESTIMATE plan: (cost = 9.324000e+03) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.66 s, 8192 iters, t-(init.)=1.59 s t(norm)=0.11981, mflops=41.7326 (err=4.8e-16) 5. Frigo-old: elapsed time t=1.68 s, 2048 iters, t-(init.)=1.66 s t(norm)=0.50034, mflops=9.9932 (err=6.2e-16) 6. GSL: elapsed time t=1.61 s, 4096 iters, t-(init.)=1.57 s t(norm)=0.236607, mflops=21.1321 (err=6.3e-16) 7. NAPACK (f2c): elapsed time t=1.41 s, 1024 iters, t-(init.)=1.41 s t(norm)=0.849975, mflops=5.88252 (err=1.5e-14) 8. Singleton (f2c): elapsed time t=1.64 s, 4096 iters, t-(init.)=1.61 s t(norm)=0.242635, mflops=20.6071 (err=6.4e-16) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=1.22 s, 256 iters, t-(init.)=1.21 s t(norm)=2.91765, mflops=1.71371 (err=6.1e-16) Top mflops for N=210 = 46.0798 Normalized results and averages for N=210: fft 0: mflops = 45.4485 (norm. = 0.986301), norm. avg. (of 10) = 0.381648 fft 1: mflops = 46.0798 (norm. = 1), norm. avg. (of 10) = 0.38296 fft 2: mflops = 6.63549 (norm. = 0.144), norm. avg. (of 10) = 0.168994 fft 3: mflops = 41.2142 (norm. = 0.89441), norm. avg. (of 10) = 0.988571 fft 4: mflops = 41.7326 (norm. = 0.90566), norm. avg. (of 10) = 0.950154 fft 5: mflops = 9.9932 (norm. = 0.216867), norm. avg. (of 10) = 0.166205 fft 6: mflops = 21.1321 (norm. = 0.458599), norm. avg. (of 10) = 0.293273 fft 7: mflops = 5.88252 (norm. = 0.12766), norm. avg. (of 10) = 0.0953078 fft 8: mflops = 20.6071 (norm. = 0.447205), norm. avg. (of 10) = 0.254719 fft 9: mflops = -1 (norm. = -0.0217015), norm. avg. (of 9) = 0.212497 fft 10: mflops = 1.71371 (norm. = 0.0371901), norm. avg. (of 10) = 0.0304148 Benchmarking for array size = 504: 0. CWP (min N): elapsed time t=1.03 s, 2048 iters, t-(init.)=0.99 s t(norm)=0.106839, mflops=46.7994 1. CWP (best N): elapsed time t=1.03 s, 2048 iters, t-(init.)=0.99 s t(norm)=0.106839, mflops=46.7994 2. FFTPACK (f2c): elapsed time t=1.73 s, 512 iters, t-(init.)=1.72 s t(norm)=0.742477, mflops=6.73421 (err=1.4e-15) FFTW_MEASURE plan: (cost = 4.296875e-04) FFTW_TWIDDLE 7 FFTW_TWIDDLE 6 FFTW_NOTW 12 3. FFTW: elapsed time t=1.82 s, 4096 iters, t-(init.)=1.74 s t(norm)=0.0938888, mflops=53.2545 (err=1.2e-15) FFTW_ESTIMATE plan: (cost = 2.147040e+04) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 4. FFTW_ESTIMATE: elapsed time t=1.78 s, 4096 iters, t-(init.)=1.69 s t(norm)=0.0911909, mflops=54.83 (err=1.2e-15) 5. Frigo-old: elapsed time t=1 s, 512 iters, t-(init.)=0.99 s t(norm)=0.427356, mflops=11.6998 (err=1.4e-15) 6. GSL: elapsed time t=1.9 s, 2048 iters, t-(init.)=1.85 s t(norm)=0.199649, mflops=25.044 (err=1.3e-15) 7. NAPACK (f2c): elapsed time t=1.53 s, 512 iters, t-(init.)=1.52 s t(norm)=0.656143, mflops=7.62029 (err=4.1e-14) 8. Singleton (f2c): elapsed time t=1.72 s, 2048 iters, t-(init.)=1.68 s t(norm)=0.181303, mflops=27.5782 (err=1.9e-15) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=1.55 s, 128 iters, t-(init.)=1.55 s t(norm)=2.67637, mflops=1.8682 (err=1.4e-15) Top mflops for N=504 = 54.83 Normalized results and averages for N=504: fft 0: mflops = 46.7994 (norm. = 0.853535), norm. avg. (of 11) = 0.424547 fft 1: mflops = 46.7994 (norm. = 0.853535), norm. avg. (of 11) = 0.425739 fft 2: mflops = 6.73421 (norm. = 0.12282), norm. avg. (of 11) = 0.164796 fft 3: mflops = 53.2545 (norm. = 0.971264), norm. avg. (of 11) = 0.986998 fft 4: mflops = 54.83 (norm. = 1), norm. avg. (of 11) = 0.954685 fft 5: mflops = 11.6998 (norm. = 0.213384), norm. avg. (of 11) = 0.170494 fft 6: mflops = 25.044 (norm. = 0.456757), norm. avg. (of 11) = 0.308135 fft 7: mflops = 7.62029 (norm. = 0.13898), norm. avg. (of 11) = 0.099278 fft 8: mflops = 27.5782 (norm. = 0.502976), norm. avg. (of 11) = 0.277288 fft 9: mflops = -1 (norm. = -0.0182382), norm. avg. (of 9) = 0.212497 fft 10: mflops = 1.8682 (norm. = 0.0340726), norm. avg. (of 11) = 0.0307473 Benchmarking for array size = 1000: 0. CWP (min N) (N=1001): elapsed time t=1.46 s, 1024 iters, t-(init.)=1.41 s t(norm)=0.138168, mflops=36.1878 1. CWP (best N) (N=1008): elapsed time t=1.04 s, 1024 iters, t-(init.)=1 s t(norm)=0.0979915, mflops=51.0248 2. FFTPACK (f2c): elapsed time t=1.8 s, 512 iters, t-(init.)=1.78 s t(norm)=0.34885, mflops=14.3328 (err=1.1e-15) FFTW_MEASURE plan: (cost = 1.015625e-03) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 3. FFTW: elapsed time t=1.05 s, 1024 iters, t-(init.)=1.01 s t(norm)=0.0989715, mflops=50.5196 (err=9.8e-16) FFTW_ESTIMATE plan: (cost = 5.220000e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 4. FFTW_ESTIMATE: elapsed time t=1.05 s, 1024 iters, t-(init.)=1.01 s t(norm)=0.0989715, mflops=50.5196 (err=9.8e-16) 5. Frigo-old: elapsed time t=1.01 s, 256 iters, t-(init.)=1 s t(norm)=0.391966, mflops=12.7562 (err=1.0e-15) 6. GSL: elapsed time t=1.26 s, 512 iters, t-(init.)=1.24 s t(norm)=0.243019, mflops=20.5745 (err=1.0e-15) 7. NAPACK (f2c): elapsed time t=1 s, 128 iters, t-(init.)=1 s t(norm)=0.783932, mflops=6.3781 (err=1.7e-14) 8. Singleton (f2c): elapsed time t=1.59 s, 1024 iters, t-(init.)=1.54 s t(norm)=0.150907, mflops=33.133 (err=1.5e-15) 9. Temperton (f2c): elapsed time t=1.78 s, 1024 iters, t-(init.)=1.74 s t(norm)=0.170505, mflops=29.3246 (err=9.9e-16) 10. Valkenburg: elapsed time t=1.78 s, 64 iters, t-(init.)=1.78 s t(norm)=2.7908, mflops=1.7916 (err=1.0e-15) Top mflops for N=1000 = 51.0248 Normalized results and averages for N=1000: fft 0: mflops = 36.1878 (norm. = 0.70922), norm. avg. (of 12) = 0.44827 fft 1: mflops = 51.0248 (norm. = 1), norm. avg. (of 12) = 0.473594 fft 2: mflops = 14.3328 (norm. = 0.280899), norm. avg. (of 12) = 0.174471 fft 3: mflops = 50.5196 (norm. = 0.990099), norm. avg. (of 12) = 0.987256 fft 4: mflops = 50.5196 (norm. = 0.990099), norm. avg. (of 12) = 0.957636 fft 5: mflops = 12.7562 (norm. = 0.25), norm. avg. (of 12) = 0.177119 fft 6: mflops = 20.5745 (norm. = 0.403226), norm. avg. (of 12) = 0.316059 fft 7: mflops = 6.3781 (norm. = 0.125), norm. avg. (of 12) = 0.101422 fft 8: mflops = 33.133 (norm. = 0.649351), norm. avg. (of 12) = 0.308293 fft 9: mflops = 29.3246 (norm. = 0.574713), norm. avg. (of 10) = 0.248718 fft 10: mflops = 1.7916 (norm. = 0.0351124), norm. avg. (of 12) = 0.0311111 Benchmarking for array size = 1960: 0. CWP (min N) (N=1980): elapsed time t=1.33 s, 512 iters, t-(init.)=1.28 s t(norm)=0.116627, mflops=42.8716 1. CWP (best N) (N=1980): elapsed time t=1.34 s, 512 iters, t-(init.)=1.3 s t(norm)=0.11845, mflops=42.2121 2. FFTPACK (f2c): elapsed time t=1.4 s, 64 iters, t-(init.)=1.4 s t(norm)=1.02049, mflops=4.89961 (err=2.8e-15) FFTW_MEASURE plan: (cost = 2.734375e-03) FFTW_TWIDDLE 10 FFTW_TWIDDLE 4 FFTW_TWIDDLE 7 FFTW_NOTW 7 3. FFTW: elapsed time t=1.34 s, 512 iters, t-(init.)=1.3 s t(norm)=0.11845, mflops=42.2121 (err=2.7e-15) FFTW_ESTIMATE plan: (cost = 9.662800e+04) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 4. FFTW_ESTIMATE: elapsed time t=1.37 s, 512 iters, t-(init.)=1.32 s t(norm)=0.120272, mflops=41.5725 (err=2.8e-15) 5. Frigo-old: elapsed time t=1.15 s, 128 iters, t-(init.)=1.14 s t(norm)=0.415485, mflops=12.0341 (err=2.8e-15) 6. GSL: elapsed time t=1.26 s, 256 iters, t-(init.)=1.24 s t(norm)=0.225965, mflops=22.1273 (err=2.8e-15) 7. NAPACK (f2c): elapsed time t=1.24 s, 64 iters, t-(init.)=1.24 s t(norm)=0.903861, mflops=5.53182 (err=1.3e-13) 8. Singleton (f2c): elapsed time t=1.14 s, 256 iters, t-(init.)=1.11 s t(norm)=0.202275, mflops=24.7188 (err=4.3e-15) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=1.07 s, 16 iters, t-(init.)=1.07 s t(norm)=3.11978, mflops=1.60268 (err=2.7e-15) Top mflops for N=1960 = 42.8716 Normalized results and averages for N=1960: fft 0: mflops = 42.8716 (norm. = 1), norm. avg. (of 13) = 0.490711 fft 1: mflops = 42.2121 (norm. = 0.984615), norm. avg. (of 13) = 0.512904 fft 2: mflops = 4.89961 (norm. = 0.114286), norm. avg. (of 13) = 0.169842 fft 3: mflops = 42.2121 (norm. = 0.984615), norm. avg. (of 13) = 0.987053 fft 4: mflops = 41.5725 (norm. = 0.969697), norm. avg. (of 13) = 0.958564 fft 5: mflops = 12.0341 (norm. = 0.280702), norm. avg. (of 13) = 0.185087 fft 6: mflops = 22.1273 (norm. = 0.516129), norm. avg. (of 13) = 0.331449 fft 7: mflops = 5.53182 (norm. = 0.129032) FFTW_MEASURE plan: (cost = 5.000000e-02) FFTW_TWIDDLE 5 FFTW_TWIDDLE 10 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 15 3. FFTW: elapsed time t=1.71 s, 32 iters, t-(init.)=1.65 s t(norm)=0.129731, mflops=38.5414 (err=5.6e-15) FFTW_ESTIMATE plan: (cost = 1.231200e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.95 s, 32 iters, t-(init.)=1.88 s t(norm)=0.147814, mflops=33.8262 (err=5.6e-15) 5. Frigo-old: elapsed time t=1.85 s, 8 iters, t-(init.)=1.84 s t(norm)=0.578677, mflops=8.64039 (err=5.7e-15) 6. GSL: elapsed time t=1.52 s, 16 iters, t-(init.)=1.48 s t(norm)=0.232729, mflops=21.4842 (err=5.5e-15) 7. NAPACK (f2c): elapsed time t=1.26 s, 4 iters, t-(init.)=1.25 s t(norm)=0.786246, mflops=6.35933 (err=1.1e-12) 8. Singleton (f2c): elapsed time t=1.47 s, 16 iters, t-(init.)=1.44 s t(norm)=0.226439, mflops=22.081 (err=7.6e-15) 9. Temperton (f2c): elapsed time t=1.78 s, 16 iters, t-(init.)=1.74 s t(norm)=0.273614, mflops=18.2739 (err=5.7e-15) 10. Valkenburg: elapsed time t=1.07 s, 1 iters, t-(init.)=1.07 s t(norm)=2.69211, mflops=1.85728 (err=5.5e-15) Top mflops for N=27000 = 38.5414 Normalized results and averages for N=27000: fft 0: mflops = 37.8532 (norm. = 0.982143), norm. avg. (of 16) = 0.552168 fft 1: mflops = 37.8532 (norm. = 0.982143), norm. avg. (of 16) = 0.589331 fft 2: mflops = 12.6177 (norm. = 0.327381), norm. avg. (of 16) = 0.186605 fft 3: mflops = 38.5414 (norm. = 1), norm. avg. (of 16) = 0.992629 fft 4: mflops = 33.8262 (norm. = 0.87766), norm. avg. (of 16) = 0.917271 fft 5: mflops = 8.64039 (norm. = 0.224185), norm. avg. (of 16) = 0.191637 fft 6: mflops = 21.4842 (norm. = 0.557432), norm. avg. (of 16) = 0.360666 fft 7: mflops = 6.35933 (norm. = 0.165), norm. avg. (of 16) = 0.119707 fft 8: mflops = 22.081 (norm. = 0.572917), norm. avg. (of 16) = 0.374562 fft 9: mflops = 18.2739 (norm. = 0.474138), norm. avg. (of 12) = 0.285026 fft 10: mflops = 1.85728 (norm. = 0.0481893), norm. avg. (of 16) = 0.0341492 Benchmarking for array size = 75600: 0. CWP (min N) (N=80080): elapsed time t=1.54 s, 8 iters, t-(init.)=1.48 s t(norm)=0.150998, mflops=33.113 1. CWP (best N) (N=80080): elapsed time t=1.55 s, 8 iters, t-(init.)=1.49 s t(norm)=0.152018, mflops=32.8908 2. FFTPACK (f2c): elapsed time t=1.57 s, 2 iters, t-(init.)=1.56 s t(norm)=0.636641, mflops=7.85372 (err=1.1e-14) FFTW_MEASURE plan: (cost = 1.800000e-01) FFTW_TWIDDLE 8 FFTW_TWIDDLE 10 FFTW_TWIDDLE 3 FFTW_TWIDDLE 7 FFTW_TWIDDLE 3 FFTW_NOTW 15 3. FFTW: elapsed time t=1.48 s, 8 iters, t-(init.)=1.42 s t(norm)=0.144877, mflops=34.5121 (err=1.1e-14) FFTW_ESTIMATE plan: (cost = 2.971080e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.73 s, 8 iters, t-(init.)=1.67 s t(norm)=0.170383, mflops=29.3457 (err=1.1e-14) 5. Frigo-old: elapsed time t=1.57 s, 2 iters, t-(init.)=1.56 s t(norm)=0.636641, mflops=7.85372 (err=1.1e-14) 6. GSL: elapsed time t=1.27 s, 4 iters, t-(init.)=1.24 s t(norm)=0.253024, mflops=19.761 (err=1.1e-14) 7. NAPACK (f2c): elapsed time t=1.99 s, 2 iters, t-(init.)=1.98 s t(norm)=0.808044, mflops=6.18778 (err=5.1e-12) 8. Singleton (f2c): elapsed time t=1.72 s, 4 iters, t-(init.)=1.69 s t(norm)=0.344847, mflops=14.4992 (err=1.5e-14) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=3.72 s, 1 iters, t-(init.)=3.72 s t(norm)=3.03629, mflops=1.64675 (err=1.1e-14) Top mflops for N=75600 = 34.5121 Normalized results and averages for N=75600: fft 0: mflops = 33.113 (norm. = 0.959459), norm. avg. (of 17) = 0.576126 fft 1: mflops = 32.8908 (norm. = 0.95302), norm. avg. (of 17) = 0.610724 fft 2: mflops = 7.85372 (norm. = 0.227564), norm. avg. (of 17) = 0.189015 fft 3: mflops = 34.5121 (norm. = 1), norm. avg. (of 17) = 0.993063 fft 4: mflops = 29.3457 (norm. = 0.850299), norm. avg. (of 17) = 0.913332 fft 5: mflops = 7.85372 (norm. = 0.227564), norm. avg. (of 17) = 0.193751 fft 6: mflops = 19.761 (norm. = 0.572581), norm. avg. (of 17) = 0.373132 fft 7: mflops = 6.18778 (norm. = 0.179293), norm. avg. (of 17) = 0.123212 fft 8: mflops = 14.4992 (norm. = 0.420118), norm. avg. (of 17) = 0.377242 fft 9: mflops = -1 (norm. = -0.0289753), norm. avg. (of 12) = 0.285026 fft 10: mflops = 1.64675 (norm. = 0.0477151), norm. avg. (of 17) = 0.0349472 Benchmarking for array size = 165375: 0. CWP (min N) (N=180180): elapsed time t=2.03 s, 4 iters, t-(init.)=1.96 s t(norm)=0.17092, mflops=29.2535 1. CWP (best N) (N=180180): elapsed time t=1.01 s, 2 iters, t-(init.)=0.97 s t(norm)=0.169176, mflops=29.555 2. FFTPACK (f2c): elapsed time t=2.75 s, 1 iters, t-(init.)=2.74 s t(norm)=0.955757, mflops=5.23146 (err=2.7e-14) FFTW_MEASURE plan: (cost = 5.300000e-01) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 3 FFTW_TWIDDLE 5 FFTW_TWIDDLE 3 FFTW_TWIDDLE 7 FFTW_NOTW 15 3. FFTW: elapsed time t=1.1 s, 2 iters, t-(init.)=1.07 s t(norm)=0.186617, mflops=26.7929 (err=2.7e-14) FFTW_ESTIMATE plan: (cost = 8.367975e+06) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.23 s, 2 iters, t-(init.)=1.2 s t(norm)=0.20929, mflops=23.8903 (err=2.7e-14) 5. Frigo-old: elapsed time t=2.48 s, 1 iters, t-(init.)=2.46 s t(norm)=0.858088, mflops=5.82691 (err=2.7e-14) 6. GSL: elapsed time t=1.68 s, 2 iters, t-(init.)=1.65 s t(norm)=0.287773, mflops=17.3748 (err=2.7e-14) 7. NAPACK (f2c): elapsed time t=2.83 s, 1 iters, t-(init.)=2.81 s t(norm)=0.980174, mflops=5.10114 (err=1.6e-11) 8. Singleton (f2c): elapsed time t=1.14 s, 1 iters, t-(init.)=1.12 s t(norm)=0.390674, mflops=12.7984 (err=4.0e-14) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=9.66 s, 1 iters, t-(init.)=9.64 s t(norm)=3.36259, mflops=1.48695 (err=2.7e-14) Top mflops for N=165375 = 29.555 Normalized results and averages for N=165375: fft 0: mflops = 29.2535 (norm. = 0.989796), norm. avg. (of 18) = 0.599108 fft 1: mflops = 29.555 (norm. = 1), norm. avg. (of 18) = 0.632351 fft 2: mflops = 5.23146 (norm. = 0.177007), norm. avg. (of 18) = 0.188348 fft 3: mflops = 26.7929 (norm. = 0.906542), norm. avg. (of 18) = 0.988256 fft 4: mflops = 23.8903 (norm. = 0.808333), norm. avg. (of 18) = 0.907498 fft 5: mflops = 5.82691 (norm. = 0.197154), norm. avg. (of 18) = 0.19394 fft 6: mflops = 17.3748 (norm. = 0.587879), norm. avg. (of 18) = 0.385062 fft 7: mflops = 5.10114 (norm. = 0.172598), norm. avg. (of 18) = 0.125956 fft 8: mflops = 12.7984 (norm. = 0.433036), norm. avg. (of 18) = 0.380342 fft 9: mflops = -1 (norm. = -0.0338352), norm. avg. (of 12) = 0.285026 fft 10: mflops = 1.48695 (norm. = 0.0503112), norm. avg. (of 18) = 0.0358008 Benchmarking for array size = 362880: 0. CWP (min N) (N=720720): elapsed time t=2.12 s, 1 iters, t-(init.)=2.05 s t(norm)=0.305875, mflops=16.3465 1. CWP (best N) (N=720720): elapsed time t=2.11 s, 1 iters, t-(init.)=2.04 s t(norm)=0.304383, mflops=16.4267 2. FFTPACK (f2c): elapsed time t=4.33 s, 1 iters, t-(init.)=4.29 s t(norm)=0.6401, mflops=7.81128 (err=1.1e-13) FFTW_MEASURE plan: (cost = 9.800000e-01) FFTW_TWIDDLE 32 FFTW_TWIDDLE 7 FFTW_TWIDDLE 3 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 15 3. FFTW: elapsed time t=1.01 s, 1 iters, t-(init.)=0.97 s t(norm)=0.144731, mflops=34.5468 (err=1.1e-13) FFTW_ESTIMATE plan: (cost = 7.511616e+06) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 4. FFTW_ESTIMATE: elapsed time t=1.3 s, 1 iters, t-(init.)=1.26 s t(norm)=0.188001, mflops=26.5956 (err=1.1e-13) 5. Frigo-old: elapsed time t=4.52 s, 1 iters, t-(init.)=4.48 s t(norm)=0.668449, mflops=7.48 (err=1.1e-13) 6. GSL: elapsed time t=1.7 s, 1 iters, t-(init.)=1.66 s t(norm)=0.247684, mflops=20.187 (err=1.1e-13) 7. NAPACK (f2c): elapsed time t=5.03 s, 1 iters, t-(init.)=4.99 s t(norm)=0.744545, mflops=6.71551 (err=3.4e-11) 8. Singleton (f2c): elapsed time t=2.78 s, 1 iters, t-(init.)=2.75 s t(norm)=0.41032, mflops=12.1856 (err=1.6e-13) 9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=19.29 s, 1 iters, t-(init.)=19.26 s t(norm)=2.87374, mflops=1.7399 (err=1.1e-13) Top mflops for N=362880 = 34.5468 Normalized results and averages for N=362880: fft 0: mflops = 16.3465 (norm. = 0.473171), norm. avg. (of 19) = 0.59248 fft 1: mflops = 16.4267 (norm. = 0.47549), norm. avg. (of 19) = 0.624095 fft 2: mflops = 7.81128 (norm. = 0.226107), norm. avg. (of 19) = 0.190335 fft 3: mflops = 34.5468 (norm. = 1), norm. avg. (of 19) = 0.988874 fft 4: mflops = 26.5956 (norm. = 0.769841), norm. avg. (of 19) = 0.900253 fft 5: mflops = 7.48 (norm. = 0.216518), norm. avg. (of 19) = 0.195128 fft 6: mflops = 20.187 (norm. = 0.584337), norm. avg. (of 19) = 0.39555 fft 7: mflops = 6.71551 (norm. = 0.194389), norm. avg. (of 19) = 0.129558 fft 8: mflops = 12.1856 (norm. = 0.352727), norm. avg. (of 19) = 0.378888 fft 9: mflops = -1 (norm. = -0.0289462), norm. avg. (of 12) = 0.285026 fft 10: mflops = 1.7399 (norm. = 0.0503634), norm. avg. (of 19) = 0.0365672 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) Maximum array size N = 2097152 Benchmarking FFTs: 0. FFTW 1. HARM (f2c) 2. PDA (f2c) 3. Singleton (f2c) 4. Temperton (f2c) Computing normalized averages (5 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.66 s, 65536 iters, t-(init.)=1.46 s t(norm)=0.0580152, mflops=86.1843 (err=1.9e-16) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. PDA (f2c): elapsed time t=1.46 s, 4096 iters, t-(init.)=1.46 s t(norm)=0.928243, mflops=5.38652 (err=2.8e-16) 3. Singleton (f2c): elapsed time t=1.64 s, 32768 iters, t-(init.)=1.54 s t(norm)=0.122388, mflops=40.8536 (err=1.9e-16) 4. Temperton (f2c): elapsed time t=1.51 s, 16384 iters, t-(init.)=1.47 s t(norm)=0.23365, mflops=21.3995 (err=1.9e-16) Top mflops for N=64 = 86.1843 Normalized results and averages for N=64: fft 0: mflops = 86.1843 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.011603), norm. avg. (of 0) = -1 fft 2: mflops = 5.38652 (norm. = 0.0625), norm. avg. (of 1) = 0.0625 fft 3: mflops = 40.8536 (norm. = 0.474026), norm. avg. (of 1) = 0.474026 fft 4: mflops = 21.3995 (norm. = 0.248299), norm. avg. (of 1) = 0.248299 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.74 s, 8192 iters, t-(init.)=1.56 s t(norm)=0.0413259, mflops=120.99 (err=3.6e-16) 1. HARM (f2c): elapsed time t=1.59 s, 2048 iters, t-(init.)=1.55 s t(norm)=0.164244, mflops=30.4425 (err=4.0e-16) 2. PDA (f2c): elapsed time t=1.51 s, 512 iters, t-(init.)=1.5 s t(norm)=0.635783, mflops=7.86432 (err=3.0e-16) 3. Singleton (f2c): elapsed time t=1.12 s, 2048 iters, t-(init.)=1.08 s t(norm)=0.114441, mflops=43.6907 (err=3.5e-16) 4. Temperton (f2c): elapsed time t=1.08 s, 2048 iters, t-(init.)=1.04 s t(norm)=0.110202, mflops=45.3711 (err=3.3e-16) Top mflops for N=512 = 120.99 Normalized results and averages for N=512: fft 0: mflops = 120.99 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 30.4425 (norm. = 0.251613), norm. avg. (of 1) = 0.251613 fft 2: mflops = 7.86432 (norm. = 0.065), norm. avg. (of 2) = 0.06375 fft 3: mflops = 43.6907 (norm. = 0.361111), norm. avg. (of 2) = 0.417569 fft 4: mflops = 45.3711 (norm. = 0.375), norm. avg. (of 2) = 0.31165 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.46 s, 512 iters, t-(init.)=1.36 s t(norm)=0.0540415, mflops=92.5214 (err=4.2e-16) 1. HARM (f2c): elapsed time t=1.02 s, 128 iters, t-(init.)=1 s t(norm)=0.158946, mflops=31.4573 (err=4.0e-16) 2. PDA (f2c): elapsed time t=1.41 s, 64 iters, t-(init.)=1.4 s t(norm)=0.445048, mflops=11.2347 (err=4.0e-16) 3. Singleton (f2c): elapsed time t=1.81 s, 256 iters, t-(init.)=1.75 s t(norm)=0.139078, mflops=35.9512 (err=4.1e-16) 4. Temperton (f2c): elapsed time t=1.37 s, 256 iters, t-(init.)=1.32 s t(norm)=0.104904, mflops=47.6625 (err=4.6e-16) Top mflops for N=4096 = 92.5214 Normalized results and averages for N=4096: fft 0: mflops = 92.5214 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 31.4573 (norm. = 0.34), norm. avg. (of 2) = 0.295806 fft 2: mflops = 11.2347 (norm. = 0.121429), norm. avg. (of 3) = 0.0829762 fft 3: mflops = 35.9512 (norm. = 0.388571), norm. avg. (of 3) = 0.407903 fft 4: mflops = 47.6625 (norm. = 0.515152), norm. avg. (of 3) = 0.379484 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.3 s, 32 iters, t-(init.)=1.23 s t(norm)=0.0782013, mflops=63.9376 (err=5.2e-16) 1. HARM (f2c): elapsed time t=1.5 s, 16 iters, t-(init.)=1.46 s t(norm)=0.185649, mflops=26.9326 (err=5.2e-16) 2. PDA (f2c): elapsed time t=1.16 s, 4 iters, t-(init.)=1.15 s t(norm)=0.58492, mflops=8.54817 (err=4.2e-16) 3. Singleton (f2c): elapsed time t=1.02 s, 8 iters, t-(init.)=1 s t(norm)=0.254313, mflops=19.6608 (err=5.3e-16) 4. Temperton (f2c): elapsed time t=1.22 s, 16 iters, t-(init.)=1.19 s t(norm)=0.151316, mflops=33.0434 (err=4.7e-16) Top mflops for N=32768 = 63.9376 Normalized results and averages for N=32768: fft 0: mflops = 63.9376 (norm. = 1), norm9. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 10. Valkenburg: elapsed time t=19.22 s, 1 iters, t-(init.)=19.19 s t(norm)=2.86329, mflops=1.74624 (err=1.1e-13) Top mflops for N=362880 = 36.0327 Normalized results and averages for N=362880: fft 0: mflops = 16.5076 (norm. = 0.458128), norm. avg. (of 19) = 0.588121 fft 1: mflops = 16.5076 (norm. = 0.458128), norm. avg. (of 19) = 0.620486 fft 2: mflops = 7.92208 (norm. = 0.219858), norm. avg. (of 19) = 0.188878 fft 3: mflops = 36.0327 (norm. = 1), norm. avg. (of 19) = 0.985913 fft 4: mflops = 32.2215 (norm. = 0.894231), norm. avg. (of 19) = 0.953897 fft 5: mflops = 7.53043 (norm. = 0.208989), norm. avg. (of 19) = 0.194185 fft 6: mflops = 21.6196 (norm. = 0.6), norm. avg. (of 19) = 0.40107 fft 7: mflops = 6.81106 (norm. = 0.189024), norm. avg. (of 19) = 0.126404 fft 8: mflops = 12.0541 (norm. = 0.334532), norm. avg. (of 19) = 0.372594 fft 9: mflops = -1 (norm. = -0.0277526), norm. avg. (of 12) = 0.274944 fft 10: mflops = 1.74624 (norm. = 0.0484627), norm. avg. (of 19) = 0.0361008 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) Maximum array siz Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.43 s, 1 iters, t-(init.)=1.38 s t(norm)=0.138534, mflops=36.0923 (err=1.2e-15) 1. HARM (f2c): elapsed time t=2.19 s, 1 iters, t-(init.)=2.14 s t(norm)=0.214828, mflops=23.2745 (err=1.2e-15) 2. PDA (f2c): elapsed time t=5.15 s, 1 iters, t-(init.)=5.1 s t(norm)=0.511973, mflops=9.76615 (err=1.2e-15) 3. Singleton (f2c): elapsed time t=4.06 s, 1 iters, t-(init.)=4.01 s t(norm)=0.402551, mflops=12.4208 (err=1.7e-15) 4. Temperton (f2c): elapsed time t=1.93 s, 1 iters, t-(init.)=1.88 s t(norm)=0.188727, mflops=26.4933 (err=1.3e-15) Top mflops for N=524288 = 36.0923 Normalized results and averages for N=524288: fft 0: mflops = 36.0923 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 23.2745 (norm. = 0.64486), norm. avg. (of 5) = 0.459802 fft 2: mflops = 9.76615 (norm. = 0.270588), norm. avg. (of 6) = 0.149502 fft 3: mflops = 12.4208 (norm. = 0.34414), norm. avg. (of 6) = 0.369729 fft 4: mflops = 26.4933 (norm. = 0.734043), norm. avg. (of 6) = 0.515981 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=2.78 s, 1 iters, t-(init.)=2.67 s t(norm)=0.127316, mflops=39.2725 (err=2.0e-15) 1. HARM (f2c): elapsed time t=4.47 s, 1 iters, t-(init.)=4.36 s t(norm)=0.207901, mflops=24.0499 (err=1.9e-15) 2. PDA (f2c): elapsed time t=10.75 s, 1 iters, t-(init.)=10.65 s t(norm)=0.507832, mflops=9.84578 (err=2.0e-15) 3. Singleton (f2c): elapsed time t=7.22 s, 1 iters, t-(init.)=7.12 s t(norm)=0.339508, mflops=14.7272 (err=2.8e-15) 4. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 39.2725 Normalized results and averages for N=1048576: fft 0: mflops = 39.2725 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 24.0499 (norm. = 0.612385), norm. avg. (of 6) = 0.485233 fft 2: mflops = 9.84578 (norm. = 0.250704), norm. avg. (of 7) = 0.16396 fft 3: mflops = 14.7272 (norm. = 0.375), norm. avg. (of 7) = 0.370482 fft 4: mflops = -1 (norm. = -0.0254631), norm. avg. (of 6) = 0.515981 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=6.84 s, 1 iters, t-(init.)=6.63 s t(norm)=0.150544, mflops=33.2128 (err=7.2e-16) 1. HARM (f2c): elapsed time t=9.56 s, 1 iters, t-(init.)=9.36 s t(norm)=0.212533, mflops=23.5257 (err=6.9e-16) 2. PDA (f2c): elapsed time t=21.94 s, 1 iters, t-(init.)=21.73 s t(norm)=0.493413, mflops=10.1335 (err=7.1e-16) 3. Singleton (f2c): elapsed time t=22.94 s, 1 iters, t-(init.)=22.73 s t(norm)=0.516119, mflops=9.68768 (err=8.4e-16) 4. Temperton (f2c): elapsed time t=10.09 s, 1 iters, t-(init.)=9.88 s t(norm)=0.224341, mflops=22.2875 (err=7.4e-16) Top mflops for N=2097152 = 33.2128 Normalized results and averages for N=2097152: fft 0: mflops = 33.2128 (norm. = 1), norm. avg. (of 8) = 1 fft 1: mflops = 23.5257 (norm. = 0.708333), norm. avg. (of 7) = 0.517104 fft 2: mflops = 10.1335 (norm. = 0.305108), norm. avg. (of 8) = 0.181603 fft 3: mflops = 9.68768 (norm. = 0.291685), norm. avg. (of 8) = 0.360632 fft 4: mflops = 22.2875 (norm. = 0.671053), norm. avg. (of 7) = 0.538134 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) Maximum array size N = 2985984 Benchmarking FFTs: 0. FFTW 1. PDA (f2c) 2. Singleton (f2c) 3. Temperton (f2c) Computing normalized averages (4 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.49 s, 16384 iters, t-(init.)=1.4 s t(norm)=0.0981359, mflops=50.9497 (err=3.0e-16) 1. PDA (f2c): elapsed time t=1.38 s, 2048 iters, t-(init.)=1.37 s t(norm)=0.768264, mflops=6.50818 (err=2.3e-16) 2. Singleton (f2c): elapsed time t=1.54 s, 16384 iters, t-(init.)=1.45 s t(norm)=0.101641, mflops=49.1928 (err=3.1e-16) 3. Temperton (f2c): elapsed time t=1.22 s, 8192 iters, t-(init.)=1.17 s t(norm)=0.164027, mflops=30.4827 (err=2.4e-16) Top mflops for N=125 = 50.9497 Normalized results and averages for N=125: fft 0: mflops = 50.9497 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = 6.50818 (norm. = 0.127737), norm. avg. (of 1) = 0.127737 fft 2: mflops = 49.1928 (norm. = 0.965517), norm. avg. (of 1) = 0.965517 fft 3: mflops = 30.4827 (norm. = 0.598291), norm. avg. (of 1) = 0.598291 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.6 s, 16384 iters, t-(init.)=1.44 s t(norm)=0.0524703, mflops=95.2921 (err=2.9e-16) 1. PDA (f2c): elapsed time t=1.42 s, 1024 iters, t-(init.)=1.41 s t(norm)=0.822034, mflops=6.08247 (err=3.6e-16) 2. Singleton (f2c): elapsed time t=1.18 s, 4096 iters, t-(init.)=1.14 s t(norm)=0.166156, mflops=30.0922 (err=2.9e-16) 3. Temperton (f2c): elapsed time t=1.2 s, 4096 iters, t-(init.)=1.16 s t(norm)=0.169071, mflops=29.5734 (err=3.1e-16) Top mflops for N=216 = 95.2921 Normalized results and averages for N=216: fft 0: mflops = 95.2921 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 6.08247 (norm. = 0.0638298), norm. avg. (of 2) = 0.0957835 fft 2: mflops = 30.0922 (norm. = 0.315789), norm. avg. (of 2) = 0.640653 fft 3: mflops = 29.5734 (norm. = 0.310345), norm. avg. (of 2) = 0.454318 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.38 s, 4096 iters, t-(init.)=1.32 s t(norm)=0.111558, mflops=44.8197 (err=3.8e-16) 1. PDA (f2c): elapsed time t=1.51 s, 256 iters, t-(init.)=1.51 s t(norm)=2.04185, mflops=2.44876 (err=4.3e-16) 2. Singleton (f2c): elapsed time t=1.94 s, 4096 iters, t-(init.)=1.88 s t(norm)=0.158886, mflops=31.4691 (err=5.8e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 44.8197 Normalized results and averages for N=343: fft 0: mflops = 44.8197 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 2.44876 (norm. = 0.0546358), norm. avg. (of 3) = 0.0820676 fft 2: mflops = 31.4691 (norm. = 0.702128), norm. avg. (of 3) = 0.661145 fft 3: mflops = -1 (norm. = -0.0223116), norm. avg. (of 2) = 0.454318 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.12 s, 2048 iters, t-(init.)=1.05 s t(norm)=0.073954, mflops=67.6096 (err=5.3e-16) 1. PDA (f2c): elapsed time t=1.11 s, 256 iters, t-(init.)=1.1 s t(norm)=0.619805, mflops=8.06706 (err=4.9e-16) 2. Singleton (f2c): elapsed time t=1.69 s, 2048 iters, t-(init.)=1.62 s t(norm)=0.1141, mflops=43.821 (err=4.5e-16) 3. Temperton (f2c): elapsed time t=1.98 s, 2048 iters, t-(init.)=1.92 s t(norm)=0.13523, mflops=36.974 (err=5.1e-16) Top mflops for N=729 = 67.6096 Normalized results and averages for N=729: fft 0: mflops = 67.6096 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 8.06706 (norm. = 0.119318), norm. avg. (of 4) = 0.0913802 fft 2: mflops = 43.821 (norm. = 0.648148), norm. avg. (of 4) = 0.657896 fft 3: mflops = 36.974 (norm. = 0.546875), norm. avg. (of 3) = 0.48517 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.65 s, 2048 iters, t-(init.)=1.56 s t(norm)=0.0764334, mflops=65.4164 (err=4.0e-16) 1. PDA (f2c): elapsed time t=1.53 s, 256 iters, t-(init.)=1.52 s t(norm)=0.595789, mflops=8.39224 (err=4.3e-16) 2. Singleton (f2c): elapsed time t=1.62 s, 1024 iters, t-(init.)=1.58 s t(norm)=0.154827, mflops=32.2942 (err=4.6e-16) 3. Temperton (f2c): elapsed time t=1.33 s, 1024 iters, t-(init.)=1.29 s t(norm)=0.126409, mflops=39.5541 (err=3.4e-16) Top mflops for N=1000 = 65.4164 Normalized results and averages for N=1000: fft 0: mflops = 65.4164 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 8.39224 (norm. = 0.128289), norm. avg. (of 5) = 0.0987621 fft 2: mflops = 32.2942 (norm. = 0.493671), norm. avg. (of 5) = 0.625051 fft 3: mflops = 39.5541 (norm. = 0.604651), norm. avg. (of 4) = 0.51504 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.19 s, 512 iters, t-(init.)=1.16 s t(norm)=0.164015, mflops=30.485 (err=4.3e-16) 1. PDA (f2c): elapsed time t=1.95 s, 64 iters, t-(init.)=1.95 s t(norm)=2.20572, mflops=2.26683 (err=5.7e-16) 2. Singleton (f2c): elapsed time t=1.25 s, 512 iters, t-(init.)=1.22 s t(norm)=0.172499, mflops=28.9857 (err=6.6e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 30.485 Normalized results and averages for N=1331: fft 0: mflops = 30.485 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 2.26683 (norm. = 0.074359), norm. avg. (of 6) = 0.0946949 fft 2: mflops = 28.9857 (norm. = 0.95082), norm. avg. (of 6) = 0.679346 fft 3: mflops = -1 (norm. = -0.032803), norm. avg. (of 4) = 0.51504 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.15 s, 1024 iters, t-(init.)=1.08 s t(norm)=0.0567511, mflops=88.104 (err=3.9e-16) 1. PDA (f2c): elapsed time t=1.22 s, 128 iters, t-(init.)=1.21 s t(norm)=0.508658, mflops=9.82979 (err=3.8e-16) 2. Singleton (f2c): elapsed time t=1.3 s, 512 iters, t-(init.)=1.27 s t(norm)=0.13347, mflops=37.4616 (err=4.0e-16) 3. Temperton (f2c): elapsed time t=1 s, 512 iters, t-(init.)=0.97 s t(norm)=0.101942, mflops=49.0476 (err=3.9e-16) Top mflops for N=1728 = 88.104 Normalized results and averages for N=1728: fft 0: mflops = 88.104 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 9.82979 (norm. = 0.11157), norm. avg. (of 7) = 0.0971057 fft 2: mflops = 37.4616 (norm. = 0.425197), norm. avg. (of 7) = 0.643039 fft 3: mflops = 49.0476 (norm. = 0.556701), norm. avg. (of 5) = 0.523373 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.23 s, 256 iters, t-(init.)=1.21 s t(norm)=0.193794, mflops=25.8006 (err=4.5e-16) 1. PDA (f2c): elapsed time t=1.81 s, 32 iters, t-(init.)=1.81 s t(norm)=2.31912, mflops=2.15599 (err=9.2e-16) 2. Singleton (f2c): elapsed time t=1.17 s, 256 iters, t-(init.)=1.15 s t(norm)=0.184185, mflops=27.1467 (err=7.7e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 27.1467 Normalized results and averages for N=2197: fft 0: mflops = 25.8006 (norm. = 0.950413), norm. avg. (of 8) = 0.993802 fft 1: mflops = 2.15599 (norm. = 0.0794199), norm. avg. (of 8) = 0.0948949 fft 2: mflops = 27.1467 (norm. = 1), norm. avg. (of 8) = 0.687659 fft 3: mflops = -1 (norm. = -0.0368369), norm. avg. (of 5) = 0.523373 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.65 s, 512 iters, t-(init.)=1.59 s t(norm)=0.0990828, mflops=50.4628 (err=4.1e-16) 1. PDA (f2c): elapsed time t=1.46 s, 32 iters, t-(init.)=1.45 s t(norm)=1.44574, mflops=3.45844 (err=4.5e-16) 2. Singleton (f2c): elapsed time t=1.66 s, 256 iters, t-(init.)=1.63 s t(norm)=0.203151, mflops=24.6122 (err=5.6e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 50.4628 Normalized results and averages for N=2744: fft 0: mflops = 50.4628 (norm. = 1), norm. avg. (of 9) = 0.99449 fft 1: mflops = 3.45844 (norm. = 0.0685345), norm. avg. (of 9) = 0.091966 fft 2: mflops = 24.6122 (norm. = 0.48773), norm. avg. (of 9) = 0.665444 fft 3: mflops = -1 (norm. = -0.0198166), norm. avg. (of 5) = 0.523373 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.71 s, 512 iters, t-(init.)=1.63 s t(norm)=0.0804806, mflops=62.1268 (err=4.7e-16) 1. PDA (f2c): elapsed time t=1.25 s, 64 iters, t-(init.)=1.24 s t(norm)=0.489796, mflops=10.2083 (err=4.8e-16) 2. Singleton (f2c): elapsed time t=1.87 s, 256 iters, t-(init.)=1.83 s t(norm)=0.180711, mflops=27.6685 (err=6.1e-16) 3. Temperton (f2c): elapsed time t=1.2 s, 256 iters, t-(init.)=1.16 s t(norm)=0.114549, mflops=43.6494 (err=4.6e-16) Top mflops for N=3375 = 62.1268 Normalized results and averages for N=3375: fft 0: mflops = 62.1268 (norm. = 1), norm. avg. (of 10) = 0.995041 fft 1: mflops = 10.2083 (norm. = 0.164315), norm. avg. (of 10) = 0.0992009 fft 2: mflops = 27.6685 (norm. = 0.445355), norm. avg. (of 10) = 0.643436 fft 3: mflops = 43.6494 (norm. = 0.702586), norm. avg. (of 6) = 0.553241 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.27 s, 32 iters, t-(init.)=1.24 s t(norm)=0.164329, mflops=30.4268 (err=4.6e-16) 1. PDA (f2c): elapsed time t=1.29 s, 8 iters, t-(init.)=1.28 s t(norm)=0.678519, mflops=7.36899 (err=4.8e-16) 2. Singleton (f2c): elapsed time t=1.46 s, 32 iters, t-(init.)=1.43 s t(norm)=0.189508, mflops=26.3841 (err=5.4e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 30.4268 Normalized results and averages for N=16800: fft 0: mflops = 30.4268 (norm. = 1), norm. avg. (of 11) = 0.995492 fft 1: mflops = 7.36899 (norm. = 0.242188), norm. avg. (of 11) = 0.1122 fft 2: mflops = 26.3841 (norm. = 0.867133), norm. avg. (of 11) = 0.663772 fft 3: mflops = -1 (norm. = -0.0328658), norm. avg. (of 6) = 0.553241 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.95 s, 8 iters, t-(init.)=1.86 s t(norm)=0.125475, mflops=39.8485 (err=6.7e-16) 1. PDA (f2c): elapsed time t=2.01 s, 2 iters, t-(init.)=1.98 s t(norm)=0.534281, mflops=9.35837 (err=6.3e-16) 2. Singleton (f2c): elapsed time t=1.28 s, 2 iters, t-(init.)=1.26 s t(norm)=0.339997, mflops=14.706 (err=6.5e-16) 3. Temperton (f2c): elapsed time t=1.19 s, 4 iters, t-(init.)=1.15 s t(norm)=0.155157, mflops=32.2253 (err=7.0e-16) Top mflops for N=110592 = 39.8485 Normalized results and averages for N=110592: fft 0: mflops = 39.8485 (norm. = 1), norm. avg. (of 12) = 0.995868 fft 1: mflops = 9.35837 (norm. = 0.234848), norm. avg. (of 12) = 0.12242 fft 2: mflops = 14.706 (norm. = 0.369048), norm. avg. (of 12) = 0.639211 fft 3: mflops = 32.2253 (norm. = 0.808696), norm. avg. (of 7) = 0.589735 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.5 s, 4 iters, t-(init.)=1.45 s t(norm)=0.182924, mflops=27.3337 (err=6.4e-16) 1. PDA (f2c): elapsed time t=3.5 s, 1 iters, t-(init.)=3.49 s t(norm)=1.76112, mflops=2.8391 (err=7.2e-16) 2. Singleton (f2c): elapsed time t=1.3 s, 2 iters, t-(init.)=1.28 s t(norm)=0.322956, mflops=15.482 (err=1.0e-15) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 27.3337 Normalized results and averages for N=117649: fft 0: mflops = 27.3337 (norm. = 1), norm. avg. (of 13) = 0.996186 fft 1: mflops = 2.8391 (norm. = 0.103868), norm. avg. (of 13) = 0.120993 fft 2: mflops = 15.482 (norm. = 0.566406), norm. avg. (of 13) = 0.633611 fft 3: mflops = -1 (norm. = -0.0365848), norm. avg. (of 7) = 0.589735 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.04 s, 2 iters, t-(init.)=1 s t(norm)=0.130628, mflops=38.2767 (err=7.2e-16) 1. PDA (f2c): elapsed time t=1.97 s, 1 iters, t-(init.)=1.95 s t(norm)=0.509449, mflops=9.81453 (err=7.4e-16) 2. Singleton (f2c): elapsed time t=1.53 s, 1 iters, t-(init.)=1.51 s t(norm)=0.394496, mflops=12.6744 (err=1.0e-15) 3. Temperton (f2c): elapsed time t=1.19 s, 2 iters, t-(init.)=1.15 s t(norm)=0.150222, mflops=33.284 (err=7.1e-16) Top mflops for N=216000 = 38.2767 Normalized results and averages for N=216000: fft 0: mflops = 38.2767 (norm. = 1), norm. avg. (of 14) = 0.996458 fft 1: mflops = 9.81453 (norm. = 0.25641), norm. avg. (of 14) = 0.130666 fft 2: mflops = 12.6744 (norm. = 0.331126), norm. avg. (of 14) = 0.612005 fft 3: mflops = 33.284 (norm. = 0.869565), norm. avg. (of 8) = 0.624714 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.37 s, 2 iters, t-(init.)=1.32 s t(norm)=0.152547, mflops=32.7768 (err=7.3e-16) 1. PDA (f2c): elapsed time t=3.06 s, 1 iters, t-(init.)=3.04 s t(norm)=0.70264, mflops=7.11602 (err=7.8e-16) 2. Singleton (f2c): elapsed time t=1.94 s, 1 iters, t-(init.)=1.92 s t(norm)=0.443773, mflops=11.267 (err=9.4e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 32.7768 Normalized results and averages for N=241920: fft 0: mflops = 32.7768 (norm. = 1), norm. avg. (of 15) = 0.996694 fft 1: mflops = 7.11602 (norm. = 0.217105), norm. avg. (of 15) = 0.136429 fft 2: mflops = 11.267 (norm. = 0.34375), norm. avg. (of 15) = 0.594121 fft 3: mflops = -1 (norm. = -0.0305094), norm. avg. (of 8) = 0.624714 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.36 s, 1 iters, t-(init.)=1.32 s t(norm)=0.167442, mflops=29.8612 (err=7.0e-16) 1. PDA (f2c): elapsed time t=4.03 s, 1 iters, t-(init.)=3.98 s t(norm)=0.504862, mflops=9.9037 (err=7.3e-16) 2. Singleton (f2c): elapsed time t=2.89 s, 1 iters, t-(init.)=2.85 s t(norm)=0.361521, mflops=13.8304 (err=9.3e-16) 3. Temperton (f2c): elapsed time t=1.27 s, 1 iters, t-(init.)=1.23 s t(norm)=0.156025, mflops=32.0461 (err=9.6e-16) Top mflops for N=421875 = 32.0461 Normalized results and averages for N=421875: fft 0: mflops = 29.8612 (norm. = 0.931818), norm. avg. (of 16) = 0.992639 fft 1: mflops = 9.9037 (norm. = 0.309045), norm. avg. (of 16) = 0.147217 fft 2: mflops = 13.8304 (norm. = 0.431579), norm. avg. (of 16) = 0.583962 fft 3: mflops = 32.0461 (norm. = 1), norm. avg. (of 9) = 0.666412 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.59 s, 1 iters, t-(init.)=1.54 s t(norm)=0.158592, mflops=31.5275 (err=6.6e-16) 1. PDA (f2c): elapsed time t=4.87 s, 1 iters, t-(init.)=4.82 s t(norm)=0.496371, mflops=10.0731 (err=6.1e-16) 2. Singleton (f2c): elapsed time t=3.73 s, 1 iters, t-(init.)=3.68 s t(norm)=0.378972, mflops=13.1936 (err=8.2e-16) 3. Temperton (f2c): elapsed time t=1.75 s, 1 iters, t-(init.)=1.69 s t(norm)=0.174039, mflops=28.7292 (err=6.6e-16) Top mflops for N=512000 = 31.5275 Normalized results and averages for N=512000: fft 0: mflops = 31.5275 (norm. = 1), norm. avg. (of 17) = 0.993072 fft 1: mflops = 10.0731 (norm. = 0.319502), norm. avg. (of 17) = 0.157352 fft 2: mflops = 13.1936 (norm. = 0.418478), norm. avg. (of 17) = 0.574228 fft 3: mflops = 28.7292 (norm. = 0.911243), norm. avg. (of 10) = 0.690895 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=2.04 s, 1 iters, t-(init.)=1.98 s t(norm)=0.1742, mflops=28.7027 (err=7.0e-16) 1. PDA (f2c): elapsed time t=11.27 s, 1 iters, t-(init.)=11.21 s t(norm)=0.986253, mflops=5.0697 (err=6.6e-16) 2. Singleton (f2c): elapsed time t=5.33 s, 1 iters, t-(init.)=5.28 s t(norm)=0.464533, mflops=10.7635 (err=8.9e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 28.7027 Normalized results and averages for N=592704: fft 0: mflops = 28.7027 (norm. = 1), norm. avg. (of 18) = 0.993457 fft 1: mflops = 5.0697 (norm. = 0.176628), norm. avg. (of 18) = 0.158422 fft 2: mflops = 10.7635 (norm. = 0.375), norm. avg. (of 18) = 0.56316 fft 3: mflops = -1 (norm. = -0.03484), norm. avg. (of 10) = 0.690895 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=2.83 s, 1 iters, t-(init.)=2.74 s t(norm)=0.15677, mflops=31.8939 (err=7.7e-16) 1. PDA (f2c): elapsed time t=9.27 s, 1 iters, t-(init.)=9.18 s t(norm)=0.525236, mflops=9.51953 (err=6.4e-16) 2. Singleton (f2c): elapsed time t=8.8 s, 1 iters, t-(init.)=8.72 s t(norm)=0.498917, mflops=10.0217 (err=7.0e-16) 3. Temperton (f2c): elapsed time t=3.68 s, 1 iters, t-(init.)=3.6 s t(norm)=0.205975, mflops=24.2748 (err=7.5e-16) Top mflops for N=884736 = 31.8939 Normalized results and averages for N=884736: fft 0: mflops = 31.8939 (norm. = 1), norm. avg. (of 19) = 0.993802 fft 1: mflops = 9.51953 (norm. = 0.298475), norm. avg. (of 19) = 0.165794 fft 2: mflops = 10.0217 (norm. = 0.31422), norm. avg. (of 19) = 0.550058 fft 3: mflops = 24.2748 (norm. = 0.761111), norm. avg. (of 11) = 0.697278 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=4.31 s, 1 iters, t-(init.)=4.2 s t(norm)=0.18012, mflops=27.7592 (err=7.4e-16) 1. PDA (f2c): elapsed time t=22.63 s, 1 iters, t-(init.)=22.52 s t(norm)=0.965788, mflops=5.17712 (err=7.2e-16) 2. Singleton (f2c): elapsed time t=10.08 s, 1 iters, t-(init.)=9.97 s t(norm)=0.427572, mflops=11.6939 (err=8.0e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 27.7592 Normalized results and averages for N=1157625: fft 0: mflops = 27.7592 (norm. = 1), norm. avg. (of 20) = 0.994112 fft 1: mflops = 5.17712 (norm. = 0.186501), norm. avg. (of 20) = 0.166829 fft 2: mflops = 11.6939 (norm. = 0.421264), norm. avg. (of 20) = 0.543618 fft 3: mflops = -1 (norm. = -0.0360241), norm. avg. (of 11) = 0.697278 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=5.15 s, 1 iters, t-(init.)=5.01 s t(norm)=0.174616, mflops=28.6343 (err=5.3e-16) 1. PDA (f2c): elapsed time t=27.08 s, 1 iters, t-(init.)=26.94 s t(norm)=0.938953, mflops=5.32508 (err=5.4e-16) 2. Singleton (f2c): elapsed time t=11.88 s, 1 iters, t-(init.)=11.74 s t(norm)=0.40918, mflops=12.2196 (err=6.4e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 28.6343 Normalized results and averages for N=1404928: fft 0: mflops = 28.6343 (norm. = 1), norm. avg. (of 21) = 0.994392 fft 1: mflops = 5.32508 (norm. = 0.185969), norm. avg. (of 21) = 0.16774 fft 2: mflops = 12.2196 (norm. = 0.426746), norm. avg. (of 21) = 0.538053 fft 3: mflops = -1 (norm. = -0.0349232), norm. avg. (of 11) = 0.697278 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=5.8 s, 1 iters, t-(init.)=5.63 s t(norm)=0.157239, mflops=31.7987 (err=7.3e-16) 1. PDA (f2c): elapsed time t=17.75 s, 1 iters, t-(init.)=17.58 s t(norm)=0.490988, mflops=10.1835 (err=7.8e-16) 2. Singleton (f2c): elapsed time t=19.13 s, 1 iters, t-(init.)=18.97 s t(norm)=0.52981, mflops=9.43735 (err=9.4e-16) 3. Temperton (f2c): elapsed time t=6.75 s, 1 iters, t-(init.)=6.58 s t(norm)=0.183772, mflops=27.2077 (err=6.9e-16) Top mflops for N=1728000 = 31.7987 Normalized results and averages for N=1728000: fft 0: mflops = 31.7987 (norm. = 1), norm. avg. (of 22) = 0.994647 fft 1: mflops = 10.1835 (norm. = 0.32025), norm. avg. (of 22) = 0.174673 fft 2: mflops = 9.43735 (norm. = 0.296784), norm. avg. (of 22) = 0.527086 fft 3: mflops = 27.2077 (norm. = 0.855623), norm. avg. (of 12) = 0.710474 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=11.68 s, 1 iters, t-(init.)=11.06 s t(norm)=0.172199, mflops=29.0361 (err=1.2e-15) 1. PDA (f2c): elapsed time t=31.62 s, 1 iters, t-(init.)=31.33 s t(norm)=0.487795, mflops=10.2502 (err=1.2e-15) 2. Singleton (f2c): elapsed time t=31.67 s, 1 iters, t-(init.)=31.38 s t(norm)=0.488573, mflops=10.2339 (err=1.6e-15) 3. Temperton (f2c): elapsed time t=12.77 s, 1 iters, t-(init.)=12.47 s t(norm)=0.194153, mflops=25.7529 (err=1.2e-15) Top mflops for N=2985984 = 29.0361 Normalized results and averages for N=2985984: fft 0: mflops = 29.0361 (norm. = 1), norm. avg. (of 23) = 0.99488 fft 1: mflops = 10.2502 (norm. = 0.353016), norm. avg. (of 23) = 0.182427 fft 2: mflops = 10.2339 (norm. = 0.352454), norm. avg. (of 23) = 0.519493 fft 3: mflops = 25.7529 (norm. = 0.886929), norm. avg. (of 13) = 0.724047 . (of 17) = 0.994693 fft 1: mflops = 10.2216 (norm. = 0.334737), norm. avg. (of 17) = 0.15501 fft 2: mflops = 13.1578 (norm. = 0.430894), norm. avg. (of 17) = 0.56936 fft 3: mflops = 28.5602 (norm. = 0.935294), norm. avg. (of 10) = 0.688165 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.98 s, 1 iters, t-(init.)=1.93 s t(norm)=0.169801, mflops=29.4463 (err=7.0e-16) 1. PDA (f2c): elapsed time t=11.24 s, 1 iters, t-(init.)=11.18 s t(norm)=0.983613, mflops=5.0833 (err=6.6e-16) 2. Singleton (f2c): elapsed time t=5.31 s, 1 iters, t-(init.)=5.26 s t(norm)=0.462773, mflops=10.8044 (err=8.9e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 29.4463 Normalized results and averages for N=592704: fft 0: mflops = 29.4463 (norm. = 1), norm. avg. (of 18) = 0.994987 fft 1: mflops = 5.0833 (norm. = 0.17263), norm. avg. (of 18) = 0.155989 fft 2: mflops = 10.8044 (norm. = 0.36692), norm. avg. (of 18) = 0.558114 fft 3: mflops = -1 (norm. = -0.0339602), norm. avg. (of 10) = 0.688165 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=2.74 s, 1 iters, t-(init.)=2.65 s t(norm)=0.15162, mflops=32.9771 (err=7.7e-16) 1. PDA (f2c): elapsed time t=9.16 s, 1 iters, t-(init.)=9.07 s t(norm)=0.518942, mflops=9.63498 (err=6.4e-16) 2. Singleton (f2c): elapsed time t=8.86 s, 1 iters, t-(init.)=8.77 s t(norm)=0.501778, mflops=9.96457 (err=7.0e-16) 3. Temperton (f2c): elapsed time t=3.7 s, 1 iters, t-(init.)=3.61 s t(norm)=0.206547, mflops=24.2076 (err=7.5e-16) Top mflops for N=884736 = 32.9771 Normalized results and averages for N=884736: fft 0: mflops = 32.9771 (norm. = 1), norm. avg. (of 19) = 0.995251 fft 1: mflops = 9.63498 (norm. = 0.292172), norm. avg. (of 19) = 0.163156 fft 2: mflops = 9.96457 (norm. = 0.302166), norm. avg. (of 19) = 0.544643 fft 3: mflops = 24.2076 (norm. = 0.734072), norm. avg. (of 11) = 0.692339 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=4.47 s, 1 iters, t-(init.)=4.36 s t(norm)=0.186982, mflops=26.7405 (err=7.4e-16) 1. PDA (f2c): elapsed time t=22.46 s, 1 iters, t-(init.)=22.35 s t(norm)=0.958498, mflops=5.2165 (err=7.2e-16) 2. Singleton (f2c): elapsed time t=10.03 s, 1 iters, t-(init.)=9.92 s t(norm)=0.425427, mflops=11.7529 (err=8.0e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 26.7405 Normalized results and averages for N=1157625: fft 0: mflops = 26.7405 (norm. = 1), norm. avg. (of 20) = 0.995489 fft 1: mflops = 5.2165 (norm. = 0.195078), norm. avg. (of 20) = 0.164752 fft 2: mflops = 11.7529 (norm. = 0.439516), norm. avg. (of 20) = 0.539386 fft 3: mflops = -1 (norm. = -0.0373964), norm. avg. (of 11) = 0.692339 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=5.03 s, 1 iters, t-(init.)=4.89 s t(norm)=0.170434, mflops=29.3369 (err=5.3e-16) 1. PDA (f2c): elapsed time t=26.96 s, 1 iters, t-(init.)=26.83 s t(norm)=0.935119, mflops=5.34691 (err=5.4e-16) 2. Singleton (f2c): elapsed time t=11.93 s, 1 iters, t-(init.)=11.79 s t(norm)=0.410923, mflops=12.1677 (err=6.4e-16) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 29.3369 Normalized results and averages for N=1404928: fft 0: mflops = 29.3369 (norm. = 1), norm. avg. (of 21) = 0.995704 fft 1: mflops = 5.34691 (norm. = 0.182259), norm. avg. (of 21) = 0.165586 fft 2: mflops = 12.1677 (norm. = 0.414758), norm. avg. (of 21) = 0.533452 fft 3: mflops = -1 (norm. = -0.0340867), norm. avg. (of 11) = 0.692339 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=5.68 s, 1 iters, t-(init.)=5.51 s t(norm)=0.153888, mflops=32.4912 (err=7.3e-16) 1. PDA (f2c): elapsed time t=17.72 s, 1 iters, t-(init.)=17.55 s t(norm)=0.490151, mflops=10.2009 (err=7.8e-16) 2. Singleton (f2c): elapsed time t=19.16 s, 1 iters, t-(init.)=18.99 s t(norm)=0.530368, mflops=9.42741 (err=9.4e-16) 3. Temperton (f2c): elapsed time t=6.72 s, 1 iters, t-(init.)=6.55 s t(norm)=0.182934, mflops=27.3323 (err=6.9e-16) Top mflops for N=1728000 = 32.4912 Normalized results and averages for N=1728000: fft 0: mflops = 32.4912 (norm. = 1), norm. avg. (of 22) = 0.995899 fft 1: mflops = 10.2009 (norm. = 0.31396), norm. avg. (of 22) = 0.17233 fft 2: mflops = 9.42741 (norm. = 0.290153), norm. avg. (of 22) = 0.522393 fft 3: mflops = 27.3323 (norm. = 0.841221), norm. avg. (of 12) = 0.704746 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=10.23 s, 1 iters, t-(init.)=9.95 s t(norm)=0.154917, mflops=32.2753 (err=1.2e-15) 1. PDA (f2c): elapsed time t=31.72 s, 1 iters, t-(init.)=31.43 s t(norm)=0.489352, mflops=10.2176 (err=1.2e-15) 2. Singleton (f2c): elapsed time t=31.98 s, 1 iters, t-(init.)=31.69 s t(norm)=0.4934, mflops=10.1338 (err=1.6e-15) 3. Temperton (f2c): elapsed time t=15.04 s, 1 iters, t-(init.)=14.63 s t(norm)=0.227783, mflops=21.9507 (err=1.2e-15) Top mflops for N=2985984 = 32.2753 Normalized results and averages for N=2985984: fft 0: mflops = 32.2753 (norm. = 1), norm. avg. (of 23) = 0.996077 fft 1: mflops = 10.2176 (norm. = 0.316577), norm. avg. (of 23) = 0.178602 fft 2: mflops = 10.1338 (norm. = 0.313979), norm. avg. (of 23) = 0.513331 fft 3: mflops = 21.9507 (norm. = 0.680109), norm. avg. (of 13) = 0.70285 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Beauregard, Bergland, CWP (min N), CWP (best N), Edelblute, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), NAPACK (f2c), Ooura (C), Ransom, Singleton (f2c), Temperton (f2c), Valkenburg 2, 30.8405, 23.3017, 14.6654, 0.53718, 1.87246, 3.64089, 1.15993, 1.12027, , 3.19688, 21.1834, 19.2399, 45.5903, , 2.73067, 0.74052, 0.74052, 13.6179, , , , 0.73636, 32.2639, , 1.35126, 1.36533, 2.16648 4, 42.3667, 37.7865, 10.8101, 1.71336, 2.27951, 11.3976, 4.40578, 2.24055, 7.3327, 8.66592, 65.536, 63.5501, 121.574, , 9.44663, 1.6487, 1.6384, 32.514, 17.7725, 19.7845, 20.5603, 1.69125, 41.1206, 1.14975, 6.89853, 4.12825, 2.16648 8, 45.5903, 44.6203, 11.7378, 1.96608, 2.02689, 11.826, 11.1551, 6.77959, 5.34988, 10.4858, 103.991, 101.475, 102.3, 28.8599, 13.1072, 3.64089, 3.54249, 42.5098, 29.1271, 28.3399, 30.2474, 3.24972, 58.7987, 1.66617, 6.5536, 8.02482, 2.22156 16, 13.53, 13.2731, 11.5865, 3.12076, 2.06413, 17.05, 22.7951, 12.7875, 5.04123, 15.5345, 108.943, 106.185, 83.8861, 25.8908, 22.0753, 6.99051, 6.85344, 47.127, 27.2357, 30.6154, 29.7468, 5.57753, 56.2994, 6.16809, 19.2399, 12.5578, 2.25986 32, 14.247, 14.3248, 12.6031, 3.39565, 2.08713, 23.8313, 23.6166, 24.2726, 5.4162, 12.9774, 99.8644, 100.825, 78.2519, 30.3057, 21.4872, 12.1363, 11.8083, 45.5903, 32.1649, 36.9217, 36.6635, 7.8019, 60.263, 5.79965, 26.2144, 14.8945, 2.27556 64, 14.0434, 14.2988, 13.3294, 4.44312, 2.02689, 25.1658, 27.3542, 31.775, 5.95782, 15.8875, 90.5245, 76.2601, 56.1737, 34.5684, 28.3399, 17.7725, 17.9756, 44.306, 33.2881, 39.0774, 38.8361, 10.4858, 66.2259, 13.6771, 34.9525, 21.5461, 2.34057 128, 15.4202, 15.2917, 14.7985, 4.5421, 2.01207, 28.4497, 33.3638, 41.2361, 6.69711, 15.6838, 82.0115, 82.0115, 55.6063, 28.8978, 29.8375, 23.5257, 24.3047, 44.7563, 36.7002, 44.2171, 43.6907, 11.4688, 66.1264, 11.7629, 38.2293, 23.3759, 2.34057 256, 16.1319, 16.513, 16.0088, 5.09017, 1.98594, 31.775, 40.3298, 45.3438, 7.43671, 16.7772, 84.7334, 83.0555, 54.1201, 30.1748, 32.514, 27.962, 29.5374, 41.1206, 38.8361, 46.6034, 41.5278, 12.3362, 67.6501, 19.2399, 42.3667, 24.3855, 2.38313 512, 17.4763, 17.8735, 17.2211, 5.21968, 1.95306, 33.7042, 41.3912, 48.6453, 8.192, 14.5636, 67.4085, 67.4085, 48.1489, 32.0993, 28.255, 31.249, 33.2295, 42.1303, 41.7575, 49.9322, 37.4491, 11.9156, 68.8846, 16.7326, 45.3711, 22.6855, 2.32214 1024, 18.5918, 19.1346, 18.3317, 5.90414, 1.90512, 34.7211, 43.6907, 43.3296, 8.91646, 15.3301, 70.8497, 64.3298, 43.3296, 28.807, 30.6601, 33.3941, 35.6659, 27.5941, 43.3296, 50.9017, 49.4611, 12.6031, 72.8178, 23.8313, 45.9902, 24.4994, 2.30761 2048, 17.6907, 17.7999, 16.6681, 5.90898, 1.86761, 33.53, 34.3284, 38.1932, 9.0112, 14.5636, 66.6725, 65.536, 41.4904, 27.2036, 27.996, 26.6999, 31.3433, 25.2946, 42.7198, 49.717, 47.2719, 11.0907, 53.3997, 18.847, 39.2324, 24.2318, 2.32547 4096, 18.6138, 19.065, 17.7725, 6.49944, 1.89046, 34.5684, 34.5684, 40.073, 9.70904, 14.5636, 67.6501, 67.6501, 37.6734, 28.3399, 29.3993, 26.6587, 30.541, 23.8313, 32.5982, 36.3668, 36.1578, 11.3976, 58.2542, 25.3688, 40.073, 24.576, 2.29951 8192, 19.4736, 19.5855, 18.3219, 6.70841, 1.8521, 34.7742, 36.254, 36.8419, 10.1425, 13.2088, 58.7564, 56.7979, 33.0861, 26.0143, 25.8172, 26.4176, 31.5544, , 32.768, 36.254, 36.8419, 10.8531, 54.526, 21.5688, 40.0926, 22.5687, 2.30262 16384, 17.1496, 17.3114, 15.9566, 7.22444, 1.80611, 31.6381, 34.9525, 35.2886, 9.60737, 12.6552, 45.3088, 50.6209, 26.7884, 24.9661, 24.7974, 22.3781, 24.9661, , 30.0821, 32.478, 32.768, 10.1382, 52.806, 27.8032, 34.9525, 22.1085, 2.27105 32768, 12.3653, 12.288, 11.1078, 6.82667, 1.79387, 24.2726, 35.4249, 35.4249, 7.74047, 11.7029, 40.5377, 42.7409, 26.9326, 20.5872, 21.9674, 14.5636, 15.1237, , 24.7306, 26.5686, 25.7004, 9.27396, 38.5506, 21.8453, 20.3739, 15.7286, 2.13704 65536, 9.98644, 9.98644, 8.66592, 7.13317, 1.76528, 20.7639, 30.8405, 30.3935, 6.5536, 11.8483, 36.7921, 37.7865, 24.1052, 18.0789, 22.0753, 11.9837, 12.5578, , 17.1898, 17.7725, 17.05, 9.19804, 34.1, 20.97, , 19.23, , 14.76, , 2.07228 131072, 9.44163, 2.08051 131072, , 6.752, , 1.77406, 19.3759, 31., 2, 31.1, 20.0741, 31.164, 3, 36.5283, 38.4177, 21.8, 36.2313, 37.7665, 21.8453, 17.2731, 21.021, 10.9227, 11.2537, , 14.469, 15.158, 14.6594, 8.84216, 32.5288, 18.1156, 17.1402, 13.2632, 2.05556 262144, 10.1257, 1, 1.76, 8, 19.6, 9, 23.3, 2.46788, 19.6608, 23.593, 23, 35.2134, 34.1927, 19., 37.4491, 35.2134, 19.994, 17.4763, 21.8453, 10.4858, 10.8225, , 12.4173, 12.753, 12.3523, 9.10925, 33.9467, 21.8453, 17.4763, 13.8782, 2.02341 Norm. Avg., 0.31924, 1.76747, 19.3052, 24.5356, 24.5356, 6.21815, 11.093, 35.3244, 34.3499, 19.0833, 16.4381, 21.4687, 10.3335, 10.6426, , 12.3591, 12.5776, 12.3286, 9.02307, 32.5538, 19.5323, 14.7796, 13.1072, 2.01976 Norm. Avg., 0.27846, 0.268935, 0.211992, 0.0949934, 0.0316011, 0.398825, 0.480458, 0.496458, 0.121617, 0.205285, 0.930353, 0.917449, 0.681867, 0.403758, 0.378515, 0.272234, 0.293459, 0.425197, 0.435017, 0.485348, 0.46579, 0.150481, 0.797787, 0.306401, 0.425418, 0.277292, 0.0369678 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, CWP (min N), CWP (best N), FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, NAPACK (f2c), Singleton (f2c), Temperton (f2c), Valkenburg 6, 6.39276, 4.62022, 8.06705, 78.9475, 76.7131, 9.24044, 12.5487, 2.85519, 4.45811, 5.08224, 2.1001 9, 11.8335, 8.34687, 10.4452, 61.8082, 60.3128, 7.85587, 12.4647, 5.37269, 9.16519, 7.85587, 2.08672 12, 14.5326, 12.5863, 13.1744, 94.7674, 93.9776, 13.4254, 16.3914, 4.69888, 9.58956, 10.6793, 2.18892 15, 17.3001, 17.3001, 13.9153, 66.7936, 67.3795, 9.14436, 11.8538, 4.44517, 10.7883, 13.3355, 1.84646 18, 17.9527, 15.6657, 10.08, 61.488, 51.24, 8.97636, 20.6683, 6.405, 14.5534, 11.6015, 2.16507 24, 26.1287, 23.1139, 11.7835, 75.9108, 70.0148, 16.2422, 25.9407, 7.32878, 13.5555, 16.5402, 2.2536 36, 29.8955, 30.1915, 13.258, 79.2038, 79.2038, 11.2108, 27.7213, 9.18478, 23.6383, 18.94, 2.22905 80, 42.7128, 43.1577, 15.8135, 67.3681, 67.3681, 18.6628, 23.146, 6.81437, 33.1451, 28.5734, 2.10525 108, 34.545, 45.9715, 14.3661, 69.0901, 62.253, 10.0273, 27.165, 11.6724, 29.5856, 23.5287, 2.19717 210, 45.4485, 46.0798, 6.63549, 41.2142, 41.7326, 9.9932, 21.1321, 5.88252, 20.6071, , 1.71371 504, 46.7994, 46.7994, 6.73421, 53.2545, 54.83, 11.6998, 25.044, 7.62029, 27.5782, , 1.8682 1000, 36.1878, 51.0248, 14.3328, 50.5196, 50.5196, 12.7562, 20.5745, 6.3781, 33.133, 29.3246, 1.7916 1960, 42.8716, 42.2121, 4.89961, 42.2121, 41.5725, 12.0341, 22.1273, 5.53182, 24.7188, , 1.60268 4725, 32.6648, 43.42, 38.5414, 33.8262, 8.64039, 21.4842, 6.35933, 22.081, 18.2739, 1.85728 75600, 33.113, 32.8908, 7.85372, 34.5121, 29.3457, 7.85372, 19.761, 6.18778, 14.4992, , 1.64675 165375, 29.2535, 29.555, 5.23146, 26.7929, 23.8903, 5.82691, 17.3748, 5.10114, 12.7984, , 1.48695 362880, 16.3465, 16.4267, 7.81128, 34.5468, 26.5956, 7.48, 20.187, 6.71551, 12.1856, , 1.7399 Norm. Avg., 0.59248, 0.624095, 0.190335, 0.988874, 0.900253, 0.195128, 0.39555, 0.129558, 0.378888, 0.285026, 0.0365672 -----------------------------------------------, 7.53043, 21.6196, 6.81106, 12.0541, , 1.74624 Norm. Avg., 0.588121, 0.620486, 0.188878, 0.985913, 0.953897, 0.194185, 0.40107, 0.126404, 0.372594, 0.274944, 0.0361008 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM (f2c), PDA (f2c), Singleton (f2c), Temperton (f2c) 4x4x4, 84.4491, , 4.70917, 38.13, 21.2549 8x8x8, 124.99, 23.2745, 9.76615, 12.4208, 26.4933 16x1024x64, 39.2725, 24.0499, 9.84578, 14.7272, 128x128x128, 33.2128, 23.5257, 10.1335, 9.68768, 22.2875 Norm. Avg., 1, 0.517104, 0.181603, 0.360632, 0.538134 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA (f2c), Singleton (f2c), Temperton (f2c) 5x5x5, 50.9497, 6.50818, 49.1928, 30.4827 6x6x6, 95.2921, 6.08247, 30.0922, 29.5734 7x7x7, 44.8197, 2.44876, 31.4691, 9x9x9, 67.6096, 8.06706, 43.821, 36.974 10x10x10, 65.4164, 8.39224, 32.2942, 39.5541 11x11x11, 30.485, 2.26683, 28.9857, 12x12x12, 88.104, 9.82979, 37.4616, 49.0476 13x13x13, 25.8006, 2.15599, 27.1467, 14x14x14, 50.4628, 3.45844, 24.6122, 15x15x15, 62.1268, 10.2083, 27.6685, 43.6494 24x25x28, 30.4268, 7.36899, 26.3841, 48x48x48, 39.8485, 9.35837, 14.706, 32.2253 49x49x49, 27.3337, 2.8391, 15.482, 60x60x60, 38.2767, 9.81453, 12.6744, 33.284 72x60x56, 32.7768, 7.11602, 11.267, 75x75x75, 29.8612, 9.9037, 13.8304, 32.0461 80x80x80, 31.5275, 10.0731, 13.1936, 28.7292 84x84x84, 28.7027, 5.0697, 10.7635, 96x96x96, 31.8939, 9.51953, 10.0217, 24.2748 105x105x105, 27.7592, 5.17712, 11.6939, 112x112x112, 28.6343, 5.32508, 12.2196, 120x120x120, 31.7987, 10.1835, 9.43735, 27.2077 144x144x144, 29.03615.0833, 10.8044, 96x96x96, 32.9771, 9.63498, 9.96457, 24.2076 105x105x105, 26.7405, 5.2165, 11.7529, 112x112x112, 29.3369, 5.34691, 12.1677, 120x120x120, 32.4912, 10.2009, 9.42741, 27.3323 144x144x144, 32.2753, 10.2176, 10.1338, 21.9507 Norm. Avg., 0.996077, 0.178602, 0.513331, 0.70285 @@@@ end