To: benchfft@theory.lcs.mit.edu Subject: SUBMIT ------------------- @@SUBMIT@@ @ submitter = Roger Koch @ submitter email = PRIVATE @ submitter organization = IBM Research @ computer manufacturer = IBM @ computer model = IntelliStation M Pro @ CPU manufacturer = Intel @ CPU model = Pentium II @ CPU speed = 300 MHz @ RAM = 256 MB @ L2 cache size = 512 kB @ operating system = Windows NT 4.0 @ C compiler = Intel C/C++ V 2.4 @ C compiler flags = /O2 -G6 -Qxi -Qip -Qmem @ Fortran compiler = NONE @ Fortran compiler flags = NONE @ remarks = NAPACK disabled (bugs under this optimization level) @ FFTW version = FFTW V1.2 @ floating-point precision = double @ floating-point size = 8 bytes ------------------------------------------------------ @@@@ bench.1d.p2.log Benchmarking for sizes: 2 (0.000228882 MB) 4 (0.000534058 MB) 8 (0.000839233 MB) 16 (0.00164795 MB) 32 (0.00297546 MB) 64 (0.00616455 MB) 128 (0.0119019 MB) 256 (0.0238037 MB) 512 (0.0476074 MB) 1024 (0.0939941 MB) 2048 (0.189575 MB) 4096 (0.37915 MB) 8192 (0.765991 MB) 16384 (1.51184 MB) 32768 (3.02368 MB) 65536 (6.09973 MB) 131072 (12.1995 MB) 262144 (25.4987 MB) 524288 (50.9973 MB) 1048576 (96 MB) 2097152 (192 MB) Maximum array size = 2097152 Benchmarking FFTs: 0. Arndt DIF 1. Arndt DIT 2. Arndt Split-Radix 3. Arndt 4-step 4. Beauregard 5. Bergland 6. CWP (min N) 7. CWP (best N) 8. Edelblute 9. FFTPACK (f2c) 10. FFTW 11. FFTW_ESTIMATE 12. Frigo-old 13. Green 14. GSL 15. GSL DIT 16. GSL DIF 17. Krukar 18. Mayer (Buneman) 19. Mayer (simple) 20. Mayer (lookup) 21. Ooura (C) 22. Ransom 23. Singleton (f2c) 24. Temperton (f2c) 25. Valkenburg Computing normalized averages (26 transforms). Benchmarking for array size = 2 (power of 2): 0. Arndt DIF: elapsed time t=1.252 s, 4194304 iters, t-(init.)=0.912 s t(norm)=0.108719, mflops=45.9902 (err=1.5e-017) 1. Arndt DIT: elapsed time t=1.272 s, 4194304 iters, t-(init.)=0.931 s t(norm)=0.110984, mflops=45.0516 (err=1.5e-017) 2. Arndt Split-Radix: elapsed time t=1.532 s, 4194304 iters, t-(init.)=1.201 s t(norm)=0.14317, mflops=34.9234 (err=1.5e-017) 3. Arndt 4-step: elapsed time t=1.793 s, 131072 iters, t-(init.)=1.783 s t(norm)=6.80161, mflops=0.735121 (err=1.5e-017) 4. Beauregard: elapsed time t=1.512 s, 1048576 iters, t-(init.)=1.432 s t(norm)=0.682831, mflops=7.32246 (err=1.2e-016) 5. Bergland: elapsed time t=1.312 s, 1048576 iters, t-(init.)=1.222 s t(norm)=0.582695, mflops=8.58082 (err=1.2e-016) 6. CWP (min N): elapsed time t=1.061 s, 524288 iters, t-(init.)=1.011 s t(norm)=0.964165, mflops=5.18584 7. CWP (best N) (N=3): elapsed time t=1.232 s, 524288 iters, t-(init.)=1.182 s t(norm)=1.12724, mflops=4.4356 8. Skipping fft (Edelblute can't handle N <= 2). 9. FFTPACK (f2c): elapsed time t=1.773 s, 2097152 iters, t-(init.)=1.603 s t(norm)=0.382185, mflops=13.0827 (err=1.2e-016) FFTW_MEASURE plan: (cost = 2.584457e-007) FFTW_NOTW 2 10. FFTW: elapsed time t=1.162 s, 4194304 iters, t-(init.)=0.821 s t(norm)=0.0978708, mflops=51.0877 (err=1.2e-016) FFTW_ESTIMATE plan: (cost = 1.820000e+002) FFTW_NOTW 2 11. FFTW_ESTIMATE: elapsed time t=1.172 s, 4194304 iters, t-(init.)=0.842 s t(norm)=0.100374, mflops=49.8136 (err=1.2e-016) 12. Frigo-old: elapsed time t=1.052 s, 4194304 iters, t-(init.)=0.712 s t(norm)=0.084877, mflops=58.9088 (err=1.2e-016) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=1.692 s, 2097152 iters, t-(init.)=1.512 s t(norm)=0.360489, mflops=13.8701 (err=1.2e-016) 15. GSL DIT: elapsed time t=1.091 s, 1048576 iters, t-(init.)=1 s t(norm)=0.476837, mflops=10.4858 (err=1.2e-016) 16. GSL DIF: elapsed time t=1.162 s, 1048576 iters, t-(init.)=1.082 s t(norm)=0.515938, mflops=9.69109 (err=1.2e-016) 17. Krukar: elapsed time t=1.723 s, 4194304 iters, t-(init.)=1.403 s t(norm)=0.167251, mflops=29.8953 (err=1.2e-016) 18. Skipping fft (Mayer can't handle N <= 2). 19. Skipping fft (Mayer can't handle N <= 2). 20. Skipping fft (Mayer can't handle N <= 2). 21. Ooura (C): elapsed time t=1.192 s, 4194304 iters, t-(init.)=0.842 s t(norm)=0.100374, mflops=49.8136 (err=1.2e-016) 22. Skipping fft (Ransom doesn't work for N=2). 23. Singleton (f2c): elapsed time t=1.443 s, 1048576 iters, t-(init.)=1.353 s t(norm)=0.645161, mflops=7.75001 (err=1.2e-016) 24. Temperton (f2c): elapsed time t=1.482 s, 524288 iters, t-(init.)=1.442 s t(norm)=1.3752, mflops=3.63584 (err=1.2e-016) 25. Valkenburg: elapsed time t=1.302 s, 1048576 iters, t-(init.)=1.222 s t(norm)=0.582695, mflops=8.58082 (err=2.1e-016) Top mflops for N=2 = 58.9088 Normalized results and averages for N=2: fft 0: mflops = 45.9902 (norm. = 0.780702), norm. avg. (of 1) = 0.780702 fft 1: mflops = 45.0516 (norm. = 0.764769), norm. avg. (of 1) = 0.764769 fft 2: mflops = 34.9234 (norm. = 0.592839), norm. avg. (of 1) = 0.592839 fft 3: mflops = 0.735121 (norm. = 0.012479), norm. avg. (of 1) = 0.012479 fft 4: mflops = 7.32246 (norm. = 0.124302), norm. avg. (of 1) = 0.124302 fft 5: mflops = 8.58082 (norm. = 0.145663), norm. avg. (of 1) = 0.145663 fft 6: mflops = 5.18584 (norm. = 0.0880317), norm. avg. (of 1) = 0.0880317 fft 7: mflops = 4.4356 (norm. = 0.0752961), norm. avg. (of 1) = 0.0752961 fft 8: mflops = -1 (norm. = -0.0169754), norm. avg. (of 0) = -1 fft 9: mflops = 13.0827 (norm. = 0.222084), norm. avg. (of 1) = 0.222084 fft 10: mflops = 51.0877 (norm. = 0.867235), norm. avg. (of 1) = 0.867235 fft 11: mflops = 49.8136 (norm. = 0.845606), norm. avg. (of 1) = 0.845606 fft 12: mflops = 58.9088 (norm. = 1), norm. avg. (of 1) = 1 fft 13: mflops = -1 (norm. = -0.0169754), norm. avg. (of 0) = -1 fft 14: mflops = 13.8701 (norm. = 0.23545), norm. avg. (of 1) = 0.23545 fft 15: mflops = 10.4858 (norm. = 0.178), norm. avg. (of 1) = 0.178 fft 16: mflops = 9.69109 (norm. = 0.16451), norm. avg. (of 1) = 0.16451 fft 17: mflops = 29.8953 (norm. = 0.507484), norm. avg. (of 1) = 0.507484 fft 18: mflops = -1 (norm. = -0.0169754), norm. avg. (of 0) = -1 fft 19: mflops = -1 (norm. = -0.0169754), norm. avg. (of 0) = -1 fft 20: mflops = -1 (norm. = -0.0169754), norm. avg. (of 0) = -1 fft 21: mflops = 49.8136 (norm. = 0.845606), norm. avg. (of 1) = 0.845606 fft 22: mflops = -1 (norm. = -0.0169754), norm. avg. (of 0) = -1 fft 23: mflops = 7.75001 (norm. = 0.131559), norm. avg. (of 1) = 0.131559 fft 24: mflops = 3.63584 (norm. = 0.0617198), norm. avg. (of 1) = 0.0617198 fft 25: mflops = 8.58082 (norm. = 0.145663), norm. avg. (of 1) = 0.145663 Benchmarking for array size = 4 (power of 2): 0. Arndt DIF: elapsed time t=1.241 s, 2097152 iters, t-(init.)=1 s t(norm)=0.0596046, mflops=83.8861 (err=1.0e-016) 1. Arndt DIT: elapsed time t=1.192 s, 2097152 iters, t-(init.)=0.951 s t(norm)=0.056684, mflops=88.2083 (err=1.0e-016) 2. Arndt Split-Radix: elapsed time t=1.052 s, 1048576 iters, t-(init.)=0.932 s t(norm)=0.111103, mflops=45.0033 (err=1.0e-016) 3. Arndt 4-step: elapsed time t=1.032 s, 131072 iters, t-(init.)=1.022 s t(norm)=0.974655, mflops=5.13002 (err=1.0e-016) 4. Beauregard: elapsed time t=1.222 s, 262144 iters, t-(init.)=1.192 s t(norm)=0.56839, mflops=8.79678 (err=1.9e-016) 5. Bergland: elapsed time t=1.483 s, 1048576 iters, t-(init.)=1.363 s t(norm)=0.162482, mflops=30.7726 (err=1.6e-016) 6. CWP (min N): elapsed time t=1.281 s, 524288 iters, t-(init.)=1.22 s t(norm)=0.290871, mflops=17.1898 7. CWP (best N) (N=15): elapsed time t=1.883 s, 262144 iters, t-(init.)=1.793 s t(norm)=0.854969, mflops=5.84817 8. Edelblute: elapsed time t=1.192 s, 1048576 iters, t-(init.)=1.072 s t(norm)=0.127792, mflops=39.126 (err=1.0e-016) 9. FFTPACK (f2c): elapsed time t=1.242 s, 1048576 iters, t-(init.)=1.122 s t(norm)=0.133753, mflops=37.3824 (err=1.6e-016) FFTW_MEASURE plan: (cost = 4.005432e-007) FFTW_NOTW 4 10. FFTW: elapsed time t=1.843 s, 4194304 iters, t-(init.)=1.373 s t(norm)=0.0409186, mflops=122.194 (err=1.6e-016) FFTW_ESTIMATE plan: (cost = 3.176000e+002) FFTW_NOTW 4 11. FFTW_ESTIMATE: elapsed time t=1.823 s, 4194304 iters, t-(init.)=1.353 s t(norm)=0.0403225, mflops=124 (err=1.6e-016) 12. Frigo-old: elapsed time t=1.402 s, 4194304 iters, t-(init.)=0.921 s t(norm)=0.0274479, mflops=182.163 (err=1.6e-016) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=1.262 s, 1048576 iters, t-(init.)=1.142 s t(norm)=0.136137, mflops=36.7277 (err=1.6e-016) 15. GSL DIT: elapsed time t=1.142 s, 524288 iters, t-(init.)=1.082 s t(norm)=0.257969, mflops=19.3822 (err=1.9e-016) 16. GSL DIF: elapsed time t=1.202 s, 524288 iters, t-(init.)=1.142 s t(norm)=0.272274, mflops=18.3639 (err=1.9e-016) 17. Krukar: elapsed time t=1.112 s, 2097152 iters, t-(init.)=0.882 s t(norm)=0.0525713, mflops=95.1089 (err=1.6e-016) 18. Mayer (Buneman): elapsed time t=1.782 s, 2097152 iters, t-(init.)=1.531 s t(norm)=0.0912547, mflops=54.7917 (err=1.0e-016) 19. Mayer (simple): elapsed time t=1.683 s, 2097152 iters, t-(init.)=1.453 s t(norm)=0.0866055, mflops=57.733 20. Mayer (lookup): elapsed time t=1.743 s, 2097152 iters, t-(init.)=1.513 s t(norm)=0.0901818, mflops=55.4435 (err=1.0e-016) 21. Ooura (C): elapsed time t=1.352 s, 2097152 iters, t-(init.)=1.111 s t(norm)=0.0662208, mflops=75.505 (err=1.6e-016) 22. Ransom: elapsed time t=1.572 s, 131072 iters, t-(init.)=1.552 s t(norm)=1.4801, mflops=3.37814 (err=2.3e-016) 23. Singleton (f2c): elapsed time t=1.872 s, 1048576 iters, t-(init.)=1.751 s t(norm)=0.208735, mflops=23.9538 (err=1.6e-016) 24. Temperton (f2c): elapsed time t=1.703 s, 524288 iters, t-(init.)=1.643 s t(norm)=0.391722, mflops=12.7642 (err=1.6e-016) 25. Valkenburg: elapsed time t=1.192 s, 262144 iters, t-(init.)=1.162 s t(norm)=0.554085, mflops=9.02389 (err=2.5e-016) Top mflops for N=4 = 182.163 Normalized results and averages for N=4: fft 0: mflops = 83.8861 (norm. = 0.4605), norm. avg. (of 2) = 0.620601 fft 1: mflops = 88.2083 (norm. = 0.484227), norm. avg. (of 2) = 0.624498 fft 2: mflops = 45.0033 (norm. = 0.247049), norm. avg. (of 2) = 0.419944 fft 3: mflops = 5.13002 (norm. = 0.0281617), norm. avg. (of 2) = 0.0203203 fft 4: mflops = 8.79678 (norm. = 0.0482907), norm. avg. (of 2) = 0.0862962 fft 5: mflops = 30.7726 (norm. = 0.168929), norm. avg. (of 2) = 0.157296 fft 6: mflops = 17.1898 (norm. = 0.0943648), norm. avg. (of 2) = 0.0911982 fft 7: mflops = 5.84817 (norm. = 0.032104), norm. avg. (of 2) = 0.0537001 fft 8: mflops = 39.126 (norm. = 0.214785), norm. avg. (of 1) = 0.214785 fft 9: mflops = 37.3824 (norm. = 0.205214), norm. avg. (of 2) = 0.213649 fft 10: mflops = 122.194 (norm. = 0.670794), norm. avg. (of 2) = 0.769014 fft 11: mflops = 124 (norm. = 0.68071), norm. avg. (of 2) = 0.763158 fft 12: mflops = 182.163 (norm. = 1), norm. avg. (of 2) = 1 fft 13: mflops = -1 (norm. = -0.00548959), norm. avg. (of 0) = -1 fft 14: mflops = 36.7277 (norm. = 0.20162), norm. avg. (of 2) = 0.218535 fft 15: mflops = 19.3822 (norm. = 0.1064), norm. avg. (of 2) = 0.1422 fft 16: mflops = 18.3639 (norm. = 0.10081), norm. avg. (of 2) = 0.13266 fft 17: mflops = 95.1089 (norm. = 0.522109), norm. avg. (of 2) = 0.514796 fft 18: mflops = 54.7917 (norm. = 0.300784), norm. avg. (of 1) = 0.300784 fft 19: mflops = 57.733 (norm. = 0.31693), norm. avg. (of 1) = 0.31693 fft 20: mflops = 55.4435 (norm. = 0.304362), norm. avg. (of 1) = 0.304362 fft 21: mflops = 75.505 (norm. = 0.414491), norm. avg. (of 2) = 0.630049 fft 22: mflops = 3.37814 (norm. = 0.0185446), norm. avg. (of 1) = 0.0185446 fft 23: mflops = 23.9538 (norm. = 0.131496), norm. avg. (of 2) = 0.131528 fft 24: mflops = 12.7642 (norm. = 0.07007), norm. avg. (of 2) = 0.0658949 fft 25: mflops = 9.02389 (norm. = 0.0495374), norm. avg. (of 2) = 0.0976001 Benchmarking for array size = 8 (power of 2): 0. Arndt DIF: elapsed time t=1.111 s, 1048576 iters, t-(init.)=0.89 s t(norm)=0.0353654, mflops=141.381 (err=2.2e-016) 1. Arndt DIT: elapsed time t=1.192 s, 1048576 iters, t-(init.)=0.972 s t(norm)=0.0386238, mflops=129.454 (err=2.2e-016) 2. Arndt Split-Radix: elapsed time t=1.522 s, 524288 iters, t-(init.)=1.412 s t(norm)=0.112216, mflops=44.5571 (err=1.8e-016) 3. Arndt 4-step: elapsed time t=1.382 s, 65536 iters, t-(init.)=1.372 s t(norm)=0.872294, mflops=5.73201 (err=2.0e-016) 4. Beauregard: elapsed time t=1.262 s, 131072 iters, t-(init.)=1.232 s t(norm)=0.391642, mflops=12.7668 (err=1.1e-016) 5. Bergland: elapsed time t=1.442 s, 524288 iters, t-(init.)=1.332 s t(norm)=0.105858, mflops=47.2332 (err=1.2e-016) 6. CWP (min N): elapsed time t=1.713 s, 524288 iters, t-(init.)=1.593 s t(norm)=0.1266, mflops=39.4944 7. CWP (best N) (N=15): elapsed time t=1.882 s, 262144 iters, t-(init.)=1.802 s t(norm)=0.28642, mflops=17.4569 8. Edelblute: elapsed time t=1.031 s, 262144 iters, t-(init.)=0.97 s t(norm)=0.154177, mflops=32.4302 (err=2.2e-016) 9. FFTPACK (f2c): elapsed time t=1.352 s, 524288 iters, t-(init.)=1.241 s t(norm)=0.0986258, mflops=50.6967 (err=1.1e-016) FFTW_MEASURE plan: (cost = 8.392334e-007) FFTW_NOTW 8 10. FFTW: elapsed time t=1.783 s, 2097152 iters, t-(init.)=1.332 s t(norm)=0.0264645, mflops=188.933 (err=1.1e-016) FFTW_ESTIMATE plan: (cost = 4.688000e+002) FFTW_NOTW 8 11. FFTW_ESTIMATE: elapsed time t=1.853 s, 2097152 iters, t-(init.)=1.412 s t(norm)=0.0280539, mflops=178.228 (err=1.1e-016) 12. Frigo-old: elapsed time t=1.512 s, 2097152 iters, t-(init.)=1.071 s t(norm)=0.0212789, mflops=234.975 (err=1.2e-016) 13. Green: elapsed time t=1.462 s, 1048576 iters, t-(init.)=1.242 s t(norm)=0.0493526, mflops=101.312 (err=1.5e-016) 14. GSL: elapsed time t=1.212 s, 524288 iters, t-(init.)=1.102 s t(norm)=0.0875791, mflops=57.0913 (err=1.1e-016) 15. GSL DIT: elapsed time t=1.112 s, 262144 iters, t-(init.)=1.052 s t(norm)=0.167211, mflops=29.9024 (err=1.3e-016) 16. GSL DIF: elapsed time t=1.222 s, 262144 iters, t-(init.)=1.162 s t(norm)=0.184695, mflops=27.0717 (err=1.2e-016) 17. Krukar: elapsed time t=1.042 s, 1048576 iters, t-(init.)=0.822 s t(norm)=0.0326633, mflops=153.077 (err=1.2e-016) 18. Mayer (Buneman): elapsed time t=1.852 s, 1048576 iters, t-(init.)=1.631 s t(norm)=0.0648101, mflops=77.1484 (err=2.0e-016) 19. Mayer (simple): elapsed time t=1.753 s, 1048576 iters, t-(init.)=1.543 s t(norm)=0.0613133, mflops=81.5484 20. Mayer (lookup): elapsed time t=1.802 s, 1048576 iters, t-(init.)=1.582 s t(norm)=0.062863, mflops=79.538 (err=2.0e-016) 21. Ooura (C): elapsed time t=1.082 s, 1048576 iters, t-(init.)=0.862 s t(norm)=0.0342528, mflops=145.973 (err=1.2e-016) 22. Ransom: elapsed time t=1.302 s, 32768 iters, t-(init.)=1.292 s t(norm)=1.64286, mflops=3.04347 (err=2.7e-016) 23. Singleton (f2c): elapsed time t=1.331 s, 262144 iters, t-(init.)=1.27 s t(norm)=0.201861, mflops=24.7695 (err=1.6e-016) 24. Temperton (f2c): elapsed time t=1.282 s, 262144 iters, t-(init.)=1.222 s t(norm)=0.194232, mflops=25.7425 (err=1.1e-016) 25. Valkenburg: elapsed time t=1.722 s, 131072 iters, t-(init.)=1.692 s t(norm)=0.537872, mflops=9.29589 (err=2.0e-016) Top mflops for N=8 = 234.975 Normalized results and averages for N=8: fft 0: mflops = 141.381 (norm. = 0.601685), norm. avg. (of 3) = 0.614296 fft 1: mflops = 129.454 (norm. = 0.550926), norm. avg. (of 3) = 0.599974 fft 2: mflops = 44.5571 (norm. = 0.189625), norm. avg. (of 3) = 0.343171 fft 3: mflops = 5.73201 (norm. = 0.0243941), norm. avg. (of 3) = 0.0216783 fft 4: mflops = 12.7668 (norm. = 0.0543324), norm. avg. (of 3) = 0.0756416 fft 5: mflops = 47.2332 (norm. = 0.201014), norm. avg. (of 3) = 0.171868 fft 6: mflops = 39.4944 (norm. = 0.168079), norm. avg. (of 3) = 0.116825 fft 7: mflops = 17.4569 (norm. = 0.0742925), norm. avg. (of 3) = 0.0605642 fft 8: mflops = 32.4302 (norm. = 0.138015), norm. avg. (of 2) = 0.1764 fft 9: mflops = 50.6967 (norm. = 0.215753), norm. avg. (of 3) = 0.21435 fft 10: mflops = 188.933 (norm. = 0.804054), norm. avg. (of 3) = 0.780694 fft 11: mflops = 178.228 (norm. = 0.758499), norm. avg. (of 3) = 0.761605 fft 12: mflops = 234.975 (norm. = 1), norm. avg. (of 3) = 1 fft 13: mflops = 101.312 (norm. = 0.431159), norm. avg. (of 1) = 0.431159 fft 14: mflops = 57.0913 (norm. = 0.242967), norm. avg. (of 3) = 0.226679 fft 15: mflops = 29.9024 (norm. = 0.127258), norm. avg. (of 3) = 0.137219 fft 16: mflops = 27.0717 (norm. = 0.115211), norm. avg. (of 3) = 0.126844 fft 17: mflops = 153.077 (norm. = 0.65146), norm. avg. (of 3) = 0.560351 fft 18: mflops = 77.1484 (norm. = 0.328326), norm. avg. (of 2) = 0.314555 fft 19: mflops = 81.5484 (norm. = 0.347051), norm. avg. (of 2) = 0.331991 fft 20: mflops = 79.538 (norm. = 0.338496), norm. avg. (of 2) = 0.321429 fft 21: mflops = 145.973 (norm. = 0.62123), norm. avg. (of 3) = 0.627109 fft 22: mflops = 3.04347 (norm. = 0.0129523), norm. avg. (of 2) = 0.0157485 fft 23: mflops = 24.7695 (norm. = 0.105413), norm. avg. (of 3) = 0.122823 fft 24: mflops = 25.7425 (norm. = 0.109554), norm. avg. (of 3) = 0.0804479 fft 25: mflops = 9.29589 (norm. = 0.0395612), norm. avg. (of 3) = 0.0782538 Benchmarking for array size = 16 (power of 2): 0. Arndt DIF: elapsed time t=1.572 s, 262144 iters, t-(init.)=1.482 s t(norm)=0.0883341, mflops=56.6033 (err=1.3e-016) 1. Arndt DIT: elapsed time t=1.512 s, 262144 iters, t-(init.)=1.422 s t(norm)=0.0847578, mflops=58.9916 (err=1.5e-016) 2. Arndt Split-Radix: elapsed time t=1.993 s, 262144 iters, t-(init.)=1.903 s t(norm)=0.113428, mflops=44.081 (err=1.1e-016) 3. Arndt 4-step: elapsed time t=1.382 s, 65536 iters, t-(init.)=1.362 s t(norm)=0.324726, mflops=15.3976 (err=1.3e-016) 4. Beauregard: elapsed time t=1.452 s, 65536 iters, t-(init.)=1.431 s t(norm)=0.341177, mflops=14.6552 (err=2.2e-016) 5. Bergland: elapsed time t=1.381 s, 262144 iters, t-(init.)=1.29 s t(norm)=0.07689, mflops=65.028 (err=1.9e-016) 6. CWP (min N): elapsed time t=1.452 s, 262144 iters, t-(init.)=1.362 s t(norm)=0.0811815, mflops=61.5904 7. CWP (best N) (N=28): elapsed time t=1.442 s, 131072 iters, t-(init.)=1.372 s t(norm)=0.163555, mflops=30.5707 8. Edelblute: elapsed time t=1.432 s, 131072 iters, t-(init.)=1.392 s t(norm)=0.165939, mflops=30.1315 (err=1.4e-016) 9. FFTPACK (f2c): elapsed time t=1.112 s, 262144 iters, t-(init.)=1.022 s t(norm)=0.0609159, mflops=82.0803 (err=2.0e-016) FFTW_MEASURE plan: (cost = 1.754761e-006) FFTW_NOTW 16 10. FFTW: elapsed time t=1.001 s, 524288 iters, t-(init.)=0.821 s t(norm)=0.0244677, mflops=204.351 (err=2.1e-016) FFTW_ESTIMATE plan: (cost = 4.256000e+002) FFTW_NOTW 16 11. FFTW_ESTIMATE: elapsed time t=1.943 s, 1048576 iters, t-(init.)=1.583 s t(norm)=0.0235885, mflops=211.967 (err=2.1e-016) 12. Frigo-old: elapsed time t=1.662 s, 1048576 iters, t-(init.)=1.301 s t(norm)=0.0193864, mflops=257.913 (err=2.1e-016) 13. Green: elapsed time t=1.052 s, 262144 iters, t-(init.)=0.962 s t(norm)=0.0573397, mflops=87.1997 (err=2.2e-016) 14. GSL: elapsed time t=1.091 s, 262144 iters, t-(init.)=1.001 s t(norm)=0.0596642, mflops=83.8023 (err=2.0e-016) 15. GSL DIT: elapsed time t=1.061 s, 131072 iters, t-(init.)=1.021 s t(norm)=0.121713, mflops=41.0804 (err=2.4e-016) 16. GSL DIF: elapsed time t=1.142 s, 131072 iters, t-(init.)=1.102 s t(norm)=0.131369, mflops=38.0608 (err=2.4e-016) 17. Krukar: elapsed time t=1.062 s, 524288 iters, t-(init.)=0.882 s t(norm)=0.0262856, mflops=190.218 (err=2.5e-016) 18. Mayer (Buneman): elapsed time t=1.402 s, 262144 iters, t-(init.)=1.302 s t(norm)=0.0776052, mflops=64.4286 (err=1.3e-016) 19. Mayer (simple): elapsed time t=1.202 s, 262144 iters, t-(init.)=1.122 s t(norm)=0.0668764, mflops=74.7648 20. Mayer (lookup): elapsed time t=1.162 s, 262144 iters, t-(init.)=1.082 s t(norm)=0.0644922, mflops=77.5287 (err=1.4e-016) 21. Ooura (C): elapsed time t=1.332 s, 524288 iters, t-(init.)=1.152 s t(norm)=0.0343323, mflops=145.636 (err=1.6e-016) 22. Ransom: elapsed time t=1.372 s, 65536 iters, t-(init.)=1.342 s t(norm)=0.319958, mflops=15.6271 (err=4.4e-016) 23. Singleton (f2c): elapsed time t=1.543 s, 262144 iters, t-(init.)=1.453 s t(norm)=0.0866055, mflops=57.733 (err=2.2e-016) 24. Temperton (f2c): elapsed time t=1.952 s, 262144 iters, t-(init.)=1.861 s t(norm)=0.110924, mflops=45.0758 (err=2.0e-016) 25. Valkenburg: elapsed time t=1.112 s, 32768 iters, t-(init.)=1.102 s t(norm)=0.525475, mflops=9.51521 (err=2.4e-016) Top mflops for N=16 = 257.913 Normalized results and averages for N=16: fft 0: mflops = 56.6033 (norm. = 0.219467), norm. avg. (of 4) = 0.515589 fft 1: mflops = 58.9916 (norm. = 0.228727), norm. avg. (of 4) = 0.507162 fft 2: mflops = 44.081 (norm. = 0.170914), norm. avg. (of 4) = 0.300107 fft 3: mflops = 15.3976 (norm. = 0.0597008), norm. avg. (of 4) = 0.0311839 fft 4: mflops = 14.6552 (norm. = 0.0568222), norm. avg. (of 4) = 0.0709367 fft 5: mflops = 65.028 (norm. = 0.252132), norm. avg. (of 4) = 0.191934 fft 6: mflops = 61.5904 (norm. = 0.238803), norm. avg. (of 4) = 0.14732 fft 7: mflops = 30.5707 (norm. = 0.118531), norm. avg. (of 4) = 0.075056 fft 8: mflops = 30.1315 (norm. = 0.116828), norm. avg. (of 3) = 0.156543 fft 9: mflops = 82.0803 (norm. = 0.318249), norm. avg. (of 4) = 0.240325 fft 10: mflops = 204.351 (norm. = 0.792326), norm. avg. (of 4) = 0.783602 fft 11: mflops = 211.967 (norm. = 0.821857), norm. avg. (of 4) = 0.776668 fft 12: mflops = 257.913 (norm. = 1), norm. avg. (of 4) = 1 fft 13: mflops = 87.1997 (norm. = 0.338098), norm. avg. (of 2) = 0.384629 fft 14: mflops = 83.8023 (norm. = 0.324925), norm. avg. (of 4) = 0.251241 fft 15: mflops = 41.0804 (norm. = 0.15928), norm. avg. (of 4) = 0.142734 fft 16: mflops = 38.0608 (norm. = 0.147573), norm. avg. (of 4) = 0.132026 fft 17: mflops = 190.218 (norm. = 0.737528), norm. avg. (of 4) = 0.604645 fft 18: mflops = 64.4286 (norm. = 0.249808), norm. avg. (of 3) = 0.292973 fft 19: mflops = 74.7648 (norm. = 0.289884), norm. avg. (of 3) = 0.317955 fft 20: mflops = 77.5287 (norm. = 0.300601), norm. avg. (of 3) = 0.314486 fft 21: mflops = 145.636 (norm. = 0.56467), norm. avg. (of 4) = 0.611499 fft 22: mflops = 15.6271 (norm. = 0.0605905), norm. avg. (of 3) = 0.0306958 fft 23: mflops = 57.733 (norm. = 0.223847), norm. avg. (of 4) = 0.148079 fft 24: mflops = 45.0758 (norm. = 0.174772), norm. avg. (of 4) = 0.104029 fft 25: mflops = 9.51521 (norm. = 0.0368931), norm. avg. (of 4) = 0.0679137 Benchmarking for array size = 32 (power of 2): 0. Arndt DIF: elapsed time t=1.582 s, 131072 iters, t-(init.)=1.502 s t(norm)=0.0716209, mflops=69.812 (err=2.6e-016) 1. Arndt DIT: elapsed time t=1.572 s, 131072 iters, t-(init.)=1.492 s t(norm)=0.0711441, mflops=70.2799 (err=2.5e-016) 2. Arndt Split-Radix: elapsed time t=1.141 s, 65536 iters, t-(init.)=1.101 s t(norm)=0.105, mflops=47.6193 (err=2.2e-016) 3. Arndt 4-step: elapsed time t=1.662 s, 32768 iters, t-(init.)=1.642 s t(norm)=0.313187, mflops=15.9649 (err=2.5e-016) 4. Beauregard: elapsed time t=1.712 s, 32768 iters, t-(init.)=1.692 s t(norm)=0.322723, mflops=15.4931 (err=2.9e-016) 5. Bergland: elapsed time t=1.272 s, 131072 iters, t-(init.)=1.192 s t(norm)=0.056839, mflops=87.9678 (err=2.8e-016) 6. CWP (min N) (N=33): elapsed time t=1.002 s, 65536 iters, t-(init.)=0.962 s t(norm)=0.0917435, mflops=54.4998 7. CWP (best N) (N=35): elapsed time t=1.973 s, 131072 iters, t-(init.)=1.883 s t(norm)=0.0897884, mflops=55.6865 8. Edelblute: elapsed time t=1.633 s, 65536 iters, t-(init.)=1.593 s t(norm)=0.15192, mflops=32.912 (err=2.7e-016) 9. FFTPACK (f2c): elapsed time t=1.512 s, 131072 iters, t-(init.)=1.432 s t(norm)=0.0682831, mflops=73.2246 (err=3.0e-016) FFTW_MEASURE plan: (cost = 4.287720e-006) FFTW_NOTW 32 10. FFTW: elapsed time t=1.102 s, 262144 iters, t-(init.)=0.942 s t(norm)=0.022459, mflops=222.628 (err=3.2e-016) FFTW_ESTIMATE plan: (cost = 3.200000e+001) FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.081 s, 262144 iters, t-(init.)=0.921 s t(norm)=0.0219584, mflops=227.704 (err=3.2e-016) 12. Frigo-old: elapsed time t=1.021 s, 262144 iters, t-(init.)=0.86 s t(norm)=0.020504, mflops=243.855 (err=3.2e-016) 13. Green: elapsed time t=1.072 s, 131072 iters, t-(init.)=0.992 s t(norm)=0.0473022, mflops=105.703 (err=2.9e-016) 14. GSL: elapsed time t=1.362 s, 131072 iters, t-(init.)=1.282 s t(norm)=0.0611305, mflops=81.7922 (err=2.9e-016) 15. GSL DIT: elapsed time t=1.002 s, 65536 iters, t-(init.)=0.962 s t(norm)=0.0917435, mflops=54.4998 (err=3.6e-016) 16. GSL DIF: elapsed time t=1.062 s, 65536 iters, t-(init.)=1.022 s t(norm)=0.0974655, mflops=51.3002 (err=3.2e-016) 17. Krukar: elapsed time t=1.251 s, 262144 iters, t-(init.)=1.09 s t(norm)=0.0259876, mflops=192.399 (err=2.9e-016) 18. Mayer (Buneman): elapsed time t=1.502 s, 131072 iters, t-(init.)=1.422 s t(norm)=0.0678062, mflops=73.7395 (err=2.3e-016) 19. Mayer (simple): elapsed time t=1.202 s, 131072 iters, t-(init.)=1.122 s t(norm)=0.0535011, mflops=93.456 20. Mayer (lookup): elapsed time t=1.162 s, 131072 iters, t-(init.)=1.082 s t(norm)=0.0515938, mflops=96.9109 (err=2.4e-016) 21. Ooura (C): elapsed time t=1.462 s, 262144 iters, t-(init.)=1.301 s t(norm)=0.0310183, mflops=161.195 (err=2.7e-016) 22. Ransom: elapsed time t=1.001 s, 16384 iters, t-(init.)=0.991 s t(norm)=0.378036, mflops=13.2262 (err=6.6e-016) 23. Singleton (f2c): elapsed time t=1.532 s, 131072 iters, t-(init.)=1.452 s t(norm)=0.0692368, mflops=72.216 (err=3.0e-016) 24. Temperton (f2c): elapsed time t=1.041 s, 65536 iters, t-(init.)=1.001 s t(norm)=0.0954628, mflops=52.3764 (err=3.0e-016) 25. Valkenburg: elapsed time t=1.342 s, 16384 iters, t-(init.)=1.332 s t(norm)=0.508118, mflops=9.84024 (err=2.6e-016) Top mflops for N=32 = 243.855 Normalized results and averages for N=32: fft 0: mflops = 69.812 (norm. = 0.286285), norm. avg. (of 5) = 0.469728 fft 1: mflops = 70.2799 (norm. = 0.288204), norm. avg. (of 5) = 0.463371 fft 2: mflops = 47.6193 (norm. = 0.195277), norm. avg. (of 5) = 0.279141 fft 3: mflops = 15.9649 (norm. = 0.0654689), norm. avg. (of 5) = 0.0380409 fft 4: mflops = 15.4931 (norm. = 0.0635343), norm. avg. (of 5) = 0.0694562 fft 5: mflops = 87.9678 (norm. = 0.360738), norm. avg. (of 5) = 0.225695 fft 6: mflops = 54.4998 (norm. = 0.223493), norm. avg. (of 5) = 0.162554 fft 7: mflops = 55.6865 (norm. = 0.228359), norm. avg. (of 5) = 0.105717 fft 8: mflops = 32.912 (norm. = 0.134965), norm. avg. (of 4) = 0.151149 fft 9: mflops = 73.2246 (norm. = 0.300279), norm. avg. (of 5) = 0.252316 fft 10: mflops = 222.628 (norm. = 0.912951), norm. avg. (of 5) = 0.809472 fft 11: mflops = 227.704 (norm. = 0.933768), norm. avg. (of 5) = 0.808088 fft 12: mflops = 243.855 (norm. = 1), norm. avg. (of 5) = 1 fft 13: mflops = 105.703 (norm. = 0.433468), norm. avg. (of 3) = 0.400908 fft 14: mflops = 81.7922 (norm. = 0.335413), norm. avg. (of 5) = 0.268075 fft 15: mflops = 54.4998 (norm. = 0.223493), norm. avg. (of 5) = 0.158886 fft 16: mflops = 51.3002 (norm. = 0.210372), norm. avg. (of 5) = 0.147695 fft 17: mflops = 192.399 (norm. = 0.788991), norm. avg. (of 5) = 0.641514 fft 18: mflops = 73.7395 (norm. = 0.302391), norm. avg. (of 4) = 0.295327 fft 19: mflops = 93.456 (norm. = 0.383244), norm. avg. (of 4) = 0.334278 fft 20: mflops = 96.9109 (norm. = 0.397412), norm. avg. (of 4) = 0.335218 fft 21: mflops = 161.195 (norm. = 0.66103), norm. avg. (of 5) = 0.621405 fft 22: mflops = 13.2262 (norm. = 0.0542381), norm. avg. (of 4) = 0.0365814 fft 23: mflops = 72.216 (norm. = 0.296143), norm. avg. (of 5) = 0.177692 fft 24: mflops = 52.3764 (norm. = 0.214785), norm. avg. (of 5) = 0.12618 fft 25: mflops = 9.84024 (norm. = 0.0403529), norm. avg. (of 5) = 0.0624015 Benchmarking for array size = 64 (power of 2): 0. Arndt DIF: elapsed time t=1.052 s, 32768 iters, t-(init.)=1.022 s t(norm)=0.0812213, mflops=61.5602 (err=5.7e-016) 1. Arndt DIT: elapsed time t=1.012 s, 32768 iters, t-(init.)=0.982 s t(norm)=0.0780423, mflops=64.0678 (err=5.4e-016) 2. Arndt Split-Radix: elapsed time t=1.282 s, 32768 iters, t-(init.)=1.252 s t(norm)=0.0995, mflops=50.2512 (err=5.8e-016) 3. Arndt 4-step: elapsed time t=1.212 s, 16384 iters, t-(init.)=1.192 s t(norm)=0.189463, mflops=26.3903 (err=5.5e-016) 4. Beauregard: elapsed time t=1.031 s, 8192 iters, t-(init.)=1.021 s t(norm)=0.324567, mflops=15.4051 (err=5.7e-016) 5. Bergland: elapsed time t=1.452 s, 65536 iters, t-(init.)=1.382 s t(norm)=0.0549157, mflops=91.0486 (err=5.3e-016) 6. CWP (min N) (N=65): elapsed time t=1.062 s, 32768 iters, t-(init.)=1.032 s t(norm)=0.082016, mflops=60.9637 7. CWP (best N) (N=84): elapsed time t=1.232 s, 32768 iters, t-(init.)=1.182 s t(norm)=0.0939369, mflops=53.2272 8. Edelblute: elapsed time t=1.822 s, 32768 iters, t-(init.)=1.782 s t(norm)=0.141621, mflops=35.3056 (err=5.7e-016) 9. FFTPACK (f2c): elapsed time t=1.432 s, 65536 iters, t-(init.)=1.362 s t(norm)=0.054121, mflops=92.3856 (err=5.4e-016) FFTW_MEASURE plan: (cost = 1.068115e-005) FFTW_TWIDDLE 4 FFTW_NOTW 16 10. FFTW: elapsed time t=1.442 s, 131072 iters, t-(init.)=1.292 s t(norm)=0.0256697, mflops=194.782 (err=5.4e-016) FFTW_ESTIMATE plan: (cost = 7.680000e+002) FFTW_TWIDDLE 2 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.412 s, 131072 iters, t-(init.)=1.262 s t(norm)=0.0250737, mflops=199.412 (err=5.7e-016) 12. Frigo-old: elapsed time t=1.602 s, 131072 iters, t-(init.)=1.452 s t(norm)=0.0288486, mflops=173.318 (err=5.7e-016) 13. Green: elapsed time t=1.031 s, 65536 iters, t-(init.)=0.951 s t(norm)=0.0377893, mflops=132.312 (err=5.7e-016) 14. GSL: elapsed time t=1.342 s, 65536 iters, t-(init.)=1.262 s t(norm)=0.0501474, mflops=99.7061 (err=5.4e-016) 15. GSL DIT: elapsed time t=1.021 s, 32768 iters, t-(init.)=0.981 s t(norm)=0.0779629, mflops=64.1331 (err=5.8e-016) 16. GSL DIF: elapsed time t=1.062 s, 32768 iters, t-(init.)=1.022 s t(norm)=0.0812213, mflops=61.5602 (err=5.7e-016) 17. Krukar: elapsed time t=1.953 s, 65536 iters, t-(init.)=1.873 s t(norm)=0.0744263, mflops=67.1805 (err=6.0e-016) 18. Mayer (Buneman): elapsed time t=1.833 s, 65536 iters, t-(init.)=1.753 s t(norm)=0.069658, mflops=71.7793 (err=5.1e-016) 19. Mayer (simple): elapsed time t=1.402 s, 65536 iters, t-(init.)=1.332 s t(norm)=0.0529289, mflops=94.4663 20. Mayer (lookup): elapsed time t=1.342 s, 65536 iters, t-(init.)=1.262 s t(norm)=0.0501474, mflops=99.7061 (err=5.6e-016) 21. Ooura (C): elapsed time t=1.813 s, 131072 iters, t-(init.)=1.663 s t(norm)=0.0330408, mflops=151.328 (err=5.3e-016) 22. Ransom: elapsed time t=1.812 s, 32768 iters, t-(init.)=1.781 s t(norm)=0.141541, mflops=35.3254 (err=8.5e-016) 23. Singleton (f2c): elapsed time t=1.432 s, 65536 iters, t-(init.)=1.352 s t(norm)=0.0537237, mflops=93.0689 (err=8.6e-016) 24. Temperton (f2c): elapsed time t=1.823 s, 65536 iters, t-(init.)=1.753 s t(norm)=0.069658, mflops=71.7793 (err=5.4e-016) 25. Valkenburg: elapsed time t=1.583 s, 8192 iters, t-(init.)=1.573 s t(norm)=0.500043, mflops=9.99914 (err=6.7e-016) Top mflops for N=64 = 199.412 Normalized results and averages for N=64: fft 0: mflops = 61.5602 (norm. = 0.308708), norm. avg. (of 6) = 0.442891 fft 1: mflops = 64.0678 (norm. = 0.321283), norm. avg. (of 6) = 0.439689 fft 2: mflops = 50.2512 (norm. = 0.251997), norm. avg. (of 6) = 0.274617 fft 3: mflops = 26.3903 (norm. = 0.132341), norm. avg. (of 6) = 0.0537575 fft 4: mflops = 15.4051 (norm. = 0.0772527), norm. avg. (of 6) = 0.0707556 fft 5: mflops = 91.0486 (norm. = 0.456585), norm. avg. (of 6) = 0.264177 fft 6: mflops = 60.9637 (norm. = 0.305717), norm. avg. (of 6) = 0.186415 fft 7: mflops = 53.2272 (norm. = 0.26692), norm. avg. (of 6) = 0.132584 fft 8: mflops = 35.3056 (norm. = 0.177048), norm. avg. (of 5) = 0.156329 fft 9: mflops = 92.3856 (norm. = 0.463289), norm. avg. (of 6) = 0.287478 fft 10: mflops = 194.782 (norm. = 0.97678), norm. avg. (of 6) = 0.837357 fft 11: mflops = 199.412 (norm. = 1), norm. avg. (of 6) = 0.840073 fft 12: mflops = 173.318 (norm. = 0.869146), norm. avg. (of 6) = 0.978191 fft 13: mflops = 132.312 (norm. = 0.663512), norm. avg. (of 4) = 0.466559 fft 14: mflops = 99.7061 (norm. = 0.5), norm. avg. (of 6) = 0.306729 fft 15: mflops = 64.1331 (norm. = 0.321611), norm. avg. (of 6) = 0.186007 fft 16: mflops = 61.5602 (norm. = 0.308708), norm. avg. (of 6) = 0.174531 fft 17: mflops = 67.1805 (norm. = 0.336893), norm. avg. (of 6) = 0.590744 fft 18: mflops = 71.7793 (norm. = 0.359954), norm. avg. (of 5) = 0.308253 fft 19: mflops = 94.4663 (norm. = 0.473724), norm. avg. (of 5) = 0.362167 fft 20: mflops = 99.7061 (norm. = 0.5), norm. avg. (of 5) = 0.368174 fft 21: mflops = 151.328 (norm. = 0.75887), norm. avg. (of 6) = 0.644316 fft 22: mflops = 35.3254 (norm. = 0.177148), norm. avg. (of 5) = 0.0646947 fft 23: mflops = 93.0689 (norm. = 0.466716), norm. avg. (of 6) = 0.225863 fft 24: mflops = 71.7793 (norm. = 0.359954), norm. avg. (of 6) = 0.165143 fft 25: mflops = 9.99914 (norm. = 0.050143), norm. avg. (of 6) = 0.0603584 Benchmarking for array size = 128 (power of 2): 0. Arndt DIF: elapsed time t=1.062 s, 16384 iters, t-(init.)=1.032 s t(norm)=0.0702994, mflops=71.1243 (err=3.7e-016) 1. Arndt DIT: elapsed time t=1.032 s, 16384 iters, t-(init.)=0.992 s t(norm)=0.0675746, mflops=73.9923 (err=3.6e-016) 2. Arndt Split-Radix: elapsed time t=1.362 s, 16384 iters, t-(init.)=1.322 s t(norm)=0.0900541, mflops=55.5222 (err=3.8e-016) 3. Arndt 4-step: elapsed time t=1.512 s, 8192 iters, t-(init.)=1.492 s t(norm)=0.203269, mflops=24.598 (err=3.3e-016) 4. Beauregard: elapsed time t=1.212 s, 4096 iters, t-(init.)=1.202 s t(norm)=0.327519, mflops=15.2663 (err=4.0e-016) 5. Bergland: elapsed time t=1.562 s, 32768 iters, t-(init.)=1.492 s t(norm)=0.0508172, mflops=98.3918 (err=3.2e-016) 6. CWP (min N) (N=130): elapsed time t=1.192 s, 16384 iters, t-(init.)=1.162 s t(norm)=0.079155, mflops=63.1672 7. CWP (best N) (N=140): elapsed time t=1.102 s, 16384 iters, t-(init.)=1.062 s t(norm)=0.072343, mflops=69.1152 8. Edelblute: elapsed time t=1.943 s, 16384 iters, t-(init.)=1.903 s t(norm)=0.129632, mflops=38.5708 (err=3.7e-016) 9. FFTPACK (f2c): elapsed time t=1.743 s, 32768 iters, t-(init.)=1.673 s t(norm)=0.056982, mflops=87.7469 (err=3.7e-016) FFTW_MEASURE plan: (cost = 2.386475e-005) FFTW_TWIDDLE 4 FFTW_NOTW 32 10. FFTW: elapsed time t=1.542 s, 65536 iters, t-(init.)=1.402 s t(norm)=0.0238759, mflops=209.416 (err=3.5e-016) FFTW_ESTIMATE plan: (cost = 1.075200e+003) FFTW_TWIDDLE 4 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.542 s, 65536 iters, t-(init.)=1.392 s t(norm)=0.0237056, mflops=210.92 (err=3.5e-016) 12. Frigo-old: elapsed time t=1.732 s, 65536 iters, t-(init.)=1.582 s t(norm)=0.0269413, mflops=185.589 (err=3.5e-016) 13. Green: elapsed time t=1.392 s, 32768 iters, t-(init.)=1.312 s t(norm)=0.0446865, mflops=111.891 (err=4.0e-016) 14. GSL: elapsed time t=1.412 s, 32768 iters, t-(init.)=1.341 s t(norm)=0.0456742, mflops=109.471 (err=3.6e-016) 15. GSL DIT: elapsed time t=1.052 s, 16384 iters, t-(init.)=1.012 s t(norm)=0.068937, mflops=72.53 (err=3.8e-016) 16. GSL DIF: elapsed time t=1.102 s, 16384 iters, t-(init.)=1.072 s t(norm)=0.0730242, mflops=68.4704 (err=3.4e-016) 17. Krukar: elapsed time t=1.813 s, 16384 iters, t-(init.)=1.783 s t(norm)=0.121457, mflops=41.1668 (err=3.5e-016) 18. Mayer (Buneman): elapsed time t=1.933 s, 32768 iters, t-(init.)=1.863 s t(norm)=0.0634534, mflops=78.798 (err=3.4e-016) 19. Mayer (simple): elapsed time t=1.452 s, 32768 iters, t-(init.)=1.372 s t(norm)=0.04673, mflops=106.998 20. Mayer (lookup): elapsed time t=1.382 s, 32768 iters, t-(init.)=1.312 s t(norm)=0.0446865, mflops=111.891 (err=3.8e-016) 21. Ooura (C): elapsed time t=1.943 s, 65536 iters, t-(init.)=1.803 s t(norm)=0.0307049, mflops=162.84 (err=3.3e-016) 22. Ransom: elapsed time t=1.171 s, 8192 iters, t-(init.)=1.151 s t(norm)=0.156811, mflops=31.8855 (err=9.7e-016) 23. Singleton (f2c): elapsed time t=1.652 s, 32768 iters, t-(init.)=1.572 s t(norm)=0.053542, mflops=93.3846 (err=4.1e-016) 24. Temperton (f2c): elapsed time t=1.061 s, 16384 iters, t-(init.)=1.021 s t(norm)=0.0695501, mflops=71.8906 (err=3.4e-016) 25. Valkenburg: elapsed time t=1.803 s, 4096 iters, t-(init.)=1.793 s t(norm)=0.488554, mflops=10.2343 (err=5.4e-016) Top mflops for N=128 = 210.92 Normalized results and averages for N=128: fft 0: mflops = 71.1243 (norm. = 0.337209), norm. avg. (of 7) = 0.427794 fft 1: mflops = 73.9923 (norm. = 0.350806), norm. avg. (of 7) = 0.426992 fft 2: mflops = 55.5222 (norm. = 0.263238), norm. avg. (of 7) = 0.272991 fft 3: mflops = 24.598 (norm. = 0.116622), norm. avg. (of 7) = 0.0627382 fft 4: mflops = 15.2663 (norm. = 0.0723794), norm. avg. (of 7) = 0.0709876 fft 5: mflops = 98.3918 (norm. = 0.466488), norm. avg. (of 7) = 0.293078 fft 6: mflops = 63.1672 (norm. = 0.299484), norm. avg. (of 7) = 0.202567 fft 7: mflops = 69.1152 (norm. = 0.327684), norm. avg. (of 7) = 0.160455 fft 8: mflops = 38.5708 (norm. = 0.182869), norm. avg. (of 6) = 0.160752 fft 9: mflops = 87.7469 (norm. = 0.416019), norm. avg. (of 7) = 0.305841 fft 10: mflops = 209.416 (norm. = 0.992867), norm. avg. (of 7) = 0.859573 fft 11: mflops = 210.92 (norm. = 1), norm. avg. (of 7) = 0.86292 fft 12: mflops = 185.589 (norm. = 0.879899), norm. avg. (of 7) = 0.964149 fft 13: mflops = 111.891 (norm. = 0.530488), norm. avg. (of 5) = 0.479345 fft 14: mflops = 109.471 (norm. = 0.519016), norm. avg. (of 7) = 0.337056 fft 15: mflops = 72.53 (norm. = 0.343874), norm. avg. (of 7) = 0.208559 fft 16: mflops = 68.4704 (norm. = 0.324627), norm. avg. (of 7) = 0.195973 fft 17: mflops = 41.1668 (norm. = 0.195177), norm. avg. (of 7) = 0.534234 fft 18: mflops = 78.798 (norm. = 0.373591), norm. avg. (of 6) = 0.319142 fft 19: mflops = 106.998 (norm. = 0.507289), norm. avg. (of 6) = 0.386354 fft 20: mflops = 111.891 (norm. = 0.530488), norm. avg. (of 6) = 0.395226 fft 21: mflops = 162.84 (norm. = 0.772047), norm. avg. (of 7) = 0.662563 fft 22: mflops = 31.8855 (norm. = 0.151173), norm. avg. (of 6) = 0.0791077 fft 23: mflops = 93.3846 (norm. = 0.442748), norm. avg. (of 7) = 0.256846 fft 24: mflops = 71.8906 (norm. = 0.340842), norm. avg. (of 7) = 0.190242 fft 25: mflops = 10.2343 (norm. = 0.048522), norm. avg. (of 7) = 0.0586675 Benchmarking for array size = 256 (power of 2): 0. Arndt DIF: elapsed time t=1.212 s, 8192 iters, t-(init.)=1.172 s t(norm)=0.0698566, mflops=71.5752 (err=7.3e-016) 1. Arndt DIT: elapsed time t=1.161 s, 8192 iters, t-(init.)=1.13 s t(norm)=0.0673532, mflops=74.2355 (err=7.7e-016) 2. Arndt Split-Radix: elapsed time t=1.412 s, 8192 iters, t-(init.)=1.372 s t(norm)=0.0817776, mflops=61.1415 (err=7.4e-016) 3. Arndt 4-step: elapsed time t=1.472 s, 4096 iters, t-(init.)=1.452 s t(norm)=0.173092, mflops=28.8864 (err=7.4e-016) 4. Beauregard: elapsed time t=1.382 s, 2048 iters, t-(init.)=1.372 s t(norm)=0.32711, mflops=15.2854 (err=8.8e-016) 5. Bergland: elapsed time t=1.593 s, 16384 iters, t-(init.)=1.523 s t(norm)=0.0453889, mflops=110.159 (err=7.7e-016) 6. CWP (min N) (N=260): elapsed time t=1.182 s, 8192 iters, t-(init.)=1.142 s t(norm)=0.0680685, mflops=73.4554 7. CWP (best N) (N=280): elapsed time t=1.121 s, 8192 iters, t-(init.)=1.081 s t(norm)=0.0644326, mflops=77.6004 8. Edelblute: elapsed time t=1.011 s, 4096 iters, t-(init.)=1.001 s t(norm)=0.119328, mflops=41.9011 (err=7.4e-016) 9. FFTPACK (f2c): elapsed time t=1.792 s, 16384 iters, t-(init.)=1.721 s t(norm)=0.0512898, mflops=97.4853 (err=8.5e-016) FFTW_MEASURE plan: (cost = 5.371094e-005) FFTW_TWIDDLE 2 FFTW_TWIDDLE 4 FFTW_NOTW 32 10. FFTW: elapsed time t=1.812 s, 32768 iters, t-(init.)=1.671 s t(norm)=0.0248998, mflops=200.805 (err=8.2e-016) FFTW_ESTIMATE plan: (cost = 9.216000e+002) FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.813 s, 32768 iters, t-(init.)=1.673 s t(norm)=0.0249296, mflops=200.564 (err=8.3e-016) 12. Frigo-old: elapsed time t=1.943 s, 32768 iters, t-(init.)=1.803 s t(norm)=0.0268668, mflops=186.103 (err=8.4e-016) 13. Green: elapsed time t=1.482 s, 16384 iters, t-(init.)=1.412 s t(norm)=0.0420809, mflops=118.819 (err=8.4e-016) 14. GSL: elapsed time t=1.492 s, 16384 iters, t-(init.)=1.421 s t(norm)=0.0423491, mflops=118.066 (err=8.5e-016) 15. GSL DIT: elapsed time t=1.122 s, 8192 iters, t-(init.)=1.082 s t(norm)=0.0644922, mflops=77.5287 (err=8.7e-016) 16. GSL DIF: elapsed time t=1.161 s, 8192 iters, t-(init.)=1.131 s t(norm)=0.0674129, mflops=74.1698 (err=8.4e-016) 17. Krukar: elapsed time t=1.342 s, 8192 iters, t-(init.)=1.302 s t(norm)=0.0776052, mflops=64.4286 (err=8.6e-016) 18. Mayer (Buneman): elapsed time t=1.051 s, 8192 iters, t-(init.)=1.011 s t(norm)=0.0602603, mflops=82.9734 (err=7.4e-016) 19. Mayer (simple): elapsed time t=1.573 s, 16384 iters, t-(init.)=1.503 s t(norm)=0.0447929, mflops=111.625 20. Mayer (lookup): elapsed time t=1.502 s, 16384 iters, t-(init.)=1.432 s t(norm)=0.0426769, mflops=117.159 (err=7.2e-016) 21. Ooura (C): elapsed time t=1.121 s, 16384 iters, t-(init.)=1.05 s t(norm)=0.0312924, mflops=159.783 (err=7.7e-016) 22. Ransom: elapsed time t=1.582 s, 8192 iters, t-(init.)=1.542 s t(norm)=0.0919104, mflops=54.4008 (err=1.3e-015) 23. Singleton (f2c): elapsed time t=1.543 s, 16384 iters, t-(init.)=1.463 s t(norm)=0.0436008, mflops=114.677 (err=1.2e-015) 24. Temperton (f2c): elapsed time t=1.012 s, 8192 iters, t-(init.)=0.972 s t(norm)=0.0579357, mflops=86.3026 (err=8.5e-016) 25. Valkenburg: elapsed time t=1.012 s, 1024 iters, t-(init.)=1.012 s t(norm)=0.482559, mflops=10.3614 (err=8.9e-016) Top mflops for N=256 = 200.805 Normalized results and averages for N=256: fft 0: mflops = 71.5752 (norm. = 0.356442), norm. avg. (of 8) = 0.418875 fft 1: mflops = 74.2355 (norm. = 0.36969), norm. avg. (of 8) = 0.419829 fft 2: mflops = 61.1415 (norm. = 0.304483), norm. avg. (of 8) = 0.276928 fft 3: mflops = 28.8864 (norm. = 0.143853), norm. avg. (of 8) = 0.0728776 fft 4: mflops = 15.2854 (norm. = 0.0761206), norm. avg. (of 8) = 0.0716292 fft 5: mflops = 110.159 (norm. = 0.548588), norm. avg. (of 8) = 0.325017 fft 6: mflops = 73.4554 (norm. = 0.365806), norm. avg. (of 8) = 0.222972 fft 7: mflops = 77.6004 (norm. = 0.386448), norm. avg. (of 8) = 0.188704 fft 8: mflops = 41.9011 (norm. = 0.208666), norm. avg. (of 7) = 0.167597 fft 9: mflops = 97.4853 (norm. = 0.485474), norm. avg. (of 8) = 0.328295 fft 10: mflops = 200.805 (norm. = 1), norm. avg. (of 8) = 0.877126 fft 11: mflops = 200.564 (norm. = 0.998805), norm. avg. (of 8) = 0.879905 fft 12: mflops = 186.103 (norm. = 0.926789), norm. avg. (of 8) = 0.959479 fft 13: mflops = 118.819 (norm. = 0.591714), norm. avg. (of 6) = 0.498073 fft 14: mflops = 118.066 (norm. = 0.587966), norm. avg. (of 8) = 0.36842 fft 15: mflops = 77.5287 (norm. = 0.386091), norm. avg. (of 8) = 0.230751 fft 16: mflops = 74.1698 (norm. = 0.369363), norm. avg. (of 8) = 0.217647 fft 17: mflops = 64.4286 (norm. = 0.320853), norm. avg. (of 8) = 0.507562 fft 18: mflops = 82.9734 (norm. = 0.413205), norm. avg. (of 7) = 0.33258 fft 19: mflops = 111.625 (norm. = 0.555888), norm. avg. (of 7) = 0.410573 fft 20: mflops = 117.159 (norm. = 0.58345), norm. avg. (of 7) = 0.422115 fft 21: mflops = 159.783 (norm. = 0.795714), norm. avg. (of 8) = 0.679207 fft 22: mflops = 54.4008 (norm. = 0.270914), norm. avg. (of 7) = 0.106509 fft 23: mflops = 114.677 (norm. = 0.571087), norm. avg. (of 8) = 0.296126 fft 24: mflops = 86.3026 (norm. = 0.429784), norm. avg. (of 8) = 0.220185 fft 25: mflops = 10.3614 (norm. = 0.0515996), norm. avg. (of 8) = 0.057784 Benchmarking for array size = 512 (power of 2): 0. Arndt DIF: elapsed time t=1.222 s, 4096 iters, t-(init.)=1.192 s t(norm)=0.0631544, mflops=79.171 (err=9.4e-016) 1. Arndt DIT: elapsed time t=1.182 s, 4096 iters, t-(init.)=1.142 s t(norm)=0.0605053, mflops=82.6373 (err=9.2e-016) 2. Arndt Split-Radix: elapsed time t=1.582 s, 4096 iters, t-(init.)=1.542 s t(norm)=0.0816981, mflops=61.2009 (err=9.2e-016) 3. Arndt 4-step: elapsed time t=1.612 s, 2048 iters, t-(init.)=1.592 s t(norm)=0.168694, mflops=29.6394 (err=9.0e-016) 4. Beauregard: elapsed time t=1.572 s, 1024 iters, t-(init.)=1.562 s t(norm)=0.331031, mflops=15.1043 (err=9.6e-016) 5. Bergland: elapsed time t=1.742 s, 8192 iters, t-(init.)=1.671 s t(norm)=0.0442664, mflops=112.953 (err=1.0e-015) 6. CWP (min N) (N=520): elapsed time t=1.232 s, 4096 iters, t-(init.)=1.202 s t(norm)=0.0636843, mflops=78.5123 7. CWP (best N) (N=560): elapsed time t=1.222 s, 4096 iters, t-(init.)=1.182 s t(norm)=0.0626246, mflops=79.8408 8. Edelblute: elapsed time t=1.102 s, 2048 iters, t-(init.)=1.082 s t(norm)=0.114653, mflops=43.6099 (err=9.2e-016) 9. FFTPACK (f2c): elapsed time t=1.322 s, 4096 iters, t-(init.)=1.292 s t(norm)=0.0684526, mflops=73.0432 (err=9.3e-016) FFTW_MEASURE plan: (cost = 1.562500e-004) FFTW_TWIDDLE 16 FFTW_NOTW 32 10. FFTW: elapsed time t=1.292 s, 8192 iters, t-(init.)=1.222 s t(norm)=0.0323719, mflops=154.455 (err=8.7e-016) FFTW_ESTIMATE plan: (cost = 1.843200e+003) FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.292 s, 8192 iters, t-(init.)=1.222 s t(norm)=0.0323719, mflops=154.455 (err=8.7e-016) 12. Frigo-old: elapsed time t=1.432 s, 8192 iters, t-(init.)=1.362 s t(norm)=0.0360807, mflops=138.578 (err=8.5e-016) 13. Green: elapsed time t=1.523 s, 8192 iters, t-(init.)=1.453 s t(norm)=0.0384914, mflops=129.899 (err=9.5e-016) 14. GSL: elapsed time t=1.923 s, 8192 iters, t-(init.)=1.853 s t(norm)=0.0490877, mflops=101.858 (err=9.1e-016) 15. GSL DIT: elapsed time t=1.252 s, 4096 iters, t-(init.)=1.212 s t(norm)=0.0642141, mflops=77.8646 (err=1.1e-015) 16. GSL DIF: elapsed time t=1.252 s, 4096 iters, t-(init.)=1.212 s t(norm)=0.0642141, mflops=77.8646 (err=1.1e-015) 17. Krukar: elapsed time t=1.763 s, 4096 iters, t-(init.)=1.733 s t(norm)=0.0918176, mflops=54.4558 (err=9.3e-016) 18. Mayer (Buneman): elapsed time t=1.092 s, 4096 iters, t-(init.)=1.052 s t(norm)=0.055737, mflops=89.7071 (err=8.7e-016) 19. Mayer (simple): elapsed time t=1.633 s, 8192 iters, t-(init.)=1.563 s t(norm)=0.0414054, mflops=120.757 20. Mayer (lookup): elapsed time t=1.542 s, 8192 iters, t-(init.)=1.472 s t(norm)=0.0389947, mflops=128.223 (err=8.8e-016) 21. Ooura (C): elapsed time t=1.172 s, 8192 iters, t-(init.)=1.092 s t(norm)=0.0289281, mflops=172.842 (err=9.6e-016) 22. Ransom: elapsed time t=1.181 s, 2048 iters, t-(init.)=1.161 s t(norm)=0.123024, mflops=40.6425 (err=1.2e-015) 23. Singleton (f2c): elapsed time t=1.672 s, 8192 iters, t-(init.)=1.602 s t(norm)=0.0424385, mflops=117.818 (err=1.1e-015) 24. Temperton (f2c): elapsed time t=1.282 s, 4096 iters, t-(init.)=1.252 s t(norm)=0.0663333, mflops=75.3769 (err=9.3e-016) 25. Valkenburg: elapsed time t=1.141 s, 512 iters, t-(init.)=1.131 s t(norm)=0.47938, mflops=10.4301 (err=1.4e-015) Top mflops for N=512 = 172.842 Normalized results and averages for N=512: fft 0: mflops = 79.171 (norm. = 0.458054), norm. avg. (of 9) = 0.423228 fft 1: mflops = 82.6373 (norm. = 0.478109), norm. avg. (of 9) = 0.426305 fft 2: mflops = 61.2009 (norm. = 0.354086), norm. avg. (of 9) = 0.285501 fft 3: mflops = 29.6394 (norm. = 0.171482), norm. avg. (of 9) = 0.0838336 fft 4: mflops = 15.1043 (norm. = 0.087388), norm. avg. (of 9) = 0.0733802 fft 5: mflops = 112.953 (norm. = 0.653501), norm. avg. (of 9) = 0.361515 fft 6: mflops = 78.5123 (norm. = 0.454243), norm. avg. (of 9) = 0.248669 fft 7: mflops = 79.8408 (norm. = 0.461929), norm. avg. (of 9) = 0.219063 fft 8: mflops = 43.6099 (norm. = 0.252311), norm. avg. (of 8) = 0.178186 fft 9: mflops = 73.0432 (norm. = 0.422601), norm. avg. (of 9) = 0.338773 fft 10: mflops = 154.455 (norm. = 0.893617), norm. avg. (of 9) = 0.878958 fft 11: mflops = 154.455 (norm. = 0.893617), norm. avg. (of 9) = 0.881429 fft 12: mflops = 138.578 (norm. = 0.801762), norm. avg. (of 9) = 0.941955 fft 13: mflops = 129.899 (norm. = 0.751549), norm. avg. (of 7) = 0.534284 fft 14: mflops = 101.858 (norm. = 0.589315), norm. avg. (of 9) = 0.392964 fft 15: mflops = 77.8646 (norm. = 0.450495), norm. avg. (of 9) = 0.255167 fft 16: mflops = 77.8646 (norm. = 0.450495), norm. avg. (of 9) = 0.243519 fft 17: mflops = 54.4558 (norm. = 0.315061), norm. avg. (of 9) = 0.486173 fft 18: mflops = 89.7071 (norm. = 0.519011), norm. avg. (of 8) = 0.355884 fft 19: mflops = 120.757 (norm. = 0.698656), norm. avg. (of 8) = 0.446583 fft 20: mflops = 128.223 (norm. = 0.741848), norm. avg. (of 8) = 0.462082 fft 21: mflops = 172.842 (norm. = 1), norm. avg. (of 9) = 0.714851 fft 22: mflops = 40.6425 (norm. = 0.235142), norm. avg. (of 8) = 0.122588 fft 23: mflops = 117.818 (norm. = 0.681648), norm. avg. (of 9) = 0.338962 fft 24: mflops = 75.3769 (norm. = 0.436102), norm. avg. (of 9) = 0.244176 fft 25: mflops = 10.4301 (norm. = 0.0603448), norm. avg. (of 9) = 0.0580685 Benchmarking for array size = 1024 (power of 2): 0. Arndt DIF: elapsed time t=1.372 s, 2048 iters, t-(init.)=1.332 s t(norm)=0.0635147, mflops=78.7219 (err=1.7e-015) 1. Arndt DIT: elapsed time t=1.332 s, 2048 iters, t-(init.)=1.292 s t(norm)=0.0616074, mflops=81.1591 (err=1.7e-015) 2. Arndt Split-Radix: elapsed time t=1.682 s, 2048 iters, t-(init.)=1.642 s t(norm)=0.0782967, mflops=63.8597 (err=1.6e-015) 3. Arndt 4-step: elapsed time t=1.623 s, 1024 iters, t-(init.)=1.603 s t(norm)=0.152874, mflops=32.7067 (err=1.6e-015) 4. Beauregard: elapsed time t=1.783 s, 512 iters, t-(init.)=1.773 s t(norm)=0.338173, mflops=14.7853 (err=1.8e-015) 5. Bergland: elapsed time t=1.953 s, 4096 iters, t-(init.)=1.883 s t(norm)=0.0448942, mflops=111.373 (err=1.7e-015) 6. CWP (min N) (N=1040): elapsed time t=1.362 s, 2048 iters, t-(init.)=1.322 s t(norm)=0.0630379, mflops=79.3174 7. CWP (best N) (N=1040): elapsed time t=1.372 s, 2048 iters, t-(init.)=1.332 s t(norm)=0.0635147, mflops=78.7219 8. Edelblute: elapsed time t=1.162 s, 1024 iters, t-(init.)=1.142 s t(norm)=0.10891, mflops=45.9096 (err=1.7e-015) 9. FFTPACK (f2c): elapsed time t=1.762 s, 2048 iters, t-(init.)=1.732 s t(norm)=0.0825882, mflops=60.5413 (err=1.7e-015) FFTW_MEASURE plan: (cost = 3.906250e-004) FFTW_TWIDDLE 4 FFTW_TWIDDLE 8 FFTW_NOTW 32 10. FFTW: elapsed time t=1.632 s, 4096 iters, t-(init.)=1.562 s t(norm)=0.037241, mflops=134.261 (err=1.7e-015) FFTW_ESTIMATE plan: (cost = 1.126400e+004) FFTW_TWIDDLE 8 FFTW_TWIDDLE 4 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.642 s, 4096 iters, t-(init.)=1.572 s t(norm)=0.0374794, mflops=133.407 (err=1.7e-015) 12. Frigo-old: elapsed time t=1.272 s, 2048 iters, t-(init.)=1.232 s t(norm)=0.0587463, mflops=85.1117 (err=1.7e-015) 13. Green: elapsed time t=1.052 s, 2048 iters, t-(init.)=1.022 s t(norm)=0.0487328, mflops=102.6 (err=1.7e-015) 14. GSL: elapsed time t=1.452 s, 2048 iters, t-(init.)=1.412 s t(norm)=0.0673294, mflops=74.2618 (err=1.7e-015) 15. GSL DIT: elapsed time t=1.523 s, 2048 iters, t-(init.)=1.483 s t(norm)=0.070715, mflops=70.7064 (err=1.9e-015) 16. GSL DIF: elapsed time t=1.492 s, 2048 iters, t-(init.)=1.452 s t(norm)=0.0692368, mflops=72.216 (err=1.9e-015) 17. Krukar: elapsed time t=1.161 s, 1024 iters, t-(init.)=1.141 s t(norm)=0.108814, mflops=45.9499 (err=1.7e-015) 18. Mayer (Buneman): elapsed time t=1.192 s, 2048 iters, t-(init.)=1.162 s t(norm)=0.0554085, mflops=90.2389 (err=1.6e-015) 19. Mayer (simple): elapsed time t=1.793 s, 4096 iters, t-(init.)=1.723 s t(norm)=0.0410795, mflops=121.715 20. Mayer (lookup): elapsed time t=1.742 s, 4096 iters, t-(init.)=1.672 s t(norm)=0.0398636, mflops=125.428 (err=1.6e-015) 21. Ooura (C): elapsed time t=1.523 s, 4096 iters, t-(init.)=1.453 s t(norm)=0.0346422, mflops=144.333 (err=1.6e-015) 22. Ransom: elapsed time t=1.652 s, 2048 iters, t-(init.)=1.611 s t(norm)=0.0768185, mflops=65.0885 (err=1.9e-015) 23. Singleton (f2c): elapsed time t=1.873 s, 4096 iters, t-(init.)=1.803 s t(norm)=0.0429869, mflops=116.315 (err=2.6e-015) 24. Temperton (f2c): elapsed time t=1.492 s, 2048 iters, t-(init.)=1.452 s t(norm)=0.0692368, mflops=72.216 (err=1.7e-015) 25. Valkenburg: elapsed time t=1.352 s, 256 iters, t-(init.)=1.342 s t(norm)=0.511932, mflops=9.76692 (err=1.8e-015) Top mflops for N=1024 = 144.333 Normalized results and averages for N=1024: fft 0: mflops = 78.7219 (norm. = 0.54542), norm. avg. (of 10) = 0.435447 fft 1: mflops = 81.1591 (norm. = 0.562307), norm. avg. (of 10) = 0.439905 fft 2: mflops = 63.8597 (norm. = 0.442448), norm. avg. (of 10) = 0.301196 fft 3: mflops = 32.7067 (norm. = 0.226606), norm. avg. (of 10) = 0.0981109 fft 4: mflops = 14.7853 (norm. = 0.102439), norm. avg. (of 10) = 0.0762861 fft 5: mflops = 111.373 (norm. = 0.771641), norm. avg. (of 10) = 0.402528 fft 6: mflops = 79.3174 (norm. = 0.549546), norm. avg. (of 10) = 0.278757 fft 7: mflops = 78.7219 (norm. = 0.54542), norm. avg. (of 10) = 0.251698 fft 8: mflops = 45.9096 (norm. = 0.318082), norm. avg. (of 9) = 0.19373 fft 9: mflops = 60.5413 (norm. = 0.419457), norm. avg. (of 10) = 0.346842 fft 10: mflops = 134.261 (norm. = 0.930218), norm. avg. (of 10) = 0.884084 fft 11: mflops = 133.407 (norm. = 0.9243), norm. avg. (of 10) = 0.885716 fft 12: mflops = 85.1117 (norm. = 0.589692), norm. avg. (of 10) = 0.906729 fft 13: mflops = 102.6 (norm. = 0.710861), norm. avg. (of 8) = 0.556356 fft 14: mflops = 74.2618 (norm. = 0.514518), norm. avg. (of 10) = 0.405119 fft 15: mflops = 70.7064 (norm. = 0.489885), norm. avg. (of 10) = 0.278639 fft 16: mflops = 72.216 (norm. = 0.500344), norm. avg. (of 10) = 0.269201 fft 17: mflops = 45.9499 (norm. = 0.318361), norm. avg. (of 10) = 0.469392 fft 18: mflops = 90.2389 (norm. = 0.625215), norm. avg. (of 9) = 0.38581 fft 19: mflops = 121.715 (norm. = 0.843297), norm. avg. (of 9) = 0.490663 fft 20: mflops = 125.428 (norm. = 0.869019), norm. avg. (of 9) = 0.507297 fft 21: mflops = 144.333 (norm. = 1), norm. avg. (of 10) = 0.743366 fft 22: mflops = 65.0885 (norm. = 0.450962), norm. avg. (of 9) = 0.159074 fft 23: mflops = 116.315 (norm. = 0.805879), norm. avg. (of 10) = 0.385654 fft 24: mflops = 72.216 (norm. = 0.500344), norm. avg. (of 10) = 0.269793 fft 25: mflops = 9.76692 (norm. = 0.0676695), norm. avg. (of 10) = 0.0590286 Benchmarking for array size = 2048 (power of 2): 0. Arndt DIF: elapsed time t=1.372 s, 512 iters, t-(init.)=1.312 s t(norm)=0.113747, mflops=43.9571 (err=1.4e-015) 1. Arndt DIT: elapsed time t=1.392 s, 512 iters, t-(init.)=1.332 s t(norm)=0.115481, mflops=43.2971 (err=1.4e-015) 2. Arndt Split-Radix: elapsed time t=1.943 s, 512 iters, t-(init.)=1.883 s t(norm)=0.163252, mflops=30.6276 (err=1.4e-015) 3. Arndt 4-step: elapsed time t=1.091 s, 256 iters, t-(init.)=1.061 s t(norm)=0.183972, mflops=27.178 (err=1.3e-015) 4. Beauregard: elapsed time t=1.071 s, 128 iters, t-(init.)=1.051 s t(norm)=0.364477, mflops=13.7183 (err=1.4e-015) 5. Bergland: elapsed time t=1.572 s, 1024 iters, t-(init.)=1.452 s t(norm)=0.0629425, mflops=79.4376 (err=1.4e-015) 6. CWP (min N) (N=2145): elapsed time t=1.963 s, 1024 iters, t-(init.)=1.843 s t(norm)=0.0798919, mflops=62.5846 7. CWP (best N) (N=2184): elapsed time t=1.842 s, 1024 iters, t-(init.)=1.711 s t(norm)=0.0741699, mflops=67.4128 8. Edelblute: elapsed time t=1.142 s, 256 iters, t-(init.)=1.112 s t(norm)=0.192816, mflops=25.9315 (err=1.4e-015) 9. FFTPACK (f2c): elapsed time t=1.002 s, 512 iters, t-(init.)=0.942 s t(norm)=0.0816692, mflops=61.2226 (err=1.4e-015) FFTW_MEASURE plan: (cost = 9.414062e-004) FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_NOTW 16 10. FFTW: elapsed time t=1.903 s, 2048 iters, t-(init.)=1.663 s t(norm)=0.0360446, mflops=138.717 (err=1.4e-015) FFTW_ESTIMATE plan: (cost = 1.269760e+004) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.212 s, 1024 iters, t-(init.)=1.092 s t(norm)=0.0473369, mflops=105.626 (err=1.4e-015) 12. Frigo-old: elapsed time t=1.683 s, 1024 iters, t-(init.)=1.563 s t(norm)=0.0677542, mflops=73.7961 (err=1.4e-015) 13. Green: elapsed time t=1.542 s, 1024 iters, t-(init.)=1.422 s t(norm)=0.061642, mflops=81.1135 (err=1.4e-015) 14. GSL: elapsed time t=1.512 s, 1024 iters, t-(init.)=1.392 s t(norm)=0.0603416, mflops=82.8616 (err=1.4e-015) 15. GSL DIT: elapsed time t=1.613 s, 512 iters, t-(init.)=1.553 s t(norm)=0.134641, mflops=37.1357 (err=1.9e-015) 16. GSL DIF: elapsed time t=1.622 s, 512 iters, t-(init.)=1.562 s t(norm)=0.135422, mflops=36.9217 (err=2.3e-015) 17. Krukar: elapsed time t=1.423 s, 512 iters, t-(init.)=1.363 s t(norm)=0.118169, mflops=42.3123 (err=1.4e-015) 18. Mayer (Buneman): elapsed time t=1.472 s, 1024 iters, t-(init.)=1.352 s t(norm)=0.0586076, mflops=85.3131 (err=1.3e-015) 19. Mayer (simple): elapsed time t=1.151 s, 1024 iters, t-(init.)=1.03 s t(norm)=0.0446493, mflops=111.984 20. Mayer (lookup): elapsed time t=1.352 s, 1024 iters, t-(init.)=1.231 s t(norm)=0.0533624, mflops=93.6989 (err=1.3e-015) 21. Ooura (C): elapsed time t=1.202 s, 1024 iters, t-(init.)=1.082 s t(norm)=0.0469034, mflops=106.602 (err=1.4e-015) 22. Ransom: elapsed time t=1.282 s, 512 iters, t-(init.)=1.222 s t(norm)=0.105945, mflops=47.1945 (err=2.0e-015) 23. Singleton (f2c): elapsed time t=1.993 s, 1024 iters, t-(init.)=1.873 s t(norm)=0.0811924, mflops=61.5821 (err=1.8e-015) 24. Temperton (f2c): elapsed time t=1.052 s, 512 iters, t-(init.)=0.992 s t(norm)=0.0860041, mflops=58.1368 (err=1.4e-015) 25. Valkenburg: elapsed time t=1.573 s, 128 iters, t-(init.)=1.553 s t(norm)=0.538566, mflops=9.28392 (err=1.6e-015) Top mflops for N=2048 = 138.717 Normalized results and averages for N=2048: fft 0: mflops = 43.9571 (norm. = 0.316883), norm. avg. (of 11) = 0.424669 fft 1: mflops = 43.2971 (norm. = 0.312125), norm. avg. (of 11) = 0.428288 fft 2: mflops = 30.6276 (norm. = 0.220791), norm. avg. (of 11) = 0.293886 fft 3: mflops = 27.178 (norm. = 0.195924), norm. avg. (of 11) = 0.107003 fft 4: mflops = 13.7183 (norm. = 0.0988939), norm. avg. (of 11) = 0.0783414 fft 5: mflops = 79.4376 (norm. = 0.572658), norm. avg. (of 11) = 0.417994 fft 6: mflops = 62.5846 (norm. = 0.451167), norm. avg. (of 11) = 0.29443 fft 7: mflops = 67.4128 (norm. = 0.485973), norm. avg. (of 11) = 0.272996 fft 8: mflops = 25.9315 (norm. = 0.186938), norm. avg. (of 10) = 0.193051 fft 9: mflops = 61.2226 (norm. = 0.441348), norm. avg. (of 11) = 0.355433 fft 10: mflops = 138.717 (norm. = 1), norm. avg. (of 11) = 0.894622 fft 11: mflops = 105.626 (norm. = 0.761447), norm. avg. (of 11) = 0.874419 fft 12: mflops = 73.7961 (norm. = 0.53199), norm. avg. (of 11) = 0.872662 fft 13: mflops = 81.1135 (norm. = 0.58474), norm. avg. (of 9) = 0.55951 fft 14: mflops = 82.8616 (norm. = 0.597342), norm. avg. (of 11) = 0.422594 fft 15: mflops = 37.1357 (norm. = 0.267708), norm. avg. (of 11) = 0.277645 fft 16: mflops = 36.9217 (norm. = 0.266165), norm. avg. (of 11) = 0.268925 fft 17: mflops = 42.3123 (norm. = 0.305026), norm. avg. (of 11) = 0.454449 fft 18: mflops = 85.3131 (norm. = 0.615015), norm. avg. (of 10) = 0.40873 fft 19: mflops = 111.984 (norm. = 0.807282), norm. avg. (of 10) = 0.522325 fft 20: mflops = 93.6989 (norm. = 0.675467), norm. avg. (of 10) = 0.524114 fft 21: mflops = 106.602 (norm. = 0.768484), norm. avg. (of 11) = 0.745649 fft 22: mflops = 47.1945 (norm. = 0.340221), norm. avg. (of 10) = 0.177189 fft 23: mflops = 61.5821 (norm. = 0.44394), norm. avg. (of 11) = 0.390953 fft 24: mflops = 58.1368 (norm. = 0.419103), norm. avg. (of 11) = 0.283366 fft 25: mflops = 9.28392 (norm. = 0.0669269), norm. avg. (of 11) = 0.0597467 Benchmarking for array size = 4096 (power of 2): 0. Arndt DIF: elapsed time t=1.692 s, 256 iters, t-(init.)=1.631 s t(norm)=0.12962, mflops=38.5742 (err=3.8e-015) 1. Arndt DIT: elapsed time t=1.702 s, 256 iters, t-(init.)=1.641 s t(norm)=0.130415, mflops=38.3392 (err=3.8e-015) 2. Arndt Split-Radix: elapsed time t=1.122 s, 128 iters, t-(init.)=1.092 s t(norm)=0.173569, mflops=28.807 (err=3.8e-015) 3. Arndt 4-step: elapsed time t=1.131 s, 128 iters, t-(init.)=1.101 s t(norm)=0.174999, mflops=28.5716 (err=3.8e-015) 4. Beauregard: elapsed time t=1.192 s, 64 iters, t-(init.)=1.172 s t(norm)=0.372569, mflops=13.4203 (err=3.8e-015) 5. Bergland: elapsed time t=1.673 s, 512 iters, t-(init.)=1.553 s t(norm)=0.0617107, mflops=81.0233 (err=3.9e-015) 6. CWP (min N) (N=4290): elapsed time t=1.142 s, 256 iters, t-(init.)=1.082 s t(norm)=0.0859896, mflops=58.1465 7. CWP (best N) (N=4368): elapsed time t=1.933 s, 512 iters, t-(init.)=1.803 s t(norm)=0.0716448, mflops=69.7888 8. Edelblute: elapsed time t=1.282 s, 128 iters, t-(init.)=1.252 s t(norm)=0.199, mflops=25.1256 (err=3.8e-015) 9. FFTPACK (f2c): elapsed time t=1.072 s, 256 iters, t-(init.)=1.012 s t(norm)=0.0804265, mflops=62.1685 (err=3.8e-015) FFTW_MEASURE plan: (cost = 2.039063e-003) FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_NOTW 32 10. FFTW: elapsed time t=1.072 s, 512 iters, t-(init.)=0.952 s t(norm)=0.0378291, mflops=132.173 (err=3.8e-015) FFTW_ESTIMATE plan: (cost = 2.539520e+004) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.312 s, 512 iters, t-(init.)=1.192 s t(norm)=0.0473658, mflops=105.561 (err=3.8e-015) 12. Frigo-old: elapsed time t=1.793 s, 512 iters, t-(init.)=1.673 s t(norm)=0.066479, mflops=75.2117 (err=3.8e-015) 13. Green: elapsed time t=1.583 s, 512 iters, t-(init.)=1.463 s t(norm)=0.0581344, mflops=86.0076 (err=3.9e-015) 14. GSL: elapsed time t=1.653 s, 512 iters, t-(init.)=1.543 s t(norm)=0.0613133, mflops=81.5484 (err=3.8e-015) 15. GSL DIT: elapsed time t=1.763 s, 256 iters, t-(init.)=1.703 s t(norm)=0.135342, mflops=36.9434 (err=4.0e-015) 16. GSL DIF: elapsed time t=1.782 s, 256 iters, t-(init.)=1.722 s t(norm)=0.136852, mflops=36.5357 (err=4.3e-015) 17. Krukar: elapsed time t=1.552 s, 256 iters, t-(init.)=1.492 s t(norm)=0.118574, mflops=42.1679 (err=3.9e-015) 18. Mayer (Buneman): elapsed time t=1.432 s, 256 iters, t-(init.)=1.372 s t(norm)=0.109037, mflops=45.8561 (err=3.8e-015) 19. Mayer (simple): elapsed time t=1.302 s, 256 iters, t-(init.)=1.242 s t(norm)=0.0987053, mflops=50.6558 20. Mayer (lookup): elapsed time t=1.392 s, 256 iters, t-(init.)=1.332 s t(norm)=0.105858, mflops=47.2332 (err=3.8e-015) 21. Ooura (C): elapsed time t=1.272 s, 512 iters, t-(init.)=1.152 s t(norm)=0.0457764, mflops=109.227 (err=3.9e-015) 22. Ransom: elapsed time t=1.042 s, 256 iters, t-(init.)=0.982 s t(norm)=0.0780423, mflops=64.0678 (err=4.5e-015) 23. Singleton (f2c): elapsed time t=1.802 s, 512 iters, t-(init.)=1.682 s t(norm)=0.0668367, mflops=74.8092 (err=6.0e-015) 24. Temperton (f2c): elapsed time t=1.081 s, 256 iters, t-(init.)=1.02 s t(norm)=0.0810623, mflops=61.6809 (err=3.8e-015) 25. Valkenburg: elapsed time t=1.712 s, 64 iters, t-(init.)=1.702 s t(norm)=0.541051, mflops=9.24127 (err=4.0e-015) Top mflops for N=4096 = 132.173 Normalized results and averages for N=4096: fft 0: mflops = 38.5742 (norm. = 0.291845), norm. avg. (of 12) = 0.4136 fft 1: mflops = 38.3392 (norm. = 0.290067), norm. avg. (of 12) = 0.41677 fft 2: mflops = 28.807 (norm. = 0.217949), norm. avg. (of 12) = 0.287558 fft 3: mflops = 28.5716 (norm. = 0.216167), norm. avg. (of 12) = 0.1161 fft 4: mflops = 13.4203 (norm. = 0.101536), norm. avg. (of 12) = 0.0802742 fft 5: mflops = 81.0233 (norm. = 0.613007), norm. avg. (of 12) = 0.434245 fft 6: mflops = 58.1465 (norm. = 0.439926), norm. avg. (of 12) = 0.306555 fft 7: mflops = 69.7888 (norm. = 0.528009), norm. avg. (of 12) = 0.294247 fft 8: mflops = 25.1256 (norm. = 0.190096), norm. avg. (of 11) = 0.192782 fft 9: mflops = 62.1685 (norm. = 0.470356), norm. avg. (of 12) = 0.36501 fft 10: mflops = 132.173 (norm. = 1), norm. avg. (of 12) = 0.903404 fft 11: mflops = 105.561 (norm. = 0.798658), norm. avg. (of 12) = 0.868105 fft 12: mflops = 75.2117 (norm. = 0.569038), norm. avg. (of 12) = 0.84736 fft 13: mflops = 86.0076 (norm. = 0.650718), norm. avg. (of 10) = 0.568631 fft 14: mflops = 81.5484 (norm. = 0.61698), norm. avg. (of 12) = 0.438793 fft 15: mflops = 36.9434 (norm. = 0.279507), norm. avg. (of 12) = 0.2778 fft 16: mflops = 36.5357 (norm. = 0.276423), norm. avg. (of 12) = 0.26955 fft 17: mflops = 42.1679 (norm. = 0.319035), norm. avg. (of 12) = 0.443165 fft 18: mflops = 45.8561 (norm. = 0.346939), norm. avg. (of 11) = 0.403113 fft 19: mflops = 50.6558 (norm. = 0.383253), norm. avg. (of 11) = 0.509682 fft 20: mflops = 47.2332 (norm. = 0.357357), norm. avg. (of 11) = 0.508955 fft 21: mflops = 109.227 (norm. = 0.826389), norm. avg. (of 12) = 0.752378 fft 22: mflops = 64.0678 (norm. = 0.484725), norm. avg. (of 11) = 0.205146 fft 23: mflops = 74.8092 (norm. = 0.565993), norm. avg. (of 12) = 0.405539 fft 24: mflops = 61.6809 (norm. = 0.466667), norm. avg. (of 12) = 0.298641 fft 25: mflops = 9.24127 (norm. = 0.0699177), norm. avg. (of 12) = 0.0605943 Benchmarking for array size = 8192 (power of 2): 0. Arndt DIF: elapsed time t=1.753 s, 128 iters, t-(init.)=1.693 s t(norm)=0.124198, mflops=40.2584 (err=3.6e-015) 1. Arndt DIT: elapsed time t=1.813 s, 128 iters, t-(init.)=1.743 s t(norm)=0.127866, mflops=39.1035 (err=3.6e-015) 2. Arndt Split-Radix: elapsed time t=1.232 s, 64 iters, t-(init.)=1.202 s t(norm)=0.176356, mflops=28.3517 (err=3.6e-015) 3. Arndt 4-step: elapsed time t=1.312 s, 64 iters, t-(init.)=1.282 s t(norm)=0.188094, mflops=26.5825 (err=3.6e-015) 4. Beauregard: elapsed time t=1.292 s, 32 iters, t-(init.)=1.282 s t(norm)=0.376188, mflops=13.2912 (err=3.7e-015) 5. Bergland: elapsed time t=1.963 s, 256 iters, t-(init.)=1.843 s t(norm)=0.0676008, mflops=73.9636 (err=3.7e-015) 6. CWP (min N) (N=8580): elapsed time t=1.142 s, 128 iters, t-(init.)=1.072 s t(norm)=0.0786415, mflops=63.5797 7. CWP (best N) (N=9240): elapsed time t=1.142 s, 128 iters, t-(init.)=1.082 s t(norm)=0.079375, mflops=62.9921 8. Edelblute: elapsed time t=1.382 s, 64 iters, t-(init.)=1.352 s t(norm)=0.198364, mflops=25.2062 (err=3.6e-015) 9. FFTPACK (f2c): elapsed time t=1.332 s, 128 iters, t-(init.)=1.272 s t(norm)=0.0933134, mflops=53.5829 (err=3.7e-015) FFTW_MEASURE plan: (cost = 4.687500e-003) FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 16 FFTW_NOTW 32 10. FFTW: elapsed time t=1.152 s, 256 iters, t-(init.)=1.032 s t(norm)=0.0378535, mflops=132.088 (err=3.7e-015) FFTW_ESTIMATE plan: (cost = 5.079040e+004) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.402 s, 256 iters, t-(init.)=1.282 s t(norm)=0.0470235, mflops=106.33 (err=3.7e-015) 12. Frigo-old: elapsed time t=1.883 s, 256 iters, t-(init.)=1.763 s t(norm)=0.0646665, mflops=77.3198 (err=3.7e-015) 13. Green: elapsed time t=1.813 s, 256 iters, t-(init.)=1.693 s t(norm)=0.0620989, mflops=80.5168 (err=3.7e-015) 14. GSL: elapsed time t=1.953 s, 256 iters, t-(init.)=1.833 s t(norm)=0.067234, mflops=74.3671 (err=3.7e-015) 15. GSL DIT: elapsed time t=1.912 s, 128 iters, t-(init.)=1.851 s t(norm)=0.135789, mflops=36.822 (err=4.4e-015) 16. GSL DIF: elapsed time t=1.933 s, 128 iters, t-(init.)=1.873 s t(norm)=0.137402, mflops=36.3895 (err=4.4e-015) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.602 s, 128 iters, t-(init.)=1.542 s t(norm)=0.11312, mflops=44.2007 (err=3.6e-015) 19. Mayer (simple): elapsed time t=1.472 s, 128 iters, t-(init.)=1.412 s t(norm)=0.103584, mflops=48.2701 20. Mayer (lookup): elapsed time t=1.562 s, 128 iters, t-(init.)=1.502 s t(norm)=0.110186, mflops=45.3778 (err=3.6e-015) 21. Ooura (C): elapsed time t=1.432 s, 256 iters, t-(init.)=1.312 s t(norm)=0.0481239, mflops=103.899 (err=3.7e-015) 22. Ransom: elapsed time t=1.242 s, 128 iters, t-(init.)=1.182 s t(norm)=0.086711, mflops=57.6628 (err=4.8e-015) 23. Singleton (f2c): elapsed time t=1.042 s, 128 iters, t-(init.)=0.982 s t(norm)=0.0720391, mflops=69.4068 (err=5.7e-015) 24. Temperton (f2c): elapsed time t=1.262 s, 128 iters, t-(init.)=1.202 s t(norm)=0.0881782, mflops=56.7034 (err=3.7e-015) 25. Valkenburg: elapsed time t=1.872 s, 32 iters, t-(init.)=1.862 s t(norm)=0.546382, mflops=9.15111 (err=3.8e-015) Top mflops for N=8192 = 132.088 Normalized results and averages for N=8192: fft 0: mflops = 40.2584 (norm. = 0.304784), norm. avg. (of 13) = 0.40523 fft 1: mflops = 39.1035 (norm. = 0.296041), norm. avg. (of 13) = 0.407483 fft 2: mflops = 28.3517 (norm. = 0.214642), norm. avg. (of 13) = 0.281949 fft 3: mflops = 26.5825 (norm. = 0.201248), norm. avg. (of 13) = 0.12265 fft 4: mflops = 13.2912 (norm. = 0.100624), norm. avg. (of 13) = 0.0818396 fft 5: mflops = 73.9636 (norm. = 0.559957), norm. avg. (of 13) = 0.443915 fft 6: mflops = 63.5797 (norm. = 0.481343), norm. avg. (of 13) = 0.32 fft 7: mflops = 62.9921 (norm. = 0.476895), norm. avg. (of 13) = 0.308297 fft 8: mflops = 25.2062 (norm. = 0.190828), norm. avg. (of 12) = 0.192619 fft 9: mflops = 53.5829 (norm. = 0.40566), norm. avg. (of 13) = 0.368137 fft 10: mflops = 132.088 (norm. = 1), norm. avg. (of 13) = 0.910834 fft 11: mflops = 106.33 (norm. = 0.804992), norm. avg. (of 13) = 0.863251 fft 12: mflops = 77.3198 (norm. = 0.585366), norm. avg. (of 13) = 0.827206 fft 13: mflops = 80.5168 (norm. = 0.609569), norm. avg. (of 11) = 0.572352 fft 14: mflops = 74.3671 (norm. = 0.563011), norm. avg. (of 13) = 0.448348 fft 15: mflops = 36.822 (norm. = 0.278768), norm. avg. (of 13) = 0.277874 fft 16: mflops = 36.3895 (norm. = 0.275494), norm. avg. (of 13) = 0.270007 fft 17: mflops = -1 (norm. = -0.00757071), norm. avg. (of 12) = 0.443165 fft 18: mflops = 44.2007 (norm. = 0.33463), norm. avg. (of 12) = 0.397406 fft 19: mflops = 48.2701 (norm. = 0.365439), norm. avg. (of 12) = 0.497661 fft 20: mflops = 45.3778 (norm. = 0.343542), norm. avg. (of 12) = 0.49517 fft 21: mflops = 103.899 (norm. = 0.786585), norm. avg. (of 13) = 0.755009 fft 22: mflops = 57.6628 (norm. = 0.436548), norm. avg. (of 12) = 0.22443 fft 23: mflops = 69.4068 (norm. = 0.525458), norm. avg. (of 13) = 0.414764 fft 24: mflops = 56.7034 (norm. = 0.429285), norm. avg. (of 13) = 0.308691 fft 25: mflops = 9.15111 (norm. = 0.0692803), norm. avg. (of 13) = 0.0612624 Benchmarking for array size = 16384 (power of 2): 0. Arndt DIF: elapsed time t=1.011 s, 32 iters, t-(init.)=0.981 s t(norm)=0.133651, mflops=37.411 (err=6.9e-015) 1. Arndt DIT: elapsed time t=1.041 s, 32 iters, t-(init.)=1.011 s t(norm)=0.137738, mflops=36.3009 (err=6.9e-015) 2. Arndt Split-Radix: elapsed time t=1.332 s, 32 iters, t-(init.)=1.302 s t(norm)=0.177383, mflops=28.1875 (err=6.9e-015) 3. Arndt 4-step: elapsed time t=1.182 s, 32 iters, t-(init.)=1.152 s t(norm)=0.156948, mflops=31.8578 (err=6.9e-015) 4. Beauregard: elapsed time t=1.392 s, 16 iters, t-(init.)=1.372 s t(norm)=0.37384, mflops=13.3747 (err=6.9e-015) 5. Bergland: elapsed time t=1.041 s, 64 iters, t-(init.)=0.981 s t(norm)=0.0668253, mflops=74.8219 (err=6.8e-015) 6. CWP (min N) (N=17160): elapsed time t=1.172 s, 64 iters, t-(init.)=1.102 s t(norm)=0.0750678, mflops=66.6065 7. CWP (best N) (N=17160): elapsed time t=1.171 s, 64 iters, t-(init.)=1.111 s t(norm)=0.0756809, mflops=66.0669 8. Edelblute: elapsed time t=1.482 s, 32 iters, t-(init.)=1.452 s t(norm)=0.197819, mflops=25.2756 (err=6.9e-015) 9. FFTPACK (f2c): elapsed time t=1.012 s, 32 iters, t-(init.)=0.982 s t(norm)=0.133787, mflops=37.3729 (err=6.9e-015) FFTW_MEASURE plan: (cost = 1.375000e-002) FFTW_TWIDDLE 64 FFTW_TWIDDLE 8 FFTW_NOTW 32 10. FFTW: elapsed time t=1.772 s, 128 iters, t-(init.)=1.652 s t(norm)=0.0562668, mflops=88.8624 (err=6.8e-015) FFTW_ESTIMATE plan: (cost = 1.441792e+005) FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.041 s, 64 iters, t-(init.)=0.981 s t(norm)=0.0668253, mflops=74.8219 (err=6.8e-015) 12. Frigo-old: elapsed time t=1.432 s, 64 iters, t-(init.)=1.372 s t(norm)=0.0934601, mflops=53.4988 (err=6.8e-015) 13. Green: elapsed time t=1.032 s, 64 iters, t-(init.)=0.972 s t(norm)=0.0662122, mflops=75.5147 (err=6.9e-015) 14. GSL: elapsed time t=1.412 s, 64 iters, t-(init.)=1.352 s t(norm)=0.0920977, mflops=54.2902 (err=6.9e-015) 15. GSL DIT: elapsed time t=1.031 s, 32 iters, t-(init.)=1.001 s t(norm)=0.136375, mflops=36.6635 (err=7.3e-015) 16. GSL DIF: elapsed time t=1.031 s, 32 iters, t-(init.)=1.001 s t(norm)=0.136375, mflops=36.6635 (err=7.4e-015) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.732 s, 64 iters, t-(init.)=1.672 s t(norm)=0.113896, mflops=43.8997 (err=6.9e-015) 19. Mayer (simple): elapsed time t=1.612 s, 64 iters, t-(init.)=1.552 s t(norm)=0.105722, mflops=47.294 20. Mayer (lookup): elapsed time t=1.702 s, 64 iters, t-(init.)=1.642 s t(norm)=0.111852, mflops=44.7018 (err=6.9e-015) 21. Ooura (C): elapsed time t=1.473 s, 128 iters, t-(init.)=1.353 s t(norm)=0.0460829, mflops=108.5 (err=6.8e-015) 22. Ransom: elapsed time t=1.082 s, 64 iters, t-(init.)=1.022 s t(norm)=0.0696182, mflops=71.8203 (err=7.5e-015) 23. Singleton (f2c): elapsed time t=1.051 s, 64 iters, t-(init.)=0.991 s t(norm)=0.0675065, mflops=74.0669 (err=1.0e-014) 24. Temperton (f2c): elapsed time t=1.322 s, 64 iters, t-(init.)=1.262 s t(norm)=0.0859669, mflops=58.1619 (err=6.9e-015) 25. Valkenburg: elapsed time t=1.131 s, 8 iters, t-(init.)=1.121 s t(norm)=0.610897, mflops=8.18469 (err=6.9e-015) Top mflops for N=16384 = 108.5 Normalized results and averages for N=16384: fft 0: mflops = 37.411 (norm. = 0.344801), norm. avg. (of 14) = 0.400913 fft 1: mflops = 36.3009 (norm. = 0.33457), norm. avg. (of 14) = 0.402275 fft 2: mflops = 28.1875 (norm. = 0.259793), norm. avg. (of 14) = 0.280366 fft 3: mflops = 31.8578 (norm. = 0.29362), norm. avg. (of 14) = 0.134862 fft 4: mflops = 13.3747 (norm. = 0.123269), norm. avg. (of 14) = 0.0847989 fft 5: mflops = 74.8219 (norm. = 0.689602), norm. avg. (of 14) = 0.461464 fft 6: mflops = 66.6065 (norm. = 0.613884), norm. avg. (of 14) = 0.340992 fft 7: mflops = 66.0669 (norm. = 0.608911), norm. avg. (of 14) = 0.329769 fft 8: mflops = 25.2756 (norm. = 0.232955), norm. avg. (of 13) = 0.195722 fft 9: mflops = 37.3729 (norm. = 0.34445), norm. avg. (of 14) = 0.366445 fft 10: mflops = 88.8624 (norm. = 0.819007), norm. avg. (of 14) = 0.904275 fft 11: mflops = 74.8219 (norm. = 0.689602), norm. avg. (of 14) = 0.850847 fft 12: mflops = 53.4988 (norm. = 0.493076), norm. avg. (of 14) = 0.80334 fft 13: mflops = 75.5147 (norm. = 0.695988), norm. avg. (of 12) = 0.582655 fft 14: mflops = 54.2902 (norm. = 0.50037), norm. avg. (of 14) = 0.452064 fft 15: mflops = 36.6635 (norm. = 0.337912), norm. avg. (of 14) = 0.282163 fft 16: mflops = 36.6635 (norm. = 0.337912), norm. avg. (of 14) = 0.274858 fft 17: mflops = -1 (norm. = -0.00921658), norm. avg. (of 12) = 0.443165 fft 18: mflops = 43.8997 (norm. = 0.404605), norm. avg. (of 13) = 0.39796 fft 19: mflops = 47.294 (norm. = 0.435889), norm. avg. (of 13) = 0.49291 fft 20: mflops = 44.7018 (norm. = 0.411998), norm. avg. (of 13) = 0.488772 fft 21: mflops = 108.5 (norm. = 1), norm. avg. (of 14) = 0.772508 fft 22: mflops = 71.8203 (norm. = 0.661937), norm. avg. (of 13) = 0.258084 fft 23: mflops = 74.0669 (norm. = 0.682644), norm. avg. (of 14) = 0.433898 fft 24: mflops = 58.1619 (norm. = 0.536054), norm. avg. (of 14) = 0.324931 fft 25: mflops = 8.18469 (norm. = 0.0754349), norm. avg. (of 14) = 0.0622747 Benchmarking for array size = 32768 (power of 2): 0. Arndt DIF: elapsed time t=1.112 s, 16 iters, t-(init.)=1.082 s t(norm)=0.137583, mflops=36.3416 (err=1.4e-014) 1. Arndt DIT: elapsed time t=1.192 s, 16 iters, t-(init.)=1.152 s t(norm)=0.146484, mflops=34.1333 (err=1.4e-014) 2. Arndt Split-Radix: elapsed time t=1.522 s, 16 iters, t-(init.)=1.492 s t(norm)=0.189718, mflops=26.355 (err=1.4e-014) 3. Arndt 4-step: elapsed time t=1.402 s, 16 iters, t-(init.)=1.372 s t(norm)=0.174459, mflops=28.6601 (err=1.4e-014) 4. Beauregard: elapsed time t=1.552 s, 8 iters, t-(init.)=1.541 s t(norm)=0.391897, mflops=12.7585 (err=1.4e-014) 5. Bergland: elapsed time t=1.151 s, 32 iters, t-(init.)=1.081 s t(norm)=0.0687281, mflops=72.7504 (err=1.4e-014) 6. CWP (min N) (N=34320): elapsed time t=1.312 s, 32 iters, t-(init.)=1.232 s t(norm)=0.0783285, mflops=63.8338 7. CWP (best N) (N=34320): elapsed time t=1.312 s, 32 iters, t-(init.)=1.242 s t(norm)=0.0789642, mflops=63.3198 8. Edelblute: elapsed time t=1.682 s, 16 iters, t-(init.)=1.652 s t(norm)=0.210063, mflops=23.8024 (err=1.4e-014) 9. FFTPACK (f2c): elapsed time t=1.572 s, 16 iters, t-(init.)=1.542 s t(norm)=0.196075, mflops=25.5004 (err=1.4e-014) FFTW_MEASURE plan: (cost = 3.762500e-002) FFTW_TWIDDLE 64 FFTW_TWIDDLE 16 FFTW_NOTW 32 10. FFTW: elapsed time t=1.201 s, 32 iters, t-(init.)=1.131 s t(norm)=0.071907, mflops=69.5342 (err=1.4e-014) FFTW_ESTIMATE plan: (cost = 2.883584e+005) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.382 s, 32 iters, t-(init.)=1.312 s t(norm)=0.0834147, mflops=59.9415 (err=1.4e-014) 12. Frigo-old: elapsed time t=1.161 s, 16 iters, t-(init.)=1.131 s t(norm)=0.143814, mflops=34.7671 (err=1.4e-014) 13. Green: elapsed time t=1.202 s, 32 iters, t-(init.)=1.142 s t(norm)=0.0726064, mflops=68.8644 (err=1.4e-014) 14. GSL: elapsed time t=1.342 s, 16 iters, t-(init.)=1.312 s t(norm)=0.166829, mflops=29.9707 (err=1.4e-014) 15. GSL DIT: elapsed time t=1.192 s, 16 iters, t-(init.)=1.162 s t(norm)=0.147756, mflops=33.8396 (err=1.4e-014) 16. GSL DIF: elapsed time t=1.192 s, 16 iters, t-(init.)=1.152 s t(norm)=0.146484, mflops=34.1333 (err=1.4e-014) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.902 s, 32 iters, t-(init.)=1.842 s t(norm)=0.117111, mflops=42.6945 (err=1.4e-014) 19. Mayer (simple): elapsed time t=1.763 s, 32 iters, t-(init.)=1.693 s t(norm)=0.107638, mflops=46.452 20. Mayer (lookup): elapsed time t=1.972 s, 32 iters, t-(init.)=1.911 s t(norm)=0.121498, mflops=41.1529 (err=1.4e-014) 21. Ooura (C): elapsed time t=1.942 s, 64 iters, t-(init.)=1.821 s t(norm)=0.057888, mflops=86.3736 (err=1.4e-014) 22. Ransom: elapsed time t=1.382 s, 32 iters, t-(init.)=1.322 s t(norm)=0.0840505, mflops=59.488 (err=1.5e-014) 23. Singleton (f2c): elapsed time t=1.422 s, 32 iters, t-(init.)=1.362 s t(norm)=0.0865936, mflops=57.741 (err=2.1e-014) 24. Temperton (f2c): elapsed time t=1.863 s, 32 iters, t-(init.)=1.793 s t(norm)=0.113996, mflops=43.8612 (err=1.4e-014) 25. Valkenburg: elapsed time t=1.392 s, 4 iters, t-(init.)=1.382 s t(norm)=0.702922, mflops=7.11317 (err=1.4e-014) Top mflops for N=32768 = 86.3736 Normalized results and averages for N=32768: fft 0: mflops = 36.3416 (norm. = 0.420749), norm. avg. (of 15) = 0.402236 fft 1: mflops = 34.1333 (norm. = 0.395182), norm. avg. (of 15) = 0.401802 fft 2: mflops = 26.355 (norm. = 0.305127), norm. avg. (of 15) = 0.282017 fft 3: mflops = 28.6601 (norm. = 0.331815), norm. avg. (of 15) = 0.147992 fft 4: mflops = 12.7585 (norm. = 0.147713), norm. avg. (of 15) = 0.0889931 fft 5: mflops = 72.7504 (norm. = 0.842276), norm. avg. (of 15) = 0.486852 fft 6: mflops = 63.8338 (norm. = 0.739042), norm. avg. (of 15) = 0.367529 fft 7: mflops = 63.3198 (norm. = 0.733092), norm. avg. (of 15) = 0.356658 fft 8: mflops = 23.8024 (norm. = 0.275575), norm. avg. (of 14) = 0.201426 fft 9: mflops = 25.5004 (norm. = 0.295233), norm. avg. (of 15) = 0.361698 fft 10: mflops = 69.5342 (norm. = 0.80504), norm. avg. (of 15) = 0.897659 fft 11: mflops = 59.9415 (norm. = 0.693979), norm. avg. (of 15) = 0.840389 fft 12: mflops = 34.7671 (norm. = 0.40252), norm. avg. (of 15) = 0.776618 fft 13: mflops = 68.8644 (norm. = 0.797285), norm. avg. (of 13) = 0.599165 fft 14: mflops = 29.9707 (norm. = 0.346989), norm. avg. (of 15) = 0.445059 fft 15: mflops = 33.8396 (norm. = 0.391781), norm. avg. (of 15) = 0.289471 fft 16: mflops = 34.1333 (norm. = 0.395182), norm. avg. (of 15) = 0.282879 fft 17: mflops = -1 (norm. = -0.0115776), norm. avg. (of 12) = 0.443165 fft 18: mflops = 42.6945 (norm. = 0.4943), norm. avg. (of 14) = 0.404841 fft 19: mflops = 46.452 (norm. = 0.537803), norm. avg. (of 14) = 0.496116 fft 20: mflops = 41.1529 (norm. = 0.476452), norm. avg. (of 14) = 0.487892 fft 21: mflops = 86.3736 (norm. = 1), norm. avg. (of 15) = 0.787674 fft 22: mflops = 59.488 (norm. = 0.688729), norm. avg. (of 14) = 0.288845 fft 23: mflops = 57.741 (norm. = 0.668502), norm. avg. (of 15) = 0.449538 fft 24: mflops = 43.8612 (norm. = 0.507808), norm. avg. (of 15) = 0.337123 fft 25: mflops = 7.11317 (norm. = 0.0823535), norm. avg. (of 15) = 0.0636133 Benchmarking for array size = 65536 (power of 2): 0. Arndt DIF: elapsed time t=1.682 s, 4 iters, t-(init.)=1.632 s t(norm)=0.389099, mflops=12.8502 (err=1.7e-014) 1. Arndt DIT: elapsed time t=1.703 s, 4 iters, t-(init.)=1.653 s t(norm)=0.394106, mflops=12.6869 (err=1.7e-014) 2. Arndt Split-Radix: elapsed time t=1.062 s, 2 iters, t-(init.)=1.042 s t(norm)=0.496864, mflops=10.0631 (err=1.7e-014) 3. Arndt 4-step: elapsed time t=1.853 s, 8 iters, t-(init.)=1.763 s t(norm)=0.210166, mflops=23.7907 (err=1.7e-014) 4. Beauregard: elapsed time t=1.012 s, 2 iters, t-(init.)=0.992 s t(norm)=0.473022, mflops=10.5703 (err=1.7e-014) 5. Bergland: elapsed time t=1.472 s, 8 iters, t-(init.)=1.382 s t(norm)=0.164747, mflops=30.3495 (err=1.7e-014) 6. CWP (min N) (N=72072): elapsed time t=1.002 s, 8 iters, t-(init.)=0.902 s t(norm)=0.107527, mflops=46.5 7. CWP (best N) (N=72072): elapsed time t=1.012 s, 8 iters, t-(init.)=0.912 s t(norm)=0.108719, mflops=45.9902 8. Edelblute: elapsed time t=1.092 s, 2 iters, t-(init.)=1.072 s t(norm)=0.511169, mflops=9.78149 (err=1.7e-014) 9. FFTPACK (f2c): elapsed time t=1.873 s, 8 iters, t-(init.)=1.783 s t(norm)=0.21255, mflops=23.5239 (err=1.7e-014) FFTW_MEASURE plan: (cost = 9.525000e-002) FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_TWIDDLE 32 FFTW_NOTW 16 10. FFTW: elapsed time t=1.512 s, 16 iters, t-(init.)=1.342 s t(norm)=0.0799894, mflops=62.5083 (err=1.7e-014) FFTW_ESTIMATE plan: (cost = 5.767168e+005) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.892 s, 16 iters, t-(init.)=1.721 s t(norm)=0.10258, mflops=48.7426 (err=1.7e-014) 12. Frigo-old: elapsed time t=1.482 s, 8 iters, t-(init.)=1.401 s t(norm)=0.167012, mflops=29.9379 (err=1.7e-014) 13. Green: elapsed time t=1.342 s, 8 iters, t-(init.)=1.262 s t(norm)=0.150442, mflops=33.2354 (err=1.7e-014) 14. GSL: elapsed time t=1.532 s, 8 iters, t-(init.)=1.442 s t(norm)=0.1719, mflops=29.0867 (err=1.7e-014) 15. GSL DIT: elapsed time t=1.683 s, 4 iters, t-(init.)=1.643 s t(norm)=0.391722, mflops=12.7642 (err=1.7e-014) 16. GSL DIF: elapsed time t=1.713 s, 4 iters, t-(init.)=1.663 s t(norm)=0.39649, mflops=12.6107 (err=1.8e-014) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.202 s, 8 iters, t-(init.)=1.122 s t(norm)=0.133753, mflops=37.3824 (err=1.7e-014) 19. Mayer (simple): elapsed time t=1.131 s, 8 iters, t-(init.)=1.05 s t(norm)=0.12517, mflops=39.9458 20. Mayer (lookup): elapsed time t=1.292 s, 8 iters, t-(init.)=1.202 s t(norm)=0.14329, mflops=34.8944 (err=1.7e-014) 21. Ooura (C): elapsed time t=1.062 s, 8 iters, t-(init.)=0.972 s t(norm)=0.115871, mflops=43.1513 (err=1.7e-014) 22. Ransom: elapsed time t=1.192 s, 8 iters, t-(init.)=1.102 s t(norm)=0.131369, mflops=38.0608 (err=1.7e-014) 23. Singleton (f2c): elapsed time t=1.692 s, 8 iters, t-(init.)=1.602 s t(norm)=0.190973, mflops=26.1817 (err=2.3e-014) 24. Temperton (f2c): elapsed time t=1.792 s, 8 iters, t-(init.)=1.702 s t(norm)=0.202894, mflops=24.6434 (err=1.7e-014) 25. Valkenburg: elapsed time t=1.633 s, 2 iters, t-(init.)=1.613 s t(norm)=0.769138, mflops=6.50078 (err=1.7e-014) Top mflops for N=65536 = 62.5083 Normalized results and averages for N=65536: fft 0: mflops = 12.8502 (norm. = 0.205576), norm. avg. (of 16) = 0.389944 fft 1: mflops = 12.6869 (norm. = 0.202964), norm. avg. (of 16) = 0.389375 fft 2: mflops = 10.0631 (norm. = 0.160988), norm. avg. (of 16) = 0.274453 fft 3: mflops = 23.7907 (norm. = 0.380601), norm. avg. (of 16) = 0.16253 fft 4: mflops = 10.5703 (norm. = 0.169103), norm. avg. (of 16) = 0.094 fft 5: mflops = 30.3495 (norm. = 0.485528), norm. avg. (of 16) = 0.486769 fft 6: mflops = 46.5 (norm. = 0.743902), norm. avg. (of 16) = 0.391052 fft 7: mflops = 45.9902 (norm. = 0.735746), norm. avg. (of 16) = 0.380351 fft 8: mflops = 9.78149 (norm. = 0.156483), norm. avg. (of 15) = 0.19843 fft 9: mflops = 23.5239 (norm. = 0.376332), norm. avg. (of 16) = 0.362612 fft 10: mflops = 62.5083 (norm. = 1), norm. avg. (of 16) = 0.904056 fft 11: mflops = 48.7426 (norm. = 0.779779), norm. avg. (of 16) = 0.836601 fft 12: mflops = 29.9379 (norm. = 0.478944), norm. avg. (of 16) = 0.758014 fft 13: mflops = 33.2354 (norm. = 0.531696), norm. avg. (of 14) = 0.594346 fft 14: mflops = 29.0867 (norm. = 0.465326), norm. avg. (of 16) = 0.446326 fft 15: mflops = 12.7642 (norm. = 0.2042), norm. avg. (of 16) = 0.284141 fft 16: mflops = 12.6107 (norm. = 0.201744), norm. avg. (of 16) = 0.277808 fft 17: mflops = -1 (norm. = -0.0159979), norm. avg. (of 12) = 0.443165 fft 18: mflops = 37.3824 (norm. = 0.598039), norm. avg. (of 15) = 0.417721 fft 19: mflops = 39.9458 (norm. = 0.639048), norm. avg. (of 15) = 0.505645 fft 20: mflops = 34.8944 (norm. = 0.558236), norm. avg. (of 15) = 0.492582 fft 21: mflops = 43.1513 (norm. = 0.690329), norm. avg. (of 16) = 0.78159 fft 22: mflops = 38.0608 (norm. = 0.608893), norm. avg. (of 15) = 0.310181 fft 23: mflops = 26.1817 (norm. = 0.418851), norm. avg. (of 16) = 0.44762 fft 24: mflops = 24.6434 (norm. = 0.394242), norm. avg. (of 16) = 0.340693 fft 25: mflops = 6.50078 (norm. = 0.103999), norm. avg. (of 16) = 0.0661374 Benchmarking for array size = 131072 (power of 2): 0. Arndt DIF: elapsed time t=1.973 s, 2 iters, t-(init.)=1.913 s t(norm)=0.429266, mflops=11.6478 (err=3.4e-014) 1. Arndt DIT: elapsed time t=1.011 s, 1 iters, t-(init.)=0.981 s t(norm)=0.440261, mflops=11.3569 (err=3.4e-014) 2. Arndt Split-Radix: elapsed time t=1.322 s, 1 iters, t-(init.)=1.292 s t(norm)=0.579834, mflops=8.62316 (err=3.4e-014) 3. Arndt 4-step: elapsed time t=1.272 s, 2 iters, t-(init.)=1.212 s t(norm)=0.271965, mflops=18.3847 (err=3.4e-014) 4. Beauregard: elapsed time t=1.102 s, 1 iters, t-(init.)=1.082 s t(norm)=0.485589, mflops=10.2968 (err=3.4e-014) 5. Bergland: elapsed time t=1.652 s, 4 iters, t-(init.)=1.552 s t(norm)=0.17413, mflops=28.7142 (err=3.4e-014) 6. CWP (min N) (N=144144): elapsed time t=1.122 s, 4 iters, t-(init.)=1.012 s t(norm)=0.113543, mflops=44.036 7. CWP (best N) (N=144144): elapsed time t=1.121 s, 4 iters, t-(init.)=1.011 s t(norm)=0.113431, mflops=44.0796 8. Edelblute: elapsed time t=1.332 s, 1 iters, t-(init.)=1.302 s t(norm)=0.584322, mflops=8.55693 (err=3.4e-014) 9. FFTPACK (f2c): elapsed time t=1.161 s, 2 iters, t-(init.)=1.111 s t(norm)=0.249302, mflops=20.056 (err=3.4e-014) FFTW_MEASURE plan: (cost = 2.100000e-001) FFTW_TWIDDLE 4 FFTW_TWIDDLE 64 FFTW_TWIDDLE 32 FFTW_NOTW 16 10. FFTW: elapsed time t=1.712 s, 8 iters, t-(init.)=1.511 s t(norm)=0.0847648, mflops=58.9867 (err=3.4e-014) FFTW_ESTIMATE plan: (cost = 1.153434e+006) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.012 s, 4 iters, t-(init.)=0.912 s t(norm)=0.102324, mflops=48.8646 (err=3.4e-014) 12. Frigo-old: elapsed time t=1.652 s, 4 iters, t-(init.)=1.552 s t(norm)=0.17413, mflops=28.7142 (err=3.4e-014) 13. Green: elapsed time t=1.552 s, 4 iters, t-(init.)=1.452 s t(norm)=0.16291, mflops=30.6918 (err=3.4e-014) 14. GSL: elapsed time t=1.892 s, 4 iters, t-(init.)=1.782 s t(norm)=0.199935, mflops=25.0081 (err=3.4e-014) 15. GSL DIT: elapsed time t=1.012 s, 1 iters, t-(init.)=0.992 s t(norm)=0.445198, mflops=11.231 (err=3.5e-014) 16. GSL DIF: elapsed time t=1.002 s, 1 iters, t-(init.)=0.982 s t(norm)=0.44071, mflops=11.3453 (err=3.6e-014) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.563 s, 2 iters, t-(init.)=1.513 s t(norm)=0.339508, mflops=14.7272 (err=3.4e-014) 19. Mayer (simple): elapsed time t=1.532 s, 2 iters, t-(init.)=1.482 s t(norm)=0.332552, mflops=15.0352 20. Mayer (lookup): elapsed time t=1.623 s, 2 iters, t-(init.)=1.573 s t(norm)=0.352972, mflops=14.1654 (err=3.4e-014) 21. Ooura (C): elapsed time t=1.312 s, 4 iters, t-(init.)=1.212 s t(norm)=0.135983, mflops=36.7694 (err=3.4e-014) 22. Ransom: elapsed time t=1.553 s, 4 iters, t-(init.)=1.453 s t(norm)=0.163022, mflops=30.6707 (err=3.4e-014) 23. Singleton (f2c): elapsed time t=1.061 s, 2 iters, t-(init.)=1.011 s t(norm)=0.226862, mflops=22.0398 (err=4.9e-014) 24. Temperton (f2c): elapsed time t=1.101 s, 2 iters, t-(init.)=1.051 s t(norm)=0.235838, mflops=21.201 (err=3.4e-014) 25. Valkenburg: elapsed time t=1.802 s, 1 iters, t-(init.)=1.782 s t(norm)=0.79974, mflops=6.25203 (err=3.4e-014) Top mflops for N=131072 = 58.9867 Normalized results and averages for N=131072: fft 0: mflops = 11.6478 (norm. = 0.197465), norm. avg. (of 17) = 0.378622 fft 1: mflops = 11.3569 (norm. = 0.192533), norm. avg. (of 17) = 0.377796 fft 2: mflops = 8.62316 (norm. = 0.146188), norm. avg. (of 17) = 0.266908 fft 3: mflops = 18.3847 (norm. = 0.311675), norm. avg. (of 17) = 0.171303 fft 4: mflops = 10.2968 (norm. = 0.174561), norm. avg. (of 17) = 0.0987388 fft 5: mflops = 28.7142 (norm. = 0.486791), norm. avg. (of 17) = 0.48677 fft 6: mflops = 44.036 (norm. = 0.746542), norm. avg. (of 17) = 0.411963 fft 7: mflops = 44.0796 (norm. = 0.74728), norm. avg. (of 17) = 0.401935 fft 8: mflops = 8.55693 (norm. = 0.145065), norm. avg. (of 16) = 0.195094 fft 9: mflops = 20.056 (norm. = 0.340009), norm. avg. (of 17) = 0.361283 fft 10: mflops = 58.9867 (norm. = 1), norm. avg. (of 17) = 0.909699 fft 11: mflops = 48.8646 (norm. = 0.828399), norm. avg. (of 17) = 0.836119 fft 12: mflops = 28.7142 (norm. = 0.486791), norm. avg. (of 17) = 0.742059 fft 13: mflops = 30.6918 (norm. = 0.520317), norm. avg. (of 15) = 0.589411 fft 14: mflops = 25.0081 (norm. = 0.423962), norm. avg. (of 17) = 0.44501 fft 15: mflops = 11.231 (norm. = 0.190398), norm. avg. (of 17) = 0.278627 fft 16: mflops = 11.3453 (norm. = 0.192337), norm. avg. (of 17) = 0.272781 fft 17: mflops = -1 (norm. = -0.016953), norm. avg. (of 12) = 0.443165 fft 18: mflops = 14.7272 (norm. = 0.24967), norm. avg. (of 16) = 0.407218 fft 19: mflops = 15.0352 (norm. = 0.254892), norm. avg. (of 16) = 0.489973 fft 20: mflops = 14.1654 (norm. = 0.240146), norm. avg. (of 16) = 0.476805 fft 21: mflops = 36.7694 (norm. = 0.62335), norm. avg. (of 17) = 0.772282 fft 22: mflops = 30.6707 (norm. = 0.519959), norm. avg. (of 16) = 0.323292 fft 23: mflops = 22.0398 (norm. = 0.37364), norm. avg. (of 17) = 0.443269 fft 24: mflops = 21.201 (norm. = 0.35942), norm. avg. (of 17) = 0.341794 fft 25: mflops = 6.25203 (norm. = 0.10599), norm. avg. (of 17) = 0.0684817 Benchmarking for array size = 262144 (power of 2): 0. Arndt DIF: elapsed time t=2.143 s, 1 iters, t-(init.)=2.093 s t(norm)=0.443565, mflops=11.2723 (err=4.3e-014) 1. Arndt DIT: elapsed time t=2.184 s, 1 iters, t-(init.)=2.134 s t(norm)=0.452254, mflops=11.0557 (err=4.3e-014) 2. Arndt Split-Radix: elapsed time t=2.724 s, 1 iters, t-(init.)=2.674 s t(norm)=0.566694, mflops=8.8231 (err=4.3e-014) 3. Arndt 4-step: elapsed time t=1.092 s, 1 iters, t-(init.)=1.042 s t(norm)=0.220829, mflops=22.642 (err=4.3e-014) 4. Beauregard: elapsed time t=2.343 s, 1 iters, t-(init.)=2.292 s t(norm)=0.485738, mflops=10.2936 (err=4.4e-014) 5. Bergland: elapsed time t=1.682 s, 2 iters, t-(init.)=1.572 s t(norm)=0.166575, mflops=30.0165 (err=4.4e-014) 6. CWP (min N) (N=360360): elapsed time t=1.612 s, 2 iters, t-(init.)=1.462 s t(norm)=0.154919, mflops=32.2749 7. CWP (best N) (N=360360): elapsed time t=1.612 s, 2 iters, t-(init.)=1.462 s t(norm)=0.154919, mflops=32.2749 8. Edelblute: elapsed time t=2.754 s, 1 iters, t-(init.)=2.694 s t(norm)=0.570933, mflops=8.75759 (err=4.3e-014) 9. FFTPACK (f2c): elapsed time t=1.102 s, 1 iters, t-(init.)=1.052 s t(norm)=0.222948, mflops=22.4268 (err=4.4e-014) FFTW_MEASURE plan: (cost = 4.700000e-001) FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 64 FFTW_TWIDDLE 4 FFTW_TWIDDLE 2 FFTW_NOTW 32 10. FFTW: elapsed time t=1.853 s, 4 iters, t-(init.)=1.633 s t(norm)=0.0865195, mflops=57.7905 (err=4.4e-014) FFTW_ESTIMATE plan: (cost = 2.988442e+006) FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.112 s, 2 iters, t-(init.)=1.012 s t(norm)=0.107235, mflops=46.6264 (err=4.4e-014) 12. Frigo-old: elapsed time t=1.902 s, 2 iters, t-(init.)=1.801 s t(norm)=0.190841, mflops=26.1998 (err=4.4e-014) 13. Green: elapsed time t=1.612 s, 2 iters, t-(init.)=1.512 s t(norm)=0.160217, mflops=31.2076 (err=4.4e-014) 14. GSL: elapsed time t=1.913 s, 2 iters, t-(init.)=1.813 s t(norm)=0.192112, mflops=26.0264 (err=4.4e-014) 15. GSL DIT: elapsed time t=2.123 s, 1 iters, t-(init.)=2.073 s t(norm)=0.439326, mflops=11.3811 (err=4.6e-014) 16. GSL DIF: elapsed time t=2.113 s, 1 iters, t-(init.)=2.063 s t(norm)=0.437207, mflops=11.4362 (err=4.6e-014) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=1.862 s, 1 iters, t-(init.)=1.812 s t(norm)=0.384013, mflops=13.0204 (err=4.3e-014) 19. Mayer (simple): elapsed time t=1.842 s, 1 iters, t-(init.)=1.792 s t(norm)=0.379774, mflops=13.1657 20. Mayer (lookup): elapsed time t=1.913 s, 1 iters, t-(init.)=1.863 s t(norm)=0.394821, mflops=12.664 (err=4.3e-014) 21. Ooura (C): elapsed time t=1.362 s, 2 iters, t-(init.)=1.262 s t(norm)=0.133726, mflops=37.3898 (err=4.4e-014) 22. Ransom: elapsed time t=1.281 s, 2 iters, t-(init.)=1.18 s t(norm)=0.125037, mflops=39.9881 (err=4.3e-014) 23. Singleton (f2c): elapsed time t=1.071 s, 1 iters, t-(init.)=1.021 s t(norm)=0.216378, mflops=23.1077 (err=6.1e-014) 24. Temperton (f2c): elapsed time t=1.112 s, 1 iters, t-(init.)=1.062 s t(norm)=0.225067, mflops=22.2156 (err=4.4e-014) 25. Valkenburg: elapsed time t=4.036 s, 1 iters, t-(init.)=3.986 s t(norm)=0.844744, mflops=5.91896 (err=4.4e-014) Top mflops for N=262144 = 57.7905 Normalized results and averages for N=262144: fft 0: mflops = 11.2723 (norm. = 0.195055), norm. avg. (of 18) = 0.368424 fft 1: mflops = 11.0557 (norm. = 0.191307), norm. avg. (of 18) = 0.367435 fft 2: mflops = 8.8231 (norm. = 0.152674), norm. avg. (of 18) = 0.260562 fft 3: mflops = 22.642 (norm. = 0.391795), norm. avg. (of 18) = 0.183553 fft 4: mflops = 10.2936 (norm. = 0.17812), norm. avg. (of 18) = 0.103149 fft 5: mflops = 30.0165 (norm. = 0.519402), norm. avg. (of 18) = 0.488583 fft 6: mflops = 32.2749 (norm. = 0.558482), norm. avg. (of 18) = 0.420103 fft 7: mflops = 32.2749 (norm. = 0.558482), norm. avg. (of 18) = 0.410632 fft 8: mflops = 8.75759 (norm. = 0.15154), norm. avg. (of 17) = 0.192532 fft 9: mflops = 22.4268 (norm. = 0.38807), norm. avg. (of 18) = 0.362771 fft 10: mflops = 57.7905 (norm. = 1), norm. avg. (of 18) = 0.914716 fft 11: mflops = 46.6264 (norm. = 0.806818), norm. avg. (of 18) = 0.834491 fft 12: mflops = 26.1998 (norm. = 0.453359), norm. avg. (of 18) = 0.726021 fft 13: mflops = 31.2076 (norm. = 0.540013), norm. avg. (of 16) = 0.586323 fft 14: mflops = 26.0264 (norm. = 0.450359), norm. avg. (of 18) = 0.445307 fft 15: mflops = 11.3811 (norm. = 0.196937), norm. avg. (of 18) = 0.274089 fft 16: mflops = 11.4362 (norm. = 0.197891), norm. avg. (of 18) = 0.26862 fft 17: mflops = -1 (norm. = -0.0173039), norm. avg. (of 12) = 0.443165 fft 18: mflops = 13.0204 (norm. = 0.225304), norm. avg. (of 17) = 0.396517 fft 19: mflops = 13.1657 (norm. = 0.227818), norm. avg. (of 17) = 0.474552 fft 20: mflops = 12.664 (norm. = 0.219136), norm. avg. (of 17) = 0.461648 fft 21: mflops = 37.3898 (norm. = 0.646989), norm. avg. (of 18) = 0.765321 fft 22: mflops = 39.9881 (norm. = 0.691949), norm. avg. (of 17) = 0.344978 fft 23: mflops = 23.1077 (norm. = 0.399853), norm. avg. (of 18) = 0.440857 fft 24: mflops = 22.2156 (norm. = 0.384416), norm. avg. (of 18) = 0.344162 fft 25: mflops = 5.91896 (norm. = 0.102421), norm. avg. (of 18) = 0.0703672 Benchmarking for array size = 524288 (power of 2): 0. Arndt DIF: elapsed time t=4.256 s, 1 iters, t-(init.)=4.146 s t(norm)=0.416204, mflops=12.0134 (err=1.1e-013) 1. Arndt DIT: elapsed time t=4.286 s, 1 iters, t-(init.)=4.186 s t(norm)=0.420219, mflops=11.8986 (err=1.1e-013) 2. Arndt Split-Radix: elapsed time t=5.628 s, 1 iters, t-(init.)=5.528 s t(norm)=0.554938, mflops=9.01001 (err=1.1e-013) 3. Arndt 4-step: elapsed time t=2.944 s, 1 iters, t-(init.)=2.844 s t(norm)=0.2855, mflops=17.5131 (err=1.1e-013) 4. Beauregard: elapsed time t=5.058 s, 1 iters, t-(init.)=4.948 s t(norm)=0.496714, mflops=10.0662 (err=1.1e-013) 5. Bergland: elapsed time t=1.963 s, 1 iters, t-(init.)=1.853 s t(norm)=0.186017, mflops=26.8793 (err=1.1e-013) 6. CWP (min N) (N=720720): elapsed time t=1.693 s, 1 iters, t-(init.)=1.543 s t(norm)=0.154897, mflops=32.2796 7. CWP (best N) (N=720720): elapsed time t=1.693 s, 1 iters, t-(init.)=1.553 s t(norm)=0.155901, mflops=32.0717 8. Edelblute: elapsed time t=5.748 s, 1 iters, t-(init.)=5.637 s t(norm)=0.56588, mflops=8.83579 (err=1.1e-013) 9. FFTPACK (f2c): elapsed time t=2.253 s, 1 iters, t-(init.)=2.143 s t(norm)=0.215129, mflops=23.2419 (err=1.1e-013) FFTW_MEASURE plan: (cost = 1.031000e+000) FFTW_TWIDDLE 64 FFTW_TWIDDLE 4 FFTW_TWIDDLE 4 FFTW_TWIDDLE 32 FFTW_NOTW 16 10. FFTW: elapsed time t=1.983 s, 2 iters, t-(init.)=1.773 s t(norm)=0.0889929, mflops=56.1843 (err=1.1e-013) FFTW_ESTIMATE plan: (cost = 5.976883e+006) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=1.181 s, 1 iters, t-(init.)=1.081 s t(norm)=0.108518, mflops=46.0753 (err=1.1e-013) 12. Frigo-old: elapsed time t=2.083 s, 1 iters, t-(init.)=1.983 s t(norm)=0.199067, mflops=25.1172 (err=1.1e-013) 13. Green: elapsed time t=1.743 s, 1 iters, t-(init.)=1.633 s t(norm)=0.163932, mflops=30.5005 (err=1.1e-013) 14. GSL: elapsed time t=1.923 s, 1 iters, t-(init.)=1.823 s t(norm)=0.183005, mflops=27.3216 (err=1.1e-013) 15. GSL DIT: elapsed time t=4.426 s, 1 iters, t-(init.)=4.315 s t(norm)=0.433169, mflops=11.5428 (err=1.1e-013) 16. GSL DIF: elapsed time t=4.437 s, 1 iters, t-(init.)=4.337 s t(norm)=0.435377, mflops=11.4843 (err=1.1e-013) 17. Skipping fft (Krukar can't handle N > 4096). 18. Mayer (Buneman): elapsed time t=3.996 s, 1 iters, t-(init.)=3.886 s t(norm)=0.390103, mflops=12.8171 (err=1.1e-013) 19. Mayer (simple): elapsed time t=3.955 s, 1 iters, t-(init.)=3.854 s t(norm)=0.386891, mflops=12.9235 20. Mayer (lookup): elapsed time t=4.106 s, 1 iters, t-(init.)=3.996 s t(norm)=0.401146, mflops=12.4643 (err=1.1e-013) 21. Ooura (C): elapsed time t=1.482 s, 1 iters, t-(init.)=1.382 s t(norm)=0.138735, mflops=36.0401 (err=1.1e-013) 22. Ransom: elapsed time t=1.653 s, 1 iters, t-(init.)=1.543 s t(norm)=0.154897, mflops=32.2796 (err=1.1e-013) 23. Singleton (f2c): elapsed time t=2.624 s, 1 iters, t-(init.)=2.514 s t(norm)=0.252372, mflops=19.812 (err=1.6e-013) 24. Temperton (f2c): elapsed time t=2.493 s, 1 iters, t-(init.)=2.382 s t(norm)=0.239121, mflops=20.9099 (err=1.1e-013) 25. Valkenburg: elapsed time t=8.613 s, 1 iters, t-(init.)=8.513 s t(norm)=0.854593, mflops=5.85074 (err=1.1e-013) Top mflops for N=524288 = 56.1843 Normalized results and averages for N=524288: fft 0: mflops = 12.0134 (norm. = 0.213821), norm. avg. (of 19) = 0.360287 fft 1: mflops = 11.8986 (norm. = 0.211777), norm. avg. (of 19) = 0.359243 fft 2: mflops = 9.01001 (norm. = 0.160365), norm. avg. (of 19) = 0.255288 fft 3: mflops = 17.5131 (norm. = 0.311709), norm. avg. (of 19) = 0.190298 fft 4: mflops = 10.0662 (norm. = 0.179163), norm. avg. (of 19) = 0.10715 fft 5: mflops = 26.8793 (norm. = 0.478413), norm. avg. (of 19) = 0.488048 fft 6: mflops = 32.2796 (norm. = 0.57453), norm. avg. (of 19) = 0.428231 fft 7: mflops = 32.0717 (norm. = 0.570831), norm. avg. (of 19) = 0.419063 fft 8: mflops = 8.83579 (norm. = 0.157265), norm. avg. (of 18) = 0.190573 fft 9: mflops = 23.2419 (norm. = 0.413672), norm. avg. (of 19) = 0.36545 fft 10: mflops = 56.1843 (norm. = 1), norm. avg. (of 19) = 0.919205 fft 11: mflops = 46.0753 (norm. = 0.820074), norm. avg. (of 19) = 0.833732 fft 12: mflops = 25.1172 (norm. = 0.44705), norm. avg. (of 19) = 0.711338 fft 13: mflops = 30.5005 (norm. = 0.542866), norm. avg. (of 17) = 0.583767 fft 14: mflops = 27.3216 (norm. = 0.486286), norm. avg. (of 19) = 0.447464 fft 15: mflops = 11.5428 (norm. = 0.205446), norm. avg. (of 19) = 0.270476 fft 16: mflops = 11.4843 (norm. = 0.204404), norm. avg. (of 19) = 0.26524 fft 17: mflops = -1 (norm. = -0.0177986), norm. avg. (of 12) = 0.443165 fft 18: mflops = 12.8171 (norm. = 0.228127), norm. avg. (of 18) = 0.387162 fft 19: mflops = 12.9235 (norm. = 0.230021), norm. avg. (of 18) = 0.460967 fft 20: mflops = 12.4643 (norm. = 0.221847), norm. avg. (of 18) = 0.448325 fft 21: mflops = 36.0401 (norm. = 0.641462), norm. avg. (of 19) = 0.758802 fft 22: mflops = 32.2796 (norm. = 0.57453), norm. avg. (of 18) = 0.357731 fft 23: mflops = 19.812 (norm. = 0.352625), norm. avg. (of 19) = 0.436213 fft 24: mflops = 20.9099 (norm. = 0.372166), norm. avg. (of 19) = 0.345636 fft 25: mflops = 5.85074 (norm. = 0.104135), norm. avg. (of 19) = 0.0721445 Benchmarking for array size = 1048576 (power of 2): 0. Arndt DIF: elapsed time t=9.413 s, 1 iters, t-(init.)=9.192 s t(norm)=0.438309, mflops=11.4075 (err=1.9e-013) 1. Arndt DIT: elapsed time t=9.513 s, 1 iters, t-(init.)=9.303 s t(norm)=0.443602, mflops=11.2714 (err=1.9e-013) 2. Arndt Split-Radix: elapsed time t=11.627 s, 1 iters, t-(init.)=11.417 s t(norm)=0.544405, mflops=9.18434 (err=1.9e-013) 3. Arndt 4-step: elapsed time t=4.517 s, 1 iters, t-(init.)=4.307 s t(norm)=0.205374, mflops=24.3459 (err=1.9e-013) 4. Beauregard: elapsed time t=10.415 s, 1 iters, t-(init.)=10.205 s t(norm)=0.486612, mflops=10.2751 (err=1.9e-013) 5. Bergland: elapsed time t=3.865 s, 1 iters, t-(init.)=3.654 s t(norm)=0.174236, mflops=28.6967 (err=1.9e-013) 6. Skipping fft (this transform size is too big for CWP). 7. Skipping fft (this transform size is too big for CWP). 8. Edelblute: elapsed time t=11.888 s, 1 iters, t-(init.)=11.678 s t(norm)=0.55685, mflops=8.97907 (err=1.9e-013) 9. FFTPACK (f2c): elapsed time t=4.557 s, 1 iters, t-(init.)=4.357 s t(norm)=0.207758, mflops=24.0665 (err=1.9e-013) FFTW_MEASURE plan: (cost = 2.062000e+000) FFTW_TWIDDLE 16 FFTW_TWIDDLE 4 FFTW_TWIDDLE 64 FFTW_TWIDDLE 2 FFTW_TWIDDLE 4 FFTW_NOTW 32 10. FFTW: elapsed time t=2.053 s, 1 iters, t-(init.)=1.843 s t(norm)=0.0878811, mflops=56.8951 (err=1.9e-013) FFTW_ESTIMATE plan: (cost = 1.195377e+007) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 8 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=2.433 s, 1 iters, t-(init.)=2.222 s t(norm)=0.105953, mflops=47.1906 (err=1.9e-013) 12. Frigo-old: elapsed time t=4.376 s, 1 iters, t-(init.)=4.166 s t(norm)=0.19865, mflops=25.1699 (err=1.9e-013) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=3.726 s, 1 iters, t-(init.)=3.516 s t(norm)=0.167656, mflops=29.823 (err=1.9e-013) 15. GSL DIT: elapsed time t=9.143 s, 1 iters, t-(init.)=8.933 s t(norm)=0.425959, mflops=11.7382 (err=1.9e-013) 16. GSL DIF: elapsed time t=9.134 s, 1 iters, t-(init.)=8.924 s t(norm)=0.425529, mflops=11.7501 (err=1.9e-013) 17. Skipping fft (Krukar can't handle N > 4096). 18. Skipping fft (Mayer (Buneman) can't handle N > 2^19). 19. Mayer (simple): elapsed time t=8.202 s, 1 iters, t-(init.)=7.992 s t(norm)=0.381088, mflops=13.1203 20. Mayer (lookup): elapsed time t=8.502 s, 1 iters, t-(init.)=8.292 s t(norm)=0.395393, mflops=12.6456 (err=1.9e-013) 21. Ooura (C): elapsed time t=2.915 s, 1 iters, t-(init.)=2.705 s t(norm)=0.128984, mflops=38.7644 (err=1.9e-013) 22. Ransom: elapsed time t=2.734 s, 1 iters, t-(init.)=2.524 s t(norm)=0.120354, mflops=41.5442 (err=1.9e-013) 23. Singleton (f2c): elapsed time t=4.647 s, 1 iters, t-(init.)=4.437 s t(norm)=0.211573, mflops=23.6325 (err=2.6e-013) 24. Temperton (f2c): elapsed time t=4.827 s, 1 iters, t-(init.)=4.617 s t(norm)=0.220156, mflops=22.7112 (err=1.9e-013) 25. Valkenburg: elapsed time t=18.477 s, 1 iters, t-(init.)=18.267 s t(norm)=0.871038, mflops=5.74027 (err=1.9e-013) Top mflops for N=1048576 = 56.8951 Normalized results and averages for N=1048576: fft 0: mflops = 11.4075 (norm. = 0.2005), norm. avg. (of 20) = 0.352298 fft 1: mflops = 11.2714 (norm. = 0.198108), norm. avg. (of 20) = 0.351186 fft 2: mflops = 9.18434 (norm. = 0.161426), norm. avg. (of 20) = 0.250595 fft 3: mflops = 24.3459 (norm. = 0.427908), norm. avg. (of 20) = 0.202179 fft 4: mflops = 10.2751 (norm. = 0.180598), norm. avg. (of 20) = 0.110822 fft 5: mflops = 28.6967 (norm. = 0.504379), norm. avg. (of 20) = 0.488865 fft 6: mflops = -1 (norm. = -0.0175762), norm. avg. (of 19) = 0.428231 fft 7: mflops = -1 (norm. = -0.0175762), norm. avg. (of 19) = 0.419063 fft 8: mflops = 8.97907 (norm. = 0.157818), norm. avg. (of 19) = 0.188849 fft 9: mflops = 24.0665 (norm. = 0.422997), norm. avg. (of 20) = 0.368327 fft 10: mflops = 56.8951 (norm. = 1), norm. avg. (of 20) = 0.923244 fft 11: mflops = 47.1906 (norm. = 0.829433), norm. avg. (of 20) = 0.833517 fft 12: mflops = 25.1699 (norm. = 0.442391), norm. avg. (of 20) = 0.697891 fft 13: mflops = -1 (norm. = -0.0175762), norm. avg. (of 17) = 0.583767 fft 14: mflops = 29.823 (norm. = 0.524175), norm. avg. (of 20) = 0.4513 fft 15: mflops = 11.7382 (norm. = 0.206314), norm. avg. (of 20) = 0.267268 fft 16: mflops = 11.7501 (norm. = 0.206522), norm. avg. (of 20) = 0.262304 fft 17: mflops = -1 (norm. = -0.0175762), norm. avg. (of 12) = 0.443165 fft 18: mflops = -1 (norm. = -0.0175762), norm. avg. (of 18) = 0.387162 fft 19: mflops = 13.1203 (norm. = 0.230606), norm. avg. (of 19) = 0.448843 fft 20: mflops = 12.6456 (norm. = 0.222262), norm. avg. (of 19) = 0.436427 fft 21: mflops = 38.7644 (norm. = 0.681331), norm. avg. (of 20) = 0.754929 fft 22: mflops = 41.5442 (norm. = 0.73019), norm. avg. (of 19) = 0.377334 fft 23: mflops = 23.6325 (norm. = 0.415371), norm. avg. (of 20) = 0.435171 fft 24: mflops = 22.7112 (norm. = 0.399177), norm. avg. (of 20) = 0.348313 fft 25: mflops = 5.74027 (norm. = 0.100892), norm. avg. (of 20) = 0.0735819 Benchmarking for array size = 2097152 (power of 2): 0. Arndt DIF: elapsed time t=18.426 s, 1 iters, t-(init.)=18.005 s t(norm)=0.408831, mflops=12.23 (err=2.7e-013) 1. Arndt DIT: elapsed time t=18.587 s, 1 iters, t-(init.)=18.167 s t(norm)=0.41251, mflops=12.1209 (err=2.7e-013) 2. Arndt Split-Radix: elapsed time t=24.375 s, 1 iters, t-(init.)=23.954 s t(norm)=0.543912, mflops=9.19266 (err=2.7e-013) 3. Arndt 4-step: elapsed time t=11.997 s, 1 iters, t-(init.)=11.576 s t(norm)=0.262851, mflops=19.0222 (err=2.7e-013) 4. Beauregard: elapsed time t=21.861 s, 1 iters, t-(init.)=21.44 s t(norm)=0.486828, mflops=10.2706 (err=2.7e-013) 5. Skipping fft (Bergland doesn't work for N > 2^20). 6. Skipping fft (this transform size is too big for CWP). 7. Skipping fft (this transform size is too big for CWP). 8. Edelblute: elapsed time t=24.976 s, 1 iters, t-(init.)=24.555 s t(norm)=0.557559, mflops=8.96766 (err=2.7e-013) 9. FFTPACK (f2c): elapsed time t=10.695 s, 1 iters, t-(init.)=10.274 s t(norm)=0.233287, mflops=21.4328 (err=2.7e-013) FFTW_MEASURE plan: (cost = 4.316000e+000) FFTW_TWIDDLE 64 FFTW_TWIDDLE 64 FFTW_TWIDDLE 16 FFTW_NOTW 32 10. FFTW: elapsed time t=4.466 s, 1 iters, t-(init.)=4.035 s t(norm)=0.0916209, mflops=54.5727 (err=2.7e-013) FFTW_ESTIMATE plan: (cost = 2.390753e+007) FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_TWIDDLE 16 FFTW_NOTW 32 11. FFTW_ESTIMATE: elapsed time t=5.297 s, 1 iters, t-(init.)=4.876 s t(norm)=0.110717, mflops=45.1602 (err=2.7e-013) 12. Frigo-old: elapsed time t=9.213 s, 1 iters, t-(init.)=8.793 s t(norm)=0.199659, mflops=25.0428 (err=2.7e-013) 13. Skipping fft (Green can't handle this size.). 14. GSL: elapsed time t=8.853 s, 1 iters, t-(init.)=8.422 s t(norm)=0.191234, mflops=26.1459 (err=2.7e-013) 15. GSL DIT: elapsed time t=19.357 s, 1 iters, t-(init.)=18.936 s t(norm)=0.429971, mflops=11.6287 (err=2.7e-013) 16. GSL DIF: elapsed time t=19.668 s, 1 iters, t-(init.)=19.237 s t(norm)=0.436806, mflops=11.4467 (err=2.7e-013) 17. Skipping fft (Krukar can't handle N > 4096). 18. Skipping fft (Mayer (Buneman) can't handle N > 2^19). 19. Mayer (simple): elapsed time t=16.974 s, 1 iters, t-(init.)=16.543 s t(norm)=0.375634, mflops=13.3108 20. Mayer (lookup): elapsed time t=17.535 s, 1 iters, t-(init.)=17.114 s t(norm)=0.3886, mflops=12.8667 (err=2.7e-013) 21. Ooura (C): elapsed time t=6.429 s, 1 iters, t-(init.)=5.998 s t(norm)=0.136194, mflops=36.7124 (err=2.7e-013) 22. Ransom: elapsed time t=7.801 s, 1 iters, t-(init.)=7.37 s t(norm)=0.167347, mflops=29.878 (err=2.7e-013) 23. Singleton (f2c): elapsed time t=10.235 s, 1 iters, t-(init.)=9.815 s t(norm)=0.222865, mflops=22.4351 (err=3.7e-013) 24. Temperton (f2c): elapsed time t=10.796 s, 1 iters, t-(init.)=10.365 s t(norm)=0.235353, mflops=21.2447 (err=2.7e-013) 25. Valkenburg: elapsed time t=40.719 s, 1 iters, t-(init.)=40.289 s t(norm)=0.914823, mflops=5.46554 (err=2.7e-013) Top mflops for N=2097152 = 54.5727 Normalized results and averages for N=2097152: fft 0: mflops = 12.23 (norm. = 0.224104), norm. avg. (of 21) = 0.346193 fft 1: mflops = 12.1209 (norm. = 0.222106), norm. avg. (of 21) = 0.345039 fft 2: mflops = 9.19266 (norm. = 0.168448), norm. avg. (of 21) = 0.246683 fft 3: mflops = 19.0222 (norm. = 0.348566), norm. avg. (of 21) = 0.209149 fft 4: mflops = 10.2706 (norm. = 0.1882), norm. avg. (of 21) = 0.114507 fft 5: mflops = -1 (norm. = -0.0183242), norm. avg. (of 20) = 0.488865 fft 6: mflops = -1 (norm. = -0.0183242), norm. avg. (of 19) = 0.428231 fft 7: mflops = -1 (norm. = -0.0183242), norm. avg. (of 19) = 0.419063 fft 8: mflops = 8.96766 (norm. = 0.164325), norm. avg. (of 20) = 0.187623 fft 9: mflops = 21.4328 (norm. = 0.392739), norm. avg. (of 21) = 0.36949 fft 10: mflops = 54.5727 (norm. = 1), norm. avg. (of 21) = 0.9269 fft 11: mflops = 45.1602 (norm. = 0.827523), norm. avg. (of 21) = 0.833232 fft 12: mflops = 25.0428 (norm. = 0.458888), norm. avg. (of 21) = 0.686509 fft 13: mflops = -1 (norm. = -0.0183242), norm. avg. (of 17) = 0.583767 fft 14: mflops = 26.1459 (norm. = 0.479102), norm. avg. (of 21) = 0.452623 fft 15: mflops = 11.6287 (norm. = 0.213086), norm. avg. (of 21) = 0.264688 fft 16: mflops = 11.4467 (norm. = 0.209752), norm. avg. (of 21) = 0.259802 fft 17: mflops = -1 (norm. = -0.0183242), norm. avg. (of 12) = 0.443165 fft 18: mflops = -1 (norm. = -0.0183242), norm. avg. (of 18) = 0.387162 fft 19: mflops = 13.3108 (norm. = 0.24391), norm. avg. (of 20) = 0.438596 fft 20: mflops = 12.8667 (norm. = 0.235772), norm. avg. (of 20) = 0.426395 fft 21: mflops = 36.7124 (norm. = 0.672724), norm. avg. (of 21) = 0.751014 fft 22: mflops = 29.878 (norm. = 0.54749), norm. avg. (of 20) = 0.385842 fft 23: mflops = 22.4351 (norm. = 0.411105), norm. avg. (of 21) = 0.434025 fft 24: mflops = 21.2447 (norm. = 0.389291), norm. avg. (of 21) = 0.350265 fft 25: mflops = 5.46554 (norm. = 0.100151), norm. avg. (of 21) = 0.0748471 ------------------------------------------------------ @@@@ bench.1d.np2.log Benchmarking for sizes: 6 (0.000686646 MB) 9 (0.000915527 MB) 12 (0.00114441 MB) 15 (0.00137329 MB) 18 (0.00180054 MB) 24 (0.0022583 MB) 36 (0.0032959 MB) 80 (0.00738525 MB) 108 (0.00994873 MB) 210 (0.0192261 MB) 504 (0.0461426 MB) 1000 (0.0916748 MB) 1960 (0.179749 MB) 4725 (0.437393 MB) 10368 (0.960205 MB) 27000 (2.48291 MB) 75600 (6.98975 MB) 165375 (15.3664 MB) 362880 (38.6829 MB) Maximum array size = 720720 Benchmarking FFTs: 0. CWP (min N) 1. CWP (best N) 2. FFTPACK (f2c) 3. FFTW 4. FFTW_ESTIMATE 5. Frigo-old 6. GSL 7. Singleton (f2c) 8. Temperton (f2c) 9. Valkenburg Computing normalized averages (10 transforms). Benchmarking for array size = 6: 0. CWP (min N): elapsed time t=1.993 s, 524288 iters, t-(init.)=1.893 s t(norm)=0.232796, mflops=21.478 1. CWP (best N) (N=15): elapsed time t=1.912 s, 262144 iters, t-(init.)=1.821 s t(norm)=0.447883, mflops=11.1636 2. FFTPACK (f2c): elapsed time t=1.152 s, 524288 iters, t-(init.)=1.052 s t(norm)=0.129372, mflops=38.6482 (err=2.6e-016) FFTW_MEASURE plan: (cost = 6.504059e-007) FFTW_NOTW 6 3. FFTW: elapsed time t=1.442 s, 2097152 iters, t-(init.)=1.072 s t(norm)=0.0329579, mflops=151.709 (err=1.5e-016) FFTW_ESTIMATE plan: (cost = 4.116000e+002) FFTW_NOTW 6 4. FFTW_ESTIMATE: elapsed time t=1.452 s, 2097152 iters, t-(init.)=1.081 s t(norm)=0.0332346, mflops=150.446 (err=1.5e-016) 5. Frigo-old: elapsed time t=1.752 s, 524288 iters, t-(init.)=1.651 s t(norm)=0.203035, mflops=24.6263 (err=5.2e-016) 6. GSL: elapsed time t=1.082 s, 524288 iters, t-(init.)=0.992 s t(norm)=0.121993, mflops=40.9858 (err=1.4e-016) 7. Singleton (f2c): elapsed time t=1.241 s, 262144 iters, t-(init.)=1.191 s t(norm)=0.292932, mflops=17.0688 (err=1.7e-016) 8. Temperton (f2c): elapsed time t=1.422 s, 262144 iters, t-(init.)=1.381 s t(norm)=0.339663, mflops=14.7205 (err=1.6e-016) 9. Valkenburg: elapsed time t=1.132 s, 131072 iters, t-(init.)=1.102 s t(norm)=0.542083, mflops=9.22367 (err=6.4e-016) Top mflops for N=6 = 151.709 Normalized results and averages for N=6: fft 0: mflops = 21.478 (norm. = 0.141574), norm. avg. (of 1) = 0.141574 fft 1: mflops = 11.1636 (norm. = 0.0735859), norm. avg. (of 1) = 0.0735859 fft 2: mflops = 38.6482 (norm. = 0.254753), norm. avg. (of 1) = 0.254753 fft 3: mflops = 151.709 (norm. = 1), norm. avg. (of 1) = 1 fft 4: mflops = 150.446 (norm. = 0.991674), norm. avg. (of 1) = 0.991674 fft 5: mflops = 24.6263 (norm. = 0.162326), norm. avg. (of 1) = 0.162326 fft 6: mflops = 40.9858 (norm. = 0.270161), norm. avg. (of 1) = 0.270161 fft 7: mflops = 17.0688 (norm. = 0.11251), norm. avg. (of 1) = 0.11251 fft 8: mflops = 14.7205 (norm. = 0.0970311), norm. avg. (of 1) = 0.0970311 fft 9: mflops = 9.22367 (norm. = 0.0607985), norm. avg. (of 1) = 0.0607985 Benchmarking for array size = 9: 0. CWP (min N): elapsed time t=1.062 s, 262144 iters, t-(init.)=1.002 s t(norm)=0.133979, mflops=37.3193 1. CWP (best N) (N=15): elapsed time t=1.902 s, 262144 iters, t-(init.)=1.811 s t(norm)=0.242151, mflops=20.6482 2. FFTPACK (f2c): elapsed time t=1.572 s, 524288 iters, t-(init.)=1.442 s t(norm)=0.096406, mflops=51.864 (err=2.3e-016) FFTW_MEASURE plan: (cost = 1.300812e-006) FFTW_NOTW 9 3. FFTW: elapsed time t=1.322 s, 1048576 iters, t-(init.)=1.081 s t(norm)=0.0361355, mflops=138.368 (err=1.5e-016) FFTW_ESTIMATE plan: (cost = 4.851000e+002) FFTW_NOTW 9 4. FFTW_ESTIMATE: elapsed time t=1.342 s, 1048576 iters, t-(init.)=1.102 s t(norm)=0.0368375, mflops=135.731 (err=1.5e-016) 5. Frigo-old: elapsed time t=1.873 s, 262144 iters, t-(init.)=1.813 s t(norm)=0.242419, mflops=20.6255 (err=5.5e-016) 6. GSL: elapsed time t=1.953 s, 524288 iters, t-(init.)=1.823 s t(norm)=0.121878, mflops=41.0246 (err=1.1e-016) 7. Singleton (f2c): elapsed time t=1.322 s, 262144 iters, t-(init.)=1.262 s t(norm)=0.168744, mflops=29.6307 (err=9.3e-017) 8. Temperton (f2c): elapsed time t=1.342 s, 262144 iters, t-(init.)=1.282 s t(norm)=0.171418, mflops=29.1685 (err=1.1e-016) 9. Valkenburg: elapsed time t=1.001 s, 65536 iters, t-(init.)=0.98 s t(norm)=0.524149, mflops=9.53927 (err=4.7e-016) Top mflops for N=9 = 138.368 Normalized results and averages for N=9: fft 0: mflops = 37.3193 (norm. = 0.269711), norm. avg. (of 2) = 0.205642 fft 1: mflops = 20.6482 (norm. = 0.149227), norm. avg. (of 2) = 0.111406 fft 2: mflops = 51.864 (norm. = 0.374827), norm. avg. (of 2) = 0.31479 fft 3: mflops = 138.368 (norm. = 1), norm. avg. (of 2) = 1 fft 4: mflops = 135.731 (norm. = 0.980944), norm. avg. (of 2) = 0.986309 fft 5: mflops = 20.6255 (norm. = 0.149062), norm. avg. (of 2) = 0.155694 fft 6: mflops = 41.0246 (norm. = 0.296489), norm. avg. (of 2) = 0.283325 fft 7: mflops = 29.6307 (norm. = 0.214144), norm. avg. (of 2) = 0.163327 fft 8: mflops = 29.1685 (norm. = 0.210803), norm. avg. (of 2) = 0.153917 fft 9: mflops = 9.53927 (norm. = 0.0689413), norm. avg. (of 2) = 0.0648699 Benchmarking for array size = 12: 0. CWP (min N): elapsed time t=1.452 s, 262144 iters, t-(init.)=1.372 s t(norm)=0.12166, mflops=41.0981 1. CWP (best N) (N=15): elapsed time t=1.912 s, 262144 iters, t-(init.)=1.832 s t(norm)=0.16245, mflops=30.7787 2. FFTPACK (f2c): elapsed time t=1.853 s, 524288 iters, t-(init.)=1.703 s t(norm)=0.0755055, mflops=66.2203 (err=2.3e-016) FFTW_MEASURE plan: (cost = 1.415253e-006) FFTW_NOTW 12 3. FFTW: elapsed time t=1.533 s, 1048576 iters, t-(init.)=1.243 s t(norm)=0.0275553, mflops=181.453 (err=2.0e-016) FFTW_ESTIMATE plan: (cost = 4.920000e+002) FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.482 s, 1048576 iters, t-(init.)=1.191 s t(norm)=0.0264026, mflops=189.376 (err=2.0e-016) 5. Frigo-old: elapsed time t=1.692 s, 262144 iters, t-(init.)=1.622 s t(norm)=0.143829, mflops=34.7636 (err=3.3e-016) 6. GSL: elapsed time t=1.953 s, 524288 iters, t-(init.)=1.803 s t(norm)=0.0799392, mflops=62.5475 (err=2.0e-016) 7. Singleton (f2c): elapsed time t=1.812 s, 262144 iters, t-(init.)=1.742 s t(norm)=0.154469, mflops=32.3689 (err=2.1e-016) 8. Temperton (f2c): elapsed time t=1.833 s, 262144 iters, t-(init.)=1.763 s t(norm)=0.156332, mflops=31.9833 (err=1.6e-016) 9. Valkenburg: elapsed time t=1.522 s, 65536 iters, t-(init.)=1.502 s t(norm)=0.532751, mflops=9.38525 (err=6.1e-016) Top mflops for N=12 = 189.376 Normalized results and averages for N=12: fft 0: mflops = 41.0981 (norm. = 0.217019), norm. avg. (of 3) = 0.209435 fft 1: mflops = 30.7787 (norm. = 0.162527), norm. avg. (of 3) = 0.128447 fft 2: mflops = 66.2203 (norm. = 0.349677), norm. avg. (of 3) = 0.326419 fft 3: mflops = 181.453 (norm. = 0.958166), norm. avg. (of 3) = 0.986055 fft 4: mflops = 189.376 (norm. = 1), norm. avg. (of 3) = 0.990873 fft 5: mflops = 34.7636 (norm. = 0.18357), norm. avg. (of 3) = 0.164986 fft 6: mflops = 62.5475 (norm. = 0.330283), norm. avg. (of 3) = 0.298978 fft 7: mflops = 32.3689 (norm. = 0.170924), norm. avg. (of 3) = 0.16586 fft 8: mflops = 31.9833 (norm. = 0.168888), norm. avg. (of 3) = 0.158908 fft 9: mflops = 9.38525 (norm. = 0.0495589), norm. avg. (of 3) = 0.0597663 Benchmarking for array size = 15: 0. CWP (min N): elapsed time t=1.903 s, 262144 iters, t-(init.)=1.813 s t(norm)=0.118015, mflops=42.3677 1. CWP (best N): elapsed time t=1.902 s, 262144 iters, t-(init.)=1.812 s t(norm)=0.117949, mflops=42.3911 2. FFTPACK (f2c): elapsed time t=1.282 s, 262144 iters, t-(init.)=1.202 s t(norm)=0.0782424, mflops=63.904 (err=3.5e-016) FFTW_MEASURE plan: (cost = 2.143860e-006) FFTW_NOTW 15 3. FFTW: elapsed time t=1.142 s, 524288 iters, t-(init.)=0.972 s t(norm)=0.0316354, mflops=158.051 (err=1.6e-016) FFTW_ESTIMATE plan: (cost = 4.485000e+002) FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.142 s, 524288 iters, t-(init.)=0.962 s t(norm)=0.03131, mflops=159.694 (err=1.6e-016) 5. Frigo-old: elapsed time t=1.682 s, 131072 iters, t-(init.)=1.642 s t(norm)=0.213767, mflops=23.3899 (err=3.8e-016) 6. GSL: elapsed time t=1.622 s, 262144 iters, t-(init.)=1.542 s t(norm)=0.100374, mflops=49.8136 (err=2.0e-016) 7. Singleton (f2c): elapsed time t=1.111 s, 131072 iters, t-(init.)=1.071 s t(norm)=0.13943, mflops=35.8602 (err=2.4e-016) 8. Temperton (f2c): elapsed time t=1.092 s, 131072 iters, t-(init.)=1.052 s t(norm)=0.136957, mflops=36.5079 (err=2.2e-016) 9. Valkenburg: elapsed time t=1.102 s, 32768 iters, t-(init.)=1.092 s t(norm)=0.568657, mflops=8.79265 (err=4.0e-016) Top mflops for N=15 = 159.694 Normalized results and averages for N=15: fft 0: mflops = 42.3677 (norm. = 0.265306), norm. avg. (of 4) = 0.223402 fft 1: mflops = 42.3911 (norm. = 0.265453), norm. avg. (of 4) = 0.162698 fft 2: mflops = 63.904 (norm. = 0.400166), norm. avg. (of 4) = 0.344856 fft 3: mflops = 158.051 (norm. = 0.989712), norm. avg. (of 4) = 0.986969 fft 4: mflops = 159.694 (norm. = 1), norm. avg. (of 4) = 0.993155 fft 5: mflops = 23.3899 (norm. = 0.146468), norm. avg. (of 4) = 0.160356 fft 6: mflops = 49.8136 (norm. = 0.311933), norm. avg. (of 4) = 0.302217 fft 7: mflops = 35.8602 (norm. = 0.224556), norm. avg. (of 4) = 0.180534 fft 8: mflops = 36.5079 (norm. = 0.228612), norm. avg. (of 4) = 0.176334 fft 9: mflops = 8.79265 (norm. = 0.0550595), norm. avg. (of 4) = 0.0585896 Benchmarking for array size = 18: 0. CWP (min N): elapsed time t=1.082 s, 131072 iters, t-(init.)=1.032 s t(norm)=0.104898, mflops=47.6652 1. CWP (best N) (N=28): elapsed time t=1.432 s, 131072 iters, t-(init.)=1.352 s t(norm)=0.137425, mflops=36.3835 2. FFTPACK (f2c): elapsed time t=1.963 s, 262144 iters, t-(init.)=1.863 s t(norm)=0.094683, mflops=52.8078 (err=2.6e-016) FFTW_MEASURE plan: (cost = 3.356934e-006) FFTW_TWIDDLE 3 FFTW_NOTW 6 3. FFTW: elapsed time t=1.723 s, 524288 iters, t-(init.)=1.523 s t(norm)=0.0387016, mflops=129.194 (err=1.9e-016) FFTW_ESTIMATE plan: (cost = 1.168200e+003) FFTW_TWIDDLE 2 FFTW_NOTW 9 4. FFTW_ESTIMATE: elapsed time t=1.842 s, 524288 iters, t-(init.)=1.641 s t(norm)=0.0417002, mflops=119.904 (err=1.8e-016) 5. Frigo-old: elapsed time t=1.021 s, 65536 iters, t-(init.)=0.991 s t(norm)=0.201462, mflops=24.8186 (err=4.0e-016) 6. GSL: elapsed time t=1.503 s, 262144 iters, t-(init.)=1.403 s t(norm)=0.0713045, mflops=70.1218 (err=2.3e-016) 7. Singleton (f2c): elapsed time t=1.212 s, 131072 iters, t-(init.)=1.162 s t(norm)=0.118112, mflops=42.3326 (err=2.7e-016) 8. Temperton (f2c): elapsed time t=1.472 s, 131072 iters, t-(init.)=1.421 s t(norm)=0.144439, mflops=34.6168 (err=2.1e-016) 9. Valkenburg: elapsed time t=1.262 s, 32768 iters, t-(init.)=1.242 s t(norm)=0.504976, mflops=9.90146 (err=5.4e-016) Top mflops for N=18 = 129.194 Normalized results and averages for N=18: fft 0: mflops = 47.6652 (norm. = 0.368944), norm. avg. (of 5) = 0.252511 fft 1: mflops = 36.3835 (norm. = 0.28162), norm. avg. (of 5) = 0.186483 fft 2: mflops = 52.8078 (norm. = 0.408749), norm. avg. (of 5) = 0.357634 fft 3: mflops = 129.194 (norm. = 1), norm. avg. (of 5) = 0.989576 fft 4: mflops = 119.904 (norm. = 0.928093), norm. avg. (of 5) = 0.980142 fft 5: mflops = 24.8186 (norm. = 0.192104), norm. avg. (of 5) = 0.166706 fft 6: mflops = 70.1218 (norm. = 0.542766), norm. avg. (of 5) = 0.350326 fft 7: mflops = 42.3326 (norm. = 0.327668), norm. avg. (of 5) = 0.209961 fft 8: mflops = 34.6168 (norm. = 0.267945), norm. avg. (of 5) = 0.194656 fft 9: mflops = 9.90146 (norm. = 0.0766405), norm. avg. (of 5) = 0.0621998 Benchmarking for array size = 24: 0. CWP (min N): elapsed time t=1.252 s, 131072 iters, t-(init.)=1.192 s t(norm)=0.0826455, mflops=60.4993 1. CWP (best N) (N=28): elapsed time t=1.442 s, 131072 iters, t-(init.)=1.372 s t(norm)=0.0951255, mflops=52.5621 2. FFTPACK (f2c): elapsed time t=1.202 s, 131072 iters, t-(init.)=1.142 s t(norm)=0.0791788, mflops=63.1482 (err=2.5e-016) FFTW_MEASURE plan: (cost = 3.829956e-006) FFTW_TWIDDLE 2 FFTW_NOTW 12 3. FFTW: elapsed time t=1.011 s, 262144 iters, t-(init.)=0.891 s t(norm)=0.0308881, mflops=161.875 (err=1.5e-016) FFTW_ESTIMATE plan: (cost = 1.248000e+003) FFTW_TWIDDLE 2 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.011 s, 262144 iters, t-(init.)=0.881 s t(norm)=0.0305414, mflops=163.712 (err=1.5e-016) 5. Frigo-old: elapsed time t=1.723 s, 131072 iters, t-(init.)=1.663 s t(norm)=0.115302, mflops=43.3645 (err=3.7e-016) 6. GSL: elapsed time t=1.693 s, 262144 iters, t-(init.)=1.563 s t(norm)=0.0541841, mflops=92.278 (err=1.5e-016) 7. Singleton (f2c): elapsed time t=1.742 s, 131072 iters, t-(init.)=1.682 s t(norm)=0.116619, mflops=42.8747 (err=2.5e-016) 8. Temperton (f2c): elapsed time t=1.513 s, 131072 iters, t-(init.)=1.443 s t(norm)=0.100048, mflops=49.9759 (err=2.1e-016) 9. Valkenburg: elapsed time t=1.863 s, 32768 iters, t-(init.)=1.843 s t(norm)=0.511126, mflops=9.78231 (err=4.8e-016) Top mflops for N=24 = 163.712 Normalized results and averages for N=24: fft 0: mflops = 60.4993 (norm. = 0.369547), norm. avg. (of 6) = 0.272017 fft 1: mflops = 52.5621 (norm. = 0.321064), norm. avg. (of 6) = 0.208913 fft 2: mflops = 63.1482 (norm. = 0.385727), norm. avg. (of 6) = 0.362317 fft 3: mflops = 161.875 (norm. = 0.988777), norm. avg. (of 6) = 0.989442 fft 4: mflops = 163.712 (norm. = 1), norm. avg. (of 6) = 0.983452 fft 5: mflops = 43.3645 (norm. = 0.264883), norm. avg. (of 6) = 0.183069 fft 6: mflops = 92.278 (norm. = 0.56366), norm. avg. (of 6) = 0.385882 fft 7: mflops = 42.8747 (norm. = 0.261891), norm. avg. (of 6) = 0.218616 fft 8: mflops = 49.9759 (norm. = 0.305267), norm. avg. (of 6) = 0.213091 fft 9: mflops = 9.78231 (norm. = 0.0597531), norm. avg. (of 6) = 0.061792 Benchmarking for array size = 36: 0. CWP (min N): elapsed time t=1.872 s, 131072 iters, t-(init.)=1.781 s t(norm)=0.0730075, mflops=68.4862 1. CWP (best N): elapsed time t=1.863 s, 131072 iters, t-(init.)=1.773 s t(norm)=0.0726795, mflops=68.7952 2. FFTPACK (f2c): elapsed time t=1.762 s, 131072 iters, t-(init.)=1.672 s t(norm)=0.0685393, mflops=72.9509 (err=3.9e-016) FFTW_MEASURE plan: (cost = 6.713867e-006) FFTW_TWIDDLE 3 FFTW_NOTW 12 3. FFTW: elapsed time t=1.712 s, 262144 iters, t-(init.)=1.531 s t(norm)=0.0313797, mflops=159.339 (err=3.5e-016) FFTW_ESTIMATE plan: (cost = 1.803600e+003) FFTW_TWIDDLE 3 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.723 s, 262144 iters, t-(init.)=1.553 s t(norm)=0.0318306, mflops=157.082 (err=3.5e-016) 5. Frigo-old: elapsed time t=1.041 s, 32768 iters, t-(init.)=1.021 s t(norm)=0.167413, mflops=29.8663 (err=5.6e-016) 6. GSL: elapsed time t=1.352 s, 131072 iters, t-(init.)=1.262 s t(norm)=0.0517324, mflops=96.6512 (err=3.9e-016) 7. Singleton (f2c): elapsed time t=1.973 s, 131072 iters, t-(init.)=1.883 s t(norm)=0.0771887, mflops=64.7763 (err=4.7e-016) 8. Temperton (f2c): elapsed time t=1.031 s, 65536 iters, t-(init.)=0.981 s t(norm)=0.0804271, mflops=62.1681 (err=3.5e-016) 9. Valkenburg: elapsed time t=1.533 s, 16384 iters, t-(init.)=1.523 s t(norm)=0.499451, mflops=10.011 (err=4.8e-016) Top mflops for N=36 = 159.339 Normalized results and averages for N=36: fft 0: mflops = 68.4862 (norm. = 0.429815), norm. avg. (of 7) = 0.294559 fft 1: mflops = 68.7952 (norm. = 0.431754), norm. avg. (of 7) = 0.240747 fft 2: mflops = 72.9509 (norm. = 0.457835), norm. avg. (of 7) = 0.375962 fft 3: mflops = 159.339 (norm. = 1), norm. avg. (of 7) = 0.990951 fft 4: mflops = 157.082 (norm. = 0.985834), norm. avg. (of 7) = 0.983792 fft 5: mflops = 29.8663 (norm. = 0.187439), norm. avg. (of 7) = 0.183693 fft 6: mflops = 96.6512 (norm. = 0.606577), norm. avg. (of 7) = 0.41741 fft 7: mflops = 64.7763 (norm. = 0.406532), norm. avg. (of 7) = 0.245461 fft 8: mflops = 62.1681 (norm. = 0.390163), norm. avg. (of 7) = 0.238387 fft 9: mflops = 10.011 (norm. = 0.0628283), norm. avg. (of 7) = 0.06194 Benchmarking for array size = 80: 0. CWP (min N): elapsed time t=1.102 s, 32768 iters, t-(init.)=1.052 s t(norm)=0.0634784, mflops=78.7669 1. CWP (best N) (N=84): elapsed time t=1.231 s, 32768 iters, t-(init.)=1.19 s t(norm)=0.0718055, mflops=69.6326 2. FFTPACK (f2c): elapsed time t=1.983 s, 65536 iters, t-(init.)=1.893 s t(norm)=0.0571125, mflops=87.5465 (err=3.9e-016) FFTW_MEASURE plan: (cost = 1.708984e-005) FFTW_TWIDDLE 5 FFTW_NOTW 16 3. FFTW: elapsed time t=1.122 s, 65536 iters, t-(init.)=1.022 s t(norm)=0.0308341, mflops=162.158 (err=3.6e-016) FFTW_ESTIMATE plan: (cost = 2.600000e+003) FFTW_TWIDDLE 5 FFTW_NOTW 16 4. FFTW_ESTIMATE: elapsed time t=1.131 s, 65536 iters, t-(init.)=1.031 s t(norm)=0.0311056, mflops=160.743 (err=3.6e-016) 5. Frigo-old: elapsed time t=1.673 s, 32768 iters, t-(init.)=1.623 s t(norm)=0.097933, mflops=51.0553 (err=4.2e-016) 6. GSL: elapsed time t=1.002 s, 32768 iters, t-(init.)=0.962 s t(norm)=0.0580478, mflops=86.1359 (err=3.4e-016) 7. Singleton (f2c): elapsed time t=1.932 s, 65536 iters, t-(init.)=1.832 s t(norm)=0.0552721, mflops=90.4615 (err=4.7e-016) 8. Temperton (f2c): elapsed time t=1.342 s, 32768 iters, t-(init.)=1.292 s t(norm)=0.0779602, mflops=64.1353 (err=4.1e-016) 9. Valkenburg: elapsed time t=1.092 s, 4096 iters, t-(init.)=1.092 s t(norm)=0.527137, mflops=9.48521 (err=5.1e-016) Top mflops for N=80 = 162.158 Normalized results and averages for N=80: fft 0: mflops = 78.7669 (norm. = 0.485741), norm. avg. (of 8) = 0.318457 fft 1: mflops = 69.6326 (norm. = 0.429412), norm. avg. (of 8) = 0.26433 fft 2: mflops = 87.5465 (norm. = 0.539884), norm. avg. (of 8) = 0.396452 fft 3: mflops = 162.158 (norm. = 1), norm. avg. (of 8) = 0.992082 fft 4: mflops = 160.743 (norm. = 0.991271), norm. avg. (of 8) = 0.984727 fft 5: mflops = 51.0553 (norm. = 0.314849), norm. avg. (of 8) = 0.200088 fft 6: mflops = 86.1359 (norm. = 0.531185), norm. avg. (of 8) = 0.431632 fft 7: mflops = 90.4615 (norm. = 0.55786), norm. avg. (of 8) = 0.284511 fft 8: mflops = 64.1353 (norm. = 0.395511), norm. avg. (of 8) = 0.258028 fft 9: mflops = 9.48521 (norm. = 0.0584936), norm. avg. (of 8) = 0.0615092 Benchmarking for array size = 108: 0. CWP (min N) (N=110): elapsed time t=1.963 s, 32768 iters, t-(init.)=1.903 s t(norm)=0.0796062, mflops=62.8092 1. CWP (best N) (N=112): elapsed time t=1.483 s, 32768 iters, t-(init.)=1.413 s t(norm)=0.0591086, mflops=84.5901 2. FFTPACK (f2c): elapsed time t=1.582 s, 32768 iters, t-(init.)=1.512 s t(norm)=0.0632499, mflops=79.0515 (err=4.5e-016) FFTW_MEASURE plan: (cost = 2.575684e-005) FFTW_TWIDDLE 9 FFTW_NOTW 12 3. FFTW: elapsed time t=1.712 s, 65536 iters, t-(init.)=1.591 s t(norm)=0.0332773, mflops=150.252 (err=3.3e-016) FFTW_ESTIMATE plan: (cost = 4.633200e+003) FFTW_TWIDDLE 9 FFTW_NOTW 12 4. FFTW_ESTIMATE: elapsed time t=1.723 s, 65536 iters, t-(init.)=1.603 s t(norm)=0.0335283, mflops=149.128 (err=3.3e-016) 5. Frigo-old: elapsed time t=1.061 s, 8192 iters, t-(init.)=1.041 s t(norm)=0.174188, mflops=28.7046 (err=5.4e-016) 6. GSL: elapsed time t=1.252 s, 32768 iters, t-(init.)=1.192 s t(norm)=0.0498637, mflops=100.273 (err=3.0e-016) 7. Singleton (f2c): elapsed time t=1.693 s, 32768 iters, t-(init.)=1.633 s t(norm)=0.0683116, mflops=73.194 (err=3.8e-016) 8. Temperton (f2c): elapsed time t=1.673 s, 32768 iters, t-(init.)=1.613 s t(norm)=0.0674749, mflops=74.1016 (err=3.5e-016) 9. Valkenburg: elapsed time t=1.452 s, 4096 iters, t-(init.)=1.442 s t(norm)=0.482573, mflops=10.3611 (err=7.1e-016) Top mflops for N=108 = 150.252 Normalized results and averages for N=108: fft 0: mflops = 62.8092 (norm. = 0.418024), norm. avg. (of 9) = 0.32952 fft 1: mflops = 84.5901 (norm. = 0.562987), norm. avg. (of 9) = 0.297514 fft 2: mflops = 79.0515 (norm. = 0.526124), norm. avg. (of 9) = 0.41086 fft 3: mflops = 150.252 (norm. = 1), norm. avg. (of 9) = 0.992962 fft 4: mflops = 149.128 (norm. = 0.992514), norm. avg. (of 9) = 0.985592 fft 5: mflops = 28.7046 (norm. = 0.191042), norm. avg. (of 9) = 0.199082 fft 6: mflops = 100.273 (norm. = 0.667366), norm. avg. (of 9) = 0.457824 fft 7: mflops = 73.194 (norm. = 0.48714), norm. avg. (of 9) = 0.307025 fft 8: mflops = 74.1016 (norm. = 0.49318), norm. avg. (of 9) = 0.284156 fft 9: mflops = 10.3611 (norm. = 0.068958), norm. avg. (of 9) = 0.0623369 Benchmarking for array size = 210: 0. CWP (min N): elapsed time t=1.973 s, 16384 iters, t-(init.)=1.913 s t(norm)=0.0720746, mflops=69.3726 1. CWP (best N): elapsed time t=1.973 s, 16384 iters, t-(init.)=1.913 s t(norm)=0.0720746, mflops=69.3726 2. FFTPACK (f2c): elapsed time t=1.362 s, 8192 iters, t-(init.)=1.342 s t(norm)=0.101123, mflops=49.4447 (err=5.4e-016) FFTW_MEASURE plan: (cost = 6.372070e-005) FFTW_TWIDDLE 2 FFTW_TWIDDLE 7 FFTW_NOTW 15 3. FFTW: elapsed time t=1.092 s, 16384 iters, t-(init.)=1.032 s t(norm)=0.0388819, mflops=128.595 (err=4.1e-016) FFTW_ESTIMATE plan: (cost = 9.324000e+003) FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.112 s, 16384 iters, t-(init.)=1.052 s t(norm)=0.0396354, mflops=126.15 (err=4.0e-016) 5. Frigo-old: elapsed time t=1.301 s, 4096 iters, t-(init.)=1.291 s t(norm)=0.19456, mflops=25.699 (err=5.2e-016) 6. GSL: elapsed time t=1.522 s, 16384 iters, t-(init.)=1.462 s t(norm)=0.0550826, mflops=90.7727 (err=5.3e-016) 7. Singleton (f2c): elapsed time t=1.042 s, 8192 iters, t-(init.)=1.012 s t(norm)=0.0762567, mflops=65.568 (err=5.6e-016) 8. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 9. Valkenburg: elapsed time t=1.012 s, 1024 iters, t-(init.)=1.012 s t(norm)=0.610053, mflops=8.196 (err=6.8e-016) Top mflops for N=210 = 128.595 Normalized results and averages for N=210: fft 0: mflops = 69.3726 (norm. = 0.539467), norm. avg. (of 10) = 0.350515 fft 1: mflops = 69.3726 (norm. = 0.539467), norm. avg. (of 10) = 0.32171 fft 2: mflops = 49.4447 (norm. = 0.384501), norm. avg. (of 10) = 0.408224 fft 3: mflops = 128.595 (norm. = 1), norm. avg. (of 10) = 0.993665 fft 4: mflops = 126.15 (norm. = 0.980989), norm. avg. (of 10) = 0.985132 fft 5: mflops = 25.699 (norm. = 0.199845), norm. avg. (of 10) = 0.199159 fft 6: mflops = 90.7727 (norm. = 0.705882), norm. avg. (of 10) = 0.48263 fft 7: mflops = 65.568 (norm. = 0.509881), norm. avg. (of 10) = 0.327311 fft 8: mflops = -1 (norm. = -0.00777637), norm. avg. (of 9) = 0.284156 fft 9: mflops = 8.196 (norm. = 0.0637352), norm. avg. (of 10) = 0.0624767 Benchmarking for array size = 504: 0. CWP (min N): elapsed time t=1.052 s, 4096 iters, t-(init.)=1.012 s t(norm)=0.0546066, mflops=91.564 1. CWP (best N): elapsed time t=1.052 s, 4096 iters, t-(init.)=1.022 s t(norm)=0.0551462, mflops=90.6681 2. FFTPACK (f2c): elapsed time t=1.953 s, 4096 iters, t-(init.)=1.913 s t(norm)=0.103224, mflops=48.4385 (err=1.4e-015) FFTW_MEASURE plan: (cost = 1.953125e-004) FFTW_TWIDDLE 6 FFTW_TWIDDLE 7 FFTW_NOTW 12 3. FFTW: elapsed time t=1.582 s, 8192 iters, t-(init.)=1.512 s t(norm)=0.0407931, mflops=122.57 (err=1.4e-015) FFTW_ESTIMATE plan: (cost = 2.147040e+004) FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 14 4. FFTW_ESTIMATE: elapsed time t=1.642 s, 8192 iters, t-(init.)=1.572 s t(norm)=0.0424119, mflops=117.892 (err=1.4e-015) 5. Frigo-old: elapsed time t=1.602 s, 2048 iters, t-(init.)=1.592 s t(norm)=0.171806, mflops=29.1026 (err=1.4e-015) 6. GSL: elapsed time t=1.021 s, 4096 iters, t-(init.)=0.991 s t(norm)=0.0534735, mflops=93.5043 (err=1.4e-015) 7. Singleton (f2c): elapsed time t=1.232 s, 4096 iters, t-(init.)=1.192 s t(norm)=0.0643193, mflops=77.7372 (err=1.9e-015) 8. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 9. Valkenburg: elapsed time t=1.312 s, 512 iters, t-(init.)=1.302 s t(norm)=0.562038, mflops=8.89619 (err=1.7e-015) Top mflops for N=504 = 122.57 Normalized results and averages for N=504: fft 0: mflops = 91.564 (norm. = 0.747036), norm. avg. (of 11) = 0.386562 fft 1: mflops = 90.6681 (norm. = 0.739726), norm. avg. (of 11) = 0.359711 fft 2: mflops = 48.4385 (norm. = 0.395191), norm. avg. (of 11) = 0.407039 fft 3: mflops = 122.57 (norm. = 1), norm. avg. (of 11) = 0.994241 fft 4: mflops = 117.892 (norm. = 0.961832), norm. avg. (of 11) = 0.983014 fft 5: mflops = 29.1026 (norm. = 0.237437), norm. avg. (of 11) = 0.202639 fft 6: mflops = 93.5043 (norm. = 0.762866), norm. avg. (of 11) = 0.508106 fft 7: mflops = 77.7372 (norm. = 0.634228), norm. avg. (of 11) = 0.355212 fft 8: mflops = -1 (norm. = -0.00815862), norm. avg. (of 9) = 0.284156 fft 9: mflops = 8.89619 (norm. = 0.0725806), norm. avg. (of 11) = 0.0633952 Benchmarking for array size = 1000: 0. CWP (min N) (N=1001): elapsed time t=1.432 s, 2048 iters, t-(init.)=1.402 s t(norm)=0.0686921, mflops=72.7886 1. CWP (best N) (N=1008): elapsed time t=1.181 s, 2048 iters, t-(init.)=1.141 s t(norm)=0.0559042, mflops=89.4388 2. FFTPACK (f2c): elapsed time t=1.753 s, 2048 iters, t-(init.)=1.723 s t(norm)=0.0844197, mflops=59.2279 (err=1.0e-015) FFTW_MEASURE plan: (cost = 4.882813e-004) FFTW_TWIDDLE 5 FFTW_TWIDDLE 4 FFTW_TWIDDLE 5 FFTW_NOTW 10 3. FFTW: elapsed time t=1.012 s, 2048 iters, t-(init.)=0.982 s t(norm)=0.0481138, mflops=103.92 (err=9.1e-016) FFTW_ESTIMATE plan: (cost = 5.220000e+004) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_NOTW 10 4. FFTW_ESTIMATE: elapsed time t=1.952 s, 4096 iters, t-(init.)=1.882 s t(norm)=0.046105, mflops=108.448 (err=8.8e-016) 5. Frigo-old: elapsed time t=1.782 s, 1024 iters, t-(init.)=1.762 s t(norm)=0.172661, mflops=28.9585 (err=9.3e-016) 6. GSL: elapsed time t=1.553 s, 2048 iters, t-(init.)=1.513 s t(norm)=0.0741306, mflops=67.4485 (err=9.1e-016) 7. Singleton (f2c): elapsed time t=1.182 s, 2048 iters, t-(init.)=1.142 s t(norm)=0.0559532, mflops=89.3604 (err=1.2e-015) 8. Temperton (f2c): elapsed time t=1.282 s, 2048 iters, t-(init.)=1.252 s t(norm)=0.0613427, mflops=81.5093 (err=9.2e-016) 9. Valkenburg: elapsed time t=1.492 s, 256 iters, t-(init.)=1.492 s t(norm)=0.584813, mflops=8.54973 (err=1.0e-015) Top mflops for N=1000 = 108.448 Normalized results and averages for N=1000: fft 0: mflops = 72.7886 (norm. = 0.671184), norm. avg. (of 12) = 0.410281 fft 1: mflops = 89.4388 (norm. = 0.824715), norm. avg. (of 12) = 0.398461 fft 2: mflops = 59.2279 (norm. = 0.54614), norm. avg. (of 12) = 0.418631 fft 3: mflops = 103.92 (norm. = 0.958248), norm. avg. (of 12) = 0.991242 fft 4: mflops = 108.448 (norm. = 1), norm. avg. (of 12) = 0.984429 fft 5: mflops = 28.9585 (norm. = 0.267026), norm. avg. (of 12) = 0.208004 fft 6: mflops = 67.4485 (norm. = 0.621943), norm. avg. (of 12) = 0.517593 fft 7: mflops = 89.3604 (norm. = 0.823993), norm. avg. (of 12) = 0.394277 fft 8: mflops = 81.5093 (norm. = 0.751597), norm. avg. (of 10) = 0.3309 fft 9: mflops = 8.54973 (norm. = 0.0788371), norm. avg. (of 12) = 0.0646821 Benchmarking for array size = 1960: 0. CWP (min N) (N=1980): elapsed time t=1.632 s, 1024 iters, t-(init.)=1.512 s t(norm)=0.068883, mflops=72.5869 1. CWP (best N) (N=1980): elapsed time t=1.623 s, 1024 iters, t-(init.)=1.503 s t(norm)=0.068473, mflops=73.0215 2. FFTPACK (f2c): elapsed time t=1.773 s, 512 iters, t-(init.)=1.723 s t(norm)=0.156991, mflops=31.8489 (err=2.8e-015) FFTW_MEASURE plan: (cost = 1.171875e-003) FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 7 FFTW_NOTW 14 3. FFTW: elapsed time t=1.162 s, 1024 iters, t-(init.)=1.042 s t(norm)=0.0474709, mflops=105.328 (err=2.8e-015) FFTW_ESTIMATE plan: (cost = 9.662800e+004) FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 14 4. FFTW_ESTIMATE: elapsed time t=1.202 s, 1024 iters, t-(init.)=1.092 s t(norm)=0.0497488, mflops=100.505 (err=2.8e-015) 5. Frigo-old: elapsed time t=1.092 s, 256 iters, t-(init.)=1.072 s t(norm)=0.195351, mflops=25.595 (err=2.8e-015) 6. GSL: elapsed time t=1.672 s, 1024 iters, t-(init.)=1.562 s t(norm)=0.0711609, mflops=70.2633 (err=2.9e-015) 7. Singleton (f2c): elapsed time t=1.933 s, 1024 iters, t-(init.)=1.813 s t(norm)=0.0825958, mflops=60.5358 (err=4.2e-015) 8. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 9. Valkenburg: elapsed time t=1.823 s, 128 iters, t-(init.)=1.813 s t(norm)=0.660766, mflops=7.56697 (err=2.9e-015) Top mflops for N=1960 = 105.328 Normalized results and averages for N=1960: fft 0: mflops = 72.5869 (norm. = 0.689153), norm. avg. (of 13) = 0.431732 fft 1: mflops = 73.0215 (norm. = 0.69328), norm. avg. (of 13) = 0.42114 fft 2: mflops = 31.8489 (norm. = 0.30238), norm. avg. (of 13) = 0.409689 fft 3: mflops = 105.328 (norm. = 1), norm. avg. (of 13) = 0.991916 fft 4: mflops = 100.505 (norm. = 0.954212), norm. avg. (of 13) = 0.982105 fft 5: mflops = 25.595 (norm. = 0.243004), norm. avg. (of 13) = 0.210696 fft 6: mflops = 70.2633 (norm. = 0.667093), norm. avg. (of 13) = 0.529093 fft 7: mflops = 60.5358 (norm. = 0.574738), norm. avg. (of 13) = 0.408159 fft 8: mflops = -1 (norm. = -0.00949419), norm. avg. (of 10) = 0.3309 fft 9: mflops = 7.56697 (norm. = 0.0718423), norm. avg. (of 13) = 0.0652329 Benchmarking for array size = 4725: 0. CWP (min N) (N=5005): elapsed time t=1.232 s, 256 iters, t-(init.)=1.162 s t(norm)=0.0787023, mflops=63.5305 1. CWP (best N) (N=5040): elapsed time t=1.062 s, 256 iters, t-(init.)=0.992 s t(norm)=0.0671882, mflops=74.4178 2. FFTPACK (f2c): elapsed time t=1.773 s, 256 iters, t-(init.)=1.703 s t(norm)=0.115344, mflops=43.3485 (err=1.9e-015) FFTW_MEASURE plan: (cost = 3.125000e-003) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_NOTW 15 3. FFTW: elapsed time t=1.572 s, 512 iters, t-(init.)=1.432 s t(norm)=0.0484947, mflops=103.104 (err=1.9e-015) FFTW_ESTIMATE plan: (cost = 1.946700e+005) FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.592 s, 512 iters, t-(init.)=1.452 s t(norm)=0.049172, mflops=101.684 (err=1.9e-015) 5. Frigo-old: elapsed time t=1.823 s, 128 iters, t-(init.)=1.783 s t(norm)=0.241525, mflops=20.7018 (err=2.0e-015) 6. GSL: elapsed time t=1.022 s, 256 iters, t-(init.)=0.952 s t(norm)=0.064479, mflops=77.5446 (err=1.9e-015) 7. Singleton (f2c): elapsed time t=1.362 s, 256 iters, t-(init.)=1.292 s t(norm)=0.0875072, mflops=57.1381 (err=2.5e-015) 8. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 9. Valkenburg: elapsed time t=1.121 s, 32 iters, t-(init.)=1.111 s t(norm)=0.601985, mflops=8.30586 (err=1.9e-015) Top mflops for N=4725 = 103.104 Normalized results and averages for N=4725: fft 0: mflops = 63.5305 (norm. = 0.616179), norm. avg. (of 14) = 0.444907 fft 1: mflops = 74.4178 (norm. = 0.721774), norm. avg. (of 14) = 0.442614 fft 2: mflops = 43.3485 (norm. = 0.420435), norm. avg. (of 14) = 0.410456 fft 3: mflops = 103.104 (norm. = 1), norm. avg. (of 14) = 0.992493 fft 4: mflops = 101.684 (norm. = 0.986226), norm. avg. (of 14) = 0.982399 fft 5: mflops = 20.7018 (norm. = 0.200785), norm. avg. (of 14) = 0.209989 fft 6: mflops = 77.5446 (norm. = 0.752101), norm. avg. (of 14) = 0.545022 fft 7: mflops = 57.1381 (norm. = 0.55418), norm. avg. (of 14) = 0.418589 fft 8: mflops = -1 (norm. = -0.00969894), norm. avg. (of 10) = 0.3309 fft 9: mflops = 8.30586 (norm. = 0.0805581), norm. avg. (of 14) = 0.0663275 Benchmarking for array size = 10368: 0. CWP (min N) (N=10920): elapsed time t=1.412 s, 128 iters, t-(init.)=1.332 s t(norm)=0.0752399, mflops=66.4541 1. CWP (best N) (N=11088): elapsed time t=1.302 s, 128 iters, t-(init.)=1.222 s t(norm)=0.0690264, mflops=72.436 2. FFTPACK (f2c): elapsed time t=1.642 s, 128 iters, t-(init.)=1.572 s t(norm)=0.0887967, mflops=56.3084 (err=3.1e-015) FFTW_MEASURE plan: (cost = 6.250000e-003) FFTW_TWIDDLE 6 FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_NOTW 32 3. FFTW: elapsed time t=1.682 s, 256 iters, t-(init.)=1.532 s t(norm)=0.0432686, mflops=115.557 (err=3.1e-015) FFTW_ESTIMATE plan: (cost = 1.254528e+005) FFTW_TWIDDLE 9 FFTW_TWIDDLE 6 FFTW_TWIDDLE 6 FFTW_NOTW 32 4. FFTW_ESTIMATE: elapsed time t=1.732 s, 256 iters, t-(init.)=1.581 s t(norm)=0.0446525, mflops=111.976 (err=3.1e-015) 5. Frigo-old: elapsed time t=1.462 s, 64 iters, t-(init.)=1.422 s t(norm)=0.160647, mflops=31.1241 (err=3.2e-015) 6. GSL: elapsed time t=1.112 s, 128 iters, t-(init.)=1.032 s t(norm)=0.058294, mflops=85.7721 (err=3.0e-015) 7. Singleton (f2c): elapsed time t=1.743 s, 128 iters, t-(init.)=1.673 s t(norm)=0.0945018, mflops=52.9091 (err=4.4e-015) 8. Temperton (f2c): elapsed time t=1.593 s, 128 iters, t-(init.)=1.523 s t(norm)=0.0860288, mflops=58.1201 (err=3.0e-015) 9. Valkenburg: elapsed time t=1.202 s, 16 iters, t-(init.)=1.202 s t(norm)=0.543173, mflops=9.20516 (err=3.1e-015) Top mflops for N=10368 = 115.557 Normalized results and averages for N=10368: fft 0: mflops = 66.4541 (norm. = 0.575075), norm. avg. (of 15) = 0.453585 fft 1: mflops = 72.436 (norm. = 0.626841), norm. avg. (of 15) = 0.454896 fft 2: mflops = 56.3084 (norm. = 0.487277), norm. avg. (of 15) = 0.415578 fft 3: mflops = 115.557 (norm. = 1), norm. avg. (of 15) = 0.992994 fft 4: mflops = 111.976 (norm. = 0.969007), norm. avg. (of 15) = 0.981506 fft 5: mflops = 31.1241 (norm. = 0.269339), norm. avg. (of 15) = 0.213945 fft 6: mflops = 85.7721 (norm. = 0.742248), norm. avg. (of 15) = 0.55817 fft 7: mflops = 52.9091 (norm. = 0.45786), norm. avg. (of 15) = 0.421207 fft 8: mflops = 58.1201 (norm. = 0.502955), norm. avg. (of 11) = 0.346541 fft 9: mflops = 9.20516 (norm. = 0.0796589), norm. avg. (of 15) = 0.0672163 Benchmarking for array size = 27000: 0. CWP (min N) (N=27720): elapsed time t=1.853 s, 64 iters, t-(init.)=1.753 s t(norm)=0.0689145, mflops=72.5537 1. CWP (best N) (N=27720): elapsed time t=1.862 s, 64 iters, t-(init.)=1.762 s t(norm)=0.0692683, mflops=72.1831 2. FFTPACK (f2c): elapsed time t=1.112 s, 16 iters, t-(init.)=1.092 s t(norm)=0.171716, mflops=29.1178 (err=5.6e-015) FFTW_MEASURE plan: (cost = 3.000000e-002) FFTW_TWIDDLE 10 FFTW_TWIDDLE 3 FFTW_TWIDDLE 10 FFTW_TWIDDLE 6 FFTW_NOTW 15 3. FFTW: elapsed time t=1.912 s, 64 iters, t-(init.)=1.811 s t(norm)=0.0711946, mflops=70.23 (err=5.7e-015) FFTW_ESTIMATE plan: (cost = 1.231200e+006) FFTW_TWIDDLE 10 FFTW_TWIDDLE 10 FFTW_TWIDDLE 2 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.933 s, 64 iters, t-(init.)=1.833 s t(norm)=0.0720595, mflops=69.3871 (err=5.7e-015) 5. Frigo-old: elapsed time t=1.702 s, 16 iters, t-(init.)=1.672 s t(norm)=0.262921, mflops=19.0171 (err=5.8e-015) 6. GSL: elapsed time t=1.592 s, 32 iters, t-(init.)=1.542 s t(norm)=0.121239, mflops=41.2408 (err=5.6e-015) 7. Singleton (f2c): elapsed time t=1.242 s, 32 iters, t-(init.)=1.192 s t(norm)=0.0937206, mflops=53.3501 (err=7.8e-015) 8. Temperton (f2c): elapsed time t=1.082 s, 32 iters, t-(init.)=1.032 s t(norm)=0.0811406, mflops=61.6214 (err=5.7e-015) 9. Valkenburg: elapsed time t=1.051 s, 4 iters, t-(init.)=1.041 s t(norm)=0.654786, mflops=7.63608 (err=5.5e-015) Top mflops for N=27000 = 72.5537 Normalized results and averages for N=27000: fft 0: mflops = 72.5537 (norm. = 1), norm. avg. (of 16) = 0.487736 fft 1: mflops = 72.1831 (norm. = 0.994892), norm. avg. (of 16) = 0.488645 fft 2: mflops = 29.1178 (norm. = 0.401328), norm. avg. (of 16) = 0.414687 fft 3: mflops = 70.23 (norm. = 0.967973), norm. avg. (of 16) = 0.99143 fft 4: mflops = 69.3871 (norm. = 0.956356), norm. avg. (of 16) = 0.979934 fft 5: mflops = 19.0171 (norm. = 0.262111), norm. avg. (of 16) = 0.216956 fft 6: mflops = 41.2408 (norm. = 0.568418), norm. avg. (of 16) = 0.558811 fft 7: mflops = 53.3501 (norm. = 0.735319), norm. avg. (of 16) = 0.440839 fft 8: mflops = 61.6214 (norm. = 0.849322), norm. avg. (of 12) = 0.38844 fft 9: mflops = 7.63608 (norm. = 0.105247), norm. avg. (of 16) = 0.0695932 Benchmarking for array size = 75600: 0. CWP (min N) (N=80080): elapsed time t=1.172 s, 8 iters, t-(init.)=1.062 s t(norm)=0.108351, mflops=46.1462 1. CWP (best N) (N=80080): elapsed time t=1.172 s, 8 iters, t-(init.)=1.062 s t(norm)=0.108351, mflops=46.1462 2. FFTPACK (f2c): elapsed time t=1.392 s, 4 iters, t-(init.)=1.342 s t(norm)=0.273837, mflops=18.259 (err=1.1e-014) FFTW_MEASURE plan: (cost = 1.050000e-001) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 10 FFTW_TWIDDLE 7 FFTW_NOTW 12 3. FFTW: elapsed time t=1.673 s, 16 iters, t-(init.)=1.453 s t(norm)=0.0741217, mflops=67.4566 (err=1.1e-014) FFTW_ESTIMATE plan: (cost = 2.971080e+006) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 8 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.622 s, 16 iters, t-(init.)=1.412 s t(norm)=0.0720302, mflops=69.4154 (err=1.1e-014) 5. Frigo-old: elapsed time t=1.442 s, 4 iters, t-(init.)=1.392 s t(norm)=0.28404, mflops=17.6032 (err=1.1e-014) 6. GSL: elapsed time t=1.703 s, 8 iters, t-(init.)=1.593 s t(norm)=0.162527, mflops=30.7641 (err=1.1e-014) 7. Singleton (f2c): elapsed time t=1.092 s, 4 iters, t-(init.)=1.042 s t(norm)=0.212622, mflops=23.516 (err=1.5e-014) 8. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 9. Valkenburg: elapsed time t=1.863 s, 2 iters, t-(init.)=1.833 s t(norm)=0.748053, mflops=6.68402 (err=1.1e-014) Top mflops for N=75600 = 69.4154 Normalized results and averages for N=75600: fft 0: mflops = 46.1462 (norm. = 0.664783), norm. avg. (of 17) = 0.49815 fft 1: mflops = 46.1462 (norm. = 0.664783), norm. avg. (of 17) = 0.499006 fft 2: mflops = 18.259 (norm. = 0.26304), norm. avg. (of 17) = 0.405767 fft 3: mflops = 67.4566 (norm. = 0.971783), norm. avg. (of 17) = 0.990274 fft 4: mflops = 69.4154 (norm. = 1), norm. avg. (of 17) = 0.981115 fft 5: mflops = 17.6032 (norm. = 0.253592), norm. avg. (of 17) = 0.219111 fft 6: mflops = 30.7641 (norm. = 0.443189), norm. avg. (of 17) = 0.552009 fft 7: mflops = 23.516 (norm. = 0.338772), norm. avg. (of 17) = 0.434835 fft 8: mflops = -1 (norm. = -0.014406), norm. avg. (of 12) = 0.38844 fft 9: mflops = 6.68402 (norm. = 0.0962902), norm. avg. (of 17) = 0.0711636 Benchmarking for array size = 165375: 0. CWP (min N) (N=180180): elapsed time t=1.572 s, 4 iters, t-(init.)=1.431 s t(norm)=0.124789, mflops=40.0676 1. CWP (best N) (N=180180): elapsed time t=1.572 s, 4 iters, t-(init.)=1.432 s t(norm)=0.124876, mflops=40.0396 2. FFTPACK (f2c): elapsed time t=1.182 s, 1 iters, t-(init.)=1.142 s t(norm)=0.398348, mflops=12.5518 (err=2.7e-014) FFTW_MEASURE plan: (cost = 2.700000e-001) FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_NOTW 15 3. FFTW: elapsed time t=1.082 s, 4 iters, t-(init.)=0.952 s t(norm)=0.0830183, mflops=60.2277 (err=2.7e-014) FFTW_ESTIMATE plan: (cost = 8.367975e+006) FFTW_TWIDDLE 7 FFTW_TWIDDLE 5 FFTW_TWIDDLE 5 FFTW_TWIDDLE 7 FFTW_TWIDDLE 9 FFTW_NOTW 15 4. FFTW_ESTIMATE: elapsed time t=1.081 s, 4 iters, t-(init.)=0.95 s t(norm)=0.0828439, mflops=60.3545 (err=2.7e-014) 5. Frigo-old: elapsed time t=1.162 s, 1 iters, t-(init.)=1.132 s t(norm)=0.39486, mflops=12.6627 (err=2.7e-014) 6. GSL: elapsed time t=1.953 s, 4 iters, t-(init.)=1.823 s t(norm)=0.158973, mflops=31.4519 (err=2.7e-014) 7. Singleton (f2c): elapsed time t=1.342 s, 2 iters, t-(init.)=1.282 s t(norm)=0.223591, mflops=22.3622 (err=4.0e-014) 8. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 9. Valkenburg: elapsed time t=2.243 s, 1 iters, t-(init.)=2.213 s t(norm)=0.77193, mflops=6.47727 (err=2.7e-014) Top mflops for N=165375 = 60.3545 Normalized results and averages for N=165375: fft 0: mflops = 40.0676 (norm. = 0.663871), norm. avg. (of 18) = 0.507357 fft 1: mflops = 40.0396 (norm. = 0.663408), norm. avg. (of 18) = 0.50814 fft 2: mflops = 12.5518 (norm. = 0.207968), norm. avg. (of 18) = 0.394778 fft 3: mflops = 60.2277 (norm. = 0.997899), norm. avg. (of 18) = 0.990698 fft 4: mflops = 60.3545 (norm. = 1), norm. avg. (of 18) = 0.982164 fft 5: mflops = 12.6627 (norm. = 0.209806), norm. avg. (of 18) = 0.218594 fft 6: mflops = 31.4519 (norm. = 0.521119), norm. avg. (of 18) = 0.550293 fft 7: mflops = 22.3622 (norm. = 0.370515), norm. avg. (of 18) = 0.431262 fft 8: mflops = -1 (norm. = -0.0165688), norm. avg. (of 12) = 0.38844 fft 9: mflops = 6.47727 (norm. = 0.10732), norm. avg. (of 18) = 0.0731723 Benchmarking for array size = 362880: 0. CWP (min N) (N=720720): elapsed time t=1.652 s, 1 iters, t-(init.)=1.512 s t(norm)=0.225602, mflops=22.163 1. CWP (best N) (N=720720): elapsed time t=1.652 s, 1 iters, t-(init.)=1.502 s t(norm)=0.22411, mflops=22.3105 2. FFTPACK (f2c): elapsed time t=2.043 s, 1 iters, t-(init.)=1.973 s t(norm)=0.294386, mflops=16.9845 (err=1.1e-013) FFTW_MEASURE plan: (cost = 5.400000e-001) FFTW_TWIDDLE 64 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 6 FFTW_NOTW 15 3. FFTW: elapsed time t=1.081 s, 2 iters, t-(init.)=0.93 s t(norm)=0.0693815, mflops=72.0654 (err=1.1e-013) FFTW_ESTIMATE plan: (cost = 7.511616e+006) FFTW_TWIDDLE 10 FFTW_TWIDDLE 9 FFTW_TWIDDLE 9 FFTW_TWIDDLE 7 FFTW_TWIDDLE 2 FFTW_NOTW 32 4. FFTW_ESTIMATE: elapsed time t=1.252 s, 2 iters, t-(init.)=1.102 s t(norm)=0.0822133, mflops=60.8174 (err=1.1e-013) 5. Frigo-old: elapsed time t=2.133 s, 1 iters, t-(init.)=2.063 s t(norm)=0.307815, mflops=16.2435 (err=1.1e-013) 6. GSL: elapsed time t=1.072 s, 1 iters, t-(init.)=1.002 s t(norm)=0.149506, mflops=33.4435 (err=1.1e-013) 7. Singleton (f2c): elapsed time t=1.973 s, 1 iters, t-(init.)=1.903 s t(norm)=0.283942, mflops=17.6092 (err=1.6e-013) 8. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). 9. Valkenburg: elapsed time t=5.728 s, 1 iters, t-(init.)=5.658 s t(norm)=0.844216, mflops=5.92266 (err=1.1e-013) Top mflops for N=362880 = 72.0654 Normalized results and averages for N=362880: fft 0: mflops = 22.163 (norm. = 0.30754), norm. avg. (of 19) = 0.49684 fft 1: mflops = 22.3105 (norm. = 0.309587), norm. avg. (of 19) = 0.49769 fft 2: mflops = 16.9845 (norm. = 0.235682), norm. avg. (of 19) = 0.386404 fft 3: mflops = 72.0654 (norm. = 1), norm. avg. (of 19) = 0.991187 fft 4: mflops = 60.8174 (norm. = 0.84392), norm. avg. (of 19) = 0.974888 fft 5: mflops = 16.2435 (norm. = 0.2254), norm. avg. (of 19) = 0.218952 fft 6: mflops = 33.4435 (norm. = 0.464072), norm. avg. (of 19) = 0.545755 fft 7: mflops = 17.6092 (norm. = 0.244351), norm. avg. (of 19) = 0.421424 fft 8: mflops = -1 (norm. = -0.0138763), norm. avg. (of 12) = 0.38844 fft 9: mflops = 5.92266 (norm. = 0.0821845), norm. avg. (of 19) = 0.0736467 ------------------------------------------------------ @@@@ bench.3d.p2.log Benchmarking for sizes: 4x4x4 (0.00128174 MB) 8x8x8 (0.00830078 MB) 16x16x16 (0.0633545 MB) 32x32x32 (0.501587 MB) 64x64x64 (4.00305 MB) 256x64x32 (8.01184 MB) 16x1024x64 (16.047 MB) 128x128x128 (32.006 MB) 512x128x64 (64.0236 MB) 256x128x256 (128.012 MB) Maximum array size N = 8388608 Benchmarking FFTs: 0. FFTW 1. HARM (f2c) 2. PDA (f2c) 3. Singleton (f2c) 4. Temperton (f2c) Computing normalized averages (5 transforms). Benchmarking for array size = 4x4x4 (power of 2): 0. FFTW: elapsed time t=1.732 s, 131072 iters, t-(init.)=1.581 s t(norm)=0.0314116, mflops=159.177 (err=1.7e-016) 1. Skipping fft (all dimensions must be > 4 for HARM). 2. PDA (f2c): elapsed time t=1.302 s, 16384 iters, t-(init.)=1.282 s t(norm)=0.203768, mflops=24.5377 (err=1.9e-016) 3. Singleton (f2c): elapsed time t=1.202 s, 65536 iters, t-(init.)=1.122 s t(norm)=0.0445843, mflops=112.147 (err=1.7e-016) 4. Temperton (f2c): elapsed time t=1.973 s, 65536 iters, t-(init.)=1.893 s t(norm)=0.0752211, mflops=66.4707 (err=1.7e-016) Top mflops for N=64 = 159.177 Normalized results and averages for N=64: fft 0: mflops = 159.177 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = -1 (norm. = -0.00628233), norm. avg. (of 0) = -1 fft 2: mflops = 24.5377 (norm. = 0.154154), norm. avg. (of 1) = 0.154154 fft 3: mflops = 112.147 (norm. = 0.704545), norm. avg. (of 1) = 0.704545 fft 4: mflops = 66.4707 (norm. = 0.417591), norm. avg. (of 1) = 0.417591 Benchmarking for array size = 8x8x8 (power of 2): 0. FFTW: elapsed time t=1.923 s, 16384 iters, t-(init.)=1.773 s t(norm)=0.0234842, mflops=212.909 (err=3.1e-016) 1. HARM (f2c): elapsed time t=1.682 s, 8192 iters, t-(init.)=1.612 s t(norm)=0.0427034, mflops=117.087 (err=3.2e-016) 2. PDA (f2c): elapsed time t=1.202 s, 2048 iters, t-(init.)=1.192 s t(norm)=0.126309, mflops=39.5855 (err=2.8e-016) 3. Singleton (f2c): elapsed time t=1.852 s, 8192 iters, t-(init.)=1.782 s t(norm)=0.0472069, mflops=105.917 (err=3.1e-016) 4. Temperton (f2c): elapsed time t=1.642 s, 8192 iters, t-(init.)=1.572 s t(norm)=0.0416438, mflops=120.066 (err=3.1e-016) Top mflops for N=512 = 212.909 Normalized results and averages for N=512: fft 0: mflops = 212.909 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 117.087 (norm. = 0.549938), norm. avg. (of 1) = 0.549938 fft 2: mflops = 39.5855 (norm. = 0.185927), norm. avg. (of 2) = 0.17004 fft 3: mflops = 105.917 (norm. = 0.497475), norm. avg. (of 2) = 0.60101 fft 4: mflops = 120.066 (norm. = 0.563931), norm. avg. (of 2) = 0.490761 Benchmarking for array size = 16x16x16 (power of 2): 0. FFTW: elapsed time t=1.222 s, 512 iters, t-(init.)=1.102 s t(norm)=0.0437895, mflops=114.183 (err=3.7e-016) 1. HARM (f2c): elapsed time t=1.212 s, 512 iters, t-(init.)=1.092 s t(norm)=0.0433922, mflops=115.228 (err=3.7e-016) 2. PDA (f2c): elapsed time t=1.382 s, 256 iters, t-(init.)=1.322 s t(norm)=0.105063, mflops=47.5904 (err=4.0e-016) 3. Singleton (f2c): elapsed time t=1.952 s, 512 iters, t-(init.)=1.831 s t(norm)=0.0727574, mflops=68.7215 (err=3.8e-016) 4. Temperton (f2c): elapsed time t=1.482 s, 512 iters, t-(init.)=1.362 s t(norm)=0.054121, mflops=92.3856 (err=4.1e-016) Top mflops for N=4096 = 115.228 Normalized results and averages for N=4096: fft 0: mflops = 114.183 (norm. = 0.990926), norm. avg. (of 3) = 0.996975 fft 1: mflops = 115.228 (norm. = 1), norm. avg. (of 2) = 0.774969 fft 2: mflops = 47.5904 (norm. = 0.413011), norm. avg. (of 3) = 0.25103 fft 3: mflops = 68.7215 (norm. = 0.596395), norm. avg. (of 3) = 0.599472 fft 4: mflops = 92.3856 (norm. = 0.801762), norm. avg. (of 3) = 0.594428 Benchmarking for array size = 32x32x32 (power of 2): 0. FFTW: elapsed time t=1.341 s, 64 iters, t-(init.)=1.21 s t(norm)=0.0384649, mflops=129.989 (err=5.1e-016) 1. HARM (f2c): elapsed time t=1.803 s, 64 iters, t-(init.)=1.683 s t(norm)=0.0535011, mflops=93.456 (err=5.3e-016) 2. PDA (f2c): elapsed time t=1.712 s, 32 iters, t-(init.)=1.652 s t(norm)=0.105031, mflops=47.6048 (err=4.1e-016) 3. Singleton (f2c): elapsed time t=1.472 s, 32 iters, t-(init.)=1.412 s t(norm)=0.0897725, mflops=55.6963 (err=4.9e-016) 4. Temperton (f2c): elapsed time t=1.101 s, 32 iters, t-(init.)=1.041 s t(norm)=0.066185, mflops=75.5458 (err=4.6e-016) Top mflops for N=32768 = 129.989 Normalized results and averages for N=32768: fft 0: mflops = 129.989 (norm. = 1), norm. avg. (of 4) = 0.997731 fft 1: mflops = 93.456 (norm. = 0.718954), norm. avg. (of 3) = 0.756297 fft 2: mflops = 47.6048 (norm. = 0.366223), norm. avg. (of 4) = 0.279829 fft 3: mflops = 55.6963 (norm. = 0.42847), norm. avg. (of 4) = 0.556721 fft 4: mflops = 75.5458 (norm. = 0.581172), norm. avg. (of 4) = 0.591114 Benchmarking for array size = 64x64x64 (power of 2): 0. FFTW: elapsed time t=1.753 s, 4 iters, t-(init.)=1.543 s t(norm)=0.0817511, mflops=61.1613 (err=1.2e-015) 1. HARM (f2c): elapsed time t=1.232 s, 2 iters, t-(init.)=1.132 s t(norm)=0.119951, mflops=41.6837 (err=1.2e-015) 2. PDA (f2c): elapsed time t=1.683 s, 2 iters, t-(init.)=1.583 s t(norm)=0.167741, mflops=29.8079 (err=1.2e-015) 3. Singleton (f2c): elapsed time t=1.182 s, 1 iters, t-(init.)=1.132 s t(norm)=0.239902, mflops=20.8418 (err=1.7e-015) 4. Temperton (f2c): elapsed time t=1.302 s, 2 iters, t-(init.)=1.202 s t(norm)=0.127369, mflops=39.2562 (err=1.3e-015) Top mflops for N=262144 = 61.1613 Normalized results and averages for N=262144: fft 0: mflops = 61.1613 (norm. = 1), norm. avg. (of 5) = 0.998185 fft 1: mflops = 41.6837 (norm. = 0.681537), norm. avg. (of 4) = 0.737607 fft 2: mflops = 29.8079 (norm. = 0.487366), norm. avg. (of 5) = 0.321336 fft 3: mflops = 20.8418 (norm. = 0.340769), norm. avg. (of 5) = 0.513531 fft 4: mflops = 39.2562 (norm. = 0.641847), norm. avg. (of 5) = 0.601261 Benchmarking for array size = 256x64x32 (power of 2): 0. FFTW: elapsed time t=1.842 s, 2 iters, t-(init.)=1.641 s t(norm)=0.0823673, mflops=60.7037 (err=1.2e-015) 1. HARM (f2c): elapsed time t=1.372 s, 1 iters, t-(init.)=1.262 s t(norm)=0.126688, mflops=39.467 (err=1.2e-015) 2. PDA (f2c): elapsed time t=1.723 s, 1 iters, t-(init.)=1.613 s t(norm)=0.161924, mflops=30.8787 (err=1.2e-015) 3. Singleton (f2c): elapsed time t=2.644 s, 1 iters, t-(init.)=2.534 s t(norm)=0.25438, mflops=19.6556 (err=1.8e-015) 4. Temperton (f2c): elapsed time t=1.352 s, 1 iters, t-(init.)=1.242 s t(norm)=0.12468, mflops=40.1025 (err=1.2e-015) Top mflops for N=524288 = 60.7037 Normalized results and averages for N=524288: fft 0: mflops = 60.7037 (norm. = 1), norm. avg. (of 6) = 0.998488 fft 1: mflops = 39.467 (norm. = 0.650158), norm. avg. (of 5) = 0.720118 fft 2: mflops = 30.8787 (norm. = 0.508679), norm. avg. (of 6) = 0.35256 fft 3: mflops = 19.6556 (norm. = 0.323796), norm. avg. (of 6) = 0.481908 fft 4: mflops = 40.1025 (norm. = 0.660628), norm. avg. (of 6) = 0.611155 Benchmarking for array size = 16x1024x64 (power of 2): 0. FFTW: elapsed time t=1.893 s, 1 iters, t-(init.)=1.683 s t(norm)=0.0802517, mflops=62.304 (err=2.0e-015) 1. HARM (f2c): elapsed time t=2.784 s, 1 iters, t-(init.)=2.573 s t(norm)=0.12269, mflops=40.7531 (err=2.0e-015) 2. PDA (f2c): elapsed time t=4.307 s, 1 iters, t-(init.)=4.097 s t(norm)=0.19536, mflops=25.5938 (err=2.0e-015) 3. Singleton (f2c): elapsed time t=5.067 s, 1 iters, t-(init.)=4.856 s t(norm)=0.231552, mflops=21.5934 (err=2.8e-015) 4. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=1048576 = 62.304 Normalized results and averages for N=1048576: fft 0: mflops = 62.304 (norm. = 1), norm. avg. (of 7) = 0.998704 fft 1: mflops = 40.7531 (norm. = 0.6541), norm. avg. (of 6) = 0.709115 fft 2: mflops = 25.5938 (norm. = 0.410788), norm. avg. (of 7) = 0.360878 fft 3: mflops = 21.5934 (norm. = 0.346582), norm. avg. (of 7) = 0.462576 fft 4: mflops = -1 (norm. = -0.0160503), norm. avg. (of 6) = 0.611155 Benchmarking for array size = 128x128x128 (power of 2): 0. FFTW: elapsed time t=3.986 s, 1 iters, t-(init.)=3.565 s t(norm)=0.0809488, mflops=61.7675 (err=7.4e-016) 1. HARM (f2c): elapsed time t=6.329 s, 1 iters, t-(init.)=5.909 s t(norm)=0.134173, mflops=37.2654 (err=7.3e-016) 2. PDA (f2c): elapsed time t=7.23 s, 1 iters, t-(init.)=6.809 s t(norm)=0.154609, mflops=32.3397 (err=7.0e-016) 3. Singleton (f2c): elapsed time t=15.282 s, 1 iters, t-(init.)=14.852 s t(norm)=0.337237, mflops=14.8264 (err=8.9e-016) 4. Temperton (f2c): elapsed time t=8.071 s, 1 iters, t-(init.)=7.65 s t(norm)=0.173705, mflops=28.7844 (err=7.7e-016) Top mflops for N=2097152 = 61.7675 Normalized results and averages for N=2097152: fft 0: mflops = 61.7675 (norm. = 1), norm. avg. (of 8) = 0.998866 fft 1: mflops = 37.2654 (norm. = 0.603317), norm. avg. (of 7) = 0.694001 fft 2: mflops = 32.3397 (norm. = 0.523572), norm. avg. (of 8) = 0.381215 fft 3: mflops = 14.8264 (norm. = 0.240035), norm. avg. (of 8) = 0.434758 fft 4: mflops = 28.7844 (norm. = 0.466013), norm. avg. (of 7) = 0.590421 Benchmarking for array size = 512x128x64 (power of 2): 0. FFTW: elapsed time t=8.352 s, 1 iters, t-(init.)=7.501 s t(norm)=0.0812899, mflops=61.5083 (err=1.3e-015) 1. HARM (f2c): elapsed time t=13.149 s, 1 iters, t-(init.)=12.298 s t(norm)=0.133276, mflops=37.5161 (err=1.2e-015) 2. PDA (f2c): elapsed time t=15.172 s, 1 iters, t-(init.)=14.321 s t(norm)=0.1552, mflops=32.2166 (err=1.3e-015) 3. Singleton (f2c): elapsed time t=27.46 s, 1 iters, t-(init.)=26.609 s t(norm)=0.288367, mflops=17.339 (err=1.6e-015) 4. Skipping fft (Temperton can't handle dimensions > 256). Top mflops for N=4194304 = 61.5083 Normalized results and averages for N=4194304: fft 0: mflops = 61.5083 (norm. = 1), norm. avg. (of 9) = 0.998992 fft 1: mflops = 37.5161 (norm. = 0.609937), norm. avg. (of 8) = 0.683493 fft 2: mflops = 32.2166 (norm. = 0.523776), norm. avg. (of 9) = 0.397055 fft 3: mflops = 17.339 (norm. = 0.281897), norm. avg. (of 9) = 0.417774 fft 4: mflops = -1 (norm. = -0.016258), norm. avg. (of 7) = 0.590421 Benchmarking for array size = 256x128x256 (power of 2): 0. FFTW: elapsed time t=17.035 s, 1 iters, t-(init.)=15.333 s t(norm)=0.0794711, mflops=62.9159 (err=1.5e-015) 1. HARM (f2c): elapsed time t=26.007 s, 1 iters, t-(init.)=24.304 s t(norm)=0.125968, mflops=39.6926 (err=1.4e-015) 2. PDA (f2c): elapsed time t=30.454 s, 1 iters, t-(init.)=28.772 s t(norm)=0.149126, mflops=33.5288 (err=1.4e-015) 3. Singleton (f2c): elapsed time t=54.529 s, 1 iters, t-(init.)=52.847 s t(norm)=0.273907, mflops=18.2544 (err=2.1e-015) 4. Temperton (f2c): elapsed time t=31.105 s, 1 iters, t-(init.)=29.423 s t(norm)=0.1525, mflops=32.7869 (err=1.5e-015) Top mflops for N=8388608 = 62.9159 Normalized results and averages for N=8388608: fft 0: mflops = 62.9159 (norm. = 1), norm. avg. (of 10) = 0.999093 fft 1: mflops = 39.6926 (norm. = 0.630884), norm. avg. (of 9) = 0.677647 fft 2: mflops = 33.5288 (norm. = 0.532914), norm. avg. (of 10) = 0.410641 fft 3: mflops = 18.2544 (norm. = 0.290139), norm. avg. (of 10) = 0.40501 fft 4: mflops = 32.7869 (norm. = 0.521123), norm. avg. (of 8) = 0.581758 ------------------------------------------------------ @@@@ bench.3d.np2.log Benchmarking for sizes: 5x5x5 (0.0022583 MB) 6x6x6 (0.00369263 MB) 7x7x7 (0.00567627 MB) 9x9x9 (0.0116577 MB) 10x10x10 (0.0158386 MB) 11x11x11 (0.0209351 MB) 12x12x12 (0.0270386 MB) 13x13x13 (0.0342407 MB) 14x14x14 (0.0426331 MB) 15x15x15 (0.0523071 MB) 24x25x28 (0.257751 MB) 48x48x48 (1.68982 MB) 49x49x49 (1.79755 MB) 60x60x60 (3.29877 MB) 72x60x56 (3.69482 MB) 75x75x75 (6.44086 MB) 80x80x80 (7.81628 MB) 84x84x84 (9.04791 MB) 96x96x96 (13.5045 MB) 105x105x105 (17.6689 MB) 112x112x112 (21.4427 MB) 120x120x120 (26.3728 MB) 144x144x144 (45.5692 MB) 180x180x180 (88.9976 MB) 240x240x240 (210.949 MB) Maximum array size N = 13824000 Benchmarking FFTs: 0. FFTW 1. PDA (f2c) 2. Singleton (f2c) 3. Temperton (f2c) Computing normalized averages (4 transforms). Benchmarking for array size = 5x5x5: 0. FFTW: elapsed time t=1.182 s, 32768 iters, t-(init.)=1.112 s t(norm)=0.038974, mflops=128.291 (err=2.8e-016) 1. PDA (f2c): elapsed time t=1.292 s, 8192 iters, t-(init.)=1.272 s t(norm)=0.178327, mflops=28.0384 (err=2.6e-016) 2. Singleton (f2c): elapsed time t=1.192 s, 32768 iters, t-(init.)=1.122 s t(norm)=0.0393245, mflops=127.147 (err=2.9e-016) 3. Temperton (f2c): elapsed time t=1.842 s, 32768 iters, t-(init.)=1.761 s t(norm)=0.0617205, mflops=81.0104 (err=2.1e-016) Top mflops for N=125 = 128.291 Normalized results and averages for N=125: fft 0: mflops = 128.291 (norm. = 1), norm. avg. (of 1) = 1 fft 1: mflops = 28.0384 (norm. = 0.218553), norm. avg. (of 1) = 0.218553 fft 2: mflops = 127.147 (norm. = 0.991087), norm. avg. (of 1) = 0.991087 fft 3: mflops = 81.0104 (norm. = 0.631459), norm. avg. (of 1) = 0.631459 Benchmarking for array size = 6x6x6: 0. FFTW: elapsed time t=1.612 s, 32768 iters, t-(init.)=1.492 s t(norm)=0.0271825, mflops=183.942 (err=2.6e-016) 1. PDA (f2c): elapsed time t=1.192 s, 4096 iters, t-(init.)=1.182 s t(norm)=0.172277, mflops=29.023 (err=2.9e-016) 2. Singleton (f2c): elapsed time t=1.602 s, 16384 iters, t-(init.)=1.542 s t(norm)=0.0561869, mflops=88.9887 (err=2.6e-016) 3. Temperton (f2c): elapsed time t=1.863 s, 16384 iters, t-(init.)=1.803 s t(norm)=0.0656972, mflops=76.1068 (err=2.5e-016) Top mflops for N=216 = 183.942 Normalized results and averages for N=216: fft 0: mflops = 183.942 (norm. = 1), norm. avg. (of 2) = 1 fft 1: mflops = 29.023 (norm. = 0.157783), norm. avg. (of 2) = 0.188168 fft 2: mflops = 88.9887 (norm. = 0.483787), norm. avg. (of 2) = 0.737437 fft 3: mflops = 76.1068 (norm. = 0.413755), norm. avg. (of 2) = 0.522607 Benchmarking for array size = 7x7x7: 0. FFTW: elapsed time t=1.082 s, 8192 iters, t-(init.)=1.032 s t(norm)=0.0436091, mflops=114.655 (err=4.0e-016) 1. PDA (f2c): elapsed time t=1.653 s, 2048 iters, t-(init.)=1.643 s t(norm)=0.277712, mflops=18.0043 (err=4.1e-016) 2. Singleton (f2c): elapsed time t=1.301 s, 8192 iters, t-(init.)=1.251 s t(norm)=0.0528633, mflops=94.5835 (err=6.4e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=343 = 114.655 Normalized results and averages for N=343: fft 0: mflops = 114.655 (norm. = 1), norm. avg. (of 3) = 1 fft 1: mflops = 18.0043 (norm. = 0.15703), norm. avg. (of 3) = 0.177789 fft 2: mflops = 94.5835 (norm. = 0.82494), norm. avg. (of 3) = 0.766605 fft 3: mflops = -1 (norm. = -0.00872182), norm. avg. (of 2) = 0.522607 Benchmarking for array size = 9x9x9: 0. FFTW: elapsed time t=1.041 s, 4096 iters, t-(init.)=0.991 s t(norm)=0.0348992, mflops=143.27 (err=4.7e-016) 1. PDA (f2c): elapsed time t=1.813 s, 2048 iters, t-(init.)=1.793 s t(norm)=0.126285, mflops=39.5929 (err=4.4e-016) 2. Singleton (f2c): elapsed time t=1.331 s, 4096 iters, t-(init.)=1.28 s t(norm)=0.0450767, mflops=110.922 (err=4.1e-016) 3. Temperton (f2c): elapsed time t=1.442 s, 4096 iters, t-(init.)=1.392 s t(norm)=0.0490209, mflops=101.997 (err=4.2e-016) Top mflops for N=729 = 143.27 Normalized results and averages for N=729: fft 0: mflops = 143.27 (norm. = 1), norm. avg. (of 4) = 1 fft 1: mflops = 39.5929 (norm. = 0.276352), norm. avg. (of 4) = 0.20243 fft 2: mflops = 110.922 (norm. = 0.774219), norm. avg. (of 4) = 0.768508 fft 3: mflops = 101.997 (norm. = 0.711925), norm. avg. (of 3) = 0.585713 Benchmarking for array size = 10x10x10: 0. FFTW: elapsed time t=1.412 s, 4096 iters, t-(init.)=1.342 s t(norm)=0.0328762, mflops=152.086 (err=3.3e-016) 1. PDA (f2c): elapsed time t=1.242 s, 1024 iters, t-(init.)=1.222 s t(norm)=0.119746, mflops=41.7552 (err=3.9e-016) 2. Singleton (f2c): elapsed time t=1.111 s, 2048 iters, t-(init.)=1.081 s t(norm)=0.0529644, mflops=94.403 (err=4.6e-016) 3. Temperton (f2c): elapsed time t=1.162 s, 2048 iters, t-(init.)=1.122 s t(norm)=0.0549733, mflops=90.9533 (err=3.6e-016) Top mflops for N=1000 = 152.086 Normalized results and averages for N=1000: fft 0: mflops = 152.086 (norm. = 1), norm. avg. (of 5) = 1 fft 1: mflops = 41.7552 (norm. = 0.27455), norm. avg. (of 5) = 0.216854 fft 2: mflops = 94.403 (norm. = 0.620722), norm. avg. (of 5) = 0.738951 fft 3: mflops = 90.9533 (norm. = 0.598039), norm. avg. (of 4) = 0.588795 Benchmarking for array size = 11x11x11: 0. FFTW: elapsed time t=1.763 s, 2048 iters, t-(init.)=1.613 s t(norm)=0.0570165, mflops=87.694 (err=4.4e-016) 1. PDA (f2c): elapsed time t=1.002 s, 256 iters, t-(init.)=0.982 s t(norm)=0.277695, mflops=18.0054 (err=4.9e-016) 2. Singleton (f2c): elapsed time t=1.892 s, 2048 iters, t-(init.)=1.742 s t(norm)=0.0615764, mflops=81.2 (err=4.6e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1331 = 87.694 Normalized results and averages for N=1331: fft 0: mflops = 87.694 (norm. = 1), norm. avg. (of 6) = 1 fft 1: mflops = 18.0054 (norm. = 0.205321), norm. avg. (of 6) = 0.214932 fft 2: mflops = 81.2 (norm. = 0.925947), norm. avg. (of 6) = 0.770117 fft 3: mflops = -1 (norm. = -0.0114033), norm. avg. (of 4) = 0.588795 Benchmarking for array size = 12x12x12: 0. FFTW: elapsed time t=1.352 s, 2048 iters, t-(init.)=1.152 s t(norm)=0.0302672, mflops=165.195 (err=3.8e-016) 1. PDA (f2c): elapsed time t=1.041 s, 512 iters, t-(init.)=0.991 s t(norm)=0.104149, mflops=48.0083 (err=3.7e-016) 2. Singleton (f2c): elapsed time t=1.532 s, 1024 iters, t-(init.)=1.432 s t(norm)=0.0752477, mflops=66.4472 (err=4.0e-016) 3. Temperton (f2c): elapsed time t=1.082 s, 1024 iters, t-(init.)=0.982 s t(norm)=0.0516015, mflops=96.8965 (err=3.9e-016) Top mflops for N=1728 = 165.195 Normalized results and averages for N=1728: fft 0: mflops = 165.195 (norm. = 1), norm. avg. (of 7) = 1 fft 1: mflops = 48.0083 (norm. = 0.290616), norm. avg. (of 7) = 0.225744 fft 2: mflops = 66.4472 (norm. = 0.402235), norm. avg. (of 7) = 0.717562 fft 3: mflops = 96.8965 (norm. = 0.586558), norm. avg. (of 5) = 0.588347 Benchmarking for array size = 13x13x13: 0. FFTW: elapsed time t=1.783 s, 1024 iters, t-(init.)=1.663 s t(norm)=0.0665867, mflops=75.09 (err=4.5e-016) 1. PDA (f2c): elapsed time t=1.822 s, 256 iters, t-(init.)=1.782 s t(norm)=0.285406, mflops=17.5189 (err=8.8e-016) 2. Singleton (f2c): elapsed time t=1.983 s, 1024 iters, t-(init.)=1.853 s t(norm)=0.0741943, mflops=67.3906 (err=8.5e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2197 = 75.09 Normalized results and averages for N=2197: fft 0: mflops = 75.09 (norm. = 1), norm. avg. (of 8) = 1 fft 1: mflops = 17.5189 (norm. = 0.233305), norm. avg. (of 8) = 0.226689 fft 2: mflops = 67.3906 (norm. = 0.897464), norm. avg. (of 8) = 0.74005 fft 3: mflops = -1 (norm. = -0.0133173), norm. avg. (of 5) = 0.588347 Benchmarking for array size = 14x14x14: 0. FFTW: elapsed time t=1.562 s, 1024 iters, t-(init.)=1.402 s t(norm)=0.0436837, mflops=114.459 (err=4.4e-016) 1. PDA (f2c): elapsed time t=1.572 s, 256 iters, t-(init.)=1.532 s t(norm)=0.190937, mflops=26.1866 (err=4.7e-016) 2. Singleton (f2c): elapsed time t=1.382 s, 512 iters, t-(init.)=1.302 s t(norm)=0.0811358, mflops=61.6251 (err=5.7e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=2744 = 114.459 Normalized results and averages for N=2744: fft 0: mflops = 114.459 (norm. = 1), norm. avg. (of 9) = 1 fft 1: mflops = 26.1866 (norm. = 0.228786), norm. avg. (of 9) = 0.226922 fft 2: mflops = 61.6251 (norm. = 0.538402), norm. avg. (of 9) = 0.717645 fft 3: mflops = -1 (norm. = -0.00873674), norm. avg. (of 5) = 0.588347 Benchmarking for array size = 15x15x15: 0. FFTW: elapsed time t=1.682 s, 1024 iters, t-(init.)=1.491 s t(norm)=0.0368088, mflops=135.837 (err=5.5e-016) 1. PDA (f2c): elapsed time t=1.042 s, 256 iters, t-(init.)=0.992 s t(norm)=0.0979592, mflops=51.0416 (err=5.4e-016) 2. Singleton (f2c): elapsed time t=1.552 s, 512 iters, t-(init.)=1.452 s t(norm)=0.0716919, mflops=69.7428 (err=6.6e-016) 3. Temperton (f2c): elapsed time t=1.222 s, 512 iters, t-(init.)=1.122 s t(norm)=0.0553983, mflops=90.2554 (err=5.2e-016) Top mflops for N=3375 = 135.837 Normalized results and averages for N=3375: fft 0: mflops = 135.837 (norm. = 1), norm. avg. (of 10) = 1 fft 1: mflops = 51.0416 (norm. = 0.375756), norm. avg. (of 10) = 0.241805 fft 2: mflops = 69.7428 (norm. = 0.51343), norm. avg. (of 10) = 0.697223 fft 3: mflops = 90.2554 (norm. = 0.664439), norm. avg. (of 6) = 0.601029 Benchmarking for array size = 24x25x28: 0. FFTW: elapsed time t=1.663 s, 128 iters, t-(init.)=1.533 s t(norm)=0.0507895, mflops=98.4455 (err=4.6e-016) 1. PDA (f2c): elapsed time t=1.733 s, 64 iters, t-(init.)=1.673 s t(norm)=0.110856, mflops=45.1037 (err=4.6e-016) 2. Singleton (f2c): elapsed time t=1.382 s, 64 iters, t-(init.)=1.322 s t(norm)=0.0875979, mflops=57.079 (err=5.5e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=16800 = 98.4455 Normalized results and averages for N=16800: fft 0: mflops = 98.4455 (norm. = 1), norm. avg. (of 11) = 1 fft 1: mflops = 45.1037 (norm. = 0.458159), norm. avg. (of 11) = 0.261474 fft 2: mflops = 57.079 (norm. = 0.579803), norm. avg. (of 11) = 0.686549 fft 3: mflops = -1 (norm. = -0.0101579), norm. avg. (of 6) = 0.601029 Benchmarking for array size = 48x48x48: 0. FFTW: elapsed time t=1.132 s, 8 iters, t-(init.)=0.952 s t(norm)=0.0642217, mflops=77.8553 (err=7.2e-016) 1. PDA (f2c): elapsed time t=1.943 s, 8 iters, t-(init.)=1.773 s t(norm)=0.119606, mflops=41.8039 (err=7.1e-016) 2. Singleton (f2c): elapsed time t=1.762 s, 4 iters, t-(init.)=1.681 s t(norm)=0.2268, mflops=22.0459 (err=6.7e-016) 3. Temperton (f2c): elapsed time t=1.772 s, 8 iters, t-(init.)=1.602 s t(norm)=0.108071, mflops=46.2661 (err=8.0e-016) Top mflops for N=110592 = 77.8553 Normalized results and averages for N=110592: fft 0: mflops = 77.8553 (norm. = 1), norm. avg. (of 12) = 1 fft 1: mflops = 41.8039 (norm. = 0.536943), norm. avg. (of 12) = 0.28443 fft 2: mflops = 22.0459 (norm. = 0.283165), norm. avg. (of 12) = 0.652933 fft 3: mflops = 46.2661 (norm. = 0.594257), norm. avg. (of 7) = 0.600062 Benchmarking for array size = 49x49x49: 0. FFTW: elapsed time t=1.432 s, 8 iters, t-(init.)=1.251 s t(norm)=0.0789097, mflops=63.3635 (err=6.2e-016) 1. PDA (f2c): elapsed time t=1.803 s, 4 iters, t-(init.)=1.713 s t(norm)=0.216103, mflops=23.1371 (err=7.0e-016) 2. Singleton (f2c): elapsed time t=1.412 s, 4 iters, t-(init.)=1.312 s t(norm)=0.165515, mflops=30.2088 (err=9.3e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=117649 = 63.3635 Normalized results and averages for N=117649: fft 0: mflops = 63.3635 (norm. = 1), norm. avg. (of 13) = 1 fft 1: mflops = 23.1371 (norm. = 0.365149), norm. avg. (of 13) = 0.290639 fft 2: mflops = 30.2088 (norm. = 0.476753), norm. avg. (of 13) = 0.639381 fft 3: mflops = -1 (norm. = -0.0157819), norm. avg. (of 7) = 0.600062 Benchmarking for array size = 60x60x60: 0. FFTW: elapsed time t=1.201 s, 4 iters, t-(init.)=1.03 s t(norm)=0.0672734, mflops=74.3236 (err=7.5e-016) 1. PDA (f2c): elapsed time t=1.012 s, 2 iters, t-(init.)=0.932 s t(norm)=0.121745, mflops=41.0694 (err=7.4e-016) 2. Singleton (f2c): elapsed time t=1.151 s, 1 iters, t-(init.)=1.111 s t(norm)=0.290255, mflops=17.2262 (err=1.0e-015) 3. Temperton (f2c): elapsed time t=1.843 s, 4 iters, t-(init.)=1.673 s t(norm)=0.10927, mflops=45.7581 (err=7.4e-016) Top mflops for N=216000 = 74.3236 Normalized results and averages for N=216000: fft 0: mflops = 74.3236 (norm. = 1), norm. avg. (of 14) = 1 fft 1: mflops = 41.0694 (norm. = 0.552575), norm. avg. (of 14) = 0.309348 fft 2: mflops = 17.2262 (norm. = 0.231773), norm. avg. (of 14) = 0.610266 fft 3: mflops = 45.7581 (norm. = 0.61566), norm. avg. (of 8) = 0.602012 Benchmarking for array size = 72x60x56: 0. FFTW: elapsed time t=1.402 s, 4 iters, t-(init.)=1.212 s t(norm)=0.0700329, mflops=71.395 (err=7.3e-016) 1. PDA (f2c): elapsed time t=1.292 s, 2 iters, t-(init.)=1.192 s t(norm)=0.137754, mflops=36.2965 (err=7.8e-016) 2. Singleton (f2c): elapsed time t=1.372 s, 1 iters, t-(init.)=1.332 s t(norm)=0.307867, mflops=16.2408 (err=9.5e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=241920 = 71.395 Normalized results and averages for N=241920: fft 0: mflops = 71.395 (norm. = 1), norm. avg. (of 15) = 1 fft 1: mflops = 36.2965 (norm. = 0.508389), norm. avg. (of 15) = 0.322618 fft 2: mflops = 16.2408 (norm. = 0.227477), norm. avg. (of 15) = 0.584747 fft 3: mflops = -1 (norm. = -0.0140066), norm. avg. (of 8) = 0.602012 Benchmarking for array size = 75x75x75: 0. FFTW: elapsed time t=1.181 s, 2 iters, t-(init.)=1.011 s t(norm)=0.0641225, mflops=77.9758 (err=7.1e-016) 1. PDA (f2c): elapsed time t=1.923 s, 2 iters, t-(init.)=1.753 s t(norm)=0.111184, mflops=44.9706 (err=7.4e-016) 2. Singleton (f2c): elapsed time t=1.853 s, 1 iters, t-(init.)=1.773 s t(norm)=0.224904, mflops=22.2317 (err=9.1e-016) 3. Temperton (f2c): elapsed time t=1.672 s, 2 iters, t-(init.)=1.502 s t(norm)=0.0952641, mflops=52.4857 (err=9.0e-016) Top mflops for N=421875 = 77.9758 Normalized results and averages for N=421875: fft 0: mflops = 77.9758 (norm. = 1), norm. avg. (of 16) = 1 fft 1: mflops = 44.9706 (norm. = 0.576726), norm. avg. (of 16) = 0.3385 fft 2: mflops = 22.2317 (norm. = 0.28511), norm. avg. (of 16) = 0.56602 fft 3: mflops = 52.4857 (norm. = 0.673103), norm. avg. (of 9) = 0.609911 Benchmarking for array size = 80x80x80: 0. FFTW: elapsed time t=1.592 s, 2 iters, t-(init.)=1.391 s t(norm)=0.0716236, mflops=69.8094 (err=7.4e-016) 1. PDA (f2c): elapsed time t=1.292 s, 1 iters, t-(init.)=1.192 s t(norm)=0.122754, mflops=40.7319 (err=7.1e-016) 2. Singleton (f2c): elapsed time t=2.454 s, 1 iters, t-(init.)=2.354 s t(norm)=0.242418, mflops=20.6255 (err=9.7e-016) 3. Temperton (f2c): elapsed time t=1.232 s, 1 iters, t-(init.)=1.132 s t(norm)=0.116575, mflops=42.8908 (err=7.7e-016) Top mflops for N=512000 = 69.8094 Normalized results and averages for N=512000: fft 0: mflops = 69.8094 (norm. = 1), norm. avg. (of 17) = 1 fft 1: mflops = 40.7319 (norm. = 0.583473), norm. avg. (of 17) = 0.35291 fft 2: mflops = 20.6255 (norm. = 0.295455), norm. avg. (of 17) = 0.550104 fft 3: mflops = 42.8908 (norm. = 0.614399), norm. avg. (of 10) = 0.610359 Benchmarking for array size = 84x84x84: 0. FFTW: elapsed time t=1.843 s, 2 iters, t-(init.)=1.603 s t(norm)=0.0705157, mflops=70.9062 (err=7.4e-016) 1. PDA (f2c): elapsed time t=1.942 s, 1 iters, t-(init.)=1.821 s t(norm)=0.160211, mflops=31.2088 (err=7.2e-016) 2. Singleton (f2c): elapsed time t=3.585 s, 1 iters, t-(init.)=3.464 s t(norm)=0.304762, mflops=16.4063 (err=9.6e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=592704 = 70.9062 Normalized results and averages for N=592704: fft 0: mflops = 70.9062 (norm. = 1), norm. avg. (of 18) = 1 fft 1: mflops = 31.2088 (norm. = 0.440143), norm. avg. (of 18) = 0.357756 fft 2: mflops = 16.4063 (norm. = 0.23138), norm. avg. (of 18) = 0.532397 fft 3: mflops = -1 (norm. = -0.0141031), norm. avg. (of 10) = 0.610359 Benchmarking for array size = 96x96x96: 0. FFTW: elapsed time t=1.412 s, 1 iters, t-(init.)=1.232 s t(norm)=0.0704892, mflops=70.9329 (err=8.0e-016) 1. PDA (f2c): elapsed time t=2.894 s, 1 iters, t-(init.)=2.714 s t(norm)=0.155282, mflops=32.1994 (err=6.4e-016) 2. Singleton (f2c): elapsed time t=5.638 s, 1 iters, t-(init.)=5.457 s t(norm)=0.312224, mflops=16.0142 (err=7.0e-016) 3. Temperton (f2c): elapsed time t=2.774 s, 1 iters, t-(init.)=2.594 s t(norm)=0.148416, mflops=33.689 (err=7.5e-016) Top mflops for N=884736 = 70.9329 Normalized results and averages for N=884736: fft 0: mflops = 70.9329 (norm. = 1), norm. avg. (of 19) = 1 fft 1: mflops = 32.1994 (norm. = 0.453943), norm. avg. (of 19) = 0.362819 fft 2: mflops = 16.0142 (norm. = 0.225765), norm. avg. (of 19) = 0.516259 fft 3: mflops = 33.689 (norm. = 0.474942), norm. avg. (of 11) = 0.598049 Benchmarking for array size = 105x105x105: 0. FFTW: elapsed time t=1.793 s, 1 iters, t-(init.)=1.553 s t(norm)=0.0666017, mflops=75.0732 (err=7.4e-016) 1. PDA (f2c): elapsed time t=3.875 s, 1 iters, t-(init.)=3.644 s t(norm)=0.156276, mflops=31.9947 (err=7.4e-016) 2. Singleton (f2c): elapsed time t=5.518 s, 1 iters, t-(init.)=5.288 s t(norm)=0.22678, mflops=22.0478 (err=8.2e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1157625 = 75.0732 Normalized results and averages for N=1157625: fft 0: mflops = 75.0732 (norm. = 1), norm. avg. (of 20) = 1 fft 1: mflops = 31.9947 (norm. = 0.42618), norm. avg. (of 20) = 0.365987 fft 2: mflops = 22.0478 (norm. = 0.293684), norm. avg. (of 20) = 0.50513 fft 3: mflops = -1 (norm. = -0.0133203), norm. avg. (of 11) = 0.598049 Benchmarking for array size = 112x112x112: 0. FFTW: elapsed time t=2.363 s, 1 iters, t-(init.)=2.072 s t(norm)=0.0722164, mflops=69.2363 (err=5.5e-016) 1. PDA (f2c): elapsed time t=4.927 s, 1 iters, t-(init.)=4.647 s t(norm)=0.161964, mflops=30.871 (err=5.9e-016) 2. Singleton (f2c): elapsed time t=6.98 s, 1 iters, t-(init.)=6.69 s t(norm)=0.23317, mflops=21.4436 (err=7.0e-016) 3. Skipping fft (Temperton only handles N = 2^m 3^n 5^q). Top mflops for N=1404928 = 69.2363 Normalized results and averages for N=1404928: fft 0: mflops = 69.2363 (norm. = 1), norm. avg. (of 21) = 1 fft 1: mflops = 30.871 (norm. = 0.445879), norm. avg. (of 21) = 0.369791 fft 2: mflops = 21.4436 (norm. = 0.309716), norm. avg. (of 21) = 0.495824 fft 3: mflops = -1 (norm. = -0.0144433), norm. avg. (of 11) = 0.598049 Benchmarking for array size = 120x120x120: 0. FFTW: elapsed time t=2.724 s, 1 iters, t-(init.)=2.374 s t(norm)=0.066303, mflops=75.4114 (err=7.2e-016) 1. PDA (f2c): elapsed time t=4.767 s, 1 iters, t-(init.)=4.417 s t(norm)=0.123362, mflops=40.5313 (err=7.9e-016) 2. Singleton (f2c): elapsed time t=13.329 s, 1 iters, t-(init.)=12.988 s t(norm)=0.362739, mflops=13.784 (err=9.4e-016) 3. Temperton (f2c): elapsed time t=4.747 s, 1 iters, t-(init.)=4.407 s t(norm)=0.123082, mflops=40.6232 (err=7.0e-016) Top mflops for N=1728000 = 75.4114 Normalized results and averages for N=1728000: fft 0: mflops = 75.4114 (norm. = 1), norm. avg. (of 22) = 1 fft 1: mflops = 40.5313 (norm. = 0.537469), norm. avg. (of 22) = 0.377413 fft 2: mflops = 13.784 (norm. = 0.182784), norm. avg. (of 22) = 0.481595 fft 3: mflops = 40.6232 (norm. = 0.538688), norm. avg. (of 12) = 0.593102 Benchmarking for array size = 144x144x144: 0. FFTW: elapsed time t=5.088 s, 1 iters, t-(init.)=4.487 s t(norm)=0.0698607, mflops=71.571 (err=1.2e-015) 1. PDA (f2c): elapsed time t=9.214 s, 1 iters, t-(init.)=8.613 s t(norm)=0.134101, mflops=37.2854 (err=1.2e-015) 2. Singleton (f2c): elapsed time t=20.359 s, 1 iters, t-(init.)=19.758 s t(norm)=0.307624, mflops=16.2536 (err=1.6e-015) 3. Temperton (f2c): elapsed time t=8.803 s, 1 iters, t-(init.)=8.202 s t(norm)=0.127702, mflops=39.1538 (err=1.2e-015) Top mflops for N=2985984 = 71.571 Normalized results and averages for N=2985984: fft 0: mflops = 71.571 (norm. = 1), norm. avg. (of 23) = 1 fft 1: mflops = 37.2854 (norm. = 0.520957), norm. avg. (of 23) = 0.383654 fft 2: mflops = 16.2536 (norm. = 0.227098), norm. avg. (of 23) = 0.47053 fft 3: mflops = 39.1538 (norm. = 0.547062), norm. avg. (of 13) = 0.589561 Benchmarking for array size = 180x180x180: 0. FFTW: elapsed time t=10.115 s, 1 iters, t-(init.)=8.943 s t(norm)=0.0682268, mflops=73.285 (err=9.3e-016) 1. PDA (f2c): elapsed time t=17.335 s, 1 iters, t-(init.)=16.164 s t(norm)=0.123316, mflops=40.5461 (err=9.3e-016) 2. Singleton (f2c): elapsed time t=45.596 s, 1 iters, t-(init.)=44.425 s t(norm)=0.338922, mflops=14.7527 (err=1.2e-015) 3. Temperton (f2c): elapsed time t=15.513 s, 1 iters, t-(init.)=14.342 s t(norm)=0.109416, mflops=45.6971 (err=9.5e-016) Top mflops for N=5832000 = 73.285 Normalized results and averages for N=5832000: fft 0: mflops = 73.285 (norm. = 1), norm. avg. (of 24) = 1 fft 1: mflops = 40.5461 (norm. = 0.553267), norm. avg. (of 24) = 0.390721 fft 2: mflops = 14.7527 (norm. = 0.201306), norm. avg. (of 24) = 0.459313 fft 3: mflops = 45.6971 (norm. = 0.623553), norm. avg. (of 14) = 0.591989 Benchmarking for array size = 240x240x240: 0. FFTW: elapsed time t=27.669 s, 1 iters, t-(init.)=24.895 s t(norm)=0.0759192, mflops=65.8595 (err=1.5e-015) 1. PDA (f2c): elapsed time t=50.332 s, 1 iters, t-(init.)=47.558 s t(norm)=0.145032, mflops=34.4752 (err=1.5e-015) 2. Singleton (f2c): elapsed time t=104.901 s, 1 iters, t-(init.)=102.127 s t(norm)=0.311444, mflops=16.0543 (err=2.1e-015) 3. Temperton (f2c): elapsed time t=46.817 s, 1 iters, t-(init.)=44.043 s t(norm)=0.134312, mflops=37.2266 (err=1.5e-015) Top mflops for N=13824000 = 65.8595 Normalized results and averages for N=13824000: fft 0: mflops = 65.8595 (norm. = 1), norm. avg. (of 25) = 1 fft 1: mflops = 34.4752 (norm. = 0.523466), norm. avg. (of 25) = 0.396031 fft 2: mflops = 16.0543 (norm. = 0.243765), norm. avg. (of 25) = 0.450691 fft 3: mflops = 37.2266 (norm. = 0.565243), norm. avg. (of 15) = 0.590206 ------------------------------------------------------ @@@@ bench.1d.p2.dat N, Arndt DIF, Arndt DIT, Arndt Split-Radix, Arndt 4-step, Beauregard, Bergland, CWP (min N), CWP (best N), Edelblute, FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, Green, GSL, GSL DIT, GSL DIF, Krukar, Mayer (Buneman), Mayer (simple), Mayer (lookup), Ooura (C), Ransom, Singleton (f2c), Temperton (f2c), Valkenburg 2, 45.9902, 45.0516, 34.9234, 0.735121, 7.32246, 8.58082, 5.18584, 4.4356, , 13.0827, 51.0877, 49.8136, 58.9088, , 13.8701, 10.4858, 9.69109, 29.8953, , , , 49.8136, , 7.75001, 3.63584, 8.58082 4, 83.8861, 88.2083, 45.0033, 5.13002, 8.79678, 30.7726, 17.1898, 5.84817, 39.126, 37.3824, 122.194, 124, 182.163, , 36.7277, 19.3822, 18.3639, 95.1089, 54.7917, 57.733, 55.4435, 75.505, 3.37814, 23.9538, 12.7642, 9.02389 8, 141.381, 129.454, 44.5571, 5.73201, 12.7668, 47.2332, 39.4944, 17.4569, 32.4302, 50.6967, 188.933, 178.228, 234.975, 101.312, 57.0913, 29.9024, 27.0717, 153.077, 77.1484, 81.5484, 79.538, 145.973, 3.04347, 24.7695, 25.7425, 9.29589 16, 56.6033, 58.9916, 44.081, 15.3976, 14.6552, 65.028, 61.5904, 30.5707, 30.1315, 82.0803, 204.351, 211.967, 257.913, 87.1997, 83.8023, 41.0804, 38.0608, 190.218, 64.4286, 74.7648, 77.5287, 145.636, 15.6271, 57.733, 45.0758, 9.51521 32, 69.812, 70.2799, 47.6193, 15.9649, 15.4931, 87.9678, 54.4998, 55.6865, 32.912, 73.2246, 222.628, 227.704, 243.855, 105.703, 81.7922, 54.4998, 51.3002, 192.399, 73.7395, 93.456, 96.9109, 161.195, 13.2262, 72.216, 52.3764, 9.84024 64, 61.5602, 64.0678, 50.2512, 26.3903, 15.4051, 91.0486, 60.9637, 53.2272, 35.3056, 92.3856, 194.782, 199.412, 173.318, 132.312, 99.7061, 64.1331, 61.5602, 67.1805, 71.7793, 94.4663, 99.7061, 151.328, 35.3254, 93.0689, 71.7793, 9.99914 128, 71.1243, 73.9923, 55.5222, 24.598, 15.2663, 98.3918, 63.1672, 69.1152, 38.5708, 87.7469, 209.416, 210.92, 185.589, 111.891, 109.471, 72.53, 68.4704, 41.1668, 78.798, 106.998, 111.891, 162.84, 31.8855, 93.3846, 71.8906, 10.2343 256, 71.5752, 74.2355, 61.1415, 28.8864, 15.2854, 110.159, 73.4554, 77.6004, 41.9011, 97.4853, 200.805, 200.564, 186.103, 118.819, 118.066, 77.5287, 74.1698, 64.4286, 82.9734, 111.625, 117.159, 159.783, 54.4008, 114.677, 86.3026, 10.3614 512, 79.171, 82.6373, 61.2009, 29.6394, 15.1043, 112.953, 78.5123, 79.8408, 43.6099, 73.0432, 154.455, 154.455, 138.578, 129.899, 101.858, 77.8646, 77.8646, 54.4558, 89.7071, 120.757, 128.223, 172.842, 40.6425, 117.818, 75.3769, 10.4301 1024, 78.7219, 81.1591, 63.8597, 32.7067, 14.7853, 111.373, 79.3174, 78.7219, 45.9096, 60.5413, 134.261, 133.407, 85.1117, 102.6, 74.2618, 70.7064, 72.216, 45.9499, 90.2389, 121.715, 125.428, 144.333, 65.0885, 116.315, 72.216, 9.76692 2048, 43.9571, 43.2971, 30.6276, 27.178, 13.7183, 79.4376, 62.5846, 67.4128, 25.9315, 61.2226, 138.717, 105.626, 73.7961, 81.1135, 82.8616, 37.1357, 36.9217, 42.3123, 85.3131, 111.984, 93.6989, 106.602, 47.1945, 61.5821, 58.1368, 9.28392 4096, 38.5742, 38.3392, 28.807, 28.5716, 13.4203, 81.0233, 58.1465, 69.7888, 25.1256, 62.1685, 132.173, 105.561, 75.2117, 86.0076, 81.5484, 36.9434, 36.5357, 42.1679, 45.8561, 50.6558, 47.2332, 109.227, 64.0678, 74.8092, 61.6809, 9.24127 8192, 40.2584, 39.1035, 28.3517, 26.5825, 13.2912, 73.9636, 63.5797, 62.9921, 25.2062, 53.5829, 132.088, 106.33, 77.3198, 80.5168, 74.3671, 36.822, 36.3895, , 44.2007, 48.2701, 45.3778, 103.899, 57.6628, 69.4068, 56.7034, 9.15111 16384, 37.411, 36.3009, 28.1875, 31.8578, 13.3747, 74.8219, 66.6065, 66.0669, 25.2756, 37.3729, 88.8624, 74.8219, 53.4988, 75.5147, 54.2902, 36.6635, 36.6635, , 43.8997, 47.294, 44.7018, 108.5, 71.8203, 74.0669, 58.1619, 8.18469 32768, 36.3416, 34.1333, 26.355, 28.6601, 12.7585, 72.7504, 63.8338, 63.3198, 23.8024, 25.5004, 69.5342, 59.9415, 34.7671, 68.8644, 29.9707, 33.8396, 34.1333, , 42.6945, 46.452, 41.1529, 86.3736, 59.488, 57.741, 43.8612, 7.11317 65536, 12.8502, 12.6869, 10.0631, 23.7907, 10.5703, 30.3495, 46.5, 45.9902, 9.78149, 23.5239, 62.5083, 48.7426, 29.9379, 33.2354, 29.0867, 12.7642, 12.6107, , 37.3824, 39.9458, 34.8944, 43.1513, 38.0608, 26.1817, 24.6434, 6.50078 131072, 11.6478, 11.3569, 8.62316, 18.3847, 10.2968, 28.7142, 44.036, 44.0796, 8.55693, 20.056, 58.9867, 48.8646, 28.7142, 30.6918, 25.0081, 11.231, 11.3453, , 14.7272, 15.0352, 14.1654, 36.7694, 30.6707, 22.0398, 21.201, 6.25203 262144, 11.2723, 11.0557, 8.8231, 22.642, 10.2936, 30.0165, 32.2749, 32.2749, 8.75759, 22.4268, 57.7905, 46.6264, 26.1998, 31.2076, 26.0264, 11.3811, 11.4362, , 13.0204, 13.1657, 12.664, 37.3898, 39.9881, 23.1077, 22.2156, 5.91896 524288, 12.0134, 11.8986, 9.01001, 17.5131, 10.0662, 26.8793, 32.2796, 32.0717, 8.83579, 23.2419, 56.1843, 46.0753, 25.1172, 30.5005, 27.3216, 11.5428, 11.4843, , 12.8171, 12.9235, 12.4643, 36.0401, 32.2796, 19.812, 20.9099, 5.85074 1048576, 11.4075, 11.2714, 9.18434, 24.3459, 10.2751, 28.6967, , , 8.97907, 24.0665, 56.8951, 47.1906, 25.1699, , 29.823, 11.7382, 11.7501, , , 13.1203, 12.6456, 38.7644, 41.5442, 23.6325, 22.7112, 5.74027 2097152, 12.23, 12.1209, 9.19266, 19.0222, 10.2706, , , , 8.96766, 21.4328, 54.5727, 45.1602, 25.0428, , 26.1459, 11.6287, 11.4467, , , 13.3108, 12.8667, 36.7124, 29.878, 22.4351, 21.2447, 5.46554 Norm. Avg., 0.346193, 0.345039, 0.246683, 0.209149, 0.114507, 0.488865, 0.428231, 0.419063, 0.187623, 0.36949, 0.9269, 0.833232, 0.686509, 0.583767, 0.452623, 0.264688, 0.259802, 0.443165, 0.387162, 0.438596, 0.426395, 0.751014, 0.385842, 0.434025, 0.350265, 0.0748471 ------------------------------------------------------ @@@@ bench.1d.np2.dat N, CWP (min N), CWP (best N), FFTPACK (f2c), FFTW, FFTW_ESTIMATE, Frigo-old, GSL, Singleton (f2c), Temperton (f2c), Valkenburg 6, 21.478, 11.1636, 38.6482, 151.709, 150.446, 24.6263, 40.9858, 17.0688, 14.7205, 9.22367 9, 37.3193, 20.6482, 51.864, 138.368, 135.731, 20.6255, 41.0246, 29.6307, 29.1685, 9.53927 12, 41.0981, 30.7787, 66.2203, 181.453, 189.376, 34.7636, 62.5475, 32.3689, 31.9833, 9.38525 15, 42.3677, 42.3911, 63.904, 158.051, 159.694, 23.3899, 49.8136, 35.8602, 36.5079, 8.79265 18, 47.6652, 36.3835, 52.8078, 129.194, 119.904, 24.8186, 70.1218, 42.3326, 34.6168, 9.90146 24, 60.4993, 52.5621, 63.1482, 161.875, 163.712, 43.3645, 92.278, 42.8747, 49.9759, 9.78231 36, 68.4862, 68.7952, 72.9509, 159.339, 157.082, 29.8663, 96.6512, 64.7763, 62.1681, 10.011 80, 78.7669, 69.6326, 87.5465, 162.158, 160.743, 51.0553, 86.1359, 90.4615, 64.1353, 9.48521 108, 62.8092, 84.5901, 79.0515, 150.252, 149.128, 28.7046, 100.273, 73.194, 74.1016, 10.3611 210, 69.3726, 69.3726, 49.4447, 128.595, 126.15, 25.699, 90.7727, 65.568, , 8.196 504, 91.564, 90.6681, 48.4385, 122.57, 117.892, 29.1026, 93.5043, 77.7372, , 8.89619 1000, 72.7886, 89.4388, 59.2279, 103.92, 108.448, 28.9585, 67.4485, 89.3604, 81.5093, 8.54973 1960, 72.5869, 73.0215, 31.8489, 105.328, 100.505, 25.595, 70.2633, 60.5358, , 7.56697 4725, 63.5305, 74.4178, 43.3485, 103.104, 101.684, 20.7018, 77.5446, 57.1381, , 8.30586 10368, 66.4541, 72.436, 56.3084, 115.557, 111.976, 31.1241, 85.7721, 52.9091, 58.1201, 9.20516 27000, 72.5537, 72.1831, 29.1178, 70.23, 69.3871, 19.0171, 41.2408, 53.3501, 61.6214, 7.63608 75600, 46.1462, 46.1462, 18.259, 67.4566, 69.4154, 17.6032, 30.7641, 23.516, , 6.68402 165375, 40.0676, 40.0396, 12.5518, 60.2277, 60.3545, 12.6627, 31.4519, 22.3622, , 6.47727 362880, 22.163, 22.3105, 16.9845, 72.0654, 60.8174, 16.2435, 33.4435, 17.6092, , 5.92266 Norm. Avg., 0.49684, 0.49769, 0.386404, 0.991187, 0.974888, 0.218952, 0.545755, 0.421424, 0.38844, 0.0736467 ------------------------------------------------------ @@@@ bench.3d.p2.dat Array Dimensions, FFTW, HARM (f2c), PDA (f2c), Singleton (f2c), Temperton (f2c) 4x4x4, 159.177, , 24.5377, 112.147, 66.4707 8x8x8, 212.909, 117.087, 39.5855, 105.917, 120.066 16x16x16, 114.183, 115.228, 47.5904, 68.7215, 92.3856 32x32x32, 129.989, 93.456, 47.6048, 55.6963, 75.5458 64x64x64, 61.1613, 41.6837, 29.8079, 20.8418, 39.2562 256x64x32, 60.7037, 39.467, 30.8787, 19.6556, 40.1025 16x1024x64, 62.304, 40.7531, 25.5938, 21.5934, 128x128x128, 61.7675, 37.2654, 32.3397, 14.8264, 28.7844 512x128x64, 61.5083, 37.5161, 32.2166, 17.339, 256x128x256, 62.9159, 39.6926, 33.5288, 18.2544, 32.7869 Norm. Avg., 0.999093, 0.677647, 0.410641, 0.40501, 0.581758 ------------------------------------------------------ @@@@ bench.3d.np2.dat Array Dimensions, FFTW, PDA (f2c), Singleton (f2c), Temperton (f2c) 5x5x5, 128.291, 28.0384, 127.147, 81.0104 6x6x6, 183.942, 29.023, 88.9887, 76.1068 7x7x7, 114.655, 18.0043, 94.5835, 9x9x9, 143.27, 39.5929, 110.922, 101.997 10x10x10, 152.086, 41.7552, 94.403, 90.9533 11x11x11, 87.694, 18.0054, 81.2, 12x12x12, 165.195, 48.0083, 66.4472, 96.8965 13x13x13, 75.09, 17.5189, 67.3906, 14x14x14, 114.459, 26.1866, 61.6251, 15x15x15, 135.837, 51.0416, 69.7428, 90.2554 24x25x28, 98.4455, 45.1037, 57.079, 48x48x48, 77.8553, 41.8039, 22.0459, 46.2661 49x49x49, 63.3635, 23.1371, 30.2088, 60x60x60, 74.3236, 41.0694, 17.2262, 45.7581 72x60x56, 71.395, 36.2965, 16.2408, 75x75x75, 77.9758, 44.9706, 22.2317, 52.4857 80x80x80, 69.8094, 40.7319, 20.6255, 42.8908 84x84x84, 70.9062, 31.2088, 16.4063, 96x96x96, 70.9329, 32.1994, 16.0142, 33.689 105x105x105, 75.0732, 31.9947, 22.0478, 112x112x112, 69.2363, 30.871, 21.4436, 120x120x120, 75.4114, 40.5313, 13.784, 40.6232 144x144x144, 71.571, 37.2854, 16.2536, 39.1538 180x180x180, 73.285, 40.5461, 14.7527, 45.6971 240x240x240, 65.8595, 34.4752, 16.0543, 37.2266 Norm. Avg., 1, 0.396031, 0.450691, 0.590206 @@@@ end