Go to the next, previous, or main section.
FFT Software in the Benchmark
Here, we provide a listing of the FFT software that is included in the
benchmark (or referenced by it). Each program is listed in
alphabetical order under the abbreviation we give it in the benchmark
results (except for commercial software, which is listed at the end).
(Usually, the abbreviation is the author's last name, unless the code
is well-known under some other name.) Also, where appropriate, there
are links to sites from which you can download the software or find
more information on it.
Except where otherwise noted, all FFTs are free for non-commercial
use, perform one-dimensional, complex-complex transforms only, and are
not limited to sizes that are powers of two. All routines
return their results in normal order (as opposed to bit-reversed or
otherwise permuted orderings). For more detailed information regarding
their availability, consult the license
and copyright section of this manual.
Some of the codes are only in C, some are only in Fortran, some are
in both, and some have been converted to C via f2c. The languages
used are noted in the entry for each program.
Small, cosmetic modifications were made to the software in order to
make it work with the benchmark. This typically involved changing the
numeric types to match those of the benchmark, altering subroutine
names, and cleaning up f2c output.
FFT Software
- Arndt:
Many transforms from the "FXT" package by Jörg Arndt (1997). This
package includes several different FFT implementations (both
complex-complex and real-complex), FHT code, and number-theoretic
transforms. Not all of the code in FXT was written by Arndt himself;
much was adapted from other sources. We benchmarked only those
routines from this package that appeared to have been significantly
modified from their original versions; otherwise, we benchmarked just
the original versions. The FXT package is ostensibly written in C++.
Its usage of C++, however, was purely cosmetic; we converted it back to
C so that we could compile everything in the benchmark with just a C
compiler. We included the following routines in the benchmark (all of
which work only for sizes that are powers of two):
- Arndt DIT is a radix-4 DIT FFT routine.
- Arndt DIF is a radix-4 DIF FFT routine.
- Arndt Split-Radix is a Duhamel-type split-radix transform,
based on the code by D. Edelblute (benchmarked separately). It was
compiled with the
USE_SINCOS3
option turned off. (With the option
turned on, there are no essential differences from Edelblute, and the
performance is the same. In any case, the code is often faster with
the option turned off.)
- Arndt 4-step is not a new dance craze, but is an FFT based on
the so-called "four-step" FFT. (This is essentially a Cooley-Tukey
FFT where you use sqrt(n) as the radix and do transpose operations to
keep data accesses sequential.)
- Bailey: Fortran
implementation of the "4-step" FFT by David H. Bailey (1995).
(Actually taken from an arbitrary-precision arithmetic package, MPFUN,
by Bailey.) Only works for sizes that are powers of 2. (No f2c
version.) Modified slightly to work outside of MPFUN and to be
callable from C. [D. H. Bailey, Intl. J. of Supercomp. Appl.,
p. 82-87 (Spring 1988).] (For a description of MPFUN, see this
report from NAS.)
- Beauregard:
A C FFT by Gerry Beauregard (1991). Can only handle sizes that
are powers of two, and does not include an inverse transform.
- Bergland:
A radix-8 C FFT, translated by Dr. Richard L. Lachance
from a Fortran program by G. D. Bergland and M. T. Dolan. Works only
for powers of two, and does not include a true inverse transform. The
original source can be found in the book: Programs for Digital
Signal Processing, edited by the DSP Committee, IEEE Acoustics,
Speech, and Signal Processing Society (IEEE Press, 1979), Chapter 1.2,
"Fast Fourier Transform Algorithms." (Received in personal
communication with Dr. Lachance.)
- Brenner:
Fortran FFT by Norman Brenner, based on a program by Charles
Rader (June, 1967). Handles real-complex transforms and
multi-dimensional transforms in addition to 1D complex-complex
transforms. [IEEE Audio Transactions (June 1967), Special Issue on the FFT.]
- Burrus:
Fortran split-radix DIF FFT by C. S. Burrus (1984) (slightly modified by
Steve Kifowit, 1997). Only
works for powers of two. [Electronics Letters (Jan. 5, 1984).]
- CWP:
C FFT by David Hale (1989) in a numerical library from the Center for Wave Phenomena at the
Colorado School of
Mines. This code works only on a limited set of array sizes and
uses a Prime Factor Algorithm. Actually, this code does not work on
most of the transform sizes in the benchmark. Instead, we run it on a
size of its own choosing. In some sense, it is "cheating"--it gets to
pick an optimal (larger) transform size while the other codes are
stuck with whatever you give them. However, in some applications, the
specific size of the transform may not matter, so the fact that we are
comparing apples and oranges might not be an issue. You should keep
this in mind when interpreting the results, however. We ran two
versions of this code:
- CWP (min N): uses the smallest allowed transform size greater than or equal to the transform size used by the rest of the codes.
- CWP (best N): use the optimal allowed transform size (>= the size used by CWP (min N)).
- Edelblute:
C FFT by Dave Edelblute
(1993), implementing a Duhamel-Holman split-radix FFT. Only works for
powers of two, and does not include an inverse transform.
- SCIPORT:
Fortran FFTs from the SCIPORT package (at Netlib), a portable implementation of
Cray's SCILIB library (see below). These routines were developed at
General Electric, probably by Scott
H. Lamson. Only works for powers of two, and includes real-complex
routines. This code is an implementation of the Stockham auto-sort FFT
algorithm.
- FFTPACK:
Fortran FFT from the popular FFTPACK package by
P. N. Swarztrauber. The code in the benchmark is from the bihar
package at Netlib, which includes
double and single precision versions of FFTPACK. A C version (via
f2c) is also included in the benchmark. FFTPACK contains
complex-complex, real-complex, sine, and cosine transforms (1D
only). [P. N. Swarztrauber, "Vectorizing the FFTs," Parallel
Computations, p. 51-83 (1982).]
- FFTW:
C FFT library by Matteo
Frigo and Steven
G. Johnson (1997). Includes real-complex, multi-dimensional, and
parallel transforms. Two FFTW options are used in the benchmark:
FFTW bases its strategy for the transform on actual run-time
timing measurements (corresponding to the
FFTW_MEASURE
option in the library), and FFTW_ESTIMATE uses a heuristic
strategy.
- Frigo-old:
C FFT written by Matteo
Frigo (one of the FFTW authors) (1996). This code is an early
precursor to FFTW. It is actually written in Cilk, and is designed for
efficient parallel execution. Optimized for powers of two, although
it will work on any size. It is included as an example program in the
Cilk distribution.
- Green:
C FFT (includes real-complex transform) by John Green (1996).
Only works for powers of two. Optimized for the PowerPC.
- GSL:
C FFT routines from the GNU
Scientific Library (GSL) version 0.3a. The FFT code was written by Brian Gough (1996). The benchmark
times three FFT implementations from GSL: GSL (mixed-radix),
GSL DIT (decimation-in-time radix-2), and GSL DIF
(decimation-in-frequency radix-2). (The latter two routines only work
for powers of two.)
- HARM:
Multi-dimensional Fortran FFT (converted via f2c). Only works for
powers of 2. An f2c version is also in the benchmark. The author is
unknown, but we suspect that it might be based on something by
J. W. Cooley himself, based on mentions of a "PK HARM"
multi-dimensional transform subroutine we found in his book.
[J. W. Cooley and P. A. W. Lewis and P. D. Welch, The Fast Fourier
Transform Algorithm and Its Applications (IBM Research, 1967).]
- Krukar:
C FFT by Richard H. Krukar (1990) (he
also has a home
page). Works only for powers of two <= 4096.
- Mayer:
C FFT by Ron Mayer (1993).
Only works for powers of two. This code actually performs the FFT
using the Fast Hartley Transform (FHT). Includes real-complex FFT, and
only works for powers of two. We benchmark three different versions
of this routine, corresponding to different methods of trigonometric
function generation (selected in Mayer's code by the
TRIG_VERSION
macro). They are:
- Mayer (Buneman): Uses a stable trigonometric iteration
algorithm by O. Buneman. (
TRIG_VERSION = 1
) [O. Buneman,
Proc. IEEE, vol. 75, no. 10, p. 1434-5 (Oct. 1987).]
- Mayer (simple): Uses a faster, naive trig. iteration which
is unstable. (
TRIG_VERSION = 5
)
- Mayer (lookup): Uses a precomputed lookup table for trig.
functions. (
TRIG_VERSION = 7
)
- MFFT:
Fortran FFT by A. Nobile and V. Roberto (1987). Includes 2, 3, and 4
dimensional transforms, with complex-complex and real-complex versions
(but no 1D transforms). Handles any power of 2, 3, and 5. Only works
when the size of the floating point type equals the size of the
integer type (due to an ugly hack in the code). This package was
optimized for the Cray X-MP. [A. Nobile and V. Roberto, "MFFT: A
package for two- and three-dimensional vectorized discrete Fourier
transforms," Computer Physics Communications 42 (1986)
p. 233.]
- Monro:
Fortran radix-4 FFT by D. Monro (Imperial College, London, 1971). Only
works for powers of two. [Appl. Statist., vol. 24, p. 153
(1975).]
- NAPACK:
Fortran FFT (converted via f2c) from the NAPACK package. The benchmark
only includes the f2c version.
[William W. Hager, Applied Numerical Linear Algebra,
Prentice-Hall, December 1987. ISBN: 0130412945.]
- Nielsen:
C FFT by Jens Jorgen Nielsen
(1996). (Mixed-radix Cooley-Tukey transform.)
- We are not permitted to distribute modified versions of this code
(see the license
information). If you wish to include this software in the
benchmark, you must make some small modifications to the (included)
original package, as described in the installation notes.
- NR (C) and NR (F):
C and Fortran FFTs from Numerical Recipes. Includes
multi-dimensional transforms, and only work for powers of two. This
is commercial software, and cannot be distributed with the
benchmark. Users who have the code may include it in the benchmark by
following the NR installation notes.
- Ooura (C) and Ooura (F):
C and Fortran FFTs by Takuya Ooura (1996). Only
works for sizes that are powers of two.
- QFT:
C implementation of the QFT ("Quick Discrete Fourier Transform")
algorithm, written by Gary
A. Sitton (1994). This code is limited to transform sizes that
are powers of two, and includes real-complex FFTs. [H. Guo, G. A. Sitton, C. S. Burrus, "The Quick
Discrete Fourier Transform," Proc. ICASSP, April 1994.]
- Ransom:
C FFT by Scott
M. Ransom. Uses the "6-step" FFT (a variant of the same method
used by Arndt 4-step; see also this
reference). Only works for sizes that are powers of two. Includes
real-complex transform and prototype mass-storage FFT. Received in
personal communication with the author.
- PDA:
Fortran FFTs (also in C via f2c) from the Public
Domain Algorithms (PDA) library. These routines perform both one
and multi-dimensional transforms. The one-dimensional transforms are
based on FFTPACK. (Because of this, we don't benchmark them
separately in the 1D case.)
- Singleton:
Fortran FFT by R. C. Singleton (Stanford Research Institute,
1968). This routine handles both one and multi-dimensional
transforms. It is included in the benchmark in both Fortran and C (via
f2c) forms. [R. C. Singleton, "An algorithm for computing the mixed
radix fast Fourier transform," IEEE Trans. on Audio and
Electroacoustics AU-17, no. 2, p. 93-103 (June, 1969).]
- Sorensen:
Fortran split-radix DIF FFT by H. V. Sorensen (1987). Includes
real-complex transforms, and only works for powers of two. (The
version which is benchmarked uses table-lookup for trig. functions and
bit reversal.) [Sorensen, Heideman, Burrus, "On computing the
split-radix FFT," IEEE Tran. ASSP, ASSP-34, No. 1,
p. 152-156 (Feb. 1986).] [Mitra & Kaiser, Digital Signal
Processing Handbook, Chap. 8, p. 491-610 (Wiley, 1993).]
- Sorensen DIT:
Fortran split-radix DIT FFT by H. V. Sorensen (1984) (slightly modified by
Steve Kifowit, 1997). Only
works for powers of two. [Electronics Letters (Jan. 5, 1984).]
- Temperton:
Fortran FFT by C. Temperton (f2c version is also benchmarked). Works
for any powers of 2, 3, and 5. Performs both one and
three-dimensional transforms. [C. Temperton, "A Generalized Prime
Factor FFT Algorithm For Any N =
2P3Q5R," SIAM Journal on
Scientific and Statistical Computing 13, no. 3, p. 676-686
(1992).]
- Valkenburg:
C FFT by Peter Valkenburg (1987).
- ESSL: A highly optimized FFT for the RS/6000 from IBM's ESSL library. Includes
multi-dimensional transforms. (Not included with the benchmark, only
linked to from it. Commercial code.)
- SCILIB:
A Cray-provided
FFT from Cray's SCILIB
library. This routine is highly vectorized on the C90.
Includes multi-dimensional transforms. (Not included with the
benchmark, only linked to from it. Commercial code.)
- SCSL:
FFTs from the SGI Cray Scientific Library (SCSL) (on the Origin2000). Includes
multi-dimensional transforms. (This library is actually a replacement
for SCILIB.) (Not included with the benchmark, only linked to from
it. Commercial code.)
- SGIMATH:
FFTs from SGI's complib.sgimath library (the CHALLENGEcomplib™
library). (Not included with the benchmark, only linked to from
it. Commercial code.)
- SUNPERF:
One-dimensional FFT from the Sun
Performance Library, optimized for the UltraSPARC. This code was
developed by Dakota Scientific
Software. (Not included with the benchmark, only linked to from
it. Commercial code.)
- DXML:
One-dimensional FFT from the Digital Extended
Math Library (DXML), optimized for the Alpha. DXML also includes
3D transforms, but we were unable to get them to work on our machine.
We left the code that we tried in the benchmark, although it is
commented out--let us
know if you can get it running. (Not included with the benchmark,
only linked to from it. Commercial code.)
- IMSL:
FFT routines from the commercial IMSL library by Visual Numerics, Inc. (Some documentation
is on-line.) Includes multi-dimensional and real-complex
routines. (Not included with the benchmark, only linked to from
it. Commercial code.)
- NAG:
FFT routines from the commercial NAG library by The Numerical Algorithms Group, Inc.
Includes multi-dimensional and real-complex routines, but does not
contain inverse transforms. (Not included with the benchmark, only
linked to from it. Commercial code.)
Go to the next, previous, or main section.