This directory contains demonstration code that supplements the paper
_Construction of a High-Performance FFT_.

Each FFT implementation is contained in one module named demo<n>.cpp, where <n>
is a sequence number.  Any one of the demo<n>.cpp modules can be compiled and
linked with test.c.  (The files common.h and glue.h are included.)  The module
test.c calls and tests the FFT implementation and provides a simple user
interface.  Run the resulting program with the command-line argument "-h" for
usage information.

Successively numbered modules use more features from the paper.  The primary
features in the modules correspond to source code displays in the paper:

	demo0.cpp	FFT Directly From Mathematics.
	demo1.cpp	First FFT Kernel.
	demo2.cpp	Expanded FFT Kernel.
	demo3.cpp	FFT Kernel Using Specialized Butterfly Routines.
	demo4.cpp	FFT Kernel with Reordered Loops and Separated Loop for k0=0.
	demo5.cpp	FFT4_Final With Bit-Reversal Permutation.
	demo6.cpp	Multiple-Stage Kernel.
	demo7.cpp	Develops several subroutines for multiple-stage kernel.
	demo8.cpp	Cache-Block Clustering GenerateFinalIndices.
	demo9.cpp	Multiple-Stage Kernel with Scaling for Reverse Transform.

It is usual for the test program to finish with a few errors, especially in the
longer vectors.  A few small errors are usually just random noise exceeding the
test threshold.  Real errors in the implementation generally cause numerous
large errors.  (There are more sophisticated ways to test the FFT.  This method
is used for simplicity and clarity.)
