The Parallel Fast Multipole Algorithm in Three Dimensions
James F. Leathrum, Jr. and John A. Board, Jr.
Abstract
We have implemented the fast multipole algorithm (FMA) of Greengard
and Rokhlin on single processor computers and on a variety of parallel
and distributed computer systems. the serial FMA reduces the
complexity of the N-body problem of electrostatics and gravitation
from order N2 (where N is the number of interacting bodies)
in a naive
implementation to order N while preserving controllable accuracy in
the computation. the parallel FMA allows further reductions in
simulation times for N-body problems; for
uniform distributions of
particles in cubic simulation volumes we obtain nearly linear speedup
on small ()<- 16 processors) parallel systems. Parallel speedups are
reduced somewhat when the particle distribution becomes non-uniform.
We have completed parallel implementations of the FMA on several
different parallel computers, including shared memory machines
(Sequent Symmetry, Encore Multimax) and distributed memory machines
(INMOS Transputer). We have also done preliminary work on Intel
Touchstone Gamma and distributed workstation clusters (under PVM and
Linda). Each class of machine has unique implementation requirements,
but we are able to get good performance from all with moderate numbers
of processors. Sensible choice of the various parameters in the FMA
is essential for good performance. A number of optimizations to the
original algorithm improve the performance of the FMA significantly.,

Back to the Scientific Computions Group Home Page