The Parallel Fast Multipole Algorithm in Three Dimensions

James F. Leathrum, Jr. and John A. Board, Jr.

Abstract

We have implemented the fast multipole algorithm (FMA) of Greengard and Rokhlin on single processor computers and on a variety of parallel and distributed computer systems. the serial FMA reduces the complexity of the N-body problem of electrostatics and gravitation from order N2 (where N is the number of interacting bodies) in a naive implementation to order N while preserving controllable accuracy in the computation. the parallel FMA allows further reductions in simulation times for N-body problems; for uniform distributions of particles in cubic simulation volumes we obtain nearly linear speedup on small ()<- 16 processors) parallel systems. Parallel speedups are reduced somewhat when the particle distribution becomes non-uniform.

We have completed parallel implementations of the FMA on several different parallel computers, including shared memory machines (Sequent Symmetry, Encore Multimax) and distributed memory machines (INMOS Transputer). We have also done preliminary work on Intel Touchstone Gamma and distributed workstation clusters (under PVM and Linda). Each class of machine has unique implementation requirements, but we are able to get good performance from all with moderate numbers of processors. Sensible choice of the various parameters in the FMA is essential for good performance. A number of optimizations to the original algorithm improve the performance of the FMA significantly.,


Back to the Scientific Computions Group Home Page


This page last modified 7/12/95 by Daniel Paul