This is going to be real simplistic. It is not our intention to lead a discourse on the mathematics behind multipole algorithms. Many people have spent vast amounts of energy doing this already. The reader is urged to to read Leslie Greengard's excellent book/dissertation [6] for a description of the basic FMM(A) algorithm. Algorithmic enhancements to Greengard's FMM are described by John Board et. al.[2,3] as well as Elliott [5].
Probably the best overall description of DPMTA and its feature-set would be the author's dissertation[11] (available online: see the reference section).
The Fast Multipole Algorithm (FMA) is an algorithm for the numerical solution of the N-body problem. Simply put, the classical N-Body problem involves computing the net effect of the interactions of each pair of particles out of a set of N. In molecular dynamics (MD) work, the particles are atoms and the forces are electrostatic. In astrophysical simulations, the particles are stellar bodies and the forces are gravitational. In either case the amount of computation grows as the square of the number of particles, for the naive implementation.
The FMA process, however, uses a Multipole Expansion (MPE) to represent the effects of a group of distant particles as a single entity. By using the MPE when computing forces on a particle, and doing operations to combine multipole expansions, the overall amount of computation can be reduced to an almost linear relationship with the number of particles.
Other multipole tree codes include the work by Barnes and Hut [1] and Warren and Salmon [13]. These algorithms are similar to Greengard's FMA, in that they use oct-tree based spatial decomposition and multipoles, but do not utilize the concept of combining distant multipoles into local expansions prior to computing the particle forces.
Hybrid methods such as PMTA [2] combine the use of local multipole expansion from FMA with the Barnes and Hut tree-codes resulting in performance improvements over the straight tree-codes, while being less complex that a full FMA implementation. In addition, PMTA-based methods use a more aggressive metric for the use of multipole representation in nearby cells, resulting in a reduction in overall runtime without lose of accuracy.
Additional performance enhancements include the use of FFT methods during the multipole computation phase [4]. Since this method increases memory usage and may not provide any noticeable performance benefit for low-accuracy simulations, it is provided as a user specified option for processing.