PHAML is now available!
PHAML version 0.9.18 can now be downloaded as the file phaml-0.9.18.tar.gz (526K) for Unix systems. When unpacked, it will place everything in a directory named phaml-0.9.18.
PHAML is in the public domain and not subject to copyright. Please see the LICENSE file.
The primary goal of the PHAML project is to produce a parallel version of MGGHAT. The target architecture is distributed memory multiprocessors, such as networked workstations, the IBM SP2, or Beowolf-type PC clusters like JazzNet.
MGGHAT is a sequential program for the solution of 2D elliptic partial differential equations using low or high order finite elements, adaptive mesh refinement based on newest node bisection of triangles, and multigrid. All aspects of the method are based on the hierarchical basis functions.
Adaptive refinement, multigrid and parallel computing have each been shown to be effective means of vastly reducing the time required to solve differential equations. However, effectively combining all three techniques is not easy and is the subject of current research. PHAML is an attempt to solve this problem.
There are several subgoals that must be addressed to achieve the primary goal:
![]() |
PHAML solution on four processors of an equation with a singular boundary condition. The colors or shades of grey indicate the region assigned to each processor. |
Refinement-Tree Partition Algorithm
The partitioning of unstructured grids such that the computational load is balanced and interprocessor communication is minimized is not trivial, and has been the subject of much research in the past decade. As part of the PHAML project, two new partitioning algorithms were developed specifically for adaptive multilevel methods.
The Refinement Tree Recursive Bisection (RTRB) partition algorithm uses the adaptive refinement tree to obtain information about the grid that is not available to other partitioning algorithms. It is a fast algorithm that uses O(N) operations to compute subtree weights, and O(log N) operations to find a bisection path from the root of the tree to a leaf such that the weight of the subtrees on either side of the path is equal. These subtrees define a two-way partition of the grid. Each of them can be bisected with RTRB to produce four partitions, etc. There is O(1) communication steps in each bisection, for a total of O(k) communication steps for k partitions.
RTRB is limited to powers-of-two for the number of partitions. The Refinement Tree K-Way (RTK) partition algorithm is a modification of RTRB to partition into any number of partitions. In this approach, a truncated traversal of the tree is performed after the subtree weights are computed. At each node, the subtree rooted there is placed into the current set if it fits, and the traversal returns to the parent. If the subtree is too big for the current set, then the traversal visits the children. This traversal uses O(k log N) operations where k is the number of partitions, and O(1) communication steps.
The Refinement Tree partition algorithms have compared favorably to other partitioning algorithms for use in parallel adaptive multilevel methods. Results from numerical experiments comparing RTRB, RTK and the popular ParMETIS partitioning software can be seen here.
The RTK method is included in Sandia National Laboratory's dynamic load balancing library Zoltan as the method REFTREE. PHAML can optionally use Zoltan instead of its native implementation of RTK, which allows a choice of load balancing methods, and an environment in which to experimentally compare different load balancing methods.
Full Domain Partition (FuDoP)
The traditional data distribution for partitioned grids places a partition on each processor plus shadow copies of the nearest neighboring grid elements of adjacent processors. This is insufficient for an efficient parallel multigrid method, resulting in either too much communication or loss of optimal order convergence depending on how the multigrid is parallelized.
A new approach to data distribution has been developed, called the Full Domain Partition (FuDoP). Each processor receives a partition of the grid plus the minimum number of additional elements to cover the full domain, with the additional elements coming from coarser refinement levels. FuDoP can be interpreted as domain decomposition with overlapping subdomains on each level of the multigrid sequence. FuDoP with extended overlap, FuDoP(k), guarantees at least k levels of overlap on each grid level. With FuDoP(1) the multigrid algorithm maintains nearly the same convergence rate as the sequential algorithm, but only requires two communication steps per V-cycle. The following table shows the convergence rates (contractions factors) for 1-64 processors using FuDoP multigrid with Laplace's Equation on an adaptive grid with 60000 vertices.
Overlap | 1 | 2 | 4 | 8 | 16 | 32 | 64 |
FuDoP(0) | .093 | .178 | .203 | .221 | .332 | .317 | .273 |
FuDoP(1) | .093 | .093 | .094 | .099 | .098 | .101 | .098 |
FuDoP(2) | .093 | .094 | .094 | .098 | .099 | .180 | .163 |
FuDoP also facilitates the parallelization of adaptive refinement. Each processor treats the FuDoP grid like a sequential algorithm, but uses zero as the error estimate in triangles not in its partition. The processors exchange grid information after the refinement is complete.
PHAML code
The PHAML code is being developed as a prototype implementation. It is written in Fortran 90, which provides modules for modularity and data abstraction, optional arguments for a flexible user interface, and many other useful language features. Development is on a cluster of Linux PCs, with portability testing on networks of Sun workstations, SGI workstations, IBM RS6K workstations, DEC Alpha/Linux workstations, and an Origin 2000.
Visualization is obtained through the OpenGL(R) graphics library. Many viewing options are available for the refinement trees, grids and solutions, along with zooming, panning and rotating capabilities.
PHAML uses an object based approach with data abstraction obtained through private/public attributes in Fortran 90 modules. The user program uses the PHAML module, which allows the declaration of variables of type ``phaml_solution_type'' (a structure containing all the data required by PHAML) and a limited number of operations on those variables. These operations include create, destroy, solve_pde, evaluate, store and restore.
PHAML uses either a master/slave or SPMD model of parallel computation. Message passing is performed with either PVM or MPI.
In the master/slave model, the create operation spawns the slave processes that will operate in parallel when other operations, like solve_pde, are invoked. If graphics options are enabled, additional processes for visualization are also spawned. The processes are terminated by the destroy operation. This model is used with PVM and MPI 2, which have spawning capability.
In the SPMD model, the master, slave, and graphics programs of the master/slave model are combined into a single program. Unlike the master/slave model, the processes are not spawned during the create operation. Instead, a fixed number of processes runs for the entire program. This model is used with versions of MPI 1 that do not have spawning capability.
Performance measurements of PHAML have been made on several computer systems.
StopWatch was developed to
provide a portable means of measuring execution time in PHAML. It is a
Fortran 90 module that provides a ``watch'' data type and several operations
on watches, for example start_watch and stop_watch. It has been tested
for portability on over fifteen Fortran 90 computer systems. This software
has been released to the public, and is used in hundreds of locations
worldwide.
f90gl was developed for PHAML
as a Fortran 90 interface to the
OpenGL graphics library. The previous
Fortran 77 bindings for OpenGL were insufficient for portability due to
its reliance on extensions to the Fortran standard. New features of the
Fortran 90 language allowed a portable interface to be developed and
implemented. These Fortran 90 bindings for OpenGL were adopted by the
OpenGL
Architecture Review Board as the official OpenGL
Fortran 90 bindingsRelated projects
If you do not have the software to read the available formats, an alternate format or a paper copy of these documents will be mailed to you if requested from William Mitchell.
Mitchell, W.F., Unified multilevel adaptive finite element methods for elliptic problems, Ph.D. thesis, Technical report UIUCDCS-R-88-1436, Department of Computer Science, University of Illinois, Urbana, IL, 1988. ( gzipped postscript, 194k)
Mitchell, W.F., MGGHAT User's Guide Version 1.1, NISTIR 5948, 1997. (html) (postscript)
Mitchell, W.F., Optimal multilevel iterative methods for adaptive grids, SIAM J. Sci. Statist. Comput. 13 (1992), pp. 146-167.
Mitchell, W.F., Adaptive refinement for arbitrary finite element spaces with hierarchical bases, J. Comp. Appl. Math. 36 (1991), pp. 65-78.
Mitchell, W.F., Refinement Tree Based Partitioning for Adaptive Grids, Proceedings of the 7th SIAM Conference on Parallel Processing for Scientific Computing, SIAM, 1995, pp. 587-592. (gzipped postscript, 75k)
Mitchell, W.F., The Refinement-Tree Partition for Parallel Solution of Partial Differential Equations, NIST Journal of Research, 103 (1998), pp. 405-414. ( gzipped postscript, 96k)
Mitchell, W.F., The Full Domain Partition Approach to Distributing Adaptive Grids, Applied Numerical Mathematics, 26 (1998) pp. 265-275, special issue for the proceedings of Grid Adaptation in Computational PDEs: Theory and Applications. (gzipped postscript, 102k)
Mitchell, W.F., The Full Domain Partition Approach for Parallel Multigrid on Adaptive Grids, Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997. (gzipped postscript, 179k)
Mitchell, W.F., A Parallel Multigrid Method Using the Full Domain Partition, Electronic Transactions on Numerical Analysis, 6 (1998) pp. 224-233, special issue for proceedings of the 8th Copper Mountain Conference on Multigrid Methods. ( gzipped postscript, 100k)
Mitchell, W.F., The Full Domain Partition Approach to Parallel Adaptive Refinement, IMA Volumes in Mathematics and its Applications, 113, Springer-Verlag, 1998, pp. 151-162. Volume devoted to the IMA Workshop on Grid Generation and Adaptive Algorithms. ( gzipped postscript, 138k)
Mitchell, W.F., The K-way Refinement Tree Partitioning Method for Adaptive Grids ( gzipped postscript, 87k)
Mitchell, W.F., A Comparison of Three Fast Repartition Methods for Adaptive Grids, Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999. ( gzipped postscript, 50k)
Mitchell, W.F., Adaptive Grid Refinement and Multigrid on Cluster Computers, Proceedings of the 15th International Parallel and Distributed Processing Symposium, IEEE Computer Society Press, 2001. ( gzipped postscript, 200k)
Mitchell, W.F., The Design of a Parallel Adaptive Multi-Level Code in Fortran 90, Proceedings of the 2002 International Conference on Computational Science, 2002. ( gzipped postscript, 50k)
Mitchell, W.F., StopWatch User's Guide Version 1.0, NISTIR 5971. (html) (gzipped postscript, 78k)
Mitchell, W.F., A Fortran 90 Interface for OpenGL: Revised January 1998, NISTIR 6134, 1998. ( gzipped postscript, 45k)
Mitchell, W.F., The Fortran 90 Bindings for OpenGL, ACM Fortran Forum, 18 (1999). ( gzipped postscript, 69k)
Development status: Active Development Last change to this page: September 10, 2004 Date this page created: 1997 Contact: William Mitchell Home Page