A New Model for the Data Distribution Problem

Thomas Loos

October 15, 1996

A New Model for Solving the Data Distribution Problem

Solving large linear systems on parallel computers requires first solving the data distribution problem. For a sparse coefficient matrix A, that problem typically is to assign rows of A to processors, and is usually solved by partitioning the graph of the matrix. Since the graph partitioning problem is NP-hard, practical algorithms are based on heuristic approaches. Performance of graph partitioning algorithms is measured in terms of the number of edges cut (EC): the lower the number of EC, the better the algorithm. For the data distribution problem, an additional load-balancing constraint is also placed on the algorithm.

Application scientists are interested in another metric: the time it takes to solve the system. Comparing several different partitioning algorithms on matrices derived from irregular grid PDE problems, results indicate that the number of EC is a poor predictor of solver run-time. This is because current graph partitioning algorithms are not tailored for the data distribution problem; the linear system solver, preconditioner, matrix data structures, and computational environment are generally ignored.

A simple simulation-based approximate execution time (AET) metric that estimates the performance of a linear system solver based on combining cost estimates of the solver's computational, communications, and synchronization kernels is presented. Simulation of kernel performance provides a comparatively easy way to model solvers and preconditioners, while giving reasonably accurate estimates in a reasonable amount of simulator execution time. AET estimates have a relative error of 13.1% or less compared to the actual time per iteration of the HMPE linear system solver, and typically is within 5%. A data distribution algorithm is proposed using the AET metric as its cost function.