MPI_Gather(void *sendbuf, /* starting address of send buffer */
int sendcount, /* number of elements in send buffer */
MPI_Datatype sendtype, /* data type of send buffer elements */
void *recvbuf, /* address of receive buffer */
int recvcount, /* number of elements for any single
receive */
MPI_Datatype recvtype, /* data type of recv buffer elements */
int root, /* rank of reciving process */
MPI_Comm comm) /* communicator group */
Almost always recvtype and sendtype are the same, and
sendcount and recvcount will be equal. The last two specify the
number of elements from each process, not the total amount of
data received. This means, e.g., if sendcount = recvcount = 1,
then p data items will be accumulated in recvbuf. They will also
be accumulated in rank order.
MPI_Allgather() does the similar thing, except that the data is accumulated into recvbuf on every process in the communicator. One shortcoming with MPI_Allgather() is the sendcount must be identical on every process. For an iterative linear solver for a linear system of order n where mod(n,p) ≠ 0 one or more segments of the gathered vector must be padded so that they all have the same size. To handle this, there is a vector variant:
MPI_Gatherv(void *sendbuf,
int sendcount,
MPI_Datatype sendtype,
void *recvbuf,
int *recvcounts,
int *displs,
MPI_Datatype recvtype,
int root,
MPI_Comm comm)
Here recvcount has been replaced by a vector of recvcounts, such that
recvcounts[i] contains the number of entries sent by process i.
It also allows placing the data almost wherever in the recvbuf
you desire, using the displs[] array.
The data sent from process i is placed in the ith
portion of the receive
buffer recvbuf on process root. The
ith part of recvbuf begins at offset displs[i] elements
into recvbuf.
The receive buffer is ignored for all non-root processes, but should
be provided.
The "All" version of MPI_Gatherv is straightforward, given the above ideas.
Warning: This operation must not allow any overlapping of writes in the recvbuf on root (which is a form of aliasing). This prohibits a specification of counts, types, and displacements that allow two different processes to have part of their send data writing to the same location in recvbuf. Keep in mind: the writing of the data into that buffer is handled by the MPI run-time system, not you. And unless you explicitly insert barriers or other forms of synchronization, the timings of those writes may differ widely from what you would expect from serial programming experience.
{(T0, d0), (T1, d1), (T2, d2), ... (Tn-1, dn-1)}
{T0, T1, T2, ..., Tn-1}
{T0, T1, T2, ..., Tn-1}
{U0, U1, U2, ..., Um-1}
When you specify only a single datatype, as with MPI_Send, but a sendcount larger than 1, then MPI creates the type signature as the concatenation of sendcount copies of the MPI datatype specified.
The majority of scientific codes will naturally have type signatures that match on both sender and receiver. The main concern is for collective communications where the type signatures of all participating processes must be identical.