Sparse Matrix Representations


Over 20 different ways of storing a sparse matrix exist, but only six are defined below. Of those six, the first three are the most commonly occurring, and the third one of those is only a slight modification of the second one. So make sure you fully master those first two. The data structures described below will be illustrated for the following example matrix, where a blank indicates a zero which is not stored:

sparse matrix here

The nonzero entries A(i,j) = 10*i + j are chosen to make it easier to see where the entry is located; the first digit gives the row number and the second one gives the column number. For the problems and iterative methods we will use, the values are actually double precision floating point numbers. Links back to this example are in each data structure section, to make it easier to return to it. I recommend that you open two windows (or at least two tabs) to keep the example in view while reading about each data structure.

The first three of the data structures below require two 1D arrays of integers and one 1D array of doubles. In the tables displaying those arrays, to make it easier to keep track of the correct indices an extra table row proceeds them, in a greenish color. It is not part of the data structures, just a convenience similar to having column headers in a spreadsheet. The tables displaying the arrays of integers and doubles are split in COO and CSR, in case you have a narrow browser window. The names of the arrays used below are mostly standard, although the three (rowptr, col, val) in CSR are more commonly named (ia, ja, a). In the text empty parentheses are appended to "col" and "val" to emphasize that they are arrays. When those terms are embedded in text with a specific index (e.g., col(k)) , or are written as a triplet (row, col, val) the parentheses are omitted. So "col( )" and "val( )" just means they are arrays, not functions.

The example matrix above has nnz = 21 nonzeros, but some arrays are shown with an unused nnz+1 = 22nd entry. In practice the arrays are allocated with only nnz, not nnz+1, components. For debugging purposes only, you might allocate nnz+1 entries instead and then set row(nnz+1) = col(nnz+1) = -5 and val(nnz+1) = NaN; that will cause your code to stop with an error condition if you accidentally run off the end of the array. If you do this, be sure to change the codes back to using only nnz entries before handing anything in.


Coordinate-wise (COO)

For each entry in the matrix, store its row and column index alown with its value. This could be done by defining a structure which is a triplet (i,j,value), and then allocating an array of those triplets of length nnz. Instead COO uses three arrays (row, col, val) so that the kth entry in each array has the corresponding information about the kth nonzero. Other naming conventions use (I, J, val), or (ia, ja, A) or the plural form (rows, cols, vals) for the names of the three arrays.

COO features:

Initialize y(1:n) = 0

for k = 1 to nnz
   y(row(k)) = y(row(k)) + val(k)*x(col(k))
end for
In two places the loop requires indirect addressing, for y(row(k)) and x(col(k)). [Two instead of three, because although y(row(k)) is both read and written, its address only needs to be computed once per iteration.] Indirect addressing in innermost loops is deadly for performance.

For the example matrix listed above, the matrix would be stored as
 

Index 1 2 3 4 5 6 7 8 9 10 11
val 11 12 14 22 23 25 31 33 34 42 45
row 1 1 1 2 2 2 3 3 3 4 4
col 1 2 4 2 3 5 1 3 4 2 5

 
Index 12 13 14 15 16 17 18 19 20 21 22
val 46 55 65 66 67 75 77 78 87 88 -
row 4 5 6 6 6 7 7 7 8 8 -
col 6 5 5 6 7 5 7 8 7 8 -

Don't forget: in the table, the row labeled "Index" is for convenience, and is not part of the data structure. Also, for the example matrix the three arrays (row, col, val) in COO storage would be of length 21, not 22. Although the arrays show the nonzeros in row order, in practice the nonzeros could be placed in any order - row after row, column after column, or even randomly.


Compressed Sparse Row (CSR)


CSR stores the column indices and values just like COO, but entries must be listed one row after another. Doing that reduces the required information about row indices to just knowing where a row begins in the column index and value arrays col( ) and val( ). This is done using an integer array of length n+1, such that the k-th entry gives the index in col( ) and val( ) where the k-th row begins. That accounts for the first n entries in the row index information array (which is usually called "row pointers" or in the tables below, rowptr( ) ). The last entry, rowptr(n+1), has the value nnz + 1. That allows the loops implementing a matrix*vector product without having a test for the special case where the last row is being used, as shown below.

Beware that the term "row pointer" is common, but what the integer array row( ) really contains are the indices (usually 1-based) of where each row starts in the arrays col( ) and val( ). It does not contain a pointer in the usual C/C++/Fortran sense.

Features:

Initialize y(1:n) = 0
for i = 1 to n
   for k = row(i) to row(i+1)-1
       y(i) = y(i) + val(k)*x(col(k))
   end for
end for
What can you say about the efficiency of the matrix-vector product (pipelining, data reuse ...)?

For the example matrix at the top of the page, the CSR arrays are:

Index 1 2 3 4 5 6 7 8 9 10 11
val 11 12 14 22 23 25 31 33 34 42 45
col 1 2 4 2 3 5 1 3 4 2 5

rowptr 1 4 7 10 13 14 17 20 22

Index 12 13 14 15 16 17 18 19 20 21 22
val 46 55 65 66 67 75 77 78 87 88 -
col 6 5 5 6 7 5 7 8 7 8 -

The arrays val( ) and col( ) above have been split into two separate tables with the rowptr( ) array between them to make it easier to see how the rowptr( ) array of indices matches up. As a simple example, the fourth row of the matrix begins with the index row(4) = 10 and ends with the index row(4+1) - 1 = row(5) - 1 = 13 - 1 = 12. What would the entries in row( ) be if the fourth row was all zeros and none of them are explicitly stored? Will the matrix-vector product pseudo-code still work correctly?


Compressed Sparse column (CSC)

CSC is almost identical to CSR, but the nonzeros are stored column by column, not row by row. Like CSR, CSC uses three arrays but "column pointers" are stored in an array of length n+1, not row pointers. The row numbers are stored in an integer array of length nnz, and the array of doubles is still of length nnz. The names of the two integer arrays are usually changed to reflect the swapping of rows and columns in the representation: colptr( ) and row( ). As a self-check, you should create tables and write down the entries of the three arrays. One way to check if it is correct: the CSC representation of a matrix A is the same as the CSR representation of the transposed matrix AT. Also note that when the matrix A is symmetric, the CSR and CSC data structures should be identical.


Modified Sparse Row (MSR)

MSR is similar to CSR, except that the row pointers and column indices are both stored (confusingly) in a single integer array. All of the diagonal entries are stored first in the array of double values, which means that even if a diagonal entry is zero, it must be explicitly stored as 0.0. In the worse case where all of the diagonal entries are zero, this implies an additional n entries must be stored

The first n entries in the integer array bindx stores the indices where each row's off-diagonal entries begin. The n+1 entry contains nnz + 1 so that (as with CSR) the matrix-vector product can be written with a single nested loop and no special cases handled for the last row of the matrix. The array of values stores first the n diagonal entries, then skips a single unused entry in position n+1, followed by the off-diagonal entries, row-by-row.

For the example matrix at the top of the page, MSR storage is:
 

Indices 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
bindx 10 12 14 16 19 19 21 23 24 2 4 3 5 1 4 2 5 6 5 7 5 8 7
val 11 22 33 0 55 66 77 88 -- 12 14 23 25 31 34 42 45 46 65 67 75 78 87

You should try to write what the sparse matrix-vector multiplication looks like in MSR.

Something to beware of with COO, CSR, and MSR: all three of these were originally used with Fortran, and hence the indices into the arrays are sometimes 1-based. However, many modern packages are written in C/C++, so sometimes 0-based indexing is used. To make it even more confusing, to maintain compatibility with Fortran drivers, some C/C++ packages use 1-based indexing. You need to carefully check the documentation, and sometimes even the source code, to determine which is used.

Sometimes a C++ package will assume the indexing is 1-based, and shift the integer arrays accordingly to use the more convenient 0-based C indexing internally. Then the package shifts indices back to 1-based before returning to the calling program unit. Others have the user pass in a flag that indicates whether or not the data structure is using 0- or 1-based indexing. Most of the recent ones allow the data to be input with either 0 or 1 based indexing, and the user provides a flag indicating if the index is "C style" or "Fortran style". RTFM (which I believe is the acronym for "read the fine manual").


Compressed Diagonal Storage (CDS)

For matrices that are (nearly) banded, it is more efficient sometimes to store the matrix diagonals.  CDS stores the matrix by diagonals, for the entire bandwidth of matrix. Features:
Initialize y(1:n) = 0
for diag = -p to q
   for k = max(1,1-diag) to min(n,n-diag)
       y(k) = y(k) + val(k,diag)*x(k+diag)
   end for
end for
For the example matrix the storage is in the 2D array of dimensions (p+q+1) x n:
 
Diag 1 2 3 4 5 6 7 8  
-2 * * 31 42 0 0 75 0  
-1 * 0 0 0 0 65 0 87  
0 11 22 33 0 55 66 77 88  
1 12 23 34 45 0 67 78 *  
2 0 0 0 46 0 0 * *  
3 14 25 0 0 0 * * *  

The first row and column (labeled "Diag") in the table are not part of the storage scheme, and are just for convenience in reading the table.  Note that the numbering of the diagonals,  with the main diagonal as zero, subdiagonals with negative numbers, and superdiagonals positive numbers, matches what MATLAB does in its tril( ) and triu( ) operators.

Entries marked with an asterisk (*) are nonexistent ones. For example, the (1,1) entry in the array above would correspond to the (1,-1) entry in A, which does not exist. Similarly the (6,8) entry in the array would be the A(8,11) but A is only 8x8. Any algorithm should not reference those entries, or require them to be set to zero. That is the reason for the max( ) and min( ) operators in the loop controls in the matrix-vector product. It also provides an easy check for some indexing errors: set those entries to NaN, and set the compiler flags so that encountering NaN will trigger a stopping error.


Envelope or skyline storage

CDS wastes storage because it has to sweep up all of a diagonal if even a single nonzero appears there. Skyline stores entries in each row only from the first nonzero to the last. E.g., the row
ai = [0, 0, 0, 3.2, 0, 0, 1, 0.5, 0, -1.7, 0, 0 ]
requires storing the three zeros between 3.2 and -1.7, but not the leading three zeros or the trailing two zeros. Let nenv be the number of stored values of matrix. Note that nenv is greater than or equal to nnz, since some nonzeros may be swept up.

Three arrays are required: val(1:nenv) for nonzero values, rowptr(1:n+1) for row pointers, and fstcol(1:n) for column number of first nonzero in each row (fstcol is shorthand for "first column"). Matrix-vector products become

for k = 1:n
  y(k) = 0
  xcol = fstcol(k)
  istart = rowptr(i)
  iend = rowptr(i+1)-1
  for j = istart:iend
    y(k) = y(k) + val(j)*x(xcol+j-istart)
  end for
end for
A key feature is that the innermost look does not have any indirect addressing, meaning that it can generally run faster than if COO or CSR are used.

For the example matrix, skyline storage is :
 

Index 1 2 3 4 5 6 7 8 9 10 11
val 11 12 0 14 22 23 0 25 31 0 33

 
12 13 14 15 16 17 18 19 20 21 22 23
34 42 0 0 0 45 46 55 65 66 67 75

 
24 25 26 27 28
34 42 0 0 0

 
rowptr 1 5 9 13 19 20 23 27 29
Fstcol 1 2 1 2 5 5 5 7  


Back to the overview page