Numerical Properties of LU Factorization


To express the error properties of LU factorization, we need to know the norm of a matrix to easily state the standard results. Recall that we are not going to derive these - that is the task of a numerical analysis course. Instead, we will see the computational implications of the error analyses.

Recall the various norms of a vector x:

The norm of a matrix can also be defined, but is a bit more complex: For a m x n matrix A, we have

As mentioned earlier , ||AB|| <= ||A|| ||B|| does not hold for all matrix norms and the norm is called consistent if it does satisfy that property.

The 1-, 2-, and infinity-norms come from vector norms, and can be equivalently defined that way. For p = 1, 2, infinity we have

||A||p = max ||Ax||p / || x ||p, for x nonzero, so ||A|| is the maximum amount by which A can ``stretch'' a vector.

LU factorization was the original motivation for numerical analysis. Results generally use the condition number k(A) of A, defined for nonsingular matrices as k(A) = ||A|| ||A-1||. Note that

Now let e be machine precision; for us this means e ~ 10-16. We'll see later where that value comes from; for now take it as given.

Perturbation result: let x solve Ax = b, where A is nonsingular and b is given. If ||dA|| || A-1 || = r < 1, then the perturbed system (A + dA)y = b + db is nonsingular. If ||dA|| < h || A || and ||db|| < h || b ||, then

|| x - y || < 2 h k(A)|| x || / (1-r)

LU Solution Result: Let LU be the factors computed by any of the algorithms we have covered, and let x be the solution computed using the triangular solution algorithms. Then (A + E) x = b, where

|E| < n e (3|A| + 5|L| |U|) + O(e2)

This is a different creature than the perturbution result. First, note that this places bounds on the entries in E; the absolute value bars around E means to take the absolute value of each entry in E. So |E| is exactly the matrix returned by Matlab's abs(E) function call. Secondly, this gives you bounds on entries, not on sum totals that come from norms. Also, the bound does not say how close x is to the "true" solution, but instead says x exactly solves a "nearby" system - with explicit bounds on just how "nearby" that system is to the exact solution. Also note that ill-conditioning is irrelevant here; instead the critical quantity relies on the size of entries in A, L and U. Of course, the condition number will probably rear its ugly head in the size of the L and U factors.