# MA398.5: Simultaneous Linear Equations – Direct Methods

This part of MA398 Matrix Analysis and Algorithms concerns direct methods for the solution of systems of simultaneous linear equations (SLE); iterative methods will be covered in the next part. To recall, the (SLE) problem is, given A ∈ ℂn×n and b ∈ ℂn, to find x ∈ ℂn such that Ax = b. It will be assumed throughout that A is invertible.

Since I’m using quite a bit of Python for other projects at the moment, my natural tendency is to write up the various algorithms in Python-like pseudo-code. So, in the code sections, vectors will be single-subscript arrays `x`, where the ith entry is denoted `x[i]`; however, I’ll stick to the mathematical convention of having the first index be 1 instead of 0. Matrices will have entries `A[i, j]`. An expression like `x[m:n]` denotes that portion of an array `x` that has index greater than or equal to `m` and strictly less than `n`. The n×n identity matrix will be denoted `eye(n)`, and the m×n zero matrix by `zero(m, n)`.

Gaussian Elimination

The core of the Gaussian Elimination (GE) method for solving (SLE) is LU factorization combined with forward and backward substitution. Therefore, we need an algorithm for effecting LU factorization. The following algorithm takes as input a square matrix A ∈ ℂn×n with non-singular principal submatrices A|j and outputs lower- and upper-triangular L, U ∈ ℂn×n such that A = LU:

```def LUFactorize(A):
# A is an n-by-n complex matrix with all principal submatrices non-singular
L = eye(n)
U = A
for k = 1, ..., n-1:
for j = k+1, ..., n:
L[j, k] = U[j, k] / U[k, k]
U[j, k:n+1] -= L[j, k] * U[k, k:n+1]
return L, U```

For backward substitution, we use the following algorithm which takes as input a non-singular upper-triangular U ∈ ℂn×n and b ∈ ℂn and outputs x such that Ux = b:

```def BackwardSubstitute(U, b):
# U is an upper-triangular n-by-n complex matrix
# b is a length-n complex vector
for j = n, ..., 1:
x[j] = (b[j] - sum(U[j, j+1:n+1] * x[j+1:n+1])) / U[j, j]
return x```

The corresponding algorithm for solving Ly = b for y when L is lower triangular is called forward substitution.

Thus, the GE algorithm for solving Ax = b is

```def GaussianElimination(A, b):
# A is an n-by-n complex matrix with all principal submatrices non-singular
# b is a length-n complex vector
L, U = LUFactorize(A)
y = ForwardSubstitute(L, b)
x = BackwardSubstitute(U, y)
return x```

The LU factorization algorithm has cost ∼ ⅔n3, and both backward and forward substitution have cost ∼ n2. Therefore, GE has cost ∼ ⅔n3.

The forward and backward substitution steps are numerically stable:

Theorem. For sufficiently small machine precision εm, the computed solution ŷ of a triangular system of equations Ty = b solves (TT)ŷ = b where, element-wise,

T| ≤ nεm1−nεm |T|.

Thus, by the results on the conditioning of (SLE), the relative error in the computation of y (in the vector ∞-norm) is bounded by some positive constant times nεmκ(T).

However, the calculation of the LU factorization is not numerically stable. The computed LU factorization is indeed the LU factorization of some AA, but the bound for the norm of ΔA is dependent upon the norm of the product of (absolute values of) the computed L and U matrices — and this cannot be bounded in terms of the norm of A, even when A is a well-conditioned matrix. The root cause of the instability is the division by the potentially tiny number `U[k, k]` in the `LUFactorize` algorithm.

Gaussian Elimination with Partial Pivoting

To overcome the instability of the classical GE method, we resort to permuting (“pivoting”) the rows and columns of A. In point of fact, we only swap rows and not columns, hence the name partial pivoting. The following algorithm performs an LU factorization with such partial pivoting: given a non-singular A ∈ ℂn×n, it returns lower- and upper-triangular L, U ∈ ℂn×n and a permutation matrix P ∈ {0, 1}n×n such that PA = LU.

```def LUFactorizePP(A):
# A is an n-by-n non-singular complex matrix
L = eye(n)
U = A
P = eye(n)
for k = 1, ..., n-1:
let i in {k, ..., n} maximize abs(U[i, k])
exchange rows i and k of L
exchange rows i and k of U
exchange rows i and k of P
for j = k+1, ..., n:
L[j, k] = U[j, k] / U[k, k]
U[j, k:n+1] -= L[j, k] * U[k, k:n+1]
return L, U, P```

Thus, the algorithm for Gaussian elimination with partial pivoting (GEPP) is

```def GaussianEliminationPP(A, b):
# A is an n-by-n non-singular complex matrix
# b is a length n complex vector
L, U, P = LUFactorizePP(A)
z = P * b
y = ForwardSubstitute(L, z)
x = BackwardSubstitute(U, y)
return x```

The additional cost of the comparison step made to maximize |uik| is Θ(n2), and so does not affect the overall ∼ ⅔n3 cost of Gaussian elimination with partial pivoting. In the case of complete pivoting (column and row permutations), the additional cost is Θ(n3), and so cannot be neglected.

The error analysis for GEPP is non-trivial. However, under a couple of simplifying assumptions (that the computed permutation matrix is the exact one, and that the computed L and U matrices are close to the exact ones), the error bound is governed by a quantity called the growth factor of A, g(A), defined to be the maximum modulus of the elements of U divided by the maximum modulus of the elements of A. We get an error bound in the induced matrix ∞-norm of the form

‖ΔA ≲ 3 n3 g(A) εmA ⁄ (1 − 3 n εm).

Since g(A) ≤ 2n−1, and this bound can be achieved by some matrices A, the above error bound looks terrible. However, it is an empirical observation that GEPP works well in practice for “most” problems. Gaussian elimination with complete pivoting has better (worst case) stability properties at an increased computational cost.

Householder QR Factorization

With access to the QR factorization of A, which in this section is assumed to be real, we can solve (SLE) with better stability properties at the cost of doubling the number of operations.

```def SLEbyQR(A, b):
# A is an n-by-n non-singular real matrix
# b is a length-n real vector
Q, R = QRFactorize(A)
x = BackwardSubstitute(R, y)
return x```

The key, of course, is to be able to compute the QR factorization of A. The following algorithm performs a QR factorization of an m×n real matrix A, and has very desirable numerical stability properties.

```def HouseholderQR(A):
# A is an m-by-n complex matrix
Q = eye(m)
R = A
for k = 1, ..., n:
u = R[k:m+1, k]  #NB u is of length m-k+1
vbar = sign(u[1]) * 2norm(u) * [1, 0, ..., 0] + u
v = vbar / 2norm(vbar)
Hk = eye(m-k+1) - 2 * v * adjoint(v)
Qk = [eye(k-1), zero(k-1, m-k+1); zero(m-k+1, k-1), Hk]
# I.e. Qk has identity and Hk as blocks on the diagonal, 0s elsewhere
R = Qk * R
Q = Q * Qk
return Q, R```

The intermediate matrices Hk are called Householder reflections: Hkx is the reflection of x in the plane orthogonal to v.

Note that in `SLEbyQR`, the matrix `Q` is only used to calculate `adjoint(Q) * b`. Therefore, a small efficiency saving can be made in solving (SLE) by QR factorization by replacing the line “`Q = Q * Qk`” by “`b = Qk * b`”, so that the final state of `b` will be `adjoint(Q) * b`. The computational cost of the algorithm `HouseholderQR` without calculating `Q` or `adjoint(Q) * b` is ∼ 2mn2 − 2n3⁄3 in the regime m = Θ(n). Thus, the cost of solving (SLE) using Householder QR factorization is ∼ 4n3⁄3, twice that of GEPP.

The error analysis for Householder QR factorization is best expressed in terms of the columns aj of A. The backward error estimate for the computed R factor is, for some positive constant c and sufficiently small machine precision,

‖Δaj2cn2εm1−cn2εmaj2.

From this we obtain:

Theorem. Let x solve Ax = b and let be the solution computed through Householder QR. Then (AA) = bb, where, for some positive constant c and sufficiently small machine precision,

‖Δaj2cn2εm1−cn2εmaj2,
‖Δb2cn2εm1−cn2εmb2.

This is a great improvement over the worst-case estimate for GEPP, at the cost of a doubling of computational cost. However, to reiterate, GEPP is “usually” much more stable in practice than the worst-case analysis suggests, and hence is favoured over Householder QR in most applications.