# MA398.3: Stability and Conditioning

The topic of this third part of MA398 Matrix Analysis and Algorithms is the study of the (usually inevitable) differences between computed and mathematically exact results, particularly in our three prototypical problems of simultaneous linear equations (SLE), least squares (LSQ), and eigenvalue/eigenvector problems (EVP). The first key point is that we split the error analysis into two parts: conditioning, which is a property of the problem alone, and stability, which is a property of the algorithm used to (approximately) solve the problem.

A useful point of view is that of backward error analysis: the computed solution to the problem is viewed as the exact solution to a perturbed problem. If that perturbation in “problem space” is small then the algorithm is called backward stable, and unstable otherwise. Once the original and perturbed problems have been identified, we wish to understand the difference between their respective exact solutions — this is the issue of conditioning. The three prototypical problems can all be viewed as solving an equation of the form G(yw) = 0, where w denotes the data that define the problem and y the solution.

Problem (SLE) (LSQ) (EVP)
Solution, y x x (x, λ)
Data, w (A, b) (A, b) A
G(y, w) Axb x−(AA)−1Ab Axλx

In the backward error analysis point of view, the computed solution ŷ solves G(ŷŵ) = 0. Conditioning concerns the estimation of Δy ≔ yŷ in terms of Δw ≔ wŵ.

Conditioning of (SLE)

Recall that the (SLE) problem is, given a matrix A ∈ ℂn×n and a vector b ∈ ℂm, to find x ∈ ℂn such that Ax = b.

The condition number in a norm ‖·‖ of a matrix A ∈ ℂn×n is κ(A) ≔ ‖A‖ ‖A−1‖ if A is invertible, and ∞ otherwise. From now on, we fix a vector norm and its induced matrix norm, in which setting we have κ(A) ≥ 1 for all A  ∈ ℂn×n. As an example, if A ∈ ℝn×n is symmetric and positive definite, then its condition number in the induced 2-norm is the ratio of its largest and smallest eigenvalues. The relevance of the condition number to (SLE) is hopefully made clear by the following result on perturbations of (SLE) with respect to b and A:

Conditioning Theorem for (SLE). (1) Suppose that Ax = b and A(xx) = bb, with b ≠ 0. Then

‖Δx‖ ⁄ ‖x‖ ≤ κ(A) ‖Δb‖ ⁄ ‖b‖.

(2) Suppose that Ax = b and (AA)(xx) = b. If κ(A) ‖ΔA‖ < ‖A‖, then

‖Δx‖ ⁄ ‖x‖ ≤ κ(A)‖AA‖−κ(A)‖ΔA ‖ΔA‖ ⁄ ‖A‖.

Conditioning of (LSQ)

Recall that the (LSQ) problem is, given a matrix A ∈ ℂm×n with m ≥ n and a vector b ∈ ℂm, to find x ∈ ℂn that minimizes ‖ Ax − b ‖2.

Furthermore, assume that rank(A) = n, so that A has singular values σ1 ≥ … ≥ σn > 0. Under this assumption AA is invertible and (LSQ) has a unique solution given by the normal equations

x = (AA)−1Ab.

The matrix A ≔ (AA)−1A ∈ ℂm×n is called the Moore–Penrose pseudo-inverse of A.

Thus, for A ∈ ℂm×n with m ≥ n, the condition number of A a norm ‖·‖ is κ(A) ≔ ‖A‖ ‖A‖ if rank(A) = n, and ∞ otherwise. Since square, invertible matrices have A = A−1, this definition of condition number extends the previous one. In the induced 2-norm, the condition number is the ratio of the largest and smallest singular values.

Conditioning Theorem for LSQ. Suppose that x solves (LSQ) for (A, b) and that xx solves (LSQ) for (A, bb). Let η ≔ ‖A2x2 ⁄ ‖Ax2 ≥ 1, and let θ ∈ [0, π⁄2] be defined by cos(θ) = ‖Ax2 ⁄ ‖b2. Then

‖Δx2 ⁄ ‖x2κ(A)η cos(θ) ‖Δb2 ⁄ ‖b2.

The corresponding estimate for perturbations of A instead of b is

‖Δx2 ⁄ ‖x2 ≤ ( κ(A) + κ(A)2 tan(θ)η ) ‖ΔA2 ⁄ ‖A2.

Conditioning of (EVP)

Recall that (EVP) problem is, given a matrix A ∈ ℂn×n, to find (xλ) ∈ ℂn×ℂ such that Ax = λx and ‖x2 = 1.

A first step towards understanding the conditioning of (EVP) is the following result on the continuity of eigenvalues:

Theorem. Let λ: ℂn×n → ℂn be the function that assigns to each square matrix the vector of its eigenvalues, ordered by decreasing absolute value and repeated according to algebraic multiplicity. Then λ is a continuous function.

Like the previous two problems, there is a notion of condition number associated to (EVP). The eigenvalue condition number for a matrix A ∈ ℂn×n and an eigenvalue λ ∈ ℂ of A is κA(λ) ≔ |〈xy〉|−1 if x and yare non-orthogonal, and ∞ otherwise, where x and y are normalized right and left eigenvectors for the eigenvalue λ.

Conditioning Theorem for (EVP). Let λ be a simple eigenvalue of A with right and left normalized eigenvectors x and y. If 〈yx〉 ≠ 0, then, for all small enough ΔA ∈ ℂn×n, the matrix AA has an eigenvalue λλ with

Δλ = 〈y, x−1 ( 〈y, ΔA x〉 + O(‖ΔA22) ),

and hence

λ| ≤ κA(λ) ( ‖ΔA2 + O(‖ΔA22) ).

Stability of Algorithms

Returning now to the general “G(yw) = 0” framework, suppose that we have a locally unique family of solutions y = g(w) as a function of the input data w, and that the algorithm returns ŷ ≠ y that is the exact solution to a perturbed problem, ŷ = g(ŵ). Δy ≔ yŷ is called the forward error, and Δw ≔ wŵ is called the backward error; the relative forward error is ‖Δy‖ ⁄ ‖y‖ and the relative backward error is ‖Δw‖ ⁄ ‖w‖. So, for example, the Conditioning Theorem for (SLE) showed that the relative forward error is bounded by the condition number times the relative backward error.

At this stage, we introduce a (very simplified) model for computer arithmetic. The most significant simplifying assumption is to ignore the problems of numbers that are not representable because they are either too large (overflow) or too close to zero (underflow). In this simple model, every compact interval [ab] of ℝ is approximated by a finite set of numbers F, with rounding function fl: [ab] → F. we assume that there exists εm > 0, called machine epsilon, such that

1. for all x ∈ [ab], there exists ε ∈ (−εm, +εm) such that fl(x) = x (1 + ε); and
2. for each operation ∗ ∈ {+, −, ×, ⁄} and each x, y ∈ F, there exists δ ∈ (−εm, +εm) such that x ⊛ y = (x ∗ y) (1 + δ), where ⊛ denotes the computed version of ∗.

An algorithm is called backward stable if the relative backward error is O(εm). For example, floating-point addition, subtraction, multiplication and division are backward stable. In practice, however, we want to know not just that the backward error is O(εm), but that it is at most Cm,nεm, where the constant Cm,n does not depend badly (e.g. exponentially) on the problem dimensions m and n.