The second part of MA398 concerns matrix factorization. The idea of matrix factorization is to write a matrix A ∈ ℂm×n as a product of (usually just a few) matrices with a simpler structure than A itself. For example, one of the factor matrices might be a diagonal matrix, which would be lovely, because the eigenvalues and eigenvectors of diagonal matrices just stare you in the face: they’re the diagonal entries of the matrix, and the standard basis vectors.
In point of fact, diagonalizations, which reveal eigenvalues, are powerful theoretical tools but are not so good for calculations. The problem is that finding eigenvalues is equivalent to finding roots of arbitrary polynomials — and that, thanks to the Abel–Ruffini theorem, cannot in general be done exactly with a finite number of arithmetic operations. More useful in computations, then, are factorizations like the QR, LU and Cholesky factorizations, which are numerically easy to achieve and split a matrix into two simpler factors but do not reveal eigenvalues.
The first group of results concerns diagonalization of a matrix, i.e. finding a change of basis matrix S such that A = S−1DS and D is diagonal. Actually, we would like the change of basis to be unitary, so that A = S∗DS, since conjugate transpose is almost trivial to compute compared to matrix inversion. We begin with a triangularization, not diagonalization, result:
Schur Factorization Theorem. For any A ∈ ℂn×n, there exists a unitary Q ∈ ℂn×n and an upper-triangular T ∈ ℂn×n such that A = QTQ∗.
Since the determinant of a triangular matrix is the product of its diagonal entries, the eigenvalues of a triangular matrix are precisely its diagonal entries. Thus, since similar matrices have the same eigenvalues, the Schur factorization reveals the eigenvalues of the original matrix A. From Schur’s theorem, we get two diagonalization theorems:
Normal Diagonalization Theorem. For any normal A ∈ ℂn×n (i.e. such that AA∗ = A∗A), there exists a unitary Q ∈ ℂn×n and a diagonal D ∈ ℂn×n such that A = QDQ∗. Furthermore, the diagonal entries of D are the eigenvalues of A, and the columns of Q are the corresponding eigenvectors of A.
Hermitian Diagonalization Theorem. For any Hermitian A ∈ ℂn×n, there exists a unitary Q ∈ ℝn×n and a diagonal Λ ∈ ℂn×n such that A = QΛQ∗. Furthermore, the diagonal entries of Λ are the eigenvalues of A, and the columns of Q are the corresponding eigenvectors of A.
(For me, personally, the last two theorems are the way that I remember the difference between normal and Hermitian matrices: both can be diagonalized, but it’s the Hermitian matrices that have real entries (eigenvalues) in their diagonalization. Of course, it’s the positive-definite ones that have positive diagonal entries.)
Jordan Canonical Form
Jordan canonical form is almost as good as diagonalizing a matrix A as A = S−1DS so that D has the eigenvalues of A down the diagonal and zeros elsewhere; in JCF, we settle for D having a few 1s just above the diagonal as well.
To be more precise, the Jordan block Jk(λ) is the k×k matrix that has λ in all the diagonal entries, 1 in the all the entries immediately above the diagonal, and 0 elsewhere. A Jordan matrix is a block diagonal matrix that has Jordan blocks down the diagonal and 0 elsewhere.
Theorem. Every square matrix is similar to a Jordan matrix: for every A ∈ ℂn×n there exists an invertible matrix S ∈ ℂn×n and a Jordan matrix J ∈ ℂn×n such that A = S−1JS, where the diagonal elements of the Jordan blocks are the eigenvalues of A.
If, in the definition of Jordan blocks and Jordan matrices, we replace the super-diagonal 1s with δ, then we get δ-Jordan blocks and δ-Jordan matrices, which are useful for proving the following result on approximation of the spectral radius for a fixed matrix A:
Theorem. Fix A ∈ ℂn×n and δ > 0. Then there is a vector norm ‖·‖A,δ such that the induced matrix norm satisfies
ρ(A) ≤ ‖A‖A,δ ≤ ρ(A) + δ.
Singular Value Decomposition
Singular value decomposition (SVD) of a rectangular matrix A ∈ ℂm×n is really as close as one can get to a diagonalization for a non-square matrix. For information on the SVD, see this separate post.
QR Factorization Theorem. For any A ∈ ℂm×n with m ≥ n, there exists a unitary Q ∈ ℂm×m and an upper-triangular R ∈ ℂm×n such that A = QR.
The QR factorization is proved essentially by a careful reading of the Gram–Schmidt orthogonormalization procedure. In practice, however, this procedure is not used to calculate QR factorizations because it is numerically unstable. It also useful to note that since all the entries of R below the diagonal are 0, the last m−n columns of Q do not contribute to the factorization — they could be anything, so long as Q is unitary. Thus, the reduced QR factorization of A is
A = Q′R′,
where Q′ consists of the first n columns of Q, and R′ the first n rows of R. When A is square, so are Q and R; since |det(Q)| = 1, it follows that R is invertible if and only if A is.
For A ∈ ℂn×n and 1 ≤ j ≤ n, let A|j ∈ ℂj×j be the matrix that is the j×j top-left corner of A.
LU Factorization Theorem. Let A ∈ ℂn×n be such that A|j is invertible for every 1 ≤ j ≤ n. Then there exist unique matrices L, U ∈ ℂn×n such that
- L is lower triangular with all diagonal entries equal to 1;
- U is upper triangular and invertible;
- and A = LU.
If any one of the submatrices A|j is singular, then no such factorization exists.
It is of interest to study the effect of permutations on LU factorization, because while permutations do not change the whether or not a matrix A is singular, they can change whether or not the submatrices A|j are singular — for example, by making the (1, 1) entry zero or non-zero.
Theorem. Let A ∈ ℂn×n be invertible. Then there exists a permutation matrix P ∈ ℂn×n (i.e. a matrix for which each row and column contains n−1 0s and one 1) and L, U ∈ ℂn×n as in the previous theorem such that PA = LU.
Cholesky Factorization Theorem. For any positive-definite Hermitian A ∈ ℂn×n, there exists an upper-triangular R ∈ ℂn×n with positive diagonal elements such that A = R∗R.