This first part of **MA398 Matrix Analysis and Algorithms** is devoted to recalling the key basic facts about vector and matrix norms, dual spaces, eigenvalue problems, and various special types of matrices. A lot of this introductory stuff can be (or even should be) done in the context of Banach and Hilbert spaces, and operators between them. The focus of MA398, though, is on *matrix* analysis and algorithms, which means that infinite-dimensional issues, while interesting, are basically a distraction. For all intents and purposes, then, the only vector spaces that we care about are

*n*-dimensional complex space ℂ^{n}(or occasionally*n*-dimensional real space ℝ^{n}), which will usually be equipped with the Euclidean inner product and norm — but sometimes with other ones, or even norms that do not arise from an inner product; and- the
*mn*-dimensional space ℂ^{m×n}of complex matrices with*m*rows and*n*columns, representing linear maps from ℂ^{n}to ℂ^{m}with respect to the usual bases of these spaces, or the corresponding space ℝ^{m×n}of real matrices; there are plenty of interesting norms that can be put on these spaces.

Norms, Inner Products and Adjoints

A **norm** (or, more fully, **vector norm**) on a vector space *V* over the field *K* = ℝ or ℂ is a function ‖·‖: *V* → [0, ∞) such that ‖*x*‖ = 0 if and only if *x* = 0 ∈ *V*; ‖*α**x*‖ = |*α*|‖*x*‖ for all *α* ∈ *K* and *x* ∈ *V*; and ‖*x*+*y*‖ ≤ ‖*x*‖ + ‖*y*‖ for all *x*, *y* ∈ *V*. We call this a “vector” norm because later on we will call attention to particular norms on spaces of matrices, and call such things matrix norms.

An important fact is that if *V* is a *finite*-dimensional vector space over *K*, such as ℂ^{n}, then any two norms ‖·‖ and ‖·‖′ on *V* are **equivalent** in the sense that there exist constants 0 < *c* ≤ *C* < ∞ such that

*c* ‖*x*‖ ≤ ‖*x*‖′ ≤ *C* ‖*x*‖ for all *x* ∈ *V*.

An **inner product** on *V* is a function 〈·, ·〉: *V* × *V* → *K* such that 〈*x*, *x*〉 ≥ 0, with equality if and only if *x* = 0; 〈*x*, *y*〉 is the complex conjugate of 〈*y*, *x*〉; and 〈*x*, *α**y*+*β**z*〉 = *α*〈*x*, *y*〉 + *β*〈*x*, *z*〉. Every inner product induces a norm by ‖*x*‖ ≔ 〈*x*, *x*〉^{1⁄2}. Any inner product and its induced norm satisfy the **Cauchy–Schwarz inequality**

|〈*x*, *y*〉| ≤ ‖*x*‖ ‖*y*‖ for all *x*, *y* ∈ *V*,

with equality if and only if *x* and *y* are linearly dependent.

On *V* = ℂ^{n}, the usual **Euclidean inner product** is

〈*x*, *y*〉 ≔ *x*̅_{1}*y*_{1} + … + *x*̅_{n}*y*_{n},

and the norm that it induces is the **Euclidean norm** ‖·‖_{2}, i.e. the square root of the sum of the squares of the absolute values of the components of *x*; in the real case, of course, the complex conjugate signs (over-bars) in the above definition are redundant. Similarly, for *p* ≥ 1, the ** p-norm** ‖

*x*‖

_{p}of

*x*is the

*p*th root of the sum of the

*p*th powers of the absolute values of the components of

*x*; the

**∞-norm**‖

*x*‖

_{∞}of

*x*is the maximum of the absolute values of the components of

*x*.

If *V* and *W* are inner product spaces and *T*: *V* → *W* is a linear map, then the **adjoint** of *T* is the linear map *T*^{∗}: *W* → *V* defined by the relation

〈*T**x*, *y*〉 = 〈*x*, *T*^{∗}*y*〉 for all *x* ∈ *V* and *y* ∈ *W*.

It’s an easy exercise to check that if *T* has matrix *A* ∈ ℂ^{m×n}, then the matrix of *T*^{∗} is the conjugate transpose of *A*, the *n*×*m* matrix whose (*i*, *j*)th element is the complex conjugate of the (*j*, *i*)th element of *A*. Therefore, we call the conjugate transpose of *A* the **adjoint** of *A*, and denote it by *A*^{∗} ∈ ℂ^{n×m}.

A matrix *A* ∈ ℂ^{m×n} is said to be **unitary** if *A*^{∗}*A* = *I*; in the real case, *A* is called **orthogonal**. A matrix is unitary if and only if its columns are orthonormal with respect to the usual inner product, and hence cannot have more columns than rows. A square unitary matrix *A* has *A*^{−1} = *A*^{∗}. Also, for a square matrix, *A*, *A* is unitary if and only if *A*^{∗} is unitary.

An important property of unitary matrices is that they preserve the Euclidean inner product (and hence the Euclidean norm as well); indeed, many authors use this as the definition of unitarity. A matrix *A* ∈ ℂ^{m×n} is unitary if and only if, for all *x*, *y* ∈ ℂ^{n},

〈*A**x*, *A**y*〉 = 〈*x*, *y*〉.

A matrix *A* ∈ ℂ^{n×n} is called **Hermitian** (or **self-adjoint**) if *A*^{∗} = *A*; in the real case, *A* is called **symmetric**.

A (Hermitian) matrix *A* ∈ ℂ^{n×n} is called **positive definite** (resp. **positive semi-definite**) if 〈*A**x*, *x*〉 > 0 (resp. ≥ 0) whenever 0 ≠ *x* ∈ ℂ^{n}. A positive-definite Hermitian matrix *A* can be used to define a new inner product and norm by

〈*x*, *y*〉_{A} ≔ 〈*x*, *A**y*〉

‖*x*‖_{A} ≔ √〈*x*, *x*〉_{A}.

Eigenvalues and Eigenvectors

For a matrix *A* ∈ ℂ^{n×n}, a non-zero *x* ∈ ℂ^{n} and *λ* ∈ ℂ are called a (**right**) **eigenvector** and **eigenvalue** for *A* if *A**x* = *λ**x*. Similarly, a non-zero *y* ∈ ℂ^{n} and *λ* ∈ ℂ are called a (**left**) **eigenvector** and **eigenvalue** for *A* if *y*^{∗}*A* = *λ**y*^{∗}. As it happens, the right and left eigenvalues of *A* are always the same, but a given eigenvalue may have different right and left eigenvectors — which are in any case only unique up to multiplication by non-zero scalars, so we often search for unit-norm eigenvectors.

The **characteristic polynomial** of *A* is

*ρ*_{A}(*z*) ≔ det(*A* − *z**I*),

which is a polynomial of degree *n* in *z*. A complex number *λ* is an eigenvalue of *A* if and only if it is a root of *ρ*_{A}; since, by the Fundamental Theorem of Algebra, *ρ*_{A} has *n* complex roots (possibly with multiplicity), *A* has *n* eigenvalues (possibly with multiplicity).

Actually, there are two notions of multiplicity for eigenvalues. The **algebraic multiplicity** of *λ* is the largest integer *q* such that (*z*−*λ*)^{q} is a factor of *ρ*_{A}(*z*). The **geometric multiplicity** of *λ* is the dimension of ker(*A*−*λ**I*). An eigenvalue is **simple** if both its algebraic and geometric multiplicities are 1. The algebraic and geometric multiplicities *q* and *r* satisfy 1 ≤ *q* ≤ *r*.

Matrices *A*, *B* ∈ ℂ^{n×n} are said to be **similar** if there is an invertible *S* ∈ ℂ^{n×n} (a change of basis matrix) such that *B* = *S*^{−1}*A**S*.

If *A* ∈ ℂ^{n×n} has *n* linearly independent eigenvectors (and not all matrices do!), then eigenvalue equation gives us the neat similarity relation

*A* = *V*Λ*V*^{−1},

where *V* is the matrix with the (say, unit-norm) eigenvectors as its columns, and Λ is the diagonal matrix that has the corresponding eigenvalues along the diagonal and zeros elsewhere. (For ideas about what to do when *A* doesn’t have *n* linearly independent eigenvectors, see the next part, and in particular Schur Diagonalization and Jordan Canonical Form.) Regardless of linear independence issues, similar matrices always have the same eigenvalues with the same algebraic and geometric multiplicities, although obviously not the same eigenvectors.

If *A* ∈ ℂ^{n×n} has *n* orthogonal eigenvectors, then *A* is called **normal**. Normality is equivalent to the condition that *A*^{∗}*A* = *A**A*^{∗}.

If *A* ∈ ℂ^{n×n} is Hermitian and positive definite, then the eigenvalues of *A* are all strictly positive, and give a nice estimate for the *A*-norm in terms of the original one: if the smallest and largest eigenvalues of *A* are respectively, *λ*_{min} and *λ*_{max}, then

√*λ*_{min} ‖*x*‖ ≤ ‖*x*‖_{A} ≤ √*λ*_{max} ‖*x*‖ for all *x* ∈ ℂ^{n}.

The **spectral radius** of *A* ∈ ℂ^{n×n} is

*ρ*(*A*) ≔ max { |*λ*| | *λ* is an eigenvalue of *A* }.

Dual Spaces

In general, the **dual space** of a normed vector space (*V*, ‖·‖) is the space *V*′ of all continuous linear functions from *V* to its field, be that ℝ or ℂ — continuous, that is, with respect to the norm ‖·‖ on *V* and the absolute value on the field. The dual space is equipped with the **operator norm**

‖*y*‖′ ≔ max { | *y**x* | | *x* ∈ *V*, ‖*x*‖ = 1 }

As a set, the dual of ℂ^{n} is always ℂ^{n}, but the norm on the dual space depends upon the norm on the primal space. (The dual pairing between the two copies of ℂ^{n} is just the standard inner product.)

**Theorem.** The spaces (ℂ^{n}, ‖·‖_{1}) and (ℂ^{n}, ‖·‖_{∞}) are duals of one another. For *p*, *q* ∈ (1, ∞) with 1⁄*p* + 1⁄*q* = 1, the spaces (ℂ^{n}, ‖·‖_{p}) and (ℂ^{n}, ‖·‖_{q}) are duals of one another. In particular, (ℂ^{n}, ‖·‖_{2}) is self-dual.

Matrix Norms

On the vector space ℂ^{n×n} of complex matrices with *n* rows and columns, we are sometimes interested in norms that interact nicely with matrix multiplication. A norm ‖·‖ on ℂ^{n×n} is called a **matrix norm** if, in addition to the usual norm axioms, it satisfies

‖*A**B*‖ ≤ ‖*A*‖ ‖*B*‖ for all *A*, *B* ∈ ℂ^{n×n}.

(Note that multiplying a matrix norm by a constant at least 1 yields a new matrix norm, whereas multiplying it by a positive constant less than 1 yields a new norm that is *not* a matrix norm.)

Given a (vector) norm ‖·‖_{v} on ℂ^{n}, the **induced matrix norm** (also known as the **operator norm**) is the matrix norm ‖·‖_{m} on ℂ^{n×n} defined by

‖*A*‖_{m} ≔ sup { ‖*A**x*‖_{v} | *x* ∈ ℂ^{n} with ‖*x*‖_{v} = 1 }.

Notably, the induced matrix norm always gives the identity matrix norm 1, a property that is not always true for arbitrary matrix norms. Usually, the same symbol is used for an induced matrix norm as for the vector norm that induces it. The matrix norm induced by the infinity norm is the maximum row sum, i.e. the maximum over the rows of *A* of the 1-norm of each row; the matrix norm induced by the 1-norm is the maximum column sum, i.e. the maximum over the rows of *A* of the 1-norm of each column.

Matrix norms interact nicely with the spectral radius (the maximum modulus of the eigenvalues):

**Theorem.** For any matrix norm, any square matrix *A* ∈ ℂ^{n×n}, and any natural *k*,

*ρ*(*A*)^{k} ≤ *ρ*(*A*^{k}) ≤ ‖*A*^{k}‖ ≤ ‖*A*‖^{k}.

Furthermore, if *A* is normal, then in the induced 2-norm, equality holds throughout the above expression.

For non-square matrices *A* ∈ ℂ^{m×n}, the matrix norm induced by the Euclidean norm satisfies ‖*A*‖_{2} = √*ρ*(*A*^{∗}*A*). Also, just as the vector 2-norm is preserved by the action of unitary matrices, so too is the matrix 2-norm: for all *A* ∈ ℂ^{m×n} and all unitary *U* ∈ ℂ^{m×m} and *V* ∈ ℂ^{n×n},

‖*U**A*‖_{2} = ‖*A*‖_{2} = ‖*A**V*‖_{2}.