# MA398.4: Complexity of Algorithms

The topic of this part of MA398 Matrix Analysis and Algorithms is the computational cost of solving numerical problems, as measured by the number of algebraic operations — and, in particular, how that depends upon the size of the problem. For simplicity, we will neglect issues like memory usage, parallel communications costs and so on, and define the cost of an algorithm to be the total number of additions, subtractions, multiplications, divisions and square roots performed during one run of the algorithm.

To describe asymptotic costs as the problem size becomes large, we will use the following notation:

g = O(f) if limsupx→∞ g(x)f(x) <; ∞,
g = Ω(f) if liminfx→∞ g(x)f(x) >; 0,
g = Θ(f) if g = O(f) and g = Ω(f),
gf if limx→∞ g(x)f(x) = 1.

These notes will not attempt to formally define what constitutes “an algorithm”. This can be a tricky point, and is especially important if one wants to rigorously establish non-trivial lower bounds on computational costs: if simply guessing the right answer and returning it in one operation is a valid algorithm, then the lower bounds are trivially 1 in all cases.

Matrix-Vector and Matrix-Matrix Multiplication

The standard algorithm for the inner product of x, y ∈ ℂn is as follows:
1) Set s = x̅1y1.
2) For i = 2, …, n: replace s by s + x̅iyi.
3) Return s.

Theorem. The standard inner-product algorithm on ℂn has cost Θ(n); any algorithm for the inner product has cost Ω(n).

The standard algorithm for multiplication of a matrix A ∈ ℂn×n with a vector x ∈ ℂn is simply n copies of the standard inner product algorithm, one for each of the n rows of A.

Theorem. The standard algorithm for matrix-vector multiplication has cost Θ(n2); any algorithm for it has cost Ω(n).

Similarly, the standard algorithm for the multiplication of two matrices A, B ∈ ℂn×n is n2 copies of the standard inner product algorithm: the (i, j)th element of AB is the inner product of (the transpose of) the ith row of A with the jth column of B.

Theorem. The standard algorithm for matrix-matrix multiplication has cost Θ(n3); any algorithm for it has cost Ω(n2).

There are algorithms for matrix-matrix multiplication that have better computational cost than Θ(n3). Perhaps the simplest is the Strassen algorithm. Let n be even, and suppose that we are interested in multiplying n×n matrices A and B, and let DAB. Write each of these matrices X in terms of (n⁄2)×(n⁄2) blocks X11, X12, X21 and X22. Let

P1 ≔ (A11+A22)(B11+B22),
P2 ≔ (A21+A22)B11,
P3A11(B12B22),
P4A22(B21B11),
P5 ≔ (A11+A12)B22,
P6 ≔ (A21A11)(B11+B12),
P7 ≔ (A12A22)(B21+B22).

The above definitions may seem to be overly complicated, but note that the are only seven multiplications of (n⁄2)×(n⁄2) matrices involved. In general, this kind of reduction from a larger problem (n×n matrix multiplication) to multiple instances of a smaller one ((n⁄2)×(n⁄2) matrix multiplication) is called a divide and conquer strategy. The crucial point is that the Strassen algorithm reduces the larger problem to 7 smaller ones, whereas the naïve approach would use 8. The blocks of the DAB can be calculated from the P matrices using only sums and differences:

D11 = P1+P4P5+P7,
D12 = P3+P5,
D21 = P2+P4,
D22 = P1+P3P2+P6.

By applying this idea recursively, and if necessary padding matrices that are not power-of-2-dimensional with zeros, we find that the Strassen algorithm for matrix-matrix multiplication has cost Θ(nlog27).

There are also more efficient methods of matrix inversion than the naïve method of Cramer’s rule:

Theorem. If there is a method to multiply n×n matrices with cost O(nβ) for some β ≥ 2, then it is also possible to invert n×n matrices with cost O(nβ).

The idea of the proof is quite simple. Let A ∈ ℂn×n be invertible. By appropriate padding with zero and identity matrices if necessary, we can assume that n is a power of 2. Since A is invertible, A−1 exists and equals (AA)−1A, so A can be inverted with cost equal to that of two matrix multiplications and one inversion of the Hermitian positive-definite matrix AA. At this point, we resort to divide-and-conquer approach using Schur complements to show that the inversion of n×n Hermitian positive-definite matrices can be accomplished in O(nβ).

Fast Fourier Transform

The previous section considered the costs of general matrix multiplication. If the matrix has some additional structure, then further reductions in cost can be achieved. A good example is furnished by the discrete Fourier transform.

For a natural number n, let

ωne2πin,
φkn−1⁄2 (1, ωnk, ωn−2k, …, ωn−(n−1)k).

The vectors φk form an orthonormal basis of ℂn, i.e. 〈φk, φm〉 = δkm. This, any u ∈ ℂn can be expanded in this basis as

u = a0φ0 + a1φ1 + … + an−1φn−1,

where ak = 〈φk, u〉. Indeed, all the coefficients {ak} of u in this Fourier basis can be found at one by multiplying the vector u in the usual basis by the symmetric matrix An whose (i, j) entry is n−1⁄2ωni+j (here the indices of An run from 0 to n−1 instead of 1 to n as usual.

Theorem. For n a power of 2, matrix-vector multiplication by An can be accomplished with cost O(n log n).

Bidiagonal and Hessenberg Forms

The emphasis of the following two matrix factorization results is on their O(n3) cost. Recall that A is upper bidiagonal if its only non-zero entries aij have either i = j or i = j−1; A is upper Hessenberg if its only non-zero entries aij have ij+1

Theorem. For every A ∈ ℂn×n there exist unitary U, V ∈ ℂn×n and an upper bidiagonal B ∈ ℂn×n such that A = UBV, and this factorization can be achieved with cost O(n3).

Theorem. For every A ∈ ℂn×n there exists a unitary U ∈ ℂn×n and an upper Hessenberg T ∈ ℂn×n such that A = UTU, and this factorization can be achieved with cost O(n3).