SVD when everything goes right

This post is about a derivation of the Singular Value Decomposition of a matrix \(\mathrm{A}\) when several conditions are met. These conditions are quite restrictive, but make derivations very easy.

Symmetric PD matrices

Let us begin with a symmetric matrix \(\mathrm{S}\in\mathbb{R}^{n\times n}\) which is assumed positive-definite with \(n\) distinct eigenvalues \(\{\lambda_i\}_i\). It is easy to prove that the eigenvectors \(v_i\) are orthogonal :

$$0 < v_i^\mathsf{T}\mathrm{S}\,v_i = \lambda_i\lVert v_i \rVert^2 $$

gives that all eigenvalues are strictly positive, but we have for \(i \neq j\)

$$ \lambda_i\,v_i^\mathsf{T}v_j = (\mathrm{S}v_i)^\mathsf{T}v_j = v_i^\mathsf{T}\mathrm{S}^\mathsf{T}v_j = v_i^\mathsf{T}\mathrm{S}v_j = \lambda_j\,v_i^\mathsf{T}v_j$$

because \(\mathrm{S} = \mathrm{S}^\mathsf{T}\). As \(\lambda_i \neq \lambda_j\) we must have \(v_i^\mathsf{T}v_j = 0\). A symmetric PD matrix \(\mathrm{S}\) can therefore be decomposed as \(\mathrm{S} = \mathrm{Q}\mathrm{\Lambda}\mathrm{Q}^\mathsf{T}\) with \(\mathrm{Q}\) orthogonal and \(\mathrm{\Lambda}\) diagonal with positive entries.

Eigendecomposition of a product

Suppose \(\mathrm{A}\) is square, and further that the matrices \(\mathrm{A}^\mathsf{T} \mathrm{A}\) and \(\mathrm{A}\mathrm{A}^\mathsf{T}\) have distinct eigenvalues. Then, by our derivation in the previous section, we have that each of the two matrices can be decomposed as

$$ \mathrm{A}^\mathsf{T} \mathrm{A} = \mathrm{U}\mathrm{\Sigma}_1\mathrm{U}^\mathsf{T} \qquad\mathrm{and}\qquad \mathrm{A}\mathrm{A}^\mathsf{T} = \mathrm{V}\mathrm{\Sigma}_2\mathrm{V}^\mathsf{T} $$

because these two matrices are always symmetric and positive-definite. Further, the eigenvalues \(\sigma^1_i\) and \(\sigma^2_i\) stored in \(\mathrm{\Sigma}_1\) and \(\mathrm{\Sigma}_2\) are equal: if \(\mathrm{A}^\mathsf{T} \mathrm{A}u_i = \sigma^1_i u_i\) then

$$\mathrm{A}\mathrm{A}^\mathsf{T} (\mathrm{A}u_i) = \sigma^1_i (\mathrm{A}u_i) = \sigma^2_j v_j\qquad\text{for some}\;j $$

We can therefore always rearrange \(\mathrm{\Sigma}_1\) into \(\mathrm{\Sigma}_2\) by pre and post multiplying by a permutation: \(\mathrm{\Sigma}_1 = \mathrm{P}\mathrm{\Sigma}_2\mathrm{P}^\mathsf{T}\). By applying the same permutation to the matrix of eigenvectors we keep the decomposition

$$ \mathrm{A}\mathrm{A}^\mathsf{T} = \mathrm{V}\mathrm{\Sigma}_2\mathrm{V}^\mathsf{T} = (\mathrm{V}\mathrm{P}^\mathsf{T})\mathrm{P}\mathrm{\Sigma}_2\mathrm{P}^\mathsf{T}(\mathrm{V}\mathrm{P}^\mathsf{T})^\mathsf{T} = \tilde{\mathrm{V}}\mathrm{\Sigma}_1\tilde{\mathrm{V}}^\mathsf{T}$$

we can therefore assume \(\mathrm{\Sigma}_1 = \mathrm{\Sigma}_2 = \mathrm{\Sigma}\) for some appropriate orthogonal matrices \(\mathrm{U}, \mathrm{V}\).

Singular Value Decomposition

Because the eigenvalues stored in \(\mathrm{\Sigma}\) are strictly positive, we can take their square root \(\sqrt{\sigma_i}\) and store them in a new matrix \(\tilde{\mathrm{\Sigma}} = \mathrm{\Sigma}^{1 / 2}\) such that $$\tilde{\mathrm{\Sigma}}\tilde{\mathrm{\Sigma}} = \tilde{\mathrm{\Sigma}}^\mathsf{T}\tilde{\mathrm{\Sigma}} = \tilde{\mathrm{\Sigma}}\tilde{\mathrm{\Sigma}}^\mathsf{T} = \mathrm{\Sigma} $$

the trick behind the SVD is to construct a specififc \(\mathrm{V}\) from the \(\mathrm{U}\)-decomposition. Consider the equalities

\begin{align} \mathrm{A}\mathrm{A}^\mathsf{T} &= \mathrm{V}\mathrm{\Sigma}\mathrm{V}^\mathsf{T} \\ & = \left(\mathrm{A}\mathrm{U}\tilde{\mathrm{\Sigma}}^{-1}\right)\mathrm{\Sigma}\left(\tilde{\mathrm{\Sigma}}^{-1}\mathrm{U}^\mathsf{T} \mathrm{A}^\mathsf{T}\right) \\ & = \left(\mathrm{A}\mathrm{U}\tilde{\mathrm{\Sigma}}^{-1}\right)\mathrm{\Sigma}\left(\mathrm{A}\mathrm{U}\tilde{\mathrm{\Sigma}}^{-1}\right)^\mathsf{T} \end{align}

therefore, by setting \(\mathrm{V} \triangleq \mathrm{A}\mathrm{U}\tilde{\mathrm{\Sigma}}^{-1}\) we can do two things: orthogonally diagonalize the matrix \(\mathrm{A}\mathrm{A}^\mathsf{T}\) and ensure that \(\mathrm{A} = \mathrm{U}^\mathsf{T}\tilde{\mathrm{\Sigma}}\mathrm{V}\). The matrices \(\mathrm{U}, \mathrm{V}\) are often named differently, so that the SVD decomposition of \(\mathrm{A}\) is

$$ \mathrm{A} = \mathrm{U}\mathrm{\Sigma}\mathrm{V}^\mathsf{T} $$ for some unitary matrices \(\mathrm{U}, \mathrm{V}\) and a positive diagonal matrix \(\mathrm{\Sigma}\).