The composition of two linear maps
A linear map and its matrix
Suppose we have a linear map \(T\colon V \to W\) where \(V,W\) are two \(\mathbb{R}\)-vector spaces with bases \(\{v_i\}_{i=1}^n\) and \(\{w_j\}_{j=1}^m\). The matrix representation \(\mathrm{Mat}(T) = \mathrm{A}\in\mathbb{R}^{m\times n}\) of the map \(T\) is easy to find: first expand any \(v\in V\) in terms of the basis vectors:
$$ T( v ) = T \left(\sum_{i=1}^n \alpha_i\,v_i\right) $$
Then use the linearity of the map \(T\) to obtain:
$$ T( v ) = \sum_{i=1}^n \alpha_i\,T(v_i) $$
finally, expand each vector \(T(v_i)\) in terms of the basis vectors of \(W\):
$$ T(v_i) = \sum_{j=1}^m a_{ji} w_j $$
This gives us the equality for any \(v\in V\): $$ T(v) = \sum_{j=1}^m \sum_{i=1}^n \alpha_i a_{ji} w_j $$
The coefficients \(\vec{\alpha} = [\alpha_1\,,\, \dots \,,\, \alpha_n]\) uniquely characterize the vector \(v\), and similarly the coefficients \(a_{ij}\) are uniquely characterize the map \(T\) given the bases of \(V,W\). Because the expansion of \(T(v)\) is unique in the basis \(\{w_j\}_j\), we obtain that \(T(v)\) has the coefficients \(\beta_j = \sum_{k=1}^n a_{jk}\alpha_k\). We can then only think of linear maps as multiplication and addition of the coefficients \(\alpha_i\) with the entries \(a_{ij}\).
We can then define the array \(\mathrm{A} = \left[ a_{ij} \right]_{ij}\) and matrix multiplication \(\mathrm{A} \vec{\alpha}\) as expected.
Composition as matrix multiplication
Let \(T,S,R\) be linear maps with matrices \(\mathrm{A},\mathrm{B},\mathrm{C}\). The third map \(R\) is the composition \(S\circ T\). Given bases for the spaces \(V\xrightarrow{T} W \xrightarrow{S} Z\), let us denote
- \(\vec{\alpha} = [\alpha_1\,,\, \dots \,,\, \alpha_n]\) the unique coefficients representing an arbitrary \(v\in V\) using the basis for \(V\),
- \(\vec{\beta} = [\beta_1\,,\, \dots \,,\, \beta_m]\) the unique coefficients representing \(w = T(v)\in W\)
- \(\vec{\gamma} = [\gamma_1\,,\, \dots \,,\, \gamma_p]\) the unique coefficients representing \(z = S(w)\in W\)
From the first section, we know that \(T\) has matrix representation \(\mathrm{A} = \left[a_{ij}\right]_{ij}\) so that the ouput coefficients are
$$ \beta_j = \sum_{k=1}^n a_{jk} \alpha_k $$
and that \(S\) has representation \(\mathrm{B} = \left[b_{ij}\right]_{ij}\) so that
$$ \gamma_q = \sum_{j=1}^m b_{qj} \beta_j $$
therefore, we have the equalities
\begin{align} \gamma_q &= \sum_{j=1}^m b_{qj} \sum_{k=1}^n a_{jk} \alpha_k \\ &= \sum_{k=1}^n \left(\sum_{j=1}^m b_{qj}a_{jk}\right) \alpha_k \\ &= \sum_{k=1}^n c_{qk}\alpha_k \end{align}
thus we obtain that the matrix \(\mathrm{C}\) has coefficients \(c_{qk} = \sum_{j=1}^m b_{qj}a_{jk} = \left[ \mathrm{BA} \right]_{qk}\). We thusly proved that \(\mathrm{C} = \mathrm{Mat}(S\circ T) = \mathrm{BA} = \mathrm{Mat}(S)\mathrm{Mat}(T)\).