5 Matrices and Coordinate Representations
5.1 The Problem of Computation
The theory developed in Chapters 1–3 establishes that a linear map T : \mathcal{V} \to \mathcal{W} is uniquely determined by its values on a basis. By Theorem 4.3, if \mathcal{B} = \{v_1, \ldots, v_n\} is a basis of \mathcal{V}, then specifying T(v_1), \ldots, T(v_n) \in \mathcal{W} determines T completely. Linearity forces T\left(\sum_{j=1}^{n} c_j v_j\right) = \sum_{j=1}^{n} c_j T(v_j) for any scalars c_1, \ldots, c_n \in \mathbb{F}.
This reduces the infinite problem—defining T on all vectors in \mathcal{V}—to the finite problem of specifying n vectors in \mathcal{W}. But “specifying a vector” is itself an infinite datum when \mathcal{W} is abstract. To perform calculations, we require numerical descriptions.
Consider the differentiation operator D : \mathcal{P}_2 \to \mathcal{P}_1 sending a polynomial to its derivative. We know D(1) = 0, D(x) = 1, and D(x^2) = 2x. These three equations determine D on all of \mathcal{P}_2. But to compute D(3 + 2x - x^2) systematically—especially when composing with other maps, inverting, or solving equations—we need more than abstract knowledge. We need a calculus of operations.
The solution is coordinatization. Once we choose a basis \mathcal{B} for \mathcal{V} and a basis \mathcal{C} for \mathcal{W}, every vector acquires a numerical address, and every linear map acquires a numerical encoding. The abstract theory becomes concrete algebra.
This chapter develops the machinery. We construct coordinate maps, define matrices as recordings of linear transformations in coordinates, derive the rules for manipulating these matrices, and establish the correspondence between abstract maps and their coordinate representations. The central tension—the interplay between intrinsic structure and coordinate description—governs everything that follows.
5.2 Coordinate Maps
Let \mathcal{V} be a finite-dimensional vector space over a field \mathbb{F} with \dim \mathcal{V} = n. Throughout this chapter, \mathbb{F} denotes either \mathbb{R} or \mathbb{C} unless otherwise specified, though the constructions apply to arbitrary fields.
Fix an ordered basis \mathcal{B} = \{v_1, \ldots, v_n\} of \mathcal{V}. By Definition 3.5, every vector v \in \mathcal{V} admits a unique representation v = \sum_{j=1}^{n} a_j v_j for some scalars a_1, \ldots, a_n \in \mathbb{F}. The coefficients (a_1, \ldots, a_n) identify v relative to \mathcal{B}. Different choices of v yield different coefficient sequences; different choices of basis yield different coefficients for the same v.
We require a standard numerical container for these coefficients. The natural candidate is \mathbb{F}^n, but we must specify how to arrange the n scalars. Convention dictates columns rather than rows.
Definition 5.1 (Column vector) An element of \mathbb{F}^n written as \begin{pmatrix} a_1 \\ a_2 \\ \vdots \\ a_n \end{pmatrix} is called a column vector. The entry a_i occupies the i-th position from the top.
The choice of columns over rows is not arbitrary. It ensures compatibility with matrix-vector multiplication, which we define shortly. Row vectors will appear later when we study dual spaces and linear functionals.
Definition 5.2 (Coordinate map) Let \mathcal{B} = \{v_1, \ldots, v_n\} be an ordered basis of \mathcal{V}. The coordinate map with respect to \mathcal{B} is the function [\cdot]_{\mathcal{B}} : \mathcal{V} \to \mathbb{F}^n defined as follows: if v = \sum_{j=1}^{n} a_j v_j, then [v]_{\mathcal{B}} = \begin{pmatrix} a_1 \\ a_2 \\ \vdots \\ a_n \end{pmatrix}. The column vector [v]_{\mathcal{B}} is called the coordinate vector of v with respect to \mathcal{B}.
The ordering of the basis is essential. Reordering the vectors in \mathcal{B} permutes the entries of [v]_{\mathcal{B}}. This dependence on ordering distinguishes ordered bases from unordered sets.
The coordinate map translates abstract vectors into Euclidean space. It converts questions about \mathcal{V}—a space whose elements may be polynomials, functions, sequences, or formal symbols—into questions about \mathbb{F}^n, where calculation is explicit.
Theorem 5.1 The coordinate map [\cdot]_{\mathcal{B}} : \mathcal{V} \to \mathbb{F}^n is linear.
Proof. Let u, v \in \mathcal{V} with u = \sum_{j=1}^{n} a_j v_j and v = \sum_{j=1}^{n} b_j v_j. Then u + v = \sum_{j=1}^{n} (a_j + b_j) v_j, so [u + v]_{\mathcal{B}} = \begin{pmatrix} a_1 + b_1 \\ \vdots \\ a_n + b_n \end{pmatrix} = \begin{pmatrix} a_1 \\ \vdots \\ a_n \end{pmatrix} + \begin{pmatrix} b_1 \\ \vdots \\ b_n \end{pmatrix} = [u]_{\mathcal{B}} + [v]_{\mathcal{B}}. For scalar multiplication, if c \in \mathbb{F} then cu = \sum_{j=1}^{n} (ca_j) v_j, giving [cu]_{\mathcal{B}} = \begin{pmatrix} ca_1 \\ \vdots \\ ca_n \end{pmatrix} = c \begin{pmatrix} a_1 \\ \vdots \\ a_n \end{pmatrix} = c[u]_{\mathcal{B}}. \quad \square
The linearity of [\cdot]_{\mathcal{B}} reflects a fundamental principle: coordinatization preserves vector space structure. Addition and scalar multiplication in \mathcal{V} correspond exactly to the componentwise operations in \mathbb{F}^n.
Theorem 5.2 The coordinate map [\cdot]_{\mathcal{B}} : \mathcal{V} \to \mathbb{F}^n is an isomorphism of vector spaces.
Proof. We verify bijectivity. For injectivity, suppose [v]_{\mathcal{B}} = 0. Then all coefficients a_j in the expansion v = \sum_{j=1}^{n} a_j v_j vanish, giving v = 0 by linear independence of \mathcal{B}. Thus \ker([\cdot]_{\mathcal{B}}) = \{0\}, and by Theorem 4.6 the map is injective.
For surjectivity, let \begin{pmatrix} c_1 \\ \vdots \\ c_n \end{pmatrix} \in \mathbb{F}^n be arbitrary. Define v = \sum_{j=1}^{n} c_j v_j \in \mathcal{V}. Then [v]_{\mathcal{B}} = \begin{pmatrix} c_1 \\ \vdots \\ c_n \end{pmatrix} by construction, establishing surjectivity.
Combining injectivity, surjectivity, and linearity from Theorem 5.1 gives the result. \square
This theorem justifies the claim from Chapter 3: every n-dimensional vector space over \mathbb{F} is isomorphic to \mathbb{F}^n. But the isomorphism is not canonical—it depends on the choice of basis. Different bases produce different isomorphisms, all equally valid.
The coordinate map is our gateway to computation. Abstract problems in \mathcal{V} translate to concrete problems in \mathbb{F}^n, where algorithms and numerical methods apply. But we must remember: \mathcal{V} exists independently of coordinates. The basis \mathcal{B} is a tool we impose, not an intrinsic structure.
5.3 Matrices as Arrays of Scalars
Before representing linear maps, we establish notation for rectangular arrays of scalars. The development is systematic: we define the set of matrices, specify indexing conventions, introduce matrix-vector multiplication, and verify its linearity. Only then do we connect matrices to linear maps.
Definition 5.3 (Matrix space) Let m, n \in \mathbb{N}. The set M_{m \times n}(\mathbb{F}) consists of all rectangular arrays A = \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{pmatrix} where a_{ij} \in \mathbb{F} for all 1 \leq i \leq m and 1 \leq j \leq n. An element A \in M_{m \times n}(\mathbb{F}) is called an m \times n matrix over \mathbb{F}. The scalar a_{ij} is the (i,j)-entry of A.
The dimensions m \times n specify “m rows by n columns.” The first index i identifies the row; the second index j identifies the column. Thus a_{ij} resides in row i, column j. This convention—“row-column” ordering—is universal.
We often suppress the field \mathbb{F} when context makes it clear, writing M_{m \times n} for M_{m \times n}(\mathbb{R}) or leaving the field implicit. When m = n, the matrix is square, and we write M_n(\mathbb{F}) for M_{n \times n}(\mathbb{F}).
Matrices admit componentwise addition and scalar multiplication. For A, B \in M_{m \times n}(\mathbb{F}) and c \in \mathbb{F}, define (A + B)_{ij} = a_{ij} + b_{ij}, \quad (cA)_{ij} = c a_{ij}. The zero matrix 0 \in M_{m \times n}(\mathbb{F}) has all entries equal to zero. The additive inverse of A is -A with entries (-A)_{ij} = -a_{ij}.
Theorem 5.3 M_{m \times n}(\mathbb{F}) is a vector space under componentwise addition and scalar multiplication. Its dimension is mn.
Proof. Verification of the vector space axioms reduces to properties of \mathbb{F}. For instance, commutativity of addition follows from (A + B)_{ij} = a_{ij} + b_{ij} = b_{ij} + a_{ij} = (B + A)_{ij} for all i, j. The remaining axioms follow similarly.
For dimension, consider the matrices E_{ij} \in M_{m \times n}(\mathbb{F}) defined by (E_{ij})_{k\ell} = \delta_{ik}\delta_{j\ell}, so E_{ij} has entry 1 in position (i,j) and zeros elsewhere. There are mn such matrices. Every A \in M_{m \times n}(\mathbb{F}) expands uniquely as A = \sum_{i=1}^{m} \sum_{j=1}^{n} a_{ij} E_{ij}. Linear independence of \{E_{ij}\} is immediate: if \sum_{i,j} c_{ij} E_{ij} = 0, evaluating the (k,\ell)-entry gives c_{k\ell} = 0. Thus \{E_{ij}\} is a basis of M_{m \times n}(\mathbb{F}). \square
The j-th column of A is the column vector a_j = \begin{pmatrix} a_{1j} \\ a_{2j} \\ \vdots \\ a_{mj} \end{pmatrix} \in \mathbb{F}^m. We often write A = \begin{pmatrix} | & | & & | \\ a_1 & a_2 & \cdots & a_n \\ | & | & & | \end{pmatrix} to emphasize the column structure. The i-th row of A consists of the entries a_{i1}, a_{i2}, \ldots, a_{in} read left to right.
Definition 5.4 (Matrix-vector product) Let A \in M_{m \times n}(\mathbb{F}) and x \in \mathbb{F}^n, with columns a_1, \ldots, a_n \in \mathbb{F}^m. Write x = \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix}. The product Ax \in \mathbb{F}^m is defined by Ax = x_1 a_1 + x_2 a_2 + \cdots + x_n a_n = \sum_{j=1}^{n} x_j a_j.
The i-th component of Ax is equivalently (Ax)_i = \sum_{j=1}^{n} a_{ij} x_j, obtained by expanding each column a_j in terms of its entries. Both formulations are equivalent; the column-based definition emphasizes that Ax is a linear combination of the columns of A.
Theorem 5.4 For fixed A \in M_{m \times n}(\mathbb{F}), the map \mathbb{F}^n \to \mathbb{F}^m defined by x \mapsto Ax is linear.
Proof. Let x, y \in \mathbb{F}^n and c \in \mathbb{F}. By Definition 5.4, A(x + y) = \sum_{j=1}^{n} (x_j + y_j) a_j = \sum_{j=1}^{n} x_j a_j + \sum_{j=1}^{n} y_j a_j = Ax + Ay. Similarly, A(cx) = \sum_{j=1}^{n} (cx_j) a_j = c \sum_{j=1}^{n} x_j a_j = c(Ax). \quad \square
Thus every matrix A \in M_{m \times n}(\mathbb{F}) determines a linear map \mathbb{F}^n \to \mathbb{F}^m via x \mapsto Ax. Conversely, every linear map \mathbb{F}^n \to \mathbb{F}^m arises this way.
Theorem 5.5 For every A \in M_{m \times n}(\mathbb{F}), the map T_A : \mathbb{F}^n \to \mathbb{F}^m defined by T_A(x) = Ax is linear. Conversely, every linear map T : \mathbb{F}^n \to \mathbb{F}^m has the form T = T_A for a unique A \in M_{m \times n}(\mathbb{F}).
Proof. The first claim is Theorem 5.4. For the converse, let T : \mathbb{F}^n \to \mathbb{F}^m be linear and let e_1, \ldots, e_n denote the standard basis vectors of \mathbb{F}^n. By Theorem 4.3, T is determined by T(e_1), \ldots, T(e_n) \in \mathbb{F}^m.
Define A \in M_{m \times n}(\mathbb{F}) by letting its j-th column be T(e_j): A = \begin{pmatrix} | & & | \\ T(e_1) & \cdots & T(e_n) \\ | & & | \end{pmatrix}. For any x = \sum_{j=1}^{n} x_j e_j, linearity gives T(x) = \sum_{j=1}^{n} x_j T(e_j) = \sum_{j=1}^{n} x_j a_j = Ax, so T = T_A. For uniqueness, if T = T_B then Ae_j = T(e_j) = Be_j for all j, so A and B have identical columns, giving A = B. \square
5.4 Matrix Representations of Linear Maps
We now extend the correspondence from Euclidean spaces to arbitrary finite-dimensional spaces. The key observation: coordinate maps convert abstract spaces into Euclidean spaces, and we already know how to represent linear maps between Euclidean spaces.
Let T : \mathcal{V} \to \mathcal{W} be linear, where \dim \mathcal{V} = n and \dim \mathcal{W} = m. Fix ordered bases \mathcal{B} = \{v_1, \ldots, v_n\} of \mathcal{V} and \mathcal{C} = \{w_1, \ldots, w_m\} of \mathcal{W}.
The diagram \begin{array}{ccc} \mathcal{V} & \xrightarrow{T} & \mathcal{W} \\ \downarrow\scriptstyle{[\cdot]_{\mathcal{B}}} & & \downarrow\scriptstyle{[\cdot]_{\mathcal{C}}} \\ \mathbb{F}^n & \xrightarrow{\widetilde{T}} & \mathbb{F}^m \end{array} displays four maps: the abstract map T at the top, the coordinate maps on the sides, and an unknown map \widetilde{T} at the bottom. We seek \widetilde{T} making the diagram commute—that is, making [\cdot]_{\mathcal{C}} \circ T and \widetilde{T} \circ [\cdot]_{\mathcal{B}} equal as functions \mathcal{V} \to \mathbb{F}^m.
By Theorem 5.2, the vertical maps are isomorphisms. This forces \widetilde{T} = [\cdot]_{\mathcal{C}} \circ T \circ ([\cdot]_{\mathcal{B}})^{-1}. Since \widetilde{T} is a composition of linear maps, it is linear. By Theorem 5.5, there exists a unique matrix A \in M_{m \times n}(\mathbb{F}) with \widetilde{T}(x) = Ax for all x \in \mathbb{F}^n.
To construct A explicitly, recall that T is determined by T(v_1), \ldots, T(v_n). Each image T(v_j) \in \mathcal{W} expands uniquely in basis \mathcal{C}: T(v_j) = \sum_{i=1}^{m} a_{ij} w_i for some scalars a_{ij} \in \mathbb{F}.
Definition 5.5 (Matrix of a linear map) Let T : \mathcal{V} \to \mathcal{W} be linear with bases \mathcal{B} = \{v_1, \ldots, v_n\} of \mathcal{V} and \mathcal{C} = \{w_1, \ldots, w_m\} of \mathcal{W}. The matrix of T with respect to \mathcal{B} and \mathcal{C}, denoted [T]_{\mathcal{B}}^{\mathcal{C}}, is the element of M_{m \times n}(\mathbb{F}) whose j-th column is [T(v_j)]_{\mathcal{C}}: [T]_{\mathcal{B}}^{\mathcal{C}} = \begin{pmatrix} | & & | \\ [T(v_1)]_{\mathcal{C}} & \cdots & [T(v_n)]_{\mathcal{C}} \\ | & & | \end{pmatrix}. Equivalently, the (i,j)-entry is the i-th coordinate of T(v_j) when expressed in basis \mathcal{C}.
Theorem 5.6 (Matrix action formula) Let T : \mathcal{V} \to \mathcal{W} be linear with matrix A = [T]_{\mathcal{B}}^{\mathcal{C}}. Then for every v \in \mathcal{V}, [T(v)]_{\mathcal{C}} = A [v]_{\mathcal{B}}.
Proof. Write v = \sum_{j=1}^{n} c_j v_j so that [v]_{\mathcal{B}} = \begin{pmatrix} c_1 \\ \vdots \\ c_n \end{pmatrix}. By linearity, T(v) = \sum_{j=1}^{n} c_j T(v_j). Applying [\cdot]_{\mathcal{C}} and using linearity from Theorem 5.1, [T(v)]_{\mathcal{C}} = \sum_{j=1}^{n} c_j [T(v_j)]_{\mathcal{C}} = \sum_{j=1}^{n} c_j a_j = A[v]_{\mathcal{B}}, where the last step uses Definition 5.4, since the j-th column of A is [T(v_j)]_{\mathcal{C}} = a_j. \square
The commutative diagram \begin{array}{ccc} \mathcal{V} & \xrightarrow{T} & \mathcal{W} \\ \downarrow\scriptstyle{[\cdot]_{\mathcal{B}}} & & \downarrow\scriptstyle{[\cdot]_{\mathcal{C}}} \\ \mathbb{F}^n & \xrightarrow{A} & \mathbb{F}^m \end{array} commutes: [T(v)]_{\mathcal{C}} = A[v]_{\mathcal{B}} for all v \in \mathcal{V}, or equivalently [\cdot]_{\mathcal{C}} \circ T = A \circ [\cdot]_{\mathcal{B}} as functions \mathcal{V} \to \mathbb{F}^m.
5.5 Subspaces Associated with Matrices
5.5.1 The Kernel of a Matrix
Definition 5.6 (Kernel of a matrix) Let A \in M_{m \times n}(\mathbb{F}). The kernel (or null space) of A is \ker(A) = \{x \in \mathbb{F}^n : Ax = 0\}.
Theorem 5.7 For any A \in M_{m \times n}(\mathbb{F}), the kernel \ker(A) is a subspace of \mathbb{F}^n.
Proof. Since A \cdot 0 = 0, we have 0 \in \ker(A). If x, y \in \ker(A), then A(x+y) = Ax + Ay = 0, so x + y \in \ker(A). If x \in \ker(A) and c \in \mathbb{F}, then A(cx) = c(Ax) = 0, so cx \in \ker(A). \square
5.5.2 The Column Space of a Matrix
Definition 5.7 (Column space) Let A \in M_{m \times n}(\mathbb{F}) have columns a_1, \ldots, a_n \in \mathbb{F}^m. The column space of A is \operatorname{Col}(A) = \operatorname{span}\{a_1, \ldots, a_n\}.
Theorem 5.8 \operatorname{Col}(A) = \{Ax : x \in \mathbb{F}^n\}.
Proof. By Definition 5.4, Ax = \sum_{j=1}^{n} x_j a_j, so every Ax is a linear combination of the columns. Conversely, every linear combination \sum c_j a_j = A\begin{pmatrix} c_1 \\ \vdots \\ c_n \end{pmatrix}. \square
5.5.3 The Row Space of a Matrix
Definition 5.8 (Row space) Let A \in M_{m \times n}(\mathbb{F}) have rows r_1, \ldots, r_m viewed as vectors in \mathbb{F}^n. The row space of A is \operatorname{Row}(A) = \operatorname{span}\{r_1, \ldots, r_m\}.
Theorem 5.9 \operatorname{Row}(A) = \operatorname{Col}(A^T).
Proof. The rows of A are the columns of A^T. \square
5.5.4 Connection to Linear Maps
Theorem 5.10 (Kernel correspondence) Let A = [T]_{\mathcal{B}}^{\mathcal{C}}. Then [\cdot]_{\mathcal{B}} : \ker(T) \to \ker(A) is an isomorphism.
Proof. By Theorem 5.6, [T(v)]_{\mathcal{C}} = A[v]_{\mathcal{B}}. If v \in \ker(T) then T(v) = 0, so A[v]_{\mathcal{B}} = 0 and [v]_{\mathcal{B}} \in \ker(A). Conversely, if x \in \ker(A), let v = ([\cdot]_{\mathcal{B}})^{-1}(x); then [T(v)]_{\mathcal{C}} = Ax = 0, so T(v) = 0 by injectivity of [\cdot]_{\mathcal{C}}, giving v \in \ker(T). Thus [\cdot]_{\mathcal{B}} restricts to a bijection \ker(T) \to \ker(A), which is linear by Theorem 5.1. \square
Theorem 5.11 (Image correspondence) Let A = [T]_{\mathcal{B}}^{\mathcal{C}}. Then [\cdot]_{\mathcal{C}} : \operatorname{im}(T) \to \operatorname{Col}(A) is an isomorphism.
Proof. If w = T(v) \in \operatorname{im}(T), then [w]_{\mathcal{C}} = [T(v)]_{\mathcal{C}} = A[v]_{\mathcal{B}} \in \operatorname{Col}(A). Conversely, if y = Ax \in \operatorname{Col}(A), let v = ([\cdot]_{\mathcal{B}})^{-1}(x); then y = A[v]_{\mathcal{B}} = [T(v)]_{\mathcal{C}}, so y is the coordinate vector of T(v) \in \operatorname{im}(T). \square
5.5.5 Rank and Nullity for Matrices
Definition 5.9 (Rank of a matrix) The rank of A \in M_{m \times n}(\mathbb{F}) is \operatorname{rank}(A) = \dim \operatorname{Col}(A), and the nullity of A is \operatorname{nullity}(A) = \dim \ker(A).
Theorem 5.12 (Rank-nullity theorem for matrices) For A \in M_{m \times n}(\mathbb{F}), n = \operatorname{rank}(A) + \operatorname{nullity}(A).
Proof. Apply Theorem 4.5 to the linear map T : \mathbb{F}^n \to \mathbb{F}^m defined by T(x) = Ax. \square
5.5.6 Example: Differentiation Operator
Consider D : \mathcal{P}_2 \to \mathcal{P}_1 with bases \mathcal{B} = \{1, x, x^2\} and \mathcal{C} = \{1, x\}. Then A = [D]_{\mathcal{B}}^{\mathcal{C}} = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 2 \end{pmatrix}. Solving Ax = 0 gives x_2 = x_3 = 0, x_1 free, so \ker(A) = \operatorname{span}\left\{\begin{pmatrix}1\\0\\0\end{pmatrix}\right\}—the coordinate image of constant polynomials. The column space \operatorname{Col}(A) = \mathbb{F}^2—the coordinate image of \mathcal{P}_1. By Rank-nullity 3 = 1 + 2.
5.6 Matrix Multiplication from Composition
Linear maps compose naturally. If T : \mathcal{U} \to \mathcal{V} and S : \mathcal{V} \to \mathcal{W} are linear, the composition S \circ T : \mathcal{U} \to \mathcal{W} is linear by Theorem 4.9. We derive the matrix of S \circ T from the matrices of S and T.
Let \dim \mathcal{U} = p, \dim \mathcal{V} = n, \dim \mathcal{W} = m, with bases \mathcal{A}, \mathcal{B}, \mathcal{C}. Let B = [T]_{\mathcal{A}}^{\mathcal{B}} \in M_{n \times p}(\mathbb{F}) and A = [S]_{\mathcal{B}}^{\mathcal{C}} \in M_{m \times n}(\mathbb{F}).
For any u \in \mathcal{U}, [(S \circ T)(u)]_{\mathcal{C}} = [S(T(u))]_{\mathcal{C}} = A[T(u)]_{\mathcal{B}} = A(B[u]_{\mathcal{A}}). The expression A(B[u]_{\mathcal{A}}) must equal [S \circ T]_{\mathcal{A}}^{\mathcal{C}}[u]_{\mathcal{A}} for all u. This defines matrix multiplication: the product AB is the unique matrix representing S \circ T.
To compute the entries of AB, apply this to the k-th standard basis vector e_k of \mathbb{F}^p. Then Be_k = b_k, the k-th column of B, and (AB)e_k = Ab_k = A\begin{pmatrix}b_{1k}\\\vdots\\b_{nk}\end{pmatrix} = \sum_{j=1}^{n}b_{jk}a_j, whose i-th entry is \sum_{j=1}^{n} a_{ij}b_{jk}. This is the (i,k)-entry of AB.
Definition 5.10 (Matrix multiplication) Let A \in M_{m \times n}(\mathbb{F}) and B \in M_{n \times p}(\mathbb{F}). The product AB \in M_{m \times p}(\mathbb{F}) has entries (AB)_{ik} = \sum_{j=1}^{n} a_{ij} b_{jk} for 1 \leq i \leq m and 1 \leq k \leq p.
The k-th column of AB is A times the k-th column of B. Matrix multiplication is defined only when the number of columns of A equals the number of rows of B.
Theorem 5.13 With notation as above, [S \circ T]_{\mathcal{A}}^{\mathcal{C}} = [S]_{\mathcal{B}}^{\mathcal{C}} \cdot [T]_{\mathcal{A}}^{\mathcal{B}} = AB.
Proof. For any u \in \mathcal{U}, applying Theorem 5.6 twice gives [(S \circ T)(u)]_{\mathcal{C}} = A[T(u)]_{\mathcal{B}} = A(B[u]_{\mathcal{A}}). By Definition 5.10, A(Bx) = (AB)x for any x \in \mathbb{F}^p: the i-th entry of A(Bx) is \sum_j a_{ij}(Bx)_j = \sum_j a_{ij}\sum_k b_{jk}x_k = \sum_k (AB)_{ik}x_k. Thus [(S \circ T)(u)]_{\mathcal{C}} = (AB)[u]_{\mathcal{A}} for all u \in \mathcal{U}. Since coordinate maps are isomorphisms, [S \circ T]_{\mathcal{A}}^{\mathcal{C}} = AB. \square
Matrix multiplication encodes composition. The order matters: AB corresponds to “first apply T (matrix B), then S (matrix A).” The products AB and BA generally differ.
Theorem 5.14 Matrix multiplication satisfies the following, where dimensions are assumed compatible:
- (Associativity) (AB)C = A(BC)
- (Distributivity) A(B + C) = AB + AC and (A + B)C = AC + BC
- (Scalar compatibility) (cA)B = c(AB) = A(cB) for any c \in \mathbb{F}
Proof. All three identities are verified entry-by-entry using Definition 5.10.
For (1), the (i,\ell)-entry of (AB)C is \sum_k \left(\sum_j a_{ij}b_{jk}\right)c_{k\ell} = \sum_j a_{ij}\left(\sum_k b_{jk}c_{k\ell}\right), which is the (i,\ell)-entry of A(BC). The rearrangement is justified by finiteness of the sums.
For (2), the (i,k)-entry of A(B+C) is \sum_j a_{ij}(b_{jk}+c_{jk}) = \sum_j a_{ij}b_{jk} + \sum_j a_{ij}c_{jk}, which equals (AB)_{ik} + (AC)_{ik}. The second distributivity law follows similarly.
For (3), (cA)_{ij} = ca_{ij}, so ((cA)B)_{ik} = \sum_j ca_{ij}b_{jk} = c(AB)_{ik}. \square
Definition 5.11 (Identity matrix) The identity matrix I_n \in M_n(\mathbb{F}) has entries (I_n)_{ij} = \delta_{ij}—ones on the diagonal, zeros elsewhere.
Theorem 5.15 For any A \in M_{m \times n}(\mathbb{F}), we have I_m A = A and A I_n = A.
Proof. The (i,k)-entry of I_m A is \sum_j \delta_{ij}a_{jk} = a_{ik}. Similarly for AI_n. \square
5.7 Block Matrices and Partitioned Operations
A block matrix is a matrix subdivided into rectangular submatrices. For instance, an m \times n matrix M may be written as M = \begin{pmatrix} A & B \\ C & D \end{pmatrix}, where A \in M_{p \times q}(\mathbb{F}), B \in M_{p \times (n-q)}(\mathbb{F}), C \in M_{(m-p) \times q}(\mathbb{F}), and D \in M_{(m-p) \times (n-q)}(\mathbb{F}).
Theorem 5.16 Let M, N \in M_{m \times n}(\mathbb{F}) be partitioned compatibly. Then M + N = \begin{pmatrix} A + A' & B + B' \\ C + C' & D + D' \end{pmatrix}, \quad cM = \begin{pmatrix} cA & cB \\ cC & cD \end{pmatrix}.
Proof. Both identities follow immediately from componentwise addition and scalar multiplication. \square
Theorem 5.17 Let M \in M_{m \times n}(\mathbb{F}) and N \in M_{n \times p}(\mathbb{F}) be partitioned as M = \begin{pmatrix} A & B \\ C & D \end{pmatrix}, \quad N = \begin{pmatrix} E & F \\ G & H \end{pmatrix}, where the column partition of M matches the row partition of N. Then MN = \begin{pmatrix} AE + BG & AF + BH \\ CE + DG & CF + DH \end{pmatrix}.
Proof. The (i,k)-entry of MN is \sum_j m_{ij}n_{jk}. Partition the sum at j = n_1 (the column cut of M, equivalently the row cut of N): (MN)_{ik} = \sum_{j=1}^{n_1}m_{ij}n_{jk} + \sum_{j=n_1+1}^{n}m_{ij}n_{jk}. The first sum is the (i,k)-entry of the product of the left block-column of M with the top block-row of N; the second sum is the product of the right block-column with the bottom block-row. Grouping by which output block (i,k) falls in gives AE+BG, AF+BH, CE+DG, CF+DH as claimed. \square
5.7.1 Block Diagonal Matrices
A block diagonal matrix M = A_1 \oplus \cdots \oplus A_k has square diagonal blocks A_i and zeros off the diagonal.
Theorem 5.18 Let M = A_1 \oplus \cdots \oplus A_k and N = B_1 \oplus \cdots \oplus B_k have matching block dimensions. Then:
- M + N = (A_1 + B_1) \oplus \cdots \oplus (A_k + B_k)
- MN = (A_1 B_1) \oplus \cdots \oplus (A_k B_k)
- M is invertible if and only if each A_i is invertible, in which case M^{-1} = A_1^{-1} \oplus \cdots \oplus A_k^{-1}
Proof. (1) follows from Theorem 5.16. (2) follows from Theorem 5.17: the (i,j)-block of MN is \sum_\ell (M)_{i\ell}(N)_{\ell j}, and since M and N are block diagonal, the only nonzero contribution is \ell = i = j, giving (MN)_{ii} = A_iB_i. For (3), if each A_i is invertible, then by (2), (A_1^{-1} \oplus \cdots \oplus A_k^{-1})(A_1 \oplus \cdots \oplus A_k) = (A_1^{-1}A_1) \oplus \cdots \oplus (A_k^{-1}A_k) = I, and similarly in reverse order, so M is invertible.
Conversely, suppose M is invertible, so the map x \mapsto Mx is bijective on \mathbb{F}^n = \mathbb{F}^{d_1} \oplus \cdots \oplus \mathbb{F}^{d_k}. Fix any i and suppose A_i y = 0 for some y \in \mathbb{F}^{d_i}. Let x \in \mathbb{F}^n be the vector with y in the i-th block component and zeros elsewhere. Then Mx = 0, so injectivity of M forces x = 0, hence y = 0. Thus A_i is injective. Since A_i is a square matrix over \mathbb{F} (finite-dimensional), injectivity implies invertibility. \square
5.7.2 Block Upper Triangular Matrices
A matrix is block upper triangular if all blocks below the diagonal are zero. The corresponding linear-algebraic structure is an invariant subspace.
Theorem 5.19 Let T : \mathcal{V} \to \mathcal{V} be a linear operator and \mathcal{U} \subseteq \mathcal{V} a subspace with T(\mathcal{U}) \subseteq \mathcal{U}. Choose a basis \{u_1, \ldots, u_k\} of \mathcal{U} and extend to a basis \{u_1, \ldots, u_k, v_1, \ldots, v_r\} of \mathcal{V}. Then [T] = \begin{pmatrix} A & B \\ 0 & C \end{pmatrix}, where A \in M_k(\mathbb{F}) represents T|_{\mathcal{U}}.
Proof. For j \leq k, T(u_j) \in \mathcal{U} by invariance, so T(u_j) = \sum_{i=1}^{k} a_{ij} u_i. The j-th column of [T] thus has zeros in rows k+1 through k+r, producing the lower-left zero block. \square
5.8 Invertibility and Isomorphisms
Definition 5.12 (Invertible matrix) A matrix A \in M_n(\mathbb{F}) is invertible if there exists B \in M_n(\mathbb{F}) with AB = BA = I_n. The matrix B is the inverse of A, denoted A^{-1}.
Theorem 5.20 The inverse of an invertible matrix is unique.
Proof. Suppose AB = BA = I_n and AC = CA = I_n. Then B = BI_n = B(AC) = (BA)C = I_nC = C. \quad \square
Theorem 5.21 Let T : \mathcal{V} \to \mathcal{V} be a linear operator on a finite-dimensional space, \mathcal{B} a basis of \mathcal{V}, and A = [T]_{\mathcal{B}}^{\mathcal{B}}. Then T is an isomorphism if and only if A is invertible. Moreover, [T^{-1}]_{\mathcal{B}}^{\mathcal{B}} = A^{-1}.
Proof. Suppose T is an isomorphism with inverse S. Let B = [S]_{\mathcal{B}}^{\mathcal{B}}. By Theorem 5.13, the matrix of S \circ T = I_{\mathcal{V}} is BA, and the matrix of T \circ S = I_{\mathcal{V}} is AB. The identity map I_{\mathcal{V}} has matrix I_n relative to any basis (since I_{\mathcal{V}}(v_j) = v_j for all j). Thus BA = AB = I_n, so A is invertible with A^{-1} = B = [T^{-1}]_{\mathcal{B}}^{\mathcal{B}}.
Conversely, suppose A is invertible. Let S : \mathcal{V} \to \mathcal{V} be the unique linear map with [S]_{\mathcal{B}}^{\mathcal{B}} = A^{-1}. Then by Theorem 5.13, the matrices of S \circ T and T \circ S are A^{-1}A = I_n and AA^{-1} = I_n, so S \circ T = I_{\mathcal{V}} and T \circ S = I_{\mathcal{V}}. Thus T is an isomorphism with inverse S. \square
Theorem 5.22 For A \in M_n(\mathbb{F}), the following are equivalent:
- A is invertible
- The map x \mapsto Ax is an isomorphism \mathbb{F}^n \to \mathbb{F}^n
- The columns of A form a basis of \mathbb{F}^n
- The columns of A are linearly independent
- The columns of A span \mathbb{F}^n
- \ker(A) = \{0\}
- For every b \in \mathbb{F}^n, the equation Ax = b has a unique solution
Proof. (1) \Leftrightarrow (2) is Theorem 5.21. The columns of A are Ae_1,\ldots, Ae_n, the images of the standard basis \{e_1,\ldots,e_n\} under T : x \mapsto Ax.
- \Leftrightarrow (3): If T is an isomorphism then \ker(T) = \{0\} and \operatorname{im}(T) = \mathbb{F}^n, so the columns are linearly independent and span \mathbb{F}^n—hence a basis. Conversely, if the columns form a basis then they are independent (so \ker(T) = \{0\}, giving injectivity by Theorem 4.6) and spanning (so T is surjective), hence T is an isomorphism.
The equivalences (3) \Leftrightarrow (4) \Leftrightarrow (5) follow from Theorem 3.6. The equivalence (2) \Leftrightarrow (6) is Theorem 4.6. The equivalence (2) \Leftrightarrow (7) is bijectivity: existence of a solution is surjectivity, uniqueness is injectivity. \square
Definition 5.13 (General linear group) The set \mathrm{GL}_n(\mathbb{F}) = \{ A \in M_n(\mathbb{F}) : A \text{ is invertible} \} is called the general linear group of degree n over \mathbb{F}.
Theorem 5.23 \mathrm{GL}_n(\mathbb{F}) is closed under matrix multiplication, contains I_n, and every element has an inverse in \mathrm{GL}_n(\mathbb{F}). These properties make it a group under matrix multiplication.
Proof. If A, B \in \mathrm{GL}_n(\mathbb{F}), then (AB)(B^{-1}A^{-1}) = A(BB^{-1})A^{-1} = AA^{-1} = I_n, so AB is invertible with (AB)^{-1} = B^{-1}A^{-1}. The remaining group axioms are immediate. \square
5.9 The Transpose and Dual Maps
Definition 5.14 (Matrix transpose) Let A \in M_{m \times n}(\mathbb{F}). The transpose A^T \in M_{n \times m}(\mathbb{F}) has entries (A^T)_{ji} = a_{ij}.
Transposition interchanges rows and columns: the i-th row of A becomes the i-th column of A^T.
Theorem 5.24 For matrices of appropriate dimensions and c \in \mathbb{F}:
- (A^T)^T = A
- (A + B)^T = A^T + B^T
- (cA)^T = cA^T
- (AB)^T = B^T A^T
Proof. Properties (1)–(3) follow immediately from Definition 5.14. For (4), let A \in M_{m \times n}(\mathbb{F}) and B \in M_{n \times p}(\mathbb{F}). For any 1 \leq k \leq p and 1 \leq i \leq m, ((AB)^T)_{ki} = (AB)_{ik} = \sum_{j=1}^{n} a_{ij} b_{jk} = \sum_{j=1}^{n} (A^T)_{ji}(B^T)_{kj} = (B^T A^T)_{ki}. \quad \square
Note that transposition reverses products: (AB)^T = B^T A^T, mirroring the reversal (S \circ T)^* = T^* \circ S^* for dual maps.
Theorem 5.25 Let \mathcal{B} = \{v_1, \ldots, v_n\} be a basis of \mathcal{V} and \mathcal{B}^* = \{\varphi_1, \ldots, \varphi_n\} the dual basis. For any \psi \in \mathcal{V}^*, [\psi]_{\mathcal{B}^*} = \begin{pmatrix} \psi(v_1) \\ \vdots \\ \psi(v_n) \end{pmatrix}.
Proof. Write \psi = \sum_{i=1}^{n} c_i \varphi_i. Evaluating on v_j gives \psi(v_j) = \sum_i c_i \delta_{ij} = c_j. So the j-th coordinate of \psi in \mathcal{B}^* is \psi(v_j). \square
Theorem 5.26 Let T : \mathcal{V} \to \mathcal{W} be linear with A = [T]_{\mathcal{B}}^{\mathcal{C}}. Then [T^*]_{\mathcal{C}^*}^{\mathcal{B}^*} = A^T.
Proof. Let \psi_k \in \mathcal{C}^*. By Theorem 5.25, the j-th entry of [T^*\psi_k]_{\mathcal{B}^*} is (T^*\psi_k)(v_j) = \psi_k(T(v_j)). Since T(v_j) = \sum_i a_{ij}w_i, \psi_k(T(v_j)) = \sum_{i=1}^{m} a_{ij}\psi_k(w_i) = \sum_{i=1}^{m} a_{ij}\delta_{ki} = a_{kj} = (A^T)_{jk}. Thus the k-th column of [T^*]_{\mathcal{C}^*}^{\mathcal{B}^*} equals the k-th column of A^T. \square
5.10 The Space of Matrices and Linear Maps
Theorem 5.27 Fix bases \mathcal{B} of \mathcal{V} and \mathcal{C} of \mathcal{W}. The map \Phi : \mathcal{L}(\mathcal{V}, \mathcal{W}) \to M_{m \times n}(\mathbb{F}), \quad T \mapsto [T]_{\mathcal{B}}^{\mathcal{C}} is a vector space isomorphism.
Proof. For linearity, the j-th column of [S+T]_{\mathcal{B}}^{\mathcal{C}} is [(S+T)(v_j)]_{\mathcal{C}} = [S(v_j)]_{\mathcal{C}} + [T(v_j)]_{\mathcal{C}} by Theorem 5.1, so \Phi(S+T) = \Phi(S)+\Phi(T). Similarly \Phi(cT) = c\Phi(T).
For injectivity, if [T]_{\mathcal{B}}^{\mathcal{C}} = 0 then [T(v_j)]_{\mathcal{C}} = 0 for all j, so T(v_j) = 0 for all j by injectivity of [\cdot]_{\mathcal{C}}. By Theorem 4.3, T = 0.
For surjectivity, given A \in M_{m \times n}(\mathbb{F}), define T : \mathcal{V} \to \mathcal{W} by T(v_j) = ([\cdot]_{\mathcal{C}})^{-1}(a_j) and extend linearly. Then [T]_{\mathcal{B}}^{\mathcal{C}} = A by construction. \square
Corollary 5.1 \dim \mathcal{L}(\mathcal{V}, \mathcal{W}) = mn.
Proof. By Theorem 5.27, \mathcal{L}(\mathcal{V},\mathcal{W}) \cong M_{m \times n}(\mathbb{F}), which has dimension mn by Theorem 5.3. \square
5.11 Closing Remarks
This chapter constructed the bridge between abstract linear algebra and computational matrix algebra. Coordinate maps [\cdot]_{\mathcal{B}} convert vectors into column vectors; matrix representations [T]_{\mathcal{B}}^{\mathcal{C}} convert linear maps into rectangular arrays. The fundamental identity [T(v)]_{\mathcal{C}} = [T]_{\mathcal{B}}^{\mathcal{C}} [v]_{\mathcal{B}} ensures that applying T abstractly corresponds exactly to matrix multiplication in coordinates. This correspondence allows us to translate between the two perspectives: subspaces associated with T correspond to subspaces associated with [T]_{\mathcal{B}}^{\mathcal{C}}, composition of maps corresponds to matrix multiplication, and so on. The coming chapters will leverage this machinery to study determinants, eigenvalues, and diagonalization.