5  Matrices and Coordinate Representations

5.1 The Problem of Computation

The theory developed in Chapters 1–3 establishes that a linear map T : \mathcal{V} \to \mathcal{W} is uniquely determined by its values on a basis. By Theorem 4.3, if \mathcal{B} = \{v_1, \ldots, v_n\} is a basis of \mathcal{V}, then specifying T(v_1), \ldots, T(v_n) \in \mathcal{W} determines T completely. Linearity forces T\left(\sum_{j=1}^{n} c_j v_j\right) = \sum_{j=1}^{n} c_j T(v_j) for any scalars c_1, \ldots, c_n \in \mathbb{F}.

This reduces the infinite problem—defining T on all vectors in \mathcal{V}—to the finite problem of specifying n vectors in \mathcal{W}. But “specifying a vector” is itself an infinite datum when \mathcal{W} is abstract. To perform calculations, we require numerical descriptions.

Consider the differentiation operator D : \mathcal{P}_2 \to \mathcal{P}_1 sending a polynomial to its derivative. We know D(1) = 0, D(x) = 1, and D(x^2) = 2x. These three equations determine D on all of \mathcal{P}_2. But to compute D(3 + 2x - x^2) systematically—especially when composing with other maps, inverting, or solving equations—we need more than abstract knowledge. We need a calculus of operations.

The solution is coordinatization. Once we choose a basis \mathcal{B} for \mathcal{V} and a basis \mathcal{C} for \mathcal{W}, every vector acquires a numerical address, and every linear map acquires a numerical encoding. The abstract theory becomes concrete algebra.

This chapter develops the machinery. We construct coordinate maps, define matrices as recordings of linear transformations in coordinates, derive the rules for manipulating these matrices, and establish the correspondence between abstract maps and their coordinate representations. The central tension—the interplay between intrinsic structure and coordinate description—governs everything that follows.


5.2 Coordinate Maps

Let \mathcal{V} be a finite-dimensional vector space over a field \mathbb{F} with \dim \mathcal{V} = n. Throughout this chapter, \mathbb{F} denotes either \mathbb{R} or \mathbb{C} unless otherwise specified, though the constructions apply to arbitrary fields.

Fix an ordered basis \mathcal{B} = \{v_1, \ldots, v_n\} of \mathcal{V}. By Definition 3.5, every vector v \in \mathcal{V} admits a unique representation v = \sum_{j=1}^{n} a_j v_j for some scalars a_1, \ldots, a_n \in \mathbb{F}. The coefficients (a_1, \ldots, a_n) identify v relative to \mathcal{B}. Different choices of v yield different coefficient sequences; different choices of basis yield different coefficients for the same v.

We require a standard numerical container for these coefficients. The natural candidate is \mathbb{F}^n, but we must specify how to arrange the n scalars. Convention dictates columns rather than rows.

Definition 5.1 (Column vector) An element of \mathbb{F}^n written as \begin{pmatrix} a_1 \\ a_2 \\ \vdots \\ a_n \end{pmatrix} is called a column vector. The entry a_i occupies the i-th position from the top.

The choice of columns over rows is not arbitrary. It ensures compatibility with matrix-vector multiplication, which we define shortly. Row vectors will appear later when we study dual spaces and linear functionals.

Definition 5.2 (Coordinate map) Let \mathcal{B} = \{v_1, \ldots, v_n\} be an ordered basis of \mathcal{V}. The coordinate map with respect to \mathcal{B} is the function [\cdot]_{\mathcal{B}} : \mathcal{V} \to \mathbb{F}^n defined as follows: if v = \sum_{j=1}^{n} a_j v_j, then [v]_{\mathcal{B}} = \begin{pmatrix} a_1 \\ a_2 \\ \vdots \\ a_n \end{pmatrix}. The column vector [v]_{\mathcal{B}} is called the coordinate vector of v with respect to \mathcal{B}.

The ordering of the basis is essential. Reordering the vectors in \mathcal{B} permutes the entries of [v]_{\mathcal{B}}. This dependence on ordering distinguishes ordered bases from unordered sets.

The coordinate map translates abstract vectors into Euclidean space. It converts questions about \mathcal{V}—a space whose elements may be polynomials, functions, sequences, or formal symbols—into questions about \mathbb{F}^n, where calculation is explicit.

Theorem 5.1 The coordinate map [\cdot]_{\mathcal{B}} : \mathcal{V} \to \mathbb{F}^n is linear.

Proof. Let u, v \in \mathcal{V} with u = \sum_{j=1}^{n} a_j v_j and w = \sum_{j=1}^{n} b_j v_j. Then u + w = \sum_{j=1}^{n} (a_j + b_j) v_j, so [u + w]_{\mathcal{B}} = \begin{pmatrix} a_1 + b_1 \\ \vdots \\ a_n + b_n \end{pmatrix} = \begin{pmatrix} a_1 \\ \vdots \\ a_n \end{pmatrix} + \begin{pmatrix} b_1 \\ \vdots \\ b_n \end{pmatrix} = [u]_{\mathcal{B}} + [w]_{\mathcal{B}}. For scalar multiplication, if c \in \mathbb{F} then cu = \sum_{j=1}^{n} (ca_j) v_j, giving [cu]_{\mathcal{B}} = \begin{pmatrix} ca_1 \\ \vdots \\ ca_n \end{pmatrix} = c \begin{pmatrix} a_1 \\ \vdots \\ a_n \end{pmatrix} = c[u]_{\mathcal{B}}. \quad \square

The linearity of [\cdot]_{\mathcal{B}} reflects a fundamental principle: coordinatization preserves vector space structure. Addition and scalar multiplication in \mathcal{V} correspond exactly to the componentwise operations in \mathbb{F}^n.

Theorem 5.2 The coordinate map [\cdot]_{\mathcal{B}} : \mathcal{V} \to \mathbb{F}^n is an isomorphism of vector spaces.

Proof. We verify bijectivity. For injectivity, suppose [v]_{\mathcal{B}} = 0. Then all coefficients a_j in the expansion v = \sum_{j=1}^{n} a_j v_j vanish, giving v = 0 by linear independence of \mathcal{B}. Thus \ker([\cdot]_{\mathcal{B}}) = \{0\}, and by Theorem 4.6, the map is injective.

For surjectivity, let \begin{pmatrix} c_1 \\ \vdots \\ c_n \end{pmatrix} \in \mathbb{F}^n be arbitrary. Define v = \sum_{j=1}^{n} c_j v_j \in \mathcal{V}. Then [v]_{\mathcal{B}} = \begin{pmatrix} c_1 \\ \vdots \\ c_n \end{pmatrix} by construction, establishing surjectivity.

Combining injectivity, surjectivity, and linearity from Theorem 5.1 gives the result. \square

This theorem justifies the claim from Chapter 3: every n-dimensional vector space over \mathbb{F} is isomorphic to \mathbb{F}^n. But the isomorphism is not canonical—it depends on the choice of basis. Different bases produce different isomorphisms, all equally valid.

The coordinate map is our gateway to computation. Abstract problems in \mathcal{V} translate to concrete problems in \mathbb{F}^n, where algorithms and numerical methods apply. But we must remember: \mathcal{V} exists independently of coordinates. The basis \mathcal{B} is a tool we impose, not an intrinsic structure.


5.3 Matrices as Arrays of Scalars

Before representing linear maps, we establish notation for rectangular arrays of scalars. The development is systematic: we define the set of matrices, specify indexing conventions, introduce matrix-vector multiplication, and verify its linearity. Only then do we connect matrices to linear maps.

Definition 5.3 (Matrix space) Let m, n \in \mathbb{N}. The set M_{m \times n}(\mathbb{F}) consists of all rectangular arrays A = \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{pmatrix} where a_{ij} \in \mathbb{F} for all 1 \leq i \leq m and 1 \leq j \leq n. An element A \in M_{m \times n}(\mathbb{F}) is called an m \times n matrix over \mathbb{F}. The scalar a_{ij} is the (i,j)-entry of A.

The dimensions m \times n specify “m rows by n columns.” The first index i identifies the row; the second index j identifies the column. Thus a_{ij} resides in row i, column j. This convention—“row-column” ordering—is universal.

We often suppress the field \mathbb{F} when context makes it clear, writing M_{m \times n} for M_{m \times n}(\mathbb{R}) or leaving the field implicit. When m = n, the matrix is square, and we write M_n(\mathbb{F}) for M_{n \times n}(\mathbb{F}).

Matrices admit componentwise addition and scalar multiplication. For A, B \in M_{m \times n}(\mathbb{F}) and c \in \mathbb{F}, define (A + B)_{ij} = a_{ij} + b_{ij}, \quad (cA)_{ij} = c a_{ij}. The zero matrix 0 \in M_{m \times n}(\mathbb{F}) has all entries equal to zero. The additive inverse of A is -A with entries (-A)_{ij} = -a_{ij}.

Theorem 5.3 M_{m \times n}(\mathbb{F}) is a vector space under componentwise addition and scalar multiplication. Its dimension is mn.

Proof. Verification of the vector space axioms reduces to properties of \mathbb{F}. For instance, commutativity of addition follows from (A + B)_{ij} = a_{ij} + b_{ij} = b_{ij} + a_{ij} = (B + A)_{ij} for all i, j. The remaining axioms follow similarly.

For dimension, consider the matrices E_{ij} \in M_{m \times n}(\mathbb{F}) defined by (E_{ij})_{k\ell} = \delta_{ik}\delta_{j\ell}.

Thus E_{ij} has entry 1 in position (i,j) and zeros elsewhere. There are mn such matrices. Every A \in M_{m \times n}(\mathbb{F}) expands uniquely as A = \sum_{i=1}^{m} \sum_{j=1}^{n} a_{ij} E_{ij}. Linear independence of \{E_{ij}\} is immediate: if \sum_{i,j} c_{ij} E_{ij} = 0, then evaluating the (k,\ell)-entry gives c_{k\ell} = 0. Thus \{E_{ij}\} is a basis of M_{m \times n}(\mathbb{F}). \square

The structure of M_{m \times n}(\mathbb{F}) as a vector space will prove useful later. For now, we focus on how matrices interact with column vectors.

The j-th column of A is the column vector a_j = \begin{pmatrix} a_{1j} \\ a_{2j} \\ \vdots \\ a_{mj} \end{pmatrix} \in \mathbb{F}^m. We often write A = \begin{pmatrix} | & | & & | \\ a_1 & a_2 & \cdots & a_n \\ | & | & & | \end{pmatrix} to emphasize the column structure. This notation proves essential when defining matrix-vector multiplication.

The i-th row of A consists of the entries a_{i1}, a_{i2}, \ldots, a_{in} read from left to right. We denote the i-th row as (a_{i1} \; a_{i2} \; \cdots \; a_{in}), written horizontally to distinguish it from a column vector.

Definition 5.4 (Matrix-vector product) Let A \in M_{m \times n}(\mathbb{F}) and x \in \mathbb{F}^n. Write x = \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix} and A = \begin{pmatrix} | & & | \\ a_1 & \cdots & a_n \\ | & & | \end{pmatrix} where a_j denotes the j-th column of A. The product Ax \in \mathbb{F}^m is defined by Ax = x_1 a_1 + x_2 a_2 + \cdots + x_n a_n = \sum_{j=1}^{n} x_j a_j.

Alternatively, the i-th component of Ax is (Ax)_i = \sum_{j=1}^{n} a_{ij} x_j. Both formulations are equivalent; the column-based definition emphasizes that Ax is a linear combination of the columns of A, we prefer the latter.

The choice to place A on the left—writing Ax rather than xA—reflects the convention that linear maps act on column vectors. When we introduce matrix multiplication in Section 5.6, this convention ensures compatibility with composition of maps.

Theorem 5.4 For fixed A \in M_{m \times n}(\mathbb{F}), the map \mathbb{F}^n \to \mathbb{F}^m defined by x \mapsto Ax is linear.

Proof. Let x, y \in \mathbb{F}^n and c \in \mathbb{F}. Write x = \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix}, y = \begin{pmatrix} y_1 \\ \vdots \\ y_n \end{pmatrix}. Then A(x + y) = \sum_{j=1}^{n} a_{ij}(x_j + y_j) = \sum_{j=1}^{n} a_{ij}x_j + \sum_{j=1}^{n}a_{ij} y_j = Ax + Ay. For scalar multiplication, A(cx) = \sum_{j=1}^{n} a_{ij}(c x_j) = c \sum_{j=1}^{n} a_{ij}x_j = c(Ax). \quad \square

Thus every matrix A \in M_{m \times n}(\mathbb{F}) determines a linear map \mathbb{F}^n \to \mathbb{F}^m via x \mapsto Ax. Conversely, every linear map \mathbb{F}^n \to \mathbb{F}^m arises this way, as we now establish.

Theorem 5.5 For every A \in M_{m \times n}(\mathbb{F}), the map T_A : \mathbb{F}^n \to \mathbb{F}^m defined by T_A(x) = Ax is linear. Conversely, every linear map T : \mathbb{F}^n \to \mathbb{F}^m has the form T = T_A for a unique A \in M_{m \times n}(\mathbb{F}).

Proof. The first claim is Theorem 5.4. For the converse, let T : \mathbb{F}^n \to \mathbb{F}^m be linear. Denote the standard basis vectors of \mathbb{F}^n by e_j = \begin{pmatrix} 0 \\ \vdots \\ 0 \\ 1 \\ 0 \\ \vdots \\ 0 \end{pmatrix} \leftarrow \text{$j$-th position}, for j = 1, \ldots, n. By Theorem 4.3, T is determined by T(e_1), \ldots, T(e_n) \in \mathbb{F}^m.

Define A \in M_{m \times n}(\mathbb{F}) by letting its j-th column be T(e_j): A = \begin{pmatrix} | & & | \\ T(e_1) & \cdots & T(e_n) \\ | & & | \end{pmatrix}. For any x = \sum_{j=1}^{n} x_j e_j \in \mathbb{F}^n, linearity gives T(x) = \sum_{j=1}^{n} x_j T(e_j) = \sum_{j=1}^{n} x_j a_j = Ax, where a_j denotes the j-th column of A. Thus T = T_A.

For uniqueness, suppose T = T_B for some B \in M_{m \times n}(\mathbb{F}). Then Ae_j = Be_j for all j. But Ae_j is the j-th column of A and Be_j is the j-th column of B, so A and B have identical columns, giving A = B. \square

This theorem establishes a bijective correspondence between M_{m \times n}(\mathbb{F}) and \mathcal{L}(\mathbb{F}^n, \mathbb{F}^m), the space of linear maps from \mathbb{F}^n to \mathbb{F}^m. Every matrix encodes a linear map; every linear map on Euclidean spaces is encoded by a unique matrix.


5.4 Matrix Representations of Linear Maps

We now extend the correspondence from Euclidean spaces to arbitrary finite-dimensional spaces. The key observation: coordinate maps convert abstract spaces into Euclidean spaces, and we already know how to represent linear maps between Euclidean spaces.

Let T : \mathcal{V} \to \mathcal{W} be linear, where \dim \mathcal{V} = n and \dim \mathcal{W} = m. Fix ordered bases \mathcal{B} = \{v_1, \ldots, v_n\} of \mathcal{V} and \mathcal{C} = \{w_1, \ldots, w_m\} of \mathcal{W}.

The diagram \begin{array}{ccc} \mathcal{V} & \xrightarrow{T} & \mathcal{W} \\ \downarrow\scriptstyle{[\cdot]_{\mathcal{B}}} & & \downarrow\scriptstyle{[\cdot]_{\mathcal{C}}} \\ \mathbb{F}^n & \xrightarrow{\widetilde{T}} & \mathbb{F}^m \end{array} displays four maps: the abstract map T at the top, the coordinate maps on the sides, and an unknown map \widetilde{T} at the bottom. We seek \widetilde{T} making the diagram commute—that is, making the compositions [\cdot]_{\mathcal{C}} \circ T and \widetilde{T} \circ [\cdot]_{\mathcal{B}} equal as functions \mathcal{V} \to \mathbb{F}^m.

By Theorem 5.2, the vertical maps are isomorphisms. This forces \widetilde{T} = [\cdot]_{\mathcal{C}} \circ T \circ ([\cdot]_{\mathcal{B}})^{-1}. The map \widetilde{T} : \mathbb{F}^n \to \mathbb{F}^m is the coordinate representation of T: it acts on coordinate vectors in \mathbb{F}^n the same way T acts on the underlying vectors in \mathcal{V}.

Since \widetilde{T} is a composition of linear maps, it is linear. By Theorem 5.5, there exists a unique matrix A \in M_{m \times n}(\mathbb{F}) with \widetilde{T}(x) = Ax for all x \in \mathbb{F}^n. This matrix encodes T in coordinates.

To construct A explicitly, we exploit Theorem 4.3. The map T is determined by T(v_1), \ldots, T(v_n). Each image T(v_j) \in \mathcal{W} expands uniquely in basis \mathcal{C}: T(v_j) = \sum_{i=1}^{m} a_{ij} w_i for some scalars a_{ij} \in \mathbb{F}. The coefficients a_{ij} record how T transforms the j-th basis vector: the first index i labels which basis element w_i of \mathcal{W} appears, and the second index j labels which input vector v_j we applied T to.

Arranging these coefficients as an array gives a matrix.

Definition 5.5 (Matrix of a linear map) Let T : \mathcal{V} \to \mathcal{W} be linear with \dim \mathcal{V} = n and \dim \mathcal{W} = m. Fix bases \mathcal{B} = \{v_1, \ldots, v_n\} of \mathcal{V} and \mathcal{C} = \{w_1, \ldots, w_m\} of \mathcal{W}. The matrix of T with respect to \mathcal{B} and \mathcal{C}, denoted [T]_{\mathcal{B}}^{\mathcal{C}}, is the element of M_{m \times n}(\mathbb{F}) whose j-th column is [T(v_j)]_{\mathcal{C}}: [T]_{\mathcal{B}}^{\mathcal{C}} = \begin{pmatrix} | & & | \\ [T(v_1)]_{\mathcal{C}} & \cdots & [T(v_n)]_{\mathcal{C}} \\ | & & | \end{pmatrix}. Equivalently, the (i,j)-entry is the i-th coordinate of T(v_j) when expressed in basis \mathcal{C}.

The matrix [T]_{\mathcal{B}}^{\mathcal{C}} depends on both bases. Different choices of \mathcal{B} or \mathcal{C} yield different matrices, all representing the same linear map T. When the bases are understood from context, we write simply [T] or A.

The construction is natural. A linear map is determined by its action on basis vectors. We record that action by expressing each T(v_j) in coordinates and arranging these coordinate vectors as columns of a matrix. The resulting array is nothing more than an organized record of how T transforms the chosen basis of \mathcal{V}.

Theorem 5.6 (Matrix action formula) Let T : \mathcal{V} \to \mathcal{W} be linear with matrix A = [T]_{\mathcal{B}}^{\mathcal{C}} relative to bases \mathcal{B} and \mathcal{C}. Then for every v \in \mathcal{V}, [T(v)]_{\mathcal{C}} = A [v]_{\mathcal{B}}.

Proof. Write v = \sum_{j=1}^{n} c_j v_j so that [v]_{\mathcal{B}} = \begin{pmatrix} c_1 \\ \vdots \\ c_n \end{pmatrix}. Linearity gives T(v) = \sum_{j=1}^{n} c_j T(v_j). Applying the coordinate map [\cdot]_{\mathcal{C}} and using its linearity from Theorem 5.1, [T(v)]_{\mathcal{C}} = \sum_{j=1}^{n} c_j [T(v_j)]_{\mathcal{C}}. The right side is precisely A[v]_{\mathcal{B}} by Definition 5.4, since the j-th column of A is [T(v_j)]_{\mathcal{C}}. \square

This theorem is the foundation of matrix representations. It states that applying the linear map T to a vector v is equivalent to multiplying the matrix A by the coordinate vector [v]_{\mathcal{B}}. The commutative diagram \begin{array}{ccc} \mathcal{V} & \xrightarrow{T} & \mathcal{W} \\ \downarrow\scriptstyle{[\cdot]_{\mathcal{B}}} & & \downarrow\scriptstyle{[\cdot]_{\mathcal{C}}} \\ \mathbb{F}^n & \xrightarrow{A} & \mathbb{F}^m \end{array} commutes: both paths from \mathcal{V} to \mathbb{F}^m yield the same result. Traversing the top edge then the right edge gives [T(v)]_{\mathcal{C}}. Traversing the bottom edge then the right edge gives A[v]_{\mathcal{B}}. Commutativity means [T(v)]_{\mathcal{C}} = A[v]_{\mathcal{B}} for all v \in \mathcal{V}.

Equivalently, the following identity of functions \mathcal{V} \to \mathbb{F}^m holds: [\cdot]_{\mathcal{C}} \circ T = A \circ [\cdot]_{\mathcal{B}}. The left side composes the abstract map with coordinatization. The right side applies coordinatization first, then matrix multiplication. They produce identical outputs.


5.5 Subspaces Associated with Matrices

In Chapter 3, we associated subspaces \ker(T) and \operatorname{im}(T) with every linear map T : \mathcal{V} \to \mathcal{W}. These are intrinsic geometric objects—they exist independently of coordinates. When we represent T by a matrix A relative to chosen bases, these subspaces acquire coordinate representations as subspaces of Euclidean space.

This section makes the correspondence explicit. We define kernel, column space, and row space for matrices, then prove these are precisely the coordinate images of the abstract subspaces from Chapter 3.


5.5.1 The Kernel of a Matrix

Definition 5.6 (Kernel of a matrix) Let A \in M_{m \times n}(\mathbb{F}). The kernel (or null space) of A is \ker(A) = \{x \in \mathbb{F}^n : Ax = 0\}.

This is the set of all vectors that A sends to zero. It generalizes the homogeneous system Ax = 0 studied in basic algebra.

Theorem 5.7 For any A \in M_{m \times n}(\mathbb{F}), the kernel \ker(A) is a subspace of \mathbb{F}^n.

Proof. Since A \cdot 0 = 0, we have 0 \in \ker(A). If x, y \in \ker(A), then A(x+y) = Ax + Ay = 0 + 0 = 0, so x + y \in \ker(A). If x \in \ker(A) and c \in \mathbb{F}, then A(cx) = c(Ax) = c \cdot 0 = 0, so cx \in \ker(A). \square

This proof is identical in structure to the proof that \ker(T) is a subspace for a linear map T (Chapter 3, Theorem 4.4). This is no coincidence.


5.5.2 The Column Space of a Matrix

Definition 5.7 (Column space) Let A \in M_{m \times n}(\mathbb{F}) have columns a_1, \ldots, a_n \in \mathbb{F}^m. The column space of A is \operatorname{Col}(A) = \operatorname{span}\{a_1, \ldots, a_n\}.

The column space consists of all linear combinations of the columns of A. Equivalently:

Theorem 5.8 \operatorname{Col}(A) = \{Ax : x \in \mathbb{F}^n\}.

Proof. By the definition of matrix-vector multiplication from Definition 5.4, Ax = x_1 a_1 + x_2 a_2 + \cdots + x_n a_n for x = \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix}. Thus Ax is a linear combination of the columns, so \{Ax : x \in \mathbb{F}^n\} \subseteq \operatorname{span}\{a_1, \ldots, a_n\}.

Conversely, every linear combination \sum_{j=1}^{n} c_j a_j equals A \begin{pmatrix} c_1 \\ \vdots \\ c_n \end{pmatrix}, so \operatorname{span}\{a_1, \ldots, a_n\} \subseteq \{Ax : x \in \mathbb{F}^n\}. \square

Since the span of any set is a subspace (Chapter 2, Theorem 3.2), \operatorname{Col}(A) is a subspace of \mathbb{F}^m.


5.5.3 The Row Space of a Matrix

Definition 5.8 (Row space) Let A \in M_{m \times n}(\mathbb{F}) have rows r_1, \ldots, r_m (viewed as vectors in \mathbb{F}^n). The row space of A is \operatorname{Row}(A) = \operatorname{span}\{r_1, \ldots, r_m\}.

The row space is a subspace of \mathbb{F}^n (the rows are vectors with n components). Note the asymmetry: the column space lives in \mathbb{F}^m, while the row space lives in \mathbb{F}^n.

Theorem 5.9 \operatorname{Row}(A) = \operatorname{Col}(A^T).

Proof. The rows of A are the columns of A^T. \square


5.5.4 Connection to Linear Maps

We now establish that matrix subspaces are coordinate representations of abstract subspaces.

Theorem 5.10 (Kernel correspondence) Let T : \mathcal{V} \to \mathcal{W} be linear with bases \mathcal{B} of \mathcal{V} and \mathcal{C} of \mathcal{W}. Let A = [T]_{\mathcal{B}}^{\mathcal{C}}. Then \ker(A) = \{[v]_{\mathcal{B}} : v \in \ker(T)\}. Equivalently, the coordinate map restricts to an isomorphism [\cdot]_{\mathcal{B}} : \ker(T) \to \ker(A).

Proof. By the matrix action formula (Theorem 5.6), [T(v)]_{\mathcal{C}} = A[v]_{\mathcal{B}} for all v \in \mathcal{V}.

If v \in \ker(T), then T(v) = 0, so [T(v)]_{\mathcal{C}} = [0]_{\mathcal{C}} = 0 \in \mathbb{F}^m. Thus A[v]_{\mathcal{B}} = 0, giving [v]_{\mathcal{B}} \in \ker(A).

Conversely, if x \in \ker(A), then Ax = 0. Let v = ([\cdot]_{\mathcal{B}})^{-1}(x) \in \mathcal{V} be the unique vector with [v]_{\mathcal{B}} = x. Then [T(v)]_{\mathcal{C}} = A[v]_{\mathcal{B}} = Ax = 0, so T(v) = 0 (since the coordinate map is an isomorphism by Theorem 5.2), giving v \in \ker(T).

Thus \ker(A) = \{[v]_{\mathcal{B}} : v \in \ker(T)\}. The coordinate map [\cdot]_{\mathcal{B}} is a bijection between \ker(T) and \ker(A), and it is linear by Theorem 5.1. \square

Theorem 5.11 (Image correspondence) Let T : \mathcal{V} \to \mathcal{W} be linear with bases \mathcal{B} of \mathcal{V} and \mathcal{C} of \mathcal{W}. Let A = [T]_{\mathcal{B}}^{\mathcal{C}}. Then \operatorname{Col}(A) = \{[w]_{\mathcal{C}} : w \in \operatorname{im}(T)\}. Equivalently, [\cdot]_{\mathcal{C}} : \operatorname{im}(T) \to \operatorname{Col}(A) is an isomorphism.

Proof. By definition, \operatorname{im}(T) = \{T(v) : v \in \mathcal{V}\} and \operatorname{Col}(A) = \{Ax : x \in \mathbb{F}^n\}.

If w \in \operatorname{im}(T), write w = T(v) for some v \in \mathcal{V}. Then [w]_{\mathcal{C}} = [T(v)]_{\mathcal{C}} = A[v]_{\mathcal{B}} \in \operatorname{Col}(A).

Conversely, if y \in \operatorname{Col}(A), write y = Ax for some x \in \mathbb{F}^n. Let v = ([\cdot]_{\mathcal{B}})^{-1}(x) \in \mathcal{V}, so [v]_{\mathcal{B}} = x. Then y = Ax = A[v]_{\mathcal{B}} = [T(v)]_{\mathcal{C}}, so y = [w]_{\mathcal{C}} where w = T(v) \in \operatorname{im}(T).

Thus \operatorname{Col}(A) is precisely the image of \operatorname{im}(T) under the coordinate map [\cdot]_{\mathcal{C}}. \square

These theorems establish that:

  • \ker(A) is the coordinate representation of \ker(T) in basis \mathcal{B}

  • \operatorname{Col}(A) is the coordinate representation of \operatorname{im}(T) in basis \mathcal{C}

The abstract subspaces from Chapter 3 become concrete subspaces of Euclidean space once we choose coordinates.


5.5.5 Rank and Nullity for Matrices

Definition 5.9 (Rank of a matrix) The rank of A \in M_{m \times n}(\mathbb{F}) is \operatorname{rank}(A) = \dim \operatorname{Col}(A). The nullity of A is \operatorname{nullity}(A) = \dim \ker(A).

By Theorem 5.11 and Theorem 5.10, if A = [T]_{\mathcal{B}}^{\mathcal{C}}, then \operatorname{rank}(A) = \dim \operatorname{im}(T), \quad \operatorname{nullity}(A) = \dim \ker(T).

The rank-nullity theorem immediately implies

Theorem 5.12 (Rank-nullity theorem for matrices) For A \in M_{m \times n}(\mathbb{F}), n = \operatorname{rank}(A) + \operatorname{nullity}(A).

Proof. This follows from Theorem 4.5 by taking T : \mathbb{F}^n \to \mathbb{F}^m to be the linear map T(x) = Ax. \square


5.5.6 The Domain and Codomain in Coordinates

For completeness, we note

  • The domain of A (as a linear map \mathbb{F}^n \to \mathbb{F}^m) is all of \mathbb{F}^n. In the abstract setting, if T : \mathcal{V} \to \mathcal{W} has matrix A = [T]_{\mathcal{B}}^{\mathcal{C}}, then \mathbb{F}^n is the coordinate representation of \mathcal{V} in basis \mathcal{B}.

  • The codomain of A is all of \mathbb{F}^m. This is the coordinate representation of \mathcal{W} in basis \mathcal{C}.

The commutative diagram from Theorem 5.6 can be labeled with subspaces:

\begin{array}{ccccc} \ker(T) & \subset & \mathcal{V} & \xrightarrow{T} & \mathcal{W} & \supset & \operatorname{im}(T) \\ \downarrow\scriptstyle{[\cdot]_{\mathcal{B}}} & & \downarrow\scriptstyle{[\cdot]_{\mathcal{B}}} & & \downarrow\scriptstyle{[\cdot]_{\mathcal{C}}} & & \downarrow\scriptstyle{[\cdot]_{\mathcal{C}}} \\ \ker(A) & \subset & \mathbb{F}^n & \xrightarrow{A} & \mathbb{F}^m & \supset & \operatorname{Col}(A) \end{array}

Every subspace of \mathcal{V} becomes a subspace of \mathbb{F}^n under coordinatization; every subspace of \mathcal{W} becomes a subspace of \mathbb{F}^m. The coordinate map preserves all linear structure—dimension, linear independence, span, and inclusion relationships.


5.5.7 Example: Differentiation Operator

Consider D : \mathcal{P}_2 \to \mathcal{P}_1 sending p(x) \mapsto p'(x). With bases \mathcal{B} = \{1, x, x^2\} of \mathcal{P}_2 and \mathcal{C} = \{1, x\} of \mathcal{P}_1, the matrix is A = [D]_{\mathcal{B}}^{\mathcal{C}} = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 2 \end{pmatrix}.

Abstract kernel: \ker(D) = \{p \in \mathcal{P}_2 : p' = 0\} = \{c : c \in \mathbb{F}\} (constant polynomials). Dimension 1.

Matrix kernel: Solve Ax = 0: \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 2 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \end{pmatrix} \implies x_2 = 0, \; x_3 = 0, \; x_1 \text{ free}. Thus \ker(A) = \left\{\begin{pmatrix} c \\ 0 \\ 0 \end{pmatrix} : c \in \mathbb{F}\right\}. Dimension 1.

Correspondence: A constant polynomial p(x) = c has coordinate vector [p]_{\mathcal{B}} = \begin{pmatrix} c \\ 0 \\ 0 \end{pmatrix}. Indeed, \ker(A) is the coordinate image of \ker(D).

Abstract image: \operatorname{im}(D) = \mathcal{P}_1 (all polynomials of degree \leq 1). Dimension 2.

Matrix column space: \operatorname{Col}(A) = \operatorname{span}\left\{\begin{pmatrix} 0 \\ 0 \end{pmatrix}, \begin{pmatrix} 1 \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ 2 \end{pmatrix}\right\} = \operatorname{span}\left\{\begin{pmatrix} 1 \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ 2 \end{pmatrix}\right\} = \mathbb{F}^2. Dimension 2.

Correspondence: Every polynomial a + bx \in \mathcal{P}_1 has coordinate vector [a + bx]_{\mathcal{C}} = \begin{pmatrix} a \\ b \end{pmatrix}. As expected, \operatorname{Col}(A) = \mathbb{F}^2 is the coordinate image of \operatorname{im}(D) = \mathcal{P}_1.

Rank-nullity check: n = 3, \operatorname{nullity}(A) = 1, \operatorname{rank}(A) = 2. Indeed, 3 = 1 + 2.


5.5.8 Summary

Abstract (Chapter 3) Coordinate Representation (Chapter 4)
Domain \mathcal{V} \mathbb{F}^n with basis \mathcal{B}
Codomain \mathcal{W} \mathbb{F}^m with basis \mathcal{C}
Kernel \ker(T) \subseteq \mathcal{V} \ker(A) \subseteq \mathbb{F}^n
Image \operatorname{im}(T) \subseteq \mathcal{W} \operatorname{Col}(A) \subseteq \mathbb{F}^m
Rank \dim \operatorname{im}(T) \operatorname{rank}(A) = \dim \operatorname{Col}(A)
Nullity \dim \ker(T) \operatorname{nullity}(A) = \dim \ker(A)

The coordinate map [\cdot]_{\mathcal{B}} : \mathcal{V} \to \mathbb{F}^n is an isomorphism that restricts to an isomorphism \ker(T) \to \ker(A). Similarly, [\cdot]_{\mathcal{C}} : \mathcal{W} \to \mathbb{F}^m restricts to an isomorphism \operatorname{im}(T) \to \operatorname{Col}(A).

Matrix operations are coordinate versions of abstract operations. Solving Ax = 0 in coordinates corresponds to finding \ker(T) abstractly. Computing \operatorname{Col}(A) in coordinates corresponds to finding \operatorname{im}(T) abstractly. The machinery developed in Chapter 6 (Gaussian elimination, echelon forms) provides algorithms for these computations.

5.6 Matrix Multiplication from Composition

Linear maps compose naturally: if T : \mathcal{U} \to \mathcal{V} and S : \mathcal{V} \to \mathcal{W} are linear, then S \circ T : \mathcal{U} \to \mathcal{W} is linear by Theorem 4.9. The composition (S \circ T)(u) = S(T(u)) applies T first, then S.

If we represent T and S by matrices relative to chosen bases, the composition S \circ T should correspond to some operation on matrices. This operation is matrix multiplication. We derive it from the requirement that matrix representations respect composition.

Let \mathcal{U}, \mathcal{V}, \mathcal{W} be finite-dimensional with \dim \mathcal{U} = p, \dim \mathcal{V} = n, \dim \mathcal{W} = m. Fix bases \mathcal{A}, \mathcal{B}, \mathcal{C} for these spaces respectively. Let T : \mathcal{U} \to \mathcal{V} have matrix B = [T]_{\mathcal{A}}^{\mathcal{B}} \in M_{n \times p}(\mathbb{F}) and S : \mathcal{V} \to \mathcal{W} have matrix A = [S]_{\mathcal{B}}^{\mathcal{C}} \in M_{m \times n}(\mathbb{F}).

By Theorem 5.6, for any u \in\mathcal{U}, [T(u)]_{\mathcal{B}} = B[u]_{\mathcal{A}}, \quad [S(v)]_{\mathcal{C}} = A[v]_{\mathcal{B}} for any v \in \mathcal{V}. The composition S \circ T : \mathcal{U} \to \mathcal{W} satisfies [(S \circ T)(u)]_{\mathcal{C}} = [S(T(u))]_{\mathcal{C}} = A[T(u)]_{\mathcal{B}} = A(B[u]_{\mathcal{A}}) = (AB)[u]_{\mathcal{A}}, where in the final step we have written AB to denote the result of whatever operation on matrices corresponds to the composition.

Since (AB)[u]_{\mathcal{A}} must equal [(S \circ T)(u)]_{\mathcal{C}} for all u \in \mathcal{U}, and since the coordinate map is an isomorphism by Theorem 5.2, the matrix AB must be [S \circ T]_{\mathcal{A}}^{\mathcal{C}}.

To determine the entries of AB explicitly, we examine its action on the standard basis vectors e_k of \mathbb{F}^p for k = 1, \ldots, p. Recall that Be_k is the k-th column of B, which we denote b_k \in \mathbb{F}^n. Then (AB)e_k = A(Be_k) = Ab_k. Thus the k-th column of AB is Ab_k. Writing b_k = \begin{pmatrix} b_{1k} \\ \vdots \\ b_{nk} \end{pmatrix}, we have Ab_k = \sum_{j=1}^{n} b_{jk} a_j, where a_j denotes the j-th column of A. The i-th component of this sum is (Ab_k)_i = \sum_{j=1}^{n} b_{jk} a_{ij} = \sum_{j=1}^{n} a_{ij} b_{jk}. This gives the (i,k)-entry of AB.

Definition 5.10 (Matrix multiplication) Let A \in M_{m \times n}(\mathbb{F}) and B \in M_{n \times p}(\mathbb{F}). The product AB \in M_{m \times p}(\mathbb{F}) has entries (AB)_{ik} = \sum_{j=1}^{n} a_{ij} b_{jk} for 1 \leq i \leq m and 1 \leq k \leq p.

The (i,k)-entry of AB is computed by taking the i-th row of A, the k-th column of B, multiplying corresponding entries, and summing. Equivalently, the k-th column of AB is A times the k-th column of B: (AB)_k = A(B_k), where B_k denotes the k-th column of B viewed as a column vector in \mathbb{F}^n.

Matrix multiplication is only defined when the number of columns in the first matrix equals the number of rows in the second. The product AB exists if and only if A is m \times n and B is n \times p for some m, n, p. The resulting product is m \times p.

Theorem 5.13 Let T : \mathcal{U} \to \mathcal{V} and S : \mathcal{V} \to \mathcal{W} be linear maps. Fix bases \mathcal{A} of \mathcal{U}, \mathcal{B} of \mathcal{V}, and \mathcal{C} of \mathcal{W}. If B = [T]_{\mathcal{A}}^{\mathcal{B}} and A = [S]_{\mathcal{B}}^{\mathcal{C}}, then [S \circ T]_{\mathcal{A}}^{\mathcal{C}} = AB.

Proof. This follows from the derivation preceding Definition 5.10. For any u \in \mathcal{U}, we have [(S \circ T)(u)]_{\mathcal{C}} = A[T(u)]_{\mathcal{B}} = A(B[u]_{\mathcal{A}}) = (AB)[u]_{\mathcal{A}} by associativity of matrix-vector multiplication. Since this holds for all u and coordinate maps are isomorphisms, the matrices [S \circ T]_{\mathcal{A}}^{\mathcal{C}} and AB coincide. \square

Matrix multiplication encodes composition of linear maps. The order matters: AB corresponds to “first apply T (represented by B), then apply S (represented by A).” The product BA generally differs from AB because composition is not commutative. When both products exist and AB \neq BA, we say the matrices do not commute.

Several algebraic properties follow from the correspondence with composition.

Theorem 5.14 Matrix multiplication satisfies the following properties, where dimensions are assumed compatible:

  1. (Associativity) (AB)C = A(BC)
  2. (Distributivity) A(B + C) = AB + AC and (A + B)C = AC + BC
  3. (Scalar compatibility) (cA)B = c(AB) = A(cB) for any c \in \mathbb{F}

Proof. For associativity, observe that both (AB)C and A(BC) represent the composition of three linear maps. By associativity of function composition, these are equal.

For distributivity, note that A(B + C) represents the composition of A with the sum B + C, which equals the sum of compositions AB + AC by linearity of composition.

Scalar compatibility follows from the fact that cA represents c times a linear map. \square

The identity matrix plays the role of the identity map.

Definition 5.11 (Identity matrix) The identity matrix I_n \in M_n(\mathbb{F}) has entries (I_n)_{ij} = \delta_{ij} = \begin{cases} 1 & \text{if } i = j \\ 0 & \text{if } i \neq j \end{cases}. Thus I_n has ones on the diagonal and zeros elsewhere.

Theorem 5.15 For any A \in M_{m \times n}(\mathbb{F}), we have I_m A = A and A I_n = A.

Proof. The identity matrix I_n represents the identity map on \mathbb{F}^n with respect to the standard basis. Composing with the identity leaves any map unchanged. \square

When working with square matrices A \in M_n(\mathbb{F}), we often write simply I for I_n when the dimension is clear.


5.7 Block Matrices and Partitioned Operations

When working with large matrices or matrices arising from structured problems, it is often useful to view them as composed of submatrices called blocks. This perspective not only simplifies notation but also reveals structural properties that may be obscured in the entry-by-entry view.

A block matrix (or partitioned matrix) is a matrix subdivided into rectangular submatrices. For instance, an m \times n matrix M might be written as M = \begin{pmatrix} A & B \\ C & D \end{pmatrix}, where A \in M_{p \times q}(\mathbb{F}), B \in M_{p \times (n-q)}(\mathbb{F}), C \in M_{(m-p) \times q}(\mathbb{F}), and D \in M_{(m-p) \times (n-q)}(\mathbb{F}) for some 1 \leq p < m and 1 \leq q < n. The horizontal line divides rows and the vertical line divides columns.

The division is chosen to highlight structure. A linear map preserving a direct sum decomposition naturally yields a block diagonal matrix. A composition of maps on different subspaces produces block products. We make these ideas precise.

5.7.1 Block Matrix Addition and Scalar Multiplication

If two matrices M and N are partitioned compatibly—meaning corresponding blocks have identical dimensions—then addition and scalar multiplication respect the block structure.

Theorem 5.16 Let M, N \in M_{m \times n}(\mathbb{F}) be partitioned as M = \begin{pmatrix} A & B \\ C & D \end{pmatrix}, \quad N = \begin{pmatrix} A' & B' \\ C' & D' \end{pmatrix} with blocks of matching dimensions. Then M + N = \begin{pmatrix} A + A' & B + B' \\ C + C' & D + D' \end{pmatrix}. Similarly, for c \in \mathbb{F}, cM = \begin{pmatrix} cA & cB \\ cC & cD \end{pmatrix}.

Proof. This follows immediately from the componentwise definitions of matrix addition and scalar multiplication. Each entry (M+N)_{ij} equals m_{ij} + n_{ij}, and the block structure simply groups these entries. \square

5.7.2 Block Matrix Multiplication

Block multiplication extends the ordinary multiplication rule: to compute the (i,j)-block of a product, we form the “dot product” of the i-th block row and the j-th block column, treating blocks as if they were scalars—provided dimensions align.

Theorem 5.17 Let M \in M_{m \times n}(\mathbb{F}) and N \in M_{n \times p}(\mathbb{F}) be partitioned as M = \begin{pmatrix} A & B \\ C & D \end{pmatrix}, \quad N = \begin{pmatrix} E & F \\ G & H \end{pmatrix} where the column partition of M matches the row partition of N. Specifically, if A is m_1 \times n_1 and B is m_1 \times n_2 with n_1 + n_2 = n, then E must be n_1 \times p_1 and G must be n_2 \times p_1 for some p_1 (so that AE and BG are defined and have matching dimensions). Under these compatibility conditions, MN = \begin{pmatrix} AE + BG & AF + BH \\ CE + DG & CF + DH \end{pmatrix}.

Proof. We verify the (1,1)-block. Write the columns of M as M = \begin{pmatrix} M_1 & M_2 \end{pmatrix} where M_1 has columns 1 through n_1 and M_2 has columns n_1+1 through n. Write the rows of N as N = \begin{pmatrix} N_1 \\ N_2 \end{pmatrix} where N_1 has rows 1 through n_1 and N_2 has rows n_1+1 through n.

By the definition of matrix multiplication, the (1,1)-block of MN—comprising rows 1 through m_1 and columns 1 through p_1—is the product of the first m_1 rows of M with the first p_1 columns of N. This product decomposes as M_1 N_1 + M_2 N_2. But M_1 contains precisely the entries of A (block (1,1) of M), N_1 contains the entries of E (block (1,1) of N), M_2 contains B, and N_2 contains G. Thus M_1 N_1 = AE and M_2 N_2 = BG, giving the (1,1)-block as AE + BG. The remaining blocks follow by similar reasoning. \square

This result is valuable for structured computations. Rather than multiplying two large matrices entry-by-entry, we can multiply their blocks—often yielding simpler expressions or revealing cancellations.

5.7.3 Block Diagonal Matrices

A matrix of the form M = \begin{pmatrix} A_1 & 0 & \cdots & 0 \\ 0 & A_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & A_k \end{pmatrix} is called block diagonal. Each A_i is a square matrix, and all off-diagonal blocks are zero. We write M = A_1 \oplus A_2 \oplus \cdots \oplus A_k or M = \bigoplus_{i=1}^{k} A_i.

Theorem 5.18 Let M = A_1 \oplus \cdots \oplus A_k and N = B_1 \oplus \cdots \oplus B_k be block diagonal matrices with blocks of matching dimensions. Then:

  1. M + N = (A_1 + B_1) \oplus \cdots \oplus (A_k + B_k)

  2. MN = (A_1 B_1) \oplus \cdots \oplus (A_k B_k)

  3. If each A_i is invertible, then M is invertible and M^{-1} = A_1^{-1} \oplus \cdots \oplus A_k^{-1}

Proof. Property (1) follows from Theorem 5.16. For (2), apply Theorem 5.17: the (i,j)-block of MN is \sum_{\ell=1}^{k} (\text{block } (i,\ell) \text{ of } M) \cdot (\text{block } (\ell,j) \text{ of } N). When M and N are block diagonal, only the \ell = i = j term survives, giving (MN)_{ii} = A_i B_i for diagonal blocks and zero elsewhere.

For (3), if each A_i is invertible, then (A_i^{-1} \oplus \cdots \oplus A_k^{-1}) \cdot (A_1 \oplus \cdots \oplus A_k) = (A_1^{-1} A_1) \oplus \cdots \oplus (A_k^{-1} A_k) = I \oplus \cdots \oplus I = I by property (2). Similarly for the reverse product. \square

Block diagonal matrices correspond to linear maps that preserve a direct sum decomposition. If \mathcal{V} = \mathcal{V}_1 \oplus \cdots \oplus \mathcal{V}_k and T : \mathcal{V} \to \mathcal{V} satisfies T(\mathcal{V}_i) \subseteq \mathcal{V}_i for all i, then choosing a basis adapted to the decomposition yields a block diagonal matrix.

5.7.4 Block Upper Triangular Matrices

A matrix of the form M = \begin{pmatrix} A_{11} & A_{12} & \cdots & A_{1k} \\ 0 & A_{22} & \cdots & A_{2k} \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & A_{kk} \end{pmatrix} is block upper triangular: blocks below the diagonal are zero. If each diagonal block A_{ii} is square and invertible, then M is invertible, and the inverse is also block upper triangular. The proof uses backward substitution on blocks, analogous to solving triangular linear systems, we leave this as an exericse in later chapters.

5.7.5 Block Matrices and Subspaces

Block structure often encodes geometric information. Consider T : \mathcal{V} \to \mathcal{V} with an invariant subspace \mathcal{U} \subseteq \mathcal{V} (meaning T(\mathcal{U}) \subseteq \mathcal{U}). Choose a basis \mathcal{B}_1 of \mathcal{U} and extend it to a basis \mathcal{B} = \mathcal{B}_1 \cup \mathcal{B}_2 of \mathcal{V}. Relative to this ordered basis, the matrix [T]_{\mathcal{B}}^{\mathcal{B}} has the form [T]_{\mathcal{B}}^{\mathcal{B}} = \begin{pmatrix} A & B \\ 0 & C \end{pmatrix}, where A represents T restricted to \mathcal{U}, and the zero block reflects that T maps \mathcal{U} into itself.

Theorem 5.19 Let T : \mathcal{V} \to \mathcal{V} be a linear operator and \mathcal{U} \subseteq \mathcal{V} a subspace with T(\mathcal{U}) \subseteq \mathcal{U}. Choose a basis \{u_1, \ldots, u_k\} of \mathcal{U} and extend to a basis \{u_1, \ldots, u_k, v_1, \ldots, v_m\} of \mathcal{V}. Then the matrix of T relative to this ordered basis is block upper triangular: [T] = \begin{pmatrix} A & B \\ 0 & C \end{pmatrix}, where A \in M_k(\mathbb{F}) represents the restriction T|_{\mathcal{U}} : \mathcal{U} \to \mathcal{U}.

Proof. For j \leq k, we have T(u_j) \in \mathcal{U} by invariance, so T(u_j) = \sum_{i=1}^{k} a_{ij} u_i for some scalars a_{ij}. The j-th column of [T] records the coordinates of T(u_j), which has zero coefficients for v_1, \ldots, v_m. Thus columns 1 through k of [T] have zeros in rows k+1 through k+m, producing the lower-left zero block. \square

This theorem is the foundation for studying eigenspaces, generalized eigenvectors, and Jordan canonical forms in later chapters. Block triangular structure simplifies spectral analysis.

5.7.6 Example: Direct Sum Decomposition

Let \mathcal{V} = \mathbb{R}^4 with the standard basis, and define T : \mathbb{R}^4 \to \mathbb{R}^4 by T\begin{pmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{pmatrix} = \begin{pmatrix} 2x_1 \\ 3x_2 \\ x_3 + x_4 \\ -x_3 + x_4 \end{pmatrix}. Let \mathcal{U} = \operatorname{span}\{e_1, e_2\} and \mathcal{W} = \operatorname{span}\{e_3, e_4\}. Then \mathbb{R}^4 = \mathcal{U} \oplus \mathcal{W}, and T preserves both subspaces: T(\mathcal{U}) \subseteq \mathcal{U} and T(\mathcal{W}) \subseteq \mathcal{W}.

Computing the matrix in the ordered basis \{e_1, e_2, e_3, e_4\}: [T] = \begin{pmatrix} 2 & 0 & 0 & 0 \\ 0 & 3 & 0 & 0 \\ 0 & 0 & 1 & 1 \\ 0 & 0 & -1 & 1 \end{pmatrix} = \begin{pmatrix} 2 & 0 \\ 0 & 3 \end{pmatrix} \oplus \begin{pmatrix} 1 & 1 \\ -1 & 1 \end{pmatrix}. The decomposition T = T|_{\mathcal{U}} \oplus T|_{\mathcal{W}} becomes transparent in this block form. Properties of T reduce to properties of the smaller blocks: for instance, T is invertible if and only if both blocks are invertible.


5.8 Invertibility and Isomorphisms

In Chapter 3, we studied isomorphisms—bijective linear maps possessing inverses. The matrix representation of an isomorphism should reflect this structure: it should be an invertible matrix.

Definition 5.12 (Invertible matrix) A matrix A \in M_n(\mathbb{F}) is invertible if there exists B \in M_n(\mathbb{F}) such that AB = BA = I_n. The matrix B is called the inverse of A and is denoted A^{-1}.

Invertibility is defined only for square matrices. For rectangular matrices, left and right inverses may exist separately but do not coincide unless the matrix is square.

Theorem 5.20 If A \in M_n(\mathbb{F}) is invertible, its inverse is unique.

Proof. Suppose B and C both satisfy AB = BA = I_n and AC = CA = I_n. Then B = BI_n = B(AC) = (BA)C = I_nC = C. \quad \square

Theorem 5.21 Let T : \mathcal{V} \to \mathcal{V} be a linear operator on a finite-dimensional space. Fix a basis \mathcal{B} of \mathcal{V} and let A = [T]_{\mathcal{B}}^{\mathcal{B}} be the matrix of T relative to \mathcal{B}. Then T is an isomorphism if and only if A is invertible. Moreover, if T is an isomorphism, then [T^{-1}]_{\mathcal{B}}^{\mathcal{B}} = A^{-1}.

Proof. Suppose T is an isomorphism. By Theorem 4.10 from Chapter 3, there exists a linear map S : \mathcal{V} \to \mathcal{V} with S \circ T = I_{\mathcal{V}} and T \circ S = I_{\mathcal{V}}, where I_{\mathcal{V}} denotes the identity map on \mathcal{V}. Let B = [S]_{\mathcal{B}}^{\mathcal{B}}.

By Theorem 5.13, the matrix of S \circ T is BA and the matrix of T \circ S is AB. The identity map I_{\mathcal{V}} has matrix I_n relative to any basis, since (I_{\mathcal{V}})(v_j) = v_j for all basis vectors v_j, giving [I_{\mathcal{V}}]_{\mathcal{B}}^{\mathcal{B}} = I_n.

Thus BA = I_n and AB = I_n, establishing that A is invertible with A^{-1} = B = [T^{-1}]_{\mathcal{B}}^{\mathcal{B}}.

Conversely, suppose A is invertible with AB = BA = I_n for some B \in M_n(\mathbb{F}). Let S : \mathcal{V} \to \mathcal{V} be the unique linear map with [S]_{\mathcal{B}}^{\mathcal{B}} = B (which exists by the correspondence between matrices and linear maps). Then the matrix of S \circ T is BA = I_n and the matrix of T \circ S is AB = I_n.

Since the matrix representation is injective—different maps have different matrices relative to a fixed basis—we conclude S \circ T = I_{\mathcal{V}} and T \circ S = I_{\mathcal{V}}. Thus T is an isomorphism with inverse S. \square

This theorem establishes that invertibility of a matrix is equivalent to the underlying linear map being an isomorphism. From Chapter 3, we know several equivalent characterizations of isomorphisms. Translating these to the matrix setting gives characterizations of invertibility.

Theorem 5.22 For A \in M_n(\mathbb{F}), the following are equivalent:

  1. A is invertible
  2. The linear map x \mapsto Ax is an isomorphism \mathbb{F}^n \to \mathbb{F}^n
  3. The columns of A form a basis of \mathbb{F}^n
  4. The columns of A are linearly independent
  5. The columns of A span \mathbb{F}^n
  6. \ker(A) = \{0\}, where \ker(A) = \{x \in \mathbb{F}^n : Ax = 0\}
  7. For every b \in \mathbb{F}^n, the equation Ax = b has a unique solution

Proof. These follow from the equivalences established in Theorem 5.21 combined with the characterizations of isomorphisms from Chapter 3. The columns of A are the images Ae_1, \ldots, Ae_n of the standard basis vectors under the map x \mapsto Ax. By Theorem 6.4, these form a basis of \mathbb{F}^n if and only if the map is an isomorphism. The remaining equivalences follow from Theorem 4.6 and Theorem 4.7. \square

The set of invertible n \times n matrices forms an algebraic structure.

Definition 5.13 (General linear group) The set \mathrm{GL}_n(\mathbb{F}) = \{ A \in M_n(\mathbb{F}) : A \text{ is invertible} \} is called the general linear group of degree n over \mathbb{F}.

The term “group” is justified by the following properties, which we state without proof as they belong to abstract algebra.

Theorem 5.23 \mathrm{GL}_n(\mathbb{F}) is closed under matrix multiplication, the identity matrix I_n belongs to \mathrm{GL}_n(\mathbb{F}), every element has an inverse in \mathrm{GL}_n(\mathbb{F}), and multiplication is associative. These properties make \mathrm{GL}_n(\mathbb{F}) a group under matrix multiplication.

Elements of \mathrm{GL}_n(\mathbb{F}) represent automorphisms of \mathbb{F}^n—isomorphisms from \mathbb{F}^n to itself. When \mathbb{F} = \mathbb{R}, the group \mathrm{GL}_n(\mathbb{R}) captures all invertible linear transformations of n-dimensional Euclidean space, including rotations, reflections, shears, and scalings.

Special subgroups of \mathrm{GL}_n(\mathbb{F}) correspond to transformations preserving additional structure. We will encounter these when we introduce inner products and orthogonality in later chapters.


5.9 The Transpose and Dual Maps

In Chapter 3, we introduced the dual space \mathcal{V}^* = \mathcal{L}(\mathcal{V}, \mathbb{F}) consisting of all linear functionals on \mathcal{V}. Every linear map T : \mathcal{V} \to \mathcal{W} induces a dual map T^* : \mathcal{W}^* \to \mathcal{V}^* acting on functionals. We now determine how T^* is represented in coordinates.

The dual map was defined in Definition 4.6 from Chapter 3 by (T^*\varphi)(v) = \varphi(T(v)) for \varphi \in \mathcal{W}^* and v \in \mathcal{V}. The transpose of a matrix will be defined shortly, and we will prove it represents the dual map.

Before connecting to linear maps, we define transposition purely as an operation on matrices.

Definition 5.14 (Matrix transpose) Let A \in M_{m \times n}(\mathbb{F}). The transpose of A, denoted A^T \in M_{n \times m}(\mathbb{F}), has entries (A^T)_{ji} = a_{ij} for all 1 \leq i \leq m and 1 \leq j \leq n.

Transposition interchanges rows and columns: the i-th row of A becomes the i-th column of A^T, and the j-th column of A becomes the j-th row of A^T. If A = \begin{pmatrix} a & b \\ c & d \\ e & f \end{pmatrix} is 3 \times 2, then A^T = \begin{pmatrix} a & c & e \\ b & d & f \end{pmatrix} is 2 \times 3.

Theorem 5.24 For matrices of appropriate dimensions and scalar c \in \mathbb{F}:

  1. (A^T)^T = A
  2. (A + B)^T = A^T + B^T
  3. (cA)^T = cA^T
  4. (AB)^T = B^T A^T

Proof. Properties (1) through (3) follow immediately from Definition 5.14. For (4), let A \in M_{m \times n}(\mathbb{F}) and B \in M_{n \times p}(\mathbb{F}). Then AB \in M_{m \times p}(\mathbb{F}) and (AB)^T \in M_{p \times m}(\mathbb{F}). For any 1 \leq k \leq p and 1 \leq i \leq m, ((AB)^T)_{ki} = (AB)_{ik} = \sum_{j=1}^{n} a_{ij} b_{jk} = \sum_{j=1}^{n} (B^T)_{kj} (A^T)_{ji} = (B^T A^T)_{ki}, where the final equality uses Definition 5.10. Thus (AB)^T = B^T A^T. \square

Note that transposition reverses the order of products: (AB)^T = B^T A^T, not A^T B^T. This mirrors the reversal in dual maps: (S \circ T)^* = T^* \circ S^* from composition theory.

To connect transposition with dual maps, we must first establish how dual spaces admit coordinates. Recall from Chapter 3 that if \mathcal{B} = \{v_1, \ldots, v_n\} is a basis of \mathcal{V}, the dual basis \mathcal{B}^* = \{\varphi_1, \ldots, \varphi_n\} of \mathcal{V}^* is defined by \varphi_i(v_j) = \delta_{ij} for all i, j. The functional \varphi_i extracts the i-th coordinate relative to \mathcal{B}.

Theorem 5.25 Let \mathcal{B} = \{v_1, \ldots, v_n\} be a basis of \mathcal{V} and \mathcal{B}^* = \{\varphi_1, \ldots, \varphi_n\} the dual basis of \mathcal{V}^*. For any \psi \in \mathcal{V}^*, we have [\psi]_{\mathcal{B}^*} = \begin{pmatrix} \psi(v_1) \\ \vdots \\ \psi(v_n) \end{pmatrix}.

Proof. Write \psi = \sum_{i=1}^{n} c_i \varphi_i for some scalars c_i. Evaluating on v_j gives \psi(v_j) = \sum_{i=1}^{n} c_i \varphi_i(v_j) = \sum_{i=1}^{n} c_i \delta_{ij} = c_j. Thus [\psi]_{\mathcal{B}^*} = \begin{pmatrix} c_1 \\ \vdots \\ c_n \end{pmatrix} = \begin{pmatrix} \psi(v_1) \\ \vdots \\ \psi(v_n) \end{pmatrix}. \square

The coordinates of a functional are obtained by evaluating it on the basis vectors. This differs from vector coordinates, which are the coefficients in the basis expansion.

We now prove the fundamental relationship between transposition and duality.

Theorem 5.26 Let T : \mathcal{V} \to \mathcal{W} be linear with bases \mathcal{B} = \{v_1, \ldots, v_n\} of \mathcal{V} and \mathcal{C} = \{w_1, \ldots, w_m\} of \mathcal{W}. Let \mathcal{B}^* and \mathcal{C}^* denote the dual bases of \mathcal{V}^* and \mathcal{W}^* respectively. If A = [T]_{\mathcal{B}}^{\mathcal{C}}, then [T^*]_{\mathcal{C}^*}^{\mathcal{B}^*} = A^T.

Proof. Let \psi_k \in \mathcal{C}^* for some 1 \leq k \leq m. We compute the coordinates of T^*\psi_k in the dual basis \mathcal{B}^*. By Theorem 5.25, [T^*\psi_k]_{\mathcal{B}^*} = \begin{pmatrix} (T^*\psi_k)(v_1) \\ \vdots \\ (T^*\psi_k)(v_n) \end{pmatrix}. The j-th entry is (T^*\psi_k)(v_j) = \psi_k(T(v_j)) by definition of T^*.

Expand T(v_j) = \sum_{i=1}^{m} a_{ij} w_i where a_{ij} are the entries of A. Then \psi_k(T(v_j)) = \psi_k\left(\sum_{i=1}^{m} a_{ij} w_i\right) = \sum_{i=1}^{m} a_{ij} \psi_k(w_i) = \sum_{i=1}^{m} a_{ij} \delta_{ki} = a_{kj}. Thus the j-th entry of [T^*\psi_k]_{\mathcal{B}^*} is a_{kj} = (A^T)_{jk}. This shows that the k-th column of [T^*]_{\mathcal{C}^*}^{\mathcal{B}^*} equals the k-th column of A^T. \square

The transpose of a matrix represents the dual of a linear map in the dual bases. Note the reversal: T : \mathcal{V} \to \mathcal{W} induces T^* : \mathcal{W}^* \to \mathcal{V}^*, and correspondingly, A \in M_{m \times n}(\mathbb{F}) transposes to A^T \in M_{n \times m}(\mathbb{F}).

This reversal reflects contravariance. Vectors transform covariantly: if T : \mathcal{V} \to \mathcal{W} and v \in \mathcal{V}, then T(v) \in \mathcal{W} follows the direction of the arrow. Functionals transform contravariantly: if \varphi \in \mathcal{W}^*, then T^*\varphi \in \mathcal{V}^* reverses the arrow.

This distinction appears throughout mathematics and physics. In differential geometry, tangent vectors are contravariant while cotangent vectors (differentials) are covariant. In general relativity, upper indices denote contravariant components and lower indices denote covariant components. The transpose operation encodes this duality at the level of matrices.


5.10 The Space of Matrices and Linear Maps

We established in Theorem 5.3 that M_{m \times n}(\mathbb{F}) is a vector space of dimension mn. We also know from Chapter 3 that \mathcal{L}(\mathcal{V}, \mathcal{W}) is a vector space under pointwise operations. The matrix representation provides an isomorphism between these spaces.

Theorem 5.27 Fix bases \mathcal{B} of \mathcal{V} and \mathcal{C} of \mathcal{W}, where \dim \mathcal{V} = n and \dim \mathcal{W} = m. The map \Phi : \mathcal{L}(\mathcal{V}, \mathcal{W}) \to M_{m \times n}(\mathbb{F}), \quad T \mapsto [T]_{\mathcal{B}}^{\mathcal{C}} is a vector space isomorphism.

Proof. We verify that \Phi is linear. Let S, T \in \mathcal{L}(\mathcal{V}, \mathcal{W}) and c \in \mathbb{F}. For any basis vector v_j \in \mathcal{B}, [(S + T)(v_j)]_{\mathcal{C}} = [S(v_j) + T(v_j)]_{\mathcal{C}} = [S(v_j)]_{\mathcal{C}} + [T(v_j)]_{\mathcal{C}} by linearity of the coordinate map from Theorem 5.1. Thus the j-th column of [S + T]_{\mathcal{B}}^{\mathcal{C}} equals the sum of the j-th columns of [S]_{\mathcal{B}}^{\mathcal{C}} and [T]_{\mathcal{B}}^{\mathcal{C}}, giving \Phi(S + T) = [S + T]_{\mathcal{B}}^{\mathcal{C}} = [S]_{\mathcal{B}}^{\mathcal{C}} + [T]_{\mathcal{B}}^{\mathcal{C}} = \Phi(S) + \Phi(T). Similarly, [(cT)(v_j)]_{\mathcal{C}} = [cT(v_j)]_{\mathcal{C}} = c[T(v_j)]_{\mathcal{C}}, so \Phi(cT) = c\Phi(T).

For injectivity, suppose \Phi(T) = 0, meaning [T]_{\mathcal{B}}^{\mathcal{C}} is the zero matrix. Then [T(v_j)]_{\mathcal{C}} = 0 for all j, so T(v_j) = 0 for all j by the injectivity of coordinate maps from Theorem 5.2. By Theorem 4.3, this forces T = 0. Thus \ker(\Phi) = \{0\}, and by Theorem 4.6, \Phi is injective.

For surjectivity, let A \in M_{m \times n}(\mathbb{F}) be arbitrary. Write A = \begin{pmatrix} | & & | \\ a_1 & \cdots & a_n \\ | & & | \end{pmatrix} where a_j \in \mathbb{F}^m is the j-th column. Each a_j determines a vector w_j \in \mathcal{W} via ([\cdot]_{\mathcal{C}})^{-1}(a_j) = w_j. Define T : \mathcal{V} \to \mathcal{W} by specifying T(v_j) = w_j on the basis and extending linearly. Then [T(v_j)]_{\mathcal{C}} = a_j by construction, so [T]_{\mathcal{B}}^{\mathcal{C}} = A, establishing surjectivity. \square

This isomorphism depends on the choice of bases \mathcal{B} and \mathcal{C}. Different basis choices yield different isomorphisms, all relating the same abstract space \mathcal{L}(\mathcal{V}, \mathcal{W}) to the same matrix space M_{m \times n}(\mathbb{F}).

As an immediate consequence, we obtain the dimension formula.

Corollary 5.1 If \dim \mathcal{V} = n and \dim \mathcal{W} = m, then \dim \mathcal{L}(\mathcal{V}, \mathcal{W}) = mn.

Proof. By Theorem 5.27, \mathcal{L}(\mathcal{V}, \mathcal{W}) is isomorphic to M_{m \times n}(\mathbb{F}). By Theorem 4.8 from Chapter 3, isomorphic spaces have equal dimension. From Theorem 5.3, \dim M_{m \times n}(\mathbb{F}) = mn. \square

When \mathcal{V} = \mathcal{W} and n = \dim \mathcal{V}, the space \mathcal{L}(\mathcal{V}, \mathcal{V}) of linear operators has dimension n^2. Such operators are also called endomorphisms. The space of n \times n matrices correspondingly has dimension n^2.


5.11 Closing Remarks

This chapter constructed the bridge between abstract linear algebra and computational matrix algebra. Coordinate maps [\cdot]_{\mathcal{B}} convert vectors into column vectors; matrix representations [T]_{\mathcal{B}}^{\mathcal{C}} convert linear maps into rectangular arrays. The fundamental identity [T(v)]_{\mathcal{C}} = [T]_{\mathcal{B}}^{\mathcal{C}} [v]_{\mathcal{B}} from Theorem 5.6 ensures that applying T in the abstract space corresponds precisely to matrix multiplication in coordinates.