12 The Spectral Theorem

12.1 The Diagonalization Problem

Not every operator is diagonalizable. Over \mathbb{R}, rotations by angles other than multiples of \pi have no real eigenvalues. Over \mathbb{C}, the Jordan block \begin{pmatrix} \lambda & 1 \\ 0 & \lambda \end{pmatrix} is not diagonalizable.

We identify a class of operators guaranteed to admit orthonormal eigenbases: self-adjoint operators satisfying T^* = T, where the adjoint T^* is defined by \langle T(v), w \rangle = \langle v, T^*(w) \rangle.

The Spectral Theorem. Every self-adjoint operator on a finite-dimensional inner product space has an orthonormal eigenbasis and real eigenvalues.

Throughout, \mathcal{V} denotes a finite-dimensional inner product space over \mathbb{R} or \mathbb{C}.

Notation: T^* as adjoint vs.~transpose

In the Linear Maps chapter, we used T^* for the transpose (or dual map) T^* : \mathcal{W}^* \to \mathcal{V}^* defined by T^*(\varphi) = \varphi \circ T. Here, T^* denotes the adjoint defined by \langle T(v), w \rangle = \langle v, T^*(w) \rangle. In finite dimensions with an orthonormal basis, the adjoint is represented by the conjugate transpose A^*, while the algebraic transpose acts on dual spaces. When an inner product is present, the Riesz representation theorem identifies \mathcal{V} with \mathcal{V}^*, and under this identification the two notions coincide. We use T^* for the adjoint throughout this chapter and the remainder of the book.

12.2 The Adjoint Operator

The adjoint generalizes the conjugate transpose of matrices. Given a linear operator T : \mathcal{V} \to \mathcal{V}, we seek an operator T^* : \mathcal{V} \to \mathcal{V} such that moving T from the first argument of the inner product to the second (or vice versa) introduces T^*.

Theorem 12.1 Let \mathcal{V} be a finite-dimensional inner product space and T : \mathcal{V} \to \mathcal{V} a linear operator. There exists a unique linear operator T^* : \mathcal{V} \to \mathcal{V} such that \langle T(v), w \rangle = \langle v, T^*(w) \rangle for all v, w \in \mathcal{V}.

Proof. For fixed w \in \mathcal{V}, the map v \mapsto \langle T(v), w \rangle is a linear functional on \mathcal{V}. By Theorem 10.10, there exists a unique vector u \in \mathcal{V} such that \langle T(v), w \rangle = \langle v, u \rangle for all v. Define T^*(w) = u.

We verify T^* is linear. For w_1, w_2 \in \mathcal{V} and \alpha, \beta \in \mathbb{F}, \begin{align*} \langle v, T^*(\alpha w_1 + \beta w_2) \rangle &= \langle T(v), \alpha w_1 + \beta w_2 \rangle \\ &= \overline{\alpha} \langle T(v), w_1 \rangle + \overline{\beta} \langle T(v), w_2 \rangle \\ &= \overline{\alpha} \langle v, T^*(w_1) \rangle + \overline{\beta} \langle v, T^*(w_2) \rangle \\ &= \langle v, \alpha T^*(w_1) + \beta T^*(w_2) \rangle. \end{align*} Since this holds for all v, we conclude T^*(\alpha w_1 + \beta w_2) = \alpha T^*(w_1) + \beta T^*(w_2).

Uniqueness follows from the defining property: if S also satisfies \langle T(v), w \rangle = \langle v, S(w) \rangle, then \langle v, T^*(w) \rangle = \langle v, S(w) \rangle for all v, w, forcing T^* = S by nondegeneracy of the inner product. \square

Definition 12.1 (Adjoint operator) The linear operator T^* : \mathcal{V} \to \mathcal{V} satisfying \langle T(v), w \rangle = \langle v, T^*(w) \rangle for all v, w is the adjoint of T.

In \mathbb{C}^n with \langle x, y \rangle = x^* y, if T is represented by matrix A, then T^* is represented by A^*: \begin{align*} \langle Ax, y \rangle = (Ax)^* y = x^* A^* y = \langle x, A^* y \rangle. \end{align*} In \mathbb{R}^n, the adjoint of A is A^T.

Examples.

Identity: I^* = I since \langle I(v), w \rangle = \langle v, w \rangle = \langle v, I(w) \rangle.
Zero operator: 0^* = 0 since \langle 0(v), w \rangle = 0 = \langle v, 0(w) \rangle.
Projection onto a subspace: If P is orthogonal projection onto \mathcal{W} (see Section 12.3 below), then P^* = P. We verify this in Section 12.3.
Rotation in \mathbb{R}^2: For rotation by angle \theta, the matrix is R = \begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix}, so R^* = R^T = \begin{pmatrix} \cos\theta & \sin\theta \\ -\sin\theta & \cos\theta \end{pmatrix} = R^{-1}. Rotations are orthogonal but not self-adjoint (unless \theta = 0 or \pi).

Theorem 12.2 For operators S, T : \mathcal{V} \to \mathcal{V} and scalars \alpha \in \mathbb{F}:

(S + T)^* = S^* + T^*
(\alpha T)^* = \overline{\alpha} T^*
(ST)^* = T^* S^*
(T^*)^* = T
I^* = I

Proof.

For all v, w, \begin{align*} \langle (S+T)(v), w \rangle &= \langle S(v) + T(v), w \rangle \\ &= \langle S(v), w \rangle + \langle T(v), w \rangle \\ &= \langle v, S^*(w) \rangle + \langle v, T^*(w) \rangle \\ &= \langle v, (S^* + T^*)(w) \rangle. \end{align*}
\begin{align*} \langle (\alpha T)(v), w \rangle &= \langle \alpha T(v), w \rangle = \overline{\alpha} \langle T(v), w \rangle \\ &= \overline{\alpha} \langle v, T^*(w) \rangle = \langle v, \overline{\alpha} T^*(w) \rangle. \end{align*}
\begin{align*} \langle (ST)(v), w \rangle &= \langle T(v), S^*(w) \rangle \\ &= \langle v, T^*(S^*(w)) \rangle = \langle v, (T^* S^*)(w) \rangle. \end{align*}
We show \langle T^*(v), w \rangle = \langle v, T(w) \rangle for all v, w, from which (T^*)^* = T follows by the definition of the adjoint. Using conjugate symmetry and the defining property of T^*: \begin{align*} \langle T^*(v), w \rangle &= \overline{\langle w, T^*(v) \rangle} = \overline{\langle T(w), v \rangle} = \langle v, T(w) \rangle. \end{align*} Since (T^*)^* is defined by \langle T^*(v), w \rangle = \langle v, (T^*)^*(w) \rangle for all v, w, comparing gives (T^*)^*(w) = T(w) for all w, hence (T^*)^* = T.
Shown in example above. \square

Property (c) shows the adjoint reverses order in products: (ST)^* = T^* S^*, analogous to (AB)^T = B^T A^T for matrices. Property (b) involves conjugation: (\alpha T)^* = \overline{\alpha} T^*, reflecting that in complex spaces, the inner product is conjugate-linear in the second argument.

Theorem 12.3 For any operator T : \mathcal{V} \to \mathcal{V}:

\ker(T^*) = (\operatorname{im}(T))^\perp
\operatorname{im}(T^*) = (\ker(T))^\perp
\ker(T) = (\operatorname{im}(T^*))^\perp
\operatorname{rank}(T) = \operatorname{rank}(T^*)

Proof.

\begin{align*} w \in \ker(T^*) &\iff T^*(w) = 0 \\ &\iff \langle v, T^*(w) \rangle = 0 \text{ for all } v \\ &\iff \langle T(v), w \rangle = 0 \text{ for all } v \\ &\iff w \perp \operatorname{im}(T) \\ &\iff w \in (\operatorname{im}(T))^\perp. \end{align*}
From (a) applied to T^* and using (T^*)^* = T: \ker(T) = (\operatorname{im}(T^*))^\perp. Taking orthogonal complements and using (\mathcal{W}^\perp)^\perp = \mathcal{W} gives (\ker(T))^\perp = \operatorname{im}(T^*).
Apply (a) to T^*.
By Theorem 4.5 and orthogonal decomposition \mathcal{V} = \operatorname{im}(T) \oplus (\operatorname{im}(T))^\perp, \begin{align*} \dim \mathcal{V} &= \dim \operatorname{im}(T) + \dim (\operatorname{im}(T))^\perp \\ &= \operatorname{rank}(T) + \dim \ker(T^*) \\ &= \operatorname{rank}(T) + (n - \operatorname{rank}(T^*)). \end{align*} Solving gives \operatorname{rank}(T) = \operatorname{rank}(T^*). \square

These relations show the adjoint interchanges kernel and image orthogonally. Geometrically, T^* “reverses the direction” of T while preserving orthogonality structure.

12.3 Self-Adjoint Operators

An operator equal to its own adjoint enjoys special properties.

Definition 12.2 (Self-adjoint operator) An operator T : \mathcal{V} \to \mathcal{V} is self-adjoint (or Hermitian in the complex case, symmetric in the real case) if T^* = T. Equivalently, \langle T(v), w \rangle = \langle v, T(w) \rangle for all v, w \in \mathcal{V}.

In terms of matrices: T is self-adjoint if [T]_{\mathcal{B}} = [T]_{\mathcal{B}}^* in any orthonormal basis \mathcal{B}. For real matrices, this means A = A^T (symmetry); for complex matrices, A = A^* (conjugate symmetry).

Examples.

Diagonal matrices: Any diagonal matrix D = \operatorname{diag}(\lambda_1, \ldots, \lambda_n) with real entries is self-adjoint since D^* = D.
Symmetric real matrices: A = \begin{pmatrix} 2 & 1 \\ 1 & 3 \end{pmatrix} is self-adjoint over \mathbb{R} since A^T = A.
Hermitian complex matrices: A = \begin{pmatrix} 1 & i \\ -i & 2 \end{pmatrix} is self-adjoint over \mathbb{C} since A^* = \begin{pmatrix} 1 & -i \\ i & 2 \end{pmatrix}^T = \begin{pmatrix} 1 & i \\ -i & 2 \end{pmatrix} = A.
Orthogonal projections: If P is orthogonal projection onto subspace \mathcal{W}, then P^* = P. For v \in \mathcal{V} with v = w + u where w \in \mathcal{W} and u \in \mathcal{W}^\perp, we have P(v) = w. Then \begin{align*} \langle P(v_1), v_2 \rangle &= \langle w_1, w_2 + u_2 \rangle = \langle w_1, w_2 \rangle \\ &= \langle w_1 + u_1, w_2 \rangle = \langle v_1, P(v_2) \rangle, \end{align*} using w_1 \perp u_2 and u_1 \perp w_2.
Non-example: Rotation by \pi/4 in \mathbb{R}^2 is not self-adjoint since R^T = R^{-1} \neq R (unless the rotation is by 0 or \pi).

Theorem 12.4 Every eigenvalue of a self-adjoint operator T : \mathcal{V} \to \mathcal{V} is real.

Proof. Let \lambda be an eigenvalue with eigenvector v \neq 0. Then T(v) = \lambda v, so \begin{align*} \langle T(v), v \rangle &= \langle \lambda v, v \rangle = \lambda \|v\|^2. \end{align*} Since T = T^*, \begin{align*} \langle T(v), v \rangle &= \langle v, T(v) \rangle = \overline{\langle T(v), v \rangle} \\ &= \overline{\lambda \|v\|^2} = \overline{\lambda} \|v\|^2. \end{align*} Thus \lambda \|v\|^2 = \overline{\lambda} \|v\|^2. Since v \neq 0, we have \|v\|^2 > 0, so \lambda = \overline{\lambda}, meaning \lambda \in \mathbb{R}. \square

This is the first major consequence of self-adjointness: eigenvalues are guaranteed to be real, even in complex vector spaces. This explains why observables in quantum mechanics (self-adjoint operators) yield real measurement outcomes.

Theorem 12.5 Eigenvectors of a self-adjoint operator corresponding to distinct eigenvalues are orthogonal.

Proof. Let T(v_1) = \lambda_1 v_1 and T(v_2) = \lambda_2 v_2 with \lambda_1 \neq \lambda_2. Then \begin{align*} \lambda_1 \langle v_1, v_2 \rangle &= \langle \lambda_1 v_1, v_2 \rangle = \langle T(v_1), v_2 \rangle \\ &= \langle v_1, T(v_2) \rangle = \langle v_1, \lambda_2 v_2 \rangle = \overline{\lambda_2} \langle v_1, v_2 \rangle. \end{align*} Since eigenvalues are real by Theorem 12.4, \overline{\lambda_2} = \lambda_2, so (\lambda_1 - \lambda_2) \langle v_1, v_2 \rangle = 0. Since \lambda_1 \neq \lambda_2, we have \langle v_1, v_2 \rangle = 0, so v_1 \perp v_2. \square

This guarantees that eigenspaces corresponding to different eigenvalues are mutually orthogonal—a property not shared by general operators. It is this orthogonality that enables us to construct orthonormal eigenbases.

Theorem 12.6 If \mathcal{W} \subseteq \mathcal{V} is T-invariant for a self-adjoint operator T, then \mathcal{W}^\perp is also T-invariant.

Proof. Let w \in \mathcal{W} and u \in \mathcal{W}^\perp. Since \mathcal{W} is T-invariant, T(w) \in \mathcal{W}. We verify T(u) \in \mathcal{W}^\perp by showing \langle T(u), w \rangle = 0: \langle T(u), w \rangle = \langle u, T(w) \rangle = 0 since u \in \mathcal{W}^\perp and T(w) \in \mathcal{W}. Thus T(u) \perp w for all w \in \mathcal{W}, so T(u) \in \mathcal{W}^\perp. \square

This property is crucial for the inductive proof of the spectral theorem: starting with a one-dimensional eigenspace, its orthogonal complement is invariant, allowing us to restrict to a lower-dimensional subspace and apply induction.

12.4 The Spectral Theorem

We now prove the central result.

Theorem 12.7 (The Spectral Theorem) Let T : \mathcal{V} \to \mathcal{V} be a self-adjoint operator on a finite-dimensional inner product space. Then \mathcal{V} has an orthonormal basis consisting of eigenvectors of T. Equivalently, there exists an orthonormal basis \mathcal{B} in which [T]_{\mathcal{B}} is diagonal with real entries.

Proof. We proceed by induction on n = \dim \mathcal{V}.

Base case: If n = 1, every operator is a scalar multiple of the identity, hence diagonal in any basis. Any unit vector forms an orthonormal eigenbasis.

Inductive step: Assume the result holds for all self-adjoint operators on spaces of dimension < n. Let \dim \mathcal{V} = n.

We first show T has a real eigenvalue. The characteristic polynomial \chi(\lambda) = \det(T - \lambda I) is a degree-n polynomial with real coefficients (since [T]_{\mathcal{B}} is a real symmetric matrix in any orthonormal basis \mathcal{B}). By the fundamental theorem of algebra, \chi has a root \lambda_0 \in \mathbb{C}, so there exists a nonzero v \in \mathbb{C}^n with Tv = \lambda_0 v (interpreting T as acting on \mathbb{C}^n via its real matrix). We claim \lambda_0 \in \mathbb{R}.

Compute \langle Tv, v \rangle in two ways. On one hand, \langle Tv, v \rangle = \langle \lambda_0 v, v \rangle = \lambda_0 \|v\|^2. On the other, since T = T^*, \langle Tv, v \rangle = \langle v, Tv \rangle = \overline{\langle Tv, v \rangle} = \overline{\lambda_0} \|v\|^2. Thus \lambda_0 \|v\|^2 = \overline{\lambda_0} \|v\|^2. Since v \neq 0, we have \|v\|^2 > 0, so \lambda_0 = \overline{\lambda_0}, meaning \lambda_0 \in \mathbb{R}.

It remains to produce a real eigenvector. Write v = u + iw with u, w \in \mathbb{R}^n. Taking real and imaginary parts of Tv = \lambda_0 v gives Tu = \lambda_0 u \quad \text{and} \quad Tw = \lambda_0 w. Since v \neq 0, at least one of u, w is nonzero, yielding a real eigenvector of T with eigenvalue \lambda_0 \in \mathbb{R}.

Let \lambda_1 = \lambda_0 be this eigenvalue and v_1 a corresponding real eigenvector. Normalize to \|v_1\| = 1. Let \mathcal{W}_1 = \operatorname{span}(v_1), a one-dimensional T-invariant subspace.

By Theorem 12.6, \mathcal{W}_1^\perp is also T-invariant. Moreover, \dim \mathcal{W}_1^\perp = n - 1 by Corollary 11.1.

Restrict T to \mathcal{W}_1^\perp: define T' = T|_{\mathcal{W}_1^\perp} : \mathcal{W}_1^\perp \to \mathcal{W}_1^\perp. We verify T' is self-adjoint on \mathcal{W}_1^\perp (with the restricted inner product). For u, w \in \mathcal{W}_1^\perp, \langle T'(u), w \rangle = \langle T(u), w \rangle = \langle u, T(w) \rangle = \langle u, T'(w) \rangle. Thus T' is self-adjoint on the (n-1)-dimensional space \mathcal{W}_1^\perp.

By the inductive hypothesis, \mathcal{W}_1^\perp has an orthonormal basis \{v_2, \ldots, v_n\} of eigenvectors of T'. Since T' = T|_{\mathcal{W}_1^\perp}, these are also eigenvectors of T.

The set \{v_1, v_2, \ldots, v_n\} is orthonormal: v_1 \in \mathcal{W}_1 and v_2, \ldots, v_n \in \mathcal{W}_1^\perp are mutually orthogonal, and \{v_2, \ldots, v_n\} is orthonormal by construction. This gives an orthonormal eigenbasis of \mathcal{V}. \square

Corollary 12.1 (Matrix form) A matrix A \in M_n(\mathbb{F}) is self-adjoint (i.e., A = A^*) if and only if there exists a unitary matrix U (orthogonal if \mathbb{F} = \mathbb{R}) such that U^* A U = D where D = \operatorname{diag}(\lambda_1, \ldots, \lambda_n) with \lambda_i \in \mathbb{R}.

Proof. If A = A^*, apply Theorem 12.7 to obtain an orthonormal eigenbasis. The change-of-basis matrix U from the standard basis to this eigenbasis is unitary (its columns are orthonormal), and U^* A U = D by the diagonalization formula. Conversely, if U^* A U = D with D real diagonal and U unitary, then A = U D U^*, so A^* = (U D U^*)^* = U D^* U^* = U D U^* = A (using D^* = D for real diagonal). \square

Remark. The spectral theorem is sometimes called the principal axis theorem in geometry, where it states that every symmetric bilinear form (quadratic form) can be diagonalized by rotating to principal axes. In physics, it’s the basis for normal modes in classical mechanics: coupled oscillators decouple into independent modes along eigendirections.

12.5 Spectral Decomposition

The spectral theorem provides more than diagonalization—it yields a canonical decomposition of the operator as a sum of projections.

Let T : \mathcal{V} \to \mathcal{V} be self-adjoint with eigenvalues \lambda_1, \ldots, \lambda_k (distinct) and corresponding eigenspaces E_{\lambda_1}, \ldots, E_{\lambda_k}. By Theorem 12.5, these eigenspaces are pairwise orthogonal. By the spectral theorem, they span \mathcal{V}: \mathcal{V} = E_{\lambda_1} \oplus E_{\lambda_2} \oplus \cdots \oplus E_{\lambda_k}.

Let P_i : \mathcal{V} \to E_{\lambda_i} denote orthogonal projection onto E_{\lambda_i}. By the Orthogonality chapter, every v \in \mathcal{V} decomposes uniquely as v = \sum_{i=1}^k P_i(v) where P_i(v) \in E_{\lambda_i}.

Since T(P_i(v)) = \lambda_i P_i(v) (as P_i(v) is an eigenvector with eigenvalue \lambda_i), we have \begin{align*} T(v) &= T\left(\sum_{i=1}^k P_i(v)\right) = \sum_{i=1}^k T(P_i(v)) \\ &= \sum_{i=1}^k \lambda_i P_i(v) = \left(\sum_{i=1}^k \lambda_i P_i\right)(v). \end{align*}

Theorem 12.8 (Spectral decomposition) Let T : \mathcal{V} \to \mathcal{V} be self-adjoint with distinct eigenvalues \lambda_1, \ldots, \lambda_k and corresponding eigenspaces E_{\lambda_1}, \ldots, E_{\lambda_k}. Let P_i : \mathcal{V} \to E_{\lambda_i} be orthogonal projection onto E_{\lambda_i}. Then T = \sum_{i=1}^k \lambda_i P_i. Moreover:

P_i P_j = \delta_{ij} P_i (orthogonal projections)
\sum_{i=1}^k P_i = I (resolution of identity)
P_i^* = P_i (self-adjoint)
P_i^2 = P_i (idempotent)

Proof. The formula T = \sum \lambda_i P_i was shown above.

For i \neq j, \operatorname{im}(P_i) = E_{\lambda_i} and \operatorname{im}(P_j) = E_{\lambda_j} are orthogonal, so P_i P_j = 0 (projecting onto E_{\lambda_j} then onto E_{\lambda_i} gives zero since E_{\lambda_j} \perp E_{\lambda_i}). For i = j, P_i^2 = P_i by idempotence of projections.
Since \mathcal{V} = \bigoplus E_{\lambda_i}, decomposing v = \sum P_i(v) gives v = (\sum P_i)(v), so \sum P_i = I.
and (d): These are properties of orthogonal projections established in the Orthogonality chapter. \square

This decomposition is called the spectral decomposition or spectral resolution of T. It expresses T as a weighted sum of orthogonal projections onto eigenspaces, with weights given by eigenvalues. The projections P_i are sometimes called spectral projections.

Matrix form. If \mathcal{B} is an orthonormal eigenbasis organized by eigenspaces (first d_1 vectors span E_{\lambda_1}, next d_2 span E_{\lambda_2}, etc.), then [T]_{\mathcal{B}} = \begin{pmatrix} \lambda_1 I_{d_1} & 0 & \cdots & 0 \\ 0 & \lambda_2 I_{d_2} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_k I_{d_k} \end{pmatrix} where d_i = \dim E_{\lambda_i}.

Applications. The spectral decomposition allows functional calculus: for any function f : \mathbb{R} \to \mathbb{R}, define f(T) = \sum_{i=1}^k f(\lambda_i) P_i. For instance, T^2 = \sum \lambda_i^2 P_i, e^T = \sum e^{\lambda_i} P_i, and T^{-1} = \sum \lambda_i^{-1} P_i (if \lambda_i \neq 0). This extends the notion of applying functions to operators beyond polynomials.

12.6 Quadratic Forms

A quadratic form on \mathcal{V} is a function Q : \mathcal{V} \to \mathbb{R} of the form Q(v) = \langle T(v), v \rangle where T : \mathcal{V} \to \mathcal{V} is a self-adjoint operator. In coordinates, if v = (x_1, \ldots, x_n) and A = [T]_{\mathcal{B}}, then Q(x) = x^T A x = \sum_{i,j} a_{ij} x_i x_j.

Quadratic forms arise in optimization (Hessian matrices at critical points), physics (kinetic and potential energy), geometry (curvature), and statistics (variance of random vectors).

Theorem 12.9 (Principal Axes Theorem) Let Q(v) = \langle T(v), v \rangle be a quadratic form with T self-adjoint. There exists an orthonormal basis \mathcal{B} = \{e_1, \ldots, e_n\} of eigenvectors of T such that Q(v) = \sum_{i=1}^n \lambda_i |c_i|^2 where v = \sum c_i e_i and \lambda_i are the eigenvalues of T. (Over \mathbb{R}, |c_i|^2 = c_i^2.)

Proof. By the spectral theorem, choose an orthonormal eigenbasis \{e_1, \ldots, e_n\} with T(e_i) = \lambda_i e_i. For v = \sum c_i e_i, \begin{align*} Q(v) &= \langle T(v), v \rangle = \left\langle T\left(\sum c_i e_i\right), \sum c_j e_j \right\rangle \\ &= \left\langle \sum \lambda_i c_i e_i, \sum c_j e_j \right\rangle \\ &= \sum_{i,j} \lambda_i c_i \overline{c_j} \langle e_i, e_j \rangle = \sum_i \lambda_i |c_i|^2. \quad \square \end{align*}

In the eigenbasis, the quadratic form has no cross terms—it is a weighted sum of squares. The eigenvectors e_i are the principal axes, and the eigenvalues \lambda_i are the principal coefficients.

Classification by definiteness. The eigenvalues determine the behavior of Q:

Positive definite: Q(v) > 0 for all v \neq 0 \iff all \lambda_i > 0
Positive semidefinite: Q(v) \geq 0 for all v \iff all \lambda_i \geq 0
Negative definite: Q(v) < 0 for all v \neq 0 \iff all \lambda_i < 0
Negative semidefinite: Q(v) \leq 0 for all v \iff all \lambda_i \leq 0
Indefinite: Q takes both positive and negative values \iff some \lambda_i > 0 and some \lambda_j < 0

Example (Conic sections). Consider Q(x, y) = x^2 - y^2 in \mathbb{R}^2. The matrix is A = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix} with eigenvalues \lambda_1 = 1, \lambda_2 = -1. The quadratic form is indefinite, representing a hyperbola Q(x,y) = c. Rotating by 45° to eigenvectors (1,1)/\sqrt{2} and (1,-1)/\sqrt{2} yields Q = u^2 - v^2 in the new coordinates, the standard hyperbola form.

12.7 Normal Operators

The spectral theorem generalizes beyond self-adjoint operators.

Definition 12.3 (Normal operator) An operator T : \mathcal{V} \to \mathcal{V} is normal if T^* T = T T^*. Equivalently, T commutes with its adjoint.

Self-adjoint operators are normal (since T = T^* implies T^* T = T T^* = T^2). Orthogonal and unitary operators are normal (since T^* = T^{-1} gives T^* T = I = T T^*). But not all normal operators are self-adjoint or orthogonal.

Example. Rotation in \mathbb{R}^2 by angle \theta \neq 0, \pi is normal but not self-adjoint. Over \mathbb{C}, it diagonalizes to \operatorname{diag}(e^{i\theta}, e^{-i\theta})—complex eigenvalues on the unit circle.

Theorem 12.10 (Spectral theorem for normal operators) Let T : \mathcal{V} \to \mathcal{V} be a normal operator on a finite-dimensional complex inner product space. Then \mathcal{V} has an orthonormal basis of eigenvectors of T. Equivalently, there exists a unitary matrix U such that U^* A U is diagonal.

Proof. (Sketch) The proof mirrors that for self-adjoint operators. The key steps:

Show T has an eigenvalue \lambda \in \mathbb{C} (by the fundamental theorem of algebra).
Show that if \mathcal{W} is T-invariant, then \mathcal{W}^\perp is T^*-invariant. For normal T, this implies \mathcal{W}^\perp is also T-invariant.
Apply induction on dimension as before. \square

For real normal operators, complexification may be required to obtain all eigenvalues, leading to pairs of complex conjugate eigenvalues with two-dimensional invariant real subspaces (as in rotations).

12.8 Computing Eigenvalues and Diagonalization

Algorithmic procedure for diagonalizing a self-adjoint operator T:

Find eigenvalues: Solve \det(T - \lambda I) = 0 for \lambda. All roots are real.
Find eigenvectors: For each eigenvalue \lambda_i, solve (T - \lambda_i I)v = 0 to obtain a basis of E_{\lambda_i}.
Orthonormalize eigenbases: Apply Gram-Schmidt within each eigenspace if necessary (though often a basis is already orthogonal).
Assemble orthonormal eigenbasis: Concatenate the orthonormal bases from all eigenspaces.
Form diagonalizing matrix: U = [e_1 \mid \cdots \mid e_n] where e_i are the orthonormal eigenvectors. Then U^* A U = D where D = \operatorname{diag}(\lambda_1, \ldots, \lambda_n).

Example. Diagonalize A = \begin{pmatrix} 4 & 2 \\ 2 & 1 \end{pmatrix}.

Step 1: Characteristic polynomial: \begin{align*} \det(A - \lambda I) &= \det\begin{pmatrix} 4-\lambda & 2 \\ 2 & 1-\lambda \end{pmatrix} \\ &= (4-\lambda)(1-\lambda) - 4 = \lambda^2 - 5\lambda = \lambda(\lambda - 5). \end{align*} Eigenvalues: \lambda_1 = 0, \lambda_2 = 5.

Step 2: Eigenvectors.

For \lambda_1 = 0: (A - 0I)v = Av = 0 gives \begin{pmatrix} 4 & 2 \\ 2 & 1 \end{pmatrix}\begin{pmatrix} x \\ y \end{pmatrix} = 0, so 4x + 2y = 0, yielding v_1 = \begin{pmatrix} 1 \\ -2 \end{pmatrix}.

For \lambda_2 = 5: (A - 5I)v = 0 gives \begin{pmatrix} -1 & 2 \\ 2 & -4 \end{pmatrix}\begin{pmatrix} x \\ y \end{pmatrix} = 0, so -x + 2y = 0, yielding v_2 = \begin{pmatrix} 2 \\ 1 \end{pmatrix}.

Step 3: Verify orthogonality: \langle v_1, v_2 \rangle = 1 \cdot 2 + (-2) \cdot 1 = 0.

Normalize: \|v_1\| = \sqrt{1 + 4} = \sqrt{5}, \|v_2\| = \sqrt{4 + 1} = \sqrt{5}. \begin{align*} e_1 = \frac{1}{\sqrt{5}}\begin{pmatrix} 1 \\ -2 \end{pmatrix}, \quad e_2 = \frac{1}{\sqrt{5}}\begin{pmatrix} 2 \\ 1 \end{pmatrix}. \end{align*}

Step 4: Form U = [e_1 \mid e_2] = \frac{1}{\sqrt{5}}\begin{pmatrix} 1 & 2 \\ -2 & 1 \end{pmatrix}.

Step 5: Verify U^T A U = D: D = \begin{pmatrix} 0 & 0 \\ 0 & 5 \end{pmatrix}.

Geometric interpretation. The eigenvectors define new orthogonal axes. In these coordinates, the quadratic form Q(x, y) = x^T A x = 4x^2 + 4xy + y^2 becomes Q = 0 u^2 + 5 v^2 = 5v^2, representing a parabolic cylinder along the u-axis.

12.9 Further Directions

The spectral theorem opens several important extensions:

Rayleigh quotient: The extreme eigenvalues admit variational characterizations \lambda_{\max} = \max_{\|v\|=1} \langle T(v), v \rangle and \lambda_{\min} = \min_{\|v\|=1} \langle T(v), v \rangle, with extremizers being the corresponding eigenvectors.
Simultaneous diagonalization: Commuting self-adjoint operators ST = TS share a common orthonormal eigenbasis.
Singular value decomposition: For any A \in M_{m\times n}(\mathbb{F}), the matrices A^*A and AA^* are self-adjoint and positive semidefinite; diagonalizing them yields A = U\Sigma V^*, developed in Chapter 14.
Functional calculus: The spectral decomposition T = \sum \lambda_i P_i allows applying arbitrary functions via f(T) = \sum f(\lambda_i) P_i, extending beyond polynomials.

12.10 Closing Remarks

The spectral theorem is the culmination of the theory of self-adjoint operators: real eigenvalues, orthogonal eigenspaces, and guaranteed diagonalizability all flow from the single condition T^* = T. The spectral decomposition T = \sum \lambda_i P_i resolves the operator into independent scalar actions on orthogonal subspaces, and the principal axes theorem eliminates cross-terms from any quadratic form.

The SVD chapter extends these ideas to arbitrary rectangular matrices via the singular value decomposition.