2 Vector Spaces

2.1 Mathematical Structures

Mathematics organizes itself through two parallel hierarchies: one algebraic, built from operations and their rules; one analytic, built from notions of distance and closeness. At the foundation of both lie sets—collections of objects with no additional structure. We impose structure by specifying operations or measurements and requiring they satisfy axioms.

2.1.1 The Algebraic Hierarchy

The algebraic hierarchy proceeds by adding operations to a set.

A group equips a set with one associative binary operation, an identity element, and inverses. Groups are the abstract language of symmetry: the integers \mathbb{Z} under addition, the nonzero rationals under multiplication, and the rotations of a triangle are all groups.

A ring adds a second operation (multiplication) compatible with the first via distributivity. Not every element needs a multiplicative inverse—the integers \mathbb{Z} and polynomial spaces \mathcal{P} are rings but not everything in them divides everything else.

A field is a commutative ring in which every nonzero element has a multiplicative inverse. Fields are the scalars of linear algebra. The main examples:

\mathbb{R} (real numbers) and \mathbb{C} (complex numbers) — the fields for most of this course
\mathbb{Q} (rational numbers)
\mathbb{Z}_p (integers modulo a prime p) — finite fields used in coding theory and cryptography

A vector space is the next level: a set whose elements can be added together and scaled by field elements via two operations satisfying natural axioms. Vector spaces live entirely within the algebraic hierarchy—no notion of distance or topology is required.

\text{Fields} \longrightarrow \text{Vector Spaces} \longrightarrow \cdots

The arrow represents “act upon”: a field acts on a vector space via scalar multiplication. This is the central algebraic relationship of linear algebra.

2.1.2 The Analytic Hierarchy

The analytic hierarchy equips sets with structure measuring closeness.

A topological space defines “closeness” through open sets, enabling a notion of continuity without any measurement of distances. This is the most general setting for analysis.

A metric space specifies distance explicitly via a function d(x,y), requiring non-negativity, symmetry, and the triangle inequality. Every metric space is a topological space, but not conversely.

A normed space combines vector space structure with measurement: a norm \|v\| assigns each vector a non-negative real number representing its “size.” The norm is zero only for the zero vector (\|v\| = 0 \iff v = 0), respects scaling (\|cv\| = |c|\|v\|), and satisfies the triangle inequality (\|u+v\| \leq \|u\| + \|v\|). Every normed space is a metric space via d(u,v) = \|u-v\|, but not every metric space admits a compatible norm.

\text{Normed Spaces} \subset \text{Metric Spaces} \subset \text{Topological Spaces}

2.1.3 The Intersection

Inner product spaces sit at the junction of both hierarchies. An inner product \langle u, v \rangle measures both magnitude and angle, inducing a norm via \|v\| = \sqrt{\langle v,v \rangle} and hence a metric. Euclidean space \mathbb{R}^n with the dot product is the canonical example. Inner products enable orthogonality, projections, and the geometric intuition underlying Fourier analysis and quantum mechanics.

Inner product spaces are simultaneously vector spaces (algebraic structure) and metric spaces (analytic structure), with the two sides linked by the inner product:

\underbrace{\text{Inner Product Spaces}}_{\text{algebraic } \cap \text{ analytic}} \subset \text{Normed Spaces} \subset \text{Metric Spaces} \subset \text{Topological Spaces}.

This chapter develops vector spaces in isolation. We impose no topology, define no distance, specify no inner product. The theory proceeds purely algebraically, studying what can be expressed through addition and scaling alone. Subsequent chapters introduce norms and inner products, enriching the structure.

2.2 Notation and Conventions

On the representation of vectors. In elementary treatments, vectors are often denoted by bold lowercase letters \mathbf{v}, sometimes with arrows \vec{v}, to distinguish them from scalars.

We abandon this distinction. Vectors are elements of abstract spaces—functions, sequences, polynomials, or formal symbols. Context determines which is which.

When coordinates appear, we write column vectors as v = \begin{pmatrix} v_1 \\ v_2 \end{pmatrix} rather than row vectors, since linear maps naturally act on columns (matrix-vector products Av require v in column form). Row vectors appear as dual vectors or covectors when we study linear functionals.

The zero vector in any space is denoted 0, not \mathbf{0} or \vec{0}. The scalar zero and vector zero are distinguished by context. If ambiguity arises, we write 0_\mathcal{V} to specify the zero element of space \mathcal{V}.

Subscripts serve two roles depending on context. When v = \begin{pmatrix} v_1 \\ v_2 \end{pmatrix}, the subscripted quantities v_1, v_2 are scalar components of a single vector. When listing multiple vectors, we write u_1, u_2, \ldots for a sequence or collection of vectors—here the subscript is an index distinguishing one vector from another. Basis vectors are typically denoted e_1, e_2, \ldots to avoid confusion with scalar components. Superscripts in parentheses, as in v^{(1)}, v^{(2)}, \ldots, indicate iteration or different instances of the same type and are labels, not exponents.

2.3 The Definition of a Vector Space

We define vector spaces axiomatically, isolating the essential properties of addition and scalar multiplication.

Definition 2.1 (Vector Space) A vector space over a field \mathbb{F} is a set \mathcal{V} equipped with two operations:

Addition: +:\mathcal{V} \times \mathcal{V} \to \mathcal{V}, denoted (u, v) \mapsto u + v
Scalar multiplication: \cdot:\mathbb{F} \times \mathcal{V} \to \mathcal{V}, denoted (c, v) \mapsto c\cdot v

satisfying, for all u, v, w \in \mathcal{V} and all a, b \in \mathbb{F}:

(Commutativity) u + v = v + u
(Associativity of addition) (u + v) + w = u + (v + w)
(Additive identity) There exists 0 \in \mathcal{V} such that v + 0 = v for all v
(Additive inverse) For each v \in \mathcal{V}, there exists -v \in \mathcal{V} with v + (-v) = 0
(Distributivity over vector addition) a(u + v) = au + av
(Distributivity over scalar addition) (a + b)v = av + bv
(Associativity of scalar multiplication) a(bv) = (ab)v
(Multiplicative identity) 1v = v for all v

Elements of \mathcal{V} are vectors; elements of \mathbb{F} are scalars.

On Fields

A field \mathbb{F} is a set with addition and multiplication satisfying the usual arithmetic properties (commutativity, associativity, distributivity, existence of identities and inverses). The primary examples are:

\mathbb{R} (real numbers) — used throughout this course
\mathbb{C} (complex numbers) — essential for spectral theory and quantum mechanics
\mathbb{Q} (rational numbers) — important in number theory
\mathbb{Z}_p (integers modulo p for prime p) — finite fields used in coding theory, cryptography, and error correction

When we write “vector space” without qualification, we mean a vector space over \mathbb{R}. When the field matters, we specify: “\mathcal{V} is a vector space over \mathbb{F}” or “\mathcal{V} is an \mathbb{F}-vector space.”

Finite fields \mathbb{Z}_p enable linear algebra over finite sets, powering modern coding theory (Reed-Solomon codes), cryptographic protocols (elliptic curve cryptography), and network coding.

The axioms specify closure (operations remain within \mathcal{V}), algebraic identities (commutativity, associativity, distributivity), and the existence of neutral/inverse elements. They make no reference to coordinates, dimension, or geometric intuition. Any set with operations satisfying these properties is a vector space.

2.4 Elementary Consequences

The axioms imply several immediate consequences. Though elementary, these results are essential—they justify familiar algebraic manipulations and establish uniqueness of key elements.

Theorem 2.1 (Uniqueness of Zero) The additive identity is unique.

Proof. Suppose 0 and 0' are both additive identities. Then 0 = 0 + 0' = 0', using first that 0' is an identity (so 0 + 0' = 0) and second that 0 is an identity (so 0 + 0' = 0'). \square

Theorem 2.2 (Uniqueness of Inverses) For each v \in \mathcal{V}, the additive inverse is unique.

Proof. Suppose w and w' both satisfy v + w = 0 and v + w' = 0. Then w = w + 0 = w + (v + w') = (w + v) + w' = 0 + w' = w', using associativity and commutativity. \square

The first two results establish uniqueness. The next three concern interaction between zero elements and the operations.

Theorem 2.3 (Scalar Zero Annihilates Vectors) For any v \in \mathcal{V}, we have 0v = 0.

Proof. Observe that 0v = (0 + 0)v = 0v + 0v. Subtracting 0v from both sides (adding -(0v)) gives 0 = 0v. \square

The dual statement holds for scalar multiplication of the zero vector.

Theorem 2.4 (Scalar Multiplication by Zero Vector) For any a \in \mathbb{F}, we have a \cdot 0 = 0.

Proof. Similar to Theorem 2.3 a \cdot 0 = a(0 + 0) = a \cdot 0 + a \cdot 0, hence 0 = a \cdot 0. \square

Finally, we confirm that scalar multiplication by -1 produces the additive inverse.

Theorem 2.5 (Scalar -1 Acts as Inverse) For any v \in \mathcal{V}, we have (-1)v = -v.

Proof. Compute v + (-1)v = 1v + (-1)v = (1 + (-1))v = 0v = 0, so (-1)v is the additive inverse of v. By uniqueness, (-1)v = -v. \square

These results confirm that the axioms impose precisely the structure needed for algebraic manipulation. The zero vector is unique, inverses are unique, and scalar multiplication interacts with the zero elements as expected.

2.5 Examples of Vector Spaces

The power of abstraction lies in recognizing common structure across diverse contexts. We present several fundamental examples.

2.5.1 Euclidean Space \mathbb{R}^n

The set of all n-tuples of real numbers: \mathbb{R}^n = \left\{(x_1, \ldots, x_n) : x_i \in \mathbb{R}\right\}.

Addition and scalar multiplication are defined componentwise: (x_1, \ldots, x_n) + (y_1, \ldots, y_n) = (x_1 + y_1, \ldots, x_n + y_n), c(x_1, \ldots, x_n) = (cx_1, \ldots, cx_n).

The zero vector is (0, 0, \ldots, 0), and the additive inverse of (x_1, \ldots, x_n) is (-x_1, \ldots, -x_n). Verification of the axioms reduces to properties of real number arithmetic.

For n = 1, we recover \mathbb{R} itself as a vector space over \mathbb{R}. For n = 2, 3, we obtain the familiar plane and three-dimensional space. For arbitrary n, we have a coordinate space with no immediate geometric interpretation—the algebraic structure suffices.

2.5.2 Spaces of Functions

Let X be any nonempty set. The collection of all functions f : X \to \mathbb{R} forms a vector space, denoted \mathbb{R}^X or \mathscr{F}(X, \mathbb{R}), under pointwise operations: (f + g)(x) = f(x) + g(x), \quad (cf)(x) = c \cdot f(x).

The zero vector is the constant function 0(x) = 0 for all x \in X. The additive inverse of f is -f, defined by (-f)(x) = -f(x).

Each axiom follows from the corresponding property of real numbers at each point x. For instance, commutativity: (f + g)(x) = f(x) + g(x) = g(x) + f(x) = (g + f)(x), so f + g = g + f as functions.

Special cases:

Continuous functions: C([a, b]), the set of all continuous functions f : [a, b] \to \mathbb{R}, is a vector space. Sums and scalar multiples of continuous functions are continuous.
Differentiable functions: C^1([a, b]), functions with continuous derivatives, forms a vector space. Differentiation is linear: (f + g)' = f' + g' and (cf)' = cf'.
Integrable functions: \mathscr{R}([a, b]), Riemann integrable functions on [a, b], forms a vector space. Integration is linear: \int (f + g) = \int f + \int g.

Function spaces are infinite-dimensional—we cannot describe every function with finitely many coordinates. This distinguishes them fundamentally from \mathbb{R}^n.

2.5.3 Polynomial Spaces

Let \mathcal{P} denote the set of all polynomials with real coefficients: p(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n, \quad a_i \in \mathbb{R}.

Addition and scalar multiplication are defined in the natural way: (p + q)(x) = p(x) + q(x), \quad (cp)(x) = c \cdot p(x).

The zero vector is the zero polynomial 0(x) = 0 (all coefficients zero). The additive inverse of p is -p, with coefficients (-a_0, -a_1, \ldots).

Polynomials of degree at most n form a subspace \mathcal{P}_n \subset \mathcal{P}. The space \mathcal{P}_n has dimension n + 1—each polynomial is uniquely determined by n + 1 coefficients. The full space \mathcal{P} is infinite-dimensional.

Polynomial spaces provide a tractable infinite-dimensional setting. Unlike general function spaces, polynomials are algebraically manipulable—we can multiply them, factor them, and study their roots. They appear throughout analysis as approximations to more complicated functions (Taylor series, interpolation).

2.5.4 Sequence Spaces

The set of all real sequences (x_1, x_2, x_3, \ldots) forms a vector space under componentwise addition and scalar multiplication: (x_n) + (y_n) = (x_n + y_n), \quad c(x_n) = (cx_n).

The zero vector is the sequence (0, 0, 0, \ldots). The additive inverse of (x_n) is (-x_n).

Subspaces of interest:

Convergent sequences: Those (x_n) for which \lim_{n \to \infty} x_n exists. This is a vector space because limits of sums are sums of limits, and limits of scalar multiples are scalar multiples of limits.
Bounded sequences: Those (x_n) for which there exists M > 0 with |x_n| \leq M for all n. Sums and scalar multiples of bounded sequences are bounded.
Sequences converging to zero: The subspace c_0 = \{(x_n) : x_n \to 0\}. This is strictly smaller than the space of convergent sequences but still infinite-dimensional.

Sequence spaces model discrete-time processes and provide the setting for studying series. The partial sums of a series \sum a_n form a sequence (s_n) in the space of sequences.

2.5.4.1 Non-Examples

Not every set with natural operations forms a vector space.

Natural numbers \mathbb{N} under addition. Even if 0 \in \mathbb{N} by convention, there are no additive inverses—the negative of any positive integer lies outside \mathbb{N}. Axiom 4 fails.

Positive reals \mathbb{R}_{>0} under usual addition. There is no zero in \mathbb{R}_{>0} (since 0 \notin \mathbb{R}_{>0}), and negatives lie outside the set. Axioms 3 and 4 fail.

The set \{(x, y) \in \mathbb{R}^2 : x, y > 0\} under componentwise addition. Same issue—no zero, no negatives. The axioms fail.

These examples show that not every set with operations is a vector space. The axioms are substantive constraints.

2.6 Subspaces

Many vector spaces naturally contain smaller vector spaces within them. Recognizing this structure simplifies analysis.

Definition 2.2 (Subspace) A subset \mathcal{W} \subseteq \mathcal{V} is a subspace of \mathcal{V} if:

0 \in \mathcal{W} (contains the zero vector)
If u, w \in \mathcal{W}, then u + w \in \mathcal{W} (closed under addition)
If w \in \mathcal{W} and c \in \mathbb{F}, then cw \in \mathcal{W} (closed under scalar multiplication)

A subspace inherits the vector space structure from \mathcal{V}—the operations are the same, and the axioms hold automatically because they hold in the ambient space. The conditions ensure \mathcal{W} is closed under these operations and contains the required neutral element.

Theorem 2.6 (Subspaces Are Vector Spaces) If \mathcal{W} is a subspace of \mathcal{V}, then \mathcal{W} is a vector space under the operations inherited from \mathcal{V}.

Proof. The axioms for a vector space are identities (commutativity, associativity, distributivity) or existence claims (zero, inverses). The identities hold in \mathcal{W} because they hold in \mathcal{V} for any elements. The zero vector exists by condition 1. For inverses: if w \in \mathcal{W}, then -w = (-1)w \in \mathcal{W} by closure under scalar multiplication. \square

2.6.0.1 Examples of Subspaces

Lines through the origin in \mathbb{R}^2. Let v \neq 0 in \mathbb{R}^2. The set \mathcal{W} = \{cv : c \in \mathbb{R}\} is a subspace. It contains 0 = 0v. It is closed under addition: c_1 v + c_2 v = (c_1 + c_2)v \in \mathcal{W}. It is closed under scalar multiplication: a(cv) = (ac)v \in \mathcal{W}.

Geometrically, \mathcal{W} is the line through the origin in the direction of v.

Polynomials of degree at most n within \mathcal{P}. The set \mathcal{P}_n \subset \mathcal{P} is a subspace. The zero polynomial has degree -\infty (or undefined), conventionally less than any n. Sums of degree-n polynomials have degree at most n. Scalar multiples have degree at most n (multiplying by zero gives the zero polynomial, which has degree -\infty; multiplying by a nonzero scalar preserves degree).

Solution sets of homogeneous linear equations. Consider the equation ax + by + cz = 0 in \mathbb{R}^3. The solution set is a plane through the origin, which is a subspace. The zero vector (0,0,0) satisfies the equation. Sums of solutions are solutions (linearity of the equation). Scalar multiples of solutions are solutions.

More generally, the solution set of any homogeneous linear system Av = 0 is a subspace of \mathbb{R}^n. This is the null space or kernel of the matrix A, a central object in linear algebra which we will study more in the coming chapters.

Continuous functions within all functions. C([a,b]) \subset \mathscr{F}([a,b], \mathbb{R}). The constant zero function is continuous. Sums of continuous functions are continuous. Scalar multiples of continuous functions are continuous.

2.6.1 Intersection and Sum of Subspaces

Subspaces combine in natural ways. Intersection preserves the subspace property; sums produce the smallest subspace containing given subspaces.

Theorem 2.7 (Intersection of Subspaces) If \mathcal{W}_1 and \mathcal{W}_2 are subspaces of \mathcal{V}, then \mathcal{W}_1 \cap \mathcal{W}_2 is a subspace.

Proof. The zero vector lies in both \mathcal{W}_1 and \mathcal{W}_2, hence in their intersection. If u, v \in \mathcal{W}_1 \cap \mathcal{W}_2, then u, v \in \mathcal{W}_1 and u, v \in \mathcal{W}_2. Since \mathcal{W}_1 is closed under addition, u + v \in \mathcal{W}_1. Since \mathcal{W}_2 is closed under addition, u + v \in \mathcal{W}_2. Thus u + v \in \mathcal{W}_1 \cap \mathcal{W}_2. Closure under scalar multiplication follows similarly. \square

Two planes intersecting along a line through the origin

The intersection of arbitrarily many subspaces is a subspace. The union, however, typically is not—closure under addition fails unless one subspace contains the other.

The union of two lines is not a subspace

Definition 2.3 (Sum of Subspaces) If \mathcal{W}_1, \mathcal{W}_2 \subseteq \mathcal{V} are subspaces, their sum is \mathcal{W}_1 + \mathcal{W}_2 = \{w_1 + w_2 : w_1 \in \mathcal{W}_1, w_2 \in \mathcal{W}_2\}.

This is the smallest subspace containing both \mathcal{W}_1 and \mathcal{W}_2. Verification of the subspace axioms is straightforward: 0 = 0 + 0 \in \mathcal{W}_1 + \mathcal{W}_2; if u = u_1 + u_2 and v = v_1 + v_2, then u + v = (u_1 + v_1) + (u_2 + v_2) \in \mathcal{W}_1 + \mathcal{W}_2; scalar multiples behave similarly.

2.7 Applications Across Disciplines

Vector spaces provide the basic language for many areas of mathematics and its applications, including analysis, probability, physics, data science, and economics.

In later chapters, we will see how linear maps, eigenvalues, inner products, and orthogonal decompositions make these connections precise.

2.7.1 Machine Learning and Data Science

High-dimensional data as vectors. In machine learning, data points are vectors. An image with n pixels is a vector in \mathbb{R}^n (or \mathbb{R}^{3n} for RGB). A document represented by word frequencies is a vector in \mathbb{R}^d, where d is the vocabulary size. Training a classifier means finding a linear separator (hyperplane) in this vector space.

Linear models. Linear regression fits data by minimizing \|Ax - b\|^2 over vectors x \in \mathbb{R}^n. Neural networks compose linear maps (matrix multiplications) with nonlinearities. The backpropagation algorithm computes gradients—covectors in the dual space.

Dimensionality reduction. Principal Component Analysis (PCA) finds the subspace of maximal variance by computing eigenvectors of the covariance matrix. Data projected onto this subspace retains essential structure while reducing dimension.

2.7.2 Coding Theory and Cryptography

Error-correcting codes. Linear codes are subspaces of \mathbb{Z}_2^n (vectors over the field \mathbb{Z}_2 = \{0,1\}). Encoding embeds messages into a subspace; decoding projects received (possibly corrupted) vectors onto the nearest codeword. Reed-Solomon codes, used in CDs, DVDs, and QR codes, are vector spaces over finite fields \mathbb{F}_q.

Cryptographic protocols. Lattice-based cryptography relies on lattices in \mathbb{R}^n (free \mathbb{Z}-modules embedded in Euclidean space) with specific norms. Elliptic curve cryptography uses vector space structure on finite fields. The security of many systems depends on the computational hardness of problems in high-dimensional lattices.

2.7.3 Probability and Statistics

Random variables as vectors. The set of square-integrable random variables L^2(\Omega, \mathbb{P}) is a vector space (in fact, a Hilbert space with inner product \langle X, Y \rangle = \mathbb{E}[XY]). Orthogonal projections minimize mean-squared error, making least squares the natural estimator.

Covariance matrices. The covariance structure of n random variables is encoded in an n \times n positive semi-definite matrix, a linear operator on \mathbb{R}^n. Diagonalizing this matrix reveals independent components—the basis of principal component analysis and factor models.

2.7.4 Economics and Game Theory

Commodity spaces. An economy with n goods is modeled as \mathbb{R}^n, where vectors represent bundles of goods. Prices are covectors (linear functionals) assigning a cost to each bundle. Budget constraints define hyperplanes; utility maximization is optimization over a convex subset of \mathbb{R}^n.

Nash equilibria. Mixed strategies in game theory are vectors in probability simplices (affine subsets of \mathbb{R}^n satisfying x_i \geq 0 and \sum x_i = 1). Computing equilibria involves fixed-point theorems for linear and nonlinear maps.

2.7.5 Physics and Engineering

Quantum mechanics. States of quantum systems are vectors in complex Hilbert spaces. Observables are self-adjoint operators. Measurement is orthogonal projection onto eigenspaces (spectral theorem). The entire framework rests on linear algebra over \mathbb{C}.

Signal processing. Signals are functions in L^2([0,T]) or sequences in \ell^2(\mathbb{Z})—infinite-dimensional vector spaces. Fourier analysis decomposes signals into orthogonal basis functions (sines and cosines). Filtering, compression, and transmission leverage vector space structure.

Control theory. Linear dynamical systems \dot{x} = Ax + Bu describe everything from airplane dynamics to electrical circuits. Stability, controllability, and observability are properties of subspaces (kernel, image, invariant subspaces).

This chapter establishes the axiomatic foundation of linear algebra. We have defined vector spaces, verified basic consequences of the axioms, and examined representative examples and subspaces.

No coordinates, bases, or dimensions have yet been introduced. These notions, which refine and organize the structure developed here, are the subject of the following chapters.