2  Vector Spaces

2.1 Mathematical Structures

Mathematics organizes itself through layers of abstraction. At the foundation lie sets—collections of objects with no additional structure. We impose structure by specifying operations and requiring they satisfy axioms.

A metric space equips a set with a notion of distance, allowing us to speak of convergence and continuity without reference to coordinates or algebraic operations. The real line \mathbb{R} with d(x,y) = |x-y| is a metric space; so is any set with a well-defined distance function satisfying non-negativity, symmetry, and the triangle inequality.

A topological space generalizes further, defining “closeness” through open sets rather than explicit distance. Metric spaces induce topologies (open balls generate the open sets), but topologies need not arise from metrics. Topology studies properties preserved under continuous deformation—compactness, connectedness, separation—without measuring.

A vector space imposes algebraic structure: vectors can be added and scaled, with these operations satisfying natural axioms. No notion of distance or angle is required—only the ability to form linear combinations. The theory of vector spaces is linear algebra, the foundation for analyzing systems of equations, studying transformations, and understanding function spaces.

A normed space combines vector space structure with measurement: a norm \|v\| assigns each vector a non-negative real number representing its “size” or “length.” Every normed space is a metric space via d(u,v) = \|u-v\|, but not every metric space admits a norm compatible with vector operations. The norm respects scaling (\|cv\| = |c|\|v\|) and satisfies the triangle inequality (\|u+v\| \leq \|u\| + \|v\|), properties not guaranteed by arbitrary metrics.

An inner product space refines further: the inner product \langle u, v \rangle measures both magnitude and angle, inducing a norm via \|v\| = \sqrt{\langle v,v \rangle}. Euclidean space \mathbb{R}^n with the dot product is the canonical example. Inner products enable orthogonality, projections, and the geometric intuition underlying Fourier analysis and quantum mechanics.

The hierarchy flows naturally: \text{Inner Product Spaces} \subset \text{Normed Spaces} \subset \text{Metric Spaces} \subset \text{Topological Spaces}. Vector spaces stand apart—they provide algebraic rather than topological structure. When a vector space admits a norm compatible with its operations, it enters the metric hierarchy. When the norm arises from an inner product, we gain geometry.

This chapter develops vector spaces in isolation. We impose no topology, define no distance, specify no inner product. The theory proceeds purely algebraically, studying what can be expressed through addition and scaling alone. Subsequent chapters introduce norms and inner products, enriching the structure.


2.2 Notation and Conventions

On the representation of vectors. In elementary treatments, vectors are often denoted by bold lowercase letters \mathbf{v}, sometimes with arrows \vec{v}, to distinguish them from scalars.

We abandon this distinction. Vectors are elements of abstract spaces—functions, sequences, polynomials, or formal symbols. Context determines which is which.

When coordinates appear, we write column vectors as v = \begin{pmatrix} v_1 \\ v_2 \end{pmatrix} rather than row vectors, since linear maps naturally act on columns (matrix-vector products Av require v in column form). Row vectors appear as dual vectors or covectors when we study linear functionals.

The zero vector in any space is denoted 0, not \mathbf{0} or \vec{0}. The scalar zero and vector zero are distinguished by context. If ambiguity arises, we write 0_V to specify the zero element of space V.

Subscripts denote components or indices: v_1, v_2, \ldots are vectors in a sequence or basis elements, while v^{(1)}, v^{(2)}, \ldots indicate iteration or different instances of the same type. Superscripts in parentheses are labels, not exponents.


2.3 The Definition of a Vector Space

We define vector spaces axiomatically, isolating the essential properties of addition and scalar multiplication.

Definition 2.1 (Vector Space) A vector space over a field \mathbb{F} is a set \mathcal{V} equipped with two operations:

  • Addition: +:\mathcal{V} \times \mathcal{V} \to \mathcal{V}, denoted (u, v) \mapsto u + v

  • Scalar multiplication: \cdot:\mathbb{F} \times \mathcal{V} \to \mathcal{V}, denoted (c, v) \mapsto c\cdot v

satisfying, for all u, v, w \in \mathcal{V} and all a, b \in \mathbb{F}:

  1. (Commutativity) u + v = v + u
  2. (Associativity of addition) (u + v) + w = u + (v + w)
  3. (Additive identity) There exists 0 \in \mathcal{V} such that v + 0 = v for all v
  4. (Additive inverse) For each v \in \mathcal{V}, there exists -v \in \mathcal{V} with v + (-v) = 0
  5. (Distributivity over vector addition) a(u + v) = au + av
  6. (Distributivity over scalar addition) (a + b)v = av + bv
  7. (Associativity of scalar multiplication) a(bv) = (ab)v
  8. (Multiplicative identity) 1v = v for all v

Elements of \mathcal{V} are vectors; elements of \mathbb{F} are scalars.

A field \mathbb{F} is a set with addition and multiplication satisfying the usual arithmetic properties (commutativity, associativity, distributivity, existence of identities and inverses). The primary examples are:

  • \mathbb{R} (real numbers) — used throughout this course
  • \mathbb{C} (complex numbers) — essential for spectral theory and quantum mechanics
  • \mathbb{Q} (rational numbers) — important in number theory
  • \mathbb{Z}_p (integers modulo p for prime p) — finite fields used in coding theory, cryptography, and error correction

When we write “vector space” without qualification, we mean a vector space over \mathbb{R}. When the field matters, we specify: “\mathcal{V} is a vector space over \mathbb{F}” or “\mathcal{V} is an \mathbb{F}-vector space.”

Finite fields \mathbb{Z}_p enable linear algebra over finite sets, powering modern coding theory (Reed-Solomon codes), cryptographic protocols (elliptic curve cryptography), and network coding.

The axioms specify closure (operations remain within \mathcal{V}), algebraic identities (commutativity, associativity, distributivity), and the existence of neutral/inverse elements. They make no reference to coordinates, dimension, or geometric intuition. Any set with operations satisfying these properties is a vector space.


2.4 Elementary Consequences

The axioms imply several immediate consequences. Though elementary, these results are essential—they justify familiar algebraic manipulations and establish uniqueness of key elements.

Theorem 2.1 (Uniqueness of Zero) The additive identity is unique.

Proof. Suppose 0 and 0' are both additive identities. Then 0 = 0 + 0' = 0', using first that 0' is an identity (so 0 + 0' = 0) and second that 0 is an identity (so 0 + 0' = 0'). \square

Theorem 2.2 (Uniqueness of Inverses) For each v \in \mathcal{V}, the additive inverse is unique.

Proof. Suppose w and w' both satisfy v + w = 0 and v + w' = 0. Then w = w + 0 = w + (v + w') = (w + v) + w' = 0 + w' = w', using associativity and commutativity. \square

The first two results establish uniqueness. The next three concern interaction between zero elements and the operations.

Theorem 2.3 (Scalar Zero Annihilates Vectors) For any v \in \mathcal{V}, we have 0v = 0.

Proof. Observe that 0v = (0 + 0)v = 0v + 0v. Subtracting 0v from both sides (adding -(0v)) gives 0 = 0v. \square

The dual statement holds for scalar multiplication of the zero vector.

Theorem 2.4 (Scalar Multiplication by Zero Vector) For any a \in \mathbb{R}, we have a \cdot 0 = 0.

Proof. Similar to Theorem 2.3 a \cdot 0 = a(0 + 0) = a \cdot 0 + a \cdot 0, hence 0 = a \cdot 0. \square

Finally, we confirm that scalar multiplication by -1 produces the additive inverse.

Theorem 2.5 (Scalar -1 Acts as Inverse) For any v \in \mathcal{V}, we have (-1)v = -v.

Proof. Compute v + (-1)v = 1v + (-1)v = (1 + (-1))v = 0v = 0, so (-1)v is the additive inverse of v. By uniqueness, (-1)v = -v. \square

These results confirm that the axioms impose precisely the structure needed for algebraic manipulation. The zero vector is unique, inverses are unique, and scalar multiplication interacts with the zero elements as expected.


2.5 Examples of Vector Spaces

The power of abstraction lies in recognizing common structure across diverse contexts. We present several fundamental examples.

2.5.1 Euclidean Space \mathbb{R}^n

The set of all n-tuples of real numbers: \mathbb{R}^n = \left\{(x_1, \ldots, x_n) : x_i \in \mathbb{R}\right\}.

Addition and scalar multiplication are defined componentwise: (x_1, \ldots, x_n) + (y_1, \ldots, y_n) = (x_1 + y_1, \ldots, x_n + y_n), c(x_1, \ldots, x_n) = (cx_1, \ldots, cx_n).

The zero vector is (0, 0, \ldots, 0), and the additive inverse of (x_1, \ldots, x_n) is (-x_1, \ldots, -x_n). Verification of the axioms reduces to properties of real number arithmetic.

For n = 1, we recover \mathbb{R} itself as a vector space over \mathbb{R}. For n = 2, 3, we obtain the familiar plane and three-dimensional space. For arbitrary n, we have a coordinate space with no immediate geometric interpretation—the algebraic structure suffices.

2.5.2 Spaces of Functions

Let X be any nonempty set. The collection of all functions f : X \to \mathbb{R} forms a vector space, denoted \mathbb{R}^X or \mathscr{F}(X, \mathbb{R}), under pointwise operations: (f + g)(x) = f(x) + g(x), \quad (cf)(x) = c \cdot f(x).

The zero vector is the constant function 0(x) = 0 for all x \in X. The additive inverse of f is -f, defined by (-f)(x) = -f(x).

Each axiom follows from the corresponding property of real numbers at each point x. For instance, commutativity: (f + g)(x) = f(x) + g(x) = g(x) + f(x) = (g + f)(x), so f + g = g + f as functions.

Special cases:

  • Continuous functions: C([a, b]), the set of all continuous functions f : [a, b] \to \mathbb{R}, is a vector space. Sums and scalar multiples of continuous functions are continuous.

  • Differentiable functions: C^1([a, b]), functions with continuous derivatives, forms a vector space. Differentiation is linear: (f + g)' = f' + g' and (cf)' = cf'.

  • Integrable functions: \mathscr{R}([a, b]), Riemann integrable functions on [a, b], forms a vector space. Integration is linear: \int (f + g) = \int f + \int g.

Function spaces are infinite-dimensional—we cannot describe every function with finitely many coordinates. This distinguishes them fundamentally from \mathbb{R}^n.

2.5.3 Polynomial Spaces

Let \mathcal{P} denote the set of all polynomials with real coefficients: p(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n, \quad a_i \in \mathbb{R}.

Addition and scalar multiplication are defined in the natural way: (p + q)(x) = p(x) + q(x), \quad (cp)(x) = c \cdot p(x).

The zero vector is the zero polynomial 0(x) = 0 (all coefficients zero). The additive inverse of p is -p, with coefficients (-a_0, -a_1, \ldots).

Polynomials of degree at most n form a subspace \mathcal{P}_n \subset \mathcal{P}. The space \mathcal{P}_n has dimension n + 1—each polynomial is uniquely determined by n + 1 coefficients. The full space \mathcal{P} is infinite-dimensional.

Polynomial spaces provide a tractable infinite-dimensional setting. Unlike general function spaces, polynomials are algebraically manipulable—we can multiply them, factor them, and study their roots. They appear throughout analysis as approximations to more complicated functions (Taylor series, interpolation).

2.5.4 Sequence Spaces

The set of all real sequences (x_1, x_2, x_3, \ldots) forms a vector space under componentwise addition and scalar multiplication: (x_n) + (y_n) = (x_n + y_n), \quad c(x_n) = (cx_n).

The zero vector is the sequence (0, 0, 0, \ldots). The additive inverse of (x_n) is (-x_n).

Subspaces of interest:

  • Convergent sequences: Those (x_n) for which \lim_{n \to \infty} x_n exists. This is a vector space because limits of sums are sums of limits, and limits of scalar multiples are scalar multiples of limits.

  • Bounded sequences: Those (x_n) for which there exists M > 0 with |x_n| \leq M for all n. Sums and scalar multiples of bounded sequences are bounded.

  • Sequences converging to zero: The subspace c_0 = \{(x_n) : x_n \to 0\}. This is strictly smaller than the space of convergent sequences but still infinite-dimensional.

Sequence spaces, as we’ve seen, model discrete-time processes and provide the setting for studying series. The partial sums of a series \sum a_n form a sequence (s_n) in the space of sequences.

2.5.4.1 Non-Examples

Not every set with natural operations forms a vector space.

Natural numbers \mathbb{N} under addition. There is no additive identity (no zero in \mathbb{N} under the usual interpretation) and no additive inverses (negatives of positive integers are not natural numbers). The structure fails axioms 3 and 4.

Positive reals \mathbb{R}_{>0} under usual addition. There is no zero in \mathbb{R}_{>0} (since 0 \notin \mathbb{R}_{>0}), and negatives lie outside the set. Axioms 3 and 4 fail again.

The set \{(x, y) \in \mathbb{R}^2 : x, y > 0\} under componentwise addition. Same issue—no zero, no negatives. The axioms fail.

These examples show that not every set with operations is a vector space. The axioms are substantive constraints.


2.6 Subspaces

Many vector spaces naturally contain smaller vector spaces within them. Recognizing this structure simplifies analysis.

Definition 2.2 (Subspace) A subset \mathcal{W} \subseteq \mathcal{V} is a subspace of \mathcal{V} if:

  1. 0 \in \mathcal{W} (contains the zero vector)
  2. If u, w \in \mathcal{W}, then u + w \in \mathcal{W} (closed under addition)
  3. If w \in \mathcal{W} and c \in \mathbb{R}, then cw \in \mathcal{W} (closed under scalar multiplication)

A subspace inherits the vector space structure from \mathcal{V}—the operations are the same, and the axioms hold automatically because they hold in the ambient space. The conditions ensure \mathcal{W} is closed under these operations and contains the required neutral element.

Theorem 2.6 (Subspaces Are Vector Spaces) If \mathcal{W} is a subspace of \mathcal{V}, then \mathcal{W} is a vector space under the operations inherited from \mathcal{V}.

Proof. The axioms for a vector space are identities (commutativity, associativity, distributivity) or existence claims (zero, inverses). The identities hold in \mathcal{W} because they hold in \mathcal{V} for any elements. The zero vector exists by condition 1. For inverses: if w \in \mathcal{W}, then -w = (-1)w \in \mathcal{W} by closure under scalar multiplication. \square

2.6.0.1 Examples of Subspaces

Lines through the origin in \mathbb{R}^2. Let v \neq 0 in \mathbb{R}^2. The set \mathcal{W} = \{cv : c \in \mathbb{R}\} is a subspace. It contains 0 = 0v. It is closed under addition: c_1 v + c_2 v = (c_1 + c_2)v \in \mathcal{W}. It is closed under scalar multiplication: a(cv) = (ac)v \in \mathcal{W}.

Geometrically, \mathcal{W} is the line through the origin in the direction of v.

Polynomials of degree at most n within \mathcal{P}. The set \mathcal{P}_n \subset \mathcal{P} is a subspace. The zero polynomial has degree -\infty (or undefined), conventionally less than any n. Sums of degree-n polynomials have degree at most n. Scalar multiples preserve degree.

Solution sets of homogeneous linear equations. Consider the equation ax + by + cz = 0 in \mathbb{R}^3. The solution set is a plane through the origin, which is a subspace. The zero vector (0,0,0) satisfies the equation. Sums of solutions are solutions (linearity of the equation). Scalar multiples of solutions are solutions.

More generally, the solution set of any homogeneous linear system Av = 0 is a subspace of \mathbb{R}^n. This is the null space or kernel of the matrix A, a central object in linear algebra which we will study more in the coming chapters.

Continuous functions within all functions. C([a,b]) \subset \mathscr{F}([a,b], \mathbb{R}). The constant zero function is continuous. Sums of continuous functions are continuous. Scalar multiples of continuous functions are continuous.

2.6.1 Intersection and Sum of Subspaces

Subspaces combine in natural ways. Intersection preserves the subspace property; sums produce the smallest subspace containing given subspaces.

Theorem 2.7 (Intersection of Subspaces) If \mathcal{W}_1 and \mathcal{W}_2 are subspaces of \mathcal{V}, then \mathcal{W}_1 \cap \mathcal{W}_2 is a subspace.

Proof. The zero vector lies in both \mathcal{W}_1 and \mathcal{W}_2, hence in their intersection. If u, v \in \mathcal{W}_1 \cap \mathcal{W}_2, then u, v \in \mathcal{W}_1 and u, v \in \mathcal{W}_2. Since \mathcal{W}_1 is closed under addition, u + v \in \mathcal{W}_1. Since \mathcal{W}_2 is closed under addition, u + v \in \mathcal{W}_2. Thus u + v \in \mathcal{W}_1 \cap \mathcal{W}_2. Closure under scalar multiplication follows similarly. \square

The intersection of arbitrarily many subspaces is a subspace. The union, however, typically is not—closure under addition fails unless one subspace contains the other.

Definition 2.3 (Sum of Subspaces) If \mathcal{W}_1, \mathcal{W}_2 \subseteq \mathcal{V} are subspaces, their sum is \mathcal{W}_1 + \mathcal{W}_2 = \{w_1 + w_2 : w_1 \in \mathcal{W}_1, w_2 \in \mathcal{W}_2\}.

This is the smallest subspace containing both \mathcal{W}_1 and \mathcal{W}_2. Verification of the subspace axioms is straightforward: 0 = 0 + 0 \in \mathcal{W}_1 + \mathcal{W}_2; if u = u_1 + u_2 and v = v_1 + v_2, then u + v = (u_1 + v_1) + (u_2 + v_2) \in \mathcal{W}_1 + \mathcal{W}_2; scalar multiples behave similarly.


2.7 Applications Across Disciplines

Vector spaces provide the basic language for many areas of mathematics and its applications, including analysis, probability, physics, data science, and economics.

In later chapters, we will see how linear maps, eigenvalues, inner products, and orthogonal decompositions make these connections precise.

2.7.1 Machine Learning and Data Science

High-dimensional data as vectors. In machine learning, data points are vectors. An image with n pixels is a vector in \mathbb{R}^n (or \mathbb{R}^{3n} for RGB). A document represented by word frequencies is a vector in \mathbb{R}^d, where d is the vocabulary size. Training a classifier means finding a linear separator (hyperplane) in this vector space.

Linear models. Linear regression fits data by minimizing \|Ax - b\|^2 over vectors x \in \mathbb{R}^n. Neural networks compose linear maps (matrix multiplications) with nonlinearities. The backpropagation algorithm computes gradients—covectors in the dual space.

Dimensionality reduction. Principal Component Analysis (PCA) finds the subspace of maximal variance by computing eigenvectors of the covariance matrix. Data projected onto this subspace retains essential structure while reducing dimension.

2.7.2 Coding Theory and Cryptography

Error-correcting codes. Linear codes are subspaces of \mathbb{Z}_2^n (vectors over the field \mathbb{Z}_2 = \{0,1\}). Encoding embeds messages into a subspace; decoding projects received (possibly corrupted) vectors onto the nearest codeword. Reed-Solomon codes, used in CDs, DVDs, and QR codes, are vector spaces over finite fields \mathbb{Z}_p.

Cryptographic protocols. Lattice-based cryptography relies on vector spaces over \mathbb{Z}^n with specific norms. Elliptic curve cryptography uses vector space structure on finite fields. The security of many systems depends on the computational hardness of problems in high-dimensional lattices.

2.7.3 Probability and Statistics

Random variables as vectors. The set of square-integrable random variables L^2(\Omega, \mathbb{P}) is a vector space (in fact, a Hilbert space with inner product \langle X, Y \rangle = \mathbb{E}[XY]). Orthogonal projections minimize mean-squared error, making least squares the natural estimator.

Covariance matrices. The covariance structure of n random variables is encoded in an n \times n positive semi-definite matrix, a linear operator on \mathbb{R}^n. Diagonalizing this matrix reveals independent components—the basis of principal component analysis and factor models.

2.7.4 Economics and Game Theory

Commodity spaces. An economy with n goods is modeled as \mathbb{R}^n, where vectors represent bundles of goods. Prices are covectors (linear functionals) assigning a cost to each bundle. Budget constraints define hyperplanes; utility maximization is optimization over a convex subset of \mathbb{R}^n.

Nash equilibria. Mixed strategies in game theory are vectors in probability simplices (subspaces of \mathbb{R}^n satisfying constraints). Computing equilibria involves fixed-point theorems for linear and nonlinear maps.

2.7.5 Physics and Engineering

Quantum mechanics. States of quantum systems are vectors in complex Hilbert spaces. Observables are self-adjoint operators. Measurement is orthogonal projection onto eigenspaces (spectral theorem). The entire framework rests on linear algebra over \mathbb{C}.

Signal processing. Signals are functions in L^2([0,T]) or sequences in \ell^2(\mathbb{Z})—infinite-dimensional vector spaces. Fourier analysis decomposes signals into orthogonal basis functions (sines and cosines). Filtering, compression, and transmission leverage vector space structure.

Control theory. Linear dynamical systems \dot{x} = Ax + Bu describe everything from airplane dynamics to electrical circuits. Stability, controllability, and observability are properties of subspaces (kernel, image, invariant subspaces).


This chapter establishes the axiomatic foundation of linear algebra. We have defined vector spaces, verified basic consequences of the axioms, and examined representative examples and subspaces.

No coordinates, bases, or dimensions have yet been introduced. These notions, which refine and organize the structure developed here, are the subject of the following chapters.