14 Chapter: Discrete Dynamical Systems and Markov Chains
This chapter studies what happens when you apply a matrix repeatedly: x_0, Ax_0, A^2x_0, \ldots The question—does this sequence converge, diverge, or oscillate?—turns out to depend entirely on the eigenvalues of A. We develop this theory carefully, then apply it to Markov chains, population models, and discrete approximations of continuous systems.
Core material (Sections 1–4) is central to the course and should be read in full. Sections marked ★ introduce advanced applications—mixing times, quantum evolution, financial models—that illustrate the reach of spectral theory. These may be treated as enrichment or assigned selectively; no subsequent core material depends on them.
14.1 Sequences of Vectors and Linear Recurrences
A sequence of vectors in \mathbb{F}^d is a function \mathbb{N} \to \mathbb{F}^d, written (x_n)_{n=0}^\infty or simply (x_n) when the index set is understood. We write x_n = (x_n^{(1)}, \ldots, x_n^{(d)})^T \in \mathbb{F}^d to emphasize the d components of the n-th term.
Definition 14.1 (Convergence of vector sequences) A sequence (x_n) in \mathbb{F}^d converges to x \in \mathbb{F}^d if for every \varepsilon > 0 there exists N \in \mathbb{N} such that n \ge N \implies \|x_n - x\| < \varepsilon. We write \lim_{n \to \infty} x_n = x or x_n \to x.
The norm \|\cdot\| may be any norm on \mathbb{F}^d. Since all norms on finite-dimensional spaces are equivalent (a result from analysis), convergence in one norm implies convergence in any other. We typically use the Euclidean norm \|x\| = \sqrt{\sum_{i=1}^d |x_i|^2} unless stated otherwise.
Theorem 14.1 A sequence (x_n) in \mathbb{F}^d converges to x if and only if each component sequence (x_n^{(i)}) converges to x^{(i)} in \mathbb{F}.
Proof. The inequality |x_n^{(i)} - x^{(i)}| \le \|x_n - x\| shows convergence of (x_n) implies componentwise convergence. Conversely, componentwise convergence gives \|x_n - x\|^2 = \sum_{i=1}^d |x_n^{(i)} - x^{(i)}|^2 \to 0. \quad \square
Definition 14.2 (Cauchy sequence) A sequence (x_n) in \mathbb{F}^d is Cauchy if for every \varepsilon > 0 there exists N \in \mathbb{N} such that m, n \ge N \implies \|x_m - x_n\| < \varepsilon.
Theorem 14.2 Every Cauchy sequence in \mathbb{F}^d converges. That is, \mathbb{F}^d is complete.
Proof. By Theorem 14.1, it suffices to show each component sequence (x_n^{(i)}) is Cauchy in \mathbb{F}. This follows from |x_m^{(i)} - x_n^{(i)}| \le \|x_m - x_n\|. Since \mathbb{R} and \mathbb{C} are complete, each (x_n^{(i)}) converges, hence (x_n) converges. \square
A linear recurrence is a sequence (x_n) in \mathbb{F}^d satisfying x_{n+1} = A x_n, \quad n \ge 0, for some A \in M_d(\mathbb{F}) and initial condition x_0 \in \mathbb{F}^d. Iteration yields the explicit formula x_n = A^n x_0.
The behavior of (x_n) is therefore determined entirely by the matrix powers A^n. When A is diagonalizable with A = P D P^{-1} where D = \operatorname{diag}(\lambda_1, \ldots, \lambda_d), we obtain A^n = P D^n P^{-1} = P \operatorname{diag}(\lambda_1^n, \ldots, \lambda_d^n) P^{-1}.
Expanding x_0 = \sum_{i=1}^d c_i v_i in the eigenbasis \{v_1, \ldots, v_d\} gives x_n = \sum_{i=1}^d c_i \lambda_i^n v_i.
This is the key formula: the system decomposes into d independent scalar recurrences, one per eigendirection. The asymptotic behavior is controlled by the spectral radius \rho(A) = \max_{1 \le i \le d} |\lambda_i|.
(See Chapter 8, Definition 9.7 for the definition and basic properties of spectral radius.)
Theorem 14.3 Let A \in M_d(\mathbb{C}) be diagonalizable. Then A^n \to 0 as n \to \infty if and only if \rho(A) < 1.
Proof. If \rho(A) < 1, then |\lambda_i| < 1 for all i, so \lambda_i^n \to 0. Since A^n = P D^n P^{-1}, we have \|A^n\| \le \|P\| \|P^{-1}\| \max_i |\lambda_i|^n \to 0.
Conversely, if some |\lambda_i| \ge 1, let v_i be a corresponding eigenvector. Then A^n v_i = \lambda_i^n v_i does not converge to 0, so A^n \not\to 0. \square
For non-diagonalizable matrices, the Jordan canonical form (Chapter 8, Section 9.8) shows that the same result holds: A^n \to 0 if and only if \rho(A) < 1. The Jordan blocks J_k(\lambda) = \lambda I + N contribute polynomial factors n^{k-1} that cannot overcome exponential decay when |\lambda| < 1. This is a good preview of why Jordan form matters: it handles exactly the cases where diagonalization fails.
14.2 Stability of Discrete Dynamical Systems
Consider the autonomous system x_{n+1} = A x_n, \quad x_0 \in \mathbb{F}^d, where A \in M_d(\mathbb{F}) is the evolution operator. A point x^* \in \mathbb{F}^d is an equilibrium if A x^* = x^*, equivalently (A - I)x^* = 0. The zero vector is always an equilibrium.
Definition 14.3 (Stability of equilibrium) The zero equilibrium of x_{n+1} = A x_n is:
Stable if for every \varepsilon > 0 there exists \delta > 0 such that \|x_0\| < \delta implies \|x_n\| < \varepsilon for all n \ge 0
Asymptotically stable if it is stable and additionally x_n \to 0 for all x_0 in a neighborhood of 0
Unstable if it is not stable
Theorem 14.4 Consider x_{n+1} = A x_n with A \in M_d(\mathbb{C}).
The zero equilibrium is asymptotically stable if and only if \rho(A) < 1
The zero equilibrium is unstable if \rho(A) > 1
If \rho(A) = 1 and A is diagonalizable with all eigenvalues on the unit circle simple, the system is stable but not asymptotically stable
Proof.
- Suppose \rho(A) < 1. By Theorem 14.3, A^n \to 0, so x_n = A^n x_0 \to 0 for all x_0. For \varepsilon > 0, choose N such that \|A^n\| < \varepsilon for n \ge N. Then \|x_n\| = \|A^n x_0\| \le \|A^n\| \|x_0\| < \varepsilon for n \ge N and \|x_0\| = 1. Scaling gives the \varepsilon-\delta condition.
Conversely, if \rho(A) \ge 1, there exists an eigenvalue \lambda with |\lambda| \ge 1 and eigenvector v \neq 0. Then \|A^n v\| = |\lambda|^n \|v\| does not decay, contradicting asymptotic stability.
If \rho(A) > 1, there exists \lambda with |\lambda| > 1. For the corresponding eigenvector v, we have \|A^n v\| = |\lambda|^n \|v\| \to \infty, so the system is unstable.
If \rho(A) = 1 and A is diagonalizable with all eigenvalues on the unit circle, the system is stable but not asymptotically stable (orbits remain bounded but do not decay). For non-diagonalizable matrices with \rho(A) = 1, Jordan blocks can introduce polynomial growth, leading to instability. \square
This theorem reduces the stability question to a spectral problem: compute eigenvalues and check |\lambda_i| < 1 for all i.
14.3 Predator–Prey Dynamics
The Lotka–Volterra model describes interaction between predators and prey. Let x_n denote prey population and y_n predator population at time n. The discrete-time linearization near an equilibrium yields \begin{pmatrix} x_{n+1} \\ y_{n+1} \end{pmatrix} = \begin{pmatrix} 1 + a & -b \\ c & 1 - d \end{pmatrix} \begin{pmatrix} x_n \\ y_n \end{pmatrix}, where a, b, c, d > 0 are parameters encoding growth rates, predation efficiency, and mortality. Stability of the equilibrium is determined by whether the eigenvalues of this 2 \times 2 matrix lie inside or outside the unit circle.
The characteristic polynomial is \det(A - \lambda I) = \lambda^2 - (2 + a - d)\lambda + (1 + a)(1 - d) + bc.
For the continuous-time model \frac{dx}{dt} = ax - bxy, \frac{dy}{dt} = -dy + cxy, linearization at the positive equilibrium x^* = d/c, y^* = a/b yields the Jacobian J = \begin{pmatrix} 0 & -b x^* \\ c y^* & 0 \end{pmatrix}.
The eigenvalues are \lambda = \pm i\sqrt{bc x^* y^*}, purely imaginary, indicating neutral stability: the system exhibits periodic oscillations rather than convergence. This is a fundamental feature of predator-prey dynamics — populations cycle indefinitely around the equilibrium.
The discrete model approximates this via Euler’s method with step size \Delta t: if B is the continuous Jacobian, then A \approx I + \Delta t \cdot B.
Theorem 14.5 For the discretization A = I + \Delta t \cdot B of a continuous system \frac{dx}{dt} = Bx, stability of the discrete system requires |1 + \Delta t \cdot \mu| < 1 for all eigenvalues \mu of B.
When \operatorname{Re}(\mu) < 0, this holds for sufficiently small \Delta t. When \mu is purely imaginary, \mu = i\omega, the condition becomes |1 + i \Delta t \omega|^2 = 1 + (\Delta t \omega)^2 < 1, which fails for all \Delta t > 0.
This shows Euler’s method is conditionally stable: it preserves stability of strictly dissipative systems but fails for conservative ones, including predator-prey models. Implicit methods or symplectic integrators are required for Hamiltonian systems.
14.3.1 Example: Competing Species
Consider two species with populations x_n and y_n competing for resources. The Leslie–Gower model gives \begin{pmatrix} x_{n+1} \\ y_{n+1} \end{pmatrix} = \begin{pmatrix} r_1 & 0 \\ 0 & r_2 \end{pmatrix} \begin{pmatrix} 1 - x_n/K_1 - \alpha y_n/K_1 \\ 1 - y_n/K_2 - \beta x_n/K_2 \end{pmatrix} \circ \begin{pmatrix} x_n \\ y_n \end{pmatrix}, where r_i are intrinsic growth rates, K_i carrying capacities, and \alpha, \beta competition coefficients. Linearizing near equilibrium yields a 2 \times 2 system whose eigenvalues determine coexistence versus competitive exclusion.
14.4 Stochastic Matrices and Probability Vectors
A matrix P \in M_d(\mathbb{R}) is column-stochastic if P_{ij} \ge 0 for all i, j and \sum_{i=1}^d P_{ij} = 1 \quad \text{for all } j.
Equivalently, \mathbf{1}^T P = \mathbf{1}^T where \mathbf{1} = (1, \ldots, 1)^T.
A vector x \in \mathbb{R}^d is a probability vector if x_i \ge 0 for all i and \sum_{i=1}^d x_i = 1.
Lemma 14.1 If P is column-stochastic and x is a probability vector, then Px is a probability vector.
Proof. Nonnegativity: (Px)_i = \sum_j P_{ij} x_j \ge 0 since P_{ij}, x_j \ge 0. Normalization: \sum_i (Px)_i = \sum_i \sum_j P_{ij} x_j = \sum_j x_j \sum_i P_{ij} = \sum_j x_j \cdot 1 = 1. \quad \square
Lemma 14.2 Every column-stochastic matrix has 1 as an eigenvalue.
Proof. Since \mathbf{1}^T P = \mathbf{1}^T, we have P^T \mathbf{1} = \mathbf{1}, so 1 is an eigenvalue of P^T. Since P and P^T share eigenvalues, 1 is an eigenvalue of P. \square
Definition 14.4 (Steady-state distribution) A vector \pi \in \mathbb{R}^d is a steady-state distribution for P if \pi is a probability vector and P\pi = \pi.
Existence is guaranteed by Lemma 14.2. Uniqueness requires additional hypotheses — specifically, that the chain is irreducible and aperiodic, as established below.
14.5 Markov Chains and Transition Matrices
Let (X_n)_{n=0}^\infty be a sequence of random variables taking values in a finite state space \mathcal{S} = \{1, \ldots, d\}. The sequence is a Markov chain if \mathbb{P}(X_{n+1} = j \mid X_n = i, X_{n-1}, \ldots, X_0) = \mathbb{P}(X_{n+1} = j \mid X_n = i) for all n \ge 0 and all states i, j \in \mathcal{S}. This is the Markov property: the future depends on the present but not the past. Given where you are now, knowing how you got there provides no additional predictive information.
The transition probabilities P_{ij} = \mathbb{P}(X_{n+1} = j \mid X_n = i) form the transition matrix P = (P_{ij}), which in the probabilistic convention is row-stochastic: P_{ij} \ge 0 and \sum_j P_{ij} = 1.
Convention. We adopt the column-stochastic convention throughout: we redefine P_{ij} = \mathbb{P}(X_{n+1} = i \mid X_n = j), so column j of P is the distribution of the next state given current state j. This ensures the state evolution is x^{(n+1)} = Px^{(n)}, making the connection to matrix powers and eigenvalue theory transparent. Let x^{(n)} \in \mathbb{R}^d denote the distribution at time n with x_i^{(n)} = \mathbb{P}(X_n = i). Then x^{(n+1)} = P x^{(n)}.
Iterating gives x^{(n)} = P^n x^{(0)} for any initial distribution x^{(0)}.
Definition 14.5 (Irreducibility) A Markov chain with transition matrix P is irreducible if for every pair of states i, j \in \mathcal{S}, there exists n \ge 0 such that (P^n)_{ij} > 0.
Irreducibility means every state is accessible from every other state. Graphically, the directed graph with adjacency matrix P is strongly connected. Without irreducibility, the chain can get “trapped” in a subset of states.
Definition 14.6 (Aperiodicity) The period of state i is d_i = \gcd\{n \ge 1 : (P^n)_{ii} > 0\}. The chain is aperiodic if d_i = 1 for all i.
Periodicity arises when states are partitioned into cycles. For instance, a bipartite graph induces period 2: the walker alternates between vertex sets. Aperiodicity ensures eventual return to every state at arbitrary times, not just multiples of the period.
Theorem 14.6 Let P be a column-stochastic matrix corresponding to an irreducible, aperiodic Markov chain. Then:
There exists a unique probability vector \pi with P\pi = \pi and \pi_i > 0 for all i
For any initial distribution x^{(0)}, we have \lim_{n \to \infty} P^n x^{(0)} = \pi
The convergence is exponential: there exist constants C > 0 and \lambda \in (0, 1) such that \|P^n x^{(0)} - \pi\| \le C \lambda^n.
This result follows from the Perron–Frobenius theorem, established in the next section. The steady state \pi represents the long-run proportion of time spent in each state — regardless of where the chain starts, it eventually “forgets” its initial condition.
14.5.1 Example: Gambler’s Ruin
A gambler starts with k dollars and plays a fair game: each round, she wins 1 dollar with probability p or loses 1 dollar with probability q = 1 - p. The game ends when she reaches 0 (ruin) or N (target). Let X_n denote her fortune at time n. This is a Markov chain on \{0, 1, \ldots, N\} with transition matrix P_{i,i+1} = p, \quad P_{i,i-1} = q \quad \text{for } 1 \le i \le N-1, and absorbing states P_{0,0} = P_{N,N} = 1.
The probability of reaching N before 0 starting from k satisfies the recurrence u_k = p u_{k+1} + q u_{k-1}, \quad u_0 = 0, \quad u_N = 1.
For p \neq q, the solution is u_k = \frac{1 - (q/p)^k}{1 - (q/p)^N}.
For p = q = 1/2 (fair game), u_k = k/N. The formula reflects the geometric sequence structure of the eigenvalues of the tridiagonal transition matrix restricted to transient states: the characteristic decay rate q/p is precisely the ratio of the two relevant eigenvalues, and the boundary conditions select the unique linear combination satisfying u_0 = 0, u_N = 1.
14.6 The Perron–Frobenius Theorem
We state the Perron–Frobenius theorem for non-negative matrices, the foundation for analyzing stochastic systems.
Theorem 14.7 (Perron–Frobenius theorem) Let A \in M_d(\mathbb{R}) satisfy A_{ij} \ge 0 for all i, j. Assume A is irreducible: for every i, j, there exists k \ge 1 such that (A^k)_{ij} > 0.
Then:
A has a positive real eigenvalue r = \rho(A) where \rho(A) = \max\{|\lambda| : \lambda \text{ eigenvalue of } A\} is the spectral radius
r has algebraic and geometric multiplicity 1
There exists an eigenvector v > 0 (meaning v_i > 0 for all i) such that Av = rv
Any other eigenvalue \lambda satisfies |\lambda| < r, provided A is aperiodic in the sense that \gcd\{k \ge 1 : (A^k)_{ii} > 0\} = 1 for all i
The full proof of the Perron–Frobenius theorem requires tools from topology and functional analysis that lie beyond this course:
- Brouwer fixed-point theorem: Every continuous map from a compact convex set to itself has a fixed point
- Compactness arguments: The set of probability vectors is compact in \mathbb{R}^d
- Positive operator theory: Properties of linear maps that preserve the positive cone
These belong to a graduate course in functional analysis or operator theory. We accept the theorem and apply its consequences.
Proof. Omitted. See Perron (1907) and Frobenius (1912) for the original proofs. \square
Corollary 14.1 Let P be column-stochastic, irreducible, and aperiodic. Then:
There exists a unique probability vector \pi with P\pi = \pi
All components of \pi are strictly positive
For any probability vector x, \lim_{n \to \infty} P^n x = \pi
Proof. By Theorem 14.7 applied to P, the eigenvalue r = 1 (from Lemma 14.2) is simple with positive eigenvector v. Normalize \pi = v / \|v\|_1 to obtain a probability vector. Uniqueness follows from simplicity of r = 1.
For (c), decompose x = c\pi + w where w lies in the span of the remaining eigenvectors. The component w decays exponentially since all other eigenvalues satisfy |\lambda| < 1 by aperiodicity. \square
This result underlies the convergence of Markov chain Monte Carlo methods, the PageRank algorithm, and equilibrium distributions in statistical mechanics.
14.7 Continuous-Time Systems and the Matrix Exponential
A continuous-time linear system is \frac{dx}{dt} = B x, \quad x(0) = x_0 \in \mathbb{F}^d, where B \in M_d(\mathbb{F}). The solution is x(t) = e^{Bt} x_0 where the matrix exponential e^{Bt} is defined by the power series (see Chapter 8, Section 9.9): e^{Bt} = \sum_{k=0}^\infty \frac{(Bt)^k}{k!} = I + Bt + \frac{(Bt)^2}{2!} + \cdots.
Theorem 14.8 For B \in M_d(\mathbb{C}):
The series defining e^{Bt} converges absolutely for all t \in \mathbb{R}
\frac{d}{dt} e^{Bt} = B e^{Bt} = e^{Bt} B
If B = P D P^{-1} is diagonalizable, then e^{Bt} = P e^{Dt} P^{-1} where e^{Dt} = \operatorname{diag}(e^{\lambda_1 t}, \ldots, e^{\lambda_d t})
e^{B(t+s)} = e^{Bt} e^{Bs} for all t, s \in \mathbb{R}
These properties are proved in Chapter 8, Section 9.9. The key tool is that submultiplicative matrix norms ensure absolute convergence of the series.
Theorem 14.9 The system \frac{dx}{dt} = Bx satisfies x(t) \to 0 as t \to \infty for all x_0 if and only if \operatorname{Re}(\lambda) < 0 for all eigenvalues \lambda of B.
Proof. If B = PDP^{-1}, then e^{Bt} = P e^{Dt} P^{-1} with e^{Dt} = \operatorname{diag}(e^{\lambda_1 t}, \ldots, e^{\lambda_d t}). Since |e^{\lambda t}| = e^{\operatorname{Re}(\lambda) t}, we have e^{\lambda t} \to 0 iff \operatorname{Re}(\lambda) < 0. \square
The parallel with discrete-time stability is clean: discrete requires |\lambda| < 1 (eigenvalues inside the unit disk); continuous requires \operatorname{Re}(\lambda) < 0 (eigenvalues in the left half-plane).
14.7.1 Connection to Discrete Systems
Discretizing \frac{dx}{dt} = Bx via Euler’s method with step \Delta t gives x_{n+1} = x_n + \Delta t \, B x_n = (I + \Delta t \, B) x_n.
Thus A = I + \Delta t \, B. If \mu is an eigenvalue of B, then \lambda = 1 + \Delta t \mu is an eigenvalue of A.
Stability comparison:
- Continuous: requires \operatorname{Re}(\mu) < 0
- Discrete: requires |1 + \Delta t \mu| < 1
For \operatorname{Re}(\mu) < 0 and small \Delta t, we have |1 + \Delta t \mu| \approx 1 + \Delta t \operatorname{Re}(\mu) < 1, so stability is preserved. For purely imaginary \mu = i\omega, we get |1 + i\Delta t \omega|^2 = 1 + (\Delta t \omega)^2 > 1, showing Euler’s method fails to preserve stability of conservative systems.
14.8 ★ Mixing Time and the Spectral Gap
This section is advanced enrichment. The main ideas — that larger spectral gap means faster convergence — are important intuition, but the formal bounds require probability theory beyond this course.
For an irreducible, aperiodic Markov chain with transition matrix P, the mixing time quantifies convergence speed to the steady state \pi.
Definition 14.7 (Total variation distance) The total variation distance between two probability distributions \mu, \nu on a finite set \{1, \ldots, d\} is d_{\text{TV}}(\mu, \nu) = \frac{1}{2} \sum_{i=1}^d |\mu_i - \nu_i|.
This measures the maximum difference in probability that \mu and \nu assign to any event. The factor 1/2 normalizes so that d_{\text{TV}} \in [0, 1], with d_{\text{TV}}(\mu, \nu) = 0 if and only if \mu = \nu, and d_{\text{TV}}(\mu, \nu) = 1 if \mu and \nu have disjoint support.
Properties:
- d_{\text{TV}}(\mu, \nu) = \max_{A \subseteq \{1, \ldots, d\}} |\mu(A) - \nu(A)|
- Triangle inequality: d_{\text{TV}}(\mu, \rho) \le d_{\text{TV}}(\mu, \nu) + d_{\text{TV}}(\nu, \rho)
- Contraction under stochastic matrices: d_{\text{TV}}(P\mu, P\nu) \le d_{\text{TV}}(\mu, \nu)
Definition 14.8 (Mixing time) For \varepsilon > 0, the \varepsilon-mixing time is t_{\text{mix}}(\varepsilon) = \min\{t \ge 0 : d_{\text{TV}}(P^t x, \pi) \le \varepsilon \text{ for all probability vectors } x\}.
The mixing time measures how long the chain takes to “forget” its initial condition and converge within tolerance \varepsilon of the steady state.
Definition 14.9 (Spectral gap) For a stochastic matrix P with eigenvalues 1 = \lambda_1 \ge |\lambda_2| \ge \cdots \ge |\lambda_d|, the spectral gap is \Delta = 1 - |\lambda_2|.
Intuition: The steady state \pi is the eigenvector for \lambda_1 = 1. All other eigenvectors decay at rate |\lambda_i|^t under iteration. The slowest-decaying non-stationary component has rate |\lambda_2|^t. When \Delta is large, |\lambda_2| is small, and this component decays quickly. Random walks on well-connected graphs mix quickly; loosely connected graphs mix slowly.
Theorem 14.10 For an irreducible, aperiodic Markov chain, \frac{1}{\Delta} \log\left(\frac{1}{2\varepsilon}\right) \le t_{\text{mix}}(\varepsilon) \le \frac{1}{\Delta} \log\left(\frac{1}{\varepsilon \pi_{\min}}\right), where \pi_{\min} = \min_i \pi_i > 0.
This bound is proved by decomposing P^t x - \pi using the spectral decomposition of P, then bounding the total variation distance via the triangle inequality and norm estimates. The argument is clean in outline but requires careful handling of the relationship between total variation distance and the \ell^2 norm. For a complete proof accessible to advanced undergraduates, see Levin, Peres, and Wilmer, Markov Chains and Mixing Times (2009), Chapter 12.
14.9 ★ Advanced Applications
The following sections illustrate how the spectral theory developed in this chapter reaches into finance, quantum physics, and numerical analysis. Students without background in stochastic calculus or quantum mechanics should focus on the linear-algebraic structure and treat the physics as motivation.
14.9.1 Portfolio Dynamics
Consider a portfolio distributed across d assets with allocation vector x \in \mathbb{R}^d where x_i \ge 0 and \sum x_i = 1. Let R \in M_d(\mathbb{R}) be the return matrix where R_{ij} represents the fraction of asset j that transitions to asset i after one period.
If R is column-stochastic, the portfolio after one period is x^{(1)} = Rx^{(0)}, and the long-term allocation is x^{(\infty)} = \lim_{n \to \infty} R^n x^{(0)} = \pi, the Perron–Frobenius steady state. The expected portfolio value at time n is \mathbb{E}[V_n] = \mathbf{r}^T x^{(n)}, converging to \mathbb{E}[V_\infty] = \mathbf{r}^T \pi as the allocation reaches equilibrium.
14.9.2 Discretized Diffusion and Black–Scholes ★★
This subsection uses stochastic differential equations and requires probability beyond this course. Focus on the deterministic part if the stochastic term is unfamiliar.
The Black–Scholes model for a stock price S(t) assumes geometric Brownian motion: \frac{dS}{dt} = \mu S + \sigma S \frac{dW}{dt}, where W(t) is Brownian motion (a random process). Focusing only on the deterministic drift \frac{dS}{dt} = \mu S and discretizing via Euler’s method gives S_{n+1} = (1 + \mu \Delta t) S_n.
This is exactly a scalar linear recurrence with A = 1 + \mu \Delta t. Stability requires |1 + \mu \Delta t| < 1. For \mu < 0 (mean-reverting), the deterministic part is stable for small \Delta t; for \mu > 0 (growing), the price trajectory diverges. The random term \sigma S \frac{dW}{dt} adds noise around this deterministic skeleton but does not change the stability analysis of the drift.
14.9.3 Quantum State Evolution ★★
This subsection is for students with physics background. No quantum mechanics is assumed beyond treating |\psi_n\rangle as a vector in \mathbb{C}^d.
In quantum mechanics, the state of a system is a unit vector |\psi\rangle \in \mathbb{C}^d (normalized: \langle \psi | \psi \rangle = 1). Time evolution under a Hamiltonian H (a Hermitian matrix, covered in the Spectral Theorem chapter) is given by |\psi(t)\rangle = e^{-iHt/\hbar} |\psi(0)\rangle.
For discrete time steps, the propagator U = e^{-iH\Delta t/\hbar} is a unitary matrix satisfying U^* U = I. The evolution is |\psi_{n+1}\rangle = U |\psi_n\rangle.
Since U is unitary, all its eigenvalues satisfy |\lambda_i| = 1: they lie on the unit circle. By Theorem 14.4(c), the system is stable but not asymptotically stable — quantum states do not converge but oscillate. If \{|i\rangle\} are eigenstates of H with eigenvalues E_i and |\psi_0\rangle = \sum c_i |i\rangle, then |\psi_n\rangle = \sum_i c_i e^{-iE_i n\Delta t/\hbar} |i\rangle.
The probability distribution oscillates with frequencies determined by energy differences E_i - E_j. This is the same spectral structure as any diagonalizable linear recurrence — what distinguishes quantum mechanics is that all eigenvalues are constrained to the unit circle, reflecting conservation of probability.
14.10 Closing Remarks
The central lesson of this chapter is that the long-run behavior of a linear recurrence x_{n+1} = Ax_n is a spectral problem. Diagonalization reduces the system to d independent scalar recurrences, each governed by a single eigenvalue. The spectral radius \rho(A) determines stability: below 1 gives exponential decay, above 1 gives growth, and \rho(A) = 1 produces bounded oscillation or marginal behavior depending on the Jordan structure (developed in Chapter 8).
Stochastic matrices refine this picture: the Perron–Frobenius theorem guarantees that irreducible, aperiodic chains have a unique positive steady state that attracts all initial distributions exponentially fast. The rate of convergence is controlled by the spectral gap 1 - |\lambda_2|, the key quantity connecting spectral theory to the applied question of how long it takes a Markov chain to “mix.”
Continuous-time systems \frac{dx}{dt} = Bx fit the same framework, with eigenvalues now classified by their real parts rather than moduli: stability requires \operatorname{Re}(\lambda) < 0. The matrix exponential e^{Bt} is the continuous analogue of A^n, and their spectral theories are parallel. Chapter 8 completes this picture by treating the non-diagonalizable case through Jordan canonical form, making the connection between algebraic structure and dynamical behavior fully precise.