14 Optimization
14.1 Fermat’s Principle and the Calculus of Extremes
In 1662, Pierre de Fermat proposed that light travels between two points along the path that takes the least time. This simple principle—now known as Fermat’s Principle—explains both reflection and refraction. When light reflects off a mirror, the angle of incidence equals the angle of reflection precisely because this minimizes travel time. When light enters water and bends, the specific angle follows from minimizing time through media of different speeds.
Fermat didn’t have calculus when he formulated this principle, but the connection is deep. If we model the travel time as a function T(\theta) of the angle \theta, then the optimal angle must occur where T'(\theta) = 0. At a minimum, the function cannot be increasing or decreasing—the tangent line must be horizontal.

This observation extends beyond optics. A soap bubble minimizes surface area subject to fixed volume. A cable hanging between two poles assumes the shape—a catenary—that minimizes potential energy. In economics, firms maximize profit by producing where marginal revenue equals marginal cost, which is to say, where the derivative of profit vanishes.
14.2 Critical Points
Where can a function achieve its maximum or minimum value? If the function is smooth, the answer is constrained.
Theorem 14.1 (Fermat’s Theorem) Let f be defined on an interval and suppose f has a local extremum at an interior point c. If f is differentiable at c, then f'(c) = 0.
Suppose f has a local maximum at c. Then there exists \delta > 0 such that f(c) \geq f(x) for all x \in (c - \delta, c + \delta).
For h > 0 sufficiently small, f(c + h) \leq f(c) \implies \frac{f(c+h) - f(c)}{h} \leq 0.
Taking h \to 0^+, we obtain f'(c) \leq 0.
For h < 0 with |h| small, f(c + h) \leq f(c) \implies \frac{f(c+h) - f(c)}{h} \geq 0.
Taking h \to 0^-, we obtain f'(c) \geq 0.
Thus f'(c) = 0. The minimum case is analogous. \square
The converse is false. A point where f'(c) = 0 need not be an extremum. Consider f(x) = x^3 at x = 0: the derivative vanishes, but the function has neither a maximum nor minimum there—it passes through horizontally but continues to increase.
Still, this theorem dramatically narrows our search. Instead of checking every point in an interval, we need only examine points where the derivative vanishes or fails to exist.
Definition 14.1 (Critical Point) A point c in the domain of f is a critical point if either f'(c) = 0 or f'(c) does not exist.
To find global extrema on a closed interval [a, b], we evaluate f at:
- All critical points in the open interval (a, b)
- The endpoints a and b
The largest value is the global maximum; the smallest is the global minimum. The Extreme Value Theorem (Theorem 7.5) guarantees that continuous functions on closed intervals attain both.
14.3 Distinguishing Maxima from Minima
Knowing that f'(c) = 0 is necessary for an interior extremum, but insufficient to determine whether c is a maximum, minimum, or neither. We need additional information.
14.3.1 The First Derivative Test
The behavior of f'(x) near c reveals the nature of the critical point.
Theorem 14.2 (First Derivative Test) Let c be a critical point of f. Suppose f is continuous at c and differentiable on some interval (c - \delta, c) \cup (c, c + \delta) for \delta > 0.
If f'(x) > 0 for x \in (c - \delta, c) and f'(x) < 0 for x \in (c, c + \delta), then f has a local maximum at c.
If f'(x) < 0 for x \in (c - \delta, c) and f'(x) > 0 for x \in (c, c + \delta), then f has a local minimum at c.
If f' does not change sign at c, then f has no local extremum at c.
We prove (1); the proof of (2) is analogous.
Suppose f'(x) > 0 for x \in (c - \delta, c) and f'(x) < 0 for x \in (c, c + \delta).
For any x \in (c - \delta, c), the function f is continuous on [x, c] and differentiable on (x, c). By the Mean Value Theorem (Theorem 9.2), there exists \xi \in (x, c) such that f(c) - f(x) = f'(\xi)(c - x).
Since \xi \in (x, c) \subset (c - \delta, c), we have f'(\xi) > 0. Also, c - x > 0. Therefore, f(c) - f(x) = f'(\xi)(c - x) > 0 \implies f(c) > f(x).
For any x \in (c, c + \delta), apply MVT on [c, x] to obtain \xi \in (c, x) such that f(x) - f(c) = f'(\xi)(x - c).
Since \xi \in (c, x) \subset (c, c + \delta), we have f'(\xi) < 0. Also, x - c > 0. Therefore, f(x) - f(c) = f'(\xi)(x - c) < 0 \implies f(x) < f(c).
Thus f(c) \geq f(x) for all x \in (c - \delta, c + \delta), so f has a local maximum at c. \square
If f' > 0 to the left of c, the function is climbing. If f' < 0 to the right, it’s descending. The critical point is a peak. Conversely, descending then ascending produces a valley. The Mean Value Theorem makes this precise: the sign of f' controls whether f increases or decreases.
This test works even when f is not differentiable at c, provided it’s continuous there. Consider f(x) = |x| at x = 0: the derivative fails to exist, but the sign change in the one-sided derivatives reveals a minimum.
14.3.2 The Second Derivative Test
When f is twice differentiable, concavity offers an alternative. The second derivative measures how the slope changes—equivalently, the curvature of the graph (see Definition 12.3). Positive curvature means the function bends upward, creating a valley.
Theorem 14.3 (Second Derivative Test) Let c be a critical point where f'(c) = 0. Suppose f''(c) exists.
- If f''(c) > 0, then f has a local minimum at c.
- If f''(c) < 0, then f has a local maximum at c.
- If f''(c) = 0, the test is inconclusive.
We prove (1); the proof of (2) is analogous.
Suppose f''(c) > 0. By definition of the derivative, f''(c) = \lim_{h \to 0} \frac{f'(c + h) - f'(c)}{h} > 0.
Since this limit is positive, there exists \delta > 0 such that for all h with 0 < |h| < \delta, \frac{f'(c + h) - f'(c)}{h} > 0.
Since f'(c) = 0, this becomes \frac{f'(c + h)}{h} > 0.
For h > 0. We have f'(c + h) > 0, so f' is positive on (c, c + \delta).
For h < 0. We have f'(c + h) < 0, so f' is negative on (c - \delta, c).
Alternatively, apply the Mean Value Theorem to f': for any x \in (c, c + \delta), there exists \xi \in (c, x) such that f'(x) - f'(c) = f''(\xi)(x - c).
Since f'(c) = 0 and f'' is continuous at c with f''(c) > 0, we have f''(\xi) > 0 for \xi near c. Thus f'(x) > 0 for x > c (nearby). Similarly, f'(x) < 0 for x < c (nearby).
By the First Derivative Test (Theorem 14.2), f has a local minimum at c. \square
The second derivative f''(c) measures the curvature of the graph at c (modulo the factor (1 + (f'(c))^2)^{3/2}, which equals 1 when f'(c) = 0). If f''(c) > 0, the graph curves upward—the tangent line lies below the graph, forming a valley. If f''(c) < 0, the graph curves downward, forming a peak.
When the test fails: If f''(c) = 0, the critical point might be anything. The functions f(x) = x^4, g(x) = -x^4, and h(x) = x^3 all satisfy f'(0) = f''(0) = 0 at x = 0, yet the first has a minimum, the second a maximum, and the third neither. In such cases, consult the first derivative test or examine higher derivatives.
14.4 A Classical Problem: The Isoperimetric Inequality
Among all closed curves of fixed perimeter, which encloses the greatest area? The circle. This is the isoperimetric inequality, known to the ancient Greeks but not rigorously proved until the 19th century.
We consider a simpler version: among all rectangles of fixed perimeter, which has the greatest area?
Problem. A rectangle has perimeter P. What dimensions maximize its area?
Let x and y be the side lengths. The constraints are 2x + 2y = P and x, y > 0. The area is A = xy.
From the perimeter constraint, y = \frac{P}{2} - x. Substituting, A(x) = x\left(\frac{P}{2} - x\right) = \frac{P}{2}x - x^2.
The domain is x \in \left(0, \frac{P}{2}\right). Differentiate: A'(x) = \frac{P}{2} - 2x.
Setting A'(x) = 0 gives x = \frac{P}{4}, hence y = \frac{P}{4}. The rectangle is a square.
To verify this is a maximum, note A''(x) = -2 < 0, so the critical point is indeed a maximum. Alternatively, observe that A(x) \to 0 as x \to 0^+ or x \to (P/2)^-, so the critical point must be a maximum.
Among all rectangles of perimeter P, the square of side \frac{P}{4} has the greatest area, A = \frac{P^2}{16}.
This result extends to higher dimensions. Among all rectangular boxes of fixed surface area, the cube has the greatest volume. The pattern is symmetry often accompanies optimality.
14.5 Optimizing with Constraints: The Cylinder Problem
A cylindrical can must hold a fixed volume V. What dimensions minimize the surface area (and thus the material cost)?
Let r be the radius and h the height. The volume is \pi r^2 h = V.
The surface area is S = 2\pi r^2 + 2\pi r h,
consisting of two circular ends and the rectangular side (when unrolled).
From the volume constraint, h = \frac{V}{\pi r^2}. Substitute: S(r) = 2\pi r^2 + 2\pi r \cdot \frac{V}{\pi r^2} = 2\pi r^2 + \frac{2V}{r}.
The domain is r > 0. Differentiate: S'(r) = 4\pi r - \frac{2V}{r^2}.
Set S'(r) = 0: 4\pi r = \frac{2V}{r^2} \implies 4\pi r^3 = 2V \implies r^3 = \frac{V}{2\pi} \implies r = \sqrt[3]{\frac{V}{2\pi}}.
The corresponding height is h = \frac{V}{\pi r^2} = \frac{V}{\pi \left(\frac{V}{2\pi}\right)^{2/3}} = 2\sqrt[3]{\frac{V}{2\pi}} = 2r.
To confirm this is a minimum, note that S(r) \to \infty as r \to 0^+ (tall, narrow can) or r \to \infty (flat, wide can), so the critical point must be a minimum.

14.6 Distance to a Curve
Problem. Find the point on the parabola y = x^2 closest to (0, 1).
Let (x, x^2) be a point on the parabola. The squared distance to (0, 1) is f^2(x) = x^2 + (x^2 - 1)^2.
We minimize f^2 rather than f to avoid the square root. Expand: f^2(x) = x^2 + x^4 - 2x^2 + 1 = x^4 - x^2 + 1.
Differentiate: \frac{d(f^2)}{dx} = 4x^3 - 2x = 2x(2x^2 - 1).
Critical points: x = 0 or x = \pm \frac{1}{\sqrt{2}}. Evaluate f^2:
At x = 0: D^2(0) = 1
At x = \pm \frac{1}{\sqrt{2}}: D^2\left(\frac{1}{\sqrt{2}}\right) = \frac{1}{4} - \frac{1}{2} + 1 = \frac{3}{4}
The minimum distance is f = \sqrt{3/4} = \frac{\sqrt{3}}{2}, achieved at \left(\pm \frac{1}{\sqrt{2}}, \frac{1}{2}\right).
The closest points lie where the line from (0, 1) to the parabola is perpendicular to the tangent. At x = \frac{1}{\sqrt{2}}, the slope of the parabola is y' = 2x = \sqrt{2}. The line from \left(\frac{1}{\sqrt{2}}, \frac{1}{2}\right) to (0, 1) has slope m = \frac{1 - 1/2}{0 - 1/\sqrt{2}} = \frac{1/2}{-1/\sqrt{2}} = -\frac{1}{\sqrt{2}}.
Indeed, \sqrt{2} \cdot \left(-\frac{1}{\sqrt{2}}\right) = -1.
