Optimizing linear objective with linear constraints. Simplex method.
Multi-session curriculum - substantial prior knowledge and complex material. Use mastery gates and deliberate practice.
Linear programming (LP) is the rare sweet spot where modeling power meets algorithmic reliability: you can express many real allocation and planning problems with linear constraints and a linear objective—and then solve them efficiently by exploiting geometry (polyhedra) and algebra (pivoting).
A linear program optimizes a linear objective cᵀx subject to linear constraints (typically Ax ≤ b, plus x ≥ 0). The feasible set is a convex polyhedron; if an optimum exists, there is an optimal vertex (basic feasible solution). The simplex method walks from vertex to vertex by pivoting in a tableau, improving the objective until optimality conditions hold.
Many decision problems are about choosing amounts: how much to produce, ship, invest, schedule, or mix. These decisions interact through resource limits (labor hours, budgets, capacities) and requirements (demand, minimum levels). When those interactions are well-approximated by linear relationships, you get a model that is both expressive and mathematically tractable.
LP is foundational in optimization because it sits at the intersection of:
An LP in common “inequality form” is:
Maximize (or minimize)
Subject to
where:
You might also see equalities, ≥ inequalities, or variables not restricted to be nonnegative. These are equivalent up to standard transformations (we’ll do those carefully soon).
That is the key geometric promise of LP:
If the feasible region is nonempty and bounded and the objective is linear, then an optimal solution exists at a vertex of the feasible polyhedron.
Here are the basic terms you will keep using:
| Term | Meaning | How to recognize it |
|---|---|---|
| Feasible solution | Any x satisfying all constraints | Ax ≤ b and x ≥ 0 |
| Feasible region | Set of all feasible solutions | Intersection of half-spaces |
| Infeasible LP | No feasible solutions exist | Constraints contradict |
| Unbounded LP | Objective can improve without limit | You can move along a feasible ray increasing cᵀx |
| Optimal solution | Feasible x with best objective value | No other feasible point improves cᵀx |
| Vertex / extreme point | “Corner” of polyhedron | Intersection of active constraints |
We’ll now build the mechanics that connect this geometry to the simplex algorithm.
The simplex method works because it only needs to look at a small subset of points: the vertices. To understand why, you need a clear picture of the feasible region and what linear objectives do on convex sets.
Each inequality constraint is a half-space:
The feasible region is:
This set P is convex because it is an intersection of convex sets (half-spaces and the nonnegative orthant).
If x₁ and x₂ are feasible, then for any λ ∈ [0, 1],
is also feasible because:
= λAx₁ + (1 − λ)Ax₂
≤ λb + (1 − λ)b
= b
and x(λ) ≥ 0 if x₁, x₂ ≥ 0.
Convexity is the reason there are no “local maxima” different from the global max for LP; but more importantly, it supports the vertex optimality claim.
A vertex is a point that cannot be expressed as a nontrivial convex combination of two other feasible points. Operationally in LP, a vertex occurs where enough constraints are tight (active) to pin down a unique point.
In ℝ², a vertex is where two non-parallel lines meet (and all constraints hold). In ℝⁿ, you typically need n linearly independent active constraints.
Consider maximizing cᵀx. The sets {x : cᵀx = α} are hyperplanes. Increasing α slides the hyperplane in direction c.
So you can search among vertices.
Two subtle but common phenomena:
1) Multiple optimal solutions
If the objective hyperplane is parallel to a face of the polyhedron, then the last-contact set is a whole edge/face. There are infinitely many optimal solutions, but still at least one optimal vertex.
2) Degeneracy
A vertex is degenerate if more than n constraints are active there, causing some basic variables to be zero in simplex form. Degeneracy can make simplex “stall” (no objective improvement on a pivot) and is the source of cycling (rare in practice, but real in theory).
The simplex method is most naturally described for standard form:
Maximize
Subject to
How do we convert Ax ≤ b into equalities? Add slack variables.
If you have:
introduce slack sᵢ ≥ 0 such that:
Stacking all constraints:
Now the full variable vector might be z = [x; s] with z ≥ 0.
In standard form with m equations, a basic solution chooses m variables to solve for (the “basic” variables) and sets the rest to 0 (the “nonbasic” variables). If the resulting basic variables are ≥ 0, the solution is a basic feasible solution (BFS), which corresponds to a vertex (under mild non-degeneracy assumptions).
This is the bridge:
Next we’ll make that bridge explicit with simplex pivoting.
If an optimum exists at a vertex, we can:
1) Start at some vertex (a BFS).
2) Move along edges to adjacent vertices.
3) Improve the objective each step.
4) Stop when no adjacent vertex improves.
The simplex method implements this using linear algebra on a changing set of basic variables.
Assume standard form:
with A ∈ ℝ^{m×n} and rank m (constraints independent).
Choose an index set B (|B| = m) for basic variables and N for nonbasic variables. Reorder columns so:
Then constraints become:
If A_B is invertible, solve:
A basic solution sets x_N = 0, giving:
This is feasible iff A_B⁻¹ b ≥ 0.
Write objective:
Substitute x_B:
cᵀx = c_BᵀA_B⁻¹(b − A_N x_N) + c_Nᵀx_N
Work it step-by-step:
= c_BᵀA_B⁻¹b − c_BᵀA_B⁻¹A_N x_N + c_Nᵀx_N
= (c_BᵀA_B⁻¹b) + (c_Nᵀ − c_BᵀA_B⁻¹A_N) x_N
Define the reduced costs:
Then:
At a BFS, x_N = 0. If all reduced costs satisfy:
then increasing any nonbasic variable from 0 cannot increase the objective (because it would contribute rⱼ xⱼ with rⱼ ≤ 0 and xⱼ ≥ 0). Therefore the BFS is optimal.
If some reduced cost is positive, that variable is a candidate to enter the basis to improve the objective.
Suppose variable x_k (currently nonbasic) has reduced cost r_k > 0 and we try to increase it. From:
the basic variables change linearly with x_k. To maintain feasibility, we require x_B ≥ 0. This yields a maximum step size (the ratio test).
Let d = A_B⁻¹ A_k be the direction in basic-variable space when x_k increases by 1. Then:
where θ is the amount we increase x_k.
Feasibility requires:
For components where d_i > 0:
So choose:
The minimizer index i* tells you which basic variable hits zero first; that variable leaves the basis.
This produces a new basis B′ where x_k enters and x_{B,i*} leaves—an adjacent vertex.
If for an entering variable x_k with r_k > 0 we have d_i ≤ 0 for all i, then increasing x_k never makes any basic variable negative. Then θ can grow without bound and the objective increases without bound.
That is the simplex certificate of unboundedness.
In practice simplex is often described via a tableau (a compact way to maintain A_B⁻¹A and A_B⁻¹b). Pivot operations perform the basis change without explicitly inverting matrices each step.
Conceptually:
A hidden requirement so far: we need an initial BFS. If the model is in the form:
and b ≥ 0 with slack variables forming an identity matrix, we get a trivial BFS by setting original variables to 0 and slacks to b.
But if constraints include:
then a BFS is not obvious.
Phase I introduces artificial variables and minimizes their sum to drive them to 0; if the minimum is > 0, the LP is infeasible. If it reaches 0, the resulting basis gives a feasible start for Phase II, which optimizes the real objective.
We’ll solidify all of this with worked examples where the algebra and geometry line up clearly.
The power of LP comes from turning messy language into:
A good LP model is:
Variables: xⱼ = quantity of product j.
Constraints: ∑ aᵢⱼ xⱼ ≤ bᵢ (resource i capacity).
Objective: maximize ∑ pⱼ xⱼ.
Variables: xⱼ = amount of ingredient j.
Constraints: nutrition minimums ∑ nᵢⱼ xⱼ ≥ reqᵢ; cost objective.
Variables: x_{i→j} shipment.
Constraints: supply and demand balance, capacity.
Objective: min cost.
Many of these become special LPs that admit faster algorithms (e.g., network flow), but LP is the unifying language.
Even before a full duality theorem, it helps to internalize the dual’s meaning:
For a primal in max form:
Maximize cᵀx
Subject to Ax ≤ b, x ≥ 0
A corresponding dual is:
Minimize bᵀy
Subject to Aᵀy ≥ c, y ≥ 0
Interpretation:
Weak duality says:
So the dual provides an upper bound on primal profit (for max problems). At optimum, the bounds meet.
This matters because:
A finite zero-sum game can be written as an LP:
Constraints enforce that expected payoff against each opponent action is ≥ v. That is linear in p, and probabilities sum to 1.
Duality aligns with the minimax theorem:
This is exactly primal-dual equality.
You’ll use LP duality as the algebraic backbone for Zero-Sum Games.
Max-flow can be expressed as an LP:
The dual corresponds to a min-cut-like object. Strong duality yields the max-flow min-cut theorem.
This is why Network Flow feels “LP-shaped” even when solved with specialized algorithms.
Despite worst-case exponential examples, simplex is extremely fast on many real problems and remains a workhorse in operations research, often enhanced with:
For very large, sparse LPs, interior-point methods are also common, but simplex is still essential for sensitivity analysis and re-optimization.
Consider the LP:
Maximize 3x₁ + 2x₂
Subject to
(1) x₁ + x₂ ≤ 4
(2) 2x₁ + x₂ ≤ 5
x₁ ≥ 0, x₂ ≥ 0
We’ll convert to standard form and find a BFS corresponding to a vertex.
Add slack variables s₁, s₂ ≥ 0 to convert ≤ to =:
x₁ + x₂ + s₁ = 4
2x₁ + x₂ + s₂ = 5
Now variables are (x₁, x₂, s₁, s₂) ≥ 0.
Write in matrix form Ax = b with x = (x₁, x₂, s₁, s₂)ᵀ:
A = [[1, 1, 1, 0],
[2, 1, 0, 1]]
b = (4, 5)ᵀ
Objective coefficients c = (3, 2, 0, 0)ᵀ.
Find an obvious basis: choose slack variables as basic.
Take B = {s₁, s₂} and N = {x₁, x₂}. Setting x₁ = x₂ = 0 gives:
s₁ = 4
s₂ = 5
All variables are ≥ 0, so this is a basic feasible solution.
Interpret geometrically:
Setting x₁ = x₂ = 0 corresponds to the origin in the (x₁, x₂) plane.
At that point, both constraints are slack (not tight): s₁ = 4 and s₂ = 5 represent the distances to the constraint boundaries.
Insight: Slack variables do more than “fix” inequalities: they expose an initial BFS (often) and encode which constraints are tight at a vertex (slack = 0) versus loose (slack > 0).
Use the same LP:
Maximize 3x₁ + 2x₂
s.t. x₁ + x₂ + s₁ = 4
2x₁ + x₂ + s₂ = 5
All variables ≥ 0.
Start at the slack-basis BFS: x₁ = x₂ = 0, s₁ = 4, s₂ = 5.
We’ll perform one improvement step in simplex logic (entering/leaving variables).
Current basis B = {s₁, s₂}. Nonbasic N = {x₁, x₂} with x₁ = x₂ = 0.
Objective value is 0.
Compute how increasing a nonbasic variable affects basic variables.
Because the current basis matrix A_B is the identity (columns of s₁ and s₂), we have:
A_B = [[1, 0],
[0, 1]]
So A_B⁻¹ is also identity, and directions are easy:
For entering x₁: column A_{x₁} = (1, 2)ᵀ ⇒ d = A_B⁻¹A_{x₁} = (1, 2)ᵀ
For entering x₂: column A_{x₂} = (1, 1)ᵀ ⇒ d = (1, 1)ᵀ
Reduced costs in this initial slack basis equal the original objective coefficients because c_B = (0, 0).
So r_{x₁} = 3 and r_{x₂} = 2.
Both are positive, so either can enter; pick x₁ (largest reduced cost).
Ratio test for x₁:
Current basic variables are (s₁, s₂) = (4, 5).
Direction d = (1, 2).
Feasibility requires:
s₁(new) = 4 − 1·θ ≥ 0 ⇒ θ ≤ 4
s₂(new) = 5 − 2·θ ≥ 0 ⇒ θ ≤ 5/2
So θ* = min(4, 5/2) = 5/2.
The leaving variable is s₂ (it hits 0 first).
Update values at the new BFS:
Increase x₁ from 0 to θ* = 5/2.
Then:
s₁ = 4 − 1·(5/2) = 3/2
s₂ = 5 − 2·(5/2) = 0
x₂ remains 0.
New BFS: (x₁, x₂, s₁, s₂) = (5/2, 0, 3/2, 0).
Objective improvement:
Old objective = 3·0 + 2·0 = 0
New objective = 3·(5/2) + 2·0 = 15/2 = 7.5
So the pivot moved to an adjacent vertex and strictly improved the objective.
Insight: A simplex step is exactly: choose an entering variable with positive reduced cost (can improve), then move as far as feasibility allows (ratio test). The leaving variable identifies which constraint becomes tight at the new vertex.
Consider:
Maximize x₁ + x₂
Subject to
x₁ − x₂ ≥ 1
x₁, x₂ ≥ 0
We’ll show it’s unbounded by exhibiting a feasible direction that increases the objective forever.
Rewrite the constraint:
x₁ − x₂ ≥ 1
Equivalently:
x₁ ≥ 1 + x₂
Pick a feasible starting point.
Let x₂ = 0, then x₁ ≥ 1. Choose (x₁, x₂) = (1, 0). This is feasible.
Find a direction v that keeps feasibility while increasing objective.
If we increase both x₁ and x₂ by the same amount t:
x₁(t) = 1 + t
x₂(t) = 0 + t
Check feasibility:
x₁(t) − x₂(t) = (1 + t) − t = 1 ≥ 1
So all t ≥ 0 remain feasible.
Objective along this ray:
f(t) = x₁(t) + x₂(t) = (1 + t) + t = 1 + 2t
As t → ∞, f(t) → ∞.
Therefore the LP is unbounded: there is no finite optimal solution.
Insight: Unboundedness is not “no constraints”; it means constraints still permit moving along some feasible ray that continually improves cᵀx. Simplex detects this when an entering variable has no limiting ratio test (no positive pivot entries in its direction).
An LP optimizes a linear objective cᵀx over a feasible region defined by linear constraints; the feasible region is a convex polyhedron.
If an optimal solution exists for a bounded feasible LP, at least one optimal solution occurs at a vertex (extreme point).
Converting Ax ≤ b to standard form uses slack variables: aᵢᵀx + sᵢ = bᵢ with sᵢ ≥ 0.
A basic feasible solution (BFS) is obtained by choosing m basic variables (a basis) and setting the remaining nonbasic variables to 0; BFS correspond to vertices.
Simplex improves the objective by pivoting: pick an entering nonbasic variable with positive reduced cost (for maximization), then pick a leaving basic variable via the ratio test to keep feasibility.
Reduced costs r tell you whether increasing a nonbasic variable can improve the objective; optimality occurs when all reduced costs are ≤ 0 (maximization).
Simplex can certify unboundedness when a variable can enter (improve objective) but no constraint limits its increase (ratio test has no valid bound).
Phase I/Phase II handles cases where no obvious initial BFS exists; Phase I finds feasibility or proves infeasibility.
Forgetting that simplex needs standard form (equalities + nonnegativity), and mishandling ≥ constraints (which often require surplus + artificial variables).
Assuming the optimum is always unique or always at a single vertex; in LP, an entire edge/face can be optimal when the objective is parallel to that face.
Misapplying the ratio test: only rows with positive direction coefficient (d_i > 0) constrain the step size; including d_i ≤ 0 can incorrectly suggest boundedness.
Confusing infeasible with unbounded: infeasible means no point satisfies constraints; unbounded means feasible points exist but objective can improve without limit.
Convert to standard form by introducing slack/surplus variables:
Minimize 4x₁ + x₂
Subject to
(1) 3x₁ + x₂ ≥ 6
(2) x₁ + 2x₂ ≤ 8
x₁, x₂ ≥ 0
Write the resulting equality constraints and list all variables with their nonnegativity requirements.
Hint: A ≤ constraint gets a +slack. A ≥ constraint becomes an equality by subtracting a surplus variable; to run simplex you would typically add an artificial variable (Phase I), but here just do the structural conversion.
For (1) 3x₁ + x₂ ≥ 6, introduce surplus u₁ ≥ 0:
3x₁ + x₂ − u₁ = 6
(If doing simplex Phase I, you would also add an artificial variable a₁ ≥ 0: 3x₁ + x₂ − u₁ + a₁ = 6.)
For (2) x₁ + 2x₂ ≤ 8, introduce slack s₂ ≥ 0:
x₁ + 2x₂ + s₂ = 8
Variables: x₁ ≥ 0, x₂ ≥ 0, u₁ ≥ 0, s₂ ≥ 0 (and optionally a₁ ≥ 0 if building a Phase I starting basis).
Consider the LP:
Maximize 2x₁ + 3x₂
Subject to
x₁ + x₂ ≤ 4
x₁ + 3x₂ ≤ 6
x₁, x₂ ≥ 0
(a) List the corner points (vertices) of the feasible region.
(b) Evaluate the objective at each vertex and identify an optimal solution.
Hint: Vertices occur at intersections of constraint boundaries, including with axes: x₁ = 0 or x₂ = 0. Check feasibility for each intersection.
(a) Add boundaries:
C1: x₁ + x₂ = 4
C2: x₁ + 3x₂ = 6
Axes: x₁ = 0, x₂ = 0.
Candidate intersections:
1) (0,0) feasible.
2) With x₁ = 0: C1 gives x₂ = 4 ⇒ (0,4) but check C2: 0 + 12 ≤ 6 false, so infeasible. C2 with x₁=0 gives x₂ = 2 ⇒ (0,2) check C1: 0+2 ≤4 ok, feasible.
3) With x₂ = 0: C1 gives x₁ = 4 ⇒ (4,0) check C2: 4 ≤ 6 ok, feasible. C2 gives x₁=6 ⇒ (6,0) but violates C1 (6≤4 false), infeasible.
4) Intersection C1 and C2:
Solve:
x₁ + x₂ = 4
x₁ + 3x₂ = 6
Subtract first from second:
2x₂ = 2 ⇒ x₂ = 1
Then x₁ = 3
So (3,1) is feasible.
Vertices: (0,0), (0,2), (3,1), (4,0).
(b) Objective 2x₁ + 3x₂:
(0,0) → 0
(0,2) → 6
(3,1) → 2·3 + 3·1 = 9
(4,0) → 8
Optimal is (3,1) with value 9.
Unbounded or bounded?
Determine whether the LP is unbounded, infeasible, or has a finite optimum:
Maximize x₁
Subject to
x₁ − x₂ ≤ 1
−x₁ + x₂ ≤ 2
x₁, x₂ ≥ 0
Justify your conclusion using geometry or an improving feasible ray.
Hint: Try rewriting the two inequalities as bounds relating x₁ and x₂. Then see if x₁ can grow while maintaining feasibility by increasing x₂ appropriately.
Rewrite constraints:
1) x₁ − x₂ ≤ 1 ⇒ x₁ ≤ 1 + x₂
2) −x₁ + x₂ ≤ 2 ⇒ x₂ ≤ 2 + x₁
These imply x₁ is not directly upper-bounded because if x₂ grows with x₁, both can stay feasible.
Construct a feasible ray:
Pick any t ≥ 0 and set x₁ = t, x₂ = t (and note x₁, x₂ ≥ 0).
Check constraints:
1) x₁ − x₂ = t − t = 0 ≤ 1 OK
2) −x₁ + x₂ = −t + t = 0 ≤ 2 OK
Objective is x₁ = t → ∞ as t → ∞.
Therefore the LP is feasible and unbounded.
Next nodes you can unlock/apply LP to:
Helpful background/adjacent nodes: