Dual Problem
Duality turns the transport problem into a search for potentials rather than couplings. This chapter explains why potentials certify optimality, how -transforms regularize them, and why the quadratic case reveals convex analysis behind Brenier maps. Linear-programming duality gives the discrete picture Bertsimas & Tsitsiklis, 1997, while the continuous form is one of the central theorems of optimal transport Villani, 2003Santambrogio, 2015.
Discrete Dual¶
The discrete dual gives finite-dimensional certificates of optimality. Its complementary-slackness conditions identify where an optimal coupling can put mass.
The two vectors play the role of source and target prices; admissibility means that no transported pair is priced above its travel cost. The discrete Kantorovich problem is a linear program. Hence its value can be computed either by minimizing over couplings or by maximizing over a pair of dual vectors.
Proof
The primal-dual optimality relation locates the support of an optimal plan. If is optimal and is optimal, then
Thus potentials are not transport maps themselves. They are certificates, and their equality set with the cost matrix is where an optimal coupling is allowed to place mass.

Discrete Kantorovich dual potentials for the quadratic cost . The upper strip shows the fixed source histogram in red and the target histogram in blue. The lower strip shows optimal dual vectors , with a gauge chosen so that . Complementary slackness states that mass can be transported only through entries where .
The interactive demo varies the target law and the number of bins. The lower curves are reconstructed from the active monotone transport graph, so the equality set moves as the coupling support changes.
Interactive panel. Use the geometry and slack controls to see feasible dual potentials touch the cost surface exactly on matched pairs.
The formula (2) also shows that is convex, being a supremum of linear functions. From the primal formulation, is concave.
Auction Algorithm and Dual Prices¶
Assignment algorithms become more transparent once one has dual variables. The auction algorithm is a dual-price method: it updates target prices and maintains an approximate complementary-slackness certificate. The tolerance removes ties, stabilizes price updates, and gives a quantitative optimality certificate Bertsekas, 1981Bertsekas, 1992Mérigot & Thibert, 2020.
Consider the square assignment problem with costs and rewrite it as a profit maximization problem with . The auction algorithm keeps prices on the target points and a partial assignment. For an unassigned source , define the best and second-best reduced profits
Source bids for and increases its price by the gap to the second-best target, plus a margin:
The target is assigned to , and its previous owner, if any, becomes unassigned. The iteration stops when all sources are assigned.

Matrix view of actual auction iterates on a diagonally dominant one-dimensional squared-distance assignment. Each panel records the current ownership state: unassigned bidders are shown as flat rows, while assigned bidders are shown as one-hot rows at their currently held target. The snapshots show initialization, intermediate price updates, and the final identity assignment satisfying complementary slackness.
The interactive auction view exposes the price dynamics directly. Increasing makes bids coarser and the final certificate looser; decreasing it makes the price landscape closer to exact complementary slackness.
Interactive panel. Use the step and gap controls to inspect how auction prices and assignments evolve toward complementary slackness.
For fixed prices , eliminating the bidder utilities in the dual minimization gives the convex objective
which comes from the dual constraints . The auction update should therefore be viewed as a price-adjustment method on this nonsmooth dual landscape, rather than as a generic gradient step.
Proof
Let be any assignment. By -complementary slackness,
Summing over cancels the prices, since both and are permutations:
No assignment has profit more than above that of . Since , the cost of is at most above the minimum cost. If the costs are integers, all assignment costs are integers; a gap strictly smaller than one forces the gap to be zero.
During the bidding process, the last bid made by a source makes its chosen target better, up to the margin , than all alternatives. Subsequent price increases can only make targets less attractive, and one checks by induction that currently assigned pairs satisfy -complementary slackness. A standard finite-termination proof normalizes prices by subtracting their minimum, observes that each bid increases one target price by at least , and bounds the normalized price spread in terms of the range of the profits.
General Formulation¶
The continuous dual is the analytic counterpart of the discrete linear program. It uses continuous potentials because measures are naturally probed through integration. The pairing is .
Proof
Weak duality is immediate. If and , then
Taking the supremum over admissible potentials and the infimum over couplings gives the inequality ``’'.
For the reverse inequality, view the primal problem as a linear program over Radon measures, paired with . The affine marginal map is continuous for the weak topology, the feasible set is nonempty because it contains , and the cost is continuous and bounded on compact sets. Since probability measures on a compact product are weakly compact, the set of attainable cost-marginal triples is closed after adding an epigraph variable. A separating-hyperplane argument applied to the convex set
gives continuous functions and a scalar multiplier, normalized so that . The separating inequality states that the supremum over such potentials is at least the primal value. This is the infinite-dimensional analogue of the finite linear-programming proof.
The discrete case corresponds to dual vectors that are samples of continuous potentials, . The primal-dual support condition becomes
For the one-dimensional quadratic cost, the continuous potentials can be read from the monotone map : on the active graph, and .

Continuous Kantorovich potentials for the same source and target families as the discrete potential figure. The upper strips show the source density in red and the target density in blue. The lower strips show potentials and for the quadratic cost . The equality set contains the monotone transport graph.
The interactive view computes the monotone map from numerical quantiles and then integrates . It makes clear that the potentials change smoothly with the target law, even though the equality set remains a thin transport graph.
Interactive panel. Use the regularity and time controls to view continuous Kantorovich potentials and their active transport contacts.
In contrast to the primal problem, showing existence of solutions to the continuous dual is nontrivial: the constraint set is not compact and the objective is not coercive. The machinery of -transforms repairs this by replacing arbitrary potentials with regular best responses.
c-Transforms¶
The -transform is the operation that improves potentials without changing feasibility. It is both a proof device for dual attainment and the route from duality to Brenier’s convex potentials.
Best-Response Potentials¶
Keeping a dual potential fixed, one can maximize in closed form over the second potential in the dual problem:
The constraint is equivalent to .
Since is positive, maximizing is achieved by taking on the support of , equivalently -almost everywhere.

Discrete -transform as a lower envelope for costs . The red circles are four source atoms with potential values ; the gray curves are ; the colored curve is their lower envelope . This is the semi-discrete situation where the source space is finite.
The interactive envelope view exposes the exponent, the number of atoms, and the potential amplitude. This is the local mechanism behind many dual regularity statements: taking a pointwise minimum of translated costs inherits regularity from the cost.
Interactive panel. Use the support and curvature controls to see the c-transform as a lower envelope of shifted cost functions.
Proof
The constraint for all is equivalent, for each fixed , to
Since is nonnegative, the largest possible value of is obtained by saturating this pointwise upper bound on the support of . The proof for is identical after exchanging the two marginals.
For the primal formula, disintegrate any feasible plan as . Then
Equality is obtained by choosing conditional laws supported on minimizers of , or by approximate measurable selections when exact selections are unavailable.
The map replaces dual potentials by better ones: it preserves feasibility and improves the dual objective. Functions of the form and are called -concave and -concave functions. These partial minimizations define maximizers on the supports of and , while the transform itself defines functions on the whole ambient spaces. This gives a canonical extension of dual solutions beyond their active supports.
Proof
For each , set and . Since all functions are -Lipschitz,
This stability is crucial for dual attainment. When is Lipschitz on compact spaces, one can replace arbitrary admissible potentials by -transformed ones with a uniform Lipschitz bound; after fixing the harmless additive gauge, compactness follows from Arzela--Ascoli.
Euclidean Case¶
The Euclidean quadratic cost is the model case where -transforms become ordinary convex conjugates after removing quadratic terms. This is the algebraic bridge between Kantorovich duality and Brenier maps.
The cost on is central because it reduces the quadratic Wasserstein problem to convex duality. Indeed, for any ,
For ,
Thus -concave functions are negatives of convex functions. In the one-dimensional bilinear model case, the hard double -transform is an operation of taking concave envelopes.
Failure of Alternating Optimization¶
A crucial property of the Legendre transform is that , and that is the convex envelope of . These properties carry over to -transforms Rockafellar, 2015.
Proof
This invariance property shows that one can improve only once by exact alternating maximization:
Thus the method reaches a stationary point immediately. This is the classical failure mode of alternating maximization on a nonsmooth problem, where the nonsmooth constraint mixes the two variables. The workaround is to introduce smoothing, which the next chapters develop through entropic regularization and Sinkhorn scaling.
For the bilinear cost on a compact interval, -concave functions are ordinary concave functions and is the smallest concave majorant of . A hard transform removes non-concave oscillations in one closure step instead of producing a gradual ascent.

Hard -transforms for the bilinear cost . Dark curves are the double-transform closures and , which are concave majorants, while dashed lighter curves are the one-sided best responses and after a harmless vertical gauge shift. Exact alternating best responses are useful for certificates but do not give the smooth iterative dynamics later provided by entropic regularization.
The final interactive demo turns this algebra into a visible operation: change the roughness of the starting potential and observe that the hard double transform jumps directly to its concave closure.
Interactive panel. Use the iteration and asymmetry controls to see why alternating c-transforms can stall or fail without the right assumptions.
- Bertsimas, D., & Tsitsiklis, J. N. (1997). Introduction to Linear Optimization. Athena Scientific.
- Villani, C. (2003). Topics in Optimal Transportation (Vol. 58). American Mathematical Society.
- Santambrogio, F. (2015). Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling. Birkhäuser.
- Bertsekas, D. P. (1981). A new algorithm for the assignment problem. Mathematical Programming, 21(1), 152–171.
- Bertsekas, D. P. (1992). Auction algorithms for network flow problems: a tutorial introduction. Computational Optimization and Applications, 1(1), 7–66.
- Mérigot, Q., & Thibert, B. (2020). Optimal transport: discretization and algorithms. arXiv Preprint arXiv:2003.00855.
- Rockafellar, R. T. (2015). Convex Analysis. Princeton university press.