Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Optimal Matching between Point Clouds

This opening chapter isolates the simplest form of optimal transport: pairing two finite point clouds. The stakes are algorithmic and geometric at once: one sees the combinatorial nature of transport, the special simplicity of the line, and the first hints that convex relaxation will be necessary in higher dimension. Classical assignment algorithms such as the Hungarian method and auction methods Kuhn, 1955Bertsekas, 1992 provide the computational backdrop, while the geometric examples prepare the Kantorovich relaxation.

Monge Problem for Discrete Points

This section formulates matching as Monge’s deterministic transport problem on two equally weighted clouds. The one-dimensional case is a transparent reference case where the optimal map can be read off by sorting.

Matching Problem

Given a cost matrix (Ci,j)i{1,,n},j{1,,m}(C_{i,j})_{i \in \{1,\ldots,n\}, j \in \{1,\ldots,m\}} and assuming n=mn=m, the optimal assignment problem aims to find a bijection σ\sigma within the set Perm(n)\Perm(n) of permutations of nn elements that solves

minσPerm(n)1ni=1nCi,σ(i).\min_{\sigma \in \Perm(n)} \frac{1}{n}\sum_{i=1}^n C_{i,\sigma(i)}.

One could naively evaluate the cost above for all permutations in Perm(n)\Perm(n). However, this set has size n!n!, which becomes enormous even for small values of nn. In general, the optimal σ\sigma is not unique.

One-Dimensional Case

In one dimension, convex costs select monotone matchings.

Proof

If this property is violated, there exists (i,i)(i,i') such that (xixi)(yσ(i)yσ(i))<0(x_i-x_{i'})(y_{\sigma(i)}-y_{\sigma(i')}) < 0. One can then define a permutation σ~\tilde\sigma by swapping the match: σ~(i)=σ(i)\tilde\sigma(i)=\sigma(i') and σ~(i)=σ(i)\tilde\sigma(i')=\sigma(i). This gives a strictly better cost by the following elementary inequality.

Let h:RRh:\RR\to\RR be strictly convex and let x<xx<x' and y<yy<y'. Then

h(xy)+h(xy)<h(xy)+h(xy).h(x-y)+h(x'-y') < h(x-y')+h(x'-y).

Set d:=yy>0d:=y'-y>0 and define, for every sRs\in\RR,

D(s):=h(s)h(sd)dD(s):=\frac{h(s)-h(s-d)}{d}

and

Δ=h(xy)+h(xy)h(xy)h(xy)=d(D(xy)D(xy)).\Delta =h(x-y')+h(x'-y)-h(x-y)-h(x'-y') =d\bigl(D(x'-y)-D(x-y)\bigr).

Because hh is strictly convex, DD is strictly increasing. Since xy<xyx-y<x'-y, monotonicity gives D(xy)<D(xy)D(x-y)<D(x'-y), hence Δ>0\Delta>0.

This property extends by continuity to convex costs that are not strictly convex, such as xy|x-y|, but then the matching is not necessarily unique. For convex costs, the algorithm to compute an optimal transport is to sort the points: find permutations σX,σY\sigma_X,\sigma_Y such that

xσX(1)xσX(2)andyσY(1)yσY(2),x_{\sigma_X(1)} \leq x_{\sigma_X(2)} \leq \cdots \qquad\text{and}\qquad y_{\sigma_Y(1)} \leq y_{\sigma_Y(2)} \leq \cdots,

and then map xσX(k)x_{\sigma_X(k)} to yσY(k)y_{\sigma_Y(k)}. Equivalently, an optimal transport is σ=σYσX1\sigma=\sigma_Y\circ\sigma_X^{-1}. The computational cost is therefore O(nlogn)O(n\log n), using for instance quicksort.

If the distance profile is concave instead of convex, the geometry changes. For costs such as cp(x,y)=xypc_p(x,y)=|x-y|^p with 0<p<10<p<1, splitting a displacement into several smaller displacements is expensive, so optimal matchings tend to create long exchanges rather than the monotone equal-rank pairing. This is the strictly concave regime studied by Gangbo and McCann Gangbo & McCann, 1996.

The real line still gives special structure. After sorting all red and blue points together, the ordered sequence decomposes into maximal alternating chains, and local matching indicators can certify pairs that must be matched in an optimum. Removing such certified pairs and repeating yields an exact hierarchical algorithm for the unit-mass balanced assignment problem, with worst-case complexity O(n2)O(n^2) in the framework of Delon, Salomon and Sobolevski Delon et al., 2012. Very concave costs also motivate simpler greedy heuristics, studied for instance by Ottolini and Steinerberger Ottolini & Steinerberger, 2023. The point is that these methods are not generic linear-programming solvers; they use the one-dimensional order and the concavity of the distance profile.

<IPython.core.display.Image object>

One-dimensional assignments for ordered source and target clouds with costs cp(x,y)=xypc_p(x,y)=|x-y|^p. The top row uses single-Gaussian source and target clouds; the bottom row uses a denser two-component source and three-component target. For the convex quadratic cost, equal ranks are matched and the segments do not cross. For the concave cost, the optimum creates long crossing exchanges; the ordered line remains useful, but through the alternating-chain structure of concave transport rather than through monotone rearrangement.

Interactive panel. Use the sliders to change the two cost exponents and see how convex costs preserve sorted, non-crossing matches while concave costs favor longer crossing exchanges.

The next figure shows the monotone case more explicitly. The red and blue curves are smooth laws used to generate equal-weight empirical measures; the dots are inverse-CDF samples at common quantile levels. The monotone assignment connects equal ranks.

<IPython.core.display.Image object>

One-dimensional optimal matching by quantile sorting. The red and blue curves are smooth laws used to generate equal-weight empirical measures; the dots are inverse-CDF samples at common quantile levels. The monotone assignment connects equal ranks, both for two Gaussian mixtures and for the transport from one central Gaussian toward a three-mode target law.

The interactive panel exposes the point count and the two laws while keeping the monotone equal-rank construction in the background.

<Figure size 700x300 with 1 Axes>

Interactive panel. Use the point-count slider and the source/target menus to redraw the one-dimensional monotone assignment. The dots move, but the rule remains equal-rank matching after sorting.

If ϕ:RR\phi:\RR\to\RR is increasing, the same technique applies to costs of the form h(ϕ(x)ϕ(y))h(|\phi(x)-\phi(y)|) after a change of variable. A typical application is grayscale histogram equalization. The empirical cumulative distribution of image luminance values is transported to a prescribed target histogram, for instance a concentrated or reference-image histogram. In one dimension, the monotone rearrangement gives the exact transport map, so the operation is both computationally simple and geometrically faithful: it matches distributions of intensities rather than individual pixels.

<IPython.core.display.Image object>

Histogram equalization as one-dimensional Monge transport on pixel intensities. The map is the monotone rearrangement T=QβFαT=Q_\beta\circ F_\alpha; here β\beta is a truncated Gaussian concentrated near dark intensities. The images are interpolated pointwise by It=(1t)I+tT(I)I_t=(1-t)I+tT(I), and all histograms share the same vertical scale.

The interactive view below exposes the target mean, target standard deviation, and interpolation time.

<Figure size 820x320 with 2 Axes>

Interactive panel. Use the mean, standard-deviation, and time sliders to move the target intensity law and follow the resulting image equalization and histogram deformation.

If hh is strictly convex, then all optimal assignments are increasing, and if the points are all distinct, this increasing map is unique. If hh is not strictly convex, for instance c(x,y)=xyc(x,y)=|x-y|, non-increasing optimal assignments can also exist. This happens, for example, in the book-shifting problem with overlapping uniform distributions, where the mass in the intersection can stay fixed.

Optimal Transport on the Circle

The sorting rule on the line has a periodic analogue. Identify the circle with S1=R/Z\mathbb S^1=\RR/\mathbb Z, let

dS1(x,y):=minkZxy+k,cp(x,y):=dS1(x,y)p,p>1.d_{\mathbb S^1}(x,y):=\min_{k\in\mathbb Z}|x-y+k|, \qquad c_p(x,y):=d_{\mathbb S^1}(x,y)^p, \qquad p>1.

The only extra datum, compared with the line, is where one opens the circle. Once a cut has been chosen, the circle is unfolded into an interval and the one-dimensional monotone assignment can be used. In the discrete case, changing the cut is the same as applying a cyclic shift to one of the two circular orderings.

Proof

Call two matched pairs cyclically inverted if the circular order of their source endpoints is opposite to the circular order of their target endpoints. Among optimal assignments, choose one with the smallest number of such inversions. The elementary exchange step is the circular analogue of the line argument above: if two matched pairs are inverted, then cutting the circle in a gap which does not meet the four endpoints and choosing integer lifts realizes the four geodesic distances involved in the exchange as ordinary distances between two ordered source lifts and two oppositely ordered target lifts. The one-dimensional Monge inequality for the strictly convex function rrpr\mapsto |r|^p then shows that swapping the two targets cannot increase the cost, and decreases it unless the four endpoints are in a degenerate tie configuration.

Thus an optimal assignment can be chosen with no cyclic inversion. A bijection between two finite cyclically ordered sets with no cyclic inversion is a rotation of the order, hence a cyclic shift. This shift specifies how the two cyclic orderings should be opened; after this cut, the rotation becomes an ordinary linear order and the matching is the equal-rank monotone assignment on the unfolded interval. Conversely, each cut gives one such cyclic shift, so minimizing over the finitely many shifts gives an optimal discrete circle assignment. Repeated points or ties are obtained by the same argument after an arbitrarily small perturbation and a limiting passage. This is the discrete form of the fast circle-Monge construction of Delon et al., 2010.

<IPython.core.display.Image object>

Optimal transport on the circle by cutting and unfolding. Purple segments show the optimal matching and the green radius marks the chosen cut. The red and blue atoms live on two copies of the circle; the denser point clouds make the cyclic ordering visible. Once the circle is opened at this angle, the same matching appears as a monotone one-dimensional assignment on the interval, with the two green endpoints identified.

Interactive panel. Use the number of points, exponent, and shift controls to open the circle at different cuts and compare the induced cyclic assignments.

<IPython.core.display.Image object>

Optimal assignments between the same two point clouds for three powers of the Euclidean distance. The source atoms are semi-regular samples in a central disk, while the target atoms are semi-regular samples on a thin annulus; this canonical geometry is reused in later coupling and regularization figures. The feasible set is unchanged, but increasing pp penalizes the longest edges more strongly and changes the global organization of the permutation.

The interactive panel reuses the same disk-to-annulus geometry and exposes the number of points, the data geometry, and the cost exponents pp in c(x,y)=xypc(x,y)=\norm{x-y}^p.

<Figure size 915x345 with 3 Axes>

Interactive panel. Use the exponent sliders to compare how different powers of the distance reshape the same two-dimensional assignment problem.

Rational Weights

The strict assignment model is also tied to equal cardinalities and equal weights. As soon as the target resolution changes or the weights are not uniform, a permutation no longer describes the feasible transports. One instead needs a nonnegative transport matrix with prescribed row and column sums; this is the finite-dimensional Kantorovich relaxation developed in the next chapters.

<IPython.core.display.Image object>

From assignments to transport plans, using the same disk-to-annulus geometry. In the balanced equal-weight case, each source atom is matched to one target atom. With a target cloud that has half as many atoms, or with strongly nonuniform target weights, the coupling matrix can merge or split mass; segment thickness and opacity encode its nonzero entries, and blue marker areas encode the prescribed target masses.

The interactive panel below exposes the target resolution, target weights, and regularization level. The first displayed plan is sparse, while positive regularization values show the entropic smoothing used later in the Sinkhorn chapter.

<Figure size 915x345 with 3 Axes>

Interactive panel. Use the source and target sizes, weight pattern, and regularization sliders to see how unequal masses and finite resolution change the matching picture.

Proof

Any assignment between the duplicated source and target clouds defines integers nijn_{ij} counting how many copied particles of type xix_i are matched to copied particles of type yjy_j. These counts satisfy jnij=ki\sum_j n_{ij}=k_i and inij=j\sum_i n_{ij}=\ell_j, and the associated coupling Pij=nij/NP_{ij}=n_{ij}/N has marginals ki/Nk_i/N and j/N\ell_j/N. The assignment cost is

1Ni,jnijc(xi,yj)=i,jPijc(xi,yj).\frac1N\sum_{i,j} n_{ij}c(x_i,y_j) = \sum_{i,j}P_{ij}c(x_i,y_j).

Conversely, any nonnegative integer count matrix with those row and column sums can be realized by matching the corresponding duplicated particles. Finally, the transportation constraint matrix is totally unimodular, so the linear problem with integer supplies and demands has an optimal integer count matrix. Thus the optimum of the rational-weight Kantorovich problem is the same as the optimum of the duplicated uniform assignment problem.

<IPython.core.display.Image object>

Rational weights as duplicated uniform matchings, using the same disk-to-annulus geometry with fewer displayed atoms. The red and blue locations are kept fixed, while disk areas encode the integer multiplicities kik_i and j\ell_j. Solving the assignment problem after duplicating particles produces several collapsed segments attached to high-multiplicity atoms; this is the integer count matrix of the proposition.

Interactive panel. Use the site and multiplicity sliders to see how rational weights can be represented by duplicated unit masses before solving an ordinary matching problem.

Two-Dimensional Assignments

This efficient strategy to compute the OT in one dimension does not extend to higher dimensions. In two dimensions, as already noted by Monge, optimal segments for the Euclidean distance cannot cross.

Proof

If two segments [xi,yσ(i)][x_i,y_{\sigma(i)}] and [xj,yσ(j)][x_j,y_{\sigma(j)}] cross at an interior point zz, then the triangle inequality gives

xiyσ(j)+xjyσ(i)<xiyσ(i)+xjyσ(j).\norm{x_i-y_{\sigma(j)}}+\norm{x_j-y_{\sigma(i)}} < \norm{x_i-y_{\sigma(i)}}+\norm{x_j-y_{\sigma(j)}}.

The assignment obtained by swapping (σ(i),σ(j))(\sigma(i),\sigma(j)) therefore has a strictly smaller cost, contradicting optimality.

This property alone is not enough to lead to an efficient algorithm. Non-crossing is only a necessary local test, not a compact certificate of optimality. For instance, if nn sources and nn targets are placed alternately on the boundary of a convex polygon, the number of non-crossing perfect matchings is the Catalan number

Cn=1n+1(2nn)4nπn3/2.C_n=\frac{1}{n+1}\binom{2n}{n} \sim \frac{4^n}{\sqrt{\pi}n^{3/2}}.

Thus even after forbidding crossings, an exhaustive search remains exponential. The two-segment swap in the proof above is nevertheless useful: it explains why a crossing matching cannot be optimal, but it does not select among the exponentially many planar matchings that survive this local test.

Matching Algorithms

This section briefly locates matching within classical combinatorial optimization. Its main point is that efficient algorithms exist, but their cleanest analysis is obtained only after introducing the linear-programming viewpoint.

Efficient algorithms exist to solve the optimal matching problem. The most well-known are the Hungarian method and auction algorithms Kuhn, 1955Bertsekas, 1981Bertsekas, 1992. Auction algorithms use prices on the target points: each source bids for the target with largest reduced profit, the target price is increased, and the process terminates when the ϵ\epsilon-complementary slackness conditions are satisfied. For integer costs, choosing ϵ<1/n\epsilon<1/n gives an exact optimum after a finite number of bids Bertsekas, 1992. The dual chapter revisits this algorithm after Kantorovich duality and explains why it is a dual price method, parallel in spirit to Sinkhorn scaling.

Hungarian Primal-Dual Method

The Hungarian method is best understood as a certificate-building algorithm for the assignment linear program. It maintains a partial matching MM and dual prices (ui,vj)(u_i,v_j) satisfying

ui+vjCi,ji,j.u_i+v_j\leq C_{i,j} \qquad \forall i,j.

The equality graph E(u,v)={(i,j):ui+vj=Ci,j}E(u,v)=\{(i,j):u_i+v_j=C_{i,j}\} contains the edges whose reduced cost is zero. The algorithm only augments MM along alternating paths made of equality edges. Starting from an unmatched source, it grows an alternating tree with source set SS and target set TT. If the tree reaches an unmatched target, the matching is augmented along the path. If no such edge exists, the dual variables are shifted by the smallest slack

δ=miniS, jT(Ci,juivj),uiui+δ (iS),vjvjδ (jT).\delta=\min_{i\in S,\ j\notin T}\bigl(C_{i,j}-u_i-v_j\bigr), \qquad u_i\leftarrow u_i+\delta\ (i\in S), \qquad v_j\leftarrow v_j-\delta\ (j\in T).

This update preserves all inequalities ui+vjCi,ju_i+v_j\leq C_{i,j}, keeps the current alternating tree tight, and creates at least one new equality edge leaving SS. Maintaining these slacks incrementally gives the standard O(n3)O(n^3) implementation for an n×nn\times n assignment problem.

The following figure summarizes actual iterates by displaying only the evolving partial assignment: unmatched rows are shown as flat rows to keep a fixed matrix format, and matched rows are shown as one-hot rows.

<IPython.core.display.Image object>

Matrix view of actual Hungarian primal-dual iterates on a diagonally dominant ordered one-dimensional squared-distance assignment. Each panel records the current partial assignment state: unassigned rows are kept flat, while assigned rows are one-hot. The snapshots are taken at initialization and after two, four, six and eight augmentations; for this pedagogical instance the partial assignments grow along the diagonal, and the final matrix is the identity assignment certified by complementary slackness.

Interactive panel. Use the size, jitter, and seed controls to regenerate the assignment instance and inspect snapshots of the Hungarian augmentation process.

Proof

For any permutation τ\tau, dual feasibility gives

iCi,τ(i)i(ui+vτ(i))=iui+jvj.\sum_i C_{i,\tau(i)} \geq \sum_i (u_i+v_{\tau(i)}) = \sum_i u_i+\sum_j v_j.

This is the weak duality lower bound. If σ\sigma is contained in the equality graph, then

iCi,σ(i)=iui+jvj,\sum_i C_{i,\sigma(i)} = \sum_i u_i+\sum_j v_j,

so the primal cost of σ\sigma reaches the dual lower bound and is optimal.

It remains to justify finite termination and the complexity bound. At each successful augmentation, the matching cardinality increases by one, so there are at most nn augmentations. During one augmentation phase, the algorithm grows an alternating tree in the equality graph. If no augmenting path is available, the dual update uses the smallest slack of an edge leaving the current tree. For edges inside the tree, adding δ\delta to source labels and subtracting δ\delta from target labels preserves tightness; for edges from SS to TcT^c, the definition of δ\delta preserves feasibility and makes at least one new edge tight; all other inequalities are unchanged or become looser. Thus the reachable sets strictly grow between two failed augmentation attempts, and they can grow at most nn times within one phase. If the current slacks miniS(Ci,juivj)\min_{i\in S}(C_{i,j}-u_i-v_j) are updated when a source enters SS, each tree expansion costs O(n)O(n). A phase therefore costs O(n2)O(n^2), and the nn phases give O(n3)O(n^3) operations. Hence the method reaches a perfect optimal matching.

References
  1. Kuhn, H. W. (1955). The Hungarian Method for the Assignment Problem. Naval Research Logistics Quarterly, 2(1–2), 83–97. 10.1002/nav.3800020109
  2. Bertsekas, D. P. (1992). Auction algorithms for network flow problems: a tutorial introduction. Computational Optimization and Applications, 1(1), 7–66.
  3. Gangbo, W., & McCann, R. J. (1996). The geometry of optimal transportation. Acta Mathematica, 177(2), 113–161.
  4. Delon, J., Salomon, J., & Sobolevski, A. (2012). Local matching indicators for transport problems with concave costs. SIAM Journal on Discrete Mathematics, 26(2), 801–827.
  5. Ottolini, A., & Steinerberger, S. (2023). Greedy Matching in Optimal Transport with Concave Cost. arXiv Preprint arXiv:2307.03140.
  6. Delon, J., Salomon, J., & Sobolevski, A. (2010). Fast transport optimization for Monge costs on the circle. SIAM Journal on Applied Mathematics, 70(7), 2239–2258.
  7. Bertsekas, D. P. (1981). A new algorithm for the assignment problem. Mathematical Programming, 21(1), 152–171.