Transportation theory (mathematics)

In mathematics and economics, transportation theory or transport theory is a name given to the study of optimal transportation and allocation of resources. The problem was formalized by the French mathematician Gaspard Monge in 1781.[1]

In the 1920s A.N. Tolstoi was one of the first to study the transportation problem mathematically. In 1930, in the collection Transportation Planning Volume I for the National Commissariat of Transportation of the Soviet Union, he published a paper "Methods of Finding the Minimal Kilometrage in Cargo-transportation in space".[2][3]

Major advances were made in the field during World War II by the Soviet mathematician and economist Leonid Kantorovich.[4] Consequently, the problem as it is stated is sometimes known as the Monge–Kantorovich transportation problem.[5] The linear programming formulation of the transportation problem is also known as the Hitchcock–Koopmans transportation problem.[6]

Motivation

Mines and factories

Suppose that we have a collection of m mines mining iron ore, and a collection of n factories which use the iron ore that the mines produce. Suppose for the sake of argument that these mines and factories form two disjoint subsets M and F of the Euclidean plane R². Suppose also that we have a cost function c : R² × R² → [0, ∞), so that c(x, y) is the cost of transporting one shipment of iron from x to y. For simplicity, we ignore the time taken to do the transporting. We also assume that each mine can supply only one factory (no splitting of shipments) and that each factory requires precisely one shipment to be in operation (factories cannot work at half- or double-capacity). Having made the above assumptions, a transport plan is a bijection T : M → F. In other words, each mine m ∈ M supplies precisely one target factory T(m) ∈ F and each factory is supplied by precisely one mine. We wish to find the optimal transport plan, the plan T whose total cost

c(T):=\sum _{m\in M}c(m,T(m))

is the least of all possible transport plans from M to F. This motivating special case of the transportation problem is an instance of the assignment problem. More specifically, it is equivalent to finding a minimum weight matching in a bipartite graph.

Moving books: the importance of the cost function

The following simple example illustrates the importance of the cost function in determining the optimal transport plan. Suppose that we have n books of equal width on a shelf (the real line), arranged in a single contiguous block. We wish to rearrange them into another contiguous block, but shifted one book-width to the right. Two obvious candidates for the optimal transport plan present themselves:

move all n books one book-width to the right ("many small moves");
move the left-most book n book-widths to the right and leave all other books fixed ("one big move").

If the cost function is proportional to Euclidean distance (c(x, y) = α|x − y|) then these two candidates are both optimal. If, on the other hand, we choose the strictly convex cost function proportional to the square of Euclidean distance (c(x, y) = α|x − y|²), then the "many small moves" option becomes the unique minimizer.

Note that the above cost functions consider only the horizontal distance traveled by the books, not the horizontal distance traveled by a device used to pick each book up and move the book into position. If the latter is considered instead, then, of the two transport plans, the second is always optimal for the Euclidean distance, while, provided there are at least 3 books, the first transport plan is optimal for the squared Euclidean distance.

Hitchcock problem

The following transportation problem formulation is credited to F. L. Hitchcock:[7]

Suppose there are m sources

x_{1},\ldots ,x_{m}

for a commodity, with

a(x_{i})

units of supply at x_i and n sinks

y_{1},\ldots ,y_{n}

for the commodity, with the demand

b(y_{j})

at y_j. If

a(x_{i},\ y_{j})

is the unit cost of shipment from x_i to y_j, find a flow that satisfies demand from supplies and minimizes the flow cost. This challenge in logistics was taken up by D. R. Fulkerson[8] and in the book Flows in Networks (1962) written with L. R. Ford Jr.[9]

Tjalling Koopmans is also credited with formulations of transport economics and allocation of resources.

Numerical solution in Excel

With large numbers of routes, the problem is solved numerically.

Inputs: The Transportation cells are T . The Supply data cells are S .The Demand data cells are D .

Think of each unit of supply as a large box (a shipping container).

Outputs: The shipment plan is X.

The current shipping cost is K.

Objective: Maximize the cost reduction

MAX R(X)=K-T·X

The shipment plan, X, must satisfy three types of constraint

(1) Non-negativity constraints X >= 0

(2) Supply constraints S-1•X >= 0

(3) Demand constraints X•1-D >= 0

One way to set up the problem in Excel is depicted in the table below.

The total shipping costs T·X are the product of terms in the array [e2:H3]

R-V solution method (an update of the simplex method):

For a small number of routes, the problem can be solved rather like a beginner's cross word puzzle or Sudoku.

The R-V Solution Method introduces Virtual unit Costs c, Virtual Prices p and a Virtual Trader.

The VIrtual Trader provides Real implications.

Crucially, the V-trader is a price taker.

Then there will be excess demand on any strictly profitable route and demand will be zero on any strictly unprofitable route.

VIRTUAL PROFIT MAXIMIZATION VPM

The unit profit on each route is p_j - t_ij -c_i These are calculated in the V-PROFIT Box at the bottom right of the Table.

(If you are working with Excel, enter these formulas and then use SOLVER if for the numerically computed maximum.)

The profit must be zero on all utilized routs and no route is strictly profitable.

STEP 1: Build a table like the one below. in the table small numbers are data points. Large bold numbers are variables.

In each column the V-PRICE must at least be the minimum cost to satisfy VPM.


	A	B	D			E	F	G	H
1			V-prices			5		5
				V- costs		P1		P2
				V- costs						V-Profit
2	S1	10	Supply	1	C1	4	10	6	0	0	-1
3	S2	30	Supply	0	C2	3	20	5	10	0	0
4				Demands			20		20
4							D1		D2

STEP 2: Make the lowest cost supplier the #1 supplier (top row).

STEP 3; Fill the orders sequentially. The first route to be filled should be in the top row [S1:D1]. Then fill sequentially by cost so [S2;D1] is filled next

STEP 3: The last order to be filled is in Italics. The source in this row is the less valuable source. Then C2 is zero. Fill in the cell to the left of C2

STEP 4: Solve for the V-Prices and V-costs.

On each route select V-COSTS and V-PRICES so that the V-Trader breaks even on all the active routes.

Start with the column that has the fewest entries (Column 2)

V-SUDOKU

The V-Costs are initially left blank 2 (zero). To break even in Column 2, P2 = C2 +T22 =0 + 5 =5

In Column 1 both routes are used. Since C2 is zero, C1 =1. Then P1=C1 + T21 =5

V-CHECK If you set this V-PUZZLE up on a spreadsheet, the profit BOX will already be filled in.

The real value of the V-prices

Supply:

If you add a unit of supply at S1 you can lower the transportation cost by adding 1 to cell [S1:C2] and subtracting 1 from cell [S2;C2].

This lowers shipping costs by 1, This is the meaning of C1. If the firm can rent an additional container at less than 1 (think "one thousand") there are additional cost savings.

If you try this at S2, the additional container doe not lower shipping costs. This is the meaning of C1.

Demand:

What would be the reduction in shipping costs if another unit of the product could be obtained locally (at the Destination).

Try reducing D1 by one unit. The shipping cost falls by...? Yes, by the V-PRICE

Using the V-virtual trader method therefore yields Virtual price and costs of Real importance.

Programming note:

If you use a canned maximizing program like Excels Add-In Solver, it will get to the correct answer in a flash.

If you look at the "Lagrange Multipliers" or "shadow prices" that may appear in a sensitivity report, they can be confusing.

Since Solver provides the solution, all you have to do is Sudoku your way to the V-Costs and V-prices.

Here is the set-up for 3 suppliers and 3 destinations. I suggest that you set S3=0 initially and Sudoku your way to the solution..


		V-prices			3		5		6
			V- costs		p₁		P₂		P₃
			V- costs								V-Profit
S1	10	Supply	1	C₁	8		1	+	6
S2	30			c₂	3		5	+	7
S3	20			C₃	4		9	0	2	+
			Demands			15		25		20
						D1		D2		D3

Abstract formulation of the problem

Monge and Kantorovich formulations

The transportation problem as it is stated in modern or more technical literature looks somewhat different because of the development of Riemannian geometry and measure theory. The mines-factories example, simple as it is, is a useful reference point when thinking of the abstract case. In this setting, we allow the possibility that we may not wish to keep all mines and factories open for business, and allow mines to supply more than one factory, and factories to accept iron from more than one mine.

Let $X$ and $Y$ be two separable metric spaces such that any probability measure on $X$ (or $Y$ ) is a Radon measure (i.e. they are Radon spaces). Let $c:X\times Y\to [0,\infty ]$ be a Borel-measurable function. Given probability measures $\mu$ on $X$ and $\nu$ on $Y$ , Monge's formulation of the optimal transportation problem is to find a transport map $T:X\to Y$ that realizes the infimum

\inf \left\{\left.\int _{X}c(x,T(x))\,\mathrm {d} \mu (x)\;\right|\;T_{*}(\mu )=\nu \right\},

where $T_{*}(\mu )$ denotes the push forward of $\mu$ by $T$ . A map $T$ that attains this infimum (i.e. makes it a minimum instead of an infimum) is called an "optimal transport map".

Monge's formulation of the optimal transportation problem can be ill-posed, because sometimes there is no $T$ satisfying $T_{*}(\mu )=\nu$ : this happens, for example, when $\mu$ is a Dirac measure but $\nu$ is not.

We can improve on this by adopting Kantorovich's formulation of the optimal transportation problem, which is to find a probability measure $\gamma$ on $X\times Y$ that attains the infimum

\inf \left\{\left.\int _{X\times Y}c(x,y)\,\mathrm {d} \gamma (x,y)\right|\gamma \in \Gamma (\mu ,\nu )\right\},

where $\Gamma (\mu ,\nu )$ denotes the collection of all probability measures on $X\times Y$ with marginals $\mu$ on $X$ and $\nu$ on $Y$ . It can be shown[10] that a minimizer for this problem always exists when the cost function $c$ is lower semi-continuous and $\Gamma (\mu ,\nu )$ is a tight collection of measures (which is guaranteed for Radon spaces $X$ and $Y$ ). (Compare this formulation with the definition of the Wasserstein metric $W_{1}$ on the space of probability measures.) A gradient descent formulation for the solution of the Monge–Kantorovich problem was given by Sigurd Angenent, Steven Haker, and Allen Tannenbaum.[11]

Duality formula

The minimum of the Kantorovich problem is equal to

\sup \left(\int _{X}\varphi (x)\,\mathrm {d} \mu (x)+\int _{Y}\psi (y)\,\mathrm {d} \nu (y)\right),

where the supremum runs over all pairs of bounded and continuous functions $\varphi :X\rightarrow \mathbf {R}$ and $\psi :Y\rightarrow \mathbf {R}$ such that

\varphi (x)+\psi (y)\leq c(x,y).

Economic interpretation

The economic interpretation is clearer if signs are flipped. Let ${\textstyle x\in X}$ stand for the vector of characteristics of a worker, ${\textstyle y\in Y}$ for the vector of characteristics of a firm, and ${\textstyle \Phi (x,y)=-c(x,y)}$ for the economic output generated by worker ${\textstyle x}$ matched with firm ${\textstyle y}$ . Setting ${\textstyle u(x)=-\varphi (x)}$ and ${\textstyle v(y)=-\psi (y)}$ , the Monge–Kantorovich problem rewrites:

\sup \left\{\int _{X\times Y}\Phi (x,y)d\gamma (x,y),\gamma \in \Gamma (\mu ,\nu )\right\}

which has dual :

\inf \left\{\int _{X}u(x)\,d\mu (x)+\int _{Y}v(y)\,d\nu (y):u(x)+v(y)\geq \Phi (x,y)\right\}

where the infimum runs over bounded and continuous function ${\textstyle u:X\rightarrow \mathbf {R} }$ and ${\textstyle v:Y\rightarrow \mathbf {R} }$ . If the dual problem has a solution, one can see that:

v(y)=\sup _{x}\left\{\Phi (x,y)-u(x)\right\}

so that ${\textstyle u(x)}$ interprets as the equilibrium wage of a worker of type ${\textstyle x}$ , and ${\textstyle v(y)}$ interprets as the equilibrium profit of a firm of type ${\textstyle y}$ .[12]

Solution of the problem

Optimal transportation on the real line

Optimal transportation matrix

Continuous optimal transport

For $1\leq p<\infty$ , let ${\mathcal {P}}_{p}(\mathbf {R} )$ denote the collection of probability measures on $\mathbf {R}$ that have finite $p$ -th moment. Let $\mu ,\nu \in {\mathcal {P}}_{p}(\mathbf {R} )$ and let $c(x,y)=h(x-y)$ , where $h:\mathbf {R} \rightarrow [0,\infty )$ is a convex function.

If $\mu$ has no atom, i.e., if the cumulative distribution function $F_{\mu }:\mathbf {R} \rightarrow [0,1]$ of $\mu$ is a continuous function, then $F_{\nu }^{-1}\circ F_{\mu }:\mathbf {R} \to \mathbf {R}$ is an optimal transport map. It is the unique optimal transport map if $h$ is strictly convex.
We have

\min _{\gamma \in \Gamma (\mu ,\nu )}\int _{\mathbf {R} ^{2}}c(x,y)\,\mathrm {d} \gamma (x,y)=\int _{0}^{1}c\left(F_{\mu }^{-1}(s),F_{\nu }^{-1}(s)\right)\,\mathrm {d} s.

The proof of this solution appears in Rachev & Rüschendorf (1998).[13]

Discrete version and linear programming formulation

In the case where the margins ${\textstyle \mu }$ and ${\textstyle \nu }$ are discrete, let ${\textstyle \mu _{x}}$ and ${\textstyle \nu _{y}}$ be the probability masses respectively assigned to ${\textstyle x\in \mathbf {X} }$ and ${\textstyle y\in \mathbf {Y} }$ , and let ${\textstyle \gamma _{xy}}$ be the probability of an ${\textstyle xy}$ assignment. The objective function in the primal Kantorovich problem is then

\sum _{x\in \mathbf {X} ,y\in \mathbf {Y} }\gamma _{xy}c_{xy}

and the constraint ${\textstyle \gamma \in \Gamma \left(\mu ,\nu \right)}$ expresses as

\sum _{y\in \mathbf {Y} }\gamma _{xy}=\mu _{x},\forall x\in \mathbf {X}

and

\sum _{x\in \mathbf {X} }\gamma _{xy}=\nu _{y},\forall y\in \mathbf {Y} .

In order to input this in a linear programming problem, we need to vectorize the matrix ${\textstyle \gamma _{xy}}$ by either stacking its columns or its rows, we call ${\textstyle \operatorname {vec} }$ this operation. In the column-major order, the constraints above rewrite as

\left(1_{1\times \left\vert \mathbf {Y} \right\vert }\otimes I_{\left\vert \mathbf {X} \right\vert }\right)\operatorname {vec} \left(\gamma \right)=\mu

and

\left(I_{\left\vert \mathbf {Y} \right\vert }\otimes 1_{1\times \left\vert \mathbf {X} \right\vert }\right)\operatorname {vec} \left(\gamma \right)=\nu

where ${\textstyle \otimes }$ is the Kronecker product, ${\textstyle 1_{n\times m}}$ is a matrix of size ${\textstyle n\times m}$ with all entries of ones, and ${\textstyle I_{n}}$ is the identity matrix of size ${\textstyle n}$ . As a result, setting ${\textstyle z=\operatorname {vec} \left(\gamma \right)}$ , the linear programming formulation of the problem is

{\begin{aligned}&{\text{Minimize }}\operatorname {vec} (c)^{\top }z\\[4pt]&{\text{subject to:}}\\[4pt]&z\geq 0\\[4pt]&{\begin{pmatrix}1_{1\times \left\vert \mathbf {Y} \right\vert }\otimes I_{\left\vert \mathbf {X} \right\vert }\\I_{\left\vert \mathbf {Y} \right\vert }\otimes 1_{1\times \left\vert \mathbf {X} \right\vert }\end{pmatrix}}\\[4pt]&z={\binom {\mu }{\nu }}\end{aligned}}

which can be readily inputted in a large-scale linear programming solver (see chapter 3.4 of Galichon (2016)[12]).

Semi-discrete case

In the semi-discrete case, ${\textstyle X=Y=\mathbf {R} ^{d}}$ and ${\textstyle \mu }$ is a continuous distribution over ${\textstyle \mathbf {R} ^{d}}$ , while ${\textstyle \nu =\sum _{j=1}^{J}\nu _{j}\delta _{y_{i}}}$ is a discrete distribution which assigns probability mass ${\textstyle \nu _{j}}$ to site ${\textstyle y_{j}\in \mathbf {R} ^{d}}$ . In this case, we can see[14] that the primal and dual Kantorovich problems respectively boil down to:

\inf \left\{\int _{X}\sum _{j=1}^{J}c(x,y_{j})\,d\gamma _{j}(x),\gamma \in \Gamma (\mu ,\nu )\right\}

for the primal, where ${\textstyle \gamma \in \Gamma \left(\mu ,\nu \right)}$ means that ${\textstyle \int _{X}d\gamma _{j}\left(x\right)=\nu _{j}}$ and ${\textstyle \sum _{j}d\gamma _{j}\left(x\right)=d\mu \left(x\right)}$ , and:

\sup \left\{\int _{X}\varphi (x)d\mu (x)+\sum _{j=1}^{J}\psi _{j}\nu _{j}:\psi _{j}+\varphi (x)\leq c\left(x,y_{j}\right)\right\}

for the dual, which can be rewritten as:

\sup _{\psi \in \mathbf {R} ^{J}}\left\{\int _{X}\inf _{j}\left\{c\left(x,y_{j}\right)-\psi _{j}\right\}d\mu (x)+\sum _{j=1}^{J}\psi _{j}\nu _{j}\right\}

which is a finite-dimensional convex optimization problem that can be solved by standard techniques, such as gradient descent.

In the case when ${\textstyle c\left(x,y\right)=\left\vert x-y\right\vert ^{2}/2}$ , one can show that the set of ${\textstyle x\in \mathbf {X} }$ assigned to a particular site ${\textstyle j}$ is a convex polyhedron. The resulting configuration is called a power diagram.[15]

Quadratic normal case

Assume the particular case ${\textstyle \mu ={\mathcal {N}}\left(0,\Sigma _{X}\right)}$ , ${\textstyle \nu ={\mathcal {N}}\left(0,\Sigma _{Y}\right)}$ , and ${\textstyle c(x,y)=\left\vert y-Ax\right\vert ^{2}/2}$ where ${\textstyle A}$ is invertible. One then has

\varphi (x)=-x^{\top }\Sigma _{X}^{-1/2}\left(\Sigma _{X}^{1/2}A^{\top }\Sigma _{Y}A\Sigma _{X}^{1/2}\right)^{1/2}\Sigma _{X}^{-1/2}x/2

\psi (y)=-y^{\top }A\Sigma _{X}^{1/2}\left(\Sigma _{X}^{1/2}A^{\top }\Sigma _{Y}A\Sigma _{X}^{1/2}\right)^{-1/2}\Sigma _{X}^{1/2}Ay/2

T(x)=(A^{\top })^{-1}\Sigma _{X}^{-1/2}\left(\Sigma _{X}^{1/2}A^{\top }\Sigma _{Y}A\Sigma _{X}^{1/2}\right)^{1/2}\Sigma _{X}^{-1/2}x

The proof of this solution appears in Galichon (2016).[12]

Separable Hilbert spaces

Let $X$ be a separable Hilbert space. Let ${\mathcal {P}}_{p}(X)$ denote the collection of probability measures on $X$ that have finite $p$ -th moment; let ${\mathcal {P}}_{p}^{r}(X)$ denote those elements $\mu \in {\mathcal {P}}_{p}(X)$ that are Gaussian regular: if $g$ is any strictly positive Gaussian measure on $X$ and $g(N)=0$ , then $\mu (N)=0$ also.

Let $\mu \in {\mathcal {P}}_{p}^{r}(X)$ , $\nu \in {\mathcal {P}}_{p}(X)$ , $c(x,y)=|x-y|^{p}/p$ for $p\in (1,\infty ),p^{-1}+q^{-1}=1$ . Then the Kantorovich problem has a unique solution $\kappa$ , and this solution is induced by an optimal transport map: i.e., there exists a Borel map $r\in L^{p}(X,\mu ;X)$ such that

\kappa =(\mathrm {id} _{X}\times r)_{*}(\mu )\in \Gamma (\mu ,\nu ).

Moreover, if $\nu$ has bounded support, then

r(x)=x-|\nabla \varphi (x)|^{q-2}\,\nabla \varphi (x)

for $\mu$ -almost all $x\in X$ for some locally Lipschitz, c-concave and maximal Kantorovich potential $\varphi$ . (Here $\nabla \varphi$ denotes the Gateaux derivative of $\varphi$ .)

Entropic regularization

Consider a variant of the discrete problem above, where we have added an entropic regularization term to the objective function of the primal problem

{\begin{aligned}&{\text{Minimize }}\sum _{x\in \mathbf {X} ,y\in \mathbf {Y} }\gamma _{xy}c_{xy}+\varepsilon \gamma _{xy}\ln \gamma _{xy}\\[4pt]&{\text{subject to: }}\\[4pt]&\gamma \geq 0\\[4pt]&\sum _{y\in \mathbf {Y} }\gamma _{xy}=\mu _{x},\forall x\in \mathbf {X} \\[4pt]&\sum _{x\in \mathbf {X} }\gamma _{xy}=\nu _{y},\forall y\in \mathbf {Y} \end{aligned}}

One can show that the dual regularized problem is

\max _{\varphi ,\psi }\sum _{x\in \mathbf {X} }\varphi _{x}\mu _{x}+\sum _{y\in \mathbf {Y} }\psi _{y}v_{y}-\varepsilon \sum _{x\in \mathbf {X} ,y\in \mathbf {Y} }\exp \left({\frac {\varphi _{x}+\psi _{y}-c_{xy}}{\varepsilon }}\right)

where, compared with the unregularized version, the "hard" constraint in the former dual ( ${\textstyle \varphi _{x}+\psi _{y}-c_{xy}\geq 0}$ ) has been replaced by a "soft" penalization of that constraint (the sum of the ${\textstyle \varepsilon \exp \left((\varphi _{x}+\psi _{y}-c_{xy})/\varepsilon \right)}$ terms ). The optimality conditions in the dual problem can be expressed as

Eq. 5.1:

\mu _{x}=\sum _{y\in \mathbf {Y} }\exp \left({\frac {\varphi _{x}+\psi _{y}-c_{xy}}{\varepsilon }}\right)~\forall x\in \mathbf {X}

Eq. 5.2:

\nu _{y}=\sum _{x\in \mathbf {X} }\exp \left({\frac {\varphi _{x}+\psi _{y}-c_{xy}}{\varepsilon }}\right)~\forall y\in \mathbf {Y}

Denoting ${\textstyle A}$ as the ${\textstyle \left\vert \mathbf {X} \right\vert \times \left\vert \mathbf {Y} \right\vert }$ matrix of term ${\textstyle A_{xy}=\exp \left(-c_{xy}/\varepsilon \right)}$ , solving the dual is therefore equivalent to looking for two diagonal positive matrices ${\textstyle D_{1}}$ and ${\textstyle D_{2}}$ of respective sizes ${\textstyle \left\vert \mathbf {X} \right\vert }$ and ${\textstyle \left\vert \mathbf {Y} \right\vert }$ , such that ${\textstyle D_{1}AD_{2}1_{\left\vert \mathbf {Y} \right\vert }=\mu }$ and ${\textstyle \left(D_{1}AD_{2}\right)^{\top }1_{\left\vert \mathbf {X} \right\vert }=\nu }$ . The existence of such matrices generalizes Sinkhorn's theorem and the matrices can be computed using the Sinkhorn–Knopp algorithm,[16] which simply consists of iteratively looking for ${\textstyle \varphi _{x}}$ to solve Equation 5.1, and ${\textstyle \psi _{y}}$ to solve Equation 5.2. Sinkhorn–Knopp's algorithm is therefore a coordinate descent algorithm on the dual regularized problem.

Applications

The Monge–Kantorovich optimal transport has found applications in wide range in different fields. Among them are:

Image registration and warping[17]
Reflector design[18]
Retrieving information from shadowgraphy and proton radiography[19]
Seismic tomography and reflection seismology[20]
The broad class of economic modelling that involves gross substitutes property (among others, models of matching and discrete choice).

References

G. Monge. Mémoire sur la théorie des déblais et des remblais. Histoire de l’Académie Royale des Sciences de Paris, avec les Mémoires de Mathématique et de Physique pour la même année, pages 666–704, 1781.
Schrijver, Alexander, Combinatorial Optimization, Berlin ; New York : Springer, 2003. ISBN 3540443894. Cf. p. 362
Ivor Grattan-Guinness, Ivor, Companion encyclopedia of the history and philosophy of the mathematical sciences, Volume 1, JHU Press, 2003. Cf. p.831
L. Kantorovich. On the translocation of masses. C.R. (Doklady) Acad. Sci. URSS (N.S.), 37:199–201, 1942.
Cédric Villani (2003). Topics in Optimal Transportation. American Mathematical Soc. p. 66. ISBN 978-0-8218-3312-4.
Singiresu S. Rao (2009). Engineering Optimization: Theory and Practice (4th ed.). John Wiley & Sons. p. 221. ISBN 978-0-470-18352-6.
Frank L. Hitchcock (1941) "The distribution of a product from several sources to numerous localities", MIT Journal of Mathematics and Physics 20:224–230 MR 0004469.
D. R. Fulkerson (1956) Hitchcock Transportation Problem, RAND corporation.
L. R. Ford Jr. & D. R. Fulkerson (1962) § 3.1 in Flows in Networks, page 95, Princeton University Press
L. Ambrosio, N. Gigli & G. Savaré. Gradient Flows in Metric Spaces and in the Space of Probability Measures. Lectures in Mathematics ETH Zürich, Birkhäuser Verlag, Basel. (2005)
Angenent, S.; Haker, S.; Tannenbaum, A. (2003). "Minimizing flows for the Monge–Kantorovich problem". SIAM J. Math. Anal. 35 (1): 61–97. CiteSeerX 10.1.1.424.1064. doi:10.1137/S0036141002410927.
Galichon, Alfred. Optimal Transport Methods in Economics. Princeton University Press, 2016.
Rachev, Svetlozar T., and Ludger Rüschendorf. Mass Transportation Problems: Volume I: Theory. Vol. 1. Springer, 1998.
Santambrogio, Filippo. Optimal Transport for Applied Mathematicians. Birkhäuser Basel, 2016. In particular chapter 6, section 4.2.
Aurenhammer, Franz (1987), "Power diagrams: properties, algorithms and applications", SIAM Journal on Computing, 16 (1): 78–96, doi:10.1137/0216006, MR 0873251.
Peyré, Gabriel and Marco Cuturi (2019), "Computational Optimal Transport: With Applications to Data Science", Foundations and Trends in Machine Learning: Vol. 11: No. 5-6, pp 355–607. DOI: 10.1561/2200000073.
Haker, Steven; Zhu, Lei; Tannenbaum, Allen; Angenent, Sigurd (1 December 2004). "Optimal Mass Transport for Registration and Warping". International Journal of Computer Vision. 60 (3): 225–240. CiteSeerX 10.1.1.59.4082. doi:10.1023/B:VISI.0000036836.66311.97. ISSN 0920-5691. S2CID 13261370.
Glimm, T.; Oliker, V. (1 September 2003). "Optical Design of Single Reflector Systems and the Monge–Kantorovich Mass Transfer Problem". Journal of Mathematical Sciences. 117 (3): 4096–4108. doi:10.1023/A:1024856201493. ISSN 1072-3374. S2CID 8301248.
Kasim, Muhammad Firmansyah; Ceurvorst, Luke; Ratan, Naren; Sadler, James; Chen, Nicholas; Sävert, Alexander; Trines, Raoul; Bingham, Robert; Burrows, Philip N. (16 February 2017). "Quantitative shadowgraphy and proton radiography for large intensity modulations". Physical Review E. 95 (2): 023306. arXiv:1607.04179. Bibcode:2017PhRvE..95b3306K. doi:10.1103/PhysRevE.95.023306. PMID 28297858. S2CID 13326345.
Metivier, Ludovic (24 February 2016). "Measuring the misfit between seismograms using an optimal transport distance: application to full waveform inversion". Geophysical Journal International. 205 (1): 345–377. Bibcode:2016GeoJI.205..345M. doi:10.1093/gji/ggw014.