Euclidean Distance Degree

By Rekha R. Thomas

Many models in the sciences and engineering arise as the set of real solutions to systems of multivariate polynomial equations with real coefficients. In practice, model observations are often noisy and may not satisfy all the equations. In such situations, the problem usually solved is that of finding the maximum likelihood estimate of the noisy sample. If the distribution of noise is Gaussian, the problem reduces to finding the closest point in Euclidean distance from the noisy sample to the model. For example, in the triangulation problem in computer vision, which is concerned with the reconstruction of three-dimensional scenes from camera images, the model is the set of possible images of points in three-dimensional space observed by a given set of cameras. This space of images can be described by certain quadratic and cubic polynomials in the coordinates of the images. The task here is to find a set of true images that are closest in Euclidean distance to a given set of noisy images.

Figure 1. Minimizing Euclidean distance from u to X.

The set of real solutions to a system of polynomial equations in \(n\) variables, called a real algebraic variety, is a subset \(X\) of \(\mathbb{R}^{n}\). The application described above is a special instance of the following general optimization problem on \(X\): Given a data point \(u \in \mathbb{R}^n\), find \(u^{*} \in X\) that minimizes \(d_{u}(x)= \parallel u – x \parallel ^{2}_{2}\) , the square of the Euclidean distance from \(u\) to \(X\). Geometrically, \(u^*\) is the first point on \(X\) that is touched by a ball of growing radius, centered at \(u\) (see Figure 1).

As simple as it sounds, this is in general a difficult (NP-hard) problem. The set of critical points of the problem (which contains the minimizer) is typically finite; the points are the solutions to a system of polynomial equations associated to those defining \(X\). We are concerned with this set of critical points, whose size is a measure of the algebraic complexity of writing \(u^*\) as a function of \(u\).

The critical points of the objective function \(d_{u}(x)\) on \(X\) are exactly those points \(x \in X\) at which \(u – x\) is perpendicular to the tangent space of \(X\) at \(x\). Geometrically, these are all the points on \(X\) that tangentially meet a growing ball centered at \(u\). The tangent space is well defined only at the smooth points of \(X\), so we consider only that subset.

To simplify the computation, however, we consider all complex solutions to the polynomial equations that define \(X\), rather than just the real solutions. Miraculously, for generic \(u \in \mathbb{R}^n\), the number of smooth complex critical points of \(d_{u}(x)\) in \(X\) is finite and constant. We call this number the Euclidean distance degree (EDdegree) of \(X\). As shown below, EDdegree(\(X\)) has surprisingly nice formulas in many instances.

Figure 2. Real critical points of an ellipse and its ED discriminant.

As a quick check, we can consider the circle \(X\) in \(\mathbb{R}^2\) defined by \(x^{2}_{1} + x^{2}_{2} = 1\). For generic \(u = (u_{1}, u_{2}) \in \mathbb{R}^2\), there are two real points on \(X\) at which the tangent is perpendicular to \(u – x\), and, indeed, EDdegree(\(X\)) = 2. For the special choice of \(u = (0, 0)\), all points in \(X\) are critical. Similarly, the EDdegree of a parabola is 3, that of an ellipse 4.

The plane curves mentioned above carry with them a second curve, called the EDdiscriminant, that breaks \(\mathbb{R}^2\) into regions in which the number of real critical points stays constant. The EDdiscriminant of an ellipse is the asteroid curve shown in Figure 2. Inside the asteroid, all four critical points are real. As \(u\) passes through the asteroid curve, two of these real points come together and emerge as two complex conjugate points, reducing the number of real critical points to two.

For plane curves, these discriminants (long known under the names evolute or caustic) appeared as early as 200 BC, in the book of Appollonius. In general, every real algebraic variety \(X\) has an EDdiscriminant that is a hypersurface dividing \(\mathbb{R}^n\) into regions that correspond to different numbers of real critical points.

We now come to some of the formulas for EDdegree that arise in applications. In linear regression, we are interested in finding the closest point from a sample to an affine space. In this case, there is a unique critical point and the EDdegree is 1. If \(X\) is the set of all \(s \times t\) matrices \((s \leq t)\) of rank at most \(r\), then EDdegree(\(X\) ) = C\((s,r)\), where C\((s,r)\) is the number of ways of choosing \(r\) items from \(s\) items. The Eckart–Young theorem from linear algebra says that given an \(s \times t\) matrix \(U\), the closest matrix to \(U\) in \(X\) is obtained via a singular value decomposition of \(U = A \Sigma B\) (where \(\Sigma = \mathrm{diag}(\sigma_{1}, \ldots , \sigma_{s})\) and \(\sigma_{1} \geq \sigma_{2} \geq \ldots\) are the singular values of \(U\)), by setting to zero the singular values \(\sigma_{r +1}, \sigma_{r +2}, \ldots\) in the decomposition. While this yields \(U^{*}\), the remaining critical points correspond to zeroing out any other choice of \(s – r\) singular values of \(U\), yielding the formula for EDdegree.

In the triangulation problem from computer vision, a general formula for the EDdegree of the space of images from \(n\) cameras is unknown. Based on values for small \(n\), it is conjectured to be \(9/2~n^{3} – 21/2~ n^{2} + 8n – 4\). An important concern in control theory is the stability of a univariate polynomial; \(p(t) = \sum^{n}_{i=0} a_{i} t^{i}\) is stable if all its complex roots have negative real parts. In this case, the algebraic boundary of the region of stability has EDdegree \(4n – 7\) if n is odd and \(2n – 3\) if \(n\) is even. This carries important information about the closest stable polynomial to one that is not.

Figure 3. The bijection between critical points on X and critical points on X*.

Many additional examples, full proofs, and references can be found in [1], in which we also develop many mathematical properties of EDdegree. A striking one is that EDdegree remains the same under projection. We also develop the notion of average real EDdegree, which counts the average number of smooth real critical points of the squared Euclidean distance function with respect to an underlying distribution on the space of data. For instance, if \(\mathbb{R}^{2}\) is equipped with the standard multivariate Gaussian distribution, then the average real EDdegree of the ellipse is 2.8375, which lies between 2 and 4, the possible number of real critical points for the ellipse. We also show how algebraic geometry offers formulas for EDdegree via invariants of \(X\), when \(X\) satisfies certain smoothness and intersection conditions. This is at first glance rather surprising––one would think that the variety cannot possibly know about the Euclidean distance function. But this objective function is intimately related to tangent spaces of the variety and hence EDdegree(\(X\)) is hardwired into \(X\) itself.

The close connection to tangent spaces leads to a natural appearance of duality in the theory of EDdegree. If the polynomials defining our algebraic variety are homogeneous, then the complex variety \(X\) has a dual variety defined as the closure of all tangent normals to \(X\). For instance, if \(X\) is a linear space, \(X^{*}\) is the orthogonal complement \(X^{\perp}\). If \(X\) is the set of all \(s \times t\) matrices of rank at most \(r\), then \(X^{*}\) is the set of \(s \times t\) matrices of rank at most \( s – r\). In the first pair, both \(X\) and \(X^{*}\) have EDdegree equal to 1, as both are linear spaces. In the second pair, EDdegree(\(X^{*}\)) = C\((s,s – r)\) by the formula mentioned earlier, but this equals C\((s,r)\) = EDdegree(\(X\)). In fact, for any \(X\), EDdegree(\(X\)) = EDdegree(\(X^{*}\)) and if \(x \in X\) is a critical point of \(u\), then \(u – x \in X^{*}\) is also a critical point of \(u\). Critical points on \(X\) and \(X^{*}\) thus come in pairs and are “proximity reversing,” in the sense that if \(x\) is the closest critical point in \(X\) to \(u\), then \(u – x\) is the farthest critical point in \(X^{*}\) from \(u\). Figure 3 [1] illustrates these facts for the variety \(X\) consisting of a pair of lines.

The work in [1] shows the power of using methods from algebraic geometry to understand optimization problems involving polynomials. Researchers have traditionally studied such problems using analytic techniques, without exploiting the algebraic features of polynomials. Minimizing Euclidean distance is a natural objective function that appears in many applications, and the notion of EDdegree captures the complexity of this problem.

This article is based on the author's joint work in [1] and her talk at the 2014 SIAM Conference on Optimization.

References
[1] Jan Draisma, Emil Horobeţ, Giorgio Ottaviani, Bernd Sturmfels, and Rekha R. Thomas, The Euclidean distance degree of an algebraic variety; arXiv:1309.0049.

Rekha R. Thomas, a professor of mathematics at the University of Washington in Seattle, works in optimization and computational algebra.