| June 01, 2016

The Final Touch? Solving Smale’s 17th Challenge Problem

In 1998, Stephen Smale published a list of 18 challenge problems for the 21st century. Two of them also appeared on David Hilbert’s 1900 list of 23 challenge problems for the 20th century. Others were unmistakably of 20th century origin, including #17, which seeks an algorithm capable of computing approximately—in low-order polynomial average time—a single zero of a system of \(n\) polynomial equations in \(n\) complex unknowns.

In the late 1970s, seeing the need for a complexity theory applicable to problems in numerical analysis, Smale determined that the most useful methods would yield only approximate solutions and be prone to occasional failure. So he looked for algorithms \(\alpha\) capable of providing—with high probability—an approximate solution of a particular problem \(P\) of a given type in an acceptable amount of time. Specifically, if \(T(P, \alpha)\) denotes the amount of time required by algorithm \(\alpha\) to find an approximate solution for a particular \(P\) in the class of interest, if it is possible to impose a manifold structure on that class, and if a probability measure is defined on the resulting manifold \(M\), Smale asked if the “average computation time” \(E(T(P, \alpha))\) is bounded by a low-order polynomial in \(|P|,\) he “size” of the problem \(P \in M,\) and the reciprocal \(r\) of the probability of algorithm failure.

In 1980, Smale wrote a paper he called “The Fundamental Theorem of Algebra,” in which he sought something like a constructive proof of the classical result. To that end, he defined, for any complex polynomial \(f\) of order \(d,\) the problem

\[(P) \qquad \textrm{Find} \enspace \zeta \enspace \textrm{such that} \enspace f(\zeta) = 0.\]

The class of all such \(f\) forms a manifold \(M,\) indeed a vector space, in an obvious fashion. He next defined \(z\) to be an approximate solution of \(P\) if \(z\) is close enough to an actual solution \(\zeta\) for Newton’s method to converge rapidly (i.e. quadratically) to \(\zeta\) when starting from \(z\). Each actual solution \(\zeta\) of \(P\) is thus surrounded by a neighborhood composed of approximate solutions, which typically fill only a small portion of the basin of attraction in which \(\zeta\) lies.

Basins of attraction are topologically complicated, even for a single polynomial like \(z^4 = 1,\) and need not fill the entire domain of definition. The basins themselves are typically open sets, separated by highly irregular (fractal) “curves” of measure zero, consisting of starting points from which Newton’s method fails to converge. Moreover, because Newton’s method can admit attracting periodic points, there may on occasion be open sets of starting points from which the method fails to converge.

Smale employed homotopy methods for finding approximate roots of univariate polynomials. Such methods work by embedding a given problem \(P \in M\) in a continuum of problems \(P_t \in M,\) defined for \(0 \le t \le 1,\) of the form

\[ (P_t) \qquad \textrm{Find} \enspace \zeta(t) \enspace \textrm{such that} \enspace F(\zeta(t), t) = 0,\]

where \(F(z,0) = g(z)\) has an obvious root \(\zeta(0)\) and \(F(z,1) = f(z).\) The methods then choose a partition \(\pi = \left\{0 = t_0 < t_1 < \cdots < t_m = 1 \right\}\) of the unit interval and attempt to solve the problems \(P_{t_i}; i = 1(1)m\) successively, using \(\zeta(t_{i-1})\) as a starting point in the search for \(\zeta(t_i).\) The combination \(F(z,t) = t\:f(z) +\: (1-t)g(z)\) seems the most obvious choice for \(F(\cdot\:, \cdot),\) and is doubtless the one most frequently chosen. Indeed, Smale makes this choice when developing an upper bound on the number of intermediate problems \(P_{t_i}\) to be solved approximately while solving \(P_1.\) Each solution curve \(\zeta(t); 0 \le t \le 1\) of \(P_t\) is surrounded by a (variable-width) strip of approximate solutions. There are at most \(d\) such strips, which can merge and/or bifurcate repeatedly as \(t\) advances from \(0\) to \(1.\)

Smale subsequently investigated the computational complexity of various other problems, including the average computation time of the simplex method of linear programming. His results were later improved by Nimrod Megiddo [6]. He also sought, in collaboration with Michael Shub, to extend his results on the fundamental theorem of algebra to systems of complex polynomials.

Let \(f = (f_1, \dots, f_n)\) be a system of \(n\) real or complex polynomials in the \(n\) unknowns \(z_1, ..., z_n\) and consider the class of problems

\[ (P') \qquad \textrm{Find} \enspace \zeta = (\zeta_1, \dots, \zeta_n) \enspace \textrm{such that} \enspace f(\zeta) = 0.\]

It is easy enough to impose a manifold structure on the set of all such systems \(f,\) and to define a probability measure on the resulting \(M.\)

In order to take advantage of Bezout’s theorem on the number of zeros of a system of complex homogeneous polynomials, they began by letting \(d_1, \dots, d_n\) be the degrees of \(f_1, \dots, f_n,\) and homogenizing each \(f_i\) by multiplying each constituent monomial by the power of an auxiliary variable \(z_0\) needed to increase its degree to \(d_i.\) The result is a system of \(n\) homogeneous polynomials of degrees \(d = (d_1, \dots, d_n)\) in the \(n +1\) variables \(z_1, \dots, z_n.\) A variant of Newton’s method, known as the projective Newton’s method, applies to such systems.

If \(\mathcal{H}(d)\) is the vector space of all homogenous systems \(f=0\) of degree \(d,\) then \(\textrm {dim}\mathcal{H}(d)\) can be shown to be \(N = \Sigma_{i=1}^{n} C_n^{n+d_i}\) where \(C_n^m\) is a binomial coefficient, and the number of solutions \(\zeta = (\zeta_1, \dots, \zeta_n)\) is known by Bezout’s theorem to be \(B= \Pi_id_i.\) In particular, if each \(d_i\) is \(2,\) then \(N \sim n^3/2\) while \(B = 2^n.\) The number of solutions \(\zeta\) is thus exponential in \(n,\) dashing any hope for a polynomial-time algorithm that produces all solutions.

With only a few equations of high degree, the problem \(P'\) can be reduced via exact symbolic techniques (Gröbner bases, resultants, and the like) to that of solving a univariate polynomial of degree \(B\) in polynomial time. But that fact alone does not lead to a solution of Smale’s problem #17, even in combination with the aforementioned homotopy methods.

In a series of five papers co-authored with Shub—which came to be known as the Bezout series—Smale established the surprising fact [7] that a zero of \(n\) complex polynomial equations in \(n\) complex unknowns can be found approximately, on average, in time polynomial in \(N\) alone. But the proof demonstrated only existence. It was neither constructive nor uniform.

In 2008-2009, a pair of papers by Carlos Beltrán and Luis Miguel Pardo [1, 2] described a probabilistic algorithm—where the machine must first select at random a point of departure from which to launch the chosen algorithm \(\alpha\) for solving \(P'\) with high probability in polynomial average time. This led some to consider #17 solved. Others disagreed, on the ground that Smale was seeking a deterministic method that would work for certain.

Beltrán and Pardo select at random an \(n \times (n+1)\) matrix \(A\) and solve \(A\zeta = 0\) for \(\zeta \not= 0.\) They then select, again at random, a homogeneous system \(g\) of higher degree that vanishes at \(\zeta.\) The resulting pair \((g, \zeta)\) then serves as a starting point for the required homotopy, and the computation time \(T(P', \alpha)\) is averaged over the two random choices.

In 2011, Felipe Cucker and Peter Bürgisser announced a deterministic variant of the Beltrán-Pardo algorithm for solving \(P',\) applicable only to a restricted range of degrees and dimensions [4]. And finally, in July of 2015, Pierre Lairez revealed a complete solution [5] of Smale’s problem.

Instead of choosing \(A\) and \(g\) at random, Lairez uses the input \(f\) itself as a random element. He chooses a precision \(\varepsilon,\) and rounds off the coefficients of the polynomial system \(f\) to that precision, thereby producing an approximation \(g\) of \(f.\) Lairez then fixes \(g\) and considers the set of all \(f\) that round to the same \(g.\) The coefficients of \(h = f - g\) are then effectively random, and may be used to produce a random matrix \(A\) and a random non-linear system \(\bar{g}\) vanishing at a non-zero solution \(\zeta\) of \(A\zeta = 0.\) By doing so, Lairez reduces the problem \(P'\) to a member of the class shown by Beltrán and Pardo to be solvable in average polynomial time.

Lairez spoke about all this at the Simons Institute in Berkeley, CA, last December. He ended his talk by posing a problem \(17bis\) asking for proof that \(P'\) can be solved approximately by an algorithm \(\alpha\) whose average running time \(E(T(P',\alpha))\) is a low-order polynomial in the two most natural measures of the size and tractability of a given \(P' \in M,\) namely the length of the input sequence and the condition number of the derivative matrix \(\triangledown f\) at a root. Such an estimate has already been developed [3] under the hypothesis that geodesics in \(M\) can be adequately approximated with respect to a metric that arises naturally in the study of homotopy methods.

References

[1] Beltrán, C., & Pardo, L.M. (2009). Smale’s 17th problem: Average polynomial time to compute affine and projective solutions. Journal of the American Mathematical Society, 22, 363–385.

[2] Beltrán, C., & Pardo, L.M. (2011). Fast Linear Homotopy to Find Approximate Zeros of Polynomial Systems. Foundations of Computational Mathematics, 11(1), 95-129.

[3] Beltrán, C., & Shub, M. (2009). Complexity of Bezout’s Theorem VII: Distance Estimates in the Condition Metric. Foundations of Computational Mathematics, 9, 179-195. doi: 10.1007/s1208-007-9018-5

[4] Cucker, F., & Bürgisser, P. (2011). On a problem posed by Steve Smale. Annals of Mathematics, 174(3), 1785–1836. doi:10.4007/annals.2011.174.3.8.

[5] Lairez, P. (2015). A Deterministic algorithm to compute approximate roots of polynomial systems in polynomial average time. Foundations of Computational Mathematics. Preprint, arXiv:1507.05485.

[6] Megiddo, N. (1986). Improved Asymptotic Analysis of the Average Number of Steps Performed by the Self-Dual Simplex Algorithm. Mathematical Programming 35, 140-72.

[7] Shub, M., & Smale, S. (1994) Complexity of Bezout’s Theorem V: Polynomial Time. Theoretical Computer Science 133, 141-164.

James Case writes from Baltimore, Maryland. Michael Shub works in dynamical systems and the theory of complexity of real number algorithms. Steve Smale was his 1967 Ph.D. thesis advisor.