Recent Advances in Dimensionality Reduction with Provable Guarantees
By Jelani Nelson
The following is a short introduction to an invited lecture to be presented at the upcoming 2018 SIAM Annual Meeting (AN18) in Portland, Ore. from July 913. Look for feature articles by other AN18 invited speakers introducing the topics of their talks in future issues.
Jelani Nelson, Harvard University.
Researchers have used
dimensionality reduction for over a hundred years — first Francis Galton, then Karl Pearson, and later Harold Hotelling in the late 19th and early 20th centuries. Pearson and Hotelling developed principal component analysis (PCA); the formation of a number of methods, including multidimensional scaling, kernel PCA, and isometric feature mapping, soon followed. One can employ these methods in statistics, data mining, machine learning, and many other areas after approximately fitting highdimensional data to lowdimensional models by extracting a few variables that “explain” most of the data.
Orthogonal but complementary is the study of dimensionality reduction from the perspective of lowdistortion embeddings into lowdimensional normed spaces. Researchers utilize such methods in highdimensional computational geometry, signal processing, algorithms for largescale linear algebra problems, identification of motifs in computational biology, machine learning, etc. Algorithms used in highdimensional computational geometry problems typically suffer from the curse of dimensionality, wherein the running time and/or memory usage of the bestknown algorithms scale poorly (even exponentially) with dimension. Such problems include nearest neighbor search and various clustering problems. The idea is that, given some input set of data points \(X = \{x_1,\ldots,x_n\}\subset\mathbb{R}^d\), one can quickly compute some \(f:X\rightarrow\mathbb{R}^m\) for \(m\ll d\), such that \(f(X)\) has roughly the same geometry as \(X\) (e.g., preservation of pairwise distances, angles, volumes of simplices defined by subsets of points, etc.). A cornerstone result in this line of work is the socalled JohnsonLindenstrauss (JL) lemma, proven in 1984:
Lemma 1 (JL lemma). For any \(X\subset \ell_2\) and any \(0<\varepsilon<1\), there exists an embedding \(f:X\rightarrow\mathbb{R}^m\) with \(m = O(\varepsilon^{2}\log X)\), such that
\[ \forall x,y\in X, \ (1\varepsilon)\xy\_2 \le \: \f(x)  f(y)\_2\le (1+\varepsilon)\xy\_2.\]
That is, the JL lemma preserves pairwise Euclidean distances, and the embedded dimension depends only on the number of points (and only logarithmically) and not on their original dimension! Furthermore, all known proofs of the JL lemma provide an \(f\) that is linear, and even chosen at random from an appropriate probability distribution, oblivious of the actual point set \(X\) to be embedded. Thus, one defines \(f\) as the map \(x\mapsto \Pi x\) for a random matrix \(\Pi\in\mathbb{R}^{m\times d}\), chosen from an appropriate distribution. Both facts have proven useful for a variety of applications.
Since the introduction of the JL lemma and its use in computer science applications beginning in the mid1990s, a variety of questions concerning the lemma have been asked and recently answered, either optimally or nearoptimally. These include the optimal dimension \(m\) that can be achieved by a Euclidean dimensionalityreducing map in terms of \(\varepsilon, X\) (i.e., is the \(m\) achieved in the JL lemma optimal?), and whether one can choose the aforementioned \(\Pi\) in such a way that the map \(x\mapsto \Pi x\) can be computed quickly. Researchers have also found new generalizations of the JL lemma and its applications, such as speeding up lowrank approximation computations for the PCA method and other largescale linear algebra problems, like approximate least squares regression. They have used a generalization of the JL lemma called subspace embeddings—pioneered by Tamás Sarlós in 2006—to achieve these.
My talk at the 2018 SIAM Annual Meeting will touch on recent developments in dimensionality reduction.

Jelani Nelson is an associate professor of computer science and the John L. Loeb Associate Professor of Engineering and Applied Sciences in the John A. Paulson School of Engineering and Applied Sciences at Harvard University. 