| July 13, 2020

Solving Combinatorial Optimization Problems on Quantum Computers

By Yuri Alexeev, Jeffrey Larson, Sven Leyffer, and Ruslan Shaydulin

The rapid solution of combinatorial optimization problems benefits numerous applications. Quantum computing has recently attracted considerable attention due to numerous algorithms with exponential speedup over state-of-the-art classical algorithms. However, no demonstrably faster quantum algorithm currently exists for combinatorial optimization. The quantum approximation optimization algorithm (QAOA) is a candidate quantum algorithm for combinatorial optimization on gate-model quantum computers, such as those developed by IBM, Google, and Rigetti Computing. Here we overview the fundamentals, advantages, disadvantages, and current state of QAOA.

Edward Farhi and collaborators first introduced QAOA in 2014 to improve the best-known approximation ratio for a specific maximum satisfiability problem (Max E3LIN2) [2], though researchers have since developed a better classical algorithm. While QAOA has yet to theoretically improve upon the best classical algorithms for any problem class, it continues to attract interest within the quantum computing and optimization communities.

The reasons for this interest are twofold. QAOA is one of few algorithms that can reliably run on near-term quantum devices. QAOA and its generalization—the quantum alternating operator ansatz [4]—can also tackle a wide class of combinatorial optimization problems that are computationally difficult and ubiquitous in applied mathematics. To disentangle the hype from reality, we first recap the mathematics of quantum computation and QAOA. Consider the combinatorial optimization problem over the Boolean hypercube and its reformulation:

\[\underset{{\mathbf{y} \in \{0,1\}^n}} \min \; f(\mathbf{y}) \quad \Leftrightarrow \quad
\underset{\mathbf{y} \in \{0,1\}^n} \min \; \sum_{\mathbf{w}\in \{0,1\}^n} f(\mathbf{w}) \unicode{x1D7D9}_{\mathbf{w}}(\mathbf{y}), \tag1 \]

where the second formulation is an equivalent representation of \(f(\mathbf{y})\) and \(\unicode{x1D7D9}_{\mathbf{w}}(\mathbf{y})\) is the indicator function that takes the value \(1\) if \(\mathbf{w}=\mathbf{y}\) and \(0\) otherwise.¹ Note that construction of such an indicator polynomial is simple but might require \(2^n\) terms in the sum.

This technique can convert the objective function \(f\)—which acts on the Boolean hypercube—into an operator \(\hat{f}\) that acts on the \(2^n\)-dimensional space \(\mathbb{C}^{2^n}\) in the following way: \(\hat{f}\cdot\mathbf{e}_i = f(\mathbf{y})\mathbf{\mathbf{e}}_i \;\forall i\), where \(\mathbf{e}_i\in \mathbb{C}^{2^n}\) is an element of an orthonormal basis \(\{\mathbf{e}_i\}_{i=1}^{2^n}\) of \(\mathbb{C}^{2^n}\) that encodes a binary string \(\mathbf{y}\in \{0,1\}^{n}\). Because \(\dim(\mathbb{C}^{2^n})=2^n\), one can use any basis of \(\mathbb{C}^{2^n}\) to encode all binary strings \(\mathbf{y}\in \{0,1\}^{n}\). We can construct \(\hat{f}\) by replacing the indicator function \(\bf{1}_{\bf{e}_{\bf_i}}\)\((\bf{y}\)\()\) with the projector operator \(\mathbf{e}_i(\mathbf{e}_i^*)^T\), where \(\mathbf{e}_{\bf{i}}^*\) denotes the complex conjugate of \(\mathbf{e}_\bf{i}\). Since \(\{\mathbf{e}_i\}_{i=1}^{2^n}\) is an orthonormal basis, this projector behaves like the indicator function: \(\mathbf{e}_i(\mathbf{e}_i^*)^T\mathbf{e}_j = \mathbf{e}_j\) if \(i=j\) (operator acts as identity) and is otherwise \(0\). While illustrative, this view does not provide a recipe for efficient construction of \(\hat{f}\) for a given \(f\). A comprehensive list of rules on constructing operators \(\hat{f}\) is available in [3].

We now connect the combinatorial optimization problem to a quantum computer. We represent the state of an \(n\)-qubit quantum computer with a norm-1 vector in \(\mathbb{C}^{2^n}\). By selecting the basis in \(\mathbb{C}^{2^n}\) to embody all binary strings \(\mathbf{y}\in \{0,1\}^{n}\), we can convert our cost function into a quantum operator that acts on a quantum computer’s state space. Note that this operator is simply a diagonal matrix. QAOA considers the following family of parameterized vectors:

\[\mathbf{x}(\mathbf{\beta}, \mathbf{\gamma}) = \prod_{i=1}^p (e^{-i\beta_i \hat{m}}e^{-i\gamma_i \hat{f}})\unicode{x1D7D9} \in \mathbb{C}^{2^n},\]

where \(p\in \mathbb{N}\), \(\mathbf{\beta}\in [0,\pi]^{p},\) \(\mathbf{\gamma}\in [0,2\pi]^{p}\) are free parameters and \(\unicode{x1D7D9}=\frac{1}{2^n}\Sigma_i\mathbf{e}_i\). The mixing operator \(\hat{m}\) is a nondiagonal matrix that “mixes together” terms that correspond to different basis vectors. Researchers typically employ a “transverse field” operator as the mixing operator [4].

Figure 1. Optimization landscape for MAXCUT on an 8-node graph (brighter is better). The landscape is nonconvex with two low-quality local optima,\(\beta\in[0,\frac{\pi}{2}]\), \(\gamma \in [0, \pi]\). Figure courtesy of Ruslan Shaydulin.

We use a classical optimizer to vary the free parameters \(\mathbf{\beta}, \mathbf{\gamma}\) and bring \(\mathbf{x}(\mathbf{\beta}, \mathbf{\gamma})\) as close as possible to the basis vector \(\mathbf{e}_i\) that encodes the binary string \(\mathbf{y}\in \{0,1\}^{n}\), which is the original problem’s solution. We have thus replaced the combinatorial optimization over \(\mathbf{y}\) with a nonconvex optimization over \(\mathbf{\beta},\mathbf{\gamma}\). Figure 1 illustrates the objective landscape. One can read the optimization result from the quantum computer by performing a measurement, which is equivalent to sampling from a probability distribution; when measuring vector \(\mathbf{x}(\mathbf{\beta}, \mathbf{\gamma})\), the result is \(\mathbf{e}_i\) with the probability \(|(\mathbf{e}_i^*)^T\mathbf{x}(\mathbf{\beta}, \mathbf{\gamma})|^2\). If \(\mathbf{x}(\mathbf{\beta}, \mathbf{\gamma}) = \mathbf{e}_i\), the measurement result will be \(\mathbf{e}_i\) with probability \(1\).

Research has shown that one cannot efficiently sample from the output of QAOA under reasonable complexity theory assumptions, even with only one step \((p=1)\). However, the fact that QAOA is classically hard to simulate does not speak to its potential for solving optimization problems. Nevertheless, a number of obstacles prevent people from understanding and realizing its potential, defined as the approximation ratios that it can achieve.

First, the following elements impede the ability to understand QAOA’s potential:

(i) the lack of analytical results about QAOA behavior in depths other than \(p=1\) or \(p=\infty\)
(ii) the lack of empirical results about QAOA scaling with problem size and QAOA performance on problems of realistic size.

Second, the subsequent factors hinder the realization of QAOA’s potential (i.e., achieving quantum advantage through optimization):

(i) the need for parameter optimization (i.e., optimizing over \(\beta,\gamma\))
(ii) the mismatch between the capabilities of available hardware and QAOA’s hardware requirements, in terms of the number of qubits, speed, and fidelity of gates.

QAOA’s potential is difficult to analyze in general, but two recent results provide analytical upper bounds on QAOA performance. Matthew Hastings demonstrated that classical local one-step algorithms achieve the same (or better) performance for \(p=1\) as one-step QAOA [5]. Another group analyzed QAOA’s locality and symmetry—with the practical corollary that the classical Goemans-Williamson algorithm for MAXCUT outperforms it in constant depth (i.e., constant \(p\))—and provided an upper bound for arbitrary \(p\) [1]. While these results are interesting, a near-term potential of QAOA is its use as a heuristic. In this case, the worst-case QAOA performance may not predict its average-case performance. Unfortunately, QAOA’s average-case analysis is impeded by its analytical complexity and the impossibility of numerical experiments for problems of nontrivial size, due to the lack of appropriately capable hardware and the computational cost of classical simulation. The greatest potential for better understanding lies at this point.

The dearth of methods to reliably and efficiently solve the parameter optimization within each QAOA iteration severely limits the realization of its potential. Any given problem and depth has no a priori knowledge of the parameter values \(\mathbf{\beta}\) and \(\mathbf{\gamma}\) that result in the distribution \(\mathbf{x}(\mathbf{\beta}, \mathbf{\gamma})\) that maximizes \(f\). We must therefore search for these values, often with classical numerical optimization routines. While the optimization appears to be simple in higher depths \(p > n\) and is trivial in the limit \(p\rightarrow\infty\), QAOA’s most promising regime is shallow (small constant) depth because it can run on near-term hardware. For small depth, the objective over \(\mathbf{\beta}\) and \(\mathbf{\gamma}\) is often nonconvex and contains many poor local optima. Figure 1 is an example objective landscape in the \(p=1\) case.

QAOA’s performance depends critically on the quality of the parameters \(\mathbf{\beta}\) and \(\mathbf{\gamma}\). However, a single local optimization run is the current default in multiple QAOA packages, even though the objective has many suboptimal local optima. One can gain considerable improvement via multistart optimization methods [9] or machine learning approaches [6], yet these techniques also struggle when \(p\) is a moderate size. The best numerical optimization routine for identifying parameters within QAOA is largely an open question.

The second limiting consideration is the required large number of qubits and fast, high-fidelity gates to execute QAOA circuits with \(p \gg 1\). Classical algorithms can achieve excellent approximation ratios in minutes—if not seconds—for most practically interesting binary optimization problems with less than a few hundred variables. Therefore, until quantum hardware with hundreds of qubits is accessible, QAOA will not be able to compete with classical state-of-the-art methods under the current approach. Such devices may become available in the next several years. Hardware size is growing steadily, but simply increasing the qubit count is not enough. As QAOA gate count requirements grow linearly with the problem description and depth \(p\), they quickly surpass current hardware's capabilities.

For example, QAOA requires execution of \(d \times n \times p\) 2-qubit gates for MAXCUT on a \(d\)-regular graph on \(n\) nodes with depth \(p\) (assuming full connectivity between qubits and “controlled NOT” as the native 2-qubit gate). This in turn necessitates the execution of hundreds of 2-qubit gates for moderate-sized problems, which is beyond the capacity of existing hardware.

In our opinion, ion-trap architectures are one of the most promising near-term quantum computers for faithful execution of QAOA circuits due to the high fidelity of gates — albeit at low frequency. Several researchers have demonstrated QAOA simulations with 40 qubits and depth \(p=1\) and \(p=2\) [8]. They found that increasing \(p=2\) did not improve the solution’s quality because the error that resulted from the execution of extra gates negated any benefits. These results underpin the importance of high-fidelity quantum hardware to make significant progress. We hope that the ion trap and other technologies will improve in the midterm. Such advances will allow for large-scale evaluations of quantum optimization in realistic settings, paving the way for a better understanding of quantum optimization heuristics and the development of improved versions for near-term applications.

¹ This representation is a version of the Fourier expansion of \(f\). Further discussion of Fourier analysis of Boolean functions is available in [7].

Acknowledgments: This material is based on work that is supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research under contract DE-AC02-06CH11357.

References
[1] Brayvi, S., Kliesch, A., Koenig, R., & Tang, E. (2019). Obstacles to state preparation and variational optimization from symmetry protection. Preprint, arXiv:1910.08980.
[2] Fahri, E., Goldstone, J., & Gutmann, S. (2014). A quantum approximate optimization algorithm applied to a bounded occurrence constraint problem. Preprint, arXiv:1412.6062.
[3] Hadfield, S. (2018). On the representation of Boolean and real functions as Hamiltonians for quantum computing. Preprint, arXiv:1804.09130.
[4] Hadfield, S., Wang, Z., O’Gorman, B., Riefel, E., Venturelli, D., & Biswas, R. (2019). From the quantum approximate optimization algorithm to a quantum alternating operator ansatz. Algorithms, 12.
[5] Hastings, M.B. (2019). Classical and quantum bounded depth approximation algorithms. Preprint, arXiv:1905.07047.
[6] Khairy, S., Shaydulin, R., Cincio, L., Alexeev, Y., & Balaprakash, P. (2019). Learning to optimize variational quantum circuits to solve combinatorial problems. In Proceedings of the AAAI Conference on Artificial Intelligence. Association for the Advancement of Artificial Intelligence.
[7] O’Donnell, R. (2014). Analysis of Boolean Functions. New York, NY: Cambridge University Press.
[8] Pagano, G., Bapat, A., Becker, P., Collins, K., De, A., Hess, P., …, Monroe, C. (2019). Quantum approximate optimization with a trapped-ion quantum simulator. Preprint, arXiv:1906.02700.
[9] Shaydulin, R., Safro, I., & Larson, J. (2019). Multistart methods for quantum approximate optimization. In Proceedings of the IEEE High Performance Extreme Computing Conference. Institute of Electrical and Electronics Engineers.

Yuri Alexeev is a principal project specialist at Argonne National Laboratory and a senior scientist at the University of Chicago. He works in the field of quantum information science. Jeffrey Larson is a computational mathematician at Argonne, where he develops algorithms for solving difficult numerical optimization problems. Sven Leyffer is a senior computational mathematician in the Mathematics and Computer Science Division at Argonne. He works on nonlinear optimization. Ruslan Shaydulin is a Ph.D. candidate at Clemson University. His dissertation focuses on quantum optimization, hybrid quantum-classical algorithms, and combinatorial optimization.