| May 27, 2022

Convergence of Artificial Intelligence, High-performance Computing, and Simulations

Many scientific applications heavily rely on brute-force numerical methods that are implemented on high-performance computing infrastructures. Over the years, hardware advances—especially with graphics processing unit (GPU)-accelerated computing and efficient scaling on powerful supercomputers—have paved the way for extremely large scientific simulations. However, severe limitations still exist. For example, accurately modeling climate change throughout the next few decades requires that we capture the turbulent dynamics of stratocumulus clouds; computing these dynamics with numerical methods necessitates 100 billion times more computing power than is currently available [5]. Similarly, researchers estimate that using today’s computers to calculate the electronic wave function for a single configuration of a simple molecule with about 100 electrons via the solution of Schrodinger’s equation [4] would take longer than the age of the universe. Can artificial intelligence (AI) methods augment or even entirely replace these brute-force calculations to obtain significant speedups? Can we make groundbreaking new discoveries as a result of such speedups? And can we employ data-driven approaches to reduce or entirely eliminate modeling errors in large-scale scientific applications? Answering these and other questions could inspire paradigm shifts in the way that we conduct scientific research.

Partial differential equations (PDEs) have long been the workhorse of scientific simulations. Mathematician Steven Strogatz did not exaggerate when he stated that differential equations “represent the most powerful tool humanity has ever created for making sense of the material world” [6]. Nevertheless, two significant challenges persist: (i) Identifying the governing model for complex systems and (ii) efficiently solving large-scale, nonlinear systems of PDEs. Classifying or formulating the underlying PDEs that appropriately model a specific problem usually requires extensive prior knowledge of the field in question. For example, modeling the deformation and fracture of solid structures demands detailed knowledge of the relationship between stress and strain in the constituent material [2]. Such knowledge is often elusive for complex systems such as living cells, so formulating the governing PDEs remains prohibitive — or else these PDEs are too simplistic to be informative. The possibility of directly obtaining such knowledge from data could revolutionize these fields.

Many researchers have attempted to use machine learning—and deep learning in particular—to replace or augment existing PDE solvers. However, standard neural networks are designed for a fixed discretization and can only learn mappings between finite-dimensional spaces. They are hence unsuitable for the modeling of complex phenomena on continuous media—e.g., fluid flows and wave propagation—that do not have discrete fixed-sized inputs and outputs.

Figure 1. Illustrative example of a global near-surface wind forecast generated by FourCastNet over the entire globe at a resolution of 0.25 degrees. FourCastNet is based on the Fourier neural operator (FNO), trained using ERA5 weather data, and able to forecast wind speeds 96 hours in advance with remarkable fidelity and correct fine-scale features. The forecast accurately captures the formation and track of Typhoon Mangkhut that begins to form at roughly 10° N, 210° W (see Inset 1). Furthermore, our model captures the typhoon’s intensification and track over a period of four days. During this forecast, the model also identifies three named hurricanes (Florence, Issac, and Helene) that are forming in the Atlantic Ocean and approaching the eastern coast of North America (see Inset 2). Figure courtesy of [3].

Neural Operator for Learning PDEs

We recently developed a foundational approach to learn complex, multiscale, and chaotic phenomena [2]. Our proposed neural operator learns operators: mappings between infinite-dimensional spaces that arise in many scientific domains wherein the inputs and outputs are continuous functions, such as in the solution of PDEs. This neural operator has no dependence on either the grid or the resolution of discretized training data. It thus allows for zero-shot super-resolution, meaning that it trains on low-resolution data and evaluates on high-resolution points. This design makes the neural operator efficient for large-scale simulations. In addition, we also developed a hybrid approach with the operator that can exploit both observed data and physical laws/constraints to reduce modeling errors and enable effective generalization [1].

We can train the operator on challenging problems—like turbulent fluid flows—with only low-resolution data. We also incorporated a Fourier neural operator (FNO) into FourCastNet: our AI-based weather forecasting model that makes weather predictions at a spatial resolution that is eight times higher than previous AI models [3]. These high-resolution predictions allow us to accurately forecast challenging variables, such as precipitation or wind, up to a week in advance (see Figure 1). This capability is especially relevant as climate change causes extreme weather events with increasingly devastating amounts of damage. The high computational cost of traditional methods that involve numerical, physics-based modeling is hampering our ability to predict and mitigate such events. For the first time, we demonstrate that an AI-based model can predict global weather and generate medium-range forecasts with similar accuracy as numerical weather models — all while being four to five orders of magnitude faster.

We have also applied FNO to many other challenging multiscale problems, such as models of multiphase flows \((10^4\) speedup) and inelastic impact (\(10^5\) speedup) [2]. Multiphase flow modeling is important for a variety of geosciences applications, including contaminant transport, carbon capture and storage, hydrogen storage, oil and gas extraction, and nuclear waste storage.

We currently live in the golden age of AI, and AI for science is a revolution in the making. Principled algorithms like neural operators account for the unique requirements of scientific domains while simultaneously introducing expressive and flexible data-driven models that can capture complex, multiscale phenomena. By further combining such data-driven models with domain knowledge and constraints, we can enable extrapolation to new scenarios beyond the training distribution. Doing so leads to speedups of several orders of magnitude that allow us to run scientific simulations at unprecedented scales.

Anima Anandkumar presented this research during an invited talk at the 2021 SIAM Annual Meeting, which took place virtually last year.

References
[1] Li, Z., Zheng, H., Kovachki, N., Jin, D., Chen, H., Liu, B., … Anandkumar, A. (2021). Physics-informed neural operator for learning partial differential equations. Preprint, arXiv:2111.03794.
[2] Liu, B., Kovachki, N., Li, Z., Azizzadenesheli, K., Anandkumar, A., Stuart, A.M., & Bhattacharya, K. (2021). A learning-based multiscale method and its application to inelastic impact problems. J. Mechan. Phys. Solids, 158, 104668.
[3] Pathak, J., Subramanian, S., Harrington, P., Raja, S., Chattopadhyay, A., Mardani, M., … Anandkumar, A. (2022). FourCastNet: A global data-driven high-resolution weather model using adaptive Fourier neural operators. Preprint, arXiv:2202.11214.
[4] Qiao, Z., Welborn, M., Anandkumar, A., Manby, F.R., & Miller III, T.F. (2020). OrbNet: Deep learning for quantum chemistry using symmetry-adapted atomic-orbital features. J. Chem. Phys., 53(12), 124111.
[5] Schneider, T., Teixeira, J., Bretherton, C.S., Brient, F., Pressel, K.G., Schär, C., & Siebesma, A.P. (2017). Climate goals and computing the future of clouds. Nature Clim. Change, 7(1), 3-5.
[6] Strogatz, S. (2009). Loves me, loves me not (do the math). Opinionator: The New York Times. Retrieved from https://opinionator.blogs.nytimes.com/2009/05/26/guest-column-loves-me-loves-me-not-do-the-math.

Anima Anandkumar is a Bren Professor at the California Institute of Technology and director of Machine Learning Research at NVIDIA. She was previously a principal scientist at Amazon Web Services and is part of the World Economic Forum's Expert Network. Anandkumar is passionate about designing principled artificial intelligence (AI) algorithms and applying them in interdisciplinary applications. Her research focuses on unsupervised AI, optimization, and tensor methods.