Machine Learning for the Global Forecast

By Jillian Kunze

What if dynamics are the easiest part of the global atmospheric system for machines to learn? Dale Durran of the University of Washington posed this question during his minisymposium presentation at the 2020 SIAM Conference on Mathematics of Planet Earth, which is taking place virtually this week. Durran collaborated with graduate student Jonathan Weyn and Microsoft senior researcher Rich Caruana to investigate whether deep learning weather prediction (DLWP) could replace numerical weather prediction (NWP) — weather models that are based on equations of atmospheric dynamics and form the current standard for weather forecasting. They focused on the use of machine learning to forecast the atmosphere’s future state given certain initial conditions.

Forecasting the weather with machine learning could address key predictability issues with NWP and make forecasting more probabilistic. In addition, DLWP could offer a huge increase in ensemble size—the number of forecasts in a set that predicts the future range of atmospheric variables—which might make it better suited for extreme event modeling. Durran and his team set out to make a machine learning model that is exclusively trained on past weather data without incorporating any information about atmospheric dynamics or physics. They focused on predicting future changes in just four atmospheric variables instead of trying to generate a complete weather forecast, as this was a sufficient first step in determining the success of the machine learning method.

Researchers used this cubed sphere grid to divide the surface of the globe into square faces.

Researchers have recently advanced DLWP in three major ways. Modelling the Earth’s surface with a cubed sphere grid, which divides the surface of the globe into four equatorial-centered square faces and two polar faces, is one such improvement. This is ideal for convolutional neural networks, a type of deep neural network that works well for forecast modeling. An additional advancement in DLWP came from the recently published result that using a six-hour resolution over 24 forecast hours can minimize the loss function; Durran described this finding as essential for the stable performance of his team’s model. The third key improvement resulted from developments in the network architecture for DLWP.

Durran and his team built their DLWP model with a deep learning U-Net architecture, which can incorporate information from large scales while still preserving fine-scale data. The model had a resolution of about 2° x 2° at the equator and proceeded in 12-hour time steps. After each convolution operation in the network, the researchers applied the capped leaky ReLU function to increase the model’s stability. They trained the model with weather data from the years 1975 to 2012, then used 2013 to 2016 as a validation set. Three fields fed into the model: the incoming solar radiation at the top of the atmosphere, a land-sea mask, and the topographic height of the Earth’s surface. The model output four prognostic variables: the 1000-hectopascal (hPa) height, or height above the Earth’s surface where the atmospheric pressure is 1000 hPa; the 500-hPa height; the 300-700 hPa thickness; and the 2-meter temperature, or temperature two meters above the Earth’s surface.

The group first qualitatively tested the model by integrating it forward in a free run over one year (2018) and visually comparing its predictions to actual atmospheric data from that same year. They then performed quantitative testing to measure the model’s quality based on two metrics: root-mean-square error (RMSE), which is sensitive to the magnitude of error in the model’s output, and anomaly correlation coefficient, which is sensitive to patterns of perturbations as compared to climatology models.

Snapshot from a free-running, one-year forecast that compares the DLWP model to weather data and a climatology model.

Durran and his collaborators compared the results for 500-hPa height and 2-m temperature over a two-week period to several benchmark NWP models from the Integrated Forecast System (IFS) of the European Centre for Medium-Range Weather Forecasts. Unlike DLWP, the IFS models included full physical dynamics and even coupled them with ocean and sea ice models. These models measured parameters at many different levels above the Earth’s surface, employed a one-hour time step rather than 12 hours, and incorporated numerous prognostic fields related to clouds and precipitation. The team used three IFS models for comparison: one with a slightly coarser resolution than their model, one with the same resolution, and one with a much finer resolution. Since the IFS models were much more complicated than the DLWP model, they only considered the IFS output for the specific parameters that their DLWP model also produced.

Despite using much less information to make its forecast, the team's model performed better than the coarse resolution IFS model, though it did not outperform the other two. Their model also ran faster, computing a two-week forecast in 0.1 seconds as opposed to 12 minutes for a particular IFS model. However, it is difficult to draw any conclusions from this comparison because the two models run on very different types of computers.

Overall, the DLWP model was able to create stable forecasts in one-year, free-running simulations and could reasonably predict the seasonal cycle of atmospheric variables. Durran expressed his surprise at the model’s good performance despite its lack of knowledge on dynamics or physics. “I’m in shock at how well it worked!” he said. Since DLWP is so computationally efficient, it may be very useful for ensemble forecasting in the future, thus allowing scientists to better predict extreme weather events.

Jillian Kunze is the associate editor of SIAM News.