# Stochastic Modeling for Weather and Climate Prediction

Regardless of the time or location, people seemingly always want to know the future weather. As late as the mid-20th century, the favored approach for weather forecasting involved *analogues *that were based on a large historical dataset of past weather reports. One would simply examine the record to find a day that was similar to the present day in question, then issue the historical evolution of the atmosphere as the forecast for the coming week. However, this method does not work in practice because the atmosphere is chaotic — its evolution is very sensitive to small details in the initial state. This was the central message of meteorologist Edward Lorenz’s groundbreaking 1963 paper: analogue forecasting is doomed to fail since one simply cannot find a historical match to the current weather with sufficient accuracy [3].

Instead of using analogues, meteorologists now generate forecasts by combining the Navier-Stokes equations with equations that describe radiation, thermodynamics, water phase changes, and other phenomena in order to build a computer model of the atmosphere. Numerically solving these equations involves setting a discretization scale, which should be as fine as possible. However, we must also produce weather forecasts in a timely manner. Despite access to some of the world’s largest supercomputers, this stipulation puts a hard limit on how fine the discretization scale can be. For weather forecasts that are out one or two weeks, this scale is around 10 kilometers. We must include the effects of all processes that occur below the discretization scale—including clouds, convection, and turbulence—in the model, but can only do so in an approximate manner via so-called “parametrization schemes.” A key assumption is that one can successfully approximate the unresolved scales’ impact on the resolved scale flow with a deterministic function of the resolved scales.

Two problems are immediately apparent. First, the Navier-Stokes equations show strong evidence of scaling symmetries [5]. In other words, if \(\mathbf{u}(\mathbf{x}, t)\), \(p(\mathbf{x}, t)\) is a solution to the Navier-Stokes equation, then \(\mathbf{u}_r(\mathbf{x},t)=r\mathbf{u}(r\mathbf{x}, r^2t)\), \(p_r(\mathbf{x},t)=r^2p(r\mathbf{x}, r^2t)\) is also a solution for any scaling parameter \(r>0\). This scaling symmetry is consistent with the power-law behavior that is evident in atmospheric observations [6]. However, truncating the equations of motion at the discretization scale and replacing the unresolved scales in computer models with a deterministic parametrization scheme violate these scaling symmetries. Deterministic parametrizations essentially assume the presence of a spectral gap between resolved and unresolved scales, which does not exist in reality. The parametrization process is therefore a source of error in our forecasts. The second problem is that small-scale forecast errors will not remain confined to the smallest scales. Instead, they will exponentially grow in time and cascade upscale in space, thus causing our forecasts to diverge from the atmosphere’s true evolution.

One solution to these two issues is to replace conventional, deterministic parametrizations with stochastic parametrization schemes [7]. We recognize that the grid-scale variables cannot fully constrain subgrid motions without a spectral gap. We therefore choose to describe the subgrid in terms of a probability density function (PDF) that is constrained by the resolved scale flow, then randomly draw from this evolving PDF to step our computer model forward. For example, instead of including the effects of the most likely arrangement of clouds, we include the effect of just one possible cloud field on the forecast’s evolution.

To derive an appropriate form for the stochastic parametrization, we can characterize small-scale variability using high-resolution simulations that resolve the small-scale phenomena of interest. We do this by coarse graining these simulations before comparing them to a low-resolution forecasting model. Measurements of the “true” PDF of subgrid motions that are conditioned on the large-scale state not only provide further evidence that parametrization schemes should be stochastic, but also motivate the form of the stochastic parametrizations themselves [1] (see Figure 1).

**Figure 1.**Coarse-graining studies motivate and constrain stochastic parametrizations.

**1a.**The coarse-graining approach.

**1b.**True subgrid temperature tendency distribution (blue) compared to the estimate from a deterministic parametrization (grey rectangle and panel titles). Figure adapted from [1].

To address the second aforementioned problem—the fact that small-scale forecast errors will not remain confined to the smallest scales—we transition from making a single forecast for an upcoming period to making a set of forecasts. The forecasts originate from different but equally likely starting conditions, which we estimate based on our measurements of the atmosphere. Each forecast also utilizes different random numbers in the stochastic parametrization scheme, thereby indicating various possible realizations of the small-scale processes. By skillfully accounting for all sources of error in our forecasts, we can ensure that they are reliable — i.e., statistically consistent with the observed evolution of the atmosphere. For instance, if we collect all of the days for which we predicted a 10 percent chance of rain, it should rain on 10 percent of those days. Stochastic parametrizations are clearly necessary; we cannot produce reliable forecasts without them.

Nowadays, however, we are not only interested in predicting the weather. Climate prediction is extremely important because it provides guidance for policymakers and enables a range of sectors to prepare for the future. But predicting the climate is a different problem than predicting the weather. In fact, Lorenz referred to weather prediction as a “prediction of the first kind” [4]. Such problems are initial value problems — the skill in the forecast comes primarily from accurate specification of the starting conditions and the system’s resulting evolution away from these conditions. Climate prediction, on the other hand, is a “prediction of the second kind.” In this context, we are interested in predicting a system’s response to an external forcing. We cannot hope to predict the specifics of the weather on any given day, but rather seek to predict the weather’s changing statistics.

Despite these differences, we produce climate predictions much like weather forecasts — though now we use a computer model of the *entire *Earth system, including the atmosphere, oceans, biosphere, and cryosphere, among other components. We also incorporate an estimate of how anthropogenic greenhouse gases and other emissions will evolve in the future—based on a range of policy-driven “emission pathways”—to assess possible forthcoming changes to the Earth’s climate. The added complexity of a climate model, coupled with the need to produce predictions on century-long timescales, means that we must substantially coarsen the discretization scale to the order of 100 kilometers.

While the weather forecasting community has readily adopted stochastic parametrizations because of their measurable positive impact on forecast skill, the climate modeling community generally still uses deterministic models. However, our recent work demonstrates the potential of stochastic parametrizations to transform climate modeling much like they have transformed weather prediction. We show that the presence of stochasticity in climate models can alleviate long-standing systematic biases, such as mean state biases—like the distribution of precipitation [8]—and biases in modes of variability, like the El Niño–Southern Oscillation [2] (see Figure 2). Despite concerted efforts from the community, these stubborn biases in deterministic models have long resisted improvement.

**Figure 2.**Power spectra of modeled (black) and observed (grey) El Niño–Southern Oscillation time series in three climate models—the Community Climate System Model (CCSM), EC-Earth, and Met Office Unified Model (MetUM)—with and without the stochastically perturbed parameterization tendencies (SPPT). The power spectrum for each model with stochastic parametrization better matches the observed data. Figure adapted from [2] and [9].

Unpicking the way in which stochasticity leads to such dramatic improvements is nontrivial, and we generally must assess the mechanism for each phenomenon of interest. For example, while researchers can understand El Niño’s basic existence as a deterministic coupling between atmosphere and ocean, its variability stems from high-frequency atmospheric wind stress forcing on the ocean surface. Simulations that include a stochastic parametrization reveal an improved distribution of atmospheric winds. In the Community Climate System Model (CCSM) and Met Office Unified Model (MetUM), the parametrization dampens an excessively active El Niño. But in the EC-Earth climate model, it enhances a too-weak El Niño (see Figure 2). If we assume that the underlying coupling strength between atmosphere and ocean differs among the various climate models, then a very simple delayed oscillator model of El Niño predicts this extraordinary result [9].

By improving the statistics of the fast “weather” in climate models, we enable the simulated Earth system to explore its attractor in a more realistic way and thus improve the model’s fidelity. As I write this article, the International Panel on Climate Change is producing its Sixth Assessment Report. This report will collate the state of the art in climate prediction and compare coordinated climate change experiments that were created with the world’s leading climate models. For the first time, one of these models includes stochastic parametrization schemes. This is an exciting development, and I trust that many climate centers will soon adopt these techniques.

*This article is based on Hannah Christensen’s invited presentation at the 2020 SIAM Conference on Mathematics of Planet Earth, which took place virtually last year. Christensen’s talk is available on SIAM’s YouTube Channel.*

**References**

[1] Christensen, H.M. (2020). Constraining stochastic parametrisation schemes using high-resolution simulations. *Quart. J. Roy. Meteorol. Soc.*, *146*(727), 938-962.

[2] Christensen, H.M., Berner, J., Coleman, D.R.B., & Palmer, T.N. (2017). Stochastic parameterization and the El Niño–Southern Oscillation. *J. Clim.*, *30*(1), 17-38.

[3] Lorenz, E.N. (1963). Deterministic nonperiodic flow. *J. Atmos. Sci.*, *20*(2), 130-141.

[4] Lorenz, E.N. (1975). Climatic predictability. In *The physical basis of climate and climate modelling* (GARP Publication Series No. 16) (pp. 132-136). Stockholm, Sweden.

[5] Lovejoy, S., & Schertzer, D. (2013). *The weather and climate: Emergent laws and multifractal cascades*. Cambridge, U.K.: Cambridge University Press.

[6] Nastrom, G.D., & Gage, K.S. (1985). A climatology of atmospheric wavenumber spectra observed by commercial aircraft. *J. Atmos. Sci.*, *42*(9), 950-960.

[7] Palmer, T.N. (2001). A nonlinear dynamical perspective on model error: A proposal for non-local stochastic-dynamic parameterization in weather and climate prediction. *Quart. J. Roy. Meteorol. Soc.*, *127*(572), 279-304.

[8] Strommen, K., Christensen, H.M., MacLeod, D., Juricke, S., & Palmer, T.N. (2019). Progress towards a probabilistic Earth system model: Examining the impact of stochasticity in the atmosphere and land component of EC-Earth v3.2. *Geosci. Model Dev.*, *12*(7), 3099-3118.

[9] Yang, C., Christensen, H.M., Corti, S., von Hardenberg, J., & Davini, P. (2019). The impact of stochastic physics on the El Niño Southern Oscillation in the EC-Earth coupled model. *Clim. Dyn.*, *53*, 2843-2859.