SIAM News Blog

Data-driven Hydrological Models Predict Mississippi River Flooding

By Lina Sorg

Flooding is responsible for more fatalities than any other severe weather phenomenon. Floodwaters damage personal property, cause injury and loss of life, and burden economic and public health infrastructure. Accurate flood prediction models are therefore necessary to minimize flooding’s impact on populations that reside near a floodplain.

In recent years, researchers have taken an increased interest in flood models due to relevant advances in computing power. Unfortunately, many flood models are still very computationally expensive due to the large amounts of required data. During a contributed presentation at the 2021 SIAM Annual Meeting, which is taking place virtually this week, Haley Dozier of the U.S. Army Engineer Research and Development Center presented several data-driven models that predict river discharge in the Mississippi/Atchafalaya River Basin (MARB). The MARB is the third-largest river basin in the world and thus a source of frequent flooding. For example, floods along the Mississippi River in 2019 caused at least 12 deaths and billions of dollars of economic losses. 

Dozier’s research group is currently working to create a continental-scale flood prediction method model, though her presentation focused solely on the MARB. She began by introducing an existing tool by NOAA that uses gauges to describe river stages and make subsequent predictions. However, gauge data is limited to certain rivers and also not easily accessible. “These kinds of model are incredibly useful, but they’re based on information or field measurements that are unavailable to the public or outdated,” Dozier said. “When we want to scale up to continental-scale models, we can’t have models that are based on gauge information.”

Dozier and her team thus explored standoff models that rely on satellite data rather than in-field gauge measurements. They used several different data sources for their models, including the European Union’s Copernicus Programme, the European Centre for Medium-range Weather Forecasts (ECMWF), and the Global Flood Awareness System (GloFAS). Copernicus is an Earth observation program that captures satellite images; its Sentinel-2 mission utilizes two polar-orbiting satellites to monitor variability in land surface conditions like vegetation, soil and water cover, inland waterways, and so forth. ECMWF reanalysis data comprises hourly estimates of atmospheric, land, and oceanic climate variables. And GloFAS data measures river discharge (volume of water through a particular river channel) and any accompanying flooding. 

Dozier then briefly explained the nuances of a forecasting regression model. “Forecasting with regression is pretty simple,” Dozier said. Instead of predicting the regression problem at the specific time in question, one predicts it at a time in the future (a subsequent timestep). Variables include convective precipitation, large-scale precipitation, humidity, and relative topography, all of which are measured at a specific point. “We aren’t necessarily predicting regional flooding, just point flooding,” Dozier said. Finally, the group used 11 days of data to predict the next day.

Figure 1. Results of the linear regression and random forest method for a point near New Orleans, La. The blue line depicts the actual discharge from the Global Flood Awareness System (GloFAS) data and the orange line represents the forecasted discharge. The bottom images are a subset of the GloFAS data from July and August 2014. Neither method’s forecast effectively captures the data.

Dozier and her team utilized many different models, but her two favorites are extremely randomized trees (ExtraTrees) and Gaussian process regression. ExtraTrees is a tree-based ensemble method that combines the predictions of several different base estimators and averages their predictions. “I’m a big fan of tree-based methods because they are very explainable, especially if you have a method where you can actually visualize the tree,” Dozier said. Gaussian process regression—a supervised learning method that solves regression and probabilistic classification problems while also providing uncertainty—works well too.

To test their models, Dozier and her colleagues selected a precise point on the Mississippi River at which to measure discharge. They picked a spot near New Orleans, La. First, Dozier displayed the results of a linear regression and a random forest (see Figure 1). The linear regression forecast did not accurately match the actual discharge data from GloFAS. Although the random forest performed a bit better, it struggled to represent large changes in discharge.

Next, Dozier contrasted these methods with the aforementioned ExtraTrees and Gaussian process (see Figure 2). Both methods performed remarkably well — their forecasts matched the GloFAS data exactly. Dozier did note that the team’s use of 11 previous days to predict one future day is likely responsible for the results’ precision. “Predicting one day in the future isn’t as complex as a lot of models,” Dozier said. “Usually you’re trying to predict a month, two months, three months in the future. But a result like this shows us that the method is at least promising so we can try to push those forecasts out further.”

Figure 2. Results of extremely randomized trees (ExtraTrees) and the Gaussian process regression for a point near New Orleans, La. The forecasted discharge exactly matches the data from the Global Flood Awareness System (GloFAS).

Because GloFAS data is based on satellite imagery, Dozier compared it to gauge discharge data from several locations along the Mississippi River. Interestingly the group found that the GloFAS and gauge data diverged at the Belle Chasse Gauge due a dam—a control structure—that is upriver from the site in question.

As a next step, Dozier wants to determine the locations of control structures that cause GloFAS data and gauge data to diverge. “This is really important for the Army in relief missions,” Dozier said. “If we go into somewhere like Africa on the Niger River, we need to figure out where a dam is and where managed water is. If we can figure out where these control structures are, we can make better decisions.”

Lina Sorg is the managing editor of SIAM News.
blog comments powered by Disqus