| October 02, 2023

Lowering the Entry Barrier to Uncertainty Quantification

Uncertainties in data are omnipresent, encompassing factors such as measurement errors, incomplete information, and random processes. The goal of uncertainty quantification (UQ) is to determine the effect of uncertain data on model predictions or inferences. UQ’s myriad applications include safe aircraft design, medical decision-making, and nuclear waste disposal, among many others.

UQ problems typically fall into two categories: propagation problems and inverse problems. Propagation problems start from a model parameter \(\theta\), which we treat as random with density \(\pi\) in order to represent a range of plausible parameter values. We use \(F\) to denote our model map that takes parameters to observations, then aim to find the distribution of \(F(\theta)\). Conversely, inverse problems originate from uncertain observations of a physical process and seek to determine the underlying model parameters.

Solving UQ problems on realistic models can be computationally challenging and often necessitates a large number of costly model evaluations. Doing so therefore requires a combination of advanced UQ methods, efficient numerical model solvers, and possibly even high-performance computing (HPC) support [5].

UQ and Realistic Models: Simple in Theory, Complex in Practice

The Monte Carlo method is a basic, inefficient approach to uncertainty propagation. First, we generate a number of independent samples \(\theta_1, \ldots, \theta_N\) from \(\pi\). Upon applying the model map, \(F(\theta_1), \ldots, F(\theta_N)\) become samples of the desired distribution.

Despite the wide variety of more sophisticated UQ methods—which mainly differ in the amount of information they offer about the desired distributions, the number of assumptions they make on model \(F\), and their level of efficiency—many of these methods require similar information about the model in question. In fact, they often come down to a subset of the following pointwise operations: evaluations \(F(\theta)\), Jacobian actions \(J(\theta)v\), gradients \(v^\top J(\theta)\), and Hessian actions.

Figure 1. UM-Bridge links uncertainty quantification (UQ) and model codes via behind-the-scenes HTTP-based network communication that is inspired by microservice architectures. Figure courtesy of Anne Reinarz.

An emerging ecosystem of software packages is expediting the implementation of advanced UQ methods. Exam-ples include CUQIpy, Lagun, MUQ, PyMC, QMCPy, Sparse Grids Matlab Kit, tinyDA, UQLab, UQ Toolkit, and TT-Toolbox.

Considering the simple mathematical “interface” and availability of software packages, the application of UQ methods to even complex models should be straightforward. So why is UQ not nearly as ubiquitous or well-integrated as deterministic numerical simulations, even though information about uncertainty is crucial to a variety of applications?

First, combining state-of-the-art UQ methods with advanced numerical solvers and HPC capabilities entails a high level of technical complexity, since the relevant communities tend to use very different tools (for good reason). Second, experts from each field often must collaborate closely throughout the entire project; technical issues and a lack of separation of concerns are frequent limiting factors.

Figure 2. Recorded water height at a buoy during a tsunami event (red), and forward model evaluations of a numerical simulation of the tsunami event with uncertain input parameters (blue). Figure courtesy of [5].

UM-Bridge: A Universal Link between UQ and Models

UM-Bridge breaks down the technical complexity by providing a link between any UQ code and any model software that is as universal as the aforementioned mathematical operations [3]. We utilize a network protocol to transfer inputs to and outputs from these operations.

The result is the architecture in Figure 1. UQ (the “client”) and the numerical model (the “server”) run as separate applications and use whatever programming languages, dependencies, or data that they each require. Due to the unified interface, we can easily interchange both the model and UQ code with alternatives. Furthermore, we can containerize UM-Bridge models for portability and even scale them to large clusters.

UM-Bridge provides straightforward integrations for C++, Julia, Python, MATLAB, and R. Framework-specific integrations also exist for emcee, MUQ, PyMC, QMCPy, Sparse Grids Matlab Kit, tinyDA, and TT-Toolbox. In addition, a selection of benchmark problems are available as ready-to-run Docker container images [2]. For example, we can download and run the tsunami simulation from Figure 2 with a single Docker command:

docker run –it -p 4242:4242 linusseelinger/model-exahype-tsunami

Now we can connect to the model server from any UM-Bridge client.

Figure 3. Python example that points UM-Bridge to a model named “forward” and requests a model evaluation \(F((0,10)^\top)\) and Jacobian action \(J((0,10)^\top) (1,4)^\top\).

In the Python example in Figure 3, we point UM-Bridge to a model named “forward” that runs on the same machine, and proceed to request a model evaluation \(F((0,10)^\top)\). We also request a Jacobian action \(J((0,10)^\top) (1,4)^\top\) from the model, provided that it is supported. As such, defining a model comes down to defining the mathematical operations in an UM-Bridge server.

Figure 4 depicts a minimal Python example that implements \(F : \mathbb{R}^2 \rightarrow \mathbb{R}\), \(F(\theta)=\theta_1+\theta_2\). We can connect this Python model to any UM-Bridge client, e.g., in place of the tsunami model in Figure 2.

Figure 4. Minimal Python example that implements \(F : \mathbb{R}^2 \rightarrow \mathbb{R}\), \(F(\theta)=\theta_1+\theta_2\).

Finally, we demonstrate the application of UQ packages to UM-Bridge models. The QMCPy example in Figure 5 propagates a uniform distribution through the model via the quasi-Monte Carlo (QMC) method [1]. The code focuses solely on the setup of QMC. Instead of defining the model, it connects to an UM-Bridge model and passes it to QMCPy’s UM-Bridge wrapper. We can apply the exact same QMCPy code to both the aforementioned custom model and the tsunami simulation (of course, the latter is much more costly).

Figure 5. QMCPy example that propagates a uniform distribution through the model via the quasi-Monte Carlo (QMC) method.

Scaling to Clusters

Scaling up UQ applications to large clusters brings its own set of challenges, as many UQ packages are not prepared for that setting and thus require costly reimplementations. The few that are able to run on large clusters are typically tied to a specific parallelization technology, such as message passing interface (MPI) or Ray.

UM-Bridge includes an easy-to-use Kubernetes configuration that runs parallel instances of arbitrary model containers in the cloud (see Figure 6). A predefined load balancer distributes model evaluation requests from the UQ client among these instances. As a result, UQ applications based on UM-Bridge are easy to scale up; no model modifications are necessary because each model instance runs in a portable container exactly as before. Likewise, the UQ software still makes evaluation requests through UM-Bridge. The only difference is that the software may now make parallel requests via any parallelization technique [4].

Many UQ algorithms themselves are rather inexpensive when compared to costly model evaluations. Even prototype-grade, thread-parallel UQ codes that run on a laptop can transparently offload model evaluations to remote clusters with thousands of processor cores.

Figure 6. Cloud cluster configuration that distributes evaluation requests across many instances of any UM-Bridge model. Figure courtesy of Anne Reinarz.

Discussion

UM-Bridge enables complex UQ applications by breaking down complexity and accelerating development — from prototypes all the way to large-scale runs on clusters. Support for Slurm-based HPC systems will become available soon. The separation of concerns facilitates efficient collaboration between experts, shifting the focus from technical issues to truly relevant mathematical questions. Model specialists gain access to a wide variety of UQ packages and can easily share models with their collaborators. Furthermore, UQ method developers can readily apply UQ codes to any numerical model (including standardized benchmark problems) and offer UQ software to a wide audience. They also do not need to develop dedicated HPC versions of their codes.

Although UM-Bridge is a young project, it is already gaining traction. Multiple UQ packages have added support for UM-Bridge, and more than 15 collaborators from over 10 institutions are working to develop the UM-Bridge benchmark library. Additionally, we received a Google Open Source Peer Bonus award in 2023, and early industry adoption is ongoing.

Documentation, tutorials, and benchmark problems for UM-Bridge are all available online. Feel free to contact us via email at [email protected] and [email protected], or join our Slack workspace from the documentation’s main page. We are still working to grow the community around UM-Bridge and are happy to actively support new applications and integrations in UQ packages. In fact, we are organizing an online, two-day workshop for UM-Bridge on December 11-12 and invite any interested individuals to attend. Registration is open until December 1.

References
[1] Choi, S.-C.T., Hickernell, F.J., McCourt, M., Rathinavel, J., & Sorokin, A. (2020). QMCPy: A quasi-Monte Carlo Python library. Retrieved from https://qmcsoftware.github.io/QMCSoftware.
[2] Merkel, D. (2014). Docker: Lightweight Linux containers for consistent development and deployment. Linux J., 2014(239), 2.
[3] Seelinger, L., Cheng-Seelinger, V., Davis, A., Parno, M., & Reinarz, A. (2023). UM-Bridge: Uncertainty quantification and modeling bridge. J. Open Source Soft., 8(83), 4748.
[4] Seelinger, L., Reinarz, A., Benezech, J., Lykkegaard, M.B., Tamellini, L., & Scheichl, R. (2023). Lowering the entry bar to HPC-scale uncertainty quantification. Preprint, arXiv:2304.14087.
[5] Seelinger, L., Reinarz, A., Rannabauer, L., Bader, M., Bastian, P., & Scheichl, R. (2021). High performance uncertainty quantification with parallelized multilevel Markov chain Monte Carlo. In SC’21: Proceedings of the international conference for high performance computing, networking, storage and analysis. St. Louis, MO: Association for Computing Machinery.

	Linus Seelinger is a postdoctoral researcher in the Institute for Mathematics at Heidelberg University. He has contributed to the DUNE numerical framework, is a core developer of the MIT Uncertainty Quantification Library, and started the UM-Bridge project.
	Anne Reinarz is an assistant professor of computer science in the Scientific Computing research group at Durham University and director of the Durham MSc in Scientific Computing and Data Analysis. She has worked on numerous open-source software projects, including UM-Bridge and the hyperbolic partial differential equation solver ExaHyPE.