| December 17, 2021

The True Logic for This World is the Calculus of Probabilities

Dedicated to Professor Ivo Babuška on the occasion of his 95th birthday.

In 1850, James Clerk Maxwell shared the following statement:

“...The actual science of logic is conversant at present only with things either certain, impossible, or entirely doubtful, none of which (fortunately) we have to reason on. Therefore the true logic for this world is the calculus of probabilities, which takes account of the magnitude of the probability which is, or ought to be, in a reasonable man’s mind.”

The question here is not whether we agree with this statement—as there is no unique, subjective answer—but rather how it could precisely reflect the language of people, science, and computation 170 years later. If experiments, theories, and computations are the three pillars of science, we argue that the language of people, science, and computation are the three pillars of probability.¹ In particular, we adapt Frank Lad [5] and Edwin T. Jaynes’ [3] arguments on why probability is the language of people and the language of science. We then show that we can adapt—and in fact have already adapted—the language of probability to our advantage in order to simplify and speed up scientific computations and advance scientific progress.

Probability as the Language of People

We are accustomed to the logic of certainty that a (mathematical) statement is either true or false. However, statements in daily real-life situations are rarely simply true or false. This reality is reflected in the phrases that we use to express uncertainty about situations — such as “I am not sure,” “most likely,” “pretty sure,” and “quite,” to name a few. Why are our lives so uncertain? We can infer an answer from the basic “uncertain” atomic structure of matter in which, by Heisenberg’s uncertainty principle, one cannot precisely determine the position and velocity of a particle (particularly an electron) simultaneously. In other words, we can understand Mother Nature as a random generating machine. For this reason, it is not surprising that our lives are surrounded by random events whose outcomes are undetermined or uncertain. Furthermore, parts of our languages—the principle methods of human communication—are designed to reflect these random events and their uncertain outcomes. The language of uncertainty is thus real-life probability. Uncertainty and/or randomness do not necessarily imply negativity, though most people do prefer certainty and stability. Indeed, even though the basic atomic structure is random and uncertain, Earth has existed for 4.5 billion years.

Probability as the Language of Science

Real-life probability is necessarily personal probability because it subjectively depends upon the way in which we operate (believe, think, reason, learn, etc.) in a world (as well as in science) that is full of uncertainties. Inspired by real-life probability, Bruno de Finetti formalized the language of subjective uncertainty into operational and computational subjective probability [2]. Regardless of objectivity or subjectivity, the calculus of probabilities is rigorous under the axiomatic probability of Andrey Kolmogorov [4]. The beauty here is that the calculus of probabilities includes not only the logic of uncertainty, but also the logic of certainty as a special case. Under Edwin T. Jaynes’ umbrella of plausible probabilities [3], the calculus of probabilities consists of generally valid principles of logic because it does not refer to random variables or chance. Probability is thus the language of science [3].

Probability as the Language of Scientific Computation?

Many engineering and scientific applications are stochastic in nature (e.g., Brownian motion, epidemic outbreaks, earthquakes, etc.). For such applications, we have no choice but to operate in stochastic regimes with random computations. We can, however surprisingly, use stochastic calculations to our advantage when solving challenging deterministic problems. This concept may sound counterintuitive, but randomized algorithms are typically easier to understand and faster to execute [6]. They are also particularly attractive for high-performance computation since they are typically embarrassingly parallel and robust. A four-day workshop on Randomized Algorithms for Scientific Computing [1], which took place virtually last year, highlighted the paramount importance of randomized algorithms in the past, present, and future of scientific computations at scale.

Figure 1. Visual depiction of an inverse problem that inverts for the initial condition of a convection-diffusion problem and shows the power of randomized methods. 1a. The reference inverse solution via a traditional deterministic approach. 1b. The randomized inverse solution with left sketching. 1c.The randomized inverse solution with randomized misfit approach. 1d. The randomized inverse solution with ensemble Kalman filtering. 1e. The randomized inverse solution with right sketching. Figure courtesy of [7].

Randomization is in fact a classical idea. One of the most famous instances is perhaps the Monte Carlo method, which Enrico Fermi first experimented with in 1930. The method is rooted in the law of large numbers whose first proof traces back to Jacob Bernoulli in 1713. Here we argue that most randomized methods originate from the Monte Carlo approach, or variants thereof. The statement of the law of large numbers—and hence the Monte Carlo method itself—is very simple. If \(x_i\), \(i=1,...,N\) are independent and identically distributed (i.i.d.) by a distribution \(\pi(x)\) over a domain \(\Omega\) with finite mean \(\bar{x}\) and variances \(\sigma^2\), then the following asymptotic result holds almost surely:

\[\lim_{N\rightarrow\infty}\frac{x_1+...+x_N}{N}=\bar{x} := \int_\Omega x \pi(x)dx.\]

The critical role of Monte Carlo simulations in science, engineering, and mathematics has been indisputable since the Manhattan Project, thanks to asymptotic convergence. However, most modern randomized methods exploit the non-asymptotic side of Monte Carlo methods to solve complex and/or computationally extensive problems. If \(x\sim\pi(x)\) is a sub-Gaussian random variable with proxy \(\sigma^2\), the Chernoff inequality shows that for any \(t>0\),

\[\mathbb{P}\Bigg[\Bigg|\frac{x_1+...+x_N}{N}-\bar{x}\Bigg|<t\Bigg] \ge 1-2e^{-N\frac{t^2}{2\sigma^2}}.\]

That is, the sample mean is very (exponentially) concentrated around the mean \(\bar{x}\). In fact, the error in the sample mean is no larger than \(t\) with a probability of at least \(1-2e^{-N\frac{t^2}{2\sigma^2}}\). We can make the error as small as desired and the success probability as high as we want by increasing the number of samples \(N\). In particular, we can obtain the confidence interval \((-t,t)\) for the error between the sample mean and true mean with a confidence level of \(1-\delta\) if we pick \(N \ge \frac{2\sigma^2}{t^2}\ln\Big(\frac{2}{\delta}\Big)\). It follows that randomized methods can fail when compared to deterministic approaches; this is indeed true, albeit with small probability. For instance, the failure probability could be made so small that randomized methods are essentially deterministic for practical purposes. This concept is the premise of most—if not all—modern randomized algorithms, including randomized linear algebra. Another crucial feature of the Monte Carlo method (and randomized methods in general) is that we can carry out the computations that are associated with each \(x_i\) in a completely parallel fashion since \(x_i\) are i.i.d. Randomized methods are thus particularly well suited for current and future extreme-scale parallel computing systems.

Intriguingly, we can derive many modern, fast, scalable, and robust randomized algorithms from Monte Carlo approximations. We have recently shown that one can employ Monte Carlo methods to unify many randomized algorithms [7], including sketchings, the randomized singular value decomposition, stochastic gradient descent, ensemble Kalman filtering, simultaneous source encoding, randomized Kaczmarz, and randomized coordinate descent (see Figure 1). The beauty of this unification lies in the following takeaways: (i) any result—whether it be asymptotic or non-asymptotic—for the framework is immediately valid for any of the methods that the framework produces; and (ii) the framework both recovers existing methods and provides a systematic way for practitioners to discover new ones.

We thus argue that probability is the language of the people, the language of science, and the language of modern scientific computation.

If the Mother Nature is random, so are our algorithms.

¹ Though there are other mathematical approaches that convey uncertainty—like decision theory, fuzzy theory, etc.—we restrict ourselves to probability.

Acknowledgments: This work was partially funded by the National Science Foundation awards NSF-1808576, NSF-2108320, and NSF-CAREER-1845799; the Defense Threat Reduction Agency award DTRAM1802962; the Department of Energy awards DE-SC0018147 and 26-0839-89, and the 2020 ConTex Award. The author is grateful for the support.

References
[1] Buluc, A., Kolda, T.G., Wild, S.M., Anitescu, M., DeGennaro, A., Jakeman, J., … Zwart, P. (2021). Randomized algorithms for scientific computing (RASC). Preprint, arXiv:2104.11079.
[2] De Finetti, B. (1970). Theory of probability: A critical introductory treatment. New York, NY: Wiley.
[3] Jaynes, E.T. (2003). Probability theory: The logic of science. Cambridge, U.K.: Cambridge University Press.
[4] Kolmogorov, A.N. (1950). Foundations of the theory of probability. New York, NY: AMS Chelsea Publishing.
[5] Lad, F. (2003). Probability: The language of the people! ... The language of science?? In Conference on Science and Democracy 2. Napoli, Italy.
[6] Motwani, R., & Raghavan, P. (1995). Randomized algorithms. Cambridge, U.K.: Cambridge University Press.
[7] Wittmer, J., Krishnanunni, C.G., Nguyen, H.V., & Bui-Thanh, T. (2021). On unifying randomized methods for inverse and optimization problems. In preparation.

Tan Bui-Thanh is an associate professor and the William J. Murray, Jr. Fellow in Engineering No. 4 in the Department of Aerospace Engineering and Engineering Mechanics, as well as the Oden Institute for Computational Engineering and Sciences, at the University of Texas at Austin. He is secretary of the SIAM Activity Group on Computational Science and Engineering and vice president of the SIAM Texas-Louisiana Section.