SIAM News Blog

Computational Mechanics: Three Decades of Applications and Extensions

By James P. Crutchfield

In part I of this article, published in the October issue of SIAM News, the author provided a thorough introduction to computational mechanics that documented the field’s multifaceted origin.

My interest in computational mechanics began as a fascination with mainframe computers in the 1960s and information theory in the 1970s. I worked in Silicon Valley for a number of years,  first at IBM (at what was to become its Almaden Research Center) on information storage technology — specifically magnetic bubble devices. This was followed by an assignment at Xerox’s Palo Alto Research Center, which at the time was busily inventing our current computing environment of packet-based networks (ethernet), internet protocols, graphical user interfaces, file servers, bitmap displays, mice, and personal workstations. An active member of the Homebrew Computer Club, I hand-built a series of microcomputers: 4-bit, 8-bit, and eventually 16-bit machines. I suggested and helped code the first cellular automaton simulator on a prototype 6502 (8-bit) microcomputer, which ultimately became the Apple I. During this time, I met many people who would later become titans of modern information technology.

As a student at the University of California, Santa Cruz (UCSC), I learned about the mathematics of computers and communication theory directly from information theory pioneer David Huffman, who was well known for his 1950s work on minimal machines — specifically machine synthesis. Huffman’s pioneering research was an integral part of his discrete mathematics and information theory courses. I learned computer architecture from Harry Huskey, a chief engineer for the first U.S. digital computers (ENIAC and EDVAC) who also taught at UCSC. In short, thinking about computing and its physical substrates went hand in hand with my physics training in statistical mechanics and my mathematics training in dynamical systems theory. This theme drove the bulk of my research on chaotic dynamics.

With this background in mind, I will now address the immediate concerns of nonlinear physics in the 1980s. As computers decreased in size and cost, they became increasingly accessible research tools. In the late 1970s and early 80s, this revolution inspired the burgeoning field of nonlinear dynamics. Computer simulations, unlike abstract existence proofs, allowed us to simply examine and interact with the solutions of complex nonlinear systems. The new tools thus revealed an unexplored universe of exquisitely complex, highly-ramified structures and unpredictable behaviors that had remained relatively abstract throughout most of the 20th century.

Randomness emerged spontaneously — though paradoxically, we knew and had programmed the underlying equations of motion. This presented several deep challenges. What is randomness? Can we quantify it? Can we extract the underlying equations of motion from observations? Was each and every nonlinear system, in the vast space of all systems, going to require its own “theory”?

In the 1970s, the nonlinear physics community identified a target problem—fluid turbulence—to probe these questions and presented a testable hypothesis, specifically the Ruelle-Takens conjecture that strange attractors were the internal mechanism driving the problem. This formalized meteorologist Edward Lorenz’s earlier proposal on deterministic nonperiodic flow: nonlinear instability was generally responsible for the weather’s unpredictability and fluid turbulence.

But a confounding problem remained. On the one hand, we had time series of measurements of the fluid velocity at a point in a flow. On the other, we had the abstract mathematics of strange attractors — complicated manifolds that circumscribed a system’s instability. How could we connect them? Proposals to use the measured time series to “reconstruct” the system’s effective state space solved the conundrum. This concept involved extracting the attractor’s geometry from a time series. Such reconstruction methods created an effective state space in which to look at the chaotic attractors and quantitatively measure their degrees of instability and attendant complexity. This was finally verified experimentally in 1983, overthrowing the decades-old Landau-Lifshitz multiple incommensurate-oscillator view of turbulence.

Figure 1. Varieties of the \(\epsilon\)-machine. 1a. Finite-state hidden Markov model (HMM). 1b. Infinite mixed state: countable HMM and uncountable HMM (fractal and continuum). 1c. \(\epsilon\)-machine for discrete-event, continuous-time hidden semi-Markov processes. Figure courtesy of James Crutchfield.

Reconstructing a chaotic attractor from a time series became a widely-used technique for identifying and quantifying deterministic chaotic behavior, thus inspiring the field of nonlinear time-series modeling. Yet reconstruction fell short of concisely expressing a system’s internal structure. Could we extend reconstruction to extract the system’s very equations of motion, enabling us to robustly predict chaotic behavior? A method to reconstruct equations of motion from a data series provided the answer.

This method worked quite well when one happened to choose a mathematical representation that matched the class of nonlinear dynamics generating the behavior. But without the correct representational “basis,” it both failed miserably and fell short of revealing how and where to look for a better technique. Thus, even this approach to modeling complex systems was inherently subjective in the choice of representation. Structural complexity remained elusive.

How could one remove this subjectivity? The solution came from a metaphor to the classification scheme for automata developed in discrete computation theory. In the 1950s and 60s, the mathematics of formal languages and automata led to a structural hierarchy of representations that ranged from devices that used finite memory to infinite memories organized in different architectures — tapes, stacks, queues, counters, and the like.

Could we do this for continuous chaotic systems? The answer was framed as a predictive equivalence relation developed from the geometry-of-a-time-series concept of reconstructed state space and adapted to an automata-theoretic setting. The equivalence relation defined a new kind of state, a distribution of futures conditioned on past trajectories in the reconstructed state space. These were the causal states whose resulting probabilistic automata were called \(\epsilon\)-machines. In this way, many notions of information processing and computing became applicable to nonlinear physics.


Figure 2. \(\epsilon\)-machines in space. 2a. Raw spacetime diagram of elementary cellular automaton 18 evolving from a random initial condition. 2b. Local causal-state spacetime fields calculated from the predictive equivalence relation over past and future lightcones reveal domains and particles. Figure courtesy of Adam Rupe.
Pithy notation, apt representation, and even the right C++ object class can greatly simplify problem solving. As minimal sufficient statistics, the \(\epsilon\)-machine causal states play just this role when working with and analyzing complex processes. Figure 1 displays a variety of \(\epsilon\)-machines, from discrete-time discrete-state to infinite-state and continuous-time processes. Local causal states are adapted to spatiotemporal processes and detect spacetime invariants sets (domains) whose broken symmetries (particles) propagate (see Figure 2).

In the mysterious realm of quantum physics, determining a system’s quantum causal states allows one to measure quantum structural complexity. Comparison to classical statistical complexity reveals a profound ambiguity: quantum physics and classical physics do not agree on the meaning of structure. That said, quantum \(\epsilon\)-machines are typically smaller, so future quantum computers will likely provide helpful compression and reduced communication costs.

When combined with recent advances in nanoscale nonequilbrium thermodynamics, \(\epsilon\)-machines offer the most efficient way to physically implement information processing (see Figure 3). Moreover, the degree of chaos (dynamical instability measured by the Kolmogorov-Sinai entropy rate) controls the rate at which an engine converts disorganized thermal energy into useful work. Additionally, the second law of thermodynamics—when properly extended to account for structure in information reservoirs—determines the minimum amount of energy required to perform a given computation.

Conveniently, originally vague intuitive notions about complex systems can now be clearly spelled out. For instance, we now say that a system is unpredictable if it has a positive entropy rate: \(h_u>0\). A system is considered complex if it has positive structural complexity: \(C_\mu>0\). And a system is emergent if its structural complexity increases over time: \(C_\mu(t^\prime)>C_\mu(t)\), if \(t^\prime>t\). Finally, we can monitor the amount of internal structure hidden from us if its crypticity is positive: \(\chi=C_\mu-\mathbf{E}>0\), where excess entropy \(\mathbf{E}\) is the shared information between past and future.

Figure 3. Thermodynamic computing. A stochastic Turing engine consisting of an information ratchet that processes symbols on a tape using an input-output \(\epsilon\)-machine. Figure courtesy of Alexander Boyd.

Most constructively, and perhaps most telling about causal states, is the manner in which they gave rise to exact methods. Spectral decomposition techniques recently derived from functional calculus yield exact, closed-form expressions for all informational and computational properties of processes generated by \(\epsilon\)-machines.

Pattern Discovery

Computational mechanics was thus extended and applied far beyond its initial conception 30 years ago. Even so, laying the foundations of a fully-automated “artificial science”—in which theories are automatically built from raw data—remains a challenge. Though the benefits are tantalizing, doing so remains an ambitious goal and its research program is still incomplete. For example, delineating the way in which the causal equivalence relation induces a structural hierarchy of complex processes, rather analogous to the Chomsky hierarchy in discrete symbolic computation theory, is largely unexplored.

That said, the biggest scientific challenges are the recent discoveries that classical and quantum simplicity are ambiguous, and the fact that the most fundamental difficulties arise from our lack of knowledge surrounding statistical dependency, beyond pairwise correlation. In short, Shannon-Kolmogorov information cannot capture all types of statistical dependencies and is mute on many factors of complex system organization. Though these are open problems, I believe that they are also answerable challenges. Perhaps past successes and the style of computational mechanics will inspire their solution.

Acknowledgments: I thank the Santa Fe Institute, the Institute for Advanced Study at the University of Amsterdam, the Telluride Science Research Center, and the California Institute of Technology for their hospitality during visits. This material is based on work supported by, or in part by, Foundational Questions Institute grant FQXi-RFP-1609, the U.S. Army Research Laboratory and the U.S. Army Research Office under contract W911NF-13-1-0390 and grant W911NF-18-1-0028, and via Intel Corporation support of CSC as an Intel Parallel Computing Center.

James P. Crutchfield teaches nonlinear physics at the University of California, Davis, directs its Complexity Sciences Center, and promotes science interventions in nonscientific settings. He is mostly concerned with  patterns: what they are, how they are created, and how intelligent agents discover them. His website is

blog comments powered by Disqus