# Mathematics at a Historic Transition in Biology

By Guo-Wei Wei

Biology concerns the structure, function, development and evolution of living organisms. The field underwent a dramatic transformation from macroscopic to microscopic (i.e., “molecular”) in the 1960s and assumed an omics dimension around the dawn of the millennium. Understanding the rules of life is biological sciences’ major mission in the 21st century. Technological advances have fueled the exponential growth of biological data, which in turn has paved the way for biology to undertake another historic transition from qualitative, phenomenological, and descriptive to quantitative, analytical, and predictive. Such a transition provides both unprecedented opportunities and grand challenges for mathematicians.

Comprehending structure-function relationships in biomolecules, which is the holy grail of biophysics, is also a recurring challenge in biology. Mathematical apparatuses, including simplicial geometry, differential geometry, differential topology, algebraic topology, geometric topology, knot theory, spectral graph theory, and topological graph, are essential for deciphering such relationships [1]. Geometric modeling bridges the gap between biological data and mathematical models, and is paramount for the conceptualization of biomolecules and their interactions. Topology dramatically simplifies biological complexity and renders insightful high-level abstraction to biological data (see Figure 1) [2, 3]. Graph theory is able to go beyond topological connectivity and incorporates harmonic analysis and optimization theory to explore biomolecular structure-function relationships.

**Figure 1.** Basic simplexes (left) and protein persistent barcodes (right). Image credit: Zixuan Cang.

A striking feature of living organisms is their multi- scale nature and tremendous complexity. Subcellular organelles, molecular machines, and dynamics and transport of biomolecules in living organisms—such as membrane transport, signal transduction, transcription and translation—are vital to cellular functions and cannot be simply described by atom-free or molecule-free phenomenological models. However, at the atomic scale these systems have intractable numbers of degrees of freedom. Multiscale modeling and analysis using quantum mechanics (QM), molecular mechanics (MM), and continuum mechanics (CM) offers an effective reduction in their dimensionality [4, 5]. Differential geometry theory of surfaces gives rise to a natural separation between microscopic and macroscopic domains [6]. Partial differential equations (PDEs) (e.g., Schrodinger equation, Poisson-Boltzmann equation, elasticity equation, etc.), Newton’s equations of motion, variational analysis, homogenization, differential geometry, persistently stable manifolds, etc., underpin multiscale QM/MM/CM modeling of excessively large biological systems [6, 7]. One can utilize conservation law, stochastic analysis, and uncertainty quantification to reveal how individual biomolecular behavior is related to experimental measurements.

Researchers do not sufficiently understand how various macromolecular complexes interact and give rise to cellular functions and biological pathways, e.g., metabolic, genetic, and signal transduction pathways [8]. Differential equations, combinatorics, probability graph, random matrix, statistical models, and algebraic geometry are the main workhorses for describing interactive biological networks, such as protein-protein interaction, gene regulation, and enzyme kinetics networks. The systems biology approaches often involve mechanistic models, like flux balance analysis and chemical kinetics, to reconstruct the dynamical systems from the quantitative properties of their elementary building blocks. Structural bioinformatics and computational biophysics predict reaction flux, rate, and equilibrium constants of biological networks. Mathematical techniques using geometry, topology, and graph theory have a competitive edge in structural bioinformatics [9, 10].

In the post-omics era, the availability of high-throughput sequencing strategies have resulted in genomics, proteomics, and metabolomics. Omics aims at the integrative studies of a whole set of biomolecular information that translates into the structure, function, and evolution of living organisms. One major challenge comes with predicting phenomics from genomics aided by ChIA-PET and/or trait information. Another challenge is understanding genome evolution due to gene-gene and gene-environment interactions. Statistical methods—such as longitudinal study, causal analysis, statistical inference, fuzzy logic, boosting, and regression—play a vital role in analyzing omic data sets. Machine learning is another powerful tool for revealing the genotype to phenotype mapping.

The development of accurate, efficient, and robust computational algorithms, methods, and schemes is a prerequisite for the implementation of mathematical approaches to biological modeling, analysis, and prediction. The importance of numerical methods in solving PDEs is gradually appreciated by the biological community [11, 12]. Computational geometry is an important aspect in structural biology and biophysics. Computational topology analyzes the intriguing topology of complex biomolecules, such as topological invariants of proteins and knot invariants of nucleosomes and chromosomes [13]. Development of efficient graph theory algorithms is crucial for the description of biomolecular binding [14].

Rational drug design is an imperative life science problem that ultimately tests our understanding of biological systems. Designing efficient drugs for curing diseases is one of most challenging tasks in biological sciences. Mathematics plays a vital role in hot-spot prediction, drug pose analysis, binding affinity prediction, structure optimization, toxicity analysis, and pharmacokinetic simulation. For example, the integration of machine learning with multiscale weighted colored graphs and multicomponent persistent homology provided the best free energy ranking for Set 1 (Stage 2) in D3R Grand Challenge 2, a worldwide competition in computer-aided drug design. It is expected that most new drugs in the next decade will be initiated by artificial intelligence.

**References**

[1] Xia, K.L., & Wei, G.W. (2016). A review of geometric, topological and graph theory apparatuses for the modeling and analysis of biomolecular data. *arXiv:1612.01735 [q-bio.BM]*, 1-76.

[2] Darcy, I.K., & Vazquez, M. (2013). Determining the topology of stable protein-DNA complexes. *Biochemical Society Transactions, 41*, 601-605.

[3] Heitsch,C., & Poznanovic, S. (2014). Combinatorial insights into rna secondary structure. In N. Jonoska and M. Saito (Eds.), *Discrete and Topological Models in Molecular Biology*, 145-166.

[4] Eisenberg, B.S., Hyon, Y.K., & Liu, C. (2010). Energy variational analysis of ions in water and channels: Field theory for primitive models of complex ionic fluids. *Journal of Chemical Physics, 133*, 104104.

[5] Baker, K., Chen, D., & Cai, W. (2016). Investigating the Selectivity of KcsA Channel by an Image Charge Solvation Method (ICSM) in Molecular Dynamics Simulations. *Communications in Computational Physics, 19*, 92-943.

[6] Wei, G.W. (2010). Differential geometry based multiscale models. *Bulletin of Mathematical Biology, 72*, 1562-1622.

[7] Zhou, Y.C., Holst, M.J., & McCammon, J.A. (2008). A nonlinear elasticity model of macromolecular conformational change induced by electrostatic forces. *Journal of Mathematical Analysis and Applications, 340*, 135-164.

[8] Komarova, N..L., Zou, X., Nie, Q., & Bardwell, L. (2005). A theoretical framework for specificity in cell signaling.

*Molecular Systems Biology, 1*(1).

[9] Demerdash, O.M.A., Daily, M.D., & Mitchell, J.C. (2009). Structure-based predictive models for allosteric hot spots. *PLOS Computational Biology, 5*, e1000531.

[10] Cang, Z.X., & Wei, G.W. (2017). TopologyNet: Topology based deep convolutional and multi-task neural networks for biomolecular property predictions. *Plos Computational Biology, 13*(7), e1005690.

[11] Zhao, S. (2012). Pseudo-time-coupled nonlinear models for biomolecular surface representation and solvation analysis. *International Journal for Numerical Methods in Biomedical Engineering, 27*, 1964-1981.

[12] Geng, W.H., & Krasny, R. (2013). A treecode-accelerated boundary integral Poisson-Boltzmann solver for continuum electrostatics of solvated biomolecules. *J. Comput. Phys., 247*, 62-87.

[13] Schlick, T., & Olson, W.K. (1992). Trefoil knotting revealed by molecular dynamics simulations of supercoiled DNA. *Science, 257*(5073), 1110-1115.

[14] Nguyen, D.D., Xiao, T., Wang, M.L., & Wei, G.W. (2017). Rigidity strengthening: A mechanism for protein-ligand binding. *Journal of Chemical Information and Modeling, 57*, 1715-1721.

Guo-Wei Wei is a professor of mathematics at Michigan State University.