| November 12, 2020

Can Scientific Computing and Combinatorial Mathematics Work Together?

Solving Real-world Problems in Computational and Data Science

Many challenges are associated with the modern mathematical problems that arise from different application areas of computational and data science. In the era of large-scale computing and big data analysis, one prominent difficulty is the sheer size of the problems. In addition, these problems typically have roots in multiple scientific disciplines, meaning that the process of finding efficient solutions requires a high degree of cross-disciplinary thinking. But instead of extending this potentially long list of challenges that arise during the solution of real-world problems, we will address a topic that researchers do not discuss as frequently: the pressing need to combine elements from scientific computing with techniques from combinatorial mathematics, which could alleviate some issues in computational and data science.

Unfortunately, numerical analysis and combinatorics are not currently interwoven in any meaningful way. This disparate gap exists in both research and education. It is difficult to find a suitable mathematical venue that publishes advancements in the combination of numerical methods and discrete structures, or a university with a curriculum that deliberately exploits the synergies between continuous and discrete mathematics. However, an emerging scientific discipline known as combinatorial scientific computing (CSC) is attempting to ameliorate this unfortunate situation. CSC aims to link combinatorial structures, graph theory, and algorithmic computer science with numerical analysis, differential equations, and scientific computing.

Figure 1. Cevdet Aykanat illustrated a recursive bipartitioning tree that partitions the current hypergraph \(H_\textrm{cur}\) into two hypergraphs \(H_L\) and \(H_R\).

CSC-based research typically involves elements from the following cyclic workflow:

Identifying a scientific computing problem in a relevant application area
Representing this problem via a combinatorial or graph model
Describing the corresponding graph problem
Finding an algorithmic approach to its solution
Executing a theoretical and/or experimental analysis of this approach.

The SIAM workshop series on CSC is a top-tier forum during which participants present original research on the design, implementation, application, and evaluation of combinatorial algorithms and data structures that arise from scientific computing problems in computational science, computational engineering, and data science. The 2020 SIAM Workshop on Combinatorial Scientific Computing (CSC20) took place this past February in Seattle, Wash. The workshop was co-located with the SIAM Conference on Parallel Processing for Scientific Computing.

CSC20 featured three invited talks over the course of the meeting’s three days. Cevdet Aykanat (Bilkent University) presented a survey entitled “Models for Optimizing Multiple Communication Cost Metrics in Parallelizing Irregular Applications.” The goal of these hypergraph models is to adequately represent and then minimize the communication time on distributed-memory computers. For irregular applications in computational and data science, the models capture both the volume of communicated data (bandwidth cost) and the number of messages that users send and receive between processes (latency cost). In essence, this process reduces the problem of minimizing communication costs in parallel computing to the problem of partitioning different hypergraph models with suitable objective functions and constraints (see Figure 1).

Figure 2. Rob Bisseling spoke about a transparent thermoplastic in a computed tomography scanner at Centrum Wiskunde & Informatica in Amsterdam, the Netherlands. The scanner (left) and detector (right) can move horizontally and vertically while the object (middle) can move along all coordinate axes.

During a talk entitled “Parallel Tomographic Reconstruction – Where Combinatorics Meets Geometry,” Rob Bisseling (Utrecht University) explored X-ray computed tomography, wherein researchers use a finite number of projections of an object—taken by a scanner—to reconstruct an estimate of that object (see Figure 2). If one employs a simultaneous iterative reconstruction technique and high spatial resolution, the resulting sparse matrices require storage on the order of petabytes. A better approach therefore relies on repeated matrix-vector products without storing the matrices explicitly. Different matrix partitionings handle the balance of computational work between parallel processes. Despite their potential to result in low communication cost, combinatorial partitioning methods tend to be computationally intractable for this tomography application. However, partitioning techniques that are inspired by combinatorial ideas and account for the underlying geometry incur slightly larger communication costs but are considered to be practically feasible.

The third invited speaker, Bruno Ribeiro (Purdue University), presented an overview on “Unearthing the Relationship Between Graph Isomorphism, Graph Neural Networks, and Matrix Factorization” (see Figure 3). He hopes to extend neural networks from input with a canonical orientation (such as images, text, and speech) to input with relational graph data that have no frame of reference (like molecules and social and biological networks). An important question in this endeavor concerns how to find suitable graph representations and their relations to probability distributions. One possible approach is to consider graph representations that are defined through invariant theory. In particular, a technique called relational pooling provides a framework for representations that are not only invariant to graph isomorphism but also capture maximal expressiveness.

Figure 3. Bruno Ribeiro demonstrated a low-distortion embedding that approximately preserves the distances between vertices in a graph \(d_A(u,v)\) and in a metric space \(d_e(Z_u, Z_v)\).

The CSC20 Best Paper Award went to “A Multilevel Mesh Partitioning Algorithm Driven by Memory Constraints,” which was co-authored by Cédric Chevalier, Franck Ledoux, and Sébastien Morais. The paper tackles a new problem in scientific computing, introduces a novel combinatorial model, designs and implements fresh algorithms, conducts extensive comparison with other approaches, and includes convincing experimental results.

The detailed workshop program, as well as the workshop proceedings and links to the slides, are available online. The CSC workshop series began in 2004 and will continue as an integral part of the new SIAM Conference on Applied and Computational Discrete Algorithms, which will debut in July 2021 and is co-located with the 2021 SIAM Annual Meeting. This conference is sponsored by the new SIAM Activity Group on Applied and Computational Discrete Algorithms.

Attendees of the 2020 SIAM Workshop on Combinatorial Scientific Computing, which took place in Seattle, Wash., this February.

Martin Bücker is a professor of computer science at Friedrich Schiller University Jena in Germany. Sherry Li is a senior scientist and group leader of the Scalable Solvers Group in the Computational Research Division at Lawrence Berkeley National Laboratory.