| December 01, 2014

SIAM’s CSC Workshop Series Marks 10th Year

The 2014 SIAM Workshop on Combinatorial Scientific Computing, held at the École Normale Supérieure in Lyon, France, July 21–23, was the sixth in a series that began ten years ago in San Francisco. True to CSC tradition, the 2014 workshop program comprised a wide range of combinatorial topics arising from many corners of scientific computing. Presenting recent results were a diverse set of speakers––PhD students, postdocs, early-career researchers, and well-established researchers, from academia, national laboratories, and industry; speakers from industry accounted for more than 10% of the talks.

The origins of CSC, which go back to well before 2004, are found in data analysis, sparse matrix computations, meshing, automatic differentiation, partitioning, parallel computing, and related areas. The 2014 workshop sessions covered many intriguing problems from these domains. The three invited speakers focused on two interesting research areas: randomized algorithms for numerical linear algebra and network analysis.

Network Analysis

Tamara Kolda (Sandia National Laboratories) opened the workshop with an invited talk on large-scale networks. Much research has gone into the development of theoretical models that generate graphs with properties similar to those of social networks, the web, and other applications. Different synthetic graphs do well on different similarity metrics; Kolda focused on reproducing degree distributions and clustering coefficients from empirical observations.

One newer model, the block two-level Erdős–Rényi (BTER) graph, hypothesizes that high clustering coefficients observed in real-world networks can be captured well by a collection of dense Erdős–Rényi graphs of various sizes and connectivity. The ER subgraphs form the fine level of the BTER graphs; on the coarse level, the nodes in the ER subgraphs are connected according to the Chung and Lu (CL) graph model, which captures the heavy-tailed degree distribution. CL connects the fine components without adversely affecting the community structure of any of the ER subgraphs [4].

In closing, Kolda considered the matching of BTER graphs to real-world data sets, as well as the analysis of time-dependent graphs. The subject of network analysis was by no means exhausted with Kolda’s talk. Subsequent speakers elaborated on the theme of network analysis, with discussions of scalable graph spectrum estimation, anomaly detection, \(b\)-matching, centrality computation, coarsening and partitioning methods, as well as software implementing many of these algorithms.

Randomized Algorithms

In the second invited talk Petros Drineas (Rensselaer Polytechnic Institute) discussed randomization in numerical linear algebra. Especially useful in the handling of large-scale matrices, random sampling can greatly reduce the size of the input problem. Drineas began by discussing matrix sampling methods that randomly select rows from a large input matrix to build a matrix of smaller dimension that retains the important information of the original. Early methods select rows uniformly at random in independent identically distributed trials; an example is the Kaczmarz method, as sketched out below.

More effective methods, however, do not select rows according to a uniform distribution. Instead, a row is selected with a probability proportional to its Euclidean norm or, even better, with respect to its so-called leverage score. One application, from a data analysis perspective, provides an alternative to the well-known singular value decomposition \(A = U{\Sigma}V^{T}\), where the singular vectors forming \(U\), \(V\) are hard to interpret in terms of the original data represented by the \(m \times n\) input matrix \(A\). Instead, columns from \(A\) can be sampled to form an \(m \times c\) matrix \(C, c \ll n\), for which an approximation \(A \approx CX\) can be constructed that strives to keep the error \(|| A – CX ||_F\) as close as possible to the error of an optimal low-rank representation, as given by a truncated SVD. The big advantage is that the matrix \(C\), formed from the original columns of \(A\), has a more natural interpretation than the SVD [1, 3].

The sampling of matrix columns (or rows) with preference for influential ones can also be applied to the solution of linear systems \(Ax = b\). The randomized Kaczmarz method iteratively refines an approximation of \(x\), using only information from a randomly chosen row \(a_i\) of \(A\). In line with the above, the original algorithm selects rows i.i.d. uniformly; sampling the \(i\)th row with probabilities proportional to \(|| a_{i} ||\), however, leads to faster convergence.

Bridging algorithms like the randomized combinatorial Kaczmarz algorithm and scientific computing, Sivan Toledo (Tel Aviv University) gave the final invited talk. The Kaczmarz algorithm solves symmetric and diagonally dominant linear systems \(Sx = b\) in near-linear time. In [2] Kelner et al. proposed a variant of the Kaczmarz method, describing their method from a physical point of view. The initial linear system \(S\) is rewritten in a graph formulation, such that the solution vector \(x\) can be retrieved from the solution of \(Lu = v\) for \(u\), where \(L\) is the Laplacian of the graph formulation of \(S\). With the graph formulation interpreted as an electrical flow problem, the system of equations \(Lu = v\) can be solved efficiently.

Kelner et al. recognized that the vector \(v\) in the above formulation has a very specific structure, which can be exploited for efficient solution by the randomized Kaczmarz algorithm through the use of a specifically constructed spanning tree of the derived graph. Toledo gave an alternative interpretation of the method, while relying solely on the graph formulation, column and null spaces, and orthogonal projections. In this way, he related the new Kaczmarz algorithm to earlier combinatorial preconditioning techniques.

Parallel Computing

Randomized algorithms like the Kaczmarz method often lend themselves well to parallelization, and parallel computing has always been one of the focus areas of combinatorial scientific computing. For the randomized combinatorial Kaczmarz algorithm, a parallel approach was presented after Toledo’s talk; indeed, about a third of the presentations at the workshop were related to parallel computing. Today, with the increasing core counts of processor architectures, including those in our hand-held devices, such a focus is hardly a surprise.

Social Computing

Kamesh Madduri (Pennsylvania State University) reminded us that a conference has both scientific and social aspects, by illustrating parallel shortest-path algorithms with a graph of the CSC workshop participants (Figure 1). Outside the busy workshop schedule, the breaks and lunches were indeed lively events and provided plenty of opportunities for discussions, scientific and otherwise. These intermissions were catered well with drinks and refreshments, while the lunches provided by ENS lived up to the French reputation for excellent cuisine.

Figure 1. CSC14 participants, part of the network shown in the graph at right, by Kamesh Madduri.

A book of abstracts of the workshop presentations is publicly available at http://hal.inria.fr/hal-01054876. More information about the workshop is available at http://www.siam.org/meetings/csc14/.

The workshop was supported financially by the LABEX MILYON (ANR-10-LABX-0070) of Université de Lyon within the program “Investissements d’Avenir” (ANR-11-IDEX-0007), operated by the French National Research Agency, and SIAM.

References
[1] C. Boutsidis, P. Drineas, and M. Magdon-Ismail, Near-optimal column-based matrix reconstruction, SIAM J. Comput., 43 (2014), 687–717.
[2] J.A. Kelner, L. Orecchia, A. Sidford, and Z.A. Zhu, A simple, combinatorial algorithm for solving SDD systems in nearly-linear time, Proc. Forty-Fifth Annual ACM Symposium on Theory of Computing, STOC ’13, 2013, 911–920.
[3] M.W. Mahoney and P. Drineas, CUR matrix decompositions for improved data analysis, Proc. Natl. Acad. Sci. USA, 106 (2009), 697–702.
[4] C. Seshadhri, T.G. Kolda, and A. Pinar, Community structure and scale-free collections of Erdős-Rényi graphs, Phys. Rev. E, 85 (2012), 056109.

Albert-Jan N. Yzelman is a researcher at the Exascience Lab in Leuven, Belgium. Bora Uçar is a CNRS researcher at the Laboratoire de l’Informatique du Parellélisme, École Normale Supérieure, Lyon.