SIAM News Blog
SIAM News
Print

Two New SciDAC Institutes Promote Mathematical Tools and Software Technology for High-Performance Computing

By Gail Pieper, Karen Devine, Esmond G. Ng, Leonid Oliker, and Robert Ross

Bigger is often said to be better, and the newest extreme-scale computers—which have millions of processing units—are certainly bigger. Moreover, the breadth of science performed at the U.S. Department of Energy’s (DOE) computing facilities continues to expand with the emergence of new technologies like artificial intelligence (AI). While these exciting advances create new opportunities for scientific discovery, they also raise novel questions for researchers who want to exploit these advances and tackle more complex problems. Will my simulation code be able to utilize the accelerators in extreme-scale computing systems? Can I take advantage of the deepening memory hierarchy in heterogeneous processors? Is there a way around the bottlenecks that are caused by the widening ratio of peak floating-point operations per second to input/output bandwidth? How can I effectively manage huge amounts of data? Can I analyze data in situ, or must I transfer it to offline storage for later analysis?

To address such questions, DOE announced that it will provide 57.5 million dollars over the next five years for two multidisciplinary teams—FASTMath and RAPIDS2—to harness supercomputers for scientific discovery. The teams are part of the Scientific Discovery through Advanced Computing (SciDAC) program and are thus called SciDAC institutes.

A Brief Background of SciDAC

The SciDAC program began in 2001 with the goal of accelerating scientific discovery using high-performance computing (HPC). SciDAC is a joint effort that involves DOE’s Office of Nuclear Energy and the six major program offices within the Office of Science (SC): Advanced Scientific Computing Research, Basic Energy Sciences, Biological and Environmental Research, Fusion Energy Sciences, High Energy Physics, and Nuclear Physics. It addresses problems in disciplines such as high energy and nuclear physics, condensed matter physics, materials science, chemistry, fusion energy sciences, and Earth systems research.

SciDAC aims to ensure that scientists from national laboratories, universities, and other research organizations take full advantage of DOE’s HPC resources. Its key objective is to bridge the gap between mathematics and computer science research and domain science research, potentially enabling significant advances in scientific discovery. Since its establishment, the SciDAC program has enjoyed tremendous success and produced significant achievements that range from new insights into the actions of supernovae to combustion improvements that reduce pollution.

Now in its fourth five-year cycle, SciDAC is recognized worldwide as a leading force in accelerating the use of HPC to advance the state of scientific knowledge.

The Two New SciDAC Institutes

In March 2020 as SciDAC-4 approached its end, DOE announced a plan to establish multidisciplinary teams to develop new tools and techniques in mathematics and computer science. These teams, which are part of SciDAC-5, will harness state-of-the-art supercomputers for scientific discovery and take advantage of DOE supercomputing facilities at Argonne, Oak Ridge, and Lawrence Berkeley National Laboratories. Following an open competition, DOE revealed in August 2020 that it had selected two institutes to be funded under the SciDAC-5 program:

Both FASTMath and RAPIDS2 involve large research collaborations between academia and national laboratories. Members of the two institutes have a significant record of successful collaboration within both SciDAC and the larger scientific community over the last 15 years; indeed, strong collaboration between FASTMath and RAPIDS was a highlight of SciDAC-4.

Impact on scientific applications remains the primary focus of both SciDAC-5 institutes. As such, FASTMath and RAPIDS2 researchers will engage with application developers and domain experts in science-focused SciDAC partnerships and DOE SC projects to address some of the most complex computational problems that are of interest to DOE. They will build on their suites of high-quality software, placing particular focus on lowering the barriers to achieving high performance and high productivity on DOE computers.

Both institutes are active in outreach to the broader scientific community. Through activities such as summer schools, tutorials, and workshops, the teams train the scientific computing community to leverage their software and help educate the next generation of computational mathematicians and scientists.

The FASTMath Institute

The FASTMath Institute, led by Esmond Ng (Lawrence Berkeley National Laboratory) and Karen Devine (Sandia National Laboratories), is committed to providing robust mathematical techniques and expertise that enhance the performance and effectiveness of scientific simulations. FASTMath pursues three key goals:

  • Deliver highly performant software with strong software engineering that runs efficiently on current and next-generation advanced computer architectures at major DOE computing facilities
  • Work closely with domain scientists to share the FASTMath team’s mathematical and machine learning knowledge and deploy its software in large-scale modeling and simulation codes
  • Build and support the broader computational mathematics and computational science communities across the DOE complex.

Several mathematical and computational challenges require the proficiency of a team like FASTMath. For example, the integration of graphics processing units into emerging exascale computers means that computational scientists must redesign their software in order to take full advantage of these accelerators. To tackle multiscale multiphysics problems, researchers need to quantify the uncertainty in their computations and be assured of higher fidelity. Domain scientists also have to leverage new technologies—such as machine learning—in modeling and workflow simulations to quickly and accurately analyze the torrents of generated data. FASTMath’s efforts to address these issues will span eight technical areas: structured mesh discretization, unstructured mesh discretization, time integration, linear and nonlinear equation solvers, eigensolvers, numerical optimization, uncertainty quantification, and data analytics. Machine learning is a cross-cutting theme among these topical fields, with planned activities that include numerical methods for machine learning and employment of machine learning techniques to optimize the usage of FASTMath software in applications.

The FASTMath team comprises more than 50 mathematicians from five national laboratories (Argonne, Lawrence Berkeley, Lawrence Livermore, Oak Ridge, and Sandia) and five universities (Massachusetts Institute of Technology, Rensselaer Polytechnic Institute, Southern Methodist University, University of Colorado Boulder, and University of Southern California). Many FASTMath researchers also participate in SciDAC-4 partnerships and DOE SC base math projects, thus enabling them to incorporate research developments into new tools for deployment in scientific applications.

The RAPIDS2 Institute

The RAPIDS2 Institute, led by Robert Ross (Argonne National Laboratory) and Leonid Oliker (Lawrence Berkeley National Laboratory), seeks to provide high-performance computer science and data management tools to help DOE SC’s application teams, which use leadership computing resources to achieve scientific breakthroughs. To accomplish this objective, the institute has identified the following goals:

  • Solve computer science, data, and AI technical challenges for SciDAC and DOE science teams
  • Engage and work directly with SC scientists and facilities to identify needs and deploy new technologies
  • Coordinate with other DOE computer science and applied mathematics activities, as well as the DOE Exascale Computing Project, to maximize impact on DOE science.

RAPIDS2 will build on the successes of the SciDAC-4 RAPIDS project and expand into several new areas of broad impact. It specifically addresses four technology thrusts:

  1. Data understanding, including ensemble analysis and feature detection 
  2. HPC platform readiness, including heterogeneous programming and autotuning 
  3. Scientific data management, including workflow automation, storage systems, and input/output  
  4. AI, including representation learning and surrogate modeling. 

These thrusts provide a toolbox of advanced computation, information, and data science technologies that focus on the common challenges of scientific applications. AI is a particularly exciting cross-cutting technology because of its potential to transform numerous scientific domains that utilize HPC, such as materials science, high energy physics, and chemistry.

RAPIDS2 brings together researchers from five national laboratories (Argonne, Lawrence Berkeley, Lawrence Livermore, Los Alamos, and Oak Ridge); six universities (Northwestern University, Ohio State University, Rutgers University, University of Delaware, University of Florida, and University of Oregon); and one software research and development company (Kitware, Inc.). These collaborators offer a broad range of expertise and a strong history of success in engagement with DOE scientists.

The Power of Two

By combining the knowledge and experience of mathematicians and computer scientists who work jointly with domain scientists, the FASTMath and RAPIDS2 institutes complement each other and collectively cover a wide spectrum of scientific computing needs. The groups’ researchers are confident that they possess the prowess, algorithms, software, and computational tools to provide end-to-end solutions that will help application developers satisfy the SciDAC mission of advancing scientific discoveries through modeling and simulation on DOE’s most advanced computers. Scientists who are interested in collaborating with the FASTMath and RAPIDS2 teams should contact the institute leaders or any institute members. 

The DOE press announcement is available online, as is a list of lead and partner institutions for the two teams. 

Gail Pieper is senior coordinator of technical editing and writing in the Mathematics and Computer Science Division at Argonne National Laboratory (ANL). Karen Devine is deputy director of the FASTMath Institute and a distinguished member of technical staff in the Center for Computing Research at Sandia National Laboratories. Esmond G. Ng is director of the FASTMath Institute and a senior scientist in the Computational Research Division at Lawrence Berkeley National Laboratory (LBNL). Leonid Oliker is deputy director of the RAPIDS2 Institute and a senior scientist in the Computational Research Division at LBNL. Robert Ross is director of the RAPIDS2 Institute and a senior computer scientist in the Mathematics and Computer Science Division at ANL.

blog comments powered by Disqus