About the Author

Parallel Sparse and Conventional FFTs: Applications and Implementation

By Samar Aseeri

Researchers utilize the fast Fourier transform (FFT) algorithm for a variety of applications. However, it is not widely used on emerging platforms such as ARM graphics processing units (GPUs), and distributed-memory systems. The FFT has a high communication-to-computation ratio, which may make it difficult to realize a high floating-point operation count. However, it is still effective for many problems where time to solution is the metric of importance.

The 2020 SIAM Conference on Parallel Processing for Scientific Computing, which took place this February in Seattle, Wash., featured a two-part minisymposium entitled "Parallel Sparse and Conventional FFTs, Applications and Implementation."
A two-part minisymposium at the 2020 SIAM Conference on Parallel Processing for Scientific Computing (PP20), which took place this February in Seattle, Wash., featured a collection of presentations by eight researchers who are working on parallel FFT algorithms and implementations. This minisymposium facilitated an exchange of information between relevant parties, addressed challenges in the field, and pondered the development of more efficient parallel FFT software for current and upcoming hardware.

Slides for most of the talks are available online. The speakers represent institutions and companies that have interesting computing facility capabilities and/or are involved in scientific computing projects. The following is an overview of the minisymposium speakers and the materials they presented.

Franz Franchetti (Carnegie Mellon University) discussed the design of FFTX, which is extending and updating FFTW for the exascale era and is used in SpectralPack code generators that are specified for spectral applications. SpectralPack is in its preliminary phase and is part of the Department of Energy’s ExaScale Computing Project.  It has been trained to deliver optimal performance on large platforms.

Andrew Canning (Lawrence Berkeley National Laboratory) spoke about his 20-year experience with FFT codes from a materials science and chemistry perspective. He presented the best methods for performing three-dimensional (3D) FFTs for materials science and chemistry codes, to be used on exascale architectures. Canning also demonstrated the development of 3D FFTs for plane wave codes in the FFTX project.

Dmitry Pekurovsky (University of California, San Diego) focused on frameworks for highly scalable multidimensional spectral transforms. He talked about the background of this work, P3DFFT++ design choices, and performance considerations that are necessary to build a typical use scenario.

Daisuke Takahashi (University of Tsukuba) reported the implementation results of a 3D real FFT with two-dimensional decomposition on Intel Xeon Phi clusters. This approach has achieved a promising level of performance.

Samar Aseeri (King Abdullah University of Science and Technology) demonstrated the network configurations of Dragonfly, which she has tested on a Shaheen II supercomputer, to align the network all-to-all connections with the FFTK library of all-to-all message passing interface (MPI) communications.

Lisandro Dalcin (King Abdullah University of Science and Technology) presented his novel method that performs communication of parallel FFTs within a single collective call via advanced MPI features.

John Bowman (University of Alberta) discussed the key features of FFTW++, the package he built on top of FFTW that simplifies work in C++. This allows more efficient computing for implicit dealiasing of Fourier-based convolutions—and better MPI—than the package used for FFTW. FFTW++ also employs a general multithreaded implementation that accepts an arbitrary number of input and output vectors.

Malcolm Roberts (AMD, Canada) highlighted the design, features, and use of rocFFT, AMD’s open-source GPU FFT. He identified projects that are currently utilizing rocFFT, which is primarily targeted at applications that run on exascale systems like Frontier.

Eight speakers shared their work in the "Parallel Sparse and Conventional FFTs, Applications and Implementation" minisympoisum at the 2020 SIAM Conference on Parallel Processing for Scientific Computing, which took place earlier this year in Seattle, Wash.

Samar Aseeri is a computational scientist at King Abdullah University of Science and Technology.