About the Author

Parallel Implementations and Applications of Next-generation FFT Algorithms at PP22

By Samar Aseeri

During the 2022 SIAM Conference on Parallel Processing for Scientific Computing (PP22) which took place virtually in February, Daisuke Takahashi (University of Tsukuba), Franz Franchetti (Carnegie Mellon University), and Samar Aseeri (King Abdullah University of Science and Technology) organized a minisymposium about the fast Fourier transform (FFT). The session was titled “Next Generation FFT Algorithms in Theory and Practice: Parallel Implementations and Applications” (see Figure 1). The trio has been coordinating minisymposia sessions at SIAM meetings for the last five years, alternating between PP and the Conference on Computational Science and Engineering as part of the parallel FFT library community formation initiative that started in 2017. This initiative aims to facilitate the exchange of information between researchers who are working on various FFT algorithms and examine the algorithms’ performance on different parallel hardware. More information about the session—including slides from the presentations—is available online.

PP22 was initially scheduled as a hybrid event, but the COVID-19’s Omicron variant meant that the conference commenced fully virtually. Despite the change of plans, the PP22 Organizing Committee put together a successful virtual meeting. One advantages of a virtual event is the flexibility, and attendees were able to join multiple different sessions and interact with many other speakers and participants. 

Takahashi opened the minisymposium by introducing its intentions. He then presented a talk that addressed the number-theoretic transform for FFT computation rather than complex or real numbers, which researchers commonly utilize. Such transforms may be beneficial for encryption and multi-precision computations. This approach employed OpenMP and vector intrinsics to parallelize the algorithm kernels. Takahashi concluded his talk by demonstrating a good speed-up of the algorithm on the Intel Xeon Platinum 8280M Processor, which has 28 cores (see Figure 2). 

Franchetti offered an update on FFTX, which is building upon the SPIRAL code generator tool from earlier versions of this minisymposium (see Figure 3). He introduced FFTX’s first official release and explained how it enables a tuned performance portable FFT on a wide range of platforms. Upcoming tutorials about its usage should materialize in the coming months on SPIRAL’s website.

Samar Aseeri discussed the overhead of a variety of performance monitoring tools on the Shaheen II Cray XC40 system when profiling an FFT-based solver for the Klein-Gordon equation [1]. The solvers in question were based on 2DECOMP&FFT and FFTE to allow comparison of each library’s performance. Aseeri also displayed initial results from a similar study on the Fugaku supercomputer, which ranks as #1 on the Top500, Graph500, and HPCG lists.

Doru Thom Popovici (Lawrence Berkeley National Laboratory) closed out the minisymposium by providing a detailed account of the pencil decomposition’s implementation for three-dimensional FFTs, which is more scalable than the slab decomposition for scientific applications on large systems with thousands of nodes. Popovici also discussed the elemental cyclic distribution with partial computation and presented performance results, which demonstrated that this approach is even more scalable than the pencil decomposition (see Figure 4). More details of this work are available in [2]. 

We thank both participants and attendees of this minisymposium for a successful session.

[1] Leu, B., Aseeri, S., & Muite, B. (2021). A comparison of parallel profiling tools for programs utilizing the FFT. In The international conference on high performance computing in Asia-Pacific region companion (HPC Asia 2021) (pp. 36-45). Association for Computing Machinery. 
[2] Popovici, D.T., Schatz, M.D., Franchetti, F., & Low, T.M. (2020). A flexible framework for multidimensional DFTs. SIAM J. Sci. Comput., 42(5), C245-C264. 

Samar Aseeri is a computational scientist at King Abdullah University of Science and Technology (KAUST) in Saudi Arabia.