Long-Range MD Electrostatics Force Computation on FPGAs
Strong scaling of long-range electrostatic force computation, which is a central concern of long timescale molecular dynamics simulations, is challenging for CPUs and GPUs due to its complex communication structure and global communication requirements. The scalability challenge is seen especially i...
Saved in:
Published in: | IEEE transactions on parallel and distributed systems Vol. 35; no. 10; pp. 1690 - 1707 |
---|---|
Main Authors: | , , , |
Format: | Journal Article |
Language: | English |
Published: |
IEEE
01-10-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Strong scaling of long-range electrostatic force computation, which is a central concern of long timescale molecular dynamics simulations, is challenging for CPUs and GPUs due to its complex communication structure and global communication requirements. The scalability challenge is seen especially in small simulations of tens to hundreds of thousands of atoms that are of interest to many important applications such as physics-driven drug discovery. FPGA clusters, with their direct, tightly coupled, low-latency interconnects, are able to address these requirements. For FPGA MD clusters to be effective, however, single device performance must also be competitive. In this work, we leverage the inherent benefits of FPGAs to implement a long-range electrostatic force computation architecture. We present an overall framework with numerous algorithmic, mapping, and architecture innovations, including a unified interleaved memory, a spatial scheduling algorithm, and a design for seamless integration with the larger MD system. We examine a number of alternative configurations based on different resource allocation strategies and user parameters. We show that the best configuration of this architecture, implemented on an Intel Agilex FPGA, can achieve <inline-formula><tex-math notation="LaTeX">2124 ns</tex-math> <mml:math><mml:mrow><mml:mn>2124</mml:mn><mml:mi>n</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="bandara-ieq1-3434347.gif"/> </inline-formula> and <inline-formula><tex-math notation="LaTeX">287 ns</tex-math> <mml:math><mml:mrow><mml:mn>287</mml:mn><mml:mi>n</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="bandara-ieq2-3434347.gif"/> </inline-formula> of simulated time per day of wall-clock time for the two molecular dynamics benchmarks DHFR and ApoA1; simulating 23K and 92K particles, respectively. |
---|---|
ISSN: | 1045-9219 1558-2183 |
DOI: | 10.1109/TPDS.2024.3434347 |