Scaling soft matter physics to thousands of graphics processing units in parallel

Gray, Alan and Hart, Alistair and Henrich, Oliver and Stratford, Kevin (2015) Scaling soft matter physics to thousands of graphics processing units in parallel. International Journal of High Performance Computing Applications, 29 (3). pp. 274-283. ISSN 1741-2846 (https://doi.org/10.1177/1094342015576848)

[thumbnail of Gray-etal-IJHPCA-2015-Scaling-soft-matter-physics-to-thousands-of-graphics-processing]
Preview
Text. Filename: Gray_etal_IJHPCA_2015_Scaling_soft_matter_physics_to_thousands_of_graphics_processing.pdf
Accepted Author Manuscript

Download (1MB)| Preview

Abstract

We describe a multi-graphics processing unit (GPU) implementation of the Ludwig application, which specialises in simulating a variety of complex fluids via lattice Boltzmann fluid dynamics coupled to additional physics describing complex fluid constituents. We describe our methodology in augmenting the original central processing unit (CPU) version with GPU functionality in a maintainable fashion. We present several optimisations that maximise performance on the GPU architecture through tuning for the GPU memory hierarchy. We describe how we implement particles within the fluid in such a way to avoid a major diversion of the CPU and GPU codebases, whilst minimising data transfer at each time step. We detail our halo-exchange communication phase for the code, which exploits overlapping to allow efficient parallel scaling to many GPUs. We present results showing that the application demonstrates excellent scaling to at least 8192 GPUs in parallel, the largest system tested at the time of writing. The GPU version (on NVIDIA K20X GPUs) is around 3.5-5 times faster that the CPU version (on fully utilised AMD Opteron 6274 16-core CPUs), comparing equal numbers of CPUs and GPUs.