Molecular Modeling & Dynamics

Molecular modeling and dynamics seeks to derive, represent, and manipulate the structures and reactions of molecules as well as those properties that are dependent on these three-dimensional structures.

Optimized for GPU-enabled , , ,

A GPU-accelerated code for calculating non-bonded forces (specifically, electrostatic surface potential) via algorithmic re-factoring and hardware/software co-design.

Optimized for GPU-enabled , , ,

A GPU-accelerated version of NAB, which constructs models for helical and non-helical nucleic acids and provides a combination of rigid body transformations and distance geometry to create candidate structures that match input criteria.

Optimized for GPU-enabled , , ,

The timescales and structure sizes accessible via simulations of atomistic molecular dynamics (MD) can be advanced substantially by two independent techniques: (1) many-core parallelization with graphics processing units (GPUs) and (2) multiscale approximation with hierarchical charge partitioning (HCP). Achieving efficient many-core parallelization on the GPU generally requires highly synchronized and regular computation across the GPU. However, multiscale methods can result in highly asynchronous and irregular processing. Thus, one might expect that realizing such multiscale algorithms on the GPU would result in an overall loss of performance and that the total speedup obtained would be less than the product of the individual speedups for the two techniques separately, i.e., less than multiplicative speedup.

To test this expectation in the context of atomistic MD, we designed and implemented our HCP multiscale method on NVIDIA GPU platforms. The HCP code was implemented in NAB, short for nucleic acid builder, and tested using the distance-dependent-dielectric, implicit solvent model. (NAB is the molecular dynamics module in the open-source Amber-Tools v1.4.) We show that for the HCP multiscale approximation and the common MD simulation model considered here, the degradation in performance due to asynchronous and irregular processing is mostly offset by a corresponding reduction in other asynchronous operations and slow global memory accesses. As a result, we realize near multiplicative speedups. For example, for a 475,000-atom virus capsid we were able to achieve a 11,071-fold combined speedup, only slightly less than the 11,706-fold multiplicative limit speedup – 48.0-fold from the parallelization on the GPU times 243.9-fold from the multiscale approximation. The overall speedup depends on structure size, with smaller structures having lower speedups. An additional benefit of the HCP implementation on the GPU is the reduced memory requirement, which allows the processing of much larger structures that would otherwise be impossible on the limited memory GPU platform.


Our research has been supported by:

Affiliated Sites: Virginia Tech · SEEC · Synergy

Last updated: January 2016