Nvidia updates CUDA programming platform for GPUs

New features aim to make better use of GPUs for scientific and engineering simulation.

Nvidia today released a major update to the CUDA parallel computing platform for its graphics processing units (GPUs). Nvidia says the new version offers three key enhancements to make parallel programing with GPUs easier, more accessible, and faster for computational biologists, chemists, physicists, geophysicists, other researchers, and engineers to advance their simulations and research by using GPUs.

Key features:

  • A resigned Visual Profiler with automated performance analysis, providing an easier path to application acceleration;
  • A new compiler based on the widely-used LLVM open-source compiler infrastructure, which Nvidia says delivers up to 10% speed up in application performance;
  • Hundreds of new imaging and signal processing functions, doubling the size of the NVIDIA Performance Primitives (NPP) library.
The new Nvidia CUDA Visual Profiler simplifies performance optimization. (Source: Nvidia)

Developers and researchers who worked with the beta release of the update are enthusiastic about the new features:

  • “The new visual profiler is amazing,” said Joshua Anderson, lead developer of the HOOMD-blue open source molecular dynamics project.  “With just a few clicks, it performs an automated performance analysis of your application, highlights likely problem areas, and then provides links to best-practice suggestions on improving them.  It makes it quick and easy for virtually all developers to accelerate a broad range of applications.”
  • “The LLVM complier gave me an almost immediate 10% performance speed up, just by recompiling my existing real-time financial risk analysis code,” said Gilles Civario, senior software architect at the Irish Centre for High-End Computing.  “I can only imagine the additional performance gains I can achieve with additional tuning using the new CUDA release.”

The update is available at no charge on the Nvidia developer website.