Nvidia adds ARM platform support to CUDA GPU programming

ARM-based devices are growing at a 10x rate compared to x86 devices.

Nvidia continues its expansion of the CUDA parallel computing environment for graphics processors with a new version that works with ARM-based computers. Primarily used for mobile devices, the ARM-based ecosystem is the fastest growing computing category and is now approximately 10 times the size of the x86 CPU-based market.

Available today as a free download at http://developer.nvidia.com/cuda-toolkit, the new CUDA release provides programmers with a platform to develop advanced science, engineering, mobile and high performance computing (HPC) applications on both ARM and x86 CPU-based systems.

Nvidia says combining high-performance CUDA-enabled GPU accelerators with low-power ARM-based systems on a chip (SoCs) will enable ARM-based systems to penetrate new markets requiring the highest levels of energy-efficient compute performance. These market segments include: defense systems, automotive, energy exploration, mobile computing, robotics, scientific research, HPC and others.

In addition to providing native support for ARM platforms, the CUDA 5.5 release delivers a variety of new performance and productivity features, including:

Enhanced Hyper-Q support – Now supported across multiple MPI processes on all Linux systems
MPI Workload Prioritization – Allows application developers to prioritize CUDA streams on the critical path first, optimizing overall application run time
New guided performance analysis – Visual Profiler and Nsight Eclipse Edition now walk developers step-by-step through the process of identifying performance bottlenecks and applying optimizations
Fast cross-compile on x86 – Reduces development time for large applications by enabling developers to compile ARM code on fast x86 processers, and transfer the compiled application to ARM

Our take

Since developers started using CUDA in 2006, successive generations of better, exponentially faster CUDA GPUs have boosted the performance of applications on x86-based systems. The number and variety of high-performance computing (HPC) applications using CUDA continues to grow. This addition of the ARM ecosystem means millions of more processors can be harnessed to add to the HPC environment.

The addition of ARM to the CUDA environment is also a bit self-serving, since Nvidia’s Tegra CPU is an ARM device. But we’ll give them a pass on this one, since the expansion of CUDA will bring benefit to other systems as well.