An early introduction to general purpose computing on graphics processing units, as envisioned in 2007 by Nvidia and AMD. Originally published in Engineering Automation Report.
By Randall S. Newton
FEBRUARY 2007—For at least the last 18 months, the same message has been repeated by various CAD company executives when asked about future trends: “Get ready for multicore processors and get ready for Graphics Processing Units.” The field of graphics card vendors has dwindled in recent years, leaving Nvidia and ATI as both volume leaders and innovators. The intense competition between the two firms has propelled an explosion of graphics capability in recent years, leading to the current generation of GPUs.
In the last 30 days this competitive fervor took off in a new direction, as both ATI (now a division of CPU maker Advanced Micro Devices) and Nvidia introduced programmable environments that allow their graphics processing units (GPUs) to work on a wider variety of tasks. The new field of interest is called General Purpose Computing on Graphics Processing Units (GPGPU), and it promises to bring High Performance Computing (HPC) into the realm of common desktop and server-based computing.
The foundation for GPGPU lies in the competitive nature of GPU product development. Both Nvidia and ATI have created programmable vertex and pixel shaders. The initial purpose was to increase realism for gaming, and the technology was closely allied to Microsoft’s DirectX video computing environment. But the new programmability and accessibility brought side benefits, which will be harnessed with DirectX 10, to be released with Microsoft Windows Vista. DirectX 10 unifies the previously separate programming specifications for vertex, shader, and geometry processing, allowing for a single pool of computational resources. (Geometry processing is new to DirectX 10).
ATI went public with its plans first, introducing the Stream Computing Initiative on September 29, 2006. As the name indicates, ATI was announcing a new direction for the company, not a specific product. At the conference announcing Steam Computing, results from a risk assessment simulation showed a 16x performance gain, while an oil and gas seismic research study achieved a 20x gain in performance. These performance gains are impressive, but they require a tight coupling of GPU, CPU and software. The broad availability of programs using Stream Computing is still in the future.
Nvidia speaks second
Nvidia spoke second in this game of high-performance leap-frog, offering more than research lab results and a roadmap. On November 8, 2006 Nvidia introduced CUDA—Compute Unified Device Architecture. Like ATI’s Stream Computing, CUDA allows developers to combine the resources of the GPU with the CPU more effectively than currently possible.
As with ATI, the obvious catch is that CUDA is an Nvidia-specific environment. Right now the only Nvidia graphics board that supports CUDA is the GeForce 8800, but more are on the way. But Nvidia is providing the tools to make their proprietary approach more accessible than ATI’s, by introducing the first C compiler specific for a GPU. Nvidia claims their implementation of C makes obsolete existing streaming languages for GPU computing, a direct reference to the ATI approach.
A CUDA-enabled GPU operates as either a flexible thread processor, where thousands of computing programs called threads work together to solve complex problems, or as a streaming processor in specific applications such as imaging where threads do not communicate. CUDA-enabled applications use the GPU for fine grained data-intensive processing, and the multicore CPUs for complicated coarse grained tasks such as control and data management. Thus a workstation with a CUDA-compliant GPU could throw much more computational horsepower at a problem; Nvidia says a CUDA-enabled computer could solve complex tasks up to 100 times faster than a typical workstation.
A test site using the CUDA environment for compositing a series of 2D X-ray images into 3D took researchers at the University of Massachusetts five hours using standard workstation hardware, but only five minutes when a CUDA-enabled GPU was included. Nvidia says future CUDA-enabled GPUs will include a Parallel Data Cache, the first generation of which will allow 128, 1.35GHz processor cores in Nvidia GPUs to work as one.
While the most buzz surrounding this technology will come from the gaming industry, ATI and Nvidia are both quick to point out that their respective approaches to GPU-enhanced high performance computing are a solution for enterprise applications.
At this point in time, Nvidia seems to be the leader in the race for bringing high performance computing to the desktop. The Nvidia implementation of the C programming language makes their approach more accessible to the software industry. Referring to the new GeForce 8800 and the CUDA initiative, graphics market analyst Jon Peddie told the Wall Street Journal today, “Nvidia has just done everything right.” In the meantime, ATI seems to be tying their HPC fortunes closely to CPU advancements from new owner AMD.