Vendors are adding power with partners such as Intel, Fusion-io, AMD, and Nvidia.
There was a time when we thought there was not much future for add-in boards to speed rendering, analysis, or physics, but now the reverse seems true. Ironically, in fact, Nvidia was one company that seemed poised to kill the accelerator board market as it positioned the GPU for all tasks. Now, it’s just a matter of definition.
The big movie companies dominate Siggraph these days, but the work done behind the scenes in the papers benefits the graphics industry in general. There were many, many papers dedicated to new approaches to taking advantage of multicore processing and using OpenCL to get the most out the hardware. Co-processor acceleration was promised by the hardware companies five years ago. Now the market is really ready for it. As has been the case every year for the past few years, there are new rendering companies springing up. They’re the direct outgrowth of the use of GPUs for ray tracing. It’s a really challenging market for all these companies, but they’re helping make rendering a more practical capability for more users. People who wouldn’t think of taking the time to render a part they were working on are trying it out because they’re getting tools that make it push-button easy. We’re seeing that technology leap ahead, but right behind is coming the same kind of optimizations being used for particle systems and fluid simulation. 2012 is truly a watershed year.
Nvidia’s Maximus systems are a variation on the theme of refinement. At Siggraph 2012, the company was bringing out its baddest, meanest Kepler processors for use on its Maximus systems first. Maximus is a software smart bridge that couples Quadro GPUs with Tesla GPUs in workstations so the Teslas can go to work on CUDA-enhanced applications such as physics, simulation, and analysis, and the GPUs handle more traditional GPU duties. Maximus is a multi-purpose accelerator. Nvidia calls it their deskside supercomputing product.
Maximus got a cautious roll-out last year as Nvidia encouraged partners to step up and optimize for the platform. This year Maximus 2 is built with a Quadro K5000 and a Tesla K20; both are Kepler-based built on TSMC’s 28-nm process. The company says it has gone from seven early adopters for Maximus to 22 partners for Maximus 2. The major hardware OEMs have signed up including Dell, Hewlett-Packard, Lenovo, and Fujitsu and also specialist companies like Boxx Technologies and Supermicro. At Siggraph, we didn’t see any new applications over those we saw at NAB. The lineup includes Adobe’s video tools, Ansys analysis, Autodesk’s content creation software, Chaos’s rendering, Bunkspeed’s rendering, Dassault’s Catia, MathWorks, and Paradigm. There will be plenty more products out on this front when Nvidia is actually shipping products. Maximus will officially ship in December with arrival of the Tesla K20. The Quadro K5000 products will be available in October.
Intel in the wings with the Xeon Phi
Intel has been working on its answer to Maximus, now named the Xeon Phi, which is due to arrive in the very near future. It has been working hard to understand this market as it emerges, and, truthfully, all the companies are. There is much to be done on the hardware side and the software side, but by the end of 2012, we will see powerful hardware options lined up and waiting for the software companies to take advantage of what’s available. At the heart of Intel’s approach is, of course, x86 technology. It had threatened its graphics processor competitors with the Larrabee processor for graphics, but it has retrenched with a more general-purpose approach in its coming Knights Corner co-processor.
Intel has been working hard on its many-integrated core (MIC), which it describes as a 50+ core capable of one teraflops real-world performance. Intel revealed its strategy and product branding at the International Supercomputing Conference held last June in Hamburg. It showed the Xeon Phi as an AIB that fits into one PCIe slot. The system had two Xeon E5 processors and a Knights Corner co-processor running the Linpack benchmark and hitting the magic one teraflops number.
The product is expected to make its official debut at the Super Computing Conference SC12 coming up in mid- November in Salt Lake City.
Fusion-io finds opportunity for optimization
Fusion-io came on the scene a couple of years ago at the Supercomputer show where they were demonstrating big gains from optimizing memory. Their pitch was to offer super-computing in small, highly efficient packages. The company recognized that faster access to memory was as important to unlocking performance as processors and their various products provide it. The company’s products make high-capacity, persistent memory available for application acceleration through the combination of hardware and software technologies. More specifically, Fusion-io brought its ioFX board to Siggraph to demonstrate how its exploitation of NAND memory and PCI Express 2.0 can significantly improve application performance. It is optimized for multi-threaded applications, and oh, before we forget, the ioFX comes with 420 GB of memory.
At Siggraph this year the company had wins with Boxx Technologies, Thinkbox Software, NextComputing, Maingear Computers, Supermicro, and ProMAX. Adobe is supporting the platform to make 4K compositing practical for After Effects. Likewise, Assimilate’s Scratch DI tool was shown playing 4K uncompressed video content in real time. Scratch is a prime example of a product that has evolved from the slow transition of filmmaking, going from analog to digital technologies. The stuff Scratch does—conform, color grading, and finishing—are the last part of the puzzle to be accomplished digitally, and there are still productions that maintain this process in film but not very many. This could be the last year for film. Sometimes when the dominos fall, they fall all at once. Fusion-io says they can accelerate products like Scratch to enable 4K playback with headroom.
Apparently, Fusion-io is on the right track. In its most recent earnings report, it announced 89% growth with $359.3 million in revenue.
AMD workstation boards
In comparison to Intel and Nvidia, AMD seems to be taking a more straightforward approach. AMD needs to fill out its FirePro workstation board lineup, and that’s exactly what it did at Siggraph. AMD’s 28-nm Graphics Core Next (GCN) architecture is the platform for its new products. The company says its 28-nm process enables it to fit up to 4.3 billion transistors in the GPU. The company’s previous generation had 2.6 billion transistors. The boards support PCI Express 3.0, which is capable of 32 GBps compared to 16 GBps for PCIe 2.0 generation products.
With GCN, AMD says it plans to get serious about GPU computing. We’ll be forgiven for observing that it’s about time, but the company has been seriously investing in OpenCL. At Siggraph the emphasis was on their new workstation add-in boards (AIBs), beginning with the high-end FirePro W9000. The stats for the FirePro W9000 include 2048 stream processors coupled with a scalar coprocessor. It can handle 1.95 billion triangles per second.
The FirePro W9000 has 6 GB GDDR5 memory, supports ECC memory, and costs $3,999. The W8000 has 4 GB GDDR5 ECC RAM and costs $1,599. The W7000 comes in at $899, and the W5000 is $599.
AMD also launched its first-FirePro APU, the AMD FirePro A300 Series. Designed for entry-level and mainstream desktop workstations, the AIB features AMD’s Eyefinity multi-display technology. AMD says the FirePro A300 Series boards are designed for users working with CAD and media and entertainment (M&E) workflows.
Nvidia brings Kepler to workstations
Nvidia used Siggraph to announce their new line of Quadro professional graphics AIBs. Designed, says Nvidia, for engineers, industrial designers, animators, and film & video editors who need to take their work with them; these new Quadro AIBs feature Nvidia’s new Kepler GPU.
The company also introduced a new line of Quadro professional graphics GPUs for the latest leading mobile workstations with at least double the number of CUDA cores of previous generations. The new mobile lineup includes the Quadro K5000M, Quadro K4000M, Quadro K3000M, Quadro K2000M, Quadro K1000M, and Quadro K500M. These GPUs incorporate a number of key features, including large frame buffers, GPU memories up to 4 GB, and resolutions up to 4K x 2K (3840 x 2160 @ 60 Hz).