AMD’s new Kaveri APU changes the rules

AMD reaches an important milestone with Kaveri, its strongest integrated processor to date. AMD’s APUs are being designed for the next generation of computing: people want to use lightweight machines and tablets for bigger tasks.

AMD demonstrated its TrueAudio acceleration to press and analysts at CES
AMD demonstrated its TrueAudio acceleration to press and analysts at CES (Source: Jon Peddie Research)

You can tell AMD feels like they’ve accomplished much of what they set out to do in 2013 as the company has enjoyed several big launches on a regular basis. AMD rolled out its latest Hawaii GCN (Graphics Core Next) GPU, it celebrated HSA (Heterogeneous System Architecture)  at its Developer conference in the fall of 2013, and it used CES as the launch pad for its latest A-Series APU, Kaveri, which combines AMD’s Steamroller CPUs and Radeon R7 graphics from the Hawaii family. The Kaveri chip represents a major step towards proving AMD’s argument that an integrated processor with a strong graphics component can have a big advantage over an integrated chip with a “good” graphics component. Although, we’re seeing a burst of new marketing, the company has been following this path since AMD acquired ATI in 2006. And, even though AMD spent most of its time taking aim at Intel and that company’s i5 integrated processor, AMD is also gunning for higher-end users.

The tech world is changing, and AMD wants to use the inflection points experienced by the industry as a slingshot to get ahead of the curve and not forever playing catch-up to its bigger, meaner competitors.

At CES, the company spoke about realities. Can you play serious games on an integrated chip? How about creating videos, music, editing photos? The answer is,  of course you can. People are doing it right now. An analysis of data from the Steam site reveals that 35% of people playing Steam games are doing it on a computer with machines less powerful than what Kaveri can provide. Sure, there are hardcore gamers who wouldn’t think of tackling Battlefield 4 without sitting astride a big honkin’ PC equipped with no less than eight CPUs and the latest greatest discrete GPUs. But in reality, the allure of the game itself is much stronger than the willingness to drop big bucks on hardware and so most players get by on good enough. AMD’s message is clear:  35% of gamers who are getting by on very mainstream platforms,  can expect to see a significant increase in performance with a low-cost Kaveri-based machine.

The situation for video is even more dramatic because people are creating videos on even more meager resources. Although the trend towards massive video creation and uploading is obvious, it’s equally obvious that not a lot of work is going into editing the video, and the content being captured on whatever device comes to hand. A study performed by Pew Research has found that 40% of people uploading video captured that video using their mobile phones and 20% are using their mobile phones to upload that video. It seems likely that people will become interested in working more with their video as time goes on; the semiconductor companies are seeing this as an opportunity.

AMD is using the term compute cores to describe both CPU and GPU processors to make  the point that from now on, they'll be able to work concurrently in AMD's APUs.
AMD is using the term compute cores to describe both CPU and GPU processors to make the point that from now on, they’ll be able to work concurrently in AMD’s APUs.

It’s all about balance

One of the key messages at the Kaveri launch was that the graphics in this processor does not take back seat to the CPUs. AMD has settled on Compute Cores as their term to communicate the processing features of APUs. The aim is to try to take the emphasis off the CPUs or the GPUs. So, AMD says the Kaveri can have up to 12 Compute Cores (4 CPUs and 8 GPUs).

AMD credits the Heterogeneous System Architecture, which has been developed as an open resource with other software and hardware companies, as the driver for more efficient use of system resource. HSA creates an environment in which software developers can take advantage of the appropriate processors, i.e., the right processor for the right workload. In addition, the new  HSA development tools enable programmers to write code that can dynamically take advantage of the processing resources without having to dramatically change the way programmers have traditionally had to write code.

AMD CTO and Fellow Phil Rogers said that the first response to the complex challenge of multiple processors and the challenge of using the graphics processors for compute tasks like physics, rendering, video, etc. in addition to graphics has been to change programming methods to accommodate graphics processors and hand off data between CPU and GPU in ways that are not necessarily efficient. Instead, he says, HSA  enables the hardware to adapt to the software. Programmers can direct their code to the proper resources by using semaphores in the code and the time the CPU or GPU has to wait for data is dramatically reduced.

Adobe’s senior manager of programming Eric Berdahl told attendees at AMD’s CES event that he does not have to have a special team at Adobe for situations where heterogeneous computing is called for. All 2,000 of his programmers can work in much the same way they’ve always worked. More people on the problem are likely to find more opportunities for optimization.

Accelerators

Another aspect of AMD’s strategy–and it represents a trend in computer technology–is the embrace of accelerator processors for specific tasks. This is a technique that has been used more frequently in the mobile and set top box markets where power constraints and high media demands have dictated a trade-off of programmable vs. fixed function processors. Yes, perhaps the system processor can do it all, but is that a good thing? Dedicated accelerators have been used for media processing to take the load off the CPUs and save power, while not sacrificing performance, in fact, usually enhancing it. In this new world we’ve arrived in, everyone is constrained at least digitally speaking, power comes at a premium and no one wants that premium to be performance or money. AMD introduced its VCE (video coding engine) and UVD (unified video decoder) a couple of years ago, and now it is putting the spotlight on its TrueAudio Technology which can deliver 32-channel surround audio. With a nod to the next generation of video, the Kaveri is ready for UltraHD 4K resolution and H.265 (HEVC, high efficiency video).

AMD says Kaveri is the first integrated processor in which the GPU is equal to the CPU. Previous architectures favored the CPU over the GPU. (Source: AMD)
AMD says Kaveri is the first integrated processor in which the GPU is equal to the CPU. Previous architectures favored the CPU over the GPU. (Source: AMD)

Kaveri also represents an advance in fab process for AMD, which partners with GlobalFoundries. The Kaveri CPU technology, codenamed Streamroller, is the third generation of AMD’s Bulldozer line of CPUs and has gotten a process shrink. With Kaveri, AMD has moved from GlobalFoundries 32nm High-K Metal Gate SOI process to its 28nm SHP (Super High Performance) process. First of all this means that AMD is able to get a lot more transistors on the chip, but also, says AMD, it’s a more balanced process. GlobalFoundries’ SOI process was “CPU-biased.” It allowed higher frequencies at the expense of density. In the GPU world, the more transistors the better and so the shift to SHP enables more density at the expense of frequency for the CPU. AMD says Steamrollers’ improved instructions per clock (IPC) helps mitigate the hit to the frequencies for the CPU.

The end game

The bar is getting lower for resource heavy applications like gaming, 4K video entertainment, video content creation, imaging, large data problems, etc. We know that in the real world, people tend to buy machines to handle play as well as work. This has been true in a large segment of CAD where independents and SMBs rule the largest part of the market. Gaming machines are often bought for design because the end users figure if it’s good enough to handle big 3D games, it can handle CAD or their Photoshop work. Now that BYOD (bring your own device) has become a way of life in many companies, the same dynamic comes into play and if it’s practical, people like the idea of grabbing a light little machine to do a job instead of trudging off to the desk to work. It’s also more likely that people work with several machines, one at work, maybe two and one at home, maybe more.

The Kaveri processors will sell for less than $200 and AMD is selling three versions first:

  • A10-7850 with R7 Graphics, a 95W chip with 12 compute cores (4 CPUs and 8 GPUs), the CPU will run at 3.7GHz and the GPU at 720MHz. It ships Jan. 14, 2014
  • A10-7700 with R7 Graphics, a 95W chip with 10 compute cores (4 CPUs and 6 GPUs), the CPU will run at 3.4GHz and the GPU at 720MHz. It ships Jan. 14, 2014
  • A10-7600 with R7 Graphics, is a low(er) power chip 65W/45W chip with 10 compute cores (4 CPUs and 6 GPUs), the CPU will run at 3.3GHz and the GPU at 720MHz. It will come out in the 4th quarter of 2014.

In 2014, Intel and AMD are both going to be making the point that machines with integrated processors can do work that was once delegated to more expensive machines. But, to add one caveat: we are seeing strength and stability in the workstation market because as people use a variety of machines they become much more clear on what type of machine they need for different jobs.  Advances like Kaveri mean we’re making fewer compromises as we grab for a lightweight, low power computer.