What’s Arm up to? Anything you want to throw at it

Tackling the next 300 billion devices from the IoT up to supercomputers.

Arm showed its first new architecture in a decade at its recent one-day virtual conference. Arm introduces their v9 design and provided their macro view of future semiconductor development. As it turns out, from Arm’s perspective, Arm devices will control the world. All data will either be generated, processed, or transferred through and by an Arm device. Arm says it is or will meet the needs of modern computing, from the IoT up to supercomputers. The number of Arm-based chips shipped continues to accelerate, with more than 100 billion devices shipped over the last five years. The new Armv9 architecture said Simon Segars, Arm’s CEO, will form the leading edge of the next 300 billion Arm-based chips.

Arm highlighted three areas of priority it focused on for its new chip: security, AI processing, compute power.

Security

Part of the responsibility of handling all of the world’s data is making sure it is secure, which Arm recognizes as the greatest technology challenge today. The Armv9 roadmap introduces the Arm Confidential Compute Architecture (CCA).

Confidential computing shields portions of code and data from access or modification while in use, even from privileged software, by performing computation in a hardware-based secure environment. The Arm CCA, says the company, will introduce the concept of dynamically created Realms, useable by all applications, in a region that is separate from both the secure and nonsecure worlds.

Arm has introduced the idea of Realms to communicate the cordoning off of sensitive data even when it’s being processed. (Source: Arm)

For example, in business applications, Realms can protect commercially sensitive data and code from the rest of the system while it is in use, at rest, and in transit. In a recent Pulse survey of enterprise executives, more than 90% of the respondents believe that if Confidential Computing were available, the cost of security could come down enabling them to dramatically increase their investment in engineering innovation.

From micro to mighty; supporting AI

Arm processors can be found in the smallest of the small IoT devices and appliances to mighty supercomputers. The ubiquity and range of AI workloads are demanding more diverse and specialized solutions. For example, it is estimated there will be more than eight billion AI-enabled voice-assisted devices in use by the mid-2020s, and 90% or more of on-device applications will contain AI elements along with AI-based interfaces like vision or voice-ii.

To address this need, Arm partnered with Fujitsu to create the Scalable Vector Extension (SVE) technology, which is at the heart of Fugaku, the world’s fastest supercomputer. Building on that work, Arm has developed SVE2 for Armv9 to enable enhanced machine learning (ML) and digital signal processing (DSP) capabilities across a wider range of applications.

SVE2, says Arm, enhances the processing ability of 5G systems, virtual and augmented reality, and ML workloads running locally on CPUs, such as image processing and smart home applications. Over the next few years, Arm says it will further extend the AI capabilities of its technology with enhancements in matrix multiplication within the CPU, in addition to ongoing AI innovations in its Mali GPUs and Ethos NPUs.

Brian Kelleher, senior vice president of hardware engineering at Nvidia, said, “Nvidia sees enormous opportunities to bring the transformative powers of AI deeper into gaming, autonomous vehicles, enterprise data centers, and embedded devices. Through our ongoing collaboration with Arm, we look forward to using Armv9 to deliver a wide range of once unimaginable computing possibilities.”

Power to the process

Arm promises to increase performance through continuous development which focuses on efficient design and optimizing for performance throughout the entire design. The company says that over the past five years, Arm has increased CPU performance annually and it says that Armv9 is expected to improve CPU performance more than 30% over the next two generations.

According to Paul Williamson, Arm’s VP and GM, client business, improving performance along a continuum of process improvements and efficiency isn’t going to be enough. As innovation continues to push the demands on computing infrastructure, Arm is going to be challenged to change and adapt.

The company is seeing increased opportunity in focused optimization. Just as the SVE2 adds support for AI, DSP, and xR workloads, the graphics upgrades boost the V9’s AI capabilities with improved matrix processing, Arm will look for ways to accelerate performance with specialized IP blocks while balancing power demands.

**Changing workloads mean new demands on CPUs and GPUS. (Source: Arm)**

During the Armv9 announcement, one of the slides shown on the Mali GPU listed new features of Arm’s plans for Mali. Arm said that ray tracing and variable-rate shading, now available in the PC via DirectX 12 Ultimate, will one day be available in Arm-powered smartphones and tablets as part of Armv9. Chips using the new v9 architecture design are expected to be released in late 2021 which means chip builders have had the IP for a while.

Arm says their long experience in mobile phones gives them an edge in understanding how to drive processing power, while living within the tight boundaries of mobile phone battery demands. Arm has identified the workloads of the future XR, AR, VR, and better gaming with AI, image processing, and more.

In his blog, Arm’s senior director marketing programs, says that integrating diverse block of IP into the SoC has to be considered without increasing the active die area and thereby increasing thermal and power demands. “Each IP block is developed with a common underlying architectural approach for performance, efficiency, and data exchange.”

Arm is also emphasizing the importance of developer access via straightforward tools.

What do we think?

Arm has a whole lot to say about how they’re improving their processors, but there’s actually not a lot of specific data. You can see why. The whole philosophy of Arm is to get more out of less so they can maintain their low-power advantage. The company has gotten help proving its case from the performance of the new Apple M9, its instances being put to work in the AWS cloud, its use in Supercomputers and the fact that Nvidia wants to buy them.

Arm says future Mali GPU will do ray tracing.

If the platform target is Android phones, then the OEMs that buy chips and assemble phones will have to go to companies like MediaTek or Qualcomm to get parts. Those companies will have to buy Arm’s IP and then find an API that will expose the ray tracing and variable rate shading (VRS) techniques. We assume what Arm means by advanced rendering techniques is mesh shaders.

Android devices use Khronos’ OpenGL ES for their API, and some are moving to Khronos’ Vulkan. Vulkan supports ray tracing, VRS, and mesh shaders. OGL ES also supports mesh shaders (via an extension that uses Nvidia code).

However, executing ray tracing is no trivial task. One of Arm’s selling points for Mali is its low power and gate count. If Arm adds computationally heavy ray tracing, the low-power claim goes out the door as more processors and memory are added. It’s pretty clear Arm isn’t going to license Imagination’s ray tracing IP, and unlikely they will license SiliconArts. Doubtful they will license AMD’s either, so they may have to do it themselves. However, if they wanted to offer ray tracing and keep the power down, they should checkout Adshir’s LocalRay software solution.