Apple rolls out the chips—M1 Pro and Max

Apple knows how to make transitions. The company proved it by transitioning away from the PowerPC platform to Apple and now they’re showing the same discipline as they move away from Intel and on to their own Arm-based family of chips. Tim Cook, CEO of Apple, says the company is a year into a two-year transition that started with the all-Apple M1 SoC. Now Apple has revealed its M1 Pro, and M1 Max SoCs, and the lineup is impressive. Apple also knows how to roll out products.

The M1 Pro memory bus is twice as wide as the M1 providing 200 Gb/s ~ 3x M1 bandwidth. It has 32 GB unified memory and the SoC has 33.7 b transistors—4x the M1.

Built using TSMC’s 5nm process, the 10-core CPU has 8 high-performance cores and 2 high-efficiency cores and is 70x the performance of the M1.

The CPUs of the M1 Pro (Source: Apple)

The 16-core GPU is equally impressive and Apple claims it offers twice the performance of the M1.

The GPU of the M1 Pro (Source: Apple)

The SoC has three media engines (upper right blocks)  that run multiple streams of 4k and 8k video and it has a ProRes (RAW) codec.

The general layout of the M1 Pro SoC (Source: Apple)

The M1 Max, based on M1 Pro, is even more impressive. It doubles the M1 Pro’s bandwidth to  400 GB/s and has a whopping 57 billion transistors, 1.75x the count of the M1 pro and 3.5x the M1 Pro.

The M1 Max offers 4x faster GPU performance than M1 (Source: Apple)

The M1 Max doubles the embedded Unified memory to 64 GB.

The M1 Mx with its unified embedded memory (Source: Apple)

Apple says their new processors are incredibly power efficient. Compared to laptops with discrete GPU the M1 Pro reaches the same performance at 70% less power or 7x the performance at the same power.

The M1 SoC will power the new 14-in Mac Pro notebook which has a 14.2-in active diagonal screen with 3024 × 1965 resolution (5.9 Mpix) and a refresh rate up to 120 Hz that dynamically adjusts to the content—if static refresh slows down.

The M1 max will power the 16-inch Mac Pro with 16.2-in diagonal display of3456 × 2234 pixels (7.7 Mpix), 1 billion colors (10-bit) liquid retina XDR display that puts out 1000 nits (1600 peak). It has thousands of LEDs in the backlight with dozens of zones and offers a 1,000000:1 contrast ratio.

What do we think?

Apple has clearly achieved world-class semiconductor design status and has succeeded in being the first such company to get the benefits of TSMC’s 5nm process. Intel’s Pat Gelsinger acknowledged Apple’s accomplishment and has taken it as a challenge to build better chips, acknowledging that it will take time to beat Apple. The power-performance curves are an engineer’s dream and the epitome of Moore’s law.

Performance comparison (Source: Apple)

In the fine print, Apple says the test systems were 4-core MSI prestige 14 EVO PC laptops with iGPUs and an 8-core MSI GP66 Leopard 11ug which use Intel Core i7-1185G7, and the Core i7-11800H, 4-core, and 8-core models of Intel’s Tiger Lake 10nm SuperFin CPUs.

The M1 has five GPU cores in an 8-156 organization—8 TMUs and 256 shaders per core, or 40 TMUs and 1280.

Apple is now FP32 centric and FP16 has the same rate (no speed up, but versus previous-gen this means FP32 is 2x rate).

Apple has not defined how many shaders are in a GPU core. Looking at the die shot, one could convince oneself that there are 16 memories inside a block. Below them, you can see two blocks that are likely ALU banks sharing a texture unit above it and then rasterization and other support logic to the right.

Die shot of Apple’s M1 Pro GPU array (Source: Apple)

M1 Pro is 14 to 16 cores, so 16 such cores would be a 128-4096 design, likely running at similar or higher clocks than Mobile, which uses up to 1.3GHz.

M1 Max is 24 to 32 cores, so 32 cores would be a 256-8096 design, again likely running at up to 1.3GHz.

If that guess is correct, then if we take those numbers, we get 166GTexels/Sec and 5.324 TFLOPS FP32 for M1 Pro – those numbers are slightly high versus the slide, so the clock is likely just below 1.3GHz.

The M1 Max would have 8096 shaders. But with a smaller process size and potentially higher clock rate, as well as a huge local memory and tightly coupled sea of memory to give the GPU every available advantage.

The design looks very scalable from the die shots, and Apple’s rate of product introduction is convincing. We’re looking forward to seeing iPads and iPhones rolling out with the M1 Pro find its way to a new generation of Apple iPads and iPhones.