Everyone is going ARM these days—Amazon and Apple are both shipping in-house CPUs with crazy performance increases, with Microsoft rumored to be developing their own. ARM has historically been used for low power mobile chips, so why exactly is ARM crushing x86 on the desktop and server space?
Everyone Is Going ARM These Days
The processor world is a complex industry, and only a few designs from a few companies are able to compete on the high end of performance. Usually, Intel or AMD holds the crown of performance, with both of them manufacturing x86 CPUs. But recently, CPUs from Apple and Amazon based on ARM have been giving Intel (and the x86 architecture) a run for their money.
Amazon has their Graviton2 CPU, which isn’t faster than Intel’s server counterparts, but is more cost effective and uses less power. With how much of an improvement it was over Graviton1, their next iteration will likely be fierce competition in the server space.
Apple hit it out of the park with their first non-mobile CPU, the Apple Silicon M1 processor, which runs faster than desktop Intel CPUs and nearly as fast as AMD’s Ryzen 5000 series, the current performance crown. It’s custom silicon which makes Apple Macbooks the current fastest laptops in the world, much to the chagrin of PC enthusiasts (myself included).
In fact, they’re so far ahead in the laptop space that Windows on the M1 Macbook runs faster than the Surface Pro X, despite Windows on ARM only running through an emulator. And as if that wasn’t humiliating enough, it absolutely crushes it with a Geekbench Single-Core score of 1,390 compared the Surface’s 802, which is laughably bad in comparison. The M1 is seriously fast.
Microsoft is also rumored to be developing their own in-house ARM processor, and though that rumor comes from the Azure server space, they’d likely use the same chip for the Surface if they can match Apple’s performance.
What’s The Difference Between ARM and x86?
At the end of the day, there isn’t too much of a difference between ARM and x86. You can still run Google Chrome and watch YouTube on either one. In fact, you may be doing so right now, as nearly all Androids and every iPhone uses an ARM-based processor.
The biggest difference for most people is that older applications meant for x86 will need to be recompiled to run on ARM as well. For some things this is easy, but not everything will be supported, especially legacy software. However, even that can usually run through x86 emulation, which Windows is starting to support.
For developers, there are a lot of differences in how applications get compiled, but these days, most compilers do a good job of supporting the major instruction sets, and you won’t really have to make many changes to get it compiling for multiple platforms.
But How is ARM Running Faster?
To answer this question, we’ll have to delve deeper into how CPUs work under the hood.
ARM and x86 are both instruction sets, also known as architectures, which basically are a list of micro-code “programs” that the CPU supports. This is why you don’t need to worry about running a Windows app on a specific AMD or Intel CPU; they’re both x86 CPUs, and while the exact designs are different (and perform differently), they both support the same instructions. This means any program compiled for x86 will, in general, support both CPUs.
CPUs basically execute operations sequentially, like a machine given a list of tasks to do. Each instruction is known as an opcode, and architectures like x86 have a lot of opcodes, especially considering they’ve been around for decades. Because of this complexity, x86 is known as a “Complex Instruction Set,” or CISC.
CISC architectures generally take the design approach of packing a lot of stuff into a single instruction. For example, an instruction for multiplication may move data from a memory bank to a register, then perform the steps for the multiplication, and shuffle the results around in memory. All in one instruction.
Under the hood though, this instruction gets unpacked into many “micro-ops,” which the CPU executes. The benefit of CISC is memory usage, and since back in the day it was at a premium, CISC used to be better.
However, that’s not the bottleneck anymore, and this is where RISC comes into play. RISC, or Reduced Instruction Set, basically does away with complex multi-part instructions. Each instruction mostly can execute in a single clock cycle, though many long operations will need to wait on results from other areas of the CPU or memory.
While this seems like going backwards, it has huge implications for CPU design. CPUs need to load all their instructions from RAM and execute them as fast as possible. It turns out it’s far easier to do that when you have many simple instructions versus a lot of complex ones. The CPU runs faster when the instruction buffer can be filled up, and that’s a lot easier to do when the instructions are smaller and easier to process.
RISC also has the benefit of something called Out-of-Order execution, or OoOE. Essentially, the CPU has a unit inside of it that reorders and optimizes instructions coming into it. For example, if an application needs to calculate two things, but they don’t depend on each other, the CPU can execute both in parallel. Usually, parallel code is very complicated for developers to write, but at the lowest levels of the CPU, it can make use of multi-tasking to speed things up. The Apple M1 chip uses OoOE to great effect.
If you’re interested in the inner workings, you should read this fantastic write-up by Erik Engheim on what makes the Apple M1 chip so fast. In short, it makes heavy usage of specialized silicon, Out-of-order execution, and having way more instruction decoders to support its speed.
Is ARM Going To Replace x86?
The honest answer is, probably. Intel has been feeling the end of Moore’s law for years now, and while AMD has been able to make performance leaps in recent years, they’re not far ahead.
This isn’t to say that x86 will die off anytime soon, but it’s clear that ARM has more potential than just being a mobile architecture—a stigma which is no longer valid given the current direction of the industry. The benefits of RISC architectures are clear, and with how much the Apple M1 chip has already improved, the future of the industry looks promising.
Plus, ARM isn’t the only RISC architecture out there. It’s still proprietary, though ARM licenses its designs to third-party designers, like Qualcomm, Samsung, and Apple. RISC-V is open source, and is similarly promising. It’s a standardized instruction set architecture, leaving the exact implementations up to the manufacturer. If the industry does move towards RISC in general, there will be open and closed source implementations available.