SiFive’s brand-new P550 is one of the world’s fastest RISC-V CPUs

sifive design families
Enlarge / SiFive’s “Important” household is stripped all the way down to the minimal configurations and efficiency essential for traditional microcontroller responsibility. “Intelligence” provides AI/ML acceleration, and the brand new “Efficiency” household presents simply what it says on the tin.

At present, RISC-V CPU design firm SiFive launched a brand new processor household with two core designs: P270 (a Linux-capable CPU with full assist for RISC-V’s vector extension 1.Zero launch candidate) and P550 (the highest-performing RISC-V CPU to this point).

A fast RISC-V overview

For these not instantly conversant in RISC-V, it’s a comparatively new CPU structure which takes benefit of Diminished Instruction Set Laptop (RISC) ideas. RISC-V is an open commonplace particularly designed to be forward-looking and evade as a lot legacy cruft as potential. One instance of this design is RISC-V’s dynamic width vector instruction set, which permits builders to execute vector directions on information of arbitrary measurement with most effectivity.

In conventional processor designs, a vector instruction has a hard and fast width tied to the hardware register measurement of the processor—for instance, SSE and SSE2 enable use of a Pentium III’s 128-bit registers, whereas making full use of an i7-4770’s 256-bit registers requires a very separate instruction set (AVX2) for a similar mathematical operations. Transferring as much as an i7-1065G7’s 512-bit registers requires one more instruction set, AVX-512—once more, for a similar underlying mathematical operations.

In sharp distinction, RISC-V vector math permits a single set of CPU directions to carry out the identical set of mathematical operations as effectively as potential, utilizing no matter measurement registers the present CPU design has out there. This implies a developer can merely write a single routine that may course of vector operations as effectively as potential on a telephone with 64-bit registers or on a supercomputer with 1,024-bit registers.

Along with forward-looking options constructed into the RISC-V spec, the structure is designed to supply flexibility that its designers didn’t or couldn’t consider forward of time. Generic RISC-V designs function reserved opcodes, which designers of particular RISC-V CPUs might then take over to supply further, arbitrary performance.

The power to “take over” reserved opcodes permits for vastly streamlined ASIC design, since each specialised directions and basic controller performance might be supplied on a single die—and with out CPU architects needing to reinvent any wheels to supply the generic controller performance.

For the second, RISC-V isn’t a severe competitor to both Arm or x86 within the general-purpose processor house, but it surely’s closely used within the microcontroller house, due partially to its extensibility and cheap licensing. We do broadly anticipate RISC-V to turn out to be a 3rd main participant in relation to general-purpose CPUs—the kind that present the “predominant mind” for telephones, tablets, and conventional computer systems—however that’s nonetheless some years away.

What’s new within the SiFive Efficiency household?

The 2 new designs introduced right now are P270 and P550. P270 is SiFive’s first CPU to totally assist the non-obligatory RISC-V vector extension 1.Zero launch candidate, and P550 is SiFive’s highest-performing RISC-V processor to this point—additionally making it, so far as we all know, the highest-performing RISC-V processor out there.

P270 and “V” 1.Zero-rc1

SiFive's Recode automatically translates legacy SIMD source to SiFive vector assembly—in this case, beginning with source code written for Arm's Neon instruction set.
Enlarge / SiFive’s Recode robotically interprets legacy SIMD supply to SiFive vector meeting—on this case, starting with supply code written for Arm’s Neon instruction set.

As you’d anticipate from the “launch candidate” rider, RISC-V’s “V” non-obligatory instruction set isn’t but a frozen commonplace. When the V spec reaches 1.Zero—with out the “launch candidate” rider—it is going to be thought of steady sufficient to freeze the function set. This can enable builders to start work on long-term tasks utilizing it for toolchains, useful simulators, and so forth, with some extent of certainty that the code the builders have written will “simply work” on future CPU designs.

It is price noting that even as soon as the discharge candidate tag is eliminated, the 1.Zero model of the V directions will nonetheless solely be thought of prepared for public ratification. The primary true manufacturing model of V shall be 2.Zero, a model quantity awarded after public ratification is taken into account full, with no main performance adjustments essential.

SiFive additionally presents a translation utility referred to as Recode, which robotically converts legacy SIMD code to V-spec vector meeting.

P550 excessive efficiency

This somewhat confusing trio of bar graphs shows a single P550 core significantly outperforming an equivalent Cortex A75 core (top two graphs) while blowing it out of the water in performance per on-die square millimeter (bottom graph).
Enlarge / This considerably complicated trio of bar graphs reveals a single P550 core considerably outperforming an equal Cortex A75 core (prime two graphs) whereas blowing it out of the water in efficiency per on-die sq. millimeter (backside graph).

Each P270 and P550 are Linux-capable designs, however the P270 is restricted to a dual-issue, in-order pipeline with solely eight levels. Whereas the P270’s full V extension assist ought to make it a formidable processor for closely vector-math-dependent functions, the P550 ought to show way more highly effective for functions nearer to these presently dealt with by general-purpose CPUs.

SiFive’s new Efficiency P550 core includes a 13-stage, triple-issue, out-of-order pipeline. SiFive claims that a four-core P550-based CPU takes up roughly the identical on-die space as a single Arm Cortex-A75, with a big efficiency benefit over that competing Arm design. SiFive says the P550 delivers eight.65 SPECInt 2006 per GHz, primarily based on inside engineering take a look at outcomes—a laudable outcome when compared to Cortex-A75 (and never too far behind an i9-10900Okay’s 11.08/GHz). Nevertheless it’s properly behind an Apple A14’s 21.1/GHz.

Intel adopts P550 to be used in its Horse Creek platform

Before everything, we have to make one factor clear—we’re nearly actually not speaking about Intel ditching the x86_64 structure for RISC-V! Fashionable x86_64 CPUs from Intel and AMD embrace administration and supervisory cores, which aren’t immediately accessible to finish customers. These are sometimes Arm CPU cores; for instance, AMD’s first APUs used Cortex-A5 for his or her platform safety processor.

The joint announcement from Intel and SiFive is unclear on simply what Horse Creek shall be. Intel typically reserves the “Creek” names for socketed platforms moderately than all-in-one system on chip (SoC) boards. This implies that, in all chance, the P550 shall be restricted to supervisory or administration duties inside x86_64 Horse Creek CPUs moderately than immediately processing directions from software program working on that platform.

Anandtech’s Ian Cuttress points out that constructing the P550 immediately into Horse Creek—which shall be constructed on Intel’s latest 7nm course of node—would possibly present Intel with less complicated testing and extra speedy growth of the brand new 7nm course of itself.