The two companies revealed that they are now working together to provide an AI-focused server that runs models—not trains them—using Qualcomm’s Cloud AI 100 Ultra AI inferencing chips and Ampere’s CPUs.
Ampere, like all other chip makers, wants to cash in on the AI wave. Though it can use the Arm IP to add some of these features to its chips, the company’s focus has always been on quick and power-efficient server processors, so it’s not really a core skill. Ampere chose to collaborate with Qualcomm Arm.
Arm CTO Jeff Wittich said, “The idea here is that while I’ll show you some great performance for Ampere CPUs running AI inferencing on just the CPUs, if you want to scale out to even bigger models—multi-100 billion parameter models, for instance—just like all the other workloads, AI isn’t one size fits all. We’ve been working with Qualcomm on this solution, combining our super efficient Ampere CPUs to do a lot of the general purpose tasks that you’re running in conjunction with inferencing, and then using their really efficient cards, we’ve got a server-level solution.”
Ampere has included Qualcomm’s cooperation in its annual roadmap update. The new AmpereOne chip, with 256 cores and a current 3nm manufacturing process, is a component of that strategy. Although it believes the new chips are ready at the fab and should launch later in 2024, as the chips are not yet widely accessible in the market.
In addition to the extra cores, the 12-channel DDR5 RAM in the new generation of AmpereOne chips is what sets them apart and enables Ampere’s data center clients to better customize their users’ memory access based on their requirements.