IPU-POD systems let you break through barriers to unleash entirely new breakthroughs in machine intelligence with real business impact. Get ready for production with IPU-POD64 and take advantage of a new approach to operationalise your AI projects.
IPU-POD64 delivers ultimate flexibility to maximise all available space and power in your datacenter, no matter how it is provisioned. 16 petaFLOPS of AI-compute for both training and inference to develop and deploy on the same powerful system.
Simplify and streamline your AI datacenter scale-out with pre-configured, pre-approved IPU-POD reference designs
Challenge the status quo by choosing powerful, parallel AI compute and differentiate your business with new AI breakthroughs
Poplar flexibly and simply compiles AI models across any number of IPUs giving you back precious development time
IPU-Fabric is designed from the ground-up for AI to provide close-to-constant communication latency, ready to extend with near-limitless scale
Start with one and scale to AI supercomputer size with flexible, pre-configured reference designs and approved technology ecosystem partners. Take advantage of systems integration skills from our elite partner network to build out your IPU-based dedicated AI infrastructure.Learn More
World-class results whether you want to explore innovative models and new possibilities, faster time to train, higher throughput or performance per TCO dollar.
Bert-Large: TrainingClick to Zoom
EfficientNet: TrainingClick to Zoom
Designed from the ground up for AI, the Intelligence Processing Unit - IPU - is a fine-grained massively parallel processor. Each GC200 Mk2 IPU has 1472 independent cores with local 900MB In-Processor Memory. IPU-POD64 has 64 GC200 IPUs packing a powerful 16 petaFLOPS of AI compute for an entirely new approach to machine intelligence compute
IPU-Fabric is designed from the ground up for AI, using an innovative, ultra-efficient, low-level point-to-point protocol that is compiled-in, eliminating the overhead of message passing. The fabric enables collectives and all-reduce operations that are managed and pre-determined at compile time. This provides a near-constant communication latency independently of number of IPUs and IPU-PODs.
IPU-Fabric's all-to-all IPU communication has 2.8Tbps low-latency bandwidth between each of the 16 IPU-M2000s that comprise the IPU-POD64. Scale-out is supported using Gateway Links with 2 x 100GbE bi-directional communication between each IPU-M2000.
Memory operations play a fundamental role in all AI applications. The more efficient this can be done, the greater likelihood your application will perform optimally. The IPU uses In-Processor Memory where all cores have their own, independent ultra-low latency, high-bandwidth SRAM. This means that model parameters, weights and activations can be operated on directly on the same silicon as the processor cores, without costly fetch and store to external memory.
IPU-POD64 is supported by Graphcore's Poplar SDK. Poplar is co-designed from scratch with the IPU to implement our graph toolchain. Poplar also implements compiled-in communications which ensures reliable, deterministic communications and memory operations during execution.
At a high level, Poplar is fully integrated with standard machine learning frameworks so developers can port existing models easily, and get up and running out-of-the-box with new applications in a familiar environment.
Below these frameworks sits Poplar. For developers who want full control to exploit maximum performance from the IPU, Poplar enables direct IPU programming in Python and C++.
Pre-configured with a 4 PetaFLOP AI system, IPU-POD16 is where you experience the power and flexibility of larger IPU systems.Learn more
Our core building block for AI infrastructure. The IPU-M2000 packs 1 PetaFlop of AI compute in a slim 1U blade.Learn more
A secure IPU cloud service to add state of the art AI compute on demand - no on-premise infrastructure deployment required.Learn more
Connect with our experts to assess your AI infrastructure requirements and solution fit.