DSS8440 IPU Server



Make new breakthroughs in machine intelligence with 8 Graphcore C2 PCIe cards connected with high-speed IPU-Links™ in an industry standard OEM system

Download White Paper

The Dell DSS8440 Graphcore IPU Server

The Dell DSS8440 IPU Server is a 4U rack-mounted chassis with eight Graphcore C2 PCIe cards, fully connected with high speed IPU-Links™.

Designed for both training and inference, this IPU Server is ideal for experimentation, pre-production pilots and commercial deployment.

Read DSS8440 product brief
C2 PCIe Cards

High Performance AI compute

Tackle your most challenging machine learning workloads with 16 Colossus™ Mk1 GC2 IPUs all working together to deliver 1.6 PetaFlops of AI compute.

Each eight-card IPU server gives you 4.8GB In-Processor Memory™ plus access to even more Streaming Memory™ with the Poplar software Exchange-Memory Management features.

Learn more about memory
Microsoft Azure

Azure IPU Preview

The Graphcore IPU preview on Microsoft Azure is open for customers focused on developing new breakthroughs in machine intelligence.

Register now

Dell IPU Server

The Dell EMC DSS8440 machine intelligence server with Graphcore technology for enterprise customers building out on-premise AI compute.

Register now

Cirrascale IPU Cloud

Cirrascale offers an IPU bare-metal cloud service and Dell EMC DSS8440 IPU servers for sale for on-premise customer applications.

Buy now
Bert Click to Zoom

Natural Language Processing - BERT

The IPU delivers over 25% faster time-to-train with the BERT language model, training BERT-Base in 36.3 hours with seven C2 IPU-Processor PCIe cards, each with two IPUs, in an IPU Server system. For BERT inference, we see more than 2x higher throughput at the lowest latency resulting in unprecedented speedups.

ResNext Click to Zoom

Image Classification - ResNeXt

Graphcore C2 IPU-Processor PCIe card achieves 7x higher throughput at 24x lower latency compared to a leading alternative processor. High throughput at the lowest possible latency is key in many of the important use cases today.

Faster Time to Results

Delivering a new level of fine-grained, parallel processing across thousands of independent processing threads on each individual IPU. The whole machine intelligence model is held inside the IPU with In-Processor Memory to maximise memory bandwidth and deliver high throughput for faster time to train and the lowest latency inference.

Enabling Innovation

See record-breaking time to train with modern high accuracy computer vision models, like ResNext and EfficientNet. Explore new, large Natural Language Processing models that take full advantage of the IPU's native sparsity support.

Training & Inference Support

High-performance training and low-latency inference capability on the same hardware improves utilisation and flexibility in the cloud and on-premise, significantly enhancing the total cost of ownership.

Designed for the Future

The IPU is designed to scale. Models are getting larger and demand for AI compute is scaling exponentially. High bandwidth IPU-Links™ allow tight integration of 16 IPUs in the server while  Infiniband support allows IPU Servers to work together in a datacenter.


The Intelligence Processing Unit (IPU)

Learn more

Poplar® Software Stack

Learn more