The Graphcore 740 IPU Server is built for AI inference and deployment


Introducing the Graphcore 740 IPU Server

The Graphcore 740 IPU Server is a 2U high-performance server for machine learning inference. It is ideally suited to the AI workload requirements of industries such as Finance and Healthcare.

The server is built with inference use cases in mind, but each C2 card in the server can also be used for training, as the IPU efficiently supports both training and inference.

Read 740 product brief
740 C2 Diagram

C2 IPU-Processor PCIe Cards

The Graphcore 740 IPU Server is a 2U rack-mounted chassis, with two Graphcore C2 PCIe IPU Cards, each of which has two Graphcore Colossus MK1 GC2 IPUs served by 2x Intel® Xeon Host CPU.

It provides 400 TeraFLOPS of IPU compute, with 1.2GB In-Processor Memory™.

The platform is fully supported by Graphcore’s Poplar® software to provide a complete platform for accelerated machine intelligence deployment.

Read C2 Card product brief
LSTM - INFERENCE Click to Zoom

Time Series Analysis

The IPU in the 740 platform delivers >300 higher throughput over the alternative power-equivalent GPU platform on LSTM Inference for Time Series Analysis problems.

This is a strong indicator for the inherent IPU advantage over the GPU architecture. LSTM is representative of the time series analysis models used in the finance industry for feature generation and alpha estimation.

RESNEXT-101 - INFERENCE Click to Zoom

Computer Vision

The IPU excels at next generation models designed to leverage small, group convolutions due to its fine grained architecture and specific features in the Poplar SDK.

Just one C2 card in the 740 IPU Server delivers huge performance gains with modern computer vision models like ResNeXt.


Probabilistic Modelling

This is an example of a training model that is ideal for the 740 IPU server using just one C2 card.

Probabilistic models using MCMC are often used for alpha estimation in Finance. This kind of complex modelling approach is too computationally intensive for legacy processor architectures to perform well.

Faster Performance

The IPU’s unique architecture supports a new level of fine-grained, parallel processing across thousands of independent processing threads on each individual IPU. Combined with unparalleled Exchange-Memory bandwidth, this delivers unrivalled performance.

Enabling Innovation

The IPU has been designed to support complex data access efficiently and at much higher speeds. By contrast, legacy processors struggle with non-aligned and sparse data accesses which are critical for emerging machine learning models.

Inference and Training 

Although targeted for inference applications, each C2 card in the 740 IPU server can also be used for training, as the IPU inherently supports both training and inference, highlighted by our MCMC probabilistic model training performance benchmark.

Designed for the Future

Machine learning model development is already shifting towards the pursuit of improved accuracy with fewer parameters and enhanced efficiency with sparse, non-aligned data. These next generation models exploit smaller kernels and sparse data, making them ideally suited to the IPU’s fine-grained, parallel processing architecture.


The Intelligence Processing Unit

Learn more

Poplar Software Stack

Learn more