Our Poplar SDK accelerates machine learning training and inference with high-performance optimisations delivering world leading performance on IPUs across models such as natural language processing, probabilistic modelling, computer vision, recommenders and more. We have provided a selection of the latest IPU performance results on this page and will update it regularly. To replicate our benchmarks, visit the Graphcore GitHub site for public code examples and applications.
Natural Language Processing
BERT (Bidirectional Encoder Representations from Transformers) is one of the most well known NLP models in use today. The IPU accelerates both training and inference on BERT, delivering 25x faster time to train with 2x faster inference at extremely low latency.
Deep Voice from Baidu is a prominent text-to-speech (TTS) model family for high-quality, end-to-end speech synthesis. The IPU’s capacity to significantly accelerate fully convolutional TTS models like Deep Voice 3, with 6.8x higher throughput than a GPU opens up the opportunity to create entirely new classes of TTS models.
Deep Voice: TrainingClick to Zoom
IPU excels with models designed to leverage small, group convolutions due to its fine grained architecture and specific features in the Poplar SDK. We deliver performance gains for both training and inference for newer computer vision models like EfficientNet and ResNeXt.
EfficientNet: InferenceClick to Zoom
EfficientNet: TrainingClick to Zoom
Probabilistic models using the Markov Chain Monte Carlo (MCMC) method use iterative sampling of an implicit distribution with Hamiltonian Monte Carlo (HMC) schemes to manage noise and uncertainty in data. The IPU delivers 15x faster time to train for MCMC using standard TensorFlow Probability.
MCMC Probabilistic Model : Training
Tensorflow Probability Model - Representative finance workload for alpha estimationClick to Zoom
Variational Inference (VI) is another common way of managing probabilistic inference, by introducing an approximate distribution, which is then sampled and optimised to get as close as possible to the target. In a TensorFlow-based Variational Autoencoder (VAE) Model combining both approaches, the IPU sees over 4.8x faster time to train.
VAE Probabilistic Model : Training
Tensorflow Variational Autoencoder Model - MCMC & VI combinationClick to Zoom
The IPU is well suited to time series analysis applications. Here, Multi-Layer Perceptron (MLP) networks are used for sales forecasting, showing a 6x training throughput advantage.
Time Series Analysis: Training
SALES FORECASTING MODEL | Multi-Layer Perceptron (MLP) + EmbeddingClick to Zoom
Autoencoders are efficient for recommendation and ranking. In this dense autoencoder model, using a public Netflix dataset, the IPU more than doubles training throughput.
Dense Autoencoder : Training
for content recommendation and rankingClick to Zoom