MK1 PERFORMANCE BENCHMARKS
Our Poplar SDK accelerates machine learning training and inference with high-performance optimisations delivering world leading performance on IPUs across models such as natural language processing, probabilistic modelling, computer vision, recommenders and more. We have provided a selection of our MK1 IPU performance results on this page. All benchmarks were generated using our examples on the Graphcore GitHub page.
MK2 BenchmarksNatural Language Processing
BERT (Bidirectional Encoder Representations from Transformers) is one of the most well known NLP models in use today. The IPU accelerates both training and inference on BERT, delivering 25x faster time to train with 2x faster inference at extremely low latency.
BERT-Base: Inference
BERT-Base: Training
Speech Processing
Deep Voice from Baidu is a prominent text-to-speech (TTS) model family for high-quality, end-to-end speech synthesis. The IPU’s capacity to significantly accelerate fully convolutional TTS models like Deep Voice 3, with 6.8x higher throughput than a GPU opens up the opportunity to create entirely new classes of TTS models.
Deep Voice: Training
Computer Vision
IPU excels with models designed to leverage small, group convolutions due to its fine grained architecture and specific features in the Poplar SDK. We deliver performance gains for both training and inference for newer computer vision models like EfficientNet and ResNeXt.
EfficientNet: Inference
EfficientNet: Training
ResNeXt-101: Inference
ResNeXt-50: Training
Probabilistic Modelling
Probabilistic models using the Markov Chain Monte Carlo (MCMC) method use iterative sampling of an implicit distribution with Hamiltonian Monte Carlo (HMC) schemes to manage noise and uncertainty in data. The IPU delivers 15x faster time to train for MCMC using standard TensorFlow Probability.
MCMC Probabilistic Model : Training
Tensorflow Probability Model - Representative finance workload for alpha estimation
Variational Inference (VI) is another common way of managing probabilistic inference, by introducing an approximate distribution, which is then sampled and optimised to get as close as possible to the target. In a TensorFlow-based Variational Autoencoder (VAE) Model combining both approaches, the IPU sees over 4.8x faster time to train.
VAE Probabilistic Model : Training
Tensorflow Variational Autoencoder Model - MCMC & VI combination
Sales Forecasting
The IPU is well suited to time series analysis applications. Here, Multi-Layer Perceptron (MLP) networks are used for sales forecasting, showing a 6x training throughput advantage.
Time Series Analysis: Training
SALES FORECASTING MODEL | Multi-Layer Perceptron (MLP) + Embedding
Recommenders
Autoencoders are efficient for recommendation and ranking. In this dense autoencoder model, using a public Netflix dataset, the IPU more than doubles training throughput.
Dense Autoencoder : Training
for content recommendation and ranking