MK2 PERFORMANCE BENCHMARKS
Our Poplar SDK accelerates machine learning training and inference with high-performance optimisations delivering world leading performance on IPUs across models such as natural language processing, probabilistic modelling, computer vision and more. We have provided a selection of the latest MK2 IPU performance benchmark charts on this page and will update it regularly. We also now provide detailed MK2 training and inference performance data in table format. You can reproduce all of these benchmarks using code in the examples repo on the Graphcore GitHub page.
MK1 Benchmarks View Performance Results TableNatural Language Processing
BERT Large (Bidirectional Encoder Representations from Transformers) is one of the most well known NLP models in use today. The IPU accelerates both training and inference on BERT-Large, delivering faster time to train with significantly higher throughput at extremely low latency for inference.
BERT-Large: Inference
BERT-Large: TTT (time-to-train)
Computer Vision
IPU excels with models designed to leverage small group convolutions due to its fine grained architecture and unique Poplar features. We deliver unparalleled performance for both training and inference for newer computer vision models like EfficientNet and ResNeXt, which deliver higher accuracy and improved efficiency, as well as for traditional computer vision models such as ResNet-50.
EfficientNet-B0: Inference
EfficientNet-B4: Training
ResNeXt-101: Inference
ResNeXt-101: Training
ResNet-50: Inference
ResNet-50: Training
Probabilistic Modelling
Probabilistic models using the Markov Chain Monte Carlo (MCMC) method use iterative sampling of an implicit distribution with Hamiltonian Monte Carlo (HMC) schemes to manage noise and uncertainty in data. The IPU delivers faster time to result for MCMC using standard TensorFlow Probability.
MCMC Probabilistic Model : Training
Tensorflow Probability Model - Representative finance workload for alpha estimation
Time Series Analysis
The IPU is well suited to time series analysis applications. Here, an LSTM inference model shows lower latency and considerably higher throughput.
LSTM: Inference
2 layer LSTM Model
Speech Processing
Deep Voice from Baidu is a prominent text-to-speech (TTS) model family for high-quality, end-to-end speech synthesis. The IPU’s capacity to rapidly accelerate fully convolutional TTS models like Deep Voice 3 with a notably higher throughput than a GPU opens up the opportunity to create entirely new classes of TTS models.