<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=145304570664993&amp;ev=PageView&amp;noscript=1">

Performance Results

Here we provide performance results for the new Bow Pod platforms. We also provide results from our MLPerf Training v2.0 submission, and results from our own benchmarking activities across a wider range of models for both Training & Inference 

Bow Platform - Training

Here we provide Training performance results for the new Bow Pod platforms. Throughput in this context is defined as the number of input data points (sequences, images, or rows) processed by the model per second. 

 

The below results detail the obtained throughput values for each of the referenced models in the specified configuration.

Bow Platform - Inference

Model inference in this context refers to running a trained model on input data to infer output. Inference performance in production setups is typically measured on two metrics: throughput (as defined previously) and latency, which is defined in this context as the amount of time taken for the modelto provide an output given an input.

 

Here below we provide results for the new Bow-2000 platform as throughput and latency for a given batch size.

MLPerf v2.0 Training Performance

For our submissions in to MLPerf Training version 2.0 we have chosen to submit for the popular application benchmark categories of Image Classification (ResNet-50) and Natural Language Processing (BERT), and also a new entry as an Open submission in the Speech Transcription category for RNN-T 

 

There are two divisions for submissions. The Closed division requires submitters to use exactly the same model and optimizer implementation that includes defining hyperparameter state and training epochs. There is also an Open division that fosters and supports innovation by supporting different model implementations more tuned to different processor capabilities or as in this case, more aligned to customer requirements

MLPerf v2.0 Training Results | MLPerf ID: 2.0-2045, 2.0-2049, 2.0-2051, 2.0-2053

MLPerf v2.0 Training Results | MLPerf ID: 2.0-2047, 2.0-2050, 2.0-2052, 2.0-2054

The MLPerf name and logo are trademarks of MLCommons Association in the United States and other countries. All rights reserved.
Unauthorized use strictly prohibited. See www.mlperf.org for more information.

IPU-POD Classic - Training

Training a machine learning model involves running the algorithm over an input dataset (training data) until the model converges - meaning that it has learned to produce the desired output to a specified accuracy. Throughput in this context is defined as the number of input data points (sequences, images, or rows) processed by the model per second. Throughput is often used as a measure of hardware performance as it is directly related to the time for the model to train to a specified accuracy.


The results provided below detail the obtained throughput values for each of the referenced models in the specified configuration. 

IPU-POD Classic - Time to Result

IPU-POD Classic - Inference

Model inference in this context refers to running a model on input data to infer output. Inference performance in production setups is typically measured on two metrics: throughput (as defined previously) and latency, which is defined as the time taken to execute an inference. 

Precision Terminology: X.Y is defined as follows: X is the precision for storing the activations & gradients, and Y is the precision for storing the weights. When training in 16.16 weights we may still use FP32 for other variables (such as norms or momentum), and include stochastic rounding.

Benchmarks were generated using our examples on the Graphcore GitHub.

This page was last updated on Wednesday, June 29, 2022

×