Leading European Search Engine, Qwant, have published a new paper evaluating the performance of Graphcore's IPU processors on deep neural networks for inference. The focus of the paper is on deep vision models such as ResNeXt, including analysis on the observed latency, throughput and energy efficiency.
The paper by Qwant summarizes the performance of one C2 card performing inference on the recent image-based deep learning model ResNeXt-101. For the evaluation, researchers at Qwant were able to make use of the full compute capacity of the C2 card by using its two IPU processors, accessed via the Microsoft Azure cloud, with each processor running one inference session in parallel. Innovators can request Preview Access to Graphcore IPU Azure VMs (NDv3) here.
The implementation used a PyTorch model which was exported to the industry standard Open Neural Network eXchange format (ONNX) to run in PopART (Poplar Advanced Runtime). Qwant used the performance metrics of latency (time per batch), throughput (number of images per second) and energy efficiency (number of images per second per Watt) to evaluate the performance of the C2 card for real-time application.