Major PopVision Update released

Major Update Released for Poplar Application Analysis Tool

Written by Simon Long

Posted Oct 05, 2020

Since launching our Poplar® application analysis tool six months ago, we’ve been rapidly expanding its capabilities. We’ve added more than 30 new features to the PopVision™ Graph Analyser for versions 2.0 and 2.1. This update will give developers deeper insight into how the IPU processes their applications, helping them to optimise their models for state-of-the-art performance. Version 2.1 is being released as part of our new Poplar SDK 1.3 release.

Graphcore created the PopVision™ Graph Analyser to give developers detailed information about how their code functions and provide them with the tools to build better machine intelligence applications. This transparency is extremely useful for developers when programming the IPU, primarily because the processor has a completely different architecture to CPUs and GPUs. The PopVision tool includes visual reports on both memory usage and program execution to show developers how their applications run on the IPU and give essential information that can inform future model development and optimisation.

The four main reports are:

Summary Report: View essential program information

Memory Report: Analyse program memory usage and layout on one or multiple IPUs

Liveness Report: Explore temporary peaks in memory and their impact

Execution Trace Report: View program execution

If you’re not already familiar with the PopVision™ Graph Analyser, here’s a quick introduction:

 

What’s New in PopVision 2.0 and 2.1?

Here is a selection of the PopVision™ Graph Analyser’s top new features:

  • New report comparison views (2.0)
  • Show memory usage per IPU (2.1)
  • New memory tile map view (2.1)
  • View liveness per IPU (2.0) and per Tile (2.0)
  • New visualisation of Bulk Synchronous Parallel (BSP) (2.0)
  • Visualisation of overlapped IO in Execution Trace (2.1)

Comparing Reports

Often when optimising a machine learning model, developers will make changes to their model and want to understand how that has changed the Poplar program. To help developers do this, we have added the capability to open two Poplar reports and the tool highlights the differences between them.

PopVision 2 report comparison 1 (1)

All of the reports in PopVision 1.0 have been updated to support comparing two Poplar reports. New graphs have been added to the memory report to show the memory usage deltas & differences.

During internal testing of PopVision 2.0, we were surprised to learn that Graphcore developers use this new report comparison features more frequently than opening a single report. Following this feedback, we’ve placed more focus on the ability to compare two reports generated with different parameters. This has accelerated the optimisation of the models that are being analysed with the tool.

New Memory Features

In PopVision 1.0, we allowed users to see how the memory for a variable was allocated on a single tile. For PopVision 2.0, we have added the option of plotting the layout of a variable across all tiles. Now developers can see which tiles a tensor had been allocated to and what part of the memory address space was used.

popvision 2 memory features 1 (1)
When analysing large models that span multiple IPUs, PopVision users have told us that they would like to see memory usage per IPU to help them understand the allocation of memory between IPUs. For PopVision 2.1, we have added a new memory option to plot memory usage by IPU.

popvision 2 memory features 2 (1)

We received some early feedback from developers who were using the report comparison feature requesting the option to be able to compare multiple tiles or IPUs when viewing a single report. In PopVision 2.0, users can hold down the shift key to select multiple tiles or IPUs and the details of them will be compared. The image above shows a visualisation comparing the amount of vertex memory on two different IPUs.

For PopVision 2. 1, a new tile memory map view has been added to show how memory is allocated to physical tiles on an IPU. This helps developers to understand more about tile positions on an IPU chip, with orange indicating the most amount of memory and blue indicating the least amount of memory. The tile map has different scaling options and supports breakdown of interleaved & non interleaved and also with and without gaps.

As with all the PopVision reports, the tile memory map supports both MK1 and MK2 processors.

popvision 2 tile memory map (1)

New Liveness Features

In PopVision 2.0, we have added the option to break down liveness per IPU, allowing developers to understand how their model is split over the IPUs and how the variables are allocated across those IPUs.

popvision 2 liveness (1)
For PopVision 2.1, we have added the option to display the maximum memory line in total, per IPU processor and per tile, helping developers to see which steps in their program have exceeded the maximum memory available.

New Execution Trace Features

A lot of work has gone into improving the execution trace. This is one of the most popular reports in the tool.

In PopVision 2.0, we added visualisation of the Bulk Synchronous Parallel execution. Developers now have the option to see the BSP execution alongside the flat/flame graph. In the BSP trace, there are tiles on the y axis (0 at the top and tile 1216 or 1472 at the bottom) and users can see for each tile how many cycles it takes to execute a compute set or exchange.

popvision 2 execution trace 1 (1)

PopVision users can select program steps in the BSP or flame/flat graph and view details on that particular program step, including a graph of the same tile execution information, now with tiles on the x axis.

For our Poplar SDK 1.3 release, the poplar::Engine::run API has been updated to take a debug string. This debug string can be used to show the engine runs on the execution trace. i.e. "OptimizerFromHost". This allows users to see which initialisation programs the frameworks have called before executing the main loop.

Additionally, within the execution trace summary users can view the amount of data streamed into and out of the IPUs from the host & between IPUs in the current visible view of the execution trace.

popvision 2 execution trace 2 (1)

Support has been added to visualise a new Poplar SDK 1.3 feature called “overlapped IO”. This new capability allows an IPU to concurrently stream data on & off the IPU (StreamCopy) while concurrently executing other steps in a program. This means that while executing an iteration of their model PopVision users can stream inputs for the next iteration onto the IPU. The flame & flat execute graphs have been updated to display this concurrent execution.

popvision 2 execution trace 3 (1)

Additional Features

We added context-sensitive “Help” popups into PopVision 2.0 which appear when users hover their mouse over a button or certain text and provide the relevant section of the Help module.

An option to reload a report has been also added so users don’t have to close and open a report when the Poplar report files change.

We have also improved the SSH connection process to remote hosts in PopVision 2.1. which now includes using aliases defined in the ssh_config as addresses of remote hosts.

Support for IPU Developers

The PopVision Graph Analyser is one of many resources Graphcore customers can use to learn more about programming the IPU and enhance their models’ performance on IPU systems. Our Developer Portal contains a wide range of content dedicated to new and experienced users of the Poplar SDK and IPU technology, including developer documentation, tutorials, video walkthroughs and application examples plus links to our Support platform and GitHub.

Learn more about the Poplar SDK

Written by Simon Long

Posted Oct 05, 2020