Using IPUs from Docker

Introduction

This guide explains how you can run applications in Docker on a Linux machine with one or more physical IPU devices.

Prerequisites:

  • A machine with IPU devices
  • Ubuntu 18.04

Initial setup

First check if your machine has the IPU device driver installed. You can check this is loaded and running with the following command:

 $ modinfo ipu_driver

If the driver is are installed and running, you should see something similar to:

 $ modinfo ipu_driver
filename:       /lib/modules/4.15.0-55-generic/updates/dkms/ipu_driver.ko
version:        1.0.39
description:    IPU PCI Driver
author:         Graphcore Limited
license:        GPL
srcversion:     49FFB7D8556EB58899AE41A
alias:          pci:v00001D95d00000003sv*sd*bc*sc*i*
alias:          pci:v00001D95d00000002sv*sd*bc*sc*i*
alias:          pci:v00001D95d00000001sv*sd*bc*sc*i*
depends:
retpoline:      Y
name:           ipu_driver
vermagic:       4.15.0-55-generic SMP mod_unload
parm:           memmap_start:array of ulong
parm:           memmap_size:array of ulong

If so, proceed to the next section. If it returns an error along the lines of:

 $ modinfo ipu_driver
modinfo: ERROR: Module ipu_driver not found.

You will need to install the driver. See the Getting Started Guide for your IPU system for more information.

Using gc-docker

The Graphcore driver package includes some command line tools for managing the IPU system.

The gc-docker command is a small wrapper for the command docker run which adds the correct flags to use a set of IPU devices inside a running container.

If this is not on your path, you will need to go to the driver installation directory and enable the host runtime tools:

 $ cd [gc-driver-path]
$ source enable.sh

This must be done in each shell. Alternatively, you can run the following command to automatically source it in all new Bash login shells:

 $ echo 'source [full-path-to-extracted-driver]/enable.sh' >> ~/.bash_profile

Loading docker images

First, download the Poplar image bundle from the Graphcore customer support portal.

Then load the bundle into your local Docker daemon:

 $ docker load --input=poplar-docker-images-1.0.136.tar.gz

Check the images have loaded and had tags applied. For example (output trimmed):

 $ docker images
REPOSITORY             TAG                IMAGE ID          CREATED         SIZE
graphcore/tools        1.0.136            aacb8f36ceab      2 hours ago     219MB
graphcore/tensorflow   1                  a03d5dba07f6      2 hours ago     1.73GB
graphcore/tensorflow   2                  2bd47e37f15b      2 hours ago     1.81GB
graphcore/poplar       1.0.136            7d9d9638136e      2 hours ago     622MB
ubuntu                 bionic-20200112    ccc6e87d482b      7 weeks ago     64.2MB
  • graphcore/tools: contains only tools to interact with IPU devices.
  • graphcore/poplar: contains Poplar, PopART and the tools to interact with IPU devices.
  • graphcore/tensorflow: contains everything in graphcore/poplar, with TensorFlow installed on top. These images are tagged with 1 and 2 to choose between using TensorFlow 1 or 2.

Note

This tarball method of container image delivery will be replaced with a Docker registry in future, which will enable docker pull to be used instead.

Verifying IPU access from inside container

First check you have access to the IPU devices on the host. To do this, run gc-inventory and check the output contains a list of devices.

Next, do the same but inside the context of a container:

 $ gc-docker -- -ti graphcore/tools gc-inventory

The output should be the same.

Check you can run a TensorFlow container with gc-docker, and make sure the IPUs are visible to TensorFlow:

 $ gc-docker -- -ti graphcore/tensorflow:2 python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
>>> tensorflow.config.list_physical_devices("IPU")
[PhysicalDevice(name='/physical_device:IPU:0', device_type='IPU')]
>>>

The syntax for running an image with gc-docker is similar to using docker run, which is:

 $ docker run [OPTIONS] IMAGE [COMMAND] [ARG...]

The main difference is that docker run is replaced with gc-docker --. So, in the TensorFlow example above, we used the graphcore/tensorflow:2 image and ran python3 as the command. No arguments were passed to python3.

The -- part of this command tells gc-docker that the rest of the arguments should be passed directly to docker run. gc-docker also has a few options which can be used before this. For example, you can pass a subset of IPU devices using --device-id n:

 $ gc-docker --device-id 4 -- -ti graphcore/tools gc-inventory

The --echo option is also useful. This makes gc-docker print the Docker command it would have run. For example:

 $ gc-docker --echo --device-id 4 -- -ti graphcore/tools gc-inventory
docker run --device=/dev/ipu4:/dev/ipu4 --device=/dev/ipu4_ex:/dev/ipu4_ex -ti graphcore/tools gc-inventory

Use the --help option or refer to the IPU Command Line Tools document, for more information.

Mounting directories from the host

You can mount volumes to share data between the host machine and the Docker container environment. This is useful for cases where you need to read data to be processed or to output results.

Volumes are mounted using the -v option. The basic syntax is -v <path_on_host>:<path_in_container>. For example, to mount /home/me/cat_pics from your host machine as /cats in the container, you could run the following command:

 $ gc-docker -- -ti -v /home/me/cat_pics:/cats graphcore/tensorflow ls -a /cats
.  ..  mog.jpg

Setting environment variables

If you need some environment variables set inside the Docker environment, add -e VAR_NAME="var value" to your Docker options.

For example:

 $ gc-docker -- -ti -e POPLAR_LOG_LEVEL=TRACE graphcore/tensorflow:2 python3

Running a TensorFlow application on an IPU

To demonstrate the workflow for running a TensorFlow application on IPUs in a Docker development environment, we will use one of the TensorFlow applications from the Graphcore public examples repository. First, get the code:

 $ git clone https://github.com/graphcore/examples.git
$ cd examples

A common pattern when working with a Docker-based development environment is to mount the current directory into the container (as described in Mounting directories from the host), then set the working directory inside the container with -w <dir name>. For example, -v "$(pwd):/app" -w /app.

Applying this, you can run the LSTM example with the following command:

 $ gc-docker -- -ti -v "$(pwd):/app" -w /app graphcore/tensorflow:1 python3 code_examples/tensorflow/kernel_benchmarks/lstm.py

Extending the images

These base images can be used to create new images for more specialised purposes, or to package an application for deployment to platforms such as Kubernetes or Kubeflow.

As an example, here’s a simple Dockerfile example that creates a Jupyter notebook environment with TensorFlow and access to IPUs:

 FROM graphcore/tensorflow:2

RUN pip3 install notebook

CMD ["jupyter", "notebook", "--allow-root", "--ip=0.0.0.0", "--port=8080"]

You can build and run this with the following commands:

 $ docker build -t notebook .
$ gc-docker -- -p 8080:8080 notebook

Further reading

You can find documentation for the Graphcore software products on the Developer page of the Graphcore website.