Machine learning processors for both training and inference

My answer to the question, is the IPU for training or inference? takes some people by surprise. Let me explain.

Instead of talking about machine intelligence hardware in terms of training and inference we should focus instead on hardware that can support continuous learning in the core or at the edge of the network.

Of course, Graphcore IPUs do support todays machine learning training and inference approaches. However, we have designed IPU’s to help innovators develop the next generation of machine intelligence systems. Next generation systems will continue to learn and get better over time and, in future systems, we will see less of a distinction between training and inference.

Training or Learning?

When babies are born we do not sit them down and train them. Humans don’t need to be shown hundreds of thousands of labeled pictures before they can recognize a cat. A toddler will recognize a cat almost immediately when they meet one for the first time. When a small dog walks in they will very quickly learn the differences and can distinguish between the two.

It is clear that today’s machine learning systems are still unsophisticated by comparison. We are now able to train classification systems that will recognize objects or words in a narrow domain more accurately than an average human but they typically need a huge amount of labeled data to do this and do not continue to learn from new experiences.

Machine intelligence systems are being developed that can understand context and that reuse knowledge that has already been learned. We can think of this like cognitive memory. Humans do not store a complete pixel bit-map of every scene captured by our eyes. Instead we scan the scene continuously using our foveal vision system and our brain builds a representation of the scene in front of us. It seems logical that we also don’t store away this picture as a complete bit-map. Instead we most likely store memories as high level representations and concepts – then re-compute the scenes as we receive some trigger that causes us to recall this particular memory.

To make these types of systems possible, machine intelligence applications will need much more flexible computer hardware. This new hardware should allow innovators to build systems that can learn and use learnt knowledge in different ways. It will then become possible to have machine intelligence systems that can improve and deliver better and better results. These systems will not just be trained but they will become learning systems that can evolve rapidly even after they are deployed.

Inference or Deployment?

Feed-forward convolutional neural networks (CNN’s) are being used today to build more accurate classification systems – for example in image recognition. This is an inference task and so we tend to think of machine intelligence systems as doing inference.

Inference, however is a very narrow description of what could quickly develop to be much more complex machine intelligence systems. Next generation machine intelligence systems will have cognitive memory, will perform different kinds of prediction and inference tasks in parallel and will make knowledge based judgments.

For example, in the autonomous driving market, we want cars to continue to learn from new situations they encounter out on the road. If an autonomous car sees a ball roll into the road followed by a child, we want it to learn that a ball on the road means a child is highly likely to follow.

Rather than keeping this new knowledge in one car, we would want it to share this knowledge, perhaps by connecting to a learning system in the cloud overnight, updating the wider knowledge system so that this new understanding is captured in an updated model by the next morning for all the cars to share. Ideally, we would then want another car to be able to work out that if it sees a toy tractor, rolling out of a driveway onto the road, it might also be followed by a child, and it should react in the same way as if it saw a ball.

Using inference as a description for the application of machine intelligence is too narrow a term and we should instead think more in terms of deploying machine intelligence.

Machine Intelligence in the Core and at the Edge

If training and inference workloads are replaced by more complex machine intelligence systems that are deployed and then continuously learn, how should we think about the hardware that will support these new systems? The most obvious segmentation is to think about machine intelligence compute that is happening in the cloud - at the core of the internet - and intelligence processing happening at the edge, either embedded into products or in edge servers, that are locally supporting embedded products with some specific intelligence tasks.

Machine intelligence in the core needs to be flexible and scalable. Core workloads will vary significantly. There may be very large amounts of compute required to support very specific learning tasks or a much smaller amount of intelligence processing might be required to support a very specific deployment.

The cloud will need to support many different users and many different intelligence tasks, all operating in parallel. Intelligence machines that can scale up and scale out will be important at the core.

While latency and high availability will be necessary at the core, if guaranteed real-time responses are critical, the intelligence processing may be better placed closer to the edge or may need to be embedded inside a device.

Again, an autonomous car is a perfect example. Full level-five autonomy will need large amounts of local intelligence processing. A mobile internet connection to a cloud will just not be practical. The system will need to react in real-time, to deal with complex situations and to react rapidly in the event of danger. We will need lots of intelligence processing happening inside the car.

In other consumer products it may be possible to split the intelligence between the device at the edge and with the core. A floor cleaning robot, for example, may come across a new object that it has never seen before. The robot has time to return to its powering station, connect to the cloud and get help to learn what this new object is. Once the knowledge model has been updated it can then continue with its task, now avoiding the wine glass that was left on the floor. These consumer devices will still need some level of intelligence processing inside so that they can operate autonomously, but this can be reduced to optimize for cost, power and functionality.

Looking ahead

Over the last two years machine intelligence has taken off at a very rapid pace. The internet is moving from text and static pictures to voice and video. This massive increase in unstructured data is creating challenges for the leading internet players. As machine learning techniques have surpassed the traditional algorithmic approaches for classification, machine intelligence has quickly taken over and been deployed by these firms. This has led to massive investments in new machine learning approaches which are now starting to have an impact in nearly every industry.

The rate of innovation continues to move at a fast pace and we will also soon have new hardware platforms that will allow innovators in machine learning to push back the boundaries. The current approach of training followed by inference will give way to machine intelligence systems that can learn and that then continue to improve as they are deployed. Machine intelligence will dramatically change computing. We will come to think in terms of machine intelligence being deployed with flexible and scalable systems in the core and on efficient and responsive systems at the edge, as well as embedded inside consumer products.

The potential for flexible, efficient, intelligence processors is enormous.

MAC‍HINE LEA‌RNING‌ P‌‍R‍O‌‍C‌‍ESSO‌RS FO‍R‍ BO‍TH TR‌A‍INING‌‍ A‌ND INFER‌ENC‍E

What to read next

JUNE P‌A‌P‍ER‍S: G‌‍RA‍D‌IENT NO‍RMS, LLM R‌EA‌SO‌NING‍ AND‍ V‍ID‍EO‌‍ G‍ENER‌‍A‌TIO‌‍N

MAY P‌‍AP‍ER‌S: P‍A‌R‌A‍LLEL SC‌‍ALING‍, EV‍O‌‍LVING‍ C‌OD‌‍E, U‌ND‍ER‍STA‍NDING‍ LLM R‌EASO‌‍NING‌

AP‍R‍IL PA‌P‍ERS: MO‍TIO‌N P‌‍R‌‍O‌‍MP‍TING, MA‍MB‌‍A R‌EASONING‌ AND MO‍D‍ELING‌ R‌EWA‍R‌‍D‍S

Get the latest Graphcore news

Register your interest