Over the past few years there has been an explosion in interest in specialised processors for the Artificial Intelligence and Machine Learning fields. This is understandable, given the potential of the area and the heavy compute requirements of the applications but it does lead you to think about where the current storm of innovation is going to go.
If we consider what the best AI chips might look like going forward, I sincerely believe that the biggest factor in the effectiveness and success of an AI chip is going to be its software stack.
Having the best software not only makes AI processors much easier to use for developers but can also harness the full potential of the underlying hardware. I recently spoke at GMIC Live about the importance of this relationship between AI software and hardware. The event is Asia's largest and leading technology conference, and this virtual edition featured cross-border livestreams from six global cities. You can watch my full presentation "The Future of AI Chips" from GMIC London Live here:
Machine Intelligence requires more from its software to achieve efficiency. In Artificial Intelligence (AI) and Machine Learning (ML), the compute itself is fundamentally different.
What makes Machine Intelligence Compute different?
While there are many properties of AI algorithms that are unique to the field, here are a few that are particularly worth highlighting:
- Modern AI and ML is all about dealing with uncertain information. The variables inside a model represent something uncertain – a probability distribution. This has a big effect on the kind of work the chip is doing. To cover the wide range of possibilities that could occur, you need both the detailed precision of fractional numbers and a wide dynamic range of possibilities. From a software point of view, this requires us to use a variety of floating-point number techniques and also algorithms which manipulate them in probabilistic ways.
- As well as being probabilistic, the data we're dealing with comes from a very high-dimensional space including content such as images, sentences, video, or even just abstract knowledge concepts. These are not the straight vectors of data that you see, for example, in graphics processing. As the number of dimensions in the data become higher, the data access becomes more irregular and sparse. This means that many of the techniques typically used in hardware and software such as buffering, caching and vectorisation don’t apply here.
- Add to this the fact that machine intelligence compute deals with both big data (huge datasets for training) and big compute (a large number of computing operations per item of data processed) and the extent of the processing challenge becomes clear.
What does this mean for Software?
Because machine intelligence computing is so different, software has to work harder in AI and ML than it does in many other areas. The AI software stack needs to combine developer productivity, ease of use and flexibility with the need for efficiency at scale.
To resolve the efficiency challenge, AI software must communicate at a lower level with the hardware. This can prevent late decisions being made during hardware runtime and improve efficiency. The probabilistic and higher order data structures in AI algorithms make it harder to predict what's going to happen during runtime. This requires the software to provide more information about the structure of the algorithm and the structure of the machine learning model being executed.
In machine intelligence, software needs to be programmed to control things like number representation and explicit memory movement that is specific to certain AI algorithms in order to optimise efficiency. Hardware must also be receptive to these optimisations.
The Benefits of Software/Hardware Co-Design
In future, more hardware/software co-design will be required, where software algorithms and AI hardware are designed at the same time. This will enable a greater degree of co-operation between hardware and software, helping developers to effectively organise, for example, memory locations or thread scheduling.
At Graphcore, we have been developing our Poplar® software stack in tandem with the IPU processor since the very early days of the company. To maximise processor efficiency, we’ve given Poplar more advanced software control than exists in other systems.
One example of this is how we manage memory. Our IPU-Machine M2000 has off-chip DDR memory. However, there is no cache or anything in the hardware to automatically control at runtime the moving or buffering of data between the external streaming memory and on-chip in-processor memory. It is all controlled in software based on the computation graph. Memory management is just one of the parts of the software stack where we optimise the hardware based on advanced analysis. This is key to our approach.
When software and hardware work together seamlessly from the outset, it is much easier to improve performance and efficiency. And by increasing software control, we can shed light on how hardware can process different machine intelligence models. Perhaps we can even learn to build new AI models that inherently perform better, leveraging advanced techniques such as sparsity.
In future, the best AI chips will be those with the best software: we believe that Graphcore is going to provide these chips. We are a software company as much as we are a hardware company and it is clear to us that this is the way forward.