<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=145304570664993&amp;ev=PageView&amp;noscript=1">

Huge “foundation models” are turbo-charging AI progress

The “good computer” which Graphcore, a British chip designer, intends to build over the next few years might seem to be suffering from a ludicrous case of nominal understatement. Its design calls for it to carry out 1019 calculations per second. If your laptop can do 100bn calculations a second—which is fair for an average laptop—then the Good computer will be 100m times faster. That makes it ten times faster than Frontier, a behemoth at America’s Oak Ridge National Laboratory which came top of the most recent “Top500” list of powerful supercomputers and cost $600m. Its four-petabyte memory will hold the equivalent of 2trn pages of printed text, or a pile of A4 paper high enough to reach the Moon. “Good” hardly seems to cut it.
But the word is not being used as a qualitative assessment: it is honouring an intellectual heritage. The computer is named after Jack Good, who worked with Alan Turing as a codebreaker during the second world war and followed him into computer science. In 1965 Good wrote an influential, if off-the-wall, article about what the field could lead to: “Speculations concerning the first ultraintelligent machine”. Graphcore wants its Good computer to be that ultra­intelligent machine, or at least to be a big step in its direction.
That means building and running artificial intelligence (ai) models with an eye-watering number of “parameters”—coefficients applied to different calculations within the program. Four years ago the 110m parameters boasted by a game-changing model called bert made it a big model. Today’s most advanced ai programs are 10,000 times larger, with over a trillion parameters. The Good computer’s incredibly ambitious specifications are driven by the desire to run programs with something like 500trn parameters.
One of the remarkable things about this incredible growth is that, until it started, there was a widespread belief that adding parameters to models was reaching a point of diminishing returns. Experience with models like bert showed that the reverse was true. As you make such models larger by feeding them more data and increasing the number of parameters they become better and better. “It was flabbergasting,” says Oren Etzioni, who runs the Allen Institute for ai, a research outfit.