A huge AI breakthrough has been achieved by the creator of the world’s largest chip

An AI model that was trained on a single device has beaten the previous record set by Cerebras Systems, which makes the world’s biggest processor.

Cerebras can now train AI models with up to 20 billion parameters using a single CS-2 system powered by the company’s wafer-sized processor (WSE-2).

The requirement to split large-scale models over thousands of GPUs has long been a source of frustration for AI developers, according to the company. As a consequence, new models may be developed and trained in a fraction of the time it normally takes.

Cerebras makes AI accessible to everyone.

A linear relationship exists between the amount of parameters and the model’s performance in sub-disciplines such as natural language processing (NLP). That is to say, the final output is better when the model is bigger.

Large-scale artificial intelligence (AI) product development has always required the usage of several GPUs or accelerators, either because there are too many parameters to store in memory or because the computational performance is inadequate to meet training workloads.

As Cerebras explains, this “painful” procedure “sometimes takes months” to complete. Because the procedure is exclusive to each pair of network compute clusters, the results cannot be transferred to other compute clusters or neural networks. This further complicates issues. It’s made just for you.”

A single CS-2 device can train relatively large-scale AI models, removing these obstacles for many, speeding the growth of current players and democratising access for others previously unable to join in the AI sector.

A new era in artificial intelligence has begun with Cerebras’ capacity to make big language models available to the general public at low cost and with simple access. According to Intersect360’s Chief Research Officer Dan Olds, “It enables firms who can’t invest tens of millions of dollars a simple and affordable on-ramp to big league NLP.”

After training GPT-3 and J class models on big datasets, it will be exciting to observe new applications and discoveries made by CS-2 clients.

Another potential benefit of the CS-2 system from Cerebras is the potential for “even trillions of parameters,” which might be handled by the system in the future. AI networks that are bigger than the human brain might be possible if numerous CS-2 systems are linked together.