Nvidia offers tools for data centre computing escalation
Following the acquisition of Mellonix, Nvidia is driving data centre changes as they become computing units. In what should have been a keynote at the company’s conference but which was altered to a virtual presentation due to the Covid-19 pandemic, CEO Jensen Huang introduced a GPU architecuture optimised for the new scale of data centre computing.
He announced the Nvidia A100 GPU, based on the Nvidia Ampere architecture, which is claimed to provide “the greatest generational performance leap” of Nvidia’s eight generations of GPUs. It is built for data analytics, scientific computing and cloud graphics, and is in full production and shipping to customers worldwide.
The A100 and the Nvidia Ampere architecture on which it is built, boosts performance by up to 20x over its predecessors. It is claimed to be the world’s largest 7nm processor with over 54bn transistors.
It also features third-generation Tensor Cores with TF32, a new math format that accelerates single-precision artificial intelligence (AI) training, and structural sparsity acceleration, an efficiency technique for AI maths. Another feature is multi-instance GPU (MIG), which allows a single A100 processor to be partitioned into up to seven independent GPUs, each with its own resources.
Also contributing to the increased performance is the third-generation NVLink technology, doubling high-speed connectivity between GPUs. The result is that A100 servers can act as one giant GPU, says Nvidia.
In the same announcement, Nvidia also announced a third generation of its Nvidia DGX AI system based on the A100. The Nvidia DGX A100 is believed to be the world’s first 5PetaFLOPS server. Each DGX A100 can be divided into as many as 56 applications, all running independently.
It allows a single server to either scale up to race through computationally intensive tasks such as AI training, or scale out, for AI deployment, or inference, Huang said.
The A100 will also be available for cloud and partner server makers as HGX A100.
A data centre powered by five DGX A100 systems for AI training and inference running on just 28kW can do the work of a typical data centre with 50 DGX-1 systems for AI training and 600 CPU systems consuming 630kW, Huang explained.
Another announcement was the next-generation DGX SuperPOD, powered by 140 DGX A100 systems and Mellanox networking technology. It offers 700PetaFLOPS of AI performance, the equivalent of one of the 20 fastest computers in the world.
The next-generation DGX SuperPOD delivers 700PetaFLOPS of AI performance.