Nvidia and Intel Xeon-based servers accelerate AI, HPC and cloud computing
Servers introduced by Super Micro Computer to its GPU (graphics processing unit) system portfolio are based on Nvidia’s HGX A100 4-GPU and third generation Intel Xeon Scalable processors.
The servers are designed for demanding AI (artificial intelligence) applications. The 2U Nvidia HGX A100 4-GPU system is suitable for deploying modern AI training clusters at scale with high speed CPU-GPU and GPU-GPU interconnects. The Supermicro 2U 2-Node system reduces energy usage and costs by sharing power supplies and cooling fans, reducing carbon emissions. It also supports a range of discrete GPU accelerators, which can be matched to the workload. Both systems include hardware security features enabled by the latest Intel Software Guard Extensions (Intel SGX).
“We can offer customers Nvidia HGX A100 (code name Redstone) 4-GPU accelerators for AI and HPC workloads in dense 2U form factors,” said Charles Liang, president and CEO, Supermicro. He added that the 2U 2-Node system is designed to share power and cooling components which reduce opex and the impact on the environment,” he added.
The 2U Nvidia HGX A100 server is based on the third generation Intel Xeon Scalable processors with Intel Deep Learning Boost technology. It is optimised for analytics, training, and inference workloads. The system can deliver up to 2.5petaflops of AI performance, with four A100 GPUs interconnected with Nvidia NVLink, providing up to 320Gbyte of GPU memory. The system is up to four times faster than the previous generation GPUs for complex conversational AI models like BERT large inference and delivers up to three times performance boost for BERT large AI training.
Thermal and cooling designs make these systems suitable for high performance clusters where node density and power efficiency are priorities, says the company. Liquid cooling is also available, resulting in further opex savings. Intel Optane Persistent Memory (PMem) is also supported, enabling significantly larger models to be held in memory, close to the CPU, before processing on the GPUs.
For applications that require multi-system interaction, the system can also be equipped with four Nvidia ConnectX-6 200Gbits per second InfiniBand cards to support GPUDirect RDMA with a 1:1 GPU-to-DPU ratio.
The new 2U 2-Node is an energy-efficient resource-saving architecture designed for each node to support up to three double-width GPUs. Each node also features a single third generation Intel Xeon Scalable processor with up to 40 cores and built-in AI and HPC (high performance computing) acceleration. AI, rendering, and VDI applications will benefit from this balance of CPUs and GPUs, says Supermicro. Equipped with its I/O Module (AIOM) expansion slots, the system can also process massive data flow for demanding AI/ML (machine learning) applications, deep learning training, and inferencing while securing the workload and learning models. It is also suiteable for multi-instance high-end cloud gaming and many other compute-intensive VDI applications. In addition, virtual content delivery networks (vCDNs) will be able to satisfy increasing demands for streaming services, said Supermicro. Power supply redundancy is built-in, as either node can use the adjacent node’s power supply in the event of a failure.