Nearly 70 per cent of the top 500 supercomputers use NVIDIA GPUs and the latest innovations to power the NVIDIA HGX™ AI supercomputing platform brings together the full power of NVIDIA GPUs, NVIDIA NVLink, NVIDIA InfiniBand networking and a fully optimized NVIDIA AI and HPC software stack. Namely these technologies are the NVIDIA A100 80GB PCIe GPU, Quantum-2 fixed-configuration switch systems, and Magnum IO Direct Storage. QCT systems not only support these technologies, but are embracing them at a time when supercomputing is expanding from simulations to data-driven approaches that require real-time information exchanges for large batch sizes.
NVIDIA A100 80GB PCIe GPUs
With faster memory in a single node to deliver speedups on large datasets on mainstream servers, this A100 80GB PCIe GPU combines FP64 Tensor cores as well as faster memory bandwidth that increases GPU utilization for data analytic workloads and scientific computing applications. Some deep learning models need to train hourly to always show the most up-to-date data and this can affect development times due to parallel architectures that can be time consuming and slow to run across multiple nodes. By minimizing time spent waiting for data loading by putting it in memory, researchers can achieve higher throughput and faster results to maximize their ROI.
Next Generation Quantum-2 NDR 400Gb/s InfiniBand Switch Systems
Complex workloads demand ultra-fast processing, extreme-size datasets, and complex, highly parallelized algorithms. As these computing requirements continue to grow exponentially, NVIDIA NDR 400b/s InfiniBand, the world’s only fully offloadable, in-network computing platform, provides the dramatic leap in performance for HPC, AI, and hyperscale cloud infrastructures to achieve unmatched performance with less cost and complexity.
The NVIDIA Quantum-2 1U fixed-configuration switch systems deliver an unprecedented 64 ports of NDR 400Gb/s InfiniBand per port (or 128 ports of NDR200), providing 3x higher port density versus HDR InfiniBand.
Magnum IO GPUDirect Storage
As datasets increase in size, the time spent loading data can impact application performance. Providing unrivalled performance for complex workloads, Magnum IO GPUDirect Storage provides direct memory access (DMA) between GPU memory and storage. The direct path enables applications to benefit from lower IO latency, utilizing the full bandwidth of the network adapters while decreasing the utilization load on the CPU and managing the impact of increased data consumption.
QCT QuantaGrid Servers Supporting NVIDIA Technologies
As AI/HPC applications traditionally need to perform an enormous amount of calculations per second, NVIDIA A100 80GB PCIe GPUs can increase the compute density of each server node and dramatically reduce the number of servers required, resulting in huge savings in cost, power, and space consumed in the data center. For in-network computing, network architects can utilize Quantum-2’s to accelerate the communications frameworks for application performance improvements. Finally, loading times can be shortened with Magnum IO GPUDirect Storage that creates a direct data path between local or remote storage and GPU memory.
Previously, during Computex 2021, we also announced our latest QCT NVIDIA-Certified servers – QuantaGrid D43N-3U, QuantaGrid D53XQ-2U, and QuantaGrid D43KQ-2U – which can also support the latest NVIDIA A100 80GB PCIe GPU. Built for AI, with highly configurable I/O slots for NVIDIA accelerators and flexible storage configurations, these systems are tailored for a diversity of data-intensive workloads. QCT currently offers these systems, Magnum IO and NVIDIA Quantum-2 systems are expected later in 2021.Follow QCT on Facebook, Linkedin, and Twitter to receive the latest news and announcements.