Quanta Cloud Technology (QCT) once again achieved leadership performance in the latest MLPerf Training v6.0 benchmark suite. As the AI landscape continues to evolve, MLPerf Training is shifting its focus from classical deep learning benchmarks to real-world, generative AI training pipelines. By including modern and open source-based benchmarks, MLPerf Training allows all participating organizations, including QCT, to test and optimize their infrastructure against the most up-to-date workloads. This not only reinforces the benchmark suite’s impact, but also provides customers with a clear reference to the workload performance results they need the most.
In this round, QCT evaluated two NVIDIA accelerated systems, NVIDIA GB300 NVL72 and the QuantaGrid D75H-10U accelerated by NVIDIA HGX B300, for a variety of workloads. Top-tier results were achieved for LLM pre-training/Generative AI (Llama 3.1 8B, GPT-OSS 20B), LLM fine-tuning (Llama 2 70B LoRA), text-to-image (FLUX.1), and recommendation systems (DLRM-DCNv2).
NVIDIA GB300 NVL72 Built by QCT

NVIDIA GB300 NVL72 with the QuantaGrid D75U-1U built by QCT brings enhanced compute and memory capabilities to power next-generation AI. Composed of 18 QuantaGrid D75U-1U compute trays accelerated by NVIDIA Blackwell Ultra GPUs and 9 NVIDIA NVLink Switch trays, it features 72 interconnected NVIDIA Blackwell Ultra GPUs acting as one gigantic GPU, delivering exceptional 1,440 PFLOPS performance for FP4 and 720 PFLOPS for FP8/FP6. Enhanced with 130 TB/s of NVLink scale-up bandwidth, 20 TB of HBM3e memory per rack and 800 Gb/s scale-out connectivity per GPU provided by NVIDIA ConnectX-8 SuperNIC, it delivers unmatched performance for large-scale AI training, generative AI, and high-throughput inference workloads.
For MLPerf Training v6.0, QCT tested both 64-GPU and 8-GPU configurations on the NVIDIA GB300 NVL72 platform. This allowed QCT to measure the scaling efficiency to understand workload scalability across different cluster sizes.
QuantaGrid D75H-10U

Fig. 2 QuantaGrid D75H-10U accelerated by NVIDIA HGX B300
This air-cooled server based on the NVIDIA HGX B300 platform is accelerated by NVIDIA Blackwell Ultra GPUs, delivering extreme performance for large-scale AI training and high-throughput inference. With eight NVIDIA Blackwell Ultra GPUs interconnected via NVIDIA NVLink and eight OSFP ports supporting speeds up to 800Gb/s with NVIDIA Spectrum-X Ethernet or NVIDIA Quantum-X800 InfiniBand, the D75H-10U delivers maximum bandwidth for multi-node scaling.
QCT’s MLPerf Training v6.0 results further reinforce its position as a leader in advanced AI infrastructure. Leveraging the capabilities of NVIDIA GB300 NVL72 built by QCT and the QuantaGrid D75H-10U accelerated by NVIDIA HGX B300, QCT delivers outstanding performance, scalability, and reliability for training large-scale models and supporting demanding AI workloads. From complex language model training to enabling real-time generative and agentic AI, QCT continues to provide robust, production-ready solutions. In collaboration with NVIDIA, QCT empowers organizations with the infrastructure needed to confidently build, train, and scale AI—driving innovation and accelerating the adoption of next-generation AI technologies. For the latest MLPerf Training v6.0 results, visit: https://mlcommons.org/benchmarks/training/

