QCT Achieved Performance Gains Across Diverse Workloads in MLPerf Training v3.1

adminUncategorizedLeave a Comment

Machine learning is taking giant leaps at an unprecedented pace to advance innovations. The results of MLPerf Training v3.1, the latest round of MLPerf Training and HPC Benchmark, show 49X performance gains in just 5 Years. As a member of MLCommons, QCT also contributed to this progress with two submissions in the closed division. QCT’s submissions included tasks in Image Classification, Object Detection, Natural Language Processing, Speech Recognition, and Recommendation, all of which were successfully achieved by meeting the prescribed quality targets (see below) using its QuantaGrid D54U-3U and QuantaGrid D74H-7U.

AreaBenchmarkDatasetQuality TargetReference Implementation ModelLatest Version Available
VisionImage classificationImageNet75.90% classificationResNet-50 v1.5v3.1
VisionImage segmentation (medical)KiTS190.908 Mean DICE score3D U-Netv3.1
VisionObject detection (light weight)Open Images34.0% mAPRetinaNetv3.1
VisionObject detection (heavy weight)COCO0.377 Box min AP and 0.339 Mask min APMask R-CNNv3.1
LanguageSpeech recognitionLibriSpeech0.058 Word Error RateRNN-Tv3.1
LanguageNLPWikipedia 2020/01/010.72 Mask-LM accuracyBERT-largev3.1
CommerceRecommendationCriteo 4TB multi-hot0.8032 AUCDLRM-dcnv2v3.1

Fig. 1. MLPerf Training v3.1 benchmarks that QCT submitted

The QuantaGrid D74H-7U is an 8-way GPU server equipped with the NVIDIA HGX H100 8-GPU Hopper SXM5 module, making it an ideal choice for compute-intensive AI training. With innovative hardware design and software optimization, the QuantaGrid D74H-7U server consistently delivers cutting-edge performance in training results.


Fig. 2. QuantaGrid D74H-7U


The QuantaGrid D54U-3U, powered by 4th Gen Intel Xeon Scalable processors, is a 3U system featuring the capacity to accommodate up to four dual-width accelerator cards or up to eight single-width accelerator cards, along with 32 DIMM slots. This provides a comprehensive and flexible architecture that can be tailored to optimize various AI/HPC applications. In this round, the QuantaGrid D54U-3U Server, configured with four NVIDIA H100-PCIe-80GB accelerator cards, achieved outstanding performance. 


Fig. 3. QuantaGrid D54U-3U with the lid open


QCT remains committed to delivering comprehensive hardware systems, solutions, and services to academic and industrial users. Moreover, we are dedicated to maintaining transparency by openly sharing our MLPerf results with the public, encompassing both training and inference benchmarks.For more detailed information, visit the official MLPerf Training results.

Leave a Reply

Your email address will not be published. Required fields are marked *