QCT's Next Leap in Accelerated Computing

NVIDIA GB200 Grace™ Blackwell Superchip​

NVIDIA GB200 Grace
Blackwell Superchip​

The NVIDIA GB200 Grace Blackwell Superchip arrives as the flagship of the Blackwell architecture, catapulting generative AI to trillion-parameter scale and delivering 30X faster real-time large language model (LLM) inference, 25X lower TCO, and 25X less energy. It combines two Blackwell GPUs and a Grace CPU and scales up to the GB200 NVL72, a 72-GPU NVIDIA® NVLink®-connected that acts as a single massive GPU.​

QCT NVIDIA GB200 Grace Blackwell Superchip & NVIDIA GB200 NVL72 Products (NVIDIA MGX™ Architecture)

The NVIDIA GB200 Grace Blackwell Superchip supercharges next-generation AI and accelerated computing. Acting as a heart to a much larger system, the NVIDIA GB200 Grace Blackwell Superchip can scale up to the NVIDIA GB200 NVL72. This is the first architecture with rack level fifth-generation NVIDIA® NVLink®, connecting 72 high-performance NVIDIA B200 Tensor Core GPUs and 36 NVIDIA Grace™ CPUs to deliver 900 GB/s of bidirectional bandwidth.

With NVIDIA® NVLink® Chip-to-Chip (C2C), applications have coherent access to a unified memory space to eliminate complexity and speed up deployment. This simplifies programming and supports the larger memory needs of trillion-parameter LLMs, transformer models for multimodal tasks, models for large-scale simulations, and generative models for 3D data.

Additionally, the NVIDIA GB200 NVL72 uses NVIDIA® NVLink® and cold-plate-based liquid cooling to create a single massive 72-GPU rack that can overcome thermal challenges, increase compute density, and facilitate high-bandwidth, low-latency GPU communication.

Performance Result

To accelerate performance for multitrillion-parameter and mixture-of-experts AI models, the latest iteration of NVIDIA NVLink® delivers groundbreaking 1.8TB/s bidirectional throughput per GPU, ensuring seamless high-speed communication among up to 576 GPUs for the most complex LLMs.

(Source: NVIDIA®)

Liquid-cooled GB200 NVL72 racks reduce a data center’s carbon footprint and energy consumption. Liquid cooling increases compute density, reduces the amount of floor space used, and facilitates high-bandwidth, low-latency GPU communication with large NVLink domain architectures. Compared to NVIDIA H100 air-cooled infrastructure, GB200 delivers 25X more performance at the same power while reducing water consumption.

(Source: NVIDIA®)

QCT Servers Powered by NVIDIA

QCT NVIDIA MGX™-based Systems
Play Video

QCT NVIDIA MGX™-based Systems such as the QuantaGrid S74G-2U, QuantaEdge EGX77GE-2U and upcoming NVIDIA Grace™ Blackwell servers allow different configurations of GPUs, CPUs and DPUs, shortening the time frame for building future compatible solutions. Based on the modular reference design, these configurations can not only support future accelerators, but also meet the requirements of diverse workloads, including those that incorporate liquid cooling, to shorten the development journey and reduce time to market.​ 

  • Accelerates Time to Market
  • Multiple Form Factors to Offer Maximum Flexibility
  • Runs Full NVIDIA Software Stack to Drive Acceleration Further

QCT NVIDIA GB200 Grace Blackwell Superchip

  • New class of rack-scale architecture interconnecting 36 NVIDIA Grace™ CPUs and 72 Blackwell GPUs
  • Supports up to 2x NVIDIA GB200 Grace Blackwell Superchip in a 2U form factor
  • Cold plate-based liquid cooling design to meet the thermal challenge from high-powered GPUs
  • Designed to handle 30X faster real-time trillion-parameter LLM inference AI models

NVIDIA GB200 NVL72

NVIDIA MGX™ Modular Architecture

  • Powered by NVIDIA Grace™ Hopper Superchip​
  • NVIDIA® NVLink®-C2C high-bandwidth and low-latency interconnect​
  • Optimized for memory intensive inference and AI workloads

NVIDIA MGX™ Modular Architecture

QuantaEdge EGX77GE-2U

  • Powered by the NVIDIA Grace™ Hopper Superchip​
  • NVIDIA® NVLink®-C2C high-bandwidth and low-latency interconnect​
  • Thermal enhancements for critical environments​
  • O-RAN compliant 5G vRAN system​
  • Multi-Access Edge Computing (MEC) server

QuantaEdge EGX77GE-2U

NVIDIA MGX™ Modular Architecture

QuantaGrid D74S-1U

  • New class of rack-scale architecture interconnecting 32 NVIDIA Grace Hopper Superchips via NVIDIA® NVLink®.​ 
  • Supports up to 2x 1000W TDP NVIDIA® GH200 144GB Grace Hopper™ Superchips in a 1U form factor. ​
  • Cold plate-based liquid cooling design to meet the thermal challenge from high-powered GPUs.​
  • Designed to handle terabyte-class AI models.​

NVIDIA MGX™ Modular Architecture

QuantaGrid D74S-1U

QCT and NVIDIA have worked together to push the boundaries of innovation with a variety of accelerated infrastructures powered by NVIDIA, enabling multiple use cases across different verticals.

In terms of smart manufacturing, QCT has integrated NVIDIA technologies such as NVIDIA CloudXR™, NVIDIA Omniverse™ and NVIDIA® CUDA® with QCT OmniPOD Enterprise 5G Solution to pre-validate virtual tour, production line simulation and object detection use cases.

Regarding HPC and AI infrastructures, QCT POD has leveraged NVIDIA GPUs, InfiniBand and CUDA to deliver better performance, allowing HPC and AI technologies to be run under one system architecture with a cloud-native scheduler and data tiering tools.

QCT has also developed QCT NVIDIA AI Enterprise Solution (NVAIE) that takes advantage of the NVIDIA AI Enterprise suite and a virtualization platform to help enterprises run AI applications.

Watch Videos

QCT NVIDIA MGX™ Systems

QCT Cutting-Edge Infrastructures
and Solutions Powered by NVIDIA®

QCT QoolRack - Liquid to Air
Cooling Solution

QCT NVIDIA MGX™ Systems

QCT Cutting-Edge Infrastructures and Solutions Powered by NVIDIA®

QCT QoolRack - Liquid to Air Cooling Solution

Contact QCT

System requirement
CAPTCHA image

This helps us prevent spam, thank you.

Top