QCT's Next Leap in Accelerated Computing

NVIDIA Vera Rubin Platform

The NVIDIA Vera Rubin platform fuels the future of AI factory. It delivers breakthrough performance for AI training, AI reasoning, and agentic AI workloads at significantly lower cost per million tokens than NVIDIA Blackwell, enabling large-scale AI deployments.

The NVIDIA Vera Rubin platform marks a performance leap with optimized compute and 288GB HBM4 memory per GPU, with almost 3x HBM4 and 2x NVIDIA NVLink™ 6 bandwidth to elevate GPU-to-GPU communication efficiency with low latency.

NVIDIA Vera Rubin NVL72

NVIDIA Vera Rubin NVL72, unifying 72 NVIDIA Rubin GPUs and 36 NVIDIA Vera CPUs, delivers massive compute density and up to 20.7 TB of HBM4 memory. It leverages NVIDIA NVLink™ 6 switches to interconnect GPUs for scale-up intelligence, integrating the NVIDIA ConnectX®-9 SuperNIC™ and NVIDIA BlueField®-4 DPU for elevated networking capabilities.

NVIDIA Vera Rubin NVL72 is also optimized for scale-out deployments and AI factories. By integrating NVIDIA Quantum-X800 InfiniBand and NVIDIA Spectrum™-X Ethernet, it delivers breakthrough performance for LLM, agentic AI, AI reasoning and video inferencing applications.

Gen-to-Gen Comparsion

QCT Servers Accelerated by NVIDIA

Discover the power of QCT’s cutting-edge infrastructures accelerated by NVIDIA Blackwell Ultra GPU, NVIDIA Grace Blackwell Superchip, and the latest NVIDIA data center PCIe GPUs. These QuantaGrid systems not only support future accelerators and liquid cooling to meet diverse workload needs, but also feature flexible, modular designs to speed up development and reduce timetomarket. 

  • Accelerates Time to Market
  • Multiple Form Factors to Offer Maximum Flexibility
  • Runs Full NVIDIA Software Stack to Drive Acceleration Further
NVIDIA Vera Rubin NVL72

NVIDIA Vera Rubin NVL72, unifying 72 NVIDIA Rubin GPUs and 36 NVIDIA Vera CPUs, delivers massive compute density and up to 20.7 TB of HBM4 memory. It leverages NVIDIA NVLink™6 switches to interconnect GPUs for scale-up intelligence, integrating the NVIDIA ConnectX®-9 SuperNIC™ and NVIDIA BlueField®-4 DPU for elevated networking capabilities.

Feature Performance Compared with Blackwell
NVFP4 Inference 3.6 EFLOPS 5x
NVFP4 Training 2.5 EFLOPS 3.5x
LPDDR5X Capacity 54TB 2.5x
HBM4 Capacity 20.7TB 1.5x
HBM4 Bandwidth 1.6PB/s 2.8x
Scale-up Bandwidth 260 TB/s 2x
#GenAI
#HPC
#EDA
#CFD
#Storage
QuantaGrid D76V-1U
NVIDIA MGX™ Architecture-based System
NVIDIA GB300 Grace™ Blackwell Ultra Superchip
  • Enhanced Compute & HBM4 Bandwidth
  • Network Upgrades
  • Optimized Serviceability
  • Efficient Cooling
Platform NVIDIA Vera Rubin
CPU
GPU
(2) NVIDIA Vera CPUs
(4) NVIDIA Rubin GPUs
Memory CPU: Up to 768GB LPDDR5X per CPU
GPU: Up to 288GB HBM4 per GPU
Storage (4) E1.S 9.5mm PCIe data SSDs
(1) E1.S 9.5mm PCIe boot SSD
Networking (1) NVIDIA® BlueField®-4 B4240V with dual 400Gb/s QSFP112 ports
(8) NVIDIA ConnectX®-9 800G OSFP ports
Power 48-54V DC busbar clip
Dimensions(W) 438 x (H) 43.6 x (D) 766mm
NVIDIA​ CMX Context Memory Storage Platform​

The platform introduces a specialized G3.5 context memory tier between node-local SSDs and shared storage, extending effective GPU KV-cache capacity across the pod. By treating KV cache as a first-class AI data type and optimizing pod-level context management, it delivers up to 5x higher tokens-per-second and up to 5x better power efficiency for agentic AI workloads.

This AI-native architecture uses the NVIDIA® B4480SP Dual BF4 Storage Processor, NVIDIA DOCA™ Memos, NVIDIA, Dynamo, and NVIDIA Spectrum-X™ Ethernet to enable seamless, low latency context sharing across nodes. By rapidly pre-staging KV-cache context back to the GPU during prefill, it minimizes inference stalls and maximizes end-to-end throughput. This ensures that large-scale AI factories remain both high performing and power- and cost-effective.

#GenAI
#HPC
#EDA
#CFD
#Storage
QuantaGrid D66F-2U
NVIDIA MGX™ Architecture-based System
NVIDIA GB300 Grace™ Blackwell Ultra Superchip
  • Adds a dedicated, RDMA-attached storage tier for KV cache between GPU HBM and enterprise storage.
  • Enables scalable, shared context reuse across nodes to support long-context, multi agent inference and agentic AI workloads.
  • 5x tokens per second compared with traditional storage, 4x higher energy efficiency compared with traditional CPU architectures for high-performance storage, and 2x more ingestion of pages per second for enterprise AI data.
HPM B4480SP Dual Bluefield-4 STX Storage Processors, powered by NVIDIA Vera: ​
(2) NVIDIA Vera CPU, 450W each​
(4) NVIDIA ConnectX®-9​
Memory (16) SOCAMM DIMM slot, populate (4) SOCAMM for CMX
Networking (4) NVIDIA ConnectX®-9 800G OSFP ports ​
Storage (24) E3.S NVMe data drives​
(2) M.2 2280 boot drive
Cooling Full liquid cooling
Power 48-54V DC busbar
Form Factor2U
Dimensions(W) 438 x (H) 87.5 x (D) 766mm
NVIDIA ARC-PRO

QuantaEdge EGN77C-2U is a carrier-grade server system derived from NVIDIA Aerial RAN Computer Pro (ARC-Pro), an AI-RAN platform for software-defined, AI-native 5G and 6G networks. Built on the NVIDIA Grace Blackwell architecture, it integrates NVIDIA RTX PRO™ 4500 Blackwell Server Edition GPU, Grace CPUs, and embedded NVIDIA ConnectX®-8 networking to support time-sync and RAN Layer 1–3 workloads. Optimized for serviceability, reliability, and telco deployments, EGN77C-2U enables operators to evolve from traditional RAN to a scalable, software-defined AI-RAN architecture.

#GenAI
#HPC
#EDA
#CFD
#Storage
QuantaEdge EGN77C-2U
NVIDIA MGX™ Architecture-based System
NVIDIA GB300 Grace™ Blackwell Ultra Superchip
  • NVIDIA ARC-Pro platform tailored for AI-RAN
  • Short-depth, carrier-grade server for the telco edge scenario
  • 2U2N form factor optimized for high availability
  • Networking and SyncE powered by NVIDIA ConnectX®-8 SuperNIC™
  • Accelerated by NVIDIA RTX PRO™ 4500 Blackwell Server Edition
Processor (1) NVIDIA Grace™ CPU per node
Memory​ (16) LPDDR5X chip-down per node
Networking (16) 25GbE SFP28 + (2) 400GbE QSFP112 per node ​
Onboard Storage (2) PCIe 5.0 M.2 2280 per node
Expansion Slots​(1) PCIe 5.0 x16 FHFL single-width slot per node ​
Dimensions(W) 448 x (H) 87.5 x (D) 420mm
Form Factor2U2N EIA rackmount server
NVIDIA GB300 NVL72

The NVIDIA GB300 NVL72 brings enhanced compute and memory capabilities to the next generation of AI and accelerated computing with 72 interconnected NVIDIA Blackwell Ultra GPUs acting as one gigantic GPU.  

This rack-level solution can be seamlessly integrated with existing NVIDIA GB200 NVL72 infrastructure, delivering higher performance with efficient liquid cooling. The design reduces energy use while maximizing compute density and optimizing floor space.

#GenAI
#HPC
#EDA
#CFD
#Storage
QuantaGrid D75U-1U
NVIDIA MGX™ Architecture-based System
NVIDIA GB300 Grace™ Blackwell Ultra Superchip
  • Accelerated by NVIDIA GB300 Grace™ Blackwell Ultra Superchips
  • 279GB of HBM3e memory capacity allows for larger batch sizing and maximum throughput
  • NVIDIA ConnectX®-8 SuperNIC delivers 800 Gb/s connectivity per GPU
  • CX8 supports PCIe switch function, removing IPEX BD and cables for simplified system design
  • Compute tray of GB200 and GB300 unified as Bianca, leveraging NVIDIA MGX™ Modular Design
Platform (2) NVIDIA GB300 Grace Blackwell Ultra Superchip​
Processor (2) NVIDIA Grace™ CPU​
GPU​ (4) NVIDIA Blackwell Ultra GPUs​
Memory CPU: Up to 480GB LPDDR5X per CPU ​
GPU: Up to 279GB HBM3e per GPU ​
Storage (4) E1.S 15mm PCIe SSDs, (8) slots available
Onboard Storage​(1) PCIe M.2 22110/2280 SSDs ​
Networking(1) NVIDIA® BlueField®-3 B3240 dual port 400G DPU
(4) NVIDIA® ConnectX®-8 800Gb OSFP ports
Power 48-54V DC​ bus bar
Dimensions(W) 438 x (H) 43.6 x (D) 766mm
NVIDIA GB200 NVL72

The NVIDIA GB200 NVL72 is a powerhouse that connects 36 NVIDIA Grace™ CPUs and 72 Blackwell GPUs via the NVIDIA NVLink™-C2C interconnect. Functioning as a single, colossal GPU, this liquid-cooled, rack-scale system is designed to navigate the complexities of trillion-parameter AI models with unprecedented ease.

#GenAI
#HPC
#EDA
#CFD
#Storage
QuantaGrid D75B-1U
NVIDIA MGX™ Architecture-based System
NVIDIA GB200 Grace™ Blackwell Superchips
  • Accelerated by dual NVIDIA GB200 Grace™ Blackwell Superchips 
  • CPU and GPU Connected by NVIDIA NVLink™ 
  • Optimized serviceability with a front-access, tool-less, hot-pluggable design 
  • Liquid-cooled NVIDIA MGX architecture 
Platform (2) NVIDIA GB200 Grace Blackwell Superchip​
Processor​(2) NVIDIA Grace™ CPUs
GPU(4) NVIDIA Blackwell GPUs
MemoryCPU: Up to 480GB LPDDR5x per CPU
GPU: Up to 186GB HBM3e per GPU
Storage (8) hot-swappable​ E1.S 15mm PCIe SSDs
Onboard Storage (1) PCIe M.2 22110/2280 SSDs​
Networking(2) NVIDIA® BlueField®-3 B3240 dual port 400G DPUs
(4) NVIDIA ConnectX®-7 400Gb OSFP ports
CoolingCPU/GPU: Liquid cooling cold plate
Peripheral: (8) 4056 dual rotor fans
Power 48-54V DC bus bar clip
Dimensions(W) 438 x (H) 43.6 x (D) 766mm
NVIDIA RTX PRO™ 6000 Blackwell Server Edition​

The QuantaGrid D75E-4U is more than just an x86-based system built on the Intel® Xeon® platform. It adheres to the NVIDIA MGX™ architecture, offering a modular design that meets diverse AI applications and customer demands. This system is compatible with a full range of NVIDIA data center PCIe GPUs, including NVIDIA RTX PRO™ 6000 Blackwell Server Edition, NVIDIA H200 NVL, NVIDIA H100 NVL, NVIDIA L40S GPU, NVIDIA L4 GPU, NVIDIA A10 GPU, and NVIDIA A16 GPU, enabling unparalleled flexibility and performance. The NVIDIA H200 NVL is particularly suited for organizations with data centers seeking low-power, air-cooled enterprise rack designs. It delivers versatile acceleration for AI and HPC workloads of all sizes, making it an ideal choice for enterprises prioritizing efficiency and scalability. With the QuantaGrid D75E-4U, customers can maximize computing power in compact spaces. The system supports flexible GPU configurations—1, 2, 4, or 8 GPUs—allowing companies to optimize their existing rack infrastructure and tailor performance to their specific requirements. 

QuantaGrid D75E-4U
NVIDIA MGX™ Architecture-based System
  • Supports NVIDIA next-gen PCIe GPUs, up to 8x DW AC 600W 
  • All PCIe 5.0 expansion slots are designed to support up to 150W
  • Remote heatsink solution for improved thermal performance
  • Enhanced serviceability with tool-less, hot-pluggable designs
  • Offers infinite flexibility to support any AI/HPC-related workloads
Processor (2) Intel® Xeon® 6 processors, up to 350W TDP
Memory (32) DDR5 RDIMM up to 6,400 MHz
Networking (1) Dedicated 1GbE management port
Storage
SKU1 - (4) DW GPUs
(12) hot-swappable E1.S SSDs
SKU2 - (8) DW GPUs
(24) hot-swappable E1.S SSDs
Expansion Slot
SKU1 - (4) DW GPUs
  • (4) DW FHFL PCIe 5.0 x 16 slots for GPU
  • (3) SW FHFL PCIe 5.0 x 16 slots for networking
SKU2 - (8) DW GPUs
  • (8) DW FHFL PCIe 5.0 x 16 slots for GPU
  • (4) SW FHFL PCIe 5.0 x 16 slots for networking
  • (1) SW FHHL PCIe 5.0 x 16 slot for networking
  • (1) SW HHHL PCIe 5.0 x 16 slot
GPU Expansion
SKU1 - (4) DW GPUs
(4) NVIDIA RTX PRO™ 6000, NVIDIA H200, NVIDIA L40S
SKU2 - (8) DW GPUs
(8) NVIDIA RTX PRO™ 6000, NVIDIA H200, NVIDIA L40S
Cooling Air cooling (design reserved for liquid cooling)
Power (3+1) 2700W/3200W CRPS Titanium PSUs
Form Factor 4U Rackmount Server
Dimensions (W) 438 x (H) 176 x (D) 800mm
NVIDIA GH200 Grace Hopper™ ​Superchip

QCT systems based on the NVIDIA MGX™ architecture, such as the QuantaGrid S74G-2U, QuantaEdge EGX77GE-2U and new NVIDIA Grace™ Blackwell servers, allow different configurations of GPUs, CPUs and DPUs, shortening the time frame for building future compatible solutions. Based on the modular reference design, these configurations can not only support future accelerators, but also meet the requirements of diverse workloads, including those that incorporate liquid cooling, to shorten the development journey and reduce time to market. 

QuantaGrid S74G-2U
NVIDIA MGX™ Architecture-based System
NVIDIA Grace Hopper™ Superchip
  • Accelerated by the NVIDIA Grace Hopper™ Superchip
  • Firstgen NVIDIA MGX™ architecture with a modular design 
  • Optimized for memory-intensive inference and HPC workloads
ProcessorNVIDIA GH200 Grace Hopper™ Superchip, 1000W TDP
MemoryCPU: Up to 480GB LPDDRX embedded
GPU: 144GB HBM3E memory
Coherent memory between CPU and GPU with NVIDIA NVLink™-C2C interconnect with a speed of 900GB/s
Storage (4) hot-swappable E1.S NVMe SSDs​
Networking(1) Dedicated 1GbE management port
Expansion Slot (3) FHFL DW PCIe 5.0 x16​ slots
Dimensions(W) 438 x (H) 87.5 x (D) 900mm
QuantaGrid EGX77GE-2U
NVIDIA Architecture-based System
NVIDIA Grace Hopper™ Superchip
  • NVIDIA ARC-Pro platform tailored for AI-RAN
  • Short-depth, carrier-grade server for the telco edge scenario
  • 2U2N form factor optimized for high availability
  • Networking and SyncE powered by NVIDIA ConnectX®-8 SuperNIC™
  • Accelerated by NVIDIA RTX PRO™ 4500 Blackwell Server Edition
ProcessorNVIDIA GH200 Grace Hopper™ Superchip, 1000W TDP
StorageInternal Storage: (2) SATA/NVMe M.2 22110/2280 SSDs​
External Storage: (2) E1.S SSDs​
Expansion Slot(3) FHFL PCIe 5.0 x16 slots​
Dimensions(W) 447.8 x (H) 86.8 x (D) 400mm

Watch videos

QCT Cutting-Edge Infrastructures
Accelerated by NVIDIA

Unleashing the Power of NVIDIA HGX™
for AI and HPC

QCT NVIDIA MGX™ Systems
Official Overview

Contact QCT

System requirement
CAPTCHA image

This helps us prevent spam, thank you.

Top