Last month NVIDIA announced its Ampere A100 GPU’s that will be powering their DGX A100 universal system for AI infrastructure. These 7nm GPU’s are designed for such things as cloud computing, AI and scientific research.
Today AMD announced it’s second generation Epyc processors will be the CPU NVIDIA uses to power its DGX A100 universal system for AI infrastructure.
DGX A100 system specifications
NVIDIA DGX™ A100 is the universal system for all AI workloads—from analytics to training to inference. DGX A100 sets a new bar for compute density, packing 5 petaFLOPS of AI performance into a 6U form factor, replacing legacy compute infrastructure with a single, unified system. DGX A100 also offers the unprecedented ability to deliver fine-grained allocation of computing power, using the Multi-Instance GPU capability in the NVIDIA A100 Tensor Core GPU, which enables administrators to assign resources that are right-sized for specific workloads. This ensures that the largest and most complex jobs are supported, along with the simplest and smallest. Running the DGX software stack with optimized software from NGC, the combination of dense compute power and complete workload flexibility make DGX A100 an ideal choice for both single node deployments and large scale Slurm and Kubernetes clusters deployed with NVIDIA DeepOps
|GPU’s||8x NVIDIA A100 Tensor Core GPUs|
|GPU Memory||320 GB total|
|Performance||5 petaFLOPS AI10 petaOPS INT8|
|System Power Usage||6.5kW max|
|CPU||Dual AMD Rome 7742, 128 cores total, 2.25 GHz (base), 3.4 GHz (max boost)|
|Networking||8x Single-Port Mellanox ConnectX-6 VPI 200Gb/s HDR InfiniBand1x Dual-Port Mellanox ConnectX-6 VPI10/25/50/100/200Gb/s Ethernet|
|Storage||OS: 2x 1.92TB M.2 NVME drivesInternal Storage: 15TB (4x 3.84TB) U.2 NVME drives|
|Software||Ubuntu Linux OS|
|System Weight||271 lbs (123 kgs)|
|Packaged System Weight||315 lbs (143kgs)|
|System Dimensions||Height: 10.4 in (264.0 mm)Width: 19.0 in (482.3 mm) MAXLength: 35.3 in (897.1 mm) MAX|
|Operating Temperature Range||5ºC to 30ºC (41ºF to 86ºF)|
NVIDIA DGX A100 leverages the high-performance capabilities, 128 cores, DDR4-3200MHz and PCIe® 4support from two AMD EPYC 7742 processors running at speeds up to 3.4 GHz1. The 2nd Gen AMD EPYC processor is the first and only current x86-architecture server processor that supports PCIe® 4, providing leadership high-bandwidth I/O that’s critical for high performance computing and connections between the CPU and other devices like GPUs.
“Only 2nd Gen AMD EPYC processors can provide up to 64 cores and 128 lanes of PCIe 4 interconnectivity in a single x86 data center processor, and we’re excited to see how the power of the NVIDIA DGX A100 system enables the I/O bandwidth to be effectively doubled,” said Raghu Nambiar, corporate vice president, data center ecosystems and application engineering, AMD. “With 2nd Gen AMD EPYC processors, our partners and customers can maximize performance and cost efficiencies in heterogeneous computing, virtualized and hyper converged infrastructure workloads, providing teams with the flexibility and capability to stay at the forefront of innovation.”
“The NVIDIA DGX A100 delivers a tremendous leap in performance and capabilities,” said Charlie Boyle, vice president and general manager, DGX systems at NVIDIA. “The 2nd Gen AMD EPYC processors used in DGX A100 provide high performance and support for PCIe Gen4. NVIDIA has put those features to work to create the world’s most powerful AI system while maintaining compatibility with the GPU-optimized software stack used across the entire DGX family.”