The ThinkSystem NVIDIA A800 PCIe 4.0 GPU delivers unprecedented acceleration—at every scale—to power the world’s highest-performing elastic data centers for AI, data analytics, and high-performance computing (HPC) applications. The NVIDIA A800 GPU can efficiently scale up or be partitioned into seven isolated GPU instances with Multi-Instance GPU (MIG), providing a unified platform that enables elastic data centers to dynamically adjust to shifting workload demands.

NVIDIA A800 Tensor Core technology supports a broad range of math precisions, providing a single accelerator for every workload. The latest generation A800 80GB doubles GPU memory and debuts the world’s fastest memory bandwidth at 2 terabytes per second (TB/s), speeding time to solution for the largest models and most massive datasets.

A800 is part of the complete NVIDIA data center solution that incorporates building blocks across hardware, networking, software, libraries, and optimized AI models and applications from the NVIDIA NGC™ catalog. Representing the most powerful end-to-end AI and HPC platform for data centers, it allows researchers to deliver real-world results and deploy solutions into production at scale.

GPU Architecture                                 NVIDIA Ampere

NVIDIA Tensor Cores                          512 third-generation Tensor Cores per GPU

NVIDIA CUDA Cores                           8192 FP32 CUDA Cores per GPU

Double-Precision Performance          FP64: 9.7 TFLOPS     FP64 Tensor Core: 19.5 TFLOPS

Single-Precision Performance            FP32: 19.5 TFLOPS     Tensor Float 32 (TF32): 156 TFLOPS, 312 TFLOPS*

Half-Precision Performance                312 TFLOPS, 624 TFLOPS*

Bfloat16                                                    312 TFLOPS, 624 TFLOPS*

Integer Performance                             INT8 624 TOPS, 1,248 TOPS*     INT4: 1,248 TOPS, 2,496 TOPS*

GPU Memory                                          80 GB HBM2

Memory Bandwidth                              1,935 GB/s

ECC                                                           Yes

Interconnect Bandwidth                      NVLink: 400 GB/s     PCIe: 64 GB/s

System Interface                                    PCIe Gen 4, x16 lanes

Form Factor                                            PCIe full height/length, double width

Multi-Instance GPU (MIG)                  Up to 7 GPU instances, 10GB each

Max Power Consumption                     300 W

Thermal Solution                                   Passive

Compute APIs                                         CUDA, DirectCompute, OpenCL, OpenACC


* With structural sparsity enabled



