GPU

NVIDIA A40 PCIe

Edit@2 months ago

Intergrated Memory(VRAM)
Capacity

48 GB

(GDDR6 384-bit)

Bandwidth

695 GB/s

99 Token/s

Vector Compute
FP64
584.60 G
FP32
37.42 T
FP16
37.42 T
BF16
37.42 T
INT32
18.71 T
INT8
X

NVIDIA A40 PCIe General-Purpose Float-Point performance (Vector Performance / Scalar Performance)

FP64: 584.60 GFLOPS

FP32: 37.42 TFLOPS

FP16: 37.42 TFLOPS

BF16: 37.42 TFLOPS

INT32: 18.71 TOPS

Matirx Compute
FP64
X
FP32
X
FP16
149.67 T
299.34 T
FP8
X
TF32
74.83 T
149.67 T
BF16
149.67 T
299.34 T
INT16
X
INT8
299.34 T
598.67 T
INT4
598.67 T
1197.34 T

NVIDIA A40 PCIe AI performance (Tensor Performance / Matrix Performance)

FP16: 149.67 TFLOPS, with sparsity: 299.34 TFLOPS

TF32: 74.83 TFLOPS, with sparsity: 149.67 TFLOPS

BF16: 149.67 TFLOPS, with sparsity: 299.34 TFLOPS

INT8: 299.34 TOPS, with sparsity: 598.67 TOPS

INT4: 598.67 TOPS, with sparsity: 1197.34 TOPS

Hardware Specs
NVIDIA A40 PCIe is a 8nm chip, has 28300 million transistors, launched by NVIDIA at 2020. It has 48 GB built-in(On-Board/On-Chip) memory with bandwidth up to 695 GB/s. It has 10752 general-purpose ALUs(CUDA cores/Shader cores) and 336 matrix cores(Tensor cores) .
Process Node
8 nm
Launch Year
2020

Vector(CUDA) Cores
10752
Matrix(Tensor) Cores
336
Core Frequency
1305 ~ 1740 MHz
Cache
6MB

Comment without registration

Share your experience with NVIDIA A40 PCIe / Found an Error? Help Us Improve!