GPU

NVIDIA GeForce RTX 4090

Edit@6 days ago

Intergrated Memory(VRAM)
Capacity

24 GB

(GDDR6X 384-bit)

Bandwidth

1008 GB/s

144 Token/s

Vector Compute
FP64
1.29 T
FP32
82.58 T
FP16
82.58 T
BF16
82.58 T
INT32
41.29 T
INT8
X

NVIDIA GeForce RTX 4090 General-Purpose Float-Point performance (Vector Performance / Scalar Performance)

FP64: 1.29 TFLOPS

FP32: 82.58 TFLOPS

FP16: 82.58 TFLOPS

BF16: 82.58 TFLOPS

INT32: 41.29 TOPS

Matirx Compute
FP64
X
FP32
X
FP16
165.15 T
330.30 T
FP8
330.30 T
660.60 T
TF32
82.58 T
165.15 T
BF16
165.15 T
330.30 T
INT16
X
INT8
660.60 T
1321.21 T
INT4
1321.21 T
2642.41 T

NVIDIA GeForce RTX 4090 AI performance (Tensor Performance / Matrix Performance)

FP16: 165.15 TFLOPS, with sparsity: 330.30 TFLOPS

FP8: 330.30 TFLOPS, with sparsity: 660.60 TFLOPS

TF32: 82.58 TFLOPS, with sparsity: 165.15 TFLOPS

BF16: 165.15 TFLOPS, with sparsity: 330.30 TFLOPS

INT8: 660.60 TOPS, with sparsity: 1321.21 TOPS

INT4: 1321.21 TOPS, with sparsity: 2642.41 TOPS

Hardware Specs
NVIDIA GeForce RTX 4090 is a 5nm chip, has 76300 million transistors, launched by NVIDIA at 2022. It has 24 GB built-in(On-Board/On-Chip) memory with bandwidth up to 1008 GB/s. It has 16384 general-purpose ALUs(CUDA cores/Shader cores) and 512 matrix cores(Tensor cores) .
Process Node
5 nm
Launch Year
2022

Vector(CUDA) Cores
16384
Matrix(Tensor) Cores
512
Core Frequency
2235 ~ 2520 MHz
Cache
72MB

Comment without registration

Share your experience with NVIDIA GeForce RTX 4090 / Found an Error? Help Us Improve!