Graphic Cards

NVIDIA Ampere GA100 GPU: 8192 CUDA Cores and 54-Billion Transistors

NVIDIA’s boss has unveiled in his kitchen (better than in the toilet) the new A100 GPU based on the Ampere architecture. The A100 GPU is the Tensor Core GPU implementation of the full GA100 GPU. The A100 does not have RT cores (Ray Tracing cores) and is focused on datacenters. The GA100 has RT cores but this number is not known yet.

Full Ampere GA100 GPU specifications:

GA100 GPU built on a 7nm manufacturing process
54-billion transistors
8192 CUDA cores
128 SMs (64 CUDA cores per SM)
Tensor cores: 512 (4 tensor cores per SM)
Third Generation Tensor Core (TensorFloat-32 TF32 Tensor Core)
New Bfloat16 (BF16)/FP32 mixed-precision Tensor Core operations
FP32 performance: 23 TFLOPS
FP64 performance: 11.5 TFLOPS (FP64 = 1/2 * FP32)
Memory: 48GB HBM2 – memory bus width: 6144-bit (6 HBM2 stacks, 12 512-bit memory controllers)
CUDA Compute Capability: 8.0

NVIDIA Ampere GA100 full GPU architecture:
NVIDIA Ampere GA100 full GPU architecture

NVIDIA Ampere GA100 streaming multiprocessor (SM):
NVIDIA Ampere GA100 streaming multiprocessor (SM)

 
A100 Tensor Core GPU specifications:

GA100 GPU built on a 7nm manufacturing process
54-billion transistors
6912 CUDA cores
108 SMs (64 CUDA cores per SM)
Tensor cores: 432 (4 tensor cores per SM)
FP32 performance: 19.5 TFLOPS
FP64 performance: 9.7 TFLOPS (FP64 = 1/2 * FP32)
Memory: 40GB HBM2 – memory bus width: 5120-bit (5 HBM2 stacks, 10 512-bit memory controllers)
CUDA Compute Capability: 8.0
TDP: 400W

 
Links:

Related Articles

Check Also

Close
Back to top button
Close
Close