CUDA
Definition
NVIDIA's parallel computing platform and programming model that enables developers to use GPUs for general-purpose processing, including AI model training and inference.
CUDA (Compute Unified Device Architecture) is the software ecosystem that made GPUs practical for AI research. Released by NVIDIA in 2007, CUDA provides libraries, tools, and APIs that allow developers to write programs that execute on GPU hardware. Key AI libraries built on CUDA include cuDNN (deep learning primitives), cuBLAS (linear algebra), and NCCL (multi-GPU communication). PyTorch and TensorFlow use CUDA under the hood for GPU acceleration. CUDA's decade-long head start has created a massive moat for NVIDIA — most AI software is written for CUDA, making it difficult for competitors like AMD (with ROCm) to gain traction despite competitive hardware. This software lock-in is a major factor in NVIDIA's dominant market position in AI compute.
Related Terms
GPU
Graphics Processing Unit — a specialized processor originally designed for rendering graphics but no...
TPU
Tensor Processing Unit — Google's custom-designed AI accelerator chip, optimized specifically for ne...
Inference
The process of using a trained AI model to generate predictions or outputs on new data, as opposed t...