TPU
Definition
Tensor Processing Unit — Google's custom-designed AI accelerator chip, optimized specifically for neural network training and inference workloads.
TPUs are application-specific integrated circuits (ASICs) developed by Google to accelerate machine learning workloads. Unlike general-purpose GPUs, TPUs are designed specifically for tensor operations (the mathematical operations underlying neural networks). Google first deployed TPUs internally in 2015 and has iterated through multiple generations (TPU v1 through v5p). TPUs are available through Google Cloud and are particularly cost-effective for training large transformer models. They power many of Google's internal AI systems and are used to train the Gemini family of models. TPUs connect in large pods for distributed training and offer competitive performance with NVIDIA GPUs for many workloads, though the NVIDIA CUDA ecosystem's maturity gives GPUs a broader software compatibility advantage.
Related Terms
Cloud AI
AI services and infrastructure provided through cloud computing platforms, allowing organizations to...
GPU
Graphics Processing Unit — a specialized processor originally designed for rendering graphics but no...
Inference
The process of using a trained AI model to generate predictions or outputs on new data, as opposed t...