Back to GlossaryInfrastructure

Edge AI

Definition

Running AI models directly on local devices (phones, IoT sensors, vehicles) rather than in the cloud, enabling real-time processing without internet connectivity.

Edge AI brings inference to the device where data is generated, eliminating network latency and cloud dependency. This is critical for applications requiring real-time responses (autonomous vehicles), operating in connectivity-limited environments (remote sensors), or handling sensitive data that shouldn't leave the device (medical devices, security cameras). Running large models on edge hardware requires aggressive optimization through quantization (reducing precision from 32-bit to 8-bit or even 4-bit), pruning, and specialized model architectures designed for efficiency. Apple's Neural Engine, Qualcomm's AI Engine, and Google's Edge TPU are hardware solutions designed for on-device AI. As models become more efficient and edge hardware more powerful, an increasing share of AI inference is moving from cloud to edge.

Companies in Infrastructure

View Infrastructure companies →