Core Concepts

Multimodal AI

Definition

AI models that can process and generate multiple types of data — text, images, audio, video — within a single system. GPT-4o and Gemini 1.5 are multimodal models. Multimodal AI enables applications like visual question answering, image-to-code generation, and video understanding.

Related Terms

No related terms linked yet.

Explore all terms →

Explore companies in this space

Core Concepts Companies

View Core Concepts companies