Core Concepts
Multimodal AI
Definition
“
AI models that can process and generate multiple types of data — text, images, audio, video — within a single system. GPT-4o and Gemini 1.5 are multimodal models. Multimodal AI enables applications like visual question answering, image-to-code generation, and video understanding.
”
Related Terms
No related terms linked yet.
Explore all terms →