Architecture

Vision-Language Model

Definition

AI models that can understand and reason about both images and text simultaneously. Vision-language models are used for image captioning, visual question answering, document analysis, and automated UI testing. Examples include GPT-4V, Claude 3.5 Sonnet, and Google's PaLI.

Related Terms

No related terms linked yet.

Explore all terms →

Explore companies in this space

Architecture Companies

View Architecture companies