Glossary of AI Terminology

What Does Multimodal Mean In the Context of Machine Learning?

Multimodal Model

Multimodal models process and relate information from different types of inputs, like text and images. They are often used in tasks that require understanding of both visual and textual content.

They are akin to tour guides at a museum who explain an artifact by combining what they see (image) with what they know (text).

Example:

MMBT

Multimodal Model

Bi-weekly AI Research Paper Readings

Stay on top of emerging trends and frameworks.