Skip to main content

One doc tagged with "multimodal"

View all tags

Vision Language Models for Robotics

Vision Language Models (VLAs) combine visual and linguistic understanding to enable robots to understand and execute natural language instructions.