News

Visual Commonsense Reasoning (VCR) is a cognitive task, challenging models to answer visual questions, and to explain the rationale behind their answers. While Large Language Models (LLMs) offer ...