Despite impressive perceptual and reasoning capabilities, vision-language models (VLMs) face challenges in systematic generalization, sample efficiency, commonsense reasoning, and trustworthy decision-making. The CogVL workshop provides a forum for researchers across computer vision, natural language processing, and cognitive science to explore how cognitively-inspired frameworks can address these limitations.
Our workshop is motivated by the emerging interest in whether cognitive principles such as counterfactual thinking, theory of mind, compositional reasoning, and causal inference can offer a blueprint for more adaptable, robust, and context-aware multimodal intelligence. Our half-day workshop features invited keynote talks, a panel discussion with leading experts, and selected papers. CogVL will also host the BlackSwan Challenge, which evaluates abductive reasoning (inferring hidden causes) and defeasible reasoning (adapting to new visual evidence) in unexpected video events.
Announcements
- Submission site is now open! Submit your papers via OpenReview. Deadline: March 1, 2026.
- The BlackSwan Challenge has launched! See the submission instructions page for details on how to participate.
- Follow us on X! CogVL Workshop on X