Video-MME-v2: Top AI Video Models Still Trail Humans by a Wide Margin

A new 800-video benchmark scores Gemini 3 Pro at just 49.4% under group-based grading and shows that thinking mode often hurts on purely visual tasks.
artificial-intelligence
Author

Kabui, Charles

Published

2026-04-19

Keywords

video-understanding, multimodal-benchmark, gemini-3-pro, video-mme-v2, thinking-mode