NVIDIA Cosmos 3: One Open Model for Text, Images, Video, Audio, and Robot Actions

NVIDIA’s Cosmos 3 is an open family of world models that handle text, images, video, audio, and robot actions in one system, in 16B and 64B sizes. NVIDIA reports it tops open-source rankings for image generation, video generation, and robot control.
artificial-intelligence
Author

Kabui, Charles

Published

2026-06-10

Keywords

nvidia-cosmos-3, world-models, physical-ai, omnimodal-ai, robot-policy