NVIDIA Nemotron 3 Nano Omni: One Open Model for Vision, Audio, and Text at 9x the Throughput

NVIDIA’s 30B-A3B open multimodal model unifies vision, audio, and text in a single architecture, delivering 9x higher throughput than comparable open omni models for agentic workloads.
artificial-intelligence
Author

Kabui, Charles

Published

2026-05-11

Keywords

nvidia-nemotron, multimodal-ai, mixture-of-experts, agentic-ai, edge-inference