Google DeepMindPreview
Gemini Omni
Native frontier video generation folded into the standard Gemini surface.
Context
1M
Max output
—
Input /1M
—
Output /1M
—
Best for
- Text/audio/image/video → dynamic video
- Natural-language video editing
- Enterprise post-production, virtual try-on
Watch out
Preview. Head-to-head vs Veo 3 / Sora 2 not yet independently run.
For creators. The one to watch for agentic video pipelines — author, narrate, produce in one tool-use sequence.
Capabilities
- Text, audio, image, and video inputs for video creation/editing
- Rolling out via Gemini API and Agent Platform API