Skip to content
Google DeepMindPreview

Gemini Omni

Native frontier video generation folded into the standard Gemini surface.

Context

1M

Max output

Input /1M

Output /1M

Best for

  • Text/audio/image/video → dynamic video
  • Natural-language video editing
  • Enterprise post-production, virtual try-on

Watch out

Preview. Head-to-head vs Veo 3 / Sora 2 not yet independently run.

For creators. The one to watch for agentic video pipelines — author, narrate, produce in one tool-use sequence.

Capabilities

  • Text, audio, image, and video inputs for video creation/editing
  • Rolling out via Gemini API and Agent Platform API

Where it runs

More from Google DeepMind

Sources