Google's official Gemini Omni and Gemini 3.5 article card
Google's official Gemini Omni and Gemini 3.5 article card
+ Google News

Google rolls out Gemini Omni Flash for video generation

Gemini Omni Flash turns mixed inputs into video and is rolling into Gemini, Flow, YouTube Shorts, and YouTube Create before the API arrives.

in 4 minutes

Google is rolling out Gemini Omni Flash, the first model in its Gemini Omni family. The model can combine images, audio, video, and text as inputs, then generate or edit video through conversation. Google announced the model at I/O 2026 and followed with a May 29 demo post showing the product direction in practice.

The distribution plan is as important as the model. Google says Gemini Omni Flash is rolling out globally to Google AI Plus, Pro, and Ultra subscribers through the Gemini app and Google Flow. It is also rolling out at no cost to users on YouTube Shorts and the YouTube Create app starting this week, with API access for developers and enterprise customers planned in the coming weeks.

The product is broader than text-to-video

Google’s framing is not only “type a prompt, get a clip.” The company says Omni can use multiple references and edit through natural language. In the demo post, Google emphasizes character consistency, physics, scene memory, style transfer, and multi-turn edits where each instruction builds on the previous one.

That matters because the hard part of generated video is rarely the first clip. The hard part is control after the first clip: keeping a character consistent, changing only one part of a scene, preserving motion, or using an image and an audio reference together without collapsing into mush. If Omni can make those edits predictable, it becomes less like a toy generator and more like a production assistant.

Google also says Omni will start with video and later support other output modalities such as image and audio. For now, it is best read as a video product with multimodal inputs, not as a fully general any-output model.

YouTube is the adoption channel

The YouTube rollout is the clearest strategic move. Google can charge subscribers in Gemini and Flow while letting creators encounter the model inside Shorts and Create, where the output has an obvious publishing destination. That gives Google a large consumer testing surface and a way to make generated video part of normal creator tooling.

The API timing also matters. Developers and enterprise customers are not first in line. Google is letting the consumer and creator products carry the early usage, then opening APIs in the following weeks. That sequence tells builders not to assume the first public release will have stable developer economics, usage limits, or production-grade controls.

For teams building with generated video, the practical next step is to watch the API release, not just the demos. Pricing, latency, content controls, input limits, and rights handling will decide whether Omni is useful for real workflows.

Provenance is part of the product

Google says videos created with Omni include SynthID digital watermarking and can be verified through the Gemini app, Gemini in Chrome, and Google Search. The company also says it is taking a cautious approach to audio and speech editing beyond voice avatars.

That caution is not decorative. Video models are judged on capability, but they are adopted based on trust, controls, and downstream policy. A creator tool that can transform footage through conversation needs clear provenance, especially when it is integrated into YouTube.

The open question is how those controls work once API access arrives. Enterprise users will want consistent metadata, policy hooks, and auditability. Creators will want speed and fewer surprises. Those needs can pull in different directions.

What to watch next

The next checkpoint is the API rollout. If Google ships Omni with usable pricing and strong edit controls, it could become the first Gemini video model that developers can plan around. If the API arrives with narrow limits, the near-term story stays inside Google’s own apps.

For readers tracking Google’s model stack, this sits beside the separate Gemini 3.5 Flash release rather than replacing it. Omni is about video creation and editing; 3.5 Flash is about agentic and coding work. See our Google company profile and AI model leaderboard for the wider model context.

Sources

The AI Feed Desk

The AI Feed Desk

Editorial desk

The AI Feed Desk tracks AI provider updates, model releases, agent tooling, and enterprise adoption, turning fast-moving announcements into source-linked context for builders and operators.

Noticed a typo, incorrect information, or translation error?

Tell us so we can fix it.

Help Improve This Article

Related Articles

Gemini 3.5 Flash beats last year's Pro on the work builders ship

Google's Gemini 3.5 Flash beats last year's 3.1 Pro on coding and agentic benchmarks at ~40% lower cost — with reasoning and 1M-context limits worth testing.

The AI Feed Desk

By The AI Feed Desk

OpenAI puts o3 and GPT-4.5 on a ChatGPT sunset clock

OpenAI will retire GPT-4.5 from ChatGPT on June 27 and OpenAI o3 on August 26, with no API change. Teams should audit model-specific workflows now.

The AI Feed Desk

By The AI Feed Desk

Anthropic releases Claude Opus 4.8 with a reliability gain for agentic coding

Claude Opus 4.8 ships with one substantive improvement: roughly four times fewer self-introduced code flaws pass unflagged versus its predecessor. Pricing holds at 4.7 levels.

The AI Feed Desk

By The AI Feed Desk

about 3 hours ago

Anthropic raises $65B at a $965B valuation

Anthropic's Series H pairs a $65B raise with $47B run-rate revenue and gigawatt-scale compute agreements. The money is for capacity, not just research.

The AI Feed Desk

By The AI Feed Desk

11 minutes ago

NVIDIA announces RTX Spark PCs for local AI agents

RTX Spark puts 1 petaflop of AI performance and up to 128GB of unified memory into Windows PCs designed for local agents.

The AI Feed Desk

By The AI Feed Desk

in 19 minutes