xAI's official grok-imagine-1.5 preview article card
xAI's official grok-imagine-1.5 preview article card
+ xAI News

xAI releases Grok Imagine Video 1.5 Preview

xAI's Grok Imagine Video 1.5 Preview is available through the API for image-to-video generation, with 720p support and pricing that rises by resolution.

about 4 hours ago

xAI released Grok Imagine Video 1.5 Preview on June 3, 2026. The model ID is grok-imagine-video-1.5-preview, and xAI says it is available through the xAI API in preview for image-to-video generation.

The product boundary is the story. This is not a text-to-video model in the docs. It turns a still image into video, lets the user describe motion with a prompt, and can generate clips up to 720p. The API pricing also changes by resolution: the docs headline $0.08 per second, but the details list 720p output at $0.14 per second.

This is image-to-video first

xAI’s launch post says the model turns a single still image into “fluid, cinematic video” while staying faithful to the source image. The API example uses an image_url, a motion prompt, a duration, and a resolution. That tells developers how to read the release: start with a frame, then direct what happens to it.

That is a narrower product than a general video model, and the narrowness is useful. Image-to-video is often the controllable part of generated video because the first frame pins the subject, composition, lighting, and style. The prompt can then specify camera movement, pacing, atmosphere, and action.

For production workflows, that means Grok Imagine 1.5 is better evaluated as an animation and shot-extension tool than as a blank-page story generator.

The pricing caveat belongs near the top

xAI’s docs show a simple headline price of $0.08 per second. The details table is more specific: image input is $0.01, 480p video output is $0.08 per second, and 720p output is $0.14 per second. You are charged for each second generated, and xAI says video or image input can also be charged.

That makes resolution and duration the practical cost controls. A 10-second 720p generation is not priced like a 10-second 480p generation. A workflow that chains shots into longer scenes will compound cost across every clip.

$0.01 Image input xAI Docs
$0.08/sec 480p output xAI Docs
$0.14/sec 720p output xAI Docs
60 rpm Listed rate limit Requests per minute xAI Docs

The developer surface is small but clear

The example in xAI’s announcement is straightforward: create a client, call client.video.generate, pass the model name, image URL, prompt, duration, and resolution, then print the returned URL. That is enough for early experimentation.

The missing details are the important ones for production. The docs page lists the model and pricing, but teams still need to verify duration limits, latency, content controls, retention behavior, regional availability beyond us-east-1, and whether outputs have provenance or watermarking metadata.

The preview label matters. A preview model can be useful without being stable enough for a production contract. Builders should expect the API shape, pricing, safety controls, and quality trade-offs to keep moving.

What to test first

Start with source-frame fidelity. Give the model product shots, character frames, UI stills, and scenes with text or logos, then check what changes between the input and output. Generated video often fails by drifting from the source rather than by failing to move.

Next, test cost against iteration count. Video workflows rarely succeed on the first generation. Price the work by accepted clip and include the cost of discarded drafts.

Finally, test editability. If each clip has to be regenerated from scratch to fix a small movement, the tool is expensive. If the model can reliably preserve a look across chained shots, it becomes more useful for actual production.

For broader model context, see our AI model leaderboard. For xAI company coverage, see our xAI company profile.

Sources

The AI Feed Desk

The AI Feed Desk

Editorial desk

The AI Feed Desk tracks AI provider updates, model releases, agent tooling, and enterprise adoption, turning fast-moving announcements into source-linked context for builders and operators.

Noticed a typo, incorrect information, or translation error?

Tell us so we can fix it.

Help Improve This Article

Related Articles

Google rolls out Gemini Omni Flash for video generation

Gemini Omni Flash turns mixed inputs into video and is rolling into Gemini, Flow, YouTube Shorts, and YouTube Create before the API arrives.

The AI Feed Desk

By The AI Feed Desk

Google releases Gemma 4 12B for local multimodal agents

Google's Gemma 4 12B is a 12B-parameter open model for local multimodal work, with 16GB memory guidance, native audio inputs, and a 256K-token context window.

The AI Feed Desk

By The AI Feed Desk

about 5 hours ago

Anthropic raises $65B at a $965B valuation

Anthropic's Series H pairs a $65B raise with $47B run-rate revenue and gigawatt-scale compute agreements. The money is for capacity, not just research.

The AI Feed Desk

By The AI Feed Desk

Anthropic releases Claude Opus 4.8 with a reliability gain for agentic coding

Claude Opus 4.8 ships with one substantive improvement: roughly four times fewer self-introduced code flaws pass unflagged versus its predecessor. Pricing holds at 4.7 levels.

The AI Feed Desk

By The AI Feed Desk

Gemini 3.5 Flash beats last year's Pro on the work builders ship

Google's Gemini 3.5 Flash beats last year's 3.1 Pro on coding and agentic benchmarks at ~40% lower cost — with reasoning and 1M-context limits worth testing.

The AI Feed Desk

By The AI Feed Desk