xAI released Grok Imagine Video 1.5 Preview on June 3, 2026. The model ID is grok-imagine-video-1.5-preview, and xAI says it is available through the xAI API in preview for image-to-video generation.
The product boundary is the story. This is not a text-to-video model in the docs. It turns a still image into video, lets the user describe motion with a prompt, and can generate clips up to 720p. The API pricing also changes by resolution: the docs headline $0.08 per second, but the details list 720p output at $0.14 per second.
This is image-to-video first
xAI’s launch post says the model turns a single still image into “fluid, cinematic video” while staying faithful to the source image. The API example uses an image_url, a motion prompt, a duration, and a resolution. That tells developers how to read the release: start with a frame, then direct what happens to it.
That is a narrower product than a general video model, and the narrowness is useful. Image-to-video is often the controllable part of generated video because the first frame pins the subject, composition, lighting, and style. The prompt can then specify camera movement, pacing, atmosphere, and action.
For production workflows, that means Grok Imagine 1.5 is better evaluated as an animation and shot-extension tool than as a blank-page story generator.
The pricing caveat belongs near the top
xAI’s docs show a simple headline price of $0.08 per second. The details table is more specific: image input is $0.01, 480p video output is $0.08 per second, and 720p output is $0.14 per second. You are charged for each second generated, and xAI says video or image input can also be charged.
That makes resolution and duration the practical cost controls. A 10-second 720p generation is not priced like a 10-second 480p generation. A workflow that chains shots into longer scenes will compound cost across every clip.
The developer surface is small but clear
The example in xAI’s announcement is straightforward: create a client, call client.video.generate, pass the model name, image URL, prompt, duration, and resolution, then print the returned URL. That is enough for early experimentation.
The missing details are the important ones for production. The docs page lists the model and pricing, but teams still need to verify duration limits, latency, content controls, retention behavior, regional availability beyond us-east-1, and whether outputs have provenance or watermarking metadata.
The preview label matters. A preview model can be useful without being stable enough for a production contract. Builders should expect the API shape, pricing, safety controls, and quality trade-offs to keep moving.
What to test first
Start with source-frame fidelity. Give the model product shots, character frames, UI stills, and scenes with text or logos, then check what changes between the input and output. Generated video often fails by drifting from the source rather than by failing to move.
Next, test cost against iteration count. Video workflows rarely succeed on the first generation. Price the work by accepted clip and include the cost of discarded drafts.
Finally, test editability. If each clip has to be regenerated from scratch to fix a small movement, the tool is expensive. If the model can reliably preserve a look across chained shots, it becomes more useful for actual production.
For broader model context, see our AI model leaderboard. For xAI company coverage, see our xAI company profile.