Mistral 3 raises the open-model bar with Apache-licensed dense and MoE releases

Mistral 3 includes three small dense models and Mistral Large 3, a 675B-parameter sparse MoE with 41B active parameters released under Apache 2.0.

May 22, 2026

Mistral AI announced Mistral 3, a new model family that includes three small dense models and Mistral Large 3, its most capable model to date. The headline for builders is not only performance. The models are released under the Apache 2.0 license.

Mistral Large 3 is a sparse mixture-of-experts model with 41 billion active parameters and 675 billion total parameters. Mistral says it was trained from scratch on 3,000 NVIDIA H200 GPUs.

What changed

Mistral says Large 3 is its first mixture-of-experts model since the Mixtral series and that it debuts at number two in the OSS non-reasoning models category on LMArena. The company also released smaller Ministral models at 14B, 8B, and 3B.

The availability story is broad: Mistral AI Studio, Amazon Bedrock, Azure Foundry, Hugging Face, Modal, IBM watsonx, OpenRouter, Fireworks, Unsloth AI, Together AI, and others.

Why this matters

Open models matter when teams need customization, local deployment, auditability, or cost control. Apache licensing makes the release more useful for commercial builders than more restrictive open-weight drops.

The smaller models are also important. Not every AI workload needs a frontier-scale model. If the small models are strong enough, they can push more inference to edge devices, private environments, and cost-sensitive applications.

What to watch next

Watch the reasoning version Mistral says is coming soon, plus real deployment reports from teams serving Large 3 with vLLM, TensorRT-LLM, SGLang, and compressed checkpoints. Open model quality only matters if it can be served economically.

Sources

Mistral AI: Introducing Mistral 3