Microsoft releases MAI-Thinking-1 and expands its agent platform

Microsoft used Build 2026 to release MAI-Thinking-1, its first in-house reasoning model from the Microsoft AI Superintelligence Team, and to fold that model into a broader enterprise agent platform. The model has 35 billion active parameters, a 256K context window, and is now in private preview on Foundry.

The model is only one piece of the announcement. Microsoft also introduced Microsoft IQ, Web IQ, Microsoft Scout, Agent 365 for local agents, Frontier Tuning, MDASH, Surface RTX Spark Dev Box, and new hosted agent infrastructure in Foundry. The message is that Microsoft wants the enterprise agent stack to run through GitHub, Foundry, Microsoft 365, Windows, and Microsoft Security.

The model release sits inside the platform

MAI-Thinking-1 is the cleanest headline. Microsoft says it was trained from scratch with zero distillation on enterprise-grade, clean, commercially licensed data. It is built for complex multi-step instructions, long-context reasoning, and code generation.

Microsoft also claims independent blind raters preferred MAI-Thinking-1 to Sonnet 4.6, and that it matches Opus 4.6 on SWE Bench Pro coding ability. Those are Microsoft-reported comparisons and should be treated that way until outside evals are available.

The rest of the MAI family broadens the pitch. Microsoft says MAI-Image-2.5 and a flash variant serve text-to-image and image-to-image workloads; MAI Transcribe 1.5 supports 43 languages; MAI-Voice-2 adds more than 15 languages and new voice options; and MAI-Code-1 is available in Copilot and VS Code.

35B Active parameters MAI-Thinking-1 Microsoft

256K Context window MAI-Thinking-1 Microsoft

7 New MAI models Reasoning, coding, image, voice, and transcription Microsoft

Microsoft IQ is the context layer

The enterprise-agent argument starts with context. Microsoft says Microsoft IQ is generally available across GitHub Copilot, Microsoft Foundry, and Copilot Studio. Work IQ captures how work happens across Microsoft 365, organizational systems, and external sources. Fabric IQ handles structured business data. Foundry IQ ties enterprise knowledge and the live web together for retrieval planning.

Web IQ is the new piece. Microsoft calls it an AI-first web search stack that is model-agnostic and MCP-native, and says it returns relevant passages at nearly 2.5x the speed of the next best alternative. That is another Microsoft claim, but it points at the real problem: agents need grounded context, and the enterprise version of that context is not a single document store.

The practical read is that Microsoft is building around the same idea it already sells in productivity software. The data lives in Microsoft 365, security lives in Microsoft identity and compliance systems, and the developer flow lives in GitHub and Foundry.

Governance is the real enterprise wedge

Agent 365 for local agents is where the announcement turns from model news into IT architecture. Microsoft says Agent 365 extends Entra, Defender, and Purview into one control plane to observe, govern, and secure agents across an organization, regardless of where they are hosted or what framework they use.

That framing matters because agents are harder to govern than chatbots. They call tools, run workflows, hold memory, coordinate with other agents, and may touch local systems. Microsoft is pairing Agent 365 with an open trust stack: ASSERT for policy-driven safety evaluation and regression testing, plus the Agent Control Specification for applying controls in the agent loop.

MDASH is the security proof point. Microsoft says the multi-model agentic security system uses more than 100 agents to find exploitable bugs by reasoning about data flow, business logic, and exploit chains, with fixes delivered through Defender Portal.

The local hardware story is back

Surface RTX Spark Dev Box is the hardware edge of the platform. Microsoft says it is powered by NVIDIA RTX Spark, offers up to one petaflop of AI compute and 128GB of unified memory, and can run up to 120B-parameter LLMs with up to 1 million tokens of context locally. The Microsoft footnote says the petaflop claim is based on NVIDIA’s theoretical FP4 TOPS using sparsity.

That caveat belongs in the piece. “One petaflop” is not a universal application-speed guarantee. It is a hardware throughput claim under specific conditions. The useful question for developers is whether local agents can run with acceptable latency, isolation, and governance on Windows without cloud GPU instances.

What teams should test first

The first test is model routing. If MAI-Thinking-1 is cheaper or more efficient for long-context reasoning and coding inside Foundry, route representative agent tasks to it and compare against the models already in production.

The second test is context quality. Microsoft IQ only matters if it retrieves the right business context without exposing too much. Test it against real contracts, tickets, documents, meetings, and business-system records with permission boundaries intact.

The third test is governance. Before deploying many agents, check whether Agent 365 can show what is running, what each agent can access, which actions it took, and which policies apply. That is where Microsoft’s existing enterprise stack either becomes a practical advantage or another layer to configure.

For broader model context, see our AI model leaderboard. For company coverage, see our Microsoft company profile.