A research agent routes private document fragments into public search queries while privacy filters monitor the outgoing trail
A research agent routes private document fragments into public search queries while privacy filters monitor the outgoing trail
+ Large Language Models News

MosaicLeaks makes research-agent privacy a query-log problem

ServiceNow researchers introduce MosaicLeaks, a benchmark showing how deep research agents can leak private enterprise facts through ordinary-looking web queries.

34 minutes ago

ServiceNow researchers have published MosaicLeaks, a benchmark for a privacy problem that becomes more important as companies give agents both private documents and external tools. The core risk is not that the agent pastes a secret into one obvious outbound message. It is that ordinary-looking web queries can accumulate into a private fact.

The Hugging Face write-up frames the failure mode around deep research agents that combine local enterprise documents with web retrieval. An outside observer does not see the private documents or the agent’s hidden reasoning. They see the external queries. MosaicLeaks asks whether that query log is enough to infer private research intent, answer private questions, or reconstruct verifiable private claims.

That is a sharper privacy model than “tell the agent not to leak.” It treats the agent’s tool use as the attack surface.

The benchmark is built around mixed context

MosaicLeaks contains 1,001 multi-hop research chains. Each chain interleaves local enterprise documents with a controlled public web corpus. The point is to create tasks where an agent needs private context to decide what public information to retrieve next.

That is exactly the pattern enterprises want from research agents. A useful agent might read an internal account note, search public market filings, compare a vendor’s public claims with private support tickets, and produce a recommendation. The same pattern can leak sensitive context if the external queries reveal too much about the internal side of the task.

The authors measure three leakage types. Intent leakage asks whether the query log reveals what private question the agent was trying to answer. Answer leakage asks whether an observer with a private question can answer it from the query log. Full-information leakage asks whether the observer can state true private claims without being given the question.

Those distinctions matter. Many privacy controls focus on direct disclosure. MosaicLeaks is about inference from fragments.

Prompting is not enough

The uncomfortable finding is that privacy prompting does not eliminate the problem. The paper says models frequently leak across families and sizes, and that training only for task performance can make leakage worse.

That is plausible because a task-only agent is rewarded for retrieving useful evidence. If the fastest route to an answer is a query that exposes a private bridge entity, a capability-focused optimization may choose the revealing route more often. Better research behavior can become worse privacy behavior unless the training objective accounts for both.

ServiceNow’s write-up proposes Privacy-Aware Deep Research, or PA-DR, as a reinforcement-learning approach that combines task success with leakage penalties. In the headline result, the authors report strict chain success rising from 48.7% to 58.7% while answer/full-information leakage falls from 34.0% to 9.9%.

That result should be treated as a benchmark result, not proof that a production system is safe. It is still useful because it shows the right shape of mitigation: measure leakage at the tool-call level and reward the agent for solving the task without exposing private context.

Enterprise agent logs become sensitive data

MosaicLeaks also changes how teams should think about observability. Agent logs are often treated as debugging material: prompts, traces, tool calls, retrieval queries, actions, and outputs. For enterprise research agents, those logs may become sensitive data in their own right.

If an external search provider, proxy, plugin, or monitoring layer can observe the query trail, it may learn more than any single query reveals. The same concern applies internally. A broad logging system that stores every agent query may collect sensitive business context even if the agent never prints the private document in its final answer.

The practical implication is that agent governance needs data-minimization rules for tool calls, not only final responses. Teams should ask which tools receive private context, whether queries can be rewritten to remove sensitive bridge entities, how long traces are retained, and who can inspect them.

What to watch next

The next checkpoint is whether privacy-aware research-agent evaluation becomes standard in enterprise deployments. A vendor claiming “secure deep research” should be able to answer more than where documents are stored. It should explain what information leaves the trust boundary through search, retrieval, connectors, browsing, plugins, and telemetry.

MosaicLeaks does not say every research agent is unusable. It says the privacy budget is spent across many small actions. That is the right warning for products that promise to connect internal knowledge with the public web.

For readers tracking model capabilities and company coverage, see our AI model leaderboard and AI company tracker.

Sources

The AI Feed Desk

The AI Feed Desk

Editorial desk

The AI Feed Desk tracks AI provider updates, model releases, agent tooling, and enterprise adoption, turning fast-moving announcements into source-linked context for builders and operators.

Noticed a typo, incorrect information, or translation error?

Tell us so we can fix it.

Help Improve This Article

Related Articles

Gemini 3.5 Flash beats last year's Pro on the work builders ship

Google's Gemini 3.5 Flash beats last year's 3.1 Pro on coding and agentic benchmarks at ~40% lower cost — with reasoning and 1M-context limits worth testing.

The AI Feed Desk

By The AI Feed Desk

Anthropic releases Claude Fable 5 and Claude Mythos 5

Anthropic's first broadly available Mythos-class model arrives as Claude Fable 5, with sensitive requests routed to Opus 4.8 and Mythos 5 reserved for trusted access.

The AI Feed Desk

By The AI Feed Desk

Anthropic releases Claude Opus 4.8 with a reliability gain for agentic coding

Claude Opus 4.8 ships with one substantive improvement: roughly four times fewer self-introduced code flaws pass unflagged versus its predecessor. Pricing holds at 4.7 levels.

The AI Feed Desk

By The AI Feed Desk

Google releases Gemma 4 12B for local multimodal agents

Google's Gemma 4 12B is a 12B-parameter open model for local multimodal work, with 16GB memory guidance, native audio inputs, and a 256K-token context window.

The AI Feed Desk

By The AI Feed Desk

Microsoft releases MAI-Thinking-1 and expands its agent platform

Microsoft's Build 2026 announcement combines MAI-Thinking-1, Microsoft IQ, Agent 365, Foundry, GitHub, and Surface RTX Spark into one enterprise agent platform.

The AI Feed Desk

By The AI Feed Desk