Security Researchers Broke Into Claude Cowork's Sandbox — and Anthropic Said It Doesn't Count

Security researchers at Armadin have demonstrated a multi-stage exploit chain that escalates from DLL sideloading in an Anthropic-signed Windows binary to root command execution inside the Hyper-V-isolated Ubuntu VM that underpins Claude Cowork — and Anthropic’s official verdict is that this is not a security issue, because an attacker would already need code execution on the host to trigger it.

🔍 THE BOTTOM LINE: Armadin chained three independent weaknesses — a DLL search-order bug in claude.exe, the Authenticode trust granted to that signature, and a named-pipe RPC channel into the sandbox VM — to land arbitrary root code execution inside Claude Cowork’s most isolated layer, with no network egress restrictions once inside. Anthropic was notified on 20 March 2026, validated it against Claude Desktop for Windows version 1.9255.2.0, and declined to treat it as a vulnerability on the grounds that the chain requires pre-existing host code execution.

What the sandbox actually defends

Claude Cowork on Windows is built around a defence-in-depth model. Claude Code is wrapped in a Hyper-V-isolated Ubuntu VM, with each session running as an unprivileged user inside a bubblewrap namespace. A seccomp filter constrains which syscalls the inner process can issue, and an egress proxy restricts outbound traffic to a domain allowlist. Communication between the host-side claude.exe and the inner VM flows over a named-pipe RPC channel gated by an Authenticode signature check. It is exactly the kind of layered model that fails in the gaps between layers, which is where Armadin went looking.

The chain: DLL sideloading, then RPC, then root

Stage 1 — sideload into claude.exe. Windows resolves DLLs by searching the application’s own directory before system directories. claude.exe loads USERENV.dll using the standard search order, so a malicious USERENV.dll dropped next to the binary is loaded first. The catch: claude.exe is signed by Anthropic, so the sideloaded code runs inside a process that other components trust.

Stage 2 — abuse the signed process to call the named pipe. The sideloaded code issues commands into the VM as if it were Anthropic’s own client. The Authenticode check passes, the bubblewrap namespace does not know the difference, and seccomp does not flag the call because it came from the right place.

Stage 3 — root in the VM, no egress filter. Once inside, commands execute as root and the per-session unprivileged user model is bypassed because the RPC channel is designed to run elevated actions. The inner process also reaches the network without the domain allowlist applying — the egress proxy is configured for normal user activity, not for trusted RPC traffic. The end state is full root code execution with arbitrary outbound connections, the worst possible outcome for a model whose pitch is that the inner VM cannot be escaped or used as a pivot.

A coding agent built the RPC client

One pointed detail in the Armadin writeup is how the RPC client was built. The team did not reverse-engineer the protocol by hand — they handed logs to a coding agent and let it fuzz parameters against the live named pipe, iterating until commands succeeded. The same category of agent Anthropic sells as a productivity product is also a capable offensive security tool when pointed at the vendor’s own infrastructure.

Anthropic’s “not a security issue” framing

Anthropic’s classification rests on one line of reasoning: by the time an attacker can sideload a DLL into claude.exe, they already have arbitrary code execution on the host, so the sandbox bypass is not a privilege escalation. In their threat model, the sandbox is a containment layer for the AI agent, not a security boundary against a compromised host.

That framing is defensible in isolation. It is also a significant narrowing of what “sandboxed” means in product marketing. A user installing Claude Cowork on Windows to keep agent actions contained is getting a thinner guarantee than the architecture diagram suggests — the boundary holds against the agent, not against any process that can write to the application’s install directory. For an enterprise buyer who chose the product specifically because the inner VM was supposed to be a hard wall, that distinction is the entire product.

What it means outside the US market

The exploit is Windows-specific, but the architectural pattern is not. Any AI agent product that wraps a coding model in a VM and connects the VM to the host via a signed-binary RPC channel is making the same bet. New Zealand enterprises evaluating Claude Cowork or its competitors — see our coverage of broader agent deployment risks and the Anthropic export-controls dispute — should read the Armadin disclosure as a structural warning. The sandbox will hold against the model. It will not hold against anything that can write a file to the application’s directory, which is a much wider threat surface than most procurement documents acknowledge.

🔍 THE BOTTOM LINE: Armadin produced a clean, reproducible root-in-VM exploit against Claude Cowork and Anthropic responded with a threat-model rebuttal rather than a patch. The technical work is solid, the classification is defensible, and the marketing claim is now narrower than it was a week ago. The right question for any buyer is no longer “can the model escape?” — it is “what does the sandbox actually contain, and against whom?”

❓ FAQ

Q: Is Claude Cowork currently vulnerable to this exploit? A: Armadin validated the chain against Claude Desktop for Windows version 1.9255.2.0 in March 2026 and published full details including proof-of-concept code. Anthropic has not released a patch because the company does not classify the issue as a security vulnerability.

Q: Why does Anthropic say this is not a security issue? A: Anthropic’s position is that the attack requires an attacker to already have code execution on the host, so the sandbox bypass is not a privilege escalation under their threat model. The sandbox is treated as a containment layer for the agent, not a security boundary against a compromised host.

Q: What is DLL sideloading in this context? A: It exploits Windows’ default DLL search order, which checks an application’s own directory before system directories. By placing a malicious USERENV.dll next to claude.exe, an attacker can hijack the process at load time — and because claude.exe is signed by Anthropic, the sideloaded code runs inside a process that other components trust.

Q: Did AI tooling help build the exploit? A: Yes. Armadin used a coding agent to read logs, fuzz RPC parameters, and iteratively construct the exploit client against the live named pipe — the same kind of agentic workflow that the targeted product is itself sold on.

Q: Should organisations using Claude Cowork on Windows be concerned? A: If your threat model assumes a trusted host and an untrusted agent, the sandbox still works as designed. If your threat model assumes the host may be compromised — the realistic case for any enterprise endpoint — then the Authenticode-gated RPC channel is a softer boundary than the architecture suggests, and host-level controls (application allowlisting, DLL safedllsearchmode, EDR) are doing the real work.