Anthropic Claude Vulnerability Exposes Cowork AI to Data Exfiltration via Prompt Injection

Published
Written by:
Lore Apostol
Lore Apostol
Cybersecurity Writer
Key Takeaways
  • Critical Flaw: A vulnerability in Anthropic’s Cowork AI permits attackers to exfiltrate files via prompt injection without additional user approval
  • API Exploitation: The attack vector leverages the Files API, allowing a malicious document to trick the AI into transmitting sensitive local data to an external account.
  • Vendor Response: Anthropic acknowledged the risk and is deploying updates to the Cowork virtual machine to restrict file access.

An unremediated Anthropic Claude bug has been identified extending to the company's newly released Cowork productivity tool, which automates office tasks by scanning local files. This flaw allows threat actors to utilize hidden instructions within a document to manipulate the AI's behavior. 

This incident highlights the escalating severity of AI cybersecurity risks associated with Large Language Model (LLM) integration in enterprise environments. 

Persistent Prompt Injection Attack Risks

Security firm PromptArmor disclosed that once the AI analyzes a compromised file, it can be coerced into uploading sensitive user data to an attacker-controlled API key, effectively bypassing standard user authorization protocols once initial access is granted.

This attack leverages the Anthropic API allowlisting to achieve data egress from Claude's VM environment | Source: PromptArmor
This attack leverages the Anthropic API allowlisting to achieve data egress from Claude's VM environment | Source: PromptArmor

A specific attack chain mirrors a files API exfiltration method previously reported by researcher Johann Rehberger in October regarding Claude Code that was not fixed. The PromptArmor analysis shows that a user uploads a .docx file that appears to be a Claude Skill file (similar to Markdown).

The 'skill' document uploaded by the user contains a concealed prompt injection | Source: PromptArmor
The 'skill' document uploaded by the user contains a concealed prompt injection | Source: PromptArmor

The core issue with the Cowork AI vulnerability lies in prompt injection attack vectors, where the AI interprets malicious data as executable instructions. Injection sources include, but are not limited to, web data from Claude for Chrome, connected MCP servers, and other sources.

Example of Opus 4.5 exfiltrating customer records to the attacker's Anthropic account | Source: PromptArmor
Example of Opus 4.5 exfiltrating customer records to the attacker's Anthropic account | Source: PromptArmor

The analysis exploit was demonstrated against Claude Haiku, but the researchers also tested Claude Opus 4.5, known to be more resilient against injections, which was ultimately “successfully manipulated via indirect prompt injection to leverage the same file upload vulnerability to exfiltrate data.

Researchers also noted that attackers could perform denial-of-service (DOS) attacks via malformed files. Security experts say such risks are amplified when agentic AI tools are used by non-technical users who may struggle to recognize or mitigate abnormal behavior.

Dr. Margaret Cunningham, Vice President of Security and AI Strategy at Darktrace, said technical literacy plays a significant role in safely using agentic AI tools.

“People with strong technical literacy know how to do things like sandbox agents or avoid risky connectors, and they may be more likely to recognize dangerous patterns. Non-technical users don't usually understand how to do this, and CoWork is targeted at non-technical or less technical users.”

“In my opinion, this is an example of where we can see the capability gap illustrated such that those with high AI literacy are protected and those without it are disproportionately exposed to risks,” Cunningham added.

Gal Moyal, from the CTO Office at Noma Security, said integrating Cowork with broader Claude capabilities increases exposure if guardrails are insufficient. "It’s not only my local drive which I provide access to, but also all the integrations are now at risk for sensitive data exfiltration, data removal or alteration, and sending emails or publishing posts under your name."

Moyal said. “Without proper guardrails, your identity which you have delegated to Claude can be used for anything.”

Remediation and Security Posture

While Anthropic initially emphasized user responsibility in monitoring AI interactions, it has confirmed plans to update the Cowork virtual machine (VM) to minimize the platform's unrestricted access to sensitive directories and files and improve its interaction with the vulnerable API.

Because Cowork is marketed towards general office workers rather than technical developers, the reliance on user vigilance to detect "suspicious actions" presents a substantial security gap in the tool's deployment.

As a precaution, users are advised to exercise caution when configuring Connectors.

In November, researchers noticed that the Chinese state-sponsored GTG-1002 leveraged Claude AI for cyberespionage targeting tens of organizations. This month, a reported IBM Bob prompt injection vulnerability allowed researchers to bypass the AI agent's security measures.


For a better user experience we recommend using a more modern browser. We support the latest version of the following browsers: For a better user experience we recommend using the latest version of the following browsers: