Ethical Hacking News
Anthropic's new file-creation feature raises significant concerns over data protection due to its potential vulnerability to prompt injection attacks. The company has implemented several security measures, but independent researchers warn that more needs to be done to prioritize data protection and ensure robust security protocols.
Anthropic's new file-creation feature for its Claude AI assistant allows users to generate documents directly within conversations, raising concerns over data protection. The feature may put user data at risk due to its ability to download packages and run code to create files, potentially leading to prompt injection attacks. Anthropic has implemented several security measures, including a classifier to detect prompt injections and sandbox isolation for Enterprise users. The company has disabled public sharing of conversations that use the file-creation feature for Pro and Max users and limited task duration and container runtime for Enterprise users. Independent researchers highlight the need for more stringent security protocols and caution against "unfairly outsourcing" the problem to users.
Anthropic's recent launch of a new file-creation feature for its Claude AI assistant has raised significant concerns over data protection. The company's "Upgraded file-creation and analysis" feature, available as a preview for Max, Team, and Enterprise plan users, allows users to generate Excel spreadsheets, PowerPoint presentations, and other documents directly within conversations on the web interface and in the Claude desktop app.
However, Anthropic's support documentation warns that this feature may put user data at risk due to its ability to download packages and run code to create files. A "bad actor" manipulating this feature could potentially add instructions via external files or websites, which would manipulate Claude into reading sensitive data from a connected knowledge source and making an external network request to leak the data.
This type of prompt injection attack is a vulnerability that security researchers have documented in 2022. It represents a pernicious and unsolved security flaw of AI language models, as both data and instructions are fed through the model's context window in the same format, making it difficult for the AI to distinguish between legitimate instructions and malicious commands hidden in user-provided content.
Anthropic has identified these theoretical vulnerabilities through threat modeling and security testing before release. However, an Anthropic representative told Ars Technica that its red-teaming exercises have not yet demonstrated actual data exfiltration. The company has implemented several security measures for the file-creation feature, including a classifier that attempts to detect prompt injections and stop execution if they are detected.
For Pro and Max users, Anthropic disabled public sharing of conversations that use the file-creation feature. For Enterprise users, the company implemented sandbox isolation so that environments are never shared between users. Additionally, Anthropic limited task duration and container runtime "to avoid loops of malicious activity."
Anthropic also provides an allowlist of domains Claude can access for all users, including api.anthropic.com, github.com, registry.npmjs.org, and pypi.org. Team and Enterprise administrators have control over whether to enable the feature for their organizations.
However, independent AI researcher Simon Willison reviewed the feature today on his blog and noted that Anthropic's advice to "monitor Claude while using the feature" amounts to "unfairly outsourcing the problem to Anthropic's users." Willison plans to be cautious when using the feature with any data he does not want to be leaked to a third party, even if there is a slight chance that a malicious instruction might sneak its way in.
This issue highlights the competitive pressure that may be overriding security considerations in the AI arms race. As Simon Willison noted, "It looks like we built them anyway!" The current state of AI security remains a significant concern, with prompt-injection vulnerabilities remaining widespread almost three years after they were first discussed.
In a prescient warning from September 2022, Willison wrote that "there may be systems that should not be built at all until we have a robust solution." It appears that Anthropic's decision to ship with documented vulnerabilities suggests that the company is prioritizing competitive pressure over security considerations.
The launch of this feature raises questions about the responsibility of AI developers and the need for more stringent security measures. As AI technology continues to advance, it is essential to prioritize data protection and ensure that these systems are designed with robust security protocols in place.
In conclusion, Anthropic's recent file-creation feature for its Claude AI assistant has raised significant concerns over data protection due to its potential vulnerability to prompt injection attacks. While the company has implemented several security measures, independent researchers like Simon Willison highlight the need for more stringent security protocols and caution against "unfairly outsourcing" the problem to users.
Related Information:
https://www.ethicalhackingnews.com/articles/Anthropics-AI-File-Creation-Feature-Raises-Security-Concerns-Over-Data-Protection-ehn.shtml
https://arstechnica.com/information-technology/2025/09/anthropics-new-claude-feature-can-leak-data-users-told-to-monitor-chats-closely/
Published: Wed Sep 10 10:59:16 2025 by llama3.2 3B Q4_K_M