AI Agents Store Credentials in the Same Box as Untrusted Code. Two New Architectures Show How to Fix It.

A new VentureBeat analysis examines two emerging architectural patterns — from Anthropic and NVIDIA's NemoClaw — that attempt to isolate AI agent credentials from the untrusted code agents execute, addressing what security researchers describe as one of the most underappreciated risks in enterprise agentic deployments.

D.O.T.S AI Newsroom

AI News Desk

Apr 13, 20264 min read

AI Agents Store Credentials in the Same Box as Untrusted Code. Two New Architectures Show How to Fix It.

Enterprise AI agents have a structural security problem that has received surprisingly little public attention relative to its severity: in most current deployments, the credentials an agent uses to take actions — API keys, OAuth tokens, database passwords, cloud IAM roles — are stored or accessible in the same runtime environment where the agent processes and executes arbitrary, often untrusted code. This co-location creates what security researchers call a "blast radius" problem: if an agent's execution environment is compromised through prompt injection, malicious tool output, or adversarial input, the attacker gains access not just to the agent's current context but to its full suite of credentials. VentureBeat's analysis of two emerging architectural responses to this problem — one from Anthropic, one from NVIDIA's NemoClaw team — illuminates how the industry is beginning to think more rigorously about agent-native security design.

The Credential Co-location Problem

The problem is architectural rather than implementation-specific. When an agent is given access to tools — a web browser, a code interpreter, a database, an email client — it typically receives those tools' credentials at initialization. Those credentials live in the agent's memory or in environment variables accessible to the agent's runtime throughout the task session. If the agent is manipulated into executing malicious code (through prompt injection in a webpage it browses, for instance) or if its tool outputs contain adversarial instructions that alter its behavior, the attacker's payload has direct access to the same credential store the agent uses for legitimate operations. The attack surface is large and the boundary between trusted agent behavior and attacker-controlled agent behavior is not architecturally enforced.

Two Architectural Responses

Anthropic's approach involves a credential proxy layer that sits between the agent's reasoning process and its tool execution environment. Rather than giving the agent direct access to credentials, the proxy mediates all tool calls — the agent requests an action, the proxy validates it against a policy specification, and only then presents the credential to the underlying tool. The agent never has direct credential access; it has access to an action interface that the proxy controls. NVIDIA's NemoClaw takes a complementary approach focused on runtime isolation: each tool call is executed in an ephemeral sandbox that receives only the minimum credential scope necessary for that specific operation, with the sandbox destroyed after execution and credentials rotated before the next call. Neither approach is a complete solution — sophisticated prompt injection can still manipulate agent behavior within whatever action space the policy allows — but both substantially reduce the blast radius of a successful compromise.

Why This Matters Now

The urgency of this architectural work is increasing as enterprise AI agents take on higher-stakes tasks. An agent that summarizes documents has limited blast radius even if compromised. An agent that can send emails, make purchases, provision cloud resources, or modify production databases has a blast radius that rivals a compromised human employee with the same permissions. The security industry has decades of experience designing privilege-separation architectures for human users; applying those principles to AI agents requires rethinking fundamental assumptions about how credentials are scoped, stored, and audited in agentic workflows. The two architectures described by VentureBeat are early attempts to do that rethinking at the infrastructure level — a necessary step before enterprise AI agents can be deployed at the scale and permission level that would make them genuinely transformative.

Back to Home

AI Agents Store Credentials in the Same Box as Untrusted Code. Two New Architectures Show How to Fix It.

The Credential Co-location Problem

Two Architectural Responses

Why This Matters Now

Related Stories

Musk Updates His OpenAI Lawsuit to Route Any $150 Billion Damages Award to the Nonprofit Foundation

OpenAI's Child Safety Blueprint Confronts AI's Role in the Surge of Child Sexual Exploitation

Anthropic's Claude Mythos Found Thousands of Zero-Days — So They're Not Releasing It