Anthropic Launches Claude Managed Agents in Public Beta

Building a production AI agent has always meant building two things: the agent itself, and all the infrastructure around it. Secure sandboxes, credential management, stateful sessions, error recovery, context engineering for long-horizon tasks, scaling under load. Months of backend work before a user sees a single feature. Claude Managed Agents, launched in public beta on April 8, 2026, is Anthropic's answer to that problem.

The pitch is direct: developers define what they want the agent to do — its tasks, tools, and guardrails. Anthropic runs everything else on its infrastructure. The result, Anthropic claims, is getting from prototype to production in days rather than months.

What Claude Managed Agents Actually Is

Claude Managed Agents is a suite of composable APIs available on the Claude Platform that provides managed, cloud-hosted infrastructure for running AI agents at scale. It targets developers and enterprise teams who need production-grade agent infrastructure without building secure execution environments, state management, or custom orchestration from scratch.

The core architecture is built around three decoupled components — a design Anthropic's engineering team calls "decoupling the brain from the hands." The session is an append-only durable log of everything that happened in an agent run. The harness is the loop that calls Claude and routes its tool calls to the relevant infrastructure. The sandbox is the execution environment where Claude runs code and edits files. Each component can fail or be replaced independently without disturbing the others — a deliberate departure from earlier agent architectures that bundled everything into a single container and created "pet" infrastructure that was fragile to debug and impossible to scale.

In the previous coupled design, if a container failed, the session was lost. If a container was unresponsive, engineers had to nurse it back to health with no clear visibility into whether the failure was in the harness, the event stream, or the container itself. The new architecture treats containers as cattle: if one dies, the harness catches the failure as a tool-call error, and a new container is reinitialized from a standard recipe. If the harness itself fails, it is rebooted via wake(sessionId), uses getSession(id) to retrieve the event log, and resumes from the last recorded event.

What the Infrastructure Handles For You

Secure sandboxed code execution. Claude runs code in an isolated sandbox. Credentials — Git tokens, OAuth tokens for external services — are never reachable from the sandbox where Claude's generated code executes. Git repositories are cloned using access tokens during sandbox initialization and wired into the local git remote, so push and pull work without the agent ever handling the token directly. For MCP tools, OAuth tokens are stored in a secure vault outside the sandbox. Claude calls MCP tools through a dedicated proxy that fetches credentials from the vault and makes the external service call — the harness is never aware of the credentials at any point.

Long-running stateful sessions. Agents can run autonomously for extended periods. Sessions persist through disconnections — outputs are durable even if the client drops the connection mid-task. This is a practical necessity for enterprise workflows that might run for hours across multiple tool calls. The session log serves as a context object that lives outside Claude's context window, allowing the harness to retrieve any slice of the event stream, rewind to a specific moment before an action, or re-read context before a decision. This separates durable storage from context management, so future improvements to context engineering can be made in the harness without changing the session interface.

Error recovery and orchestration. The harness manages when to call tools, how to manage context across a long task, and how to recover from errors — including errors in the tools themselves. Developers do not need to write their own retry logic, context window management, or failure detection.

Multi-agent coordination (research preview). Agents can spin up and direct other agents to parallelize complex work. This feature is currently in research preview — a step behind the public beta — so teams should treat it as unstable before building production dependencies on it. Anthropic flags this clearly: meaningful instability should be expected.

Self-evaluation (research preview). Developers can define success criteria while Claude iterates toward meeting them — useful for tasks where quality requires judgment rather than binary pass/fail tests. Also in research preview.

Session tracing in the Claude Console. Every agent run is viewable in the Claude Console. Developers can inspect the full event stream, see which tools were called, review intermediate outputs, and debug failures without needing shell access to the underlying containers.

MCP integration. The harness natively supports the Model Context Protocol for connecting agents to external tools and data sources. Given that MCP crossed 97 million installs in March 2026 and has become the default mechanism for agent tool connectivity across the industry, first-class MCP support is a significant practical advantage.

How to Define an Agent

Developers define agents either by describing the agent's behavior in natural language or through a YAML configuration file. The configuration specifies the agent's tasks, the tools it has access to, the guardrails that constrain its behavior, and the MCP servers it can connect to. Anthropic's infrastructure handles the rest — tool orchestration, context management, sandbox provisioning, and error recovery.

All Managed Agents API endpoints require the managed-agents-2026-04-01 beta header. The SDK sets this header automatically. Behaviors may be refined between releases as the beta evolves. Certain features — outcomes, multi-agent coordination, and memory — are in research preview and require separate access requests.

Who Is Already Building With It

Anthropic shared early adopter use cases at launch:

Notion deployed Claude directly into workspaces through Custom Agents, currently in private alpha. Engineers can ship code while knowledge workers generate presentations and websites — dozens of parallel tasks while teams collaborate on outputs — all without leaving their workspace.

Rakuten deployed specialist agents across product, sales, marketing, finance, and HR departments within approximately a week per deployment. The agents plug into Slack and Teams, accept task assignments from employees, and return deliverables including spreadsheets and slide decks.

Asana built what they call AI Teammates — agents that work alongside humans inside project management workflows, picking up tasks and drafting deliverables. Asana CTO Amritansh Raghav said the product helped his team add advanced capabilities dramatically faster than previous approaches allowed, freeing engineers to focus on user experience rather than infrastructure.

Sentry paired their existing Seer debugging agent with a Claude-powered counterpart that writes patches and opens pull requests. Developers can go from a flagged bug to a reviewable code fix in a single automated flow.

Performance Claims and What They Mean

Anthropic claims up to 10x faster development time for teams moving from prototype to production. In internal tests, Managed Agents improved structured file generation success rates by up to 10 points over standard prompting methods. Early beta tester feedback cited in industry coverage suggests development cycles reduced by up to 80 percent in some cases.

The 10x speed claim specifically refers to the infrastructure work that previously blocked production deployment — the months spent building sandboxes, managing credentials, handling long-running tasks, and reworking everything each time the underlying model was updated. Teams that previously spent that time on plumbing can now focus on what their agent actually does.

Whether those claims hold across different use cases will become clearer as the public beta expands. Anthropic is transparent that the beta is still being refined — behaviors may change between releases.

Pricing

Claude Managed Agents is priced on consumption. Standard Claude token rates apply for model usage. On top of that, active runtime is billed at $0.08 per session-hour — charged only while the agent is actively running, not for idle time between sessions.

At that rate, the pricing is clearly positioned for enterprise use rather than individual developers or small teams. A complex agent running for 10 hours across a week costs $0.80 in session fees on top of token costs — manageable for business workflows, but an important factor for high-volume or long-running consumer applications. Anthropic has designed this explicitly as an enterprise pitch: teams that prioritize time-to-production over price.

What This Means for the Agent Market

Until this launch, Anthropic's approach to agents was primarily model-level — providing Claude as the intelligence layer that developers built their own agent infrastructure around. Outside of Claude Code and Cowork for end users, Anthropic had not made its own infrastructure available for running third-party agents. Claude Managed Agents changes that positioning significantly.

The competitive landscape now includes OpenAI's agent frameworks, Google's Gemini agent tooling on Google Cloud, Salesforce AgentForce, and Microsoft Copilot Studio — all providing managed infrastructure for building and deploying agents. Anthropic's differentiation is its safety-first positioning, the architectural decisions around credential isolation and session durability, and the direct integration with MCP as the standard agent connectivity protocol.

The engineering philosophy Anthropic has applied — decoupling session, harness, and sandbox so that each can evolve independently — is also notable. The explicit goal is that the interfaces outlast any particular implementation. As Claude models improve, assumptions baked into harnesses go stale. Anthropic's example is instructive: Claude Sonnet 4.5 would wrap up tasks prematurely as it sensed its context limit approaching — a behavior they addressed with context resets in the harness. When they used the same harness with Claude Opus 4.5, the behavior was gone and the resets had become dead weight. By separating the interfaces from the implementations, Managed Agents is designed to absorb those model improvements without requiring developers to rewrite their agent infrastructure every time a new Claude version ships.

Claude Managed Agents is available now at platform.claude.com. Documentation is at platform.claude.com/docs/en/managed-agents/overview.

Anthropic Launches Claude Managed Agents in Public Beta — April 8, 2026