How sandboxed execution isolates agents

The container that bounds every agent run in Disco Parrot. A disposable container per run, a gateway every action crosses, a working directory the agent cannot escape, hardening that drops privilege, and the reason the boundary is the container rather than the agent's good behavior.

When an agent in Disco Parrot runs a command, edits a file, or reaches a tool, it does so inside a container built for that one run. The container is the security boundary. It is the answer to the question a security team asks about any system that lets software act: what stops it from reaching past the task it was given? Here, the stop is not a rule the agent agrees to follow. It is a wall the agent runs inside and cannot see past.

This page is the full account of that boundary: the container per run, the gateway every action crosses, the working directory the agent stays inside, the hardening underneath, and the reason the whole thing is built around containing the agent rather than trusting it. The sandbox concept is the everyday version; this is the version for the person who has to vouch for it. It is one of six independent layers in the security overview; the others are why a slip here would still be contained.

A container for every run

Every unit of agent work gets its own container. A chat turn runs in one, a flow step in one, a background task in one, and no two runs share. What this protects is the thing a security team checks first: one workspace's work does not land in another workspace's container, and two conversations never cross. Each container is reserved for one specific piece of work, named by an identifier unique and unguessable enough that nothing else can be routed to it, and the durable record that ties a run to its container is partitioned per workspace, so the lookup that reconnects a run to its container stays inside one workspace. Even two workspaces that happen to pick the same internal name for a conversation stay in separate containers.

The container is disposable. It is created when the work starts and torn down when it ends, so a run does not inherit the leftovers of an unrelated one. When a run pauses for idleness and later resumes, the platform reconnects to the same container only when its key still matches, which it judges against a closed list: the profile, the template, the image tag, the host, and the runtime-identity fingerprint behind the run. If any of those has changed, the old container is destroyed and a fresh one takes its place. What survives a teardown is the durable record of the session and any work the agent committed to a branch, never the running container itself.

Every unit of agent work gets its own disposable container, named for the exact work it holds. Two workspaces and two conversations never share one, and the container is created for the run and torn down after it.

The effect is that the unit of isolation is the unit of work. There is no long-lived shared machine that many runs pass through and might leave traces on. Each run is born into a clean container, does its work, and the container goes away.

When the run ends, the container is gone

A run does not accumulate. When the work finishes, or the run sits idle long enough, or you destroy it from the Sandboxes page, the container and everything written inside it are removed together. The agent's scratch files, anything it cloned, anything it wrote to /workspace that it did not commit, all of it goes when the container goes. On the managed and Kubernetes hosts the working disk is allocated fresh for the run and gone with it, so nothing is left behind for a later run to find.

Containers are born, do work, and end. A run is created, runs, and pauses when idle, resuming only if its key still matches. When the work is done, or the profile, image, host, or identity behind it changed, the container is destroyed. Only the session transcript and any committed branch outlive it; the disk is gone, and a new run starts clean.

What you keep is what you chose to keep. The durable record of the session, the transcript of every message, tool call, and file edit, is written to the platform's audit store rather than to the container, so it survives the teardown for review. Work the agent meant to land is a commit it pushed to a branch under your identity, which lives in your repository, not in the container. Everything else is ephemeral on purpose: the container is a place to do work, not a place to store it.

One choice makes a workspace deliberately durable, and it is yours to make. A profile on a Local Docker host can keep its /workspace in a reused volume across runs, under a retention window, so a developer who wants a warm checkout between sessions has one; where that is turned on, the persistence is the point, and everywhere else the disk is ephemeral. That option lives on sandbox profiles.

Credentials never enter this picture as something to wipe. A run that needs a key receives a short-lived lease at the moment it acts; for a runtime that needs the credential as a file, it is written to a scoped path readable only by that run, used, and deleted, never baked into the image or left for the next run. The credentials and the secret policy page is where that lives in full; the point here is that the thing a security review worries about finding on a disposed disk was scoped to the moment and removed after it.

Everything goes through the gateway

Inside the container, the agent does not reach the outside world directly. Alongside it runs a small platform-owned service, a sidecar, and every action the agent takes that leaves its own process, reading a file, running a command, calling a tool, returning a result, crosses that sidecar. The sidecar is the one door, and it is locked.

What this door protects against is one run reaching into another. It is locked with a per-container key: a fresh secret the platform issues to one container alone, so a request carrying another run's token, or no token, is refused. A token taken from one run is worthless against any other container, which is the property that keeps runs from ever using each other's gateway. The routes that answer without the key are the harmless ones: a health check, and the preview of the app the agent is running, which is the agent's own output rather than a privileged route. Anything that moves a file, runs a command, or reaches a tool always demands the key. On the managed and customer-hosted operator hosts the key is always issued at launch, so this is the live behavior of a real run, not a setting to remember.

Inside the container the agent reaches the outside only through a platform-owned sidecar, locked with a key minted for that one container. A token from one run is useless against another, and the control plane sits on a network the container cannot cross.

The result is that a run cannot act on the platform, and cannot act on another run. The agent's only path outward is the sidecar, and a sidecar answers only to its own container's key, so a request to another container's gateway is refused without that container's secret, even when the two containers can see each other on the network. The platform's own control plane and data stores are out of reach the same way: a container holds no credential for them, and the platform's interfaces answer only to authenticated callers a sandbox is not one of. On the managed Docker host the gateway is bound to the local loopback, so it is not exposed on any outside interface to begin with. An agent can reach its workspace and its gateway, and the gateway mediates the tools and destinations its task is given; reaching the platform's own systems, or another run, takes a credential a run does not have.

The agent stays in its workspace

What this guarantees is that your repository and the files of the task are the only filesystem the agent can touch, and nothing on the host or in another user's space is reachable from inside the run. The agent's working directory is /workspace, and it cannot get out of it. Every file path the agent hands to the gateway is checked twice before anything happens. First the path is resolved and measured against the workspace root, and a path that points outside is refused with a plain "Path escapes workspace." Then the real location is resolved through any symbolic links and checked again, so a link that tries to point out of the workspace is caught at the destination rather than the name. A path also has to match a strict shape, with none of the parent-directory hops or malformed forms that escape attempts usually rely on, so the common tricks for slipping past a path check never get the chance to run.

Every file path is checked twice before anything happens: resolved and measured against the workspace root, then re-checked through any symbolic links at its real destination. A path that points outside is refused, so the agent works in its workspace and the rest of the filesystem is not addressable.

The point for you is concrete: an agent cannot read /etc, cannot wander into another user's home directory, cannot reach the host's filesystem through the container. It works in the directory it was given, with your repository and the files of its task, and the rest of the filesystem is not addressable from where it stands.

Opening a sandbox in your IDE

A developer can attach a live sandbox to their own editor to watch a run or pair with it, and that handoff keeps the same boundary rather than reaching around it. Opening a session mints a signed ticket bound to one sandbox, one person, and one workspace, with a short window to connect and a session that times out on its own. Whoever opens it picks inspect or write up front, so a read-only look stays read-only, and the choice is recorded. The connection answers only to that session's own token, which is a separate credential from the one the gateway uses for the agent's own calls: a tunnel request without it, or with another session's, is refused the same way the gateway refuses a stranger.

What this protects is the obvious worry. Attaching an editor does not hand a person, or anything riding along with their editor, a wider reach than the run already has: the session lands inside the same /workspace, behind the same sidecar, under the same hardening. And because opening the session is written to the audit trail, the question "who attached to this run, when, and could they write" has an answer on the record rather than in memory. The everyday version of attaching is on open a sandbox in your IDE; here the point is that the door has the same lock as every other one.

When Priya Patel opens a running sandbox in her editor to pair with an agent mid-task, she picks inspect, looks, and closes it. The session shows up in the audit log as hers, marked inspect, and times out on its own when she steps away. Nothing about attaching widened what the run could reach, and the record of who looked is there without her having to log it.

add_photo_alternate

Screenshot to capture

The audit log view under Settings, dark theme, filtered to sandbox events. A table of rows, each with an actor (a person's name with a small avatar, and one row marked with an agent glyph), an action ('Sandbox created', 'IDE session created', 'Sandbox destroyed'), a target reading 'sandbox sbx-3a9c, CSV streaming export', and a timestamp. The 'IDE session created' row is expanded to show a detail line 'mode: inspect'. Surface #131316, border #27272a.

save as: public/docs-media/audit-sandbox-lifecycle.png

Caption when added: A run's lifecycle is on the record: created, attached to, destroyed, each with who did it and when. The container is disposable, but the account of it is not.

Why the agent never stops to ask permission

One choice looks, at first glance, like a relaxed control: the agent runs without interactive permission prompts. It does not stop to ask "may I run this command?" the way a developer tool might. For a security reviewer that is the right thing to question, so here is why it is the opposite of a loosening.

The safety in Disco Parrot does not come from the agent pausing to ask permission and honoring the answer. It comes from the container. The agent is already inside a boundary it cannot escape and a gateway that enforces what it can touch, so the meaningful limits are applied at the wall, not negotiated with the agent. Asking the agent to also self-police would add a prompt that means nothing, because a misbehaving agent is exactly the one that would answer its own prompt "yes." The no-prompt mode is set deliberately everywhere a run starts, precisely because the real enforcement lives one layer down, in the container, where the agent cannot reach it and cannot talk its way around it.

So the agent is free to act inside the container, and the container is what makes that free action safe. What bounds the agent is not its restraint but the walls, and the walls do not depend on the agent agreeing to them. This is the heart of the model: containment over trust.

Hardened underneath

If anything ever reached past the workspace, the worry is what it could do next, and the answer here is built to be almost nothing. Even an agent that somehow slipped its working directory finds no privilege to seize, no host to climb onto, and no cluster to command. Across host types the sandbox runs as a non-root user that cannot escalate its privileges, with its Linux capabilities dropped. On the customer-hosted Kubernetes path, where the footprint is fully specified, it also runs under the default restricted system-call profile and the no-new-privileges flag, and its identity inside the cluster carries no access to the cluster's own control surface, so a sandbox cannot turn around and operate the Kubernetes it runs on. The container also runs under fixed CPU and memory limits, so a runaway or wedged agent is bounded to its own slice and cannot starve the host or its neighbors. The working directory and the gateway are the only things it owns.

Hardening like this is the floor, not the feature. It means that even the unlikely escape, an agent that somehow slipped its working directory, would land as an unprivileged user in a stripped container with nothing to climb, rather than with the run of the machine. The boundary is the container; the hardening makes the inside of the container a poor place to stand if anything ever reached it.

What a run can and cannot reach

The boundary is easiest to hold in mind as a short list of what a run can and cannot do from inside a running container.

What a run can and cannot reach, as one picture. Inside the boundary the agent owns its workspace and its keyed gateway. Outside it, the host filesystem, another run, and the platform's own control plane and data stores are not addressable: the reach stops at the container wall.

From inside a run, the agent can	The agent cannot
Read and write its own `/workspace` and the files of its task	Reach the host filesystem outside `/workspace`
Use its container's gateway, with the right key	Use another container's gateway: a request without that container's key is refused
Reach the tools and destinations its task is given	Act on the platform's control plane or its data stores
Do the work the gateway mediates for it	Pass as one of the platform's authenticated callers

Each row is a wall described earlier: the workspace check, the per-container key, the gateway that mediates tools, and the plain fact that a container holds no credential for the platform's own systems. Together they draw a small, legible box. The agent is powerful inside the box and has no usable reach into your platform, your data, or another run, which is exactly the trade a security team wants from software that acts on its behalf.

Verifying the boundary yourself

The model is built to be checked, not taken on faith. When a security owner at a regulated customer sits down to review it, the questions are concrete, and each one resolves to something they can confirm rather than a claim they have to trust.

Can one run reach another? They confirm the per-container key: a token issued to one container, refused everywhere else. Can a run reach the platform's own data? They confirm that the container holds no credential for it and the gateway mediates every outward call. Can a run read the host? They confirm the workspace check and walk a path that points outside it, watching it refused. On the customer-hosted path, they review the container footprint, non-root, no added capabilities, a cluster identity that cannot touch the cluster, against their own hardening standard before the first agent runs.

The point is that none of these answers depends on the agent behaving. Each one is a wall the platform holds, so the reviewer is checking a boundary they can see rather than a behavior they have to hope for. That is what makes the model something a security team can sign off on, not just something it agrees to live with.

add_photo_alternate

Screenshot to capture

The Sandboxes list page under Platform, dark theme: one row per active run, each with a status badge (running / paused / destroyed), the profile and the work it belongs to (a conversation, a flow, a background task), a branch, a last-active time, and a per-row action menu including Destroy. Several distinct rows visible, each a separate container. Surface #131316, border #27272a.

save as: public/docs-media/sandbox-list-isolation.png

Caption when added: Every run is a separate container you can see and stop. The isolation is not just something we tell you about; it is something you can watch and act on from one page.

Why isolation works this way

The design follows from treating the agent as untrusted code that does useful work. If you trust the agent, you give it broad access and rely on its behavior, and your security is only as good as the agent's worst moment, a bad instruction, a poisoned input, a confident mistake. If you contain the agent, you give it a small box and strong walls, and its worst moment is bounded by the box rather than by your whole environment. Disco Parrot takes the second path without apology, because it is the only one that scales to letting agents do real work.

Building the boundary as the container, rather than as a set of rules the agent follows, is what makes it trustworthy. A rule the agent enforces on itself is only as reliable as the agent; a wall the platform enforces around the agent holds regardless of what the agent does. So the container is disposable, the gateway is keyed, the workspace is sealed, and the prompts are turned off on purpose, all so that the question "what stops the agent from reaching past its task" has the same answer every time: the container does, and the container does not depend on the agent.

For the person who owns security, the isolation model is a small set of independent mechanisms you can each verify: a container per run, a keyed gateway, a sealed workspace, a hardened image. None of them rests on the agent's good behavior, which is the property you are checking for.

For an engineering leader, containment is what lets you adopt agents broadly. The boundary is the same for every run, so a new agent or a new flow inherits it rather than needing its own security design.

For a platform engineer, the customer-hosted footprint is explicit: a non-root, capability-dropped container whose cluster identity cannot operate the cluster, reachable only outbound. It is a footprint you can review against your own standards before a single agent runs.

For a team lead, the practical version is simple. The agent works in a clean container that holds only its task, it cannot reach your other systems from there, and when the work is done the container is gone.

dns

Sandboxed execution (concept)

The everyday version: what a sandbox is and how it runs your task.

key

Credentials and the secret policy

What reaches the container, and the keys that never do.

lan

Network boundary

The outbound-only channel, and running the container in your own network.

shield_person

Access control

The named permissions and the route check that refuses an unguarded path.