Reviewable autonomy

The operating model behind Disco Parrot. Agents do real engineering work, you approve it at defined points, and every step is recorded and reversible.

Disco Parrot runs software work through AI agents, but it does not hand the work over and hope for the best. Every agent runs inside an isolated sandbox, carries out a defined piece of work, and stops at the points where a person decides whether the work moves forward. The platform records what happened at each step, separates the agent's edits from your team's, and lets you roll any change back.

We call this operating model reviewable autonomy. The agent does the labor. You own the decisions, and you own the record. This page explains the model and the parts that make it work. The rest of the Core concepts section goes deep on each part.

The loop

Every piece of agent work follows the same shape.

Every piece of agent work follows the same loop. The third step is where you decide.

Capture intent

You describe what you want as an Initiative and break it into Plans. An agent can draft both from your codebase, and you edit them until they say what you mean.

Run the work in a sandbox

A Flow carries out the work as an ordered set of steps inside an isolated sandbox. The sandbox is where code is checked out, commands run, and changes are produced.

Review at checkpoints

The run pauses at checkpoints you control. You approve, reject, or skip before the run continues. Nothing past a checkpoint happens until you decide.

Record and trace

Each step keeps a transcript in Sessions with the tools it called, the number of turns it took, and the cost. Changes the agent makes to your work are saved with an ai source tag, kept in version history, and recorded in the audit log, so the work is always traceable back to the requirement that started it.

Why you review intent, not a pile of output

A common way to use AI for software work is to describe a goal, wait, and receive a large change to review after the fact. By the time you see it, the decisions are already made. The only real choice left is to accept the result or start over.

Disco Parrot inverts that. Work is broken into steps, and the moments that matter (a plan before it gets built, a change before it ships) are exactly where you weigh in. You review decisions while they are still cheap to change, and you direct the work instead of grading it at the end. A wrong approach caught at the plan checkpoint costs a sentence. The same mistake caught after the code is written costs a rewrite.

The parts of the system

Reviewable autonomy is the product of a few pieces working together. Each has its own concept page.

The SDLC work model holds the work itself: portfolios, projects, initiatives, plans, bugs, sprints, and goals. These are structured records, not freeform notes, which is what lets agents read and update them and lets you query across them.
Flows define multi-step agent work: the steps, the inputs, the checkpoints, and what happens on success or failure.
Sandboxes isolate every run. The container is the boundary, so the agent works against a checkout of your code rather than your live systems.
Agents are shaped by the model they run on and the instructions they follow, both of which you control.
Human checkpoints are the control points. Approve, reject, or skip, with a comment when you reject.
Approved actions bound what an agent can do: which tools it may call, which credentials it can use and for how long, and whether it may change a given environment or must only propose changes.
Audit trails and version history record who changed what, separate the agent's edits from people's edits, and let you restore an earlier version without losing the later one.

How the agent's reach is bounded

An agent in Disco Parrot is powerful inside its sandbox and constrained at every edge of it. Three mechanisms set the limits.

Tool allowlists. An agent can only call the tools it has been given. External tools arrive through MCP, and each connection declares exactly which of its tools are enabled. A tool that is not on the list is not available to the agent, and the sandbox is the wall around everything the enabled tools can touch.
Credentials it never holds. Long-lived keys and managed credentials, such as your model provider keys and your Git tokens, stay on the server and are blocked from entering the sandbox. When a step genuinely needs to act, for example to push a branch, the platform either performs the action itself or hands the sandbox a short-lived, tightly scoped credential that expires in minutes and is confined to a single command. The standing secrets are never in the agent's environment.
Environments that only accept proposals. An environment carries a change policy for each kind of change (code, schema, infrastructure, and so on). Under a policy of "propose only," the agent prepares the change as reviewable source and stops short of applying it; the platform raises a blocking issue and pauses the run so the change is reviewed before anything treats it as live. You decide, per environment, where an agent may act and where it may only suggest.

Where you make the decisions

Checkpoints are the point in a run where a person signs off. A Flow decides which steps pause, and you act on each one from the run's timeline: approve to continue, reject to stop with a comment, or skip to move past a step.

This holds even when you are not watching. A run you send to the background advances on its own only through the checkpoints a step has explicitly been set to auto-approve. Every other checkpoint pauses the run and waits for you, and each auto-approval is written onto the step so you can see what was approved without you. Human checkpoints covers this in full.

add_photo_alternate

Screenshot to capture

A Flow run timeline paused at a checkpoint: the completed step's summary above, and the pending step with its Approve / Reject / Skip actions and a comment box.

save as: public/docs-media/run-checkpoint-paused.png

Caption when added: A run paused at a checkpoint, waiting for a decision.

Everything is recorded and reversible

A run is never a black box, and a change is never one-way.

The audit log records changes with the actor and an edit source of user, ai, or system, so you can filter the agent's work apart from your team's. Anyone with the audit-export permission can export it to CSV, subject to your plan.
Sessions is the per-execution transcript: every chat, Flow run, and background task with the agent's messages, tool calls, and file edits, plus the cost in dollars and the metadata of where the work ran.
Version history keeps a full snapshot of an initiative, plan, skill, or set of agent instructions on every meaningful save. Restoring an older version creates a new version rather than deleting the newer one, so nothing is lost, and the restore itself is recorded.

What this means

For the people doing the work, reviewable autonomy means you spend your attention where it pays off: shaping the plan and approving the change, not babysitting a terminal. You can send a long run to the background and pick it up later from Sessions, and you can always see what the agent did and why.

For the people who have to sign off on how AI is used, it means a person is in the loop at the points that matter, an agent's reach is bounded by tool allowlists and a strict credential policy, changes to protected environments arrive as proposals, and there is a complete, exportable record of every change with the agent's edits clearly marked. The same model that makes the product pleasant to use is the model that makes it defensible to adopt. The governance section covers this in depth: human oversight and approvals for the control points, and approved actions and least privilege for what an agent and a person are each allowed to do.

bolt

Flows

How multi-step agent work is defined and run.

approval

Human checkpoints

The control points where you approve, reject, or skip.

dns

Sandboxed execution

Why every agent run is isolated.