PR review loop
What happens to a pull request after an agent opens it. Your reviewers comment on GitHub, the agent reviews its own diff, and both streams become findings you resolve with a Fix this that writes new commits and rechecks itself until the work converges or asks for a person.
A pull request is where agent work meets your team. The agent opens it under your identity, your engineers read the diff and comment on GitHub the way they always have, and Disco Parrot automates the half that usually drags: the part where a comment becomes a change becomes another comment. It pulls your reviewers' comments in, lets the agent review its own diff alongside them, and turns each one into a finding you can hand back to an agent with one click. The agent answers with code, rechecks its own work, and keeps going until the finding is resolved or it has reached the point where a person should look.
This page is the loop itself: where the findings come from, what their severities and statuses mean, how Fix this dispatches an agent, and how the recheck decides when the work has converged. For how a pull request gets opened in the first place, and why GitHub stays the place you merge, read ship code.
Where the review lives
Every implementation plan that has opened a pull request gets a review surface of its own. From the plan's PR badge you open the in-app review, a full page built for reading a change and acting on it, laid out as three panels you can resize and collapse to taste:
- Plan context on the left, so the change is read against the work it was meant to do.
- The diff in the center: the pull request's files in a tree you can filter, shown split or unified, with a Viewed checkbox per file so a long review remembers where you stopped. Click a finding on the right and the diff jumps to the line it is about, which is how the two panels work as one.
- The conversation on the right, the stream of findings and comments the rest of this page is about. Its header keeps a live count of what is open and what is resolved, groups the stream by file, by severity, or by time, and hides resolved findings by default so the open work is what you see.
The surface is a page, not a popup, because reviewing a change is real work that deserves the room.
A plan gets its review surface the moment a pull request is opened against it. Until then the page shows a No PR yet prompt that points you to open one from the plan's Flows launcher or with /create-pr in chat.
The page stays in step with GitHub on its own. Once the pull request is merged or closed there, a banner marks it and the conversation goes read-only, so the in-app picture always matches the real one. If the pull request is deleted on GitHub, the page says so and offers a re-sync rather than guessing. GitHub stays the source of truth.
Where findings come from
A finding is one thing to address on the pull request. Two streams feed the conversation, and they sit side by side because to the work it does not matter which one raised the point.
Your reviewers' GitHub comments are read in through a webhook the moment they land. When a reviewer submits a review or leaves a comment on a line, the platform mirrors it into a finding attached to the plan, so the request shows up in the app next to the diff it is about without anyone copying anything across. The flow is one direction only, from GitHub into Disco Parrot. The platform never writes back to the pull request, so your reviewers' threads, approvals, and merges stay entirely theirs.
The agent's own review is the second stream. An agent can review the diff it produced and raise findings against it, the same shape of finding a human reviewer would leave, so the obvious problems are caught and queued before a person spends attention on them. The agent's review is held to the same bar as a person's, not a softer one. It raises findings against the diff it just wrote, blocking ones included, and it cannot close them by saying so. A finding the agent raises is one the agent then has to answer with code, the same as a finding a reviewer left.
What a GitHub review becomes depends on what the reviewer did. An approval raises nothing; there is no work in a thumbs-up. A review that requests changes becomes a blocking finding, because the reviewer has said as much. A review left as a comment, and any inline comment on a line, becomes a question, the softest severity, because a comment is a prompt to look rather than a demand to change. The mapping is deliberate: it carries the reviewer's own weight into the app rather than flattening every comment into the same urgency.
Reading a finding
Every finding is a card in the conversation, and the card is built to answer the reviewer's first three questions at a glance: who raised it, how much it matters, and where it stands.
Who is the author chip. A finding from your reviewer carries their GitHub name and avatar; a finding from the agent's own review carries an AI Review mark, so you always know whether you are reading a teammate or the agent.
How much it matters is the severity, shown as a colored pill:
| Severity | What it means |
|---|---|
| Blocking | Has to be addressed before the change should land. A changes requested review becomes this. |
| Suggestion | Worth doing, not a gate. An improvement the reviewer would like to see. |
| Nit | A small, optional polish. Take it or leave it. |
| Question | A prompt to look or explain. Every inline comment and every commented review becomes this. |
Blocking and Question are the severities that arrive from GitHub. Suggestion and Nit come from the agent's own review, or from a reviewer when someone sets a finding's severity by hand, so the softer end of the scale is where the agent's self-review and your team's judgment live.
Where it stands is the status, and the status is where the loop shows itself. A finding moves through a small, legible set of states:
| Status | What it means |
|---|---|
| Open | Raised, not yet worked. The starting state for every finding. |
| Fix in progress | An agent is on it. The card reads Running · Attempt N/3 with a spinner while the loop runs. |
| Resolved | The fix was made and the recheck confirmed it. The card collapses and strikes through. |
| Regressed | A fix made something else break. The loop pauses and the card explains what regressed. |
| Needs human | The agent tried and did not converge. The card asks you to step in. |
| Won't fix | A person decided to leave it. A deliberate close, on the record. |
Alongside those, a card carries the file and line it is about, the comment body, any acceptance criterion or test case it is linked to, and a suggested fix when the reviewer offered one. It is a complete picture of one point of review, in one place, whether the point came from a person or the agent.
What a good finding looks like
A finding is only as useful as the change it points at. The ones that move the work share a shape, and it is the same shape whether a reviewer left it or the agent's own review raised it: a specific place, a specific problem, and enough of the why that an agent can act without guessing.
A good finding names the line, not the file. "The export should respect the column filter" attached to src/export/csv.ts:54 tells an agent exactly where to read and what to change. The same point left as "exports look wrong" attached to nothing sends the agent hunting, and a hunt is where a fix goes sideways. The file-and-line citation on every card is not decoration; it is the difference between a finding an agent can answer in one pass and one it has to interpret first.
A good finding carries a reason, not just a verdict. "Extract this into a helper, it is duplicated three times" tells an agent what done looks like. "Clean this up" does not, and an agent that does not know what done looks like is the agent that takes all three attempts and still lands on Needs human. When a reviewer offers a suggested fix, the card keeps it, and the agent reads it as the target rather than reverse-engineering one.
This is also why the agent's self-review is held to the human bar rather than a softer one. A review that raised vague findings against its own diff would be noise the team learned to scroll past. By raising the same specific, line-anchored, reasoned findings a careful reviewer would, the agent's review earns the same attention, and the obvious problems are caught and queued before a person spends a minute on them. The bar is what makes a second stream worth having instead of a second thing to ignore.
Fix this: a comment becomes a commit
A finding with a Fix this button is one you can hand straight to an agent. It appears on the findings worth automating, the blocking ones and the suggestions, the points that ask for a code change rather than a conversation.
Click it and the loop starts. An agent picks the finding up in a fresh sandbox, makes the change it asks for, and pushes it as new commits on the same pull request. The card flips to Fix in progress the instant you click, so you can watch the work rather than wonder if it took. If the dispatch cannot start, the card rolls back to Open and offers a Retry, so a click never quietly does nothing.
The fix arrives the way the original change did, as commits your team reviews and merges on GitHub. The agent answers review by doing more work, never by editing the conversation about it: it does not reply to the thread, resolve the comment, or approve its own change. The request stays human and stays on GitHub; the only thing the agent adds is code.
The recheck loop: how the work converges
This is the part that makes the loop a loop. An agent that makes a change is not done; an agent that makes a change and then checks whether it actually worked is. After the agent applies a fix, it rechecks its own work, and what the recheck finds decides what happens next.
You do not run the recheck; it runs itself, after every fix, and the finding's status is how you watch it. The card updates live as the loop turns, with no refresh, so the attempt count climbing, a regression, or a hand-off back to you appears the moment it happens.
There are four ways a recheck can land, and each one moves the finding somewhere you can read:
- Resolved. The fix worked and nothing else broke. The finding goes to Resolved, the card collapses, and that point of review is closed. When the last finding resolves, the loop is done.
- Still open. The fix helped but did not finish the job. The agent queues another attempt, a fresh implement-and-recheck pass, and goes again. The card counts up, Attempt 2/3, so you can see it working rather than guessing.
- Regressed. The fix made something else break, caught because the recheck runs your test cases, not just the one the finding named. This is the difference between a fix and a regression you find next week: an agent that re-checked only the line it was asked about would close findings while quietly breaking the file around them. The loop pauses rather than digging the hole deeper, and the card turns up a banner naming what regressed. From there the move is yours: Retry to send the agent back in with the regression named, Won't fix to close the finding on the record, or fix the line yourself and Mark resolved. The loop hands you the controls; it does not pick for you.
- Needs human. The agent took its attempts, three by default, and did not converge. Rather than loop forever, it stops and hands the finding back marked Needs human. For the people who can dispatch fixes, the card carries Retry, Mark resolved, and Won't fix right there, so the next move is one click. The agent knowing when to stop is as much a part of the design as the agent knowing how to fix.
The cap is the point. An agent that retries without limit is a worse problem than the bug it was chasing. Three attempts is enough to clear the fixes that are genuinely mechanical and short of the number where more tries stop helping, and when it is reached, the work lands in front of a person with its history attached rather than disappearing into a retry that never ends.
Reading the attempts
Every finding that went through the loop keeps its work on the record. A Show attempts control on the card opens the timeline: one entry per attempt, each with the commit the agent pushed (linked to GitHub), the tests it ran and how many passed, and the verdict the recheck reached. When a fix took three tries, you can read all three, see what each commit changed, and see exactly where the recheck turned. The loop is auditable by construction, because every turn of it left a commit and a result behind.
A finding, end to end
Tom reviews Maya's CSV download button pull request, PR #142, and leaves one comment on a line in src/export/csv.ts: the new button should respect the existing column filter before exporting. He submits it as a request changes.
The webhook fires. The comment lands in the app as a finding on the plan, marked Blocking, because Tom requested changes rather than only commenting. Maya sees it on the review surface next to the diff it is about, reads it, and clicks Fix this.
The card flips to Fix in progress. An agent picks the finding up in a fresh sandbox, reads the filter code the comment points at, makes the change, and pushes it as a new commit on PR #142. Then it rechecks, running the plan's test cases rather than only the line Tom named. One of them fails, an export test that assumed no filter. The fix helped but did not finish, so the card counts up to Attempt 2/3 and the agent goes again, this time carrying the filter through the header row too. The recheck passes clean. The finding goes Resolved and the card collapses.
Maya never wrote the fix and never argued in the thread. Tom re-reviews the two commits on GitHub, sees his comment answered in code, approves, and merges. The whole turn, both attempts and the test runs that decided them, sits behind Show attempts on the finding for anyone who reads the pull request later.
Many findings, many reviewers
One finding on one pull request is the easy case. A real review has a dozen findings across a handful of reviewers, some blocking and some not, some answered in one attempt and some that land back on a person, and the surface is built so the dozen reads as cleanly as the one.
The conversation stays one stream no matter how many people fed it. A blocking comment from Tom, a question from Priya, and three suggestions from the agent's own review sit in the same panel, each carrying its author chip, so you read the change rather than chase who-said-what across tabs.
When the stream gets long, the header narrows it. Group by severity to put the blocking findings first, group by file to read a change one file at a time, or leave resolved findings hidden so only the open work shows. A reviewer who left one comment on GitHub can open the surface, find it as a finding, and watch it resolve without learning a new tool.
Findings resolve independently, not in a line. Fix this on one finding sends one agent into one sandbox, and dispatching a fix on a second finding does not wait on the first. Each finding runs its own implement-and-recheck loop and reports back on its own card.
Several findings can be in progress at once, each pushing its own commits, each rechecking against the plan's test cases, so the slow part is the reviewing you wanted to do anyway rather than a queue. The loop is per-finding by design, which is what lets a heavy review clear in close to the time its heaviest single finding takes.
What does not parallelize is the judgment, and that is the point. The agents do the typing on as many findings as you dispatch; the decisions about what to merge stay with the reviewers, on GitHub, in the order they choose.
Who can review and who can fix
Everything above is what the loop does. This is who gets to drive it: reading a review is open to your whole team, and dispatching an agent at it is held a little closer.
Anyone who can see the plan can open its review, read the diff, and read every finding. Viewers and members included, so the engineer who left the comment on GitHub can watch it become a resolved finding without any extra grant. Dispatching a Fix this, editing a finding, or marking one resolved or won't-fix takes the review-findings.manage scope, which rides on your planners, admins, and owners. The split matches the weight of the action: reading review is something everyone does, and starting agent work against your codebase is something a smaller, accountable set of people do.
That same scope is what lets a person take a finding in hand directly rather than through the loop. A manager can resolve or re-open any finding by hand at any time, not only after a hand-off, and can reclassify a finding's severity when the automatic mapping read it heavier or lighter than the team would: downgrade a blocking comment to a nit, or mark a question won't-fix and move on. The loop is the fast path; the manual controls are always there underneath it.
| You want to | Scope |
|---|---|
| Open the review, read the diff and the findings | pr-review.read + review-findings.read |
| Dispatch Fix this, edit a finding, mark resolved or won't-fix | review-findings.manage |
Why the loop works this way
The hard part of agent-written code was never the first draft. It was the review, the back-and-forth that turns one comment into a week: a reviewer asks for a change, someone makes it, the reviewer checks it, something else broke, and around again. Disco Parrot keeps the half of that loop that belongs to people exactly where it is, on GitHub, in your reviewers' own words, and automates the half that was always mechanical: making the change, checking it, and trying again until it holds.
For a planner, the review surface is the one place a pull request stops being a thing happening somewhere else: the plan you wrote, the comment your reviewer left, and the commit that answered it all read on one page, so "where does PR #142 stand" is a glance, not a standup question.
For an engineer, Fix this answers the comment you left with commits you review, not a thread the agent argues in, and the recheck runs your test cases so a fix that breaks something else is caught before you see it.
For a lead, the loop stops itself: three attempts, then a person, with every commit and every test result on the record, so an agent never grinds unattended against a problem it cannot solve.
For the person who has to trust agents near your codebase, the line is bright: the agent reads your reviewers' comments and answers them with code, and it never writes a word back to the pull request, never resolves a thread, never approves itself. Review stays human; the agent does the typing.
How the pull request opens under your identity, and why GitHub holds the merge.
The operating model the loop is one expression of: the agent proposes, a person decides.
What the recheck runs to confirm a fix held and nothing else broke.
Where a reviewer's comment that the docs should have caught becomes a signal a later review acts on.