Real-time architecture

How the browser and the agent stay in sync without a refresh. One server-sent-events stream carries change signals to the app, a separate streaming path carries an agent's turn token by token and survives a dropped connection, and a different channel entirely talks to the sandbox operators.

Open Disco Parrot in two tabs, move a plan to a new status in one, and watch it move in the other. Start an agent on a long task, close the laptop, come back, and the run is still there with its output intact. That liveness is not a polling loop hammering the server every few seconds; it is a small set of real-time channels, each built for the kind of thing it carries. This page is how they work.

It is the architecture view, not the network-security view. How those channels are authenticated, why nothing inbound reaches your compute, and how the operator dials out rather than in are covered on network and connectivity and the network boundary. Here the question is how the browser, the server, and the agent stay in step.

One stream to the browser

The app keeps a single server-sent-events stream open to the server, at one endpoint, /__/events. Server-sent events are a one-way channel the browser holds open and the server writes to when something happens, which fits the shape of the problem: the app mostly needs to hear "something you care about changed," not to chat back over the same wire. This one stream replaced four separate ones the product used to run, so the browser holds a single connection instead of several, and everything live flows through it.

What flows over it is named events: a plan changed status, a sprint was updated, a background task made progress, a sandbox came up, a notification arrived. Each event is tagged with the workspace and, where it matters, the person it belongs to, and the server only writes an event to a connection that is allowed to see it. The stream is authenticated and scoped the same as any other endpoint, and the connection captures who you are and what you can read at the moment it opens, so the server does not re-check your permissions on every event it sends. That snapshot decides who hears an event, not what you may read: because the data behind every signal comes from a fresh, fully checked fetch, a change to your access takes effect on the next refetch even while the same stream stays open.

The connection looks after itself in two small ways. The server sends a periodic heartbeat down the stream, so a quiet connection is known to be alive rather than assumed dead, and a browser whose stream actually dropped reopens it on its own. And a connection that stops keeping up, a tab that froze, a client that went away, is closed rather than allowed to back up the server; the app reconnects and picks up from the live state. The stream is cheap to hold and cheap to lose, which is the property the rest of this page depends on.

Signals, not data

The most useful thing to understand about that stream is what an event actually carries. An event is a nudge, not a payload. When a plan changes, the event says "this plan changed," and the part of the app showing that plan re-reads the plan from the server through a normal request. The event names the record that changed, so the refetch is aimed at that record rather than reloading the whole view. The authoritative data always comes from a fetch against the store; the event only tells the app when to go and get it.

That choice is what keeps the live layer dependable. Because the event is a signal and not the record, a brief disconnect cannot corrupt what you see: if the app misses an event during a network blip, the next event, or the next time you touch the data, refetches the truth. There is no diverging local copy to reconcile, because the app never treated the event as the source of truth in the first place. The stream tells the app when; the store remains the what.

A live event says what changed. The data still comes from the store.

Anatomy of one event

Follow a single change. Priya moves a plan from In progress to In review on her screen. The platform writes one event onto the stream, named for what happened and tagged with the workspace and, where it matters, the person it concerns. Sarah has that workspace's board open in another tab, and her board is subscribed to plan-changed events, so it hears this one. It does not trust the event for the new state. It does what every live surface does: it fires a fresh, fully permission-checked fetch for that one plan, gets the authoritative record back, and reorders the row.

Three things make that safe. The event named the record, so the refetch was aimed rather than a full reload. The data Sarah saw came from a checked read against the store, not from the wire, so she could only ever see what she is allowed to see. And if her connection had missed the event entirely, the next event or the next time she touched the board would have refetched the same truth. One event, one aimed fetch, no shared state to drift.

One change, traced. The event points; the data comes from a fresh, checked read of the store.

What stays live

Live-ness in Disco Parrot is a property of the whole product, not a feature bolted onto one page. Most of the moving parts of the app subscribe to that one stream:

The Command Center shows the live state of every chat and flow run you have in flight, advancing as steps start and finish.
Report panels refetch when their underlying work changes, keeping the previous numbers on screen while the new ones load so the panel never flashes empty.
Dashboard panels refresh on the same signals, so a board left open on a wall display stays current.
The notifications bell updates its unseen count the moment something arrives.
Any list or detail page can subscribe to the lifecycle events for the records it shows, so a board reorders itself as work moves without anyone reloading.

None of these poll. They hold the one stream and react to the signals on it, which is why the app feels live without a tab full of refresh spinners. The exact event names a page can subscribe to are catalogued in the automation reference.

add_photo_alternate

Screenshot to capture

The Command Center under Working in Disco Parrot, dark theme. Several in-flight run cards, each a chat or flow run showing a live status, a step label like 'Editing plan' advancing, a small running indicator, an elapsed timer, and the work it belongs to. One card is mid-stream with a partial agent reply visible. Surface #131316, border #27272a.

save as: public/docs-media/command-center-live-runs.png

Caption when added: The Command Center rides this stream: every run you have in flight advances on screen as its steps start and finish, with nothing to refresh.

A streaming agent turn that survives a drop

A chat turn is the one place where the wire carries the actual payload rather than a signal. When an agent works, its output, the tokens of its reply and the tool calls it makes, streams to the browser as it happens, so you watch the work rather than wait for it. That stream needs a stronger resilience model than the signal stream, because the tokens are the content and missing them would lose the reply.

So a turn's stream carries a sequence cursor: a counter that advances with each chunk the server sends. The browser remembers the last cursor it received. If the connection drops, a laptop sleeps, a tab is closed, a deploy rolls, the browser reconnects and hands back the cursor it last saw, and the server replays from exactly there, then continues live. You rejoin a turn already in progress without a gap and without a duplicate. The server holds a turn's recent output while the turn is live, so a reconnect within the turn rejoins cleanly; a turn that already finished and was cleaned up is one the browser starts fresh on instead. When the turn is genuinely finished, the stream sends a single terminal marker, and the client stops trying to resume.

The states a turn's stream moves through are explicit: streaming while it runs, reconnecting after a drop while it replays, and then one of three resolutions, completed, failed, or a resync that starts the view fresh when resuming is not the safe move. The chat page is where you meet this every day, the part where your run survives a blip; this is the mechanism under it.

A turn streams chunk by chunk. After a drop, the cursor rejoins it at the exact chunk.

add_photo_alternate

Screenshot to capture

A single chat turn mid-stream under Work with agents, dark theme. The agent's reply is partway written, the last line ending mid-sentence, and one tool call rendered inline above it as a labeled step ('Read plan', 'Edit plan') with a running indicator. A subtle 'reconnecting' chip is shown in a second stacked variant of the same turn. Surface #131316, border #27272a.

save as: public/docs-media/chat-turn-streaming.png

Caption when added: An agent's turn streams in token by token, and a dropped connection rejoins the same turn from where it left off rather than starting over.

Two channels, not one

There are two real-time channels in the platform, and keeping them separate is a choice that matters. The one this page has described so far is browser-facing: server-sent events from the platform to your app. The other is operator-facing: a secure WebSocket channel, over Azure Web PubSub, between the platform and the sandbox operators that run your compute. That second channel carries the commands and replies that drive a sandbox, an agent's file reads, its command output, its turn stream on its way to the host.

The browser never speaks the operator channel, and the operators never speak server-sent events. A sandbox's output reaches your screen by traveling the operator channel to the host, the platform's own server, and then the browser stream from the host, two hops over two different transports, with the host in the middle. The reason that split matters for a reader is in network and connectivity: the operator channel is outbound-only and reaches nothing in your network from outside. For this page, the architecture point is that the channel your browser holds and the channel your compute holds are not the same channel.

The two channels also recover differently, because they carry different things: the signal stream reconnects and moves on, while the turn stream replays from its cursor.

Two transports that meet only at the host. The browser never holds the operator channel.

Bursts are coalesced, not hammered

When a flow run is moving fast, it can fire many change signals in a short window, every step start, every record it touches. If each one triggered an immediate refetch, a busy run would hammer the data store. The live surfaces that watch a lot of records at once, the dashboards in particular, coalesce a burst into one refresh. They subscribe to the relevant event types and debounce them on a short trailing window, so a flurry of events in half a second turns into a single refresh wave rather than a dozen. The effect is that a busy dashboard stays current without turning every burst of activity into a storm of reads against the store.

What is live and what is a daily snapshot

Not every number on a dashboard is meant to move by the second, and the platform is precise about which is which. The panels that show current state, a sprint's health, estimate accuracy and health, an explore query, a scorecard, are live and refresh on the signals above. The panels that show a trend over time, a burndown, a velocity line, a team-trends or comparison chart, a time-window report, read from history that is captured once a day, and they are labeled "updated daily" so the distinction is visible rather than implied.

This is a design decision, not a shortfall. A burndown is a record of how a sprint went day by day; recomputing it every few seconds would cost more and mean less, because the line for yesterday does not change when a plan moves this afternoon. The live panels track the present, the snapshot panels track the trend, and the label tells you which lens you are looking through.

add_photo_alternate

Screenshot to capture

A dashboard under Reports, dark theme, with two panels side by side. The left panel is a live sprint-health scorecard showing current numbers. The right panel is a burndown line chart with a small 'Updated daily' label in its header. Surface #131316, border #27272a.

save as: public/docs-media/dashboard-live-and-snapshot.png

Caption when added: Two panels, two lenses: the live scorecard tracks the present, the burndown reads from history captured once a day and says so.

When the network does not cooperate

The live layer is built to absorb the ordinary failures of a real network, and collecting what each one looks like in one place is the fastest way to trust it. A single dropped signal costs at most a moment of staleness: the browser reopens the stream, and the next event or the next read brings the view current, because no signal carried state a fetch cannot recover. A laptop that sleeps mid-turn loses nothing of an agent's output: the turn's cursor lets the browser rejoin exactly where it left off when it wakes. A deploy rolling under an open stream is the same case as a drop, the browser reconnects to the new instance and continues. A burst of changes does not become a storm of reads, because the busy live surfaces coalesce a flurry into a single refresh.

The pattern behind all four is the one this page keeps returning to: the high-volume traffic is signals a fetch can always recover, and the one stream that carries real state, an agent's turn, carries a cursor so it can be replayed. There is no failure in this list where the client and the server are left disagreeing about the world, because the store is the authority and the live layer only ever points at it.

Every ordinary network failure, and why the view stays correct through it.

Watching a run while a teammate works

When Sarah kicks off a flow on the CSV Export initiative and opens the Command Center, the run advances on her screen as each step starts and finishes, because the Command Center is reacting to the signal stream. While she watches, Priya moves two plans to In review in another tab, and Sarah's board reorders itself a moment later, because her board subscribed to the same plan-changed signals and refetched. Sarah opens the agent's chat to read its reasoning, and the tokens stream in live; when her wifi drops in the elevator and comes back, the turn picks up from where it was rather than starting over. None of this asked her to refresh, and none of it depended on a single connection staying perfect, because each piece recovers by the route its own traffic needs.

Why the real-time layer works this way

The temptation in a live app is to push everything over one fat socket and treat the wire as the source of truth. That is the design that breaks in the field, because the day a connection drops, the client and the server disagree about the world and nobody can say who is right. Disco Parrot avoids that by drawing a line between signals and payloads. The high-volume, low-stakes traffic, "something changed, go look", rides a one-way stream where a lost message costs nothing, because the store is still the truth. The low-volume, high-stakes traffic, an agent's actual output, rides a stream with a cursor so not a token is lost. And the channel your browser holds is kept entirely separate from the channel your compute holds, so the liveness you see in the app never depends on opening a path into your network. The whole thing stays correct when the network does not cooperate, because each channel uses only the recovery it actually needs.

For a developer, the model is two channels with two contracts. The unified server-sent-events stream at /__/events carries scoped, named change signals that your hooks turn into refetches against the authoritative store. The agent turn stream carries the payload and is resumable: it advances a sequence cursor, replays from the cursor on reconnect, and ends on a terminal marker. Treat the signal stream as a cache-invalidation bus, not a data feed.

For a platform engineer, the operational notes are that the app holds one stream rather than several, that bursty live surfaces debounce on a short trailing window so a fast run does not turn into a read storm against the store, and that trend panels read once-a-day snapshots rather than recomputing on every signal. The browser channel and the operator channel are separate transports with separate recovery models.

For an enterprise reviewer, the separation is the point to check: the channel your browser holds is server-sent events from the platform, and the channel that drives your compute is a different, outbound-only transport your browser never touches. The live stream is authenticated and scoped like any other endpoint, and each event is filtered to the workspace and person allowed to see it before it is sent.

For a prospect, the experience is the headline: the app is live without a refresh button, an agent's work streams in as it happens, and a long run survives a closed laptop and a lost connection. That liveness holds when the network does not, which is what separates an app that appears live from one you can leave running on a wall display all day.

dashboard

Command Center

The live panel that rides this stream.

chat

Chat

The agent turn whose stream survives a drop.

smart_toy

The agent runtime and tool surface

What produces the turn that streams to you.

lan

Network and connectivity

The operator channel and why nothing reaches in.