Network and connectivity

How a bring-your-own host connects to Disco Parrot without opening a single inbound port. The operator dials out over a managed channel, only small control messages and agent events cross it, your code and secrets never do, the tokens expire in minutes, and work already running on your compute survives a dropped connection.

When you run a bring-your-own host, a Kubernetes cluster or a Docker engine you control, the obvious question a security team asks is the right one: what did we have to expose to let it talk to the platform? This page is the answer, in full. The short version is that you expose nothing inbound. The longer version, the channel, what travels it, the tokens that secure it, and how it recovers, is what follows.

For where a sandbox runs, read sandbox hosts and deployment options; for what an agent is allowed to do once it is running, read approved actions. For the governance lens on all of this, what you can attest to a reviewer about running compute inside your own boundary, read network boundary and customer-hosted compute. This page is the wire between your host and the platform.

The connection runs outward, never inward

The operator you install does not listen for connections; it makes one. On startup it opens an outbound connection to a managed messaging channel (Azure Web PubSub) and holds it open. The platform sends commands over the channel the operator opened, and the operator answers back over the same channel. There is no service to expose, no ingress to route, no port to forward, and nothing on your side that accepts an incoming connection.

The connection runs one way. The operator inside your boundary dials out to the managed channel and holds it open; the platform sends commands and the operator answers over that same connection. Nothing reaches inward, and no inbound port is opened.

This is why a bring-your-own host is safe to run behind a firewall with no exceptions for inbound traffic: it behaves like any other outbound client your network already allows, the way a CI runner, a monitoring agent, or a backup tool does. The operator bundle you apply has no Service, no Ingress, and no listening port anywhere in it, on Kubernetes or on Docker. The only direction that ever carries traffic is out.

What crosses the channel, and what never does

The channel carries small messages, not your work. Two kinds of thing travel it: lifecycle commands (launch this sandbox, tear that one down, report your health) and agent events (the stream of an agent's progress as it works, so you can watch a run live). Both are compact control messages.

What does not travel the channel is the part that matters most.

The channel carries orchestration, not your work. Small control messages and the live agent stream cross it; your code, your files, and your secrets never do. It is a control plane, not a pipe your data flows through.

Your code and your files stay inside the sandbox. A checkout, a build, a file an agent reads or writes, all of it happens on the compute inside your boundary. It is never shipped over the messaging channel to the platform and back.
Secrets never cross the channel in the clear. A credential an agent needs for a real action, a token to push a branch, a key for an external tool, is leased at the moment the step needs it and handed straight to the process that uses it, through a credential helper or a temporary file, never embedded in a message on the wire. The lease is scoped to one operation and expires in minutes. Approved actions covers the credential model in full.

So the channel is a control plane for orchestration, not a pipe your source code or your secrets flow through. That distinction is the heart of the security story: the platform tells your host what to do and watches it happen, while the work, and the sensitive material the work touches, stays on your side.

Commands, results, and the sidecar

Every message is a round trip you can follow. A command carries a correlation id and a reply address; the operator does the work and publishes its result back tagged with the same id, so a response always maps to its request. The operator handles the lifecycle of sandboxes: creating one, tearing one down, inspecting or listing them, health-checking, and repairing a connection after a reconnect. The per-run work inside a sandbox is handled by a small sidecar that runs alongside the agent: it receives the commands for that run, executes them locally, and publishes the results back. The sidecar's commands are the work itself, reading and writing files, running a process, driving the agent turn by turn, and, when someone opens a sandbox in an editor, carrying the IDE session and any SSH tunnel.

That last point is worth stating plainly for a review: an interactive IDE or SSH session into a sandbox rides this same outbound channel. It does not open a port on your side or reach inward; it tunnels over the connection the operator already holds, framed and chunked like any other message. A large result is split into chunks and reassembled rather than streamed wholesale, which is why even a big file read stays a series of small messages on the wire.

The sidecar is deliberately limited on the channel. It is granted no ability to read another sandbox's traffic, so even the component sitting closest to the agent can see only its own run. The hub it connects to is closed and platform-owned, and a sidecar's place on it is to speak for its own sandbox and read nothing else.

Inside the closed, platform-owned hub, traffic is partitioned into groups: a global registration group, a reply channel per host, a command channel per sandbox, and a group per open session. Each speaker reaches only its own group; a sidecar cannot read another sandbox's channel.

The tokens that secure it

The connection is held together by tokens that are short-lived by design and refreshed before they lapse, so a leaked one is worth little and a stale one just stops working.

The tokens get shorter the closer they sit to the agent's actual work. The connection runs on hour-long tokens that renew; the credentials for real actions are leased for minutes and gone when the step ends.

The operator's connection token lasts about an hour. The operator trades its one-time install key for a short-lived token to the messaging channel, and renews it on its own before it expires. The install key itself is single-use and only ever appears in the bundle you download; the platform stores only a hash of it.
Each sandbox's sidecar gets its own short-lived token, on the order of a few minutes, and the platform refreshes it about two minutes before it would lapse, so a long-running task never falls off the channel mid-flight. The token is scoped so the sidecar can read nothing but its own run.
The credentials an agent uses for actual work are leased per operation and measured in minutes, not hours, a token to push to Git lives about ten minutes, a token handed to an external tool can be as short as one. They are minted when the step runs and gone when it finishes.

Nothing in this chain is a long-lived secret sitting on your host or in a sandbox. The connection runs on tokens that age out fast, and the credentials for work age out faster.

How the operator gets on the channel

The handshake that puts an operator on the channel is built to give nothing away. The operator presents its install key, an opk_ token the platform keeps only as a hash, to a single platform endpoint, and gets back a short-lived connection token and the address of the channel to join. The platform checks the key against the stored hash in constant time, and every failure, an unknown host, a wrong key, a retired key, returns the same flat rejection, so a probe learns nothing from trying. The endpoint is rate-limited to a handful of attempts a minute per key, refuses a key that a later download has retired, and hands back nothing usable until your workspace is actually set up for bring-your-own hosts. There is no other way onto the channel.

How the operator gets on the channel. It presents its single-use key, the platform checks it against a stored hash in constant time, and every failure collapses to the same flat rejection so a probe learns nothing. Success returns a short-lived token and the channel address.

When the connection drops

Networks blip, nodes get drained, a laptop sleeps. The model is built so that a dropped connection is a pause, not a loss.

A dropped connection is a pause, not a loss. The operator reconnects on its own and a live run replays from its last event, while the sandboxes keep running on your compute the whole time and are picked back up when the operator returns.

When the channel drops, the operator reconnects on its own, backing off and retrying instead of hammering, a retry that starts at a fraction of a second and grows to a few, over about a minute, before a stuck request fails cleanly instead of hanging. A live agent run that was streaming picks up from where it left off: the platform asks the sidecar to replay from the last event it received, so you do not lose the tail of a run to a reconnect.

The piece that decides whether you can run this behind a firewall is what happens to work already running. The sandboxes are on your compute, a pod in your cluster, a container on your engine, so when the operator's connection drops, that work keeps running. It does not get killed because the operator stepped away. The host shows the connection as dropped, and when the operator comes back it reconciles with the sandboxes that were running: it confirms each one is still alive, re-points its sidecar at a fresh token, and carries on. A sandbox whose run had already finished while the connection was down is cleaned up rather than left lingering. A brief outage costs you a brief wait, not the work in flight.

The egress to allow

Because everything is outbound, a security team can allowlist exactly what a host needs and nothing more. From inside your boundary, the operator reaches three destinations:

the platform endpoint where it exchanges its key for a connection token,
the managed messaging channel (Azure Web PubSub) it holds its connection open over,
the container registry it pulls the sandbox image from.

Beyond that, a sandbox reaches only what its own work and its leased credentials allow, your Git provider, a package registry, the specific cloud resource a workload identity grants it. There is no inbound rule to write, on any host kind. You are allowing a short, known list of outbound destinations, the same shape of rule you already write for a build agent.

Everything is on the record

The connection is not only bounded, it is observed. Issuing an operator's bundle, the act that mints its key, is recorded as a security event with a fingerprint of the key, never the key itself. Every credential an agent is leased moves through the audit trail as it is granted, consumed, and expires, and every lease a policy denies is recorded too, with the secret itself redacted in every case. You can answer, after the fact, which host was issued what, what each run was granted, and what it reached, from the record rather than from memory.

None of this has to be taken on faith. A security owner can confirm the shape of it directly:

The operator exposes no service and no listening port. On a Kubernetes host, kubectl get svc,ingress -n disco-parrot returns nothing.
The operator manifest you applied contains no Service, no Ingress, and no inbound port, so there is nothing to scan.
The egress your network allows for the host is the three outbound destinations above, with no inbound rule anywhere.
The audit trail, filtered to host connections and credential leases, shows every grant with its scope and expiry and the secret redacted.

Each of those is a fact you can produce, which is the whole point of the design.

add_photo_alternate

Screenshot to capture

An audit trail view filtered to host and credential events, dark theme. A table with columns Time, Actor, Event, Host, Detail. Rows: 'host.operator_bundle' for host 'prod-eks' with detail 'bundle issued, key fingerprint a1b2c3d4'; 'credential-lease.granted' detail 'git-push scope, expires in 10m, secret redacted'; 'credential-lease.denied' detail 'capability not allowed for this profile'; 'credential-lease.granted' detail 'external-tool scope, expires in 1m, secret redacted'; 'credential-lease.expired' detail 'git-push scope'. Each secret value shown as a muted 'redacted' chip, never a value. A filter chip row above reads 'Host: prod-eks' and 'Last 30 days'. Breadcrumb 'Platform / Audit'.

save as: public/docs-media/network-audit-trail.png

Caption when added: Bundle issuance and every leased credential land in the audit trail, the host event carrying a fingerprint of the key rather than the key, each lease its scope and expiry, and the secret itself redacted, so a reviewer reads what happened off the record itself.

A security review, end to end

Marcus has run the Insights team's Kubernetes host for a month when the annual security review comes around, and the reviewer's checklist is the usual one: inbound exposure, data in transit, secret handling, blast radius.

Inbound exposure is the quick one. The operator manifest has no Service and no Ingress; the operator dials out. There is nothing to scan because there is nothing listening. Data in transit is next, and the answer reframes the question: the Insights source never leaves the cluster, because the work happens in the pod and only control messages cross the channel. Secret handling is the one the reviewer expected to be hard, and it is the easiest: there are no standing secrets in the sandboxes to handle, the connection runs on hour-long tokens that refresh, and the credentials for actual work are leased for minutes and audited one by one. Blast radius is the operator's RBAC, scoped to a single namespace, plus an egress list of three outbound destinations.

The review that the team braced for takes an afternoon, because every answer is something Marcus can point at in the manifest, the audit trail, or this page, rather than something he has to argue. That is the whole intent of the design: to make "we run agents safely inside our boundary" a set of facts, not a promise.

Why the connectivity model works this way

The tempting way to connect a customer's compute to a platform is to have the platform reach in: open a port, expose an endpoint, hold a tunnel. It is simpler to build and it is exactly what a security team cannot allow, because every inbound door is a door someone else might walk through. So the model inverts it. Your host reaches out, the way everything else in your network that talks to the outside world already does, and the platform never holds a connection into your boundary.

From that one decision the rest follows. The channel carries orchestration, not your code or your secrets, because the work belongs on your compute and the sensitive material belongs with the process that uses it, not on a wire. The tokens are short because a credential that ages out in an hour, or in a minute, is a small thing to lose. The recovery is forgiving because work running on your own compute should not die when a connection blips. And all of it is audited because, for the people who answer for the boundary, safe has to be provable, not asserted. The point is to make running agents on your own infrastructure a decision you can defend in a review, line by line.

For a security owner, this is the page you bring to the review. No inbound exposure, code and secrets that never cross the wire, short-lived tokens, a three-destination egress list, and an audit trail for all of it. Every claim is one you can verify.

For a platform engineer, the operator is a well-behaved outbound client: it connects out, reconnects on its own, survives a node blip without taking running work with it, and reaches a short, allowlistable set of destinations. It runs like the other agents already on your network.

For a lead, the connectivity model is what lets you say yes to running agents inside your boundary without a months-long security argument. The hard questions have concrete answers, so the decision is about whether you want the host, not whether it is safe.

For a planner, none of this surfaces in your day. You hand off work and watch it happen; the channel, the tokens, and the recovery are the machinery underneath that keeps it both live and safe.

dns

Set up a BYO Kubernetes host

Run the operator in your cluster, with the egress and identity this page describes.

developer_board

Set up a Local Docker host

The same outbound model on a Docker engine you control.

verified_user

Approved actions

The leased-credential model behind "secrets never cross the channel."